DATA HOMEWORK 4

    This homework is on the information-structure annotations of the Switchboard corpus.

  1. Skim this guidelines paper on the annotations:
    Nissim, Malvina. 2003. Annotation Scheme for Information Status in Dialogue.

  2. Go to the SWBD information-structure annotations in

    /afs/ir.stanford.edu/data/linguistic-data/Treebank/LINK-swbd/backtrans

    Not all the dialogues there are annotated for information structure; you can do
    egrep -l '_old' * | more
    
    to find out which ones.

  3. Look at the file sw2041.backtranslated.mrg. Explain why "the aerobics class" is mediated in the following sentence:
    ( (S
     (SBAR-SBJ
      (WHADVP-N40040B (WRB how) )
      (S
       (NP-SBJ_MARKABLE_human_old_general (PRP I) )
       (VP (VBD met) 
        (NP_MARKABLE_human_old_ident_ANAPH2_ANTEC3 (PRP him) )
        (ADVP (-NONE- -N40040B) ))))
     (VP (VBD was) 
      (PP-PRD (IN through) 
       (, ,)
       (INTJ (UH uh) )
       (, ,)
       (NP_MARKABLE_nonconc_med_set
        (NP_MARKABLE_nonconc_med_set_ANTEC29 (DT the)  (NN aerobics)  (NN class) )
        (SBAR
         (WHNP-N40044D_MARKABLE_nonconc_old_relative_ANAPH29 (WDT that) )
         (S
          (NP-SBJ-N40043E_MARKABLE_human_old_ident_ANAPH3 (PRP he) )
          (VP (VBD used) 
           (S
            (NP-SBJ_MARKABLE (-NONE- -N40043E) )
            (VP (TO to) 
             (VP (VB teach) 
              (NP_MARKABLE (-NONE- -N40044D) ))))))))))
     (. .) (-DFL- E_S) ))
    ( (CODE (SYM SpeakerA7) (. .) ))
    

  4. Write a pgraph on something interesting you notice anywhere in the data.