Version 1 (modified by Ales Horak, 3 years ago) (diff)

copied from private/AdvancedNlpCourse/AnaphoraResolution

Anaphora resolution

IA161 Advanced NLP Course?, Course Guarantee: Aleš Horák

Prepared by: Marek Medveď

State of the Art

Anaphora resolution (or pronoun resolution) is the problem of resolving references to earlier or later items in the discourse.
Main approaches:

  1. Knowledge-rich approaches:
    1. Syntax-based approaches
    2. Discourse-Based Approaches
    3. Hybrid Approaches
    4. Corpus based Approaches
  2. Knowledge-poor Approaches:
    1. Machine learning techniques


  1. Anaphora Resolution, Studies in Language and Linguistics by Mitkov, R., 2014, Taylor & Francis, ISBN 9781317881810
  2. Anaphora resolution: the state of the art, Ruslan Mitkov,1999, Citeseer
  3. Strategies of anaphora resolution, Tanya Reinhart, 2006, North Holland, Source
  4. Discriminative Approach to Predicate-argument Structure Analysis with Zero-anaphora Resolution, Kenji Imamura and Kuniko Saito and Tomoko Izumi, 2009, Association for Computational Linguistics, ACMID 1667611, Source
  5. The Influence of Minimum Edit Distance on Reference Resolution, Michael Strube and Stefan Rapp and Christoph Muller, EMNLP 2002, Association for Computational Linguistics, ACMID 1118733, Source
  6. Combining Sample Selection and Error-driven Pruning for Machine Learning of Coreference Rules, Vincent Ng and Claire Cardie, EMNLP 2002, Association for Computational Linguistics, ACMID 1118701, Source

Practical Session

Student has to understand Hobbs' definition of anaphora resolution and according to it implement the main function of Hobbs' algorithm in proposed python script that contains all necessary functions. According to real data (syntactic trees) student tests his program and evaluate it. At the and of the session student has to hand the results to prove completing the task. If the student finishes early the additional task is to find sentence structures that are not covered by Hobbs' algorithm.

The task:

  1. download script with data is available here
  2. NLTk package is required for Paste 'pip3 install nltk --user' to terminal to install NLTK package.
  3. understand Hobbs' definition of anaphora resolution and replace 'XXX' function call with correct one
  4. find 20 nontrivial sentences wit anaphora: 10 that Hobbs algorithm can recognize and 10 sentences it dos not. You can use the Stanford parser to test new sentences - copy the tree to one line and remove the ROOT tag.
  5. submit script with 10 examples that are correctly recognized with and 10 examples that are not correctly recognized by in the homework vault. For each unrecognized example write an explanation into one separate file unrecognized_notes.txt (first column: example id, second column: explanation).


  1. execute Hobbs script: python ./ demosents.txt He

Attachments (2)

Download all attachments as: .zip