Čeština
English
  • Vítejte na stránkách NLP Centra!
  • Zapojte se do vývoje softwarových nástrojů!
  • Analýza přirozeného jazyka
  • Vyzkoušejte si korpusy o velikosti knihoven online!
  • Studujte jednu ze specializací!
  • Členové laboratoře

Anaphora resolution

IA161 Advanced NLP Course, Course Guarantee: Aleš Horák

Prepared by: Marek Medveď

State of the Art

Anaphora resolution (or pronoun resolution) is the problem of resolving references to earlier or later items in the discourse.
Main approaches:

  1. Knowledge-rich approaches:
    1. Syntax-based approaches
    2. Discourse-Based Approaches
    3. Hybrid Approaches
    4. Corpus based Approaches
  2. Knowledge-poor Approaches:
    1. Machine learning techniques

References

  1. Anaphora Resolution, Studies in Language and Linguistics by Mitkov, R., 2014, Taylor & Francis, ISBN 9781317881810
  2. Anaphora resolution: the state of the art, Ruslan Mitkov,1999, Citeseer
  3. Strategies of anaphora resolution, Tanya Reinhart, 2006, North Holland, Source
  4. Discriminative Approach to Predicate-argument Structure Analysis with Zero-anaphora Resolution, Kenji Imamura and Kuniko Saito and Tomoko Izumi, 2009, Association for Computational Linguistics, ACMID 1667611, Source
  5. The Influence of Minimum Edit Distance on Reference Resolution, Michael Strube and Stefan Rapp and Christoph Muller, EMNLP 2002, Association for Computational Linguistics, ACMID 1118733, Source
  6. Combining Sample Selection and Error-driven Pruning for Machine Learning of Coreference Rules, Vincent Ng and Claire Cardie, EMNLP 2002, Association for Computational Linguistics, ACMID 1118701, Source

Practical Session

Student has to understand Hobbs' definition of anaphora resolution and according to it implement the main function of Hobbs' algorithm in proposed python script that contains all necessary functions. According to real data (syntactic trees) student tests his program and evaluate it. At the and of the session student has to hand the results to prove completing the task. If the student finishes early the additional task is to find sentence structures that are not covered by Hobbs' algorithm.

The task:

  1. download script with data is available here
  2. understand Hobbs' definition of anaphora resolution and replace 'XXX' function call with correct one
  3. try to find senteces that can not be resolved by Hobbs' algorithm. You can use the Stanford parser to test new sentences - copy the tree to one line and remove the ROOT tag.
  4. submit hobbs.py script to homework vault

Important commands (aisa):

  1. python with nltk (version 3.0.2.): /home/xmedved1/nlp/python-env/bin/python
  2. copy /home/xmedved1/nltk_data/ to your home directory
  3. execute Hobbs script: /home/xmedved1/nlp/python-env/bin/python ./hobbs.py demosents.txt He