Version 21 (modified by 2 years ago) (diff) | ,
---|
Anaphora resolution
IA161 Advanced NLP Course?, Course Guarantee: Aleš Horák
Prepared by: Marek Medveď
State of the Art
Anaphora resolution (or pronoun resolution) is the problem of resolving references to earlier or later items in the discourse.
Main approaches:
- Knowledge-rich approaches:
- Syntax-based approaches
- Discourse-Based Approaches
- Hybrid Approaches
- Corpus based Approaches
- Knowledge-poor Approaches:
- Machine learning techniques
References
- Anaphora Resolution, Studies in Language and Linguistics by Mitkov, R., 2014, Taylor & Francis, ISBN 9781317881810
- Anaphora resolution: the state of the art, Ruslan Mitkov,1999, Citeseer
- Strategies of anaphora resolution, Tanya Reinhart, 2006, North Holland, Source
- Discriminative Approach to Predicate-argument Structure Analysis with Zero-anaphora Resolution, Kenji Imamura and Kuniko Saito and Tomoko Izumi, 2009, Association for Computational Linguistics, ACMID 1667611, Source
- The Influence of Minimum Edit Distance on Reference Resolution, Michael Strube and Stefan Rapp and Christoph Muller, EMNLP 2002, Association for Computational Linguistics, ACMID 1118733, Source
- Combining Sample Selection and Error-driven Pruning for Machine Learning of Coreference Rules, Vincent Ng and Claire Cardie, EMNLP 2002, Association for Computational Linguistics, ACMID 1118701, Source
Practical Session
Student has to understand Hobbs' definition of anaphora resolution and according to it implement the main function of Hobbs' algorithm in proposed python script that contains all necessary functions. According to real data (syntactic trees) student tests his program and evaluate it. At the and of the session student has to hand the results to prove completing the task. If the student finishes early the additional task is to find sentence structures that are not covered by Hobbs' algorithm.
The task:
- download the script with data from here
- NLTK package is required for
hobbs.py
. When running at your computer, pastepip3 install nltk --user
to terminal to install NLTK package. Faculty machines should have nltk already installed. - understand Hobbs' definition of anaphora resolution and replace
XXX
function calls with correct ones - find 20 nontrivial sentences wit anaphora: 10 that Hobbs algorithm can recognize and 10 sentences it dos not. You can use the Stanford parser to test new sentences - copy the tree to one line and remove the ROOT tag.
- submit your
hobbs.py
script with 10 examples that are correctly recognized withhobbs.py
and 10 examples that are not correctly recognized byhobbs.py
in the homework vault. For each unrecognized example write an explanation into one separate fileunrecognized_notes.txt
(first column: example id, second column: explanation).
Commands:
- execute Hobbs script:
python3 ./hobbs.py demosents.txt He
Attachments (3)
- hobbs_2-IA161.zip (40.9 KB) - added by 4 years ago.
-
hobbs_correct.py (14.6 KB) - added by 17 months ago.
Correct implementaion of Hobbs script
-
hobbs.zip (21.7 KB) - added by 5 months ago.
Homework
Download all attachments as: .zip