wiki:private/NlpInPracticeCourse/NamedEntityRecognition

Version 2 (modified by Zuzana Nevěřilová, 9 years ago) (diff)

--

Named Entity Recognition

IA161 Advanced NLP Course, Course Guarantee: Aleš Horák

Prepared by: Zuzana Nevěřilová

TODO til 31.5.2015

  1. choose particular papers for References below (that will serve as input for the lecture later on)
  2. prepare the Practical Session

State of the Art

References

  1. David Nadeau, Satoshi Sekine: A survey of named entity recognition and classification. In Satoshi Sekine and Elisabete Ranchhod (eds.) Named Entities: Recognition, classification and use. Lingvisticæ Investigationes 30:1. 2007. pp. 3–26 http://brown.cl.uni-heidelberg.de/~sourjiko/NER_Literatur/survey.pdf
  2. Charles Sutton and Andrew McCallum: An Introduction to Conditional Random Fields. Foundations and Trends in Machine Learning 4 (4). 2012. http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf

Practical Session

Try naive gazetteer method (implement substring search) on prepared data. Observe the recognition:

  1. what happens to every string present in the gazetteer?
  2. what happens to NE not present in the gazetteer?

Try machine learning approach (use the Stanford NER) with prepared data. Observe the recognition:

  1. measure precision, recall, and F1-score on the test data
  2. find NEs not present in the train data
  3. find NEs that were not recognized
  4. discuss what types of NE are easy/difficult to recognize

Attachments (2)

Download all attachments as: .zip