Changes between Version 1 and Version 2 of private/NlpInPracticeCourse/NamedEntityRecognition


Ignore:
Timestamp:
Jul 23, 2015, 4:17:32 PM (9 years ago)
Author:
Zuzana Nevěřilová
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/NamedEntityRecognition

    v1 v2  
    1414=== References ===
    1515
    16 Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[https://www.aclweb.org/anthology/|ACL Anthology]]) that will be used as a source for the one-hour lecture:
    17 
    18  1. paper1
    19  1. paper2
    20  1. paper3
     16 1. David Nadeau, Satoshi Sekine: A survey of named entity recognition and classification. In Satoshi Sekine and Elisabete Ranchhod (eds.) Named Entities: Recognition, classification and use. Lingvisticæ Investigationes 30:1. 2007. pp. 3–26 [[http://brown.cl.uni-heidelberg.de/~sourjiko/NER_Literatur/survey.pdf]]
     17 1. Charles Sutton and Andrew !McCallum: An Introduction to Conditional Random Fields. Foundations and Trends in Machine Learning 4 (4). 2012. [[http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf]]
    2118
    2219== Practical Session ==
    2320
    24 Concrete description of work assignment for students for the second one-hour part of the lecture. The work will consist of tasks connected with practical implementations of algorithms connected with the current topic (probably not the state-of-the-art algorithms mentioned in the first part) and with real data. Students can test the algorithms, evaluate them and possibly try some short adaptations for various subtasks.
     21Try naive gazetteer method (implement substring search) on prepared data.
     22Observe the recognition:
     23  1. what happens to every string present in the gazetteer?
     24  1. what happens to NE not present in the gazetteer?
    2525
    26 Students can also be required to generate some results of their work and hand them in to prove completing the tasks.
     26Try machine learning approach (use the Stanford NER) with prepared data.
     27Observe the recognition:
     28  1. measure precision, recall, and F1-score on the test data
     29  1. find NEs not present in the train data
     30  1. find NEs that were not recognized
     31  1. discuss what types of NE are easy/difficult to recognize