Changes between Version 2 and Version 3 of private/NlpInPracticeCourse/NamedEntityRecognition


Ignore:
Timestamp:
Jul 23, 2015, 4:30:29 PM (9 years ago)
Author:
Zuzana Nevěřilová
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/NamedEntityRecognition

    v2 v3  
    1212== State of the Art ==
    1313
     14NER aims to ''recognize'' and ''classify'' names of people, locations, organizations, products, artworks, sometimes dates, money, measurements (numbers with units), law or patent numbers etc. Known issues are ambiguity of words (e.g. ''May'' can be a month, a verb, or a name), ambiguity of classes (e.g. ''HMS Queen Elisabeth'' can be a ship), and the inherent incompleteness of lists of NEs.
     15
     16Named entity recognition (NER) is used mainly in information extraction (IE) but it can significantly improve other NLP tasks such as syntactic parsing.
     17
     18=== Example from IE ===
     19
     20In 2003, Hannibal Lecter (as portrayed by Hopkins) was chosen by the American Film Institute as the #1 movie villain.
     21
     22Hannibal Lecter <-> Hopkins
     23
     24=== Example concerning syntactic parsing ===
     25
     26Wish You Were Here is the ninth studio album by the English progressive rock group Pink Floyd.
     27
     28vs.
     29
     30Wish_You_Were_Here is the ninth studio album by the English progressive rock group Pink Floyd.
     31
    1432=== References ===
    1533
     
    2038
    2139Try naive gazetteer method (implement substring search) on prepared data.
    22 Observe the recognition:
     40Observe the results:
    2341  1. what happens to every string present in the gazetteer?
    2442  1. what happens to NE not present in the gazetteer?
    2543
    2644Try machine learning approach (use the Stanford NER) with prepared data.
    27 Observe the recognition:
     45Observe the results:
    2846  1. measure precision, recall, and F1-score on the test data
    2947  1. find NEs not present in the train data