Changes between Version 16 and Version 17 of private/NlpInPracticeCourse/NamedEntityRecognition


Ignore:
Timestamp:
Oct 9, 2017, 11:44:01 AM (8 years ago)
Author:
Ales Horak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/NamedEntityRecognition

    v16 v17  
    56561. train the model using the default settings (cnec.prop), N.B. that the `convert_cnec_stanford.py` only recognizes PERSON, LOCATION and ORGANIZATION, you can extend the markup conversion later:
    5757{{{
    58 java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
     58java -cp stanford-ner-2017-06-09/stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
    5959  -prop cnec.prop
    6060}}}
    61611. convert the test data to the Stanford NER format:
    6262 {{{
    63 python convert_cnec_stanford.py named_ent_dtest.xml \
     63python convert_cnec_stanford.py cnec2.0/data/xml/named_ent_dtest.xml \
    6464  > named_ent_dtest.tsv
    6565}}}
    66661. evaluate the model on `dtest`:
    6767{{{
    68 java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
     68java -cp stanford-ner-2017-06-09/stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
    6969  -loadClassifier cnec-3class-model.ser.gz \
    7070  -testFile named_ent_dtest.tsv
     
    838310. evaluate the model on `dtest` with only NEs that are not present in the train data:
    8484 {{{
    85 java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
     85java -cp stanford-ner-2017-06-09/stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
    8686  -loadClassifier cnec-3class-model.ser.gz \
    8787  -testFile named_ent_dtest_unknown.tsv
     
    919111. test on your own input:
    9292 {{{
    93 java -mx600m -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
     93java -mx600m -cp stanford-ner-2017-06-09/stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
    9494  -loadClassifier cnec-3class-model.ser.gz -textFile sample.txt
    9595}}}