Changes between Version 15 and Version 16 of private/NlpInPracticeCourse/NamedEntityRecognition


Ignore:
Timestamp:
Oct 9, 2017, 11:37:57 AM (7 years ago)
Author:
Ales Horak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/NamedEntityRecognition

    v15 v16  
    48481. convert the train data to the Stanford NER format:
    4949{{{
    50 python convert_cnec_stanford.py cnec2.0/data/xml/named_ent_train.xml > named_ent_train.tsv
     50python convert_cnec_stanford.py cnec2.0/data/xml/named_ent_train.xml \
     51  > named_ent_train.tsv
    5152}}}
    5253
     
    55561. train the model using the default settings (cnec.prop), N.B. that the `convert_cnec_stanford.py` only recognizes PERSON, LOCATION and ORGANIZATION, you can extend the markup conversion later:
    5657{{{
    57 java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop cnec.prop
     58java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
     59  -prop cnec.prop
    5860}}}
    59611. convert the test data to the Stanford NER format:
    6062 {{{
    61  python convert_cnec_stanford.py named_ent_dtest.xml > named_ent_dtest.tsv
     63python convert_cnec_stanford.py named_ent_dtest.xml \
     64  > named_ent_dtest.tsv
    6265}}}
    63661. evaluate the model on `dtest`:
    6467{{{
    6568java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
    66   -loadClassifier cnec-3class-model.ser.gz -testFile named_ent_dtest.tsv
     69  -loadClassifier cnec-3class-model.ser.gz \
     70  -testFile named_ent_dtest.tsv
    6771}}}
    6872
     
    8084 {{{
    8185java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier \
    82    -loadClassifier cnec-3class-model.ser.gz -testFile named_ent_dtest_unknown.tsv
     86  -loadClassifier cnec-3class-model.ser.gz \
     87  -testFile named_ent_dtest_unknown.tsv
    8388}}}
    8489