Changes between Version 14 and Version 15 of private/NlpInPracticeCourse/InformationExtraction


Ignore:
Timestamp:
Nov 20, 2017, 11:15:41 AM (6 years ago)
Author:
Ales Horak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/InformationExtraction

    v14 v15  
    3131
    3232 1. Create {{{<YOUR_FILE>}}}, a text file named {{{ia161-UCO-08.txt}}} where '''UCO''' is your university ID.
    33  1. Download and install GATE (Java 8 is necessary) from https://gate.ac.uk/download/ ({{{java -jar gate-<VERSION>-installer.jar}}})
    34  1. Run GATE ({{{GATE_Developer_<VERSION>/bin/gate.sh}}})
     33 1. Download and install GATE (Java 8 is necessary) from https://gate.ac.uk/download/
     34 {{{
     35java -jar gate-<VERSION>-installer.jar
     36}}}
     37 1. Run GATE
     38 {{{
     39GATE_Developer_<VERSION>/bin/gate.sh
     40}}}
    3541 1. Load ANNIE (with defaults), read about its components
    3642 1. Create document(s):
    37    * right click on Language !Resources/New/GATE Document in the left menu
     43   * right click on `Language Resources/New/GATE Document` in the left menu
    3844   * change {{{markupAware}}} to {{{false}}}
    3945   * change {{{sourceUrl}}} to {{{stringContent}}} and paste some news text
     
    4147   * you can find three sample texts here: [raw-attachment:text1.txt text1.txt], [raw-attachment:text2.txt text2.txt], [raw-attachment:text3.txt text3.txt]
    4248 1. Create corpus:
    43    * right click on Language !Resources/New/GATE Corpus in the left menu
     49   * right click on `Language Resources/New/GATE Corpus` in the left menu
    4450   * drag and drop the document in order to put them into the corpus
    45  1. Run ANNIE: Click on !Applications/Annie in the left menu, select Corpus
    46  1. Observe the annotated results, click on a document, then Annotation Sets and/or Annotation List.
     51 1. Run ANNIE: Click on `Applications/Annie` in the left menu, select `Corpus`
     52 1. Observe the annotated results, click on a document, then `Annotation Sets` and/or `Annotation List`.
    4753
    4854So far, GATE did not much more than Stanford NER in lecture 04. Note, however, that all tokens are annotated and POS-tagged. Also note the annotation type Lookup.
    4955
    50 We add rules for extracting job titles and the respective person names. The rules are defined in the grammars [raw-attachment:jobtitle.jape] and [raw-attachment:jobtitleperson.jape]
     56We add rules for extracting ''job titles'' and the respective ''person names''. The rules are defined in the grammars [raw-attachment:jobtitle.jape] and [raw-attachment:jobtitleperson.jape]
    5157
    52  1. Right click Processing !Resources/New/JAPE Transducer in the left menu
     58 1. Right click `Processing Resources/New/JAPE Transducer` in the left menu
    5359 1. Download the grammar(s).
    5460 1. Click on {{{grammmarUrl}}} and choose the grammar file {{{jobtitle.jape}}}
    55  1. Click on !Applications/Annie in the left menu and add the JAPE Transducer to the ANNIE pipeline
    56  1. Run ANNIE again: Click on !Applications/Annie in the left menu
    57  1. Observe the annotated results, click on a document, then Annotation Sets and/or Annotation List. If applicable, you can see new annotation JobTitle.
     61 1. Click on `Applications/Annie` in the left menu and add the JAPE Transducer to the ANNIE pipeline
     62 1. Run ANNIE again: Click on `Applications/Annie` in the left menu
     63 1. Observe the annotated results, click on a document, then `Annotation Sets` and/or `Annotation List`. If applicable, you can see new annotation `JobTitle`.
    5864 1. Observe the grammars {{{jobtitle.jape}}} and {{{jobtitleperson.jape}}}
    5965