31 | | Write a short program in Python which will extract simple information about who was who, from the parsed file. The result should look like [attachment:wiki.output this file]. |
| 32 | 1. Create {{{<YOUR_FILE>}}}, a text file named {{{ia161-UCO-08.txt}}} where '''UCO''' is your university ID. |
| 33 | 1. Download and install GATE (Java 8 is necessary) from https://gate.ac.uk/download/ |
| 34 | 1. Run GATE |
| 35 | 1. Load ANNIE (with defaults) |
| 36 | 1. Create language resources: |
| 37 | * right click on Language !Resources/New/GATE Document in the left menu |
| 38 | * change {{{markupAware}}} to {{{false}}} |
| 39 | * change {{{sourceUrl}}} to {{{stringContent}}} and paste some news text |
| 40 | * you can find three sample texts here: |
| 41 | 1. Create corpus: |
| 42 | * right click on Language !Resources/New/GATE Corpus in the left menu |
| 43 | * drag and drop the document in order to put them into the corpus |
| 44 | 1. Run ANNIE: Click on !Applications/Annie in the left menu, select Corpus |
| 45 | 1. Observe the annotated results, click on a document, then Annotation Sets and/or Annotation List. |
| 46 | |
| 47 | So far, GATE did not much more than Stanford NER in lecture 04. Note, however, that all tokens are annotated and POS-tagged. |
| 48 | |
| 49 | We add rules for extracting job titles and the respective person names: |
| 50 | |
| 51 | 1. Right click Processing !Resources/New/JAPE Transducer in the left menu |
| 52 | 1. Click on {{{grammmarUrl}}} and choose grammar {{{jobtitle.jape}}} |
| 53 | 1. Click on !Applications/Annie in the left menu and add the JAPE Transducer to the ANNIE pipeline |
| 54 | 1. Run ANNIE again: Click on !Applications/Annie in the left menu, select Corpus |
| 55 | 1. Observe the annotated results, click on a document, then Annotation Sets and/or Annotation List. If applicable, you can see new annotation JobTitle. |
| 56 | 1. Observer the grammars {{{jobtitle.jape}}} and {{{jobtitleperson.jape}}} |
| 57 | |
| 58 | Add new grammar {{{jobtitleperson.jape}}} and observe the results. |
| 59 | |
| 60 | Optionally, you can add further documents and observe how universal the {{{jobtitleperson.jape}}} grammar is. |
| 61 | |
| 62 | Write your observations to {{{<YOUR_FILE>}}}. |
| 63 | |