Changes between Version 12 and Version 13 of private/NlpInPracticeCourse/InformationExtraction
- Timestamp:
- Nov 15, 2017, 7:15:05 PM (6 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
private/NlpInPracticeCourse/InformationExtraction
v12 v13 3 3 [[https://is.muni.cz/auth/predmet/fi/ia161|IA161]] [[en/AdvancedNlpCourse|Advanced NLP Course]], Course Guarantee: Aleš Horák 4 4 5 Prepared by: Vojtěch Kovář5 Prepared by: Zuzana Nevěřilová 6 6 7 7 … … 38 38 * change {{{markupAware}}} to {{{false}}} 39 39 * change {{{sourceUrl}}} to {{{stringContent}}} and paste some news text 40 * you can find three sample texts here: 40 * you can find three sample texts here: [raw-attachment:text1.txt text1.txt], [raw-attachment:text2.txt text2.txt], [raw-attachment:text3.txt text3.txt] 41 41 1. Create corpus: 42 42 * right click on Language !Resources/New/GATE Corpus in the left menu … … 47 47 So far, GATE did not much more than Stanford NER in lecture 04. Note, however, that all tokens are annotated and POS-tagged. 48 48 49 We add rules for extracting job titles and the respective person names :49 We add rules for extracting job titles and the respective person names. The rules are defined in the grammars [raw-attachment:jobtitle.jape] and [raw-attachment:jobtitleperson.jape] 50 50 51 51 1. Right click Processing !Resources/New/JAPE Transducer in the left menu … … 56 56 1. Observer the grammars {{{jobtitle.jape}}} and {{{jobtitleperson.jape}}} 57 57 58 Add new grammar {{{jobtitleperson.jape}}} and observe the results.58 Add new transducer with the grammar {{{jobtitleperson.jape}}} and observe the results. 59 59 60 60 Optionally, you can add further documents and observe how universal the {{{jobtitleperson.jape}}} grammar is. 61 61 62 62 Write your observations to {{{<YOUR_FILE>}}}. 63 64 65 You may modify or draw inspiration from [raw-attachment:demo.py this demo script].