Changes between Version 22 and Version 23 of private/NlpInPracticeCourse/LanguageResourcesFromWeb


Ignore:
Timestamp:
Oct 23, 2017, 5:22:35 PM (8 years ago)
Author:
xsuchom2
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/LanguageResourcesFromWeb

    v22 v23  
    3131  * Text processing pipelines for converting a text file to a 3-column vertical:
    3232    * Czech: {{{alba:/opt/majka/majka-desamb-czech.sh | cut -f1-3}}} or a [http://nlp.fi.muni.cz/projekty/rule_ind/index.cgi web interface] (short documents only)
     33      * See an example below.
    3334    * English: {{{alba:/opt/TreeTagger/tools/tt-english\_v2.sh | awk '{print \$1"\textbackslash t"\$3"\textbackslash t"\$2}'}}}
    3435  * For each plagiarism:
     
    7677}}}
    7778
     79How to produce the 3-column POS tagged vertical from a plaintext:
     80{{{
     81scp plagiarism.txt aurora.fi.muni.cz:~/
     82ssh aurora.fi.muni.cz
     83ssh alba
     84/opt/majka/majka-desamb-czech.sh < ~/plagiarism.txt | cut -f1-3 > ~/plagiarism.vert
     85logout
     86logout
     87scp aurora.fi.muni.cz:~/plagiarism.vert ./
     88}}}
     89
     90