Changes between Version 15 and Version 16 of private/NlpInPracticeCourse/RelationExtraction


Ignore:
Timestamp:
Dec 7, 2022, 5:11:37 PM (17 months ago)
Author:
xrambous
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/RelationExtraction

    v15 v16  
    1818== Practical Session ==
    1919
    20 Enhance hypernym detection to provide better results.
     20=== Technical Requirements ===
    2121
    22  * Download [[htdocs:bigdata/ia161-hyper.zip|prepared scripts and data]]:
    23  {{{
    24 wget https://nlp.fi.muni.cz/trac/research/chrome/site/bigdata/ia161-hyper.zip
    25 }}}
    26  * `pip install majka`
    27  * Unzip, `cd ia161-hyper` and run {{{./hyper.py}}}
    28  * The script reads file {{{vstup.txt}}} (each line is word|definition) and outputs hypernym for each word.
     22The task will proceed using Python notebook run in web browser in the [https://colab.research.google.com/ Google Colaboratory] environment
     23with the MU G-Suite disk access.
     24
     25In case of running the codes in a local environment, the requirements are
     26Python 3, and NLTK module.
     27
     28 * Access the [https://colab.research.google.com/drive/1kQdFno7kDalQkGSFgSYXT6EDbNSgPruP Python notebook in the Google Colab environment] and make your own copy. Do not forget to save your work if you want to see your changes later, leaving the browser will throw away all changes!
     29
     30
     31 * The script reads file {{{input.txt}}} (each line is word|definition) and outputs hypernym for each word.
    2932 * Default approach is naive: ''first noun in definition is hypernym''
    30  * majka gives ''noun'' to some ''adjectives'', deal with this to improve results
    31  * Update the {{{find_hyper()}}} function in `hyper.py` to provide better results.
     33 * Update the {{{find_hyper()}}} function  to provide better results.
    3234 * Upload updated script plus the output.
    33  * Gold standard to evaluate your result: [[raw-attachment:gold.txt|gold.txt]]
     35 * Gold standard to evaluate your result: [[raw-attachment:gold_en.txt|gold_en.txt]]