= Automatic relation extraction =

[[https://is.muni.cz/auth/predmet/fi/ia161|IA161]] [[en/AdvancedNlpCourse|Advanced NLP Course]], Course Guarantee: Aleš Horák

Prepared by: Adam Rambousek

== State of the Art ==

=== References ===

Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[https://www.aclweb.org/anthology/|ACL Anthology]]) that will be used as a source for the one-hour lecture:

 1. Lefever, Els, Marjan Van de Kauter, and Véronique Hoste. "Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch." Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014). 2014.
 1. Wang, Tong, and Graeme Hirst. "Exploring patterns in dictionary definitions for synonym extraction." Natural Language Engineering 18.03 (2012): 313-342.
 1. Schropp, Gwendolijn, Els Lefever, and Véronique Hoste. "A Combined Pattern-based and Distributional Approach for Automatic Hypernym Detection in Dutch." RANLP. 2013.
 1. Grefenstette, Gregory. "INRIASAC: Simple Hypernym Extraction Methods." arXiv preprint arXiv:1502.01271 (2015).

== Practical Session ==

Enhance hypernym detection to provide better results.

 * Download [[htdocs:bigdata/ia161-hyper.zip|prepared scripts and data]] 
 * Unzip, `cd ia161-hyper` and run {{{./hyper.py}}} 
 * If you have trouble with encoding, use {{{PYTHONIOENCODING=UTF-8 ./hyper.py}}}
 * The script reads file {{{vstup.txt}}} (each line is word|definition) and outputs hypernym for each word.
 * Default approach is naive: ''first noun in definition is hypernym''
 * majka gives ''noun (k1)'' to some ''adjectives (k2)'', deal with this to improve results 
 * Update the {{{find_hyper()}}} function in `hyper.py` to provide better results.
 * Upload updated script plus the output.
 * Gold standard to evaluate your result: [[raw-attachment:gold.txt|gold.txt]]