Automatic relation extraction

IA161 NLP in Practice Course, Course Guarantee: Aleš Horák

Prepared by: Adam Rambousek

State of the Art

Lefever, Els, Marjan Van de Kauter, and Véronique Hoste. "Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch." Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014). 2014.
Wang, Tong, and Graeme Hirst. "Exploring patterns in dictionary definitions for synonym extraction." Natural Language Engineering 18.03 (2012): 313-342.
Schropp, Gwendolijn, Els Lefever, and Véronique Hoste. "A Combined Pattern-based and Distributional Approach for Automatic Hypernym Detection in Dutch." RANLP. 2013.
Grefenstette, Gregory. "INRIASAC: Simple Hypernym Extraction Methods." arXiv preprint arXiv:1502.01271 (2015).
Shen, Yatian, and Xuan-Jing Huang. "Attention-based convolutional neural network for semantic relation extraction." Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016.
Li, Qing, et al. "A comprehensive exploration of semantic relation extraction via pre-trained CNNs." Knowledge-Based Systems (2020): 105488.

The task will proceed using Python notebook run in web browser in the Google Colaboratory environment with the MU G-Suite disk access.

In case of running the codes in a local environment, the requirements are Python 3, and NLTK module.

Access the Python notebook in the Google Colab environment and make your own copy. Do not forget to save your work if you want to see your changes later, leaving the browser will throw away all changes!

The script reads file input.txt (each line is word|definition) and outputs hypernym for each word.
Default approach is naive: first noun in definition is hypernym
Update the find_hyper() function to provide better results.
Upload updated script plus the output.
Gold standard to evaluate your result: gold_en.txt