Version 2 (modified by 3 years ago) (diff) | ,
---|
Machine translation
IA161 NLP in Practice Course, Course Guarantee: Aleš Horák
Prepared by: Pavel Rychlý
State of the Art
The Neural Machine Translation system are structured as Encoder-Decoder pair. They are trained on parallel corpora, each training example is a pair of source sentence and a reference translation. Big advances could be done by preparing cleaner data and feeding the network with the right order of sentences.
References
- Alammar, Jay (2018). The Illustrated Transformer [Blog post]. Retrieved from https://jalammar.github.io/illustrated-transformer/
- Popel, Martin, et al. "Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals." Nature communications 11.1 (2020): 1-15.
- Thompson, Brian and Koehn, Philipp. "Vecalign: Improved Sentence Alignment in Linear Time and Space", Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019
Practical Session
Technical Requirements
The task will proceed using Python notebook run in web browser in the Google Colaboratory environment.
In case of running the codes in a local environment, the requirements are Python 3.6+, jupyter notebook.
Translation with a Sequence to Sequence Network and Attention
Access notebook in the Google Colab environment.
OR
download the notebook or plain python file from the shared notebook (File > Download) and run in your local environment.
Follow the notebook. Choose one of the task at the end of the notebook.
upload
Upload your modified notebook or python script with results to the homework vault (odevzdávárna).