Version 4 (modified by 8 years ago) (diff) | ,
---|
Language modelling
IA161 Advanced NLP Course, Course Guarantee: Aleš Horák
Prepared by: Vít Baisa
State of the Art
The goal of a language model is a) to predict a following word or phrase based on a given history and b) to assign a probability (= score) to any possible input sentence. In the past, this was achieved mainly by n-gram models known since WWII. But recently, the buzzword deep learning penetrated also into language modelling and it turned out to be substantially better than Markov's n-gram models.
References
- Bengio, Yoshua, et al. "A neural probabilistic language model." The Journal of Machine Learning Research 3 (2003): 1137-1155.
- Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).
- Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in Neural Information Processing Systems. 2013.
- Chelba, Ciprian, et al. "One billion word benchmark for measuring progress in statistical language modeling." arXiv preprint arXiv:1312.3005 (2013).
Practical Session
We will build a simple language model (skip-gram) which has very interesting properties. When trained properly, the vectors of words obey simple space arithmetics, e.g. vector "king" − vector "man" + vector "woman" ~= vector "queen". We will train this model on a large Czech and English corpora and evaluate the results.