Changes between Version 1 and Version 2 of private/NlpInPracticeCourse/LanguageModelling


Ignore:
Timestamp:
Jun 1, 2015, 9:36:52 AM (9 years ago)
Author:
Vít Baisa
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/LanguageModelling

    v1 v2  
    55Prepared by: Vít Baisa
    66
    7 == TODO til 31.5.2015 ==
     7== State of the Art =
    88
    9  1. choose particular papers for [[#References|References]] below (that will serve as input for the lecture later on)
    10  1. prepare the [[#PracticalSession|Practical Session]]
    11 
    12 == State of the Art ==
     9The goal of language model is to a) predict a following word or phrase based on a given text history and b) assign a probability (=score) to any possible input sentence. This was done mainly by n-gram models known from WWII. But recently, buzzword deep learning penetrated also into language modelling and it turned out neural networks beat classic n-gram models substantially.
    1310
    1411=== References ===
     
    1613Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[https://www.aclweb.org/anthology/|ACL Anthology]]) that will be used as a source for the one-hour lecture:
    1714
    18  1. paper1
    19  1. paper2
    20  1. paper3
     15 1. Bengio, Yoshua, et al. "A neural probabilistic language model." The Journal of Machine Learning Research 3 (2003): 1137-1155.
     16 1. Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).
     17 1. Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in Neural Information Processing Systems. 2013.
     18 1. Chelba, Ciprian, et al. "One billion word benchmark for measuring progress in statistical language modeling." arXiv preprint arXiv:1312.3005 (2013).
    2119
    2220== Practical Session ==
    2321
    24 Concrete description of work assignment for students for the second one-hour part of the lecture. The work will consist of tasks connected with practical implementations of algorithms connected with the current topic (probably not the state-of-the-art algorithms mentioned in the first part) and with real data. Students can test the algorithms, evaluate them and possibly try some short adaptations for various subtasks.
    25 
    26 Students can also be required to generate some results of their work and hand them in to prove completing the tasks.
     22We will build a simple language model (skip-gram) which has very interesting properties. When trained properly, the vectors of words obey simple space arithmetics, e.g.
     23vector "king" − vector "man" + vector "woman" ~= vector of "queen".
     24We will train this model on a large Czech and English corpora and evaluate the result.