Changes between Version 3 and Version 4 of private/NlpInPracticeCourse/TopicModelling
- Timestamp:
- Nov 1, 2015, 8:27:18 PM (8 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
private/NlpInPracticeCourse/TopicModelling
v3 v4 6 6 7 7 == State of the Art == 8 8 Topic modeling is a statistical approach for discovering abstract topics hidden in text documents. A document usually consists of multiple topics with different weights. Each topic can be described by typical words belonging to the topic. The most frequently used methods of topic modeling are Latent Semantic Analysis and Latent Dirichlet Allocation. 9 9 === References === 10 11 Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[https://www.aclweb.org/anthology/|ACL Anthology]]) that will be used as a source for the one-hour lecture:12 10 13 11 1. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993 – 1022, 2003. … … 17 15 == Practical Session == 18 16 19 In this session we will use [[http://radimrehurek.com/gensim/|Gensim]] to model latent topics of various texts. We will focus on Latent Semantic Analysis and Latent Dirichlet Allocation models.17 In this session we will use [[http://radimrehurek.com/gensim/|Gensim]] to model latent topics of Wikipedia documents. We will focus on Latent Semantic Analysis and Latent Dirichlet Allocation models. 20 18 21 19 Students will also be required to generate some results of their work and hand them in to prove completing the tasks.