Changes between Version 3 and Version 4 of private/NlpInPracticeCourse/TopicModelling


Ignore:
Timestamp:
Nov 1, 2015, 8:27:18 PM (8 years ago)
Author:
ymaterna
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/TopicModelling

    v3 v4  
    66
    77== State of the Art ==
    8 
     8Topic modeling is a statistical approach for discovering abstract topics hidden in text documents. A document usually consists of multiple topics with different weights. Each topic can be described by typical words belonging to the topic. The most frequently used methods of topic modeling are Latent Semantic Analysis and Latent Dirichlet Allocation.
    99=== References ===
    10 
    11 Approx 3 current papers (preferably from best NLP conferences/journals, eg. [[https://www.aclweb.org/anthology/|ACL Anthology]]) that will be used as a source for the one-hour lecture:
    1210
    1311 1. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993 – 1022, 2003.
     
    1715== Practical Session ==
    1816
    19 In this session we will use [[http://radimrehurek.com/gensim/|Gensim]] to model latent topics of various texts. We will focus on Latent Semantic Analysis and Latent Dirichlet Allocation models.
     17In this session we will use [[http://radimrehurek.com/gensim/|Gensim]] to model latent topics of Wikipedia documents. We will focus on Latent Semantic Analysis and Latent Dirichlet Allocation models.
    2018
    2119Students will also be required to generate some results of their work and hand them in to prove completing the tasks.