Changes between Version 13 and Version 14 of private/NlpInPracticeCourse/TopicModelling


Ignore:
Timestamp:
Oct 20, 2021, 10:16:10 AM (2 years ago)
Author:
xrambous
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • private/NlpInPracticeCourse/TopicModelling

    v13 v14  
    2424 1. Download and extract the corpus of Czech Wikipedia documents:  [[htdocs:bigdata/wiki.tar.bz2|wiki corpus]].
    2525 1. Train LSA and LDA models of the corpus for various numbers of topics using Gensim. You can use this template: [raw-attachment:models.py models.py].
    26  1. For both LSA and LDA select the best model (by looking at the data or by computing perplexity of a test set for LDA).
     26 1. Check the coherence for various parameters.
     27 1. For both LSA and LDA select the best model (by looking at the data or by coherence).
    2728 1. Select 5 most important topics with 10 most important words, give them a name, save it into a text file and upload it into odevzdavarna.
    2829