Changes between Version 17 and Version 18 of private/NlpInPracticeCourse/TopicModelling
- Timestamp:
- Nov 3, 2022, 7:16:28 PM (18 months ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
private/NlpInPracticeCourse/TopicModelling
v17 v18 22 22 23 23 1. Gensim is already installed on epimetheus1.fi.muni.cz and it also offers faster model processing. 24 1. Download and extract the corpus of Czech Wikipedia documents: [[htdocs:bigdata/wiki.tar.bz2|wiki corpus]].24 1. Download and extract the corpus of Wikipedia documents: [[htdocs:bigdata/wiki.tar.bz2|Czech wiki corpus]], [[htdocs:bigdata/wiki_en.tar.bz2|English wiki corpus]]. 25 25 1. Train LSA and LDA models of the corpus for various numbers of topics using Gensim. You can use this template: [raw-attachment:models.py models.py]. 26 26 1. Check the coherence for various parameters.