• Welcome to webpages of NLP Centre!
  • Join the development of software tools!
  • Analysis of natural language
  • Try online corpora of huge sizes out for free!
  • Study one of the subject specialization!
  • Members of the laboratory

NLP lab seminar

The laboratory seminar is primarily meant to present the activities of the active laboratory members: what they are doing, what results they have, what problems they have, which subtasks they are not able to solve and would like to have a collaboration of someone else, etc. Occasionally, and rather rarely, presentations by related department members can also be expected.

The seminar is now held on Thursdays at 14:00 in B203 (Autumn 2021) and is open to anyone interested in the subject (does not have to be active in the lab). It can also be taken as a course PV173 NLP Lab Seminar and earn three credits for active participation, including presentation of your results (achieved in NLP Center projects or on a relevant issue). The seminar is given in English. Presentations can be in English, Czech or Slovak.

Selected presentations are also presented online at authenticated ZOOM. Please upload any attachments to the online presentation by following the instructions. Uploaded videos are available on the video page.

Presentations wanted:

Presentations offered:

Seminar programme in the autumn semester 2021

date program


seminar programme for this semester
Aleš Horák: RASLAN 2021 Call for Papers


Vít Novotný: SIGIR 2021 and RANLP 2021
Adam Rambousek: AHISTO project


Michaela Denisová: Crosslingual embedding models


Mikuláš Bankovič: Superresolution techniques for OCR


Rastislav Papčo: Topic classificaton in web corpora
Edoardo Signoroni: Corpus alignment by machine translation techniques


Dalibor Bačovský: Improving the Subword Model of fastText


Ondřej Sotolář: Facebook conversations classification
Radoslav Sabol: Language identification and sentiment analysis for social network texts


Tereza Vrabcová: Preparation of Parallel Corpora for Machine Translation
Adam Hájek: Automatic text summarization using GPT-2


Petr Zelina: Czech transformers
Samuel Špalek: Tokenizers: comparison of 'utok' and 'unitok'


Marek Medveď: Answer Context in Question Answering
Kristína Němcová: Multimodal machine learning


Tomáš Houfek: Information extraction from medical records
Daniel Krátký: TBA


Krištof Anetta, Mahmut Arslan: Electronic health records processing
Ondřej Herman: TBA

Seminar programme in the spring semester 2021

date program


seminar programme for this semester


Pavel Rychlý: projekt LINDAT/CLARIAH-CZ
Pavel Rychlý: projekt strojového překladu
Pavel Rychlý: projekt generování slovníku


Helena Medková: Zeugma Detection using Word Sketch
Vítek Novotný: EDS-MEMBED: Multi-Sense Embeddings Based on Enhanced Distributional Semantic Structures via a Graph Walk over Word Senses


Michal Štefánik: Unsupervised Estimation of Out-of-Domain Performance of Language Models
Marek Medveď: SQAD database update


Hien Thi Ha: Block type classification from scanned invoices
Vítek Novotný: Combining log-bilinear language models with Transformers


Tomáš Houfek: Vytěžování dat z lékařských zpráv


Mikuláš Bankovič: Application of super-resolution on OCR of historical documents
Adam Hájek: výpočet GTP-2 na Metacentru


Tereza Vrabcová: Parallel corpus from web pages
Vítek Novotný: When FastText Pays Attention (preprint)


Tereza Kinská: Creation of Judikatura corpora of court decisions
Pavel Rychlý: Using Makefiles for NLP projects


Petr Zelina: ALBERT Training with TensorFlow and PyTorch


Krištof Anetta: Electronic Health Records processing, Apache cTakes


Ondřej Sotolář: Building a Corpus for Personal Data Detection


Michal Starý: Event Detection

It is also possible to view the seminar programme in preceding semesters.