Indexing and Searching Very Large Texts

IA161 Advanced NLP Course, Course Guarantee: Aleš Horák

Prepared by: Miloš Jakubíček

State of the Art


Practical Session

  1. login to aurora
  2. write a program or script that will find all occurrences of a given word form including a small context (at least 5 preceding and succeeding words) in the vertical file /corpora-fast1/vert/bnc/bnc.vert
  3. the script will take two arguments: path to the vertical file and word to be searched
  4. submit the script into the IS vault