     1= Indexing and Searching Very Large Texts =
     3[[|IA161]] [[en/AdvancedNlpCourse|Advanced NLP Course]], Course Guarantee: Aleš Horák
     5Prepared by: Miloš Jakubíček
     7== State of the Art ==
     15== Practical Session ==
     17Compare search through (A) plain text using grep, (B) an indexed corpus using Manatee, (C) a corpus indexed in an arbitrary SQL database
     18Use vertical text for BNC available at aurora:/corpora/vert/bnc/bnc.vert.xz.
     20Search for the phrase "test case", display context of 10 words before and after each occurrence of the search phrase.
     22(A) plain
     24Hint: use grep -C to display context
     28Corpus is already indexed on Manatee, try:
     30time corpquery bnc '[word="test"] [word="case"]'
     34Use your favourite SQL database, on aurora you can use sqlite3.
     35Hint how to import vertical text:
     39For (A), (B) and (C), submit the commands you used and how long the search took to evaluate.