Changes between Version 3 and Version 4 of en/WordLevelAnalysis
- Timestamp:
- Jun 5, 2014, 10:53:51 AM (10 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
en/WordLevelAnalysis
v3 v4 1 1 = Word Level Analysis = 2 2 == Motivation == 3 4 [[Image(/trac/research/raw-attachment/wiki/en/WordLevelAnalysis/chladnicka.png)]] 5 6 3 7 Many applications need a tool for “clustering” of word forms appearing in texts: 4 8 … … 57 61 * ''polydaktylie, polydaktiliích, polydaktylií, ... <=> polydaktylie'' 58 62 * => extension of data or more precise “guessing” 63 64 65 == Resolving Ambiguities Using Context == 66 67 An extreme case ''Stroj ženu holí.'' 68 * ''Já stroj ženu holí, ty stroj ženu holí, ten stroj ženu holí.'' 69 70 Usual case is e.g. ''stát'' 71 * noun: ''Stát jsem já.'' 72 * verb: ''Celá továrna musela hodinu stát.'' 73 * at the part of speech level, it is a bigger problem for English 74 75 The context of the word determines its interpretation 76 * rules and/or statistical data describe typical contexts of nouns, verbs, etc. 77 * using such information one can tell that ''stát'' is noun/verb