Changes between Version 3 and Version 4 of en/WordLevelAnalysis


Ignore:
Timestamp:
Jun 5, 2014, 10:53:51 AM (7 years ago)
Author:
xkocinc
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • en/WordLevelAnalysis

    v3 v4  
    11= Word Level Analysis =
    22== Motivation ==
     3
     4[[Image(/trac/research/raw-attachment/wiki/en/WordLevelAnalysis/chladnicka.png)]]
     5
     6
    37Many applications need a tool for “clustering” of word forms appearing in texts:
    48
     
    5761 * ''polydaktylie, polydaktiliích, polydaktylií, ... <=> polydaktylie''
    5862 * => extension of data or more precise “guessing”
     63
     64
     65== Resolving Ambiguities Using Context ==
     66
     67An extreme case ''Stroj ženu holí.''
     68 * ''Já stroj ženu holí, ty stroj ženu holí, ten stroj ženu holí.''
     69
     70Usual case is e.g. ''stát''
     71 * noun: ''Stát jsem já.''
     72 * verb: ''Celá továrna musela hodinu stát.''
     73 * at the part of speech level, it is a bigger problem for English
     74
     75The context of the word determines its interpretation
     76 * rules and/or statistical data describe typical contexts of nouns, verbs, etc.
     77 * using such information one can tell that ''stát'' is noun/verb