= Text Characteristics = == Keyword extraction == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/example.png)]] '''Definition''' Words used to characterise the contents of a document. '''Method''' Select words that appear with statistically unusual frequency in a text '''Applications''' * Text classification (topic, spam) * Search Engine Optimisation (SEO) * Text filtering (job advertising, RSS) * Text summarization * Text clustering and reorganization [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/seo.png)]] == Communication Pattern Analysis == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/text_characteristics.png​)]] '''Motivation''' * Analysis of personality traits using author’s verbal style * Optimize communication strategies * Behaviour prediction [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/applications.png)]] == Author’s traits == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/vocabulary.png)]] == Problem Definition == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/auth_ver.png)]] [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/auth_att.png)]] [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/auth_clu.png)]] == Author Writeprint/Stylom == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/collection.png)]] == Authorship Verification == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/stylometry.png)]] == Machine learning approach == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/simML.png)]] == Accuracy == [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/verification.png)]] == Conclusions == ''' Keyword Extraction ''' A Brief representation of the content of a document. ''' Communication Pattern Analysis ''' An analysis of personality traits. ''' Authorship Recognition ''' An uncovering authorship of anonymous texts.