| 1 | = Text Characteristics = |
| 2 | |
| 3 | == Keyword extraction == |
| 4 | |
| 5 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/example.png)]] |
| 6 | |
| 7 | '''Definition''' |
| 8 | Words used to characterise the contents of a document. |
| 9 | |
| 10 | '''Method''' |
| 11 | Select words that appear with statistically unusual frequency in a text |
| 12 | |
| 13 | '''Applications''' |
| 14 | * Text classification (topic, spam) |
| 15 | * Search Engine Optimisation (SEO) |
| 16 | * Text filtering (job advertising, RSS) |
| 17 | * Text summarization |
| 18 | * Text clustering and reorganization |
| 19 | |
| 20 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/seo.png)]] |
| 21 | |
| 22 | |
| 23 | == Communication Pattern Analysis == |
| 24 | |
| 25 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/.png)]] |
| 26 | |
| 27 | |
| 28 | '''Motivation''' |
| 29 | * Analysis of personality traits using author’s verbal style |
| 30 | * Optimize communication strategies |
| 31 | * Behaviour prediction |
| 32 | |
| 33 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/applications.png)]] |
| 34 | |
| 35 | |
| 36 | == Author’s traits == |
| 37 | |
| 38 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/.png)]] |
| 39 | |
| 40 | |
| 41 | == Problem Definition == |
| 42 | |
| 43 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/auth_ver.png)]] |
| 44 | |
| 45 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/auth_att.png)]] |
| 46 | |
| 47 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/auth_clu.png)]] |
| 48 | |
| 49 | |
| 50 | == Author Writeprint/Stylom == |
| 51 | |
| 52 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/collection.png)]] |
| 53 | |
| 54 | |
| 55 | == Authorship Verification == |
| 56 | |
| 57 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/stylometry.png)]] |
| 58 | |
| 59 | |
| 60 | == Machine learning approach == |
| 61 | |
| 62 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/simML.png)]] |
| 63 | |
| 64 | == Accuracy == |
| 65 | |
| 66 | [[Image(/trac/research/raw-attachment/wiki/en/TextCharacteristics/.png)]] |
| 67 | |
| 68 | |
| 69 | == Conclusions == |
| 70 | |
| 71 | ''' Keyword Extraction ''' |
| 72 | A Brief representation of the content of a document. |
| 73 | |
| 74 | ''' Communication Pattern Analysis ''' |
| 75 | An analysis of personality traits. |
| 76 | |
| 77 | ''' Authorship Recognition ''' |
| 78 | An uncovering authorship of anonymous texts. |
| 79 | |
| 80 | |
| 81 | |
| 82 | |
| 83 | |
| 84 | |
| 85 | |
| 86 | |
| 87 | |
| 88 | |
| 89 | |
| 90 | |
| 91 | |