Version 5 (modified by 8 years ago) (diff) | ,
---|
Opinion mining, sentiment analysis
IA161 Advanced NLP Course, Course Guarantee: Aleš Horák
Prepared by: Zuzana Nevěřilová
State of the Art
Sentiment analysis can be seen as a text categorization task (i.e. is the writer's opinion on a discussed topic X or Y?). It consists of detection of the topic (which can be easy in focused reviews) and detection of the sentiment (which is generally difficult). Opinions are sometimes expressed in a very subtle manner (e.g. the sentence How could anyone sit through this movie? contains no negative word) [3]. The sentiments are usually simply classified by their polarity (positive, negative) but they can be recognized more in depth (e.g. strongly negative). Recognized opinions are also subject to summarization (e.g. how many people like this new iPhone design?).
References
- Bing Liu. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. 2012, 5(1): 1-167. DOI: 10.2200/s00416ed1v01y201204hlt016. Draft version available at http://www.cs.uic.edu/~liub/FBS/SentimentAnalysis-and-OpinionMining.pdf
- Bing Liu. Sentiment Analysis Tutorial. AAAI-2011, August 8, 2011. Slides available at http://www.cs.uic.edu/~liub/FBS/Sentiment-Analysis-tutorial-AAAI-2011.pdf
- Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, Thumbs up? Sentiment Classification using Machine Learning Techniques, Proceedings of EMNLP 2002. http://www.cs.cornell.edu/home/llee/papers/sentiment.pdf
Bing Liu's References: http://www.cs.uic.edu/~liub/FBS/AAAI-2011-tutorial-references.pdf
Practical Session
Train classifier on the Movie Review Data http://www.cs.cornell.edu/people/pabo/movie-review-data/. Measure precision, recall, and F1-score.
Train classifier on e-shop evaluation provided by customers and users of www.zbozi.cz. Measure precision, recall, and F1-score.
Discuss the differences between training classifiers on Czech and English data.
Attachments (4)
- book-reviews.txt (4.0 KB) - added by 6 years ago.
- book-reviews.2.txt (4.0 KB) - added by 6 years ago.
-
classify.py (2.9 KB) - added by 6 years ago.
reseni NLTK UNICODE problemu v NLTK 3.1 (tokenizace necelych slov)
- Aspect_Based_Sentiment_Analysis.ipynb (240.3 KB) - added by 3 days ago.
Download all attachments as: .zip