awesome-sentiment-analysis/README.md

132 lines
7.9 KiB
Markdown
Raw Normal View History

2017-02-23 15:14:31 -05:00
# 😀😄😂😭 Awesome Sentiment Analysis 😥😟😱😤
2017-02-21 21:38:30 -05:00
2017-02-21 20:53:09 -05:00
A curated list of Sentiment Analysis methods, implementations and misc.
2017-02-21 21:38:30 -05:00
> Sentiment Analysis is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written languages. (Liu 2012)
2017-02-26 12:25:13 -05:00
## Contents
2017-02-22 18:07:16 -05:00
2017-02-22 18:01:58 -05:00
<!-- TOC -->
2017-02-26 12:25:13 -05:00
- [Contents](#contents)
2017-02-22 18:07:16 -05:00
- [Objective](#objective)
- [Introduction](#introduction)
- [Survey Papers](#survey-papers)
- [Baseline Systems](#baseline-systems)
- [Resources and Corpora](#resources-and-corpora)
- [Open Source Implementations](#open-source-implementations)
- [NodeJS](#nodejs)
- [Java](#java)
- [Python](#python)
- [R](#r)
- [Golang](#golang)
- [Ruby](#ruby)
2017-02-22 18:17:30 -05:00
- [CSharp](#csharp)
2017-02-22 18:07:16 -05:00
- [SaaS APIs](#saas-apis)
- [Contributing](#contributing)
2017-02-22 18:01:58 -05:00
<!-- /TOC -->
2017-02-22 17:58:02 -05:00
## Objective
2017-02-22 12:05:33 -05:00
The goal of this repository is to provide adequate links for scholars who want to research in this domain; and at the same time, be sufficiently accessible for developers who want to integrate sentiment analysis into their applications.
2017-02-21 22:54:41 -05:00
2017-02-22 17:58:02 -05:00
## Introduction
2017-02-22 18:12:59 -05:00
Sentiment Analysis happens at various levels:
- Document-level Sentiment Analysis evaluate sentiment of a single entity (i.e. a product) from a review document.
- Sentence-level Sentiment Analysis evaluate sentiment from a single sentence.
- Aspect-level Sentiment Analysis performs finer-grain analysis. For example, the sentence “the iPhones call quality is good, but its battery life is short.” evaluates two aspects: call quality and battery life, of iPhone (entity). The sentiment on iPhones call quality is positive, but the sentiment on its battery life is negative. (Liu 2012)
2017-02-21 22:54:41 -05:00
2017-02-22 12:00:26 -05:00
Most recent research focuses on the aspect-based approaches. But not all opensource implementations are caught up yet.
2017-02-21 22:54:41 -05:00
2017-02-23 15:22:25 -05:00
There are many different approaches to solve the problem. Lexical methods, for example, look at the frequency of words expressing positive and negative sentiment (from i.e. SentiWordNet) occuring in the given sentence. Supervised Machine Learning, such as Naive Bayes and Support Vector Machine (SVM), can be used with training data. Since training examples are difficult to obtain, Unsupervised Machine Learning, such as Latent Dirichlet Allocation (LDA) and word embeddings (Word2Vec) are also used on large unlabeled datasets. Recent works also apply Deep Learning methods such as Convolutional Neural Network (CNN) and Long Short-term Memory (LSTM), as well as their attention-based variants. You will find more details in the survey papers.
2017-02-21 22:54:41 -05:00
2017-02-21 21:38:30 -05:00
## Survey Papers
Liu, Bing. "Sentiment analysis and opinion mining." Synthesis lectures on human language technologies 5.1 (2012): 1-167. [[pdf]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.244.9480&rep=rep1&type=pdf)
Vinodhini, G., and R. M. Chandrasekaran. "Sentiment analysis and opinion mining: a survey." International Journal 2.6 (2012): 282-292. [[pdf]](http://www.dmi.unict.it/~faro/tesi/sentiment_analysis/SA2.pdf)
Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. "Sentiment analysis algorithms and applications: A survey." Ain Shams Engineering Journal 5.4 (2014): 1093-1113. [[pdf]](http://www.sciencedirect.com/science/article/pii/S2090447914000550)
2017-02-21 22:17:01 -05:00
## Baseline Systems
Wang, Sida, and Christopher D. Manning. "Baselines and bigrams: Simple, good sentiment and topic classification." Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, 2012. [[pdf]](http://nlp.stanford.edu/pubs/sidaw12_simple_sentiment.pdf)
2017-02-22 17:41:55 -05:00
Cambria, Erik, Daniel Olsher, and Dheeraj Rajagopal. "SenticNet 3: a common and common-sense knowledge base for cognition-driven sentiment analysis." Proceedings of the twenty-eighth AAAI conference on artificial intelligence. AAAI Press, 2014. [[pdf]](http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/download/8479/8602)
2017-02-22 17:33:00 -05:00
## Resources and Corpora
2017-02-23 14:38:29 -05:00
AFINN: List of English words rated for valence [[web]](http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010)
2017-02-22 17:33:00 -05:00
SentiWordNet: Lexical resource devised for supporting sentiment analysis. [[web]](http://sentiwordnet.isti.cnr.it/) [[paper]](https://www.researchgate.net/profile/Fabrizio_Sebastiani/publication/220746537_SentiWordNet_30_An_Enhanced_Lexical_Resource_for_Sentiment_Analysis_and_Opinion_Mining/links/545fbcc40cf27487b450aa21.pdf)
GloVe: Algorithm for obtaining word vectors. Pretrained word vectors available for download [[web]](http://nlp.stanford.edu/projects/glove/) [[paper]](http://nlp.stanford.edu/pubs/glove.pdf)
SemEval14-Task4: Annotated aspects and sentiments of laptops and restaurants reviews. [[web]](http://alt.qcri.org/semeval2014/task4/) [[paper]](http://www.aclweb.org/anthology/S14-2004)
2017-02-21 22:13:38 -05:00
2017-02-23 14:25:45 -05:00
Stanford Sentiment Treebank: Sentiment dataset with fine-grained sentiment annotations [[web]](http://nlp.stanford.edu/sentiment/code.html) [[paper]](http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf)
2017-02-21 22:13:38 -05:00
## Open Source Implementations
2017-02-21 21:38:30 -05:00
2017-02-22 18:12:16 -05:00
The characteristics of each implementation are desribed.
2017-02-21 21:38:30 -05:00
### NodeJS
[thisandagain/sentiment]( https://github.com/thisandagain/sentiment): Lexical, Dictionary-based, AFINN-based.
2017-02-21 22:54:41 -05:00
2017-02-21 22:13:38 -05:00
[thinkroth/Sentimental](https://github.com/thinkroth/Sentimental) Lexical, Dictionary-based, AFINN-based.
2017-02-21 21:38:30 -05:00
### Java
2017-02-22 12:00:26 -05:00
[LingPipe](http://alias-i.com/): Lexical, Corpus-based, Supervised Machine Learning
2017-02-21 21:38:30 -05:00
2017-02-23 14:26:09 -05:00
[CoreNLP](https://github.com/stanfordnlp/CoreNLP): Supervised Machine Learning, Deep Learning
2017-02-21 22:13:38 -05:00
2017-02-22 12:00:26 -05:00
[ASUM](http://uilab.kaist.ac.kr/research/WSDM11/): Unsupervised Machine Learning, Latent Dirichlet Allocation. [[paper]](http://www.cs.cmu.edu/~yohanj/research/papers/WSDM11.pdf)
2017-02-21 22:54:41 -05:00
2017-02-21 21:38:30 -05:00
### Python
2017-02-21 22:13:38 -05:00
[nltk](http://www.nltk.org/): [VADER](https://github.com/cjhutto/vaderSentiment) sentiment analysis tool, Lexical, Dictionary-based, Rule-based. [[paper]](http://comp.social.gatech.edu/papers/icwsm14.vader.hutto.pdf)
2017-02-21 21:38:30 -05:00
[vivekn/sentiment](https://github.com/vivekn/sentiment): Supervised Machine Learning, Naive Bayes Classifier. [[paper]](https://arxiv.org/abs/1305.6143)
[xiaohan2012/twitter-sent-dnn](https://github.com/xiaohan2012/twitter-sent-dnn): Supervised Machine Learning, Deep Learning, Convolutional Neural Network. [[paper]](http://nal.co/papers/Kalchbrenner_DCNN_ACL14)
2017-02-21 22:13:38 -05:00
[kevincobain2000/sentiment_classifier](https://github.com/kevincobain2000/sentiment_classifier): Supervised Machine Learning, Naive Bayes Classifier, Max Entropy Classifier, SentiWordNet.
[pedrobalage/SemevalAspectBasedSentimentAnalysis](https://github.com/pedrobalage/SemevalAspectBasedSentimentAnalysis): Aspect-Based, Supervised Machine Learning, Conditional Random Field.
[ganeshjawahar/mem_absa](https://github.com/ganeshjawahar/mem_absa): Aspect-Based, Supervised Machine Learning, Deep Learning, Attention-based, External Memory. [[paper]](https://arxiv.org/abs/1605.08900)
2017-02-21 21:38:30 -05:00
### R
[timjurka/sentiment](https://github.com/timjurka/sentiment): Supervised Machine Learning, Naive Bayes Classifier.
### Golang
2017-02-21 22:13:38 -05:00
[cdipaolo/sentiment](https://github.com/cdipaolo/sentiment): Supervised Machine Learning, Naive Bayes Classifier. Based on [cdipaolo/goml](https://github.com/cdipaolo/goml).
### Ruby
[malavbhavsar/sentimentalizer](https://github.com/malavbhavsar/sentimentalizer): Lexical, Dictionary-based.
2017-02-21 22:14:37 -05:00
2017-02-21 22:13:38 -05:00
[7compass/sentimental](https://github.com/7compass/sentimental): Lexical, Dictionary-based.
2017-02-21 23:00:08 -05:00
2017-02-22 18:17:30 -05:00
### CSharp
2017-02-22 18:16:10 -05:00
[amrish7/Dragon](https://github.com/amrish7/Dragon): Supervised Machine Learning, Naive Bayes Classifier.
2017-02-22 17:57:06 -05:00
## SaaS APIs
Google Cloud Natural Language API [[web]](https://cloud.google.com/natural-language/)
IBM Watson Alchemy Language [[web]](https://www.ibm.com/watson/developercloud/alchemy-language.html)
Microsoft Cognitive Service [[web]](https://www.microsoft.com/cognitive-services/en-us/text-analytics-api)
Aylien [[web]](https://developer.aylien.com/text-api-demo)
2017-02-21 23:08:41 -05:00
## Contributing
2017-02-21 23:00:08 -05:00
:+1::tada: First off, thanks for taking the time to contribute! :tada::+1:
Steps to contribute:
- Make your awesome changes
- Submit pull request; if you add a new entry, please give a very brief explanation why you think it should be added.