This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
Expert Analysis: Is Sentiment Analysis an 80% Solution?
Sentiment-analysis technologies aren't perfect. But what critics are missing is the value of automation, the inaccuracy of human assessment, and the many applications that require only "good-enough" accuracy.
So perhaps human sentiment analysis isn't as good as folks suppose; certainly not 100%. Try a few examples yourself. Imagine that you're not just reading, that you're generating data for monitoring/measurement/analysis purposes. Are these tweets positive, negative, both, or neutral -- at the tweet, sentence, and feature [e.g., public option, Obama, GOP, Avatar] levels?
Of course, I cherry-picked those examples to illustrate that it's sometimes difficult to assign sentiment polarity precisely. Take the first example. It's not explicitly pro a health care public option, is it, even while it's implicitly against public-option opponents? At the tweet level, is it pro, con, neither, or both?
Automated Sentiment Analysis
Regarding automated systems, Bing Liu says "acceptable accuracy and even the measure of it is quite tricky because sentiment analysis is a multi-faceted problem with several sub-problems. For most practical applications, they all need to be solved*** In terms of precision and recall of opinion orientation classification (not other sub-problems), I believe a precision around 90% will be sufficient, but some companies asked for near 100% precision based on my practical experience. (They need to be educated!) Recall is a slightly different issue. A reasonable value will be OK as one does not need to catch every sentence with opinions to find the problems of a product."
Mikko Kotila says leading providers such as Sysomos and Radian6 estimate their automated sentiment analysis and scoring system to be 80% accurate. Without citing examples, I asserted that many systems don't do even that well, not that they have to in order to be useful. But can anyone do better?
Dave Nadeau, creator of restaurant-review start-up InfoGlutton, can. According to Nadeau, InfoGlutton is trained on restaurant reviews from 25 sources for more than 100 restaurants. The proprietary corpus is made of 6,000 reviews totaling 40,000 sentences. Nadeau offers the statistics that:
"InfoGlutton sentiment analysis at sentence level is 89.5% accurate, with classifiers tuned for very high (~92%) precision for the positive and negative sentiments.
InfoGlutton sentiment analysis at review level is 94% accurate, with classifiers tuned for very high (~96%) precision for the positive and negative sentiments.
Accuracy Beyond Precision
My earlier Twitter examples allow me to introduce the notion that there's more to accuracy than classification precision. Accuracy in text analytics and search is typically computed from both precision and recall. Recall is the proportion of target features (documents, entities, whatever) found and the proportion of those found that were found correctly. Doesn't it go without saying that human methods will never match the recall, speed, and reach of automated methods?
Hint: "vraiment mauvais" is French for truly bad, and "le service après vente" is post-sales service. I'm sure you recognize "Toshiba" and I'd infer that "Regza LED" is a product. It's a computer's ability to find and analyze sources across languages, and to operate 24/7, scanning volumes of data that would overwhelm humans, that gives an edge to automation.
This example illustrates recall. If Toshiba (or their rivals) monitored only English-language sources, they would miss A LOT of relevant material. The tweet I offered above is in French; I can only guess at the volume of material posted in Japanese, Korean, Spanish, Arabic, Russian, and other languages. I'm not claiming that every automation solution can handle every language, nor that your efforts have to be exhaustive. But I will say that if your brand is multi-national, your recall, and thus your overall sentiment-analysis accuracy, are going to suffer without automation.
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.