Expert Analysis: Is Sentiment Analysis an 80% Solution? - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
04:13 PM
Connect Directly

Expert Analysis: Is Sentiment Analysis an 80% Solution?

Sentiment-analysis technologies aren't perfect. But what critics are missing is the value of automation, the inaccuracy of human assessment, and the many applications that require only "good-enough" accuracy.

So perhaps human sentiment analysis isn't as good as folks suppose; certainly not 100%. Try a few examples yourself. Imagine that you're not just reading, that you're generating data for monitoring/measurement/analysis purposes. Are these tweets positive, negative, both, or neutral -- at the tweet, sentence, and feature [e.g., public option, Obama, GOP, Avatar] levels?

Here's a thought for those against a public option. Sign a legal doc removing u from any government health care EVER - U pay no matter how $

Obama trying bipartisanship delayed #hcr and allowed GOP to redefine in negatively. Takes willing sides for cooperation #p2 #tcot

Seattle's hippest pastor says "Avatar is Satan."

Of course, I cherry-picked those examples to illustrate that it's sometimes difficult to assign sentiment polarity precisely. Take the first example. It's not explicitly pro a health care public option, is it, even while it's implicitly against public-option opponents? At the tweet level, is it pro, con, neither, or both?

Automated Sentiment Analysis

Regarding automated systems, Bing Liu says "acceptable accuracy and even the measure of it is quite tricky because sentiment analysis is a multi-faceted problem with several sub-problems. For most practical applications, they all need to be solved*** In terms of precision and recall of opinion orientation classification (not other sub-problems), I believe a precision around 90% will be sufficient, but some companies asked for near 100% precision based on my practical experience. (They need to be educated!) Recall is a slightly different issue. A reasonable value will be OK as one does not need to catch every sentence with opinions to find the problems of a product."

Mikko Kotila says leading providers such as Sysomos and Radian6 estimate their automated sentiment analysis and scoring system to be 80% accurate. Without citing examples, I asserted that many systems don't do even that well, not that they have to in order to be useful. But can anyone do better?

Dave Nadeau, creator of restaurant-review start-up InfoGlutton, can. According to Nadeau, InfoGlutton is trained on restaurant reviews from 25 sources for more than 100 restaurants. The proprietary corpus is made of 6,000 reviews totaling 40,000 sentences. Nadeau offers the statistics that:

  • "InfoGlutton sentiment analysis at sentence level is 89.5% accurate, with classifiers tuned for very high (~92%) precision for the positive and negative sentiments.
  • InfoGlutton sentiment analysis at review level is 94% accurate, with classifiers tuned for very high (~96%) precision for the positive and negative sentiments.

Accuracy Beyond Precision

My earlier Twitter examples allow me to introduce the notion that there's more to accuracy than classification precision. Accuracy in text analytics and search is typically computed from both precision and recall. Recall is the proportion of target features (documents, entities, whatever) found and the proportion of those found that were found correctly. Doesn't it go without saying that human methods will never match the recall, speed, and reach of automated methods?

What do you make of this tweet?

Le service apres vente de Toshiba est vraiment... MAUVAIS!!!! (Probleme: Regza LED)

Hint: "vraiment mauvais" is French for truly bad, and "le service après vente" is post-sales service. I'm sure you recognize "Toshiba" and I'd infer that "Regza LED" is a product. It's a computer's ability to find and analyze sources across languages, and to operate 24/7, scanning volumes of data that would overwhelm humans, that gives an edge to automation.

This example illustrates recall. If Toshiba (or their rivals) monitored only English-language sources, they would miss A LOT of relevant material. The tweet I offered above is in French; I can only guess at the volume of material posted in Japanese, Korean, Spanish, Arabic, Russian, and other languages. I'm not claiming that every automation solution can handle every language, nor that your efforts have to be exhaustive. But I will say that if your brand is multi-national, your recall, and thus your overall sentiment-analysis accuracy, are going to suffer without automation.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
2 of 3
Comment  | 
Print  | 
More Insights
10 Ways to Transition Traditional IT Talent to Cloud Talent
Lisa Morgan, Freelance Writer,  11/23/2020
What Comes Next for the COVID-19 Computing Consortium
Joao-Pierre S. Ruth, Senior Writer,  11/24/2020
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
Flash Poll