Tweets Tell Whether You Have A Job - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Government // Big Data Analytics
03:36 PM
Connect Directly

Tweets Tell Whether You Have A Job

Twitter data mining could provide governments with an alternative means of measuring unemployment, researchers say.

 8 Doomsday Predictions From Yesterday And Today
8 Doomsday Predictions From Yesterday And Today
(Click image for larger view and slideshow.)

Researchers at universities in Australia, Spain, and the US, in conjunction with UNICEF, have found that Twitter posts can be mined to measure unemployment.

In a study published through Cornell's, researchers Alejandro Llorente, Manuel Garcia-Herranz, Manuel Cebrian, and Esteban Moro "demonstrate that behavioral features related to unemployment can be recovered from the digital exhaust left by the microblogging network Twitter."

"Digital exhaust" is a curious choice of words, because it implies that social media data is a worthless byproduct of online interaction, something to be cast aside. Yet the researchers' findings suggest the very opposite: Social media exhaust is the primary product. It's gold, rather than garbage, and social media users don't realize the value they're throwing away. The social media realm is a charity to benefit businesses.

Three decades ago, the technologist and writer Stewart Brand famously observed the tension inherent in how we value information.

[Does big data need a social media approach? Read Does Big Data Need A 'LinkedIn For Analytics'?]

"Information wants to be free," Brand said in one of his several variations on this theme. "Information also wants to be expensive. Information wants to be free because it has become so cheap to distribute, copy, and recombine -- too cheap to meter. It wants to be expensive because it can be immeasurably valuable to the recipient. That tension will not go away."

Facebook, Google, Twitter, and the rest of the social media and advertising industry want people to believe that their online work -- their posts, pictures, and associated data -- isn't worth anything, in order to capture the full value of this free bounty of insight. Pollute the world with your digital exhaust; we'll clean up, all the way to the bank.

Llorente and his colleagues used a data set of 19.6 million geolocated Twitter messages in Spain from Nov. 29, 2012, to June 30, 2013, and a data set detailing unemployment across various regions of the country to uncover a relationship between economic metrics and social behavior. Interestingly, they noted a correlation between misspellings in tweets, which they take as a proxy for education level, and unemployment.

(Image: Wikimedia)
(Image: Wikimedia)

The researchers consider their success in using Twitter posts to assess employment status to be a "a proof of concept for how a wide range of behavioral features linked to socioeconomic behavior can be inferred from the digital traces that are left by publicly available social media." And they argue that governments might be able to adapt social media surveillance as an alternative to more costly traditional data gathering methods related to public policy.

"The immediacy of social media may also allow governments to better measure and understand the effect of policies, social changes, natural or man-made disasters in the economical status of cities in almost real-time," the researchers state.

Others have already reached this conclusion and are monitoring social media for threats to the powers that be, at home and abroad. Intelligence agencies have been wise to the value of social media exhaust for years, both for the overtly expressed sentiment and for the web of personal relationships exposed through social network links.

In fact, data mining to assess aspects of society has become commonplace. Google began using search queries to assess flu infections in 2008. This year, researchers have demonstrated that Twitter posts can be used to infer whether the person posting is ill. And web pages are laden with tracking scripts to gather data.

Interpreting that data correctly, however, remains a challenge. Google's flu tracking system turned out to be inaccurate. Worse, it would not be difficult to construct a social media study to support a predetermined conclusion for political or economic gain. So Twitter used as a way to measure employment should be assessed with caution.

Twitter has long been wise to the value of its data. In its IPO filing, it disclosed that it had made $47.5 million selling data to other companies in 2012, up from $28.6 million the year before.

Companies want information to be free, so they can sell it at great cost.

Apply now for the 2015 InformationWeek Elite 100, which recognizes the most innovative users of technology to advance a company's business goals. Winners will be recognized at the InformationWeek Conference, April 27-28, 2015, at the Mandalay Bay in Las Vegas. Application period ends Jan. 16, 2015.

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Threaded  |  Newest First  |  Oldest First
D. Henschen
D. Henschen,
User Rank: Author
11/20/2014 | 4:22:00 PM
"Digital Exhaust" is being misused
From what's described, it sounds like they're misusing the phrase "digital exhaust." Twitter tweets are human expressions of information, opinion, and insight. It's Twitter's social network, and to my knowledge it has never treated tweets as "exhaust." That data is their life blood, and they've made APIs to that data widely available. Aggregators capture that data and offer it as well.

The term "digital exhaust" was coined to describe the data that most enterprises used to routinely throw away. It was described as exhaust because it was not seen as being "business data." Log files, for example, contain super high-scale machine data. IT might have kept this data on a short-term basis as a way to monitor the functioning of IT systems.

Now that Hadoop and other high-scale platforms are available, it's possible, economically speaking, to store this "exhaust data" and analytics firms such as Splunk and others have found ways to correlate this data with business events and customer data to develop business insights. To sum up, Tweets aren't exhaust data. That term describes types of data that used to thrown away because there was no perceived value and there was no economically sensible way to store that information.
11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
Time to Shift Your Job Search Out of Neutral
Jessica Davis, Senior Editor, Enterprise Apps,  3/31/2021
Does Identity Hinder Hybrid-Cloud and Multi-Cloud Adoption?
Joao-Pierre S. Ruth, Senior Writer,  4/1/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll