Boost Your Analytics, Machine Learning with Alternative Data - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
4/12/2019
08:00 AM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Boost Your Analytics, Machine Learning with Alternative Data

Adding external data sources to your analytics and machine learning initiatives can provide new dimensions of insights. Here are some sources of data you can tap.

Finding data for your analytics and machine learning initiatives has generally not been a problem for most organizations. Enterprise organizations collect data as an operational part of doing business. There are transactions, customer records, ERP, CRM, financials, human capital management, and more. Your organization has gathered metrics from web site visits and marketing email responses. There's plenty of data you already have that can fuel your data, analytics or machine learning initiatives.

But if you are using the same data sources you've always used, you may not be getting the range and dimensions of insights that could be available to you. It may be time to consider going beyond those traditional in-house sources to tap alternatives.

"Your largest sources of data aren't those you own," said Lydia Clougherty Jones, a research director in the Gartner data and analytics group, speaking at the recent Gartner Data and Analytics Summit in Orlando, Florida. "They are the ones that are out there in the data ecosystem."

Image: vegefox.com - stock.adobe.com
Image: vegefox.com - stock.adobe.com

There's a whole world of unexplored data sources out there, if you haven't yet made use of anything from outside your organization. A Gartner survey from about a year ago revealed that just under half of organizations were tapping external data sources.

Jones categorizes alternative data sources into seven different sources -- enterprise, dark, open, web, social, partner, and syndicated. Here's how she defines each of those.

Enterprise data is actually the kind of data you already have -- data about customers, suppliers, partners, and employees that is already readily accessible. This could be transactional data or manufacturing supply chain data.

Dark data is also data that is already available internally to your organization. This is data that was used for a single purpose and then forgotten about or archived. It includes emails, contracts, documents, multimedia, system logs, or other intellectual property.

"Parsing, tagging, linking, or otherwise structuring or extracting usable data from these sources can offer the greatest immediate opportunity," said Jones. A potential use case for using this data for analytics purposes is to help identify insurance fraud.

Open data is another alternative data source. Jones said governments have begun opening their data up to the public as a matter of principle or mandate. There's an estimated 10 million such data sets available worldwide. These data sets can include data about the economy, labor, the population, health and welfare, citizen services, infrastructure, and more. This data may also have commercial value, particularly if you combine it with your own data or other external data sources. A potential use case here is retail chains leveraging this data to determine the best locations for new stores.

Web data is data that you scrape from websites, often to track the activities of competitors, partners, suppliers, and others, Jones said. You may want to track a competitor's pricing, for instance, or keep track of their job postings. Jones noted that there is a growing marketplace of web content harvesting tools including Connotate, Mozenda, Kofax, Import.io, and DeiXTo.

Social media data is another growing source of data for organizations. This can include content from posts on Twitter, Facebook, LinkedIn, Instagram, Pinterest, YouTube, Reddit, blogs, review sites, and more. Organizations can use these sources to focus on consumer sentiment and trends and get a better sense of brand awareness and consumer engagement. Plus, they can monitor their reputation. Tools to help include Meltwater, Clarabridge, Synthesio, Brandwatch and Zoho, among others.

Partner data comes from suppliers and resellers and may include data about sales, inventory, capacity, forecasts, product or equipment specifications, and customers. Jones said that many companies give this data away already, but some are considering charging for it or using it to barter.

Syndicated data is the final source of the seven. This data comes from data brokers or marketplaces and could include consumer data, financial data, weather data, images, market intelligence, product master or reference data, and industry-specific data, according to Jones. She said there are thousands of data brokers now and the market for data exchanges is just getting started. These will connect buyers and sellers of proprietary data in the years to come.

Jones recommends that enterprises set up a practice for identifying data sources and procuring them. Such an organization inside the larger group can navigate legal questions, ownership, and rights. Plus, they will have the expertise to determine the value of the data.

"Identify the range of internal and external data sources of value to your organization," Jones said. "Explore the variety of potential use cases for these data sources."

Read more about data and analytics here:

How to Buy External Data to Fuel Analytics, AI Insights

CDOs Need To Change Their Firms, Then Change Jobs

The Future of AI in America: What All Leaders Should Consider

Planning a Trustworthy Citizen Data Science Initiative

Jessica Davis has spent a career covering the intersection of business and technology at titles including IDG's Infoworld, Ziff Davis Enterprise's eWeek and Channel Insider, and Penton Technology's MSPmentor. She's passionate about the practical use of business intelligence, ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
Top-Paying U.S. Cities for Data Scientists and Data Analysts
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/5/2019
Slideshows
10 Strategic Technology Trends for 2020
Jessica Davis, Senior Editor, Enterprise Apps,  11/1/2019
Commentary
Study Proposes 5 Primary Traits of Innovation Leaders
Joao-Pierre S. Ruth, Senior Writer,  11/8/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Slideshows
Flash Poll