How to Spin Up an AI Flywheel with Data

Machine Learning & AI

Gather as much data as you can about your customers, optimize metrics to address business questions, and build out your AI-based applications.

Guest Commentary, Guest Commentary

April 30, 2018

5 Min Read

Enterprises everywhere are racing to figure out how to start making artificial intelligence work for their business. But the reality is that only a scarce few are actually making real progress -- most everyone else is just fighting with their data.

Over the years, I’ve had the good fortune of working with two companies at the very forefront of AI: Facebook and Microsoft. Here’s what 20-plus years building massive databases for these pioneers has taught me about spinning up an AI flywheel with data.

Stop throwing away data

Companies traditionally classify their data in one of two ways: It’s either high-fidelity data that’s closely governed and religiously maintained (e.g., crown jewels like customer and transaction data), or it’s raw unstructured data that’s archived or thrown away (e.g., user interaction data).

For decades, this approach made sense -- but not anymore. With the rise of cloud technologies, the cost of storing and processing data has fallen dramatically. What would have once consumed an entire IT budget is now a tiny drop in the bucket.

If your company is still practicing data austerity, the first step in your AI journey is to stop. Data is your most valuable asset -- keep it all. That’s because modern approaches to AI are foundationally based on training algorithms with data. Generally speaking, the more data you have, the better your AI gets. Now that collecting and storing large data sets is widely accessible, there’s no reason for any company with their sights set on AI to throw it away.

Check your intuition

When data is expensive to store and process, making data-driven decisions can be hard. It’s simply too costly (and in some cases technically infeasible) to get the right data in front of the right people at the right time. That’s why organizations traditionally lean on the intuition of senior leaders when making important decisions.

Today, the technology limitations are quickly fading. It’s now possible to collect very fine-grained information about how people are interacting with your company, product or service. It’s also possible to make data available in near real time.

But solving the technical challenges of data storage and processing alone does not guarantee success. To turn the corner on AI, a company must also get comfortable relying less on the intuition of its leaders. Remember: making better decisions with data is the entire point of adopting AI.

Experiment often and everywhere

With more data informing your decision-making and less intuition clouding it, it’s time to start setting up experiments. To get the best results, you want to create a culture where a broad group of individuals and teams -- not just a small set of senior leaders -- has access to the data and is not afraid to try out different ideas and see what works.

Most machine learning and AI approaches involve trying out lots of different algorithms with different metrics and parameters, and training your models to see the impact they have on specific problems you're trying to solve.

If your company already has a culture of experimentation, adopting machine learning and AI-based practices is relatively straightforward. Chances are you have processes in place to collect, analyze and use data effectively. From there, it’s often easy to find opportunities where AI can augment these human-based processes.

Close the loop

When you train machine learning and AI algorithms with data, you start with a large, rich, and structured data set humans can query to answer questions on their own. Next, train an algorithm with the same data to achieve the same goal, say answering a particular question or identifying certain patterns. This algorithm can be easily evaluated against the known human-answered data set to determine how good the algorithm is.

As the algorithm gets better at finding the right solutions or patterns, you can start asking it more advanced questions, with the eventual goal of having it answer questions human trainers are less capable of answering themselves. The key to ensuring optimal results as algorithms become more sophisticated and complex is “closing the loop.” In other words, making sure input data is always good and constantly evaluating outcomes against the right metrics.

Choose metrics wisely

The more decisions are based on data, the more important it becomes to identify what you are optimizing for and define the metrics you are tracking against.

Generally speaking, choosing metrics that are a proxy for customer satisfaction is a tried-and-true approach. When measured properly, customer satisfaction can provide insights across every aspect of your business. For example, measuring renewal rates can not only tell you if customers liked your product but also whether they’re happy with the customer support experience.

For the most accurate reflection of customer satisfaction, optimize across a range of metrics. While it’s tempting to search and optimize for the one “perfect” metric, over-optimizing for a single metric can lead to distortions in product behavior and create new challenges down the road. Case in point: Facebook’s recent decision to pivot away from solely optimizing for time spent on its platform because it was ultimately leading to less meaningful interactions.

The lesson here is to choose your metrics wisely. They have a lasting impact on your business over the long term and are yet another lynchpin in your data and AI journey.

Sameet Agarwal is Vice President of Engineering at Snowflake Computing, where he manages and leads the company’s product development.

About the Author(s)

Guest Commentary

The InformationWeek community brings together IT practitioners and industry experts with IT advice, education, and opinions. We strive to highlight technology executives and subject matter experts and use their knowledge and experiences to help our audience of IT professionals in a meaningful way. We publish Guest Commentaries from IT practitioners, industry analysts, technology evangelists, and researchers in the field. We are focusing on four main topics: cloud computing; DevOps; data and analytics; and IT leadership and career development. We aim to offer objective, practical advice to our audience on those topics from people who have deep experience in these topics and know the ropes. Guest Commentaries must be vendor neutral. We don't publish articles that promote the writer's company or product.

See more from Guest Commentary

Related Topics

Recent in Leadership

Related Topics

Recent in Resilience

Related Topics

Recent in ML & AI

Related Topics

Recent in Data

Related Topics

Recent in Sustainability

Related Topics

Recent in Infrastructure

Related Topics

Recent in Software

Related Topics

About the Author(s)

Editor's Choice