Keep Up With Streaming Data - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
News
3/29/2005
02:00 PM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Keep Up With Streaming Data

Technology proven in stock trading can tap the details in the flood of streaming data coming your way.

Seth GrimesA new class of streaming data software, responding to what may soon be a common demand, is melding deep analytics with the ability to crunch torrents of data in real time. Analyzing continuous, high-volume data feeds poses a special challenge for applications as varied as automated financial-market trading, security-incident detection and weather forecasting. These applications all use analytically discovered patterns to generate predictions, yet the value of these predictions is degraded by long processing times.

Until recently, if you needed real-time results, you had to settle for simple analyses such as scoring, where you plug up-to-the-second numbers into canned models and — based on the outputs — fire off alerts, make routing choices or make thumbs-up or thumbs-down decisions. Streaming data solutions offer more sophisticated analyses.

Take securities-trading data, which, to simplify, consists of streams of ticker symbols and prices, lot sizes, times of the last trade, and bids and offers. NASDAQ alone hosts trading in 3,300 companies with billions of daily price quotes and trades. Apama, a U.K.-based vendor, is focusing on this market by offering systems that filter, join and analyze market data feeds. These solutions support "algorithmic" securities trading that applies complex, adaptive market models.

StreamBase, a company founded by database pioneer and Ingres/Postgres inventor Michael Stonebraker, is taking a general-purpose approach to securities trading. He characterizes real-time, complex processing as "a very different challenge from the stored data challenge solved by relational databases." The stored-data approach involves assimilating operational data into warehouses that host many simultaneous, diverse data analyses or, alternatively, into smaller marts refined for narrower analyses. Regardless of scale, data acquisition involves time-wasting data cleansing, loading and indexing. Analytically structured databases simply can't keep up with the flood of data that a single stream may deliver.

Streaming data arrives when it's ready — irregularly and unpredictably. While point-in-time values matter, the data may contain important patterns that can be discerned only by looking at "time windows" rather than points and only by correlating data from multiple sources. In the securities industry, there's often interest in trends among tracked entities relative to comparators and historic patterns. Traders might want to detect anomalies that hint at risks and opportunities, perhaps fleeting, to either hedge or exploit. Meanwhile, most of the data in a feed might be extraneous and should be filtered out before analysis. Imagine sipping water from a fire hose.

Notable streaming-data projects have emerged from industrial and academic research labs. RiverGlass is a security- and financial-risk-oriented commercialization of technology created at the University of Illinois to federate and detect patterns in heterogeneous document and data streams. Hancock from AT&T Labs Research is designed to monitor communications traffic. Intel Research Berkeley is developing software to handle large arrays of environmental, security and tracking "mote" sensor devices that produce data streams. And Coral8, a startup that leverages research done at Stanford University, is similarly targeting sensor-data analysis as well as financial, security-incident and operational intelligence applications that rely on continuous detection, calculation and analysis.

Like StreamBase, Coral8 provides an extended version of standard SQL designed for long-running, "incremental" queries over continuous data streams as well as querying conventional stored, relational data. Stanford University's Continuous Query Language (CQL) is another variation. These query languages bring familiar SQL syntax, such as subqueries, joins and new operators, to data streams.

Streaming-data technology is likely to gain market acceptance much more rapidly than relational database systems did. It targets a critical need for complex, real-time processing — a need that isn't met by sluggish (by comparison) RDBMS-reliant approaches or by activity-monitoring systems with shallow analytics. And the technology can query both streaming and conventional relational data, easing integration, first in the securities-trading niche but soon in a spectrum of applications that could benefit from data-intensive, real-time operational analytics.

Seth Grimes is a principal of Alta Plana Corp., a Washington, D.C.-based consultancy specializing in large-scale analytic computing systems. Write to him at [email protected].

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Commentary
CIOs Face Decisions on Remote Work for Post-Pandemic Future
Joao-Pierre S. Ruth, Senior Writer,  2/19/2021
Slideshows
11 Ways DevOps Is Evolving
Lisa Morgan, Freelance Writer,  2/18/2021
News
CRM Trends 2021: How the Pandemic Altered Customer Behavior Forever
Jessica Davis, Senior Editor, Enterprise Apps,  2/18/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Slideshows
Flash Poll