Government Toils To Create Big Data Infrastructure - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Government // Big Data Analytics

Government Toils To Create Big Data Infrastructure

Government is slowly puzzling out how to extract knowledge from the most data it has ever had available to it.

NASA's Maven Enters Mars Orbit: What's Next?
NASA's Maven Enters Mars Orbit: What's Next?
(Click image for larger view and slideshow.)

Climate researchers at the Energy Department's Lawrence Berkeley National Laboratory are using the power of the Edison and Hopper supercomputers to run global weather simulations at levels of granularity never before possible.

"We can do calculations I have been waiting my entire career to do," said Michael Wehner, senior staff scientist in the lab's Computational Research Division. "This brings a leap forward. The simulations are much more realistic; the storms are much more interesting."

The supercomputer simulations have produced more than 400 terabytes of modeling data in the last two years, and this creates a challenge of its own, Wehner said. "How to extract something meaningful from this data?"

There are two big challenges to making use of big data. The first is that the ability of supercomputers to create these large datasets has outstripped their ability to use them. It is a problem of input/output, said Wehner. High-performance computers are good at generating data, but not so good at putting it out. They are not designed to take it back in for analysis. "The input is killing us," Wehner said.

[Government's big data initiatives are paying off for cancer research. Read Big Data Disease Breakthroughs.]

"This is not necessarily a new problem," said Steve Wallach, former technical executive at the National Geospatial-Intelligence Agency (NGA). As long as 30 years ago computers were producing more data than could be practically used, and the ability to produce it has outpaced our ability to manage it since then, he noted. "We are moving into a new area," said Wallach.

The other major challenge is making the data available to other researchers who can add value to it. "I spend a lot of the taxpayers' money producing this data with the big machines," Wehner said. "We bend over backwards trying to get this out to other collaborators."

Image: Wikipedia
Image: Wikipedia

Making big data accessible requires more than large volumes of storage and high-bandwidth network links. It requires metadata so that data can be searched and located and it requires new techniques for storage and retrieval so that data stored over distributed systems can be found quickly and delivered efficiently to users.

In many cases, "it's pretty much roll your own" in developing tools for more efficient use of big data, Wehner said. Now, the experience of the government's early big data adopters -- especially the Energy Department's national labs and big players in the intelligence community, such as the NGA and NSA -- is trickling into the private sector, which has begun producing commercial tools for big data that can be used by government.

In the past, "industry learned how to do it by working with us," said Gary Grider, who leads the High Performance Computing division at Los Alamos National Lab. "That no longer is entirely true. Some commercial entities are catching up. We have some brethren, which is good for us."

Big data and big bucks
The federal government has been producing and using big data almost from its beginning. The first national census in 1790 gathered information on nearly 4 million people, which Commerce Department Under Secretary for Economic Affairs Mark Doms called, "a huge dataset for its day, and not too shabby by today's standards, as well."

The scale has increased dramatically since then. The United States now has a population of more than 300 million with an annual economy of $17 trillion, and the government's principal statistical agencies spend about $3.7 billion a

Next Page

William Jackson is writer with the <a href="" target="_blank">Tech Writers Bureau</A>, with more than 35 years' experience reporting for daily, business and technical publications, including two decades covering information ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 3
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Charlie Babcock
Charlie Babcock,
User Rank: Author
10/3/2014 | 6:44:04 PM
Let's put more investment into Dept. of Energy data
Nice piece by Willaim Jackson. Government has the data -- lots of it. But it's hard to get it into the right hands and systems to analyze it and make it useful. I would think the Department of Energy would benefit from a big investment in handling big data.
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
Where Cloud Spending Might Grow in 2021 and Post-Pandemic
Joao-Pierre S. Ruth, Senior Writer,  11/19/2020
The Ever-Expanding List of C-Level Technology Positions
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/10/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
Flash Poll