NOAA CIO Tackles Big Data
The climate, oceanography and weather agency has untold petabytes of data to manage and volumes are only going to accelerate.
Slideshow: Government's 10 Most Powerful Supercomputers (click for larger image and for full slideshow)
Immediately after the 9.0 earthquake struck off the coast of Japan on March 11, the National Oceanic and Atmospheric Administration, using real-time data from ocean sensors, generated computer models of the tsunami to follow. Those models were quickly shared around the world via YouTube and other websites, providing vital information to an anxious public.
The sensors, located on buoys and the ocean floor, are part of a global network of sensors that provide a steady stream of data on the Earth's oceans and weather. With that and a vast archive of historical data, the agency manages some of the largest databases in federal government. Its Princeton, N.J., data center alone stores more than 20 petabytes of data.
"I focus much of my time on data lifecycle management," said Joe Klimavicz, who discussed his IT strategy in a recent interview with InformationWeek at NOAA headquarters in Silver Spring, Md. The keys to ensuring that data is useable and easy to find, he says, include using accurate metadata, publishing data in standard formats, and having a well-conceived data storage strategy.
NOAA is responsible for weather and climate forecasts, coastal restoration, and fisheries management, and much of Uncle Sam’s oceanic, environmental, and climate research. The agency, which spends about $1 billion annually on IT, is investing in new supercomputers for improved weather and climate forecasting and making information available to the public through Web portals such as Climate.gov and Drought.gov.
Weather and climate sensors attached to planes, bridges, and buildings are becoming ubiquitous, feeding data into NOAA's forecasting models. NOAA collects 80 TB of scientific data daily, and Klimavicz expects there to be a ten-fold increase in measurements by 2020.
As is true in other agencies, NOAA uses a mix of legacy IT systems and newer platforms. "While we probably have some of the most cutting-edge storage around at some locations, we have more primitive technology elsewhere," he said. "A lot of these things take time, and I don’t see a lot of influx of cash to do this in a big bang, so we're tackling it incrementally."
Last year, NOAA began real-time monitoring from a new cybersecurity center, which is open 12 hours a day, five days a week. Klimavicz wants to expand that to 24 by 7. The agency uses ArcSight security tools for monitoring events and correlating logs. "The earlier you react, the less work you have to do," he says.
With 122 field offices, NOAA is highly decentralized. The agency’s CIO office -- with about 115 federal employees and an equal number of contractors -- oversees IT policy and manages telecom, supercomputing, cybersecurity, and other IT operations. Six line offices, including the National Weather Service and the National Environmental Satellite, Data and Information Service, have their own CIOs, who work under Klimavicz's guidance and meet with him weekly.
Slideshow: Government's 10 Most Powerful Supercomputers (click for larger image and for full slideshow)
Klimavicz’s current top priority is replacing the agency’s outdated email systems with cloud-based email. Following two email-as-a-service pilots, NOAA is evaluating vendor proposals now. A transition to the cloud is expected this summer.
NOAA is heavily invested in high-performance computing, which supports the agency’s scientific research. "We've been executing our HPC plan on schedule and on budget," Klimavicz said.
Last year, NOAA deployed a 260 TFlop Cray XT6 supercomputer for climate research at the Department of Energy's Oak Ridge National Laboratory. It plans to upgrade the machine to 1.1 PFlops this summer, which would place it among the world's most powerful computers. NOAA decided to work with Oak Ridge because the lab had ample space and affordable power. Later this year, the agency will bring another climate and weather supercomputer online at its Environmental Security Computing Center in West Virginia, and it has issued a request for proposals to upgrade its forecasting system.
One goal is to double the resolution of NOAA's climate models, which will make local forecasts more accurate and let NOAA model storms and weather across wider geographic areas and longer timelines. Each doubling of resolution requires 16 times the computing horsepower because the models are multi-dimensional, Klimavicz says.
The agency is also looking to add more non-atmospheric data into its models to improve forecasts. It plans to factor in readings on circular ocean currents called mesoscale eddies, for example. “I don't think we're looking at it enough from a holistic perspective," he said. "We need to improve that."
Proposed budget cuts could put some of NOAA’s IT plans on hold. The House of Representatives has passed a bill that would cut NOAA's funding for the rest of fiscal 2011 by 14%, limiting the agency’s Tsunami Warning Centers' ability to upgrade tsunami models and delaying deployment of a next-generation satellite.
Klimavicz, who joined NOAA from the National Geospatial-Intelligence Agency, also serves as the Department of Commerce’s senior geospatial official. His experience proved valuable last year in NOAA’s response to the Deepwater Horizon oil spill, as the agency scurried to create a mapping website that tracked the oil, fishery closings, threats to marine life and other data.
GeoPlatform.gov/gulfresponse, co-developed by NOAA and the University of New Hampshire's Coastal Response Research Center, was moved from an academic environment to a public website in weeks, receiving millions of hits in its first day alone. "Not only did we have to support the first responders, but we had to be sure we were putting data out there for public consumption," he said.
The site is being expanded for use during disasters ranging from fires to earthquakes. "We're committed to situational awareness for all big events," Klimavicz said.
Key to GeoPlatform.gov's success will be its ability to integrate data from other agencies and the private sector. During the Deepwater disaster, data had to be approved by the response's unified command for accuracy before being posted to the Web. Klimavicz says the goal is to let people submit their own data to site via open APIs.
About the Author
You May Also Like