Harvard Goes To School On Big Data - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
7/5/2011
02:21 PM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Harvard Goes To School On Big Data

IBM Netezza appliance powers medical school's analysis of 10-million-plus patient records for drug safety research.

"If a patient has a high lipid test [LDL] level, for example, it's more likely they will take a lipid-lowering medication, and at the same time it's more likely that they will have a heart attack," Schneeweiss explains.

LDL levels are just one risk factor out of hundreds that are identified and prioritized by high-dimensional propensity scores. It takes time to develop and run the algorithms, and that gets back to the capacity and speed of the analytics platform. Without elaborate and time-consuming database tuning and optimization work, researchers found that many of their iterative algorithms took as long as overnight or a weekend to run.

"By 2009 we recognized that we needed a fundamentally different approach," says Schneeweiss.

The different approach embraced by the commercial world for big-data processing has been massively parallel processing appliances built on commodity (mostly Intel X86) servers rather than clusters of expensive proprietary symmetric multiprocessor servers. Harvard didn't have to look far to find such an appliance as it was approached by IBM Netezza, headquartered in nearby Marlborough, Mass., in 2010 to explore the possibility of a research partnership.

(Competitors will undoubtedly point out that Netezza still uses proprietary Field Programmable Gate Arrays for data filtering, but the company switched to commodity X86 processors and storage in 2009 with the move to its TwinFin architechture .)

Appliances are typically a seven-figure investment, but through the partnership, Harvard did not have to pay for its appliance. "That explains why we didn't shop around -- it was a Godsend that came at the right moment," says Schneeweiss.

The transition to IBM Netezza happened quickly early this year, as IBM Netezza had a TwinFin appliance up and running at a Harvard research data center within two days. Once data was migrated to the new environment, Schneeweiss says the school's six programmers were able to do analyses at least ten times faster without any optimization.

"We have one analysis of data on 150,000 patients that took 20 minutes, with optimization, in the old environment, and it now takes two seconds without any special tuning," he says.

Given the faster analysis speeds and minimal tuning now required, researches now routinely apply high-dimensional propensity scoring techniques to improve the accuracy of their research. "That gets us that much closer to causal conclusions, and researchers can act upon that insight," Schneeweiss says.

The faster Harvard's researchers can develop conclusive research, the sooner they will be able help drug companies, the FDA, and other regulatory agencies take risky drugs off the market and steer practitioners toward the safest and most effective medications available.

For IBM Netezza, promoting the use of the company's technology among prestigious researchers helps opens doors at other research facilities and at commercial firms, such as pharmaceutical giants. "We at Netezza are excited that our collaboration with these notable Harvard Medical School faculty and researchers has already led to leveraging IBM research development efforts and existing products toward revolutionizing computational pharmacoepidemiology," wrote Shawn Dolley, vice president and general manager of the Healthcare & Life Sciences practice at IBM Netezza.

It's the kind of good-will gesture that has always paid off for IBM, even if means giving away a million-dollar-plus appliance.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
Commentary
The Best Way to Get Started with Data Analytics
John Edwards, Technology Journalist & Author,  7/8/2020
Slideshows
10 Cyberattacks on the Rise During the Pandemic
Cynthia Harvey, Freelance Journalist, InformationWeek,  6/24/2020
News
IT Trade Shows Go Virtual: Your 2020 List of Events
Jessica Davis, Senior Editor, Enterprise Apps,  5/29/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Key to Cloud Success: The Right Management
This IT Trend highlights some of the steps IT teams can take to keep their cloud environments running in a safe, efficient manner.
Slideshows
Flash Poll