Data Scientists Want Big Data Ethics Standards - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
09:06 AM
Connect Directly

Data Scientists Want Big Data Ethics Standards

Nearly half of data scientists surveyed last month say Facebook's controversial "mood manipulation study" was unethical, and many support ethics guidelines for big data research.

10 Big Data Online Courses
10 Big Data Online Courses
(Click image for larger view and slideshow.)

The vast majority of statisticians and data scientists believe that consumers should worry about privacy issues related to data being collected on them, and most have qualms about the questionable ethics behind Facebook's undisclosed psychological experiment on its users in 2012.

Those are just two of the findings from a Revolution Analytics survey of 144 data scientists at JSM (Joint Statistical Meetings) 2014, an annual gathering of statisticians, to gauge their thoughts on big data ethics. The Boston conference ran Aug. 2-7.

The survey results show data scientists are largely a principled bunch concerned over the lack of ethical guidelines for big data research, at least in some industries.

The Facebook study is a case in point. In January 2012, the social network placed positive or negative posts and images in nearly 700,000 of its users' news feeds to gauge whether the information would sway people's emotions. The Facebook users were unaware they were subjects in the study.

[New sources of data raise new privacy issues. Read Mining WiFi Data: Retail Privacy Pitfalls]

The JSM survey found that 47% of respondents found the Facebook study unethical; another 40% said they "don't know" if the mood manipulation study was ethical.

Big data researchers can glean an important lesson from the Facebook study and the criticism it received, said David Smith, chief community officer at Revolution Analytics, one of leading commercial providers of software and services based on the open-source R programming language. Smith is responsible for developing relationships with the statistician and data scientist community that uses and develops R.

In a phone interview with InformationWeek, Smith said data scientists and statisticians working in the scientific and health science fields already have "a lot of regulation around how data is collected and analyzed."

One example involves medical research conducted for the US Department of Health and Human Services' National Institutes of Health (NIH). "If you want to run a study, say, a psychological study through the NIH with actual patients or human subjects, you need to go through an ethics review before you go ahead and do that," said Smith.

In the tech industry, however, big data ethical guidelines are far more opaque.

"I think what's interesting about the Facebook [study] is that there's this whole new Wild West, if you like, of data coming from Internet applications, Internet services, the Internet of Things, where these practices and procedures aren't really in place yet," said Smith.

When asked if there should be an ethical framework for collecting and using data, 42% of JSM survey respondents agreed that an industry standard should be in place, while 43% said that ethics already plays "a big part" in their research.

If people feel there isn't an ethical standard in place for data collection and analysis, "then naturally they should worry about privacy issues associated with that data," Smith said. "Statisticians and data scientists have an important role to play in the practices and standards around handling and analyzing data in the world at large. I think the Facebook example should teach us a lesson, and my hope is that web and technology companies will involve data scientists more in analyzing the data that they collect."

Do you need a deeper leadership bench? Send your most promising leaders to our InformationWeek Leadership Summit, Sept. 30 in New York City, for a day of peer learning and strategic speakers.

Jeff Bertolucci is a technology journalist in Los Angeles who writes mostly for Kiplinger's Personal Finance, The Saturday Evening Post, and InformationWeek. View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
9/29/2014 | 2:58:33 AM
Nice try, but I don't buy the soul searching from data scientists
Ethical standards and laws are two very different things;  No one ever went to jail for violating ethics. 
User Rank: Ninja
9/22/2014 | 12:37:18 PM
Ethics should be part of research
Not surprised that this debate has come up. Big Data, the Internet of Things and so on have created new sources for significant research. Even in other areas of research where there are standards and regulations, some researchers bend the rules. In this situation, a body of standards needs to exist to be a baseline for appropriate usage of new sources of vast data.
10 Top Cloud Computing Startups
Cynthia Harvey, Freelance Journalist, InformationWeek,  8/3/2020
How Enterprises Can Adopt Video Game Cloud Strategy
Joao-Pierre S. Ruth, Senior Writer,  7/28/2020
Conversational AI Comes of Age
Guest Commentary, Guest Commentary,  8/7/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Special Report: Why Performance Testing is Crucial Today
This special report will help enterprises determine what they should expect from performance testing solutions and how to put them to work most efficiently. Get it today!
Flash Poll