Big Data Fakers: 5 Warning Signs - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
12:52 PM
Mark E. Johnson
Mark E. Johnson

Big Data Fakers: 5 Warning Signs

Data falsification at research institutions to make results look better is nothing new. Here's what it can teach us about misuse of big data in business.

 Big Data Analytics Masters Degrees: 20 Top Programs
Big Data Analytics Masters Degrees: 20 Top Programs
(click image for larger view and for slideshow)

Data fabrication and falsification pose a major problem in academic research, especially for projects funded by government agencies. Large fines and moratoria for researchers await those individuals and institutions caught cheating. The extent to which this problem also occurs in the amorphous world of big data is difficult to assess, but worth evaluating given the embarrassments in academia and the likelihood that motivations to cheat are universal.

Universities are increasingly cognizant of the problem and their compliance offices are taking aggressive steps to demonstrate to funding agencies that they are vigilant in handling the problem proactively.

At the University of Central Florida, in response to a request from senior management for a seminar on data fabrication and falsification, I developed a two-hour module addressing scientific misconduct and compensatory measures. Graduate students are required to attend the seminar to be officially admitted to Ph.D. candidacy.

[ Big data is not as daunting as you think. Read Microsoft Goes After 3 Big Data Myths. ]

Although I suspect InformationWeek readers are better informed than our Ph.D. students about data fabrication and falsification scandals, I thought I'd share some of the preliminary conclusions from my seminar on academia's dealings with data misconduct.

Here are some of the most egregious cases I came across. We'll start with five all-star perps, researchers who have made an embarrassing name for themselves by falsifying or misrepresenting data. Then we'll move on to five types of big-data people or scenarios that should make you suspicious enough to do some additional digging.

1. Eric "Massage Muscles not Data" Poehlman.

This University of Vermont kinesiologist was the first researcher to earn a federal prison term -- 366 days -- owing to extensive data fabrications. If the data did not support his hypothesis, he changed it to suit his purposes. Credit should be given to his graduate student/technician Walter deNiro, who had the courage and fortitude to question the honesty of his supervisor's analyses. Poehlman cited the need to fund his lab as motivation for tampering with the data to keep the funding flowing.

2. Yoshitaka "Retracto" Fujii.

Fujii, an anethesiologist at Toho University, likely holds the all-time record of retractions of papers with 172 found to be bogus by an expert panel and thus in various stages of retraction. The panel found that 126 of his randomized controlled studies -- double blind, no less -- "were totally fabricated." Some of his co-authors were in fact unaware that they were even co-authors because he forged their signatures.

3. Dipak "Sommelier" Das.

Das, a researcher at the Cardiovascular Research Center at the University of Connecticut, avoided detection for many years because the results of his studies – a glass of red wine per day is good for health -- was so comforting. Who wanted to overturn this result? He eventually was caught tinkering with Western blots, a type of figure for identifying proteins. Das unsuccessfully tried to transfer the blame to his students, one of whom admitted that he changed a figure the way Das wanted him to.

4. Diederik "Media Dude" Stapel.

This Tilburg University researcher studied human phenomena of great topical interest -- bias and stereotypes -- leading to numerous interviews with the mainstream media regarding his findings. Unfortunately, as the sole proprietor of his data, much of it faked from his office, it took years before his falsehoods were discovered.

5. Eric "Not So" Smart.

For at least 10 years, Smart falsified data in grant proposals and publications in his areas, cardiovascular disease and diabetes. A key problem area was again Western blots, and he also reported results on genetically engineered mice – "knockout" mice -- that did not exist. Some of these publications garnered over one hundred citations and he drew funding to the University of Kentucky to the tune of $8 million. Smart resigned from the university and evidently works now as a science teacher in the Lexington area.

These are just five of the bad actors among many possible world-class data fabricators or manipulators we might not know about. The Department of Health and Human Services maintains a list that currently has 43 individuals with active administrative actions against them, a data falsification wall of shame if you will. Publicizing the guilty parties, their crimes and the corresponding penalties is in stark contrast to the old days of handling data fraud cases internally and quietly -- and ineffectively.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
CIOs Face Decisions on Remote Work for Post-Pandemic Future
Joao-Pierre S. Ruth, Senior Writer,  2/19/2021
11 Ways DevOps Is Evolving
Lisa Morgan, Freelance Writer,  2/18/2021
CRM Trends 2021: How the Pandemic Altered Customer Behavior Forever
Jessica Davis, Senior Editor, Enterprise Apps,  2/18/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Flash Poll