Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy

Saturday, September 18, 2021

Fraudulent data raise questions about superstar honesty researcher

Cathleen O'Grady
Sciencemag.com
Originally posted 24 Aug 21

Here is an excerpt:

Some time later, a group of anonymous researchers downloaded those data, according to last week’s post on Data Colada. A simple look at the participants’ mileage distribution revealed something very suspicious. Other data sets of people’s driving distances show a bell curve, with some people driving a lot, a few very little, and most somewhere in the middle. In the 2012 study, there was an unusually equal spread: Roughly the same number of people drove every distance between 0 and 50,000 miles. “I was flabbergasted,” says the researcher who made the discovery. (They spoke to Science on condition of anonymity because of fears for their career.)

Worrying that PNAS would not investigate the issue thoroughly, the whistleblower contacted the Data Colada bloggers instead, who conducted a follow-up review that convinced them the field study results were statistically impossible.

For example, a set of odometer readings provided by customers when they first signed up for insurance, apparently real, was duplicated to suggest the study had twice as many participants, with random numbers between one and 1000 added to the original mileages to disguise the deceit. In the spreadsheet, the original figures appeared in the font Calibri, but each had a close twin in another font, Cambria, with the same number of cars listed on the policy, and odometer readings within 1000 miles of the original. In 1 million simulated versions of the experiment, the same kind of similarity appeared not a single time, Simmons, Nelson, and Simonsohn found. “These data are not just excessively similar,” they write. “They are impossibly similar.”

Ariely calls the analysis “damning” and “clear beyond doubt.” He says he has requested a retraction, as have his co-authors, separately. “We are aware of the situation and are in communication with the authors,” PNAS Editorial Ethics Manager Yael Fitzpatrick said in a statement to Science.

Three of the authors say they were only involved in the two lab studies reported in the paper; a fourth, Boston University behavioral economist Nina Mazar, forwarded the Data Colada investigators a 16 February 2011 email from Ariely with an attached Excel file that contains the problems identified in the blog post. Its metadata suggest Ariely had created the file 3 days earlier.

Ariely tells Science he made a mistake in not checking the data he received from the insurance company, and that he no longer has the company’s original file. He says Duke’s integrity office told him the university’s IT department does not have email records from that long ago. His contacts at the insurance company no longer work there, Ariely adds, but he is seeking someone at the company who could find archived emails or files that could clear his name. His publication of the full data set last year showed he was unaware of any problems with it, he says: “I’m not an idiot. This is a very easy fraud to catch.”