Princeton-led team sounds alarm: AI poses risks to scientific integrity

May 3, 2024

290

AI is transforming scientific research, but without proper guidance it could possibly do more harm than good.

This is the pointed conclusion of a latest paper published in Science Advances by an interdisciplinary team of 19 researchers led by computer scientists Arvind Narayanan and Sayash Kapoor from Princeton University.

The team argues that misuse of machine learning across scientific disciplines is fueling a reproducibility crisis that threatens to undermine the foundations of science.

“As we move from traditional statistical methods to machine learning methods, there are a far greater variety of opportunities to shoot yourself within the foot.” said Narayananwho directs the Princeton Center for Information Technology Policy.

“If we don’t take motion to enhance our scientific and reporting standards in the sphere of machine learning-based science, we risk not only one discipline, but many alternative scientific disciplines, rediscovering these crises one after one other.”

The problem, in response to the authors, is that machine learning has been rapidly adopted in just about all scientific fields, often without clear standards to make sure the integrity and reproducibility of results.

They emphasize that tThousands of articles using flawed machine learning methods have already been published.

But the Princeton-led team says there remains to be time to avoid this looming crisis. They have presented a straightforward checklist of best practices that, if widely adopted, could make sure the reliability of machine learning in science.

The checklist, called REFORMS (Recommendations for Machine-Learning-based Science), consists of 32 questions in eight key areas:

Study objectives: Clearly state what scientific claim is being made and the way machine learning is used to support it. Justify the alternative of machine learning over traditional statistical methods.
Computer reproducibility: Provide the code, data, computing environment specifications, documentation, and a reproduction script crucial for others to independently reproduce the outcomes of the study.
Data quality: Document the information sources, sampling frame, end result variables, sample size, and amount of missing data. Justify that the information set is suitable and representative of the scientific query.
Data preprocessing: Report how data was cleaned, transformed, and split into training and testing sets. Provide a reason for any excluded data.
Model: Describe and justify all models tried, the tactic used to pick out the ultimate models, and the hyperparameter tuning process. Compare performance to appropriate baselines.
Data leaks: Ensure that the modeling process didn’t unintentionally use information from the test data and that input functions don’t reveal the result.

“This is a scientific problem with systematic solutions,” explains Kapoor.

However, there are costs involved if you get it fallacious could possibly be immense. Flawed science could derail promising research, discourage researchers, and undermine public trust in science.

Previous research, reminiscent of The large-scale study of nature A lot of scholars on generative AI in science suggested that the continued integration of AI into scientific workflows is inevitable. Participants highlighted quite a few advantages – 66% said AI enables faster data processing, 58% believed it improves calculations, and 55% said it saves money and time.

However, 53% felt the outcomes may not be reproducible, 58% feared bias and 55% believed AI could enable fraudulent research.

Ultimately, like every tool, AI is barely as secure and effective because the people behind it. Careless use, even when unintentional, can mislead science.

We have observed evidence of this amongst researchers published an article with nonsensical AI-generated diagrams in Frontiers magazine – a rat with giant testicles, no less. Funny, however it showed that peer reviews may not even recognize obvious uses for AI.

The latest guidelines aim to “keep honest people honest,” as Narayanan put it.

Widespread adoption by researchers, reviewers and journals could set a brand new standard for scientific integrity within the age of AI.

However, reaching consensus can be difficult, especially for the reason that reproducibility crisis is already flying under the radar.

Princeton-led team sounds alarm: AI poses risks to scientific integrity

LEAVE A REPLY Cancel reply

Must Read

Humanoid robots or human connection? What Elon Musk's Optimus reveals about our AI ambitions

3 questions: How AI could optimize the ability grid

Decoding the Arctic to predict winter weather

Gmail introduces personalized AI inbox, AI digests in search, and more

Envisioning a Better Future Together – A message from our Founder and CEO about purpose, unity and motion

Grok produces sexualized photos of girls and minors for users on X – a legal scholar explains why this happens and what may be...

The Stone Center on Inequality and Shaping the Future of Work opens at MIT

Latest articles

Humanoid robots or human connection? What Elon Musk's Optimus reveals about our AI ambitions

3 questions: How AI could optimize the ability grid

Decoding the Arctic to predict winter weather

Our Newsletter

Princeton-led team sounds alarm: AI poses risks to scientific integrity

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter