Scientific discovery is probably the most demanding human activities. First, scientists must understand existing knowledge and discover a major gap in knowledge. Next, they have to formulate a research query and design and conduct an experiment to reply that query. Then, they have to analyze and interpret the outcomes of the experiment, which can generate one other research query.
Can such a fancy process be automated? Last week Sakana AI Labs announced the creation of an “AI scientist” – a man-made intelligence system that they claim could make scientific discoveries in the sphere of machine learning fully robotically.
Using generative large language models (LLMs) like people who underpin ChatGPT and other AI chatbots, the system can brainstorm, select promising ideas, program recent algorithms, present results, and write a paper summarizing and referencing the experiment and its findings. Sakana claims the AI tool can run your complete lifecycle of a scientific experiment at a price of just $15 per paper—lower than the price of a scientist's lunch.
These are big claims. Are they valid? And even in the event that they are, would a military of AI scientists producing research reports at inhuman speed really be excellent news for science?
How a pc can “do science”
Much of science is conducted in the general public domain, and just about all scientific knowledge has been written down somewhere (otherwise we might don’t have any way of “knowing” it). Millions of scientific papers can be found without spending a dime online, for instance in repositories corresponding to arXiv And PubMed.
LLMs trained on this data grasp the language of science and its patterns, so perhaps it's by no means surprising that a generative LLM can produce something that appears like scientific paper—it's picked up loads of examples to repeat.
What is less clear is whether or not an AI system is able to writing a scientific paper. The key point is that good science requires novelty.
But is it interesting?
Scientists are not looking for to study things which might be already known. Rather, they need to learn recent things, especially recent things which might be significantly different from what’s already known. This requires an assessment of the scope and value of a contribution.
The Sakana system attempts to find out interestingness in two ways. First, it “scores” recent paper ideas based on their similarity to existing research (indexed within the Semantic scholar Repository). Anything that is just too similar might be discarded.
Second, Sakana's system introduces a “peer review” step – one other LLM is used to evaluate the standard and novelty of the paper produced. Again, there are many examples of peer review online on web sites corresponding to openreview.net which may function a guide for critiquing an essay. LLMs have also adopted these.
AI will not be indicator of AI performance
The response to Sakana AI’s results has been mixed. Some describe it as “countless scientific nonsense“.
Even the system's own review of the outcomes concludes that the outcomes are weak at best. This will likely improve as technology advances, however the query of whether automated scientific work is priceless stays.
The ability of LLMs to evaluate the standard of research can also be an open query. My own work (soon to be published in Methods for research synthesis) shows that LLMs should not particularly good at assessing the chance of bias in medical research studies, although this too may improve over time.
Sakana's system automates discoveries in computational research, which is far easier than in other branches of science that require physical experiments. Sakana's experiments are conducted using code, which can also be structured text that LLMs can learn to generate.
AI tools should support scientists, not replace them
AI researchers have been developing systems to support science for a long time. Given the big amount of published research, it may possibly be difficult to even find publications relevant to a selected scientific query.
Specialized search tools use AI to assist researchers find and summarize existing work. These include the Semantic Scholar mentioned above, but additionally newer systems corresponding to Unlock, Research rabbits, knowledge And consensus.
Text mining tools like PubTator Dig deeper into documents to discover key areas of focus, corresponding to specific genetic mutations and diseases and their proven associations. This is especially useful for curating and organizing scientific information.
Machine learning can also be used to support the synthesis and evaluation of medical evidence, in tools corresponding to Robot Reviewer. Summaries that compare and contrast claims in articles by learning Help with conducting literature research.
All of those tools aim to assist scientists do their work more efficiently, not to switch them.
AI research could exacerbate existing problems
While Sakana AI States While the corporate doesn’t expect the role of human scientists to diminish, its vision of a “fully AI-driven scientific ecosystem” would have a major impact on science.
One concern is that if AI-generated articles flood the scientific literature, future AI systems might be trained on AI results and Model collapseThis implies that their ability to innovate could increasingly stagnate.
However, the implications for science go far beyond the impact on the AI science systems themselves.
There are already bad actors in science, including “paper mills” that fake papersThis problem is just deteriorate if a scientific paper may be produced with 15 US dollars and a vague place to begin.
The need to examine a mountain of robotically generated research for errors could quickly overwhelm the capability of real scientists. The peer review system is arguably already broken, and feeding more research of dubious quality into the system is not going to fix it.
Science is fundamentally based on trust. Scientists value the integrity of the scientific process in order that we are able to trust that our understanding of the world (and now the machines of that world) is valid and improving.
A scientific ecosystem by which AI systems play a central role raises fundamental questions on the meaning and value of this process and the way much trust we should always place in AI scientists. Is this the form of scientific ecosystem we would like?