In Ridley Scott's Blade Runner, the bounty hunter Deckard is sure he's got the movie's central dilemma figured out: if the rogue replicant (a bioengineered humanoid that is indistinguishable from humans) he is hunting is useful, it's not his problem. He's wrong almost immediately, of course, and spends the rest of the film learning that the question was never really about the machines. FQxI's Hiranya Peiris, a cosmologist at the Institute of Astronomy, Cambridge University, opens her new Nature Astronomy comment piece with this and makes a similar move. The debate about large language models in astrophysics, she argues, isn't really about large language models.
Scientists from multiple disciplines are grappling with the rights and wrongs, uses and abuses, benefits and dangers of using large language models (LLMs) and artificial intelligence to support their work. Last year, for instance, FQxI's Hector Zenil, a computer scientist at King's College London, co-authored a review of how LLMs and Generative AI are re-defining the scientific method, for npj: Artificial Intelligence, in light of AlphaFold's Nobel-winning achievements in predicting protein folding patterns, one of the biggest challenges in biology. In a news piece for King's College London, Zenil explained his view that while such tools can be powerful for predicting things (protein-folding patterns, say), they offer little in terms of understanding why things happen (why these particular folds occur, in this example, rather than others). “This is no longer about whether AI can do science. It is about whether we understand the science that AI does — and whether that still counts as science at all,” Zenil told KCL's News Centre.
Peiris's piece reflects on the fears surrounding AI within the scientific community. Will machines replace human scientists? Her core argument is that if an LLM can replicate your scientific contribution, the problem is not the LLM. The anxiety around AI, specifically in astrophysics, she says, is a symptom of pre-existing problems in standards, scientific practice and incentive structures (see “The Perils of Peer Review” for a summary of such issues). There's an assumption in parts of the community that ideas are cheap and the field is just rate-limited by how fast we can turn them into papers. Peiris, however, says that the flood of incremental publications isn't a sign that we have too many good ideas and not enough hands but rather is a sign that execution has been substituting for thought. The hard part of being a scientist, she argues, is working out which problems are actually worth spending years on, and learning which papers not to write.
Beyond Peiris's use of machine-learning methods for research, she writes that she personally uses LLMs as a sounding board of sorts, a way to pressure-test an idea or get oriented in unfamiliar territory before bringing it to a colleague. Her team also uses AI-assisted coding tools, though the resulting code goes through the same validation, review and testing as anything written by hand.
A key distinction she makes is around automating vs. augmenting. Automating astrophysics would be building systems that do science without us, which is concerning not because the outputs may be wrong, but because it takes the human out of the loop of understanding (echoing Zenil, the reason science is done in the first place). Whereas augmenting astrophysics would be giving people better tools. “This is what telescopes do. This is what computers do” she argues in the piece.
Peiris goes on to discuss how this extends to training the next generation to be more than just prompt engineers. Read her full essay to see her suggestions for addressing this problem—a problem that she argues was established long before Gen AIs joined the picture—that too many students have picked up the habits of production rather than habits of thought.
Explore more:
“Large language models are not the problem,” by Hiranya Peiris: Nature Astronomy
“The Age of AI Sci,” by Hector Zenil: KCL News Centre
“Is AI Research Physics?” by Gerardo Adesso: QSpace
“Scientific production in the era of large language models,” by K. Kusemi et al: Science
Peiris also recently appeared alongside fellow FQxI Scientific Advisory Council member Jim Al-Khalili on The Life Scientific. You can listen on BBC Radio 4, though it may only be available in the UK.