Discussion Articles

Articles that discuss the use of LLMs in science.

Title	Type of Resource	Link to Resource	Date Recorded	Open Science	Use of LLM	Research Discipline(s)	Description of Resource
What Limits LLM-based Human Simulation: LLMs or Our Design?	Discussion Article	Simulations	February 9, 2025	Preprint	Data Generation	Computer Science	We argue that advancing LLM-based human simulation requires addressing both LLM's inherent limitations and simulation framework design challenges. Recent studies have revealed significant gaps between LLM-based human simulations and real-world observations, highlighting these dual challenges. To address these gaps, we present a comprehensive analysis of LLM limitations and our design issues, proposing targeted solutions for both aspects. Furthermore, we explore future directions that address both challenges simultaneously, particularly in data collection, LLM generation, and evaluation. To support further research in this field, we provide a curated collection of LLM-based human simulation resource
Artificial intelligence and illusions of understanding in scientific research	Discussion Article	Illusions	January 16, 2025	Open Source	Research Design, Other	Any Discipline	Scientists are enthusiastically imagining ways in which artificial intelligence (AI) tools might improve research. Why are AI tools so attractive and what are the risks of implementing them across the research pipeline? Here we develop a taxonomy of scientists’ visions for AI, observing that their appeal comes from promises to improve productivity and objectivity by overcoming human shortcomings. But proposed AI solutions can also exploit our cognitive limitations, making us vulnerable to illusions of understanding in which we believe we understand more about the world than we actually do. Such illusions obscure the scientific community’s ability to see the formation of scientific monocultures, in which some types of methods, questions and viewpoints come to dominate alternative approaches, making science less innovative and more vulnerable to errors. The proliferation of AI tools in science risks introducing a phase of scientific enquiry in which we produce more but understand less. By analysing the appeal of these tools, we provide a framework for advancing discussions of responsible knowledge production in the age of AI.
Engineering of Inquiry: The “Transformation” of Social Science through Generative AI	Discussion Article	Social	January 10, 2025	Preprint	Research Design, Other	Any Discipline	We increasingly read that generative AI will “transform” the social sciences, but little to no work has conceptualized the conditions necessary to fulfill such a promise. We review recent research on generative AI and evaluate its potential to reshape research practices. As the technology advances, generative AI could support various research tasks, including idea generation, data collection, and analysis. However, we discuss three challenges to an optimistic outlook that focuses solely on accelerating research through practical tools and reducing costs through inexpensive “synthetic” data. First, generative AI raises severe concerns about the validity of conclusions drawn from synthetic data about human populations. Second, possible efficiency gains in the research process may be partially offset by new problems introduced by the technology. Third, applications of generative AI have so far focused on enhancing existing methods, with limited efforts to harness the technology’s unique potential to simulate human behavior in social environments. Sociologists could use sociological theories and methods to develop “generative agents.” A new “trading zone” could emerge where social scientists, statisticians, and computer scientists develop new methodologies to facilitate innovative lines of inquiry and produce scientifically valid conclusions.
Contribution and Challenges of ChatGPT and Similar Generative Artificial Intelligence in Biochemistry, Genetics and Molecular Biology	Discussion Article	Contributions	January 9, 2025	Preprint	Other	Biology	The incorporation of ChatGPT, an advanced natural language processing model, into the realms of biochemistry, genetics, and molecular biology has revolutionized research and communication within these fields. This study explores the impacts and obstacles associated with ChatGPT in these domains. ChatGPT has made substantial contributions to the accessibility and dissemination of knowledge in biochemistry, genetics, and molecular biology. It simplifies complex scientific literature, offers concise explanations, answers queries, and generates summaries, benefiting researchers, students, and practitioners. Furthermore, it fosters global collaboration by enabling discussions and knowledge sharing among scientists. One of the primary advantages of ChatGPT is its assistance in decoding intricate genomic and proteomic data. It aids in genetic sequence analysis, identifies potential disease markers, and provides suggestions for experimental designs. Additionally, ChatGPT can assist in composing and reviewing research papers, elevating the quality of scientific publications in these fields. However, despite its merits, ChatGPT encounters challenges in the context of biochemistry, genetics, and molecular biology. It may struggle to grasp highly specialized or novel research topics, potentially leading to the dissemination of inaccurate information if not used judiciously. Privacy concerns arise when discussing sensitive genetic or medical data. ChatGPT brings valuable advantages to the domains of biochemistry, genetics, and molecular biology by simplifying information access, promoting collaborative research, and aiding in data interpretation. Nevertheless, users must remain vigilant about potential inaccuracies and privacy issues. Addressing these challenges as technology evolves will be crucial to fully unlock ChatGPT's potential in advancing research and education within these critical scientific disciplines.
LLMs and the Risk of Sloppy Science: Navigating the Future of Scientific Inquiry in the Age of Artificial Intelligence	Discussion Article	Sloppy	December 18, 2024	Preprint	Other		The emergence of Large Language Models (LLMs) such as GPT-3 represents a paradigm shift in scientific research, offering unparalleled capabilities in data analysis, hypothesis generation, and literature synthesis. However, their integration into research processes raises fundamental philosophical, methodological and ethical concerns. This article critically examines the benefits and risks associated with LLMs through the lenses of philosophy of science, cognitive science, and second-order cybernetics in order to augment a public debate that has until now seemed excessively focused on practical implementation risks rather than categorical errors. We explore the tension between augmenting human research capabilities and the threat of "sloppy science," where the ease of generating scientific content may compromise the quality and reliability of research outputs.
Challenges in Guardrailing Large Language Models for Science	Discussion Article	Guardrails	December 18, 2024	Preprint	Research Design, Other	Any Discipline	The rapid development in large language models (LLMs) has transformed the landscape of natural language processing and understanding (NLP/NLU), offering significant benefits across various domains. However, when applied to scientific research, these powerful models exhibit critical failure modes related to scientific integrity and trustworthiness. Existing general-purpose LLM guardrails are insufficient to address these unique challenges in the scientific domain. We propose a comprehensive taxonomic framework for LLM guardrails encompassing four key dimensions: trustworthiness, ethics & bias, safety, and legal compliance. Our framework includes structured implementation guidelines for scientific research applications, incorporating white-box, blackbox, and gray-box methodologies. This approach specifically addresses critical challenges in scientific LLM deployment, including temporal sensitivity, knowledge contextualization, conflict resolution, and intellectual property protection.
Using AI in Grounded Theory research – a proposed framework for a ChatGPT-based research assistant	Discussion Article	Grounded Theory	December 18, 2024	Preprint	Data Analysis	Sociology	The purpose of this paper is to explore the potential application of ChatGPT in relation to grounded theory. Our focus is building a case as to its usefulness to support the research process as an assistant to the researcher, rather than to replace the intellectual rigour needed to conduct credible grounded theory research. To aid this, we present a framework for using ChatGPT to assist researchers’ decision making and analysis. By structuring the analytical process into clear research phases - from initial coding through to visualisation and expansion - and providing specific prompts and instructions for each phase, the framework enables researchers to systematically harness AI capabilities whilst maintaining the methodological rigour and accountability of the researcher in leading this process. We argue that the framework's strength lies in its careful alignment with established grounded theory processes, particularly in its emphasis on constant comparison throughout all analytical phases. As many grounded theory methods are employed in other qualitative research designs, we argue that the proposed framework may have potential for use in a broad range of designs, however, we also suggest that this is the start of new conversations about how researchers can harness AI to assist their decision making and intellectual work, processes which can never be fully replaced.
Beyond principlism: Practical strategies for ethical AI use in research practices	Discussion Article, Reporting Guidelines	Practical strategies	November 10, 2024	Preprint	Describing Results, Science Communication	Any Discipline	The rapid adoption of generative artificial intelligence (AI) in scientific research, particularly large language models (LLMs), has outpaced the development of ethical guidelines, leading to a “Triple-Too” problem: too many high-level ethical initiatives, too abstract principles lacking contextual and practical relevance, and too much focus on restrictions and risks over benefits and utilities. Existing approaches—principlism (reliance on abstract ethical principles), formalism (rigid application of rules), and technological solutionism (overemphasis on technological fixes)—offer little practical guidance for addressing ethical challenges of AI in scientific research practices. To bridge the gap between abstract principles and day-to-day research practices, a user-centered, realism-inspired approach is proposed here. It outlines five specific goals for ethical AI use: (1) understanding model training and output, including bias mitigation strategies; (2) respecting privacy, confidentiality, and copyright; (3) avoiding plagiarism and policy violations; (4) applying AI beneficially compared to alternatives; and (5) using AI transparently and reproducibly. Each goal is accompanied by actionable strategies and realistic cases of misuse and corrective measures. I argue that ethical AI application requires evaluating its utility against existing alternatives rather than isolated performance metrics. Additionally, I propose documentation guidelines to enhance transparency and reproducibility in AI-assisted research. Moving forward, we need targeted professional development, training programs, and balanced enforcement mechanisms to promote responsible AI use while fostering innovation. By refining these ethical guidelines and adapting them to emerging AI capabilities, we can accelerate scientific progress without compromising research integrity.
How to write effective prompts for large language models	Documentation, Discussion Article, Tutorial w/o Code, Application/Tool	Prompt engineering	November 10, 2024	Preprint	Research Design, Data Collection, Data Cleaning/Preparation, Data Generation, Describing Results, Science Communication, Other	Any Discipline	Effectively engaging with large language models is becoming increasingly vital as they proliferate across research landscapes. This Comment presents a practical guide for understanding their capabilities and limitations, along with strategies for crafting well-structured queries, to extract maximum utility from these artificial intelligence tools.
Techniques for supercharging academic writing with generative AI	Documentation, Discussion Article, Use Case Example, Reporting Guidelines, Tutorial w/ Code, Tutorial w/o Code, Application/Tool	AI-based writing	November 10, 2024	Preprint	Describing Results, Science Communication	Any Discipline	Generalist large language models can elevate the quality and efficiency of academic writing.

Highlighted Resources

Categories