Articles that discuss the use of LLMs in science.
Title | Type of Resource | Link to Resource | Date Recorded | Open Science | Use of LLM | Research Discipline(s) | Description of Resource |
---|---|---|---|---|---|---|---|
Ten simple rules for using large language models in science | Discussion Article | 10 Rules | September 22, 2024 | Open Source | Other | Biology, Public Health, Other | Generative artificial intelligence (AI) tools, including large language models (LLMs), are expected to radically alter the way we live and work, with as many as 300 million jobs at risk [1]. Arguably the most well-known LLM currently is GPT (generative pre-trained transformer), developed by American company OpenAI [2]. Since its release in late 2022, GPT’s chatbot interface, ChatGPT, has exploded in popularity, setting a new record for the fastest growing user base in history [3]. The appeal of GPT and other LLMs stem from their ability to effectively carry out multistep tasks and provide clear, human-like responses to complicated queries and prompts (Box 1). Unsurprisingly, this capacity is catching the eye of scientists [4]. |
Automated Social Science: Language Models as Scientist and Subjects | Discussion Article | Automated Social Science | March 11, 2024 | Open Source | Other | Other | We present an approach for automatically generating and testing, in silico, social scientific hypotheses. This automation is made possible by recent advances in large language models (LLM), but the key feature of the approach is the use of structural causal models. Structural causal models provide a language to state hypotheses, a blueprint for constructing LLM-based agents, an experimental design, and a plan for data analysis. The fitted structural causal model becomes an object available for prediction or the planning of follow-on experiments. We demonstrate the approach with several scenarios: a negotiation, a bail hearing, a job interview, and an auction. In each case, causal relationships are proposed and tested, finding evidence for some and not others. In the auction experiment, we show that the in silico simulation results closely match the predictions of auction theory, but elicited predictions of the clearing prices from an LLM are inaccurate. However, the LLM’s predictions are dramatically improved if the model can condition on the fitted structural causal model. When given a proposed structural causal model for one of the scenarios, the LLM is good at predicting the signs of estimated effects, but it cannot reliably predict the magnitudes of those effects. This suggests that social simulations give the model insight not available purely through direct elicitation. In short, the LLM knows more than it can (immediately) tell. |
Start Generating: Harnessing Generative Artificial Intelligence for Sociological Research | Discussion Article | Sociological | September 22, 2024 | Open Source | Other | Sociology | How can generative artificial intelligence (GAI) be used for sociological research? The author explores applications to the study of text and images across multiple domains, including computational, qualitative, and experimental research. Drawing upon recent research and stylized experiments with DALL·E and GPT-4, the author illustrates the potential applications of text-to-text, image-to-text, and text-to-image models for sociological research. Across these areas, GAI can make advanced computational methods more efficient, flexible, and accessible. The author also emphasizes several challenges raised by these technologies, including interpretability, transparency, reliability, reproducibility, ethics, and privacy, as well as the implications of bias and bias mitigation efforts and the trade-offs between proprietary models and open-source alternatives. When used with care, these technologies can help advance many different areas of sociological methodology, complementing and enhancing our existing toolkits. |
“ChatGPT Assists Me in My Reference List:” Exploring the Chatbot’s Potential as Citation Formatting Tool | Discussion Article | Citations | September 22, 2024 | Science Communication, Other | Other | This inquiry unveiled the potential of ChatGPT as a viable alternative to traditional citation generator. Findings showed the substantial potential and reliability of the chatbot as a citation formatting tool. Notably, the study revealed ChatGPT’s remarkable accuracy in configuring reference citations for journal articles and books across a range of styles, including APA 7, MLA 9, IEEE, and Harvard. Furthermore, the tool demonstrated proficiency in organizing in-text citations for multiple references. Despite the commendable performance of ChatGPT, manual editing remains essential for the final verification of the references to ensure the utmost accuracy and credibility of sourced materials. | |
Challenges and Applications of Large Language Models | Discussion Article | Challenges and Applications of L | August 12, 2024 | Preprint | Other | Other | Large Language Models (LLMs) went from non-existent to ubiquitous in the machine learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify the remaining challenges and already fruitful application areas. In this paper, we aim to establish a systematic set of open problems and application successes so that ML researchers can comprehend the field's current state more quickly and become productive. [with discussion of challenges related to scientific applications] |
We Have No Satisfactory Social Epistemology of AI-Based Science | Discussion Article | No Satisfactory Social Epistemology | June 9, 2024 | Open Source | Other | Philosophy | In the social epistemology of scientific knowledge, it is largely accepted that relationships of trust, not just reliance, are necessary in contemporary collaborative science characterised by relationships of opaque epistemic dependence. Such relationships of trust are taken to be possible only between agents who can be held accountable for their actions. But today, knowledge production in many fields makes use of AI applications that are epistemically opaque in an essential manner. This creates a problem for the social epistemology of scientific knowledge, as scientists are now epistemically dependent on AI applications that are not agents, and therefore not appropriate candidates for trust. |
Science Based on Artificial Intelligence Need not Pose a Social Epistemological Problem | Discussion Article | Need not Pose a Social Epistemological Problem | June 9, 2024 | Preprint | Other | Philosophy | It has been argued that our currently most satisfactory social epistemology of science can’t account for science that is based on artificial intelligence (AI) because this social epistemology requires trust between scientists that can take full responsibility for the research tools they use, and scientists can’t take full responsibility for the AI tools they use since these systems are epistemically opaque. I think this argument overlooks that much AI-based science can be done without opaque models, and that agents can take full responsibility for the systems they use even if these systems are opaque. Requiring that an agent fully understand how a system works is an untenably strong condition for that agent to take full responsibility for the system and risks absolving AI developers from responsibility for their products. AI-based science need not create trust-related social epistemological problems if we keep in mind that what makes both individual scientists and their use of AI systems trustworthy isn’t full transparency of the internal processing but their adherence to social and institutional norms that ensure that scientific claims can be trusted. |
Living with Uncertainty: Full Transparency of AI isn’t Needed for Epistemic Trust in AI-based Science | Discussion Article | Living with Uncertainty | June 9, 2024 | Preprint | Other | Philosophy | Can AI developers be held epistemically responsible for the processing of their AI systems when these systems are epistemically opaque? And can explainable AI (XAI) provide public justificatory reasons for opaque AI systems’ outputs? Koskinen (2024) gives negative answers to both questions. Here, I respond to her and argue for affirmative answers. More generally, I suggest that when considering people’s uncertainty about the factors causally determining an opaque AI’s output, it might be worth keeping in mind that a degree of uncertainty about conclusions is inevitable even in entirely human-based empirical science because in induction there’s always a risk of getting it wrong. Keeping this in mind may help appreciate that requiring full transparency from AI systems before epistemically trusting their outputs might be unusually (and potentially overly) demanding. |
The illusion of artificial inclusion | Discussion Article | artificial inclusion | March 25, 2024 | Preprint | Data Collection, Data Generation | Computer Science | Human participants play a central role in the development of modern artificial intelligence (AI) technology, in psychological science, and in user research. Recent advances in generative AI have attracted growing interest to the possibility of replacing human participants in these domains with AI surrogates. We survey several such "substitution proposals" to better understand the arguments for and against substituting human participants with modern generative AI. Our scoping review indicates that the recent wave of these proposals is motivated by goals such as reducing the costs of research and development work and increasing the diversity of collected data. However, these proposals ignore and ultimately conflict with foundational values of work with human participants: representation, inclusion, and understanding. This paper critically examines the principles and goals underlying human participation to help chart out paths for future work that truly centers and empowers participants. |
Generative AI Can Supercharge Your Academic Research | Discussion Article, Use Case Example | Using LLM in Research Process | March 19, 2024 | Open Source | Other | Other | Conducting relevant scholarly research can be a struggle. Educators must employ innovative research methods, carefully analyze complex data, and then master the art of writing clearly, all while keeping the interest of a broad audience in mind. Generative AI is revolutionizing this sometimes tedious aspect of academia by providing sophisticated tools to help educators navigate and elevate their research. But there are concerns, too. AI’s capabilities are rapidly expanding into areas that were once considered exclusive to humans, like creativity and ingenuity. This could lead to improved productivity, but it also raises questions about originality, data manipulation, and credibility in research. With a simple prompt, AI can easily generate falsified datasets, mimic others’ research, and avoid plagiarism detection. [4 how to tutorials follow] |