Articles

Below are articles that use LLMs in their research workflows. You can use the Search option to find examples from your discipline, or for specific workflow applications you may be considering.

TitleType of ResourceLink to ResourceDate RecordedOpen ScienceUse of LLMResearch Discipline(s)Description of Resource
LLMs Model Non-WEIRD Populations: Experiments with Synthetic Cultural Agents Research Article February 9, 2025 Preprint Data Generation Computer Science, Economics Despite its importance, studying economic behavior across diverse, non-WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations presents significant challenges. We address this issue by introducing a novel methodology that uses Large Language Models (LLMs) to create synthetic cultural agents (SCAs) representing these populations. We subject these SCAs to classic behavioral experiments, including the dictator and ultimatum games. Our results demonstrate substantial cross-cultural variability in experimental behavior. Notably, for populations with available data, SCAs' behaviors qualitatively resemble those of real human subjects. For unstudied populations, our method can generate novel, testable hypotheses about economic behavior. By integrating AI into experimental economics, this approach offers an effective and ethical method to pilot experiments and refine protocols for hard-to-reach populations. Our study provides a new tool for cross-cultural economic studies and demonstrates how LLMs can help experimental behavioral research.
Exploring the potential of LLM to enhance teaching plans through teaching simulation Research Article February 9, 2025 Open Source Data Generation Education The introduction of large language models (LLMs) may change future pedagogical practices. Current research mainly focuses on the use of LLMs to tutor students, while the exploration of LLMs’ potential to assist teachers is limited. Taking high school mathematics as an example, we propose a method that utilizes LLMs to enhance the quality of teaching plans through guiding the LLM to simulate teacher-student interactions, generate teaching reflections, and subsequently direct the LLM to refine the teaching plan by integrating these teaching process and reflections. Human evaluation results show that this method significantly elevates the quality of the original teaching plans generated directly by LLM. The improved teaching plans are comparable to high-quality ones crafted by human teachers across various assessment dimensions and knowledge modules. This approach provides a pre-class rehearsal simulation and ideas for teaching plan refinement, offering practical evidence for the widespread application of LLMs in teaching preparation.
Enhancing Participatory Development Research in South Asia through LLM Agents System: An Empirically-Grounded Methodological Initiative and Agenda from Field Evidence in Sri Lankan Research Article February 9, 2025 Open Source Data Analysis, Other Computer Science, Languages The integration of artificial intelligence into development research methodologies offers unprecedented opportunities to address persistent challenges in participatory research, particularly in linguistically diverse regions like South Asia. Drawing on empirical implementation in Sri Lanka’s Sinhala-speaking communities, this study presents a methodological framework designed to transform participatory development research in the multilingual context of Sri Lanka’s flood-prone Nilwala River Basin. Moving beyond conventional translation and data collection tools, the proposed framework leverages a multi-agent system architecture to redefine how data collection, analysis, and community engagement are conducted in linguistically and culturally complex research settings. This structured, agent-based approach facilitates participatory research that is both scalable and adaptive, ensuring that community perspectives remain central to research outcomes. Field experiences underscore the immense potential of LLM-based systems in addressing long-standing issues in development research across resource-limited regions, delivering both quantitative efficiencies and qualitative improvements in inclusivity. At a broader methodological level, this research advocates for AI-driven participatory research tools that prioritize ethical considerations, cultural sensitivity, and operational efficiency. It highlights strategic pathways for deploying AI systems to reinforce community agency and equitable knowledge generation, offering insights that could inform broader research agendas across the Global South.
Literature Reviews with Llm-Based Tools Research Article February 9, 2025 Preprint Research Design Business, Other The integration of large language models (LLMs) into academic research represents a potential change in how research engages with existing knowledge. While literature reviews have served as a significant means of passing on academic research, the exponential growth of output has created an unsustainable burden. No one can read it all; far too much of it is repetitive and unoriginal. The time needed to engage in meaningful fieldwork is endangered. This paper examines how LLM integration can aid research practice by automating aspects of literature synthesis, freeing up time for experiential investigation and theory development. Through analysis of emerging practices, we highlight how technological augmentation can create space for engagement with the empirical, while maintaining rigor and relevance. We demonstrate our position via an exemplary case and its analysis. We will suggest that thoughtful LLM integration can address a critical tension in organizational studies: maintaining awareness of existing scholarship while fostering engagement with living organizational reality.
Highlighting Case Studies in LLM Literature Review of Interdisciplinary System Science Research Article February 9, 2025 Open Source Data Collection Computer Science Large Language Models (LLMs) were used to assist four Commonwealth Scientific and Industrial Research Organisation (CSIRO) researchers to perform systematic literature reviews (SLR). We evaluate the performance of LLMs for SLR tasks in these case studies. In each, we explore the impact of changing parameters on the accuracy of LLM responses. The LLM was tasked with extracting evidence from chosen academic papers to answer specific research questions. We evaluate the models’ performance in faithfully reproducing quotes from the literature and subject experts were asked to assess the model performance in answering the research questions. We developed a semantic text highlighting tool to facilitate expert review of LLM responses. We found that state of the art LLMs were able to reproduce quotes from texts with greater than 95% accuracy and answer research questions with an accuracy of approximately 83%. We use two methods to determine the correctness of LLM responses; expert review and the cosine similarity of transformer embeddings of LLM and expert answers. The correlation between these methods ranged from 0.48 to 0.77, providing evidence that the latter is a valid metric for measuring semantic similarity.
Agent Laboratory: Using LLM Agents as Research Assistants Research Article, Application/Tool February 9, 2025 Preprint Research Design, Data Collection, Data Analysis, Describing Results, Science Communication Computer Science Historically, scientific discovery has been a lengthy and costly process, demanding substantial time and resources from initial conception to final results. To accelerate scientific discovery, reduce research costs, and improve research quality, we introduce Agent Laboratory, an autonomous LLM-based framework capable of completing the entire research process. This framework accepts a human-provided research idea and progresses through three stages--literature review, experimentation, and report writing to produce comprehensive research outputs, including a code repository and a research report, while enabling users to provide feedback and guidance at each stage. We deploy Agent Laboratory with various state-of-the-art LLMs and invite multiple researchers to assess its quality by participating in a survey, providing human feedback to guide the research process, and then evaluate the final paper. We found that: (1) Agent Laboratory driven by o1-preview generates the best research outcomes; (2) The generated machine learning code is able to achieve state-of-the-art performance compared to existing methods; (3) Human involvement, providing feedback at each stage, significantly improves the overall quality of research; (4) Agent Laboratory significantly reduces research expenses, achieving an 84% decrease compared to previous autonomous research methods. We hope Agent Laboratory enables researchers to allocate more effort toward creative ideation rather than low-level coding and writing, ultimately accelerating scientific discovery.
Beyond Likert Scales: Convergent Validity of an NLP-Based Future Self-Continuity Assessment from Verbal Data Research Article February 8, 2025 Preprint Data Analysis Psychology Psychological assessment using self-report Likert items suffers from numerous inherent biases. These biases limit our capability to assess complex psychological constructs such as Future Self-Continuity (FSC), i.e. the perceived connection between one's present and future self. However, recent advances in Natural Language Processing (NLP) and Large Language Models (LLMs) have opened new possibilities for psychological assessment. In this paper, we introduce a novel method of psychological assessment applied to measuring FSC that uses an LLM for NLP of transcripts from self-recorded audio of responses to 15 structured interview prompts developed from FSC theory and research. 164 whitelisted MTurk workers completed an online survey and interview task. Claude 3.5 Sonnet was used to process the transcripts and generate quantitative scores. The resulting FSC scores (including total score, and the similarity, vividness, and positivity components) showed significant correlations with scores on the Future Self-Continuity Questionnaire (FSCQ), a well-validated Likert item measure of FSC, supporting the new method's convergent validity. A Bland-Altman analysis indicating general agreement with standard FSCQ scores, replication using an updated Claude 3.5 Sonnet model, and the strong correlations between NLP-based FSC scores using the two models supports the new assessment method's validity and robustness. This measurement approach can inform treatment planning and interventions by providing clinicians with a more authentic FSC assessment. Beyond FSC, this NLP/LLM approach can enhance psychological assessment broadly, with significant implications for research and clinical practice.
From Assistance to Autonomy -- A Researcher Study on the Potential of AI Support for Qualitative Data Analysis Research Article February 3, 2025 Preprint Data Analysis Computer Science The advent of AI tools, such as Large Language Models, has introduced new possibilities for Qualitative Data Analysis (QDA), offering both opportunities and challenges. To help navigate the responsible integration of AI into QDA, we conducted semi-structured interviews with 15 HCI researchers experienced in QDA. While our participants were open to AI support in their QDA workflows, they expressed concerns about data privacy, autonomy, and the quality of AI outputs. In response, we developed a framework that spans from minimal to high AI involvement, providing tangible scenarios for integrating AI into HCI researchers' QDA practices while addressing their needs and concerns. Aligned with real-life QDA workflows, we identify potentials for AI tools in areas such as data pre-processing, researcher onboarding, or mediation. Our framework aims to provoke further discussion on the development of AI-supported QDA and to help establish community standards for their responsible use.
Quantifying the use and potential benefits of artificial intelligence in scientific research Research Article January 16, 2025 Open Source Other Data Science, Any Discipline The rapid advancement of artificial intelligence (AI) is poised to reshape almost every line of work. Despite enormous efforts devoted to understanding AI’s economic impacts, we lack a systematic understanding of the benefits to scientific research associated with the use of AI. Here we develop a measurement framework to estimate the direct use of AI and associated benefits in science. We find that the use and benefits of AI appear widespread throughout the sciences, growing especially rapidly since 2015. However, there is a substantial gap between AI education and its application in research, highlighting a misalignment between AI expertise supply and demand. Our analysis also reveals demographic disparities, with disciplines with higher proportions of women or Black scientists reaping fewer benefits from AI, potentially exacerbating existing inequalities in science. These findings have implications for the equity and sustainability of the research enterprise, especially as the integration of AI with science continues to deepen.
Use of large language models as artificial intelligence tools in academic research and publishing among global clinical researchers Research Article January 10, 2025 Preprint Other Medicine With breakthroughs in Natural Language Processing and Artificial Intelligence (AI), the usage of Large Language Models (LLMs) in academic research has increased tremendously. Models such as Generative Pre-trained Transformer (GPT) are used by researchers in literature review, abstract screening, and manuscript drafting. However, these models also present the attendant challenge of providing ethically questionable scientific information. Our study provides a snapshot of global researchers’ perception of current trends and future impacts of LLMs in research. Using a cross-sectional design, we surveyed 226 medical and paramedical researchers from 59 countries across 65 specialties, trained in the Global Clinical Scholars’ Research Training certificate program of Harvard Medical School between 2020 and 2024. Majority (57.5%) of these participants practiced in an academic setting with a median of 7 (2,18) PubMed Indexed published articles. 198 respondents (87.6%) were aware of LLMs and those who were aware had higher number of publications (p < 0.001). 18.7% of the respondents who were aware (n = 37) had previously used LLMs in publications especially for grammatical errors and formatting (64.9%); however, most (40.5%) did not acknowledge its use in their papers. 50.8% of aware respondents (n = 95) predicted an overall positive future impact of LLMs while 32.6% were unsure of its scope. 52% of aware respondents (n = 102) believed that LLMs would have a major impact in areas such as grammatical errors and formatting (66.3%), revision and editing (57.2%), writing (57.2%) and literature review (54.2%). 58.1% of aware respondents were opined that journals should allow for use of AI in research and 78.3% believed that regulations should be put in place to avoid its abuse. Seeing the perception of researchers towards LLMs and the significant association between awareness of LLMs and number of published works, we emphasize the importance of developing comprehensive guidelines and ethical framework to govern the use of AI in academic research and address the current challenges.