LLM in Science

Resource Description

Title
Utilizing Machine Learning Algorithms Trained on AI-generated Synthetic Participant Recent Music-Listening Activity in Predicting Big Five Personality Traits
Description of Resource
The recent rise of publicly available artificial intelligence (AI) tools such as ChatGPT has raised a plethora of questions among users and skeptics alike. One major question asks, "Has AI gained the ability to indistinguishably mimic the psychology of its organic, human counterpart?". Since music has been known to be a positive predictor of personality traits due to the individuality of personal preference, in this paper we use machine learning (ML) algorithms to analyze the predictability of AI-generated or 'synthetic' participants' Big 5 personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism) using their recent music listening activity and motivations for listening to music. Recent music listening history for synthetic participants is generated using ChatGPT and the corresponding audio features for the songs are derived via the Spotify Application Programming Interface (Beats per minute, Danceability, Instrumentals, Happiness, etc). This study will also administer the Uses of Music Inventory to account for synthetic participants’ motivations for listening to music: emotional, cognitive, and background.The dataset will be trained and tested on scaler-model combinations to identify the predictions with the least mean absolute error using ML models such as Random Forest, Decision Tree, K-Nearest Neighbors, Logistic Regression, and Support Vector Machine. Both regression (continuous numeric value) and classification (Likert scale option) prediction methods will be used. An Exploratory Factor Analysis (EFA) will be conducted on the audio features to find a latent representation of the dataset that machine learning is also trained and tested on. A full literature review showed this is the first study to use both Spotify API data, rather than self-reported music preference, and machine learning, in addition to traditional statistical tests and regression models, to predict the personality of a synthetic college student demographic. The findings of this study show ChatGPT struggles to mimic the diverse and complex nature of human personality psychology and music taste. This paper is a pilot study to a broader ongoing investigation where the findings of synthetic participants are compared to that of real college students using the same inventories for which data collection is ongoing
Type of Resource
Research Article
Research Discipline(s)
Engineering, Psychology
Open Science
Preprint, Open Source
Use of LLM
Data Generation