Journal of Applied Psychology, Vol 108(8), Aug 2023, 1277-1299; doi:10.1037/apl0001082
The present study explores the plausibility of measuring personality indirectly through an artificial intelligence (AI) chatbot. This chatbot mines various textual features from users’ free text responses collected during an online conversation/interview and then uses machine learning algorithms to infer personality scores. We comprehensively examine the psychometric properties of the machine-inferred personality scores, including reliability (internal consistency, split-half, and test–retest), factorial validity, convergent and discriminant validity, and criterion-related validity. Participants were undergraduate students (n = 1,444) enrolled in a large southeastern public university in the United States who completed a self-report Big Five personality measure (IPIP-300) and engaged with an AI chatbot for approximately 20–30 min. In a subsample (n = 407), we obtained participants’ cumulative grade point averages from the University Registrar and had their peers rate their college adjustment. In an additional sample (n = 61), we obtained test–retest data. Results indicated that machine-inferred personality scores (a) had overall acceptable reliability at both the domain and facet levels, (b) yielded a comparable factor structure to self-reported questionnaire-derived personality scores, (c) displayed good convergent validity but relatively poor discriminant validity (averaged convergent correlations = .48 vs. averaged machine-score correlations = .35 in the test sample), (d) showed low criterion-related validity, and (e) exhibited incremental validity over self-reported questionnaire-derived personality scores in some analyses. In addition, there was strong evidence for cross-sample generalizability of psychometric properties of machine scores. Theoretical implications, future research directions, and practical considerations are discussed. (PsycInfo Database Record (c) 2023 APA, all rights reserved)