Beyond the Checkbox: Leveraging AI Chatbots for Inclusive Demographic Data Collection
| dc.contributor.author | Chekili, Amel | en |
| dc.contributor.committeechair | Hernandez, Jorge Ivan | en |
| dc.contributor.committeemember | Diana, Rachel A. | en |
| dc.contributor.committeemember | Hickman, Louis | en |
| dc.contributor.committeemember | Hsu, Ning | en |
| dc.contributor.department | Psychology | en |
| dc.date.accessioned | 2025-09-20T08:01:05Z | en |
| dc.date.available | 2025-09-20T08:01:05Z | en |
| dc.date.issued | 2025-09-19 | en |
| dc.description.abstract | Traditional demographic surveys compress rich identities into rigid checkboxes. This dissertation asks whether a conversational chatbot, powered by GPT-4o, can restore that nuance. In a within-subjects experiment, 230 participants completed both a chatbot conversation and the standard Office of Management and Budget (OMB) form. Exploratory analyses showed that participants' open-ended narratives frequently moved beyond the OMB labels. By encoding these responses with the INSTRUCTOR embedding model, and organizing them via hierarchical clustering, the categorization can be "cut" at multiple levels of granularity, producing solutions that can satisfy regulatory reporting and finer leaves that reveal national, regional, and mixed-heritage detail. Hypothesis-driven tests of user experience reinforced these advantages. On the User Experience Questionnaire, the chatbot outscored the demographic checklist on hedonic qualities, novelty, and stimulation, while the checklist retained pragmatic strengths such as dependability. Perceived group inclusivity also rose when data were collected through the chatbot, regardless of how closely respondents' identities aligned with OMB categories. Overall, the findings indicate that a carefully engineered chatbot, paired with advanced natural-language-processing analyses, can enhance race and ethnicity data collection by producing richer information and fostering a more inclusive, engaging respondent experience. | en |
| dc.description.abstractgeneral | Most surveys that ask about race or ethnicity limit respondents to a handful of checkboxes. These boxes make record-keeping simple, yet they flatten the richness of personal heritage. This dissertation investigates whether a conversational artificial-intelligence assistant can restore that nuance. A sample of 230 adults first completed the standard Office of Management and Budget race and ethnicity form, and then engaged in a short dialogue with a GPT-4o powered chatbot that encouraged open self-description. The conversation yielded responses that named specific countries, regions, and blended lineages that never appear on the official list. Natural-language software grouped the free-text answers into hierarchies. At the broadest level, the groupings still satisfied regulatory reporting. At the granulated levels, they revealed detailed threads of identity such as national origin and mixed heritage. Participants judged the chatbot to be more engaging, enjoyable, and welcoming than the traditional checklist, though the checklist remained slightly easier to finish. Feelings of inclusion also rose after interacting with the chatbot, regardless of how well respondents' identities aligned with government categories. The results demonstrate that a thoughtfully engineered chatbot can meet formal data requirements while allowing people to express who they truly are. This approach makes demographic information richer, more accurate, and more respectful of individual identity. | en |
| dc.description.degree | Doctor of Philosophy | en |
| dc.format.medium | ETD | en |
| dc.identifier.other | vt_gsexam:44388 | en |
| dc.identifier.uri | https://hdl.handle.net/10919/137810 | en |
| dc.language.iso | en | en |
| dc.publisher | Virginia Tech | en |
| dc.rights | Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International | en |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | en |
| dc.subject | Chatbots | en |
| dc.subject | Demographic data collection | en |
| dc.subject | Race | en |
| dc.subject | Ethnicity | en |
| dc.subject | Natural language processing | en |
| dc.subject | Inclusivity | en |
| dc.title | Beyond the Checkbox: Leveraging AI Chatbots for Inclusive Demographic Data Collection | en |
| dc.type | Dissertation | en |
| thesis.degree.discipline | Psychology | en |
| thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
| thesis.degree.level | doctoral | en |
| thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1