Evaluating Human-LLM Alignment in ETD Subject Classification
| dc.contributor.author | Klair, Hajra | en |
| dc.contributor.author | German, Fausto | en |
| dc.contributor.author | Banerjee, Bipasha | en |
| dc.contributor.author | Ingram, William A. | en |
| dc.date.accessioned | 2026-02-09T15:32:15Z | en |
| dc.date.available | 2026-02-09T15:32:15Z | en |
| dc.date.issued | 2025-09-27 | en |
| dc.description.abstract | Author-assigned subject labels in Electronic Theses and Dissertations (ETDs) are often inconsistent, overly broad, or misaligned with the research focus. This hampers discovery, aggregation, and analysis, especially for interdisciplinary research. LLMs offer a scalable alternative for automated classification, but their labeling rationale is opaque and introduces systematic biases. This study compares subject labels generated by LLMs with human-assigned labels for over 9,000 ETDs across 21 academic categories to assess the disagreement. We evaluate multiple prompt-based and fine-tuned LLM configurations and analyze areas of agreement and disagreement to identify patterns of misclassification. LLMs achieve competitive performance overall but frequently misclassify theoretical or interdisciplinary texts, often due to overweighting lexical cues and disregarding context. We show such errors are not random but reflect structured semantic divergences from human interpretation. These findings suggest a need for hybrid frameworks that combine LLM scalability with human contextual judgment to improve subject labeling in academic repositories. | en |
| dc.description.version | Accepted version | en |
| dc.format.extent | Pages 57-69 | en |
| dc.format.extent | 13 page(s) | en |
| dc.format.mimetype | application/pdf | en |
| dc.identifier.doi | https://doi.org/10.1007/978-3-032-06136-2_6 | en |
| dc.identifier.eissn | 1865-0937 | en |
| dc.identifier.isbn | 978-3-032-06135-5 | en |
| dc.identifier.issn | 1865-0929 | en |
| dc.identifier.orcid | Ingram, William [0000-0002-8307-8844] | en |
| dc.identifier.orcid | Banerjee, Bipasha [0000-0003-4472-1902] | en |
| dc.identifier.uri | https://hdl.handle.net/10919/141203 | en |
| dc.identifier.volume | 2694 | en |
| dc.language.iso | en | en |
| dc.publisher | Springer | en |
| dc.rights | Creative Commons Attribution 4.0 International | en |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | en |
| dc.subject | Classification | en |
| dc.subject | Large Language Models | en |
| dc.title | Evaluating Human-LLM Alignment in ETD Subject Classification | en |
| dc.title.serial | New Trends in Theory and Practice of Digital Libraries, TPDL 2025 | en |
| dc.type | Conference proceeding | en |
| dc.type.dcmitype | Text | en |
| dc.type.other | Proceedings Paper | en |
| dc.type.other | Book in series | en |
| pubs.finish-date | 2025-09-26 | en |
| pubs.organisational-group | Virginia Tech | en |
| pubs.organisational-group | Virginia Tech/Engineering | en |
| pubs.organisational-group | Virginia Tech/Engineering/Computer Science | en |
| pubs.organisational-group | Virginia Tech/Library | en |
| pubs.organisational-group | Virginia Tech/All T&R Faculty | en |
| pubs.organisational-group | Virginia Tech/Library/Library assessment administrators | en |
| pubs.organisational-group | Virginia Tech/Library/Dean's office | en |
| pubs.organisational-group | Virginia Tech/Library/Information Technology | en |
| pubs.organisational-group | Virginia Tech/Graduate students | en |
| pubs.organisational-group | Virginia Tech/Graduate students/Doctoral students | en |
| pubs.start-date | 2025-09-23 | en |