MetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University Libraries

dc.contributor.authorChoudhury, Muntabir Hasanen
dc.contributor.authorSalsabil, Lamiaen
dc.contributor.authorJayanetti, Himarsha R.en
dc.contributor.authorWu, Jianen
dc.contributor.authorIngram, William A.en
dc.contributor.authorFox, Edward A.en
dc.date.accessioned2024-01-22T13:08:02Zen
dc.date.available2024-01-22T13:08:02Zen
dc.date.issued2023en
dc.description.abstractMetadata quality is crucial for discovering digital objects through digital library (DL) interfaces. However, due to various reasons, the metadata of digital objects often exhibits incomplete, inconsistent, and incorrect values. We investigate methods to automatically detect, correct, and canonicalize scholarly metadata, using seven key fields of electronic theses and dissertations (ETDs) as a case study. We propose MetaEnhance, a framework that utilizes state-of-the-art artificial intelligence (AI) methods to improve the quality of these fields. To evaluate MetaEnhance, we compiled a metadata quality evaluation benchmark containing 500 ETDs, by combining subsets sampled using multiple criteria. We evaluated MetaEnhance against this benchmark and found that the proposed methods achieved nearly perfect F1-scores in detecting errors and F1-scores ranging from 0.85 to 1.00 for correcting five of seven key metadata fields. The codes and data are publicly available on GitHub11https://github.com/lamps-lab/ETDMiner/tree/master/metadata-correction.en
dc.description.versionSubmitted versionen
dc.format.extentPages 61-65en
dc.format.extent5 page(s)en
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1109/JCDL57899.2023.00019en
dc.identifier.eissn2575-8152en
dc.identifier.isbn9798350399318en
dc.identifier.issn2575-7865en
dc.identifier.orcidIngram, William [0000-0002-8307-8844]en
dc.identifier.orcidFox, Edward [0000-0003-1447-6870]en
dc.identifier.urihttps://hdl.handle.net/10919/117434en
dc.identifier.volume2023-Juneen
dc.language.isoenen
dc.publisherACMen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectDigital Librariesen
dc.subjectScholarly Big Dataen
dc.subjectETDen
dc.subjectMetadata Qualityen
dc.subjectArtificial Intelligenceen
dc.titleMetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University Librariesen
dc.title.serial2023 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, JCDLen
dc.typeConference proceedingen
dc.type.dcmitypeTexten
dc.type.otherProceedings Paperen
dc.type.otherBook in seriesen
pubs.finish-date2023-06-30en
pubs.organisational-group/Virginia Techen
pubs.organisational-group/Virginia Tech/Engineeringen
pubs.organisational-group/Virginia Tech/Engineering/Computer Scienceen
pubs.organisational-group/Virginia Tech/Libraryen
pubs.organisational-group/Virginia Tech/All T&R Facultyen
pubs.organisational-group/Virginia Tech/Engineering/COE T&R Facultyen
pubs.organisational-group/Virginia Tech/Library/Library assessment administratorsen
pubs.organisational-group/Virginia Tech/Library/Dean's officeen
pubs.organisational-group/Virginia Tech/Library/Information Technologyen
pubs.organisational-group/Virginia Tech/Graduate studentsen
pubs.organisational-group/Virginia Tech/Graduate students/Doctoral studentsen
pubs.start-date2023-06-26en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MetaEnhance-arXiv.pdf
Size:
252.61 KB
Format:
Adobe Portable Document Format
Description:
Submitted version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Plain Text
Description: