ETDs Knowledge Graph Building
Files
TR Number
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
ETDs are Electronic Theses and Dissertation, and this project aims to enhance the storage, accessibility, and exploration of Virginia Tech's ETDs by transforming traditional metadata into a searchable knowledge graph. Recognizing the limitations of flat or relational storage for representing rich academic relationships, we developed a dual-database architecture using Virtuoso (RDF/SPARQL) and Neo4j (property graph/Cypher) to model key entities such as authors, advisors, departments, and disciplines. A Streamlit-based web interface provides an intuitive search experience across both databases, enabling users to explore semantic connections by keyword, year, and entity type. The backend includes a Python-based data pipeline that transforms flat CSV data into normalized graph structures, optimized for batch loading at scale. This framework demonstrates a scalable, future-proof approach to managing large volumes of academic content, supporting more meaningful discovery and long-term preservation of institutional knowledge.