Enhancing Layout Understanding via Human-in-the-Loop: A User Study on PDF-to-HTML Conversion for Long Documents
dc.contributor.author | Mao, Chenyu | en |
dc.contributor.committeechair | Fox, Edward A. | en |
dc.contributor.committeemember | Chen, Yan | en |
dc.contributor.committeemember | Lee, Sang Won | en |
dc.contributor.department | Computer Science and#38; Applications | en |
dc.date.accessioned | 2025-03-25T08:00:18Z | en |
dc.date.available | 2025-03-25T08:00:18Z | en |
dc.date.issued | 2025-03-24 | en |
dc.description.abstract | Document layout understanding often utilizes object detection to locate and parse document elements, enabling systems that convert documents into searchable and editable formats to enhance accessibility and usability. Nevertheless, the recognition results often contain errors that require manual correction due to small training dataset size, limitations of models, and defects in training annotations. However, many of these problems can be addressed via human review to improve correctness. We first improved our system by combining the previous Electronic Thesis/Dissertation (ETD) parsing tool and AI-aided annotation tool, providing instant and accurate file output. Then we used our new pipeline to investigate the effectiveness and efficiency of manual correction strategies in improving object detection accuracy through user studies, including 8 participants, comprising a balanced number of four STEM and four non-STEM researchers, all with some background in ETDs. Each participant was assigned correction tasks on a set of ETDs from both STEM and non-STEM disciplines to ensure comprehensive evaluation across different document types. We collected quantitative metrics, such as completion times, accuracy rates, number of wrong labels, and feedback through our post-survey, to assess the usability and performance of the manual correction process and to examine their relationship with users' academic backgrounds. Results demonstrate that manual adjustment significantly enhanced the accuracy of document element identification and classification, with experienced participants achieving superior correction precision. Furthermore, usability feedback revealed a strong correlation between user satisfaction and system design, providing valuable insights for future system enhancement and development. | en |
dc.description.abstractgeneral | With the development of technology, there is an increasing demand to make printed and scanned documents more accessible. Organizations such as universities and libraries have millions of valuable documents, including theses, dissertations, and research papers, which exist only in PDF, often as a scanned format. While these works contain valuable knowledge, they can be challenging to search through or access, especially for those with low vision. To solve this problem, we need computer systems that automatically recognize and convert different parts of these documents --- like titles, headings, paragraphs, and figures --- into more usable forms. Our research focuses on improving how these document recognition systems work by combining computer automation with human expertise. While computers can process documents quickly, they sometimes need more training data for complex document layouts. We developed a web-based tool allowing people to review the computer's work and correct errors, such as mislabeled sections or missed elements. We conducted a detailed study with 8 participants who used our correction tool, to understand how effective this human-computer collaboration could be. We carefully measured several aspects of their experience: how many pages they annotated in a fixed amount of time, how accurate their corrections were, and how they felt about using the tool. We also used a post-survey to gather feedback about their experience with the tool. The results were very encouraging. When humans reviewed and corrected the computer's work, the accuracy of document recognition improved significantly. We found that participants could effectively identify and fix errors in the computer's output, especially when the tool was easy to use. Higher user satisfaction was strongly linked to how intuitive and straightforward participants found the correction process. One useful finding was that this process creates a positive feedback loop. Every correction a person makes helps expand the training data available to the computer system, which means the system can learn from these corrections and gradually become better at recognizing similar elements in future documents, reducing the number of errors that need to be corrected over time. Our research offers insights into building advanced object detection systems incorporating computational efficiency with human review. The results boost the formulation of optimal strategies for developing user-centric interfaces and effective document repair operations. This work has practical implications for making academic and research documents more accessible to everyone, including those relying on screen readers or other assistive technologies. This research represents a step forward in making the vast knowledge of digital documents more accessible, searchable, and usable for all readers. By showing how humans and computers can work together effectively, we are helping to build better systems for preserving and sharing knowledge in the digital age. | en |
dc.description.degree | Master of Science | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:42536 | en |
dc.identifier.uri | https://hdl.handle.net/10919/125076 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | ETD | en |
dc.subject | deep learning | en |
dc.subject | object detection | en |
dc.subject | document layout analysis | en |
dc.title | Enhancing Layout Understanding via Human-in-the-Loop: A User Study on PDF-to-HTML Conversion for Long Documents | en |
dc.type | Thesis | en |
thesis.degree.discipline | Computer Science & Applications | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |
Files
Original bundle
1 - 1 of 1