VTechWorks staff will be away for the Independence Day holiday from July 4-7. We will respond to email inquiries on Monday, July 8. Thank you for your patience.
 

Team 3: Object Detection and Topic Modeling (Objects&Topics) CS 5604 F2022

dc.contributor.authorDevera, Alanen
dc.contributor.authorSahu, Rajen
dc.contributor.authorMasrourisaadat, Nilaen
dc.contributor.authorAmirthalingam, Nirmalen
dc.contributor.authorMao, Chenyuen
dc.date.accessioned2023-03-10T18:02:15Zen
dc.date.available2023-03-10T18:02:15Zen
dc.date.issued2023-01-17en
dc.description.abstractThe CS 5604: Information Storage and Retrieval class (Fall 2022), led by Dr. Edward Fox, has been assigned the task of designing and implementing a state-of-the-art information retrieval and analysis system that will support Electronic Theses & Dissertations (ETDs). Given a large collection of ETDs, we want to run different kinds of learning algorithms to categorize them into logical groups, and by the end, be able to suggest to an end-user the documents which are strongly related to the one they are looking for. The overall goal for the project is to have a service that can upload, search, and retrieve ETDs with their derived digital objects, in a human-readable format. Specifically, our team is tasked with analyzing documents using object detection and topic models, with the final deliverable being the Experimenter web page for the derived objects and topics. The object detection team worked with Faster R-CNN and YOLOv7 models, and implemented post-processing rules for saving objects in a structured format. As the final deliverable for object detection, inference on 5k ETDs has been completed, and the refined objects have been saved to the Repository. The topic modeling team worked with clustering ETDs to 10, 25, 50, and 100 topics with different models (LDA, NeuralLDA, CTM, ProdLDA). As the final deliverable for topic modeling, we store the related topics and related documents for 5k ETDs in the Team 1 database, so that Team 2 could provide the related topic and documents on the documents page. By the end of the semester the team was able to deliver the Experimenter web page for the derived objects and topics, and the related objects and topics for 5k ETDs stored in the Team 1 database.en
dc.description.notesTopicBubbleDemo.mp4 - Video file demonstrating the topic bubble visualization ETDViewerDemo.mp4 - Video file illustrating the use of the topic viewer ObjectsTopicsReport.zip - Final report (Zip from Overleaf) ObjectsTopicsReport.pdf - Final report (PDF version) ObjectsTopicsPresentation.pptx - Final presentation (PowerPoint version) ObjectsTopicsPresentation.pdf - Final presentation (PDF version)en
dc.identifier.urihttp://hdl.handle.net/10919/114081en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.rightsAttribution-NonCommercial 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en
dc.subjectETDen
dc.subjectElectronic Theses and Dissertationsen
dc.subjectDeep Learningen
dc.subjectObject Detectionen
dc.subjectTopic Modelingen
dc.titleTeam 3: Object Detection and Topic Modeling (Objects&Topics) CS 5604 F2022en
dc.typePresentationen
dc.typeReporten
dc.typeVideoen

Files

Original bundle
Now showing 1 - 5 of 6
Name:
ObjectsTopicsPresentation.pptx
Size:
5.33 MB
Format:
Microsoft Powerpoint XML
Loading...
Thumbnail Image
Name:
ObjectsTopicsPresentation.pdf
Size:
3.82 MB
Format:
Adobe Portable Document Format
Name:
ETDViewerDemo.mp4
Size:
24.02 MB
Format:
MP4 Container format for video files
Name:
TopicBubbleDemo.mp4
Size:
6.66 MB
Format:
MP4 Container format for video files
Loading...
Thumbnail Image
Name:
ObjectsTopicsReport.pdf
Size:
2.93 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: