Topic Modeling Toolkit

dc.contributor.authorLin, Jiayueen
dc.contributor.authorPang, Mingkaien
dc.contributor.authorLiu, Yulongen
dc.date.accessioned2023-07-05T17:43:28Zen
dc.date.available2023-07-05T17:43:28Zen
dc.date.issued2023-05-08en
dc.description.abstractThe Topic Modeling Toolkit project began with an existing text mining toolkit and aimed to enhance its functionality by incorporating cutting-edge topic modeling techniques. Specifically, BERTopic, CTM, and LDA were used to extract pertinent topics from a corpus of text documents. The resulting web-based platform provides users with a search engine, a recommendation system, and a usable interface for browsing and exploring these topics. In addition to these enhancements, our team also implemented a text-filtering framework and redesigned the user interface using Tailwind CSS. The final deliverables of the project include a fully functional website, user documentation, and an open-source toolkit that can be used to train machine learning models and support browsing and searching for various text datasets. While the current version of the toolkit includes BERTopic, CTM, and LDA, there is potential for future work to incorporate additional topic modeling methods. It is important to note that while the project originally focused on electronic theses and dissertations (ETDs), the resulting platform can be used to explore and comprehend complex subjects within any corpus of text documents. The topic modeling toolkit is available as an open-source package that users can install and use on their own computers. It is available for use and can be used to support browsing and searching for various text datasets. The intended user group for the platform includes researchers, students, and other users interested in exploring and understanding complex topics within a given corpus of text documents. The resulting topic modeling toolkit offers features that facilitate the exploration and comprehension of intricate topics within text document collections. This tool has the potential to aid researchers, students, and other users in their respective fields.en
dc.description.notesTopicModelingToolkitReport.pdf - Final report as PDF file TopicModelingToolkitReport.zip - Final report downloaded from Overleaf TopicModelingToolkitFinalPres.pdf - Final Presentation as PDF file TopicModelingToolkitFinalPres.pptx - Final Presentation as pptx fileen
dc.identifier.urihttp://hdl.handle.net/10919/115648en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.rightsAttribution 4.0 Internationalen
dc.rightsAttribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.subjectMachine learningen
dc.subjectTopic modelingen
dc.subjectUser interface designen
dc.subjectText filteringen
dc.subjectTailwind CSSen
dc.subjectBERTopicen
dc.subjectSearch algorithmsen
dc.subjectRecommendation algorithmsen
dc.titleTopic Modeling Toolkiten
dc.typePresentationen
dc.typeReporten

Files

Original bundle
Now showing 1 - 4 of 4
Name:
TopicModelingToolkitReport.zip
Size:
5.49 MB
Format:
Loading...
Thumbnail Image
Name:
TopicModelingToolkitReport.pdf
Size:
5.53 MB
Format:
Adobe Portable Document Format
Name:
TopicModelingToolkitFinalPres.pptx
Size:
4.96 MB
Format:
Microsoft Powerpoint XML
Loading...
Thumbnail Image
Name:
TopicModelingToolkitFinalPres.pdf
Size:
3.4 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: