Browsing by Author "Masrourisaadat, Nila"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Integrated Digital Library System for Long Documents and their ElementsChekuri, Satvik; Chandrasekar, Prashant; Banerjee, Bipasha; Park, Sung Hee; Masrourisaadat, Nila; Ahuja, Aman; Ingram, William A.; Fox, Edward A. (ACM, 2023)We describe a next-generation integrated Digital Library (DL) system that addresses the numerous goals associated with long documents such as Electronic Theses and Dissertations (ETDs). Our extensible workflow-centric design supports a variety of users/personas (e.g., researchers, curators, and experimenters) who can benefit from improved access to ETDs and the content buried therein. Our approach leverages natural language processing, deep learning, information retrieval, and software engineering methods. The services cover ingesting, storing, curating, analyzing, detecting, extracting, classifying, summarizing, topic modeling, browsing, searching, retrieving, recommending, visualizing/reporting, and interacting with ETDs and derivative text/image-based elements/objects. Workflows connect the services and their APIs, along with UI-based access. We believe our approach can guide others to combine tailored user support, research, and education by way of extensible DLs.
- Quantitative and Qualitative Analysis of Text-to-Image modelsMasrourisaadat, Nila (Virginia Tech, 2023-08-30)The field of image synthesis has seen significant progress recently, including great strides with generative models like Generative Adversarial Networks (GANs), Diffusion Models, and Transformers. These models have shown they can create high-quality images from a variety of text prompts. However, a comprehensive analysis that examines both their performance and possible biases is often missing from existing research. In this thesis, I undertake a thorough examination of several leading text-to-image models, namely Stable Diffusion, DALL-E Mini, Lafite, and Ernie-ViLG. I assess their performance in generating accurate images of human faces, groups, and specified numbers of objects, using both Frechet Inception Distance (FID) scores and R-precision as my evaluation metrics. Moreover, I uncover inherent gender or social biases these models may possess. My research reveals a noticeable bias in these models, which show a tendency towards generating images of white males, thus under-representing minorities in their output of human faces. This finding contributes to the broader dialogue on ethics in AI and sets the stage for further research aimed at developing more equitable AI systems. Furthermore, based on the metrics I used for evaluation, the Stable Diffusion model outperforms the others in generating images from text prompts. This information could be particularly useful for researchers and practitioners trying to choose the most effective model for their future projects. To facilitate further research in this field, I have made my findings, the related data, and the source code publicly available.
- Team 3: Object Detection and Topic Modeling (Objects&Topics) CS 5604 F2022Devera, Alan; Sahu, Raj; Masrourisaadat, Nila; Amirthalingam, Nirmal; Mao, Chenyu (Virginia Tech, 2023-01-17)The CS 5604: Information Storage and Retrieval class (Fall 2022), led by Dr. Edward Fox, has been assigned the task of designing and implementing a state-of-the-art information retrieval and analysis system that will support Electronic Theses & Dissertations (ETDs). Given a large collection of ETDs, we want to run different kinds of learning algorithms to categorize them into logical groups, and by the end, be able to suggest to an end-user the documents which are strongly related to the one they are looking for. The overall goal for the project is to have a service that can upload, search, and retrieve ETDs with their derived digital objects, in a human-readable format. Specifically, our team is tasked with analyzing documents using object detection and topic models, with the final deliverable being the Experimenter web page for the derived objects and topics. The object detection team worked with Faster R-CNN and YOLOv7 models, and implemented post-processing rules for saving objects in a structured format. As the final deliverable for object detection, inference on 5k ETDs has been completed, and the refined objects have been saved to the Repository. The topic modeling team worked with clustering ETDs to 10, 25, 50, and 100 topics with different models (LDA, NeuralLDA, CTM, ProdLDA). As the final deliverable for topic modeling, we store the related topics and related documents for 5k ETDs in the Team 1 database, so that Team 2 could provide the related topic and documents on the documents page. By the end of the semester the team was able to deliver the Experimenter web page for the derived objects and topics, and the related objects and topics for 5k ETDs stored in the Team 1 database.