Browsing by Author "Manogaran, Harish Babu"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- CS5604 Fall 2022 - Team 5 INTShukla, Anmol; Travasso, Aaron; Manogaran, Harish Babu; Sisodia, Pallavi Kishor; Li, Yuze (Virginia Tech, 2022-01-08)The primary objective of the project is to build a state-of-the-art system to search and retrieve relevant information effectively from a large corpus of electronic theses and dissertations. The system is targeted towards documents such as academic textbooks, dissertations and theses where the information available is enormous, compared to websites or blogs, which the conventional search engines are equipped to handle effectively. The entire work involved in developing the system has been divided into five areas such as data management (Team-1, Curator); search and retrieval (Team-2, User); object detection and topic analysis (Team-3, Objects & Topics); language models, classification, summarization and segmentation (Team-4, Classification & Summarization); and lastly integration (Team-5, Integration). The teams and their operations are structured in a way to mirror an environment of a company working on new product development. The Integration (INT) team focuses on one of the important aspects such as setting up work environments with all requirements for the teams, integrating the work done by the other four teams, and deploying suitable Docker containers for seamless operation (workflow) along with maintaining the cluster infrastructure. The INT team archives this distribution of code and containers on the Virginia Tech Docker Container Registry and deploys it on the Virginia Tech CS Cloud. The INT team also guides team evaluations of prospective container components and workflows. Additionally, the team implements continuous integration and continuous deployment to enable seamless integration, building and testing of code as they are developed. Furthermore, the team works on setting up a workflow management system that employs Apache Airflow to automate creating, scheduling, and monitoring of workflows. We have created customized containers for each team based on their individual requirements. We have developed a workflow management system using Apache Airflow that creates and manages workflows to achieve the goals of each team such as indexing, object detection, segmentation, summarization, and classification. We have also implemented a Continuous Integration and Continuous Deployment (CI/CD) pipeline to automatically create, test and deploy the updated image whenever a new push is made to a Git repository. Additionally, we extended our support to other teams in troubleshooting the issues they faced in deployment. Our current cluster statistics (i.e., Kubernetes Resource Definitions) are: 45 deployments, 40 ingresses, 39 pods, 180 services, and 13 secrets. Lastly, the INT team would like to express its gratitude to the work of the INT-2020 team and the predecessors who have done substantial work upon which we built. We would like to acknowledge here their significant contribution.
- Discovering Novel Biological Traits From Images Using Phylogeny-Guided Neural NetworksElhamod, Mohannad; Khurana, Mridul; Manogaran, Harish Babu; Uyeda, Josef C.; Balk, Meghan A.; Dahdul, Wasila; Bakış, Yasin; Bart, Henry L. Jr.; Mabee, Paula M.; Lapp, Hilmar; Balhoff, James P.; Charpentier, Caleb; Carlyn, David; Chao, Wei-Lun; Stewart, Charles V.; Rubenstein, Daniel I.; Berger-Wolf, Tanya; Karpatne, Anuj (ACM, 2023-08-06)Discovering evolutionary traits that are heritable across species on the tree of life (also referred to as a phylogenetic tree) is of great interest to biologists to understand how organisms diversify and evolve. However, the measurement of traits is often a subjective and labor-intensive process, making trait discovery a highly label-scarce problem. We present a novel approach for discovering evolutionary traits directly from images without relying on trait labels. Our proposed approach, Phylo-NN, encodes the image of an organism into a sequence of quantized feature vectors–or codes–where different segments of the sequence capture evolutionary signals at varying ancestry levels in the phylogeny. We demonstrate the effectiveness of our approach in producing biologically meaningful results in a number of downstream tasks including species image generation and species-to-species image translation, using fish species as a target example.
- Hierarchy Aligned Commonality Through Prototypical Networks: Discovering Evolutionary Traits over Tree-of-LifeManogaran, Harish Babu (Virginia Tech, 2024-10-11)A grand challenge in biology is to discover evolutionary traits, which are features of organisms common to a group of species with a shared ancestor in the Tree of Life (also referred to as phylogenetic tree). With the recent availability of large-scale image repositories in biology and advances in the field of explainable machine learning (ML) such as ProtoPNet and other prototype-based methods, there is a tremendous opportunity to discover evolutionary traits directly from images in the form of a hierarchy of prototypes learned at internal nodes of the phylogenetic tree. However, current prototype-based methods are mostly designed to operate over a flat structure of classes and face several challenges in discovering hierarchical prototypes on a tree, including the problem of learning over-specific features at internal nodes in the tree. To overcome these challenges, we introduce the framework of Hierarchy aligned Commonality through Prototypical Networks (HComP-Net), which learns common features shared by all descendant species of an internal node and avoids the learning of over-specific prototypes. We empirically show that HComP-Net learns prototypes that are of high accuracy, semantically consistent, and generalizable to unseen species in comparison to baselines. While we focus on the biological problem of discovering evolutionary traits, our work can be applied to any domain involving a hierarchy of classes.