Browsing by Author "Ramakrishnan, Narendran"
Now showing 1 - 20 of 38
Results Per Page
Sort Options
- Achieving More with Less: Learning Generalizable Neural Networks With Less Labeled Data and Computational OverheadsBu, Jie (Virginia Tech, 2023-03-15)Recent advancements in deep learning have demonstrated its incredible ability to learn generalizable patterns and relationships automatically from data in a number of mainstream applications. However, the generalization power of deep learning methods largely comes at the costs of working with very large datasets and using highly compute-intensive models. Many applications cannot afford these costs needed to ensure generalizability of deep learning models. For instance, obtaining labeled data can be costly in scientific applications, and using large models may not be feasible in resource-constrained environments involving portable devices. This dissertation aims to improve efficiency in machine learning by exploring different ways to learn generalizable neural networks that require less labeled data and computational resources. We demonstrate that using physics supervision in scientific problems can reduce the need for labeled data, thereby improving data efficiency without compromising model generalizability. Additionally, we investigate the potential of transfer learning powered by transformers in scientific applications as a promising direction for further improving data efficiency. On the computational efficiency side, we present two efforts for increasing parameter efficiency of neural networks through novel architectures and structured network pruning.
- aiWATERS: An Artificial Intelligence Framework for the Water SectorVekaria, Darshan (Virginia Tech, 2023-07-20)The ubiquity of Artificial Intelligence (AI) and Machine Learning (ML) applications has led to their widespread adoption across diverse domains like education, self-driving cars, healthcare, and more. AI is making its way into the industry, beyond research and academia. Concurrently, the water sector is undergoing a digital transformation, driven by challenges such as water demand forecasting, wastewater treatment, asset maintenance and management, and water quality assessment. Water utilities are at different stages in their journey of digital transformation, and its decision-makers, who are non-expert stakeholders in AI applications, must understand the technology to make informed decisions. The non-expert stakeholders should know that while AI has numerous benefits to offer, there are also many challenges related to data, model development, knowledge integration, and ethical concerns that should be considered before implementing it for real-world applications. Civil engineering is a licensed profession where critical decision-making is involved. Failure of critical decisions by civil engineers may put their license at risk, and therefore trust in any decision-support technology is crucial for its acceptance in real-world applications. This research proposes a framework called aiWATERS (Artificial Intelligence for the Water Sector) to facilitate the successful application of AI in the water sector. Based on this framework, we conduct pilot interviews and surveys with various small, medium, and large water utilities to capture their current state of AI implementation and identify the challenges faced by them. The research findings reveal that most of the water utilities are at an early stage of implementing AI as they face concerns regarding the blackbox nature, trustworthiness, and sustainability of AI technology in their system. The aiWATERS framework is intended to help the utilities navigate through these issues in their journey of digital transformation.
- Amplifying the Griot: Technology for Preserving, Retelling, and Supporting Underrepresented StoriesKotut, Lindah Jerop (Virginia Tech, 2021-05-24)As we develop intelligent systems to handle online interactions and digital stories, how do we address those stories that are unwritten and invisible? How do ensure that communities who value oral histories are not left behind, and their voices also inform the design of these systems? How do we determine that the technology we design respect the agency and ownership of the stories, without imposing our own biases? To answer these questions, I rely on accounts from different underrepresented communities, as avenues to examine how digital technology affect their stories, and the agency they have over them. From these stories, I elicit guidelines for the design of equitable and resilient tools and technologies. I sought wisdom from griots who are master storytellers and story-keepers on the craft of handling both written and unwritten stories, which instructed the development of the Respectful Space for technology typology, a framework that informs our understanding and interaction with underrepresented stories. The framework guided the approach to understand technology use by inhabitants of rural spaces in the United States--particularly long-distance hikers who traverse these spaces. I further discuss the framework's extensibility, by considering its use for community self-reflection, and for researchers to query the ethical implications of their research, the technology they develop, and the consideration for the voices that the technology amplifies or suppresses. The intention is to highlight the vast resources that exist in domains we do not consider, and the importance of the underrepresented voices to also inform the future of technology.
- Anomalous Information Detection in Social MediaTao, Rongrong (Virginia Tech, 2021-03-10)This dissertation focuses on identifying various types of anomalous information pattern in social media and news outlets. We focus on three types of anomalous information, including (1) media censorship in news outlets, which is information that should be published but is actually missing, (2) fake news in social media, which is unreliable information shown to the public, and (3) media propaganda in news outlets, which is trustworthy information but being over-populated. For the first problem, existing approaches on censorship detection mostly rely on monitoring posts in social media. However, media censorship in news outlets has not received nearly as much attention, mostly because it is difficult to systematically detect. The contributions of our work include: (1) a hypothesis testing framework to identify and evaluate censored clusters of keywords, (2) a near-linear-time algorithm to identify the highest scoring clusters as indicators of censorship, and (3) extensive experiments on six Latin American countries for performance evaluation. For the second problem, existing approaches studying fake news in social media primarily focus on topic-level modeling or prediction based on a set of aggregated features from a col- lection of posts. However, the credibility of various information components within the same topic can be quite different. The contributions of our work in this space include: (1) a new benchmark dataset for fake news research, (2) a cluster-based approach to improve instance- level prediction of information credibility, and (3) extensive experiments for performance evaluations. For the last problem, existing approaches to media propaganda detection primarily focus on investigating the pattern of information shared over social media or evaluation from domain experts. However, these approaches cannot be generalized to a large-scale analysis of media propaganda in news outlets. The contributions of our work include: (1) non- parametric scan statistics to identify clusters of over-populated keywords, (2) a near-linear-time algorithm to identify the highest scoring clusters as indicators of propaganda, and (3) extensive experiments on two Latin American countries for performance evaluation.
- Automatic Question Answering and Knowledge Discovery from Electronic Health RecordsWang, Ping (Virginia Tech, 2021-08-25)Electronic Health Records (EHR) data contain comprehensive longitudinal patient information, which is usually stored in databases in the form of either multi-relational structured tables or unstructured texts, e.g., clinical notes. EHR provides a useful resource to assist doctors' decision making, however, they also present many unique challenges that limit the efficient use of the valuable information, such as large data volume, heterogeneous and dynamic information, medical term abbreviations, and noisy nature caused by misspelled words. This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to seek answers from EHR for clinical activity related questions posed in human language without the assistance of database and natural language processing (NLP) domain experts, (2) How to discover underlying relationships of different events and entities in structured tabular EHRs, and (3) How to predict when a medical event will occur and estimate its probability based on previous medical information of patients. First, to automatically retrieve answers for natural language questions from the structured tables in EHR, we study the question-to-SQL generation task by generating the corresponding SQL query of the input question. We propose a translation-edit model driven by a language generation module and an editing module for the SQL query generation task. This model helps automatically translate clinical activity related questions to SQL queries, so that the doctors only need to provide their questions in natural language to get the answers they need. We also create a large-scale dataset for question answering on tabular EHR to simulate a more realistic setting. Our performance evaluation shows that the proposed model is effective in handling the unique challenges about clinical terminologies, such as abbreviations and misspelled words. Second, to automatically identify answers for natural language questions from unstructured clinical notes in EHR, we propose to achieve this goal by querying a knowledge base constructed based on fine-grained document-level expert annotations of clinical records for various NLP tasks. We first create a dataset for clinical knowledge base question answering with two sets: clinical knowledge base and question-answer pairs. An attention-based aspect-level reasoning model is developed and evaluated on the new dataset. Our experimental analysis shows that it is effective in identifying answers and also allows us to analyze the impact of different answer aspects in predicting correct answers. Third, we focus on discovering underlying relationships of different entities (e.g., patient, disease, medication, and treatment) in tabular EHR, which can be formulated as a link prediction problem in graph domain. We develop a self-supervised learning framework for better representation learning of entities across a large corpus and also consider local contextual information for the down-stream link prediction task. We demonstrate the effectiveness, interpretability, and scalability of the proposed model on the healthcare network built from tabular EHR. It is also successfully applied to solve link prediction problems in a variety of domains, such as e-commerce, social networks, and academic networks. Finally, to dynamically predict the occurrence of multiple correlated medical events, we formulate the problem as a temporal (multiple time-points) and multi-task learning problem using tensor representation. We propose an algorithm to jointly and dynamically predict several survival problems at each time point and optimize it with the Alternating Direction Methods of Multipliers (ADMM) algorithm. The model allows us to consider both the dependencies between different tasks and the correlations of each task at different time points. We evaluate the proposed model on two real-world applications and demonstrate its effectiveness and interpretability.
- Bilevel Optimization in the Deep Learning Era: Methods and ApplicationsZhang, Lei (Virginia Tech, 2024-01-05)Neural networks, coupled with their associated optimization algorithms, have demonstrated remarkable efficacy and versatility across an extensive array of tasks, encompassing image recognition, speech recognition, object detection, sentiment analysis, and more. The inherent strength of neural networks lies in their capability to autonomously learn intricate representations that map input data to corresponding output labels seamlessly. Nevertheless, not all tasks can be neatly encapsulated within the confines of an end-to-end learning paradigm. The complexity and diversity of real-world challenges necessitate innovative approaches that extend beyond conventional formulations. This calls for the exploration of specialized architectures and optimization strategies tailored to the unique intricacies of specific tasks, ensuring a more nuanced and effective solution to the myriad demands of diverse applications. The bi-level optimization problem stands out as a distinctive form of optimization, characterized by the embedding or nesting of one problem within another. Its relevance persists significantly in the current era dominated by deep learning. A notable instance of its application in the realm of deep learning is observed in hyperparameter optimization. In the context of neural networks, the automatic training of weights through backpropagation represents a crucial aspect. However, certain hyperparameters, such as the learning rate (lr) and the number of layers, must be predetermined and cannot be optimized through the conventional chain rule employed in backpropagation. This underscores the importance of bi-level optimization in addressing the intricate task of fine-tuning these hyperparameters to enhance the overall performance of deep learning models. The domain of deep learning presents a fertile ground for further exploration and discoveries in optimization. The untapped potential for refining hyperparameters and optimizing various aspects of neural network architectures highlights the ongoing opportunities for advancements and breakthroughs in this dynamic field. Within this thesis, we delve into significant bi-level optimization challenges, applying these techniques to pertinent real-world tasks. Given that bi-level optimization entails dual layers of optimization, we explore scenarios where neural networks are present in the upper-level, the inner-level, or both. To be more specific, we systematically investigate four distinct tasks: optimizing neural networks towards optimizing neural networks, optimizing attractors towards optimizing neural networks, optimizing graph structures towards optimizing neural network performance, and optimizing architecture towards optimizing neural networks. For each of these tasks, we formulate the problems using the bi-level optimization approach mathematically, introducing more efficient optimization strategies. Furthermore, we meticulously evaluate the performance and efficiency of our proposed techniques. Importantly, our methodologies and insights transcend the realm of bi-level optimization, extending their applicability broadly to various deep learning models. The contributions made in this thesis offer valuable perspectives and tools for advancing optimization techniques in the broader landscape of deep learning.
- Can an LLM find its way around a Spreadsheet?Lee, Cho Ting (Virginia Tech, 2024-06-05)Spreadsheets are routinely used in business and scientific contexts, and one of the most vexing challenges data analysts face is performing data cleaning prior to analysis and evaluation. The ad-hoc and arbitrary nature of data cleaning problems, such as typos, inconsistent formatting, missing values, and a lack of standardization, often creates the need for highly specialized pipelines. We ask whether an LLM can find its way around a spreadsheet and how to support end-users in taking their free-form data processing requests to fruition. Just like RAG retrieves context to answer users' queries, we demonstrate how we can retrieve elements from a code library to compose data processing pipelines. Through comprehensive experiments, we demonstrate the quality of our system and how it is able to continuously augment its vocabulary by saving new codes and pipelines back to the code library for future retrieval.
- Capsule Networks: Framework and Application to Disentanglement for Generative ModelsMoghimi, Zahra (Virginia Tech, 2021-06-30)Generative models are one of the most prominent components of unsupervised learning models that have a plethora of applications in various domains such as image-to-image translation, video prediction, and generating synthetic data where accessing real data is expensive, unethical, or compromising privacy. One of the main challenges in designing a generative model is creating a disentangled representation of generative factors which gives control over various characteristics of the generated data. Since the architecture of variational autoencoders is centered around latent variables and their objective function directly governs the generative factors, they are the perfect choice for creating a more disentangled representation. However, these architectures generate samples that are blurry and of lower quality compared to other state-of-the-art generative models such as generative adversarial networks. Thus, we attempt to increase the disentanglement of latent variables in variational autoencoders without compromising the generated image quality. In this thesis, a novel generative model based on capsule networks and a variational autoencoder is proposed. Motivated by the concept of capsule neural networks and their vectorized output, these structures are employed to create a disentangled representation of latent features in variational autoencoders. In particular, the proposed structure, called CapsuleVAE, utilizes a capsule encoder whose vector outputs can translate to latent variables in a meaningful way. It is shown that CapsuleVAE generates results that are sharper and more diverse based on FID score and a metric inspired by the inception score. Furthermore, two different methods for training CapsuleVAE are proposed, and the generated results are investigated. In the first method, an objective function with regularization is proposed, and the optimal regularization hyperparameter is derived. In the second method, called sequential optimization, a novel training technique for training CapsuleVAE is proposed and the results are compared to the first method. Moreover, a novel metric for measuring disentanglement in latent variables is introduced. Based on this metric, it is shown that the proposed CapsuleVAE creates more disentangled representations. In summary, our proposed generative model enhances the disentanglement of latent variables which contributes to the model's generalizing well to new tasks and more control over the generated data. Our model also increases the generated image quality which addresses a common disadvantage in variational autoencoders.
- Commonsense for Zero-Shot Natural Language Video LocalizationHolla, Meghana (Virginia Tech, 2023-07-07)Zero-shot Natural Language-Video Localization (NLVL) has shown promising results in training NLVL models solely with raw video data through dynamic video segment proposal generation and pseudo-query annotations. However, existing pseudo-queries lack grounding in the source video and suffer from a lack of common ground due to their unstructured nature. In this work, we investigate the effectiveness of commonsense reasoning in zero-shot NLVL. Specifically, we present CORONET, a zero-shot NLVL framework that utilizes commonsense information to bridge the gap between videos and generated pseudo-queries through a commonsense enhancement module. Our approach employs Graph Convolutional Networks (GCN) to encode commonsense information extracted from a knowledge graph, conditioned on the video, and cross-attention mechanisms to enhance the encoded video and pseudo-query vectors prior to localization. Through empirical evaluations on two benchmark datasets, we demonstrate that our model surpasses both zero-shot and weakly supervised baselines. These results underscore the significance of leveraging commonsense reasoning abilities in multimodal understanding tasks.
- Data Centric Defenses for Privacy AttacksAbhyankar, Nikhil Suhas (Virginia Tech, 2023-08-14)Recent research shows that machine learning algorithms are highly susceptible to attacks trying to extract sensitive information about the data used in model training. These attacks called privacy attacks, exploit the model training process. Contemporary defense techniques make alterations to the training algorithm. Such defenses are computationally expensive, cause a noticeable privacy-utility tradeoff, and require control over the training process. This thesis presents a data-centric approach using data augmentations to mitigate privacy attacks. We present privacy-focused data augmentations to change the sensitive data submitted to the model trainer. Compared to traditional defenses, our method provides more control to the individual data owner to protect one's private data. The defense is model-agnostic and does not require the data owner to have any sort of control over the model training. Privacypreserving augmentations are implemented for two attacks namely membership inference and model inversion using two distinct techniques. While the proposed augmentations offer a better privacy-utility tradeoff on CIFAR-10 for membership inference, they reduce the reconstruction rate to ≤ 1% while reducing the classification accuracy by only 2% against model inversion attacks. This is the first attempt to defend model inversion and membership inference attacks using decentralized privacy protection.
- Deep Learning Empowered Unsupervised Contextual Information Extraction and its applications in Communication SystemsGusain, Kunal (Virginia Tech, 2023-01-16)
- Design and Maintenance of Event Forecasting SystemsMuthiah, Sathappan (Virginia Tech, 2021-03-26)With significant growth in modern forms of communication such as social media and micro- blogs we are able to gain a real-time understanding into events happening in many parts of the world. In addition, these modern forms of communication have helped shed light into the increasing instabilities across the world via the design of anticipatory intelligence systems [45, 43, 20] that can forecast population level events like civil unrest, disease occurrences with reasonable accuracy. Event forecasting systems are generally prone to become outdated (model drift) as they fail to keep-up with constantly changing patterns and thus require regular re-training in order to sustain their accuracy and reliability. In this dissertation we try to address some of the issues associated with design and maintenance of event forecasting systems in general. We propose and showcase performance results for a drift adaptation technique in event forecasting systems and also build a hybrid system for event coding which is cognizant of and seeks human intervention in uncertain prediction contexts to maintain a good balance between prediction-fidelity and cost of human effort. Specifically we identify several micro-tasks for event coding and build separate pipelines for each with uncertainty estimation capabilities and thereby be able to seek human feedback whenever required for each micro-task independent of the rest.
- Designing Human-Centered Collaborative Systems for School RedistrictingSistrunk, Virginia Andreea (Virginia Tech, 2024-07-24)In a multitude of nations, the provision of education is predominantly facilitated through public schooling systems. These systems are structured in accordance with school districts, which are geographical territories where educational institutions share identical administrative frameworks and frequently coincide with the confines of a city or county. To enhance the operational efficiency of these schooling systems, the demarcations of public schools undergo periodic modifications. This procedure, also known as school redistricting, invariably engenders a myriad of tensions within the associated communities. This dissertation addresses the potential and necessity to integrate geographically-enabled crowd-sourced input into the redistricting process, and concurrently presents and evaluates a feasible solution. The pivotal contributions of this dissertation encompass: i) the delineation of the interdisciplinary sub-field at the nexus of HCI, CSCW, and education policy, ii) the identification of requirements from participants proficient in traditional, face-to-face deliberations, representing a diverse array of stakeholder groups, iii) the conception of a self-serve interactive boundary optimization system, and iv) a comprehensive user study conducted during a live public school rezoning deliberation utilizing the newly proposed hybrid approach. The live study specifically elucidates the efficacy of key design choices and the representation and rationalization of intricate user constraints in civic deliberations and educational policy architecture. My research looks into four primary areas of exploration: (i) the application of computer science usability-design principles to augment and expedite the visual deconstruction of intricate multi-domain data, thereby enhancing comprehension for novice users, (ii) the identification of salient elements of experiential learning within the milieu of visual scaffolding, (iii) the development of a preliminary platform designed to expand the capacity for crowd-sourcing novice users in the act of reconciling geo-spatial constraints, and finally, (iv) the utilization of Human-Computer Interaction (HCI) and data-driven analysis to discern, consolidate, and inaugurate novel communication channels that foster the restoration of trust within communities. To do so, I analyzed the previous work that was done in the domain, proposed a new direction, and created a web-application, called Redistrict. This an on-line platform allows the user to generate and explore "what if" scenarios, express opinions, and participate asynchronously in proximity-based public school boundary deliberations. I first evaluated the perceived value added by Redistrict through a user study with 12 participants experienced in traditional in-person deliberations, representing multiple stakeholder groups. Subsequently, I expanded the testing to an online rezoning. As a result of all interactions and the use of the web application, the participants reported a better understanding of geographically enabled projections, proposals from public officials, and increased consideration of how difficult it is to balance multidisciplinary constraints. Here, I present the design possibilities used and the effective online aid for the issue of public school rezoning deliberations and redistricting. This data-driven approach aids the school board and decision makers by offering automated strategies, a straightforward, visual, and intuitive method to comprehend intricate geographical limitations. The users demonstrated the ability to navigate the interface without iii any previous training or explanation. In this work, I propose the following three new concepts: (i) A new interdisciplinary subfield for Human Computing Interaction -Computer Supported Cooperative Work that combines Computer Science, Geography, and Education Policy. We explain and demonstrate how single domain approach failed in supporting this field and how complex geo-spatial problems require intensive technology to simultaneously balance all education policy constraints. This sits only at the intersection of a multi-domain approach. (ii) A sophisticated deconstruction of intricate data sets is presented through this methodology. It enables users to assimilate, comprehend, and formulate decisions predicated on the information delineated on a geospatial representation, leveraging preexisting knowledge of geographical proximity, and engaging in scenario analysis. Each iterative attempt facilitates incremental understanding, epitomizing the concept of information scaffolding. The efficacy of this process is demonstrated by its ability to foster independent thought and comprehension, obviating the need for explicit instructions. This technique is henceforth referred to as 'visual scaffolding'. (iii) In our most recent investigation, we engage in an introspective analysis of the observed input in civic decision making. We present the proposition of integrating digital civic engagement with user geolocation data. We advocate for the balance of this input, as certain geographical areas may be disproportionately represented in civic deliberations. The introduction of a weighting mechanism could facilitate a deeper understanding of the foundational premises on which civic decisions are based. We coin the term 'digital geo-civics' to characterize this pioneering approach.
- Detecting Irregular Network Activity with Adversarial Learning and Expert FeedbackRathinavel, Gopikrishna (Virginia Tech, 2022-06-15)Anomaly detection is a ubiquitous and challenging task relevant across many disciplines. With the vital role communication networks play in our daily lives, the security of these networks is imperative for smooth functioning of society. This thesis proposes a novel self-supervised deep learning framework CAAD for anomaly detection in wireless communication systems. Specifically, CAAD employs powerful adversarial learning and contrastive learning techniques to learn effective representations of normal and anomalous behavior in wireless networks. Rigorous performance comparisons of CAAD with several state-of-the-art anomaly detection techniques has been conducted and verified that CAAD yields a mean performance improvement of 92.84%. Additionally, CAAD is augmented with the ability to systematically incorporate expert feedback through a novel contrastive learning feedback loop to improve the learned representations and thereby reduce prediction uncertainty (CAAD-EF). CAAD-EF is a novel, holistic and widely applicable solution to anomaly detection.
- Domain-based Frameworks and Embeddings for Dynamics over NetworksAdhikari, Bijaya (Virginia Tech, 2020-06-01)Broadly this thesis looks into network and time-series mining problems pertaining to dynamics over networks in various domains. Which locations and staff should we monitor in order to detect C. Difficile outbreaks in hospitals? How do we predict the peak intensity of the influenza incidence in an interpretable fashion? How do we infer the states of all nodes in a critical infrastructure network where failures have occurred? Leveraging domain-based information should make it is possible to answer these questions. However, several new challenges arise, such as (a) presence of more complex dynamics. The dynamics over networks that we consider are complex. For example, C. Difficile spreads via both people-to-people and surface-to-people interactions and correlations between failures in critical infrastructures go beyond the network structure and depend on the geography as well. Traditional approaches either rely on models like Susceptible Infectious (SI) and Independent Cascade (IC) which are too restrictive because they focus only on single pathways or do not incorporate the model at all, resulting in sub-optimality. (b) data sparsity. Additionally, the data sparsity still persists in this space. Specifically, it is difficult to collect the exact state of each node in the network as it is high-dimensional and difficult to directly sample from. (c) mismatch between data and process. In many situations, the underlying dynamical process is unknown or depends on a mixture of several models. In such cases, there is a mismatch between the data collected and the model representing the dynamics. For example, the weighted influenza like illness (wILI) count released by the CDC, which is meant to represent the raw fraction of total population infected by influenza, actually depends on multiple factors like the number of health-care providers reporting the number and public tendency to seek medical advice. In such cases, methods which generalize well to unobserved (or unknown) models are required. Current approaches often fail in tackling these challenges as they either rely on restrictive models, require large volume of data, and/or work only for predefined models. In this thesis, we propose to leverage domain-based frameworks, which include novel models and analysis techniques, and domain-based low dimensional representation learning to tackle the challenges mentioned above for networks and time-series mining tasks. By developing novel frameworks, we can capture the complex dynamics accurately and analyze them more efficiently. For example, to detect C. Difficile outbreaks in a hospital setting, we use a two-mode disease model to capture multiple pathways of outbreaks and discrete lattice-based optimization framework. Similarly, we propose an information theoretic framework which includes geographically correlated failures in critical infrastructure networks to infer the status of the network components. Moreover, as we use more realistic frameworks to accurately capture and analyze the mechanistic processes themselves, our approaches are effective even with sparse data. At the same time, learning low-dimensional domain-aware embeddings capture domain specific properties (like incidence-based similarity between historical influenza seasons) more efficiently from sparse data, which is useful for subsequent tasks. Similarly, since the domain-aware embeddings capture the model information directly from the data without any modeling assumptions, they generalize better to new models. Our domain-aware frameworks and embeddings enable many applications in critical domains. For example, our domain-aware frameworks for C. Difficile allows different monitoring rates for people and locations, thus detecting more than 95% of outbreaks. Similarly, our framework for product recommendation in e-commerce for queries with sparse engagement data resulted in a 34% improvement over the current Walmart.com search engine. Similarly, our novel framework leads to a near optimal algorithms, with additive approximation guarantee, for inferring network states given a partial observation of the failures in networks. Additionally, by exploiting domain-aware embeddings, we outperform non-trivial competitors by up to 40% for influenza forecasting. Similarly, domain-aware representations of subgraphs helped us outperform non-trivial baselines by up to 68% in the graph classification task. We believe our techniques will be useful for variety of other applications in many areas like social networks, urban computing, and so on.
- Enhancing Identity Theory Measurement: A Case Study in Ways to Advance the SubfieldHayes, Whitney Ann (Virginia Tech, 2024-01-23)Identity theory (IT) is a sociological theory that helps to explain how societal patterns and norms shape the ways in which people behave and make decisions. The current project presents a comprehensive exploration of IT in the context of academic conferences, shedding light on the multifaceted identities of sociologists as scholars, educators, activists, and beyond. It examines how these diverse roles intersect and influence behaviors within professional settings. The first article critiques traditional IT research's limitations and adopts a qualitative approach to more accurately capture how participants describe themselves, moving beyond the constraints of previous methodologies. The second piece investigates homophily–the tendency to associate with similar others. Focusing on minority identities in higher education, this study explores homophily across various demographics, such as race, gender, and academic rank, thus providing insights into the nuances of inequality within academic circles. The final article examines the impact of technology in academic conferences, particularly in the post-COVID-19 era. It analyzes how oppressed identities leverage a conference mobile app for networking, highlighting technology's role in creating inclusive environments and enhancing connections among marginalized groups. Collectively, this dissertation offers a nuanced view of identity within the academic sphere. By challenging existing IT research paradigms, introducing innovative survey techniques, linking IT with homophily, and assessing technology's influence on conference dynamics, this work enriches our understanding of sociologists' identities and interactions. It holds significant implications for future research and the development of more equitable and inclusive sociological communities, emphasizing the complex interplay of personal and professional identities in academic settings.
- Evaluating, Understanding, and Mitigating Unfairness in Recommender SystemsYao, Sirui (Virginia Tech, 2021-06-10)Recommender systems are information filtering tools that discover potential matchings between users and items and benefit both parties. This benefit can be considered a social resource that should be equitably allocated across users and items, especially in critical domains such as education and employment. Biases and unfairness in recommendations raise both ethical and legal concerns. In this dissertation, we investigate the concept of unfairness in the context of recommender systems. In particular, we study appropriate unfairness evaluation metrics, examine the relation between bias in recommender models and inequality in the underlying population, as well as propose effective unfairness mitigation approaches. We start with exploring the implication of fairness in recommendation and formulating unfairness evaluation metrics. We focus on the task of rating prediction. We identify the insufficiency of demographic parity for scenarios where the target variable is justifiably dependent on demographic features. Then we propose an alternative set of unfairness metrics that measured based on how much the average predicted ratings deviate from average true ratings. We also reduce these unfairness in matrix factorization (MF) models by explicitly adding them as penalty terms to learning objectives. Next, we target a form of unfairness in matrix factorization models observed as disparate model performance across user groups. We identify four types of biases in the training data that contribute to higher subpopulation error. Then we propose personalized regularization learning (PRL), which learns personalized regularization parameters that directly address the data biases. PRL poses the hyperparameter search problem as a secondary learning task. It enables back-propagation to learn the personalized regularization parameters by leveraging the closed-form solutions of alternating least squares (ALS) to solve MF. Furthermore, the learned parameters are interpretable and provide insights into how fairness is improved. Third, we conduct theoretical analysis on the long-term dynamics of inequality in the underlying population, in terms of the fitting between users and items. We view the task of recommendation as solving a set of classification problems through threshold policies. We mathematically formulate the transition dynamics of user-item fit in one step of recommendation. Then we prove that a system with the formulated dynamics always has at least one equilibrium, and we provide sufficient conditions for the equilibrium to be unique. We also show that, depending on the item category relationships and the recommendation policies, recommendations in one item category can reshape the user-item fit in another item category. To summarize, in this research, we examine different fairness criteria in rating prediction and recommendation, study the dynamic of interactions between recommender systems and users, and propose mitigation methods to promote fairness and equality.
- Explainable and Network-based Approaches for Decision-making in Emergency ManagementTabassum, Anika (Virginia Tech, 2021-10-19)Critical Infrastructures (CIs), such as power, transportation, healthcare, etc., refer to systems, facilities, technologies, and networks vital to national security, public health, and socio-economic well-being of people. CIs play a crucial role in emergency management. For example, the recent Hurricane Ida, Texas Winter storm, colonial cyber-attack that occurred during 2021 in the US, shows the CIs are highly inter-dependent with complex interactions. Hence power system failures and shutdown of natural gas pipelines, in turn, led to debilitating impacts on communication, waste systems, public health, etc. Consider power failures during a disaster, such as a hurricane. Subject Matter Experts (SMEs) such as emergency management authorities may be interested in several decision-making tasks. Can we identify disaster phases in terms of the severity of damage from analyzing changes in power failures? Can we tell the SMEs which power grids or regions are the most affected during each disaster phase and need immediate action to recover? Answering these questions can help SMEs to respond quickly and send resources for fast recovery from damage. Can we systematically provide how the failure of different power grids may impact the whole CIs due to inter-dependencies? This can help SMEs to better prepare and mitigate the risks by improving system resiliency. In this thesis, we explore problems to efficiently operate decision-making tasks during a disaster for emergency management authorities. Our research has two primary directions, guide decision-making in resource allocation and plans to improve system resiliency. Our work is done in collaboration with the Oak Ridge National Laboratory to contribute impactful research in real-life CIs and disaster power failure data. 1. Explainable resource allocation: In contrast to the current interpretable or explainable model that provides answers to understand a model output, we view explanations as answers to guide resource allocation decision-making. In this thesis, we focus on developing a novel model and algorithm to identify disaster phases from changes in power failures. Also, pinpoint the regions which can get most affected at each disaster phase so the SMEs can send resources for fast recovery. 2. Networks for improving system resiliency: We view CIs as a large heterogeneous network with nodes as infrastructure components and dependencies as edges. Our goal is to construct a visual analytic tool and develop a domain-inspired model to identify the important components and connections to which the SMEs need to focus and better prepare to mitigate the risk of a disaster.
- A Framework for Automated Discovery and Analysis of Suspicious Trade RecordsDatta, Debanjan (Virginia Tech, 2022-05-27)Illegal logging and timber trade presents a persistent threat to global biodiversity and national security due to its ties with illicit financial flows, and causes revenue loss. The scale of global commerce in timber and associated products, combined with the complexity and geographical spread of the supply chain entities present a non-trivial challenge in detecting such transactions. International shipment records, specifically those containing bill of lading is a key source of data which can be used to detect, investigate and act upon such transactions. The comprehensive problem can be described as building a framework that can perform automated discovery and facilitate actionability on detected transactions. A data driven machine learning based approach is necessitated due to the volume, velocity and complexity of international shipping data. Such an automated framework can immensely benefit our targeted end-users---specifically the enforcement agencies. This overall problem comprises of multiple connected sub-problems with associated research questions. We incorporate crucial domain knowledge---in terms of data as well as modeling---through employing expertise of collaborating domain specialists from ecological conservationist agencies. The collaborators provide formal and informal inputs spanning across the stages---from requirement specification to the design. Following the paradigm of similar problems such as fraud detection explored in prior literature, we formulate the core problem of discovering suspicious transactions as an anomaly detection task. The first sub-problem is to build a system that can be used find suspicious transactions in shipment data pertaining to imports and exports of multiple countries with different country specific schema. We present a novel anomaly detection approach---for multivariate categorical data, following constraints of data characteristics, combined with a data pipeline that incorporates domain knowledge. The focus of the second problem is U.S. specific imports, where data characteristics differ from the prior sub-problem---with heterogeneous attributes present. This problem is important since U.S. is a top consumer and there is scope of actionable enforcement. For this we present a contrastive learning based anomaly detection model for heterogeneous tabular data, with performance and scalability characteristics applicable to real world trade data. While the first two problems address the task of detecting suspicious trades through anomaly detection, a practical challenge with anomaly detection based systems is that of relevancy or scenario specific precision. The third sub-problem addresses this through a human-in-the-loop approach augmented by visual analytics, to re-rank anomalies in terms of relevance---providing explanations for cause of anomalies and soliciting feedback. The last sub-problem pertains to explainability and actionability towards suspicious records, through algorithmic recourse. Algorithmic recourse aims to provides meaningful alternatives towards flagged anomalous records, such that those counterfactual examples are not judged anomalous by the underlying anomaly detection system. This can help enforcement agencies advise verified trading entities in modifying their trading patterns to avoid false detection, thus streamlining the process. We present a novel formulation and metrics for this unexplored problem of algorithmic recourse in anomaly detection. and a deep learning based approach towards explaining anomalies and generating counterfactuals. Thus the overall research contributions presented in this dissertation addresses the requirements of the framework, and has general applicability in similar scenarios beyond the scope of this framework.
- Information Extraction of Technical Details From Scholarly ArticlesKaushal, Kulendra Kumar (Virginia Tech, 2021-06-16)Researchers have made significant progress in information extraction from short documents in the last few years, including social media interaction, news articles, and email excerpts. This research aims to extract technical entities like hardware resources, computing platforms, compute time, programming language, and libraries from scholarly research articles. Research articles are generally long documents having both salient as well as non-salient entities. Analyzing the cross-sectional relation, filtering the relevant information, measuring the saliency of mentioned entities, and extracting novel entities are some of the technical challenges involved in this research. This work presents a detailed study about the performance, effectiveness, and scalability of rule-based weakly supervised algorithms. We also develop an automated end-to-end Research Entity and Relationship Extractor (E2R Extractor). Additionally, we perform a comprehensive study about the effectiveness of existing deep learning-based information extraction tools like Dygie, Dygie++, SciREX. The research also contributes a dataset containing novel entities annotated in BILUO format and represents the baseline results using the E2R extractor on the proposed dataset. The results indicate that the E2R extractor successfully extracts salient entities from research articles.