Browsing by Author "Huang, Bert"
Now showing 1 - 20 of 53
Results Per Page
Sort Options
- Action Recognition with Knowledge TransferChoi, Jin-Woo (Virginia Tech, 2021-01-07)Recent progress on deep neural networks has shown remarkable action recognition performance from videos. The remarkable performance is often achieved by transfer learning: training a model on a large-scale labeled dataset (source) and then fine-tuning the model on the small-scale labeled datasets (targets). However, existing action recognition models do not always generalize well on new tasks or datasets because of the following two reasons. i) Current action recognition datasets have a spurious correlation between action types and background scene types. The models trained on these datasets are biased towards the scene instead of focusing on the actual action. This scene bias leads to poor generalization performance. ii) Directly testing the model trained on the source data on the target data leads to poor performance as the source, and target distributions are different. Fine-tuning the model on the target data can mitigate this issue. However, manual labeling small- scale target videos is labor-intensive. In this dissertation, I propose solutions to these two problems. For the first problem, I propose to learn scene-invariant action representations to mitigate the scene bias in action recognition models. Specifically, I augment the standard cross-entropy loss for action classification with 1) an adversarial loss for the scene types and 2) a human mask confusion loss for videos where the human actors are invisible. These two losses encourage learning representations unsuitable for predicting 1) the correct scene types and 2) the correct action types when there is no evidence. I validate the efficacy of the proposed method by transfer learning experiments. I trans- fer the pre-trained model to three different tasks, including action classification, temporal action localization, and spatio-temporal action detection. The results show consistent improvement over the baselines for every task and dataset. I formulate human action recognition as an unsupervised domain adaptation (UDA) problem to handle the second problem. In the UDA setting, we have many labeled videos as source data and unlabeled videos as target data. We can use already exist- ing labeled video datasets as source data in this setting. The task is to align the source and target feature distributions so that the learned model can generalize well on the target data. I propose 1) aligning the more important temporal part of each video and 2) encouraging the model to focus on action, not the background scene, to learn domain-invariant action representations. The proposed method is simple and intuitive while achieving state-of-the-art performance without training on a lot of labeled target videos. I relax the unsupervised target data setting to a sparsely labeled target data setting. Then I explore the semi-supervised video action recognition, where we have a lot of labeled videos as source data and sparsely labeled videos as target data. The semi-supervised setting is practical as sometimes we can afford a little bit of cost for labeling target data. I propose multiple video data augmentation methods to inject photometric, geometric, temporal, and scene invariances to the action recognition model in this setting. The resulting method shows favorable performance on the public benchmarks.
- Addressing Occlusion in Panoptic SegmentationSarkaar, Ajit Bhikamsingh (Virginia Tech, 2021-01-20)Visual recognition tasks have witnessed vast improvements in performance since the advent of deep learning. Despite the gains in performance, image understanding algorithms are still not completely robust to partial occlusion. In this work, we propose a novel object classification method based on compositional modeling and explore its effect in the context of the newly introduced panoptic segmentation task. The panoptic segmentation task combines both semantic and instance segmentation to perform labelling of the entire image. The novel classification method replaces the object detection pipeline in UPSNet, a Mask R-CNN based design for panoptic segmentation. We also discuss an issue with the segmentation mask prediction of Mask R-CNN that affects overlapping instances. We perform extensive experiments and showcase results on the complex COCO and Cityscapes datasets. The novel classification method shows promising results for object classification on occluded instances in complex scenes.
- ALJI: Active Listening Journal InteractionSullivan, Patrick Ryan (Virginia Tech, 2019-10-29)Depression is a crippling burden on a great many people, and it is often well hidden. Mental health professionals are able to treat depression, but the general public is not well versed in recognizing depression symptoms or assessing their own mental health. Active Listening Journal Interaction (ALJI) is a computer program that seeks to identify and refer people suffering with depression to mental health support services. It does this through analyzing personal journal entries using machine learning, and then privately responding to the author with proper guidance. In this thesis, we focus on determining the feasibility and usefulness of the machine learning models that drive ALJI. With heavy data limitations, we cautiously report that with a single journal entry, our model detects when a person's symptoms warrant professional intervention with a 61% accuracy. A great amount of discussion on the proposed solution, methods, results, and future directions of ALJI is included.
- ‘Beating the news’ with EMBERS: Forecasting Civil Unrest using Open Source IndicatorsRamakrishnan, Naren; Butler, Patrick; Self, Nathan; Khandpur, Rupinder P.; Saraf, Parang; Wang, Wei; Cadena, Jose; Vullikanti, Anil Kumar S.; Korkmaz, Gizem; Kuhlman, Christopher J.; Marathe, Achla; Zhao, Liang; Ting, Hua; Huang, Bert; Srinivasan, Aravind; Trinh, Khoa; Getoor, Lise; Katz, Graham; Doyle, Andy; Ackermann, Chris; Zavorin, Ilya; Ford, Jim; Summers, Kristen; Fayed, Youssef; Arredondo, Jaime; Gupta, Dipak; Mares, David; Muthia, Sathappan; Chen, Feng; Lu, Chang-Tien (2014)We describe the design, implementation, and evaluation of EMBERS, an automated, 24x7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. Unlike retrospective studies, EMBERS has been making forecasts into the future since Nov 2012 which have been (and continue to be) evaluated by an independent T&E team (MITRE). Of note, EMBERS has successfully forecast the uptick and downtick of incidents during the June 2013 protests in Brazil. We outline the system architecture of EMBERS, individual models that leverage specific data sources, and a fusion and suppression engine that supports trading off specific evaluation criteria. EMBERS also provides an audit trail interface that enables the investigation of why specific predictions were made along with the data utilized for forecasting. Through numerous evaluations, we demonstrate the superiority of EMBERS over baserate methods and its capability to forecast significant societal happenings.
- Bounded Expectation of Label Assignment: Dataset Annotation by Supervised Splitting with Bias-Reduction TechniquesHerbst, Alyssa Kathryn (Virginia Tech, 2020-01-20)Annotating large unlabeled datasets can be a major bottleneck for machine learning applications. We introduce a scheme for inferring labels of unlabeled data at a fraction of the cost of labeling the entire dataset. We refer to the scheme as Bounded Expectation of Label Assignment (BELA). BELA greedily queries an oracle (or human labeler) and partitions a dataset to find data subsets that have mostly the same label. BELA can then infer labels by majority vote of the known labels in each subset. BELA makes the decision to split or label from a subset by maximizing a lower bound on the expected number of correctly labeled examples. BELA improves upon existing hierarchical labeling schemes by using supervised models to partition the data, therefore avoiding reliance on unsupervised clustering methods that may not accurately group data by label. We design BELA with strategies to avoid bias that could be introduced through this adaptive partitioning. We evaluate BELA on labeling of four datasets and find that it outperforms existing strategies for adaptive labeling.
- Continual Learning for Deep Dense PredictionLokegaonkar, Sanket Avinash (Virginia Tech, 2018-06-11)Transferring a deep learning model from old tasks to a new one is known to suffer from the catastrophic forgetting effects. Such forgetting mechanism is problematic as it does not allow us to accumulate knowledge sequentially and requires retaining and retraining on all the training data. Existing techniques for mitigating the abrupt performance degradation on previously trained tasks are mainly studied in the context of image classification. In this work, we present a simple method to alleviate catastrophic forgetting for pixel-wise dense labeling problems. We build upon the regularization technique using knowledge distillation to minimize the discrepancy between the posterior distribution of pixel class labels for old tasks predicted from 1) the original and 2) the updated networks. This technique, however, might fail in circumstances where the source and target distribution differ significantly. To handle the above scenario, we further propose an improvement to the distillation based approach by adding adaptive l2-regularization depending upon the per-parameter importance to the older tasks. We train our model on FCN8s, but our training can be generalized to stronger models like DeepLab, PSPNet, etc. Through extensive evaluation and comparisons, we show that our technique can incrementally train dense prediction models for novel object classes, different visual domains, and different visual tasks.
- Data Augmentation with Seq2Seq ModelsGranstedt, Jason Louis (Virginia Tech, 2017-07-06)Paraphrase sparsity is an issue that complicates the training process of question answering systems: syntactically diverse but semantically equivalent sentences can have significant disparities in predicted output probabilities. We propose a method for generating an augmented paraphrase corpus for the visual question answering system to make it more robust to paraphrases. This corpus is generated by concatenating two sequence to sequence models. In order to generate diverse paraphrases, we sample the neural network using diverse beam search. We evaluate the results on the standard VQA validation set. Our approach results in a significantly expanded training dataset and vocabulary size, but has slightly worse performance when tested on the validation split. Although not as fruitful as we had hoped, our work highlights additional avenues for investigation into selecting more optimal model parameters and the development of a more sophisticated paraphrase filtering algorithm. The primary contribution of this work is the demonstration that decent paraphrases can be generated from sequence to sequence models and the development of a pipeline for developing an augmented dataset.
- Data Mining Academic Emails to Model Employee Behaviors and Analyze Organizational StructureStraub, Kayla Marie (Virginia Tech, 2016-06-06)Email correspondence has become the predominant method of communication for businesses. If not for the inherent privacy concerns, this electronically searchable data could be used to better understand how employees interact. After the Enron dataset was made available, researchers were able to provide great insight into employee behaviors based on the available data despite the many challenges with that dataset. The work in this thesis demonstrates a suite of methods to an appropriately anonymized academic email dataset created from volunteers' email metadata. This new dataset, from an internal email server, is first used to validate feature extraction and machine learning algorithms in order to generate insight into the interactions within the center. Based solely on email metadata, a random forest approach models behavior patterns and predicts employee job titles with $96%$ accuracy. This result represents classifier performance not only on participants in the study but also on other members of the center who were connected to participants through email. Furthermore, the data revealed relationships not present in the center's formal operating structure. The culmination of this work is an organic organizational chart, which contains a fuller understanding of the center's internal structure than can be found in the official organizational chart.
- Data Mining Twitter to Improve Automated Vehicle SafetyMcDonald, Anthony D.; Huang, Bert; Wei, Ran; Alambeigi, Hananeh; Arachie, Chidubem; Smith, Alexander Charles; Jefferson, Jacelyn (SAFE-D: Safety Through Disruption National University Transportation Center, 2021-02)Automated vehicle (AV) technologies may significantly improve driving safety, but only if they are widely adopted and used appropriately. Adoption and appropriate use are influenced by user expectations, which are increasingly being driven by social media. In the context of AVs, prior studies have observed that major news events such as crashes and technology announcements influence user responses to AVs; however, the exact impact and dynamics of this influence are not well understood. The goals of this project were to develop a novel search method to identify AV-relevant user comments on Twitter, mine these tweets to understand the influence of crashes and news events on user sentiment about AVs, and finally translate these findings into a set of guidelines for reporting about AV crashes. In service of these goals, we developed a novel semi-supervised constrained-level learning machine search approach to identify relevant tweets and demonstrated that it outperformed alternative methods. We used the relevant tweets identified to develop a topic model of AV events which illustrated that crashes, fault and safety, and technology companies were the most discussed topics following major events. While the sentiment among these topics was mostly neutral, tweets about crashes and fault and safety were negatively biased. We combined these findings with a series of interviews with Public Information Officers to develop a set of five basic guidelines for AV communication. These guidelines should aid proper public calibration and subsequent acceptance and use of AVs.
- Data-driven customer energy behavior characterization for distributed energy managementAfzalan, Milad (Virginia Tech, 2020-07-01)With the ever-growing concerns of environmental and climate concerns for energy consumption in our society, it is crucial to develop novel solutions that improve the efficient utilization of distributed energy resources for energy efficiency and demand response (DR). As such, there is a need to develop targeted energy programs, which not only meet the requirement of energy goals for a community but also take the energy use patterns of individual households into account. To this end, a sound understanding of the energy behavior of customers at the neighborhood level is needed, which requires operational analytics on the wealth of energy data from customers and devices. In this dissertation, we focus on data-driven solutions for customer energy behavior characterization with applications to distributed energy management and flexibility provision. To do so, the following problems were studied: (1) how different customers can be segmented for DR events based on their energy-saving potential and balancing peak and off-peak demand, (2) what are the opportunities for extracting Time-of-Use of specific loads for automated DR applications from the whole-house energy data without in-situ training, and (3) how flexibility in customer demand adoption of renewable and distributed resources (e.g., solar panels, battery, and smart loads) can improve the demand-supply problem. In the first study, a segmentation methodology form historical energy data of households is proposed to estimate the energy-saving potential for DR programs at a community level. The proposed approach characterizes certain attributes in time-series data such as frequency, consistency, and peak time usage. The empirical evaluation of real energy data of 400 households shows the successful ranking of different subsets of consumers according to their peak energy reduction potential for the DR event. Specifically, it was shown that the proposed approach could successfully identify the 20-30% of customers who could achieve 50-70% total possible demand reduction for DR. Furthermore, the rebound effect problem (creating undesired peak demand after a DR event) was studied, and it was shown that the proposed approach has the potential of identifying a subset of consumers (~5%-40% with specific loads like AC and electric vehicle) who contribute to balance the peak and off-peak demand. A projection on Austin, TX showed 16MWh reduction during a 2-h event can be achieved by a justified selection of 20% of residential customers. In the second study, the feasibility of inferring time-of-use (ToU) operation of flexible loads for DR applications was investigated. Unlike several efforts that required considerable model parameter selection or training, we sought to infer ToU from machine learning models without in-situ training. As the first part of this study, the ToU inference from low-resolution 15-minute data (smart meter data) was investigated. A framework was introduced which leveraged the smart meter data from a set of neighbor buildings (equipped with plug meters) with similar energy use behavior for training. Through identifying similar buildings in energy use behavior, the machine learning classification models (including neural network, SVM, and random forest) were employed for inference of appliance ToU in buildings by accounting for resident behavior reflected in their energy load shapes from smart meter data. Investigation on electric vehicle (EV) and dryer for 10 buildings over 20 days showed an average F-score of 83% and 71%. As the second part of this study, the ToU inference from high-resolution data (60Hz) was investigated. A self-configuring framework, based on the concept of spectral clustering, was introduced that automatically extracts the appliance signature from historical data in the environment to avoid the problem of model parameter selection. Using the framework, appliance signatures are matched with new events in the electricity signal to identify the ToU of major loads. The results on ~1500 events showed an F-score of >80% for major loads like AC, washing machine, and dishwasher. In the third study, the problem of demand-supply balance, in the presence of varying levels of small-scale distributed resources (solar panel, battery, and smart load) was investigated. The concept of load complementarity between consumers and prosumers for load balancing among a community of ~250 households was investigated. The impact of different scenarios such as varying levels of solar penetration, battery integration level, in addition to users' flexibility for balancing the supply and demand were quantitatively measured. It was shown that (1) even with 100% adoption of solar panels, the renewable supply cannot cover the demand of the network during afternoon times (e.g., after 3 pm), (2) integrating battery for individual households could improve the self-sufficiency by more than 15% during solar generation time, and (3) without any battery, smart loads are also capable of improving the self-sufficiency as an alternative, by providing ~60% of what commercial battery systems would offer. The contribution of this dissertation is through introducing data-driven solutions/investigations for characterizing the energy behavior of households, which could increase the flexibility of the aggregate daily energy load profiles for a community. When combined, the findings of this research can serve to the field of utility-scale energy analytics for the integration of DR and improved reshaping of network energy profiles (i.e., mitigating the peaks and valleys in daily demand profiles).
- Deep Learning for Enhancing Precision MedicineOh, Min (Virginia Tech, 2021-06-07)Most medical treatments have been developed aiming at the best-on-average efficacy for large populations, resulting in treatments successful for some patients but not for others. It necessitates the need for precision medicine that tailors medical treatment to individual patients. Omics data holds comprehensive genetic information on individual variability at the molecular level and hence the potential to be translated into personalized therapy. However, the attempts to transform omics data-driven insights into clinically actionable models for individual patients have been limited. Meanwhile, advances in deep learning, one of the most promising branches of artificial intelligence, have produced unprecedented performance in various fields. Although several deep learning-based methods have been proposed to predict individual phenotypes, they have not established the state of the practice, due to instability of selected or learned features derived from extremely high dimensional data with low sample sizes, which often results in overfitted models with high variance. To overcome the limitation of omics data, recent advances in deep learning models, including representation learning models, generative models, and interpretable models, can be considered. The goal of the proposed work is to develop deep learning models that can overcome the limitation of omics data to enhance the prediction of personalized medical decisions. To achieve this, three key challenges should be addressed: 1) effectively reducing dimensions of omics data, 2) systematically augmenting omics data, and 3) improving the interpretability of omics data.
- Deep Representation Learning on Labeled GraphsFan, Shuangfei (Virginia Tech, 2020-01-27)We introduce recurrent collective classification (RCC), a variant of ICA analogous to recurrent neural network prediction. RCC accommodates any differentiable local classifier and relational feature functions. We provide gradient-based strategies for optimizing over model parameters to more directly minimize the loss function. In our experiments, this direct loss minimization translates to improved accuracy and robustness on real network data. We demonstrate the robustness of RCC in settings where local classification is very noisy, settings that are particularly challenging for ICA. As a new way to train generative models, generative adversarial networks (GANs) have achieved considerable success in image generation, and this framework has also recently been applied to data with graph structures. We identify the drawbacks of existing deep frameworks for generating graphs, and we propose labeled-graph generative adversarial networks (LGGAN) to train deep generative models for graph-structured data with node labels. We test the approach on various types of graph datasets, such as collections of citation networks and protein graphs. Experiment results show that our model can generate diverse labeled graphs that match the structural characteristics of the training data and outperforms all baselines in terms of quality, generality, and scalability. To further evaluate the quality of the generated graphs, we apply it to a downstream task for graph classification, and the results show that LGGAN can better capture the important aspects of the graph structure.
- Detecting Bots using Stream-based System with Data SynthesisHu, Tianrui (Virginia Tech, 2020-05-28)Machine learning has shown great success in building security applications including bot detection. However, many machine learning models are difficult to deploy since model training requires the continuous supply of representative labeled data, which are expensive and time-consuming to obtain in practice. In this thesis, we build a bot detection system with a data synthesis method to explore detecting bots with limited data to address this problem. We collected the network traffic from 3 online services in three different months within a year (23 million network requests). We develop a novel stream-based feature encoding scheme to support our model to perform real-time bot detection on anonymized network data. We propose a data synthesis method to synthesize unseen (or future) bot behavior distributions to enable our system to detect bots with extremely limited labeled data. The synthesis method is distribution-aware, using two different generators in a Generative Adversarial Network to synthesize data for the clustered regions and the outlier regions in the feature space. We evaluate this idea and show our method can train a model that outperforms existing methods with only 1% of the labeled data. We show that data synthesis also improves the model's sustainability over time and speeds up the retraining. Finally, we compare data synthesis and adversarial retraining and show they can work complementary with each other to improve the model generalizability.
- Directional Airflow for HVAC SystemsAbedi, Milad (Virginia Tech, 2019)Directional airflow has been utilized to enable targeted air conditioning in cars and airplanes for many years, where the occupants could adjust the direction of flow. In the building sector however, HVAC systems are usually equipped with stationary diffusors that can only supply the air either in the form of diffusion or with fixed direction to the room in which they have been installed. In the present thesis, the possibility of adopting directional airflow in lieu of the conventional uniform diffusors has been investigated. The potential benefits of such a modification in control capabilities of the HVAC system in terms of improvements in the overall occupant thermal comfort and energy consumption of the HVAC system have been investigated via a simulation study and an experimental study. In the simulation study, an average of 59% per cycle reduction was achieved in the energy consumption. The reduction in the required duration of airflow (proportional to energy consumption) in the experimental study was 64% per cycle. The feasibility of autonomous control of the directional airflow, has been studied in a simulation experiment by utilizing the Reinforcement Learning algorithm which is an artificial intelligence approach that facilitates autonomous control in unknown environments. In order to demonstrate the feasibility of enabling the existing HVAC systems to control the direction of airflow, a device (called active diffusor) was designed and prototyped. The active diffusor successfully replaced the existing uniform diffusor and was able to effectively target the occupant positions by accurately directing the airflow jet to the desired positions.
- Distinguishing Dynamical Kinds: An Approach for Automating Scientific DiscoveryShea-Blymyer, Colin (Virginia Tech, 2019-07-02)The automation of scientific discovery has been an active research topic for many years. The promise of a formalized approach to developing and testing scientific hypotheses has attracted researchers from the sciences, machine learning, and philosophy alike. Leveraging the concept of dynamical symmetries a new paradigm is proposed for the collection of scientific knowledge, and algorithms are presented for the development of EUGENE – an automated scientific discovery tool-set. These algorithms have direct applications in model validation, time series analysis, and system identification. Further, the EUGENE tool-set provides a novel metric of dynamical similarity that would allow a system to be clustered into its dynamical regimes. This dynamical distance is sensitive to the presence of chaos, effective order, and nonlinearity. I discuss the history and background of these algorithms, provide examples of their behavior, and present their use for exploring system dynamics.
- End-To-End Text Detection Using Deep LearningIbrahim, Ahmed Sobhy Elnady (Virginia Tech, 2017-12-19)Text detection in the wild is the problem of locating text in images of everyday scenes. It is a challenging problem due to the complexity of everyday scenes. This problem possesses a great importance for many trending applications, such as self-driving cars. Previous research in text detection has been dominated by multi-stage sequential approaches which suffer from many limitations including error propagation from one stage to the next. Another line of work is the use of deep learning techniques. Some of the deep methods used for text detection are box detection models and fully convolutional models. Box detection models suffer from the nature of the annotations, which may be too coarse to provide detailed supervision. Fully convolutional models learn to generate pixel-wise maps that represent the location of text instances in the input image. These models suffer from the inability to create accurate word level annotations without heavy post processing. To overcome these aforementioned problems we propose a novel end-to-end system based on a mix of novel deep learning techniques. The proposed system consists of an attention model, based on a new deep architecture proposed in this dissertation, followed by a deep network based on Faster-RCNN. The attention model produces a high-resolution map that indicates likely locations of text instances. A novel aspect of the system is an early fusion step that merges the attention map directly with the input image prior to word-box prediction. This approach suppresses but does not eliminate contextual information from consideration. Progressively larger models were trained in 3 separate phases. The resulting system has demonstrated an ability to detect text under difficult conditions related to illumination, resolution, and legibility. The system has exceeded the state of the art on the ICDAR 2013 and COCO-Text benchmarks with F-measure values of 0.875 and 0.533, respectively.
- Enhancing Trust in Autonomous Systems without Verifying SoftwareStamenkovich, Joseph Allan (Virginia Tech, 2019-06-12)The complexity of the software behind autonomous systems is rapidly growing, as are the applications of what they can do. It is not unusual for the lines of code to reach the millions, which adds to the verification challenge. The machine learning algorithms involved are often "black boxes" where the precise workings are not known by the developer applying them, and their behavior is undefined when encountering an untrained scenario. With so much code, the possibility of bugs or malicious code is considerable. An approach is developed to monitor and possibly override the behavior of autonomous systems independent of the software controlling them. Application-isolated safety monitors are implemented in configurable hardware to ensure that the behavior of an autonomous system is limited to what is intended. The sensor inputs may be shared with the software, but the output from the monitors is only engaged when the system violates its prescribed behavior. For each specific rule the system is expected to follow, a monitor is present processing the relevant sensor information. The behavior is defined in linear temporal logic (LTL) and the associated monitors are implemented in a field programmable gate array (FPGA). An off-the-shelf drone is used to demonstrate the effectiveness of the monitors without any physical modifications to the drone. Upon detection of a violation, appropriate corrective actions are persistently enforced on the autonomous system.
- Evaluating, Understanding, and Mitigating Unfairness in Recommender SystemsYao, Sirui (Virginia Tech, 2021-06-10)Recommender systems are information filtering tools that discover potential matchings between users and items and benefit both parties. This benefit can be considered a social resource that should be equitably allocated across users and items, especially in critical domains such as education and employment. Biases and unfairness in recommendations raise both ethical and legal concerns. In this dissertation, we investigate the concept of unfairness in the context of recommender systems. In particular, we study appropriate unfairness evaluation metrics, examine the relation between bias in recommender models and inequality in the underlying population, as well as propose effective unfairness mitigation approaches. We start with exploring the implication of fairness in recommendation and formulating unfairness evaluation metrics. We focus on the task of rating prediction. We identify the insufficiency of demographic parity for scenarios where the target variable is justifiably dependent on demographic features. Then we propose an alternative set of unfairness metrics that measured based on how much the average predicted ratings deviate from average true ratings. We also reduce these unfairness in matrix factorization (MF) models by explicitly adding them as penalty terms to learning objectives. Next, we target a form of unfairness in matrix factorization models observed as disparate model performance across user groups. We identify four types of biases in the training data that contribute to higher subpopulation error. Then we propose personalized regularization learning (PRL), which learns personalized regularization parameters that directly address the data biases. PRL poses the hyperparameter search problem as a secondary learning task. It enables back-propagation to learn the personalized regularization parameters by leveraging the closed-form solutions of alternating least squares (ALS) to solve MF. Furthermore, the learned parameters are interpretable and provide insights into how fairness is improved. Third, we conduct theoretical analysis on the long-term dynamics of inequality in the underlying population, in terms of the fitting between users and items. We view the task of recommendation as solving a set of classification problems through threshold policies. We mathematically formulate the transition dynamics of user-item fit in one step of recommendation. Then we prove that a system with the formulated dynamics always has at least one equilibrium, and we provide sufficient conditions for the equilibrium to be unique. We also show that, depending on the item category relationships and the recommendation policies, recommendations in one item category can reshape the user-item fit in another item category. To summarize, in this research, we examine different fairness criteria in rating prediction and recommendation, study the dynamic of interactions between recommender systems and users, and propose mitigation methods to promote fairness and equality.
- Exploits in Concurrency for Boolean SatisfiabilitySohanghpurwala, Ali Asgar Ali Akbar (Virginia Tech, 2018-12-14)Boolean Satisfiability (SAT) is a problem that holds great theoretical significance along with effective formulations that benefit many real-world applications. While the general problem is NP-complete, advanced solver algorithms and heuristics allow for fast solutions to many large industrial problems. In addition to SAT, many applications rely on generalizations of Satisfiability such as MaxSAT, and Satisfiability Modulo Theories (SMT). Much of the advancement in SAT solver performance has been in the realm of improved sequential solvers with advanced conflict resolution, learning mechanisms, and sophisticated heuristics. There have been some successful demonstrations of massively parallel and hardware-accelerated solvers for SAT, but these have failed to find their way into mainstream usage. This document first presents previous work in Hardware Acceleration of Satisfiability followed by an analysis of why these attempts failed to gain widespread acceptance. It then demonstrates an alternative, hardware-centric approach, based on distributed Stochastic Local Search (SLS) that is better suited to efficient hardware implementation. Then a parallel SLS/CDCL hybrid approach is proposed that is suitable for distributed search with minimal communication overhead while maintaining completeness. Finally the efficacy and flexibility of distributed local search is considered with an adaptation to Weighted Partial MaxSAT (WPMS) and a focused case study on converted Probabilistic Inference instances.
- Going Deeper with Images and Natural LanguageMa, Yufeng (Virginia Tech, 2019-03-29)One aim in the area of artificial intelligence (AI) is to develop a smart agent with high intelligence that is able to perceive and understand the complex visual environment around us. More ambitiously, it should be able to interact with us about its surroundings in natural languages. Thanks to the progress made in deep learning, we've seen huge breakthroughs towards this goal over the last few years. The developments have been extremely rapid in visual recognition, in which machines now can categorize images into multiple classes, and detect various objects within an image, with an ability that is competitive with or even surpasses that of humans. Meanwhile, we also have witnessed similar strides in natural language processing (NLP). It is quite often for us to see that now computers are able to almost perfectly do text classification, machine translation, etc. However, despite much inspiring progress, most of the achievements made are still within one domain, not handling inter-domain situations. The interaction between the visual and textual areas is still quite limited, although there has been progress in image captioning, visual question answering, etc. In this dissertation, we design models and algorithms that enable us to build in-depth connections between images and natural languages, which help us to better understand their inner structures. In particular, first we study how to make machines generate image descriptions that are indistinguishable from ones expressed by humans, which as a result also achieved better quantitative evaluation performance. Second, we devise a novel algorithm for measuring review congruence, which takes an image and review text as input and quantifies the relevance of each sentence to the image. The whole model is trained without any supervised ground truth labels. Finally, we propose a brand new AI task called Image Aspect Mining, to detect visual aspects in images and identify aspect level rating within the review context. On the theoretical side, this research contributes to multiple research areas in Computer Vision (CV), Natural Language Processing (NLP), interactions between CVandNLP, and Deep Learning. Regarding impact, these techniques will benefit related users such as the visually impaired, customers reading reviews, merchants, and AI researchers in general.
- «
- 1 (current)
- 2
- 3
- »