Browsing by Author "Fan, Shuangfei"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- CS5604: Information and Storage Retrieval Fall 2016 - CMT (Collection Management Tweets)Wagner, Mitchell J.; Abidi, Faiz; Fan, Shuangfei (Virginia Tech, 2016-12-08)As the Collection Management Tweets team in the Fall 2016 CS5604 class, we were responsible for processing >1.2 billion tweets, including data transfer, noise reduction, tweet augmentation, and storage via several technologies. Our work was the first step in a pipeline that included many teams and ultimately culminated in a comprehensive information retrieval system. We were also responsible for building a social network (or set of networks) for those tweets, along with their tweeters. In this report, we detail our experience with this project. Additionally, we propose solutions for transferring incremental database updates from MySQL to HDFS and subsequently to HBase, derive a graph structure and relationships from entities identified in tweet collections, and offer a query-independent method for estimating the importance of those entities. We achieve these goals through the use of several open-source software packages, and present open, scalable solutions addressing the objectives we were given.
- Deep Representation Learning on Labeled GraphsFan, Shuangfei (Virginia Tech, 2020-01-27)We introduce recurrent collective classification (RCC), a variant of ICA analogous to recurrent neural network prediction. RCC accommodates any differentiable local classifier and relational feature functions. We provide gradient-based strategies for optimizing over model parameters to more directly minimize the loss function. In our experiments, this direct loss minimization translates to improved accuracy and robustness on real network data. We demonstrate the robustness of RCC in settings where local classification is very noisy, settings that are particularly challenging for ICA. As a new way to train generative models, generative adversarial networks (GANs) have achieved considerable success in image generation, and this framework has also recently been applied to data with graph structures. We identify the drawbacks of existing deep frameworks for generating graphs, and we propose labeled-graph generative adversarial networks (LGGAN) to train deep generative models for graph-structured data with node labels. We test the approach on various types of graph datasets, such as collections of citation networks and protein graphs. Experiment results show that our model can generate diverse labeled graphs that match the structural characteristics of the training data and outperforms all baselines in terms of quality, generality, and scalability. To further evaluate the quality of the generated graphs, we apply it to a downstream task for graph classification, and the results show that LGGAN can better capture the important aspects of the graph structure.