Browsing by Author "Thorve, Swapna"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- CS5604: Clustering and Social Networks for IDEALVishwasrao, Saket; Thorve, Swapna; Tang, Lijie (2016-05-03)The Integrated Digital Event Archiving and Library (IDEAL) project of Virginia Tech provides services for searching, browsing, analysis, and visualization of over 1 billion tweets and over 65 million webpages. The project development involved a problem based learning approach which aims to build a state-of-the-art information retrieval system in support of IDEAL. With the primary objective of building a robust search engine on top of Solr, the entire project is divided into various segments like classification, clustering, topic modeling, etc., for improving search results. Our team focuses on two tasks: clustering and social networks. Both these tasks will be considered independent for now. The clustering task aims to congregate documents in groups such that documents within a cluster would be as similar as possible. Documents are tweets and webpages and we present results for different collections. The k-means algorithm is employed for clustering the documents. Two methods were employed for feature extraction, namely, TF-IDF score and the word2vec method. Evaluation of clusters is done by two methods – Within Set Sum of Squares (WSSE) and analyzing the output of the topic analysis team to extract cluster labels and find probability scores for a document. The later strategy is a novel approach for evaluation. This strategy can be used for assessing problems of cluster labeling, likelihood of a document belonging to a cluster, and hierarchical distribution of topics and cluster. The social networking task will extract information from Twitter data by building graphs. Graph theory concepts will be applied for accomplishing this task. Using dimensionality reduction techniques and probabilistic algorithms for clustering, as well as using improving on the cluster labelling and evaluation are some of the things that can be improved on our existing work in the future. Also, the clusters that we have generated can be used as an input source in Classification, Topic Analysis and Collaborative filtering for more accurate results.
- EpiViewer: an epidemiological application for exploring time series dataThorve, Swapna; Wilson, Mandy L.; Lewis, Bryan L.; Swarup, Samarth; Vullikanti, Anil Kumar S.; Marathe, Madhav V. (2018-11-22)Background Visualization plays an important role in epidemic time series analysis and forecasting. Viewing time series data plotted on a graph can help researchers identify anomalies and unexpected trends that could be overlooked if the data were reviewed in tabular form; these details can influence a researcher’s recommended course of action or choice of simulation models. However, there are challenges in reviewing data sets from multiple data sources – data can be aggregated in different ways (e.g., incidence vs. cumulative), measure different criteria (e.g., infection counts, hospitalizations, and deaths), or represent different geographical scales (e.g., nation, HHS Regions, or states), which can make a direct comparison between time series difficult. In the face of an emerging epidemic, the ability to visualize time series from various sources and organizations and to reconcile these datasets based on different criteria could be key in developing accurate forecasts and identifying effective interventions. Many tools have been developed for visualizing temporal data; however, none yet supports all the functionality needed for easy collaborative visualization and analysis of epidemic data. Results In this paper, we present EpiViewer, a time series exploration dashboard where users can upload epidemiological time series data from a variety of sources and compare, organize, and track how data evolves as an epidemic progresses. EpiViewer provides an easy-to-use web interface for visualizing temporal datasets either as line charts or bar charts. The application provides enhanced features for visual analysis, such as hierarchical categorization, zooming, and filtering, to enable detailed inspection and comparison of multiple time series on a single canvas. Finally, EpiViewer provides several built-in statistical Epi-features to help users interpret the epidemiological curves. Conclusions EpiViewer is a single page web application that provides a framework for exploring, comparing, and organizing temporal datasets. It offers a variety of features for convenient filtering and analysis of epicurves based on meta-attribute tagging. EpiViewer also provides a platform for sharing data between groups for better comparison and analysis. Our user study demonstrated that EpiViewer is easy to use and fills a particular niche in the toolspace for visualization and exploration of epidemiological data.
- EpiViewer: An Epidemiological Application For Exploring Time Series DataThorve, Swapna (Virginia Tech, 2018-11)Visualization plays an important role in epidemic time series analysis and forecasting. Viewing time series data plotted on a graph can help researchers identify anomalies and unexpected trends that could be overlooked if the data were reviewed in tabular form. However,there are challenges in reviewing data sets from multiple data sources (data can be aggregated in different ways and measure different criteria which can make a direct comparison between time series difficult. In the face of an emerging epidemic, the ability to visualize time series from various sources and organizations and to reconcile these datasets based on different criteria could be key in developing accurate forecasts and identifying effective interventions. Many tools have been developed for visualizing temporal data; however, none yet supports all the functionality needed for easy collaborative visualization and analysis of epidemic data. In this thesis, we develop EpiViewer, a time series exploration dashboard where users can upload epidemiological time series data from a variety of sources and compare, organize, and track how data evolves as an epidemic progresses. EpiViewer provides an easy-to-use web interface for visualizing temporal datasets either as line charts or bar charts. The application provides enhanced features for visual analysis, such as hierarchical categorization, zooming, and filtering, to enable detailed inspection and comparison of multiple time series on a single canvas. Finally, EpiViewer provides a built-in statistical Epi-features module to help users interpret the epidemiological curves.