Browsing by Author "Marathe, Madhav V."
Now showing 1 - 20 of 45
Results Per Page
Sort Options
- Analysis system using brokers that access information sources(United States Patent and Trademark Office, 2018-01-16)Systems, methods, and computer-readable media for generating a data set are provided. One method includes generating a data set based on input data using a plurality of brokers. The method further includes receiving a request from a user and determining whether the request can be fulfilled using data currently in the data set. When the request can be fulfilled using data currently in the data set, the data is accessed using broker(s) configured to provide access to data within the data set. When the request cannot be fulfilled using data currently in the data set, at least one new broker is spawned using existing broker(s) and additional data needed to fulfill the request is added to the data set using the new broker. The method further includes generating a response to the request using one or more of the plurality of brokers.
- The Betweenness Centrality Of Biological NetworksNarayanan, Shivaram (Virginia Tech, 2005-09-16)In the last few years, large-scale experiments have generated genome-wide protein interaction networks for many organisms including Saccharomyces cerevisiae (baker's yeast), Caenorhabditis elegans (worm) and Drosophila melanogaster (fruit fly). In this thesis, we examine the vertex and edge betweenness centrality measures of these graphs. These measures capture how "central" a vertex or an edge is in the graph by considering the fraction of shortest paths that pass through that vertex or edge. Our primary observation is that the distribution of the vertex betweenness centrality follows a power law, but the distribution of the edge betweenness centrality has a Poisson-like distribution with a very sharp spike. To investigate this phenomenon, we generated random networks with degree distribution identical to those of the protein interaction networks. To our surprise, we found out that the random networks and the protein interaction networks had almost identical distribution of edge betweenness. We conjecture that the "Poisson-like" distribution of the edge betweenness centrality is the property of any graph whose degree distribution satisfies power law.
- Capacity Characterization of Multi-Hop Wireless Networks- A Cross Layer ApproachChafekar, Deepti Ramesh (Virginia Tech, 2009-03-11)A fundamental problem in multi-hop wireless networks is to estimate their throughout capacity. The problem can be informally stated as follows: given a multi-hop wireless network and a set of source destination pairs, determine the maximum rate r at which data can be transmitted between each source destination pair. Estimating the capacity of a multi-hop wireless network is practically useful --- it yields insights into the fundamental performance limits of the wireless network and at the same time aids the development of protocols that can utilize the network close to this limit. A goal of this dissertation is to develop rigorous mathematical foundations to compute the capacity of any given multi-hop wireless network with known source-destination pairs. An important factor that affects the capacity of multi-hop wireless networks is radio interference. As a result, researchers have proposed increasingly realistic interference models that aim to capture the physical characteristics of radio signals. Some of the commonly used simple models that capture radio interference are based on geometric disk-graphs. The simplicity of these models facilitate the development of provable and often conceptually simple methods for estimating the capacity of wireless networks. A potential weakness of this class of models is that they oversimplify the physical process by assuming that the signal ends abruptly at the boundary of a geometric region (a disk for omni-directional antennas). A more sophisticated interference model is the physical interference model, also known as the Signal to Interference Plus Noise Ratio (SINR) model. This model is more realistic than disk-graph models as it captures the effects of signal fading and ambient noise. This work considers both disk-graph and SINR interference models. In addition to radio interference, the throughput capacity of a multi-hop wireless network also depends on other factors, including the specific paths selected to route the packets between the source destination pairs (routing), the time at which packets are transmitted (scheduling), the power with which nodes transmit (power control) and the rate at which packets are injected (rate control). In this dissertation, we consider three different problems related to estimating network capacity. We propose an algorithmic approach for solving these problems. We first consider the problem of maximizing throughput with the SINR interference model by jointly considering the effects of routing and scheduling constraints. Second, we consider the problem of maximizing throughput by performing adaptive power control, scheduling and routing for disk-graph interference models. Finally, we examine the problem of minimizing end-to-end latency by performing joint routing, scheduling and power control using the SINR interference model. Recent results have shown that traditional layered networking principles lead to inefficient utilization of resources in multi-hop wireless networks. Motivated by these observations, recent papers have begun investigating cross-layer design approaches. Although our work does not develop new cross-layered protocols, it yields new insights that could contribute to the development of such protocols in the future. Our approach for solving these multi-objective optimization problems is based on combining mathematical programming with randomized rounding to obtain polynomial time approximation algorithms with provable worst case performance ratios. For the problems considered in this work, our results provide the best analytical performance guarantees currently known in the literature. We complement our rigorous theoretical and algorithmic analysis with simulation-based experimental analysis. Our experimental results help us understand the limitations of our approach and assist in identifying certain parameters for improving the performance of our techniques.
- Cascading Events in the Aftermath of a Targeted Physical Attack on the Power GridMeyur, Rounak (Virginia Tech, 2019-03-29)This work studies the consequences of a human-initiated targeted attack on the electric power system by simulating the detonation of a bomb at one or more substations in and around Washington DC. An AC power flow based transient analysis on a realistic power grid model of Eastern Interconnection is considered to study the cascading events. A detailed model of control and protection system in the power grid is considered to ensure the accurate representation of cascading outages. Particularly, the problem of identifying a set of k critical nodes, whose failure/attack leads to the maximum adverse impact on the power system has been analyzed in detail. It is observed that a greedy approach yields node sets with higher criticality than a degree-based approach, which has been suggested in many prior works. Furthermore, it is seen that the impact of a targeted attack exhibits a nonmonotonic behavior as a function of the target set size k. The consideration of hidden failures in the protective relays has revealed that the outage of certain lines/buses in the course of cascading events can save the power grid from a system collapse. Finally, a comparison with the DC steady state analysis of cascading events shows that a transient stability assessment is necessary to obtain the complete picture of cascading events in the aftermath of a targeted attack on the power grid.
- Cognitive NetworksThomas, Ryan William (Virginia Tech, 2007-06-15)For complex computer networks with many tunable parameters and network performance objectives, the task of selecting the ideal network operating state is difficult. To improve the performance of these kinds of networks, this research proposes the idea of the cognitive network. A cognitive network is a network composed of elements that, through learning and reasoning, dynamically adapt to varying network conditions in order to optimize end-to-end performance. In a cognitive network, decisions are made to meet the requirements of the network as a whole, rather than the individual network components. We examine the cognitive network concept by first providing a definition and then outlining the difference between it and other cognitive and cross-layer technologies. From this definition, we develop a general, three-layer cognitive network framework, based loosely on the framework used for cognitive radio. In this framework, we consider the possibility of a cognitive process consisting of one or more cognitive elements, software agents that operate somewhere between autonomy and cooperation. To understand how to design a cognitive network within this framework we identify three critical design decisions that affect the performance of the cognitive network: the selfishness of the cognitive elements, their degree of ignorance, and the amount of control they have over the network. To evaluate the impact of these decisions, we created a metric called the price of a feature, defined as the ratio of the network performance with a certain design decision to the performance without the feature. To further aid in the design of cognitive networks, we identify classes of cognitive networks that are structurally similar to one another. We examined two of these classes: the potential class and the quasi-concave class. Both classes of networks will converge to Nash Equilibrium under selfish behavior and in the quasi-concave class this equilibrium is both Pareto and globally optimal. Furthermore, we found the quasi-concave class has other desirable properties, reacting well to the absence of certain kinds of information and degrading gracefully under reduced network control. In addition to these analytical, high level contributions, we develop cognitive networks for two open problems in resource management for self-organizing networks, validating and illustrating the cognitive network approach. For the first problem, a cognitive network is shown to increase the lifetime of a wireless multicast route by up to 125\%. For this problem, we show that the price of selfishness and control are more significant than the price of ignorance. For the second problem, a cognitive network minimizes the transmission power and spectral impact of a wireless network topology under static and dynamic conditions. The cognitive network, utilizing a distributed, selfish approach, minimizes the maximum power in the topology and reduces (on average) the channel usage to within 12\% of the minimum channel assignment. For this problem, we investigate the price of ignorance under dynamic networks and the cost of maintaining knowledge in the network. Today's computer networking technology will not be able to solve the complex problems that arise from increasingly bandwidth-intensive applications competing for scarce resources. Cognitive networks have the potential to change this trend by adding intelligence to the network. This work introduces the concept and provides a foundation for future investigation and implementation.
- Cognitive Networks: Foundations to ApplicationsFriend, Daniel (Virginia Tech, 2009-03-06)Fueled by the rapid advancement in digital and wireless technologies, the ever-increasing capabilities of wireless devices have placed upon us a tremendous challenge - how to put all of this capability to effective use. Individually, wireless devices have outpaced the ability of users to optimally configure them. Collectively, the complexity is far more daunting. Research in cognitive networks seeks to provide a solution to the diffculty of effectively using the expanding capabilities of wireless networks by embedding greater degrees of intelligence within the network itself. In this dissertation, we address some fundamental questions related to cognitive networks, such as "What is a cognitive network?" and "What methods may be used to design a cognitive network?" We relate cognitive networks to a common artificial intelligence (AI) framework, the multi-agent system (MAS). We also discuss the key elements of learning and reasoning, with the ability to learn being the primary differentiator for a cognitive network. Having discussed some of the fundamentals, we proceed to further illustrate the cognitive networking principle by applying it to two problems: multichannel topology control for dynamic spectrum access (DSA) and routing in a mobile ad hoc network (MANET). The multichannel topology control problem involves confguring secondary network parameters to minimize the probability that the secondary network will cause an outage to a primary user in the future. This requires the secondary network to estimate an outage potential map, essentially a spatial map of predicted primary user density, which must be learned using prior observations of spectral occupancy made by secondary nodes. Due to the complexity of the objective function, we provide a suboptimal heuristic and compare its performance against heuristics targeting power-based and interference-based topology control objectives. We also develop a genetic algorithm to provide reference solutions since obtaining optimal solutions is impractical. We show how our approach to this problem qualifies as a cognitive network. In presenting our second application, we address the role of network state observations in cognitive networking. Essentially, we need a way to quantify how much information is needed regarding the state of the network to achieve a desired level of performance. This question is applicable to networking in general, but becomes increasingly important in the cognitive network context because of the potential volume of information that may be desired for decision-making. In this case, the application is routing in MANETs. Current MANET routing protocols are largely adapted from routing algorithms developed for wired networks. Although optimal routing in wired networks is grounded in dynamic programming, the critical assumption, static link costs and states, that enables the use of dynamic programming for wired networks need not apply to MANETs. We present a link-level model of a MANET, which models the network as a stochastically varying graph that possesses the Markov property. We present the Markov decision process as the appropriate framework for computing optimal routing policies for such networks. We then proceed to analyze the relationship between optimal policy and link state information as a function of minimum distance from the forwarding node. The applications that we focus on are quite different, both in their models as well as their objectives. This difference is intentional and signficant because it disassociates the technology, i.e. cognitive networks, from the application of the technology. As a consequence, the versatility of the cognitive networks concept is demonstrated. Simultaneously, we are able to address two open problems and provide useful results, as well as new perspective, on both multichannel topology control and MANET routing. This material is posted here with permission from the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Virginia Tech library's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this material, you agree to all provisions of the copyright laws protecting it.
- Combining Participatory Influenza Surveillance with Modeling and Forecasting: Three Alternative ApproachesBrownstein, John S.; Marathe, Achla (JMIR Publications, 2017)Background: Influenza outbreaks affect millions of people every year and its surveillance is usually carried out in developed countries through a network of sentinel doctors who report the weekly number of Influenza-like Illness cases observed among the visited patients. Monitoring and forecasting the evolution of these outbreaks supports decision makers in designing effective interventions and allocating resources to mitigate their impact. Objective: Describe the existing participatory surveillance approaches that have been used for modeling and forecasting of the seasonal influenza epidemic, and how they can help strengthen real-time epidemic science and provide a more rigorous understanding of epidemic conditions. Methods: We describe three different participatory surveillance systems, WISDM (Widely Internet Sourced Distributed Monitoring), Influenzanet and Flu Near You (FNY), and show how modeling and simulation can be or has been combined with participatory disease surveillance to: i) measure the non-response bias in a participatory surveillance sample using WISDM; and ii) nowcast and forecast influenza activity in different parts of the world (using Influenzanet and Flu Near You). Results: WISDM-based results measure the participatory and sample bias for three epidemic metrics i.e. attack rate, peak infection rate, and time-to-peak, and find the participatory bias to be the largest component of the total bias. The Influenzanet platform shows that digital participatory surveillance data combined with a realistic data-driven epidemiological model can provide both short-term and long-term forecasts of epidemic intensities, and the ground truth data lie within the 95 percent confidence intervals for most weeks. The statistical accuracy of the ensemble forecasts increase as the season progresses. The Flu Near You platform shows that participatory surveillance data provide accurate short-term flu activity forecasts and influenza activity predictions. The correlation of the HealthMap Flu Trends estimates with the observed CDC ILI rates is 0.99 for 2013-2015. Additional data sources lead to an error reduction of about 40% when compared to the estimates of the model that only incorporates CDC historical information. Conclusions: While the advantages of participatory surveillance, compared to traditional surveillance, include its timeliness, lower costs, and broader reach, it is limited by a lack of control over the characteristics of the population sample. Modeling and simulation can help overcome this limitation as well as provide real-time and long-term forecasting of influenza activity in data-poor parts of the world.
- Comparing Effectiveness of Top-Down and Bottom-Up Strategies in Containing InfluenzaMarathe, Achla; Lewis, Bryan L.; Barrett, Christopher L.; Chen, Jiangzhuo; Marathe, Madhav V.; Eubank, Stephen; Ma, Yifei (Public Library of Science, 2011-09-22)This research compares the performance of bottom-up, self-motivated behavioral interventions with top-down interventions targeted at controlling an “Influenza-like-illness”. Both types of interventions use a variant of the ring strategy. In the first case, when the fraction of a person's direct contacts who are diagnosed exceeds a threshold, that person decides to seek prophylaxis, e.g. vaccine or antivirals; in the second case, we consider two intervention protocols, denoted Block and School: when a fraction of people who are diagnosed in a Census Block (resp., School) exceeds the threshold, prophylax the entire Block (resp., School). Results show that the bottom-up strategy outperforms the top-down strategies under our parameter settings. Even in situations where the Block strategy reduces the overall attack rate well, it incurs a much higher cost. These findings lend credence to the notion that if people used antivirals effectively, making them available quickly on demand to private citizens could be a very effective way to control an outbreak.
- Complex situation analysis system that generates a social contact network, uses edge brokers and service brokers, and dynamically adds brokers(United States Patent and Trademark Office, 2013-04-16)A system for generating a representation of a situation is disclosed. The system comprises one or more computer-readable media including computer-executable instructions that are executable by one or more processors to implement a method of generating a representation of a situation. The method comprises receiving input data regarding a target population. The method further comprises constructing a synthetic data set including a synthetic population based on the input data. The synthetic population includes a plurality of synthetic entities. Each synthetic entity has a one-to-one correspondence with an entity in the target population. Each synthetic entity is assigned one or more attributes based on information included in the input data. The method further comprises receiving activity data for a plurality of entities in the target population.
- Complex situation analysis system that spawns/creates new brokers using existing brokers as needed to respond to requests for data(United States Patent and Trademark Office, 2014-03-25)Systems, methods, and computer-readable media for generating a data set are provided. One method includes generating a data set based on input data using a plurality of brokers. The method further includes receiving a request from a user and determining whether the request can be fulfilled using data currently in the data set. When the request can be fulfilled using data currently in the data set, the data is accessed using broker(s) configured to provide access to data within the data set. When the request cannot be fulfilled using data currently in the data set, at least one new broker is spawned using existing broker(s) and additional data needed to fulfill the request is added to the data set using the new broker. The method further includes generating a response to the request using one or more of the plurality of brokers.
- Complex situation analysis system using a plurality of brokers that control access to information sources(United States Patent and Trademark Office, 2016-06-14)Systems, methods, and computer-readable media for generating a data set are provided. One method includes generating a data set based on input data using a plurality of brokers. The method further includes receiving a request from a user and determining whether the request can be fulfilled using data currently in the data set. When the request can be fulfilled using data currently in the data set, the data is accessed using broker(s) configured to provide access to data within the data set. When the request cannot be fulfilled using data currently in the data set, at least one new broker is spawned using existing broker(s) and additional data needed to fulfill the request is added to the data set using the new broker. The method further includes generating a response to the request using one or more of the plurality of brokers.
- Data analysis and modeling pipelines for controlled networked social science experimentsCedeno-Mieles, Vanessa; Hu, Zhihao; Ren, Yihui; Deng, Xinwei; Contractor, Noshir; Ekanayake, Saliya; Epstein, Joshua M.; Goode, Brian J.; Korkmaz, Gizem; Kuhlman, Christopher J.; Machi, Dustin; Macy, Michael; Marathe, Madhav V.; Ramakrishnan, Naren; Saraf, Parang; Self, Nathan (PLOS, 2020-11-24)There is large interest in networked social science experiments for understanding human behavior at-scale. Significant effort is required to perform data analytics on experimental outputs and for computational modeling of custom experiments. Moreover, experiments and modeling are often performed in a cycle, enabling iterative experimental refinement and data modeling to uncover interesting insights and to generate/refute hypotheses about social behaviors. The current practice for social analysts is to develop tailor-made computer programs and analytical scripts for experiments and modeling. This often leads to inefficiencies and duplication of effort. In this work, we propose a pipeline framework to take a significant step towards overcoming these challenges. Our contribution is to describe the design and implementation of a software system to automate many of the steps involved in analyzing social science experimental data, building models to capture the behavior of human subjects, and providing data to test hypotheses. The proposed pipeline framework consists of formal models, formal algorithms, and theoretical models as the basis for the design and implementation. We propose a formal data model, such that if an experiment can be described in terms of this model, then our pipeline software can be used to analyze data efficiently. The merits of the proposed pipeline framework is elaborated by several case studies of networked social science experiments.
- Deep Learning for Taxonomy PredictionRamesh, Shreyas (Virginia Tech, 2019-06-04)The last decade has seen great advances in Next-Generation Sequencing technologies, and, as a result, there has been a rise in the number of genomes sequenced each year. In 2017, there were as many as 10,000 new organisms sequenced and added into the RefSeq Database. Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction. In the Plinko strategy, each network takes advantage of different word usage patterns corresponding to different levels of evolutionary divergence. Plinko has the advantages of relatively low storage, GPGPU parallel training and inference, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional,alignment-free approach to taxonomy prediction.
- Design and Implementation of An Emulation Testbed for Optimal Spectrum Sharing in Multi-hop Cognitive Radio NetworksLiu, Tong (Virginia Tech, 2007-07-09)Cognitive Radio (CR) capitalizes advances in signal processing and radio technology and is capable of reconfiguring RF and switching to desired frequency bands. It is a frequency-agile data communication device that is vastly more powerful than existing multi-channel multi-radio (MC-MR) technology. In this thesis, we investigate the important problem of multi-hop networking with CR nodes. In a CR network, each node has a set of frequency bands (not necessarily of equal size) that may not be the same as those at other nodes. The uneven size of frequency bands prompts the need of further division into sub-bands for optimal spectrum sharing. We characterize behaviors and constraints for such multi-hop CR network from multiple layers, including modeling of spectrum sharing and sub-band division, scheduling and interference constraints, and flow routing. We give a formal mathematical formulation with the objective of maximizing the network throughput for a set of user communication sessions. Since such problem formulation falls into mixed integer non-linear programming (MINLP), which is NP-hard in general, we develop a lower bound for the objective by relaxing the integer variables and linearization. Subsequently, we develop a nearoptimal algorithm to this MINLP problem. This algorithm is based on a novel sequential fixing (SF) procedure, where the integer variables are determined iteratively via a sequence of linear program (LP). In order to implement and evaluate these algorithms in a controlled laboratory setting, we design and implement an emulation testbed. The highlights of our experimental research include: • Emulation of a multi-hop CR network with arbitrary topology; • An implementation of the proposed SF algorithm at the application layer; • A source routing implementation that can easily support comparative study between SF algorithm and other schemes; • Experiments comparing the SF algorithm with another algorithm called Layered Greedy Algorithm (LGA); • Experimental results show that the proposed SF significantly outperforms LGA. In summary, the experimental research in this thesis has demonstrated that SF algorithm is a viable algorithm for optimal spectrum sharing in multi-hop CR networks.
- Detail in network models of epidemiology: are we there yet?Eubank, Stephen; Barrett, Christopher L.; Beckman, Richard J.; Bisset, Keith R.; Durbeck, L.; Kuhlman, Christopher J.; Lewis, Bryan L.; Marathe, Achla; Marathe, Madhav V.; Stretz, P. (Taylor & Francis, 2010)Network models of infectious disease epidemiology can potentially provide insight into how to tailor control strategies for specific regions, but only if the network adequately reflects the structure of the region’s contact network. Typically, the network is produced by models that incorporate details about human interactions. Each detail added renders the models more complicated and more difficult to calibrate, but also more faithful to the actual contact network structure. We propose a statistical test to determine when sufficient detail has been added to the models and demonstrate its application to the models used to create a synthetic population and contact network for the USA.
- Discovering contextual connections between biological processes using high-throughput dataLasher, Christopher Donald (Virginia Tech, 2011-09-12)Hearkening to calls from life scientists for aid in interpreting rapidly-growing repositories of data, the fields of bioinformatics and computational systems biology continue to bear increasingly sophisticated methods capable of summarizing and distilling pertinent phenomena captured by high-throughput experiments. Techniques in analysis of genome-wide gene expression (e.g., microarray) data, for example, have moved beyond simply detecting individual genes perturbed in treatment-control experiments to reporting the collective perturbation of biologically-related collections of genes, or "processes". Recent expression analysis methods have focused on improving comprehensibility of results by reporting concise, non-redundant sets of processes by leveraging statistical modeling techniques such as Bayesian networks. Simultaneously, integrating gene expression measurements with gene interaction networks has led to computation of response networks--subgraphs of interaction networks in which genes exhibit strong collective perturbation or co-expression. Methods that integrate process annotations of genes with interaction networks identify high-level connections between biological processes, themselves. To identify context-specific changes in these inter-process connections, however, techniques beyond process-based expression analysis, which reports only perturbed processes and not their relationships, response networks, composed of interactions between genes rather than processes, and existing techniques in process connection detection, which do not incorporate specific biological context, proved necessary. We present two novel methods which take inspiration from the latest techniques in process-based gene expression analysis, computation of response networks, and computation of inter-process connections. We motivate the need for detecting inter-process connections by identifying a collection of processes exhibiting significant differences in collective expression in two liver tissue culture systems widely used in toxicological and pharmaceutical assays. Next, we identify perturbed connections between these processes via a novel method that integrates gene expression, interaction, and annotation data. Finally, we present another novel method that computes non-redundant sets of perturbed inter-process connections, and apply it to several additional liver-related data sets. These applications demonstrate the ability of our methods to capture and report biologically relevant high-level trends.
- Economic and Social Impact of Influenza Mitigation Strategies by Demographic ClassBarrett, Christopher L.; Bisset, Keith R.; Leidig, Jonathan; Marathe, Achla; Marathe, Madhav V. (Elsevier, 2011-03-01)Background—We aim to determine the economic and social impact of typical interventions proposed by the public health officials and preventive behavioral changes adopted by the private citizens in the event of a “flu-like” epidemic. Method—We apply an individual-based simulation model to the New River Valley area of Virginia for addressing this critical problem. The economic costs include not only the loss in productivity due to sickness but also the indirect cost incurred through disease avoidance and caring for dependents. Results—The results show that the most important factor responsible for preventing income loss is the modification of individual behavior; it drops the total income loss by 62% compared to the base case. The next most important factor is the closure of schools which reduces the total income loss by another 40%. Conclusions—The preventive behavior of the private citizens is the most important factor in controlling the epidemic.
- Effective Search in Online Knowledge Communities: A Genetic Algorithm ApproachZhang, Xiaoyu (Virginia Tech, 2009-09-11)Online Knowledge Communities, also known as online forum, are popular web-based tools that allow members to seek and share knowledge. Documents to answer varieties of questions are associated with the process of knowledge exchange. The social network of members in an Online Knowledge Community is an important factor to improve search precision. However, prior ranking functions don't handle this kind of document with using this information. In this study, we try to resolve the problem of finding authoritative documents for a user query within an Online Knowledge Community. Unlike prior ranking functions which consider either content based feature, hyperlink based feature, or document structure based feature, we explored the Online Knowledge Community social network structure and members social interaction activities to design features that can gauge the two major factors affecting user knowledge adoption decision: argument quality and source credibility. We then design a customized Genetic Algorithm to adjust the weights for new features we proposed. We compared the performance of our ranking strategy with several others baselines on a real world data www.vbcity.com/forums/. The evaluation results demonstrated that our method could improve the user search satisfaction with an obviously percentage. At the end, we concluded that our approach based on knowledge adoption model and Genetic Algorithm is a better ranking strategy in the Online Knowledge Community.
- Epidemiology Experimentation and Simulation Management through Scientific Digital LibrariesLeidig, Jonathan Paul (Virginia Tech, 2012-07-20)Advances in scientific data management, discovery, dissemination, and sharing are changing the manner in which scientific studies are being conducted and repurposed. Data-intensive scientific practices increasingly require data management related services not available in existing digital libraries. Complicating the issue are the diversity of functional requirements and content in scientific domains as well as scientists' lack of expertise in information and library sciences. Researchers that utilize simulation and experimentation systems need digital libraries to maintain datasets, input configurations, results, analyses, and related documents. A digital library may be integrated with simulation infrastructures to provide automated support for research components, e.g., simulation interfaces to models, data warehouses, simulation applications, computational resources, and storage systems. Managing and provisioning simulation content allows streamlined experimentation, collaboration, discovery, and content reuse within a simulation community. Formal definitions of this class of digital libraries provide a foundation for producing a software toolkit and the semi-automated generation of digital library instances. We present a generic, component-based SIMulation-supporting Digital Library (SimDL) framework. The framework is formally described and provides a deployable set of domain-free services, schema-based domain knowledge representations, and extensible lower and higher level service abstractions. Services in SimDL are specialized for semi-structured simulation content and large-scale data producing infrastructures, as exemplified in data storage, indexing, and retrieval service implementations. Contributions to the scientific community include previously unavailable simulation-specific services, e.g., incentivizing public contributions, semi-automated content curating, and memoizing simulation-generated data products. The practicality of SimDL is demonstrated through several case studies in computational epidemiology and network science as well as performance evaluations.
- EpiViewer: an epidemiological application for exploring time series dataThorve, Swapna; Wilson, Mandy L.; Lewis, Bryan L.; Swarup, Samarth; Vullikanti, Anil Kumar S.; Marathe, Madhav V. (2018-11-22)Background Visualization plays an important role in epidemic time series analysis and forecasting. Viewing time series data plotted on a graph can help researchers identify anomalies and unexpected trends that could be overlooked if the data were reviewed in tabular form; these details can influence a researcher’s recommended course of action or choice of simulation models. However, there are challenges in reviewing data sets from multiple data sources – data can be aggregated in different ways (e.g., incidence vs. cumulative), measure different criteria (e.g., infection counts, hospitalizations, and deaths), or represent different geographical scales (e.g., nation, HHS Regions, or states), which can make a direct comparison between time series difficult. In the face of an emerging epidemic, the ability to visualize time series from various sources and organizations and to reconcile these datasets based on different criteria could be key in developing accurate forecasts and identifying effective interventions. Many tools have been developed for visualizing temporal data; however, none yet supports all the functionality needed for easy collaborative visualization and analysis of epidemic data. Results In this paper, we present EpiViewer, a time series exploration dashboard where users can upload epidemiological time series data from a variety of sources and compare, organize, and track how data evolves as an epidemic progresses. EpiViewer provides an easy-to-use web interface for visualizing temporal datasets either as line charts or bar charts. The application provides enhanced features for visual analysis, such as hierarchical categorization, zooming, and filtering, to enable detailed inspection and comparison of multiple time series on a single canvas. Finally, EpiViewer provides several built-in statistical Epi-features to help users interpret the epidemiological curves. Conclusions EpiViewer is a single page web application that provides a framework for exploring, comparing, and organizing temporal datasets. It offers a variety of features for convenient filtering and analysis of epicurves based on meta-attribute tagging. EpiViewer also provides a platform for sharing data between groups for better comparison and analysis. Our user study demonstrated that EpiViewer is easy to use and fills a particular niche in the toolspace for visualization and exploration of epidemiological data.
- «
- 1 (current)
- 2
- 3
- »