Browsing by Author "Shea-Blymyer, Colin"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Distinguishing Dynamical Kinds: An Approach for Automating Scientific DiscoveryShea-Blymyer, Colin (Virginia Tech, 2019-07-02)The automation of scientific discovery has been an active research topic for many years. The promise of a formalized approach to developing and testing scientific hypotheses has attracted researchers from the sciences, machine learning, and philosophy alike. Leveraging the concept of dynamical symmetries a new paradigm is proposed for the collection of scientific knowledge, and algorithms are presented for the development of EUGENE – an automated scientific discovery tool-set. These algorithms have direct applications in model validation, time series analysis, and system identification. Further, the EUGENE tool-set provides a novel metric of dynamical similarity that would allow a system to be clustered into its dynamical regimes. This dynamical distance is sensitive to the presence of chaos, effective order, and nonlinearity. I discuss the history and background of these algorithms, provide examples of their behavior, and present their use for exploring system dynamics.
- A General Metric for the Similarity of Both Stochastic and Deterministic System DynamicsShea-Blymyer, Colin; Roy, Subhradeep; Jantzen, Benjamin C. (MDPI, 2021-09-09)Many problems in the study of dynamical systems—including identification of effective order, detection of nonlinearity or chaos, and change detection—can be reframed in terms of assessing the similarity between dynamical systems or between a given dynamical system and a reference. We introduce a general metric of dynamical similarity that is well posed for both stochastic and deterministic systems and is informative of the aforementioned dynamical features even when only partial information about the system is available. We describe methods for estimating this metric in a range of scenarios that differ in respect to contol over the systems under study, the deterministic or stochastic nature of the underlying dynamics, and whether or not a fully informative set of variables is available. Through numerical simulation, we demonstrate the sensitivity of the proposed metric to a range of dynamical properties, its utility in mapping the dynamical properties of parameter space for a given model, and its power for detecting structural changes through time series data.
- OutbreakSum: Automatic Summarization of Texts Relating to Disease OutbreaksGruss, Richard; Morgado, Daniel; Craun, Nate; Shea-Blymyer, Colin (2014-12)The goal of the fall 2014 Disease Outbreak Project (OutbreakSum) was to develop software for automatically analyzing and summarizing large collections of texts pertaining to disease outbreaks. Although our code was tested on collections about specific diseases--a small one about Encephalitis and a large one about Ebola--most of our tools would work on texts about any infectious disease, where the key information relates to locations, dates, number of cases, symptoms, prognosis, and government and healthcare organization interventions. In the course of the project, we developed a code base that performs several key Natural Language Processing (NLP) functions. Some of the tools that could potentially be useful for other Natural Language Generation (NLG) projects include: 1. A framework for developing MapReduce programs in Python that allows for local running and debugging; 2. Tools for document collection cleanup procedures such as small-file removal, duplicate-file removal (based on content hashes), sentence and paragraph tokenization, nonrelevant file removal, and encoding translation; 3. Utilities to simplify and speed up Named Entity Recognition with Stanford NER by using the Java API directly; 4. Utilities to leverage the full extent of the Stanford CoreNLP library, which include tools for parsing and coreference resolution; 5. Utilities to simplify using the OpenNLP Java library for text processing. By configuring and running a single Java class, you can use OpenNLP to perform part-of-speech tagging and named entity recognition on your entire collection in minutes. We’ve classified the tools available in OutbreakSum into four major modules: 1. Collection Processing; 2. Local Language Processing; 3. MapReduce with Apache Hadoop; 4. Summarization.