Browsing by Author "Hossain, M. Shahriar"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Connecting the Dots between PubMed AbstractsHossain, M. Shahriar; Gresock, Joseph; Edmonds, Yvette M.; Helm, Richard F.; Potts, Malcolm; Ramakrishnan, Naren (PLOS, 2012-01-03)Background There are now a multitude of articles published in a diversity of journals providing information about genes, proteins, pathways, and diseases. Each article investigates subsets of a biological process, but to gain insight into the functioning of a system as a whole, we must integrate information from multiple publications. Particularly, unraveling relationships between extra-cellular inputs and downstream molecular response mechanisms requires integrating conclusions from diverse publications. Methodology We present an automated approach to biological knowledge discovery from PubMed abstracts, suitable for “connecting the dots” across the literature. We describe a storytelling algorithm that, given a start and end publication, typically with little or no overlap in content, identifies a chain of intermediate publications from one to the other, such that neighboring publications have significant content similarity. The quality of discovered stories is measured using local criteria such as the size of supporting neighborhoods for each link and the strength of individual links connecting publications, as well as global metrics of dispersion. To ensure that the story stays coherent as it meanders from one publication to another, we demonstrate the design of novel coherence and overlap filters for use as post-processing steps. Conclusions We demonstrate the application of our storytelling algorithm to three case studies: i) a many-one study exploring relationships between multiple cellular inputs and a molecule responsible for cell-fate decisions, ii) a many-many study exploring the relationships between multiple cytokines and multiple downstream transcription factors, and iii) a one-to-one study to showcase the ability to recover a cancer related association, viz. the Warburg effect, from past literature. The storytelling pipeline helps narrow down a scientist's focus from several hundreds of thousands of relevant documents to only around a hundred stories. We argue that our approach can serve as a valuable discovery aid for hypothesis generation and connection exploration in large unstructured biological knowledge bases.
- Probability-one Homotopy Maps for Constrained Clustering ProblemsEasterling, David R.; Watson, Layne T.; Ramakrishnan, Naren; Hossain, M. Shahriar (Department of Computer Science, Virginia Polytechnic Institute & State University, 2013-12-31)Many algorithms for constrained clustering have been developed in the literature that aim to balance vector quantization requirements of cluster prototypes against the discrete satisfaction requirements of constraint (must-link or cannot-link) sets. A significant amount of research has been devoted to designing new algorithms for constrained clustering and understanding when constraints help clustering. However, no method exists to systematically characterize solution sets as constraints are gently introduced and how to assist practitioners in choosing a sweet spot between vector quantization and constraint satisfaction. A homotopy method is presented that can smoothly track solutions from unconstrained to constrained formulations of clustering. Beginning the homotopy zero curve tracking where the solution is (fairly) well-understood, the curve can then be tracked into regions where there is only a qualitative understanding of the solution set, finding multiple local solutions along the way. Experiments demonstrate how the new homotopy method helps identify better tradeoffs and reveals insight into the structure of solution sets not obtainable using pointwise exploration of parameters.