Show simple item record

dc.contributor.authorCadena, Jose Eduardoen_US
dc.date.accessioned2018-01-30T09:00:53Z
dc.date.available2018-01-30T09:00:53Z
dc.date.issued2018-01-29en_US
dc.identifier.othervt_gsexam:13510en_US
dc.identifier.urihttp://hdl.handle.net/10919/81960
dc.description.abstractNetworks are a mathematical abstraction of the interactions between a set of entities, with extensive applications in social science, epidemiology, bioinformatics, and cybersecurity, among others. There are many fundamental problems when analyzing network data, such as anomaly detection, dense subgraph mining, motif finding, information diffusion, and epidemic spread. A common underlying task in all these problems is finding an "interesting subgraph"; that is, finding a part of the graph---usually small relative to the whole---that optimizes a score function and has some property of interest, such as connectivity or a minimum density. Finding subgraphs that satisfy common constraints of interest, such as the ones above, is computationally hard in general, and state-of-the-art algorithms for many problems in network analysis are heuristic in nature. These methods are fast and usually easy to implement. However, they come with no theoretical guarantees on the quality of the solution, which makes it difficult to assess how the discovered subgraphs compare to an optimal solution, which in turn affects the data mining task at hand. For instance, in anomaly detection, solutions with low anomaly score lead to sub-optimal detection power. On the other end of the spectrum, there have been significant advances on approximation algorithms for these challenging graph problems in the theoretical computer science community. However, these algorithms tend to be slow, difficult to implement, and they do not scale to the large datasets that are common nowadays. The goal of this dissertation is developing scalable algorithms with theoretical guarantees for various network analysis problems, where the underlying task is to find subgraphs with constraints. We find interesting subgraphs with guarantees by adapting techniques from parameterized complexity, convex optimization, and submodularity optimization. These techniques are well-known in the algorithm design literature, but they lead to slow and impractical algorithms. One unifying theme in the problems that we study is that our methods are scalable without sacrificing the theoretical guarantees of these algorithm design techniques. We accomplish this combination of scalability and rigorous bounds by exploiting properties of the problems we are trying to optimize, decomposing or compressing the input graph to a manageable size, and parallelization. We consider problems on network analysis for both static and dynamic network models. And we illustrate the power of our methods in applications, such as public health, sensor data analysis, and event detection using social media data.en_US
dc.format.mediumETDen_US
dc.publisherVirginia Techen_US
dc.rightsThis item is protected by copyright and/or related rights. Some uses of this item may be deemed fair and permitted by law even without permission from the rights holder(s), or the rights holder(s) may have licensed the work for use under certain conditions. For other uses you need to obtain permission from the rights holder(s).en_US
dc.subjectGraph Miningen_US
dc.subjectData Miningen_US
dc.subjectGraph Algorithmsen_US
dc.subjectAnomaly Detectionen_US
dc.subjectFinding Subgraphsen_US
dc.subjectParameterized Complexityen_US
dc.subjectDistributed Algorithmsen_US
dc.titleFinding Interesting Subgraphs with Guaranteesen_US
dc.typeDissertationen_US
dc.contributor.departmentComputer Scienceen_US
dc.description.degreePh. D.en_US
thesis.degree.namePh. D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen_US
thesis.degree.disciplineComputer Science and Applicationsen_US
dc.contributor.committeechairVullikanti, Anil Kumar S.en_US
dc.contributor.committeememberMarathe, Madhav Vishnuen_US
dc.contributor.committeememberLu, Chang Tienen_US
dc.contributor.committeememberKonjevod, Goranen_US
dc.contributor.committeememberRamakrishnan, Narendranen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record