Browsing by Author "Karra, Kiran"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Modeling and Analysis of Non-Linear Dependencies using Copulas, with Applications to Machine LearningKarra, Kiran (Virginia Tech, 2018-09-21)Many machine learning (ML) techniques rely on probability, random variables, and stochastic modeling. Although statistics pervades this field, there is a large disconnect between the copula modeling and the machine learning communities. Copulas are stochastic models that capture the full dependence structure between random variables and allow flexible modeling of multivariate joint distributions. Elidan was the first to recognize this disconnect, and introduced copula based models to the ML community that demonstrated magnitudes of order better performance than the non copula-based models Elidan [2013]. However, the limitation of these is that they are only applicable for continuous random variables and real world data is often naturally modeled jointly as continuous and discrete. This report details our work in bridging this gap of modeling and analyzing data that is jointly continuous and discrete using copulas. Our first research contribution details modeling of jointly continuous and discrete random variables using the copula framework with Bayesian networks, termed Hybrid Copula Bayesian Networks (HCBN) [Karra and Mili, 2016], a continuation of Elidan’s work on Copula Bayesian Networks Elidan [2010]. In this work, we extend the theorems proved by Neslehov ˇ a [2007] from bivariate ´ to multivariate copulas with discrete and continuous marginal distributions. Using the multivariate copula with discrete and continuous marginal distributions as a theoretical basis, we construct an HCBN that can model all possible permutations of discrete and continuous random variables for parent and child nodes, unlike the popular conditional linear Gaussian network model. Finally, we demonstrate on numerous synthetic datasets and a real life dataset that our HCBN compares favorably, from a modeling and flexibility viewpoint, to other hybrid models including the conditional linear Gaussian and the mixture of truncated exponentials models. Our second research contribution then deals with the analysis side, and discusses how one may use copulas for exploratory data analysis. To this end, we introduce a nonparametric copulabased index for detecting the strength and monotonicity structure of linear and nonlinear statistical dependence between pairs of random variables or stochastic signals. Our index, termed Copula Index for Detecting Dependence and Monotonicity (CIM), satisfies several desirable properties of measures of association, including Renyi’s properties, the data processing inequality (DPI), and ´ consequently self-equitability. Synthetic data simulations reveal that the statistical power of CIM compares favorably to other state-of-the-art measures of association that are proven to satisfy the DPI. Simulation results with real-world data reveal CIM’s unique ability to detect the monotonicity structure among stochastic signals to find interesting dependencies in large datasets. Additionally, simulations show that CIM shows favorable performance to estimators of mutual information when discovering Markov network structure. Our third research contribution deals with how to assess an estimator’s performance, in the scenario where multiple estimates of the strength of association between random variables need to be rank ordered. More specifically, we introduce a new property of estimators of the strength of statistical association, which helps characterize how well an estimator will perform in scenarios where dependencies between continuous and discrete random variables need to be rank ordered. The new property, termed the estimator response curve, is easily computable and provides a marginal distribution agnostic way to assess an estimator’s performance. It overcomes notable drawbacks of current metrics of assessment, including statistical power, bias, and consistency. We utilize the estimator response curve to test various measures of the strength of association that satisfy the data processing inequality (DPI), and show that the CIM estimator’s performance compares favorably to kNN, vME, AP, and HMI estimators of mutual information. The estimators which were identified to be suboptimal, according to the estimator response curve, perform worse than the more optimal estimators when tested with real-world data from four different areas of science, all with varying dimensionalities and sizes.
- Probabilistic Load-Margin Assessment using Vine Copula and Gaussian Process EmulationXu, Yijun; Karra, Kiran; Mili, Lamine M.; Korkali, Mert; Chen, Xiao; Hu, Zhixiong (IEEE, 2020)The increasing penetration of renewable energy along with the variations of the loads bring large uncertainties in the power system states that are threatening the security of power system planning and operation. Facing these challenges, this paper proposes a cost-effective, nonparametric method to quantity the impact of uncertain power injections on the load margins. First, we propose to generate system uncertain inputs via a novel vine copula due to its capability in simulating complex multivariate highly dependent model inputs. Furthermore, to reduce the prohibitive computational time required in the traditional Monte-Carlo method, we propose to use a nonparametric, Gaussian-process-emulator-based reduced-order model to replace the original complicated continuation power-flow model. This emulator allows us to execute the time-consuming continuation power-flow solver at the sampled values with a negligible computational cost. The simulations conducted on the IEEE 57-bus system, to which correlated renewable generation are attached, reveal the excellent performance of the proposed method.
- Wireless Distributed Computing on the Android PlatformKarra, Kiran (Virginia Tech, 2012-09-27)The last couple of years have seen an explosive growth in smartphone sales. Additionally, the computational power of modern smartphones has been increasing at a high rate. For example, the popular iPhone 4S has a 1 GHz processor with 512 MB of RAM [5]. Other popular smartphones such as the Samsung Galaxy Nexus S also have similar specications. These smartphones are as powerful as desktop computers of the 2005 era, and the tight integration of many dierent hardware chipsets in these mobile devices makes for a unique mobile platform that can be exploited for capabilities other than traditional uses of a phone, such as talk and text [4]. In this work, the concept using smartphones that run the Android operating system for distributed computing over a wireless mesh network is explored. This is also known as wireless distributed computing (WDC). The complexities of WDC on mobile devices are different from traditional distributed computing because of, among other things, the unreliable wireless communications channel and the limited power available to each computing node. This thesis develops the theoretical foundations for WDC. A mathematical model representing the total amount of resources required to distribute a task with WDC is developed. It is shown that given a task that is distributable, under certain conditions, there exists a theoretical minimum amount of resources that can be used in order to perform a task using WDC. Finally, the WDC architecture is developed, an Android App implementation of the WDC architecture is tested, and it is shown in a practical application that using WDC to perform a task provides a performance increase over processing the job locally on the Android OS.