Department of Statistics
http://hdl.handle.net/10919/24213
2019-05-21T20:59:27ZHigh Breakdown Estimation Methods for Phase I Multivariate Control Charts
http://hdl.handle.net/10919/89427
Jensen, Willis A.; Birch, Jeffrey B.; Woodall, William H.
2005-01-01T00:00:00ZThe goal of Phase I monitoring of multivariate data is to identify multivariate outliers and step changes so that the estimated control limits are sufficiently accurate for Phase II monitoring. High breakdown estimation methods based on the minimum volume ellipsoid (MVE) or the minimum covariance determinant (MCD) are well suited to detecting multivariate outliers in data. However, they are difficult to implement in practice due to the extensive computation required to obtain the estimates. Based on previous studies, it is not clear which of these two estimation methods is best for control chart applications. The comprehensive simulation study here gives guidance for when to use which estimator, and control limits are provided. High breakdown estimation methods such as MCD and MVE, can be applied to a wide variety of multivariate quality control data.Robust Parameter Design: A Semi-Parametric Approach
http://hdl.handle.net/10919/89428
Pickle, Stephanie M.; Robinson, Timothy J.; Birch, Jeffrey B.; Anderson-Cook, Christine M.
2005-01-01T00:00:00ZParameter design or robust parameter design (RPD) is an engineering methodology intended as a cost-effective approach for improving the quality of products and processes. The goal of parameter design is to choose the levels of the control variables that optimize a defined quality characteristic. An essential component of robust parameter design involves the assumption of well estimated models for the process mean and variance. Traditionally, the modeling of the mean and variance has been done parametrically. It is often the case, particularly when modeling the variance, that nonparametric techniques are more appropriate due to the nature of the curvature in the underlying function. Most response surface experiments involve sparse data. In sparse data situations with unusual curvature in the underlying function, nonparametric techniques often result in estimates with problematic variation whereas their parametric counterparts may result in estimates with problematic bias. We propose the use of semi-parametric modeling within the robust design setting, combining parametric and nonparametric functions to improve the quality of both mean and variance model estimation. The proposed method will be illustrated with an example and simulations.Statistical Monitoring of Heteroscedastic Dose-Response Profiles from High-throughput Screening
http://hdl.handle.net/10919/89429
Williams, J.D.; Birch, J.B.; Woodall, W.H.; Ferry, N.M.
2006-01-01T00:00:00ZIn pharmaceutical drug discovery and agricultural crop product discovery, in vivo bioassay experiments are used to identify promising compounds for further research. The reproducibility and accuracy of the bioassay is crucial to be able to correctly distinguish between active and inactive compounds. In the case of agricultural product discovery, a replicated dose-response of commercial crop protection products is assayed and used to monitor test quality. The activity of these compounds on the test organisms, the weeds, insects, or fungi, is characterized by a dose-response curve measured from the bioassay. These curves are used to monitor the quality of the bioassays. If undesirable conditions in the bioassay arise, such as equipment failure or problems with the test organisms, then a bioassay monitoring procedure is needed to quickly detect such issues. In this paper we illustrate a proposed nonlinear profile monitoring method to monitor the variability of multiple assays, the adequacy of the dose-response model chosen, and the estimated dose-response curves for aberrant cases in the presence of heteroscedasticity. We illustrate these methods with in vivo bioassay data collected over one year from DuPont Crop Protection.A Bayesian Hierarchical Approach to Dual Response Surface Modeling
http://hdl.handle.net/10919/89426
Chen, Younan; Ye, Keying
2005-01-01T00:00:00ZIn modern quality engineering, dual response surface methodology is a powerful tool to monitor an industrial process by using both the mean and the standard deviation of the measurements as the responses. The least squares method in regression is often used to estimate the coefficients in the mean and standard deviation models, and various decision criteria are proposed by researchers to find the optimal conditions. Based on the inherent hierarchical structure of the dual response problems, we propose a hierarchical Bayesian approach to model dual response surfaces. Such an approach is compared with two frequentist least squares methods by using two real data sets and simulated data.Construction Concepts for Continuum Regression
http://hdl.handle.net/10919/89425
Spitzner, Dan J.
2004-08-28T00:00:00ZApproaches for meaningful regressor construction in the linear prediction problem are investigated in a framework similar to partial least squares and continuum regression, but weighted to allow for intelligent specification of an evaluative scheme. A cross-validatory continuum regression procedure is proposed, and shown to compare well with ordinary continuum regression in empirical demonstrations. Similar procedures are formulated from model-based constructive criteria, but are shown to be severely limited in their potential to enhance predictive performance. By paying careful attention to the interpretability of the proposed methods, the paper addresses a long-standing criticism that the current methodology relies on arbitrary mechanisms.Dimension Reduction for Multinomial Models Via a Kolmogorov-Smirnov Measure (KSM)
http://hdl.handle.net/10919/89423
Loftus, Stephen C.; House, Leanna L.; Hughley, Myra C.; Walke, Jenifer B.; Becker, Matthew H.; Belden, Lisa K.
2015-01-01T00:00:00ZDue to advances in technology and data collection techniques, the number of measurements often exceeds the number of samples in ecological datasets. As such, standard models that attempt to assess the relationship between variables and a response are inapplicable and require a reduction in the number of dimensions to be estimable. Several filtering methods exist to accomplish this, including Indicator Species Analyses and Sure Information Screening, but these techniques often have questionable asymptotic properties or are not readily applicable to data with multinomial responses. As such, we propose and validate a new metric called the Kolmogorov-Smirnov Measure (KSM) to be used for filtering variables. In the paper, we develop the KSM, investigate its asymptotic properties, and compare it to group equalized Indicator Species Values through simulation studies and application to a well-known biological dataset.Speculations Concerning the First Ultraintelligent Machine
http://hdl.handle.net/10919/89424
Good, Irving John
2005-03-05T00:00:00ZThe survival of man depends on the early construction of an ultraintelligent machine. In order to design an ultraintelligent machine we need to understand more about the human brain or human thought or both. In the following pages an attempt is made to take more of the magic out of the brain by means of a "subassembly" theory, which is a modification of Hebb's famous speculative cell-assembly theory. My belief is that the first ultraintelligent machine is most likely to incorporate vast artificial neural circuitry, and that its behavior will be partly explicable in terms of the subassembly theory. Later machines will all be designed by ultra-intelligent machines, and who am I to guess what principles they will devise? But probably Man will construct the deus ex machina in his own image.A Phase I Cluster-Based Method for Analyzing Nonparametric Profiles
http://hdl.handle.net/10919/89420
Chen, Yajuan; Birch, Jeffrey B.; Woodall, William H.
2014-01-01T00:00:00ZA cluster-based method was used by Chen et al.²⁴ to analyze parametric profiles in Phase I of the profile monitoring process. They showed performance advantages in using their cluster-based method of analyzing parametric profiles over a non-cluster-based method with respect to more accurate estimates of the parameters and improved classification performance criteria. However, it is known that, in many cases, profiles can be better represented using a nonparametric method. In this study, we use the clusterbased method to analyze profiles that cannot be easily represented by a parametric function. The similarity matrix used during the clustering phase is based on the fits of the individual profiles with pspline regression. The clustering phase will determine an initial main cluster set which contains greater than half of the total profiles in the historical data set. The profiles with in-control T² statistics are sequentially added to the initial main cluster set and upon completion of the algorithm, the profiles in the main cluster set are classified as the in-control profiles and the profiles not in the main cluster set are classified as out-of-control profiles. A Monte Carlo study demonstrates that the cluster-based method results in superior performance over a non-cluster-based method with respect to better classification and higher power in detecting out-of-control profiles. Also, our Monte Carlo study shows that the clusterbased method has better performance than a non-cluster-based method whether the model is correctly specified or not. We illustrate the use of our method with data from the automotive industry.Interaction Analysis of Three Combination Drugs via a Modified Genetic Algorithm
http://hdl.handle.net/10919/89422
Wan, Wen; Pei, Xin-Yan; Grant, Steven; Birch, Jeffrey B.; Felthousen, Jessica; Dai, Yun; Fang, Hong-Bin; Tan, Ming; Sun, Shumei
2014-01-01T00:00:00ZFew articles have been written on analyzing and visualizing three-way interactions between drugs. Although it may be quite straightforward to extend a statistical method from two-drugs to three-drugs, it is hard to visually illustrate which dose regions are synergistic, additive, or antagonistic, due to a four-dimensional (4-D) problem of plot- ting three-drug dose regions plus a response. This problem can be converted and solved by showing some dose regions of our interest in a 3-D, three-drug dose regions. We propose to apply a modified genetic algorithm (MGA) to construct the dose regions of interest after fitting the response surface to the interaction index (II) by a semiparametric method, the model robust regression method (MRR). A case study with three anti-cancer drugs in an in vitro experiment is employed to illustrate how to find the dose regions of interest. For example, suppose researchers are interested in visualizing where the synergistic areas with II ≤ 0:4 are in 3-D. After fitting a MRR model to the calculated II, the MGA procedure is used to collect those feasible points that satisfy the estimated values of II ≤ 0:4. All these feasible points are used to construct the approximate dose regions of interest in a 3-D.An Improved Hybrid Genetic Algorithm with a New Local Search Procedure
http://hdl.handle.net/10919/89418
Wan, Wen; Birch, Jeffrey B.
2012-01-01T00:00:00ZA hybrid genetic algorithm (HGA) combines a genetic algorithm (GA) with an individual learning procedure. One such learning procedure is a local search technique (LS) used by the GA for refining global solutions. A HGA is also called a memetic algorithm (MA), one of the most successful and popular heuristic search methods. An important challenge of MAs is the trade-off between global and local searching as it is the case that the cost of a LS can be rather high. This paper proposes a novel, simplified, and efficient HGA with a new individual learning procedure that performs a LS only when the best offspring (solution) in the offspring population is also the best in the current parent population. Additionally, a new LS method is developed based on a three-directional search (TD), which is derivative-free and self-adaptive. The new HGA with two different LS methods (the TD and Neld-Mead simplex) is compared with a traditional HGA. Two benchmark functions are employed to illustrate the improvement of the proposed method with the new learning procedure. The results show that the new HGA greatly reduces the number of function evaluations and converges much faster to the global optimum than a traditional HGA. The TD local search method is a good choice in helping to locate a global “mountain” (or “valley”) but may not perform as well as the Nelder-Mead method in the final fine tuning toward the optimal solution.