Browsing by Author "Spitzner, Dan J."
Now showing 1 - 20 of 20
Results Per Page
Sort Options
- Applications of Control Charts in Medicine and EpidemiologySego, Landon Hugh (Virginia Tech, 2006-04-05)We consider two applications of control charts in health care. The first involves the comparison of four methods designed to detect an increase in the incidence rate of a rare health event, such as a congenital malformation. A number of methods have been proposed: among these are the Sets method, two modifications of the Sets method, and the CUSUM method based on the Poisson distribution. Many of the previously published comparisons of these methods used unrealistic assumptions or ignored implicit assumptions which led to misleading conclusions. We consider the situation where data are observed as a sequence of Bernoulli trials and propose the Bernoulli CUSUM chart as a desirable method for the surveillance of rare health events. We compare the steady-state average run length performance of the Sets methods and its modifications to the Bernoulli CUSUM chart under a wide variety of circumstances. Except in a very few instances we find that the Bernoulli CUSUM chart performs better than the Sets method and its modifications for the extensive number of cases considered. The second application area involves monitoring clinical outcomes, which requires accounting for the fact that each patient has a different risk of death prior to undergoing a health care procedure. We propose a risk-adjusted survival time CUSUM chart (RAST CUSUM) for monitoring clinical outcomes where the primary endpoint is a continuous, time-to-event variable that is right censored. Risk adjustment is accomplished using accelerated failure time regression models. We compare the average run length performance of the RAST CUSUM chart to the risk-adjusted Bernoulli CUSUM chart, using data from cardiac surgeries to motivate the details of the comparison. The comparisons show that the RAST CUSUM chart is more efficient at detecting deterioration in the quality of a clinical procedure than the risk-adjusted Bernoulli CUSUM chart, especially when the fraction of censored observations is not too high. We address details regarding the implementation of a prospective monitoring scheme using the RAST CUSUM chart.
- Bayesian D-Optimal Design for Generalized Linear ModelsZhang, Ying (Virginia Tech, 2006-12-07)Bayesian optimal designs have received increasing attention in recent years, especially in biomedical and clinical trials. Bayesian design procedures can utilize the available prior information of the unknown parameters so that a better design can be achieved. However, a difficulty in dealing with the Bayesian design is the lack of efficient computational methods. In this research, a hybrid computational method, which consists of the combination of a rough global optima search and a more precise local optima search, is proposed to efficiently search for the Bayesian D-optimal designs for multi-variable generalized linear models. Particularly, Poisson regression models and logistic regression models are investigated. Designs are examined for a range of prior distributions and the equivalence theorem is used to verify the design optimality. Design efficiency for various models are examined and compared with non-Bayesian designs. Bayesian D-optimal designs are found to be more efficient and robust than non-Bayesian D-optimal designs. Furthermore, the idea of the Bayesian sequential design is introduced and the Bayesian two-stage D-optimal design approach is developed for generalized linear models. With the incorporation of the first stage data information into the second stage, the two-stage design procedure can improve the design efficiency and produce more accurate and robust designs. The Bayesian two-stage D-optimal designs for Poisson and logistic regression models are evaluated based on simulation studies. The Bayesian two-stage optimal design approach is superior to the one-stage approach in terms of a design efficiency criterion.
- Clustering Response-Stressor Relationships in Ecological StudiesGao, Feng (Virginia Tech, 2007-06-20)This research is motivated by an issue frequently encountered in water quality monitoring and ecological assessment. One concern for researchers and watershed resource managers is how the biological community in a watershed is affected by human activities. The conventional single model approach based on regression and logistic regression usually fails to adequately model the relationship between biological responses and environmental stressors since the study samples are collected over a large spatial region and the response-stressor relationships are usually weak in this situation. In this dissertation, we propose two alternative modeling approaches to partition the whole region of study into disjoint subregions and model the response-stressor relationships within subregions simultaneously. In our examples, these modeling approaches found stronger relationships within subregions and should help the resource managers improve impairment assessment and decision making. The first approach is an adjusted Bayesian classification and regression tree (ABCART). It is based on the Bayesian classification and regression tree approach (BCART) and is modified to accommodate spatial partitions in ecological studies. The second approach is a Voronoi diagram based partition approach. This approach uses the Voronoi diagram technique to randomly partition the whole region into subregions with predetermined minimum sample size. The optimal partition/cluster is selected by Monte Carlo simulation. We propose several model selection criteria for optimal partitioning and modeling according to the nature of the study and extend it to multivariate analysis to find the underlying structure of response-stressor relationships. We also propose a multivariate hotspot detection approach (MHDM) to find the region where the response-stressor relationship is the strongest according to an R-square-like criterion. Several sets of ecological data are studied in this dissertation to illustrate the implementation of the above partition modeling approaches. The findings from these studies are consistent with other studies.
- Construction Concepts for Continuum RegressionSpitzner, Dan J. (Virginia Tech, 2004-08-28)Approaches for meaningful regressor construction in the linear prediction problem are investigated in a framework similar to partial least squares and continuum regression, but weighted to allow for intelligent specification of an evaluative scheme. A cross-validatory continuum regression procedure is proposed, and shown to compare well with ordinary continuum regression in empirical demonstrations. Similar procedures are formulated from model-based constructive criteria, but are shown to be severely limited in their potential to enhance predictive performance. By paying careful attention to the interpretability of the proposed methods, the paper addresses a long-standing criticism that the current methodology relies on arbitrary mechanisms.
- Contributions to Profile Monitoring and Multivariate Statistical Process ControlWilliams, James Dickson (Virginia Tech, 2004-12-01)The content of this dissertation is divided into two main topics: 1) nonlinear profile monitoring and 2) an improved approximate distribution for the T² statistic based on the successive differences covariance matrix estimator. Part 1: Nonlinear Profile Monitoring In an increasing number of cases the quality of a product or process cannot adequately be represented by the distribution of a univariate quality variable or the multivariate distribution of a vector of quality variables. Rather, a series of measurements are taken across some continuum, such as time or space, to create a profile. The profile determines the product quality at that sampling period. We propose Phase I methods to analyze profiles in a baseline dataset where the profiles can be modeled through either a parametric nonlinear regression function or a nonparametric regression function. We illustrate our methods using data from Walker and Wright (2002) and from dose-response data from DuPont Crop Protection. Part 2: Approximate Distribution of T² Although the T² statistic based on the successive differences estimator has been shown to be effective in detecting a shift in the mean vector (Sullivan and Woodall (1996) and Vargas (2003)), the exact distribution of this statistic is unknown. An accurate upper control limit (UCL) for the T² chart based on this statistic depends on knowing its distribution. Two approximate distributions have been proposed in the literature. We demonstrate the inadequacy of these two approximations and derive useful properties of this statistic. We give an improved approximate distribution and recommendations for its use.
- Contributions to Robust Adaptive Signal Processing with Application to Space-Time Adaptive RadarSchoenig, Gregory Neumann (Virginia Tech, 2007-04-12)Classical adaptive signal processors typically utilize assumptions in their derivation. The presence of adequate Gaussian and independent and identically distributed (i.i.d.) input data are central among such assumptions. However, classical processors have a tendency to suffer a degradation in performance when assumptions like these are violated. Worse yet, such degradation is not guaranteed to be proportional to the level of deviation from the assumptions. This dissertation proposes new signal processing algorithms based on aspects of modern robustness theory, including methods to enable adaptivity of presently non-adaptive robust approaches. The contributions presented are the result of research performed jointly in two disciplines, namely robustness theory and adaptive signal processing. This joint consideration of robustness and adaptivity enables improved performance in assumption-violating scenarios—scenarios in which classical adaptive signal processors fail. Three contributions are central to this dissertation. First, a new adaptive diagnostic tool for high-dimension data is developed and shown robust in problematic contamination. Second, a robust data-pre-whitening method is presented based on the new diagnostic tool. Finally, a new suppression-based robust estimator is developed for use with complex-valued adaptive signal processing data. To exercise the proposals and compare their performance to state- of-the art methods, data sets commonly used in statistics as well as Space-Time Adaptive Processing (STAP) radar data, both real and simulated, are processed, and performance is subsequently computed and displayed. The new algorithms are shown to outperform their state-of-the-art counterparts from both a signal-to-interference plus noise ratio (SINR) convergence rate and target detection perspective.
- The Effects of Household Fabric Softeners on the Thermal Comfort and Flammability of Cotton and Polyester FabricsGuo, Jiangman (Virginia Tech, 2003-04-29)This study examined the effects of household fabric softeners on the thermal comfort and flammability of 100% cotton and 100% polyester fabrics after repeated laundering. Two fabric properties related to thermal comfort, water vapor transmission and air permeability, were examined. A 3 X 2 X 3 experimental design (i.e., 18 experimental cells) was developed to conduct the research. Three independent variables were selected: fabric softener treatments (i.e., rinse cycle softener, dryer sheet softener, no softener), fabric types (i.e., 100% cotton, 100% polyester), and number of laundering cycles (i.e., 1, 15, 25 cycles). Three dependent variables were tested: water vapor transmission, air permeability, and flammability. The test fabrics were purchased from Testfabrics, Inc. To examine the influence of the independent variables and their interactions on each dependent variable, two-way or three-way Analysis of Variance (ANOVA) tests were used to analyze the data. Results in this study showed that both the rinse cycle softener and the dryer sheet softener significantly decreased the water vapor transmission of test specimens to a similar degree. The rinse cycle softener decreased the air permeability of test specimens most and was followed by the dryer sheet softener. The rinse cycle softener increased the flammability of both cotton and polyester fabrics, but the dryer sheet softener had no significant effect on the flammability of both fabric types. Statistical analysis also indicated that the interactions were significant among the independent variables on water vapor transmission, air permeability, and flammability of the test specimens. For example, the rinse cycle softener significantly decreased the water vapor transmission and air permeability of cotton fabric but had no effect on polyester fabric. The dryer sheet softener also decreased the water vapor transmission of cotton fabric but had no effect on polyester fabric, and it had no effect on the air permeability of both cotton and polyester fabrics. In addition, the air permeability of cotton specimens treated with the rinse cycle softener continuously reduced after repeated laundering, but that of polyester fabrics treated with the rinse cycle softener only reduced after 15 laundering cycles and showed no continuous decrease when laundering cycles increased. When the influence of fabric softener treatments on flammability was examined, the results showed that the more the specimens were laundered with the rinse cycle softener, the greater the flammability of the test specimens. However, the dryer sheet softener did not have a significant effect on the flammability of the test fabrics even after repeated laundering. For the polyester fabric, all specimens treated with the dryer sheet softener or no softener passed the standard of children's sleepwear even after 25 laundering cycles, but those treated with the rinse cycle softener did not pass the standard. In conclusion, fabric softener treatment had a significant influence on the thermal comfort (i.e., water vapor transmission and air permeability) and flammability of 100% cotton and 100% polyester fabrics after repeated laundering cycles and the effects were significantly different among the three independent variables (i.e., fabric softener treatments, fabric types, and number of laundering cycles). The applications of these results were also discussed.
- A Framework for Monitoring Performance-Based Road MaintenancePinero, Juan Carlos (Virginia Tech, 2003-12-08)In the late 1980s and early 1990s few transportation agencies around the world considered performance-based specifications as an alternative to improve the efficiency of the services provided to the public. These initiatives are better known as Performance-Based Road Maintenance (PBRM). PBRM calls for performance-based work, in which a desired outcome is specified rather than a material or method. This type of specification promises to be an excellent tool to improve government efficiency in maintaining transportation networks; however, without proper monitoring, it could likely yield adverse outcomes. Since PBRM is relatively new, the availability of reliable and comprehensive sets of guidelines to evaluate the effectiveness and efficiency of this type of specifications in the roadway maintenance arena is limited. Transportation agencies currently rely on criteria and procedures they have had developed from their traditional methods used to evaluate performance. Unfortunately, some of these procedures cannot appropriately assess the benefits, if any, accrued by the government as a result of implementing performance-based specifications for the maintenance of the roadway system. This research presents the development of a framework for monitoring PBRM more comprehensively and accurately. The framework considers the assessment of five main areas -- Level of Service Effectiveness, Cost-Efficiency, Timeliness of Response, Safety Procedures, and Quality of Services -- in order to guarantee the comprehensiveness and reliability of the evaluation process. The major contribution of this framework is to provide transportation agencies with guidelines for evaluating the effectiveness and efficiency of PBRM as an alternative delivery method to maintain and preserve the roadway system.
- Improving Turbidity-Based Estimates of Suspended Sediment Concentrations and LoadsJastram, John Dietrich (Virginia Tech, 2007-05-04)As the impacts of human activities increase sediment transport by aquatic systems the need to accurately quantify this transport becomes paramount. Turbidity is recognized as an effective tool for monitoring suspended sediments in aquatic systems, and with recent technological advances turbidity can be measured in-situ remotely, continuously, and at much finer temporal scales than was previously possible. Although turbidity provides an improved method for estimation of suspended-sediment concentration (SSC), compared to traditional discharge-based methods, there is still significant variability in turbidity-based SSC estimates and in sediment loadings calculated from those estimates. The purpose of this study was to improve the turbidity-based estimation of SSC. Working at two monitoring sites on the Roanoke River in southwestern Virginia, stage, turbidity, and other water-quality parameters and were monitored with in-situ instrumentation, suspended sediments were sampled manually during elevated turbidity events; those samples were analyzed for SSC and for physical properties; rainfall was quantified by geologic source area. The study identified physical properties of the suspended-sediment samples that contribute to SSC-estimation variance and hydrologic variables that contribute to variance in those physical properties. Results indicated that the inclusion of any of the measured physical properties, which included grain-size distributions, specific surface-area, and organic carbon, in turbidity-based SSC estimation models reduces unexplained variance. Further, the use of hydrologic variables, which were measured remotely and on the same temporal scale as turbidity, to represent these physical properties, resulted in a model which was equally as capable of predicting SSC. A square-root transformed turbidity-based SSC estimation model developed for the Roanoke River at Route 117 monitoring station, which included a water level variable, provided 63% less unexplained variance in SSC estimations and 50% narrower 95% prediction intervals for an annual loading estimate, when compared to a simple linear regression using a logarithmic transformation of the response and regressor (turbidity). Unexplained variance and prediction interval width were also reduced using this approach at a second monitoring site, Roanoke River at Thirteenth Street Bridge; the log-based transformation of SSC and regressors was found to be most appropriate at this monitoring station. Furthermore, this study demonstrated the potential for a single model, generated from a pooled set of data from the two monitoring sites, to estimate SSC with less variance than a model generated only from data collected at this single site. When applied at suitable locations, the use of this pooled model approach could provide many benefits to monitoring programs, such as developing SSC-estimation models for multiple sites which individually do not have enough data to generate a robust model or extending the model to monitoring sites between those for which the model was developed and significantly reducing sampling costs for intensive monitoring programs.
- Intelligent Fusion of Evidence from Multiple Sources for Text ClassificationZhang, Baoping (Virginia Tech, 2006-06-20)Automatic text classification using current approaches is known to perform poorly when documents are noisy or when limited amounts of textual content is available. Yet, many users need access to such documents, which are found in large numbers in digital libraries and in the WWW. If documents are not classified, they are difficult to find when browsing. Further, searching precision suffers when categories cannot be checked, since many documents may be retrieved that would fail to meet category constraints. In this work, we study how different types of evidence from multiple sources can be intelligently fused to improve classification of text documents into predefined categories. We present a classification framework based on an inductive learning method -- Genetic Programming (GP) -- to fuse evidence from multiple sources. We show that good classification is possible with documents which are noisy or which have small amounts of text (e.g., short metadata records) -- if multiple sources of evidence are fused in an intelligent way. The framework is validated through experiments performed on documents in two testbeds. One is the ACM Digital Library (using a subset available in connection with CITIDEL, part of NSF's National Science Digital Library). The other is Web data, in particular that portion associated with the Cadê Web directory. Our studies have shown that improvement can be achieved relative to other machine learning approaches if genetic programming methods are combined with classifiers such as kNN. Extensive analysis was performed to study the results generated through the GP-based fusion approach and to understand key factors that promote good classification.
- Methods of Determining the Number of Clusters in a Data Set and a New Clustering CriterionYan, Mingjin (Virginia Tech, 2005-11-28)In cluster analysis, a fundamental problem is to determine the best estimate of the number of clusters, which has a deterministic effect on the clustering results. However, a limitation in current applications is that no convincingly acceptable solution to the best-number-of-clusters problem is available due to high complexity of real data sets. In this dissertation, we tackle this problem of estimating the number of clusters, which is particularly oriented at processing very complicated data which may contain multiple types of cluster structure. Two new methods of choosing the number of clusters are proposed which have been shown empirically to be highly effective given clear and distinct cluster structure in a data set. In addition, we propose a sequential type of clustering approach, called multi-layer clustering, by combining these two methods. Multi-layer clustering not only functions as an efficient method of estimating the number of clusters, but also, by superimposing a sequential idea, improves the flexibility and effectiveness of any arbitrary existing one-layer clustering method. Empirical studies have shown that multi-layer clustering has higher efficiency than one layer clustering approaches, especially in detecting clusters in complicated data sets. The multi-layer clustering approach has been successfully implemented in clustering the WTCHP microarray data and the results can be interpreted very well based on known biological knowledge. Choosing an appropriate clustering method is another critical step in clustering. K-means clustering is one of the most popular clustering techniques used in practice. However, the k-means method tends to generate clusters containing a nearly equal number of objects, which is referred to as the ``equal-size'' problem. We propose a clustering method which competes with the k-means method. Our newly defined method is aimed at overcoming the so-called ``equal-size'' problem associated with the k-means method, while maintaining its advantage of computational simplicity. Advantages of the proposed method over k-means clustering have been demonstrated empirically using simulated data with low dimensionality.
- A Modified Bayesian Power Prior Approach with Applications in Water Quality EvaluationDuan, Yuyan (Virginia Tech, 2005-11-28)This research is motivated by an issue frequently encountered in environmental water quality evaluation. Many times, the sample size of water monitoring data is too small to have adequate power. Here, we present a Bayesian power prior approach by incorporating the current data and historical data and/or the data collected at neighboring stations to make stronger statistical inferences on the parameters of interest. The elicitation of power prior distributions is based on the availability of historical data, and is realized by raising the likelihood function of the historical data to a fractional power. The power prior Bayesian analysis has been proven to be a useful class of informative priors in Bayesian inference. In this dissertation, we propose a modified approach to constructing the joint power prior distribution for the parameter of interest and the power parameter. The power parameter, in this modified approach, quantifies the heterogeneity between current and historical data automatically, and hence controls the influence of historical data on the current study in a sensible way. In addition, the modified power prior needs little to ensure its propriety. The properties of the modified power prior and its posterior distribution are examined for the Bernoulli and normal populations. The modified and the original power prior approaches are compared empirically in terms of the mean squared error (MSE) of parameter estimates as well as the behavior of the power parameter. Furthermore, the extension of the modified power prior to multiple historical data sets is discussed, followed by its comparison with the random effects model. Several sets of water quality data are studied in this dissertation to illustrate the implementation of the modified power prior approach with normal and Bernoulli models. Since the power prior method uses information from sources other than current data, it has advantages in terms of power and estimation precision for decisions with small sample sizes, relative to methods that ignore prior information.
- Online Impulse Buying Behavior with Apparel Products: Relationships with Apparel Involvement, Website Attributes, and Product Category/PriceRhee, Young-Ju (Virginia Tech, 2006-09-06)The potential use of the Internet for apparel retail marketing is extremely viable (Murphy, 1998); however, most of the journal papers on apparel Internet shoppers are limited to the comparison of demographic, psychographic, and behavioral characteristics between shoppers and non-shoppers (McKinney, 2004). Little empirical research has addressed the role of impulsiveness in online apparel shopping behavior. In the past, impulse buying was considered as something bad and consumers felt guilty after impulse buying (Ainslie 1975; Levy 1976). However, most researchers now no longer view impulse buying as a negative phenomenon because studies showed that impulse buying satisfies a number of hedonic desires (Piron 1991; Rook & Fisher 1995; Thompson, Locander, & Pollio 1990). Impulse buyers exhibited greater feelings of amusement, delight, enthusiasm, and joy (Weinberg & Gottwald, 1982) and often felt uplifted or energized after a shopping experience that involves impulse buying (Rook, 1987; Gardner & Rook, 1988; 1993) because impulse buying can provide the enjoyment of novelty and surprise, and the ability of mood alteration (i.e., breaking out of negative mood state) (Gardner & Rook, 1988; Rook, 1987). Recognizing the positive feelings generated from impulse buying and considering the increasing frequency of college students'' Internet shopping (Seock, 2003), one strategy to create competitive advantages in the apparel market of college students is to understand the variables related to impulse buying and based on the understanding provide a website that generates pleasurable shopping. The purpose of this study was to examine the relationships between online apparel impulse buying behavior and apparel involvement, apparel website attributes, and product category/price. The data were collected using an online survey with a structured questionnaire. To recruit participants, 37,590 e-mails were sent to six universities located in different regions of the United States. A total of 687 college students responded to the survey including 284 online apparel buyers, 194 non-online apparel buyers, and 209 non-apparel website visitors. When the impulsiveness of online apparel purchases in general was used to divide the participants into impulse buyer and non-impulse buyer groups, the Chi-square test results showed that there were significantly more female respondents in the impulse buyer group than in the non-impulse buyer group. However, when impulsiveness of last purchase was used to divide the participants into impulse purchase and non-impulse purchase groups, the results showed no significant difference between the genders. For other results, the findings were all consistent. Respondents in the impulse buyer and purchase groups than the non-impulse buyer and purchase groups had a greater amount of total monthly income and spent more money on apparel products. The impulse buyer and purchase groups visited websites that sold clothing/accessories more frequently and purchased more apparel products online over the past six months than the non-impulse buyer and purchase groups. These results suggest that impulse buyers are an important segment of the apparel online market. Four hypotheses were put forward to test the relationships among the variables. Before the proposed hypotheses could be examined, the factor analysis was first conducted to determine the constructs of apparel involvement and website attributes. The results showed that apparel involvement consisted of three factors (i.e., sign value/perceived importance, pleasure value, risk importance/probability) and website attributes consisted of four factors (i.e., website design, product presentation, promotion, product search/policy information). The results of MANOVA showed that the impulse buyer group perceived the sign value/perceived importance and the pleasure value of apparel involvement significantly higher, and perceived the risk importance/probability of apparel involvement significantly lower than the non-impulse buyer group. Based on the results, H1 was supported. Impulsive and non-impulsive online apparel buyers differed significantly in their apparel involvement. For H2, the results indicated that the impulse purchase group evaluated the website where they bought the last apparel item significantly better in website design, product presentation, promotion, and product search/policy information than the non-impulse purchase group. Based on the results, H2 was supported. The evaluations of the attributes of websites where impulse purchases and non-impulse purchases of apparel products were made were significantly different. Test of H3 showed that some product categories purchased by the respondents in the impulse purchase group were significantly different from those bought by the non-impulse purchase group. Categories such as shirt/blouse and belt were bought more frequently by the respondents in the impulse purchase group whereas shoes were bought more frequently by those in the non-impulse purchase group. The respondents in the impulse purchase group bought more items that cost less than $25 than those in the non-impulse purchase group. Based on the results H3 was supported. The product categories purchased by the impulse purchase group and non-impulse purchase group were significantly different. The multiple regression results showed that the sign value/perceived importance of apparel involvement contributed the most in explaining impulsiveness of online apparel buying behavior, followed by product price, risk importance/probability of apparel involvement, and product presentation of website attributes. Other factors, such as the pleasure value of apparel involvement and website attributes in website design, promotion, and product search/policy information, had no significant linear relationships with the impulsiveness of online apparel buying behavior. Based on the results, H4 was partially supported. From the results of the present study, it is concluded that apparel involvement, website attributes, and product price are closely related to the impulsiveness of consumers'' online apparel buying behavior. This study is beneficial to researchers and marketers by identifying possible psychological reasons for impulse buying as well as suggesting strategies to develop an apparel website that facilitate impulse buying behavior.
- Optimal Blocking for Three Treatments and BIBD Robustness - Two Problems in Design OptimalityParvu, Valentin (Virginia Tech, 2004-11-29)Design optimality plays a central role in the area of statistical experimental design. In general, problems in design optimality are composed of two vital, but separable, components. One of these is determining conditions under which a design is optimal (such as criterion bounds, values of design parameters, or special structure in the information matrix). The other is construction of designs satisfying those conditions. Most papers deal with either optimality conditions, or design construction in accordance with desired combinatorial properties, but not both. This dissertation determines optimal designs for three treatments in the one-way and multi-way heterogeneity settings, first proving optimality through a series of bounding arguments, then applying combinatorial techniques for their construction. Among the results established are optimality with respect to the well known E and A criteria. A- and E-optimal block designs and row-column designs with three treatments are found, for any parameter set. E-optimal hyperrectangles with three treatments are also found, for any parameter set. Systems of distinct representatives theory is used for the construction of optimal designs. Efficiencies relative to optimal criterion values are used to determine robustness of block designs against loss of a small number of blocks. Nonisomorphic bal anced incomplete block designs are ranked based on their robustness. A complete list of most robust BIBDs for v ≤ 10, r ≤ 15 is compiled.
- Profile Monitoring for Mixed Model DataJensen, Willis Aaron (Virginia Tech, 2006-04-10)The initial portion of this research focuses on appropriate parameter estimators within a general context of multivariate quality control. The goal of Phase I analysis of multivariate quality control data is to identify multivariate outliers and step changes so that the estimated control limits are sufficiently accurate for Phase II monitoring. High breakdown estimation methods based on the minimum volume ellipsoid (MVE) or the minimum covariance determinant (MCD) are well suited to detecting multivariate outliers in data. Because of the inherent difficulties in computation many algorithms have been proposed to obtain them. We consider the subsampling algorithm to obtain the MVE estimators and the FAST-MCD algorithm to obtain the MCD estimators. Previous studies have not clearly determined which of these two estimation methods is best for control chart applications. The comprehensive simulation study here gives guidance for when to use which estimator. Control limits are provided. High breakdown estimation methods such as MCD and MVE can be applied to a wide variety of multivariate quality control data. The final, lengthier portion of this research considers profile monitoring. Profile monitoring is a relatively new technique in quality control used when the product or process quality is best represented by a profile (or a curve) at each time period. The essential idea is often to model the profile via some parametric method and then monitor the estimated parameters over time to determine if there have been changes in the profiles. Because the estimated parameters may be correlated, it is convenient to monitor them using a multivariate control method such as the T-squared statistic. Previous modeling methods have not incorporated the correlation structure within the profiles. We propose the use of mixed models (both linear and nonlinear) to monitor linear and nonlinear profiles in order to account for the correlation structure within a profile. We consider various data scenarios and show using simulation when the mixed model approach is preferable to an approach that ignores the correlation structure. Our focus is on Phase I control chart applications.
- Prospective Spatio-Temporal Surveillance Methods for the Detection of Disease ClustersMarshall, J. Brooke (Virginia Tech, 2009-06-10)In epidemiology it is often useful to monitor disease occurrences prospectively to determine the location and time when clusters of disease are forming. This aids in the prevention of illness and injury of the public and is the reason spatio-temporal disease surveillance methods are implemented. Care must be taken in the design and implementation of these types of surveillance methods so that the methods provide accurate information on the development of clusters. Here two spatio-temporal methods for prospective disease surveillance are considered. These include the local Knox monitoring method and a new wavelet-based prospective monitoring method. The local Knox surveillance method uses a cumulative sum (CUSUM) control chart for monitoring the local Knox statistic, which tests for space-time clustering each time there is an incoming observation. The detection of clusters of events occurring close together both temporally and spatially is important in finding outbreaks of disease within a specified geographic region. The local Knox surveillance method is based on the Knox statistic, which is often used in epidemiology to test for space-time clustering retrospectively. In this method, a local Knox statistic is developed for use with the CUSUM chart for prospective monitoring so that epidemics can be detected more quickly. The design of the CUSUM chart used in this method is considered by determining the in-control average run length (ARL) performance for different space and time closeness thresholds as well as for different control limit values. The effect of nonuniform population density and region shape on the in-control ARL is explained and some issues that should be considered when implementing this method are also discussed. In the wavelet-based prospective monitoring method, a surface of incidence counts is modeled over time in the geographical region of interest. This surface is modeled using Poisson regression where the regressors are wavelet functions from the Haar wavelet basis. The surface is estimated each time new incidence data is obtained using both past and current observations, weighing current observations more heavily. The flexibility of this method allows for the detection of changes in the incidence surface, increases in the overall mean incidence count, and clusters of disease occurrences within individual areas of the region, through the use of control charts. This method is also able to incorporate information on population size and other covariates as they change in the geographical region over time. The control charts developed for use in this method are evaluated based on their in-control and out-of-control ARL performance and recommendations on the most appropriate control chart to use for different monitoring scenarios is provided.
- Recommendations for Design Parameters for Central Composite Designs with Restricted RandomizationWang, Li (Virginia Tech, 2006-08-15)In response surface methodology, the central composite design is the most popular choice for fitting a second order model. The choice of the distance for the axial runs, alpha, in a central composite design is very crucial to the performance of the design. In the literature, there are plenty of discussions and recommendations for the choice of alpha, among which a rotatable alpha and an orthogonal blocking alpha receive the greatest attention. Box and Hunter (1957) discuss and calculate the values for alpha that achieve rotatability, which is a way to stabilize prediction variance of the design. They also give the values for alpha that make the design orthogonally blocked, where the estimates of the model coefficients remain the same even when the block effects are added to the model. In the last ten years, people have begun to realize the importance of a split-plot structure in industrial experiments. Constructing response surface designs with a split-plot structure is a hot research area now. In this dissertation, Box and Hunters' choice of alpha for rotatablity and orthogonal blocking is extended to central composite designs with a split-plot structure. By assigning different values to the axial run distances of the whole plot factors and the subplot factors, we propose two-strata rotatable splitplot central composite designs and orthogonally blocked split-plot central composite designs. Since the construction of the two-strata rotatable split-plot central composite design involves an unknown variance components ratio d, we further study the robustness of the two-strata rotatability on d through simulation. Our goal is to provide practical recommendations for the value of the design parameter alpha based on the philosophy of traditional response surface methodology.
- Semiparametric Techniques for Response Surface MethodologyPickle, Stephanie M. (Virginia Tech, 2006-06-28)Many industrial statisticians employ the techniques of Response Surface Methodology (RSM) to study and optimize products and processes. A second-order Taylor series approximation is commonly utilized to model the data; however, parametric models are not always adequate. In these situations, any degree of model misspecification may result in serious bias of the estimated response. Nonparametric methods have been suggested as an alternative as they can capture structure in the data that a misspecified parametric model cannot. Yet nonparametric fits may be highly variable especially in small sample settings which are common in RSM. Therefore, semiparametric regression techniques are proposed for use in the RSM setting. These methods will be applied to an elementary RSM problem as well as the robust parameter design problem.
- Some Model-Based and Distance-Based Clustering Methods for Characterization of Regional Ecological Stressor-Response Patterns and Regional Environmental Quality TrendsFarrar, David B. (Virginia Tech, 2006-08-15)We develop statistical methods for evaluation of regional variation of ecological stressor-response relationships, and regional variation in temporal profiles of water quality, for application to data from monitoring stations on bodies of water. To evaluate regional variation in regression relationships, we use model-based clustering procedures with class-specific regression models. Units for clustering are taken to be basins, or combinations of basins and ecoregions. We rely on a Bayesian formulation and sample the posterior distribution using a Markov chain Monte Carlo algorithm. Two general approaches to the label-switching problem are considered, each leading to procedures that we apply in data analyses. Two applications are presented. We explore some relationships among priors with a Dirichlet distribution for class probabilities. We compare two rank-based criteria for grouping stations according to similarities in temporal profiles. The two criteria are illustrated in a hierarchical cluster analysis based on measurements of a water quality variable.
- Univariate and Multivariate Surveillance Methods for Detecting Increases in Incidence RatesJoner, Michael D. Jr. (Virginia Tech, 2007-03-30)It is often important to detect an increase in the frequency of some event. Particular attention is given to medical events such as mortality or the incidence of a given disease, infection or birth defect. Observations are regularly taken in which either an incidence occurs or one does not. This dissertation contains the result of an investigation of prospective monitoring techniques in two distinct surveillance situations. In the first situation, the observations are assumed to be the results of independent Bernoulli trials. Some have suggested adapting the scan statistic to monitor such rates and detect a rate increase as soon as possible after it occurs. Other methods could be used in prospective surveillance, such as the Bernoulli cumulative sum (CUSUM) technique. Issues involved in selecting parameters for the scan statistic and CUSUM methods are discussed, and a method for computing the expected number of observations needed for the scan statistic method to signal a rate increase is given. A comparison of these methods shows that the Bernoulli CUSUM method tends to be more effective in detecting increases in the rate. In the second situation, the incidence information is available at multiple locations. In this case the individual sites often report a count of incidences on a regularly scheduled basis. It is assumed that the counts are Poisson random variables which are independent over time, but the counts at any given time are possibly correlated between regions. Multivariate techniques have been suggested for this situation, but many of these approaches have shortcomings which have been demonstrated in the quality control literature. In an attempt to remedy some of these shortcomings, a new control chart is recommended based on a multivariate exponentially weighted moving average. The average run-length performance of this chart is compared with that of the existing methods.