Browsing by Author "Smith, Eric P."
Now showing 1 - 20 of 170
Results Per Page
Sort Options
- Abiotic and biotic factors influencing the decline of native unionid mussels in the Clinch River, VirginiaYeager, Mary Melinda (Virginia Tech, 1994)Declining unionid populations in the Clinch River are of concern due to the high endemism in the diverse fauna of the Cumberlandian region. Increase in agricultural and mining activities, as well as in industry and urbanization, are coupled with unionid declines throughout the watershed. In many reaches of the Clinch River, mussel populations exist which fail to show recruitment suggesting that this is the weak link in the complex life cycle. Two possible factors which could endanger the sensitive juvenile stage are the presence of sediment toxicants or adult Corbicula fluminea in the depositional areas, the preferred habitat of the juveniles. Before investigating the impacts of these factors, it was necessary to characterize the relationship of the juveniles with the sediment they inhabit. Observations of feeding behavior using videotape, dye studies in a feeding chamber, and gut content analysis were used to determine mechanisms of feeding, the primary food source, and the origin of substances taken up by juveniles. Exposure to sediment came not only through direct contact, but also through filtration of interstitial water and sediment-associated fine particulate organic matter. Juveniles used pedal locomotory and pedal sweep feeding behaviors to facilitate movement of particles into the pedal gape. Intermittent sediment toxicity was found in laboratory bioassays using Daphna magna and Chironomus riparius. These data, along with fluctuating metals in the Clinch River sediments, indicated that acute insults existed from which recovery would depend on the frequency, intensity and duration of the events. Field studies revealed that the intermittent toxicity is reflected in the community structure of benthic macroinvertebrates and impairs growth of juvenile unionids in-situ studies. The intermittent toxicity which may be associated with rain events impairs stream biota and may prevent recruitment of juvenile unionids. The presence of adult C. fluminea in sediments was found to decrease juvenile unionid growth and recovery from test sediments and to increase mortality and resuspension of juveniles into the water column. Both the presence of sediment-bound toxicants and C. fluminea may be contributing to unionid bivalve declines in the Clinch River, Virginia.
- An Alternative Estimate of Preferred Direction for Circular DataOtieno, Bennett Sango (Virginia Tech, 2002-07-25)Circular or Angular data occur in many fields of applied statistics. A common problem of interest in circular data is estimating a preferred direction and its corresponding distribution. This problem is complicated by the so-called wrap-around effect, which exists because there is no minimum or maximum on the circle. The usual statistics employed for linear data are inappropriate for directional data, as they do not account for the circular nature of directional data. Common choices for summarizing the preferred direction are the sample circular mean, and sample circular median. A newly proposed circular analog of the Hodges-Lehmann estimator is proposed, as an alternative estimate of preferred direction. The new measure of preferred direction is a robust compromise between circular mean and circular median. Theoretical results show that the new measure of preferred direction is asymptotically more efficient than the circular median and that its asymptotic efficiency relative to the circular mean is quite comparable. Descriptions of how to use the methods for constructing confidence intervals and testing hypotheses are provided. Simulation results demonstrate the relative strengths and weaknesses of the new approach for a variety of distributions.
- The analysis of longitudinal ordinal dataSchabenberger, Oliver (Virginia Tech, 1995)Longitudinal data, in which a series of observations is collected on a typically large number of experimental units is one of the most frequent and important sources of quantitative information in forestry. The dependencies among repeated observations for an experimental unit must be accounted for in order to validate statistical estimation and inference in modeling efforts. The recent advances in statistical theory for correlated data created a body of theory which will become of increasing importance as analysts realize the limitations of traditional methods that ignore these dependencies. Longitudinal data fosters research questions that focus on the individual experimental unit rather than the population as in classical cross-sectional data analysis. Mixed model techniques have emerged as powerful tools to address research problems of this kind and are treated extensively in this dissertation. Over the last years interest in modeling quantal responses that take on only a countable, discrete number of possible values has also increased throughout the discipline. The theory of generalized linear models provides the groundwork to embody quantal response models into the toolbox of applied analysts. The focus of this dissertation is to combine modern analytical tools for longitudinal data with regression methods for quantal responses. Special emphasis is placed on ordinal and binary data because of their prevalence in ecological, biological, and environmental statistics. The first chapters review the literature and introduce necessary theory. The second part of this dissertation consists of a case study in which binary and ordinal fusiform rust response on loblolly and slash pine is modeled in a longitudinal data base provided by the East Texas Pine Plantation Research Project.
- Analysis of multispecies microcosm experimentsMercante, Donald Eugene (Virginia Tech, 1990)Traditionally, single species toxicity tests have been the primary tool for assessment of hazard of toxic substances in aquatic ecosystems. These tests are inadequate for accurately reflecting the impact of toxicants on the community structure inherent in ecosystems. Multispecies microcosm experiments are gaining widespread acceptance as an important vehicle in understanding the nature and magnitude of effects for more complex systems. Microcosm experiments are complex and costly to conduct. Consequently, sample sizes are typically small (8-20 replicates). In addition, these experiments are difficult to analyze due to their multivariate and repeated measures nature. Working under the constraint of small sample sizes, we develop inferential as well as diagnostic methods that detect and measure community changes as a result of an intervention (i.e. toxicant), and assess the importance of individual species. A multi-factorial simulation analysis is used to compare several methods. The Multi-Response Permutation Procedure (MRPP) and a regression method incorporating a correlation structure are found to be the most powerful procedures for detecting treatment differences. The MRPP is particularly suited to experiments with replication and when the response variable may not be normally distributed. The regression model for dissimilarity data has the advantage of enabling direct estimation of many parameters not possible with the MRPP as well as the magnitude of treatment effects. A stepwise dependent variable selection algorithm with a selection criterion based on a conditional p-value argument is proposed and applied to a real data set. It is seen to have advantages over other methods for assessing species importance.
- An analysis of palustrine forested wetland compensation effectiveness in VirginiaAtkinson, Robert B. (Virginia Tech, 1991-09-03)Plans to construct a wetland to replace wetland losses has become a common feature of permit requests. The purpose of this project is to suggest a methodology for quantifying the effectiveness of palustrine forested wetland construction in Virginia. Wetlands constructed by ~ne Virginia Department of Transportation and the U.S. Army Corps of Engineers were surveyed and Wagner Road constructed wetland in Petersburg, Virginia was selected as the primary study site. Chapter One of the present study suggests a method for early assessment of revegetation success utilizing weighted averages of colonizing vegetation. An adjacent reference site was chosen that was in close proximity to the constructed site and was used for comparison. Results from the Wagner Road site and the reference wetland indicated that colonizing vegetation weighted averages provide a more sensitive measure of revegetation success than the methods described in the federal wetland delineation manual.
- Analysis of Zero-Heavy Data Using a Mixture Model ApproachWang, Shin Cheng (Virginia Tech, 1998-03-18)The problem of high proportion of zeroes has long been an interest in data analysis and modeling, however, there are no unique solutions to this problem. The solution to the individual problem really depends on its particular situation and the design of the experiment. For example, different biological, chemical, or physical processes may follow different distributions and behave differently. Different mechanisms may generate the zeroes and require different modeling approaches. So it would be quite impossible and inflexible to come up with a unique or a general solution. In this dissertation, I focus on cases where zeroes are produced by mechanisms that create distinct sub-populations of zeroes. The dissertation is motivated from problems of chronic toxicity testing which has a data set that contains a high proportion of zeroes. The analysis of chronic test data is complicated because there are two different sources of zeroes: mortality and non-reproduction in the data. So researchers have to separate zeroes from mortality and fecundity. The use of mixture model approach which combines the two mechanisms to model the data here is appropriate because it can incorporate the mortality kind of extra zeroes. A zero inflated Poisson (ZIP) model is used for modeling the fecundity in Ceriodaphnia dubia toxicity test. A generalized estimating equation (GEE) based ZIP model is developed to handle longitudinal data with zeroes due to mortality. A joint estimate of inhibition concentration (ICx) is also developed as potency estimation based on the mixture model approach. It is found that the ZIP model would perform better than the regular Poisson model if the mortality is high. This kind of toxicity testing also involves longitudinal data where the same subject is measured for a period of seven days. The GEE model allows the flexibility to incorporate the extra zeroes and a correlation structure among the repeated measures. The problem of zero-heavy data also exists in environmental studies in which the growth or reproduction rates of multi-species are measured. This gives rise to multivariate data. Since the inter-relationships between different species are imbedded in the correlation structure, the study of the information in the correlation of the variables, which is often accessed through principal component analysis, is one of the major interests in multi-variate data. In the case where mortality influences the variables of interests, but mortality is not the subject of interests, the use of the mixture approach can be applied to recover the information of the correlation structure. In order to investigate the effect of zeroes on multi-variate data, simulation studies on principal component analysis are performed. A method that recovers the information of the correlation structure is also presented.
- Assessment of morphological and molecular genetic variation of freshwater mussel species belonging to the genera Fusconaia, Pleurobema, and Pleuronaia in the upper Tennessee River basinSchilling, Daniel Edward (Virginia Tech, 2015-07-07)Select freshwater mussels in the genera Fusconaia, Pleurobema, and Pleuronaia were collected primarily in the upper Tennessee River basin from 2012 to 2014 for phylogenetic and morphological assessments. Freshwater mussels in these genera are similar in appearance, hence the need for phylogenetic verification and morphological assessment. Phylogenetic analyses of the mitochondrial gene ND1 and the nuclear gene ITS1 revealed three unrecognized, phylogenetically distinct species. These species were separated from their closest congener by 2.85%, 3.17%, and 6.32% based on pairwise genetic distances of ND1. Gaps created from aligning ITS1 sequences were coded as fifth characters, which phylogenetically separated most closely related species. Analyses of ND1 agreed with previous literature on the phylogenetic distinctiveness of Pleuronaia species, with the exception of the DNA sequences of P. gibberum, which grouped outside this genus based on the analyses conducted in this study. Morphological variation was recorded for eight of the species to include quantitative and qualitative characters as well as geometric morphometric analyses. Three decision trees were created from quantitative and qualitative characters using classification and regression tree analyses. The best-performing tree used quantitative and qualitative characters describing shell-only scenarios and obtained 80.6% correct classification on terminal nodes. Canonical variates analysis on geometric morphometric shell data revealed large morphological overlap between species. Goodall's F-tests between pairs of species revealed significant differences (a=0.05) between all but one species pairs; however, examination of landmarks on shells concluded large overlap of landmarks between species pairs. Lack of morphologically distinct characters to readily identify these phylogenetically distinct species indicates large morphological overlap among these species. Biologists need to be cognizant that morphologically cryptic species may exist in systems often explored. Three dichotomous keys were created from classification trees to identify select individuals in the genera Fusconaia, Pleurobema, and Pleuronaia; two of these keys, one for shells and one for live mussels were tested by participants with varying mussel identification skills to represent novices and experts. Both keys used continuous (quantitative) and categorical variables to guide participants to identifications. Novices, who had no prior mussel identification experience, correctly identified mussels with a 50% accuracy using the shell key and with a 51% accuracy using the live key. Experts, who had at least three years of experience identifying mussels, correctly identified mussels with a 58% accuracy using the shell key and with a 68% accuracy using the live key; however one expert noted that they did not use the live key to correctly identify one mussel. Morphological overlap of variables between mussels likely resulted in failure to consistently identify mussels correctly. Important management decisions and project implementations require accurate assessment of species' localities and populations. Incorrect species identification could hinder species' recovery efforts or prevent projects that otherwise could have continued if species are misidentified. If a mussel collection is thought to be a new record or could affect a project, I recommend that molecular genetic identifications be used to verify the species identity.
- Asymptotic Results for Model Robust RegressionStarnes, Brett Alden (Virginia Tech, 1999-12-14)Since the mid 1980's many statisticians have studied methods for combining parametric and nonparametric esimates to improve the quality of fits in a regression problem. Notably in 1987, Einsporn and Birch proposed the Model Robust Regression estimate (MRR1) in which estimates of the parametric function, ƒ, and the nonparametric function, 𝑔, were combined in a straightforward fashion via the use of a mixing parameter, λ. This technique was studied extensively at small samples and was shown to be quite effective at modeling various unusual functions. In 1995, Mays and Birch developed the MRR2 estimate as an alternative to MRR1. This model involved first forming the parametric fit to the data, and then adding in an estimate of 𝑔 according to the lack of fit demonstrated by the error terms. Using small samples, they illustrated the superiority of MRR2 to MRR1 in most situations. In this dissertation we have developed asymptotic convergence rates for both MRR1 and MRR2 in OLS and GLS (maximum likelihood) settings. In many of these settings, it is demonstrated that the user of MRR1 or MRR2 achieves the best convergence rates available regardless of whether or not the model is properly specified. This is the "Golden Result of Model Robust Regression". It turns out that the selection of the mixing parameter is paramount in determining whether or not this result is attained.
- Bacteria Total Maximum Daily Load Issues: Report of the Bacteria TMDL Subcommittee of the Water Quality Academic Advisory CommitteeDillaha, Theo A. III; Hershner, Carl H.; Kator, Howard I.; Mostaghimi, Saied; Shabman, Leonard A.; Smith, Eric P.; Younos, Tamim M.; Zipper, Carl E. (Virginia Water Resources Research Center, 2002-10)
- Bald eagle distribution, abundance, roost use and response to human activity on the northern Chesapeake Bay, MarylandBuehler, David A. (Virginia Tech, 1990-01-11)I studied bald eagle (Haliaeetus leucocephalus) distribution, abundance, roost use and response to human activity on the northern Chesapeake Bay from 1984-89. The eagle population consisted of Chesapeake breeding eagles, Chesapeake nonbreeding eagles, northern-origin eagles and southern-origin eagles; changes in overall eagle distribution and abundance reflected the net changes in these 4 groups. Breeding territories on the northern Chesapeake increased from 12 to 28 from 1984 to 1988. Breeding eagles were resident all year, always ~7 km from the nest. Chesapeake nonbreeding eagles moved throughout most of the bay, but rarely left it (~5% of the radio-tagged eagles were off the bay during any month). Northern eagles migrated into the bay in late fall (x = 21 December! n = 7! range = 61 days) and departed in early spring (x = 27 March, n = 14, range = 43 days). Southern eagles arrived on the northern bay throughout April-August (x = 6 June, n = 11, range = 94 days) and departed from June - October (x = 3 September, n = 22, range = 119 days). Northern Chesapeake eagle abundance peaked twice annually; in winter (261 eagles, December 1987), driven by the presence of northern eagles, and in summer (604 eagles, August 1988), driven by the presence of southern birds. Of 1,117 radio-tagged eagle locations, only 55 (4.90/0) occurred in human-developed habitat, which composed 27.7% of 1,442 km2 of potential eagle habitat on the northern Chesapeake Bay (P < 0.001). During 36 aerial shoreline surveys, eagles were observed on only 111 of 700 (15.9%) 250-m shoreline segments that had development within 100 m, whereas eagles were observed on 312 of 859 (36.30/0) segments when development was absent (P < 0.001). On average, eagles were observed on 1.0 segment/survey that had coincident pedestrian use within 500 m, compared to 3.6 segments/survey expected if eagles and pedestrians were distributed along the shoreline independently (n = 34 surveys, P < 0.001).
- Bandwidth Selection Concerns for Jump Point Discontinuity Preservation in the Regression Setting Using M-smoothers and the Extension to hypothesis TestingBurt, David Allan (Virginia Tech, 2000-03-23)Most traditional parametric and nonparametric regression methods operate under the assumption that the true function is continuous over the design space. For methods such as ordinary least squares polynomial regression and local polynomial regression the functional estimates are constrained to be continuous. Fitting a function that is not continuous with a continuous estimate will have practical scientific implications as well as important model misspecification effects. Scientifically, breaks in the continuity of the underlying mean function may correspond to specific physical phenomena that will be hidden from the researcher by a continuous regression estimate. Statistically, misspecifying a mean function as continuous when it is not will result in an increased bias in the estimate. One recently developed nonparametric regression technique that does not constrain the fit to be continuous is the jump preserving M-smooth procedure of Chu, Glad, Godtliebsen & Marron (1998),`Edge-preserving smoothers for image processing', Journal of the American Statistical Association 93(442), 526-541. Chu et al.'s (1998) M-smoother is defined in such a way that the noise about the mean function is smoothed out while jumps in the mean function are preserved. Before the jump preserving M-smoother can be used in practice the choice of the bandwidth parameters must be addressed. The jump preserving M-smoother requires two bandwidth parameters h and g. These two parameters determine the amount of noise that is smoothed out as well as the size of the jumps which are preserved. If these parameters are chosen haphazardly the resulting fit could exhibit worse bias properties than traditional regression methods which assume a continuous mean function. Currently there are no automatic bandwidth selection procedures available for the jump preserving M-smoother of Chu et al. (1998). One of the main objectives of this dissertation is to develop an automatic data driven bandwidth selection procedure for Chu et al.'s (1998) M-smoother. We actually present two bandwidth selection procedures. The first is a crude rule of thumb method and the second is a more sophistocated direct plug in method. Our bandwidth selection procedures are modeled after the methods of Chu et al. (1998) with two significant modifications which make the methods robust to possible jump points. Another objective of this dissertation is to provide a nonparametric hypothesis test, based on Chu et al.'s (1998) M-smoother, to test for a break in the continuity of an underlying regression mean function. Our proposed hypothesis test is nonparametric in the sense that the mean function away from the jump point(s) is not required to follow a specific parametric model. In addition the test does not require the user to specify the number, position, or size of the jump points in the alternative hypothesis as do many current methods. Thus the null and alternative hypotheses for our test are: H0: The mean function is continuous (i.e. no jump points) vs. HA: The mean function is not continuous (i.e. there is at least one jump point). Our testing procedure takes the form of a critical bandwidth hypothesis test. The test statistic is essentially the largest bandwidth that allows Chu et al.'s (1998) M-smoother to satisfy the null hypothesis. The significance of the test is then calculated via a bootstrap method. This test is currently in the experimental stage of its development. In this dissertation we outline the steps required to calculate the test as well as assess the power based on a small simulation study. Future work such as a faster calculation algorithm is required before the testing procedure will be practical for the general user.
- Banker needs for accounting informationCalderon, Thomas G. (Virginia Polytechnic Institute and State University, 1987)This research examines the extent to which user needs are affected by differences in the size and ownership characteristics of reporting entities. Bank loan officers constitute the target group of financial statement users and the study focuses on the perceived need for sixteen financial statement items. Among these are twelve items for which differentiation in financial reporting has been proposed (key items), and four items that bankers generally require when evaluating a loan application (control items) . The research model is based on the hypothesis that perceptions of accounting information are affected by the decision context, complexity of the organization in which the decision is being made, and the behavior response repertoire of the user. A quasi-experimental design with two treatments is utilized. The treatments are (1) a commercial loan decision involving a small privately held corporation, and (2) a commercial loan decision involving a large public corporation. A questionnaire was mailed to gather the data. Three hundred and fifteen usable responses were received, for a response rate of 21%. The data were analyzed using multivariate analysis of variance and canonical correlation analysis. Differences in the size and ownership characteristics of commercial loan applicants were found to have a statistically significant impact on the perceived needs of bankers for financial statement information. This relationship is most observable among disclosures that are perceived to be of lesser importance in the loan evaluation process. The perceived needs for items that are considered to be of greater importance (for example, the control items) are relatively insensitive to variations in the size and ownership characteristics of commercial loan applicants. Overall, commercial loan officers tend to perceive a relatively high need for general financial statement items, but tend to downplay the importance of the more specific and detailed items. The results also indicate that the organizational complexity of a bank, and the degree to which its commercial loan officers are committed to the work ethic of the banking profession, are significantly related to the perceived need for financial statement disclosures.
- Bayesian Analysis of Temporal and Spatio-temporal Multivariate Environmental DataEl Khouly, Mohamed Ibrahim (Virginia Tech, 2019-05-09)High dimensional space-time datasets are available nowadays in various aspects of life such as economy, agriculture, health, environment, etc. Meanwhile, it is challenging to reveal possible connections between climate change and weather extreme events such as hurricanes or tornadoes. In particular, the relationship between tornado occurrence and climate change has remained elusive. Moreover, modeling multivariate spatio-temporal data is computationally expensive. There is great need to computationally feasible models that account for temporal, spatial, and inter-variables dependence. Our research focuses on those areas in two ways. First, we investigate connections between changes in tornado risk and the increase in atmospheric instability over Oklahoma. Second, we propose two multiscale spatio-temporal models, one for multivariate Gaussian data, and the other for matrix-variate Gaussian data. Those frameworks are novel additions to the existing literature on Bayesian multiscale models. In addition, we have proposed parallelizable MCMC algorithms to sample from the posterior distributions of the model parameters with enhanced computations.
- Bayesian Approach Dealing with Mixture Model ProblemsZhang, Huaiye (Virginia Tech, 2012-04-23)In this dissertation, we focus on two research topics related to mixture models. The first topic is Adaptive Rejection Metropolis Simulated Annealing for Detecting Global Maximum Regions, and the second topic is Bayesian Model Selection for Nonlinear Mixed Effects Model. In the first topic, we consider a finite mixture model, which is used to fit the data from heterogeneous populations for many applications. An Expectation Maximization (EM) algorithm and Markov Chain Monte Carlo (MCMC) are two popular methods to estimate parameters in a finite mixture model. However, both of the methods may converge to local maximum regions rather than the global maximum when multiple local maxima exist. In this dissertation, we propose a new approach, Adaptive Rejection Metropolis Simulated Annealing (ARMS annealing), to improve the EM algorithm and MCMC methods. Combining simulated annealing (SA) and adaptive rejection metropolis sampling (ARMS), ARMS annealing generate a set of proper starting points which help to reach all possible modes. ARMS uses a piecewise linear envelope function for a proposal distribution. Under the SA framework, we start with a set of proposal distributions, which are constructed by ARMS, and this method finds a set of proper starting points, which help to detect separate modes. We refer to this approach as ARMS annealing. By combining together ARMS annealing with the EM algorithm and with the Bayesian approach, respectively, we have proposed two approaches: an EM ARMS annealing algorithm and a Bayesian ARMS annealing approach. EM ARMS annealing implement the EM algorithm by using a set of starting points proposed by ARMS annealing. ARMS annealing also helps MCMC approaches determine starting points. Both approaches capture the global maximum region and estimate the parameters accurately. An illustrative example uses a survey data on the number of charitable donations. The second topic is related to the nonlinear mixed effects model (NLME). Typically a parametric NLME model requires strong assumptions which make the model less flexible and often are not satisfied in real applications. To allow the NLME model to have more flexible assumptions, we present three semiparametric Bayesian NLME models, constructed with Dirichlet process (DP) priors. Dirichlet process models often refer to an infinite mixture model. We propose a unified approach, the penalized posterior Bayes factor, for the purpose of model comparison. Using simulation studies, we compare the performance of two of the three semiparametric hierarchical Bayesian approaches with that of the parametric Bayesian approach. Simulation results suggest that our penalized posterior Bayes factor is a robust method for comparing hierarchical parametric and semiparametric models. An application to gastric emptying studies is used to demonstrate the advantage of our estimation and evaluation approaches.
- Bayesian D-Optimal Design for Generalized Linear ModelsZhang, Ying (Virginia Tech, 2006-12-07)Bayesian optimal designs have received increasing attention in recent years, especially in biomedical and clinical trials. Bayesian design procedures can utilize the available prior information of the unknown parameters so that a better design can be achieved. However, a difficulty in dealing with the Bayesian design is the lack of efficient computational methods. In this research, a hybrid computational method, which consists of the combination of a rough global optima search and a more precise local optima search, is proposed to efficiently search for the Bayesian D-optimal designs for multi-variable generalized linear models. Particularly, Poisson regression models and logistic regression models are investigated. Designs are examined for a range of prior distributions and the equivalence theorem is used to verify the design optimality. Design efficiency for various models are examined and compared with non-Bayesian designs. Bayesian D-optimal designs are found to be more efficient and robust than non-Bayesian D-optimal designs. Furthermore, the idea of the Bayesian sequential design is introduced and the Bayesian two-stage D-optimal design approach is developed for generalized linear models. With the incorporation of the first stage data information into the second stage, the two-stage design procedure can improve the design efficiency and produce more accurate and robust designs. The Bayesian two-stage D-optimal designs for Poisson and logistic regression models are evaluated based on simulation studies. The Bayesian two-stage optimal design approach is superior to the one-stage approach in terms of a design efficiency criterion.
- Bayesian hierarchical approaches to analyze spatiotemporal dynamics of fish populationsBi, Rujia (Virginia Tech, 2020-09-03)The study of spatiotemporal dynamics of fish populations is important for both stock assessment and fishery management. I explored the impacts of environmental and anthropogenic factors on spatiotemporal patterns of fish populations, and contributed to stock assessment and management by incorporating the inherent spatial structure. Hierarchical models were developed to specify spatial and temporal variations, and Bayesian methods were adopted to fit the models. Yellow perch (Perca flavescens) is one of the most important commercial and recreational fisheries in Lake Erie, which is currently managed using four management units (MUs), with each assessed by a spatially-independent stock-specific assessment model. The current spatially-independent stock-specific assessment assumes that movement of yellow perch among MUs in Lake Erie is statistically negligible and biologically insignificant. I investigated whether the assumption is violated and the effect this assumption has on assessment. I first explored the spatiotemporal patterns of yellow perch abundance in Lake Erie based on data from a 27-year gillnet survey, and analyzed the impacts of environmental factors on spatiotemporal dynamics of the population. I found that yellow perch relative biomass index displayed clear temporal variation and spatial heterogeneity, however the two middle MUs displayed spatial similarities. I then developed a state-space model based on a 7-year tag-recovery data to explore movements of yellow perch among MUs, and performed a simulation analysis to evaluate the impacts of sample size on movement estimates. The results suggested substantial movement between the two stocks in the central basin, and the accuracy and precision of movement estimates increased with increasing sample size. These results demonstrate that the assumption on movements among MUs is violated, and it is necessary to incorporate regional connectivity into stock assessment. I thus developed a tag-integrated multi-region model to incorporate movements into a spatial stock assessment by integrating the tag-recovery data with 45-years of fisheries data. I then compared population projections such as recruitment and abundance derived from the tag-integrated multi-region model and the current spatial-independent stock-specific assessment model to detect the influence of hypotheses on with/without movements among MUs. Differences between the population projections from the two models suggested that the integration of regional stock dynamics has significant influence on stock estimates. American Shad (Alosa sapidissima), Hickory Shad (A. mediocris) and river herrings, including Alewife (A. pseudoharengus) and Blueback Herring (A. aestivalis), are anadromous pelagic fishes that spend most of the annual cycle at sea and enter coastal rivers in spring to spawn. Alosa fisheries were once one of the most valuable along the Atlantic coast, but have declined in recent decades due to pollution, overfishing and dam construction. Management actions have been implemented to restore the populations, and stocks in different river systems have displayed different recovery trends. I developed a Bayesian hierarchical spatiotemporal model to identify the population trends of these species among rivers in the Chesapeake Bay basin and to identify environmental and anthropogenic factors influencing their distribution and abundance. The results demonstrated river-specific heterogeneity of the spatiotemporal dynamics of these species and indicated the river-specific impacts of multiple factors including water temperature, river flow, chlorophyll a concentration and total phosphorus concentration on their population dynamics. Given the importance of these two case studies, analyses to diagnose the factors influencing population dynamics and to develop models to consider spatial complexity are highly valuable to practical fisheries management. Models incorporating spatiotemporal variation describe population dynamics more accurately, improve the accuracy of stock assessments, and would provide better recommendations for management purposes.
- Bayesian Hierarchical Methods and the Use of Ecological Thresholds and Changepoints for Habitat Selection ModelsPooler, Penelope S. (Virginia Tech, 2005-12-02)Modeling the complex relationships between habitat characteristics and a species' habitat preferences pose many difficult problems for ecological researchers. These problems are complicated further when information is collected over a range of time or space. Additionally, the variety of factors affecting these choices is difficult to understand and even more difficult to accurately collect information about. In light of these concerns, we evaluate the performance of current standard habitat preference models that are based on Bayesian methods and then present some extensions and supplements to those methods that prove to be very useful. More specifically, we demonstrate the value of extending the standard Bayesian hierarchical model using finite mixture model methods. Additionally, we demonstrate that an extension of the Bayesian hierarchical changepoint model to allow for estimating multiple changepoints simultaneously can be very informative when applied to data about multiple habitat locations or species. These models allow the researcher to compare the sites or species with respect to a very specific ecological question and consequently provide definitive answers that are often not available with more commonly used models containing many explanatory factors. Throughout our work we use a complex data set containing information about horseshoe crab spawning habitat preferences in the Delaware Bay over a five-year period. These data epitomize some of the difficult issues inherent to studying habitat preferences. The data are collected over time at many sites, have missing observations, and include explanatory variables that, at best, only provide surrogate information for what researchers feel is important in explaining spawning preferences throughout the bay. We also looked at a smaller data set of freshwater mussel habitat selection preferences in relation to bridge construction on the Kennerdell River in Western Pennsylvania. Together, these two data sets provided us with insight in developing and refining the methods we present. They also help illustrate the strengths and weaknesses of the methods we discuss by assessing their performance in real situations where data are inevitably complex and relationships are difficult to discern.
- Bayesian hierarchical modelling of dual response surfacesChen, Younan (Virginia Tech, 2005-11-29)Dual response surface methodology (Vining and Myers (1990)) has been successfully used as a cost-effective approach to improve the quality of products and processes since Taguchi (Tauchi (1985)) introduced the idea of robust parameter design on the quality improvement in the United States in mid-1980s. The original procedure is to use the mean and the standard deviation of the characteristic to form a dual response system in linear model structure, and to estimate the model coefficients using least squares methods. In this dissertation, a Bayesian hierarchical approach is proposed to model the dual response system so that the inherent hierarchical variance structure of the response can be modeled naturally. The Bayesian model is developed for both univariate and multivariate dual response surfaces, and for both fully replicated and partially replicated dual response surface designs. To evaluate its performance, the Bayesian method has been compared with the original method under a wide range of scenarios, and it shows higher efficiency and more robustness. In applications, the Bayesian approach retains all the advantages provided by the original dual response surface modelling method. Moreover, the Bayesian analysis allows inference on the uncertainty of the model parameters, and thus can give practitioners complete information on the distribution of the characteristic of interest.
- Bayesian Methodology for Missing Data, Model Selection and Hierarchical Spatial Models with Application to Ecological DataBoone, Edward L. (Virginia Tech, 2003-01-31)Ecological data is often fraught with many problems such as Missing Data and Spatial Correlation. In this dissertation we use a data set collected by the Ohio EPA as motivation for studying techniques to address these problems. The data set is concerned with the benthic health of Ohio's waterways. A new method for incorporating covariate structure and missing data mechanisms into missing data analysis is considered. This method allows us to detect relationships other popular methods do not allow. We then further extend this method into model selection. In the special case where the unobserved covariates are assumed normally distributed we use the Bayesian Model Averaging method to average the models, select the highest probability model and do variable assessment. Accuracy in calculating the posterior model probabilities using the Laplace approximation and an approximation based on the Bayesian Information Criterion (BIC) are explored. It is shown that the Laplace approximation is superior to the BIC based approximation using simulation. Finally, Hierarchical Spatial Linear Models are considered for the data and we show how to combine analysis which have spatial correlation within and between clusters.
- Bayesian Model Averaging and Variable Selection in Multivariate Ecological ModelsLipkovich, Ilya A. (Virginia Tech, 2002-04-09)Bayesian Model Averaging (BMA) is a new area in modern applied statistics that provides data analysts with an efficient tool for discovering promising models and obtaining esti-mates of their posterior probabilities via Markov chain Monte Carlo (MCMC). These probabilities can be further used as weights for model averaged predictions and estimates of the parameters of interest. As a result, variance components due to model selection are estimated and accounted for, contrary to the practice of conventional data analysis (such as, for example, stepwise model selection). In addition, variable activation probabilities can be obtained for each variable of interest. This dissertation is aimed at connecting BMA and various ramifications of the multivari-ate technique called Reduced-Rank Regression (RRR). In particular, we are concerned with Canonical Correspondence Analysis (CCA) in ecological applications where the data are represented by a site by species abundance matrix with site-specific covariates. Our goal is to incorporate the multivariate techniques, such as Redundancy Analysis and Ca-nonical Correspondence Analysis into the general machinery of BMA, taking into account such complicating phenomena as outliers and clustering of observations within a single data-analysis strategy. Traditional implementations of model averaging are concerned with selection of variables. We extend the methodology of BMA to selection of subgroups of observations and im-plement several approaches to cluster and outlier analysis in the context of the multivari-ate regression model. The proposed algorithm of cluster analysis can accommodate re-strictions on the resulting partition of observations when some of them form sub-clusters that have to be preserved when larger clusters are formed.