A Comparison of Principal Components from Real and Random Data

Files
TR Number
Date
1985
Journal Title
Journal ISSN
Volume Title
Publisher
Ecological Society of America
Abstract

We compared principal components derived from sets of real data with dimensions of 120 x 7, 120 x 4, 150 x 11, 150 x 8, 150 x 5, 454 x 12, 454 x 8, and 454 x 5, to those from sets of randomly generated data of corresponding size. Principal components from subsets of 25, 50, 75, and 100 observations from the 120- and 150-observation data sets and those from subsets of 25, 50, 75, 100, 150, 200, 300, and 400 observations from the 454-observation data sets were compared. Percent variance association with components from real data was relatively constant over all sample sizes; percent variance decreased with larger samples of random data. A bootstrap method was used to develop standard error estimates on percent variance and percent of remaining variance associated with components from real data. Percent of remaining variance associated with the first four components from real data was significantly higher than analogous components from random data.

Description
Keywords
bootstrap, confidence intervals, principal components analysis, random data, significance tests, standard error estimates
Citation
Dean F. Stauffer, Edward O. Garton, and R. Kirk Steinhorst 1985. A Comparison of Principal Components from Real and Random Data. Ecology 66:1693-1698. http://dx.doi.org/10.2307/2937364