Seven methods of handling missing data using samples from a national data base

dc.contributor.authorWitta, Eleanor Leaen
dc.contributor.committeechairKaiser, J.en
dc.contributor.committeememberFortune, Jimmie C.en
dc.contributor.committeememberHereford, Karl T.en
dc.contributor.committeememberKeith, Timothy Z.en
dc.contributor.committeememberStrickland, Deborah C.en
dc.contributor.departmentEducational Research and Evaluationen
dc.date.accessioned2014-03-14T21:14:21Zen
dc.date.adate2008-06-06en
dc.date.available2014-03-14T21:14:21Zen
dc.date.issued1992en
dc.date.rdate2008-06-06en
dc.date.sdate2008-06-06en
dc.description.abstractThe effectiveness of seven methods of handling missing data was investigated in a factorial design using random samples selected from the National Education Longitudinal Study of 1988 (NELS-88). Methods evaluated were listwise deletion, pairwise deletion, mean substitution, Buck's procedure, mean regression, one iteration regression, and iterative regression. Factors controlled were number of variables (4 and 8), average intercorrelation (0.2 and 0.4), sample size (200 and 2000), and proportion of incomplete cases (10%, 20%, and 40%). The pattern of missing values was determined by the pattern existing in the variables selected from NELS-88 data base. Covariance matrices resulting from the use of each missing data method were compared to the 'true' covariance matrix using multi-sample analysis in LISREL 7. Variable means were compared to the 'true' means using the MANOVA procedure in SPSS/PC+. Statistically significant differences (p≤.05) were detected in both comparisons. The most surprising result of this study was the effectiveness (p>.05) of pairwise deletion whenever the sample size was large thus supporting the contention that the error term disappears as sample size approaches infinity (Glasser, 1964). Listwise deletion was also effective (p>.05) whenever there were four variables or the sample size was small. Almost as surprising was the relative ineffectiveness (p<.05) of the regression methods. This is explained by the difference in proportion of incomplete cases versus the proportion of missing values, and by the distribution of the missing values within the incomplete cases.en
dc.description.degreePh. D.en
dc.format.extentvii, 82 leavesen
dc.format.mediumBTDen
dc.format.mimetypeapplication/pdfen
dc.identifier.otheretd-06062008-170840en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-06062008-170840/en
dc.identifier.urihttp://hdl.handle.net/10919/38437en
dc.language.isoenen
dc.publisherVirginia Techen
dc.relation.haspartLD5655.V856_1992.W588.pdfen
dc.relation.isformatofOCLC# 28310136en
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subject.lccLD5655.V856 1992.W588en
dc.subject.lcshMissing observations (Statistics)en
dc.titleSeven methods of handling missing data using samples from a national data baseen
dc.typeDissertationen
dc.type.dcmitypeTexten
thesis.degree.disciplineEducational Research and Evaluationen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LD5655.V856_1992.W588.pdf
Size:
3.67 MB
Format:
Adobe Portable Document Format
Description: