An exploration of the robustness of traditional regression analysis versus analysis using backpropagation networks

TR Number
Journal Title
Journal ISSN
Volume Title
Virginia Tech

Research linking neural networks and statistics has been at two ends of a spectrum: either highly theoretical or application specific. This research attempts to bridge the gap on the spectrum by exploring the robustness of regression analysis and backpropagation networks in conducting data analysis. Robustness is viewed as the degree to which a technique is insensitive to abnormalities in data sets, such as violations of assumptions.

The central focus of regression analysis is the establishment of an equation that describes the relationship between the variables in a data set. This relationship 1s used primarily for the prediction of one variable based on the known values of the other variables. Certain assumptions have to be made regarding the data in order to obtain a tractable solution and the failure of one or more of these assumptions results in poor prediction.

The assumptions underlying linear regression that are used to characterize data sets in this research are characterized by: (a) sample size and error variance, (b) outliers, skewness, and kurtosis, (c) multicollinearity, and (d) nonlinearity and underspecification.

By using this characterization, the robustness of each technique is studied under what is, in effect, the relaxation of assumptions one at a time. The comparison between regression and backpropagation is made using the root mean square difference between the predicted output from each technique and the actual output.