Principal Component AnalysisPrincipal component analysis is probably the oldest and best known of the It was first introduced by Pearson (1901), techniques ofmultivariate analysis. and developed independently by Hotelling (1933). Like many multivariate methods, it was not widely used until the advent of electronic computers, but it is now weIl entrenched in virtually every statistical computer package. The central idea of principal component analysis is to reduce the dimen sionality of a data set in which there are a large number of interrelated variables, while retaining as much as possible of the variation present in the data set. This reduction is achieved by transforming to a new set of variables, the principal components, which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables. Computation of the principal components reduces to the solution of an eigenvalue-eigenvector problem for a positive-semidefinite symmetrie matrix. Thus, the definition and computation of principal components are straightforward but, as will be seen, this apparently simple technique has a wide variety of different applications, as weIl as a number of different deri vations. Any feelings that principal component analysis is a narrow subject should soon be dispelled by the present book; indeed some quite broad topics which are related to principal component analysis receive no more than a brief mention in the final two chapters. |
Contents
1 | |
Mathematical and Statistical Properties of Population Principal | 8 |
CHAPTER | 9 |
CHAPTER 3 | 29 |
CHAPTER 4 | 46 |
24 | 61 |
Graphical Representation of Data Using Principal Components | 64 |
27 | 69 |
Principal Components in Regression Analysis | 129 |
465 | 148 |
Principal Components Used with Other Multivariate Techniques | 156 |
CHAPTER 10 | 173 |
CHAPTER 11 | 199 |
CHAPTER 12 | 223 |
APPENDIX | 235 |
References | 247 |
Other editions - View all
Common terms and phrases
approximation assumption biplot Chapter cluster analysis coefficients columns Component number Contrasts correlation matrix correspondence analysis covariance matrix covariance or correlation criteria criterion cut-off data set defined deleted dependent variable described dimensionality discussed in Section distribution eigenvalues eigenvectors equation estimators Euclidean distance example factor analysis factor model Figure four PCs give given interpretation ith observation Jolliffe jth variable kth PC last few PCs latent root regression linear functions Mahalanobis distance maximized methods multicollinearities multivariate normal distribution multivariate normality original variables orthogonal outliers painters PC regression PCs account population possible prediction predictor variables principal co-ordinate analysis Principal component analysis procedure Property q PCs relationships retained rotation sample covariance sample PCs scree graph second PC similar singular value decomposition statistical structure sum of squared Table techniques three PCs tion total variation two-dimensional uncorrelated values variable selection vector x₁ zero