|
principal components analysis (pca) |
|
|
|
|
|
A statistical procedure for transforming an (observations by variables) data matrix so that the variables in the new matrix are uncorrelated. Unlike factor analysis, there are as many new variables (termed components) in the transformed as in the original matrix.
The components are extracted by an iterative averaging procedure. The first principal component occupies a position as close to (i.e. as highly correlated with) all of the original variables as possible. The second is as close as possible to the residual variation from the first, and so on until all have been extracted.
The output of a pca (which normally takes only a few seconds on modern computers, with many standard packages available) comprises three important sets of information. The eigenvalues are measures of the relative importance of each component (i.e. the proportion of the variation in the original variables accounted for by each); the greater the value of an eigenvalue, the greater the commonality among the original variables. The component loadings show the correlations between the original variables and the new ones, thus identifying which groups of variables have common patterns. Finally, the component scores are values for the observations on each of the new variables.
Principal components analysis has been used by geographers: (a) to identify groups of inter-correlated variables, in an inductive search for common patterns; (b) to simplify a data set by removing redundant information resulting from inter-correlated variables; (c) to reorganize a data set by removing collinearity (see regression; general linear model); and (d) to test hypotheses. (RJJ)
Suggested Reading Johnston, R.J. 1978: Multivariate statistical analysis in geography: a primer on the general linear model. London and New York: Longman. |
|
|
|
|
|
Bookmark this page:
|
|
|
|
|
|
<< former term |
|
next term >> |
|
|
|
|
|
|
|
|
|