Principal: Difference between revisions
CSV import Tags: mobile edit mobile web edit |
No edit summary |
||
| Line 1: | Line 1: | ||
Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting vectors (each being a linear combination of the variables and containing n observations) are an uncorrelated orthogonal basis set. PCA is sensitive to the relative scaling of the original variables. | Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting vectors (each being a linear combination of the variables and containing n observations) are an uncorrelated orthogonal basis set. PCA is sensitive to the relative scaling of the original variables. | ||
| Line 21: | Line 19: | ||
* [[Singular value decomposition]] | * [[Singular value decomposition]] | ||
* [[Eigenvalues and eigenvectors]] | * [[Eigenvalues and eigenvectors]] | ||
[[Category:Multivariate statistics]] | [[Category:Multivariate statistics]] | ||
[[Category:Dimension reduction]] | [[Category:Dimension reduction]] | ||
| Line 31: | Line 24: | ||
[[Category:Statistical models]] | [[Category:Statistical models]] | ||
[[Category:Data analysis]] | [[Category:Data analysis]] | ||
{{stub}} | {{stub}} | ||
Latest revision as of 01:36, 2 January 2025
Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. The resulting vectors (each being a linear combination of the variables and containing n observations) are an uncorrelated orthogonal basis set. PCA is sensitive to the relative scaling of the original variables.
History[edit]
PCA was invented in 1901 by Karl Pearson, as an analogue of the principal axis theorem in mechanics; it was later independently developed and named by Harold Hotelling in the 1930s.
Definition[edit]
PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by some projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on.
Applications[edit]
PCA is used in exploratory data analysis and for making predictive models. It is commonly used in fields such as face recognition and image compression. It is also used in finance, genetics, neuroscience, and many other fields.


