The central idea of principal component analysis (PCA) is to reduce the dimensionality of a data set consisting of a large number of interrelated variables, while retaining as much as possible of the variation present in the data set. This is achieved by transforming to a new set of variables, the principal components (PCs), which are uncorrelated, and which are ordered so that the rst few... For example, by only looking at data distribution projected on the principal direction in Figure 9-10 and 12-13, it is almost impossible to find corresponding original data set. To resolve these issues, in literature, kernel PCA or statistically independent component analysis (ICA) are employed where PCA fails.

Be able to demonstrate that PCA/factor analysis can be undertaken with either raw data or a set of correlations After you have worked through this chapter and if you feel you have learnt something not mentioned above please add it below: Factor analysis and Principal Component Analysis (PCA)

3 PCA Cautions 10 At the end of the last lecture, I set as our goal to nd ways of reducing the dimensionality of our data by means of linear projections, and of choosing pro- A plot of the data in the space of the ?rst two principal components, with the points labelled by the name of the corresponding competitor can be produced as shown with Figure 13.3.

### PCA yields the directions (principal components) that maximize the variance of the data, whereas LDA also aims to find the directions that maximize the separation (or discrimination) between different classes, which can be useful in pattern classification problem (PCA “ignores” class labels).

- 354 CHAPTER 18. PRINCIPAL COMPONENTS ANALYSIS Setting the derivatives to zero at the optimum, we get wT w = 1 (18.19) vw = ?w (18.20) Thus, desired vector w is an eigenvector of the covariance matrix v, and the maxi-
- COEFF = princomp(X) performs principal components analysis (PCA) on the n-by-p data matrix X, and returns the principal component coefficients, also known as loadings. Rows of X correspond to observations, columns to variables.
- I have a large set of variables measured on different scales/units, and want to standardize them to the same scale, with a mean of 0 and standard deviation of 1, so that I can run a PCA on them. I
- COEFF = princomp(X) performs principal components analysis (PCA) on the n-by-p data matrix X, and returns the principal component coefficients, also known as loadings. Rows of X correspond to observations, columns to variables.

