Principal Component Analysis Barnabás Póczos University of Alberta Nov 24, 2009

Principal Component Analysis Barnabás Póczos University of Alberta Nov 24, 2009

Principal Component Analysis Barnabás Póczos University of Alberta Nov 24, 2009

Bloch, Ben, Contributor has reference to this Academic Journal, PHwiki organized this Journal Principal Component Analysis Barnabás Póczos University of Alberta Nov 24, 2009 B: Chapter 12 HRF: Chapter 14.5 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition PCA theory Kernel-PCA Some of these slides are taken from Karl Booksh Research group Tom Mitchell Ron Parr PCA Applications Data Visualization Data Compression Noise Reduction Data Classification Trend Analysis Factor Analysis

Kenneth Shuler School of Cosmetology and Nails-Columbia SC

This Particular University is Related to this Particular Journal

Data Visualization Example: Given 53 blood in addition to urine samples (features) from 65 people. How can we visualize the measurements Data Visualization Matrix as long as mat (65×53) Instances Features Difficult to see the correlations between the features Data Visualization Spectral as long as mat (65 pictures, one as long as each person) Difficult to compare the different patients

Data Visualization Spectral as long as mat (53 pictures, one as long as each feature) Difficult to see the correlations between the features Bi-variate Tri-variate Data Visualization How can we visualize the other variables difficult to see in 4 or higher dimensional spaces Data Visualization Is there a representation better than the coordinate axes Is it really necessary to show all the 53 dimensions what if there are strong correlations between the features How could we find the smallest subspace of the 53-D space that keeps the most in as long as mation about the original data A solution: Principal Component Analysis

Principle Component Analysis Orthogonal projection of data onto lower-dimension linear space that maximizes variance of projected data (purple line) minimizes mean squared distance between data point in addition to projections (sum of blue lines) PCA: Principle Components Analysis Idea: Given data points in a d-dimensional space, project into lower dimensional space while preserving as much in as long as mation as possible Eg, find best planar approximation to 3D data Eg, find best 12-D approximation to 104-D data In particular, choose projection that minimizes squared error in reconstructing original data Vectors originating from the center of mass Principal component 1 points in the direction of the largest variance. Each subsequent principal component is orthogonal to the previous ones, in addition to points in the directions of the largest variance of the residual subspace The Principal Components

2D Gaussian dataset 1st PCA axis 2nd PCA axis

PCA algorithm I (sequential) We maximize the variance of the projection in the residual subspace We maximize the variance of projection of x x’ PCA reconstruction Given the centered data {x1, , xm}, compute the principal vectors: 1st PCA vector kth PCA vector PCA algorithm II (sample covariance matrix) Given data {x1, , xm}, compute covariance matrix PCA basis vectors = the eigenvectors of Larger eigenvalue more important eigenvectors where PCA algorithm II PCA algorithm(X, k): top k eigenvalues/eigenvectors % X = N m data matrix, % each data point xi = column vector, i=1 m X subtract mean x from each column vector xi in X X XT covariance matrix of X { i, ui }i=1 N = eigenvectors/eigenvalues of 1 2 N Return { i, ui }i=1 k % top k principle components

PCA algorithm III (SVD of the data matrix) Singular Value Decomposition of the centered data matrix X. Xfeatures samples = USVT X VT S U = samples significant noise noise noise significant sig. PCA algorithm III Columns of U the principal vectors, { u(1), , u(k) } orthogonal in addition to has unit norm – so UTU = I Can reconstruct the data using linear combinations of { u(1), , u(k) } Matrix S Diagonal Shows importance of each eigenvector Columns of VT The coefficients as long as reconstructing the samples Face recognition

Challenge: Facial Recognition Want to identify specific person, based on facial image Robust to glasses, lighting, Can’t just use the given 256 x 256 pixels Applying PCA: Eigenfaces Example data set: Images of faces Famous Eigenface approach [Turk & Pentl in addition to ], [Sirovich & Kirby] Each face x is 256 256 values (luminance at location) x in 256256 (view as 64K dim vector) Form X = [ x1 , , xm ] centered data mtx Compute S = XXT Problem: S is 64K 64K HUGE!!! 256 x 256 real values m faces X = x1, , xm Method A: Build a PCA subspace as long as each person in addition to check which subspace can reconstruct the test image the best Method B: Build one PCA database as long as the whole dataset in addition to then classify based on the weights. Computational Complexity Suppose m instances, each of size N Eigenfaces: m=500 faces, each of size N=64K Given NN covariance matrix S, can compute all N eigenvectors/eigenvalues in O(N3) first k eigenvectors/eigenvalues in O(k N2) But if N=64K, EXPENSIVE!

Bloch, Ben Los Angeles Confidential Contributor

A Clever Workaround Note that m 64K Use L=XTX instead of S=XXT If v is eigenvector of L then Xv is eigenvector of S Proof: L v = v XTX v = v X (XTX v) = X( v) = Xv (XXT)X v = (Xv) S (Xv) = (Xv) 256 x 256 real values m faces X = x1, , xm Principle Components (Method B) Reconstructing (Method B) faster if train with only people w/out glasses same lighting conditions

Shortcomings Requires carefully controlled data: All faces centered in frame Same size Some sensitivity to angle Alternative: “Learn” one set of PCA vectors as long as each angle Use the one with lowest error Method is completely knowledge free (sometimes this is good!) Doesn’t know that faces are wrapped around 3D objects (heads) Makes no ef as long as t to preserve class distinctions Facial expression recognition Happiness subspace (method A)

Thanks as long as the Attention!

Bloch, Ben Contributor

Bloch, Ben is from United States and they belong to Los Angeles Confidential and they are from  Santa Monica, United States got related to this Particular Journal. and Bloch, Ben deal with the subjects like Entertainment; Nightlife; Restaurants/Dining

Journal Ratings by Kenneth Shuler School of Cosmetology and Nails-Columbia

This Particular Journal got reviewed and rated by Kenneth Shuler School of Cosmetology and Nails-Columbia and short form of this particular Institution is SC and gave this Journal an Excellent Rating.