A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Resea

A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Resea www.phwiki.com

A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Resea

Birkett, Melody, News Director;Reporter/Anchor has reference to this Academic Journal, PHwiki organized this Journal A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Research Center Acknowledgement Lecture material shamelessly adapted/stolen from the following sources: Kilian Weinberger: “Survey on Distance Metric Learning” slides IBM summer intern talk slides (Aug. 2006) Sam Roweis slides (NIPS 2006 workshop on “Learning to Compare Examples”) Yann LeCun talk slides (NIPS 2006 workshop on “Learning to Compare Examples”) Outline Motivation in addition to Basic Concepts ML tasks where it’s useful to learn dist. metric Overview of Dimensionality Reduction Mahalanobis Metric Learning as long as Clustering with Side Info (Xing et al.) Pseudo-metric online learning (Shalev-Shwartz et al.) Neighbourhood Components Analysis (Golderberger et al.), Metric Learning by Collapsing Classes (Globerson & Roweis) Metric Learning as long as Kernel Regression (Weinberger & Tesauro) Metric learning as long as RL basis function construction (Keller et al.) Similarity learning as long as image processing (LeCun et al.) Part 1 Part 2

Indiana University at South Bend US www.phwiki.com

This Particular University is Related to this Particular Journal

Motivation Many ML algorithms in addition to tasks require a distance metric (equivalently, “dissimilarity” metric) Clustering (e.g. k-means) Classification & regression: Kernel methods Nearest neighbor methods Document/text retrieval Find most similar fingerprints in DB to given sample Find most similar web pages to document/keywords Nonlinear dimensionality reduction methods: Isomap, Maximum Variance Unfolding, Laplacian Eigenmaps, etc. Motivation (2) Many problems may lack a well-defined, relevant distance metric Incommensurate features Euclidean distance not meaningful Side in as long as mation Euclidean distance not relevant Learning distance metrics may thus be desirable A sensible similarity/distance metric may be highly task-dependent or semantic-dependent What do these data points “mean” What are we using the data as long as Which images are most similar

It depends centered left right male female It depends what you are looking as long as student professor

what you are looking as long as nature background plain background Key DML Concept: Mahalanobis distance metric The simplest mapping is a linear trans as long as mation Mahalanobis distance metric The simplest mapping is a linear trans as long as mation Algorithms can learn both matrices

>5 Minutes Introduction to Dimensionality Reduction How can the dimensionality be reduced eliminate redundant features eliminate irrelevant features extract low dimensional structure Notation Input: Output: Embedding principle: with Nearby points remain nearby, distant points remain distant. Estimate r.

Two classes of DR algorithms Linear Non-Linear Linear dimensionality reduction Principal Component Analysis (Jolliffe 1986) Project data into subspace of maximum variance.

Optimization Optimization Eigenvalue solution: Facts about PCA Eigenvectors of covariance matrix C Minimizes ssq reconstruction error Dimensionality r can be estimated from eigenvalues of C PCA requires meaningful scaling of input features

Multidimensional Scaling (MDS) Multidimensional Scaling (MDS) Multidimensional Scaling (MDS) inner product matrix

Birkett, Melody KFYI-AM News Director;Reporter/Anchor www.phwiki.com

Multidimensional Scaling (MDS) equivalent to PCA use eigenvectors of inner-product matrix requires only pairwise distances Non-linear dimensionality reduction Non-linear dimensionality reduction

From subspace to submanifold We assume the data is sampled from some manifold with lower dimensional degree of freedom. How can we find a truthful embedding Approximate manifold with neighborhood graph Approximate manifold with neighborhood graph

Neighborhood Component Analysis (Goldberger et. al. 2004) Distance metric as long as visualization in addition to kNN

Birkett, Melody News Director;Reporter/Anchor

Birkett, Melody is from United States and they belong to KFYI-AM and they are from  Phoenix, United States got related to this Particular Journal. and Birkett, Melody deal with the subjects like International News; Local News; National News; Regional News

Journal Ratings by Indiana University at South Bend

This Particular Journal got reviewed and rated by Indiana University at South Bend and short form of this particular Institution is US and gave this Journal an Excellent Rating.