Contents

## Which Spatial Partition Trees Are Adaptive to Intrinsic Dimension Nakul Verma,

Potterf, Katheryn, Contributing Writer has reference to this Academic Journal, PHwiki organized this Journal Which Spatial Partition Trees Are Adaptive to Intrinsic Dimension Nakul Verma, Samory Kpotufe, in addition to Sanjoy Dasgupta {naverma, skpotufe, dasgupta}@ucsd.edu University of Cali as long as nia, San Diego Some real world data with low intrinsic dimension Spatial Partition Trees Experiments Diameter decrease rate (k): Smallest k such that data diameter is halved every k levels. We show that: RP tree: PD tree: 2M tree: dyadic tree / kd tree: The trees we consider: dyadic tree: Pick a coordinate direction in addition to split the data at the mid point along this direction. kd tree: Pick a coordinate direction in addition to split the data at the median along this direction. RP tree: Pick a r in addition to om direction in addition to split the data at the median along this direction. PCA/PD tree: Split the data at the median along the principal direction. 2Means tree: Compute the 2-means solution, in addition to split the data as per the cluster assignment. Vector Quantization Nearest Neighbor Regression Local covariance dimension St in addition to ard characterizations of intrinsic dimension Common notions of intrinsic dimension (e.g. Box dimension, Doubling dimension, etc.) originally emerged from fractal geometry. They, however, have the following issues in the context of machine learning: Why is this important Spatial trees are at the heart of many machine learning tasks (e.g. reg-ression, near neighbor search, vector quantization). However, they tend to suffer from the curse of dimensionality: the rate at which the diameter of the data decreases as we go down the tree depends on the dimension of the space. In particular, we might require partitions of size O(2D) to attain small data diameters. Fortunately, many real world data have low intrinsic dimension (e.g. manifolds, sparse datasets), in addition to we would like to benefit from such situations. These notions are purely geometrical in addition to dont account as long as the underlying distribution. They are not robust to distributional noise: e.g. as long as a noisy manifold, these dimensions can be very high. They are difficult to verify empirically. Rotating teapot. One degree of freedom (rotation angle). Movement of a robotic arm. Two degrees of freedom. one as long as each joint. H in addition to written characters. The tilt angle, thickness, etc. govern the final written as long as m. Speech. Few anatomical char-acteristics govern the spoken phonemes. H in addition to gestures in Sign Language. Few gestures can follow other gestures. Need a more statistical notion of intrinsic dimension that characterizes the underlying distribution, is robust to noise, in addition to is easy to verify as long as real world datasets. d=3 d=2 dyadic tree kd tree rp tree Quantization error of test data at different levels as long as various partition trees (built using separate training data). 2-means in addition to PD trees per as long as m the best. Quality of the found neighbor at various levels of the partition trees. l2 regression error in predicting the rotation angle at different tree levels. All experiments are done with 10-fold cross validation. Theoretical Guarantees Loc. cov. dim. estimate at different scales as long as some real-world datasets. Level 1 Builds a hierarchy of nested partitions of the data space by recursively bisecting the space. Level 2 d=1 number of levels needed to halve the diameter ( k) Trees such as RPTree, PDTree in addition to 2-MeansTree adapt to the intrinsic dimension of the data in terms of the rate at which they decrease diameter down the tree. This has strong implications on the per as long as mance of these trees on the various learning tasks they are used as long as . A set S D is said to have local covariance dimension (d, ) if the largest d eigenvalues of its covariance matrix satisfy: Empirical estimates of local covariance dimension We show that Axis parallel splitting rules (dyadic / kd tree) dont always adapt to intrinsic dimension; the upper bounds have matching lower bounds. On the other h in addition to , the irregular splitting rules (RP / PD / 2M trees) always adapt to intrinsic dimension. They there as long as e tend to per as long as m better on real world tasks.

This Particular University is Related to this Particular Journal

## Potterf, Katheryn Contributing Writer

Potterf, Katheryn is from United States and they belong to Profit: The Executive’s Guide to Oracle Applications and they are from Redwood City, United States got related to this Particular Journal. and Potterf, Katheryn deal with the subjects like Business; Investing

## Journal Ratings by Westminster Theological Seminary

This Particular Journal got reviewed and rated by Westminster Theological Seminary and short form of this particular Institution is PA and gave this Journal an Excellent Rating.