Accelerated Gradient Method as long as Multi-Task Sparse Learning Problems

Accelerated Gradient Method as long as Multi-Task Sparse Learning Problems

Accelerated Gradient Method as long as Multi-Task Sparse Learning Problems

Jarolim, Edie, Freelance Writer has reference to this Academic Journal, PHwiki organized this Journal Accelerated Gradient Method as long as Multi-Task Sparse Learning ProblemsXi Chen, Weike Pan+, James T. Kwok+, Jaime G. Carbonell Machine Learning DepartmentCarnegie Mellon University+ Computer Science EngineeringThe Hong Kong University of Science & Tech TexPoint fonts used in EMF. Read the TexPoint manual be as long as e you delete this box.: AAAAAAAAAAAAAAContentQ&AConvergence RateMulti-Task Sparse Learning ProblemsExperimental ResultsAccelerated Gradient MethodMulti-Task Learning ProblemsMulti-Task Learning: Combine data from various data sources Potentially exploit the inter-relation among different data

Arizona State University US

This Particular University is Related to this Particular Journal

Multi-Task Learning ProblemsCollection of TasksM linear classifiers Optimization over M tasks jointly:Sparse Multi-Task Learning ProblemsCoefficients as long as a feature (2nd)Coefficients as long as a task (2nd)[Tropp 2006]Solution where only a few features are involved but the involved features will contribute in solving the problems as long as all tasks !Sub-gradient DescentOptimization FormulationSub-gradient method: Convergence Rate :

Gradient update as long as smooth function:Generalized gradient update with non-smooth term:Equivalent View of Gradient DescentNesterov’s Accelerated Gradient Method:Define two alternating sequences in addition to is affine combination of in addition to Per as long as m the generalized gradient update at Special choice of step-size Nesterov’s Accelerated Gradient Method[Nesterov 1983 & 2003] Accelerated Gradient AlgorithmAccelerated Gradient Algorithm

Fast Computation of Generalized Gradient UpdateKey observation: row decomposition: Generalized Gradient Update SubproblemDual Form[J. Duchi et al. 2008]Sorting in addition to thresholding: O(MlogM)ExtensionsConvergence Rate

Comparisons MTL-AGM (Multi-Task Learning – Accelerated Gradient Method)MTL-PGM (Projected Gradient Descent) MTL-FOLOS (Forward Looking Subgradient)Datasets Classification (Letters Dataset) : 8 binary classification tasksRegression (School Dataset): 139 regression tasks (half training , half testing) Experimental Results[A. Quattoni et al. 2009][J. Duchi et al. 2009]Letter Dataset ( =0.01, C=100, weak sparsity)Experimental ResultsHinge loss NOT smooth MTL-AGM Superior Letter Dataset ( =0.05, C=50, strong sparsity)Experimental Results

School Dataset ( =1, C=100, Square Loss, RMSE)Experimental ResultsThank YouGeneralized Gradient Update Sorting & thresholding procedure:Time complexity: O(MlogM)

Jarolim, Edie Jarolim, Edie Freelance Writer

Jarolim, Edie Freelance Writer

Jarolim, Edie is from United States and they belong to Jarolim, Edie and they are from  Tucson, United States got related to this Particular Journal. and Jarolim, Edie deal with the subjects like Beverages; Food; Pets and Pet Products; Restaurants/Dining; Travel

Journal Ratings by Arizona State University

This Particular Journal got reviewed and rated by Arizona State University and short form of this particular Institution is US and gave this Journal an Excellent Rating.