1. Motivation Undirected models are useful in many settings. Consider models in
Buduson, Sarah, General Assignment Reporter has reference to this Academic Journal, PHwiki organized this Journal 1. Motivation Undirected models are useful in many settings. Consider models in exponential family as long as m: Task: Given i.i.d. data, estimate parameters accurately in addition to quickly Maximum likelihood estimation (MLE): Need to resort to approximate techniques: Pseudolikelihood / composite likelihoods Sampling-based techniques (e.g. MCMC-MLE) Contrastive divergence (CD) learning We propose particle filtered MCMC-MLE Particle Filtered MCMC-MLE with Connections to Contrastive Divergence Arthur Asuncion, Qiang Liu, Alex in addition to er Ihler, Padhraic Smyth Department of Computer Science, University of Cali as long as nia, Irvine 3. Contrastive Divergence (CD) Widely-used machine learning algorithm as long as learning undirected models [Hinton, 2002] CD can be motivated by taking gradient of log-likelihood directly: CD-n samples from current model (approx.): Initialize chains at empirical data distribution Only run n MCMC steps Persistent CD: initialize chains at samples at previous iteration [Tieleman, 2008] 5. Experimental Analysis Visible Boltzmann machines: Exponential r in addition to om graph models (ERGMs): Conditional r in addition to om fields (CRFs): Restricted Boltzmann machines (RBMs): 2. MCMC-MLE Widely used in statistics [Geyer, 1991] Idea: draw samples from alternate distribution p(x0) using MCMC, to approximate the likelihood: To optimize approximate likelihood, use gradient: Degeneracy problems if moves far from initial 0 4. Particle Filtered MCMC-MLE (PF) Use sampling-importance-resampling (SIR) with MCMC rejuvenation to estimate gradient Monitor effective sample size (ESS): If ESS (health of particles) is low: Resample particles in proportion to w Rejuvenate with n MCMC steps based on PF can avoid MCMC-MLEs degeneracy issues PF can be potentially faster than CD since it only rejuvenates when ESS is low As the number of particles approaches infinity, PF recovers the exact log-likelihood gradient 6. Conclusions Particle filtered MCMC-MLE can avoid the degeneracy issues of MCMC-MLE by per as long as ming resampling in addition to rejuvenation Particle filtered MCMC-MLE is sometimes faster than CD since it only rejuvenates when needed There is a unified view of all these algorithms Partition function usually intractable MCMC-MLE uses importance sampling to estimate gradient Run MCMC under p(x0) until equilibrium Calculate new weight Update using approximate gradient Run MCMC under as long as n steps Calculate weight in addition to check ESS If ESS is low, resample in addition to rejuvenate Run MCMC under as long as n steps edges 2-stars triangles Network statistics: Experiments on MNIST data. 500 hidden units used. Monte Carlo approximation PF can be viewed as a hybrid between MCMC-MLE in addition to CD
This Particular University is Related to this Particular Journal
Buduson, Sarah General Assignment Reporter
Buduson, Sarah is from United States and they belong to KPHO-TV and they are from Phoenix, United States got related to this Particular Journal. and Buduson, Sarah deal with the subjects like General Assignment News
Journal Ratings by Armstrong University
This Particular Journal got reviewed and rated by Armstrong University and short form of this particular Institution is US and gave this Journal an Excellent Rating.