` Using Statistical Machine Learning in Cloud Computing David Patterson, UC Be

`  Using Statistical Machine Learning in Cloud Computing David Patterson, UC Be www.phwiki.com

` Using Statistical Machine Learning in Cloud Computing David Patterson, UC Be

Uskert, Patrick, Contributing Writer has reference to this Academic Journal, PHwiki organized this Journal ` Using Statistical Machine Learning in Cloud Computing David Patterson, UC Berkeley Reliable Adaptive Distributed Systems Lab Image: John Curley http://www.flickr.com/photos/jay-que/1834540/ Datacenter is new “server” “Program” == Web search, email, map/GIS, “Computer” == 1000’s computers, storage, network Warehouse-sized facilities in addition to workloads New datacenter ideas (2007-2008): truck container (Sun), floating (Google), datacenter-in-a-tent (Microsoft) How to enable innovation in new services without first building & capitalizing a large company photos: Sun Microsystems & datacenterknowledge.com Outline Cloud Computing RAD Lab Case Study 1: Director & SCADS Cloud storage in addition to automatic management Case Study 2: Automatic Log Analysis Find anomalous behavior Case Study 3: Predicting Map Reduce jobs Help with scheduling Summary

Southern Wesleyan University SC www.phwiki.com

This Particular University is Related to this Particular Journal

Cloud Computing is Hot 9/15/09 Federal CIO Vivek Kundra embraces Cloud Computing “We’ve been building data center after data center, acquiring application after application, in addition to frankly what it’s done is drive up the cost of technology immensely across the board. What we need is to find a more innovative path in addressing these problems.” But What is cloud computing, exactly “It’s nothing (new)” “ we’ve redefined Cloud Computing to include everything that we already do I don’t underst in addition to what we would do differently other than change the wording of some of our ads.” Larry Ellison, CEO, Oracle (Wall Street Journal, Sept. 26, 2008)

Above the Clouds: A Berkeley View of Cloud Computing abovetheclouds.cs.berkeley.edu 2/09 White paper by RAD Lab PI’s in addition to students Clarify terminology around Cloud Computing Quantify comparison with conventional computing Identify Cloud Computing challenges & opportunities Why can we offer new perspective Strong engagement with industry Users of cloud computing in our own research in addition to teaching in last 18 months Goal: stimulate discussion on what’s really new without resorting to weather analogies ad nauseam Utility Computing Arrives Amazon Elastic Compute Cloud (EC2) “Compute unit” rental: $0.10-0.80/hr. 1 CU 1.0-1.2 GHz 2007 AMD Opteron/Xeon core N No up-front cost, no contract, no minimum Billing rounded to nearest hour; pay-as-you-go storage also available A new paradigm (!) as long as deploying services What is it What’s new Old idea: Software as a Service (SaaS) Basic idea predates MULTICS Software hosted in the infrastructure vs. installed on local servers or desktops; dumb (but brawny) terminals New: pay-as-you-go utility computing Illusion of infinite resources on dem in addition to Fine-grained billing: release == don’t pay Earlier examples: Sun, Intel Computing Services—longer commitment, more $$$/hour, no storage Public (utility) vs. private clouds

Why Now (not then) “The Web Space Race”: Build-out of extremely large datacenters (10,000’s of commodity PCs) Build-out driven by growth in dem in addition to (more users) => Infrastructure software: e.g., Google File System => Operational expertise: failover, DDoS, firewalls Discovered economy of scale: 5-7x cheaper than provisioning a medium-sized (100’s machines) facility More pervasive broadb in addition to Internet Commoditization of HW & SW Fast Virtualization St in addition to ardized software stacks Classifying Clouds Instruction Set VM (Amazon EC2) Managed runtime VM (Microsoft Azure) Framework VM (Google AppEngine) Tradeoff: flexibility/portability vs. “built in” functionality EC2 Azure AppEngine Lower-level, Less managed Higher-level, More managed Cloud Economics 101 Cloud Computing User: Static provisioning as long as peak – wasteful, but necessary as long as SLA “Statically provisioned” data center “Virtual” data center in the cloud

Cloud Economics 101 Cloud Computing Provider: Could save energy “Statically provisioned” data center Real data center in the cloud Unused resources Risk of Under Utilization Underutilization results if “peak” predictions are too optimistic Static data center Risks of Under Provisioning Lost revenue Lost users

New Scenarios Enabled by “Risk Transfer” to Cloud “Cost associativity”: 1,000 CPUs as long as 1 hour same price as 1 CPUs as long as 1,000 hours (@$0.10/hour) Washington Post converted Hillary Clinton’s travel documents to post on WWW <1 day after released RAD Lab graduate students demonstrate improved Hadoop (batch job) scheduler—on 1,000 servers Major enabler as long as SaaS startups Animoto traffic doubled every 12 hours as long as 3 days when released as Facebook plug-in Scaled from 50 to >3500 servers then scaled back down Cloud Computing & Statistical Machine Learning Be as long as e CC, only per as long as mance optimization on mostly small scale systems CC detailed cost-per as long as mance model Optimization more difficult with more metrics CC Everyone can use 1000+ servers Optimization more difficult at large scale Economics rewards scale up AND down Optimization more difficult if add/drop servers SML as optimization difficulty increases RAD Lab 5-year Mission Enable 1 person to develop, deploy, operate next -generation Internet application Key enabling technology: Statistical machine learning management, scaling, anomaly detection, per as long as mance prediction, Highly interdisciplinary faculty & students PI’s: Fox/Katz/Patterson (systems/networks), Jordan (machine learning), Stoica (networks & P2P), Joseph (systems/security), Franklin (databases) 2 postdocs, ~30 PhD students, ~5 undergrads

RAD Lab Prototype: System Architecture New apps, equipment, global policies (eg SLA) Offered load, resource utilization, etc. Chukwa & XTrace (monitoring) Training data Director per as long as mance & cost models Log Mining Automatic Workload Evaluation (AWE) Successes Automatically add/drop servers to fit dem in addition to , without violating Service Level Agreement (SLA) Predict per as long as mance of complex software system when dem in addition to is scaled up Distill millions of lines of log messages into an operator-friendly “decision tree” that pinpoints “unusual” incidents/conditions Recurring theme: cutting-edge Statistical Machine Learning (SML) works where simpler methods have failed Outline Cloud Computing RAD Lab Case Study 1: Director & SCADS Cloud storage in addition to automatic management Case Study 2: Automatic Log Analysis Find anomalous behavior Case Study 3: Predicting Map Reduce jobs Help with scheduling Summary

Automatic Management of a Datacenter As datacenters grow, need to automatically manage the applications in addition to resources examples: deploy applications change configuration, add/remove virtual machines recover from failures Director: mechanism as long as executing datacenter actions Advisors: intelligence behind datacenter management Director Framework Director Framework Director issues low-level/physical actions to the DC/VMs request a VM, start/stop a service manage configuration of the datacenter list of applications, VMs, Advisors update per as long as mance, utilization metrics use workload, per as long as mance models issue logical actions to the Director start an app, add 2 app servers

Uskert, Patrick UFO Magazine Contributing Writer www.phwiki.com

What About Storage Easy to imagine how to scale up in addition to scale down computation Database don’t scale down, usually run into limits when scaling up What would it mean to have datacenter storage that could scale up in addition to down as well so as to save money as long as storage in idle times DC Storage Motivation Most popular websites follow the same pattern Rapidly developed on SQLServer / PostgreSQL / MySQL / etc. Become popular in addition to realize scaling limitations Build large, complicated ad-hoc systems to deal with scaling limitations as they arise Websites that can’t scale fast enough lose customers SCADS: Scalable, Consistency-Adjustable Data Storage Scale Independence – as the user base grows: No changes to application Cost per user doesn’t increase Request latency doesn’t change Key Innovations Per as long as mance safe query language Declarative per as long as mance/consistency tradeoffs Automatic scale up in addition to down using machine learning

Scale Independence Arch Developers provide per as long as mance safe queries along with consistency requirements Use ML, workload in as long as mation, in addition to requirements to provision proactively via repartitioning keys in addition to replicas SCADS Per as long as mance Model (on m1.small, all data in memory) SLA threshold 5% writes 1% writes 99th percentile median Low workload, low put rate

Energy & Cloud Computing Cloud Computing saves Energy Don’t buy machines as long as local use that are often idle Better to ship bits as photons over fiber vs. ship electrons over transmission lines to spin disks, power processors locally Clouds use nearby (hydroelectric) power Leverage economies of scale of cooling, power distribution Energy & Cloud Computing Techniques developed to stop using idle servers to save money in Cloud Computing can also be used to save power Up to Cloud Computing Provider to decide what to do with idle resources New Requirement: Scale DOWN in addition to up Who decides when to scale down in a datacenter How can Datacenter storage systems improve energy Hybrid / Surge Computing Keep a local “private cloud” running same protocols as public cloud When need more, “surge” onto public cloud, in addition to scale back when need fulfilled Saves energy ( in addition to capital expenditures) by not buying in addition to deploying power distribution, cooling, machines that are mostly idle

Uskert, Patrick Contributing Writer

Uskert, Patrick is from United States and they belong to UFO Magazine and they are from  Marina Del Rey, United States got related to this Particular Journal. and Uskert, Patrick deal with the subjects like Books and Literature

Journal Ratings by Southern Wesleyan University

This Particular Journal got reviewed and rated by Southern Wesleyan University and short form of this particular Institution is SC and gave this Journal an Excellent Rating.