Contents

## Patterns of Influence in a Recommendation Network Spread of in as long as mation In as long as mation cascades Cascade as long as mation process Work on in as long as mation cascades

Hutcheson, Carolyn, Host/Producer has reference to this Academic Journal, PHwiki organized this Journal Patterns of Influence in a Recommendation Network Jure Leskovec, CMU Ajit Singh, CMU Jon Kleinberg, Cornell Spread of in as long as mation Social network plays fundamental role in spread of in as long as mation or influence Viral marketing (Word of mouth) An idea gets a sudden widespread popularity Example: GMail achieved wide popularity in addition to the only way to obtain an account was through referral In blogs a piece of in as long as mation spreads rapidly be as long as e eventually picked by mass media In as long as mation cascades Cascades are phenomena in which an action or idea becomes widely adopted due to influence by others Traditionally sociologists studied the diffusion of innovation: Hybrid corn (Ryan in addition to Gross, 1943) Prescription drugs (Coleman et al. 1957)

This Particular University is Related to this Particular Journal

Cascade as long as mation process Time: t1 < t2 < < tn legend received recommendation in addition to propagated it as long as ward received a recommendation but didnt propagate Work on in as long as mation cascades Cascades have also been studied to: Select trendsetters as long as viral marketing (Kempe et al. 2003, Richardson et al. 2002) Find inoculation targets in epidemiology (Newman 2002) Explain trends in blogspace (Adar in addition to Adamic 2005, Gruhl et al. 2004) Since it is hard to obtain reliable data on cascades, previous studies were primarily focused on large-scale (coarse) analysis Our work We look at the fine-grained patterns of influence in a large-scale, real recommendation network Given a directed who-influences-whom graph Find cascades And examine their topological structure: What kinds of cascades arise frequently in real life Are they like trees, stars, or something else What is the distribution of cascade sizes (all same size / exponential tail / heavy-tailed) Roadmap The recommendation network dataset Proposed method: Indentifing cascades Enumerating cascades Counting cascades (approximate graph isomorphism) Experimental results: Distribution of cascade sizes Frequent cascade subgraphs Conclusion Roadmap The recommendation network dataset Proposed method: Indentifing cascades Enumerating cascades Counting cascades (approximate graph isomorphism) Experimental results: Distribution of cascade sizes Frequent cascade subgraphs Conclusion The data recommendation network Senders in addition to followers of recommendations receive discounts on products Recommendations are made to any number of people at the time of purchase The data recommendations For each recommendation we have: sender ID recipient ID recommendation time response (buy / no buy) purchase time The data description A large online retailer (June 2001 to May 2003) Over a gigabyte in size 15,646,121 recommendations 3,943,084 distinct customers 548,523 products recommended 99% of them belonging 4 main product groups: books DVDs music CDs VHS The data statistics Networks are very sparsely connected (low average degree) 9% of DVD purchases are due to recommendations Book recommendations are influential high low Roadmap The recommendation network dataset Proposed method: Indentifing cascades Enumerating cascades Counting cascades (approximate graph isomorphism) Experimental results: Distribution of cascade sizes Frequent cascade subgraphs Conclusion Product recommendation network Majority of recommendations do not cause purchases nor propagation Notice many star-like patterns Many disconnected components Identifying cascades Given a set of recommendations find cascades We use the following approach Create a separate graph as long as each product Delete late recommendations: Delete recommendations that happened after the first purchase of the product We get time-increasing graph Delete no-purchase nodes: We find many star-like patterns, no propagation of influence Delete nodes that did not purchase a product Now connected components correspond to maximal cascades Cascade enumeration Maximal cascades do not reveal what are the cascade building blocks (local structures) Given a maximal cascade we want to enumerate all local cascades: For every node we explore the cascade in the neighborhood up to 1, 2, 3, steps away This way we capture the local structure of the cascade around the node source node 1 step away 2 steps away Counting cascades (graph isomorphism) To count cascades we need to determine whether a new cascade is isomorphic to already seen one: No polynomial graph isomorphism algorithm is known, so we reside to approximate solution Graphs are isomorphic if there exists a node mapping so that nodes have same neighbors == Graph isomorphism Do not compare the graphs directly, but For each graph we create a signature A good signature is one where isomorphic graphs have the same signature, but few non-isomorphic graphs share the same signature Compare the graph signatures Creating a signature We propose multilevel approach Complexity ( in addition to accuracy) depends on the size of the graph Different levels of the signature Number of nodes, number of edges Sorted in- in addition to out- degree sequence Singular values of graph adjacency matrix For small graphs (n < 9) we per as long as m exact isomorphism test simple (fast/inaccurate) complex (slow/accurate) Comparing signatures First compare simple signatures Compare the graphs with the same simple signature using more in addition to more complicated (expensive/accurate) signatures At the end ( as long as small graphs) we per as long as m exact isomorphism resolution Since we are interested in building blocks of cascades which are generally small, the precision as long as small graphs is more important Comparing signatures Example Compare simple signature (number of nodes/edges) Compare simple signature (degree sequence) Compare simple signature (Singular values) Counting subgraphs related work Work on frequent subgraph mining: Apriori-based algorithm (Inokuchi et al. 2000) G-span (Yan in addition to Han, 2002) Kuramochi in addition to Karypis 2004; Pei, Jiang in addition to Zhang 2005; in addition to many more It mainly focuses on richly labeled undirected graphs (e.g. chemical compounds) We are interested in enumerating subgraphs based only on their structures We have no labels on nodes in addition to edges So heuristics as long as pruning the search space using node in addition to edge labels cannot be applied Roadmap The recommendation network dataset Proposed method: Indentifing cascades Enumerating cascades Counting cascades (approximate graph isomorphism) Experimental results: Distribution of cascade sizes Frequent cascade subgraphs Conclusion Measuring maximal cascade sizes Count how many people are in a single cascade We observe a heavy tailed distribution which can not be explained by a simple branching process very few large cascades books

Cascade sizes as long as DVDs DVD cascades can grow large possibly a product of websites where people sign up to exchange recommendations shallow drop off fat tail a number of large cascades DVD Music CD in addition to VHS cascades Music in addition to VHS cascades dont grow large music VHS Frequent cascade subgraphs (1) General observations: DVDs have the richest cascades (most recommendations, most densely linked) Books have small cascades Music is 3 times larger than video but does not have much variety in cascades high low number of all words vocabulary size

Frequent cascade subgraphs (2) is the most common cascade subgraph It accounts as long as ~75% cascades in books, CD in addition to VHS, only 12% of DVD cascades is 6 (1.2 as long as DVD) times more frequent than For DVDs is more frequent than Chains ( ) are more frequent than is more frequent than a collision ( ) (but collision has less edges) Late split ( ) is more frequent than Typical classes of cascades No propagation Common friends Nodes having same friends A complicated cascade Conclusion (1) Cascades are a as long as m of collective behavior We developed a scalable algorithm as long as indentifing in addition to counting cascades (approximate graph isomorphism) We illustrate the existence of cascades, in addition to measure their frequencies in a large real-world dataset

Conclusion (2) From our experiments we found: Most cascades are small, but large bursts can occur Cascade sizes follow a heavy-tailed distribution Frequency of different cascade subgraphs depends on the product type Cascade frequencies do not simply decrease monotonically as long as denser subgraphs But reflect more subtle features of the domain in which the recommendations are operating Thank you! Questions jure@cs.cmu.edu

## Hutcheson, Carolyn Host/Producer

Hutcheson, Carolyn is from United States and they belong to WTJB-FM and they are from Troy, United States got related to this Particular Journal. and Hutcheson, Carolyn deal with the subjects like Local News; National News; Public Affairs/Issues

## Journal Ratings by Universitat de Barcelona

This Particular Journal got reviewed and rated by Universitat de Barcelona and short form of this particular Institution is ES and gave this Journal an Excellent Rating.