Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between...

90
Hunting the truffle: Predicting success in social systems by tracking key individuals Computational Social Science Seminar Zurich, 24.09.2019 Manuel Sebastian Mariani URPP Social Networks (University of Zurich) Institute of Fundamental and Frontier Sciences (UESTC, Chengdu) | 31.01.2019

Transcript of Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between...

Page 1: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Hunting the truffle: Predictingsuccess in social systems by trackingkey individualsComputational Social Science SeminarZurich, 24.09.2019

Manuel Sebastian MarianiURPP Social Networks (University of Zurich)Institute of Fundamental and Frontier Sciences (UESTC, Chengdu)

| 31.01.2019

Page 2: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Hunting the truffle: Predicting success |

Page 3: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Individuals whose early adoptions predictsuccess 3

1. Are there individuals who repeatedly purchase inrecently-opened shops that later become successful? Whorepeatedly early adopt innovations that later succeed?

2. If there are, how to use them for success predictions?3. If there are, which socioeconomic, demographic, and

behavioral traits characterize them? Are they social hubs?

Research questions

Hunting the truffle: Predicting success |

Page 4: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Contribution 4Success prediction in social systems.■ Linking individual-level behavioral pa erns and theemergence of success.

■ Tracking and targeting the ”right” individuals can helpcompanies to predict and enhance the success of their newproducts’ diffusion.

Important nodes in social systems.■ Most of the literature on opinion leaders and influencers has

focused on the centrality.■ Differently from these studies, we search for individuals whose

adoptions predict success directly from the purchasetime-series, without looking into social network data.

■ We quantify the out-of-sample predictive power of differentgroups of individuals.

Hunting the truffle: Predicting success |

Page 5: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Outline 5

1 Success prediction in social systems: Background

2 Influencers: Do they have predictive power?

3 Discoverers of success

4 Quantifying the predictive power of different groups ofindividuals

5 Who are the discoverers of success?

6 Open challenges and take-home messages

Hunting the truffle: Predicting success |

Page 6: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Success prediction in social systems:Background

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 7: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Prediction in social systems 7There is increasing, interdisciplinary interest in prediction insocial systems.

Topics covered: Predictingonline cascades, election out-comes, scientific impact, policyimplications, and more.

……………

■ Increasing availability ofhigh-resolution data onsocioeconomic andinformation systems.(Lazer et al., Science, 2009;Gao et al, Physics Reports,2019).

■ Growing interest amongcomputational scientists intraditionally social scientifictopics, e.g., the evolution ofsocial networks, the diffusionof information, and thegeneration of inequality.(Hofman et al., Science, 2017).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 8: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Prediction in social systems 7There is increasing, interdisciplinary interest in prediction insocial systems.

Topics covered: Predictingonline cascades, election out-comes, scientific impact, policyimplications, and more.

……………

■ Increasing availability ofhigh-resolution data onsocioeconomic andinformation systems.(Lazer et al., Science, 2009;Gao et al, Physics Reports,2019).

■ Growing interest amongcomputational scientists intraditionally social scientifictopics, e.g., the evolution ofsocial networks, the diffusionof information, and thegeneration of inequality.(Hofman et al., Science, 2017).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 9: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting success in social systems 8

■ Recent interdisciplinary efforts advanced our ability to predictsuccess in diverse social systems.

■ Success is viewed as a collective phenomenon that emergesas a result of the interactions and actions by the members ofthe social system.

■ It is typically measured in terms of popularity-based metrics.

Two main approaches to success prediction:■ Modeling the success dynamics.■ Machine learning.

Success prediction

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 10: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting success in social systems 8

■ Recent interdisciplinary efforts advanced our ability to predictsuccess in diverse social systems.

■ Success is viewed as a collective phenomenon that emergesas a result of the interactions and actions by the members ofthe social system.

■ It is typically measured in terms of popularity-based metrics.

Two main approaches to success prediction:■ Modeling the success dynamics.■ Machine learning.

Success prediction

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 11: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Two main approaches in the literature on success 9

1. Modeling the success dynamics.■ We start by unveiling the basic mechanisms that govern the

dynamics of success.■ We design an aggregate dynamic model that includes the

observed mechanisms.■ We validate the model by using it to predict future success based

on early data.2. Machine learning.

■ We design a classification/regression model that includesmultiple features.

■ We aim to understand which combination of features leads to thebest predictive accuracy.

■ We a empt to interpret the best-predictive features.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 12: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The dynamics of success 10

In social systems, success can be o en modeled as a combi-nation of preferential a achment, fitness, and aging.

■ Preferential a achment. Future popularity increase isproportional to current popularity.

■ Fitness. The popularity you would get in the absence ofpreferential a achment.

■ Aging. A ractiveness decays over time.Medo et al., PRL 2011

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 13: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The dynamics of success 10

In social systems, success can be o en modeled as a combi-nation of preferential a achment, fitness, and aging.

■ Preferential a achment. Future popularity increase isproportional to current popularity.

■ Fitness. The popularity you would get in the absence ofpreferential a achment.

■ Aging. A ractiveness decays over time.Medo et al., PRL 2011

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 14: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Success and fitness/performance/talent are notequivalent 11

■ Found in a experimental study by Salganik et al., Science2006, and in empirical data of papers citations and WWW.

■ Present in success dynamics models.

■ Probability that a paper receives a new citation at time t :

Pi (t) ∼ ci (t) ηi f (t − ti ), (1)

where ci (t) denotes previous citations, ηi the fitness, f (t − ti ) an agingfunction (Medo et al., PRL 2011).

■ According to mean-field theory, the expected citation count of paper i is

ci (∞) ∼ exp (A ηi ), (2)

■ Small differences in fitness can lead to wide differences in success.

Relevance model

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 15: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Success and fitness/performance/talent are notequivalent 11

■ Found in a experimental study by Salganik et al., Science2006, and in empirical data of papers citations and WWW.

■ Present in success dynamics models.

■ Probability that a paper receives a new citation at time t :

Pi (t) ∼ ci (t) ηi f (t − ti ), (1)

where ci (t) denotes previous citations, ηi the fitness, f (t − ti ) an agingfunction (Medo et al., PRL 2011).

■ According to mean-field theory, the expected citation count of paper i is

ci (∞) ∼ exp (A ηi ), (2)

■ Small differences in fitness can lead to wide differences in success.

Relevance model

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 16: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting scientific impact and bestsellers sales 12

Success trajectories follow ”universal” pa erns. By properlyrescaling the citation trajectories of different papers, a singledynamic curve is obtained.

Wang et al., Science (2013); Yucesoy et al., EPJ Data Science (2018).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 17: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Early detection of milestone papers and patents 13

Network centralities can substantially outperform citationcounts in early detecting milestone papers and patents.

Mariani et al., Journal of Informetrics (2016), Technol. Forecast.Soc. Change (2019).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 18: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting individual scientific impact 14

The timing of a researcher’s ”largest breakthrough” wasfound to be random.

Sinatra et al., Science (2016).Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 19: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting actors’ peak year 15

We can accurately predict if an actor has already achievedhis/her productivity peak or not.

Williams et al., Nature Communications (2019).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 20: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Machine-learning approach 16

Example: Predicting the future success of an online cascadebased on early activity.

■ Temporal features. Early-adoption speed.■ Structural features. The structure of the network around

the early adopters.■ Early adopters’ features. Specific information about the

early adopters (popularity, reputation, activity levels).■ Similarity features. How similar the early adopters are.

Difference between niche items vs. items of broad interest.

Features

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 21: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Machine-learning approach 16

Example: Predicting the future success of an online cascadebased on early activity.

■ Temporal features. Early-adoption speed.■ Structural features. The structure of the network around

the early adopters.■ Early adopters’ features. Specific information about the

early adopters (popularity, reputation, activity levels).■ Similarity features. How similar the early adopters are.

Difference between niche items vs. items of broad interest.

Features

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 22: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting online cascades based on earlyadoptions 17

The speed of early adoption is the best predictor of con-tent popularity, but properties of the early adopters mightbe predictive as well.

Chen et al., WWW (2014), Shulman et al., AAAI Conference onWeb and Social Media (2016).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 23: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting online cascades a priori 18

Content that evokes high-arousal positive (awe) or negativeemotions (anger, anxiety) is more viral.

Berger et al., Journal of Marketing Research, 2011.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 24: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Researchers degree of freedom 19

■ Choose the data and the items. For example, online cascades.■ Define a success variable. Total number of reshares.

Hofman, Sharma and Wa s, Science 355, 486-488 (2017).

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 25: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

More degrees of freedom 20Problem choice.■ Ex-ante predictions. We a empt to only use information

available before each store is opened.■ Early detection. We are allowed to ”peek” into early activity

data on the store, e.g., on the early purchases made in the store.■ The general expectation is that predictive performance is larger

in the la er scenario.Which predictive model?■ Simple models might have a clear interpretation.■ Complex models with many features might lead to be er

performance, but reduced interpretability.■ There can be a tradeoff between accuracy andinterpretability.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 26: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

More degrees of freedom 20Problem choice.■ Ex-ante predictions. We a empt to only use information

available before each store is opened.■ Early detection. We are allowed to ”peek” into early activity

data on the store, e.g., on the early purchases made in the store.■ The general expectation is that predictive performance is larger

in the la er scenario.Which predictive model?■ Simple models might have a clear interpretation.■ Complex models with many features might lead to be er

performance, but reduced interpretability.■ There can be a tradeoff between accuracy andinterpretability.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 27: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Early detection: Peeking-based strategies 21

Given a set of items and data on their early adoptions, which arethe most likely ones to become successful?

Problem

■ One question, a broad range of formulations.■ How do we define the early-adoption window?■ How much activity can we look into?■ How do we define successful items?

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 28: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Different studies, different choices 22

Shulman et al., Predictability of popularity: Gaps betweenprediction and understanding, 2016

Various choices in the literature, conclusions are not alwaysconsistent.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 29: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Different studies, different choices 22

Shulman et al., Predictability of popularity: Gaps betweenprediction and understanding, 2016

Various choices in the literature, conclusions are not alwaysconsistent.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 30: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting success in social systems: choices 23

■ Classification vs. regression.■ Evaluation metrics?■ Ex-ante predictions (explanation) vs. early detection.■ Accuracy vs interpretability.

Start from a question of interest, and then design the predictionproblem (and methods adopted) to answer the question.Hofman, Sharma and Wa s, Science 355, 486-488 (2017).

Hybrid approach

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 31: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting success in social systems: choices 23

■ Classification vs. regression.■ Evaluation metrics?■ Ex-ante predictions (explanation) vs. early detection.■ Accuracy vs interpretability.

Start from a question of interest, and then design the predictionproblem (and methods adopted) to answer the question.Hofman, Sharma and Wa s, Science 355, 486-488 (2017).

Hybrid approach

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 32: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predicting success in social systems: Maininsights 24

■ ”Success” can be quantified and predicted in diverse areas ofhuman activity: Online content (cascades), Science (papers,researchers), Art (artists), Show business (actors), Bestsellers(authors, books).

■ Two main approaches to the success prediction problem:■ Models of success dynamics.■ Machine learning.

■ Through success dynamics models, researchers have founddynamical pa erns that generalize across domains of humanactivities.

■ Machine learning approaches are typically used to predict thefuture popularity of online content.

■ Many researchers degrees of freedom need to be fixed, andthe choice can affect substantially the results.

Hunting the truffle: Predicting success | 1. Success prediction in social systems: Background

Page 33: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Influencers: Do they have predictivepower?

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 34: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Linking individual-level behavioral pa erns andsuccess prediction 26

■ Usually, research on success and innovation diffusion studiesstudy individual-level behavioral pa erns andsuccess/diffusion dynamics in isolation.

■ A long-standing assumption is that some individuals – referredto as influencers – can have a disproportionate influence onspreading processes.

The main a empts to link individuals and success predictionshave been made for the influencers in online systems.

■ Can the influencers accelerate a diffusion process?■ Are the influencers adoptions predictive of diffusionsuccess?

Linking individuals and success

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 35: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Linking individual-level behavioral pa erns andsuccess prediction 26

■ Usually, research on success and innovation diffusion studiesstudy individual-level behavioral pa erns andsuccess/diffusion dynamics in isolation.

■ A long-standing assumption is that some individuals – referredto as influencers – can have a disproportionate influence onspreading processes.

The main a empts to link individuals and success predictionshave been made for the influencers in online systems.

■ Can the influencers accelerate a diffusion process?■ Are the influencers adoptions predictive of diffusionsuccess?

Linking individuals and success

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 36: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Linking individual-level behavioral pa erns andsuccess prediction 26

■ Usually, research on success and innovation diffusion studiesstudy individual-level behavioral pa erns andsuccess/diffusion dynamics in isolation.

■ A long-standing assumption is that some individuals – referredto as influencers – can have a disproportionate influence onspreading processes.

The main a empts to link individuals and success predictionshave been made for the influencers in online systems.

■ Can the influencers accelerate a diffusion process?■ Are the influencers adoptions predictive of diffusionsuccess?

Linking individuals and success

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 37: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Influencers: three dimensions 27

■ Who one is. Personality traits, socio-demographicbackgrounds, and lifestyles

■ What one knows. Competence, such as her knowledge,expertise, or ability to provide information or guidance onparticular issues.

■ Whom one knows. The structural position of the person in anetwork.

Muller and Peres, International Journal of Research in Marketing(2019), 36 (1): 3-19

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 38: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The role of influencers: triggering largercascades 28

According to diffusion models, influencers can trigger largespreading processes when targeted.

F. Iannelli, M. S. Mariani, and I. M. Sokolov. Physical Review E 98.6 (2018): 062302.Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 39: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The role of influencers: rapidly dismantling asocial network 29

F. Morone, and H. A. Makse. Nature 524.7563 (2015): 65.Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 40: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The role of influencers: accelerating empiricaldiffusion 30

The influencers’ adoptions are predictive of success

■ Goldenberg et al. (2009) found that in an online social network,a small sample of social hubs offers accurate early-stagesuccess predictions.

■ Chen et al. (2014) found that in predictive models with manydifferent features, the early adoptions by central individuals areamong the most significant predictors of success.

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 41: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The role of influencers: inconsistent predictivesignal 31

When using network-related traits for prediction, results donot generalize across platforms.

■ Weng et al. (2013) found that the number of infectedcommunities by a piece of content is a more importantpredictor than the early adopters’ centrality.

■ Shulman et al. (2016) found that features based on the networkstructure around the early adopters are not consistentlypredictive of success.

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 42: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The role of influencers: bo om-line 32

■ According to diffusion models, influencers can trigger largespreading processes when targeted.

■ Removing the influencers from a network can rapidly dismantlea network.

■ If our goal is to predict success, existing studes suggest thatthe predictive signal from the influencers is inconsistent.

Differently from most existing literature, we search for rele-vant individuals for success prediction directly from trans-action data, without looking into social network data.

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 43: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

The role of influencers: bo om-line 32

■ According to diffusion models, influencers can trigger largespreading processes when targeted.

■ Removing the influencers from a network can rapidly dismantlea network.

■ If our goal is to predict success, existing studes suggest thatthe predictive signal from the influencers is inconsistent.

Differently from most existing literature, we search for rele-vant individuals for success prediction directly from trans-action data, without looking into social network data.

Hunting the truffle: Predicting success | 2. Influencers: Do they have predictive power?

Page 44: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Discoverers of success

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 45: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers 34

Individuals who are repeatedly among the first ones to adoptinnovations that later gain many adopters.

Discoverers

■ We will identify them from the purchase time-series through ageneral procedure.

1. We count the number of discoveries, d∗i per individual.

2. We introduce a null model: every one has the same likelihood tomake a discovery.

3. Is d∗i surprising?

The identification procedure does not require social net-work data.

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 46: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers 34

Individuals who are repeatedly among the first ones to adoptinnovations that later gain many adopters.

Discoverers

■ We will identify them from the purchase time-series through ageneral procedure.

1. We count the number of discoveries, d∗i per individual.

2. We introduce a null model: every one has the same likelihood tomake a discovery.

3. Is d∗i surprising?

The identification procedure does not require social net-work data.

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 47: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

A nationwide socioeconomic system 35We analyze two large, nationwide datasets:■ Credit-card records (CCRs) from a large bank over a

three-year temporal window (from June 2015 to May 2018).■ Call data records (CDRs) from a large mobile phone operator

over a one-year temporal window (2016).

We can partially match the two datasets:■ The bank customers had to provide a mobile phone number

to the bank.■ Among the 251, 405 telco customers, 146, 762 (58.4%) are

also bank customers.■ Among the 1, 417, 937 bank customers, 146, 762 (10.4%) are

also telco customers.

Matching

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 48: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

A nationwide socioeconomic system 35We analyze two large, nationwide datasets:■ Credit-card records (CCRs) from a large bank over a

three-year temporal window (from June 2015 to May 2018).■ Call data records (CDRs) from a large mobile phone operator

over a one-year temporal window (2016).

We can partially match the two datasets:■ The bank customers had to provide a mobile phone number

to the bank.■ Among the 251, 405 telco customers, 146, 762 (58.4%) are

also bank customers.■ Among the 1, 417, 937 bank customers, 146, 762 (10.4%) are

also telco customers.

Matching

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 49: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Networks 36

We can build two networks:■ CCRs → Individual-shop transaction network (18 months):

Who bought where at which time.We consider three networks corresponding to three shop categories:

■ Eating places. (Restaurants, bars, drinking places, etc.)■ Clothing stores. (Children’s wear stores, Shoe stores, etc.)■ Food stores. (Grocery Stores, Supermarkets, Candy stores, etc.)

■ CDRs → Individual-individual communication network (12one-month snapshots of the undirected, unweighted network):Who is connected with whom.

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 50: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers I: Counting discoveries 37

■ The discoverers are defined in terms of discoveries.■ Popular shops. A shop α is considered as popular if, compared

against shops of the same category launched in the samemonth, it is among the top-z% by final number of visitors, v .

■ Discovery. Individual i discovers shop α if i purchases in shopα no later than ∆ days a er α is opened, and α turns out to bea popular shop.

■ ∆ and z are parameters of the method(in the following, z% = 10% and ∆ = 90 dd ).

■ For each individual i , we count its number of discoveries, d∗i .

■ Is d∗i surprising?

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 51: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers I: Counting discoveries 37

■ The discoverers are defined in terms of discoveries.■ Popular shops. A shop α is considered as popular if, compared

against shops of the same category launched in the samemonth, it is among the top-z% by final number of visitors, v .

■ Discovery. Individual i discovers shop α if i purchases in shopα no later than ∆ days a er α is opened, and α turns out to bea popular shop.

■ ∆ and z are parameters of the method(in the following, z% = 10% and ∆ = 90 dd ).

■ For each individual i , we count its number of discoveries, d∗i .

■ Is d∗i surprising?

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 52: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers I: Counting discoveries 37

■ The discoverers are defined in terms of discoveries.■ Popular shops. A shop α is considered as popular if, compared

against shops of the same category launched in the samemonth, it is among the top-z% by final number of visitors, v .

■ Discovery. Individual i discovers shop α if i purchases in shopα no later than ∆ days a er α is opened, and α turns out to bea popular shop.

■ ∆ and z are parameters of the method(in the following, z% = 10% and ∆ = 90 dd ).

■ For each individual i , we count its number of discoveries, d∗i .

■ Is d∗i surprising?

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 53: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers I: Counting discoveries 37

■ The discoverers are defined in terms of discoveries.■ Popular shops. A shop α is considered as popular if, compared

against shops of the same category launched in the samemonth, it is among the top-z% by final number of visitors, v .

■ Discovery. Individual i discovers shop α if i purchases in shopα no later than ∆ days a er α is opened, and α turns out to bea popular shop.

■ ∆ and z are parameters of the method(in the following, z% = 10% and ∆ = 90 dd ).

■ For each individual i , we count its number of discoveries, d∗i .

■ Is d∗i surprising?

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 54: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers II: Introducing a nullmodel 38

We define a statistical null model where each individual isequally likely to make a discovery.

■ L is the total number of links, D the total number of discoveries, ki is thenumber of shops visited by i .

■ There are L marbles inside a urn, D of which are discoveries.

■ Individual i extracts ki marbles without replacement.

■ We expect di = p ki discoveries, where p = D/L.

■ The number of discoveries by individual i follows the hypergeometricdistribution.

The null model

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 55: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers II: Introducing a nullmodel 38

We define a statistical null model where each individual isequally likely to make a discovery.

■ L is the total number of links, D the total number of discoveries, ki is thenumber of shops visited by i .

■ There are L marbles inside a urn, D of which are discoveries.

■ Individual i extracts ki marbles without replacement.

■ We expect di = p ki discoveries, where p = D/L.

■ The number of discoveries by individual i follows the hypergeometricdistribution.

The null model

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 56: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Defining the discoverers III: Is d∗i surprising? 39

The discoverers’ number of discoveries is so high that it cannotbe explained by chance (i.e., by the null model).

Observations vs. expectation

■ We define the statistical surprisal, Si(d∗i |ki), associated with

the observed number of discoveries d∗i .

Si(d∗i |ki) = − log (P(di ≥ d∗

i ))

■ P(di ≥ d∗i ) represents the probability that the individual would

have done under the null model.■ Low P → It is unlikely that individual i achieved di discoveries by

chance → High S .

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 57: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

How it works 40

……………

■ Individual DB9594 ...(”Mark”) visited kMark = 263eating places, and (s)hecollected d∗

Mark = 44discoveries.

■ Mark’s expected number ofdiscoveries under the nullmodel was 13.74.

■ The probability that Markachieved 44 discoveries ormore under the null modelwasP(dMark ≥ d∗

Mark) ∼ 10−11 .

■ Mark’s surprisal issMark = 25.28.

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 58: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Comparing against the null model 41

……………

■ We resample the individuals’number of discoveries bydrawing from the distributionunder the null hypothesis(bootstrap).

■ The surprisal valuesachieved by the topindividuals is significantlylarger than that achieved bythe corresponding topindividuals obtained with thebootstrap.

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 59: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Surprisal vs activity 42

……………

■ The surprisal metric ispositively yet weaklycorrelated with the numberof visited stores perindividuals.

■ Purchasing in many differentstores does not make younecessarily a discoverer.

Hunting the truffle: Predicting success | 3. Discoverers of success

Page 60: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Quantifying the predictive power ofdifferent groups of individuals

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 61: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Framing a predictive problem 44

Are there specific groups of individuals whose pres-ence/absence among the earliest customers of a shop isinformative about the shops’ future success?

Our predictive question

■ Our intuition: If the discoverers truly have the ”habit” to earlyvisit successful shops, their presence (absence) among a shop’searly visitors might be a signal that the shop will (not) bepopular.

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 62: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Framing a predictive problem 44

Are there specific groups of individuals whose pres-ence/absence among the earliest customers of a shop isinformative about the shops’ future success?

Our predictive question

■ Our intuition: If the discoverers truly have the ”habit” to earlyvisit successful shops, their presence (absence) among a shop’searly visitors might be a signal that the shop will (not) bepopular.

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 63: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Framing a predictive problem 45

■ Identification period (18 months). We identify four classes ofrelevant individuals:

■ From the CCRs: Discoverers, Store explorers.■ From the CDRs: Social Hubs / Influencers [Goldenberg et al.,

2009; Morone and Makse, 2015], Explorers by mobility traits.

■ Validation period (12 months). We a empt to predict whethera previously-unseen shop will be successful, given thepresence/absence of previously-identified individuals of agiven class among its earliest visitors.

■ For each group I of individuals, we build a classifier thatclassifies a shop as successful if and only if an individual in I isfound among the earliest V visitors.

■ The ground-truth group of successful shops comprises the shopsthat are ranked in the top-10% by total number of visitors, amongshops of the same category and launched in the same month.

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 64: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Framing a predictive problem 45

■ Identification period (18 months). We identify four classes ofrelevant individuals:

■ From the CCRs: Discoverers, Store explorers.■ From the CDRs: Social Hubs / Influencers [Goldenberg et al.,

2009; Morone and Makse, 2015], Explorers by mobility traits.

■ Validation period (12 months). We a empt to predict whethera previously-unseen shop will be successful, given thepresence/absence of previously-identified individuals of agiven class among its earliest visitors.

■ For each group I of individuals, we build a classifier thatclassifies a shop as successful if and only if an individual in I isfound among the earliest V visitors.

■ The ground-truth group of successful shops comprises the shopsthat are ranked in the top-10% by total number of visitors, amongshops of the same category and launched in the same month.

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 65: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Evaluation metrics 46

For each group I , we build its classifier and we measure:■ Precision / Success rate. Success rate of the stores that

received an early purchase by by an individual in I .■ We compare the success rate against the baseline given by a

random classifier, obtaining the success-rate fold increase.■ Recall. Fraction of successful stores that received an early visit

by an individual in I .■ Positive likelihood ratio. Probability of a store that received

an early visit by an individual in I being successful divided bythe probability of a store that did not receive a visit beingsuccessful.

■ Ma hews’ correlation.

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 66: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predictive performance: Precision 47

■ The discoverers have the largest predictive power.

■ Explorers by radius of gyration are highly competitive for clothing stores.

■ The discoverers’ predictive power generalizes across categories, whereasthe same does not hold for the other groups of individuals.

■ The social hubs’ predictive power is inconsistent.

Insights

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 67: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Predictive performance: Ma hews correlation 48

■ The discoverers have the largest predictive power.

■ Explorers by radius of gyration are highly competitive for clothing stores.

■ The discoverers’ predictive power generalizes across categories, whereasthe same does not hold for the other groups of individuals.

■ The social hubs’ predictive power is inconsistent.

Insights

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 68: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Combining groups of individuals 49

For each pair (I1, I2) of groups of individuals, we build a clas-sifier that classifies a shop as successful if and only if both anindividual in I and an individual in I2 are found among the ear-liest 30 visitors.

2-dimensional classifiers

■ Can the presence of early customers from different groups ofselected individuals be a stronger predictor of success?

■ Note: We are not a empting to maximize predictiveperformance, but to understand whether the co-presence ofpairs of groups of individuals is a be er predictor of success.

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 69: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Combining groups of individuals 50

■ Combining the discoverers with other groups of individuals can improvethe predictive power.

■ Combining the discoverers with the groups of social hubs does notimprove the preditive power for clothing stores, where the hubs wereunderperforming when considered individually.

Insights

Hunting the truffle: Predicting success | 4. Quantifying the predictive power of different groups of individuals

Page 70: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Who are the discoverers of success?

Hunting the truffle: Predicting success | 5. Who are the discoverers of success?

Page 71: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Who are the discoverers? 52

Once we have identified the discoverers and quantified theirpredictive performance, it is inevitable to investigate which traitscharacterize them.■ What is their typical age and gender?■ How socially well-connected are they?■ Do they explore many different stores?■ Do they spend/travel a lot?

Our unique combination of bank and telco data allows us toanswer these questions.

Hunting the truffle: Predicting success | 5. Who are the discoverers of success?

Page 72: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Who are the discoverers? 52

Once we have identified the discoverers and quantified theirpredictive performance, it is inevitable to investigate which traitscharacterize them.■ What is their typical age and gender?■ How socially well-connected are they?■ Do they explore many different stores?■ Do they spend/travel a lot?

Our unique combination of bank and telco data allows us toanswer these questions.

Hunting the truffle: Predicting success | 5. Who are the discoverers of success?

Page 73: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Socio-economic and demographic traits 53

Hunting the truffle: Predicting success | 5. Who are the discoverers of success?

Page 74: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Socio-economic and demographic traits 54

■ The discoverers and store explorers tend to be above themedian in network centrality, number of visited stores, mobilitydiversity.

■ The discoverers are not outstanding in any of these traits,though.

■ The discoverers’ demographic traits are not consistent acrosscategories.

■ The discoverers can have different traits than the storeexplorers (e.g., food stores).

■ The social hubs have above-median expenditures and numberof visited stores.

Hunting the truffle: Predicting success | 5. Who are the discoverers of success?

Page 75: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Socio-economic and demographic traits 54

■ The discoverers and store explorers tend to be above themedian in network centrality, number of visited stores, mobilitydiversity.

■ The discoverers are not outstanding in any of these traits,though.

■ The discoverers’ demographic traits are not consistent acrosscategories.

■ The discoverers can have different traits than the storeexplorers (e.g., food stores).

■ The social hubs have above-median expenditures and numberof visited stores.

Hunting the truffle: Predicting success | 5. Who are the discoverers of success?

Page 76: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Open challenges and take-homemessages

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 77: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Summary of results 56

We analyzed a three-year CCR from an entire nation, andinvestigated whether there exist individuals whose early visits to ashop reveal increased odds of success for the visited shop.■ Identification. The discoverers repeatedly early visit shops

that later become successful. Their behavior cannot beexplained by chance.

■ Predictive performance. The discoverers’ consistentperformance cannot be achieved by any other group oftop-individuals. The social hubs’ performance is weaker andinconsistent.

■ Characterizazion. The discoverers exhibit consistentsocio-economic traits (above- median centrality, expenditures,mobility), but they are not outstanding in any of them.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 78: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Limitations: Social network data 57

■ We implicitly assumed that mobile-phone communication dataprovide us with sufficiently good estimates of the individuals’centrality in society.

■ On the other hand, it can only provide us with a partialrepresentation of the actual communication flows in society.

■ Obtaining a complete representation of the socialcommunication pa erns across an entire nation is not feasible.

■ At the same time, our study mimics a real-world problem wherean organization has only access to incomplete socialinformation about its customers.

■ We are a empting to generalize our results to onlinecommunities where the complete social network is available.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 79: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Limitations: Social network data 57

■ We implicitly assumed that mobile-phone communication dataprovide us with sufficiently good estimates of the individuals’centrality in society.

■ On the other hand, it can only provide us with a partialrepresentation of the actual communication flows in society.

■ Obtaining a complete representation of the socialcommunication pa erns across an entire nation is not feasible.

■ At the same time, our study mimics a real-world problem wherean organization has only access to incomplete socialinformation about its customers.

■ We are a empting to generalize our results to onlinecommunities where the complete social network is available.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 80: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Limitations: Predictive performance 58

■ We aimed to uncover the predictive power hidden in theactions by specific groups of selected individuals.

■ This allowed us to address specific research questions.■ At the same time, it might be limiting if one’s goal is purely

predictive performance.■ To that end, machine learning algorithms that include multiple

features might achieve a be er performance.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 81: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Limitations: Predictive performance 58

■ We aimed to uncover the predictive power hidden in theactions by specific groups of selected individuals.

■ This allowed us to address specific research questions.■ At the same time, it might be limiting if one’s goal is purely

predictive performance.■ To that end, machine learning algorithms that include multiple

features might achieve a be er performance.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 82: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Beyond the discoverers: Anticipating rising anddeclining popularity trends 59

■ Behaving as a discoverer is only one possible behavioralpa ern.

■ We are currently searching for individuals with other interestingbehavioral traits.

■ In particular, individuals who consistently■ Anticipate rising popularity trends.■ Anticipate declining popularity trends.

■ We can find these kinds of individuals (with li le overlap withthe discoverers) that have out-of-sample predictive power.

■ We are developing an agent-based model to model theheterogeneous adoption pa erns of different groups ofidentified individuals.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 83: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Beyond the discoverers: Anticipating rising anddeclining popularity trends 59

■ Behaving as a discoverer is only one possible behavioralpa ern.

■ We are currently searching for individuals with other interestingbehavioral traits.

■ In particular, individuals who consistently■ Anticipate rising popularity trends.■ Anticipate declining popularity trends.

■ We can find these kinds of individuals (with li le overlap withthe discoverers) that have out-of-sample predictive power.

■ We are developing an agent-based model to model theheterogeneous adoption pa erns of different groups ofidentified individuals.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 84: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Be er understanding the discoverers 60The discoverers’ number of contacts tend to be above themedian, but the social mechanisms behind their emergence needto be clarified.

■ Are they able to effectively influence their peers, acting asinfluencers?

■ Are they innovative individuals who adopt before all theirpeers?

■ Are they susceptible individuals who are rapid in followingtheir peers’ actions?

■ Can we show that they have a ”taste” that well resonate withtheir community or the population?

Research questions

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 85: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

References: Background 61

▶ J. Goldenberg, et al. ”The role of hubs in the adoption process.”Journal of Marketing 73.2 (2009): 1-13.

▶ B. Shulman, A. Sharma, and D. Cosley. ”Predictability ofpopularity: Gaps between prediction and understanding.” TenthInternational AAAI Conference on Web and Social Media. 2016.

▶ J. M. Hofman, A. Sharma, and D. J. Wa s. ”Prediction andexplanation in social systems.” Science 355.6324 (2017): 486-488.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 86: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

References: Early detection of seminal paperand patents 62

▶ M. S. Mariani, M. Medo, and Y.-C. Zhang. ”Identification ofmilestone papers through time-balanced network centrality.”Journal of Informetrics 10.4 (2016): 1207-1223.

▶ M. S. Mariani, M. Medo, and F. Lafond. ”Early identification ofimportant patents: Design and validation of citation networkmetrics.” Technological Forecasting and Social Change 146(2019): 644-654.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 87: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

References: Discoverers 63

▶ M. Medo, et al. ”Identification and impact of discoverers inonline social systems.” Scientific Reports 6 (2016): 34218.

▶ M. S. Mariani, Y. Gimenez, J. Brea, M. Minnoni, R. Algesheimer,C. J. Tessone. ”Predicting success in socioeconomic systems bytracking key individuals.” Working paper.

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 88: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Thanks to my collaborators 64

URPP Social NetworksRene AlgesheimerFrancesco De CollibusClaudio Juan Tessone

……………GrandataJorge BreaYanina GimenezMartin Minnoni

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 89: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Take-home messages 65

1. We framed the problem of predicting the future success of astore based on its early customers. We are able to comparethe predictive power of different groups of individuals.

2. The discoverers of success differ from the widely-studiedinfluencers / social hubs as they are not necessarily central inthe social network, but their monetary transactions consistentlyreveal future success for the recipient of the transaction.

3. Once identified, agent-level pa erns of adoptions/purchasescan be leveraged to predict a collective phenomenon(success).

Hunting the truffle: Predicting success | 6. Open challenges and take-home messages

Page 90: Huntingthetruffle:Predicting ... · Shulman et al., Predictability of popularity: Gaps between prediction and understanding, 2016 Various choices in the literature, conclusions are

Manuel Sebastian Mariani

URPP Social Networks (University of Zurich)Institute of Fundamental and Frontier Sciences (UESTC, Chengdu)

B [email protected]

mh ps://www.business.uzh.ch/en/research/professorships/networkscience/people/Dr-Manuel-Mariani.html