Aslay Ph.D. Defense

54
From Viral Marketing to Social Advertising: Ad Allocation Under Social Influence Çiğdem Aslay (UPF) 1 Supervisors: Prof. Dr. Ricardo Baeza-Yates (UPF) Dr. Francesco Bonchi (ISI)

Transcript of Aslay Ph.D. Defense

From Viral Marketing to Social Advertising: Ad Allocation Under Social Influence

Çiğdem Aslay (UPF)

1

Supervisors: Prof. Dr. Ricardo Baeza-Yates (UPF)

Dr. Francesco Bonchi (ISI)

Outline

2

• Introduction

• Influence in Online Social Networks

• Viral Marketing and Influence Maximization

• Social Advertising: Promoted Posts

• Part I - Online Topic-aware Influence Maximization Queries

• Part II - Social Advertising: Regret Minimization

• Part III - Social Advertising: Revenue Maximization

• Conclusion

Influence in Online Social Networks

Grumpy Cat• 25K+ votes in Reddit (< 1 day)• 1M+ views in Imgur • 300+ variants in Reddit • 100+ Quickmeme macros

nice meme! indeed!

(< 2 days)

3

• Social Influence Induced Viral Phenomena

4

• Attached a promotional message with

a clickable URL for free sign up• Merely spent $50K• 12M users signed up within the first

18 months

• Sign-up to the service only through

invitation from a friend• No money spent on marketing• Resulted in bidding on Ebay for

invites

Influence in Online Social NetworksViral Marketing*

exploit the “word of mouth” effect in a social network to achieve marketing goals through self-replicating viral processes

* S. Jurvetson, “What Exactly is Viral Marketing”, Red Herring

• Given

• a directed social network G = (V,E)

• a propagation model m

• a cardinality budget k

• Define• S: initial set of k (seed) nodes to start the propagation• σm(S): expected size of the influence propagation from S

• Find

S⇤= argmax

S✓V,|S|=k�m(S)

Influence Maximization

* Kempe et al., “Maximizing the spread of influence through a social network”, KDD 2003 5

Discrete Optimization Problem*

Influence Propagation Models

Independent Cascade (IC) Model• Each arc (u,v) is associated with an influence probability puv • A node u activated at time t tries to influence each inactive neighbor v, with a

success probability puv

Topic-aware Independent Cascade (TIC) Model*

• An item i described as a distribution over K topics: • Topic specific influence probabilities on arcs: • Item specific success probabilities on arcs:

*N. Barbieri, F. Bonchi and G. Manco, “Topic-aware Social Influence Propagation Models”, ICDM 2012 6

Complexity and Approximation• Influence Maximization is NP-Hard under both models

• TIC boils down to IC on the probabilistic graph Gi = (V,A,pi) • Reduction from the Set Cover problem

• Greedy algorithm • (1 – 1/e)-approximation* using monotonicity1 and submodularity2

7

#P-hard

*Nemhauser et al., “An analysis of approximations for maximizing submodular set functions I”, Mathematical Programming 1978

• Implemented by online social networking platforms

• “Promoted Posts” are injected to the social feeds of users

• Similar to organic posts from friends in a social network

• Contain an advertising message: text, image or video

• Can propagate to friends via social actions: “likes”, “shares”

• Each click to a promoted post produces social proof to friends

• Advertisers have to pay for engagements / clicks

8

Social AdvertisingA market that did not exist until Facebook launched its first

advertising service in May 2005, projected to generate $11

billion revenue by 2017*

* http://www.unified.com/historyofsocialadvertising/

9

Motivation

• Part II - Social Advertising: Regret Minimization • Part III - Social Advertising: Revenue Maximization

• Part I - Online Topic-aware Influence Maximization Queries

Enable online social influence analytics in support of viral marketing decision making

Influence Maximization

Computational Advertising

Part I Online Topic-aware Influence

Maximization Queries

• C. Aslay, N. Barbieri, F. Bonchi, and R. Baeza-Yates. “Online Topic-aware Influence Maximization Queries”. Published in International Conference on Extending Database Technology (EDBT) 2014.

Given • a social graph G = (V,E) • a space of Z topics • topic-specific peer-influence probabilities on arcs, pz

u,v

• a query item q, • cardinality budget k

• A TIM query asks to find a seed set of k nodes that maximizes the expected number of nodes adopting item q in the network:

11

Topic-aware Influence Maximization (TIM) Queries

• TIM query can be processed by any influence maximization algorithm:

• Reduce TIC to IC via the derived graph Gq = (V,A,pq)

• Enjoy (1 – 1/e)-approximation guarantee

12

Topic-aware Influence Maximization (TIM) Queries

*Goyal et al., “CELF++: optimizing the greedy algorithm for influence maximization in social networks ”, WWW 2011

• Challenge: enormous number of potential queries• Any possible point lying on the probability simplex • Any potential query induces a different probabilistic graph

• Indexing is necessary for online TIM query processing• Need milliseconds response to enable online viral marketing analytics

Efficiency compromised:Takes days to process a single query for k = 50 on a graph with 30K nodes and

425K edges with CELF++*

Influence Index

Index over pre-computed solutions of a limited number

of TIM queries.

13

• Similar peer influence probabilities • Similar influence propagation patterns

Similar items are likely to interest similar users

INFLEX

Index Construction (Offline) • Phase 1: seed node extraction

• Phase 2: tree-based index construction

• Phase 3: list-based index construction

Query Processing (Online) • Phase1: topic-wise NNs retrieval • Phase 2: aggregation of pre-computed

seed sets of NN’s wrt topic-wise similarity

14

Selection of Index Items• Space-based selection:

• Equi-distantly positioned topic distributions on the probability simplex

• (+) Fair coverage of the simplex • (-) Disregards the available workload

• Data-driven selection: • Catalog of items learnt from the log of past propagations

• (+) Queried items likely to follow the distributions learnt from past data • (-) Sparsity issues for skewed topic distributions in the catalog

The best of both approaches Simplex Sampling

15

Selection of Index Items• Sampling from the probability simplex

• Estimate the Dirichlet distribution maximizing the log-likelihood of the

available workload • Generate a large sample with good simplex coverage • Bregman K-means++ clustering on the generated sample • Take distributions on the centroids as the index items

16

Tree Construction• KL-Divergence for measuring similarity btw. probability distributions

1 Cayton, “Fast Nearest Neighbor Retrieval for Bregman Divergences”, ICML 2008 2 Nielsen et al., “Tailored Bregman Ball Trees for Effective Nearest Neighbors”, EuroCG 2009

Bregman Ball Trees1,2

• Hierarchical space partition based on convex Bregman Balls:

• Bregman k-means++ to generate child nodes from parent nodes • Gaussian clustering to find the optimal number of child nodes (k in k-means)

non-metric search space!

17

• Neither range nor k-NN search • Anderson-Darling statistical test as stopping criterion

• if so far visited leaves provide “good enough” neigbours, return

• DFS starting from the root node to the leaf nodes • Navigation via projection of the query point onto Bregman balls

• Pruning strategy

• use an upper bound from current NN set:

• visit subtree only if it improves the current bound:

Similarity Search

18

Rank Aggregation• Combine the seed node rankings of NN’s into a “consensus” ranking

Kemeny-Optimal Rank Aggregation

• Find a ranked list that has the min. Kendall-Tau distance to the input lists • Kendall-Tau distance: # of pairwise disagreements between 2 ranked lists

NP-Hard even for 4 input permutations*

Approximation via techniques from Social Choice Theory

19*Dwork et al., “Rank aggregation methods for the web.”, WWW 2001

INFLEX – Rank Aggregation

Aggregation weights: non-linear transformation of KL-Divergence

Social Choice Theory strives for fairness..

Weighted Borda Aggregation• Borda score: total # of list-elements preceded in all the input lists • 5-approximation to the optimal Kemeny ranking

Weighted Copeland Aggregation• Copeland score: total # of list-elements that were defeated in the

pairwise comparison among all the input lists • 4-approximation to the optimal Kemeny ranking

20

Experiments• Real-world FLIXSTER dataset

• Social graph: 30K users, 425k unidirectional social links

• Propagation Log {(User, Movie, Time)}

• Ratings on 12K movies

• Benchmarks devised via various INFLEX components• exactKNN: exact K-NN search (with best performing K)

• approxKNN: approximate K-NN search (with best performing K)

• approxKNN + Sel: approximate K-NN search + automatic list selection

• approxAD: Anderson-Darling test based approximate NN search

• INFLEX: Anderson-Darling test based approximate NN search with automatic list selection

21https://github.com/aslayci/INFLEX

• Ground truth: standard (offline) greedy algorithm

22

Experiments

Part II Social Advertising:

Regret Minimization

• C. Aslay, W. Lu, F. Bonchi, A. Goyal, and, L. V. Lakshmanan. “Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret”. Published in International Conference on Very Large Data Bases (VLDB) 2015.

Social AdvertisingCost per Engagement (CPE) Model

• The social network platform owner (a.k.a. host) – Sells “ad-engagements” (“clicks”) to advertisers – Inserts promoted posts to the social feed of users likely to click

– high click-through-probability (CTP)

• Advertiser – Willing to pay a fixed CPE to host for each click

24

Ad allocation under social influence Strategically allocate users to advertisers, leveraging social influence and the propensity of ads to propagate, subject to limited advertisers’ budgets

TIC-CTP Propagation ModelExtending TIC model with Click-Through-Probabilities

• Balance between intrinsic relevance in the absence of social proof and

peer influence • Ad-specific CTP for each user: δ(u,i)

• Probability that user u will click ad i in the absence of social proof

• Lemma 4.1: TIC-CTP reduces to TIC model with piH,u = δ(u,i)

• When δ(u,i) = 1 for all u and i, TIC = TIC-CTP

v

u

wH

puw

puv

pHvpHw

pHu

25

Budget and Regret• Host:

• Owns directed social graph G = (V,E) and TIC-CTP model instance • Sets user attention bound κu for each user u ∊ V

• Advertiser i:

• agrees to pay CPE(i) for each click up to his budget Bi

• Exp. revenue of the host from allocating seed set Si to advertiser i: min(σi(Si) × CPE(i), Bi)

• σi(Si) × CPE(i) < Bi : Lost revenue opportunity for the host • σi(Si) × CPE(i) > Bi : Free service to the advertiser

Host’s regret

26

Budget and Regret(Raw) Allocation Regret• Regret of the host from allocating seed set Si to advertiser i:

Ri(Si) = |Bi − σi(Si) × CPE(i)|

• Overall allocation regret: R(S1, …, Sh) = Ri(Si)

i=1

h

Penalized Allocation Regret• λ: penalty to discourage selecting large number of poor quality seeds • Regret of the host with seed set size penalization Ri(Si) = |Bi − σi(Si) × CPE(i)| + λ × |Si|

27

Regret Minimization• Given

• a social graph G = (V,E) • TIC-CTP propagation model • h advertisers with budget Bi and CPE(i) for each advertiser i

• attention bound κu for each user u ∊ V • penalty parameter λ ≥ 0

• Find a valid allocation S = (S1, …, Sh) that minimizes the overall regret of the host from the allocation:

28

Theoretical Analysis• Regret-Minimization is NP-hard and is NP-hard to approximate

• Reduction from 3-PARTITION problem

• Regret function is neither monotone nor submodular

• Still, a greedy algorithm:

29

selects the (ad,user) that gives the max. reduction in regret

Approximation guarantee w.r.t. the total budget of all advertisers

• Theorem 4.2: Penalized allocation regret

• Raw allocation regret

• Theorem 4.3:

• Theorem 4.4:

Theoretical Analysis

30

Scalable Algorithms

Two-Phase Iterative Regret Minimization (TIRM)

* Tang et al., “Influence maximization: Near-optimal time complexity meets practical efficiency”, SIGMOD 2014

Two-Phase Influence Maximization (TIM) Algorithm*

• Estimates influence spread for the most influential “s” nodes from a random sample of “θ(s)” RR-Sets θ(s): statistically sufficient sample size needed for accurate estimation of the influence spread of s nodes

Estimator:

TIM cannot be used for minimizing the regret Does not handle CTPs Requires predefined seed set size s

Built on the Reverse Influence Sampling framework of TIM

31

(1) RR-sets sampling under TIC-CTP model: RRC-sets • Sample a random RR set R for advertiser i

• Remove every node u in R with probability 1 – δ(u,i)

• Form “RRC-set” from the remaining nodes

Scalability compromised: Requires at least 2 orders of magnitude bigger sample size for CTP = 0.01.

Theorem 4.5: MG(u | S) in IC-CTP = δ(u) * MG(u | S) in IC

TIRM

32

TIRM

For each advertiser i:

• Start with a “safe” initial seed set size si

• Sample θi(si) RR sets required for si

• Update si based on current regret

• Revise θi(si), sample additional RR sets, revise estimates

(2) Iterative Seed Set Size Estimation

Estimation accuracy of TIRM Theorem 4.6

33

Datasets and Parameters

TIC EM Learning

Exponential Distribution

WC Model

WC Model

sampled uniformly at random from [0.01, 0.03]

Peer influence probabilities:

CTPs:

34

Experiments

https://github.com/aslayci/TIRM

Algorithms Tested• MYOPIC: Top κu ads for which u has the highest δ(u,i) * CPE(i)

• MYOPIC+: Budget-aware MYOPIC enhancement • Greedy-IRIE: Instantiation of the Greedy algorithm with IRIE* heuristic • TIRM:

• ε set to 0.1 for quality experiments on FLIXSTER and EPINIONS • ε set to 0.2 for scalability experiments on DBLP and LIVEJOURNAL

* K. Jung, W. Heo, and W. Chen, "IRIE: Scalable and Robust Influence Maximization in Social Networks", ICDM 2012 35

Experiments

Ove

rall

Reg

ret

6.5%16%

145%

205%

36

2.5%

26%

122%

141%

Scalability Experiments – Running Time

16 min.s (47 seeds)

5 hours (4649 seeds) 1.5 hours

(5866 seeds)

37

38

Part III Social Advertising:

Revenue Maximization

• C. Aslay, F. Bonchi, L. V. Lakshmanan, and W. Lu. “Revenue Maximization in Incentivized Social Advertising”. Submitted to International Conference on Very Large Data Bases (VLDB) 2017. (ArXiv e-prints, arXiv: 1612.00531)

Incentivized Social AdvertisingCPE model with seed user incentives

39

• Advertiser • Pays a fixed CPE to host for each

engagement

• Pays monetary incentive to each seed user engaging with his ad

• Total payment subject to his budget

• Host • Sells ad-engagements to advertisers • Inserts promoted posts to feed of users in exchange for monetary incentives

• Seed users take a cut on the social advertising revenue

Revenue Maximization• Given

• a social graph G = (V,E) • TIC propagation model • h advertisers with budget Bi and CPE(i) for each ad i

• seed user incentives ci(u) for each user u∈V and for each ad i

• Find an allocation S = (S1, …, Sh) that maximizes the overall revenue of the host from the allocation:

40

Theoretical Analysis• Revenue-Maximization problem is NP-hard

• Restricted special case with h = 1:

• NP-Hard Submodular-Cost Submodular-Knapsack* (SCSK) problem

41*Iyer et al., “Submodular optimization with submodular cover and submodular knapsack constraints”, NIPS 2013.

Partition matroid

Submodular knapsack constraints

• Family 𝘊 of feasible solutions form an Independence System

• Two greedy approximation algorithms w.r.t. sensitivity to seed user costs during the node selection

Theoretical Analysis• Cost-agnostic greedy algorithm

• Selects (node,ad) pair giving the max. marginal increase in revenue

• Theorem 5.2: Approximation guarantee follows* from 𝘊 forming an independence system

where • R and r are, respectively, upper and lower rank of 𝘊

• κπ is the curvature of total revenue function π(.)

42* Conforti et al., "Submodular set functions, matroids and the greedy algorithm: tight worst-case bounds and some

generalizations of the Rado-Edmonds theorem.", Discrete Applied Mathematics 1984

Theoretical Analysis• Cost-sensitive greedy algorithm

• Selects the (node,ad) pair giving the max. rate of marginal gain in

revenue per marginal gain in payment

• Theorem 5.3: Approximation guarantee obtained

where • ρmax and ρmin are, respectively, max. and min. singleton payments

• κρi is the curvature of ad i’s payment function ρi(.)

43

Scalable AlgorithmsTwo-Phase Iterative Revenue Maximization• Built on the Reverse Influence Sampling framework of TIRM (Part II)

• Latent seed set size estimation

44

• Two-Phase Iterative Cost-Agnostic Revenue Maximization (TI-CARM)

• Two-Phase Iterative Cost-Sensitive Revenue Maximization (TI-CSRM)

Datasets and Parameters

TIC EM Learning

TIC WC Model

WC Model

WC Model

Peer influence probabilities:

45

Experiments

Algorithms Tested

46

Experiments

} • TI-CARM

• TI-CSRM • PageRank

• For each ad i, select the best candidate user wrt Pagerank ordering

• Among those, select the (user, ad) pair giving maximum marginal increase in the

revenue of the host

• ε set to 0.1 for quality experiments on FLIXSTER and EPINIONS • ε set to 0.2 for scalability experiments on DBLP and LIVEJOURNAL

Experiments

47

Revenue vs Seed User Incentive Costs

Experiments

48

Revenue vs Window Size

Experiments

49

Scalability Results - Running Time

Experiments

50

Scalability Results - Memory (GB)

• Novel problem formulation

• Initiated the investigation of topic-aware influence indexing techniques in

the influence maximization literature

• First step towards enabling online social influence analytics

• Orthogonal to efforts on scalable and efficient influence maximization

algorithms

• Many direct follow ups1,2,3

51

ContributionsPart I - Online Topic-aware Influence Maximization Queries

1 S. Chen et al., "Online Topic-aware Influence Maximization", VLDB 2015

2 Li et al., "Real-time Targeted Influence Maximization for Online Advertisements", VLDB 2015

3 W. Chen et al., "Real-Time Topic-aware Influence Maximization using Preprocessing", ICCS 2015

C. Aslay, N. Barbieri, F. Bonchi, and R. Baeza-Yates. “Online Topic-aware Influence Maximization Queries”. Published in EDBT 2014.

• Initiated the investigation in the area of Social Advertising through the Viral

Marketing lens to address problems that Influence Maximization and

Computational Advertising literature fail to address in isolation

• Introduced novel discrete optimization problem with provable approximation

guarantees

• Introduced TIC-CTP propagation model

• Extended the state-of-the-art influence maximization algorithms for scalable

greedy approximation

• Latent seed set size estimation

• Handling TIC-CTP propagation model52

ContributionsPart II - Social Advertising: Regret Minimization

C. Aslay, W. Lu, F. Bonchi, A. Goyal, and, L. V. Lakshmanan. “Viral Marketing Meets Social Advertising: Ad Allocation with Minimum

Regret”. Published in VLDB 2015.

53

ContributionsPart III - Social Advertising: Revenue Maximization

*Iyer et al., “Submodular optimization with submodular cover and submodular knapsack constraints”, NIPS 2013.

• Initiated the investigation in the area of Incentivized Social Advertising through the Viral Marketing lens

• Introduced novel discrete optimization problem

• Provided cost-agnostic and cost-sensitive approximation guarantees to submodular function maximization subject to a matroid and multiple submodular knapsack constraints

• Generalization of the restricted single submodular knapsack version of the problem (SCSK*)

• Theoretical results also valid for linear knapsack constraints

• = 0 when payment function for ad i is modular

C. Aslay, F. Bonchi, L. V. Lakshmanan, and W. Lu. “Revenue Maximization in Incentivized Social Advertising”. Submitted to

VLDB 2017. (arXiv: 1612.00531)

Thank you!

54