MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based...

16
MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of Electrical and Computer Engineering Department of Computer Science University of Toronto

Transcript of MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based...

Page 1: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

MIDDLEWARE SYSTEMSRESEARCH GROUP

Modelling Performance Optimizations for Content-based Publish/SubscribeAlex Wun and Hans-Arno Jacobsen

Department of Electrical and Computer EngineeringDepartment of Computer ScienceUniversity of Toronto

Page 2: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Matching Performance Optimizations Often based on exploiting similarities between

subscriptions Avoid unnecessary subscription and predicate

evaluations

Can we abstract these optimizations? Formalize content-based Matching Plans (order of

predicate evaluations) Theoretically quantify performance of matching plans Compare heuristic techniques with optimal matching

plans

Page 3: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Commonality Model

}{ 1 mSS

CSS m 1

For a subscription set

mSSC 1

or

DisjunctiveCommonalityExpression

ConjunctiveCommonalityExpression

A set of commonality expressions is a subscription topology.

• Per-Link Matching• DNF Subscriptions

• Shared predicates• Clustering on subscription classes or attributes• “Pruning” strategies (e.g., number of attributes)

Page 4: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Link-Group Topology

LSS m 1

PP

PP

PSPSPL

mmnm

n

m

1

111

1

1

CSS m 1

NNO ln

Depth First Algorithm to determine probabilistically optimal matching plan [Greiner2006] in

Page 5: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Link-Group TopologyLow Selectivity

X X

High Selectivity

o

o

Page 6: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Link-Cluster Topology

. . . . . . . . .

Multi-Cluster-Link Topology

. . .

Cluster TopologyMulti-Link Topology

. . . . . .

Dynamic Programming(not very efficient)

. . . . . .

Arbitrary Topologies

Page 7: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Cluster Topology

• Dramatic scalability effects of clustering in CPS• Observed trend depends on proportion of commonalities not number of predicates

. . .X

o

Page 8: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Applications – DoS Resilience

Normal

SubscriptionMigration

Page 9: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Applications – DoS Resilience

HighCommonality

LowCommonality

HighCommonality

Page 10: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Related Work

Carzaniga et al. [Carzaniga2001]Formal notation for covering

Mühl [Mühl2002]Formal syntax for CPS routing

Li et al. [Li2005] and Campailla et al. [Campailla2001]BDD based CPS matching algorithms

Page 11: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Conclusion

Probabilistically optimal matching plans are known for some subscription topologies

Scalable CPS matching depends heavily on commonalities Focus on abstracting commonalities

Future work Express covering, correlation, … Arbitrary subscription topologies Metrics for expressing compression due to existence

of commonalities

Page 12: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

References

[Greiner2006] Finding optimal satisficing strategies for And-Or trees, Artificial

Intelligence [Carzaniga2001]

Design and Evaluation of a Wide-Area Event Notification Service, ACM Transactions on Computer Systems

[Mühl2002] Large-Scale Content-Based Publish/Subscribe Systems, PhD Thesis

[Li2005] A Unified Approach to Routing, Covering and Merging in

Publish/Subscribe Systems based on Modified Binary Decision Diagrams, ICDCS

[Campailla2001] Efficient filtering in Publish-Subscribe Systems using Binary Decision,

International Conference on Software Engineering

Page 13: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

MIDDLEWARE SYSTEMSRESEARCH GROUP

Extra Slides

Page 14: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Table-based versus Tree-based

SNNC SSnSnC

n

n

N

NS

NN

nn

11

1SN

kRc

1

1

1

1

p

pSp

p

pC

Nk

k

N

nRc

Naive Table-based Tree-based

Page 15: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Disjunctive Commonalities

“Shortcut” unnecessary subscription/predicate evaluations

Examples: Per-Link Matching [Banavar1999,Carzaniga2003] DNF Subscriptions

CSS m 1 PCPSi Given some publication P

Computed by matching algorithm

Page 16: MIDDLEWARE SYSTEMS RESEARCH GROUP Modelling Performance Optimizations for Content-based Publish/Subscribe Alex Wun and Hans-Arno Jacobsen Department of.

Conjunctive Commonalities

“Shortcut” unnecessary subscription/predicate evaluations

Examples: Shared predicates Clustering on subscription classes or attributes “Pruning” strategies (e.g., number of attributes)

PSPC iGiven some publication P

Computed by matching algorithm

mSSC 1