Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is...

27
Alan G. Labouseur School of Computer Science and Mathematics Marist College Poughkeepsie, NY 12601 [email protected] Justin Svegliato School of Computer Science and Mathematics Marist College Poughkeepsie, NY 12601 [email protected] Jeong-Hyon Hwang Dept. of Computer Science University at Albany State University of New York Albany, NY 12222 [email protected] Distributed Graph Snapshot Placement and Query Performance in a Data Center Environment MARIST MARIST

Transcript of Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is...

Page 1: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

Alan G. Labouseur School of Computer Science

and Mathematics Marist College

Poughkeepsie, NY 12601 [email protected]

Justin Svegliato School of Computer Science

and Mathematics Marist College

Poughkeepsie, NY 12601 [email protected]

Jeong-Hyon Hwang Dept. of Computer Science

University at Albany State University of New York

Albany, NY 12222 [email protected]

Distributed Graph Snapshot Placement and Query Performance

in a Data Center Environment

MARIST MARIST

Page 2: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

draft 1.3 2Labouseur, Svegliato, and Hwang

IoTisalwaysGrowing• moresourcesandsensors• moreproductsandservices• moremessages• moretransactions

IoTisalwaysChanging• adding,removing,andmodifyingconnections

Inotherwords...IoTisalwaysEvolving

Q: Howdowegaininsightfromevolvingnetworks?

Daily Deluge of Data

Page 3: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

3

Dynamic Graphs

Labouseur, Svegliato, and Hwang

A:Wegaininsightfromevolvingnetworksbytreatingthemlike DynamicGraphswherevertices(dots)modelentitiesand edges(lines)modelrelationshipsbetweenentities.

Theevolutionofanetworkcanbemodeledasaseriesofgraphsnapshotsthatrepresentthatnetworkatdifferentpointsintime.

G1

ac

bd

G2 ec

d

a

bf

ce

d

a

b

G3

snapshot snapshotsnapshot

!mepasses !mepasses

draft 1.3

Page 4: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

4

G* Overview

Labouseur, Svegliato, and Hwang

G*isourdynamicgraphdatabasesystem.• graphsnapshotdistribution• multi-corescaleup• multi-serverscaleout• deduplicatedstorageforlargegraphs• in-memorycompactindexing• sharedcomputation• sophisticatedparallelgraphqueries• integrationwithRelationaldatabasesandotherstores

• aconvenienteasy-to-useGUI• adistributiondashboardforanalysis

draft 1.3

Page 5: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

5

G* Details: Distributed Architecture

Labouseur, Svegliato, and Hwang

QueryParser

QueryOptimizer

QueryCoordinator

querynetwork

executionplan

G* master

query

Communication Layer

Graph ManagerMemory Buffer

Disk

Index

Query Execution Engine

HA

Communication Layer

network

......Graph Manager

Memory Buffer

Disk

Index

Query Execution Engine

HA

Communication Layer

α β

control

data controldata control

(Justin)

draft 1.3

Page 6: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

6

G* Details: Deduplicated Snapshot Storage

Labouseur, Svegliato, and Hwang

c

bd

G2 G3ce

G1

df

ce

d

ca

b

a

b

a

bd

G1!G2!G3

ac

b

G1!G2!G3 G1-G2-G3

df

ce

(G2!G3)-G1 G3-G1-G2

!" #

d

(G1!G2)-G3

commonality-baseddeduplication

draft 1.3

Page 7: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

7

G* Details: Distributed Query Processing

Labouseur, Svegliato, and Hwang

Average Degree Query (BSP) operator vertex@* = VertexOperator([], [α,β,γ]);operator degree@* = ProjectionOperator([vertex@local], [id, graph.id, cardinality(outgoing_edges)+cardinality(incoming_edges)], [id, graph.id, degree]);operator partial@* = AggregateOperator([degree@local], [count, sum], [*, degree], [count, sum], [graph.id]);operator union@α = UnionOperator([partial@*]);operator total@α = AggregateOperator([union@α], [sum, sum], [count, sum], [count, sum], [graph.id]);operator avg@α = ProjectionOperator([total@local], [graph.id, 1.0*sum/count], [graph.id, avg]);

draft 1.3

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c, !,{G1}), (d, !,{G1,G2}), (c, !,{G2}), (e, !,{G2})(a,!,{G1,G2}) (b, !,{G1,G2})

(1,2,{G1,G2})

(3/4,{G1}), (4/5,{G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}

d

{G1,G2}

G2G1

ac

e

bd

ac

bd

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

!" #

query: average degree variation

{G1,G2}

Page 8: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c, !,{G1}), (d, !,{G1,G2}), (c, !,{G2}), (e, !,{G2})(a,!,{G1,G2}) (b, !,{G1,G2})

(1,2,{G1,G2})

(3/4,{G1}), (4/5,{G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}

d

{G1,G2}

G2G1

ac

e

bd

ac

bd

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

!" #

query: average degree variation

{G1,G2}

8

G* Details: Distributed Query Processing

Labouseur, Svegliato, and Hwang

Average Degree Query (BSP) operator vertex@* = VertexOperator([], [α,β,γ]);operator degree@* = ProjectionOperator([vertex@local], [id, graph.id, cardinality(outgoing_edges)+cardinality(incoming_edges)], [id, graph.id, degree]);operator partial@* = AggregateOperator([degree@local], [count, sum], [*, degree], [count, sum], [graph.id]);operator union@α = UnionOperator([partial@*]);operator total@α = AggregateOperator([union@α], [sum, sum], [count, sum], [count, sum], [graph.id]);operator avg@α = ProjectionOperator([total@local], [graph.id, 1.0*sum/count], [graph.id, avg]);

draft 1.3

Page 9: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c, !,{G1}), (d, !,{G1,G2}), (c, !,{G2}), (e, !,{G2})(a,!,{G1,G2}) (b, !,{G1,G2})

(1,2,{G1,G2})

(3/4,{G1}), (4/5,{G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}

d

{G1,G2}

G2G1

ac

e

bd

ac

bd

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

!" #

query: average degree variation

{G1,G2}

9

G* Details: Distributed Query Processing

Labouseur, Svegliato, and Hwang

Average Degree Query (BSP) operator vertex@* = VertexOperator([], [α,β,γ]);operator degree@* = ProjectionOperator([vertex@local], [id, graph.id, cardinality(outgoing_edges)+cardinality(incoming_edges)], [id, graph.id, degree]);operator partial@* = AggregateOperator([degree@local], [count, sum], [*, degree], [count, sum], [graph.id]);operator union@α = UnionOperator([partial@*]);operator total@α = AggregateOperator([union@α], [sum, sum], [count, sum], [count, sum], [graph.id]);operator avg@α = ProjectionOperator([total@local], [graph.id, 1.0*sum/count], [graph.id, avg]);

draft 1.3

Page 10: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

10

G* Details: Distributed Query Processing

Labouseur, Svegliato, and Hwang

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c, !,{G1}), (d, !,{G1,G2}), (c, !,{G2}), (e, !,{G2})(a,!,{G1,G2}) (b, !,{G1,G2})

(1,2,{G1,G2})

(3/4,{G1}), (4/5,{G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}

d

{G1,G2}

G2G1

ac

e

bd

ac

bd

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

!" #

query: average degree variation

Average Degree Query (BSP) operator vertex@* = VertexOperator([], [α,β,γ]);operator degree@* = ProjectionOperator([vertex@local], [id, graph.id, cardinality(outgoing_edges)+cardinality(incoming_edges)], [id, graph.id, degree]);operator partial@* = AggregateOperator([degree@local], [count, sum], [*, degree], [count, sum], [graph.id]);operator union@α = UnionOperator([partial@*]);operator total@α = AggregateOperator([union@α], [sum, sum], [count, sum], [count, sum], [graph.id]);operator avg@α = ProjectionOperator([total@local], [graph.id, 1.0*sum/count], [graph.id, avg]);

draft 1.3

Page 11: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

11

G* Details: Distributed Query Processing

Labouseur, Svegliato, and Hwang

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c, !,{G1}), (d, !,{G1,G2}), (c, !,{G2}), (e, !,{G2})(a,!,{G1,G2}) (b, !,{G1,G2})

(1,2,{G1,G2})

(3/4,{G1}), (4/5,{G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}

d

{G1,G2}

G2G1

ac

e

bd

ac

bd

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

!" #

query: average degree variation

Average Degree Query (BSP) operator vertex@* = VertexOperator([], [α,β,γ]);operator degree@* = ProjectionOperator([vertex@local], [id, graph.id, cardinality(outgoing_edges)+cardinality(incoming_edges)], [id, graph.id, degree]);operator partial@* = AggregateOperator([degree@local], [count, sum], [*, degree], [count, sum], [graph.id]);operator union@α = UnionOperator([partial@*]);operator total@α = AggregateOperator([union@α], [sum, sum], [count, sum], [count, sum], [graph.id]);operator avg@α = ProjectionOperator([total@local], [graph.id, 1.0*sum/count], [graph.id, avg]);

draft 1.3

Page 12: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

12

G* Details: Distributed Query Processing

Labouseur, Svegliato, and Hwang

(1,1,{G1,G2})

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}), (a,2,{G1,G2}) (b,1,{G1,G2})

(c, !,{G1}), (d, !,{G1,G2}), (c, !,{G2}), (e, !,{G2})(a,!,{G1,G2}) (b, !,{G1,G2})

(1,2,{G1,G2})

(3/4,{G1}), (4/5,{G2})

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

(2,0,{G1}), (3,1,{G2}))

c

bd

{G1,G2,G3}

ac

b

{G1,G2,G3} {G1}

df

ce

{G2,G3} {G3}

d

{G1,G2}

G2G1

ac

e

bd

ac

bd

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

vertex

degree

count, sum

average

union

vertex

degree

count, sum

vertex

degree

count, sum

(c,0,{G1}), (d,0,{G1,G2}), (c,1,{G2}), (e,0,{G2}),

(1,2,{G1,G2}), (1,1,{G1,G2}), (2,0,{G1}), (3,1,{G2})

!" #

query: average degree variation

Average Degree Query (BSP) operator vertex@* = VertexOperator([], [α,β,γ]);operator degree@* = ProjectionOperator([vertex@local], [id, graph.id, cardinality(outgoing_edges)+cardinality(incoming_edges)], [id, graph.id, degree]);operator partial@* = AggregateOperator([degree@local], [count, sum], [*, degree], [count, sum], [graph.id]);operator union@α = UnionOperator([partial@*]);operator total@α = AggregateOperator([union@α], [sum, sum], [count, sum], [count, sum], [graph.id]);operator avg@α = ProjectionOperator([total@local], [graph.id, 1.0*sum/count], [graph.id, avg]);

draft 1.3

Page 13: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

13

Dynamic Graph Analytics

Labouseur, Svegliato, and Hwang

Usingdistributedqueries,wecananalyzenetworkevolutiontodiscovertrendsandinsightscrucialtomanyLieldswithintheInternetofThings:

• networksofelectronichealthrecords• geolocationtrackers• diagnosticdatafromconnecteddevices(e.g.,smartwatches)• andmore…

G1

ac

bd

G2 ec

d

a

bf

ce

d

a

b

G3

!mepasses !mepasses

draft 1.3

Page 14: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

14

Wait. Isn’t this already solved?

Labouseur, Svegliato, and Hwang

Sofar...• Acceleratingqueriesbydistributingstaticdataovermultipleworkershasbeenwellexplored.

However...• Acceleratingqueriesbydistributingcontinuouslygenerateddatapresentsnewchallenges.

• Techniquesforpartitioningindividualgraphstofacilitateparallelcomputationhavebeendeveloped.

• Techniquesforpartitioningseriesofgraphs(snapshots)tofacilitateparallelcomputationraisenewchallenges.‣ cannotconsideronlyonegraphatatime

‣ cannotdistributeeverysnapshottoallworkers

Sothere’saproblem.

draft 1.3

Page 15: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

15

The Problem

Labouseur, Svegliato, and Hwang

Thesnapshotdistributionproblem:Howdowedistributesnapshotsofacontinuouslyevolvingnetworkamongouravailableworkersinawaythat’sef:icient,scalable,andoptimizedtoprocessvarioustypesofqueries?

Formally:

draft 1.3

Page 16: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

16

Queries

Labouseur, Svegliato, and Hwang

PageRankrequiresmanyverticestobeexaminedtocomputethevalueforeachvertex.

AverageDegreerequiresonlyonevertextobeexaminedtocomputethevalueforeachvertex.

n. . . α β γ

. . . . . .

. . . . . .

n. . . α β γ

. . . . . .

. . . . . .

draft 1.3

Page 17: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

17

Distributions

Labouseur, Svegliato, and Hwang

SharedNothingstoresalloftheverticescomprisingeachsnapshotononeofthenworkers.SnapshotG2isstoredentirelyonworkerβ.

n. . .

1-1

. . . . . .

. . . . . .

1-2

1-n

2-1

2-2

3-1

3-2

2-n 3-n

m-1

m-2

m-n

α β γ

n. . .

1-1

. . . . . .

. . . . . .

2-1

m-1

1-2

2-2

1-3

2-3

m-2 m-3

1-m

2-m

m-n

α β γ

SharedEverythingstorestheverticescomprisingeachsnapshotsuchthattheyaresharedamongallofthenworkers.SnapshotG2issharedacrossallworkers.

draft 1.3

Page 18: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

18

Data Center Environment

Labouseur, Svegliato, and Hwang

IBMPureFlexhardware• three2.9GHZIntelXeonE5-2690CPUsof8coreseach(for24corestotal)

• 131GBRAM• dedicated20TBstorageareanetwork

Software• Ubuntu14.04.2LTS(GNU/Linux 3.13.0-57-generic x86 64)virtualmachinesoneachofourthreebladesconLiguredwithaccesstoall8coresanda70GBSANpartition

• OpenJDKRuntimeEnvironment(IcedTea2.5.5).• G*Databaseinstalledononemasterand23workers,eachtasksettoasinglecore

draft 1.3

Page 19: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

Wecomputedtherelativespeedupfor• PageRankandAverageDegreequeries• ononesnapshotandallsnapshots(23evolutionarygraphs)• placedinSharedNothingandSharedEverythingdistributions• withvertexcountsof1K,2K,4K,and8K

19

Experiments

Labouseur, Svegliato, and Hwangdraft 1.3

Page 20: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

20

Relative Speedup

Labouseur, Svegliato, and Hwang

RelativespeedupistheaveragequeryexecutiontimeforagivenqueryinSharedEverythingplacement(τse)dividedbytheaveragequeryexecutiontimeforSharedNothingplacement(τsn).

Speedup=τse÷τsn

BothPageRankandAverageDegreequerieswereexecutedLivetimesandtheresultingexecutiontimesaveraged.

draft 1.3

Page 21: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

21

Results: PageRank on all snapshots

Labouseur, Svegliato, and Hwang

NotetheincreasingspeedupofSharedNothingoverSharedEverythingforPageRankqueriesonallsnapshots.

Speedupgrowsasdatasetgrows.

Shows• beneLitofparallelismvs.overheadofcommunication

• importanceofsnapshotplacement

Remember:PageRankqueriesinvolvesigniLicantcommunicationamongvertices.

Commoncase:AnalysisofchanginginLluenceasnetworksevolve.

draft 1.3

Page 22: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

22

Results: Average Degree on all snapshots

Labouseur, Svegliato, and Hwang

NotetheconsistencyinSharedNothingandSharedEverythingforAverageDegreequeriesonallshapshots.

Speedupisunchangedasdatasetgrows.

Shows• beneLitsof“embarrassing”parallelism(nocommunicationoverhead)

• snapshotplacementdoesnotmatterRemember:AverageDegreequeriesinvolve

nocommunicationamongvertices.

Commoncase:Analysisofchangingpopularityasnetworksevolve.

draft 1.3

Page 23: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

23

Results: PageRank on one snapshot

Labouseur, Svegliato, and Hwang

NotethedecreasingspeedupofSharedNothingoverSharedEverythingforPageRankqueriesononesnapshot.

Speedupshrinksasdatasetgrows.

Shows• asthedatasetgrows,thebeneLitofparallelismdecreasesduetoincreasingcommunicationoverhead

Remember:PageRankqueriesinvolvesigniLicantcommunicationamongvertices.

draft 1.3

Page 24: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

24

Results: Average Degree on one snapshot

Labouseur, Svegliato, and Hwang

NotethedecreasingspeedupofSharedNothingoverSharedEverythingforAverageDegreequeriesononeshapshot.

Speedupshrinksasdatasetgrows.

Shows• impactofcommunicationoverheadonparallelism

• importanceofsnapshotplacement

Remember:AverageDegreequeriesinvolvenocommunicationamongvertices.

draft 1.3

Page 25: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

25

Conclusions So Far

Labouseur, Svegliato, and Hwang

Eveninadatacenterenvironment(withfast,sharedresources),therearesigniMicantbeneMitsofcarefulsnapshotplacementinthemostcommonandimportantcases.• WhenmultiplesnapshotsarequeriedtogetherandthosequeriesinvolvesigniLicantcommunication(i.e.,PageRankonallsnapshots)itisadvantageoustostoreeachsnapshotonfewerservers.

• But…itisimportantthattheoverallquerieddatabebalancedoverallserverstoavoidhurtingtheperformanceofqueriesthatdonotinvolvesigniLicantcommunicationamongworkers(i.e.,AverageDegreeononesnapshot).

draft 1.3

Page 26: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

26

Next Steps

Labouseur, Svegliato, and Hwang

Ourworkcontinues.We’re...• experimentingwithlargeranddifferentdatasets

‣ long-tailBarabasi-Albertgraphs‣ socialgraphsgeneratedbytheLinkedDataBenchmarkCouncilbenchmarkingtool

• increasingtheresolutionofourexecutiontimemeasurementstorecorddataattheBSPsupersteplevel.

• comparingourdatacenterresultswiththoseobtainedfromexperimentingonthe64-coreserverclusterweusedinourpriorwork‣ Thismaysuggestinsightintographqueryperformanceonclustersofconnectedcomputerswithnosharedresourcesascomparedtodatacenterenvironmentswithmanysharedresources.

draft 1.3

Page 27: Graph Placement in a Data Center - Labouseur · draft 1.3 Labouseur, Svegliato, and Hwang 2 IoT is always Growing • more sources and sensors • more products and services • more

27

Thank you. Questions?

Alan G. Labouseur School of Computer Science

and Mathematics Marist College

Poughkeepsie, NY 12601 [email protected]

Justin Svegliato School of Computer Science

and Mathematics Marist College

Poughkeepsie, NY 12601 [email protected]

Jeong-Hyon Hwang Dept. of Computer Science

University at Albany State University of New York

Albany, NY 12222 [email protected]

Distributed Graph Snapshot Placement and Query Performance

in a Data Center Environment

draft 1.3