an I Online Graph Partitioning algorithm for distributed ... Incremental Online Graph Partitioning...

24
an Incremental Online Graph Partitioning algorithm for distributed graph databases Dong Dai*, Wei Zhang, Yong Chen IOGP

Transcript of an I Online Graph Partitioning algorithm for distributed ... Incremental Online Graph Partitioning...

an Incremental Online Graph Partitioning algorithm for

distributed graph databases

Dong Dai*, Wei Zhang, Yong Chen

IOGP

Workflow of The Presentation

6/30/17 HPDC 2017, Washington D.C. 2

A Use Case

OLTP vs. OLAP

RethinkGraph

Partitioning

Modeling and

Analysis

IOGP Design Idea

IOGP DetailsEvaluation

Setup

Partition Quality

OLTPPerformance

A Use Case

• MeetDanielfrom“All-About-HPC”LLC• HewantstobuildaprovenancesystemforHPC

• Hetracesallprovenance-relevantevents• Hemodelsandstoresthemintoaprovenancegraph• Heoffersusersgraphinterfacestoqueryprovenance

6/30/17 HPDC 2017, Washington D.C. 3

New events inserted continuously

Build large graphs with properties

Thousands of clients query it

Typical OLTP Workloads

OLTP vs. OLAP

• Wehavetwosetsoftoolstohandlegraphs• GraphDatabases• short,finishinmilliseconds• touchasmallpartofthegraph• measuredbytimecostofeachtransaction

• GraphProcessingEngines• longer,finishinseconds,minutes• touchalargepartoforthewholegraph• measuredbysystemthroughput

6/30/17 HPDC 2017, Washington D.C. 4

OLTP

OLAP

Daniel needs distributed graph databases for his OLTP opsBut, how he partitions the graphs?

Rethink the Graph Partitioning

•Manygraphpartitioningalgorithms•Multi-levelscheme• METIS,Chaco,PMRSB,Scotch,ParMetis,Pt-Scotch,etc.

• Heuristicsingle-passmethods• LDG,Fennel,2D-Grid,Vertex-Cut,etc.

• Onlinepartitioningalgorithms• Hashing,Leopard,etc.

6/30/17 HPDC 2017, Washington D.C. 5

Not Designed for OLTP Workloads

Rethink the Graph Partitioning

• WhatdoesOLTPneed?

• Responsequicklywheninsertionhappens• Itneedstofinishfast(inms)• Notimetowaitformulti-levelschemesorevenre-partitioning

• Responsewhileknownothingaboutthevertex• Theinsertionmaybe“random”• Donotassumeknowconnectivityofinsertedvertex

• Measurementofagoodpartitioning• OLTPcaresresponsetimeofeachoperation• Onlyminimizingedge-cuts/maximizingthebalanceisnotenough• Thinkaboutagraphwithnequal-size subgraphs

6/30/17 HPDC 2017, Washington D.C. 6

Modeling and Analysis

• Genericdistributedgraphdatabasemodel• directed/undirectedgraphs• bi-directionaccessesonthegraphs• queryablepropertiesonverticesandedges

6/30/17 HPDC 2017, Washington D.C. 7

Factors of OLTP Performance

• Single-PointAccessPerformance• One-hop:ifclientsknowthelocationofquerieddata• Itsavestimeforqueryinglocationinformation

• TraversalPerformance(Edge-CutandVertex-Cut)

6/30/17 HPDC 2017, Washington D.C. 8

IOGP Core Idea

• Leveragedeterministichashingtoplacenewvertices• fastandeasytocalculate• enableone-hopaccess• needno infotoconductpartitioning

• Incrementallyreassignverticestobetterlocation• continuouslygetbetterlocalitywithincreasingknowledge

• Forvertexistoolarge,changefromedge-cuttovertex-cut• increasetheparallelism• improveOLTPresponsetime

6/30/17 HPDC 2017, Washington D.C. 9

IOGP - Quiet Stage

• QuietStageisthedefaultstage• avertexv isinserted,itwillbeplacedbasedonHashing• anedgeu->v isinserted,itwillbeplacedtogetherwithitsconnectedvertices

6/30/17 HPDC 2017, Washington D.C. 10

v à u

v

Vertex Reassign Stage

• Whenweknewenoughconnectivityofavertexv• moveittothepartitionthatstoresthemostofitsneighbors• Aheuristicfunction(derivedfromFennel)

6/30/17 HPDC 2017, Washington D.C. 11

v à u

v|u1, u2, u3, ...

Edge Splitting Stage

• Whenavertexbecomesextremelylarge• Edge-cutleadstosignificantperformancedegradation• Splitalledgestoincreasetheparallelismofdataaccesses

6/30/17 HPDC 2017, Washington D.C. 12

IOGP Details

• We need per-vertex metadata• To record vertex location• To know vertex degree or it has been split• To efficiently calculate

• Counters

6/30/17 HPDC 2017, Washington D.C. 13

split(v) ...,...loc(v)

alo(v)

ali(v)

plo(v)

pli(v)

plo(v)

pli(v)

split(v)loc(v)

counters(v)

Original Location

Update Counters after Reassignment

1. Updateloc(v) intheoriginalserver

2. InserverSu (vismovingoutofit)• v’sactuallocalityturnsintopotentiallocality• v’slocalneighbors

• reduceactuallocalityby1becausevisnotlocalanymore

3. InserverSk (vismovingintoit)• v’spotentiallocalityturnsintoactuallocality• v’slocalneighbors

• increasetheiractuallocalityby1becausev islocaltothemnow

6/30/17 HPDC 2017, Washington D.C. 14

Optimization: Timing of Vertex Reassignment

• Vertexreassignmentistime-consuming• Observation• moreedges,morestableconnectivity;• lessreassignmentisneeded

• Design• deferringvertexreassignmentuntilitsconnectivitystabilizes• reducingvertexreassignmentfrequencywithmoreedges

• Implementation• Startreassignmentuntilitsdegreeishigherthank• Checkforreassignmentwhenitsdegreereaches[2*k,4*k,...,2l*k...]

6/30/17 HPDC 2017, Washington D.C. 15

Optimization: Asynchronous Data Movement

• Vertexreassignmentandedgesplitting• syncdatamovementmaystaleOLTPoperations• optimization:asynchronousdatamovement

• Edgesplittingexample• Updatein-memorycounters• addvertexinpendingmovementqueue• Serverstartstorejectnewedges

• Datamayindifferentlocationsduringmovement• originalserver,newserver,andmaybeboth• clientsneedtoreadmultiplecopiesandaggregatethem• Prefertheonefrom“newserver”

6/30/17 HPDC 2017, Washington D.C. 16

Evaluation Setup

• AllevaluationswereconductedontheCloudLab cluster• 32serversforback-endstorage

• WechosevariousdatasetsfromSNAPforevaluations• scalesfrom200Kedgestoover100Medges• fromvariousdomains,formingdifferentshapes

• WeevaluatedIOGPonadistributedgraphdatabaseprototype,namelySimpleGDB(https://github.com/daidong/simplegdb-Java)• Aprototypesystemhasbeenusedinseveralresearchprojects• Forfaircomparisonandcontrollableoptimizations

6/30/17 HPDC 2017, Washington D.C. 17

Partitioning Quality

METIS is not always balancedOthers (including IOGP) are well balanced

6/30/17 HPDC 2017, Washington D.C. 18

• METIS runs directly on the final graph• Fennel runs in random order

METIS is still the bestIOGP achieves very well

Continuous Refinement of IOGP

• Usingsimilarheuristic,IOGPperformsbetterthanFennelduetocontinuousrefinement

6/30/17 HPDC 2017, Washington D.C. 19

Key Tunable Parameters

Reassignment Threshold

somewhere around average degree of the graph

6/30/17 HPDC 2017, Washington D.C. 20

Split Threshold

Relevant with 1) hardware; 2) scale of the database cluster; 3) vertex degree and property size

Memory Footprint

• HowmanymemoryisusedinIOGPtokeepthein-memorycounters?

6/30/17 HPDC 2017, Washington D.C. 21

OLTP Performance

Graph Insertionless than 10% overhead

6/30/17 HPDC 2017, Washington D.C. 22

Graph TravelAs much as 2x benefits

Conclusion & Future Work

• OLTPworkloadsofdistributedgraphdatabasesrequirenewsetofgraphpartitioningalgorithms.

• IOGPachieveslessthan10%overheadsoninsertion,andasmuchas2xperformanceimprovementonqueries.

• Therearestilllotsofthingscanbedone• Reducetheoverheadsofvertexreassignments• Partitioningthegraphsbasedontheexactworkloadpatterns• ...

6/30/17 HPDC 2017, Washington D.C. 23

Thanks and Questions

6/30/17 HPDC 2017, Washington D.C. 24