Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off...

20
Giraph Neil Butcher

Transcript of Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off...

Page 1: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Giraph

NeilButcher

Page 2: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Background• Giraph scalableplatformforimplementinggraphalgorithms

• DevelopedbyApache• Basedoff‘Pregel’• UtilizesHadoopMapReduceframeworktotargetgraphproblems

• OpenSource

1

Page 3: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Advantages of Solving Problems with Giraph• Message-basedcommunication:nolocks• Globalsynchronization:nosemaphores• Simpletoprogram• Massivelyparallel:taskbasedprogramming• Faulttolerant:Savesintermediateresults

2

Page 4: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Giraph Algorithms: Basic Idea• Algorithmsarewrittenfromtheperspectiveofavertex

• Verticessendmessagestoeachothertosharepertinentinformation

3

Page 5: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

How it Works• ’compute’ functionhasabilityto:– modifystateofvertexanditsoutgoingedges– Cansendmessagestoothervertices– Receivemessagessentinprevioussuperstep

• Thingsthathappenduringasuperstep:– A‘compute’functionisinvokedoneachvertexthatreceivedamessageintheprevioussuperstep

– Nextsuperstep beginsonly afterallverticeshavecompletedtheirwork

– Ifnomessagesareinflight,haltprogram4

Page 6: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Algorithm

5

Readupdatesfromothervertices,findminimum

Senddistancetoothervertices

Page 7: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

6

Page 8: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

7

Page 9: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

8

Page 10: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

9

Page 11: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Single Source Shortest Path Example

10

Page 12: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

More Complex Example: PageRank

11

Page 13: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Giraph Job Lifetime

12

Page 14: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Implementing Algorithm in Giraph• DefineaVertex class– Subclassofexistingimplementations

• DefineaVertexInputFormat toreadthegraph• DefineVertexOutputFormat thatdefineshowtoextractresultbasedonVertexfinalstate

• Manyotherfeaturescanbeutilizedtoimproveperformance

13

Page 15: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Aggregators• Eachvertexcanstorevaluesthatcanbereadbyallverticesinproceedingsuperstep

• Canmaintainvalues(sum,min,max,accumulate,userdefined,ect)

• Aggregatorsmustberegisteredonmaster

14

Page 16: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Combiners• Userdefinedfunctiontocombinemessagesbeforebeingsentordelivered

• Savesonnetworkandmemory

15

Page 17: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Checkpointing• Canbeexpensivebutnecessary• Ensuresnosinglepointoffailure• Storeworkatuserdefinedintervals• Restartonfailure

16

Page 18: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Zookeeper Responsibilities: Computation State • Handlespartition/workermapping• Globalstate• Checkpointpaths,aggregatorvalues,statistics

17

Page 19: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Master Responsibilties: Coordination

• Assignspartitionstoworkers– Hashmapping isdefault– Canbeuserdefined

• Monitorsworkers• Coordinatessupersteps (ending,startingect)

18

Page 20: Giraph - University of Notre Dame · 2018. 10. 2. · •Developed by Apache •Based off ‘Pregel’ •Utilizes Hadoop MapReduce framework to target graph problems •Open Source

Worker Responsibilities: Vertices

• Workersareassignedvertices• Performcompute• Passmessagesbetweenvertices• Computeslocalaggregationvalues

19