HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform...

11
HIVE: Scalable, Cross Platform Graph Analytics Framework in Python Vincent Cavé - Intel Stanley Seibert - Anaconda FOSDEM 2020

Transcript of HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform...

Page 1: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

HIVE: Scalable, Cross Platform Graph Analytics Framework in Python

Vincent Cavé - Intel

Stanley Seibert - Anaconda

FOSDEM 2020

Page 2: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

2HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

§What is HIVE?§Architecture§ Interfaces§Extensibility§Summary

Outline

Page 3: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

3HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

HIVE: A Bridge Between Graphs and Data Science

HIVE

• Graph Analytics in Python• Data-science Inter-Operability• High Performance• Transparent Orchestration• Community Driven• Hardware Agnostic

• In development, to be open sourced in 2020

HardwareVendors

ResearchCommunity

High Perf Graph

LibrariesDASK

Graph UsersPython

Data Science Packages

Page 4: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

4HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

One Indirection to target them all

Hardware Architectures

Graph Frameworks• SuiteSparse• Galois• GraphIt• Gunrock• …

Graph Representation

Graph Algorithm usingParadigm & API

• High-Level Graph API • Graph Query API with Numba• Data Inter-Operability

• Dynamic Task Graph• Orchestrate compute & data• Extensible via plugins

Data Science

Ecosystem

HIVE DASK Runtime

HIVE APIs

Page 5: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

5HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

HIVE Framework Interfaces

User APILouvain(G)

Data ModelsGraphs:{DF@CPU,

CSR@CPU, …}

Transformers{DF@CPU=>CSR

@CPU}, …

Graph Algorithms Backends{Louvain, XBLAS, CPU,

CSR}, …

HIVE / DASK <orchestrate>

Congratulations, you’ve just built a graph!

Page 6: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

6HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

All this time, it was a graph of plugins

Graph Algorithms Backends

Graph Algorithms Backends

User API

Graph Algorithms Backends

Data Model

Data Model

Data Model

Data Model

Data Model

Transformers

Page 7: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

7HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

Doing Graph Analytics With The Help of GraphsData Transformation Graphs

Load Data Preprocessing

Make GraphMake Graph

Graph Op #1 Graph Op #2

Graph Op #3 Save VisualizeHIVE

File Format #1 File Format #2

Table Array

GraphFormat #1

GraphFormat #2

Workflow Task Graphs

Orchestrate HW backend selection & data movement Automated data transformers selection

Page 8: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

8HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

Extensibility: Supporting New Hardware

HIVE / DASK

Data ModelsTransformersGraph Algorithms Backends

User APILouvain(G)

CSR@XPU, …{DF@CPU=>CSR@XPU}, …

{Louvain, XBLAS, XPU, CSR}, …

§ No functional changes to User API

§ New hardware only requires a few plugins

§ Becomes part of the HIVE runtime toolbox

§ Mixing between HW architectures is automatically supported

Page 9: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

9HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

Extensibility: Supporting a new User API

HIVE / DASK

Data ModelsTransformersGraph Algorithms Backends

User API

{TC, XBLAS, CPU, CSR}, …

TC(G)

§ Extend the User API

§ Provide at least one implementation

§ Becomes part of the HIVE runtime toolbox

Page 10: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

10HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

Stakeholders View

Graph FrameworkDevelopers

• Python frontend for algorithms

• Increased user base• Performance

feedback

Data Scientists

• Unified API for Graph Analytics

• Python inter-operability• State of the art backends• Transparent

orchestration• Increased workflow

portability

Researchers

• Easy integration in workflows

• Easily extensible• Performance monitoring

& optimization

Page 11: HIVE: Scalable, Cross Platform Graph Analytics Framework ...€¦ · HIVE: Scalable, Cross Platform Graph Analytics Framework in Python ... HIVE: A Bridge Between Graphs and Data

11HIVE: Graph Analytics Framework in Python – Vincent Cavé, Stanley Seibert – FOSDEM 2020

HIVE: A Bridge Between Graphs and Data Science

HIVE

HardwareVendors

ResearchCommunity

High Perf Graph

LibrariesDASK

Graph UsersPython

Data Science Packages

Questions?