Journey of The Connected Enterprise - Knowledge Graphs - Smart Data

41
Knowledge Graphs Journey of the Connected Enterprise Benjamin Nussbaum @bennussbaum | [email protected] www.atomrain.com | www.graphgrid.com

Transcript of Journey of The Connected Enterprise - Knowledge Graphs - Smart Data

Knowledge  GraphsJourney  of  the  Connected  Enterprise

Benjamin  Nussbaum@bennussbaum |  [email protected]

www.atomrain.com |  www.graphgrid.com

What  is  a  graph?

This  is  a  graph…

Not  this…

Vocabulary  is  important

Why  do  graphs  matter?

Graphs  fundamentally  change  how  we  interact  with  our  data

Data  used  to  be  stored  like  this  and  was  completely  disconnected

RDBMS  promised  it  would  connect  our  data,  but  index-­‐based  joining  was  too  costly

…and  NoSQL  stores  went  even  further  away  from  connection  being  front  and  center

We  need  graphs  because  our  data  is  actually  a  graph

The  world  we  live  in  is  a  driven  by  the  context  of  how  things  are  related

And  we  can  represent  those  things

And  the  nature  of  the  relationship  to  other  things

Along  with  important  contextual  information  about  that  relationship

Without  connection  the  picture  our  data  provides  is  incomplete  at  best

And  that  is  why  big  data  left  so  many  with  solutions  that  were  lacking

Big  Data  Ingest  Headlines

• We’re  writing  1  Billion  rows  per  day  into  redshift• etc

• We’re  writing  petabytes  of  sensor  output  per  year  into  dynamo• etc

It’s  time  to  go  from  Big  Data  to  Smart  Data

General  Data  Goals

IntuitiveSpeedAgility

Smart  Data  Characteristics  We  Want

• Connected• Easily  Explored  (ad-­‐hoc  queries)• Dynamic  Graph  Traversal• Constant  Time  Graph  Traversal

• Pattern  Detection• Contextually  Relevant  Edges• Guaranteed  Edge  Integrity• Easily  understood  by  non-­‐tech• Knowledge  Representation

An  immediately  accessible  graph  changes  everything

Index-­‐Free• Native  Property

Index-­‐Based  (join  pain)• Native  Property• Native  Triple• RDBMS  (Graph  Layer)• HDFS  (Graph  Layer)• NoSQL  (Graph  Layer)

The  Graph  Components

Storage  Models  vs  Semantic  Models

Fortunately  you  don’t  have  to  choose  RDF  or  Property

What  are  the  benefits  of  a  index-­‐free  native  graph  property  storage  model?• You’re  interacting  with  your  data  in  its  true  form• Everyone  can  understand  the  data  design  and  organization• Developers  get  more  done  in  less  time• Your  organization’s  data  is  connected  across  all  silos• Understanding  the  connections  becomes  very  apparent• Non-­‐invasive  low  burn  integration  with  existing  data  architecture

Improved  Data  Understanding  and  Interaction

JOIN

JOINJO

IN

JOIN

JOINJOIN

JOIN

JOIN JOIN

JOIN

JOIN

JOINJOIN

JOIN

JOIN

JOINJOIN

JOINJOIN

JOIN

JOIN

JOIN

Improved  Data  Understanding  and  Interaction

Improved  Developer  Productivity

“Complex  Join”  in  SQL opencypher.org – Native  Query  Language  for  Graphs

SQL  Query  vs Native  Graph  Query  (Cypher)

Equivalent  queries  for  finding  the  reporting  chain  within  an  organization

Improved  Cross-­‐Functional  Collaboration

Graphs  Connect  Not  Only  Your  Data  But  Your  Whole  Organization

Seamless  Integration  with  Existing  Systems

• Very  low-­‐risk,  non-­‐invasive  operation• Create  connectors  for  existing  data  bases• Flow  data  into  your  knowledge  graph• Real-­‐time,  analytics,  learning,  understanding,  etc applications  interact  with  the  native  graph  database  directly• Start  flowing  new  data  directly  into  your  knowledge  graph  (assuming  you  chose  one  that  is  ACID  and  guarantees  referential  integrity)

These  last  few  slides  may  sound  simple

• But  if  you’ve  never  gone  through  it  there  are  challenges  looming• It’s  not  just  a  technical  choice• It’s  a  complete  paradigm  shift  for  the  organization

Graph  Thinking  is  a  Paradigm  Shift

And  no  paradigm  has  ever  made  more  sense

• This  is  how  the  brain  works  – dealing  with  “things”  and  how  they’re  related  is  already  how  we’re  wired  to  function• Solve  more  complex  problems  with  less  effort• Improved  collaboration  between  technical  teams  and  everyone  else• Flexibility  to  evolve  your  data  naturally  as  your  business  changes

But  don’t  go  it  alone

• The  challenge  of  asking  an  RDBMS  developer  to  work  with  graph  for  the  first  few  times  is  that  old  habits  die  hard. They  tend  to  treat  it  like  a  relational  database,  which  is  death  for  an  index-­‐free  native  graph.• “An  ounce  of  prevention  is  worth  a  pound  of  cure”  holds  true.  Engage  an  expert  early  and  your  time  to  market  will  shorten  by  years.• Hiring  and  training  a  team  around  this  will  take  years,  but  you  can  get  a  jump  start  and  avoid  all  the  pitfalls  that  experience  teaches

And  don’t  just  use  any  existing  vendor

• “Oh  we’ll  just  use  (insert  global  consultancy)  because  they’re  on  the  vendor  list  already  and  they  told  us  they  could  do  it  and  they’ll  put  a  larger  team  on  it.”• In  software  a  bigger  team  does  NOT  mean  you’ll  get  a  better  result.  Especially  when  it  is  dealing  with  a  fundamental  paradigm  shift.• Few  individuals  are  capable  of  leading  an  enterprise  through  the  transformation  needed  to  become  a  connected  enterprise.  It’s  a  completely  different  world  view.  It  is  not  simply  the  introduction  of  a  technology.

Hire  a  Professional  Team  with  Experience  in  Connected  Data• Often  we  end  up  at  organizations,  from  global  telecoms  to  national  transportation  leaders  and  global  financial  institutions,  with  the  goal:• Fix  our  graph  architecture  that  is  failing  in  production• Get  our  graph  architecture  that  has  run  1-­‐2  years  over  into  production

• It  doesn’t  need  to  be  us,  but  it  probably  should  be• There  are  few  teams  with  as  much  expertise  in  this  area  as  we  have

• At  least  make  sure  you’re  getting  individuals  that  know  how  to  work  with  native  graph  data

Just  get  started

• Get  a  POC  going  on  one  piece  of  the  enterprise  data  is  the  first  step  where  the  new  paradigm  can  be  proven• Trying  to  start  with  everything  at  once  will  result  in  it  not  likely  getting  off  the  ground

In  Summary

• Pick  your  first  connected  data  initiative• Use  an  index-­‐free  native  graph  property  storage  model• If  you  think  you  have  too  much  data  for  it  to  handle  come  see  me

• Have  an  expert  in  enterprise  graph  data  architectures  help  you• Engage  your  non-­‐technical  teams  from  the  beginning  in  the  initiative

Q&A

Benjamin  Nussbaum@bennussbaum |  [email protected]

www.atomrain.com |  www.graphgrid.com

Knowledge  Graphs:Journey  of  the  Connected  Enterprise  

Knowledge  GraphsJourney  of  the  Connected  Enterprise

Thank  You!

Benjamin  Nussbaum@bennussbaum |  [email protected]

www.atomrain.com |  www.graphgrid.com