When a relational database doesn’t work
And why a graph database might help
ContentsContents
• Franz and customers• Two Use Cases
– Amdocs: a real time semantic platform for telecom that knows everything about everyone in real time
– Real time news and social network analysis using the Linked Open Data CloudLinked Open Data Cloud
• Scalability?• Integration with other NoSQL databases – Solr, MongoDBg , g
Franz Inc – Who We AreFranz Inc Who We Are
• Private, founded 1984 • We are an AI and
Semantic Technology company• Out of BerkeleyOut of Berkeley
(1 (2 3) (4 5) (6 7) (8 9) (10 11) (12 13) (14 15)(16 17) (18 19 20 21 22 23 24 27 28) (29 30))
Bob
AliceCraig
Bill
How is it different from an RDB d h i i fl ibl ?and why is it more flexible?
• No Schema. – Say whatever you want to say but– ontologies may constrain what you put in triple store
• No Link Tables – because you can do one‐to‐many relationships directly
• No Indexing Choices– Can add new data attributes (predicates) on‐the‐fly that will be real‐time available for querying becausewill be real time available for querying, because everything is automatically indexed.
• Takes anything you give it: it is trivial to consume– Rows and columns from RDB, XML, RDF(S), OWL, Text and Extracted Entities, JSON
AllegroGraph: RDF Graph StoreAllegroGraph: RDF Graph Store
RESTBackup/Restore
ReplicationRules Java
Warm FailoverSparql Prolog Rules
Clif++ Geo SNA Time RDFS+ Java‐Script
Session Management, Query Engine, FederationSecurity
ManagementStorage layer ( compression, indexing, freetext, transactions )
Use Case AmdocsUse Case Amdocs
Build a semantic platformthat knows everything
babout everyonein real time.
Telco Call Center Volume QuadruplesQuadruples Since 2007
• On average, each call – Lasts 10 minutes– Go thru 68 screens
• One call costs 3 months’ profit from that customer• One call costs 3 months profit from that customer• It’s getting worse every day!
Typical Interaction Begins in the Dark
Bill
Dark
PlanPast Payments The unknown – why
calling? How to help?
DeviceCalculator (avg peak usage)
g p
Past Interactions (Memos)
Statements
No real‐time context ‐ insight & guidance
(Memos)g g
High AHT, poor FCR, low customer and agent satisfaction
AIDA Maps Events to C tConcepts
Events from many source systems are transformed into a set of related business concepts
Interactions
Bills
Orders
Many events Triple Store with business concepts
Bills
Payments
Collections
Charge disputeg p
Individual
Customer
Pay instructions Subjective "good payer"Patterns "always pays 2 days late"
Chronology of events
Device Activated
Device heartbeat
Subscriptions
D i h
a e s a ays pays days a eTrends “improving payer"Geospatial “within 5 miles of the tower"Time “within 5 minutes of an outage" Chronology of eventsDevice changes Probability “probably will call about the bill"Absence of occurrence “missed payment"Relationship between " friend of a friend"
Events Decision Engine
Container
ActionsSBA Application Server
ContainerContainer
EventIngestion Inference
Amdocs Event Collector
Amdocs Integration Framework
Scheduled
Inference Engine(Business Rules)
Bayesian
EventsEvents
“Sesame”
ScheduledEvents
yBeliefNetwork
Operational SystemsOperational Systems
CRMCRMRM OMS
AllegroGraph
Operational SystemsOperational Systems
Event Data SourcesEvent Data Sources
NW Web 2.0
AllegroGraphTriple Store DB
AIDA Event CollectionAIDA Event Collection
Amdocs Event CollectorInference & DecisionAmdocs Event Collector
Event Sources Collection Parsing Mapping Publishing
Decision
Ingestion
• Events are collected from many heterogeneous, configured event sources
Phone calls texting video upload roaming etc– Phone calls, texting, video upload, roaming, etc.– iTune download, web site interaction, media upload– Emails, support calls
Bill payment or non payment– Bill payment or non‐payment– Phones stop working or disconnect
• All fused and mapped into a single event knowledge base
AIDA Semantic Inference
• Define rules to operate to create higher level concepts
AIDA Semantic Inference
– Event (mapping) rules ‐Map event data into the domain ontology– Automatic rules – Compute new properties defined by the ontology– On‐demand rules ‐ perform inference for the services
• Rules triggered upon event ingestion, service request or schedule• Semantic rule inference generates new triples from existing ones
Bills
Charges
P t
Amount
Payment P
Customer
Payments Due Date
“Timeliness”Make
Pattern
Good
Bad
Devices Model
StatusOnTime
Early
Late
Improving
Worsening
Semantic Inference – Using Business R l hi h l l
• AIDA provides Workbench for business
Rules to generate high level concepts“Late Payment” defined in Workbench
rule construction• Utilizes a sophisticated
magnetic block GUI for b i lbusiness analysts
• Rules triggered to infer and generate newbusiness conceptsbusiness concepts
rule PaymentDetails.timeliness{
if date within EarlyPeriod days after customerBill.billDatethen timeliness = Early ;
Each business rule defines an attribute. This rule defines an attribute of the PaymentDetails class called timeliness
then timeliness = Early ;else if date not within LatePeriod days after customerBill.billDatethen timeliness = Late ;else timeliness = OnTime ;
}All classes and their attributes are defined in the application ontology
Java codeJava code
Decisioning – Probabilistic
• AIDA incorporates also Bayesian Belief Networks (BBN)
Assessment
• These are graphical models for reasoning under uncertainty• Important part of decision making – the likelihood of something happenning
estimated by how often it occurred in the past (primarily used in medical research til tl )until recently)
• Evidence consists of observations on certain nodes leading to conclusions
Evidence Conclusions
Payment Pattern
Bill Expect Payment Arrangement
Setup
Payment
Expect Payment
Presenting insight to the CSRese t g s g t to t e CS
Prediction on reason for the Process opens Prediction on reason for the call – ranked by probability relevant screen for
reference and action
Presentation of recent dinteractions and events
Prioritized Recommended treatment and script
First application: CRMAmdocs Guided Interaction Advisor
First Call ResolutionFirst Call Resolution• Increase up to 15%
Average Handling Time• Reduce up to 30%
Training CostsR d 25%• Reduce up to 25%
Triples all the way downTriples all the way down
So why a triple storeSo why a triple store
• Flexibility, flexibility and flexibilityy, y y– Change the schema on a daily basis– Customers create new policies which in turn will create new schemas on the fly
• Needed to work with meaningRdf describes data– Rdf describes data
• Needed to be declarative for everything– Most RTBI is a combination of data in the DB and javaMost RTBI is a combination of data in the DB and java variables in the application.
Text Intelligence for DOD/ISText Intelligence for DOD/IS
How would you do this with d d h iyour standard search engine
• Give me a newspaper text with a republican and a democrat that serve on two subcommittees that have the same parent committee.
• Which [democrat|republican] is most vocal in the oil spill disaster[ | p ] p
• Given this text, find all the other texts that have the same people and the same main topics but not democrats in the textsame main topics but not democrats in the text.
• Which newspaper favors [democrats|republicans]
• Which [democrate|republican|senator|representative] get most of the attention in the last week.
• Give me the distribution of the most important topics yesterday
The processThe process
• We spider daily > 300 on‐line newspapers and thousands of p y p pblogs
• And search specifically for all the member of the senate and house of representatives and the executive branch
• Apply entity extractor to the text and extract main concepts – About 150 triples per text…p p
• Hook up these concepts with a detailed database of each politician and with information from the linked open data cloud
From News Article toFrom News Article to
• People (has‐people)p ( p p )– And their roles
• Places (has‐places)– And the county, state, country they are in
• Organizations (has‐organizations)– Government departments, company names, etc.
• Main Categories (has‐domains)Politics sports ministries energy finance economics– Politics, sports, ministries, energy, finance, economics, ecology, oil, mining industry, etc..
• Main Concepts (has‐main‐groups)– Other important nouns and phrases in a text
LOD cloud – Sept 22 2010LOD cloud Sept 22 2010
latest LOD cloud
AllegroTextAllegroText
• A little demo?
How scalable is this?How scalable is this?
LoadingLoading
QueriesQueries
• Query planner now takes 99% of SPARQL 1.0, automatically Q y p Q , ycompiles it into query graph flow language…
You can write this by hand if you i i lfwant to optimize yourself.
This will actually work on Prolog i h l !with rules too!
Query performance notes:iWins
• Indices are small enough to fit in memory of conventialg ymachines
• Simultaneous access to indices (see next slide)
• Pipe line architecture• Pipe line architecture– Stream based processing (all nodes can be active in parallel. Most nodes can begin before the end of data is p greached.)
The endThe end
Top Related