1. Knowledge graph panel (to share) - ISWC...

Enterprise-scaleknowledge graphs

● Yuqing Gao, Microsoft● Anant Narayanan, Facebook● Alan Patterson, eBay● Jamie Taylor, Google● Anshu Jain, IBM

Panel: Enterprise-scale knowledge graphs

Knowledge Systems

Yuqing GaoAI and ResearchMicrosoft

Knowledge systems: World graph

Web pages, Web documents, Images, …

Search queries, views, click throughs, …

World graph• People• Places• Organizations• Things• Actions…

Academic graph• People• Publications• Fields of Study• Venues…

Knowledge systems: Academic graph

Authors, institutions, articles, conferences …

Knowledge acquisition, search, recommendation …

Work graph• People• Groups• Messages• Activities

…

Knowledge systems: Work graph

Emails, Messages, Documents, Meetings, …

Messages read/sent, Document author/shared, …

Infusing knowledge: From search to conversation

Search

Q&A

Enrichment Recommendations

Conversation

Relatedness

Bing – knowledge in answers

Knowledge in Conversation

I want to travel to NY 2 days before Thanksgiving, staying for a week

Okay, booking a flight to JFK from November 20 to November 27. Where will you be flying from?

Got it, I’ve found some flights for you …

From San Francisco, and also non-stop in first class

How about leaving in the afternoon

NY

JFKairport

Thanksgiving

SF

SFOairport

Nov 22

travel

NY

Thanksgiving

week

San Francisco

non-stop first class

leaving afternoon

FlightSearch

Delta2648

Filter Set

…

…

How do we bring knowledge systems to life?

Knowledge creation

Raw DataStructured + Unstructured

High-qualitySemanticallyOrganized

Challenges of scaled KGsBuilding a small KG is easy - building a vast system like Satori is a huge challenge

Coverage

Freshness

Have we got the information we need?

CorrectnessIs our information accurate?

Is information up to date?

Will Smith: Single entity, 108K facts assembled from 41 web sites.There are 200 Will Smiths on Wikipedia alone.

Three forces in constant conflict:

Increased freshness and coverage

Harder to ensure correctness

Increased correctness Harder to ensure freshness and coverage

Correctness is always hard – what is true and correct? Particularly critical in today’s world

Active research and product efforts in knowledge

• Unsupervised knowledge extraction from unstructured data in open domain

• Knowledge graph semantic embedding

• Autonomous knowledge inference & verification

• Real-time knowledge graph with archiving

• Large scale entity linking and disambiguation

• Ultra-scale knowledge representations

• Knowledge system for multi-lingual

• Knowledge Precision vs Comprehensiveness

• …

Thank you!

Facebook is known for the world’s largest social graph spanning billions of entities and trillions of assertions.

We’ve built technology over the last decade to enable rich connections between users and we’re eager to apply it to build a deeper understanding of not just people, but also the things that people care about.

Why a Knowledge Graph for FB?

The most socially relevant entities, concepts and words.

People, CelebritiesPlaces, Points of InterestMovies, TV, MusicSports

What do we put in it?

Deliberately focused and small to start with, US only.

~500M assertions~50M entities~100s of types

How large is it?

Attribute: adventurous, casual, sustainable, trendyDish: coffee and tea, bread, drink, parfait, cake, belgian waffle, gingerbread, liege waffle, turkey sandwich, fresh-squeezed lemonade, bacon waffleFeatures: Credit cards, Takeout, Wifi, -Parking, -ADAMeals: Breakfast, LunchSuggestions: liege waffle, lemonadeTelephone: (555) 987-1234Hours: { … }Website: http://www.heidiswafflehouse.com

Knowledge of the worldPeople, Places, Things

Heidis Waffle House

Use it to provide utility...Recommend smart

repliesEasy sharingEntity Detection

Memory

… and delight the user!

eBay Product Knowledge Graph

Alan Patterson, Structured Data [email protected]

ISWC 2018

Memorabilia & Collectables

ParisFranceEuropeWartime

Michael JordanBasketballChicago Bulls

Pipeline

Graph - Products, properties, value types, fitment, relationships, standards, variations, people, places, brands, companies, events, dates.

Statistical Data Mining - Discovery (over billions of listings)

Buyer & Seller Data - Queries, clicks & conversions, seller descriptions

Knowledge Matching and Semantic Verification

Graph stats - Expect around 100M products, >1Bn triples

Problems

Biggest Problem: Identity Are these listings the same product

Identity depends on who’s asking

Buyer and Seller have different requirements

Proprietary + Confidential

Knowledge Graph

Jamie TaylorGoogle - Knowledge Graph Schema Team

ISWC 2018

Knowledge Graph

1 Billion Things

70 Billion Facts

What is it used for

Challenges

Challenges

What is personally challengingLong evolutionBigger than my brainSo many stakeholders

Challenges


Wonderful Problems!

Challenges


Survival Mechanism:A Principled Approach

Managing Identity

Managing Identity

Santos FC Before Pelé

Sailing

Cuba

Managing Identity


Sailing

Cuba

Maintaining Semantic Stability

E-Sports & Sports

Movies & Television

Broadcast, Podcast & Streaming

Managing Identity


Sailing

Cuba

Maintaining Semantic Stability

Knowledge Graph @ Watson - Context

Watson Jeopardy: Fact based QA on Wikipedia and other sourceshttps://en.wikipedia.org/wiki/Watson_(computer)

Strategic IP Insight Platform - SIIP: Interactive exploration & discovery on patents & journalshttps://www-935.ibm.com/services/us/gbs/bao/siip/, http://time.com/3208716/ibm-watson-cancer/

Watson Discovery Advisor: OnPrem QA and interactive exploration based discovery https://youtu.be/pHfmh1Bi6aA

Watson Discovery Service: Cloud SAAS model for discovery on unstructured text (focused on developer)https://www.ibm.com/watson/developercloud/discovery.html

Knowledge Integration Toolkit (KnIT): Understanding all the worlds medical abstractshttps://www-935.ibm.com/services/us/gbs/bao/siip/, http://time.com/3208716/ibm-watson-cancer/

Several Watson Services: Watson NLU, Retrieve and Rank, Document Conversion etc. https://www.ibm.com/watson/developercloud/services-catalog.html

Discovery is about the non-obviousDiscovery creates new knowledge (hitherto unknown / unrecorded in domain docs, data sources). New knowledge is surprising and anomalous. It could be formally abstracted in several forms:

Search and Exploration look for knowledge already available in the knowledge sources available to the system. They are necessary for Discovery but not sufficient

Confirmation of facts known to an SME of a domain is not discovery (listing all possible side effects of a drug which may have been mined from structured or unstructured data)

New link between entities: A new side effect of a drug, a new potential emerging company as an acquisition target or sales lead, a non-obvious but relevant case / law applicable in a particular trial / proceeding, a person of interest in a terrorist attack. (Link Prediction, Relationship Discovery, Relationship Ranking)

A potential new important entity in the domain: e.g. a new material for display technologies, a new investor for a particular investment area etc. (Entity Discovery, Entity Recommendation, Entity Ranking)

Changing significance of an existing entity: it’s changing relationships/attributes/metrics. e.g. increasing stake by an investor in an organization, increasing interaction between a person of interest and some criminal in an intelligence gathering scenario, decreasing number complaints for a particular product or service in a retail / consumer scenario. (Trend analysis, Distribution Analysis, Anomaly Detection)

In a nutshell• We haven’t focused on ONE Global knowledge graph• We have a framework for anyone to build their domain specific KGs• Offered as a part of Watson Discovery Advisor and now Watson

Discovery Service, also consumed internally by services / solutions• Clients In various sectors:

– Banking and Finance– IT Services, Customer Service– Cyber Security– Scientific Discovery: Life Science, Oil & Gas, Chemicals & Petroleum– Defense– Space Exploration– Media and Entertainment

valuable things we learnt..• polymorphic stores is the answer, stop fighting for one

store

• build surprise into your model, into your api

• evidence is primitive to the system, which every node and edge maps to

• push entity resolution to runtime through context

• every new piece of knowledge should know its cascade effect

Detailed description can be found at talks.anshu.co

Open problems that haunt us

• modeling and analyzing changing knowledge

• merge relationships discovered from unstructured data with known relationships

• federation of global, domain specific and customer specific knowledge

• incremental update of global knowledge on horizontally scaled stores

• scanned PDF extraction ..garbage in garbage out

Thank you.

1. Knowledge graph panel (to share) - ISWC...

Documents

Transcript of 1. Knowledge graph panel (to share) - ISWC...