[Webinar]Building Knowledge through Data Visualization

53
Data Visualization with GraphDB and Workbench [email protected] Co-lead, Innovation and Consulting Group, Ontotext Corp

Transcript of [Webinar]Building Knowledge through Data Visualization

Page 1: [Webinar]Building Knowledge through Data Visualization

Data Visualization withGraphDB and Workbench

[email protected]

Co-lead, Innovation and Consulting Group, Ontotext Corp

Page 2: [Webinar]Building Knowledge through Data Visualization

Outline

↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 3: [Webinar]Building Knowledge through Data Visualization

Ontotext History and Essential Facts

↗ Started in 2000 as a Semantic Web pioneer↗ As Innovation lab within Sirma Group (listed as SKK), the biggest Bulgarian software house

↗ Got spun-off and took VC investment in 2008

↗ 65 staff, HQ in Bulgaria, reps in Canada, UK, Germany and USA

↗ Over 400 person-years invested in R&D ↗ Multiple innovation & technology awards: Washington Post, BBC, FT, BAIT, etc.

↗ Member of multiple industry bodies: ↗ W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation

Page 4: [Webinar]Building Knowledge through Data Visualization

Clients (selection)

Page 5: [Webinar]Building Knowledge through Data Visualization

GraphDB

↗ Scalable RDF 1.1 engine

↗ Platform independent

↗ W3C standards support

↗ Open source API

↗ Reasoning and consistency checking

↗ Main contributor to RDF4J project

↗ Excellent support

Page 6: [Webinar]Building Knowledge through Data Visualization

This webinar

• SPARQL editing and data visualization features available in GraphDB Workbench (GDB WB)

• Using queries written by others: query URL, parameterization• Data visualizations that can be added with little programming• 3rd party SPARQL writing aids and visualization tools that can be

integrated to GraphDB (we'd be glad to do that for you)

• Full report: HTML, PDF

• Webinar: presentation, TODO recording

Page 7: [Webinar]Building Knowledge through Data Visualization

Outline

↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 8: [Webinar]Building Knowledge through Data Visualization

SPARQL Editing

• GDB WB integrates the YASGUI editor• Automatic prefix addition (best practice: load prefixes.ttl)• Class autocompletion• Property autocompletion

Page 10: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 11: [Webinar]Building Knowledge through Data Visualization

FactForge Charts: Bar

Page 12: [Webinar]Building Knowledge through Data Visualization

FactForge Charts: Pie

Page 13: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 14: [Webinar]Building Knowledge through Data Visualization

SPARQL Results in Google Sheet FactForge-Industries

Page 15: [Webinar]Building Knowledge through Data Visualization

Google Sheet Formulas

● Top left cell: get data (see next for the long ugly URL)

=importdata("http://factforge.net/repositories/ff-news?query=%23+F4%3A+Top-level+industries+by

+number+of+companies%0A%23+-+benefits+from+the+mapping+and+consolidation+of+industry+cl

assifications%0A%23+++and+predicates+in+DBPedia+done+in+the+FactForge%0A%23+-+benefits+fr

om+reasoning+-+transitive+and+symmetric+properties+across%0A%23+++the+industry+classificatio

n+taxonomy+of+FactForge%0A%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%

2F%3E%0APREFIX+ff-map%3A+%3Chttp%3A%2F%2Ffactforge.net%2Fff2016-mapping%2F%3E%0A%

0ASELECT+DISTINCT+%3Ftop_industry+(COUNT(*)+AS+%3Fcount)%0A%7B%0A+++%3Fcompany+dbo

%3Aindustry+%3Findustry+.%0A+++%3Findustry+%5Eff-map%3AindustryVariant+%2F+ff-map%3Aind

ustryCenter+%3Ftop_industry+.%0A%7D%0AGROUP+BY+%3Ftop_industry+ORDER+BY+DESC(%3Fcou

nt)+")

● Third col: extract industry name from industry URL

=regexreplace(A2,"http://dbpedia.org/resource/","")

Page 16: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 17: [Webinar]Building Knowledge through Data Visualization

Query URL• Interactive endpoint: http://factforge.net/sparql

− versus programmatic endpoint: http://factforge.net/repositories/ff-news

• List of repos as JSON: http://factforge.net/rest/repositories • Get query URL, then replace the endpoint

• If you dislike CSV, add Accept header, e.g.curl -H Accept:text/tab-separated-values

Page 18: [Webinar]Building Knowledge through Data Visualization

Query Parameters

• E.g. find the industries of a given $companyPREFIX dbo: <http://dbpedia.org/ontology/>SELECT ?industry {$company dbo:industry ?industry}

• Add parameter to query URL (value in NTriples format):&$company=<http://dbpedia.org/resource/Google>− URL: <http://dbpedia.org/resource/Google>− plain string: "Google"− string with language: "Google"@en− date with XSD type: "2017-05-25"^^<http://www.w3.org/2001/XMLSchema#date>

• Try it, returns?industry<http://dbpedia.org/resource/Software><http://dbpedia.org/resource/Internet><http://dbpedia.org/resource/Mobile_device><http://dbpedia.org/resource/Cloud_computing>

Page 19: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 20: [Webinar]Building Knowledge through Data Visualization

IRISA SQUALL (CNL)

• SQUALL (Semantic Query and Update High-Level Language). 2011-2013. Paper 1, 2 , 3, examples.

• Example questionWhich person is an author of at least 10 publication-s?

• Translates toSELECT DISTINCT ?x1 WHERE { ?x1 a :person . {SELECT DISTINCT ?x1 (COUNT(DISTINCT ?x3) AS ?x2) WHERE { ?x3 a :publication . ?x3 :author ?x1 .

Page 23: [Webinar]Building Knowledge through Data Visualization

MOLTO: CNL query to SPARQLQuestion in English/Swedish is translated to SPARQL

Page 24: [Webinar]Building Knowledge through Data Visualization

MOLTO: RDF to NL Generation (Lexicalization)painting description in a dozen languages

Page 25: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 26: [Webinar]Building Knowledge through Data Visualization

W3C Data Cube

W3C Data Cube ontology: • OLAP data model

• Statistical classifications following SDMX

Many statistical datasets available as RDF, e.g.:• Linked SDMX Data developed by Sarven Capadisli: International Monetary Fund IMF,

OECD, UN Food and Agriculture Organization FAO, Swiss Federal Statistical Office

BFS, European Central Bank ECB, World Bank, Transparency International.

• Eurostat developed by the LOD Around the Clock (LATC) project (static)

• Eurostat wrapper developed by Benedikt Kämpgen (updateable)

• US Securities and Exchange Commission SEC Edgar Wrapper developed by Benedikt

Kämpgen

• UN ComTrade developed by the Multisensor project

Page 27: [Webinar]Building Knowledge through Data Visualization

AKSW CubeViz

CubeViz: faceted statistical browser, visualization charts. ● Original project: OntoWiki addon (dependency), PHP: demo, source , wiki, used at the EU Open Data Portal.● Currently being rewritten to JavaScript: demo (doesn't quite work), source

Page 28: [Webinar]Building Knowledge through Data Visualization

AKSW CubeViz

Polar Chart (EU Digital Agenda Scoreboard)

Page 29: [Webinar]Building Knowledge through Data Visualization

OpenCube Toolkit

OpenCube Toolkit developed by OpenCube project. Tools for:Data Creation (conversion)• TARQL extension: CSV/TSV files• D2RQ extension for data cubes: relational databases • JSON-stat2qb extension: JSON-stat• R2RML extension: relational databases, following W3C standardData Expanding• OpenCube Compatibility Explorer: (a) search LOD and find cubes compatible to expand initial cube, (b) establish

typed links• OpenCube Aggregator: (a) creates 2n−1 new cubes: all combinations of n dimensions. (b) new observations for all

attributes of a hierarchical dimension.• OpenCube Expander: merge two compatible cubes.Data Exploring• Data catalogue management: user interface (UI) templates for managing metadata on RDF data cubes and

supporting search and discovery• OpenCube Browser: table-based visualizations • OpenCube OLAP Browser: OLAP operations: pivot, drill-down, and roll-up• R statistical analysis: run R data analysis scripts • Interactive chart visualization widgets: cube slices with charts• OpenCube MapView: visualize geo-spatial dimension: chroplet, markers, bubbles

Page 30: [Webinar]Building Knowledge through Data Visualization

CubesViewer

• CubesViewer: excellent OLAP visualization tool: demo, CubesViewer Studio demo, source, documentation.

• Based on DataBrewery Cubes framework: source, documentation. • Unfortunately does not yet support W3C Cubes

− We'd love to develop such feature for you (tracking issue)

Page 31: [Webinar]Building Knowledge through Data Visualization

CubesViewer

Page 32: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 33: [Webinar]Building Knowledge through Data Visualization

GDB WB Builtin Overview: Class Relations

Page 34: [Webinar]Building Knowledge through Data Visualization

GDB WB Builtin: Class Instances & Hierarchy

Page 35: [Webinar]Building Knowledge through Data Visualization

GDB WB Builtin: Domain/Range Graph

Page 36: [Webinar]Building Knowledge through Data Visualization

GDB WB Builtin Detail: Visual Graph

Page 37: [Webinar]Building Knowledge through Data Visualization

GDB WB Visual Graph: Relations of Google

Page 38: [Webinar]Building Knowledge through Data Visualization

GDB Graph Viz Dev: Company RelationsGDB Dev Hub: Visualizing GraphDB data with Ogma JS (library developed by Linkurious)

Page 39: [Webinar]Building Knowledge through Data Visualization

GDB Graph Viz Dev: Flight Routes

Page 40: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 41: [Webinar]Building Knowledge through Data Visualization

Visualization Toolkits

Numerous powerful and popular visualization tools, creating an amazing variety of graphs and charts, e.g.:

● d3.js, with addons (e.g. interactive selection of chart type)● Tableau Public edition● Microsoft PowerBI● GoJS● Google Charts● LinkuriousSpecialized tools, e.g.● CrossFilter for "faceting" of multidimensional data,● Cubism for viewing time series● CubeViz and OpenCube Toolkit for statistical data● Histropedia for making advanced timelines

Page 42: [Webinar]Building Knowledge through Data Visualization

Example with GDB and TableauPublic procurement spending through last 5 Bulgarian cabinets (2011-2016). Sofia Datathon, March 2017. Slides, Visualization

Page 43: [Webinar]Building Knowledge through Data Visualization

Example with GDB and PowerBIProcurements by one contracting authority in time. Filtering by government cabinet, focusing by time interval. Sofia Hackathon, Apr 2017

Page 44: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 45: [Webinar]Building Knowledge through Data Visualization

RDF by Example• ONTO tool for RDF instance visualization (rdfpuml) and R2RML generation (rdf2rml). • E.g. mapping Dun & Bradstreet company data to Financial Industry Business Ontology (FIBO)

Page 46: [Webinar]Building Knowledge through Data Visualization

RDF by Example• Dun & Bradstreet details (top-right): 3 "measures" (NetWorth, AnnualSales, ProfitLoss) • Total of 152 fields grouped in 32 nodes: impossible to comprehend without such diagram

Page 47: [Webinar]Building Knowledge through Data Visualization

R2RML Generation• Model of Museum Exhibitions (for J. Paul Getty Museum)• Includes RDB joins and field names (Gallery TMS)

Page 48: [Webinar]Building Knowledge through Data Visualization

R2RML Generated From Model• R2RML is verbose: 3 nodes, 15 statements for every model statement

• 1 model node (representing an Exhibition at a Venue) is expanded to

15 R2RML nodes: huge savings in complexity and maintainability

• R2RML requires semantic experts, whereas model diagrams can be

understood by subject-matter experts (museum curators, commodity

trade analysts, etc)

• Details in SWIB'16 presentation

Page 49: [Webinar]Building Knowledge through Data Visualization

R2RML Generated From Model: Detail

Page 50: [Webinar]Building Knowledge through Data Visualization

Outline↗ Intro: Ontotext, GraphDB, Webinar

↗ Writing SPARQL

↗ Built-in SPARQL Result Visualizations

↗ Using SPARQL Results in Spreadsheets

↗ Invoking SPARQL Queries, Parameterization

↗ Tools that Help With Writing SPARQL Queries

↗ Tools for Statistical Visualizations

↗ Graph Visualizations: Built-in, Developing

↗ Visualization Toolkits

↗ Declarative Visualization

↗ JDBC Data Access API

↗ Q&A

Page 51: [Webinar]Building Knowledge through Data Visualization

Why JDBC/ODBC?

• Many viz tools (e.g. Pentaho, Centrifuge, QlikView, Tableau) have ODBC/JDBC interfaces

• To save effort of constructing query URLs and saving results, we can provide a JDBC API

to GraphDB

• The user feeds SPARQL (not SQL queries) through JDBC, SPARQL tabular results are

returned to the tool

• We can reuse Jena JDBC or another open source library

• If the tool supports ODBC not JDBC, we can use the JDBC-ODBC bridge

(sun.jdbc.odbc.JdbcOdbcDriver).

• E.g. connecting from Java to Excel using ODBC and the JDBC-ODBC bridge

Page 52: [Webinar]Building Knowledge through Data Visualization
Page 53: [Webinar]Building Knowledge through Data Visualization

• Contact: [email protected]

Lead, Innovation and Consulting Group, Ontotext Corp

• We'd be glad to deploy any 3rd party tools and integrate them to GraphDB for you!

Thanks for your attention.Question time!

DOWNLOAD GRAPHDB FREE