Skolemising Blank Nodes while Preserving Isomorphism

Skolemising Blank Nodes whilePreserving Isomorphism

Aidan Hogan – DCC, Universidad de Chile

WHY? BLANK NODES ARE GREAT!

When life gives you blank nodes …

Blank Nodes are glue!

Blank Nodes names aren’t important …

(Isomorphic)

Blank nodes are common in real-world data …

Aidan Hogan, Marcelo Arenas, Alejandro Mallea and Axel Polleres "Everything You Always Wanted to Know About Blank Nodes". Journal of Web Semantics 27: pp. 42–69, 2014

BLANK NODES ENABLE SYNTAX SHORTCUTSThey represent implicit nodes in the graphThey help specify order, higher-arity relations, reification, etc., succinctlyThey are common in real-world data

BLANK NODES:WHAT’S THE PROBLEM?

Are two RDF graphs isomorphic?

RDF ISOMORPHISM IS GI-COMPLETEA general algorithm to see if two RDF graphs are the “same” will (probably) not be tractable

BLANK NODES ADD COMPLEXITY?WHAT TO DO?

RDF 1.1 proposes Skolemisation

But fresh IRIs every time is not ideal

Would prefer a “consistent” labelling

Compute isomorphically-unique graph hash

Finding duplicate documents from a crawler

CANONICAL LABELLING USEFUL FOR:1. Mapping blank nodes to IRIs 2. Computing unique hashes for RDF graphs

OLD BUT RECURRING QUESTION

An old question that won’t go away …

Jeremy J. Carroll. “Signing RDF Graphs.” ISWC 2003.

Edzard Höfig, Ina Schieferdecker. “Hashing of RDF Graphs and a Solution to the Blank Node Problem.” URSW 2014.

NO EXISTING APPROACH IS GENERAL• Hard cases seem unlikely in practice• Let’s build a general (and thus worst-case exponential) algorithm

that’s efficient for practical cases

NAÏVE CANONICAL LABELLING SCHEME

(Naïve) Canonical labels for blank nodes

But wait … what happens if ... ?

Or another case …

Fixpoint does not distinguish all blank nodes!

NAÏVE: COLOUR BLANK NODES RECURSIVELY UNTIL FIXPOINT• Efficient• Incomplete

CANONICAL LABELLING SCHEME:ALWAYS DISTINGUISH ALL BLANK NODES

Brendan D. McKay. "Practical graph isomorphism". Congressus Numerantium 30: pp. 45–87, 1981.

Start with a (non-distinguished) colouring …

Let’s distinguish a node …

Colouring is no longer a fixpoint!

Rerun colouring to fixpoint

Fixpoint reached: still not finished!

So again let’s distinguish another …

… and rerun colouring to fixpoint

Now all blank nodes are distinguished!

Blank node labels computed from colour

Let’s go back: first, why pick _:a and _:c?

Okay so: why _:a …

Adapt ideas from the Nauty algorithm (for standard graph isomorphism)

Check all leafs for minimum graph

What happened?

Automorphisms cause repetitions

CORE ALGORITHM: FIND MINIMAL GRAPH FOLLOWING FIXED COLOURING RULES• Complete• Efficient for many cases?

OKAY … SO WHAT HASHING TO USE?

What about hash collisions?

128 bit: MD5, Murmur3_128160 bit: SHA1

HASHING MAY LEAD TO COLLISIONS• Don’t care what hashing you want to use• 128-bit hash shortest hash with acceptable collision probability• For cryptographic use-cases, SHA-256 or better might be needed

EVALUATION

Evaluation: Real-world Graphs

Evaluation: Nasty Synthetic Graphs

CONCLUSIONS

In loving memory of

Linked Data

2007–2012

Survived by its research

community

_:b1999–2015

Conclusions

Aside: Why GI-Hard?

Aside: Why GI-Hard?(Can Encode Graph Isomorphism as RDF Isomorphism)

if and only if

Aside: Why GI-Complete?(Can we encode RDF isomorphism as graph isomorphism?)

if and only if

Aside: Why GI-Complete?(Yes: We can encode RDF isomorphism as graph isomorphism)

if and only if

COMPLETE CANONICAL LABELLING SCHEME

A complete canonical labelling?

Find a canonical labelling for H

Choose the lowest possible graph

COMPLETE: FIND MINIMUM POSSIBLE GRAPH USING FIXED BLANK NODE LABELS• Complete• Inefficient

The need for a graph-level hash

OPTIMISATION: PRUNE THE TREE USING AUTOMORPHISMS

Trim the search treeusing “found” automorphisms

Found Automorphisms …

PRUNING PER AUTOMORPHISMS AVOIDS SYMMETRIC REPETITIONS• Automorphisms are found naturally• Makes very “regular” structures (like cliques) a lot easier• Need to be careful how to manage the automorphism group

Skolemising Blank Nodes while Preserving Isomorphism

Technology

Transcript of Skolemising Blank Nodes while Preserving Isomorphism

20160323100333chapter 3 - Isomorphism and Homomorphism (1)

Solving Subgraph Isomorphism Problems with Constraint ...becool.info.ucl.ac.be/pub/papers/constraints2010_matching.pdfSolving Subgraph Isomorphism Problems with Constraint Programming

Limits to Institutional Isomorphism: Examining Internal ...

Efficient subgraph isomorphism detection: a decomposition ...cs.bilkent.edu.tr/~saksoy/courses/cs551-Spring2009/papers/messmer... · Efficient Subgraph Isomorphism Detection: A Decomposition

GPU Snapshot: Checkpoint Offloading for GPU …...costs, socket counts and energy consumption. Such accelerator-dense nodes pose a reliability challenge because preserving a large

Mimetic Isomorphism and TechnologyEvaluation: Does ...

A SATAKE ISOMORPHISM IN CHARACTERISTIC

computing graph isomorphism, computing tree isomorphism · Combinatorial algorithms computing graph isomorphism, computing tree isomorphism Jiří Vyskočil, Radek Mařík 2012

Permutahedra, HKR isomorphism and polydiﬀerential ... · Permutahedra, HKR isomorphism and polydiﬀerential Gerstenhaber-Schack complex S.A. Merkulov Department of Mathematics,

Teachers’ Work:Institutional Isomorphism and Cultural ...

The Scott Isomorphism Theorem

Strong isomorphism reductions in complexity theory

The Iron Cage Revisited: Institutional Isomorphism and ... · of homogenization is isomorphism. In Haw- ley's (1968) description, isomorphism is a con- straining process that forces

Isomorphism testing - Monash Universityusers.monash.edu/~heikod/icts2016/CPGlecture4.pdf · Isomorphism Testing Standard PresentationsExample Resources Isomorphism testing for p-groups

A Kinetic Energy Preserving DG Scheme based on Gauss …ortleb/DG_KEP_GaussLegend... · 2016. 12. 13. · energy preserving discontinuous Galerkin scheme with Gauss-Lobatto nodes

Graph Isomorphism as Hsp

Permutation groups and the graph isomorphism problem · Key words: Graph isomorphism, permutation groups 1 Introduction One of the core ideas in mathematics is the notion of an isomorphism,

arXiv · 2019. 5. 10. · Abstract On the Isomorphism Problem of p-Endomorphisms Peter Jong, Ph.D. Department of Mathematics, University of Toronto, 2003 Let X = (X,B,µ,T) be a measure-preserving

On the Complexity of Matroid Isomorphism Problembvrr/jour/RS1.pdf · On the Complexity of Matroid Isomorphism Problem ... At an intuitive level, the graph isomorphism problem asks

The isomorphism relation of classifiable shallow theories · The isomorphism relation of classiﬁable shallow theories Francesco Mangraviti The isomorphism relation of classiﬁable