The Mystical Principles of XSLT: Enlightenment through Software Visualization

Evan Lenz, PresidentAugust 2, 2016

The Mystical Principles of XSLTEnlightenment through Software Visualization

2

Grokking XSLT Easy to learn just enough to be dangerous

− Lots of terrible code in the wild

Understanding template rules is elusive− Not what procedural programmers expect− Lots of explanations that are partial, imprecise, or ambiguous

XSLT really isn’t that difficult a language− But only if you grok template rules

A lot that’s invisible

3

Avoiding xsl:apply-templates

4

Befriending xsl:apply-templates

5

By analogy to OO From XSLT 1.0 Pocket Reference:

− http://lenzconsulting.com/how-xslt-works/#applying_template_rules

• “In object-oriented (OO) terms, xsl:apply-templates is like a function that iterates over a list of objects (nodes) and, for each object, calls the same polymorphic function. Each template rule in your stylesheet defines a different implementation of that single polymorphic function. Which implementation is chosen depends on the runtime characteristics of the object (node). Loosely speaking, you define all the potential bindings by associating a “type” (pattern) with each implementation (template rule).”

− http://lenzconsulting.com/how-xslt-works/#modes• “Furthering the OO analogy, a mode name identifies which polymorphic

function to execute repeatedly in a call to xsl:apply-templates. When mode="foo" is set, foo acts as the name of a polymorphic function, and each template rule with mode="foo" defines an implementation of the foo ‘function.’”

http://lenzconsulting.com/how-xslt-works/#applying_template_rules

http://lenzconsulting.com/how-xslt-works/#applying_template_rules

http://lenzconsulting.com/how-xslt-works/#modes

http://lenzconsulting.com/how-xslt-works/#modes

6

By analogy to other instructions This:

<xsl:apply-templates select="animal"/>…<xsl:template match="animal[@type eq 'cat']">…</xsl:template><xsl:template match="animal[@type eq 'dog']">…</xsl:template>

is (more or less) equivalent to this:

<xsl:for-each select="animal"> <xsl:choose> <xsl:when test="@type eq 'cat'">…</xsl:when> <xsl:when test="@type eq 'dog'">…</xsl:when> </xsl:choose></xsl:for-each>

7

De-mystifying XSLT wizardry Mysticism:

− “a theory postulating the possibility of direct and intuitive acquisition of ineffable knowledge or power”

In relation to XSLT:− Belief in the possibility that there is a way to make XSLT’s

behavior more directly and intuitively understandable• By making implicit behavior explicit and the abstract concrete, through an

interactive visualization

8

Elusiveness of the task Alternating between faith and doubt that what I’m

imagining is possible or useful Grand schemes that collapse into themselves Making abstract ideas concrete

− Shapes in my mind converted to shapes on the screen

Disillusionment− Finding out that certain ideas weren’t well-formed− The crucible of incarnation

The feeling that there’s nothing really “there”

9

Like a debugger? Debugging is like the scientific method

− The debugger is the instrument of measurement− Start with a hypothesis− Set:

• breakpoints• watch variables

− One step at a time, careful not to “step over” when we want to “step into”—otherwise we’ve missed our chance and have to start all over (because we can’t go backwards in time)

10

Not like a debugger Software visualization for XSLT:

− A holistic experience, rather than a narrow line of inquiry− A space in which to freely explore, in any direction− Not bound by our previous steps− No need to start with a question− Just start playing around and see what we will see

11

Not like a debugger Sri Aurobindo on gnostic knowledge:

− “For while the reason [like a debugger] proceeds from moment to moment of time and loses and acquires and again loses and again acquires, the gnosis [or visualization tool] dominates time in a one view and perpetual power and links past, present and future in their indivisible connections, in a single continuous map of knowledge, side by side.” (The Synthesis of Yoga, p. 464)

12

Escaping the bounds of time A transformation is fundamentally:

− a data set− not a process

XSLT’s functional nature− More like an abstract sculpture, showing relationships− Less like a movie

Time is just one way to traverse the relationships:− we can go forward and backward in time (result document order)− we can step outside of time− we can traverse the relationships using a different dimension

The essence of transformation

14

Transformation as microcosm “The Big Bang”

− the initial context node

“The Engine of Creation”− template rules

“Emergence/Manifestation”− the result tree

15

Scope and method Scope:

− Template rules (and maybe xsl:for-each)• XSLT’s core processing model

− XPath, though foundational, is out of scope

Method:− Explode a transformation into all its parts− Put them back together

16

Fundamental unit: the focus As defined in the XSLT Recommendation, consists of:

− the context item (.),− the context position (position()), and− the context size (last()).

17

An expanded definition of focus We will define “focus” as:

− the particular instance of a focus− i.e. each instantiation of a focus-changed sequence constructor

• body of <xsl:template>, body of <xsl:for-each>, etc.

Then we can consider the shallow result chunk implied by a focus to be an intrinsic property of that focus

− 1:1 relationship

18

For example Given a single instantiation of this rule:

<xsl:template match="heading"> <title> <xsl:value-of select="."/> </title></xsl:template>

We could represent the “focus” like this:

<trace:focus context-id="n1a833c7b3a005151" context-position="1" context-size="1" rule-id="nfa49863d62353482" invocation-id="bc08a736-13f7-e97b-a272"> <title>This is the title</title></trace:focus>

19





Cosmic address (GUID) for the

current invocation of <xsl:apply-templates/>

20





How many nodes are in the

list being processed by that invocation

21





In other words: how many foci belong to the

current invocation

22





The position of the current

focus in that list

23





generated ID of the node being processed (a <heading> element)

Cosmic address (generated ID) of the node being

processed(a <heading>

element)

24





generated ID of the node being processed (a <heading> element)

Cosmic address (generated ID)of the matching template rule

in the stylesheet

25





The shallow result chunk that this focus creates

26





The shallow result chunk that this focus creates

27

Why it’s called a “shallow” result chunk It doesn’t include the results of nested invocations:

<trace:focus context-id="n292122aca28a94ca" context-position="1" context-size="1" rule-id="n8ef8c66cac078ff" invocation-id="4dd7580b-d44e-a0e4-714d"> <html> <head> <trace:invocation id="bc08a736-13f7-e97b-a272"/>

</head> <body> <trace:invocation id="6129c19f-2632-e99c-53b5"/>

</body> </html></trace:focus>

28


<trace:focus context-id="n292122aca28a94ca" context-position="1" context-size="1" rule-id="n8ef8c66cac078ff" invocation-id="4dd7580b-d44e-a0e4-714d"> <html> <head> <trace:invocation id="bc08a736-13f7-e97b-a272"/>  </head> <body> <trace:invocation id="6129c19f-2632-e99c-53b5"/>  </body> </html></trace:focus>

29


<trace:focus context-id="n292122aca28a94ca" context-position="1" context-size="1" rule-id="n8ef8c66cac078ff" invocation-id="4dd7580b-d44e-a0e4-714d"> <html> <head> <trace:invocation id="bc08a736-13f7-e97b-a272"/>  </head> <body> <trace:invocation id="6129c19f-2632-e99c-53b5"/>  </body> </html></trace:focus>

Links to zero or more other foci

having the given invocation ID

30

Mystical principle: To see is to create Inner seeing results in outer manifestation

− They are not separate; they are two sides of the same coin.

To focus on the input is to create the output− (as defined by the template rule)

They are inextricably linked, 1-to-1− As defined here, the focus and the result chunk are intrinsic to

each other.

31

Mystical principle: All is accessible Each focus is connected to every other focus via

successive invocation-id links Every entity (node, rule, or focus) has a “cosmic address”

(GUID or generated node ID)− Unique, globally accessible address within the world of the

transformation

You can begin anywhere in the world and traverse to anywhere else in the world, including past, present, future

Nothing is lost through the passage of time

32

Mystical principle: Inner upholds outer The result tree is not only isomorphic to the tree of foci

(the XSLT execution tree) It is an adornment of that tree

− The result is the part we see− The inner structure upholds the outer result

The visible world is a decoration of the invisible− a manifestation of a deeper, unseen reality

33

Mystical principle: There is no separation Distinctions are only by definition and thus arbitrary A focus and result chunk are “intrinsic” to each other only

because we are looking at it that way A focus is “extrinsic” to another focus only because we

defined it that way Ultimately, there is only an unbroken whole But:

− Division and distinction are functions we possess− There is joy in the division and joy in the reunion− That’s why we do it

Case Study: Enrichment Tracer

35

Bringing this down to earth: a case study Client project to rewrite an enrichment engine Technical articles are automatically enriched with search

terms based on their content− (using a taxonomy of terms and reverse queries in MarkLogic)

There are various business rules about what types of article should be enriched with what types of terms

The rules need to evolve over time Most importantly, the rule behavior needs to be

transparent to the analysts and taxonomy managers

36

XSLT pipeline for enrichment rules For the rewritten engine’s enrichment rules, I decided to

use XSLT as the rules language (surprise, surprise!)− declarative and powerful− simplified by using a multi-stage pipeline− good in conjunction with the modified identity transform

The article passes through unchanged, except that new metadata (“enrichment”) is inserted into each section

Example stages:− get-terms− add-inherited− remove-subsumed

− remove-excluded− add-more-terms− filter-terms

37

Distinction between “rules” and “engine” Entire enrichment engine consists of XSLT template rules

− (and imported XQuery libraries)

But not all template rules are created equal Some modes are “engine-level”

− not meant to be regularly customized

Some modes are “rules-level”− direct implementations of the business rules− meant to be customized to handle different content scenarios

The latter are the ones that need to be made transparent

38

Making the business rules transparent Modes are grouped by pipeline stage Each mode has a default behavior

− e.g. mode="terms" by default uses a reverse query to find all matching terms

− can be overridden to provide a different behavior• (such as to fix the term for a particular article type, regardless of the content)

The default rules don’t need to be shown− Only the custom overrides

Enrichment Tracer demo

40

Trace-enabling the XSLT Used XSLT to pre-process (“trace-enable”) the original

engine and rules XSLT The trace-enabled stylesheet additionally generates:

− inline trace data• <trace:match-start> and <trace:match-end> markers in the result tree

− out-of-band trace data• using xdmp:set() in MarkLogic

The “rule-level” modes are annotated in the original engine code

− so the trace-enabler knows which template rules to augment− [show example, line 538 of engine.xsl]− [also show trace data example in oXygen]

41

Other techniques used Documenting the rules inline, so they get fed straight to

the tracer interface Storing the trace data for each input document into the

(MarkLogic) database for faster subsequent renders Automatically invalidating (and thus forcing re-generation

of) the cached trace data whenever a rule or data change is detected

Generalizing to arbitrary XSLT

43

Characteristics to preserve Colored lines to depict relationships Visually represent XML as XML Interactive slider for building/un-building the result

44

Difficulties How to visualize temporary trees?

− i.e. results stored in <xsl:variable>

Modifying template rule results could break the stylesheet− type errors− node count dependencies, etc.

45

Solution: Store trace data out-of-band Via side effects

− Write a document to a database, or− Update a global variable

Benefits:− avoid type errors− avoid unexpected (altered) stylesheet behavior− shredded trace data may suggest more scalable visualization

implementations• (e.g. for large source documents)

Demo of early results

47

Links Online demo here:

− http://xmlportfolio.com/xslt-visualizer-demo/

Code on GitHub:− http://github.com/evanlenz/xslt-visualizer

http://xmlportfolio.com/xslt-visualizer-demo/

http://xmlportfolio.com/xslt-visualizer-demo/

http://github.com/evanlenz/xslt-visualizer

The Mystical Principles of XSLT: Enlightenment through Software Visualization

Software

Transcript of The Mystical Principles of XSLT: Enlightenment through Software Visualization