Post on 26-Jul-2015
#SMX #21C1 @minderwinter
Charles Midwinter, Collegis Education
Visualizing Attribution in Living Color
#SMX #21C1 @minderwinter
▪ When multiple channels or tactics assist with a conversion, an attribution model is the set of rules we use to “attribute” portions of the conversion to each assisting touch-point.
▪ But you already knew that…
What is Attribution (review, obviously)?
#SMX #21C1 @minderwinter
▪ Last Interaction ▪ Last Non-direct Click ▪ Last AdWords Click ▪ First Interaction ▪ Linear ▪ Time Decay ▪ Position Based
Google Analytics Attribution Models
#SMX #21C1 @minderwinter
Almost anything is better than “Last Click,” but black boxes aren’t much better. ▪ No visibility on the details of the attribution
calculation ▪ Possible pitfalls with certain channels ▪ Too many groundless assumptions required
The Problem with Out-of-the-Box Attribution Models
#SMX #21C1 @minderwinter
If you want to understand multi-channel attribution, the “multi-channel attribution funnel” reports in Google Analytics are your first stop. ▪ Take a look at the “top conversion paths”
report ▪ This is great information, but how to
summarize it at a high level?
Google Analytics & Channel/Tactic Interactions
#SMX #21C1 @minderwinter
The object that can summarize these conversion paths is called an “edge matrix.” ▪ Usually used for the analysis of networks
(eg. social networks) ▪ Encodes the connections among entities ▪ Can be visualized as a “node graph” with
open source software (Gephi)
Edge Matrices
#SMX #21C1 @minderwinter
Consider the following conversion paths: ▪ A > C > B > C ▪ A > B ▪ B > C
Edge Matrix Example 1/3
#SMX #21C1 @minderwinter
In words ▪ A ▪ referred to C once ▪ referred to B once
▪ B ▪ referred to C twice
▪ C ▪ referred to B once
Edge Matrix Example 2/3
#SMX #21C1 @minderwinter
Just use my handy dandy Python script. ▪ Go to: ▪ traffictheory.org/smx-2015
▪ Download the script ▪ Make sure you have Python 2.7 installed (not
Python 3!) ▪ Follow the instructions at the URL above to
run.
MCF Top Conversion Paths to Edge Matrix
#SMX #21C1 @minderwinter
To visualize the “Edge Matrix” as a Node Graph, you’ll need Gephi, open source graph software. ▪ Open the “edge_matrix.csv” file created by the
Python script (see website for more details) ▪ Import the “last_click.csv” file created by the
Python script (see website for more details)
Turning an Edge Matrix into a Node Graph
#SMX #21C1 @minderwinter
▪ A layout algorithm uses the weights of the connections/edges to re-arrange the nodes.
▪ Usually physics-based, involving a gravitation-like attraction that scales with the edge weights between nodes, and often a repulsion that separates weakly connected nodes.
Layout Algorithms
#SMX #21C1 @minderwinter
▪ Nodes that refer to each other often are now placed close together in 2D space.
▪ Two central communities of nodes are identifiable (“direct/(none)” and “google/organic”)
The Result of Layout Algorithm “Force Atlas 2”
#SMX #21C1 @minderwinter
▪ To make this graph more useful, we’d like to map a metric to node size
▪ The metric should give us some indication of the node’s importance to the conversion process
▪ In order to proceed, we should understand a bit more about the node graph
Measuring Node Importance
#SMX #21C1 @minderwinter
▪ Degree: the number of a node’s connections.
▪ In-Degree: the number of a node’s incoming connections
▪ Out-Degree: the number of a node’s out-going connections
Degree
#SMX #21C1 @minderwinter
▪ A ▪ Degree = 2 ▪ In-Degree = 0 ▪ Out-Degree = 2
Degree ExampleA B C
A 0 1 1
B 0 0 2
C 0 1 0
#SMX #21C1 @minderwinter
▪ B ▪ Degree = 1 ▪ In-Degree = 0 ▪ Out-Degree = 1
Degree ExampleA B C
A 0 1 1
B 0 0 2
C 0 1 0
#SMX #21C1 @minderwinter
▪ Weighted Degree: the number of a node’s connections multiplied by their weights.
▪ In-Degree: the number of a node’s incoming connections multiplied by their weights.
▪ Out-Degree: the number of a node’s out-going connections multiplied by their weights.
Weighted Degree
#SMX #21C1 @minderwinter
▪ B ▪ Weighted Degree = 2 ▪ In-Degree = 0 ▪ Out-Degree = 2
Weighted Degree ExampleA B C
A 0 1 1
B 0 0 2
C 0 1 0
#SMX #21C1 @minderwinter
▪ The most important nodes are the ones generating incremental conversions
▪ Conceptually, they generate a net output. ▪ A node that gets no in-bound connections, but has many out-bound
connections is a source of conversions, and should be highly valued. ▪ A node that generates a lot of last-click conversions has value, but its
net output should be adjusted so that in-bound connections are subtracted.
▪ A node that has as many in-bound connections as it does last-click/out-bound connections is adding little value from an incremental perspective.
Assessing Node (Campaign or Source/Medium) Importance
#SMX #21C1 @minderwinter
(Weighted Out-degree + Last Click) – Weighted In-Degree
▪ This metric gives us an indication of node importance from an incremental conversion perspective.
Net Output
#SMX #21C1 @minderwinter
▪ Nodes that generate more incremental conversions are larger
▪ Caveat: flawed tracking means this metric is far from perfect
Mapping “Net Output” to Node Size
#SMX #21C1 @minderwinter
▪ Positioning tells us which nodes are closely connected, and size tells us how well nodes generate incremental conversions
▪ It would also be nice to know how each node tends to assist in the conversion process: does it produce last clicks, or is it higher in the funnel?
Assessing Node Function
#SMX #21C1 @minderwinter
▪ The lower a node is in the conversion funnel, the more last clicks it should have
▪ The higher a node is in the funnel, the more likely it is to push traffic to other nodes (high weighted out-degree)
Funnel Position 1/2
#SMX #21C1 @minderwinter
Last Click / (Weighted Out-degree + Last Click) ▪ 0 for nodes with no last click ▪ 1 for nodes with all last click ▪ Varies from 0 to 1 as ratio of last click to
weighted out-degree increases
Funnel Position 2/2
#SMX #21C1 @minderwinter
▪ Nodes high in the funnel are redder
▪ Nodes lower in the funnel are bluer
▪ In-between nodes are lighter in color, sometimes almost white.
Mapping Funnel Position to Node Color
#SMX #21C1 @minderwinter
▪ Proximity tells you how often channels interact
▪ Color tells you a channel/campaign’s position in the funnel
▪ Size tells you how many incremental conversions are likely generated by a channel/campaign
How to Interpret the Result
#SMX #21C1 @minderwinter
▪ Identify “sinks” ▪ Sinks are blueish. ▪ These kinds of channels
are at the end of the conversion path
▪ They are lynch pins in the network, fed by channels higher in the funnel
▪ Overvalued by last click
Sinks
#SMX #21C1 @minderwinter
▪ Identify “sources”: ▪ Reddish ▪ Tend to be earlier in
the conversion path ▪ Undervalued by last
click
Sources
#SMX #21C1 @minderwinter
▪ Identify “assistors”: ▪ Pale, or sometimes
white ▪ Beware of small
assistors ▪ Tend to be midway in
the conversion path ▪ Undervalued by last
click, but can be overvalued by other models
Assistors
#SMX #21C1 @minderwinter
▪ Display ▪ Retargeting ▪ Direct Buy ▪ Behavioral
▪ Paid Search ▪ Branded ▪ Unbranded
▪ Organic Search
▪ Referral ▪ Social ▪ Direct
Source, Sink, or Assistor?
#SMX #21C1 @minderwinter
▪ Display ▪ Retargeting (Assistor) ▪ Direct Buy (Source) ▪ Behavioral (Source/
Assistor)
▪ Paid Search ▪ Branded (Sink) ▪ Unbranded (Source/
Assistor)
▪ Organic Search (Assistor/Sink)
▪ Referral (Source/Assistor)
▪ Social (Assistor) ▪ Direct (Assistor/Sink)
Source, Sink, or Assistor?
#SMX #21C1 @minderwinter
▪ Depending on your sales cycle, channels & campaigns may function differently in the conversion funnel
Results May Vary
#SMX #21C1 @minderwinter
▪ Nodes with little visibility are hard to interpret: ▪ Organic: because of (not provided), its a mix of branded and
unbranded. Its “Funnel Position” will be determined by the strength of your brand and the amount of unbranded organic traffic you receive.
▪ Direct: can skew your results. We know it contains all kinds of poorly tracked traffic. Sometimes, I just go ahead and remove direct from the graph.
Caveats
#SMX #21C1 @minderwinter
▪ Select an attribution model that fits your conversion process ▪ Sources are under valued by both last click and
time decay, for example. ▪ Identify outliers and understand what they
say about your mix (discover fraud) ▪ Use the visualization rhetorically to justify
budget for exposure tactics
How to Make This Actionable