@analyticsseo #bigdatascience
Big Data Science for Content Marketing Success
[email protected] www.analyticsseo.com +44 208 977 4465
Keyword Clustering: How Big Data is taking the guesswork out of Digital Content Publishing Strategy
The Web is not a random network
http://www.amazon.co.uk/Linked-Albert-laszlo-Barabasi/dp/0465085733
Mapping the Internet
http://internet-map.net/#9-158.57192434422797-63.9119835263755http://www.wired.com/2015/06/mapping-the-internet
Power Laws
• Pareto Principle – 80:20 - 80% of your market is dominated by 20% of websites• The ‘Long Tail’ of search• Characterised by a small number of large items, and a large number
of small items• Why is it like this? - Preferential Attachment – “Rich get Richer”
https://en.wikipedia.org/wiki/Power_lawhttp://www.thelongtail.com/about.html
The Long Tail – Websites – Users & Visitors
http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html
Pareto Effect – In Effect in Most Markets
So What? How does this apply to Content Marketing
• The Long Tail theory argues that there is demand for more and more specific niches• Google and other search engines are trying to add structure to
unstructured data and make sense of your market• Analysis of the web for your market can show the long tail at work
• Large players dominating but also aggregating niches• Niche players building their businesses off the back of quality content satisfying the
peculiar needs of the many
• Understanding this can be the strategy to success in all markets• Helping you analyse niches of relative strength vs the competition• Supporting you to build a content marketing strategy based on Google’s view of the
world
Knowledge Graph Results
Direct Answers
• 200+ Ranking Signals/Factors (each having up to 50 variants)• 10,000+ Ranking Signals http://searchengineland.com/seotable/ • Rankbrain algorithm – Artificial Intelligence• It’s about becoming an Authority in an area – See Google’s
Natural Language Search Results for Intent Queries - they selected authoritative pages which:• Were frequently selected in search results• Consistently rank high in search results for related topics
• If you publish quality content giving good answers covering a natural cluster then you will do well• Think EAT, Users’ Needs and Mobile
Where’s Google Going?
What’s the difference between a Ranking Factor and a Ranking Signal?http://searchengineland.com/close-smx-west-growth-direct-answers-seos-react-216009
EATOnly Quality Content will make it in the end
Google Quality Raters Guidelines
ExpertiseAuthorityTrustworthiness
http: //www.thesempost.com/google-quality-raters-guide-mobile/Google Quality Raters PDF
Google Quality Raters Guidelines
+ How well you satisfy users’ needs+ Think Mobile
What is Graph Clustering?
6 Degrees of Separation
http://barokas.com/2014/11/10-reasons-give-thanks-pr/http://www.dailymail.co.uk/sciencetech/article-2064746/Facebook-shrinks-degrees-separation-just-FOUR.html
Graph AnalysisActor
Actor
Actor
Actor
Actor
Movie
Movie
Movie
Movie
Actor
Actor
Rod Steiger
Martin SheenCharlie Sheen
Julia Roberts
Clint Eastwood
Kevin Bacon
Louis Anderson
Truth or Consequences
JFK
Ferris Bueller’s Day Off
Movie
Quicksilver
Mystic River
Movie
Flatliners
Actor
Kiefer Sutherland
Movie
A Few Good Men
Actor
William BaldwinSo you calculate how connected an actor is – or their ‘Bacon number’.You can also calculate how ‘central’ an actor is. E.g. Eric Roberts.
http://oracleofbacon.org
Big Data - Graph Analysis for Content Marketing
It’s like Venn Diagrams on Steroids!
Graph Clustering shows Google’s algorithms at workGraph data example:
Natural clusters
Green = URLsBrown = KWsBlue = Domains
Imagine if you could get this view of your market?
Market Analysis
• What does your market look like online? You might analyse..• Yourself• Your direct competitors• Consumer behaviour• Trends in consumer demand
• Or more importantly all of this and …… how Google’s algorithm works in your market?
How companies typically do market analysis….
A B C D
A
B
C
D
An old friend…The Venn Diagram
A BA
BC
A
B
C
D
A
B
C
D
E
F
G
H
I
J
K
L
MNO
P
RS
T
UV
W
Y
X
Z
You need a method for comparing A against everyone, then B against everyone and so on….
B
Even Venn diagrams have their limitations!
A
B
C
D A B
A
BC
A
B
C
D
E
F
G
H
I
J
K
L
M NO
P
R
S
T
UV
W
Y
X
Z
BA B
A
BC
A
B
C
D
E
F
G
H
I
J
K
L
M NO
P
R
S
T
UV
W
Y
X
Z
A B
A
BC
A
B
C
D
E
F
G
H
I
J
K
L
M N
P
R
UV
W
Y
X
Z
A B
A
BC
A
B
C
D
E
F
G
H
I
J
K
LM N
O
S
UV
YZ
A B
A
BC
A
B
C
D
E
F
G
H
I
J
K
L
M N
P
R
UV
W
Y
X
Z
A B
A
BC
A
B
C
D
E
F
G
H
I
J
K
L
M
P
R
UV
Y Z
How companies should do market analysis….A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AMAN AO AP AQ AR AS AT AU AVAW AX AY
ABCDEFGHIJKL
MNOPQRSTUVWXYZ
AA
Gain unparalleled insights with Graph Analysis and Clustering
You simply need:• A large database of keywords• Your ranking position for each keyword• Demand for those keywords (Search Volume) and £value of each keyword• A way to measure your strength vs competition• Graph analysis database and tools• Method for clustering results (Latent Semantic Indexing & Graph Structure)• Method for visualising or distilling the results into Excel or PowerBI• Time, money and some technical skills
RD
P
P
P
KW
KW
KW
KW
KW
CP
CP
KW
KW
KW
The competition!
KW
KW
CP
CDYou
Opportunity!
CD = Competing domainsCP = Competitors’ pages
RD = Ranking domainP = Your pageKW = Keyword
How to ‘graph’ your market
Graph Clustering - Structure & SemanticsWhat we do: Data comprising groups of keywords and associated ranking pages that we obtain sheds light on Google’s view of the relationship between the structure of the web and search intent.
Data & Segmentation: Based on this, we can model related keywords and ranking URLs; then segment the data into groups revealing natural clusters of search topics.
Unique insight: Determining maximum gain that could be derived; a recommended course of action for each these groups, provides actionable insight. Natural clusters
Graph data example: Green = URLsBrown = KWsBlue = Domains
What is Keyword Clustering?• It’s about more than keywords• It’s about the structure of the web – or more exactly how Google interprets the
structure of the web to present its search results• It’s about how the SERPs fall into natural clusters• It’s about labelling those clusters semantically• It’s about finding relevant clusters with high demand where you are competing
against weaker competitors for short-term wins• It’s about defining a ‘real-time’ strategy that shows the algorithm at work in your
market so you can plan short-term and long-term growth• It’s about becoming an Authority in a cluster • If you publish quality content giving good answers covering a natural cluster then
you have a chance to become an Authority
https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2014197227
Everyone here is already clustering!
• Keyword research into keyword groups• Grouping URLs• Grouping competitors or other websites (blogs, affiliates, partners,
media, social networks, etc)• Everyone here sees their market through their lens – if you were
sitting in your closest competitor’s boardroom there would be lots of similarities but you would see things slightly differently• The common denominator is Google who aggregates all the content
and all the links and clusters the web into its different niches
DIY Clustering – 1 Keyword, 1 Page of SERPs, Un-tuned Algorithm
http://search.carrotsearch.com/carrot2-webapp/search
DIY Clustering – 1 Keyword, 1 Page of SERPs, Un-tuned Algorithm
DIY Clustering – 1 Keyword, 1 Page of SERPs, Un-tuned Algorithm
DIY Clustering – 1 Keyword, 1 Page of SERPs, Un-tuned Algorithm
DIY Clustering – 1 Keyword, 1 Page of SERPs, Un-tuned Algorithm
DIY Clustering – 1 Keyword, 1 Page of SERPs, Un-tuned Algorithm
DIY Clustering – Stage 2
http://app.raw.densitydesign.org
DIY Clustering – Stage 2
http://app.raw.densitydesign.org
DIY Clustering – Stage 2
http://app.raw.densitydesign.org
DIY Clustering – Stage 2
http://app.raw.densitydesign.org
Now imagine you could do this with…
• Millions of keywords• Millions of pages from the SERPs• Millions of websites• Using a Graph Database• With a tuned algorithm - Using the structure of the web and semantic
algorithms to refine the model
Using this tech and data you can really understand your market
We can filter the data to:1) Identify core niche competitors2) Market Dynamics – Size and Concentration & Growth3) Analyse the performance of existing keywords & pages4) Suggest new keywords for existing pages5) Suggest new keywords for new pages
Commercial & in-confidencewww.analyticsseo.com
GreatBritishChefs.com case study
Taking the guesswork out of your content marketing efforts
GreatBritishChefs.com’s Market
Mar
ket V
isibi
lity
Shar
e%
The initial ‘net casting’ shows that the top 19 biggest players account for 81% of the market.
Identifying Core Niche Competitors
CORE NICHE COMPETITORS
NICHE COMPETITORS
POWERFUL MAINSTREAM COMPETITORS
FRINGE COMPETITORS
HIGH
LOW
HIGH
LOW
ORG
ANIC
SEA
RCH
VISI
BILI
TY
STRENGTH OF DOMAIN
GreatBritishChefs.com
Top 100 market domains shown. Bubble size = number of unique keywordsX axis: Strength determined by Majestic® metrics Y axis: Sum of estimated organic search visibility (log scale)
Analysing Core Niche Competitors
CORE NICHE COMPETITORS
NICHE COMPETITORS
POWERFUL MAINSTREAM COMPETITORS
FRINGE COMPETITORS
HIGH
LOW
HIGH
LOW
ORG
ANIC
SEA
RCH
VISI
BILI
TY
STRENGTH OF DOMAIN
Halfords in Red
Analysing Core Niche Competitors
CORE NICHE COMPETITORS
NICHE COMPETITORS
POWERFUL MAINSTREAM COMPETITORS
FRINGE COMPETITORS
HIGH
LOW
HIGH
LOW
ORG
ANIC
SEA
RCH
VISI
BILI
TY
STRENGTH OF DOMAIN
CORE NICHE COMPETITORS
NICHE COMPETITORS
POWERFUL MAINSTREAM COMPETITORS
FRINGE COMPETITORS
HIGH
LOW
HIGH
LOW
ORG
ANIC
SEA
RCH
VISI
BILI
TY
STRENGTH OF DOMAIN
Webuyanycar.com
Identifying Core Niche Competitors
CORE NICHE COMPETITORS
NICHE COMPETITORS
POWERFUL MAINSTREAM COMPETITORS
FRINGE COMPETITORS
HIGH
LOW
HIGH
LOW
ORG
ANIC
SEA
RCH
VISI
BILI
TY
STRENGTH OF DOMAIN
Top 100 market domains shown. Bubble size = number of unique keywordsX axis: Strength determined by Majestic® metrics Y axis: Sum of estimated organic search visibility (log scale)
Commercial & in-confidencewww.analyticsseo.com
Visionexpress.com
Initial Market Share results for GreatBritishChefs.com
Market Keywords: Keywords you rank for:
Your Market Share
2.8%
113K 24K
Market Value
£6.1MYour Market Value
£115K
Market Searches
24.7MSearches for your keywords
4.8M
by visibility
Opportunity Keywords
89K
Commercial & in-confidencewww.analyticsseo.com
Which existing keywords for existing content to focus on?
LONG-TERM ROI
LOW/NO ROI
QUICK ROI
MAINTAIN ROI
HIGH
LOW
HIGH
LOW
ORG
ANIC
GRO
WTH
POT
ENTI
AL
AVERAGE RELATIVE STRENGTH
Bubble size = number of unique keywordsX axis: Average relative strength of cluster determined by Majestic® metrics Y axis: Sum of estimated organic traffic growth per cluster(log scale)
9,215 of your keywords have growth potential, clustered into >400 categories below:
Weekly estimated traffic growth possible in this quadrant alone: 556,837Estimated current weekly traffic from Market Share: 204,703So with this quadrant one could (in theory) increase traffic by: 172 %
HARDER torank for
EASIER torank for
Which existing keywords for existing content to focus on?
cluster label Your keyword count
average keyword frequency across
100 domainssum of potential traffic increase
average relative strength
beef 214 16.29 30,940 -25.12
dinner 268 17.27 30,818 -14.5
Jelly 82 12.96 28,117 -20.57
Example of 3 Clusters:
Example Keywords from 1st Cluster:
Your keyword Top competing URL Your top ranking page for this
keywordKeyword
frequency across domains
Potential traffic
increase (max)
Potential traffic
increase (5 ranks up)
Your current
rank
roast beef www.jamieoliver.com/recipes/beef-recipes/perfect-roast-beef/
www.greatbritishchefs.com/recipes/roast-beef-recipe-mushrooms-brandy-potatoes
15 1,150 400 29
roast beef recipe
www.jamieoliver.com/recipes/beef-recipes/perfect-roast-beef/
www.greatbritishchefs.com/recipes/roast-beef-recipe-mushrooms-brandy-potatoes
20 669 250 17
how to cook steak
www.bbcgoodfood.com/technique/how-cook-steak
www.greatbritishchefs.com/how-to-cook/how-to-cook-steak
17 635 300 22
Which existing keywords for existing content to focus on?
Example ‘Beef’ Cluster
Which NEW keywords for existing content to focus on?
LONG-TERM HIGH ROI
SHORT-TERM LOW ROI
SHORT-TERM HIGH ROI
SHORT-TERM LOW ROI
HIGH
LOW
HIGH
LOW
SUM
OF
ORG
ANIC
GRO
WTH
POT
ENTI
AL
AVERAGE RELATIVE STRENGTH – DOMAIN LEVEL
Bubble size = number of unique keywordsX axis: Average relative strength of cluster determined by Majestic® metrics Y axis: Sum of estimated organic traffic growth per cluster(log scale)
43,449 keywords GreatBritishChefs.com could rank for that relate closely to existing content:
Commercial & in-confidencewww.analyticsseo.com
Which NEW keywords for existing content to focus on?
Weekly estimated traffic growth possible in this quadrant alone: 164,592Estimated current weekly traffic from Market Share: 204,703So with this quadrant one could (in theory) increase traffic by: 80 %
HARDER torank for
EASIER torank for
Which NEW keywords for existing content to focus on?
cluster labelcount of unique
opportunity keywords
Sum of potential traffic increase
average relative strength
Average keyword frequency across
100 domains
turkey 859 29,222 -26.50 3.62
beef 1,462 16,810 -23.58 4.49
dinner 1,261 12,194 -27.07 2.89
Example of 3 clusters:
Example Keywords from 1st Cluster:
Relevant opportunity
keywordTop competing URL
Your most suitable* ranking page for this
keyword
Keyword frequency
across domains
Potential increase in
search volume (position 1)
Potential increase in
search volume (position 5)
Potential increase in
search volume(position 10)
christmas dinner
www.bbcgoodfood.com/recipes/category/christmas-dinner
www.greatbritishchefs.com/collections/christmas-recipes
14 59,786 20,000 500
christmas food
en.wikipedia.org/wiki/List_of_Christmas_dishes
www.greatbritishchefs.com/collections/christmas-recipes
10 100,000 24,383 750
christmas food ideas
en.wikipedia.org/wiki/List_of_Christmas_dishes
www.greatbritishchefs.com/collections/christmas-recipes
15 500,000 100,000 22,371
* Suitability is determined by semantic similarity of each ranking page’s keywords to the proposed new keyword phrase.
Commercial & in-confidencewww.analyticsseo.com
Which NEW keywords for NEW content to focus on?
LONG-TERM HIGH ROI
LONG-TERM AVERAGE ROI
SHORT-TERM HIGH ROI
SHORT-TERM AVERAGE ROI
HIGH
LOW
HIGH
LOW
SEAR
CH V
OLU
MES
– K
EYW
ORD
CLU
STER
LEV
EL
RELATIVE STRENGTH - DOMAIN LEVEL
Showing clusters of new potential opportunity keywords that are less related to existing content
20,863 new keywords for content creation strategies:
Commercial & in-confidencewww.analyticsseo.com
Which NEW keywords for NEW content to focus on?
Weekly estimated traffic growth possible in this quadrant alone: 302,272Estimated current weekly traffic from Market Share: 204,703So with this quadrant one could (in theory) increase traffic by: 148 %
HARDER torank for
EASIER torank for
Which NEW keywords for NEW content to focus on?
cluster labelcount of unique
opportunity keywords
Sum of potential traffic increase
average relative strength
Average keyword frequency across
100 domains
recipes 2,638 32,626 -22.00 4.22
pasta 655 20,687 -29.55 3.50
bread 822 20,079 -22.49 2.64
Example of 3 clusters:
Example Keywords from 1st Cluster:
Relevant opportunity
keywordTop competing URL
Keyword frequency
across domains
Potential search volume
(position 1)
Potential search volume
(position 5)
Potential search volume
(position 10)
quick dinner ideas
www.bbcgoodfood.com/recipes/category/quick-easy
10 10,969 731 50
simple recipes www.bbcgoodfood.com/recipes/collection/easy
10 4,969 950 70
how to cook www.theguardian.com/lifeandstyle/series/how-to-cook-the-perfect
10 2,999 647 11
Short-term, High ROI Recommendations for GreatBritishChefs.com
Content Type of optimisation Max % increase in weekly traffic (est.)
Most rewarding clusters
Existing Optimise existing content and keywords 172% Top 3 (of 70):beefdinnerjelly
Existing Optimise existing content with new keyword suggestions 80% Top 3 (of 34):
turkeybeefdinner
New Create new content with new keyword suggestions 148% Top 3 (of 41):
recipespastabread
Content Marketing Strategy
Sustainable investment in new
quality contentEngaged users
Increased shares, links
and click-thrus
Increased relevance in your cluster
Increased share of voice
in SERPs
Increase in Traffic and
Sales
Re-invest profits
Take-aways• Don’t try and second-guess Google’s algorithm – Let the data speak for itself• Analyse your whole market to get a different perspective and to see as many
opportunities as possible• You need ‘Artificial Intelligence for Natural Search’ - If your competitors are using
Big Data and Data Science and you’re not - then you’ll face an uphill battle• Produce quality content - http
://www.thesempost.com/google-quality-raters-guide-mobile/ • Gary Illyes, Google, “How many visitors have I helped today?” and not just “how
many visitors did I get.”• If you want ‘Direct Answers’ then you need to know the questions!• Look at your site’s E-A-T… That is, analyzing the page’s “expertise,
authoritativeness and trustworthiness”
Useful Linkscarrotsearch.com/ - Ling3G Clustering Engine – Circles and Foamtree
www.visualisingdata.com/resources/ - great set of resources for visualising data
www.visualcomplexity.com/vc/ - a great resource about visualising Big Data
www.linkurious.com – visualising and clustering graph data
www.keylines.com – visualising and clustering graph data
www.zoomcharts.com – online HTML 5 charting tool
www.neo4j.com – graph database and tutorials
www.thesempost.com/google-quality-raters-guide-mobile/ - recent article on the latest manual (not a link to the manual)
Google Quality Raters PDF – essential reading for Content Marketers
About Analytics SEO
@analyticsseo #bigdatascience
Big Data Science for Content Marketing Success
[email protected] www.analyticsseo.com +44 208 977 4465
Top Related