Max Prin - SMX West 2017 - What to do when Google can't understand your JavaScript
Transcript of Max Prin - SMX West 2017 - What to do when Google can't understand your JavaScript
#SMX #23A2 @maxxeightSEO Best Practices for JavaScript
What To Do When Google
Can't Understand
Your JavaScript
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt – onclick + window.location != <a href=”link.html”>
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt – onclick + window.location != <a href=”link.html”>– 1 unique “clean” URL per piece of content (and vice-
versa)
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
URL Structures (with AJAX websites)
Fragment Identifier: example.com/#url– Not supported. Ignored. URL = example.com
Hashbang: example.com/#!url (pretty URL)– Google and Bing will request: example.com/?_escaped_fragment_=url (ugly URL)
– The escaped_fragment URL should return an HTML snapshot
Clean URL: example.com/url– Leveraging the pushState function from the History
API– Must return a 200 status code when loaded directly
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt – onclick + window.location != <a href=”link.html”>– 1 unique “clean” URL per piece of content (and vice-
versa) Rendering– Load content automatically, not based on user
interaction (click, mouseover, scroll)
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt – onclick + window.location != <a href=”link.html”>– 1 unique “clean” URL per piece of content (and vice-
versa) Rendering– Load content automatically, not based on user
interaction (click, mouseover, scroll) – the 5-second rule
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt – onclick + window.location != <a href=”link.html”>– 1 unique “clean” URL per piece of content (and vice-
versa) Rendering– Load content automatically, not based on user
interaction (click, mouseover, scroll) – the 5-second rule– Avoid JavaScript errors (bots vs. browsers)
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
HTML snapshots are only required with uncrawlable URLs (#!)
When used with clean URLs:– 2 URLs requested for each content (crawl
budget!) Served directly to (other) crawlers
(Facebook, Twitter, Linkedin, etc.) Matching the content in the DOM No JavaScript (except JSON-LD markup) Not blocked from crawling
The “Old” AJAX Crawling Scheme And HTML Snapshots
DOMHTML
Snapshot
#SMX #23A2 @maxxeight
HTML snapshots are only required with uncrawlable URLs (#!)
When used with clean URLs:– 2 URLs requested for each content (crawl
budget!) Served directly to (other) crawlers
(Facebook, Twitter, Linkedin, etc.) Matching the content in the DOM No JavaScript (except JSON-LD markup) Not blocked from crawling
The “Old” AJAX Crawling Scheme And HTML Snapshots
DOMHTML
Snapshot
#SMX #23A2 @maxxeight
Crawling– Don’t block resources via robots.txt – onclick + window.location != <a href=”link.html”>– 1 unique “clean” URL per piece of content (and vice-
versa) Rendering– Load content automatically, not based on user
interaction (click, mouseover, scroll) – the 5-second rule– Avoid JavaScript errors (bots vs. browsers)
Indexing– Mind the order of precedence (SEO signals and
content)
How To Make Sure Google Can Understand Your Pages
#SMX #23A2 @maxxeight
Google cache (unless HTML snapshots) Google Fetch & Render (Search Console)– limitation in terms of bytes (~200 KBs)– doesn’t show HTML snapshot (DOM)
Tools For SEO And JavaScript
#SMX #23A2 @maxxeight
Google cache (unless HTML snapshots) Google Fetch & Render (Search Console)– limitation in terms of bytes (~200 KBs)– doesn’t show HTML snapshot (DOM)
Fetch & Render As Any Bot (TechnicalSEO.com)
Tools For SEO And JavaScript
#SMX #23A2 @maxxeight
Google cache (unless HTML snapshots) Google Fetch & Render (Search Console)– limitation in terms of bytes (~200 KBs)– doesn’t show HTML snapshot (DOM)
Fetch & Render As Any Bot (TechnicalSEO.com)
Chrome DevTools (JavaScript Console)
Tools For SEO And JavaScript
#SMX #23A2 @maxxeight
Google cache (unless HTML snapshots) Google Fetch & Render (Search Console)– limitation in terms of bytes (~200 KBs)– doesn’t show HTML snapshot (DOM)
Fetch & Render As Any Bot (TechnicalSEO.com)
Chrome DevTools (JavaScript Console) SEO Crawlers– ScreamingFrog– Botify– Scalpel (Merkle proprietary tool)
Tools For SEO And JavaScript
#SMX #23A2 @maxxeightLEARN MORE: UPCOMING @SMX EVENTS
THANK YOU! SEE YOU AT THE NEXT #SMX