Best Practices for Architecting High Volume, High Performance Publishing for Data Intensive Website
High performance website
-
Upload
chamnap-chhorn -
Category
Documents
-
view
4.919 -
download
4
Transcript of High performance website
High Performance Website
Chhorn Chamnap
30 April 2011
1
About Me
• Chhorn Chamnap
• Senior Developer at Yoolk Inc.
• Blog: http://chamnapchhorn.blogspot.com/
=> http://blog.placexpert.com/
• Twitter: @chamnap
• Email: [email protected]
2
The 80/20 Performance Rule
• A theory from Vilfredo Pareto, an economist in the early 1900s.
• 80% of the consequences come from 20% of the causes.
• 80% of the time is spent in only 20% of the code.
• Focus on the 20% that affects 80% of the end-user response time.
• Start from the front end.
3
1. Minimize HTTP Requests
• 80% of the end-user response time is spent on the front-end
Loading http://www.yahoo.com/ Time spent loading popular web sites
4
1. Minimize HTTP Requests
• Solutions
– Combined scripts and stylesheets
– CSS Sprites
– Image maps
– Browser Cache Usage - 40-60% of daily visitors to your site come in with an empty cache.
5
Combined Scripts and Stylesheets
• Combining six scripts into one eliminates five HTTP requests
• http://stevesouders.com/examples/combo.php6
Image maps
• Combine multiple images into a single image.• client-side – preferred
<img usemap%="#map1" border=0 src="/images/imagemap.gif">
<map name="map1">
<area shape="rect" coords="0,0,31,31" href="home.html" title="Home">
…
</map>
• drawbacks:– must be contiguous– defining area coordinates – tedious, errors
• http://www.w3.org/TR/html401/struct/objects.html#h-13.6• http://stevesouders.com/examples/rule-min-http.php
7
CSS Sprites
• The preferred method for reducing the number of image requests
• Combine into a single image and use background-image and background-position properties to display.
<span style="
background-image: url('sprites.gif');
background-position: -260px -90px;">
</span>
• http://stevesouders.com/examples/sprites.php
8
Content Delivery Network
How does CDN work?
9
2. Use a Content Delivery Network
• The user's proximity has an impact on response times.
• Solutions– Use a CDN: Akamai Technologies, EdgeCast, Amazon
CloudFront, or level3.
– Distribute your static content before distributing your dynamic content.
• At Yahoo!, using a CDN improved response times >= 20%.
• http://stevesouders.com/examples/rule-cdn.php
10
3. Add an Expires or a Cache-Control Header
• Implement "Never expire" policy for all static components.
• Avoids unnecessary HTTP requests on subsequent page views after the first visit.
• How to expire the cache?yahoo_2.0.6.js
css_based.css?30042011
• http://stevesouders.com/examples/rule-expires.php
HTTP/1.1 200 OK
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000
Content-Length: 12195
11
Empty vs. Full Cache
Loading http://www.yahoo.com/
12
13
Expires Header Configuration
• Apache
ExpiresDefault "access plus 10 years"
• Nginxif ($request_filename ~* ^.+\.(jpg|jpeg|gif|png|ico|css|js|swf)$) {
expires max;
break;
}
14
4. Gzip Components
• Reduces response times.
• The response size ~ 70% reduction.
• 90%+ of browsers support gzip.
• Compress any text responses
• Image and PDF files SHOULD NOT be gzipped.
• http://stevesouders.com/examples/rule-gzip.php
HTTP/1.1 200 OK
Content-Length: 12195
Content-Encoding: gzip
GET /i/yahoo.gif HTTP/1.1
Host: example.com
Accept-Encoding: gzip
15
Gzip Configuration
• Apache 1.3 uses mod_gzip, Apache 2.x uses mod_deflate
• Apache 2.x: mod_deflateAddOutputFilterByType DEFLATE text/html text/css
application/x-javascript
• Nginxgzip on;
gzip_comp_level 6;
gzip_types text/plain text/css text/javascript
application/javascript application/json
application/x-javascript text/xml application/xml
application/xml+rss;
16
5. Put Stylesheets at the Top
• Research at Yahoo! shows that moving stylesheets to the HEAD makes pages appear to be loading faster.– It allows the page to render progressively.
• The user is stuck viewing a blank white page if put at the bottom.– The browser block rendering to avoid redrawing
elements.
• The HTML specification said the same thing.• http://stevesouders.com/examples/rule-css-
top.php
17
6. Put Scripts at the Bottom
• Download no > two in parallel per hostname.
• Block parallel downloads across all hostnames.
• Block rendering of everything below them in the page.
• Never uses document.write.
• Script DEFER attribute is not asolution.
• http://stevesouders.com/examples/rule-js-bottom.php
18
7. Avoid CSS Expressions
• Powerful but dangerous way to set CSS properties dynamically.
• Supported since IE5 but deprecated from IE8.background-color: expression( (new Date()).getHours()%2 ?
"#B8D4FF" : "#F08A00" );
• Execute so many times (> 10,000)
– mouse move, key press, resize, scroll, etc.
• http://stevesouders.com/examples/rule-expr.php
19
8. Make JavaScript and CSS External
• Using external files produces faster pages. Why?
– Js and css files are cached.
– Inlined increases the document’s size.
– Many documents could re-use them.
• To reduce HTTP requests in the front page:
– inline JavaScript and CSS, but dynamically download the external files after finished loading.
• http://stevesouders.com/examples/rule-inline.php
20
DNS Lookup
21
9. Reduce DNS Lookups
• 20-120 ms to lookup.• Blocks the parrallel download. • IE caches
– DnsCacheTimeout: 30 minutes
– KeepAliveTimeout: 1 minute
– ServerInfoTimeout: 2 minutes
• Firefox– network.dnsCacheExpiration: 1 minute
– network.dnsCacheEntries: 20
– network.http.keep-alive.timeout: 5 minutes
• Reducing hostnames reduces the DNS lookups. – But it reduces parallel downloads as well.
• The rule of thumbs is two per hostname.• http://stevesouders.com/examples/rule-dns.php
22
Configuration
ActionController::Base.asset_host = Proc.new { |source|
"http://assets#{source.hash % 2 + 1}.placexpert.com"
}
server_name assets1.placexpert.com assets2.placexpert.com;
location / {
alias /var/www/placexpert/public/;
if ($request_filename ~* ^.+\.(jpg|jpeg|gif|png|ico|css|js|swf)$) {
expires max; break;
}
}
23
Parallel Downloads
24
10. Minify JavaScript and CSS
• Minification reduces response time and the size.
• Two popular tools are:– JSMin
– YUI Compressor
– Google Closure Compiler
• Obfuscation is an alternative optimization.
• Both methods achieves fairly the same size reduction.
• Should minified inline js and css as well (5% reduction)
• http://stevesouders.com/examples/rule-minify.php
25
11. Avoid Redirects
• Redirects are accomplished using the 301 and 302 status codes.
• Redirects slow down the user experience.– Nothing can be rendered
– Round-trip request
• Add Expires headers to cache redirects
• Make sure use the standard 3xx HTTP status codes.
• http://stevesouders.com/examples/rule-redir.php
HTTP/1.1 301 Moved Permanently
Location: http://example.com/newuri
Content-Type: text/html
26
Wasteful redirects
http://astrology.yahoo.com/astrology =>
http://astrology.yahoo.com/astrology/
• This is fixed in Apache by using Alias or mod_rewrite, or the DirectorySlashdirective if you're using Apache handlers.
• Nginx– rewrite ^(.*[^/])$ $1/ permanent;
27
12. Remove Duplicate Scripts
• A typical problem
– 2 of 10 top sites contain duplicate
• Hurts performance
– extra HTTP requests (IE only)
– extra executions (even cached)
• http://stevesouders.com/examples/rule-js-dupes.php
28
13. Configure ETags
• ETags stands for Entity Tags.
• A unique identifier that is more flexible than the Last-Modified date.
• Doesn’t works well for cluster of servers.
• To turn off in Apache,FileETag none
• http://stevesouders.com/examples/rule-etags.php
HTTP/1.1 200 OK
Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
ETag: "10c24bc-4ab-457e1c1f"
Content-Length: 12195
GET /i/yahoo.gif HTTP/1.1
Host: us.yimg.com
If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
If-None-Match: "10c24bc-4ab-457e1c1f"
HTTP/1.1 304 Not Modified
29
14. Make Ajax Cacheable
• Expires or Cache-Control header
• Adding a timestamp to the URL– &t=1190241612
• When it has been modified, send with a new timestamp.
• http://stevesouders.com/examples/rule-ajax.php
30
15. Flush the Buffer Early
• Don’t let the browser idle.
• In PHP, use flush() to send data partially.
• Good for busy backends or light frontends.
• Flushing after the HEAD
... <!-- css, js -->
</head>
<?php flush(); ?>
<body>
... <!-- content -->
• Yahoo! search
31
16. Use GET for AJAX Requests
• POST in the browsers as a two-step process:– sending the headers first,
– then sending data.
• Best to use GET.
• POST without posting any data behaves like GET.
• GET is meant for retrieving information
• POST is for sending data to be stored server-side.
32
17. Post-load Components
• What's absolutely required in order to render the page initially?• The others can wait.• Post-loading should include
– drag and drop– animations– hidden content– images below the fold.
• Load them in the onLoad event.• Tools
– YUI Image Loader– YUI Get utility
• Yahoo! Home Page
33
Post-Onload Script
• inline in front pagewindow.onload = downloadComponents;
function downloadComponents() {
var elem = document.createElement("script");
elem.src = "http://.../file1.js";
document.body.appendChild(elem);
...
}
• speeds up secondary pages
34
18. Preload Components
• Preload what you’ll need in the future.
• Unconditional preload
• Conditional preload - make an educated guess
• Anticipated preload - preload in advance before redesign
35
19. Reduce the Number of DOM Elements
• The number of DOM elements:document.getElementsByTagName('*').length
• More bytes to download and slower DOM access.
• Using nested tables for layout purposes?
• Are you throwing in more <div>s only to fix layout issues?
• Try YUI CSS utilities: grids.css, fonts.css and reset.css
• And how many DOM elements are too many?
• The Yahoo! Home Page is a pretty busy page and still under 700 elements (HTML tags).
36
20. Split Components Across Domains
• Maximize parallel downloads.
• 2-4 sub-domains only because of the DNS lookup penalty
– static1.example.org,
– static2.example.org
• Maximizing Parallel Downloads in the Carpool Lane
37
21. Minimize the Number of iframes
• Pros– Helps with slow third-party content like badges
and ads
– Security sandbox
– Download scripts in parallel
• Cons– Costly even if blank
– Blocks page onload
– Non-semantic
38
22. No 404s
• A useless request.
39
23. Reduce Cookie Size
• Authentication and personalization only
• Small impact on the response time. • When the Cookie Crumbles
– Eliminate unnecessary cookies– Keep cookie sizes small– Setting cookies at the appropriate domain level– Set an Expires date appropriately
GET /index.html HTTP/1.1
Host: www.example.org
HTTP/1.1 200 OK
Content-type: text/html
Set-Cookie: name=value
Set-Cookie: name2=value2; Expires=Wed,
09 Jun 2021 10:18:14 GMTGET /spec.html HTTP/1.1
Host: www.example.org
Cookie: name=value; name2=value2
Accept: */*
40
Impact of cookies on response time
41
24. Use Cookie-free Domains for Components
• Create www site and subdomain– that subdomain is cookie-free
• Buy a whole new domain if already set on domain without www.– Yahoo! uses yimg.com,
– YouTube uses ytimg.com,
– Amazon uses images-amazon.com and so on.
• Some proxies might refuse to cache the components that are requested with cookies.
42
25. Minimize DOM Access
• Accessing DOM elements is slow
– Cache references
– Update nodes "offline" and add them later
– Avoid fixing layout with JavaScript
• High Performance Ajax Applications
43
26. Develop Smart Event Handlers
• Using event delegation is a good approach.
• Use DOMContentLoaded event instead.
• High Performance Ajax Applications
44
27. Choose <link> over @import
• One of the previous best practices states that CSS should be at the top in order to allow for progressive rendering.
• In IE @import behaves the same as using <link> at the bottom of the page, so it's best not to use it.
45
28. Avoid Filters
• It blocks rendering and freezes the browser while the image is being downloaded.
• It also increases memory consumption and is applied per element, not per image.
• Avoid AlphaImageLoader completely and use gracefully degrading PNG8 instead, which are fine in IE.
46
29. Optimize CSS Sprites
• Arranging the images horizontally (makes the smaller file size).
• Combining similar colors in a sprite helps you keep the color count low, ideally under 256 colors so to fit in a PNG8.
• Don't leave big gaps between the images in a sprite.
– It makes the user agent requires less memory.
47
30. Don't Scale Images in HTML
• Don't use a bigger image than you need.
• If you need <img width="100" height="100" src="mycat.jpg"
alt="My Cat" />
then mycat.jpg should be 100x100px.
48
31. Make favicon.ico Small and Cacheable
• Browser will always request it, better not to respond 404. – Cookies are sent every time it's requested.
– Also interferes with the download sequence in IE.
• Solutions– Make it small, preferably under 1K.
– Set Expires header when you feel comfortable.
• Imagemagick can help you create small favicons
49
32. Keep Components under 25K
• iPhone won't cache components > 25K (uncompressed size).
• Minification is important because gzip alone may not be sufficient.
• Performance Research, Part 5: iPhoneCacheability - Making it Stick
50
33. Avoid Empty Image src
• Image with empty string src attribute occurs more than one will expect. It appears in two form: – Straight HTML
<img src="">
– JavaScriptvar img = new Image();img.src = "";
• Browser makes another request to your server. • Cripple your servers by sending a large amount of unexpected
traffic.• Waste server computing cycles.• Possibly corrupt user data.• HTML5 instruct browsers not to make an additional request.• Empty image src can destroy your site
51
Developer Tools
• Firebug
• YSlow
• http://pagespeed.googlelabs.com/
52
References
• http://yuiblog.com/blog/2006/11/28/performance-research-part-1/
• http://yuiblog.com/blog/2007/01/04/performance-research-part-2/
• http://yuiblog.com/blog/2007/03/01/performance-research-part-3/
• http://yuiblog.com/blog/2007/04/11/performance-research-part-4/
• http://developer.yahoo.com/performance/rules.html• http://stevesouders.com/examples/rules.php• http://www.slideshare.net/w3guru/high-performance-
websites-by-souders-steve-presentation
53