Shared Dictionary Compression Over Http Presentation

11
Shared-Dictionary Compression over HTTP (SDCH) Wei-Hsin Lee June 2008

description

Shared Dictionary Compression over HTTP protocol (SDCH) aims at reducing data redundancy across HTTP responses. The protocol is meant to work with current schemes (gzip, deflate) to further compress the HTTP responses. This protocol is different from original proposed rfc3229 (differential compression), as it does not require the browser to cache the last version of pages.

Transcript of Shared Dictionary Compression Over Http Presentation

Page 1: Shared Dictionary Compression Over Http Presentation

Shared-Dictionary Compression over HTTP (SDCH)

Wei-Hsin LeeJune 2008

Page 2: Shared Dictionary Compression Over Http Presentation

Why do we care?• Speeding up Google and the Web

– The faster the Web is, the more useful it is.– The faster Google web search is, the more searches

people do.– Lots of users still suffer from slow networks. For

example, in developing countries.

Page 3: Shared Dictionary Compression Over Http Presentation

Reduce transmission time• Reducing payload size is the key.• Gzip works well as the compression for each

individual response.• What about common data shared by a group of

pages (inter-response redundancy) or pages that change a little bit frequently?

• Only transmit the data that is common to each response once.

• Thereafter, send only the parts of the response that differ.

Page 4: Shared Dictionary Compression Over Http Presentation

Why not RFC 3229?• RFC3229 “Delta Compression in HTTP”

– Good for saving bandwidth

• But– Too many states for server to track

• The possible states of www.google.com/search is bigger than all possible search results.

– Only applicable to the same URL • Discourages aggressive caching.

– No benefit for similar pages that don’t share an URL.

Page 5: Shared Dictionary Compression Over Http Presentation

Shared-Dictionary Compression over HTTP (SDCH)

• An addition to HTTP• Small set of states (dictionaries) shared between

client and server.• Dictionaries are scoped by domain name and

path. Just like cookies. It allows dictionaries to apply to multiple URLs.

Page 6: Shared Dictionary Compression Over Http Presentation

SDCH protocol details• SDCH defines

– How client informs server of its capability and state.– How the server should respond to client when the

client is SDCH capable.– How dictionaries get loaded into client.

• Implement VCDIFF (RFC 3284) differential compression format with enhancements– Interleave instructions with data so that each network

packet can be decoded as it arrives. (chunked encoding)

– Checksum to ensure data integrity

Page 7: Shared Dictionary Compression Over Http Presentation

Example 1

Page 8: Shared Dictionary Compression Over Http Presentation

Example 2

Page 9: Shared Dictionary Compression Over Http Presentation

Other details• Complement to Gzip or Deflate.

– Should be used before applying Gzip

• Lab result– About 40 percent data reduction better than Gzip

alone on Google search.– See faster Google search results. Especially under

low bandwidth and high latency condition.

• Working on the best way to get this out to users.

Page 10: Shared Dictionary Compression Over Http Presentation

Your help counts!• Please join the group

– http://groups.google.com/group/SDCH– Protocol spec, and the encoder/decoder code will be

there soon.

• Getting your hands dirty is even better!– Make your web site use SDCH.– Make Squid or Apache web servers SDCH capable.

Page 11: Shared Dictionary Compression Over Http Presentation

Don’t forget to join the group.

http://groups.google.com/group/SDCH