How GZIP compression works - JS Conf EU 2014
-
Upload
raul-fraile -
Category
Technology
-
view
1.358 -
download
12
description
Transcript of How GZIP compression works - JS Conf EU 2014
H O W G Z I P C O M P R E S S I O N W O R K SR A U L F R A I L E
J S C O N F E U B E R L I N
• P H P / J S S O F T W A R E D E V E L O P E R
!
• M S ( R E S ) S T U D E N T I N
C O M P U T I N G T E C H N O L O G I E S .
!
• M A D E I N S PA I N .
A B O U T M E
D ATA C O M P R E S S I O N
N O T A N E X P E R T *
D ATA C O M P R E S S I O N I S A N AMAZ ING T O P I C
REALLY !
M A G I CI T C A N B E S E E N L I K E …
flickr.com/photos/jeffkrause/6799254170
flickr.com/photos/t_e_brown/8677750589
… I T ’ S N O T
I N F O R M AT I O N T H E O R YC L A U D E S H A N N O N
E N T R O P Yflickr.com/photos/95303997@N07/10074330416
H = - p ( x ) l o g 2 p ( x )⎲⎳
AV E R A G E A M O U N T O F I N F O R M AT I O N C O N TA I N E D I N E A C H M E S S A G E
≈N U M B E R O F B I T S T O R E P R E S E N T T H E M E S S A G E
225 days/year 62 %
17 days/year 6 %
flickr.com/photos/aigle_dore/5952296478flickr.com/photos/mariano-mantel/13955110319
H U M A N B R A I NI S D E S I G N E D T O C O M P R E S S D A TA
flickr.com/photos/birthintobeing/11841180046
flickr.com/photos/neolao/3105372669flickr.com/photos/tommiephotography/6840025942
flickr.com/photos/earlysound/2186172726
M O R S E C O D E S H O R T E R S E Q U E N C E S F O R C O M M O N C H A R A C T E R S
flickr.com/photos/amboo213/9044879245
D ATA C O M P R E S S I O N I N H T T P
GET index.html Accept-Encoding: gzip, deflate
G Z I P + H T T P
G Z I P C O M P R E S S I O N
• D E F L A T E A L G O R I T H M
!
• D E S I G N E D B Y P H I L K A T Z
!
• U S E D I N H T T P, P N G A N D P D F
G Z I P
D E F L AT E
L Z 7 7
H U F F M A N C O D I N G+
L Z 7 7 ( VA R I AT I O N )
T H I S F I L E I S H U G E ! T H AT ' S B E C A U S E T H E F I L E I S N O T C O M P R E S S E D
< 3 3 , 9 >
S E A R C H B U F F E R ( U P T O 3 2 K B ) L O O K - A H E A D
T H I S F I L E I S H U G E ! T H AT ' S B E C A U S E T H E F I L E I S N O T C O M P R E S S E D
L Z 7 7 ( VA R I AT I O N )
< 3 3 , 9 >
L I T E R A L S · L E N G T H S · D I S TA N C E S
H U F F M A N C O D I N G
0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 1 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 0 0
H 0 0 0E 0 0 1L 0 1 0O 0 1 1W 1 0 0R 1 0 1D 1 1 0_ 1 1 1
H E L L O W O R L D
8 8 B I T S
F I X E D - L E N G T H C O D E S
0 0 0 0 0 1 0 1 0 0 1 0 0 1 1 1 1 1 1 0 0 0 1 1 1 0 1 0 1 0 1 1 0
3 3 B I T S
H U F F M A N C O D I N G
C H A R A C T E R F R E Q U E N C Y:
0 0 0 1 0 0 1 0 0 1 1 0 1 1 1 0 0 0 0
L 3 0O 2 1H 1 0 0E 1 0 1W 1 1 0R 1 1 1D 1 0 0 0_ 1 0 0 1
H E L L O W O R L D
1 9 B I T S
I T ’ S A M B I G U O U S
H EL H OD O…
VA R I A B L E - L E N G T H C O D E S
H U F F M A N C O D I N G
L 3 1 0O 2 1 1 1H 1 0 0 1E 1 1 1 0 0W 1 0 0 1R 1 0 0 0D 1 1 1 0 1_ 1 0 1 0
H U F F M A N C O D I N G
L 3 1 0O 2 1 1 1H 1 0 0 1E 1 1 1 0 0W 1 0 0 1R 1 0 0 0D 1 1 1 0 1_ 1 0 1 0
0 0 1 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 0 1
H E L L O W O R L D
3 2 B I T S
H U F F M A N C O D I N G
TA B L E 1 : L I T E R A L S + L E N G T H S
TA B L E 2 : D I S TA N C E S
B L O C K S
B L O C K 1 B L O C K 2 … B L O C K NM M M M
M O D E 1 : N O C O M P R E S S I O N
M O D E 2 : F I X E D C O D E TA B L E S
M O D E 3 : G E N E R AT E D C O D E TA B L E S
flickr.com/photos/functoruser/2436979033
G Z I P C O M P R E S S I O NI M P L E M E N TAT I O N S
G N U G Z I P Z O P F L I7 - Z I P
M O D E FA S T
M O D E H I G H
C O M P R E S S I O N
M O D E N O R M A L
G E N E R A L R U L E : M O R E T I M E , B E T T E R C O M P R E S S I O N R AT I O
I M P L E M E N TAT I O N S
G Z I P C O M P R E S S I O NW H Y G Z I P ?
• G O O D C O M P R E S S I O N R A T I O .
• FA S T T O ( U N ) C O M P R E S S .
• I N T H E W O R S T C A S E , E X PA N D S
T H E D A TA S L I G H T LY.
• M E M O R Y I N D E P E N D E N T.
• F R E E I M P L E M E N TA T I O N S T H A T
A V O I D PA T E N T S .
T R A D E O F F
N E W E R A L G O R I T H M SI S S U E S T R Y I N G T O A D D B Z I P 2 S U P P O R T T O C H R O M E
G Z I P C O M P R E S S I O NB E Y O N D G Z I P
P R E P R O C E S S D ATA T O O P T I M I Z E MATCHES
G Z I P ( T ( D ATA ) ) < G Z I P ( D ATA )
T R A N S P O S I N G J S O N
{ "name": "John", "country": "USA" }, { "name": "Stephan", "country": "Germany" }, { "name": "Rob", "country": "USA" }
{ "name": [ "John", "Stephan", "Rob" ], "country": [ "USA", "Germany", "USA" ] }
X M L / H T M L AT T R I B U T E S O R D E R
<input id='f1' class='field' name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" />
<input id="f1" class="field" name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" />
<input id="f1" class="field" name="f1" type="text" /> <input id="f2" class="field" name="f2" type="text" />
<input type="text" class="field" id="f1" name="f1" /> <input type="text" class="field" id="f2" name="f2" />
1 7 , 7 6 %
2 7 , 1 0 %
3 8 , 3 2 %
3 8 , 3 2 %
h t t p : / / g o o . g l / G g M w 2 6
R E F E R E N C E S
“ C o m p r e s s o r H e a d ” C o l t M c A n l i s
“ D a t a C o m p r e s s i o n : T h e C o m p l e t e R e f e r e n c e ” D a v i d S a l o m o n
“ A U n i v e r s a l A l g o r i t h m f o r S e q u e n t i a l D a t a C o m p r e s s i o n ” J a c o b Z i v & A b r a h a m L e m p e l
“ A m e t h o d f o r t h e c o n s t r u c t i o n o f m i n i m u m r e d u n d a n c y c o d e s ” D a v i d A . H u f f m a n
T H A N K Y O U
R a ú l F r a i l e @ r a u l f r a i l e