A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers...

14
A Benchmark for HTTP 2.0 Header Compression Christian Grothoff Technische Universit¨ at M¨ unchen 31.07.2013 https://gnunet.org/httpbenchmark/

Transcript of A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers...

Page 1: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

A Benchmark for HTTP 2.0 Header Compression

Christian Grothoff

Technische Universitat Munchen

31.07.2013

https://gnunet.org/httpbenchmark/

Page 2: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Goals

I Realistic benchmark based on diverse, real-world user data

I Preserve realistic privacy expectations of monitored users

I Preserve compression characteristics

I Allow assessment of tunneling multiple requests in one stream

I Capture data at high-speed link (no TCP streamreconstruction)

Page 3: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Goals

I Realistic benchmark based on diverse, real-world user data

I Preserve realistic privacy expectations of monitored users

I Preserve compression characteristics

I Allow assessment of tunneling multiple requests in one stream

I Capture data at high-speed link (no TCP streamreconstruction)

Page 4: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Methods

I Capture data with libpcap on port 80 that looks like HTTPheader

I Traces are headers with same source IP to same destinationIP within 5 minutes

I Store in sqlite database with trace- and timing information

I Remove IP addresses

I Obscure possibly private data in headers

Page 5: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Header cleaning

I Apply substitution ciphers depending on payload:I Same substitution for all URIs in a traceI Same substitution for all (expected) occurences of hostnameI Same substitution for cookie values in traceI Fresh, character-set preserving substitutions for other headers

I Write regular expressions for headers that are not cleaned

I Discard headers for which we do not have a regular expression

I Pass internal review for data export

Page 6: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Header cleaning

I Apply substitution ciphers depending on payload:I Same substitution for all URIs in a traceI Same substitution for all (expected) occurences of hostnameI Same substitution for cookie values in traceI Fresh, character-set preserving substitutions for other headers

I Write regular expressions for headers that are not cleaned

I Discard headers for which we do not have a regular expression

I Pass internal review for data export

Page 7: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Length distribution for the traces

1

10

100

1000

10000

100000

1 10 100 1000 10000

# t

race

s of

giv

en leng

th

number of requests in trace

Length distribution of the traces

Data set 1Data set 2Data set 3Data set 4Data set 5

The length of a trace is defined as the number of headers in agiven trace.

Page 8: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Number of bytes per header

1

10

100

1000

10000

100000

0 200 400 600 800 1000 1200 1400

# h

ead

ers

of

giv

en s

ize

number of bytes per header

Size distribution of the headers

Data set 1Data set 2Data set 3Data set 4Data set 5

Page 9: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Number of key-value pairs in the headers

1

10

100

1000

10000

100000

1e+06

0 10 20 30 40 50 60

# h

ead

ers

wit

h t

his

many v

alu

es

number of values per header

Number of key-value pairs in the headers

Data set 1Data set 2Data set 3Data set 4Data set 5

Page 10: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Compression (public benchmark)

Algorithm Compressed size Comp. time Decomp. time

memcpy req. 260 ± 44 MB 166 ± 6 ms 148 ± 3 msgzip req. 46 ± 18 MB 9,3 ± 2,2s 2 ± 0,5 sbzip2 req. 215 ± 33 MB 103 ± 14 s 26 ± 4 s

memcpy res. 157 ± 12 MB 155 ± 5 ms 137 ± 3 msgzip res. 30 ± 4 MB 7,1 ± 0,7s 1,4 ± 0,1 sbzip2 res. 138 ± 10 MB 70 ± 5 s 17 ± 1,2 s

Page 11: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Compression (raw data)

Algorithm Compressed size Comp. time Decomp. time

memcpy req. 265 ± 45 MB 169 ± 4 ms 147 ± 3 msgzip req. 37 ± 6 MB 8,6 ± 1,4s 2,1 ± 0,2 sbzip2 req. 210 ± 31 MB 103 ± 15 s 26 ± 3,9 s

memcpy res. 163 ± 12 MB 159 ± 3 ms 138 ± 3 msgzip res. 29 ± 4 MB 7,4 ± 0,7s 1,6 ± 0,1 sbzip2 res. 138 ± 10 MB 71 ± 4,6s 17 ± 1,1 s

Page 12: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Publication

I PDF with detailed description of method

I Five sqlite3 databases with ≈ 1 million HTTP headers each

I C source code for capture, cleanup and evaluation

I ODS spreadsheed with statistical analysis of data output by Ccode

I License: Code is GPL, use of data should be attributed

Page 13: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Do you have any questions?

References:

I Christian Grothoff. A Benchmark for HTTP 2.0 HeaderCompression. https://gnunet.org/httpbenchmark/,2013.

Page 14: A Benchmark for HTTP 2.0 Header Compression · Header cleaning I Apply substitution ciphers depending on payload: I Same substitution for all URIs in a trace I Same substitution for

Content-length distribution