Bitcoin Internals

32
Bitcoin Internals

Transcript of Bitcoin Internals

Page 1: Bitcoin Internals

Bitcoin Internals

Page 2: Bitcoin Internals

Who am I?

James Turner Polyglot programmer

Worked for ebay , BBC, BSkyB CTO @ magnr.com

2

Page 3: Bitcoin Internals

Topics

• Binary protocols

• Hashing and probability

• Bloom Filters

• Merkle Trees

• P2P networks and CAP theorem

3

Page 4: Bitcoin Internals

What is a binary protocol?

4

Page 5: Bitcoin Internals

What is a binary protocol

A binary protocol is a protocol which is intended or expected to be read by a machine rather than a human being, as opposed to a plain text protocol such as IRC, SMTP, or HTTP. Binary protocols have the advantage of terseness, which translates into speed of transmission and interpretation.

5https://en.wikipedia.org/wiki/Binary_protocol

Wikipedia says…

Page 6: Bitcoin Internals

NOT a binary protocol

6

Page 7: Bitcoin Internals

Our own binary protocol?

We can define our own “Sandwich” protocol as 1) a 32 bit Integer for number of cheese slices followed by 2) a 32 bit Integer for number of ham slices

So our binary protocol (assuming Big Endian) for 1,1 sandwich would be: 00000000 00000000 00000000 00000001 00000000 00000000 00000000 00000001

This is a fixed format. There are no variable sized parts.

7https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats

Page 8: Bitcoin Internals

Binary Protocol Efficiency

Our “Sandwich” protocol uses 8 bytes to transmit the cheese and ham information.

Compare this to JSON, where we might have {“cheese”:1,”ham”:1}

This is 20 bytes

In this example, we’re >50% more efficient.

However, sometimes you can’t read a binary protocol, our terminal output would be “”

8

There are 8 bytes here honestly!

Page 9: Bitcoin Internals

Variable length binary protocol

The “Message” protocol:

1) a 32 bit Integer followed by 2) a variable number of bytes (chars)

So our binary output for 5”hello” would be

00000000 00000000 00000000 00000101 1101000 1100101 1101100 1101100 1101111

9https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats

Page 10: Bitcoin Internals

Bitcoin protocol

10https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure

Block

Message Header

Page 11: Bitcoin Internals

What is hashing?

A computational function that takes an arbitrary sized input, and produces a fixed size output.

e.g. sha256(“hello”) produces “2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824”

Hashing has certain properties which we find useful: • It’s extremely hard to reverse, and calculate the original data from the hash. • If the input data changes even slightly the hash output is completely different.

11

Page 12: Bitcoin Internals

Hashing collision probability

If 2 pieces of input data produce the same output hash, we have a “collision”.

Given the random nature of hashing, what is the probability that for any 2 pieces of input data we would generate identical hashes?

If the output of the hashing function is a single byte e.g. 01101011 or 01111111 or 01110001

We can see that there is a 1 in 256 (2^8) chance of getting a collision.

This can be generalised to 1/(2^n) where n is number of bits.

12

Page 13: Bitcoin Internals

Merkle Trees

13

Root Hash 1234

TX1

Hash0

hash(TX1)

Hash 01hash(Hash0 , Hash1 )

TX2

Hash1

hash(TX2)

TX3

Hash2

hash(TX3)

TX4

Hash3

hash(TX4)

Data

Hash 23hash(Hash2 , Hash3 )

Page 14: Bitcoin Internals

Verify a TX using Merkle Trees

14

H 12345678

TX1

H1

H12

DataTX2 TX3 TX4 TX5 TX6 TX7 TX8

H2 H3 H4 H5 H6 H7 H8

H34 H56 H78

H1234 H5678

Verify TX8 exists in block

https://en.bitcoin.it/wiki/Merged_mining_specification#Merkle_Branch

Page 15: Bitcoin Internals

Merkle Blocks

15

Merkle Block Message

https://github.com/bitcoin/bips/blob/master/bip-0037.mediawiki

Page 16: Bitcoin Internals

Bloom Filters

Let’s assume we have 2 hash functions “f” and “g”

f(x) and g(x) produce 2 random outputs

e.g.

f(“hello”) => 123 g(“hello”) => 192

16https://en.wikipedia.org/wiki/Bloom_filter

Page 17: Bitcoin Internals

Bloom Filters

We have an array of bits, let’s say 8 (this will fit a single byte). such that the empty bitset looks like this:

17https://en.wikipedia.org/wiki/Bloom_filter

Page 18: Bitcoin Internals

Bloom Filters

By performing the modulus (%) of each hash output with 8 (the size of the bitset) we should get the following:

123 % 8 =3 192 % 8 = 0

We now mark positions 0 and 3 as “1” bits

18https://en.wikipedia.org/wiki/Bloom_filter

Page 19: Bitcoin Internals

Bloom Filters

19https://en.wikipedia.org/wiki/Bloom_filter

f(“hello”) g(“hello”) f(“world”) g(“world”)

Page 20: Bitcoin Internals

Bloom Filters (exists)

20https://en.wikipedia.org/wiki/Bloom_filter

f(“world”) g(“world”) f(“bar”) g(“bar”)

Page 21: Bitcoin Internals

Bloom Filters (false positives)

21https://en.wikipedia.org/wiki/Bloom_filter

f(“foo”) g(“foo”)

Page 22: Bitcoin Internals

Bloom Filters (error rate)

22https://en.wikipedia.org/wiki/Bloom_filter

m bits

k is number of hashing functionsk=2 , hash functions f & g

m=8m is the number of bits in our bitset

n is the number of items representedn=1, “hello”

f(“hello”)

probability of a single bit NOT being set is (1 - 1/8)^2 , more generally (1-1/m)^k

as n grows, this becomes (1-1/m)^kn

g(“hello”)

Page 23: Bitcoin Internals

Bloom Filter properties

• Memory compaction (lots of items in a small space)

• Possible existence (and false positives)

• Collision probability determined by number of items/number

of bits

23

Page 24: Bitcoin Internals

Bloom Filters in Bitcoin

24

filteradd, filterload, filterremove

Page 25: Bitcoin Internals

CAP theorem

• Consistency

• Availability

• Partition Tolerance

25

https://en.wikipedia.org/wiki/CAP_theorem

The CAP theorem is a negative result that says you cannot simultaneously achieve all three goals in the presence of errors. Hence, you must pick one objective to give up.

http://cacm.acm.org/blogs/blog-cacm/83396-errors-in-database-systems-eventual-consistency-and-the-cap-

theorem/fulltext

Page 26: Bitcoin Internals

CAP theorem in P2P networks

26

A node in the Bitcoin network

Page 27: Bitcoin Internals

Availability

27

XAvailable

Available Available

Available

Available

Available

Available

Available

Available

Available

Unavailable

Network: Available

UnavailableX

Page 28: Bitcoin Internals

Partitioning

28

XX

Partitioned

TX

TX

TX

Page 29: Bitcoin Internals

Consistency

29

Block 222B

Network: Inconsistent

Block 222B

Block 222B Block 222B

Block 222B

Block 222A Block 222A

Block 222A

Block 222A

Block 222A

Block 222A

Block 222A

Page 30: Bitcoin Internals

Consistency

30

Block 223

Network: Consistent

Block 223

Block 223 Block 223

Block 223

Block 223 Block 223

Block 223

Block 223

Block 223

Block 223

Block 223

Page 31: Bitcoin Internals

Other P2P protocols

• BitTorrent (Distributed Hash Tables)

• Gnutella (Query Routing Tables)

31https://en.wikipedia.org/wiki/List_of_P2P_protocols

Page 32: Bitcoin Internals

Questions?

32