Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

40
Erlang, random numbers, and the security 1 Kenji Rikitake 10-SEP-2013 for London Erlang User Group meetup 1 Sunday, September 8, 13

description

London Erlang User Group presentation on Erlang and the random numbers

Transcript of Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Page 1: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Erlang, random numbers, and the security

1

Kenji Rikitake10-SEP-2013

for London Erlang User Group meetup

1Sunday, September 8, 13

Page 2: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Executive summary

2

Random number generation is very

diffcult to do properly; so don’t try

to invent yours!

2Sunday, September 8, 13

Page 3: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Dilbert’s definition of randomness (NOT)

3

http://dilbert.com/strips/comic/2001-10-25/

http://insanity-works.blogspot.jp/2008/05/debianubuntu-serious-opensslssh.html

3Sunday, September 8, 13

Page 4: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

What does randomness mean? (Wikipedia)

4

• Commonly, it means lack of pattern or predictability in events

•Non-order or non-coherence in a sequence, no intelligible pattern or combination

• Statistically, random processes still follow a probabilistic distribution

4Sunday, September 8, 13

Page 5: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

“Pseudo” randomness?

• The results of deterministic computations can be predicted; there is no real randomness from those algorithms

• A predictive sequence generator with a very long period (e.g., (2^19937)-1 times of computations) can be used as a list of pseudo random number generator (PRNG)

• Each generation needs a seed for to determine the starting point (aka Initialization Vector)

5

5Sunday, September 8, 13

Page 6: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Then what is “real” randomness?

• Mostly depending on the physical events which is very unlikely to be predictable by the consumer of the randomness itself

• You can’t really avoid the interference of the exterior environment

• But at least statistically whether a sequence is random or not can be measured

6

6Sunday, September 8, 13

Page 7: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Entropy pool and the level of randomness• Entropy is the unit of randomness

• If 32-bit word of complete random number exists, it has 32 bits of entropy

• If the 32-bit word takes only four different values and each value has a 25% chance of occurring, the word has 2 bits of entropy

• The Dilbert’s case has ZERO bit of entropy

7

Source: Neils Ferguson, Bruce Schneier, and Tadayoshi Kohno: Cryptography Engineering, Chapter 9 (Generating Randomness), p. 137, John Wiley and Sons, 2010, ISBN-13: 9780470474242

7Sunday, September 8, 13

Page 8: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Sources of “real” high-entropy sources

• Thermal noise (of resistors)

• Avalanche noise of Zener diodes

• Optical noise (e.g., LavaCan)

• Radio noise (static) (random.org)

• Rolling the dice (Dice-O-Matic)

• Very expensive (= low entropy bit rate)

• Practical for seeding purpose only

• Not necessarily statistically uniform or gaussian

8

8Sunday, September 8, 13

Page 9: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Avalanche diode noise generator example

9

Source: http://www.erlang-factory.com/conference/SFBay2011/speakers/kenjirikitake

9Sunday, September 8, 13

Page 10: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Arduino Duemilanove and the noise generator

10

Source: http://www.erlang-factory.com/conference/SFBay2011/speakers/kenjirikitake

10Sunday, September 8, 13

Page 11: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Can OS gather entropy from various events in a computer? • Possible events: hard disk access duration,

mouse/keyboard timing, ethernet packet arrival timing, wall-clock values, etc.

• Note: those events can only happen after when the system is booted; at the boot time the entropy pool is virtually empty

• Solutions: use an external entropy source to gather sufficient entropy, or wait for the enough entropy to be gathered and filled into the pool

11

Source: Nadia Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman: Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices, Proceedings of the 21st USENIX Security Symposium, August 2012, https://factorable.net/paper.html

11Sunday, September 8, 13

Page 12: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

So what are good PRNGs?

• Output distribution

• uniform distribution: equally covering the all possible values in the same probability

• Gaussian/normal distribution: representing the sum of many independent random values

• The generation period length should be very large (especially when sharing the same algorithm between multiple processes)

12

12Sunday, September 8, 13

Page 13: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Gaussian/normal distribution

13

Source: http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg

13Sunday, September 8, 13

Page 14: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Uniformity is hard to achieve in the real world

14

http://nakedsecurity.sophos.com/2013/07/09/anatomy-of-a-pseudorandom-number-generator-visualising-cryptocats-buggy-prng/

An example: Cryptocat's buggy PRNG written in JavaScript

14Sunday, September 8, 13

Page 15: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Another example: PHP5 rand(0,1) on Windows: see the pattern?

15

http://boallen.com/random-numbers.html

15Sunday, September 8, 13

Page 16: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Android Bitcoin Wallets had PRNG vulnerability to let 55 BTC stolen

16

• Java’s SecureRandom class (wasn’t secure enough on Android actually)

• “In order to remain secure the random numbers used to generate private keys must be nondeterministic, meaning that the output of the generator cannot be predicted”

• “Android phones/tablets are weak and some signatures have been observed to have colliding R values”

http://thegenesisblock.com/security-vulnerability-in-all-android-bitcoin-wallets/

16Sunday, September 8, 13

Page 17: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Executive summary (again)

17

Random number generation is very diffcult to do properly; so don’t try to invent yours!

http://imgs.xkcd.com/comics/random_number.png

17Sunday, September 8, 13

Page 18: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

So how Erlang/OTP does on PRNGs?

18

• random module: non-secure PRNG (Wichmann-Hill AS183, in 1982)

• crypto module: cryptographically secure PRNG, OpenSSL API in NIFs

• For generating passwords and keys, always use the crypto module

• There was a bug in SSH module using random function instead of crypto (CVE-2011-0766, discovered by Geoff Cant, fixed on R14B03)

18Sunday, September 8, 13

Page 20: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

20

%% Code of AS183 PRNG%% from Erlang/OTP R16B01 lib/stdlib/src/random.erl%% This hasn’t been really changed at least since R14B02

-define(PRIME1, 30269).-define(PRIME2, 30307).-define(PRIME3, 30323).

uniform() -> {A1, A2, A3} = case get(random_seed) of undefined -> seed0(); Tuple -> Tuple end, B1 = (A1*171) rem ?PRIME1, B2 = (A2*172) rem ?PRIME2, B3 = (A3*170) rem ?PRIME3, put(random_seed, {B1,B2,B3}), R = B1/?PRIME1 + B2/?PRIME2 + B3/?PRIME3, R - trunc(R).

20Sunday, September 8, 13

Page 21: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Issues on random module, aka AS183

21

• The period length is short (~ 2^43)

• Very old design (in 1980s)

• Not designed for parallelism

• Only three 16-bit integers as the seed

• Not designed for speed (float division)

• Good thing: it’s pure Erlang code

21Sunday, September 8, 13

Page 22: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Goals for solving issues on random module • Larger period length PRNG needed

• New or modern design PRNGs preferred

• Addressing parallelism requirements

• Non-overlapping = orthogonal sequences

• Larger state length for seeding

• Availability on both with and without NIFs

22

22Sunday, September 8, 13

Page 23: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Required/suggested features on parallelism• Each PRNG has to manage its own state

• Independent = orthogonal sequences from the same algorithm

• Splitting the internal state

• Using orthogonal polynomials

• Seed jumping: fast calculation for advancing the internal state

23

23Sunday, September 8, 13

Page 24: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Alternatives I implented for the random module• sfmt-erlang

• tinymt-erlang

• Both available at GitHub

• Details available as ACM Erlang Workshop papers, as well as from my own web site

• These are non-secure PRNGs

24

24Sunday, September 8, 13

Page 25: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Mersenne Twister

• BSD licensed

• By Matsumoto and Nishimura, 1996

• Very long period ((2^19937) -1)

• Uniformly distributed in 623-dimension hypercube, mathematically proven

• Widely used on many open-source languages: R, Python, mruby, etc.

25

25Sunday, September 8, 13

Page 26: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

SIMD-oriented Fast Mersenne Twister (SFMT)• Improved MT by Saito and Matsumoto

(2006)

• SIMD-oriented, optimized for x86(_64)

• Various period length supported

• ((2^607)-1) ~ ((2^216091)-1)

• Faster recovery from 0-excess initial state

26

26Sunday, September 8, 13

Page 27: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

sfmt-erlang

• A straight-forward implementation of SFMT in both pure Erlang and with NIFs

• Suggested period length: (2^19937-1)

• with NIFS: >x3 faster than random module

• A choice on PropEr Erlang test tool

27

27Sunday, September 8, 13

Page 28: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Porting tips of SFMT code from C to Erlang

• Make it reentrant: removed all static arrays for the internal state table (no mutable data structure)

• SFMT itself can be written as a recursion

a[X] = r(a[X-N], a[X-(N-POS1)], a[X-1], a[X-2])

• An Erlang way: adding elements to the heads and do the lists:reverse/1 made the code 50% faster than using the ++ operator!

28

28Sunday, September 8, 13

Page 29: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Speed difference between C and Erlang• On sfmt-erlang

• Pure Erlang code: x300 slower than C

• Erlang NIF code: still x10 slower than C

• Some dilemma: speed optimization is important, but Erlang VM can hardly beat the native C code

29

29Sunday, September 8, 13

Page 30: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

TinyMT

• MT for smaller memory footprint (and where shorter period length is OK)

• Saito and Matsumoto, 2011

• Period: ((2^127)-1), still much > 2^43

• Seed: seven 32-bit unsigned integers

• Parallel execution oriented features: ~2^56 independent generation polynomials

30

30Sunday, September 8, 13

Page 31: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

tinymt-erlang

• A straight-forward implementation of TinyMT in both pure Erlang and with NIFs

• Some of sfmt-erlang code are equally applicable (e.g., seed initialization)

• x86_64 HiPE optimization: x3 speed

• Wall-clock speed: roughly the same as random module

• In fprof: x2~x6 slower than random module

31

31Sunday, September 8, 13

Page 32: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Observations on tinymt-erlang code• 32-bit integer ops are on BIGNUMs

• Small integer on 32bit Erlang: max 28 bits

• Overhead of calling functions and BEAM memory allocation might have taken a significant portion of exec time

• sfmt-erlang batch exec is still x3~4 faster than tinymt-erlang batch exec

32

32Sunday, September 8, 13

Page 33: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

NIF scheduling issue

• NIF blocks the Erlang scheduler

• How long can a CPU spend time in a NIF?

• Rule of thumb: <1msec per exec

• Target for sfmt/tinymt-erlang: < 0.1msec

• On Riak this has been a serious issue

• Scott Fritchie has an MD5 example code

• A preference for the pure Erlang code

33

33Sunday, September 8, 13

Page 34: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

TinyMT precomputed polynomial parameters• On 2011, I decided to pre-compute the

polynomial parameters at Kyoto University’s super-computer (x86_64) cluster (I could only use two 16-core nodes (32 cores))

• 2^28 (~256M) parameter sets are available for both 32- and 64-bit implementations

• Took ~32 days of continuous execution

• 18~19 sets/sec for each core

34

34Sunday, September 8, 13

Page 35: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

“Secure” PRNGs• One-way function: guessing input or internal state

from the output should be kept practically impossible

• A PRNG is less secure when:

• generation period is shorter

• easy function exists to guess the internal state from the output

• the output sequence is predictable from the past output sequence data (= a simple PRNG is unsecure)

35

35Sunday, September 8, 13

Page 36: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

For building more secure PRNGs

• Reducing predictability: always incorporate necessary external extropy

• Cryptographic strength: use well-proven algorithms, e.g., AES, to “encrypt” the PRNG output

• Rule of thumb: use crypto:strong_rand_bytes/1 to guarantee sufficient entropy is consumed

36

36Sunday, September 8, 13

Page 37: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Tips for parallel execution of PRNGs

• Use completely orthogonal sequences

• e.g., TinyMT independent polynomials

• If the seeding space is large, it is unlikely that PRNG sequences from the seeds generated from another PRNGs will collide with each other (but this is not mathematically proven)

• Beware of execution sequence change when predictability is important

37

37Sunday, September 8, 13

Page 38: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Tips on secure PRNGs

• Don’t invent yours unless you hire a well-experienced cryptographers (i.e., stick to proven functions)

• Ensure the system has enough entropy to guarantee the unpredictability

• An external hardware RNG is suggested

• And don’t let NSA cripple the algorithms!

38

38Sunday, September 8, 13

Page 39: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Executive summary again

39

Random number

generation is very

diffcult to do properly; so don’t try

to invent yours!

http://imgs.xkcd.com/comics/security_holes.png

39Sunday, September 8, 13

Page 40: Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

Thanks

Questions?

40

40Sunday, September 8, 13