Lies, Damned Lies, and Substrings

Post on 22-Jan-2018

490 views 2 download

Transcript of Lies, Damned Lies, and Substrings

Lies, Damned Lies, and Substrings

HASEEB QURESHI

SOF TWARE ENGINEER @

Let me tell you a story about a time Ruby lied to me.

A coworker and I were arguing about an algorithm.

Him

Me

It started with a classic problem:

How to generate all of the substrings of a string?

Hello

H, e, l, l, o

He, el, ll, lo

Hel, ell, llo

Hell, ello

Hello Helloi = 0 j = 3

Hello

H, e, l, l, o

He, el, ll, lo

Hel, ell, llo

Hell, ello

Hello Helloi = 1 j = 4

Each substring is defined by a unique start and end index.

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend Quadratically many pairs of indices,

therefore the inner loop runs O(n2) many times.

Me: This algorithm is O(n2).

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend But what about what’s inside the loop?

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend

How long does it actually take to build a substring?

(We’re going to assume fixed-width [ASCII/UTF-32] strings for simplicity.)

(Also, Ruby treats strings less than 24 characters differently, but we can ignore that for large n.)

H e l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

Memory

e l l52a0 52a1 52a2 52a3 52a4 52a5

str

str2 =str[1..3]

Obviously, copying each substring takes linear time.

That is, linear in the length of the average substring.

O(1)? Log(n)? O(n)?H, e, l, l, o

He, el, ll, lo

Hel, ell, llo

Hell, ello

Hello

… Which is how long?

require_relative 'substrings'

def average_substring_ratio(original_string_length) str = 'a' * original_string_length substring_lengths = substrings(str).map(&:length) average_substring_length = substring_lengths.reduce(:+) .fdiv(substring_lengths.count)

average_substring_length / original_string_lengthend

(1..150).step(5).each do |count| puts "#{count}: #{average_substring_ratio(count)}"end

1: 1.06: 0.444444444444444411: 0.393939393939393916: 0.37521: 0.365079365079365126: 0.35897435897435931: 0.354838709677419436: 0.3518518518518518641: 0.3495934959349593646: 0.3478260869565217351: 0.3464052287581756: 0.3452380952380952361: 0.344262295081967266: 0.343434343434343471: 0.3427230046948357

76: 0.3421052631578947581: 0.3415637860082304586: 0.3410852713178294591: 0.3406593406593406796: 0.34027777777777773101: 0.33993399339933994106: 0.33962264150943394111: 0.3393393393393393116: 0.339080459770115121: 0.33884297520661155126: 0.3386243386243386131: 0.3384223918575064136: 0.3382352941176471141: 0.3380614657210402146: 0.33789954337899547

(You can also prove

this mathematically.)

Limn→∞=⅓n

H, e, l, l, o

He, el, ll, lo

Hel, ell, llo

Hell, ello

Hello

So the average substring grows linearly with the original string.

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend Thus, this copy is O(n)

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend

So this whole thing takes O(n3) time.Colleague:

Not so fast. (or slow.)

Enter COW(copy-on-write)

Copy-on-write is a kind of structural sharing.

H e l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

Memory

str

str2 = str[1..3]

str_ptr: 8fe1length: 3

Here’s the proof.

require_relative 'display_string' # credit to Pat Shaughnessy

debug = Debug.new

str = ('a'..'z').to_a.joinstr2 = str.dup

debug.display_string(str) # DEBUG: RString = 0x7f98fb05b090 # DEBUG: ptr = 0x7f98fc0aa970 -> "abcdefghijklmnopqrstuvwxyz" # DEBUG: len = 26

debug.display_string(str2) # DEBUG: RString = 0x7f98fb05afa0 # DEBUG: ptr = 0x7f98fc0aa970 -> "abcdefghijklmnopqrstuvwxyz" # DEBUG: len = 26

Pointer to same string in memory!

require_relative 'display_string' # credit to Pat Shaughnessy

debug = Debug.new

str = ('a'..'z').to_a.joinstr2 = str[1..-1]

debug.display_string(str) # DEBUG: RString = 0x7f98fb05b090 # DEBUG: ptr = 0x7f98fc0aa970 -> "abcdefghijklmnopqrstuvwxyz" # DEBUG: len = 26

debug.display_string(str2) # DEBUG: RString = 0x7f98fb05afa0 # DEBUG: ptr = 0x7f98fc0aa971 -> "bcdefghijklmnopqrstuvwxyz" # DEBUG: len = 25

Still the same string, but now offset by 1.

What happens if either string gets mutated?

require_relative 'display_string' # credit to Pat Shaughnessy

debug = Debug.new

str = ('a'..'z').to_a.joinstr2 = str[1..-1]str[1] = '&'

debug.display_string(str) # DEBUG: RString = 0x7fa2a304fbf8 # DEBUG: ptr = 0x7fa2a2f1f170 -> "a&cdefghijklmnopqrstuvwxyz" # DEBUG: len = 26

debug.display_string(str2) # DEBUG: RString = 0x7fa2a304fae0 # DEBUG: ptr = 0x7fa2a2f50b11 -> "bcdefghijklmnopqrstuvwxyz" # DEBUG: len = 25

The write forced a copy to a new string in memory.

H e l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

Memory

str

str2 = str[1..3]

str_ptr: 8fe1length: 3

callbacks: [str2]

str[1] = '&'

H e l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

Memory

str

str2 = str[1..3]

callbacks: [str2]

e l l52a0 52a1 52a2 52a3 52a4 52a5

H & l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

Memory

str

str2 = str[1..3] e l l

52a0 52a1 52a2 52a3 52a4 52a5

So…

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend

This is a shallow copy, which is actually O(1).

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend

And this whole thing takes O(n2) time.

Case closed.

require_relative 'substrings'require 'benchmark'

str = 'abcdefgh' * 128str2 = str * 2

benchmarks = Benchmark.bmbm do |bm| bm.report(str.length) do substrings(str) end

bm.report(str2.length) do substrings(str2) endend

puts 'Growth: ' + benchmarks[1].real / benchmarks[0].real

Rehearsal ----------------------------------------1024 0.290000 0.070000 0.360000 ( 0.357953)2048 2.360000 0.500000 2.860000 ( 2.876344)------------------------------- total: 3.220000sec

user system total real1024 0.270000 0.070000 0.340000 ( 0.338351)2048 2.200000 0.400000 2.600000 ( 2.601713)

Growth: 7.689380300623611

When the input doubles, the time grows by a factor of 8.

This algorithm is not quadratic.

( 0.338351)( 2.601713)

wat

require 'benchmark'NUM_TIMES = 100_000

str = 'abcde' * 2 ** 10str2 = str * 2

Benchmark.bmbm do |bm| bm.report(str.length) do NUM_TIMES.times { str[1..-1] } end

bm.report(str2.length) do NUM_TIMES.times { str2[1..-1] } endend

Rehearsal -----------------------------------------...---------------------------------------------------

user system total real5120 0.020000 0.000000 0.020000 ( 0.021144)10240 0.020000 0.000000 0.020000 ( 0.020291)

That sure looks like copy-on-write optimization…

require 'benchmark'NUM_TIMES = 100_000

str = 'abcde' * 2 ** 10str2 = str * 2

Benchmark.bmbm do |bm| bm.report(str.length) do NUM_TIMES.times { str[1..-2] } end

bm.report(str2.length) do NUM_TIMES.times { str2[1..-2] } endend

Rehearsal -----------------------------------------...---------------------------------------------------

user system total real5120 0.110000 0.060000 0.170000 ( 0.171367)10240 0.200000 0.140000 0.340000 ( 0.347153)

Only substrings that include the last character are copy-on-write.

So turns out:

the vast majority of substrings don’t include the last character.H, e, l, l, o

He, el, ll, lo

Hel, ell, llo

Hell, ello

Hello

And, of course,

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend So this on average is linear.

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend

And this whole thing is O(n3).

It was all a lie.

WHY HAVE YOU BETRAYED ME

RUBY

Naturally…

Hmm.

¯\_( )_/¯

ᕕ( ᐛ )ᕗ

Let’s…

… recompile Ruby…?

maml004775hquresh:ruby haseeb_qureshi$ make installCC = clangLD = ldLDSHARED = clang -dynamic -bundleCFLAGS = -O3 -fno-fast-math -ggdb3 -Wall -Wextra -Wno-unused-parameter -Wno-parentheses -Wno-

long-long -Wno-missing-field-initializers -Wno-tautological-compare -Wno-parentheses-equality -Wno-constant-logical-operand -Wno-self-assign -Wunused-variable -Werror=implicit-int -Werror=pointer-arith -Werror=write-strings -Werror=declaration-after-statement -Werror=shorten-64-to-32 -Werror=implicit-function-declaration -Werror=division-by-zero -Werror=deprecated-declarations -Werror=extra-tokens -pipe

XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector -fno-strict-overflow -fvisibility=hidden -DRUBY_EXPORT -fPIE

CPPFLAGS = -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE -D_DARWIN_UNLIMITED_SELECT -D_REENTRANT -I. -I.ext/include/x86_64-darwin15 -I./include -I. -I./enc/unicode/9.0.0

DLDFLAGS = -Wl,-undefined,dynamic_lookup -Wl,-multiply_defined,suppress -fstack-protector -Wl,-u,_objc_msgSend -Wl,-pie -framework CoreFoundation

SOLIBS =Apple LLVM version 7.3.0 (clang-703.0.31)Target: x86_64-apple-darwin15.4.0Thread model: posix

I now have a custom version of in my usr/local/bin

ml004775hquresh:bin haseeb_qureshi$ ls -l...-rwxr-xr-x 1 haseeb_qureshi admin 3.1M Oct 23 00:37 ruby...

ml004775hquresh:bin haseeb_qureshi$ ./ruby -vruby 2.4.0dev (2016-10-23 trunk 56478) [x86_64-darwin15]

require 'benchmark'NUM_TIMES = 100_000

str = 'abcde' * 2 ** 10str2 = str * 2

Benchmark.bmbm do |bm| bm.report(str.length) do NUM_TIMES.times { str[1..-2] } end

bm.report(str2.length) do NUM_TIMES.times { str2[1..-2] } endend

Let’s run this benchmark again…

Rehearsal -----------------------------------------...--------------------------------------------------- user system total real5120 0.020000 0.000000 0.020000 ( 0.020432)10240 0.020000 0.000000 0.020000 ( 0.020300)

ml004775hquresh:bin haseeb_qureshi$ ./ruby ~/Projects/substrings/benchmark3.rb

Boom.

Ruby is now doing copy-on-write optimization on all strings!

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend And this bad boy, finally,

takes O(n2) time.

(Applause break)

But you have to wonder…

why was that the default behavior?

H e l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

str \0

In C, strings should end with a null-terminator or null byte.

This is how C knows it’s reached the end of a string.

Null terminator

H e l l o8fe0 8fe1 8fe2 8fe3 8fe4 8fe5

str \0

If you passed a substring which did not include a NUL into a library written in C, it might keep reading bytes until it found

the NUL.

Null terminator

str2 = str[1..3]

Essentially, it ensures any C extensions treat all Ruby

strings correctly.

So that’s it.

We’re finally done.

We have an O(n2) algorithm for substrings.

Except one thing…

Remember where we started?

We need to generate all the substrings.

Did we actually… generate them?

def substrings(str) (0...str.length).each_with_object([]) do |i, subs| (i...str.length).each do |j| subs << str[i..j] end endend

puts substrings("Hello")It takes linear time to print a substring, so printing all

the substrings will still take O(n3) time.

So in what sense is this O(n2)?

If you think about it, the whole idea of copy-on-write is laziness.

What we’ve created are lazy strings.

H, e, l, l, o

He, el, ll, lo

Hel, ell, llo

Hell, ello

Hello

Instead of making these:

str[0..-1]

We made these:

str[0..3], str[1..4]

str[0..2], str[1..3], str[2..4]

str[0..1], str[1..2], str[2..3], str[3..4]

str[0..0], str[1..1], str[2..2], str[3..3], str[4..4]

All we’ve really done is build each pair of indices.

The Ruby array that substrings(str) returns does not actually contain the

substrings.

It’s just a clever, lazy way to express them.

It’s lies all the way down.

Thanks for listening.

You can follow me at @hosseeb

Special thanks to Ned Ruggeri, David Runger, and Pat Shaughnessy.

You can find the code on Github: Haseeb-Qureshi