Introduction to Data-Oriented Design

44
Introduction to Data-Oriented Design @YaroslavBunyak Senior Software Engineer, SoftServe

description

Slides for a talk I gave an IT Weekend Rivne, November 2014. And I'm too lazy to add comments for each slide :)

Transcript of Introduction to Data-Oriented Design

Page 1: Introduction to Data-Oriented Design

Introduction to Data-Oriented Design

@YaroslavBunyak Senior Software Engineer, SoftServe

Page 2: Introduction to Data-Oriented Design

Programming, M**********r Do you speak it?

Page 3: Introduction to Data-Oriented Design

Story

Page 4: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 5: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 6: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 7: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 8: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 9: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 10: Introduction to Data-Oriented Design

Sieve of Eratosthenes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Page 11: Introduction to Data-Oriented Design

Sieve of Eratosthenes

Simple algorithm

Easy to implement

Page 12: Introduction to Data-Oriented Design

Sieve of Eratosthenesint array[SIZE];

array[i] = 1;

if (array[i]) ...

!

int bits[SIZE / 32];

bits[i / 32] |= 1 << (i % 32);

if (bits[i / 32] & (1 << (i % 32))) ...

Page 13: Introduction to Data-Oriented Design

Sieve of Eratosthenes

Simple algorithm

Easy to implement

But...

unexpected results

Page 14: Introduction to Data-Oriented Design

Sieve of Eratosthenes

The second implementation (bitset) is 3-5x faster than first (array)

Even though it actually does more work

Page 15: Introduction to Data-Oriented Design

Why?!.

Page 16: Introduction to Data-Oriented Design

Fast Forward

Page 17: Introduction to Data-Oriented Design

...

• Years have passed

• I become a software engineer

• And one day...

Page 18: Introduction to Data-Oriented Design

This Graph

Slide 17

CPU/Memory performance

Computer architecture: a quantitative approachBy John L. Hennessy, David A. Patterson, Andrea C. Arpaci-Dusseau

Page 19: Introduction to Data-Oriented Design

This Table1980 Modern PC Improvement, %

Clock speed, Mhz 6 3000 +500x

Memory size, MB 2 2000 +1000x

Memory bandwidth, MB/s 137000 (read) 2000 (write)

+540x +150x

Memory latency, ns 225 ~70 +3x

Memory latency, cycles 1.4 210 -150x

Page 20: Introduction to Data-Oriented Design

• CPU registers

• Cache Level 1

• Cache Level 2

• RAM

• HDD

Memory HierarchyCPU

RAM

Disk

L1i Cache

L1d Cache

L2 Cache

Page 21: Introduction to Data-Oriented Design

Distance Metaphor

• L1 cache: it's on your desk, pick it up.

• L2 cache: it's on the bookshelf in your office, get up out of the chair.

• Main memory: it's on the shelf in your garage downstairs, might as well get a snack while you're down there.

• Disk: it's in, um, California. Walk there. Walk back. Really.

http://hacksoflife.blogspot.com/2011/04/going-to-california-with-aching-in-my.html

Page 22: Introduction to Data-Oriented Design

Fact

• Memory access is expensive

• CPU cycles are cheap

Page 23: Introduction to Data-Oriented Design

Modern Programming

• High-level languages and abstractions

• OOP

• everywhere!

• objects scattered throughout the address space

• memory access patterns are unpredictable

Page 24: Introduction to Data-Oriented Design

Meet Data-Oriented Design

Page 25: Introduction to Data-Oriented Design

Ideas

• code transforms data

• data >> code

• hardware is not a black box

Page 26: Introduction to Data-Oriented Design

Program

data dataxform

Page 27: Introduction to Data-Oriented Design

Example 1: AoS vs SoAstruct Tile

{

bool ready;

Data pixels; // big chunk of data

};

Tile tiles[SIZE];

vs

struct Image

{

bool ready[SIZE]; // hot data

Data pixels[SIZE]; // cold data

};

Page 28: Introduction to Data-Oriented Design

Example 1: AoS vs SoAfor (int i = 0; i < SIZE; ++i)

{

if (tiles[i].ready)

draw(tiles[i].pixels);

}

!vs

for (int i = 0; i < SIZE; ++i)

{

if (image.ready[i])

draw(image.pixels[i]);

}

Page 29: Introduction to Data-Oriented Design

Example 1: AoS vs SoA!!!!!!

vs

!!!!

Page 30: Introduction to Data-Oriented Design

By The Way

• Memory loads in chunks, not single bytes

• One such chunk is called a cache line

• Typical size: 64 or 128 bytes

Page 31: Introduction to Data-Oriented Design

Example 1: AoS vs SoA!!!!!!

vs

!!!!

Page 32: Introduction to Data-Oriented Design

Example 2: Existencestruct Image

{

bool ready[SIZE];

Data pixels[SIZE];

};

Image image;

vs

Data ready_pixels[N];

// N ≤ SIZE

!

Page 33: Introduction to Data-Oriented Design

Example 2: Existencefor (int i = 0; i < SIZE; ++i)

{

if (image.ready[i])

draw(image.pixels[i]);

}

!vs

for (int i = 0; i < N; ++i)

{

draw(ready_pixels[i];

}

Page 34: Introduction to Data-Oriented Design

Example 3: Locality!array<float> numbers;

float sum = 0.0f;

for (auto it : numbers)

sum += *it;

!vs

list<float> numbers;

float sum = 0.0f;

for (auto it : numbers)

sum+ = *it;

Page 35: Introduction to Data-Oriented Design

Example 3: Locality!!!!!!

vs

!!!!

Page 36: Introduction to Data-Oriented Design

Advice

• Keep your data closer to registers and cache (hot data)

• Don’t touch what you don’t have to (cold data)

• Predictable access patterns (e.g. linear arrays) - good

• What’s good for memory - good for you

Page 37: Introduction to Data-Oriented Design

DOD Patterns

• A to B transform

• In-place transform

• Existence based processing

• Data normalization

• DB design says hello!

• Task, gather, dispatch, and more...

Page 38: Introduction to Data-Oriented Design

DOD Benefits

• Maximum performance

• CPU doesn’t wait & starve

• Easy to parallelize

• data is grouped, transforms separated

• ready for Parallel Processing, OOP doesn’t

• Simpler code

• surprise!

Page 39: Introduction to Data-Oriented Design

References: Memory

• Ulrich Drepper “What Every Computer Programmer Should Know About Memory”

• Крис Касперски “Техника оптимизации програм. Еффективное использование памяти”

• Christer Ericson “Memory Optimization”

• Igor Ostrovsky “Gallery of Processor Cache Effects”

Page 40: Introduction to Data-Oriented Design

References: DOD• Noel Llopis “Data-Oriented Design”, Game Developer

Magazine, September 2009

• Richard Fabian “Data-Oriented Desing”, book draft http://www.dataorienteddesign.com/dodmain/

• Tony Albrecht “Pitfalls of Object-Oriented Programming”

• Niklas Frykholm “Practical Examples of Data Oriented Design”, also everything on http://bitsquid.blogspot.com/

• Mike Acton “Typical C++ Bullshit”

• Data Oriented Design @ Google+

Page 41: Introduction to Data-Oriented Design

Bonus: Object or not?Q: What is a table?

A: Flat top and 4 legs.

Q: Object? (OOP)

A: Yes.

Q: If we remove one leg. Is it still an object?

A: …

DOD: There is no table :)

Page 42: Introduction to Data-Oriented Design

Bonus: Object or not?Q: You are modelling a pile of sand. Is it an object?

A: Yes.

Q: What is the border line number of particles N after which just a bunch of sand particles start forming a pile? 10? 1000? 1000000?

(i.e. can we say that N particles are just a bunch of particles, but N+1 particles become a pile of sand?)

A: …

DOD: Sand particles are data.

Page 43: Introduction to Data-Oriented Design

Thank You!

Page 44: Introduction to Data-Oriented Design

Q?