Download - IEEE Fast Square Root

Transcript
Page 1: IEEE Fast Square Root

IEEE Fast Square Root

Ref: Graphics Gems III; 2

Spring 2003

Page 2: IEEE Fast Square Root

2

Motivation

• Square root operations are frequently used in many applications (e.g., computer graphics)

• Usually speed is more important than accuracy

• Is there any way faster than sqrt( )?– Idea: tabulated sqrt!

Page 3: IEEE Fast Square Root

3

Math Background

123

24

22222 3), (say, odd is if

22 4), (say,even is if

:partexponent

22

21

21

21

221

21

mmme

e

mmee

For 52-bit mantissa (double), only limited cases need to be computed: 2252 entries; each entry with 52 bits

Negative exponents: same!

Page 4: IEEE Fast Square Root

4

Abridged Table

• Sacrifice accuracy for smaller tables

• Indexed by first 13 bits of mantissa only– Only 2213 entries; each entry with 20

significant binary bits

• Further accuracy, if required, can be obtained by one or two Newton iterations, using the tabulated value as initial guess

Try this yourself!

Page 5: IEEE Fast Square Root

5

How Numbers are Stored in Memory

SEEEEEEEEEEEMMMM MMMMMMMMMMMMMMMM MMMMMMMMMMMMMMMM MMMMMMMMMMMMMMMM

Conceptually:

Byte swapping: Cautious when exchanging binary files and direct data access;But when we read/operate as the declared data, do not need to worry (it reads backward)

B0 B1 B2 B3 B4 B5 B6 B7

Stored (on PC): byte swapping; Least Significant Byte firstB7 B6 B5 B4 B3 B2 B1 B0

This is why the

examiner works

Page 6: IEEE Fast Square Root

6

Byte Swapping (cont)

float 3.5: 0x 4060 0000

double 3.5: 0x 400c 0000 0000 0000

short 1029: 0x 0405

int 218+5: 0x 0040 0005

05 04

05 00 40 00

00 00 60 40

00 00 00 00 00 00 0c 40

Use this program to see

for yourself

Page 7: IEEE Fast Square Root

7

Implementation

Setup Table

Evaluation

Page 8: IEEE Fast Square Root

8

Details: access M.S.Byte

B7 B6 B5 B4 B3 B2 B1 B0

f fi

Page 9: IEEE Fast Square Root

9

MMMMMMMMMMMMMMMM MMMMMMMMMMMMMMMM SEEEEEEEEEEEMMMM MMMMMMMMMMMMMMMM

13 7

f fi

Page 10: IEEE Fast Square Root

10

Evaluation

Page 11: IEEE Fast Square Root

11

Time a function

Page 12: IEEE Fast Square Root

12

Example

Page 13: IEEE Fast Square Root

13

Twice faster; but note the overhead for

building up the tables