Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

19
iMAGIS is a joint project of CNRS - INPG - INRIA - UJF iMAGIS-GRAVIR / IMAG Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer François X. Sillion, Jean-Marc Hasenfratz iMAGIS

description

Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer. François X. Sillion, Jean-Marc Hasenfratz iMAGIS. Radiosity. Hierarchical Radiosity. Hierarchical representation (mesh) Interactions computed at appropriate level. Strategies for Hierarchical Radiosity. Gathering - PowerPoint PPT Presentation

Transcript of Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

Page 1: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS is a joint project of CNRS - INPG - INRIA - UJF

iMAGIS-GRAVIR / IMAG

Efficient Parallel Refinementfor Hierarchical Radiosity on a

DSM computer

François X. Sillion, Jean-Marc Hasenfratz

iMAGIS

Page 2: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Radiosity

Page 3: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Hierarchical Radiosity•Hierarchical representation (mesh)

• Interactions computed at appropriate level

Page 4: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Strategies for Hierarchical Radiosity

•Gathering– memory consuming (store links)– Easier dynamic modifications

•Shooting– Memory efficient– Requires heuristic to decide shooting level– Links recomputed as needed

Page 5: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Parallel Approaches•Two approaches:

– data exchange via message-passing algorithms– Shared memory

•Partial solutions possible if “natural” partitioning exists (e.g. inside buildings) [Fun96,FY97]

•Virtual interfaces are harder to handle [RAPP97]

•Load balancing problem[Cav99]

Page 6: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Scheduler•Force all link refinement operations through a scheduler object

•Natural place for – Parallel synchronization– Orientation and steering of calculation

•Advantages of using scheduler:– Global view of all pending task at any given time– Task extraction can be made according to various

selection criteria

Page 7: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Example (sequential) schedulers

•Stack scheduler (depth first refinement)

•Priority scheduler– Use simple structure (heap)– Hierarchical level (breadth first)– Size, energy, error– Interactive user control

•Random scheduler...

Page 8: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Architecture

SolverSolver

Main / GUIMain / GUI

Refiner RefinerRefiner

Refiner RefinerRefiner

Refiner RefinerRefiner

Refiner Refiner

Page 9: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Synchronization

1. Scheduler– Single object talks to all refiners => Danger!– Use simple blocks of refinement jobs

2. Hierarchical data structure– Consistency of hierarchical scene structure

3. Interactions– Links or energy representations

Page 10: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Test scenes

VRLab - 51 182 polygonsAircraft - 184 456 polygons

Office - 5 285 polygons

Page 11: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Measurements• Hardware architecture:

– ccNUMA SGI 2000 computer with 64 microprocessors

– Limit to 40 microprocessors R10000 at 195MHz

Proc AProc A Proc BProc B

IOXbarIO

Xbar

Node1

Node1

Node511

Node511Hub

ChipHubChip

Mem&

Dir

Mem&

Dir

Scalable Interconnect NetworkScalable Interconnect Network

Node 0Node 0

IO CtrlsIO Ctrls

……RR

RR RR

RR

RRRR

RR RR

RR

RR RR

RR

RRRR

RR RR

Page 12: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Measurements

•Time measurements:– Refinement: times system call which return clock

ticks– Memory access, cache access…: perfex software

tool which uses the 31 hard counters of R10,000

Page 13: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Results

CPU Refinement time

VRLab

1

10

100

1000

10000

100000

0 5 10 15 20 25 30 35 40

# proc

Tim

e [s

ec]

Iteration 1

Iteration 2

Iteration 3

Iteration 4

VRLab

1

10

100

1000

10000

100000

0 5 10 15 20 25 30 35 40

# proc

Tim

e [s

ec]

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Aircraft

0

2000

4000

6000

8000

10000

12000

0 5 10 15 20 25 30 35 40

# proc

Tim

e [s

ec]

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Aircraft

0

2000

4000

6000

8000

10000

12000

0 5 10 15 20 25 30 35 40

# proc

Tim

e [s

ec]

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Office

0

2000

4000

6000

8000

10000

12000

14000

0 5 10 15 20 25 30

# proc

Tim

e [s

ec]

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Office

0

2000

4000

6000

8000

10000

12000

14000

0 5 10 15 20 25 30

# proc

Tim

e [s

ec]

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Page 14: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Results

Speed-up

0

5

10

15

20

25

30

35

40

1 10 20 30 40# proc

Sp

eed

-up

VRLab

Office

Aircraft

Ideal

0

5

10

15

20

25

30

35

40

1 10 20 30 40# proc

Sp

eed

-up

VRLab

Office

Aircraft

Ideal

0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40# proc

Sp

eed

-up

Iteration 1Iteration 2Iteration 3

Iteration 4Ideal

0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40# proc

Sp

eed

-up

Iteration 1Iteration 2Iteration 3

Iteration 4Ideal

Page 15: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Results

Influence of the size of link blocks on overall CPU time

Office

0

5000

10000

15000

20000

25000

30000

35000

0 5 10 15 20 25 30 35 40

# proc

Tim

e [s

ec]

Block of 1 link

Block of 10 links

Block of 100 links

Office

0

5000

10000

15000

20000

25000

30000

35000

0 5 10 15 20 25 30 35 40

# proc

Tim

e [s

ec]

Block of 1 link

Block of 10 links

Block of 100 links

Page 16: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Results

Memory used before and during the iterations

0

100

200

300

400

500

600

700

800

Load Iteration 1 Iteration 2 Iteration 3 Iteration 4

Me

mo

ry o

cc

up

ati

on

[M

o]

0

1000000

2000000

3000000

4000000

5000000

6000000

# o

f li

nk

s

AircraftTDBMWVRLabAircraft linksTd linksBMW linksVRLab links

0

100

200

300

400

500

600

700

800

Load Iteration 1 Iteration 2 Iteration 3 Iteration 4

Me

mo

ry o

cc

up

ati

on

[M

o]

0

1000000

2000000

3000000

4000000

5000000

6000000

# o

f li

nk

s

AircraftTDBMWVRLabAircraft linksTd linksBMW linksVRLab links

Page 17: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Conclusions•Very simple atomic tasks•Easily managed with a single scheduler structure

•Easily implemented on top of an existing radiosity simulation code

– Thread setup – New link creation upon refinement decision

Page 18: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Future work

•Understanding the peculiar behaviour observed for the aircraft scene

•Dealing with graphics resources for “optimized” calculations using graphics hardware

Page 19: Efficient Parallel Refinement for Hierarchical Radiosity on a DSM computer

iMAGIS-GRAVIR / IMAG

Acknowledgements•Peter Kipfer contributed to the design and early implementation of this work.

•Thanks to Centre Charles Hermite for providing access to its computational resources

•Laurent Alonso provided useful advice on performance questions.

•This work was supported in part by the European Union’s ESPRIT project #24944, ARCADE (“Making Radiosity Usable”).