Multiprocessor Game Loops: Lessons from Jason Gregory Naughty Dog, Inc.

71
Multiprocessor Game Loops: Multiprocessor Game Loops: Lessons from Lessons from Jason Gregory Jason Gregory Naughty Dog, Inc. Naughty Dog, Inc.
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    227
  • download

    6

Transcript of Multiprocessor Game Loops: Lessons from Jason Gregory Naughty Dog, Inc.

Multiprocessor Game Loops:Multiprocessor Game Loops:Lessons fromLessons from

Jason GregoryJason GregoryNaughty Dog, Inc.Naughty Dog, Inc.Jason GregoryJason Gregory

Naughty Dog, Inc.Naughty Dog, Inc.

22Game Forum Germany 2010Game Forum Germany 2010

AgendaAgenda

• Goal:Goal: Learn about Learn about modern multiprocessor game engine modern multiprocessor game engine

update loopsupdate loops...... ... by investigating Naughty Dog’s ... by investigating Naughty Dog’s train leveltrain level..

Game Forum Germany 2010Game Forum Germany 2010 33

QuickTime™ and a decompressor

are needed to see this picture.

44Game Forum Germany 2010Game Forum Germany 2010

AgendaAgenda

• During our investigation, we’ll answer the following During our investigation, we’ll answer the following questions:questions: Why did we make the train itself Why did we make the train itself dynamicdynamic (not static)? (not static)? How does the train How does the train movemove?? How does it tie into the How does it tie into the update of other engine systemsupdate of other engine systems?? How do How do player mechanicsplayer mechanics and and animationanimation work on the work on the

train?train? How do game object How do game object attachment hierarchiesattachment hierarchies work? work? How are How are ray and sphere castsray and sphere casts used on the train? used on the train? How do we utilize the PS3’s How do we utilize the PS3’s parallel computing resourcesparallel computing resources??

Static or Dynamic?Static or Dynamic?

55

66Game Forum Germany 2010Game Forum Germany 2010

Could the Train be Static?Could the Train be Static?

• In the past, games with trains in them have used a In the past, games with trains in them have used a static trainstatic train approach. approach. Train is actually Train is actually stationarystationary.. Background scrolls byBackground scrolls by to produce illusion of to produce illusion of

movement.movement.• This solves a lot of problems.This solves a lot of problems.

Player mechanics, NPC locomotion, weapon mechanics, Player mechanics, NPC locomotion, weapon mechanics, etc.etc.are all the same as in a “regular” non-moving game are all the same as in a “regular” non-moving game level.level.

77Game Forum Germany 2010Game Forum Germany 2010

Was That Good Enough for Us?Was That Good Enough for Us?

• No way!No way!• Stationary train design imposes undue limitations:Stationary train design imposes undue limitations:

Train must move in a Train must move in a perfectly straight lineperfectly straight line...... ... or along a broad ... or along a broad circular arccircular arc (constant curvature). (constant curvature). No No ups and downsups and downs allowed in the track either. allowed in the track either. Not nearly as much Not nearly as much funfun to jump and shoot to jump and shoot betweenbetween

trains.trains.

∴ ∴ We decided to go for a fully We decided to go for a fully dynamicdynamic train. train.

Game Forum Germany 2010Game Forum Germany 2010 88

QuickTime™ and a decompressor

are needed to see this picture.

Train MovementTrain Movement

99

1100

Game Forum Germany 2010Game Forum Germany 2010

Spline TrackingSpline Tracking

1111

Game Forum Germany 2010Game Forum Germany 2010

The Master CarThe Master Car

• Each train car is an Each train car is an independent game objectindependent game object (GO). (GO).• One car is designated as the One car is designated as the mastermaster..

It moves It moves without regardwithout regard for the other cars. for the other cars. Every other car is a Every other car is a slaveslave: it simply : it simply maintains proper maintains proper

spacingspacing with the car(s) in front of and/or behind it. with the car(s) in front of and/or behind it.

1122

Game Forum Germany 2010Game Forum Germany 2010

Spacing and SpeedSpacing and Speed

u = 0

u = 1u = 0.5

s = 0s = 7

s = 14

1133

Game Forum Germany 2010Game Forum Germany 2010

Updating the CarsUpdating the Cars

• Train car game objects need to update in a Train car game objects need to update in a specific orderspecific order:: MasterMaster first. first. Then cars Then cars in frontin front of master, from master to locomotive. of master, from master to locomotive. Then cars Then cars behindbehind master, from master to caboose. master, from master to caboose.

1 2 34 ... n4 ... n

n + 1

n + 2 ... mn + 2 ... m

1144

Game Forum Germany 2010Game Forum Germany 2010

Updating the CarsUpdating the Cars

• In the In the Uncharted 2Uncharted 2 engine, game objects are managed as a engine, game objects are managed as a treetree.. ChildrenChildren update update after their parentafter their parent..

• For the train...For the train... MM

M+1M+1

M+2M+2

M+3M+3

M−1M−1

M−2M−2

M−3M−3

1155

Game Forum Germany 2010Game Forum Germany 2010

Teleporting the TrainTeleporting the Train

• The train sometimes The train sometimes teleportsteleports in the game. in the game. e.g. To transition from a e.g. To transition from a looped sectionlooped section to a to a straight-straight-

awayaway.. Typically hidden by a Typically hidden by a camera cutcamera cut..

QuickTime™ and a decompressor

are needed to see this picture.

1166

Game Forum Germany 2010Game Forum Germany 2010

Teleporting the TrainTeleporting the Train

• Problem: Teleport the train such that Problem: Teleport the train such that whichever car the whichever car the player is onplayer is on moves to a moves to a predefined locationpredefined location on the track. on the track. To implement, To implement, change the masterchange the master to be the player’s car... to be the player’s car... ... and ... and teleport itteleport it to the desired location. to the desired location. All other cars All other cars follow automaticallyfollow automatically..

Updating Large-Scale Engine SystemsUpdating Large-Scale Engine Systems

1177

1188

Game Forum Germany 2010Game Forum Germany 2010

Large-Scale Engine SystemsLarge-Scale Engine Systems

• Let’s define Let’s define large-scale engine systemslarge-scale engine systems as: as: Engine components that operate on Engine components that operate on lots of lots of

datadata...... ... and require careful ... and require careful performance performance

optimizationoptimization..

• Examples include:Examples include: skeletal skeletal animationanimation,, collisioncollision detection, detection, rigid body rigid body dynamicsdynamics,, renderingrendering, ..., ...

1199

Game Forum Germany 2010Game Forum Germany 2010

A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)

• Most game programmers Most game programmers first learn about the first learn about the game game looploop from from rendering rendering tutorialstutorials..

while (!quit){ ReadJoypad(); UpdateScene(); DrawScene(); FlipBuffers();}

while (!quit){ ReadJoypad(); UpdateScene(); DrawScene(); FlipBuffers();}

2200

Game Forum Germany 2010Game Forum Germany 2010

A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)

• Since we need to update our game objects anyway...Since we need to update our game objects anyway...

UpdateSceneUpdateScene()() becomes becomes UpdateGameObjectsUpdateGameObjects()()

• In the spirit of good In the spirit of good object-orientedobject-oriented design, we should let design, we should letthe the game objects game objects drive the drive the large-scale engine large-scale engine systemssystems..

(Right???)(Right???)

2211

Game Forum Germany 2010Game Forum Germany 2010

A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)

void Tank::Update (){ MoveTank(); AimTurret(); FireIfNecessary();

Animate(); DetectCollisions(); SimulatePhysics(); UpdateAudio(); Draw();}

void Tank::Update (){ MoveTank(); AimTurret(); FireIfNecessary();

Animate(); DetectCollisions(); SimulatePhysics(); UpdateAudio(); Draw();}

while (!quit){ ReadJoypad(); UpdateGameObjects(); // DrawScene(); FlipBuffers();}

while (!quit){ ReadJoypad(); UpdateGameObjects(); // DrawScene(); FlipBuffers();}

2222

Game Forum Germany 2010Game Forum Germany 2010

A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)

• The only problem with this is...The only problem with this is...

it doesn’t work!it doesn’t work! • For one thing, it’s just For one thing, it’s just not feasiblenot feasible for some engine for some engine

systems.systems. e.g. e.g. Collision detectionCollision detection cannot be done (properly) one cannot be done (properly) one

object at a time.object at a time.• Need to Need to solve iterativelysolve iteratively, account for , account for time of impacttime of impact

(TOI),(TOI),• optimize collision detection via optimize collision detection via spatial subdivisionspatial subdivision

(e.g. broadphase AABB prune and sweep),(e.g. broadphase AABB prune and sweep),• optimize dynamics by grouping into optimize dynamics by grouping into islandsislands, etc., etc.

2233

Game Forum Germany 2010Game Forum Germany 2010

A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)

• It’s also terribly It’s also terribly inefficientinefficient:: potential for potential for duplicated computationsduplicated computations,, possible possible reallocation of resourcesreallocation of resources,, poor data and instruction poor data and instruction cache coherencecache coherence,, not conducive to not conducive to parallel computationparallel computation..

2244

Game Forum Germany 2010Game Forum Germany 2010

Hardware in the Old DaysHardware in the Old Days

• On an Intel Pentium 286 (cira 1982):On an Intel Pentium 286 (cira 1982): CPU speedCPU speed was the bottleneck (6 MHz). was the bottleneck (6 MHz).• ∴ ∴ Use Use if() testsif() tests to to avoid unnecessary avoid unnecessary computationscomputations..

Integer instructionsInteger instructions faster than faster than floating-pointfloating-point..• ∴ ∴ Use Use fixed-pointfixed-point numbers, or FPU. numbers, or FPU.

More More compute powercompute power provided by provided by adding adding transistorstransistors..

2255

Game Forum Germany 2010Game Forum Germany 2010

Modern HardwareModern Hardware

• On a Cell (PS3) or Xenon (Xbox 360):On a Cell (PS3) or Xenon (Xbox 360): Memory access timeMemory access time is the bottleneck. is the bottleneck.• L1 cache miss = ~60 cycles.L1 cache miss = ~60 cycles.• L2 cache miss = ~500 cycles.L2 cache miss = ~500 cycles.• ∴ ∴ Organize data in Organize data in compactcompact, , contiguouscontiguous blocks. blocks.• Use Use struct of arraysstruct of arrays rather than array of structs. rather than array of structs.

PipelinedPipelined architecture = architecture = branchingbranching & & data data conversion conversion are are slowslow..• ∴ ∴ Use Use duplicated computationsduplicated computations to to avoid if() testsavoid if() tests..

Compute powerCompute power is provided via is provided via parallelismparallelism..

2266

Animation EngineAnimation Engine

GO 1 Anim Data

GO 1 Anim Data

GO 2 Anim Data

GO 2 Anim Data

GO 3 Anim Data

GO 3 Anim Data

...

Collision EngineCollision Engine

GO 1 Coll. Data

GO 1 Coll. Data

GO 2 Coll. Data

GO 2 Coll. Data

GO 3 Coll. Data

GO 3 Coll. Data

...

Rendering EngineRendering Engine

GO 1 Draw Data

GO 1 Draw Data

GO 2 Draw Data

GO 2 Draw Data

GO 3 Draw Data

GO 3 Draw Data

...

Much better!

Game Forum Germany 2010Game Forum Germany 2010

Optimizing Large-Scale UpdatesOptimizing Large-Scale Updates

GameObject 1GameObject 1

Anim Data

Anim Data

Coll. DataColl. Data

Draw Data

Draw Data

GameObject 2GameObject 2

Anim Data

Anim Data

Coll. DataColl. Data

Draw Data

Draw Data

GameObject 3GameObject 3

Anim Data

Anim Data

Coll. DataColl. Data

Draw Data

Draw Data

... ...

Inefficient

2277

Game Forum Germany 2010Game Forum Germany 2010

Optimizing Large-Scale UpdatesOptimizing Large-Scale Updates

• Game objects Game objects do not owndo not own their large-scale system data... their large-scale system data... ... they ... they point topoint to data that is data that is ownedowned by the large-scale by the large-scale

systems.systems.• Game objects should not mutate their large-scale system Game objects should not mutate their large-scale system

data...data... ... they should only ... they should only requestrequest changes. changes.

• That way, each large-scale system can:That way, each large-scale system can: apply any requested changes apply any requested changes as part of its as part of its batch batch

updateupdate,, organize its data in organize its data in whatever manner is most efficientwhatever manner is most efficient..

2288

class Tank : public GameObject{public: // interface...

private: // components -- owned by engine subsystems AnimControl* m_pAnimControl; RigidBody* m_pRigidBody; MeshInstance* m_pRenderable;};

class Tank : public GameObject{public: // interface...

private: // components -- owned by engine subsystems AnimControl* m_pAnimControl; RigidBody* m_pRigidBody; MeshInstance* m_pRenderable;};

Scene GraphScene Graph

meshInstance8meshInstance8

meshInstance5meshInstance5

meshInstance37

meshInstance37

...

Collision/Physics WorldCollision/Physics World

rigidBody9rigidBody9 collidable9collidable9

rigidBody3rigidBody3 collidable3collidable3

rigidBody41rigidBody41 collidable41collidable41

... ...

Game Forum Germany 2010Game Forum Germany 2010

Optimizing Large-Scale UpdatesOptimizing Large-Scale Updates

...

animControl15animControl15

animControl42animControl42

animControl7animControl7

...

Animation Engine

Animation Engine

skelInstance15skelInstance15

skelInstance42skelInstance42

skelInstance7skelInstance7

...

Game Forum Germany 2010Game Forum Germany 2010 2299

while (!quit){ ReadJoypad();

g_gameWorld.UpdateAllGameObjects(); //for (each GameObject* pGo) // pGo->Update(); g_animationEngine.Update(); g_collisionWorld.DetectCollisions(); g_physicsWorld.Simulate(); g_audioEngine.Update(); g_renderingEngine.DrawAndFlipBuffers();}

while (!quit){ ReadJoypad();

g_gameWorld.UpdateAllGameObjects(); //for (each GameObject* pGo) // pGo->Update(); g_animationEngine.Update(); g_collisionWorld.DetectCollisions(); g_physicsWorld.Simulate(); g_audioEngine.Update(); g_renderingEngine.DrawAndFlipBuffers();}

Player Mechanics and AnimationPlayer Mechanics and Animation

3300

Game Forum Germany 2010Game Forum Germany 2010 3311

QuickTime™ and a decompressor

are needed to see this picture.

3322

Game Forum Germany 2010Game Forum Germany 2010

Player Mechanics on a Moving ObjectPlayer Mechanics on a Moving Object

• Player’s position and orientation represented as an Player’s position and orientation represented as an attached frame of referenceattached frame of reference:: reference to reference to parent game objectparent game object,, local transformlocal transform (relative to parent), (relative to parent), cached cached world-space transformworld-space transform (with dirty flag). (with dirty flag).

• Not quite as simple as it may sound!Not quite as simple as it may sound! Just Just switching to attached framesswitching to attached frames got us ~60% of the got us ~60% of the

way there.way there. Lots of Lots of bugsbugs to fix. to fix. Special-case handling of Special-case handling of transitionstransitions between attached between attached

frames.frames.

3333

Game Forum Germany 2010Game Forum Germany 2010

Animation Pipeline ReviewAnimation Pipeline Review

1.1.Animation UpdateAnimation Update Update Update clocksclocks, trigger , trigger keyframed eventskeyframed events, detect , detect end-end-

of-animationof-animation conditions, take conditions, take transitionstransitions to new to new animation states.animation states.

2.2.Pose Blending PhasePose Blending Phase Extract posesExtract poses from anim clips, from anim clips, blendblend individual joint individual joint

poses.poses.• Generates Generates local joint transformslocal joint transforms (parent-relative). (parent-relative).

3344

Game Forum Germany 2010Game Forum Germany 2010

Animation Pipeline ReviewAnimation Pipeline Review

3.3.Global Pose PhaseGlobal Pose Phase Walk hierarchy, calculate Walk hierarchy, calculate global joint transformsglobal joint transforms

(model-space or world-space) and (model-space or world-space) and matrix palettematrix palette (input to renderer).(input to renderer).

4.4.Post-ProcessingPost-Processing Apply Apply IKIK, , procedural animationprocedural animation, etc., etc.• Generates new Generates new locallocal joint transforms. joint transforms.• Recalculate global poses if necessary (re-run phase 3).Recalculate global poses if necessary (re-run phase 3).

3355

Game Forum Germany 2010Game Forum Germany 2010

Game Object HooksGame Object Hooks

• It’s convenient to provide It’s convenient to provide game objectsgame objects with with hookshooks into the various phases of the animation update.into the various phases of the animation update.

• What we do at Naughty Dog:What we do at Naughty Dog: GameObject::GameObject::UpdateUpdate()() • Regular GO update; runs before animation.Regular GO update; runs before animation.

GameObject::GameObject::PostAnimUpdatePostAnimUpdate()() • Runs after Animation Phase 1.Runs after Animation Phase 1.• Game objects may respond to animation events, force Game objects may respond to animation events, force

transitions to new animation states, etc.transitions to new animation states, etc.

3366

Game Forum Germany 2010Game Forum Germany 2010

Game Object HooksGame Object Hooks

GameObject::GameObject::PostAnimBlendingPostAnimBlending()() • Runs after Animation Phase 2.Runs after Animation Phase 2.• Game objects can apply procedural animation in local Game objects can apply procedural animation in local

space.space. GameObject::GameObject::PostJointUpdatePostJointUpdate()() • Runs after Animation Phase 3.Runs after Animation Phase 3.• This is the first time during the frame that the This is the first time during the frame that the global global transformtransform of every joint is known. of every joint is known.• Game objects can Game objects can applyapply IKIK, or use the global space , or use the global space

joint transforms in other ways.joint transforms in other ways.

Game Forum Germany 2010Game Forum Germany 2010 3377

while (!quit){ // ... g_gameWorld.UpdateAllGameObjects();

g_animationEngine.RunPhase1(); g_gameWorld.CallPostAnimUpdateHooks();

g_animationEngine.RunPhase2(); g_gameWorld.CallPostAnimBlendingHooks();

g_animationEngine.RunPhase3(); g_gameWorld.CallPostJointUpdateHooks(); // ...

while (!quit){ // ... g_gameWorld.UpdateAllGameObjects();

g_animationEngine.RunPhase1(); g_gameWorld.CallPostAnimUpdateHooks();

g_animationEngine.RunPhase2(); g_gameWorld.CallPostAnimBlendingHooks();

g_animationEngine.RunPhase3(); g_gameWorld.CallPostJointUpdateHooks(); // ...

Object Attachment HierarchyObject Attachment Hierarchy

Bucketed UpdatesBucketed Updates

3388

3399

Game Forum Germany 2010Game Forum Germany 2010

4400

Game Forum Germany 2010Game Forum Germany 2010

Bucketed Game Object UpdatesBucketed Game Object Updates• Can implement this by updating game objects in Can implement this by updating game objects in bucketsbuckets..

Game object dependencies represented by a Game object dependencies represented by a dependency treedependency tree..

Bucket 3Bucket 3Bucket 2Bucket 2Bucket 1Bucket 1Bucket 0Bucket 0

trainCar7trainCar7

trainCar8trainCar8 crate21crate21 playerCharacterplayerCharacter weapon17weapon17

box12box12

npc65npc65trainCar9trainCar9

weapon3weapon3

Each Each bucketbucket corresponds to one corresponds to one levellevel of the tree. of the tree.

............ ............ ............ ............

4411

Prop BucketProp BucketCharacter Bucket

Character Bucket

Object BucketObject BucketVehicle BucketVehicle Bucket

Game Forum Germany 2010Game Forum Germany 2010

Bucketed Game Object UpdatesBucketed Game Object Updates

• Simpler to group objects into Simpler to group objects into pre-determinedpre-determined buckets. buckets.

trainCar7trainCar7

trainCar8trainCar8 crate21crate21 playerCharacterplayerCharacter weapon17weapon17

box12box12

npc65npc65trainCar9trainCar9

weapon3weapon3

............ ............ ............ ............

4422

while (!quit){ // ... for (each bucket) { g_gameObjectMgr.UpdateObjects(bucket); AnimateBucket(bucket); } g_collphysWorld.CollideAndSimulate(dt); g_audioEngine.Update(dt); g_renderingEngine.DrawAndFlipBuffers(dt);}

while (!quit){ // ... for (each bucket) { g_gameObjectMgr.UpdateObjects(bucket); AnimateBucket(bucket); } g_collphysWorld.CollideAndSimulate(dt); g_audioEngine.Update(dt); g_renderingEngine.DrawAndFlipBuffers(dt);}

Game Forum Germany 2010Game Forum Germany 2010

Bucketed Game Object UpdatesBucketed Game Object Updates

Game Forum Germany 2010Game Forum Germany 2010 4433

void AnimateBucket(bucket){ g_animationEngine.UpdateClocks(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimUpdate();

g_animationEngine.CalcLocalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimBlending();

g_animationEngine.CalcGlobalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostJointUpdate();}

void AnimateBucket(bucket){ g_animationEngine.UpdateClocks(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimUpdate();

g_animationEngine.CalcLocalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimBlending();

g_animationEngine.CalcGlobalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostJointUpdate();}

Ray and Sphere CastsRay and Sphere Casts

4444

4455

Game Forum Germany 2010Game Forum Germany 2010

Ray and Sphere CastsRay and Sphere Casts

• A A collision castcollision cast is an is an instantaneous collision queryinstantaneous collision query.. Input:Input:• SnapshotSnapshot of collision geometry in game world of collision geometry in game world at time at time tt..• One or more One or more rays / moving spheresrays / moving spheres (capsules) to cast. (capsules) to cast.

Output:Output:• Would any of the rays/spheres Would any of the rays/spheres strike anythingstrike anything??• If so, If so, whatwhat??

4466

Game Forum Germany 2010Game Forum Germany 2010

Uses for Collision CastingUses for Collision Casting

• PlayerPlayer and and NPCsNPCs use downward casts to determine use downward casts to determine what what surfacesurface they are standing on. they are standing on.

• NPCsNPCs use ray casts to answer use ray casts to answer line of sightline of sight questions. questions.• WeaponsWeapons use ray casts to determine use ray casts to determine bullet impactsbullet impacts..• and the list goes on...and the list goes on...

4477

Game Forum Germany 2010Game Forum Germany 2010

Inter-Game-Object Queries and SynchronizationInter-Game-Object Queries and Synchronization

• Synchronization problemsSynchronization problems can arise whenever: can arise whenever: game object A...game object A... ... ... queriesqueries game object B. game object B.

• Not just limited to Not just limited to ray and sphere castsray and sphere casts.. Any kind of Any kind of inter-object queryinter-object query can be affected. can be affected.

4488

SSii((tt) = [ ) = [ rrii((tt),), vvii((tt),), mmii, ...,, ..., healthhealthii((tt),), ammoammoii((tt), ... ]), ... ]

SSii((tt) = [ ) = [ rrii((tt),), vvii((tt),), mmii, ...,, ..., healthhealthii((tt),), ammoammoii((tt), ... ]), ... ]

Game Forum Germany 2010Game Forum Germany 2010

Game Object State VectorsGame Object State Vectors

4499

Game Forum Germany 2010Game Forum Germany 2010

Bucketing and Game Object State VectorsBucketing and Game Object State Vectors

SASA SASA

SBSB SBSB

SCSC SCSC

tt11tt11 tt2 2 = = tt1 1 + + ∆∆tttt2 2 = = tt1 1 + + ∆∆tt

SASA

SBSB

SCSC

5500

SBSB

SCSC

SBSB

Game Forum Germany 2010Game Forum Germany 2010

Bucketing and Game Object State VectorsBucketing and Game Object State Vectors

SASA SASA

SCSC

SBSB

tt11tt11 tt2 2 = = tt1 1 + + ∆∆tttt2 2 = = tt1 1 + + ∆∆tt

SASA

SBSB

SCSC

5511

SBSB

SCSC

Game Forum Germany 2010Game Forum Germany 2010

Bucketing and Game Object State VectorsBucketing and Game Object State Vectors

SASA SASA

tt11tt11 tt2 2 = = tt1 1 + + ∆∆tttt2 2 = = tt1 1 + + ∆∆tt

querqueryy

querqueryy

5522

Game Forum Germany 2010Game Forum Germany 2010

Bucketing and Game Object State VectorsBucketing and Game Object State Vectors

• Major contributor to the ubiquitous “Major contributor to the ubiquitous “one frame offone frame off bug.”bug.”

• Update ordering via bucketing helps, but only when the Update ordering via bucketing helps, but only when the following rule is adhered to:following rule is adhered to:

An object in bucket An object in bucket bb may only read the state may only read the stateof objects in buckets (of objects in buckets (bb - 1), ( - 1), (bb - 2), ... - 2), ...

• You can’t reliably read the state of objects in your You can’t reliably read the state of objects in your ownown bucketbucket!!

5533

Game Forum Germany 2010Game Forum Germany 2010

One Frame Off Bugs in Collision QueriesOne Frame Off Bugs in Collision Queries

• Another complication arises because:Another complication arises because: Collision and physicsCollision and physics run run afterafter the game objects have the game objects have

updatedupdated.. So, the location of all collision geometry is So, the location of all collision geometry is one frame oldone frame old

when we cast our rays/spheres.when we cast our rays/spheres.

collision geois left behind

ray castmisses its target

5544

Game Forum Germany 2010Game Forum Germany 2010

One Frame Off Bugs in Collision QueriesOne Frame Off Bugs in Collision Queries

• Problem:Problem: What we really want is to update each game object’s What we really want is to update each game object’s

collision geometry collision geometry with its bucketwith its bucket..• ImpossibleImpossible, because coll/phys update is , because coll/phys update is monolithicmonolithic——

happens after all game objects have been updated.happens after all game objects have been updated. ... or is it?... or is it?

5555

Game Forum Germany 2010Game Forum Germany 2010

One Frame Off Bugs in Collision QueriesOne Frame Off Bugs in Collision Queries

• Observation:Observation: Ray and sphere casts Ray and sphere casts don’t caredon’t care about the full about the full

collision/physics update.collision/physics update.• Don’t need contact information, velocity, etc.Don’t need contact information, velocity, etc.• Only need to know where the collision geo Only need to know where the collision geo will bewill be..

• Solution:Solution: All we need to do is All we need to do is update the broadphase AABBsupdate the broadphase AABBs

between buckets.between buckets.

5566

Game Forum Germany 2010Game Forum Germany 2010

Solving One Frame Off Bugs in Collision QueriesSolving One Frame Off Bugs in Collision Queries

• Instead, we’ll update the AABBs early, in between game Instead, we’ll update the AABBs early, in between game object bucket updates.object bucket updates. This allows our collision queries to return This allows our collision queries to return correct correct

resultsresults..

collision geois left behind

ray casthits

broadphaseAABB update

Achieving ParallelismAchieving Parallelism

5577

5588

Game Forum Germany 2010Game Forum Germany 2010

Game Console Architecture: Xbox 360Game Console Architecture: Xbox 360

Shared L2 Cache (1 MB)Shared L2 Cache (1 MB)

Unified RAM(512 MB)

Unified RAM(512 MB) GPUGPU

PowerPCCore 0

PowerPCCore 0

L1 DataL1 Data L1 Instr.

L1 Instr.

PowerPCCore 1

PowerPCCore 1

L1 DataL1 Data L1 Instr.

L1 Instr.

PowerPCCore 2

PowerPCCore 2

L1 DataL1 Data L1 Instr.

L1 Instr.

5599

Game Forum Germany 2010Game Forum Germany 2010

Game Console Architecture: PLAYSTATION 3Game Console Architecture: PLAYSTATION 3

System RAM(256 MB)

System RAM(256 MB) GPUGPU

L2 Cache(512 kB)

L2 Cache(512 kB)

PPUPPU

L1 DataL1 Data L1 Instr.

L1 Instr.

SPU0SPU0

Local Store(256 kB)

Local Store(256 kB)

SPU1SPU1

Local Store(256 kB)

Local Store(256 kB)

SPU6SPU6

Local Store(256 kB)

Local Store(256 kB)

............

Video RAM(256 MB)

Video RAM(256 MB)

DMACDMACEIBEIBEIBEIB

6600

Game Forum Germany 2010Game Forum Germany 2010

Thinking About Modern Multiprocessor HardwareThinking About Modern Multiprocessor Hardware

• We’ll think in terms of multiple We’ll think in terms of multiple hardware threadshardware threads..• These could be provided by:These could be provided by:

hyperthreaded CPUhyperthreaded CPU,, multiple coresmultiple cores,, SPUsSPUs on PS3/Cell. on PS3/Cell.

6611

gameloop

gameloop

Game Forum Germany 2010Game Forum Germany 2010

Ways to Achieve Parallelism: Thread per SubsystemWays to Achieve Parallelism: Thread per Subsystem

animationanimation collision/physics

collision/physics renderingrendering audioaudio

6622

Game Forum Germany 2010Game Forum Germany 2010

Ways to Achieve Parallelism: Fork/JoinWays to Achieve Parallelism: Fork/Join

set upset up

thread0thread0 thread1thread1 thread2thread2thread3thread3

forkforkforkfork

process results

process results

joinjoinjoinjoin

6633

SPU0SPU0SPU0SPU0 SPU1SPU1SPU1SPU1 SPU5SPU5SPU5SPU5PPUPPUPPUPPU

Game Forum Germany 2010Game Forum Germany 2010

Ways to Achieve Parallelism: JobsWays to Achieve Parallelism: Jobs

anim poseanim pose

ray castray cast

broad phasebroad phase

visibilityvisibility

anim poseanim pose

ray castray cast

broad phasebroad phase

visibilityvisibility

............

anim poseanim pose

visibilityvisibility

mesh proc

mesh proc

anim poseanim pose

gameloop

gameloop

kickkickkickkick

waitwaitwaitwait

6644

Game Forum Germany 2010Game Forum Germany 2010

Ways to Achieve Parallelism: JobsWays to Achieve Parallelism: Jobs

• JobJob = = [[ ( (codecode + + input datainput data) → ) → output dataoutput data ]] KickKick job = request job to be scheduled on a HW thread. job = request job to be scheduled on a HW thread. Must Must waitwait for job before processing its output data. for job before processing its output data.• If job is If job is donedone, wait takes close to , wait takes close to zero timezero time..• If job is If job is not donenot done, main thread (PPU) , main thread (PPU) blocksblocks until it is until it is

done.done.

• Job managerJob manager handles handles schedulingscheduling, , allocates buffersallocates buffers in in local store, and local store, and coordinates DMAscoordinates DMAs.. Programmer specifies data sources & destinations via a Programmer specifies data sources & destinations via a DMA DMA

listlist..

6655

Game Forum Germany 2010Game Forum Germany 2010

Parallelism on the TrainParallelism on the Train

RayCastHandle hRayCast = INVALID;

while (!quit){ // ...

// wait and re-kick immediately if (hRayCast.IsValid()) { waitRayCast(hRayCast); processRayCastResults(hRayCast); } hRayCast = kickRayCast(...);

// ...}

RayCastHandle hRayCast = INVALID;

while (!quit){ // ...

// wait and re-kick immediately if (hRayCast.IsValid()) { waitRayCast(hRayCast); processRayCastResults(hRayCast); } hRayCast = kickRayCast(...);

// ...}

while (!quit){ // ...

RayCastHandle hRayCast = kickRayCast(...);

// do other useful work on PPU // ...

waitRayCast(hRayCast); processRayCastResults(hRayCast);

// ...}

while (!quit){ // ...

RayCastHandle hRayCast = kickRayCast(...);

// do other useful work on PPU // ...

waitRayCast(hRayCast); processRayCastResults(hRayCast);

// ...}

6666

Game Forum Germany 2010Game Forum Germany 2010

Parallelism on the TrainParallelism on the Train

• How does parallelism with jobs affect our train design?How does parallelism with jobs affect our train design? Large-scale engine system updatesLarge-scale engine system updates and and ray/sphere ray/sphere

castscasts are largely are largely asynchronousasynchronous in in U2:ATU2:AT..• Main game loop (PPU) Main game loop (PPU) kicks jobskicks jobs on SPU. on SPU.• Other work can proceedOther work can proceed on the PPU while jobs are on the PPU while jobs are

running.running.• ResultsResults picked up picked up laterlater this frame... or this frame... or next framenext frame..

6677

Game Forum Germany 2010Game Forum Germany 2010

Parallelism on the TrainParallelism on the Train

• Data must be Data must be compactcompact and and contiguouscontiguous so it can be so it can be DMA’d to the SPUs.DMA’d to the SPUs. We’re doing this already, to maintain good We’re doing this already, to maintain good cache cache

coherencycoherency..• The collision world must be The collision world must be lockedlocked in order to update in order to update

broadphase AABBs:broadphase AABBs: Wait until all Wait until all outstanding ray/sphere jobsoutstanding ray/sphere jobs are are donedone.. LockLock.. Update AABBsUpdate AABBs.. UnlockUnlock..

6688

Game Forum Germany 2010Game Forum Germany 2010

Parallelism on the TrainParallelism on the Train

• We can even do our broadphase AABB updates asynchronously.We can even do our broadphase AABB updates asynchronously.

Update Bucket 0 Update Bucket 1Kick BP Job

Broadphase AABB Job

SPU

PPU

query/cast

ConclusionsConclusions

6699

7700

Game Forum Germany 2010Game Forum Germany 2010

ConclusionsConclusions

• The combination of The combination of modern hardware restrictionsmodern hardware restrictions and the problems generated by the and the problems generated by the train leveltrain level...... forced us to design our engine in a forced us to design our engine in a robustrobust and and efficientefficient

manner.manner.• Results:Results:

U2:ATU2:AT boasts near boasts near 100% hardware utilization100% hardware utilization on PS3 on PS3(PPU + all 6 SPUs).(PPU + all 6 SPUs).

Way-cool Way-cool train leveltrain level as pay-off for all the hard work. as pay-off for all the hard work. Technology was the primary enabler for the Technology was the primary enabler for the convoy convoy

levellevel as well. as well.

7711

Game Forum Germany 2010Game Forum Germany 2010

Thanks For Listening!Thanks For Listening!

• Free free to send questions to me at:Free free to send questions to me at:[email protected][email protected]