Multiprocessor Game Loops: Lessons from Jason Gregory Naughty Dog, Inc.
-
date post
22-Dec-2015 -
Category
Documents
-
view
227 -
download
6
Transcript of Multiprocessor Game Loops: Lessons from Jason Gregory Naughty Dog, Inc.
Multiprocessor Game Loops:Multiprocessor Game Loops:Lessons fromLessons from
Jason GregoryJason GregoryNaughty Dog, Inc.Naughty Dog, Inc.Jason GregoryJason Gregory
Naughty Dog, Inc.Naughty Dog, Inc.
22Game Forum Germany 2010Game Forum Germany 2010
AgendaAgenda
• Goal:Goal: Learn about Learn about modern multiprocessor game engine modern multiprocessor game engine
update loopsupdate loops...... ... by investigating Naughty Dog’s ... by investigating Naughty Dog’s train leveltrain level..
Game Forum Germany 2010Game Forum Germany 2010 33
QuickTime™ and a decompressor
are needed to see this picture.
44Game Forum Germany 2010Game Forum Germany 2010
AgendaAgenda
• During our investigation, we’ll answer the following During our investigation, we’ll answer the following questions:questions: Why did we make the train itself Why did we make the train itself dynamicdynamic (not static)? (not static)? How does the train How does the train movemove?? How does it tie into the How does it tie into the update of other engine systemsupdate of other engine systems?? How do How do player mechanicsplayer mechanics and and animationanimation work on the work on the
train?train? How do game object How do game object attachment hierarchiesattachment hierarchies work? work? How are How are ray and sphere castsray and sphere casts used on the train? used on the train? How do we utilize the PS3’s How do we utilize the PS3’s parallel computing resourcesparallel computing resources??
66Game Forum Germany 2010Game Forum Germany 2010
Could the Train be Static?Could the Train be Static?
• In the past, games with trains in them have used a In the past, games with trains in them have used a static trainstatic train approach. approach. Train is actually Train is actually stationarystationary.. Background scrolls byBackground scrolls by to produce illusion of to produce illusion of
movement.movement.• This solves a lot of problems.This solves a lot of problems.
Player mechanics, NPC locomotion, weapon mechanics, Player mechanics, NPC locomotion, weapon mechanics, etc.etc.are all the same as in a “regular” non-moving game are all the same as in a “regular” non-moving game level.level.
77Game Forum Germany 2010Game Forum Germany 2010
Was That Good Enough for Us?Was That Good Enough for Us?
• No way!No way!• Stationary train design imposes undue limitations:Stationary train design imposes undue limitations:
Train must move in a Train must move in a perfectly straight lineperfectly straight line...... ... or along a broad ... or along a broad circular arccircular arc (constant curvature). (constant curvature). No No ups and downsups and downs allowed in the track either. allowed in the track either. Not nearly as much Not nearly as much funfun to jump and shoot to jump and shoot betweenbetween
trains.trains.
∴ ∴ We decided to go for a fully We decided to go for a fully dynamicdynamic train. train.
Game Forum Germany 2010Game Forum Germany 2010 88
QuickTime™ and a decompressor
are needed to see this picture.
1111
Game Forum Germany 2010Game Forum Germany 2010
The Master CarThe Master Car
• Each train car is an Each train car is an independent game objectindependent game object (GO). (GO).• One car is designated as the One car is designated as the mastermaster..
It moves It moves without regardwithout regard for the other cars. for the other cars. Every other car is a Every other car is a slaveslave: it simply : it simply maintains proper maintains proper
spacingspacing with the car(s) in front of and/or behind it. with the car(s) in front of and/or behind it.
1122
Game Forum Germany 2010Game Forum Germany 2010
Spacing and SpeedSpacing and Speed
u = 0
u = 1u = 0.5
s = 0s = 7
s = 14
1133
Game Forum Germany 2010Game Forum Germany 2010
Updating the CarsUpdating the Cars
• Train car game objects need to update in a Train car game objects need to update in a specific orderspecific order:: MasterMaster first. first. Then cars Then cars in frontin front of master, from master to locomotive. of master, from master to locomotive. Then cars Then cars behindbehind master, from master to caboose. master, from master to caboose.
1 2 34 ... n4 ... n
n + 1
n + 2 ... mn + 2 ... m
1144
Game Forum Germany 2010Game Forum Germany 2010
Updating the CarsUpdating the Cars
• In the In the Uncharted 2Uncharted 2 engine, game objects are managed as a engine, game objects are managed as a treetree.. ChildrenChildren update update after their parentafter their parent..
• For the train...For the train... MM
M+1M+1
M+2M+2
M+3M+3
M−1M−1
M−2M−2
M−3M−3
1155
Game Forum Germany 2010Game Forum Germany 2010
Teleporting the TrainTeleporting the Train
• The train sometimes The train sometimes teleportsteleports in the game. in the game. e.g. To transition from a e.g. To transition from a looped sectionlooped section to a to a straight-straight-
awayaway.. Typically hidden by a Typically hidden by a camera cutcamera cut..
QuickTime™ and a decompressor
are needed to see this picture.
1166
Game Forum Germany 2010Game Forum Germany 2010
Teleporting the TrainTeleporting the Train
• Problem: Teleport the train such that Problem: Teleport the train such that whichever car the whichever car the player is onplayer is on moves to a moves to a predefined locationpredefined location on the track. on the track. To implement, To implement, change the masterchange the master to be the player’s car... to be the player’s car... ... and ... and teleport itteleport it to the desired location. to the desired location. All other cars All other cars follow automaticallyfollow automatically..
1188
Game Forum Germany 2010Game Forum Germany 2010
Large-Scale Engine SystemsLarge-Scale Engine Systems
• Let’s define Let’s define large-scale engine systemslarge-scale engine systems as: as: Engine components that operate on Engine components that operate on lots of lots of
datadata...... ... and require careful ... and require careful performance performance
optimizationoptimization..
• Examples include:Examples include: skeletal skeletal animationanimation,, collisioncollision detection, detection, rigid body rigid body dynamicsdynamics,, renderingrendering, ..., ...
1199
Game Forum Germany 2010Game Forum Germany 2010
A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)
• Most game programmers Most game programmers first learn about the first learn about the game game looploop from from rendering rendering tutorialstutorials..
while (!quit){ ReadJoypad(); UpdateScene(); DrawScene(); FlipBuffers();}
while (!quit){ ReadJoypad(); UpdateScene(); DrawScene(); FlipBuffers();}
2200
Game Forum Germany 2010Game Forum Germany 2010
A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)
• Since we need to update our game objects anyway...Since we need to update our game objects anyway...
UpdateSceneUpdateScene()() becomes becomes UpdateGameObjectsUpdateGameObjects()()
• In the spirit of good In the spirit of good object-orientedobject-oriented design, we should let design, we should letthe the game objects game objects drive the drive the large-scale engine large-scale engine systemssystems..
(Right???)(Right???)
2211
Game Forum Germany 2010Game Forum Germany 2010
A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)
void Tank::Update (){ MoveTank(); AimTurret(); FireIfNecessary();
Animate(); DetectCollisions(); SimulatePhysics(); UpdateAudio(); Draw();}
void Tank::Update (){ MoveTank(); AimTurret(); FireIfNecessary();
Animate(); DetectCollisions(); SimulatePhysics(); UpdateAudio(); Draw();}
while (!quit){ ReadJoypad(); UpdateGameObjects(); // DrawScene(); FlipBuffers();}
while (!quit){ ReadJoypad(); UpdateGameObjects(); // DrawScene(); FlipBuffers();}
2222
Game Forum Germany 2010Game Forum Germany 2010
A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)
• The only problem with this is...The only problem with this is...
it doesn’t work!it doesn’t work! • For one thing, it’s just For one thing, it’s just not feasiblenot feasible for some engine for some engine
systems.systems. e.g. e.g. Collision detectionCollision detection cannot be done (properly) one cannot be done (properly) one
object at a time.object at a time.• Need to Need to solve iterativelysolve iteratively, account for , account for time of impacttime of impact
(TOI),(TOI),• optimize collision detection via optimize collision detection via spatial subdivisionspatial subdivision
(e.g. broadphase AABB prune and sweep),(e.g. broadphase AABB prune and sweep),• optimize dynamics by grouping into optimize dynamics by grouping into islandsislands, etc., etc.
2233
Game Forum Germany 2010Game Forum Germany 2010
A Simple Approach (That Doesn’t Work)A Simple Approach (That Doesn’t Work)
• It’s also terribly It’s also terribly inefficientinefficient:: potential for potential for duplicated computationsduplicated computations,, possible possible reallocation of resourcesreallocation of resources,, poor data and instruction poor data and instruction cache coherencecache coherence,, not conducive to not conducive to parallel computationparallel computation..
2244
Game Forum Germany 2010Game Forum Germany 2010
Hardware in the Old DaysHardware in the Old Days
• On an Intel Pentium 286 (cira 1982):On an Intel Pentium 286 (cira 1982): CPU speedCPU speed was the bottleneck (6 MHz). was the bottleneck (6 MHz).• ∴ ∴ Use Use if() testsif() tests to to avoid unnecessary avoid unnecessary computationscomputations..
Integer instructionsInteger instructions faster than faster than floating-pointfloating-point..• ∴ ∴ Use Use fixed-pointfixed-point numbers, or FPU. numbers, or FPU.
More More compute powercompute power provided by provided by adding adding transistorstransistors..
2255
Game Forum Germany 2010Game Forum Germany 2010
Modern HardwareModern Hardware
• On a Cell (PS3) or Xenon (Xbox 360):On a Cell (PS3) or Xenon (Xbox 360): Memory access timeMemory access time is the bottleneck. is the bottleneck.• L1 cache miss = ~60 cycles.L1 cache miss = ~60 cycles.• L2 cache miss = ~500 cycles.L2 cache miss = ~500 cycles.• ∴ ∴ Organize data in Organize data in compactcompact, , contiguouscontiguous blocks. blocks.• Use Use struct of arraysstruct of arrays rather than array of structs. rather than array of structs.
PipelinedPipelined architecture = architecture = branchingbranching & & data data conversion conversion are are slowslow..• ∴ ∴ Use Use duplicated computationsduplicated computations to to avoid if() testsavoid if() tests..
Compute powerCompute power is provided via is provided via parallelismparallelism..
2266
Animation EngineAnimation Engine
GO 1 Anim Data
GO 1 Anim Data
GO 2 Anim Data
GO 2 Anim Data
GO 3 Anim Data
GO 3 Anim Data
...
Collision EngineCollision Engine
GO 1 Coll. Data
GO 1 Coll. Data
GO 2 Coll. Data
GO 2 Coll. Data
GO 3 Coll. Data
GO 3 Coll. Data
...
Rendering EngineRendering Engine
GO 1 Draw Data
GO 1 Draw Data
GO 2 Draw Data
GO 2 Draw Data
GO 3 Draw Data
GO 3 Draw Data
...
Much better!
Game Forum Germany 2010Game Forum Germany 2010
Optimizing Large-Scale UpdatesOptimizing Large-Scale Updates
GameObject 1GameObject 1
Anim Data
Anim Data
Coll. DataColl. Data
Draw Data
Draw Data
GameObject 2GameObject 2
Anim Data
Anim Data
Coll. DataColl. Data
Draw Data
Draw Data
GameObject 3GameObject 3
Anim Data
Anim Data
Coll. DataColl. Data
Draw Data
Draw Data
... ...
Inefficient
2277
Game Forum Germany 2010Game Forum Germany 2010
Optimizing Large-Scale UpdatesOptimizing Large-Scale Updates
• Game objects Game objects do not owndo not own their large-scale system data... their large-scale system data... ... they ... they point topoint to data that is data that is ownedowned by the large-scale by the large-scale
systems.systems.• Game objects should not mutate their large-scale system Game objects should not mutate their large-scale system
data...data... ... they should only ... they should only requestrequest changes. changes.
• That way, each large-scale system can:That way, each large-scale system can: apply any requested changes apply any requested changes as part of its as part of its batch batch
updateupdate,, organize its data in organize its data in whatever manner is most efficientwhatever manner is most efficient..
2288
class Tank : public GameObject{public: // interface...
private: // components -- owned by engine subsystems AnimControl* m_pAnimControl; RigidBody* m_pRigidBody; MeshInstance* m_pRenderable;};
class Tank : public GameObject{public: // interface...
private: // components -- owned by engine subsystems AnimControl* m_pAnimControl; RigidBody* m_pRigidBody; MeshInstance* m_pRenderable;};
Scene GraphScene Graph
meshInstance8meshInstance8
meshInstance5meshInstance5
meshInstance37
meshInstance37
...
Collision/Physics WorldCollision/Physics World
rigidBody9rigidBody9 collidable9collidable9
rigidBody3rigidBody3 collidable3collidable3
rigidBody41rigidBody41 collidable41collidable41
... ...
Game Forum Germany 2010Game Forum Germany 2010
Optimizing Large-Scale UpdatesOptimizing Large-Scale Updates
...
animControl15animControl15
animControl42animControl42
animControl7animControl7
...
Animation Engine
Animation Engine
skelInstance15skelInstance15
skelInstance42skelInstance42
skelInstance7skelInstance7
...
Game Forum Germany 2010Game Forum Germany 2010 2299
while (!quit){ ReadJoypad();
g_gameWorld.UpdateAllGameObjects(); //for (each GameObject* pGo) // pGo->Update(); g_animationEngine.Update(); g_collisionWorld.DetectCollisions(); g_physicsWorld.Simulate(); g_audioEngine.Update(); g_renderingEngine.DrawAndFlipBuffers();}
while (!quit){ ReadJoypad();
g_gameWorld.UpdateAllGameObjects(); //for (each GameObject* pGo) // pGo->Update(); g_animationEngine.Update(); g_collisionWorld.DetectCollisions(); g_physicsWorld.Simulate(); g_audioEngine.Update(); g_renderingEngine.DrawAndFlipBuffers();}
Game Forum Germany 2010Game Forum Germany 2010 3311
QuickTime™ and a decompressor
are needed to see this picture.
3322
Game Forum Germany 2010Game Forum Germany 2010
Player Mechanics on a Moving ObjectPlayer Mechanics on a Moving Object
• Player’s position and orientation represented as an Player’s position and orientation represented as an attached frame of referenceattached frame of reference:: reference to reference to parent game objectparent game object,, local transformlocal transform (relative to parent), (relative to parent), cached cached world-space transformworld-space transform (with dirty flag). (with dirty flag).
• Not quite as simple as it may sound!Not quite as simple as it may sound! Just Just switching to attached framesswitching to attached frames got us ~60% of the got us ~60% of the
way there.way there. Lots of Lots of bugsbugs to fix. to fix. Special-case handling of Special-case handling of transitionstransitions between attached between attached
frames.frames.
3333
Game Forum Germany 2010Game Forum Germany 2010
Animation Pipeline ReviewAnimation Pipeline Review
1.1.Animation UpdateAnimation Update Update Update clocksclocks, trigger , trigger keyframed eventskeyframed events, detect , detect end-end-
of-animationof-animation conditions, take conditions, take transitionstransitions to new to new animation states.animation states.
2.2.Pose Blending PhasePose Blending Phase Extract posesExtract poses from anim clips, from anim clips, blendblend individual joint individual joint
poses.poses.• Generates Generates local joint transformslocal joint transforms (parent-relative). (parent-relative).
3344
Game Forum Germany 2010Game Forum Germany 2010
Animation Pipeline ReviewAnimation Pipeline Review
3.3.Global Pose PhaseGlobal Pose Phase Walk hierarchy, calculate Walk hierarchy, calculate global joint transformsglobal joint transforms
(model-space or world-space) and (model-space or world-space) and matrix palettematrix palette (input to renderer).(input to renderer).
4.4.Post-ProcessingPost-Processing Apply Apply IKIK, , procedural animationprocedural animation, etc., etc.• Generates new Generates new locallocal joint transforms. joint transforms.• Recalculate global poses if necessary (re-run phase 3).Recalculate global poses if necessary (re-run phase 3).
3355
Game Forum Germany 2010Game Forum Germany 2010
Game Object HooksGame Object Hooks
• It’s convenient to provide It’s convenient to provide game objectsgame objects with with hookshooks into the various phases of the animation update.into the various phases of the animation update.
• What we do at Naughty Dog:What we do at Naughty Dog: GameObject::GameObject::UpdateUpdate()() • Regular GO update; runs before animation.Regular GO update; runs before animation.
GameObject::GameObject::PostAnimUpdatePostAnimUpdate()() • Runs after Animation Phase 1.Runs after Animation Phase 1.• Game objects may respond to animation events, force Game objects may respond to animation events, force
transitions to new animation states, etc.transitions to new animation states, etc.
3366
Game Forum Germany 2010Game Forum Germany 2010
Game Object HooksGame Object Hooks
GameObject::GameObject::PostAnimBlendingPostAnimBlending()() • Runs after Animation Phase 2.Runs after Animation Phase 2.• Game objects can apply procedural animation in local Game objects can apply procedural animation in local
space.space. GameObject::GameObject::PostJointUpdatePostJointUpdate()() • Runs after Animation Phase 3.Runs after Animation Phase 3.• This is the first time during the frame that the This is the first time during the frame that the global global transformtransform of every joint is known. of every joint is known.• Game objects can Game objects can applyapply IKIK, or use the global space , or use the global space
joint transforms in other ways.joint transforms in other ways.
Game Forum Germany 2010Game Forum Germany 2010 3377
while (!quit){ // ... g_gameWorld.UpdateAllGameObjects();
g_animationEngine.RunPhase1(); g_gameWorld.CallPostAnimUpdateHooks();
g_animationEngine.RunPhase2(); g_gameWorld.CallPostAnimBlendingHooks();
g_animationEngine.RunPhase3(); g_gameWorld.CallPostJointUpdateHooks(); // ...
while (!quit){ // ... g_gameWorld.UpdateAllGameObjects();
g_animationEngine.RunPhase1(); g_gameWorld.CallPostAnimUpdateHooks();
g_animationEngine.RunPhase2(); g_gameWorld.CallPostAnimBlendingHooks();
g_animationEngine.RunPhase3(); g_gameWorld.CallPostJointUpdateHooks(); // ...
4400
Game Forum Germany 2010Game Forum Germany 2010
Bucketed Game Object UpdatesBucketed Game Object Updates• Can implement this by updating game objects in Can implement this by updating game objects in bucketsbuckets..
Game object dependencies represented by a Game object dependencies represented by a dependency treedependency tree..
Bucket 3Bucket 3Bucket 2Bucket 2Bucket 1Bucket 1Bucket 0Bucket 0
trainCar7trainCar7
trainCar8trainCar8 crate21crate21 playerCharacterplayerCharacter weapon17weapon17
box12box12
npc65npc65trainCar9trainCar9
weapon3weapon3
Each Each bucketbucket corresponds to one corresponds to one levellevel of the tree. of the tree.
............ ............ ............ ............
4411
Prop BucketProp BucketCharacter Bucket
Character Bucket
Object BucketObject BucketVehicle BucketVehicle Bucket
Game Forum Germany 2010Game Forum Germany 2010
Bucketed Game Object UpdatesBucketed Game Object Updates
• Simpler to group objects into Simpler to group objects into pre-determinedpre-determined buckets. buckets.
trainCar7trainCar7
trainCar8trainCar8 crate21crate21 playerCharacterplayerCharacter weapon17weapon17
box12box12
npc65npc65trainCar9trainCar9
weapon3weapon3
............ ............ ............ ............
4422
while (!quit){ // ... for (each bucket) { g_gameObjectMgr.UpdateObjects(bucket); AnimateBucket(bucket); } g_collphysWorld.CollideAndSimulate(dt); g_audioEngine.Update(dt); g_renderingEngine.DrawAndFlipBuffers(dt);}
while (!quit){ // ... for (each bucket) { g_gameObjectMgr.UpdateObjects(bucket); AnimateBucket(bucket); } g_collphysWorld.CollideAndSimulate(dt); g_audioEngine.Update(dt); g_renderingEngine.DrawAndFlipBuffers(dt);}
Game Forum Germany 2010Game Forum Germany 2010
Bucketed Game Object UpdatesBucketed Game Object Updates
Game Forum Germany 2010Game Forum Germany 2010 4433
void AnimateBucket(bucket){ g_animationEngine.UpdateClocks(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimUpdate();
g_animationEngine.CalcLocalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimBlending();
g_animationEngine.CalcGlobalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostJointUpdate();}
void AnimateBucket(bucket){ g_animationEngine.UpdateClocks(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimUpdate();
g_animationEngine.CalcLocalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostAnimBlending();
g_animationEngine.CalcGlobalPoses(bucket); for (each GameObject* pGo in bucket) pGo->PostJointUpdate();}
4455
Game Forum Germany 2010Game Forum Germany 2010
Ray and Sphere CastsRay and Sphere Casts
• A A collision castcollision cast is an is an instantaneous collision queryinstantaneous collision query.. Input:Input:• SnapshotSnapshot of collision geometry in game world of collision geometry in game world at time at time tt..• One or more One or more rays / moving spheresrays / moving spheres (capsules) to cast. (capsules) to cast.
Output:Output:• Would any of the rays/spheres Would any of the rays/spheres strike anythingstrike anything??• If so, If so, whatwhat??
4466
Game Forum Germany 2010Game Forum Germany 2010
Uses for Collision CastingUses for Collision Casting
• PlayerPlayer and and NPCsNPCs use downward casts to determine use downward casts to determine what what surfacesurface they are standing on. they are standing on.
• NPCsNPCs use ray casts to answer use ray casts to answer line of sightline of sight questions. questions.• WeaponsWeapons use ray casts to determine use ray casts to determine bullet impactsbullet impacts..• and the list goes on...and the list goes on...
4477
Game Forum Germany 2010Game Forum Germany 2010
Inter-Game-Object Queries and SynchronizationInter-Game-Object Queries and Synchronization
• Synchronization problemsSynchronization problems can arise whenever: can arise whenever: game object A...game object A... ... ... queriesqueries game object B. game object B.
• Not just limited to Not just limited to ray and sphere castsray and sphere casts.. Any kind of Any kind of inter-object queryinter-object query can be affected. can be affected.
4488
SSii((tt) = [ ) = [ rrii((tt),), vvii((tt),), mmii, ...,, ..., healthhealthii((tt),), ammoammoii((tt), ... ]), ... ]
SSii((tt) = [ ) = [ rrii((tt),), vvii((tt),), mmii, ...,, ..., healthhealthii((tt),), ammoammoii((tt), ... ]), ... ]
Game Forum Germany 2010Game Forum Germany 2010
Game Object State VectorsGame Object State Vectors
4499
Game Forum Germany 2010Game Forum Germany 2010
Bucketing and Game Object State VectorsBucketing and Game Object State Vectors
SASA SASA
SBSB SBSB
SCSC SCSC
tt11tt11 tt2 2 = = tt1 1 + + ∆∆tttt2 2 = = tt1 1 + + ∆∆tt
SASA
SBSB
SCSC
5500
SBSB
SCSC
SBSB
Game Forum Germany 2010Game Forum Germany 2010
Bucketing and Game Object State VectorsBucketing and Game Object State Vectors
SASA SASA
SCSC
SBSB
tt11tt11 tt2 2 = = tt1 1 + + ∆∆tttt2 2 = = tt1 1 + + ∆∆tt
SASA
SBSB
SCSC
5511
SBSB
SCSC
Game Forum Germany 2010Game Forum Germany 2010
Bucketing and Game Object State VectorsBucketing and Game Object State Vectors
SASA SASA
tt11tt11 tt2 2 = = tt1 1 + + ∆∆tttt2 2 = = tt1 1 + + ∆∆tt
querqueryy
querqueryy
5522
Game Forum Germany 2010Game Forum Germany 2010
Bucketing and Game Object State VectorsBucketing and Game Object State Vectors
• Major contributor to the ubiquitous “Major contributor to the ubiquitous “one frame offone frame off bug.”bug.”
• Update ordering via bucketing helps, but only when the Update ordering via bucketing helps, but only when the following rule is adhered to:following rule is adhered to:
An object in bucket An object in bucket bb may only read the state may only read the stateof objects in buckets (of objects in buckets (bb - 1), ( - 1), (bb - 2), ... - 2), ...
• You can’t reliably read the state of objects in your You can’t reliably read the state of objects in your ownown bucketbucket!!
5533
Game Forum Germany 2010Game Forum Germany 2010
One Frame Off Bugs in Collision QueriesOne Frame Off Bugs in Collision Queries
• Another complication arises because:Another complication arises because: Collision and physicsCollision and physics run run afterafter the game objects have the game objects have
updatedupdated.. So, the location of all collision geometry is So, the location of all collision geometry is one frame oldone frame old
when we cast our rays/spheres.when we cast our rays/spheres.
collision geois left behind
ray castmisses its target
5544
Game Forum Germany 2010Game Forum Germany 2010
One Frame Off Bugs in Collision QueriesOne Frame Off Bugs in Collision Queries
• Problem:Problem: What we really want is to update each game object’s What we really want is to update each game object’s
collision geometry collision geometry with its bucketwith its bucket..• ImpossibleImpossible, because coll/phys update is , because coll/phys update is monolithicmonolithic——
happens after all game objects have been updated.happens after all game objects have been updated. ... or is it?... or is it?
5555
Game Forum Germany 2010Game Forum Germany 2010
One Frame Off Bugs in Collision QueriesOne Frame Off Bugs in Collision Queries
• Observation:Observation: Ray and sphere casts Ray and sphere casts don’t caredon’t care about the full about the full
collision/physics update.collision/physics update.• Don’t need contact information, velocity, etc.Don’t need contact information, velocity, etc.• Only need to know where the collision geo Only need to know where the collision geo will bewill be..
• Solution:Solution: All we need to do is All we need to do is update the broadphase AABBsupdate the broadphase AABBs
between buckets.between buckets.
5566
Game Forum Germany 2010Game Forum Germany 2010
Solving One Frame Off Bugs in Collision QueriesSolving One Frame Off Bugs in Collision Queries
• Instead, we’ll update the AABBs early, in between game Instead, we’ll update the AABBs early, in between game object bucket updates.object bucket updates. This allows our collision queries to return This allows our collision queries to return correct correct
resultsresults..
collision geois left behind
ray casthits
broadphaseAABB update
5588
Game Forum Germany 2010Game Forum Germany 2010
Game Console Architecture: Xbox 360Game Console Architecture: Xbox 360
Shared L2 Cache (1 MB)Shared L2 Cache (1 MB)
Unified RAM(512 MB)
Unified RAM(512 MB) GPUGPU
PowerPCCore 0
PowerPCCore 0
L1 DataL1 Data L1 Instr.
L1 Instr.
PowerPCCore 1
PowerPCCore 1
L1 DataL1 Data L1 Instr.
L1 Instr.
PowerPCCore 2
PowerPCCore 2
L1 DataL1 Data L1 Instr.
L1 Instr.
5599
Game Forum Germany 2010Game Forum Germany 2010
Game Console Architecture: PLAYSTATION 3Game Console Architecture: PLAYSTATION 3
System RAM(256 MB)
System RAM(256 MB) GPUGPU
L2 Cache(512 kB)
L2 Cache(512 kB)
PPUPPU
L1 DataL1 Data L1 Instr.
L1 Instr.
SPU0SPU0
Local Store(256 kB)
Local Store(256 kB)
SPU1SPU1
Local Store(256 kB)
Local Store(256 kB)
SPU6SPU6
Local Store(256 kB)
Local Store(256 kB)
............
Video RAM(256 MB)
Video RAM(256 MB)
DMACDMACEIBEIBEIBEIB
6600
Game Forum Germany 2010Game Forum Germany 2010
Thinking About Modern Multiprocessor HardwareThinking About Modern Multiprocessor Hardware
• We’ll think in terms of multiple We’ll think in terms of multiple hardware threadshardware threads..• These could be provided by:These could be provided by:
hyperthreaded CPUhyperthreaded CPU,, multiple coresmultiple cores,, SPUsSPUs on PS3/Cell. on PS3/Cell.
6611
gameloop
gameloop
Game Forum Germany 2010Game Forum Germany 2010
Ways to Achieve Parallelism: Thread per SubsystemWays to Achieve Parallelism: Thread per Subsystem
animationanimation collision/physics
collision/physics renderingrendering audioaudio
6622
Game Forum Germany 2010Game Forum Germany 2010
Ways to Achieve Parallelism: Fork/JoinWays to Achieve Parallelism: Fork/Join
set upset up
thread0thread0 thread1thread1 thread2thread2thread3thread3
forkforkforkfork
process results
process results
joinjoinjoinjoin
6633
SPU0SPU0SPU0SPU0 SPU1SPU1SPU1SPU1 SPU5SPU5SPU5SPU5PPUPPUPPUPPU
Game Forum Germany 2010Game Forum Germany 2010
Ways to Achieve Parallelism: JobsWays to Achieve Parallelism: Jobs
anim poseanim pose
ray castray cast
broad phasebroad phase
visibilityvisibility
anim poseanim pose
ray castray cast
broad phasebroad phase
visibilityvisibility
............
anim poseanim pose
visibilityvisibility
mesh proc
mesh proc
anim poseanim pose
gameloop
gameloop
kickkickkickkick
waitwaitwaitwait
6644
Game Forum Germany 2010Game Forum Germany 2010
Ways to Achieve Parallelism: JobsWays to Achieve Parallelism: Jobs
• JobJob = = [[ ( (codecode + + input datainput data) → ) → output dataoutput data ]] KickKick job = request job to be scheduled on a HW thread. job = request job to be scheduled on a HW thread. Must Must waitwait for job before processing its output data. for job before processing its output data.• If job is If job is donedone, wait takes close to , wait takes close to zero timezero time..• If job is If job is not donenot done, main thread (PPU) , main thread (PPU) blocksblocks until it is until it is
done.done.
• Job managerJob manager handles handles schedulingscheduling, , allocates buffersallocates buffers in in local store, and local store, and coordinates DMAscoordinates DMAs.. Programmer specifies data sources & destinations via a Programmer specifies data sources & destinations via a DMA DMA
listlist..
6655
Game Forum Germany 2010Game Forum Germany 2010
Parallelism on the TrainParallelism on the Train
RayCastHandle hRayCast = INVALID;
while (!quit){ // ...
// wait and re-kick immediately if (hRayCast.IsValid()) { waitRayCast(hRayCast); processRayCastResults(hRayCast); } hRayCast = kickRayCast(...);
// ...}
RayCastHandle hRayCast = INVALID;
while (!quit){ // ...
// wait and re-kick immediately if (hRayCast.IsValid()) { waitRayCast(hRayCast); processRayCastResults(hRayCast); } hRayCast = kickRayCast(...);
// ...}
while (!quit){ // ...
RayCastHandle hRayCast = kickRayCast(...);
// do other useful work on PPU // ...
waitRayCast(hRayCast); processRayCastResults(hRayCast);
// ...}
while (!quit){ // ...
RayCastHandle hRayCast = kickRayCast(...);
// do other useful work on PPU // ...
waitRayCast(hRayCast); processRayCastResults(hRayCast);
// ...}
6666
Game Forum Germany 2010Game Forum Germany 2010
Parallelism on the TrainParallelism on the Train
• How does parallelism with jobs affect our train design?How does parallelism with jobs affect our train design? Large-scale engine system updatesLarge-scale engine system updates and and ray/sphere ray/sphere
castscasts are largely are largely asynchronousasynchronous in in U2:ATU2:AT..• Main game loop (PPU) Main game loop (PPU) kicks jobskicks jobs on SPU. on SPU.• Other work can proceedOther work can proceed on the PPU while jobs are on the PPU while jobs are
running.running.• ResultsResults picked up picked up laterlater this frame... or this frame... or next framenext frame..
6677
Game Forum Germany 2010Game Forum Germany 2010
Parallelism on the TrainParallelism on the Train
• Data must be Data must be compactcompact and and contiguouscontiguous so it can be so it can be DMA’d to the SPUs.DMA’d to the SPUs. We’re doing this already, to maintain good We’re doing this already, to maintain good cache cache
coherencycoherency..• The collision world must be The collision world must be lockedlocked in order to update in order to update
broadphase AABBs:broadphase AABBs: Wait until all Wait until all outstanding ray/sphere jobsoutstanding ray/sphere jobs are are donedone.. LockLock.. Update AABBsUpdate AABBs.. UnlockUnlock..
6688
Game Forum Germany 2010Game Forum Germany 2010
Parallelism on the TrainParallelism on the Train
• We can even do our broadphase AABB updates asynchronously.We can even do our broadphase AABB updates asynchronously.
Update Bucket 0 Update Bucket 1Kick BP Job
Broadphase AABB Job
SPU
PPU
query/cast
7700
Game Forum Germany 2010Game Forum Germany 2010
ConclusionsConclusions
• The combination of The combination of modern hardware restrictionsmodern hardware restrictions and the problems generated by the and the problems generated by the train leveltrain level...... forced us to design our engine in a forced us to design our engine in a robustrobust and and efficientefficient
manner.manner.• Results:Results:
U2:ATU2:AT boasts near boasts near 100% hardware utilization100% hardware utilization on PS3 on PS3(PPU + all 6 SPUs).(PPU + all 6 SPUs).
Way-cool Way-cool train leveltrain level as pay-off for all the hard work. as pay-off for all the hard work. Technology was the primary enabler for the Technology was the primary enabler for the convoy convoy
levellevel as well. as well.
7711
Game Forum Germany 2010Game Forum Germany 2010
Thanks For Listening!Thanks For Listening!
• Free free to send questions to me at:Free free to send questions to me at:[email protected][email protected]