Overview Of Parallel Development - Ericnel
-
Upload
ukdpe -
Category
Technology
-
view
2.079 -
download
1
description
Transcript of Overview Of Parallel Development - Ericnel
![Page 1: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/1.jpg)
1
Overview of Parallel Development
Eric Nelsonhttp://geekswithblogs.net/iupdateablehttp://blogs.msdn.com/goto100 http://twitter.com/ericnel
![Page 2: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/2.jpg)
Agenda
Overview of what we are up toDrill down into parallel programming for managed developers
![Page 3: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/3.jpg)
Things I learnt...We have a very large investment in parallel computing
We have “something for everyone”It is not all synced, it is sometimes overlapping
It is a big topicManaged vs native vs client vs server vs task vs data...
Even with the investment, design/code/test for parallel is far harder
Locking, Deadlocks, Livelocks
It is about getting ready for the futureCode today – run better tomorrow?
VS2010 CTP – not a great place for parallelSingle core in guestUnsupported route to use Hyper-V
Easiest route to dabble – Microsoft Parallel Extensions June CTP for VS2008
![Page 4: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/4.jpg)
Buying a new Processor
£100 - £300£100 - £300
2-3GHz2-3GHz
2 cores or 42 cores or 4
64-bit64-bit
CoreCore
CoreCore
![Page 5: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/5.jpg)
Buying a new Processor
CoreCoreCoreCoreCoreCoreCoreCore£200 - £500£200 - £500
2-3GHz2-3GHz
4 cores with HT4 cores with HT
64-bit64-bit
QuickPath QuickPath InterconnectInterconnect
Memory ControllerMemory Controller
![Page 6: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/6.jpg)
Where will it all end?
Unisys ES7000 (7600R) used with kind permission of Mr Henk var der Valk, Unisys, NL
![Page 7: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/7.jpg)
Was it a wise purchase?
Windows OSWindows OS
App 1App 1 App 2App 2 ......
App 1App 1
.NET CLR.NET CLR
.NET Framework.NET Framework
My CodeMy Code
![Page 8: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/8.jpg)
Was it a wise purchase?
Some environments scale to take advantage of additional CPU cores (mostly server-side)
A lot of code does not (mostly client-side)This code will see little benefit from future hardware advances
ASP.NET Web Forms/ServicesASP.NET Web Forms/Services WCF ServicesWCF Services WF EngineWF Engine ......
.NET ThreadPool or Custom Threading Strategy.NET ThreadPool or Custom Threading Strategy
![Page 9: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/9.jpg)
What happened to “The Free Lunch”?
Bad sequential code will run faster on a faster processor
Just using parallel code is not enoughBad parallel code WILL NOT run faster on more cores
![Page 10: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/10.jpg)
0
16
32
48
64
0 16 32 48 64
Cores
Par
alle
l S
pee
du
p
Production Fluid
Production Face
Production Cloth
Game Fluid
Game Rigid Body
Game Cloth
Marching Cubes
Sports Video Analysis
Video Cast Indexing
Home Video Editing
Text Indexing
Ray Tracing
Foreground Estimation
Human Body Tracker
Portifolio Management
Geometric Mean
Graphics Rendering – Physical Simulation -- Vision – Data Mining -- Analytics
Applications Can Scale Well
![Page 11: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/11.jpg)
Multithreaded programming is “hard” todayDoable by only a subgroup of senior specialistsParallel patterns are not prevalent, well known, nor easy to implementSo many potential problems
Races, deadlocks, livelocks, lock convoys, cache coherency overheads, lost event notifications, broken serializability, priority inversion, and so on…
Businesses have little desire to “go deep”Best developers should focus on business value, not concurrencyNeed simple ways to allow all developers to write concurrent code
What's The Problem?
![Page 12: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/12.jpg)
void MatrixMult( int size, double** m1, double** m2, double** result){ for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { result[i][j] = 0; for (int k = 0; k < size; k++) { result[i][j] += m1[i][k] * m2[k][j]; } } }}
![Page 13: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/13.jpg)
void MatrixMult( int size, double** m1, double** m2, double** result) { int N = size; int P = 2 * NUMPROCS; int Chunk = N / P; HANDLE hEvent = CreateEvent(NULL, TRUE, FALSE, NULL); long counter = P; for (int c = 0; c < P; c++) { std::thread t ([&,c] { for (int i = c * Chunk; i < (c + 1 == P ? N : (c + 1) * Chunk); i++) { for (int j = 0; j < size; j++) { result[i][j] = 0; for (int k = 0; k < size; k++) { result[i][j] += m1[i][k] * m2[k][j]; } } } if (InterlockedDecrement(counter) == 0) SetEvent(hEvent); }); } WaitForSingleObject(hEvent,INFINITE); CloseHandle(hEvent);}
Synchronization Knowledge
Error prone
Heavy synchronization
Static partitioning
Lack of thread reuse
Tricks
Lots of boilerplate
![Page 14: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/14.jpg)
Microsoft Parallel Computing Technologies
•Robotics-based manufacturing assembly line•Silverlight Olympics viewer
•Enterprise search, OLTP, collab•Animation / CGI rendering•Weather forecasting•Seismic monitoring•Oil exploration
•Automotive control system •Internet –based photo services
•Ultrasound imaging equipment •Media encode/decode•Image processing/ enhancement•Data visualization
Task Concurrency
Data Parallelism
Distributed/Cloud Computing
LocalComputing
CCR
Maestro
TPL / PPL
Cluster TPL
Cluster PLINQ
MPI / MPI.Net
WCF
Cluster SOA
WF
PLINQ
TPL / PPL
CDS
OpenMP
WF
Compute Shader
![Page 15: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/15.jpg)
Visual Studio 2010Tools / Programming Models / Runtimes
Parallel Pattern Library
Resource Manager
Task Scheduler
Task Parallel Library
PLINQ
Managed Library Native Library
ThreadsThreadsOperating System
Concurrency Runtime
Programming Models
AgentsLibrary
ThreadPool
Task SchedulerTask Scheduler
Resource ManagerResource Manager
Data Structures
Dat
a St
ruct
ures
Integrated Tooling
Tools
ParallelDebugger
Tool
Profiler Concurrenc
yAnalysis
Programming Models
Concurrency Runtime
![Page 16: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/16.jpg)
16
Explicit Tasking Support
.NET 4.0 Task Parallel Library
Task, TaskFactoryParallel.ForParallel.ForeachParallel.InvokeConcurrent data structures
Visual Studio 2010 C++Parallel Pattern Library
task, task_groupparallel_forparallel_for_eachparallel_invokeConcurrent data structuresPrimitives for message passingUser-mode locks
![Page 17: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/17.jpg)
Task Parallel Library ( TPL )
![Page 18: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/18.jpg)
18
Task
No Threadingto Threadingto Tasks
![Page 19: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/19.jpg)
Program Thread
Program Thread
CLR Thread Pool
User Mode Scheduler
GlobalQueue
Worker Thread 1
Worker Thread p
![Page 20: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/20.jpg)
CLR Thread Pool: Work-Stealing
Worker Thread 1
Worker Thread p
Program Thread
Program Thread
User Mode Scheduler For Tasks
GlobalQueue
LocalQueue
LocalQueue
Task 1Task 1Task 2Task 2
Task 3Task 3Task 5Task 5Task 4Task 4
Task 6Task 6
![Page 21: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/21.jpg)
Debugger Support
Support both managed and native1. Parallel Tasks2. Parallel Stacks
![Page 22: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/22.jpg)
Higher Level Constructs
Even with Task there are common patterns that build into higher level abstractions
The Parallel classInvoke, For, For<T>, Foreach
Care needs to be taken with state, ordering“This is not your Father’s for loop”
![Page 23: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/23.jpg)
23
Parallel
Parallel.ForEachParallel.Invoke
![Page 24: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/24.jpg)
Declarative Data Parallelism
Parallel LINQ-to-Objects (PLINQ)Enables LINQ devs to leverage multiple coresFully supports all .NET standard query operatorsMinimal impact to existing LINQ model
var q = from p in people where p.Name == queryInfo.Name && p.State == queryInfo.State && p.Year >= yearStart && p.Year <= yearEnd orderby p.Year ascending select p;
![Page 25: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/25.jpg)
25
Parallel LINQ
![Page 26: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/26.jpg)
What Next?
Download VS 2010 CTPRemember to set the clock back
OrDownload Parallel Extensions June CTP for VS2008Experiment with runtime and API
Team is working on Visual Studio 2010 betaVery open to feedbackJoin in the discussion forumshttp://blogs.msdn.com/pfxteam/
![Page 27: Overview Of Parallel Development - Ericnel](https://reader035.fdocuments.us/reader035/viewer/2022062614/545d44b8b0af9fa42c8b4df1/html5/thumbnails/27.jpg)
Parallel Computing Resources
Downloads, Binaries, Code, Forums, Blogs, Videos, Screencasts,
Podcasts, Articles, Samples
http://msdn.com/concurrency
http://blogs.msdn.com/pfxteam/