Sly and the RoarVM: Exploring the Manycore Future of Programming
-
Upload
stefan-marr -
Category
Technology
-
view
6.797 -
download
4
description
Transcript of Sly and the RoarVM: Exploring the Manycore Future of Programming
Sly and the RoarVM Exploring the Manycore Future of Programming
Renaissance Project
h=p://[email protected]/~smarr/renaissance/
Stefan Marr
So@ware Languages Lab, Vrije Universiteit Brussel ExaScience Lab, 2011-‐08-‐24
Agenda
• Context: Renaissance Project • Sly – Demo – Language Overview
• RoarVM – Intercore Messaging – Read-‐mostly heap – Pauseless GC
• Outlook 23/08/11 2
Renaissance Project
• CollaboraWon between – IBM Research, Portland State Univ, VUB
• Problem – Parallel systems are nondeterminisWc – CommunicaWon is expensive
• The vision – Embrace nondeterminism – Harness emergence – Confidence → Delay
23/08/11 3
Possible Real Life ApplicaWon
23/08/11 5
Image: h=p://blogs.technet.com/b/andrew/archive/2007/08/22/olap-‐cubes-‐and-‐mulWdimensional-‐analysis.aspx
Possible Real Life ApplicaWon
23/08/11 6
• ACK: ongoing work at IBM Research
• Online analyWcal processing (OLAP) – Business intelligence, reporWng, data mining
• Query mulWdimensional data sets – Includes dependent/calculated data dimensions – Sync. required to avoid use of stale data
• Without sync.: how to bound error? – RecalculaWng, refreshing results
Sly
• Ensembles: collecWons represenWng a whole, mulW-‐part enWWes – e.g. a flock of birds
• Adverbs: modifiers to operands – individualLY, serialLY, randomLY: n
• Gerunds: reducWon semanWcs – averagING, selectINGpredicate, ensemblING
23/08/11 7
Examples of Boids
• Every boid is its own thread, no synch.
computeCentroid ^ self flock boids averagINGserialLYposition
computeNeighbors ^ self flock boids selectINGisNear: self
matchVelocityWithNeighbors ^ self neighbors averagINGvelocity / 8.0 More details: h=p://[email protected]/~smarr/renaissance/sly3-‐overview/
23/08/11 8
The Manycore Problem
• Shared memory access very expensive
– Inter-‐core messaging – Purpose-‐specific networks
23/08/11 10
0 250 500 750 1000 1 word
10 words
DMA Load Mem. Msg: 3 hops Msg: 1 hop
(from [2])
Architecture Overview
• Core = Interpreter Instance = OS Process/Thread • ‘Single System Image’/shared memory VM
23/08/11 11
Message-‐Passing between Interpreters
• To avoid main-‐memory latency
• VM internal use only – GC
• preGCAcWon + ACK • doAllRootsHere + Response, …
– SafepoinWng • requestSafepointOnOtherCores • grantSafepoint, …
23/08/11 12
Memory System
• Object table, moving stop-‐the-‐world GC – Usually allocaWon on heap of local core
24/08/11 13
Object Heap
Memory System
• Object table, moving stop-‐the-‐world GC – Usually allocaWon on heap of local core
23/08/11 14
Object Heap
Global Safepoint
Read-‐Mostly Heap [2]
• TILE64 cache coherency VERY expensive – Read-‐mostly heap: incoherent, ‘user managed CC’
23/08/11 15
Cache
Cache
Cache
Read/Write
Read-‐mostly
Stop-‐the-‐World in Manycore Era?
23/08/11 16
Work
Arrival at safe point
Do garbage collect
Work
Run out of space
Pauseless GC
• Based on a paper by Click et al., 2005 [3] • No global synchronizaWon – Local checkpoints: 1 mutator and GC thread
• ConWnuously running GC threads • Load-‐Value-‐barrier + self-‐healing of stale refs. • Constant performance overhead – Without hardware support
• Easy to implement (2 MA students, 1 month)
23/08/11 18
Conclusion
• Sly: map/reduce-‐like programming model – Based on ensembles, adverbs and gerunds
• RoarVM – Research arWfact, stable, used daily – No opWmizaWon of sequenWal performance – Tested up to 59cores – Shows weak scalability
23/08/11 20
Outlook
• Sly – Ideas for non-‐linear syntax – Algorithms for nondeterminisWc programming
• RoarVM: focus on research – VMs are mulW-‐language plarorms – Support for concurrency models – DSLs for specific concurrent/parallel problems – Language guarantees in inter-‐model interacWon
23/08/11 21
References and Material [1] J. Pallas and D. Ungar. MulWprocessor Smalltalk: A Case Study of a MulWprocessor-‐based Programming Environment. In PLDI ’88, p. 268–277. ACM, 1988.
[2] D. Ungar and S. Adams. HosWng an Object Heap on Manycore Hardware: An ExploraWon. In DLS’09, p. 99-‐110. ACM, 2009.
[3] C. Click, G. Tene, and M. Wolf. The Pauseless GC Algorithm. In VEE ’05, p. 46-‐56. ACM, 2005.
[4] D. Ungar, D. Kimelman, and S. Adams. Inconsistency Robustness for Scalability in InteracWve Concurrent‑Update In-‐Memory MOLAP Cubes. In Inconsistency Robustness Symposium 2011.
[5] P. Inostroza Valdera. The Sly3 Programming Language: An IntroducWon and Analysis. h=p://[email protected]/~smarr/renaissance/sly3-‐overview/
[6] Renaissance Project: h=p://[email protected]/~smarr/renaissance/
23/08/11 22 Roundup and Conclusion