The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of...
Transcript of The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of...
![Page 1: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/1.jpg)
The ParLab Stack
Parallel Computing Laboratory
Sarah Bird
May 30, 2013
![Page 2: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/2.jpg)
Bridging the Gap
Parallel Applications
Parallel Hardware
Parallel Software
IT industry Users
Easy to write correct programs that run efficiently on manycore
![Page 3: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/3.jpg)
Integrated Software Stack Personal Health
Image Retrieval
Hearing, Music
Speech Parallel Browser
Motifs/Dwarfs
Sketching
Legacy Code
Schedulers Communication & Synch. Primitives
Efficiency Language Compilers
Legacy OS
Multicore/GPGPU
OS Libraries & Services
RAMP Manycore
Hypervisor
Co
rrec
tnes
s
Composition & Coordination Language (C&CL)
Parallel Libraries
Parallel Frameworks
Static Verification
Dynamic Checking
Debugging with Replay
Directed Testing
Autotuners
C&CL Compiler/Interpreter
Efficiency Languages
Type Systems
![Page 4: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/4.jpg)
Hardware
• RAMP Gold
– Simulation
– 64 cores
– FPGA
RISC-V
Chip
FPGA
RAMP Gold
SPARC PTX
GPGPU
x86
Multicore
• RISC-V – Implementations
written in Chisel – Rocket
• 6 stage in-order
– Hwacha • 64 bit vector core
– FPGA • 2 Rocket
– 45 nm Chip • 1 Rocket, 1 Hwacha • 1 Ghz
• x86
• GPGPU
– Cuda
– OpenGL
![Page 5: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/5.jpg)
Operating Systems
• Akaros
– Cloud OS
RISC-V
Chip
FPGA
RAMP Gold
SPARC PTX
GPGPU
x86
Multicore
• Tessellation
– Client OS
– Space-Time Partitioning
– Two-Level Scheduling
– QoS to Applications
– PACORA
• Linux
Akaros Tessellation Linux
![Page 6: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/6.jpg)
Schedulers
• Lithe – Compose Parallel Runtimes
– Thread Building Blocks
– Open MP
RISC-V
Chip
FPGA
RAMP Gold
SPARC PTX
GPGPU
x86
Multicore
• PULSE – Framework to write schedulers
– Earliest Deadline-First
– Global Round Robin
Akaros Tessellation Linux
PULSE Lithe
OpenMP TBB EDF GRR
![Page 7: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/7.jpg)
SEJITS
RISC-V
Chip
FPGA
RAMP Gold
SPARC PTX
GPGPU
x86
Multicore
Akaros Tessellation Linux
PULSE Lithe
OpenMP TBB EDF GRR
ASP Copperhead
PyCASP
TFJ
![Page 8: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/8.jpg)
Applications
RISC-V
Chip
FPGA
RAMP Gold
SPARC PTX
GPGPU
x86
Multicore
Akaros Tessellation Linux
PULSE Lithe
OpenMP TBB EDF GRR
TFJ ASP Copperhead
Optical Flow Music
Synthesis
PyCASP
Parallel Browser
PARDORA
![Page 9: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/9.jpg)
Why Create an Integrated Prototype?
• Encourages Collaboration • Prevents neglecting important
pieces of the problem • Uncover opportunities for
invention by seeing which side of an interface is the best place to satisfy a requirement
• Demonstrate the importance of design simplicity
• Enhance the education of the PhD students in areas beyond their own specialties
• Help with technology transfer by giving concrete examples of our ideas for our colleagues in industry
![Page 10: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/10.jpg)
Forces for Integration • Design Compatibility
– Shared Space and Discussions – Symbiotic Designs – Example: Music and Tessellation
• Customized Support – In-house experts helping adapt
their design to your problem – Examples: Lithe and Tessellation CAA and Applications
• Motivating Applications – Exciting to show your research on
run a compelling application – Examples: Patterns and MRI BFS and RISC-V
![Page 11: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/11.jpg)
Integrated Demos in ParLab History
![Page 12: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/12.jpg)
Stack Redesign
• Productivity programs can create applications that require efficiency – Scale – Performance
• Easily target many platforms and features – Example: Vector units on RISC-V Chip
• Efficient performance predictibility – Interactivity and Responsiveness – Realtime Performance
What did rethinking the entire computing stack at once get us?
![Page 13: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/13.jpg)
Integrated Demos
• Two Demos to show off the ParLab Stack
• Fun and compelling applications that require efficiency
• Easily target many platforms and features
• Interactivity and Realtime Performance
• Integration!
![Page 14: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/14.jpg)
Music Exploration and Recommendation
• Better Pandora
• Audio Content Analysis Framework
• Parallel Browser Big Data Visualization
• Demonstrates Scale and Responsiveness
![Page 15: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/15.jpg)
Music Recommendation and Exploration
RISC-V
RV FPGA
PTX
GPGPU
x86
Multicore
Tessellation
Lithe
TBB
ASP
PyCASP
Parallel Browser
PARDORA
RISC-V
RV FPGA
PTX
GPGPU
x86
Multicore
Tessellation
Lithe
TBB
ASP
PyCASP
Parallel Browser
PARDORA
![Page 16: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/16.jpg)
Virtual Musical Instrument
• Musical Instrument using a camera
• Music Synthesis
• Vision Applications
• Demonstrate Realtime
![Page 17: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/17.jpg)
Virtual Musical Instrument
PTX
GPGPU
x86
Multicore
Tess Linux
PULSE
GRR
TFJ
Optical Flow
Music Synthesis
RISC-V
FPGA
Chip
![Page 18: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/18.jpg)
Virtual Musical Instrument
PTX
GPGPU
x86
Multicore
Tess Linux
PULSE
GRR
TFJ
Optical Flow
Music Synthesis
RISC-V
FPGA
Chip
![Page 19: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/19.jpg)
Virtual Musical Instrument
PTX
GPGPU
x86
Multicore
Linux
PULSE
GRR
TFJ
Optical Flow
Music Synthesis
RISC-V
RV FPGA
RV Chip
Tess
![Page 20: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/20.jpg)
Virtual Musical Instrument
PTX
GPGPU
x86
Multicore
Tess Linux
PULSE
GRR
TFJ
Optical Flow
Music Synthesis
RISC-V
RV Chip
RV Chip
![Page 21: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/21.jpg)
Music Recommendation & Exploration Demo
Leo Meyerovich, Katya Gonina, Gage Eads, Eric Roman, Eric Battenberg, Henry Cook, Gerald Friedland
End of ParLab Celebration
May 30, 2013
21
![Page 22: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/22.jpg)
Demo
22
![Page 23: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/23.jpg)
System Overview
“Radiohead”
Client Server
Recommendation Engine 1K
clustered songs
23
1M songs
Layout Engine
sid_1, sid_2
“King Tubby”
SEJITS Parallel Browser
Tesselation
![Page 24: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/24.jpg)
Client Architecture
24
Parallel JavaScript Libraries 1K
clustered songs
1M songs
Safari Web Browser
Parse (JSON)
Layout Render
viz.ftl Synthesizer
offline: compile-time
![Page 25: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/25.jpg)
Music Recommendation Stack
ASP
PyCASP
OpenCL
GPGPU
Parallel Browser
PARDORA
Linux
CUDA / Cilk+
PTX
GPGPU
RISC-V
RV FPGA
x86
Multicore
Tessellation
Lithe
TBB
x86
Multicore
JavaScript
x86
![Page 26: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/26.jpg)
Train a UBM*
Offline Phase
Online Phase
Find Closest Songs
Adapt UBM on query songs
Get potential neighbors using
Collaborative Filtering
* UBM = Universal Background Model 26
1M songs
1K clustered
songs
“Radiohead”
Server Architecture
SEJITS
GPU
CPU
FPGA
Tesselation
![Page 27: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/27.jpg)
Music Recommendation Stack
ASP
PyCASP
OpenCL
GPGPU
Parallel Browser
PARDORA
Linux
CUDA / Cilk+
PTX
GPGPU
RISC-V
RV FPGA
x86
Multicore
Tessellation
Lithe
TBB
x86
Multicore
JavaScript
x86
![Page 28: The ParLab Stackparlab.eecs.berkeley.edu/sites/all/parlab/files/The Par... · 2013. 6. 12. · of an interface is the best place to satisfy a requirement •Demonstrate the importance](https://reader035.fdocuments.us/reader035/viewer/2022071508/6128df75106c4c14be43741c/html5/thumbnails/28.jpg)
Music Recommendation Stack
ASP
PyCASP
OpenCL
GPGPU
Parallel Browser
PARDORA
Tessellation
Lithe
TBB
x86
Multicore
RISC-V
RV FPGA
Linux
CUDA / Cilk+
PTX
GPGPU
x86
Multicore