INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER...
Transcript of INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER...
![Page 1: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/1.jpg)
INTEL® HPC DEVELOPER CONFERENCE
Fuel Your Insight
Large-scale Distributed Rendering with the OSPRay Ray Tracing Framework
Carson Brownlee
![Page 2: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/2.jpg)
Shared-memory
![Page 3: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/3.jpg)
Distributed-memory
![Page 4: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/4.jpg)
Why MPI?
Data that exceeds the memory limits of a single node
Performance limitations
Tiled displays
In Situ
![Page 5: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/5.jpg)
Strong Scaling
![Page 6: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/6.jpg)
Weak Scaling
![Page 7: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/7.jpg)
High Fidelity Rendering
![Page 8: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/8.jpg)
Related Work
Sending Rays, Kilauea - Kato ’01,’02,’03
Interactive Ray Tracing on Clusters - Wald et al. ‘03
Distributed Shared Memory - DeMarle et al. ‘03
IceT Compositing - Moreland et al. ’11
![Page 9: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/9.jpg)
Multiple Device
API commands are processed on the appropriate active device. This provides a modular backend for processing API calls. Currently these include:
1. Local
2. MPI
3. COI (Now deprecated in favor of MPI)
![Page 10: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/10.jpg)
Using OSPRay MPI
Compile
OSPRAY_BUILD_MPI_DEVICE=ON
Requires MPI Library with multi-threading support (IMPI recommended)
OSPRAY_EXP_DATA_PARALLEL=ON (expiremental)
Run
mpirun -n 3 ./ospGlutViewer —osp:mpi teapot.obj (mpirun args vary)
mpirun -ppn 1 -n 1 -host localhost ./ospGlutViewer —osp:mpi teapot.obj : -n 2 -host n1, n2 ./ospray_mpi_worker —osp:mpi
ParaView
VTKOSPRAY_ARGS=“—osp:mpi” mpirun -ppn 1 -n 1 -host localhost ./paraview : -n 1 -host n1, n2 ./ospray_mpi_worker —osp:mpi
![Page 11: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/11.jpg)
Distributed Framebuffer
Data replicated and Data distributed
Tile ownership
Stores accumulation buffer locally
Pixel Operations
Processed tiles with framebuffer colors sent to display node
![Page 12: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/12.jpg)
Tiling Pseudocode
![Page 13: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/13.jpg)
Load Balancing
Static load balancing
Tiles are strided to avoid work imbalance
1 2 3 1 2 3
2 3 1 2 3 1
3 1 2 3 1 2
![Page 14: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/14.jpg)
Work API Comm
API:
ospRenderFrame() {…}
MPIDevice:
MPIDevice::renderFrame()
{
work::RenderFrame work(_fb, _renderer, fbChannelFlags);
processWork(&work);
}
Work:
void RenderFrame::serialize(SerialBuffer &b) const {
b << (int64)fbHandle << (int64)rendererHandle << channels;
}
![Page 15: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/15.jpg)
Work API Comm
Work:
void RenderFrame::run() {
FrameBuffer *fb = (FrameBuffer*)fbHandle.lookup();
Renderer *renderer = (Renderer*)rendererHandle.lookup();
renderer->renderFrame(fb, channels);
}
Worker:
mpi::recv(mpi::Address(&mpi::app, (int32)mpi::RECV_ALL), workCommands);
for (work::Work *&w : workCommands)
w->run();
![Page 16: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/16.jpg)
Async Comm Layer
Actions are separated into
receive queue
process queue
send queue
![Page 17: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/17.jpg)
Async Comm Layer
struct MasterTileMessage : public mpi::async::CommLayer::Message {
vec2i coords;
float error;
uint32 color[TILE_SIZE][TILE_SIZE];
};
void DFB::incoming(mpi::async::CommLayer::Message *_msg) {
switch (_msg->command) {
case MASTER_WRITE_TILE_NONE:
this->processMessage((MasterTileMessage_NONE*)_msg);
break;
}
![Page 18: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/18.jpg)
Distributed Data
Currently experimental and only for Volume data
env var OSPRAY_DATA_PARALLEL=blockXxBlockYxBlockZ
Data is projected onto tiles, all nodes determine tile overlap
Tiles sent to owning node for compositing
![Page 19: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/19.jpg)
Strong Scaling
![Page 20: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/20.jpg)
Distributed API
Ability to specify what is run where
3 Modes:
Master/Slave
- All ranks not master run commands specified from master rank
Collaborative
- All ranks make the same commands
Independent
- run locally
![Page 21: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/21.jpg)
D-API Example - Distributed Volume Rendering
Sync: initialization
Sync: create shared volume
Local: create resident volume section
Local: add local volume to synchronous volume
Master: add annotations
Sync: render
![Page 22: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/22.jpg)
Distributed API
ospdApiMode(OSPD_MODE_INDEPENDENT);
OSPVolume localVol = ospNewVolume("shared_structured_volume");
OSPData ospLocalVolData = ospNewData(volumeData.size(), OSP_UCHAR, volumeData.data(), OSP_DATA_SHARED_BUFFER);
ospCommit(ospLocalVolData);
// Switch back to collaborative mode and commit the collab volume and add it to the world
ospdApiMode(OSPD_MODE_COLLABORATIVE);
ospCommit(volume);
ospAddVolume(world, volume);
ospCommit(world);
![Page 23: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/23.jpg)
D-API Implementation
void MPIDevice::processWork(work::Work* work)
{
if (currentApiMode == OSPD_MODE_MASTER) {
mpi::send(mpi::Address(&mpi::worker,(int32)mpi::SEND_ALL), work);
} else if (currentApiMode == OSPD_MODE_COLLABORATIVE) {
// sync calls
}
work->run();
}
![Page 24: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/24.jpg)
Tiled Displays
![Page 25: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/25.jpg)
DisplayWald - Experimental
Built as an OSPRay module
Requires MPI
Stereo supported
Routing through single head node supported if display nodes not accessible from compute nodes
![Page 26: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/26.jpg)
DisplayWald - Experimental
Server (displays):
mpirun -perhost 1 -n 6 ./ospDisplayWald -w 3 -h 2 --no-head-node
mpirun -perhost 1 -n 6 ./ospDisplayWald -w 3 -h 2 —head-node
// will output hostname and port
Client (renderer):
mpirun -n ./ospDwViewer —display-wall-host host:port
![Page 27: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/27.jpg)
Performance Tips
Wayness - single MPI process per node ideal
Excessive API calls can currently cause very long load times
Affinity issues - check CPU utilization pegged at 100%.
KNL cache mode - OSPRay runs best in cache/quadrant mode
Samples Per Pixel - Negative values will subset image per frame
![Page 28: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/28.jpg)
Questions?
![Page 29: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified](https://reader038.fdocuments.us/reader038/viewer/2022110111/5abb730b7f8b9a321b8ce568/html5/thumbnails/29.jpg)
Legal Notices and DisclaimersIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or
service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Copyright © 2016 Intel Corporation. All rights reserved. Intel, Intel Inside, the Intel logo, Intel Xeon and Intel Xeon Phi are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others.
Copyright © 2016 Intel Corporation, All Rights Reserved
29