C++ on the Web (GDCE 2013)
-
Upload
andre-weissflog -
Category
Technology
-
view
3.197 -
download
0
description
Transcript of C++ on the Web (GDCE 2013)
C++ on the WebA Tale from the Trenches
Andre WeissflogHead of Development, BerlinBigpoint GmbH
GDC Europe 2013
What’s this about?
• the web as a new target platform for C++ code
• differences to traditional platforms
• differences between C++/web technologies
• porting problems and solutions
Demos
• Dragons Demo: minimal 3D skinned character demo [show demo]
• Map Demo: more advanced 3D demo [show demo]
• based on Nebula3 engine, also used in Drakensang Online
Why develop for the web HTML5 + WebGL?
Create Deploy Play
• no walled gardens, no gate-keepers, no certification process• free choice of hosting & payment providers• no installations, no updates, no plugins, no lengthy downloads• multi-platform “for free”• battle-hardened security infrastructure
The web is the most open and seamless platform for users and developers.
C++ to web technologies
Google’s pNaCl
Mozilla’s emscripten
Adobe’s crossbridge
LLVM has opened up a lot of new usage scenarios for C/C++...
...for instance running C/C++ code inside byte code VMs and other sandboxed environments:
Mozilla’s emscripten
• OpenSource project, started in 2010• .cpp → LLVM .bc → .js• extremely active and responsive dev team• lots of wrapper APIs (OpenGL, SDL, GLUT, ...)• limited threading support (no pthreads)
Recent Developments:• asm.js (highly optimizable subset of JS)• massive compilation speed improvements• inline Javascript directly into C++
Google’s pNaCl
• OpenSource project, started in 2008• .cpp → LLVM .bc → (deploy) x86/x64/ARM• Google Chrome only• safe sandbox for native code execution• full pthreads implementation
Recent Developments:• pNaCl finally ready for prime-time• enabled in Chrome v.30 and up• no longer restricted to Chrome Web Store apps
• formerly known as Alchemy and flascc• started in 2008, recently open-sourced• .cpp → LLVM .bc → AVM2 byte code• runs in Flash plugin• proprietary 3D API (Stage3D)• incredibly slow and resource hungry build process :/
Adobe’s crossbridge
Focus...Will mostly talk about emscripten (and some pNaCl)
Why:• emscripten has widest reach (all major browsers)• emscripten progresses incredibly fast• pNaCl currently has edge in threading support• pNaCl and emscripten are actually quite similar from dev perspective
But Javascript is slow, isn’t it?asm.js generated code is probably faster than you think, and pNaCl generated code is probably slower than you think (don’t have hard
benchmark numbers yet... sorry)
IMHO: for 3D games, the real performance gains will come through WebGL extensions, high call-overhead requires extensions to reduce number of GL calls!
My [OSX] dev environment
• Xcode (for compiling/debugging native OSX and iOS apps)
• Eclipse (for emscripten and NaCl specific dev work)
• emscripten SDK
• NaCl SDK
• cmake
• a local HTTP server (e.g. “python -m SimpleHTTPServer”)
Multiplatform Build System
ios.toolchain.cmake
osx.toolchain.cmake
pnacl.toolchain.cmake
emscripten.toolchain.cmake
android.toolchain.cmakeCMakeLists.txt + make
cmake: flexible meta-build-system, generates IDE project files and/or makefiles from generic “CMakeLists” files.
cmake
cmake toolchain files define platform-specific tools, header/lib search paths and compile options
windows.toolchain.cmake
pNaClCMakeLists.txt files define compile targets and their source files
Multiplatform Ecosystem
32 BIT+
64 BIT
x86, x86_64, ARM
OpenGLvs
Direct3D?
BigEndian no longer matters
POSIX+
Windows
code must be 32/64 bit clean
no exceptionsno RTTIno STL
no Boost
Windows still big, everything else is POSIX-ish
OpenGL Renaissance, but D3D9 still relevant
these make porting to exotic platforms often harder, not easier
• WinXP is still incredibly big in Eastern Europe & Asia• 3D feature base-line is OpenGL ES 2.0 w/o extensions (== WebGL)• go fully GL on all platforms? (GL driver quality? Win8 Metro apps? ANGLE?)
What keeps me awake at night:
N3 Multiplatform PhilosophyPlatform-specific code lives in itsown sub-directories.
__POSIX____IOS__
__NACL____EMSCRIPTEN__
...
Platform-specific pre-processor definesprovided by build system.
class DisplayCoreBase
class NaclDisplayCore class EmscDisplayCore class IOSDisplayCore#if __NACL__ #if __EMSCRIPTEN__ #if __IOS__
#endif #endif #endif
class DisplayCore
Diamond-shape class hierarchy resolved at compile time:
Multiplatform Line Counts
Dragons Demo (~170k lines of code)
platform-agnostic: 148kPOSIX/CRT: 7k OpenGL: 6.7kemscripten: 3kpNaCl: 3.5kOSX/iOS: 2.2k
about ~2% platform-specific code
Size Comparisons (Dragons Demo)~170k lines of C++ code
OSX (-arch i386 -O3):• orig: 2027 kByte• +no asserts: 1457 kByte• +stripped: 1237 kByte• +gzipped: 413 kByte
OSX (-arch x86_64 -O3):• orig: 2134 kByte• +no asserts: 1663 kByte• +stripped: 1427 kByte• +gzipped: 460 kByte
iOS (-arch armv7 -O3):• orig: 1542 kByte• +no asserts: 1196 kByte• +stripped: 972 kByte• +gzipped: 395 kByte
pNaCl (-O2):• orig: 1654 kByte• +no asserts: 1333 kByte• +stripped: 1333 kByte• +gzipped: 842 kByte
emscripten (-O2 --llvm-opts 3 --llvm-lto 3):• orig: 5414 kByte• +no asserts: 2154 kByte• +closure pass: 1951 kByte• +gzipped: 486 kByte
wow, surprisingly compact!
smaller than expected
bigger than expected
closure: Google’s JS optimizer/minifier
The Callback ProblemKey Point to understand (and accept):
Browser runtime environment uses callback model for asynchronous programming.
Start lengthy operation, provide callback which will be called when operation is finished: becomes very messy very quickly.
Games are usually frame-driven, not callback-driven.
This is the main riddle when trying to port a game engine to browser platforms.
The Game Loop Problem
Most event-driven platforms don’t let you “own the game loop”.
Instead the application runs completely inside event callback functions which must return quickly.
Failing to return quickly results in unresponsive behaviour or even your app being killed.
pNaCl
The Game Loop ProblemBest solution is to use the app main threadexclusively for system event handling...
...and spawn a “Game Thread” which runs the actual game loop.
MainThread
GameThread
input events
system events
quit event
display change events
Only wakes up on system events.
Runs your typical “infinite” game loop.
The CallOnMainThread-ProblemSome platforms have restrictions what OS functionalityis accessible from threads.
E.g. must call OpenGL or IO functions from the main thread only.
pNaCl
Either run everything on main thread, or dispatch“system calls” to run asynchronously on main thread.
CallOnMainThread problem
All “PPAPI calls” must happen on main thread, and the main thread must never block.
pNaCl
Threads can push function pointers for deferred execution on main thread.
Deferred function calls and result callbacks execute in a simple run-loop after your per-frame callback on the main thread.
This primitive runloop/callback model makes it easy to shoot yourself in the foot by waiting for events triggered by your own callbacks. This stops the entire runloop and freezes the app.
But: All other threads can block as much as they want, waiting for events triggered by callbacks on the main thread. Nice way to simulate blocking I/O.
Conclusion: pNaCl’s full threading support can be used to workaround many of its restrictions by moving the actual game logic into its own thread, and use the main thread only for “system calls” and their result callbacks.
CallOnMainThread visualizedpNaCl
Init():launch Game Thread
StartIO():begin async IO,
set finish-callback to FinishIO()
FinishIO():set finished-condvar
Main Thread
put StartIO func ptr on main thread’s run
queue and wait for finished-condvar
new thread
CallOnMainThread(StartIO)
your Game Thread
finished-condvar is set,continue Game Thread
...blocked...
finished-condvar set
your pNaCl main-thread code
invoke callbacks to pNaCl app code:
initializationinvoke deferred funcsinvoke result callbacks
...
pNaCl runtime(runloop/callbacks)
Limitations
Similar restrictions as pNaCl, but can’teasily use threads to workaround them:
• most “interesting functions” (WebGL!) must be called from main thread• main thread must not block• no pthreads, only WebWorkers for threading• WebWorkers have their own “address space”
Can’t move entire game loop into WebWorker thread (yet?)
Browser vendors working towards more flexible WebWorkers, but HTML5 standardization takes time.
Limitation WorkaroundsAll your code must run inside “slices”,always return within 16 or 32 ms to browser.
If something takes longer, either spread workover several frames, or move into WebWorker.
N3 has new “PhasedApplication” model: app goes through phases,which tick themselves forward when finished.
OnInit
OnPreloading
OnOpening
OnRunning
OnClosing
OnQuit
OnFrameemscripten
runtimeenvironment
max 16ms or 32ms (for 60 or 30 fps)
Threading Workarounds
Failed approach: Try to wrap low-level threading code in some sort of “co-operative thread scheduling” system.
Success: Move abstraction to a higher level (don’t wrap “low level threads”, but wrap “parallel task system”).
2 uses for threading: hide blocking / make use of additional CPU cores.
Dispatcher WorkerThread(s)
request
Nebula3 parallel task system model3 Flavours:
• Blocking: thread sleeps until messages arrives• Timeout: block until messages arrive, or timeout occurs• Run-through: infinite loop doing per-frame work, pull messages
emscripten port adds 2 “run modes”:
• Parallel: work is pushed to WebWorker threads (makes use of cpu cores)• Sliced: runs on main-thread, work is “triggered” per frame (hides callback mess)
response
queue
Nebula3 IO System
IOSystem
HTTP File System
App Code
IO request
IO responsewith Stream
object
Closer to HTTP philosophy then fopen()/fclose():
• URLs instead of file system paths• asynchronous IO is default, synchronous is special case• pluggable filesystem handlers associated with URL scheme (http://, file://, ...)• Stream objects with StreamReaders and StreamWriters
Local File System
http://..,file://...
Stream object with file data
• Filesystem modules return Stream objects holding downloaded data• Stream objects have typical Read/Seek/... methods• IO reponse is a “Future” object, app code polls whether response has become valid
Asset Loading
Easy way: emscripten can pre-load assets into memory before app starts, accessible through fopen() / fread()
HTTPFile System
Web Server
HTTP request
Downside: delay on startup, memory cost - doesn’t work well for big asset sets.
Solution: need to stream and uncompress all assets on demand asynchronously
HTTP response
Problem: HTTP downloads much slower than loading from HDD, can’t block while waiting for download to finish.
UncompressWebWorker
App Code
IO request
IO response
HTTP File System has platform-specific implementations:• emscripten: emscripten_async_wget_data()• pNaCl: pp::URLLoader• OSX/iOS: NSURLRequest • Linux / Windows: libCURL• fallback: home-made HTTP client using raw TCP sockets (tricky!)
Preloading Phase
Loading Screen On
Preloading
Problem: Sometimes asynchronous loading is too much hassle, or even impossible (for instance when using 3rd party libs).
Solution: Have pre-loading app phases, show loading screen, download and pin files into a memory filesystem, continue to next app phase when files have finished downloading.
Synchronous IO functions exclusively access data in memory filesystem, fail if file hasn’t been preloaded.
Running
Loading Screen Off
Loading Screen On
PreloadingLoading Screen Off
Running
MemoryFile System
fread()
fread()
populate
populate
Web ServerHTTP
Only use this approach when absolutely necessary and only for small files, not for textures, geometry, audio, etc...
Debugging
None of the C++ web solutions have really goodinteractive debugging support (yet).
Develop and debug your app mainly as a native desktop app for OSX or Windows inside XCode or VStudio, this gives the best turn-around time and “debugging experience”
Only fall-back to low-level debugging for platform-specific code.
emscripten debugging can be surprisingly easy:
• generated Javascript can be made very readable (see -g options in emcc)• can inject debugging statements without recompiling• see emscripten/src/settings.js for some interesting runtime debug options
JS Debugging with Source Mapsemcc -g4 generates source maps containing reference data to the original C++ sources.
Interactively debug C++ code in the browser! (still feels very rough around the edges though)
Too many slides, too little time...Other interesting problem areas:
Audio NetworkingWebAudio vs Audio tag
no common compressed audio
format across browsers
WebSockets or WebRTC
much more restrictive than Berkley Sockets
(security reasons)
Feels like back in the 90’s, have to roll our own Audio and
Networking libs AGAIN :(
Too many slides, too little time...
OpenGLhave to settle on OpenGL ES 2 feature set
“it just works!”...even on mobile: Main problem is
call overhead into WebGL, but it’s still surprisingly fast.
Questions?
Resourceshttp://flohofwoe.blogspot.com
http://www.flohofwoe.net/demos.html