High Performance NodeJS
-
Upload
dicoding -
Category
Technology
-
view
950 -
download
0
Transcript of High Performance NodeJS
Introduction
Name : Harimurti Prasetio
email : [email protected]
twitter : http://twitter.com/harippe
facebook : https://www.facebook.com/harippe.murti
GitHub : http://github.com/aerios
What is NodeJS?
A javascript platform that run on top of V8 engine*
Non-blocking I/O
Single-threaded by nature
Originally developed by Ryan Dahl for his internal project
How NodeJS works
NodeJS use Event Loop in its core, provided by libuv library. Event Loop is
single-threaded and running indefinitely. Event Loop is responsible for :
abstracting I/O access from external request
invoke handler for I/O operation and delegates the operation to the handler
receive event from I/O handler regarding operation completion (success or
error)
trigger any callbacks associated with the event
Event Loop
As we can see, although NodeJS is single-threaded, but internally it still use
multithreading for I/O operation. This strategy ensure NodeJS to still able
process next request while waiting for the result from I/O operations.
Another benefit from using single-threaded environment is no memory
synchronization needed between callbacks. If two or more callbacks are
manipulating a variable, no race condition will occur. This feature, in my
personal experience, is the reason why develops application using NodeJS is
easy.
High Performance in NodeJS
In a sense, high performance means the ability to use all available resource
provided by host to deliver higher throughput
Depend on application requirement
It is harder to optimize NodeJS for CPU-intensize application than I/O-intensive application
Due to its single-threaded nature, it is almost impossible to perform parallel
programming using multithreading
almost, it means that there are several ways to use multithreading, but with limited
functionality
Case Study
Suppose our web application, built using NodeJS, need to be enhanced with
analytic functionality. After several discussion and benchmarking, it is decided
to perform data aggregation on application because performing JOIN and
GROUP on database will degrade its performance significantly.
Case Study
In the middle of the sprint, the developer bestowed with this task found that
during crunching very large dataset the application will freeze. He quickly
realize that the application stuck when it enter several tight loops, needed by
the APIs that provide analytic functionality. Changing the loops to native for
loops doesn’t make it either. So, he need to find a way so that the application
won’t freeze despite the tight loops.
Case Study
Tight loops is one case of CPU-intensive operation. Roughly there are 2 ways to
solve this problem :
1. Decrease input size
2. Increase the power
Number (1) is a no-go, because reducing the input size will produce misleading
output. So number (2) is the only option. But the question is, how do we
increase the (CPU) power when NodeJS is single-threaded by nature? How do
NodeJS application consume more CPU power explicitly from the host?
Case Study
After quick Google, there are several ways to achieve it :
1. Multiprocessing
2. WebWorker
3. Parallel.js
Multiprocess - spawn()
- enable NodeJS to create child process from another command
- suppose we need to run python script from our NodeJS application
- we can invoke the script via NodeJS using spawn
- syntax :
var spawn = require(“child_process”).spawn
var inst =
spawn(“py”,[“./path/to/python_script.py”,”parameter_for_py1”,”parameter_for_py2”])
Multiprocess - fork()
- special case of child_process.spawn
- enable NodeJS application to run another NodeJS application as its child
and perform bidirectional link between them
- by using fork, the parent and children can communicate via message
passing using send() method
- syntax
Multiprocess - fork()
main.js
var fork = require(“child_process”).fork
var inst = fork(“path/to/worker.js”,[“this is argument”])
inst.on(“message”,function(message){
console.log(message)
})
inst.send({data:”Hi worker”})
Multiprocess - fork()
worker.js
var argFromParent = process.argv[2]
process.on(“message”,function(message){
process.send({data : message.data,”response”:argFromParent})
})
Multiprocess - fork()
By using fork(), the main application and its children can communicate back
and forth. This is the simplest form of message passing, a method for memory
sharing between different processes or actors. Using message passing :
main application can send data or command to its children
workers (child processes) can send back the output of calculation or
command execution
This feature enables developers to create simple job queue system using
purely NodeJS
Multiprocess - exec()
- NodeJS spawn a shell and execute the command within the shell
- any output or error is buffered and will be provided via callback
- useful to call bash command
- syntax :
var exec = require(“child_process”).exec
var inst = exec(“ls -lah ~/”,function(error,output,error){console.log(output)})
Multithreading in NodeJS
By default, multithreading is not supported in NodeJS. But some npm modules,
such as webworker-threadsand parallel.js enable developer to create new
thread. Both of these modules use WebWorker API, one of ES5 specification.
Multithreading in NodeJS
From my experience, using threads provided via WebWorker have some
benefits :
Lower resource overhead than multiprocess for initialization
Passing data back and forth incur lower overhead
The drawback for using threads:
Because the threads created using WebWorker are not native NodeJS thread,
it is not guaranted that several features provided by NodeJS is present
(non-blocking I/O, module loading, etc)
Conclusion
NodeJS provide several ways to achieve high performance using parallel
programming
Developers can select tools provided natively by NodeJS or using
community-provided modules
In the end, high performance is not a problem that can be solved by simply
using tools. Understanding how things works and performing iterative
benchmarking is a must.
Reference
http://mcgill-csus.github.io/student_projects/Submission2.pdf
http://www.journaldev.com/7462/node-js-processing-model-single-threaded-
model-with-event-loop-architecture
http://nikhilm.github.io/uvbook/An%20Introduction%20to%20libuv.pdf
https://nikhilm.github.io/uvbook/threads.html
http://docs.libuv.org/en/v1.x/design.html
https://www.npmjs.com/package/webworker-threads