On Github brycebaril / perf_node_interactive
Two primary performance targets:
"Work is done in a timely manner."
Often includes (and dominated by) to Asynchronous IO in Node.js
task_end - task_start
"Work isn't blocking other work."
Blocking is usually Javascript execution or Garbage Collection in Node.js
requests-per-second
IMPORTANT: Blocking operations can cause cascading blockage resulting in significant task completion delays.
A,B,C scheduled at the same time, each will take 10ms A takes 10ms B takes 10ms plus A's 10ms so 20ms C takes 10ms plus A and B, so 30msDo NOT skip straight to step 5. Randomly applied V8 compiler tricks are unlikely to have a significant impact.
Saving 20ms on a task that includes a 5 second database query won't provide a significant improvement.The logs show some images take well over a second to manipulate with the manipulation replaceBackground, and this is a blocking operation.
Based on logs and the code, I have identified the bulk of the time is the image manipulation. It's not doing any network calls or other asynchronous work.
(out of scope for this presentation)
These are usually infrequent and slow enough you can always log. This is essentially what APM vendors do for you automatically.var THRESHOLD = 40 var image = /* read file, call readimage */ function fillBlack(frame) { return frame.fill(0) } replaceBackground(image.frames, fillBlack, THRESHOLD) /* call writegif on image, write to file */replaceBackground calculates the median frame by comparing every pixel in every frame, then for each frame calculates the difference from that background frame
Most tools fall into one of three categories:
With our selected poor-performing image, we'll use /usr/bin/time to measure a baseline:
$ \time node harness.js ~/Downloads/bling2.gif 8.67user 0.06system 0:08.71elapsed 100%CPU (0avgtext+0avgdata 181988maxresident)k 0inputs+752outputs (0major+41833minor)pagefaults 0swaps
(They are not up-to-date but the steps are all there)
The avg() function averages every frame to make a simulated background frame.
Both perf and v8-profiler indicate we're spending the bulk of the time there.
If the problem is slow JavaScript execution:
(out of scope for this presentation)
(Load the two files in assets/irhydra into IRHydra to explore)
avg() calls medianPixel()
The algorithm to calculate median requires a sort
We don't see medianPixel in the flamegraph or v8-profiler output because it was inlined.Our image is 800 x 450 pixels with 51 frames
800 * 450 = 360000 pixels per frame
So 360000 51 pixel sortPixels calls. 😨
Calculating mean doesn't require a sort. Maybe it will work?
$ \time node harness.js ~/Downloads/bling2.gif 3.40user 0.04system 0:03.43elapsed 100%CPU (0avgtext+0avgdata 182048maxresident)k 0inputs+704outputs (0major+44011minor)pagefaults 0swaps
8.67 seconds to 3.40 seconds!
$ \time node harness.js ~/Downloads/bling2.gif 2.21user 0.06system 0:02.28elapsed 99%CPU (0avgtext+0avgdata 183944maxresident)k 10584inputs+704outputs (0major+43860minor)pagefaults 0swaps
Another second saved!
Capture flamegraphs of a production process with the click of a button!
Bryce B. Baril - http://brycebaril.com
Twitter: @brycebaril
NSolid: Try NSolid!