On Github brycebaril / perf_goto_chicago
❔🔍🔬📐📍👓📐🔁
Bryce B. Baril -- @brycebaril
Just tweeted the slide deck locationTwo primary performance concepts:
"Work is done in a timely manner."
Often includes (and is dominated by) Asynchronous IO in Node.js
task_end - task_start
"Work isn't blocking other work."
Blocking work is usually Javascript execution or Garbage Collection in Node.js
requests-per-second
JavaScript execution is single-threaded and on-CPU tasks will be serialized.
IMPORTANT: This can cascade into significant task completion delays.
A,B,C scheduled at the same time, each will take 10ms A takes 10ms B takes 10ms plus A's 10ms so 20ms C takes 10ms plus A and B, so 30msCollaborative art project that did user-driven image glitching and manipulation. Core manipulation logic is in glitcher.
Do NOT skip straight to step 5. Randomly applied V8 compiler tricks are unlikely to have any significant impact.
Saving 20ms on a task that includes a 5 second database query won't provide a significant improvement.The logs show some images take well over a second with the manipulation replaceBackground.
Based on logs and the code, the bulk of the time is the image manipulation. It's not doing any network calls or other asynchronous work.
Feels like a cop-out... how to do this with tooling?(out of scope for this presentation)
These are usually infrequent and slow enough you can always log. This is essentially what APM vendors do for you automatically. Caching: recurring theme--don't do things if you don't have tovar THRESHOLD = 40 var image = /* read file, call readimage */ function fillBlack(frame) { return frame.fill(0) } replaceBackground(image.frames, fillBlack, THRESHOLD) /* call writegif on image, write to file */replaceBackground calculates the median frame by comparing every pixel in every frame, then for each frame calculates the difference from that background frame
Most tools fall into one of three categories:
If the problem is slow JavaScript execution:
With our selected poor-performing image, we'll use /usr/bin/time to measure a baseline:
$ \time node harness.js ~/Downloads/bling2.gif 8.67user 0.06system 0:08.71elapsed 100%CPU (0avgtext+0avgdata 181988maxresident)k 0inputs+752outputs (0major+41833minor)pagefaults 0swaps
]
(open assets/median.cpuprofile in Chrome DevTools)
We can see garbage collection in V8-Profiler output, can eliminate that concern(out of scope for this presentation)
The avg() function averages every frame to make a simulated background frame.
Both perf and v8-profiler indicate we're spending the bulk of the time there.
These things can absolutely make your code easier
for you
to work with.
Generally these:
Focus your effort on high return-on-investment. Don't sacrifice dev convenience by refactoring everything to speed up fast enough code.
Some things could cause systemic slowdowns, hard to see if every function got slower.The optimizing compiler doesn't care if your code is GOOD
(Load the two files in assets/irhydra into IRHydra to explore)
avg() calls medianPixel()
The algorithm to calculate median requires a sort
We don't see medianPixel in the flamegraph or v8-profiler output because it was inlined.Our image is 800 x 450 pixels with 51 frames
800 * 450 = 360000 pixels per frame
So 360_000 sortPixels calls, each sorting 51 pixels. 😨
Calculating mean doesn't require a sort. Maybe it will work?
$ \time node harness.js ~/Downloads/bling2.gif 3.40user 0.04system 0:03.43elapsed 100%CPU (0avgtext+0avgdata 182048maxresident)k 0inputs+704outputs (0major+44011minor)pagefaults 0swaps
8.67 seconds to 3.40 seconds!
]
$ \time node harness.js ~/Downloads/bling2.gif 2.21user 0.06system 0:02.28elapsed 99%CPU (0avgtext+0avgdata 183944maxresident)k 10584inputs+704outputs (0major+43860minor)pagefaults 0swaps
Another second saved!
Capture flamegraphs of a production process with the click of a button!
Bryce B. Baril - http://brycebaril.com
Twitter: @brycebaril
NSolid: Try NSolid!