On Github kripken / mloc_emscripten_talk
JavaScript is standards-based and the only language that runs in all web browsers
You can run only JavaScript in browsers, but you can write in another language - if you compile it to JavaScript
Doesn't matter for all codebases, but tends to matter more in large ones
JavaScript engines have gotten fast enough to run large compiled codebases
Late 2008/early 2009: V8, TraceMonkey, and Nitro were released, and the race for JavaScript speed was on
That race enabled running large compiled codebases
Compiled JavaScript can be faster than "regular" handwritten JavaScript
Wait, compiled JavaScript is a subset of JavaScript! How can it be faster?
C/C++ => LLVM => Emscripten => JavaScript
LLVM's optimizer uses type information to perform many useful optimizations. Decades of work have gone into developing optimization passes for C/C++ compilers.
...dce, inline, constmerge, constprop, dse, licm, gvn, instcombine, mem2reg, scalarrepl...
These optimization are only available for compiled code!
Running them manually on a "normal" JavaScript codebase would be hard and make the code less maintainable
Modern JavaScript engines infer types at runtime
This especially helps on code that is implicitly typed - which is exactly what compiled code is!
function compiledCalculation() { var x = f()|0; // x is a 32-bit value var y = g()|0; // so is y return (x+y)|0; // 32-bit addition, no type or overflow checks }
Modern JavaScript engines optimize typed arrays very well
var MEM8 = new Uint8Array(1024*1024); var MEM32 = new Uint32Array(MEM8.buffer); // alias MEM8's data function compiledMemoryAccess(x) { MEM8[x] = MEM8[x+10]; // read from x+10, write to x MEM32[(x+16)>>2] = 100; }
Compiled C/C++ uses a typed array as "memory"
asm.js (spec) is a research project at Mozilla that aims to formally define the subset of JavaScript that compilers like Emscripten and Mandreel already generate (typed arrays as memory, etc.)
function strlen(ptr) { // calculate length of C string ptr = ptr|0; var curr = 0; curr = ptr; while (MEM8[curr]|0 != 0) { curr = (curr + 1)|0; } return (curr - ptr)|0; }
asm.js code avoids potential slowdowns in code: no variables with mixed types, etc.
asm.js code does only low-level assembly-like computation, precisely what compiled C/C++ needs (and hence the name)
Type check output of a C/C++ to JavaScript compiler
Type check input to a JavaScript engine at runtime
Variable types pop out during type checking. This makes it possible to do ahead of time (AOT) compilation, not only just in time (JIT)
JavaScript engine has a guarantee that there are no speed bumps - variable types won't change, etc. - so it can generate simpler and more efficient code
The asm.js type system makes it easy to reason about global program structure: function calls, memory access, etc.
Code can run around 2X slower than native - comparable to Java, C# - and will get even faster
Optimizations can be done quickly and straightforwardly in existing JavaScript engines - not a new VM or JIT, just some additional optimizations to existing engines
Code is just a subset of JavaScript (like Crockford's "Good Parts") so already runs in all browsers
Not a new language
Compile your code using emscripten with ASM_JS=1
Run it in a build of Firefox from this branch
Not supported yet: C++ exceptions, setjmp/longjmp
C/C++ compiled to JavaScript can be fast (and even faster with asm.js). But what about other languages?
Many languages can be compiled to C, C++ or LLVM IR, which means they can be compiled to JavaScript with the same approach and benefits
Entire C/C++ runtimes can be compiled and the original language interpreted with proper semantics, but this is not lightweight
Source-to-source compilers from such languages to JavaScript ignore semantic differences (for example, numeric types)
Actually, Java and C# have a similar predicament: Both languages depend on special VMs to be efficient
Source-to-source compilers for them lose out on the optimizations done in those VMs
AOT compilers for them can at least gain LLVM-type optimizations - but still something is missing
Should we compile entire VMs from C/C++ to JavaScript, and implement JavaScript-emitting JITs?
Seems the only way to run most languages with perfect semantics + maximum speed
This is why I believe C/C++ to JavaScript translation is the core issue regarding compilation to JavaScript
Statically-typed languages and especially C/C++ can be compiled effectively to JavaScript
Expect the speed of compiled C/C++ to get to just 2X slower than native code, or better, later this year
Thanks for listening!Questions?