Ruby 2.0 – Under a Microscope – Overview



Ruby 2.0 – Under a Microscope – Overview

0 3


ruby-under-a-microscope-introduction-slides

Slides for <Ruby under a microscope> introduction

On Github bachue / ruby-under-a-microscope-introduction-slides

Ruby 2.0

Under a Microscope

Created by Bachue / @iBachue

  • Just a share.
  • I'll introduce some points in this book to you.
  • Don't ask me tough questions, I can't answer then. :)
This book is available to buy on http://www.nostarch.com/rum, price is $31.95, about 200RMB.

Who is this book for?

  • Ruby programmer
  • No C programming knowledge is required
  • No Ruby basis explaination in these slides!
  • Only for Ruby programmer who shows strong interest in Ruby internal!
  • No C code shown in these slides!
bachue.github.io/ruby-under-a-microscope-introduction-slides

Agenda

  • Overview
  • Tokenize
  • Parsing
  • Compilation
  • Execution as double stack machine
  • Control Structures
  • Objects & Class
  • Method Lookup & Constant Lookup
  • Closure & Binding
  • GC
Basically map to each characters in the book, just a little changes.

Overview

Tokenize, Parse, Compile => YARV instruments
Bison: GNU Yacc

JRuby

  • Steps are similar, except JVM bytecode is generated
  • Jay is Bison-like but written in Java
  • JIT in Java could compile bytecode to machine language

JIT

Rubinius

  • Steps are similar, except Rubinius instruments and LLV instruments are generated
  • Rubinius compile Ruby to Rubinius instruments, and then use LLVM compile them to LLVM instruments, then may compile to machine language by JIT compiler

Tokenize

10.times do |n|
  puts n
end
						

Before

After

parser_yylex in parse.y

Keywords: defs/keywords

ruby/parse.y:6933 in `parser_yylex'

start from line 6968: switch (c = nextc())

keywords: rb_reserved_word, defs/keywords => lex.c

MRI doesn't use Lex tokenization tool, but write these code by hand, because of the performance or flexibility.

Ripper.tokenize & Ripper.lex

require 'ripper'
require 'pp'
code = <<-STR
  10.times do |n|
    puts n
  end
STR
p Ripper.tokenize(code)
pp Ripper.lex(code)
						
Ripper, which was provided since 1.9, allows you to call the same tokenization and parsing code that Ruby uses to process text from code files.
["  ", "10", ".", "times", " ", "do", " ", "|", "n", "|", "\n", "    ", "puts", " ", "n", "\n", "  ", "end", "\n"]

[[[1, 0], :on_sp, "  "],
 [[1, 2], :on_int, "10"],
 [[1, 4], :on_period, "."],
 [[1, 5], :on_ident, "times"],
 [[1, 10], :on_sp, " "],
 [[1, 11], :on_kw, "do"],
 [[1, 13], :on_sp, " "],
 [[1, 14], :on_op, "|"],
 [[1, 15], :on_ident, "n"],
 [[1, 16], :on_op, "|"],
 [[1, 17], :on_ignored_nl, "\n"],
 [[2, 0], :on_sp, "    "],
 [[2, 4], :on_ident, "puts"],
 [[2, 8], :on_sp, " "],
 [[2, 9], :on_ident, "n"],
 [[2, 10], :on_nl, "\n"],
 [[3, 0], :on_sp, "  "],
 [[3, 2], :on_kw, "end"],
 [[3, 5], :on_nl, "\n"]]
						
Result

Parsing

LALR, Bison

MRI uses GNU Bison to generate the parser code, from parse.y to parse.c

Spanish => English Example

Me gusta el Ruby. => I Like Ruby.
Le gusta el Ruby. => She likes Ruby.
						
An Simple Example

Bison/Yacc grammar

SpanishPhrase: VerbAndObject el ruby {
  printf("%s Ruby\n", $1);
};
VerbAndObject: SheLikes | ILike {
  $$ = $1;
};
SheLikes: le gusta {
  $$ = "She likes";
}
ILike: me gusta {
  $$ = "I like";
}
						
Grammar Rule

Simplified Steps

Grammar Rule Stack Tokens Action le gusta el ruby le gusta el ruby Shift le gusta el ruby Shift SheLikes el ruby Reduce VerbAndObject el ruby Reduce VerbAndObject el ruby Shift VerbAndObject el ruby Shift SpanishPhrase Match
  • Looks too simple to explain
  • But LALR is very complicated actually we must follow LALR Goto Table
  • No real Ruby example here, let's have a glance at the real code

parse.y

parse.y:860

BNF Syntax of Ruby 1.4

ruby -y

10.times do |n|
  puts n
end
						
Another example
Starting parse
Entering state 0
Reducing stack by rule 1 (line 782):
-> $$ = nterm @1 ()
Stack now 0
Entering state 2
Reading a token: Next token is token tINTEGER ()
Shifting token tINTEGER ()
Entering state 41
Reducing stack by rule 470 (line 4255):
   $1 = token tINTEGER ()
-> $$ = nterm numeric ()
Stack now 0 2
Entering state 104
Reducing stack by rule 428 (line 3830):
   $1 = nterm numeric ()
-> $$ = nterm literal ()
Stack now 0 2
Entering state 94
Reducing stack by rule 270 (line 2616):
   $1 = nterm literal ()
-> $$ = nterm primary ()
Stack now 0 2
Entering state 80
Reading a token: Next token is token '.' ()
Reducing stack by rule 329 (line 3071):
   $1 = nterm primary ()
-> $$ = nterm primary_value ()
Stack now 0 2
Entering state 81
Next token is token '.' ()
Shifting token '.' ()
Entering state 336
Reading a token: Next token is token tIDENTIFIER ()
Shifting token tIDENTIFIER ()
Entering state 533
Reading a token: Next token is token keyword_do ()
Reducing stack by rule 550 (line 4823):
   $1 = token tIDENTIFIER ()
-> $$ = nterm operation2 ()
Stack now 0 2 81 336
Entering state 538
Next token is token keyword_do ()
Reducing stack by rule 572 (line 4874):
-> $$ = nterm none ()
Stack now 0 2 81 336 538
Entering state 676
Reducing stack by rule 246 (line 2426):
   $1 = nterm none ()
-> $$ = nterm opt_paren_args ()
Stack now 0 2 81 336 538
Entering state 674
Reducing stack by rule 404 (line 3635):
   $1 = nterm primary_value ()
   $2 = token '.' ()
   $3 = nterm operation2 ()
   $4 = nterm opt_paren_args ()
-> $$ = nterm method_call ()
Stack now 0 2
Entering state 93
Next token is token keyword_do ()
Shifting token keyword_do ()
Entering state 380
Reducing stack by rule 414 (line 3735):
-> $$ = nterm @28 ()
Stack now 0 2 93 380
Entering state 576
Reading a token: Next token is token '|' ()
Shifting token '|' ()
Entering state 653
Reading a token: Next token is token tIDENTIFIER ()
Shifting token tIDENTIFIER ()
Entering state 761
Reading a token: Next token is token '|' ()
Reducing stack by rule 518 (line 4543):
   $1 = token tIDENTIFIER ()
-> $$ = nterm f_norm_arg ()
Stack now 0 2 93 380 576 653
Entering state 636
Reducing stack by rule 519 (line 4550):
   $1 = nterm f_norm_arg ()
-> $$ = nterm f_arg_item ()
Stack now 0 2 93 380 576 653
Entering state 637
Reducing stack by rule 521 (line 4578):
   $1 = nterm f_arg_item ()
-> $$ = nterm f_arg ()
Stack now 0 2 93 380 576 653
Entering state 765
Next token is token '|' ()
Reducing stack by rule 572 (line 4874):
-> $$ = nterm none ()
Stack now 0 2 93 380 576 653 765
Entering state 748
Reducing stack by rule 537 (line 4723):
   $1 = nterm none ()
-> $$ = nterm opt_f_block_arg ()
Stack now 0 2 93 380 576 653 765
Entering state 850
Reducing stack by rule 372 (line 3377):
   $1 = nterm f_arg ()
   $2 = nterm opt_f_block_arg ()
-> $$ = nterm block_param ()
Stack now 0 2 93 380 576 653
Entering state 763
Next token is token '|' ()
Reducing stack by rule 572 (line 4874):
-> $$ = nterm none ()
Stack now 0 2 93 380 576 653 763
Entering state 770
Reducing stack by rule 385 (line 3479):
   $1 = nterm none ()
-> $$ = nterm opt_bv_decl ()
Stack now 0 2 93 380 576 653 763
Entering state 847
Next token is token '|' ()
Shifting token '|' ()
Entering state 905
Reducing stack by rule 384 (line 3468):
   $1 = token '|' ()
   $2 = nterm block_param ()
   $3 = nterm opt_bv_decl ()
   $4 = token '|' ()
-> $$ = nterm block_param_def ()
Stack now 0 2 93 380 576
Entering state 655
Reducing stack by rule 381 (line 3444):
   $1 = nterm block_param_def ()
-> $$ = nterm opt_block_param ()
Stack now 0 2 93 380 576
Entering state 716
Reading a token: Next token is token tIDENTIFIER ()
Shifting token tIDENTIFIER ()
Entering state 35
Reading a token: Next token is token tIDENTIFIER ()
Reducing stack by rule 547 (line 4818):
   $1 = token tIDENTIFIER ()
-> $$ = nterm operation ()
Stack now 0 2 93 380 576 716
Entering state 110
Next token is token tIDENTIFIER ()
Reducing stack by rule 258 (line 2499):
-> $$ = nterm @7 ()
Stack now 0 2 93 380 576 716 110
Entering state 219
Next token is token tIDENTIFIER ()
Shifting token tIDENTIFIER ()
Entering state 35
Reading a token: Next token is token '\n' ()
Reducing stack by rule 474 (line 4275):
   $1 = token tIDENTIFIER ()
-> $$ = nterm user_variable ()
Stack now 0 2 93 380 576 716 110 219
Entering state 209
Next token is token '\n' ()
Reducing stack by rule 486 (line 4291):
   $1 = nterm user_variable ()
-> $$ = nterm var_ref ()
Stack now 0 2 93 380 576 716 110 219
Entering state 107
Reducing stack by rule 276 (line 2622):
   $1 = nterm var_ref ()
-> $$ = nterm primary ()
Stack now 0 2 93 380 576 716 110 219
Entering state 80
Next token is token '\n' ()
Reducing stack by rule 239 (line 2375):
   $1 = nterm primary ()
-> $$ = nterm arg ()
Stack now 0 2 93 380 576 716 110 219
Entering state 203
Next token is token '\n' ()
Reducing stack by rule 240 (line 2381):
   $1 = nterm arg ()
-> $$ = nterm arg_value ()
Stack now 0 2 93 380 576 716 110 219
Entering state 204
Next token is token '\n' ()
Reducing stack by rule 263 (line 2531):
   $1 = nterm arg_value ()
-> $$ = nterm args ()
Stack now 0 2 93 380 576 716 110 219
Entering state 207
Next token is token '\n' ()
Reducing stack by rule 572 (line 4874):
-> $$ = nterm none ()
Stack now 0 2 93 380 576 716 110 219 207
Entering state 398
Reducing stack by rule 262 (line 2525):
   $1 = nterm none ()
-> $$ = nterm opt_block_arg ()
Stack now 0 2 93 380 576 716 110 219 207
Entering state 397
Reducing stack by rule 254 (line 2463):
   $1 = nterm args ()
   $2 = nterm opt_block_arg ()
-> $$ = nterm call_args ()
Stack now 0 2 93 380 576 716 110 219
Entering state 409
Reducing stack by rule 259 (line 2499):
   $1 = nterm @7 ()
   $2 = nterm call_args ()
-> $$ = nterm command_args ()
Stack now 0 2 93 380 576 716 110
Entering state 387
Next token is token '\n' ()
Reducing stack by rule 58 (line 1344):
   $1 = nterm operation ()
   $2 = nterm command_args ()
-> $$ = nterm command ()
Stack now 0 2 93 380 576 716
Entering state 72
Next token is token '\n' ()
Reducing stack by rule 51 (line 1297):
   $1 = nterm command ()
-> $$ = nterm command_call ()
Stack now 0 2 93 380 576 716
Entering state 70
Reducing stack by rule 44 (line 1249):
   $1 = nterm command_call ()
-> $$ = nterm expr ()
Stack now 0 2 93 380 576 716
Entering state 69
Next token is token '\n' ()
Reducing stack by rule 41 (line 1225):
   $1 = nterm expr ()
-> $$ = nterm stmt ()
Stack now 0 2 93 380 576 716
Entering state 245
Next token is token '\n' ()
Reducing stack by rule 14 (line 933):
   $1 = nterm stmt ()
-> $$ = nterm stmts ()
Stack now 0 2 93 380 576 716
Entering state 244
Next token is token '\n' ()
Shifting token '\n' ()
Entering state 289
Reducing stack by rule 569 (line 4866):
   $1 = token '\n' ()
-> $$ = nterm term ()
Stack now 0 2 93 380 576 716 244
Entering state 291
Reducing stack by rule 570 (line 4869):
   $1 = nterm term ()
-> $$ = nterm terms ()
Stack now 0 2 93 380 576 716 244
Entering state 433
Reading a token: Next token is token keyword_end ()
Reducing stack by rule 560 (line 4847):
   $1 = nterm terms ()
-> $$ = nterm opt_terms ()
Stack now 0 2 93 380 576 716 244
Entering state 432
Reducing stack by rule 12 (line 913):
   $1 = nterm stmts ()
   $2 = nterm opt_terms ()
-> $$ = nterm compstmt ()
Stack now 0 2 93 380 576 716
Entering state 817
Next token is token keyword_end ()
Shifting token keyword_end ()
Entering state 881
Reducing stack by rule 415 (line 3734):
   $1 = token keyword_do ()
   $2 = nterm @28 ()
   $3 = nterm opt_block_param ()
   $4 = nterm compstmt ()
   $5 = token keyword_end ()
-> $$ = nterm brace_block ()
Stack now 0 2 93
Entering state 382
Reducing stack by rule 298 (line 2781):
   $1 = nterm method_call ()
   $2 = nterm brace_block ()
-> $$ = nterm primary ()
Stack now 0 2
Entering state 80
Reading a token: Next token is token '\n' ()
Reducing stack by rule 239 (line 2375):
   $1 = nterm primary ()
-> $$ = nterm arg ()
Stack now 0 2
Entering state 79
Next token is token '\n' ()
Reducing stack by rule 49 (line 1282):
   $1 = nterm arg ()
-> $$ = nterm expr ()
Stack now 0 2
Entering state 69
Next token is token '\n' ()
Reducing stack by rule 41 (line 1225):
   $1 = nterm expr ()
-> $$ = nterm stmt ()
Stack now 0 2
Entering state 67
Next token is token '\n' ()
Reducing stack by rule 8 (line 855):
   $1 = nterm stmt ()
-> $$ = nterm top_stmt ()
Stack now 0 2
Entering state 66
Reducing stack by rule 5 (line 833):
   $1 = nterm top_stmt ()
-> $$ = nterm top_stmts ()
Stack now 0 2
Entering state 65
Next token is token '\n' ()
Shifting token '\n' ()
Entering state 289
Reducing stack by rule 569 (line 4866):
   $1 = token '\n' ()
-> $$ = nterm term ()
Stack now 0 2 65
Entering state 291
Reducing stack by rule 570 (line 4869):
   $1 = nterm term ()
-> $$ = nterm terms ()
Stack now 0 2 65
Entering state 292
Reading a token: Now at end of input.
Reducing stack by rule 560 (line 4847):
   $1 = nterm terms ()
-> $$ = nterm opt_terms ()
Stack now 0 2 65
Entering state 290
Reducing stack by rule 3 (line 813):
   $1 = nterm top_stmts ()
   $2 = nterm opt_terms ()
-> $$ = nterm top_compstmt ()
Stack now 0 2
Entering state 64
Reducing stack by rule 2 (line 782):
   $1 = nterm @1 ()
   $2 = nterm top_compstmt ()
-> $$ = nterm program ()
Stack now 0
Entering state 1
Now at end of input.
Stack now 0 1
Cleanup: popping nterm program ()
						

Ripper.sexp

require 'ripper'
require 'pp'
code = <<-STR
  10.times do |n|
    puts n
  end
STR
puts code
pp Ripper.sexp(code)
						
Parse source code and create S-exp tree.
[:program,
 [[:method_add_block,
   [:call, [:@int, "10", [1, 3]], :".", [:@ident, "times", [1, 6]]],
   [:do_block,
    [:block_var,
     [:params, [[:@ident, "n", [1, 16]]], nil, nil, nil, nil],
     nil],
    [[:command,
      [:@ident, "puts", [2, 0]],
      [:args_add_block, [[:var_ref, [:@ident, "n", [2, 5]]]], false]]]]]]]
						

ruby --dump parsetree

@ NODE_SCOPE (line: 3)
+- nd_tbl: (empty)
+- nd_args:
|   (null node)
+- nd_body:
    @ NODE_ITER (line: 1)
    +- nd_iter:
    |   @ NODE_CALL (line: 1)
    |   +- nd_mid: :times
    |   +- nd_recv:
    |   |   @ NODE_LIT (line: 1)
    |   |   +- nd_lit: 10
    |   +- nd_args:
    |       (null node)
    +- nd_body:
        @ NODE_SCOPE (line: 3)
        +- nd_tbl: :n
        +- nd_args:
        |   @ NODE_ARGS (line: 1)
        |   +- nd_frml: 1
        |   +- nd_next:
        |   |   @ NODE_ARGS_AUX (line: 1)
        |   |   +- nd_rest: (null)
        |   |   +- nd_body: (null)
        |   |   +- nd_next:
        |   |       (null node)
        |   +- nd_opt:
        |       (null node)
        +- nd_body:
            @ NODE_FCALL (line: 2)
            +- nd_mid: :puts
            +- nd_args:
                @ NODE_ARRAY (line: 2)
                +- nd_alen: 1
                +- nd_head:
                |   @ NODE_DVAR (line: 2)
                |   +- nd_vid: :n
                +- nd_next:
                    (null node)
						
Also shows node info, which is very related to the implementation.

Equivalent AST nodes

NODE_LIT means literal.

Compilation

Ruby 1.8

No compilation in Ruby 1.8, Ruby 1.8 execute AST directly.

Ruby 1.9 & 2.0

Ruby 1.9 compiles Ruby code to YARV instruments.

Benchmark

i= 0
while i < ARGV[0].to_i
  i += 1
end
						
Performance competition between AST and YARV instruments.

logarithmic scale

You can see that for short-lived processes, Ruby 1.8.7 is actually faster than Ruby 1.9.3 and 2.0 because there is no need to compile the Ruby code into YARV instructions. However, after about 11,000 iterations, Ruby 1.9.3 and 2.0 are faster.

linear scale

Displaying YARV Instructions

code = <<-RUBY
  puts 1 + 2
RUBY

puts RubyVM::InstructionSequence.compile(code).disasm
						

Compiled

== disasm: <RubyVM::InstructionSequence:<compiled>@<compiled>>==========
0000 trace            1                                               (   1)
0002 putself
0003 putobject_OP_INT2FIX_O_1_C_
0004 putobject        2
0006 opt_plus         <callinfo!mid:+, argc:1, ARGS_SKIP>
0008 opt_send_simple  <callinfo!mid:puts, argc:1, FCALL|ARGS_SKIP>
0010 leave
						
  • Execute in Stack machine
  • putobject_OP_INT2FIX_O_1_C_ is equivalent to putobject 1
  • ARGS_SKIP indicates the arguments are simple values (not blocks or arrays of unnamed arguments), allowing YARV to skip some work it would have to do otherwise.
  • MRI traverses the AST nodes(result of ruby -y) and generate these instruments.

Another Example

code = <<-RUBY
  10.times do |n|
    puts n
  end
RUBY

puts RubyVM::InstructionSequence.compile(code).disasm
						

Compiled

== disasm: <RubyVM::InstructionSequence:<compiled>@<compiled>>==========
== catch table
| catch type: break  st: 0002 ed: 0006 sp: 0000 cont: 0006
|------------------------------------------------------------------------
0000 trace            1                                               (   1)
0002 putobject        10
0004 send             <callinfo!mid:times, argc:0, block:block in <compiled>>
0006 leave
== disasm: <RubyVM::InstructionSequence:block in <compiled>@<compiled>>=
== catch table
| catch type: redo   st: 0000 ed: 0009 sp: 0000 cont: 0000
| catch type: next   st: 0000 ed: 0009 sp: 0000 cont: 0009
|------------------------------------------------------------------------
local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1] s3)
[ 2] n<Arg>
0000 trace            256                                             (   1)
0002 trace            1                                               (   2)
0004 putself
0005 getlocal_OP__WC__0 2
0007 opt_send_simple  <callinfo!mid:puts, argc:1, FCALL|ARGS_SKIP>
0009 trace            512                                             (   3)
0011 leave                                                            (   2)
						
  • Notice Local variable
  • getlocal_OP__WC__0 is equivalent to getlocal *, 0
  • This trick allows Ruby 2.0 to save a bit of time because it doesn’t need to pass the 0 argument separately.

iseq_compile_each in compile.c

How Ruby interates thought the AST

compile.c:3191

Execution

As double stack machine

rb_control_frame_t

pc, sp, self, type(frame type)

caller

  • Just like we call caller
  • CFP => current frame pointer

Example

10.times do
  puts "The quick brown fox jumps over the lazy dog."
end
						

Ruby always creates "TOP" frame first when starting a new program. At the top of the stack, at least initially, a frame of type "EVAL" corresponds to the top level or main scope of the Ruby script.

Local Variable Access

Environment Pointer

  • EP = SP - 1
  • Special: to track information related to blocks(rb_block_t).
  • Local variable table is not a table(Just like Stack variable in C)
  • special: to track information related to blocks(rb_block_t) (doesn't mean special variable table).
  • svar: pointer to special variable table(only for ruby method call)
  • cref: pointer to current lexical scope(designed for *_eval, *_exec methods, only for block call)
  • Method Arguments Are Treated Like Local Variables

Dynamic Variable Access

code = <<-RUBY
  def display_string
    str = "Dynamic access."
    10.times do
      puts str
    end
  end
RUBY

puts RubyVM::InstructionSequence.compile(code).disasm
						
== disasm: <RubyVM::InstructionSequence:f@<compiled>>===================
== catch table
| catch type: break  st: 0010 ed: 0014 sp: 0000 cont: 0014
|------------------------------------------------------------------------
local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1] s1)
[ 2] str
0000 trace            8                                               (   1)
0002 trace            1                                               (   2)
0004 putstring        "Dynamic access."
0006 setlocal_OP__WC__0 2
0008 trace            1                                               (   3)
0010 putobject        10
0012 send             <callinfo!mid:times, argc:0, block:block in f>
0014 trace            16                                              (   6)
0016 leave                                                            (   3)
== disasm: <RubyVM::InstructionSequence:block in f@<compiled>>==========
== catch table
| catch type: redo   st: 0000 ed: 0009 sp: 0000 cont: 0000
| catch type: next   st: 0000 ed: 0009 sp: 0000 cont: 0009
|------------------------------------------------------------------------
0000 trace            256                                             (   3)
0002 trace            1                                               (   4)
0004 putself
0005 getlocal_OP__WC__1 2
0007 opt_send_simple  <callinfo!mid:puts, argc:1, FCALL|ARGS_SKIP>
0009 trace            512                                             (   5)
0011 leave                                                            (   4)
						
setlocal_OP__WC__0 is the same as setlocal *, 0
getlocal_OP__WC__1 2 means to follow Previous EP once, then minus 2

Special Variable Access

Guess?

str = "The quick brown fox jumped over the lazy dog.\n"
/fox/.match(str)
def search(str)
  /dog/.match(str)
  puts "Value of $& inside method: #{$&}"
end
search(str)
puts "Value of $& in the top level scope: #{$&}"
						
Guess: What's the answer?
Each method call has its own special variable table

Guess Again?

str = "The quick brown fox jumped over the lazy dog.\n"
/fox/.match(str)
2.times do
  /dog/.match(str)
  puts "Value of $& inside block: #{$&}"
end
puts "Value of $& in the top level scope: #{$&}"
						
Guess: What's the answer?
When it's a block call, EP - 1 is cref, means Lexical Scope.

YARV Instruments Definitions

tool/insns2vm.rb: insns.def => vm.inc

  • Generator: tool/insns2vm.rb, Run by miniruby during the build process
  • Instruments Definition: insns.def
  • insns.def is not C code
  • After compilation: vm.inc, code we really execute

Control Structures

Example

code = <<-RUBY
  i = 0
  while i < 20
    if i < 10
      puts 'small'
    else
      for j in 0..3
        puts "large \#{j}"
      end
    end
    i += 1
  end
  puts 'done'
RUBY

puts RubyVM::InstructionSequence.compile(code).disasm
						
== disasm: <RubyVM::InstructionSequence:<compiled>@<compiled>>==========
== catch table
| catch type: break  st: 0035 ed: 0039 sp: 0000 cont: 0039
| catch type: break  st: 0013 ed: 0058 sp: 0000 cont: 0058
| catch type: next   st: 0013 ed: 0058 sp: 0000 cont: 0010
| catch type: redo   st: 0013 ed: 0058 sp: 0000 cont: 0013
|------------------------------------------------------------------------
local table (size: 3, argc: 0 [opts: 0, rest: -1, post: 0, block: -1] s1)
[ 3] i          [ 2] j
0000 trace            1                                               (   1)
0002 putobject_OP_INT2FIX_O_0_C_
0003 setlocal_OP__WC__0 3
0005 trace            1                                               (   2)
0007 jump             49
0009 putnil
0010 pop
0011 jump             49
0013 trace            1                                               (   3)
0015 getlocal_OP__WC__0 3
0017 putobject        10
0019 opt_lt           <callinfo!mid:<, argc:1, ARGS_SKIP>
0021 branchunless     33
0023 trace            1                                               (   4)
0025 putself
0026 putstring        "small"
0028 opt_send_simple  <callinfo!mid:puts, argc:1, FCALL|ARGS_SKIP>
0030 pop
0031 jump             40                                              (   3)
0033 trace            1                                               (   6)
0035 putobject        0..3
0037 send             <callinfo!mid:each, argc:0, block:block in <compiled>>
0039 pop
0040 trace            1                                               (  10)
0042 getlocal_OP__WC__0 3
0044 putobject_OP_INT2FIX_O_1_C_
0045 opt_plus         <callinfo!mid:+, argc:1, ARGS_SKIP>
0047 setlocal_OP__WC__0 3
0049 getlocal_OP__WC__0 3                                             (   2)
0051 putobject        20
0053 opt_lt           <callinfo!mid:<, argc:1, ARGS_SKIP>
0055 branchif         13
0057 putnil
0058 pop
0059 trace            1                                               (  12)
0061 putself
0062 putstring        "done"
0064 opt_send_simple  <callinfo!mid:puts, argc:1, FCALL|ARGS_SKIP>
0066 leave
== disasm: <RubyVM::InstructionSequence:block in <compiled>@<compiled>>=
== catch table
| catch type: redo   st: 0004 ed: 0018 sp: 0000 cont: 0004
| catch type: next   st: 0004 ed: 0018 sp: 0000 cont: 0018
|------------------------------------------------------------------------
local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1] s3)
[ 2] ?<Arg>
0000 getlocal_OP__WC__0 2                                             (   8)
0002 setlocal_OP__WC__1 2                                             (   6)
0004 trace            256
0006 trace            1                                               (   7)
0008 putself
0009 putobject        "large "
0011 getlocal_OP__WC__1 2
0013 tostring
0014 concatstrings    2
0016 opt_send_simple  <callinfo!mid:puts, argc:1, FCALL|ARGS_SKIP>
0018 trace            512                                             (   8)
0020 leave                                                            (   7)
						
  • if: branchunless
  • while: branchif
  • for: equivalent to (0..3).each

Catch table

  • break, redo, next are not implemented by jump
  • From up to down, ask one by one
  • assume there's a break or redo or next in for statement in the last example.

Objects & Class

RObject

includes/ruby/ruby.h

At the top of the figure is a pointer to the RObject structure. (Internally, Ruby always refers to any value with a VALUE pointer.) Below this pointer, the RObject structure contains an inner RBasic structure and information specific to custom objects. The RBasic section contains information that all values use: a set of Boolean values called flags that store a variety of internal technical values, and a class pointer called klass.

The class pointer indicates which class an object is an instance of. In the RObject section, Ruby saves an array of instance variables that each object contains, using numiv, the instance variable count, and ivptr, a pointer to an array of values.

RBasic: includes/ruby/ruby.h:747

RObject: includes/ruby/ruby.h:762

Example

class Mathematician
  attr_accessor :first_name
  attr_accessor :last_name
end

euler = Mathematician.new
euler.first_name = 'Leonhard'
euler.last_name = 'Euler'

euclid = Mathematician.new
euclid.first_name = 'Euclid'
						
As you can see, instance variable names are not stored in RObject

Instance Variable Access

  • In Ruby 1.8, RObject stores instance variables as a hashmap.
  • In Ruby 1.9, RObject just stores values of instance variables in an Array, and RClass stores names of them. Use index to connect them.
  • Instance variable values table will preallocate more space.

Generic Objects

Fixnum

  • Symbol, NilClass, TrueClass, FalseClass has their owned flags.
  • Float uses double internally, so it has its own class.

How do generic objects have instance variables?

All generic objects don't have their owned numiv and ivptr. Ruby saves it in a special hash called generic_iv_tbl. This hash maintains a map between generic objects and pointers to other hashes that contain each object’s instance variables.

RClass

RClass: include/ruby/ruby.h:790

rb_classext_struct: internal.h:264

Class Instance Variable Access

  • Both class instance variables and class variables are stored in class-level instance variable table
  • But use two different algorithm
  • Class instance variable is just stored in RClass simply.

Class Variable Access

Ruby search through RClass superclass chain for the Class variable.

Guess?

class SuperClass
end
class SubClass < SuperClass
end

SubClass.class_variable_set :@@b, 'b'
SuperClass.class_variable_set :@@b, 'a'
SubClass.class_variable_set :@@b
                        
But Ruby always use the variable which is the super class even subclass has the same class variable.

Method Lookup & Constant Lookup

Module

There's no RModule in MRI

Share data structure(RClass) with class

Include Module in a Class

module Professor
end

class Methematician < Person
  include Professor
end
                        
Ruby just copy the RClass object and insert it into the linked list between the class and its superclass.

Include one Module into Another

In that case, just copy the entire module tree.

Prepend Module in a Class

  • Since 2.0
  • When you prepend a module, Ruby creates a copy of the target class and set it as the superclass of the prepended module. And move all methods from the original class to the copy.
  • Why does this design so tricky? Why not move prepended module as the subclass of the class? If we call method on this class directly, we won't find it if it is defined in prepended module.

Guess: Does it work?

module A; end

class C
  include A
end

module A
  def f
    'module A'
  end
end

C.new.f
                        
Works, m_tbl will be copyed but still share memory.

Guess Again?

module A
  def f
    'module A'
  end
end

module B
end

class C
  include B
end

module B
  include A
end

C.new.f
                        

No, module B has been copyed, it can't include module A in its copy.

rb_include_module in class.c

rb_include_module: class.c:823

include_modules_at: class.c:848

rb_include_class_new: class.c:788

Global Method Cache

klass defined_class Fixnum#times Integer#times Object#puts BasicObject#puts etc ... etc ...

The Inline Method Cache

  • Any possible method creation, importing or removing will clean all caches.
  • Since 2.1, invalid only sub-classes under effective class

Lexical Scope

Source code - class - lexical scope

Constant Access

We will talk about the algorithm of Constant search

Guess?

class SuperClass
  FIND_ME = "Found in SuperClass"
end

module ParentLexicalScope
  FIND_ME = "Found in ParentLexicalScope"

  class SubClass < SuperClass
    p FIND_ME
  end
end
                        
  • search through lexical scope chain
  • for each scope's class, if no constant found, check for autoload
  • search through superclass chain
  • for each superclass, if no constant found, check for autoload
  • if still no constant found, call const_missing

Closure & Binding

Block

10.times do
  str = "The quick brown fox jumps over the lazy dog."
  puts str
end
                        
We will talk about initialization of rb_block_t.
  • Before call times method, Ruby creates and initializes a new rb_block_t structure to represent the block. Copy the current EP into it.
  • When yield is called, Ruby can access local variables directly from rb_control_frame_t structure, and access variables from the parent scope indirectly from EP in the block using dynamic variable access.
  • Ruby doesn't really allocate memory for and initialize rb_block_t, just references subset of rb_control_frame_t, that's why block is not really an Object.

vm_core.h

  • rb_control_frame_t: vm_core.h:445
  • rb_block_t: vm_core.h:462
  • proc in rb_block_t is used when Ruby creates proc object from a block

Lambda

def message_function
  str = "The quick brown fox"
  lambda do |animal|
    puts "#{str} jumps over the lazy #{animal}." 
  end
end

function_value = message_function
function_value.call 'dog'
                        
Then let's talk about lambda.

rb_lambda_t?

rb_proc_t

rb_proc_t contains rb_block_t, means a proc is a kind of Ruby object that wraps up a block.
  • When you call lambda, Ruby copies the entire contents of the current YARV stack frame into the heap.
  • rb_env_t is a wrapper for the heap copy of the stack.
  • When Ruby calls the block inside the proc, it copies the stack frame in the heap. The new stack frame contains an EP that points to the heap.

Proc is an Object

Block is not an Object, but Proc is.

Guess?

def message_function
  str = "The quick brown fox"
  func = lambda do |animal|
    puts "#{str} jumps over the lazy #{animal}." 
  end
  str = "The sly brown fox"
  func
end

function_value = message_function
function_value.call 'dog'
                        
  • Once Ruby creates the new heap copy of the stack (the new rb_env_t structure or internal environment object), it resets the EP in the rb_control_frame_t structure to point to the copy.
  • Each block you pass to the lambdas accesses the same variable in the parent scope. Ruby achieves this by checking whether the EP already points to the heap. So if you create multiple lambdas in the same scope, Ruby won't create the second copy, it will reuse the same rb_env_t structure in the second rb_proc_t structure.

Benchmark

require 'benchmark'
ITERATIONS = 1000000
Benchmark.bm do |bench|
  bench.report("While") do
    ITERATIONS.times do
      sum = 0
      i= 1
      while i <= 10
        sum += i
        i += 1 
      end
    end
  end

  bench.report("Block") do
    ITERATIONS.times do
      sum = 0
      (1..10).each do |i|
        sum += i
      end
    end
  end

  bench.report("Lambda") do
    ITERATIONS.times do
      sum = 0

      blk = lambda do |i|
        sum += i
      end

      (1..10).each(&blk)
    end
  end
end
                        
Let's see a performance benchmark between while, block and lambda.

Result

        user       system      total        real
While   0.480000   0.000000   0.480000 (  0.479636)
Block   0.900000   0.000000   0.900000 (  0.899904)
Lambda  1.720000   0.050000   1.770000 (  1.763199)
                        

Binding

def get_binding
  a = 2
  b = 3
  binding
end

eval('puts a + b', get_binding)
                        

rb_binding_t

As the implementation of *_eval, *_exec, EP in new rb_control_frame_t points to the lower stack frame. So Ruby can access to local variable directly.
The binding structure is simply a wrapper around the internal environment structure—the heap copy of the stack frame. The binding structure also contains the file name and line number of the loca- tion from where you called binding.

GC

Algorithms

Let's just talk about algorithms.

MRI

mark-and-sweep garbage collection

Non-Moving strategy

RVALUE

gc.c

  • RVALUE as a basic unit
  • RVALUE: gc.c:325

Mark

MRI has marked five active objects (gray) with five garbage objects remaining in the heap (white).

Since 2.0, Bitmap Marking

Since 2.0, because of Unix memory optimization technique "copy on write" when fork.

Sweeping

While sweeping, MRI places unused RVALUE structures back on the free list.

Since 1.9.3, Lazy Sweeping

  • Lazy sweeping sweeps only enough garbage objects back to the free list to create a few new Ruby objects and to allow your program to continue
  • Lazy sweeping can reduce the amount of time your program is paused waiting for garbage collection; however, it doesn’t reduce the overall amount of garbage collection work to do. Lazy sweeping amortizes the same total amount of sweeping work over multiple GC pauses.

JRuby & Rubinius

copying garbage collection

generational garbage collection

concurrent garbage collection

Bump Allocation

Copying garbage collection is the ability to create objects of different sizes

The Semi-Space Algorithm

Moving strategy

Generational Garbage Collection

JRuby & Rubinius have used it for years. Intrduced into MRI since 2.1.
  • Keywords: minor GC, major GC
  • Popular Algorithms combination: minor: Copy, major: Mark & Sweep
  • MRI 2.1: both Mark & Sweep
  • Promotion: Rubinius: 2, MRI 2.1: 1(limited by Mark & Sweep), JRuby: dynamic
  • JRuby: another "permanent generation" for internal objects for JVM ifself
  • Rubinius: Also uses a third generation for very large objects by Mark & Sweep

Since 2.1, RGenGC

Restricted Generational Garbage Collection

Problem: Mature object could reference young object.

Write barriers

Ruby 2.1 uses an old GC technique called write barriers to monitor changes to mature objects – whenever you add a reference from one object to another (whenever you write to or modify an object), a write barrier is triggered. The barriers check whether the source object is mature, and if so adds the mature object to a special list. Later Ruby 2.1 includes these just these modified mature objects in the next mark and sweep process, preventing active, young objects from being incorrectly considered garbage.

Koichi’s fascinating presentation from EuRuKo 2013

Difficult to implement WB in C-ext, watch the video for more details!

Concurrect Garbage Collection

  • Run GC on separate thread
  • Since 2.1, MRI reduce sweeping time by introducing this tech

Marking While the Object Graph Changes

New object hasn't been marked here, so it'll be GCed!

Tricolor Marking

  • White: haven't been marked, all objects start from here.
  • Grey: reachable, won't be GCed, but we still need to check any objects that the object references
  • Black: marked, have no references to objects in the white set, surely won't be GCed
  • Pick an object from the grey set. Blacken this object (move it to the black set), by greying all the white objects it references directly
  • Repeat the previous step until the grey set is empty
  • When there are no more objects in the grey set, then all the objects remaining in the white set will be GCed
  • The collector moves a marked object back to the mark stack because the application modified it.

Thanks!