Austin.RB – Ruby to Elixir – Overview



Austin.RB – Ruby to Elixir – Overview

0 1


ruby-to-elixir

Austin.RB 2015-08-03 Ruby to Elixir

On Github KronicDeth / ruby-to-elixir

Austin.RB

Ruby to Elixir

2015-08-03

Luke Imhoff

luke_imhoff@rapid7.com Kronic.Deth@gmail.com @limhoff-r7 @KronicDeth @KronicDeth I am a Senior Software Engineer on the Metasploit Architecture team at Rapid7. In my spare time, I maintain IntelliJ Elixir, the Elixir plugin for Jetbrains IDEs like IntelliJ and Rubymine.

Outline

Overview Installation Interactive Types Control Flow Pattern Matching Project Code Testing Metaprogramming Concurrency Resources Upcoming Dates

Overview

Ruby Elixir Paradigms Imperative ✓ Concurrent ✓ Functional ✓ ✓ Object-Oriented ✓ Typing Dynamic ✓ ✓ Duck ✓ Strong ✓ Mutability Mutable Immutable Concurrency CPU-bound OS Processes VM Processes IO-bound Threads, Fibers VM Processes Metaprogramming Runtime, Class Methods Compilation, Macros

Elixir is a functional language that is dynamically typed using immutable data that is highly concurrent.

MRI Ruby's go-to concurrency approach of forking a process has too high of a overhead and is only tolerable on system with Copy-On-Write (COW) fork, which excludes Windows. Threading and fibers in the MRI cannot get around the GIL and so only help with IO-bound code.

On the other hand, Elixir uses the Erlang VM, which thanks to the immutable data, can run isolated processes inside the VM so fast and so cheaply that my laptop can use as many as I want effectively and use a separate one for each logically concurrent task.

Just as you learned imperative programming over mathematical thinking, you can retrain to think in functional languages and even switch back and forth. I currently switch between Java, Ruby, and Elixir. With only a few hiccups on namespace syntax and string quoting. If you're used to using map or select from Enumerable in Ruby, you are already using functional programming. If you ever used class methods in Ruby that didn't write to class or instance variable, you're already using functional programming.

Learning functional programming will also prepare you for the future. If you watch conference talks in other languages, such as Javascript or C++, languages are moving to have more and more functional features as the programming community adapts to the many-cores future. They have figured out that locks, threads, and shared mutable state is too error prone.

Moore's law increasing processor speed stopped 10 years ago. If you want to take advantage of Moore's law now and in the future you need to write concurrent programs now that can be parallelized across the additional cores of the future automatically.

Switching from imperative, mutable, object-oriented Ruby to functional, immutable, concurrent Elixir may be intimidating, but I'll show you how easy it is to translate your Ruby skills to Elixir to quickly get started learning Elixir on your own.

Installation

OSX

Ruby Elixir Homebrew brew install ruby brew install elixir Version Manager rvm install VERSION kiex install VERSION

Ruby and Elixir both have homebrew packages and version managers.

If you use kiex, you'll need to install Erlang separately using either kerl, spelled K-E-R-L, or homebrew

Windows

Ruby Elixir Installer rubyinstaller.exe elixir-websetup.exe Chocolatey cinst ruby cinst elixir

Both Ruby and Elixir have installers for installing the languages as a Program in Windows.

You can alternatively use the Chocolatey (or NuGET) package manager for Windows.

Linux

Use your package manager

Interactive

Starting Interactive

Ruby Elixir irb iex

Breaking the current command

Ruby Elixir CTRL+C #iex:break on line by itself If you make a typing mistake in irb, you're probably used to hitting CTRL+C, but if you do that in iex you'll get a prompt asking whether to abort, continue, kill, and some other options. To mimic the behavior of CTRL+C from irb, type the comment iex colon break on a line by itself.
Ruby Elixir exit CTRL+C CTRL+C A single CTRL+C in iex will bring up the break handler, which allows you to inspect the running VM and kill individual processes.

Types

Numeric Types

Ruby Elixir Name Example Name Example Integer 9, 0b1, 0o7, 0xF Integer 9, 0b1, 0o7, 0xF Float 1.2, 3e+0 float 1.2, 3e+0

Integer formats are the same for Ruby and Elixir. They both support, decimal, binary, octal, and hexadecimal. Underscore (_) can be used to separate digit groups.

Float formats are the same for Ruby and Elixir. They both support 'e' notation.

Constant Types

Ruby Elixir Name Example Name Example Symbol :symbol, :"symbol", :'symbol' Atom :atom, :"atom", :'atom' Class/Module name MyClass, MyNamespace::MyModule Alias MyModule, MyNamespace.MyModule, :erlang_module Constant MY_CONSTANT Module Attribute @my_attribute

Symbol in Ruby just becomes the word Atom in Elixir.

Class/Module names and Aliases both share camel-casing, but namespaces in Class/Module names are separated with colon-colon (::), while Alias names are separated with dot (.). Additionally, Erlang modules are just atoms. Aliases are called Aliases because they are actually syntactic sugar with an atom starting with Elixir dot (Elixir.)

Your brain may read module attributes as being like class instance variables in Ruby since they both start with at (@) and look like a variable, but module attributes don't get assigned to with equals (=), instead they are referenced before the value to put into the module attribute. This is because module attributes can be configured to either reset their value or accumulate all values passed to them. This allows for some nice features, which I'll get to later.

Boolean Types

Ruby Elixir false false, :false nil nil, :nil true true, :true

In Elixir, false, nil, and true are all just syntactic sugar for the atoms of the same name.

Both Ruby and Elixir has falsy logic, so both false and nil are false for boolean operations.

Strings

Ruby Elixir Format "string" "string" Interpolation "Hello #{:world}" ✓ "Hello #{:world}" ✓ Encoding UTF-8 UTF-8 Unicode Capitalization "José Valim".upcase # "JOSé VALIM" ❌ String.upcase "José Valim" # "JOSÉ VALIM" ✓ Unicode Graphemes Rendering "\u0065\u0301" # "é" ✓ "\x{0065}\x{0301}" # "é" ✓ Unicode Graphemes Length "\u0065\u0301".length # 2 ❌ String.length "\x{0065}\x{0301}" # 1 ✓

Ruby and Elixir support double quoted strings with interpolation encoded as UTF-8

Elixir excels at proper unicode handling compared to Ruby: Elixir properly capitalizes e-acute (é) in José Valim while Ruby does not; Elixir properly handles the separate e and acute accent being one grapheme while Ruby counts them as 2 characters for the string length.

If you care about what the user actually sees, you want the number of graphemes in a string, not the raw bytes, so Elixir's approach is correct.

Regular Expressions

Ruby Elixir Literals
  • /[A-Z]+/
  • %r{[A-Z]+}
  • ~r{[A-Z]+}
  • ~r[(.*)]
  • ~r<[A-Z]+>
  • ~r"[A-Z]+"
  • ~r/[A-Z]+/
  • ~r([A-Z]+)
  • ~r|[A-Z]+|
  • ~r'[A-Z]+'
Compile Regexp.new "string" Regexp.compile! "string" Replace
'`spec` is a task for `rake`'.gsub(
  /`(.*?)`/,
  '<code>\1</code>'
) # "<code>test</code> is a task for <code>rake</code>"
                                  
Regex.replace(
  ~r/`(.*?)`/,
  "`test` is a task for `mix`",
  "<code>\\1</code>"
) # <code>test</code> is a task for <code>mix</code>"
                                    
Unlike Ruby, which only allows

Anonymous Functions

Ruby Elixir Declaration
  • add = ->(a,b){ a + b }
  • add = lambda { |a,b| a + b }
  • add = proc { |a,b| a + b }
  • add = Proc.new { |a, b| a + b}
  • add = fn a, b -> a + b end
  • add = fn (a, b) -> a + b end
  • add = &(&1 + &2)
Calling
  • add.call(1,2)
  • add.call 1, 2
  • add.(1,2)
  • add[1,2]
  • add.(1,2)

Collections

Ruby Elixir Name Example Name Example Array [1,2,3] Tuple [1,2,3] Hash
  • {a: 1, b: 2}
  • {:a => 1, :b => 2}
Map
  • %{a: 1, b: 2}
  • %{:a => 1, :b => 2}
Set Set.new [1,2,3] HashSet Enum.into [1,2,3], HashSet.new ❌ ❌ Linked List
  • [1,2,3]
  • [1 | [2 | [3 | []]]]
❌ ❌ Keyword List
  • [a: 1, a: 2, b: 3]
  • [{:a, 1}, {:a, 2}, {:b, 3}]

Arrays and tuples are both contiguous in memory, but tuples are a fixed size while Ruby Arrays are resizable. Both allow lookup by index.

Both Set and HashSet are built on top of Hashes and don't support a built-in syntax for initialization, and so here are populated using an Array and Linked List, respectively.

Linked list can access the first (head) element or the rest (tail) elements quickly. The Ruby standard library has linked list implementation. The hamster gem has Hamster.list .

A Keyword List allows multiple values for the same key and is implemented as a list of 2-tuples. So, not as efficient as a map, which is built is optimized, but useful for named function arguments where an option can be repeated.

Control Flow

Boolean Control Flow

Ruby Elixir
if false
  'This will never be seen'
else
  'This will'
end
                                    
if false do
  "This will never be seen"
else
  "This will"
end
                                    
unless true
  'This will never be seen'
else
  'This will'
end
                                    
unless true do
  "This will never be seen"
else
  "This will"
end
                                    
if one
  'one is true'
elsif two
  'two is true'
else
  'neither one nor two is true'
end
                                    
cond do
  one -> "one is true"
  two -> "two is true"
  true -> "neight one nor two is true"
end
                                    

Take note that if and unless look the same in Ruby and Elixir, except that Elixir has a 'do' after the condition. I'll explain why that is later.

Although Elixir has control flow, it is rare in idiomatic code to see if, unless or cond. It is much more likely to see pattern matching, which I'll cover next.

Rescuing Exceptions

Ruby Elixir
begin
  raise 'some error'
rescue RuntimeError => runtime_error
  puts runtime_error
rescue ArgumentError
  puts 'argument error occurred'
rescue => exception
  puts exception
rescue
  puts 'some exception'
else
  puts 'no exception'
ensure
  puts 'always runs'
end
try do
  raise "some error"
rescue
  x in [RuntimeError] ->
    IO.puts x.message
  ArgumentError ->
    IO.puts "argument error occurred"
  error ->
    IO.puts error
  _ ->
    IO.puts "some error"
else
  IO.puts "no error"
after
  puts "always run"
end
begin try do rescue Klass => instance variable in [Alias] -> rescue Klass Alias -> rescue => exception error -> rescue _ -> else else ensure after end end

Catching Throws And Exits

Ruby Elixir
answer = nil
catch (:done) do
  answer = 42
  throw :done
end
try do
  name = "Alice"
  throw("Hello", name)
  exit "I am exiting"
catch
  {greeting, name} ->
    IO.puts "#{greeting} to you, #{name}"
  :exit, _ -> "not really"
after
  IO.puts "Nothing thrown"
end

Catch and throw are quite esoteric in ruby. The only case I can think of them being used in production is in parts of Rack.

In Elixir, catch and throw is only meant for use when you can't send a message up the stack any other way. However, Elixir's catch is far more flexible as the throw and catch don't have to agree on single symbol to match on the way Ruby does.

Catch and exit are used all the time to monitor for VM processes exiting. VM processes monitoring each other is part of the resiliency features of Elixir, so it is very important. Most of the time, the standard library will handle the exit catching behind the scenes.

You may have also noticed that rescuing exceptions and catching throws and exits all use try. It is actually possible to rescue exceptions and catch throws and exits in the same try.

Pattern Matching

Ruby Assignment to Elixir Matching

Step Ruby Elixir 1 foo = 1 foo is 1 foo is 1 2 1 = foo SyntaxError foo is 1 3 2 = foo SyntaxError ** (MatchError) no match of right hand side value: 1

In Elixir, the equals sign (=) is the match operator. The match operator should not be thought of as assignment, but instead of trying to get the two sides of the equals sign to match, which is why you can do foo equals 1 or 1 equals foo.

However, after already doing foo equals 1 (foo = 1) you can't do 2 equals foo because the match operator will only rebind a variable if it is on the left-hand side.

Destructuring to Match

Ruby Elixir Expression Variable Value(s) Expression Variable Value(s) a, b = [1, 2] a = 1b = 2 {a, b} = {1, 2} a = 1b = 2 _, b = [1, 2] b = 2 {_, b} = {1, 2} b = 2 a, b* = [1, 2, 3] a = 1b = [2, 3] [a | b] = [1, 2, 3] a = 1b = [2, 3]

ArgumentError to Match

Ruby Elixir Expression Variable Value(s) Expression Variable Value(s)
a, b = [nil, 2]
unless a == 1
  raise ArgumentError,
        "a should be 1"
end
                                    
ArgumentError: a should be 1
{1, b} = {0, 2}
** (MatchError) no match of right hand side value: {nil, 2}
opening, closing = [:td, :th]
unless opening == closing
  raise ArgumentError,
        "opening and closing tag don't match"
end
ArgumentError: opening and closing tag don't match
{tag, tag} = {:td, :th}
* (MatchError) no match of right hand side value: {:td, :th}

Instead of destructuring an argument and then checking its value as would be required in Ruby, you can put the expected value into the pattern in Elixir and have Elixir check the value for you.

The patterns automatically will enforce that when the same variable is used more than once it must have the same value. For example, you could check the closing and opening tag match when parsing XML.

Function Clauses

Ruby Elixir
def cat_greet(who)
  case who
  when :owner
    'Purr!'
  when :dog
    'Hiss!'
  else
    '*ignore*'
end
cat_greet(:owner) # "Purr!"
cat_greet(:dog) # "Hiss!"
cat_greet(:sitter) # "*ignore*"
                                    
cat_greet = fn
  :owner -> "Purr!"
  :dog -> "Hiss!"
  - -> "*ignore*"
end

cat_greet.(:owner) # "Purr!"
cat_greet.(:dog) # "Hiss!"
cat_greet.(:sitter) # "*ignore*"

Using pattern matching, we can have a function behave differently based on its inputs without the need to write our own conditional logic. In ruby, we're stuck with a case statement unless the argument is a class and the responses can be made polymorphic.

The pattern matching is very efficient in the compiled BEAM bytecode. If multiple function clauses have to same prefix, such when matching strings or packets, the compiler will produce a tree of check so that the prefix is matched first before moving on to the parts of the pattern that differs.

If-else to case

Ruby Elixir
fizz = n % 3 == 0
buzz = n % 5 == 0

if fizz && buzz
  'FizzBuzz'
elsif fizz
  'Fizz'
elsif buzz
  'Buzz'
else
  n
end
case {rem(n, 3), rem(n, 5), n} do
  {0, 0, _} -> "FizzBuzz"
  {0, _, _} -> "Fizz"
  {_, 0, _) -> "Buzz"
  {_, _, n} -> n
end

In Ruby, the case statement uses the triple-equals (===) operator to allow when clauses that match either the exact value or be of the same Class or Regexp match the argument to when.

In Elixir, case statements are a way to match a bunch of patterns in order the same as with function clauses.

Project

Let's start from the a clean state and say we're creating a new project.

Tools

Ruby Elixir Ruby Elixir gem list mix archive bundle help mix help gem build *.gemspec mix archive.build gem help mix help gem install *.gem mix archive.install rake -T mix help gem uninstall NAME mix archive.uninstall NAME bundle outdated mix hex.outdated rm *.gem mix clean gem owner mix hex.owner bundle list mix deps gem push mix hex.publish bundle install mix deps.get gem query mix hex.search rm Gemfile.lock mix deps.unlock --all bundle gem mix new bundle update mix deps.update --all rake spec mix test rake TASK1 TASK2 mix do TASK1 TASK2

Unlike Ruby, where you need to know whether to use bundle, gem, or rake, everything is a mix task in Elixir.

Mix tasks are namespaced just like rake taks, but mix task use dot (.) instead of colon (:).

Most of the gem command map to mix archive namespace tasks. Most bundle commands map to mix deps, except for outdated, which is mix hex dot outdated.

Any gem commands interacting with rubygems.org are mapped to mix hex tasks because the rubygems.org equivalent is hex.pm.

Ruby Elixir gem install bundler bundle gem example --coc --mit --test=rspec mix new example

There is not step equivalent to gem install bundler for Elixir because mix is part of the Elixir install.

File Layout

Ruby Elixir example.gemspec mix.exs .gitignore .gitignore Gemfile mix.exs lib/example.rb lib/example.ex lib/example/version.rb mix.exs README.md README.md spec/spec_helper.rb test/test_helper.exs spec/example_spec.rb test/example_test.exs

Both bundle gem and mix new handle git ignores, a stubbed packaging file, versioning, README, and tests.

In version 1.10, bundler really stepped up its game and now prompts for Code-of-Conduct (COC), license, and automatically generates a .travis.yml for travis-ci.org

Unlike bundler's Gemfile and and rubygems' gemspec, there is not split packaging and dependency management in Elixir: everything is in mix.exs. Additionally, the version of the project is stored directly in the mix.exs instead of being in a separate file like lib/example/version.rb.

Packaging

Ruby Elixir
# coding: utf-8
lib = File.expand_path('../lib', __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require 'example/version'

Gem::Specification.new do |spec|
  spec.name          = "example"
  spec.version       = Example::VERSION
  spec.authors       = ["Luke Imhoff"]
  spec.email         = ["luke_imhoff@rapid7.com"]

  spec.summary       = %q{TODO: Write a short summary, because Rubygems requires one.}
  spec.description   = %q{TODO: Write a longer description or delete this line.}
  spec.homepage      = "TODO: Put your gem's website or public repo URL here."
  spec.license       = "MIT"

  spec.files         = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
  spec.bindir        = "exe"
  spec.executables   = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
  spec.require_paths = ["lib"]

  spec.add_development_dependency "bundler", "~> 1.10"
  spec.add_development_dependency "rake", "~> 10.0"
  spec.add_development_dependency "rspec"
end
defmodule Example.Mixfile do
  use Mix.Project

  def project do
    [app: :example,
     version: "0.0.1",
     elixir: "~> 1.0",
     deps: deps]
  end

  def application do
    [applications: [:logger]]
  end

  defp deps do
    []
  end
end

As you can see at the top of mix.exs, it is a standard Elixir module and under the project's Example namespace, so the version can be extracted from Example.Mixfile.project at runtime without the new for a separate version file like Ruby's lib/example/version.rb constants.

The app keyword in the project function gives the entirety of this project a name that can be used to include this project in other project releases. The applications list in the applications function list the applications this project depends on at runtime. In Erlang and Elixir, applications are libraries or trees of VM processes that are started as a group to provide some service to other applications.

The version keyword in the project function sets the name of this project when it is released as an application or published to hex.

hex.pm is the Elixir equivalent of rubygems.org.

The deps function lists the dev, prod, and/or test compile time dependencies.

Dependency - Source

Source Ruby Elixir *.gemspec Gemfile mix.exs Packager spec.add_runtime_dependency 'mydep', '~> 1.2.3' gem 'mydep', '~> 1.2.3' {:mydep, "~> 0.3.0"}* Github ❌ gem 'mydep', github: 'myorg/mydep', tag: 'v1.2.3' {:mydep, github: 'myorg/mydep', tag: "v1.2.3"} Path ❌ gem 'mydep', path: 'path/to/mydep' {:mydep, path: "path/to/mydep"

*All mix dependencies are added to the [] in Example.MixFile.deps/0

Because rubygems and bundler grew up after Ruby was started, their individual responsibilities for dependencies overlap.