On Github KronicDeth / ruby-to-elixir
2015-08-03
Luke Imhoff
luke_imhoff@rapid7.com Kronic.Deth@gmail.com @limhoff-r7 @KronicDeth @KronicDeth I am a Senior Software Engineer on the Metasploit Architecture team at Rapid7. In my spare time, I maintain IntelliJ Elixir, the Elixir plugin for Jetbrains IDEs like IntelliJ and Rubymine.Elixir is a functional language that is dynamically typed using immutable data that is highly concurrent.
MRI Ruby's go-to concurrency approach of forking a process has too high of a overhead and is only tolerable on system with Copy-On-Write (COW) fork, which excludes Windows. Threading and fibers in the MRI cannot get around the GIL and so only help with IO-bound code.
On the other hand, Elixir uses the Erlang VM, which thanks to the immutable data, can run isolated processes inside the VM so fast and so cheaply that my laptop can use as many as I want effectively and use a separate one for each logically concurrent task.
Just as you learned imperative programming over mathematical thinking, you can retrain to think in functional languages and even switch back and forth. I currently switch between Java, Ruby, and Elixir. With only a few hiccups on namespace syntax and string quoting. If you're used to using map or select from Enumerable in Ruby, you are already using functional programming. If you ever used class methods in Ruby that didn't write to class or instance variable, you're already using functional programming.
Learning functional programming will also prepare you for the future. If you watch conference talks in other languages, such as Javascript or C++, languages are moving to have more and more functional features as the programming community adapts to the many-cores future. They have figured out that locks, threads, and shared mutable state is too error prone.
Moore's law increasing processor speed stopped 10 years ago. If you want to take advantage of Moore's law now and in the future you need to write concurrent programs now that can be parallelized across the additional cores of the future automatically.
Switching from imperative, mutable, object-oriented Ruby to functional, immutable, concurrent Elixir may be intimidating, but I'll show you how easy it is to translate your Ruby skills to Elixir to quickly get started learning Elixir on your own.
Ruby and Elixir both have homebrew packages and version managers.
If you use kiex, you'll need to install Erlang separately using either kerl, spelled K-E-R-L, or homebrew
Both Ruby and Elixir have installers for installing the languages as a Program in Windows.
You can alternatively use the Chocolatey (or NuGET) package manager for Windows.
Use your package manager
Integer formats are the same for Ruby and Elixir. They both support, decimal, binary, octal, and hexadecimal. Underscore (_) can be used to separate digit groups.
Float formats are the same for Ruby and Elixir. They both support 'e' notation.
Symbol in Ruby just becomes the word Atom in Elixir.
Class/Module names and Aliases both share camel-casing, but namespaces in Class/Module names are separated with colon-colon (::), while Alias names are separated with dot (.). Additionally, Erlang modules are just atoms. Aliases are called Aliases because they are actually syntactic sugar with an atom starting with Elixir dot (Elixir.)
Your brain may read module attributes as being like class instance variables in Ruby since they both start with at (@) and look like a variable, but module attributes don't get assigned to with equals (=), instead they are referenced before the value to put into the module attribute. This is because module attributes can be configured to either reset their value or accumulate all values passed to them. This allows for some nice features, which I'll get to later.
In Elixir, false, nil, and true are all just syntactic sugar for the atoms of the same name.
Both Ruby and Elixir has falsy logic, so both false and nil are false for boolean operations.
Ruby and Elixir support double quoted strings with interpolation encoded as UTF-8
Elixir excels at proper unicode handling compared to Ruby: Elixir properly capitalizes e-acute (é) in José Valim while Ruby does not; Elixir properly handles the separate e and acute accent being one grapheme while Ruby counts them as 2 characters for the string length.
If you care about what the user actually sees, you want the number of graphemes in a string, not the raw bytes, so Elixir's approach is correct.
'`spec` is a task for `rake`'.gsub( /`(.*?)`/, '<code>\1</code>' ) # "<code>test</code> is a task for <code>rake</code>"
Regex.replace( ~r/`(.*?)`/, "`test` is a task for `mix`", "<code>\\1</code>" ) # <code>test</code> is a task for <code>mix</code>"Unlike Ruby, which only allows
Arrays and tuples are both contiguous in memory, but tuples are a fixed size while Ruby Arrays are resizable. Both allow lookup by index.
Both Set and HashSet are built on top of Hashes and don't support a built-in syntax for initialization, and so here are populated using an Array and Linked List, respectively.
Linked list can access the first (head) element or the rest (tail) elements quickly. The Ruby standard library has linked list implementation. The hamster gem has Hamster.list .
A Keyword List allows multiple values for the same key and is implemented as a list of 2-tuples. So, not as efficient as a map, which is built is optimized, but useful for named function arguments where an option can be repeated.
if false 'This will never be seen' else 'This will' end
if false do "This will never be seen" else "This will" end
unless true 'This will never be seen' else 'This will' end
unless true do "This will never be seen" else "This will" end
if one 'one is true' elsif two 'two is true' else 'neither one nor two is true' end
cond do one -> "one is true" two -> "two is true" true -> "neight one nor two is true" end
Take note that if and unless look the same in Ruby and Elixir, except that Elixir has a 'do' after the condition. I'll explain why that is later.
Although Elixir has control flow, it is rare in idiomatic code to see if, unless or cond. It is much more likely to see pattern matching, which I'll cover next.
begin raise 'some error' rescue RuntimeError => runtime_error puts runtime_error rescue ArgumentError puts 'argument error occurred' rescue => exception puts exception rescue puts 'some exception' else puts 'no exception' ensure puts 'always runs' end
try do raise "some error" rescue x in [RuntimeError] -> IO.puts x.message ArgumentError -> IO.puts "argument error occurred" error -> IO.puts error _ -> IO.puts "some error" else IO.puts "no error" after puts "always run" endbegin try do rescue Klass => instance variable in [Alias] -> rescue Klass Alias -> rescue => exception error -> rescue _ -> else else ensure after end end
answer = nil catch (:done) do answer = 42 throw :done end
try do name = "Alice" throw("Hello", name) exit "I am exiting" catch {greeting, name} -> IO.puts "#{greeting} to you, #{name}" :exit, _ -> "not really" after IO.puts "Nothing thrown" end
Catch and throw are quite esoteric in ruby. The only case I can think of them being used in production is in parts of Rack.
In Elixir, catch and throw is only meant for use when you can't send a message up the stack any other way. However, Elixir's catch is far more flexible as the throw and catch don't have to agree on single symbol to match on the way Ruby does.
Catch and exit are used all the time to monitor for VM processes exiting. VM processes monitoring each other is part of the resiliency features of Elixir, so it is very important. Most of the time, the standard library will handle the exit catching behind the scenes.
You may have also noticed that rescuing exceptions and catching throws and exits all use try. It is actually possible to rescue exceptions and catch throws and exits in the same try.
In Elixir, the equals sign (=) is the match operator. The match operator should not be thought of as assignment, but instead of trying to get the two sides of the equals sign to match, which is why you can do foo equals 1 or 1 equals foo.
However, after already doing foo equals 1 (foo = 1) you can't do 2 equals foo because the match operator will only rebind a variable if it is on the left-hand side.
a, b = [nil, 2] unless a == 1 raise ArgumentError, "a should be 1" endArgumentError: a should be 1
{1, b} = {0, 2}** (MatchError) no match of right hand side value: {nil, 2}
opening, closing = [:td, :th] unless opening == closing raise ArgumentError, "opening and closing tag don't match" endArgumentError: opening and closing tag don't match
{tag, tag} = {:td, :th}* (MatchError) no match of right hand side value: {:td, :th}
Instead of destructuring an argument and then checking its value as would be required in Ruby, you can put the expected value into the pattern in Elixir and have Elixir check the value for you.
The patterns automatically will enforce that when the same variable is used more than once it must have the same value. For example, you could check the closing and opening tag match when parsing XML.
def cat_greet(who) case who when :owner 'Purr!' when :dog 'Hiss!' else '*ignore*' end cat_greet(:owner) # "Purr!" cat_greet(:dog) # "Hiss!" cat_greet(:sitter) # "*ignore*"
cat_greet = fn :owner -> "Purr!" :dog -> "Hiss!" - -> "*ignore*" end cat_greet.(:owner) # "Purr!" cat_greet.(:dog) # "Hiss!" cat_greet.(:sitter) # "*ignore*"
Using pattern matching, we can have a function behave differently based on its inputs without the need to write our own conditional logic. In ruby, we're stuck with a case statement unless the argument is a class and the responses can be made polymorphic.
The pattern matching is very efficient in the compiled BEAM bytecode. If multiple function clauses have to same prefix, such when matching strings or packets, the compiler will produce a tree of check so that the prefix is matched first before moving on to the parts of the pattern that differs.
fizz = n % 3 == 0 buzz = n % 5 == 0 if fizz && buzz 'FizzBuzz' elsif fizz 'Fizz' elsif buzz 'Buzz' else n end
case {rem(n, 3), rem(n, 5), n} do {0, 0, _} -> "FizzBuzz" {0, _, _} -> "Fizz" {_, 0, _) -> "Buzz" {_, _, n} -> n end
In Ruby, the case statement uses the triple-equals (===) operator to allow when clauses that match either the exact value or be of the same Class or Regexp match the argument to when.
In Elixir, case statements are a way to match a bunch of patterns in order the same as with function clauses.
Let's start from the a clean state and say we're creating a new project.
Unlike Ruby, where you need to know whether to use bundle, gem, or rake, everything is a mix task in Elixir.
Mix tasks are namespaced just like rake taks, but mix task use dot (.) instead of colon (:).
Most of the gem command map to mix archive namespace tasks. Most bundle commands map to mix deps, except for outdated, which is mix hex dot outdated.
Any gem commands interacting with rubygems.org are mapped to mix hex tasks because the rubygems.org equivalent is hex.pm.
There is not step equivalent to gem install bundler for Elixir because mix is part of the Elixir install.
Both bundle gem and mix new handle git ignores, a stubbed packaging file, versioning, README, and tests.
In version 1.10, bundler really stepped up its game and now prompts for Code-of-Conduct (COC), license, and automatically generates a .travis.yml for travis-ci.org
Unlike bundler's Gemfile and and rubygems' gemspec, there is not split packaging and dependency management in Elixir: everything is in mix.exs. Additionally, the version of the project is stored directly in the mix.exs instead of being in a separate file like lib/example/version.rb.
# coding: utf-8 lib = File.expand_path('../lib', __FILE__) $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib) require 'example/version' Gem::Specification.new do |spec| spec.name = "example" spec.version = Example::VERSION spec.authors = ["Luke Imhoff"] spec.email = ["luke_imhoff@rapid7.com"] spec.summary = %q{TODO: Write a short summary, because Rubygems requires one.} spec.description = %q{TODO: Write a longer description or delete this line.} spec.homepage = "TODO: Put your gem's website or public repo URL here." spec.license = "MIT" spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) } spec.bindir = "exe" spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) } spec.require_paths = ["lib"] spec.add_development_dependency "bundler", "~> 1.10" spec.add_development_dependency "rake", "~> 10.0" spec.add_development_dependency "rspec" end
defmodule Example.Mixfile do use Mix.Project def project do [app: :example, version: "0.0.1", elixir: "~> 1.0", deps: deps] end def application do [applications: [:logger]] end defp deps do [] end end
As you can see at the top of mix.exs, it is a standard Elixir module and under the project's Example namespace, so the version can be extracted from Example.Mixfile.project at runtime without the new for a separate version file like Ruby's lib/example/version.rb constants.
The app keyword in the project function gives the entirety of this project a name that can be used to include this project in other project releases. The applications list in the applications function list the applications this project depends on at runtime. In Erlang and Elixir, applications are libraries or trees of VM processes that are started as a group to provide some service to other applications.
The version keyword in the project function sets the name of this project when it is released as an application or published to hex.
hex.pm is the Elixir equivalent of rubygems.org.
The deps function lists the dev, prod, and/or test compile time dependencies.
*All mix dependencies are added to the [] in Example.MixFile.deps/0
Because rubygems and bundler grew up after Ruby was started, their individual responsibilities for dependencies overlap.