Introducing Erlang – Bob Ippolito – RuPy 2013Budapest, Hungary



Introducing Erlang – Bob Ippolito – RuPy 2013Budapest, Hungary

1 4


intro-to-erlang-2013

Intro to Erlang (Sonoma State 2013 talk)

On Github etrepum / intro-to-erlang-2013

Introducing Erlang

Bob Ippolito

RuPy 2013Budapest, Hungary

What do I know?

  • Erlang user since 2006
  • Founded Mochi Media, Inc. (CTO from 2005-2012)
  • Developed popular Erlang web server (mochiweb)

Erlang

  • Ericsson language
Agner Krarup Erlang

Raison d'être

  • Came from Ericsson R&D in the 80s
  • Telecom shifting to packet switching
  • Multi-service networks
  • Improve development of telephony applications
  • Scalable, distributed, no down time, fast development

Domain

Distributed massively concurrent soft-realtime systems that do not stop.

Applications

  • Networking (Switching, Routing, SDN)
  • Messaging (SMS, MMS, IM) services
  • Distributed systems (Databases, cluster management)
  • Web services (analytics, ad servers)
  • Simulation (testing)
  • Game servers

Telecom

  • Conferences
  • Training
  • Certification
  • Consulting
  • Support

Database Vendors

Advertising

  • Mochi Media
  • AdRoll
  • OpenX
  • AdGear
  • AOL

Cloud / Data Center Automation

  • Rackspace
  • Opscode
  • Heroku
  • VMWare

Mobile/VoIP services

  • WhatsApp
  • 2600hz
  • TigerText

Monitoring / Ops

  • Boundary
  • BugSense
  • Alert Logic

Media

Games

  • Spilgames
  • Wooga
  • MuchDifferent / Pikkotekk
  • Linden Lab

Money

  • Klarna (payments)
  • Smarkets (sports betting)

Open Source

  • Riak (Basho)
  • CouchDB (CouchBase, Cloudant)
  • ejabberd (Process-One)
  • RabbitMQ (LShift → VMWare)
  • Disco (Nokia)

Not good for

  • Client side applications (GUIs)
  • … but web-based UIs work well
  • Number crunching

Ancestry

  • Syntax mostly from Prolog and ML
  • List comprehensions from Miranda
  • Message passing inspired by Smalltalk

Erlang was prototyped in Prolog:

% factorial.pl (Prolog)
:- module(factorial, [factorial/2]).

factorial(0, 1).
factorial(N, F) :-
  N>0,
  N1 is N-1,
  factorial(N1, F1),
  F is N * F1.
% factorial.erl (Erlang)
-module(factorial).
-export([factorial/2]).

factorial(0) -> 1;
factorial(N) when N > 0 ->
  N * factorial(N - 1).

List comprehensions from Miranda (like Haskell)

-- Subsets.hs
module Subsets (subsets) where

subsets :: [a] -> [[a]]
subsets []     = [[]]
subsets (x:xs) = [(x:y) | y <- ys] ++ ys
  where ys = subsets xs
% subsets.erl
-module(subsets).
-export([subsets/1]).

subsets([]) ->
    [[]];
subsets([X | Xs]) ->
    Ys = subsets(Xs),
    [[X | Y] || Y <- Ys] ++ Ys.

Pattern matching like ML

(* reverse.sml *)
fun reverse(l : 'a list) : 'a list =
    reverse1(l, [])

fun reverse1(l : 'a list, acc : 'a list) =
    case l of
        [] => acc
      | (x::xs) => reverse1(xs, x::acc)
% reverse.erl
-module(reverse).
-export([reverse/1]).

reverse(L) ->
    reverse(L, []).

reverse(L, Acc) ->
    case L of
        [] -> Acc;
        [X | Rest] -> reverse(Rest, [X | Acc])
    end.
Smalltalk didn't really use concurrency for actors or message passing, so the relationship is mostly conceptual.

Key Features

  • Declarative
  • Concurrent
  • Multi-core
  • Distributed
  • Robust
  • Versatile

Declarative

  • "Describe the problem, not the solution"
  • Great syntax for this (but not C-like)!
  • Functional code is (often) easier to understand and test
  • … but not 100% pure (message passing is a side-effect)
-module(merge_sort).
-export([merge_sort/1]).

% bottom-up merge sort
merge_sort([]) ->
    [];
merge_sort(L) ->
    iterate([[X] || X <- L]).

iterate([Xs]) ->
    Xs;
iterate(Lists) ->
    iterate(merge_lists(Lists)).

merge_lists([Xs, Ys | Rest]) ->
    [merge2(Xs, Ys) | merge_lists(Rest)];
merge_lists(Lists) ->
    Lists.

merge2(Xs=[X | _], [Y | Ys]) when X > Y ->
    [Y | merge2(Xs, Ys)];
merge2([X | Xs], Ys) ->
    [X | merge2(Xs, Ys)];
merge2([], Ys) ->
    Ys.
# merge_sort.py
def merge_sort(lst):
    if not lst:
        return []
    lists = [[x] for x in lst]
    while len(lists) > 1:
        lists = merge_lists(lists)
    return lists[0]

def merge_lists(lists):
    result = []
    for i in range(0, len(lists) // 2):
        result.append(merge2(lists[i*2], lists[i*2 + 1]))
    if len(lists) % 2:
        result.append(lists[-1])
    return result

def merge2(xs, ys):
    i = 0
    j = 0
    result = []
    while i < len(xs) and j < len(ys):
        x = xs[i]
        y = ys[j]
        if x > y:
            result.append(y)
            j += 1
        else:
            result.append(x)
            i += 1
    result.extend(xs[i:])
    result.extend(ys[j:])
    return result

Convenient bit syntax for parsing binary data

%% This parses a TCP packet header!
<< SourcePort:16, DestinationPort:16, SequenceNumber:32,
   AckNumber:32, DataOffset:4, _Reserved:4, Flags:8,
   WindowSize:16, Checksum:16, UrgentPointer:16,
   Payload/binary >> = Segment,
OptSize = (DataOffset - 5)*32,
<< Options:OptSize, Message/binary >> = Payload,
<< CWR:1, ECE:1, URG:1, ACK:1, PSH:1,
   RST:1, SYN:1, FIN:1>> = <<Flags:8>>,

%% Can now process the Message according to the 
%% Options (if any) and the flags CWR, …, FIN etc.

Concurrent

  • Erlang concurrency is cheap
  • Per-process heaps with incremental GC
  • Message passing, not mutexes
  • Straightforward control flow (not callbacks!)

RAM footprint per unit of concurrency (approx)

1.3KB
Haskell ThreadId + MVar (GHC 7.6.3, 64-bit) 2.6 KB
Erlang process (64-bit) 8.0 KB
Go goroutine 64.0 KB
Java thread stack (minimum) 64.0 KB
C pthread stack (minimum) 1 MB
Java thread stack (default) 8 MB
C pthread stack (default, 64-bit Mac OS X)

Per-process heaps

  • No sharing
  • GC is per-process, and not "stop the world"!
  • Process references do not prevent GC
  • Explicitly hibernate idle processes for compaction

RPC with a Counter process

Counter ! {self(), {add, 1}},
receive
  {Counter, {result, N}} ->
    io:format("~p~n", [N])
end

Multi-core

  • One scheduler per core, scales well to 32+ cores
  • Better scalability to more cores is in-progress
  • Schedulers understand IO (disk, network calls)
  • No implicit synchronization
%% Parse HTTP headers from Socket
headers(Socket, Request, Headers) ->
    ok = inet:setopts(Socket, [{active, once}]),
    receive
        {http, _, http_eoh} ->
            %% All of the HTTP headers are parsed
            handle_request(Socket, Request, Headers);
        {http, _, {http_header, _, Name, _, Value}} ->
            headers(Socket, Request, [{Name, Value} | Headers]);
        {tcp_closed, _} ->
            exit(normal);
        _Other ->
            %% Invalid request
            exit(normal)
    after ?HEADERS_RECV_TIMEOUT ->
        exit(normal)
    end.

Distributed

  • Built-in distributed computing support
  • Easy to get started, "no configuration"
  • Same syntax, similar semantics, whether local or not
  • Necessary for true fault tolerant systems (2+ nodes)
  • Scales well to hundreds of nodes (LAN only)

Connect to other nodes by name

(node1@res)1> net_adm:ping(node2@res).
pong
(node1@res)2> self().
<0.38.0>
(node1@res)3> rpc:call(node2@res, erlang, self, []).
<6943.43.0>
(node1@res)4> nodes().
[node2@res]
(node1@res)5> rpc:cast(node2@res, init, stop, []).
true
(node1@res)6> nodes().
[]
alto[email protected]PidAnet_kernelepmdonyx[email protected]PidBnet_kernelepmd

Robust

  • OTP framework encodes these best practices
  • Processes can be linked for exit signal propagation
  • Errors are localized and do not corrupt application state
  • Supervisor processes start/stop/monitor other processes
  • "Let it crash" means far less error handling code

OTP

Open Telephony Platform

Framework has little to do with telephony.
SupervisorAWorkerASupervisorBWorkerB1WorkerB2WorkerB3

Versatile

  • Upgrade/fix live systems with hot code loading
  • Debug/profile live systems with tracing
  • Easily attach a console to a live system
  • Good support for instrumentation

Language Overview

Functional

  • No mutable data structures
  • Linked lists (like LISP)
  • All state is on the stack
  • Tail-call elimiation
atom (interned string, like LISP)
this_is_an_atom
'also an atom'
binary (contiguous array of bytes w/ O(1) slice)
<<1, 2, "bytes", 9999:64/little>>
integer (unbounded), float (64-bit double)
1125899839733759
2.5
ref (unique reference)
1> make_ref().
#Ref<0.0.0.30>

Collections

Lists
[1, 2, 3] = [1 | [2 | [3 | []]]]
Tuples
{one, two, 3}
fun (function reference)
1> fun () -> ok end.
#Fun<erl_eval.20.80484245>
2> fun erlang:self/0.
#Fun<erlang.self.0>
pid (process identifier)
1> self().
<0.35.0>
port (port identifier, similar to pid)
1> erlang:open_port(
    {spawn, "/bin/ls"}, []).     
#Port<0.606>
booleans are atoms
true /= false
characters are integers, strings are lists
"abc" == [$a, $b | [99]]
records are tuples
-record(foo, {bar}).
#foo{bar=1} == {foo, 1}
Variables are single assignment (and UpperCase)
1> X = 1, X = 2.
** exception error: no match …
Assignment is pattern matching! (_ is wildcard)
1> {[X], _} = {[foo], bar}, X.
foo
2> {X, Y} = {foo, bar}, Y.
bar
Functions and case can use guards when pattern matching
case date() of
  {2013, 10, 31} ->
     halloween;
  {2013, 10, N} when N >= 11, 
                     N =< 13 ->
     rupy_conference;
  _ ->
     some_other_day
end.
Comma delimits expressions ("then")
first(), second(), third().
Semicolon delimits clauses ("or")
case X of
  1 -> one;
  2 -> two;
  _ -> other_number
end.
Period ends the last clause (".")
simple_function() -> ok.

Modules

  • Flat namespace
  • Explicit export of public functions
  • Referenced by name (atom)
  • Compiled to ".beam" file
  • Can be reloaded in a running system
%% hello.erl
%% USAGE: hello:world().
-module(hello).
-export([world/0]).

world() -> io:format("hello world~n", []).

More Info

Thanks!

Slides

http://bob.ippoli.to/intro-to-erlang-2013/

Source

github.com/etrepum/intro-to-erlang-2013

Email

bob@redivi.com

Twitter

@etrepum