On Github rntz / monoid-slides
Michael Arntzenius
http://www.rntz.netMotto: Simple concepts in abstract mathematics lead to powerful programming techniques
I will show this by example: monoids.
Explain monoids from the ground up & why they're useful.
A binary operator
that's associative
with an identity element.
a · b
Integer addition, +
Set union, ∪
String concatenation, ⧺
Integer maximum
Division, /
Here is a list of binary operators
(a · b) · c = a · (b · c)
Integer addition, +
Set union, ∪
String concatenation, ⧺
Integer maximum
Division, / (a/b)/c ≠ a/(b/c)
It doesn't matter where you put the parentheses!
a · id = id · a = a
Integer addition, +, with 0
Set union, ∪, with {}
String concatenation, ⧺, with ""
Integer maximum no least integer
Division, / (a/b)/c ≠ a/(b/c)
Associative binary operators with identity.
Monoids are so simple they show up everywhere.
Which I'll demonstrate by giving some more examples.
Matrix multiplication
Merging sorted lists (merge step in mergesort!)
Unix commands under |, with cat
Composing functions of type (A → A), with (λx.x)
Regular expressions under choice ("|" in (a|b))
CFGs under choice ("|" in A ::= B | C)
Functions for free
Parallelization / MapReduce
Starting point for exploring algebra
instance Monoid String where mappend = (++) -- ++ is the operator mempty = "" -- "" is the identityGet this for free!
mconcat ["monoids", " are ", "great"] -- Evaluates to "monoids are great"
Idea: any monoid gives you an n-ary operator for free. Associative: apply to N things w/o worry parentheses If N=0, return the identity.
This is exactly what mconcat does.
Split the list in half;
send each half to a different processor;
recurse!
Once both halves return, mappend their results together.
(Do whiteboard diagram.)
Theoretical speedup: O(n) to O(log n).
The reduce step is a distributed, parallel mconcat
MapReduce is a framework (originally from Google) for doing parallel computation on massive datasets.
The "map" part is a parallel map.
Parallelizing mconcat is the core of the "reduce" part of MapReduce.
It's not quite that simple (MapReduce makes assumptions about monoid, sorts for you.)
Associative binary operators with identity
Turn up everywhere
Make the "reduce" in map-reduce go
A binary operator
that's associative
and commutative
and idempotent
with an identity element.
Semilattice is overloaded terminology.
Technically: join-semilattices with least elements; I'll just say "semilattice".
A monoid
that's commutative
and idempotent.
Integer addition, +, with 0
Set union, ∪, with {}
String concatenation, ⧺, with ""
Integer maximum no least integer
Division, / (a/b)/c ≠ a/(b/c)
a · b = b · a
Integer addition, +, with 0
Set union, ∪, with {}
String concatenation, ⧺, with ""
Integer maximum no least integer
Division, / (a/b)/c ≠ a/(b/c)
a · a = a
Integer addition, +, with 0 1 + 1 = 2
Set union, ∪, with {}
String concatenation, ⧺, with ""
Integer maximum no least integer
Division, / (a/b)/c ≠ a/(b/c)
Natural numbers under maximum
Booleans under logical "or"
N-bit integers under bitwise "or"
Bloom filters under union
Functions for free
Distributed programming (where they are called CvRDTs)
Convergent replicated datatype
Used to implement eventually-consistent systems
Core idea: Allow changes in any order, possibly multiple times, yet guarantee the same result