On Github benkolera / talk-haskell-pure-and-simple
Brisbane Functional Programming Group - 2014-01-28
Show off some cool haskell code! :)
Show that haskell can do lots of super useful things.
Even in something that has lots of side effects.
And that a degree in Category Theory is not required.
Feel comfortable with the basics of the haskell toolbox.
Less afraid of haskell than what you are now.
Have a glimpse of the beauty & merit of the haskell way.
Inspired to experiment and learn!
We've got the rest of the meetup to talk. Plenty of time!
Ask lots of questions if something isn't clear.
Code is available on github so you can go through it later.
We'll stop for a break whenever the pizza arrives.
module FizzBuzz (fizzbuzz) where import Control.Monad (mapM_) data Result = Number Integer | Fizz | Buzz | FizzBuzz deriving (Show,Eq) fizzbuzz :: IO () fizzbuzz = mapM_ (print . result) [1..100] result :: Integer -> Result result n | n `mod` 15 == 0 = FizzBuzz | n `mod` 5 == 0 = Fizz | n `mod` 3 == 0 = Buzz | otherwise = Number n
Can be explicit or implict with both:
module FizzBuzz (fizzbuzz) where import Control.Monad (mapM_)vs
module FizzBuzz where import Control.Monad
I prefer being explicit, so my code is less terse than some haskell code.
Constructors aren't types themselves.
data Result = Number Integer | Fizz | Buzz | FizzBuzz deriving (Show,Eq)
Not possible:
printBuzz :: Buzz -> String
Can take a Result and match each constructor:
printResult :: Result -> String printResult Fizz = "Fizz" printResult Buzz = "Buzz" printResult FizzBuzz = "FizzBuzz" printResult (Number n) = show n
Alias (can be used interchangibly)
type FilePath = String
Newtype (cannot be used interchangibly)
newtype FilePath = FilePath String
Records (generates getters and setters
data FilePath = FilePath { dir::String , filename::String }
Polymorphism over disparate types.
print :: Show a => IO () print a = putStrLn (show a)
And to define one for Result:
instance Show Result where show Fizz = "Fizz" show Buzz = "Buzz" show FizzBuzz = "FizzBuzz" show (Number n) = show n
ghci> :t map map :: (a -> b) -> [a] -> [b]
Is done with a space. Returns rest of the function:
ghci> :t map (+1) map (+1) :: Num b => [b] -> [b]
Which can then be applied to get the result:
ghci> map (+1) [1,2,3] [2,3,4]
Function application has highest precedence (10).
Functions that don't start with a letter are infix.
Can make a function infix with bacticks (5 `div` 2).
Infix operators each have a precedence level.
Composition (.) has 9 and ($) has 0.
Safest way is to use parens where you feel like it.
Hlint will tell you which parens are unnecessary.
As top level functions / in let or where blocks.
As Lambdas / anonymous functions:
ghci> :t (\a -> a + 1 ) (\a -> a + 1 ) :: Num a => a -> a
As sections (better style than lambdas):
ghci> :t (+ 1) (+ 1) :: Num a => a -> a
Composition:
ghci> :t show . (+1) show . (+1) :: (Num b, Show b) => b -> String ghci> show . (+1) $ (5::Int) "6"
Applicative:
ghci> import Control.Applicative ghci> import Data.Char ghci> :t liftA2 (||) isDigit isAlpha liftA2 (||) isDigit isAlpha :: Char -> Bool ghci> filter (liftA2 (||) isDigit isAlpha) "abcd34efg!.,' " "abcd34efg"
import Data.Arrow ghci> :t (***) (***) :: Arrow a => a b c -> a b' c' -> a (b, b') (c, c') ghci> :t (++"bar") *** (+9000) (++"bar") *** (+9000) :: Num a => ([Char], a) -> ([Char], a) ghci> (++"bar") *** (+9000) $ ("foo",0) ("foobar",9000) ghci> :t join (***) (+1) join (***) (+1) :: Num a => (a, a) -> (a, a) ghci> join (***) (+1) $ (1,2) (2,3)
(This is as weird as it gets, I promise!)
Lines that start at the same level of indentation are different expressions.
To continue the same expression on a new line, indent at least one space:
do x <- getXFromDatabase y <- getYFromDatabase return $ thisFunctionHasAReallyReallyLongNameSoINeedToPutTheParamsOnTheNextLine x y
ghci> :t putStrLn putStrLn :: String -> IO () ghci> :t getLine getLine :: IO String ghci> :t map putStrLn ["foo","bar"] --List of IO that are never run! map putStrLn ["foo","bar"] :: [IO ()]
Are actions that are run by the RTS only if the action is sequenced to the main function.
IO actions are values. Can do some useful things with them.
There is (almost) no way to get a value out of IO. Can only sequence with other IO.
main = do fromHandle <- getAndOpenFile "in.txt" ReadMode toHandle <- getAndOpenFile "ALLCAPS.txt" WriteMode contents <- hGetContents fromHandle hPutStr toHandle $ map toUpper contents hClose toHandle putStr "DONE!!!"
Hint: Separating the execution of the IO action from the program code that generates it turns out to be super duper useful!
Lots and lots of things require some kind of sequencing bits of stuff.
IO is one such thing, but there are lots of others and we'll see few.
Do notation and monad combinators is just a DSL for sequencing and manipulating these bits.
Not a burrito.
String is a list of Characters (Obsolete)
Text is a lazy or strict unicode data structure. Fast and correct.
ByteString is lazy or strict binary data structure. Must choose a decoding method yourself (ASCII,utf8,etc) to get to String or Text.
Overloaded strings extension means that string literals can either be String,Text or ByteString.
Going from lazy to strict involves reading everything into memory.
A simple, open, server-to-server webhook-based pubsub (publish/subscribe) protocol for any web accessible resources.
Parties (servers) speaking the PubSubHubbub protocol can get near-instant notifications (via webhook callbacks) when a topic (resource URL) they're interested in is updated.
Subscripion takes an optional secret.
Subscriptions must be 202 accepted and then verified with the subscriber.
Verification involves calling GET on the callback url with the topic, and a challenge. Subscriber must return a 200 OK with the challenge in the body.
Subscription also optionally includes a secret string shared to the hub. The hub must hmac sign the http body of any content distribution if it was given.
Low latency and publisher friendly feed consumers (read: news readers)!
At work, our internal services publish event streams via atom feeds.
Nice solution as PubSubHubbub allows consumers to be push or pull based as required.
Publisher.hs: Scotty web server that allows post creation from a browser.
Subscriber.hs: Scotty web server that subscribes to hub, listen for callbacks and pushes the updated post list to the browser via a websocket.
Main.hs: The actual PubSubHubbub scotty web server.
Lots of asynchronicity and state needed. Very few pure bits.
Can haskell even do this in a nice way?!?
Haskell-hubbub was my experimentation to help answer this.
TL;DR: The answer is 'definitely yes'. :)
Use it for any non-throwaway program.
Akin to sbt/lein/mvn/gradle/gem/pip/etc.
Package can product one or more libraries and/or binaries.
Can also have testsuites and benchmarks that aren't compiled into the final artifacts.
Pulls dependencies from hackage.haskell.org.
Builds c code, links external libraries
And lots more!
Library code in src/, tests in test/ and binary entry points at top level. (Main.hs,Publisher.hs,Subscriber.hs)
Three different build options:
Initalize sandbox and install every dependency into the sandbox:
cabal sandbox init cabal install --enable-tests --flag=mocks --only-dependencies
Configure and run tests:
cabal configure --enable-tests --flag=tests #Only needs to be done once. cabal test
Build everything including mock binaries:
cabal configure --flag=mock #Only needs to be done once. cabal build dist/build/hubbub/hubbub dist/build/hubbub-mock-publisher/hubbub-mock-publisher dist/build/hubbub-mock-subscriber/hubbub-mock-subscriber
Cabal is extraordinarily flexible. Can set it up however you like.
This just works best for me. Other packages set things up differently.
Helps to see this as it is quite confusing at first.
Declare the datatypes with standard haskell.
Use safecopy for versioned serialisation.
Make transactions in Query and Update contexts.
Can build bigger Queries/Updates out of others.
Make acidic build the top level transactions that are written to the logs.
Two different contexts: ScottyM and ActionM.
ScottyM is for configuring scotty (routes, middleware).
ActionM is for parsing the request and producing a response.
These contexts, like IO have a Monad interface so we reuse do notation with them.
Able to lift IO actions into either context with liftIO.
Lots more to conduit that we see here.
Implements a streaming interface so that you can stream to disk / other sockets.
Works well for simple things too!
Yet another monad: sequences HTML elements together. Reuse of do notation, forM, etc is very handy.
Everything is guaranteed to be escaped.
Bit verbose, though. Hamlet is a quasiquoter built on top that can help.
Container that has a value. Value can be replaced.
Part of STM (Software Transactional Memory).
Build up STM actions and run the action with atomically to execute it.
Anything that has the var can atomically read/write its value.
We use this for sharing an in memory database of websocket connections.
Set an intercept handler for new websocket connections in wai.
We create an IO action that handles all comms for that socket.
IO action runs forever until the connection disconnects.
Upon disconnect, the IO action will remove the connection from the shared state.
Distribute pushes payload down each connection.
src/Network/Hubbub.hs # Top API (client talks to this). src/Network/Hubbub/Http.hs # Http calls to the Pubs and Subs. src/Network/Hubbub/Hmac.hs # Hmac signs bits of text src/Network/Hubbub/Queue.hs # Handles queuing of events. src/Network/Hubbub/Internal.hs # Actions events. src/Network/Hubbub/SubscriptionDb.hs # Generics Api to persist Subs. src/Network/Hubbub/SubscriptionDb/Acid.hs # AcidState impl of DbApi src/Network/Hubbub/SubscriptionDb/SqLite.hs # SQLite impl of DbApi.
test/Test.hs # Main for tests test/Network/Hubbub/TestHelpers.hs # Various helper funcs test/Network/Hubbub/Test.hs # Test for Hubbub.hs test/Network/Hubbub/Queue/Test.hs # Test for Queue.hs test/Network/Hubbub/Http/Test.hs # etc. test/Network/Hubbub/Hmac/Test.hs test/Network/Hubbub/SubscriptionDb/Test.hs test/Network/Hubbub/SubscriptionDb/Acid/Test.hs test/Network/Hubbub/SubscriptionDb/SqLite/Test.hs test/Network/Hubbub/Internal/Test.hs
Holds everything about a subscription that we need to distribute content to it and expire subscriptions.
Newtypes so that we cannot mix things up.
Having a type for HttpResource is a bit pedantic. Saves having a string (which may not be a URL) or having a Network.URL (which may be relative or not for HTTP) causing issues later.
Just defines all of the functions we need for the API.
Just a datatype with record accessors. Zero magic.
EitherT can be an error or a good result. Encourages catching exceptions in the impl.
toRow goes from a thing to a list of fields.
fromRow and fromField return RowParsers.
A row parser is a sequential thing that steps through the columns of a result set and returns whether it found what it was looking for or not.
RowParser is a Monad and an Applicative.
Use applicative <$> to lift constructor into RowParser and apply to many RowParsers.
Currying works awesomely for filling in the SqLite specific bit.
There is an IsString instance for Query, thus string literal are Querys.
fmap (fmap func) is to fmap IO then fmap the list inside.
ToRow is used to convert params into columns. Only is to take a single thing that has a toField and make a toRow out of it.
Tasty provides a way to group tests into trees.
Allows multiple ways to test (HUnit,Quickcheck,Smallcheck)
Test tree from this file is imported by SubscriptionDb/Test.hs.
Test tree from this SubscriptionDb/Test.hs is imported by test/Test.hs.
Deals with making assertions. Throws exceptions if not met.
Assertion is just IO () so can do anything in tests.
assertFail and @?= are enough for my use.
Combining multiple other transactions in expireSubscriptions, using do notition and mapM_ monad machinery to help.
DbApi implementation just takes advantage of currying once again.
Datatypes for Subscription,Publication and Distribution Events
Retryable just gives a polymorphic way of extracting/incrementing the attempt count.
Also allows retry delay to be implemented differently depending on the type.
Allows the queue to work on all Retryable events without sub typing.
Takes an event actioner, error handler and a queue of retryables.
Atomically dequeues (or blocks waiting for an event) from the queue.
Requeues failed events based on the retry delay.
IO Action never returns. This is actually not a problem!
Because all the IO is lazy, we can test a lot here without too much boilerplate.
forkIO is a way to run any IO action in a different green thread.
Mutable vars let us collect the state from the other threads.
And this is important, because any code that needs a sleep 100ms in a test just to pass needs a lot of testing! ;)
Gets the content type and body of the topic of a publication event.
httpResourceToRequest . fromTopic . publicationTopic $ ev -- is the same as httpResourceToRequest (fromTopic (publicationTopic ev))
&&& ("fanout") is from Arrow that applies both functions to the response and returns a tuple.
contentType = fmap snd . find ((== hContentType) . fst) . responseHeaders -- is the same as contentType r = fmap snd $ find ((== hContentType) . fst) $ responseHeaders r
Just assembles the post request from the event. Nothing fancy.
Pattern matching on constructor to get mode.
Takes a random number generator as a param.
Needs to keep query params from original callback, so filter out hub ones from input.
Generates random challenge and checks response.
handleT from the Errors library to recover from NotFound.
There are tests. Using scotty as a test server for the http calls.
Lots of details for scotty stuff we've seen before.
We'll skip them in the talk but they are on github to peek at.
Largely boring. Just gluing http calls together into event handlers.
We'll skip them unless we have time / interest.
Config for client to configure us.
Env for all of the state our code needs to run (TQueues, DbApi,etc)
Constructor or accessors of HubbubEnv are not exported, so clients can't fiddle with it or inspect it.
Initializer for either AcidState or SQLite
subscribe / publish just push onto the queues.
List just reads from the DbApi
All the real work happens asynchronously by the worker threads.
Where the IO actions really become awesome.
Can just fork as many of the loops as we like.
They do their stuff until the main thread exits.
This part is ridiculously easy.
Wonderfully boring code.
None of the asynch bleeds into here.
Just really about parsing requests and giving outputs, as it should be.
This code was pretty much all impure.
Was that actually a big deal?
Except for the type sigs, did you even notice?
Title should be: Haskell can do real world real stuff just as well as your favourite language!
I'd argue that we're actually better off in haskell.
We just did asynch, non blocking code and it didn't even hurt (or involve Node.js).
Laziness is a virtue (forever $ doStuff)
Furthermore the fact that IO is a value gives rise to lots of cool combinators (mapM,replicateM,etc) that are reusable for other sequencey type things.
Don't listen to the FUD: Haskell can do real world things very well.
Just look at how warp beats the pants off of node/python/etc http://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-language-benchmarks.
This code: https://github.com/benkolera/haskell-hubbub/
Search packages on hackage: http://hackage.haskell.org/
Read LYAH: http://learnyouahaskell.com/
Hoogle: http://www.haskell.org/hoogle/
GHCi: run "cabal repl" in your project to get a repl. :t and :i do awesome things.