Backend



Backend

0 0


ghc-intro-nov-2014

Presentation about ghc internals, Stockholm Haskell Meetup Group Nov 2014

On Github Tarrasch / ghc-intro-nov-2014

# GHC internals Haskell Stockholm Wednesday, November 27, 2014
This is *my* take on GHC internals! Much of what I cover will not be specific to GHC, as I think compilers in general are interesting.
# Frontend Lexing, parsing and type checking.
## Lexing ![Chomsky Hierarchy](./img/Chomsky-hierarchy.svg) Lexers are only as powerful as finite automata, they fall in the smallest set of recognized languages in the Chomsky Hierarchy, the *regular* languages.
```haskell lexer :: String -> [Token] data Token = TIdent String | TOperator String | TNumericLiteral Int | TStringLiteral String | ... > lexer "5 + abc +\"six\"" [TNumericLiteral 5, TOperator "+", TIdent "abc", TOperator "+", TStringLiteral "six"] ``` The lexer gives out a *linear* structure
```haskell fun :: Int -> Int fun x = x + ♠♣♥♦ ``` ``` ❯ ghc LexerError.hs [1 of 1] Compiling Main ( LexerError.hs, LexerError.o ) LexerError.hs:2:13: parse error on input `♠♣♥♦' ```
## Parsing ![Chomsky Hierarchy](./img/Chomsky-hierarchy.svg) Most parsers are less powerful than non-deterministic push-down automata, which correspond to the *context-free* languages in the Chomsky Hierarchy. Perhaps you've seen and forgotten names like LL(1), LL(2), LR(1), GLR etc. from the Dragon book. :)
```haskell parse :: [Token] -> AST -- | AST -- Abstract Syntax Tree data AST = PBinOp AST String AST | POperator String | PIdent String | PNumericLiteral Int | PStringLiteral String | ... > parse [NumericLiteral 5, Operator "+", Ident "abc", Operator "+", StringLiteral "six"] PBinOp (PBinOp (PNumericLiteral 5) (POperator "+") (PIdent "abc")) (POperator "+") (PStringLiteral "six") ``` The parser gives out a *tree* structure.
```haskell fun :: Int -> Int fun x = x + + x ``` ``` ❯ ghc ParserError.hs [1 of 1] Compiling Main ( ParserError.hs, ParserError.o ) ParserError.hs:2:13: parse error on input `+' ```
We are still very close to Haskell syntax, for example, we've not flattened `(BinOp x y z)` to an `(Application y x z)`. ## why?
## Type checking Type checking would fail for ```haskell fun :: Int -> Int fun x = x + "hello" ``` ``` ❯ ghc TypeError.hs [1 of 1] Compiling Main ( TypeError.hs, TypeError.o ) TypeError.hs:1:1: The function `main' is not defined in module `Main' TypeError.hs:2:13: Couldn't match expected type `Int' with actual type `[Char]' In the second argument of `(+)', namely `"hello"' In the expression: x + "hello" In an equation for `fun': fun x = x + "hello" ``` I won't cover type checking.

Backend

Convert memory representation of Haskell to executable code and optimize while doing so.

GHC converts Haskell to executable code by successive recasting

Each recasting gets us closer to machine code, we call these Intermediate Representations IR for short.

At each representation, we do optimizations that are appropriate for that level of abstraction. (Discuss while looking at them)

![Recastings in Haskell Backend](./img/suc_recasts.png)
# Runtime The runtime system will be embedded in your binary. It things that can't be implemented in library code. Garbage collector, exceptions, parallelism, locks, STM, etc.
# Thank you Arash Rouhani [@Tarrasch](https://github.com/tarrasch/) on github http://tarrasch.github.io/ghc-intro-nov-2014/ recursion yay