The Hack Programming Language – A better way to PHP – The Type Checker



The Hack Programming Language – A better way to PHP – The Type Checker

0 1


presenting-hack

A presentation prepared for GrPHPDev on the Hack programming language

On Github HackPack / presenting-hack

The Hack Programming Language

A better way to PHP

By Brian Scaturro / @scaturr

Hack is a language for HHVM that interoperates seamlessly with PHP. The goal of Hack is to offer developers a way to write cleaner, safer and refactorable code while trying to maintain a level of compatibility with current PHP codebases.

http://docs.hhvm.com/manual/en/hack.intro.whatis.php

Current Hack things in the works:

i.e - how I know things about Hack

The Type Checker

The real value of Hack

Catch those type errors early

... even while coding

Tough type checker

The type checker is your best line of defense. However, the runtime will allow things the type checker does not approve of.

... forgiving runtime

Using hh_client

The type checker is invoked by using the hh_client executable that installs with Hack. This executable is currently only available on *nix platforms.

... be sure to search the right path

hh_client will search for types based on the location of a .hhconfig file. It is an empty file. Be sure to include this in the root of your projects.

Type Annotations

Type annotations allow for PHP code to be explicitly typed on parameters, class member variables and return values (types are inferred for locals). These annotated types are checked via a type checker.

http://docs.hhvm.com/manual/en/hack.annotations.php

Typed parameters and return types

                            
<?hh //strict

function sum(int x, int y): int {
    return x + y;
}
                            
                        

Typed class members

                            
<?hh //strict

class TestResult
{
    protected int $runCount;

    protected array<Failure> $failures;
}
                            
                        

Annotating closures

                            
<?hh //strict

function invoke((function(int $x): int) $fn): int {
    return $fn(1);
}
                            
                        

Annotation with $this

                            
<?hh //strict

class FluentObject
{
    public function doThing(): this
    {
        //do something
        return $this;
    }
}
                            
                        

... needs a little work yet

https://github.com/HackPack/HackUnit/blob/bug-broken-this-annotation/Runner/Options.php

... can workaround pretty easily for now

                            
<?hh //strict
namespace HackPack\HackUnit\Runner;

class Options
{
    protected string $testPath;

    /**
     * Use Options type since "this" annotation is broken
     * when namespaces are used
     */
    public function setTestPath(string $testPath): Options
    {
        $this->testPath;
        return $this;
    }
}
                            
                        

The mixed type

The Hack type checker has some pretty interesting ways of handling the "mixed" type that many PHP programmers are familiar with. The type checker will give you a pass on it if it sees you are validating the type.

... has to be checked

                            
<?hh

function sum(mixed $x): void {
  if (is_array($x) || $x instanceof Vector) {
    $s = 0;
    foreach ($x as $v) {
      $s += $v;
    }
    return $s;
  }
  //... do something else or throw an exception...
}
                            
                        

http://docs.hhvm.com/manual/en/hack.annotations.mixedtypes.php

Annotating generators and coroutines

PHP 5.5 introduced some features that programmers in other languages have been enjoying for years: generators and coroutines. Hack has given types to both of these.

Annotating generators

                            
<?hh //strict

function infiniteIterator(): Continuation<int> {
    $i = 0;
    while (true) {
        yield ++$i;
    }
}
                            
                        

Annotating coroutines

                            
<?hh
async function f(): Awaitable<int> {
  return 42;
}

async function g(): Awaitable<string> {
  $f = await f();
  $f++;
  return 'hi test ' . $f;
}
                            
                        

http://docs.hhvm.com/manual/en/hack.annotations.generators.php

Hack Modes

Hack has various levels of tolerance for it's type checker. These are known as modes, and they are triggered by comments.

Strict mode

For the pure of heart. Everything must be annotated and everything is type checked. Strict code cannot call into non Hack code.

... a strict example

                            
<?hh //strict
class TestCase
{
    public function __construct(protected string $name)
    {
    }

    public function setUp(): void
    {
    }

    public function expect<T>(T $context): Expectation<T>
    {
        return new Expectation($context);
    }
}
                            
                        

... strict continued

It is worth noting that top level code cannot exist in strict mode. This makes it impossible to execute a purely strict program (there is no "main" method). Partial or UNSAFE must be used as strict entry points.

A note on hhi interfaces

Strict mode will throw "Unbound name" errors when it can't find a corresponding type (including functions). Out of the box, this also includes native PHP functions! You can imagine how frustrating this would be everytime you reach for your favorite PHP function.

https://github.com/facebook/hhvm/pull/2656

... hhi interfaces continued

hhi files contain interfaces for most of the PHP core library. These files basically provide type information to Hack's type checker. You only have to make sure they are on the path of the type checker. After install, these are found at /usr/share/hhvm/Hack/hhi. Just copy them to your project's directory and you should be good.

An example hhi

Partial mode

Partial mode is the default of Hack. In partial mode, the type checker checks all types other than that encompassed by an // UNSAFE comment. Partial mode also allows for the partially typing of a class, method or function (e.g., only type a subset of its arguments). And, also unlike strict mode, partial mode allows engineers to call code that has not yet been "Hack-ified" (in other words, they can call into untyped code).

http://docs.hhvm.com/manual/en/hack.modes.partial.php

... the safe bet for program entry

Since strict mode does not allow top level code, partial mode is the method for program entry.

https://github.com/HackPack/HackUnit/blob/master/bin/Hackunit

Decl mode

Decl mode is used to allow Hack code written in strict mode to call into legacy code, without having to fix the issues that would be pointed out by partial mode. The type checker will "absorb" the signatures of the code, but will not type check the code. Decl is mainly used when annotating old, existing APIs (i.e., when the code does note meet Hack's stricter subset of PHP).

http://docs.hhvm.com/manual/en/hack.modes.decl.php

UNSAFE

// UNSAFE disables the type checker from the point of unsafe declaration until the end of the current block of code (where the end of the current block generally refers to the associated ending brace (}) of which the // UNSAFE is declared).

https://github.com/HackPack/HackUnit/blob/master/Runner/Loading/StandardLoader.php#L98

Generics

Hack introduces generics to PHP (in the same vein as statically type languages such as C# and Java). Generics allow classes and methods to be parameterized (i.e., a type associated when a class is instantiated or a method is called).

The benefit of course being: generics can be statically checked.

http://docs.hhvm.com/manual/en/hack.generics.php

Generic interfaces are great for design

                            
<?hh //strict
namespace HackPack\Hacktions;

trait EventEmitter
{
    protected Map<string, Vector<(function(...): void)>> $listeners = Map {};
}
                            
                        

The type checker can easily catch these sort of type constraints, and our program design is better for them.

... the dream is better than reality

While Hack generics are useful type annotations, they are nowhere near as useful as generics in Java/C#.

This mainly stems from Hack's preference for inference and the fact that a type is not a concrete thing in Hack.

Inference

The preference for inference seems like a blow to readability. The following results in a type error:

                            
$fun = () ==> { $fn = $this->callable; $fn(); }
$this->expectCallable($fun)->toThrow<ExpectationException>();

//Tests/Core/CallableExpectationTest.php|54 col 70 error|  This operator is not associative, add parentheses
                            
                        

http://bit.ly/1mBUm25

... this is ok

                            
$fun = () ==> { $fn = $this->callable; $fn();};
$this->expectCallable($fun)->toThrow('\HackPack\HackUnit\Core\ExpectationException');
                            
                        

This snippet also demonstrates we haven't left the "magic string" pattern of PHP. Types are not concrete things in Hack - we still have to check types against strings.

Generics don't stack

Generics are not as useful in Hack because they don't stack. What I mean by stack is explained by the following from the docs:

A generic method must not collide with any existing, non-generic method name (i.e, public function swap and public function swap).

http://docs.hhvm.com/manual/en/Hack.generics.method.php

... even if they should

                            
<?hh //strict                                
class Cook
{
    use Subject<Waiter>;
    use Subject<Busboy>;
}

//throws type errors
                            
                        

https://github.com/HackPack/Hacktions

Summary

Generics are really useful design tools. Their presence in Hack is a welcome addition that is not present in vanilla PHP. However, they are not as useful as they are in other languages.

Nullable Types

Hack introduces a safer way to deal with nulls through a concept known as the "Nullable" type. Nullable allows any type to have null assigned and checked on it.

http://docs.hhvm.com/manual/en/Hack.nullable.php

Be explicit about the possibility of null

                            
<?hh //strict
class Options
{
    protected ?string $HackUnitFile;

    public function getHackUnitFile(): ?string
    {
        $path = (string) getcwd() . '/Hackunit.php';
        if (! is_null($this->HackUnitFile)) {
            $path = $this->HackUnitFile;
        }
        $path = realpath($path);
        return $path ?: null;
    }
}
                            
                        

... be sure to check nulls

The following results in a type error:

                            
<?hh //strict
class TestResult
{
    protected ?float $startTime;

    public function getTime(): ?float
    {
        $time = null;
        $startTime = $this->startTime;
        $time = microtime(true) - $startTime;
        return $time;
    }

    //TestResult.php|39 col 35 error|  Typing error
    //TestResult.php|39 col 35 error|  This is a num (int/float) because this is used in an arithmetic operation
    //TestResult.php|13 col 15 error|  It is incompatible with a nullable type
}
                            
                        

... the following is ok

                            
<?hh //strict
class TestResult
{
    public function getTime(): ?float
    {
        $time = null;
        $startTime = $this->startTime;
        if (!is_null($startTime)) {
            $time = microtime(true) - $startTime;
        }
        return $time;
    }
}
                            
                        

http://bit.ly/1piowdZ

Summary

Nullable allows you to be explicit about the possibility of null. This makes code more readable and easier to reason about.

Collections

The goals of Hack collections are four-fold:

Provide a unified collections framework that is simple and intuitive. Provide equal or better performance than the equivalent PHP array pattern. Provide a collection implementation that allows for optional static typing, integrating seamlessly with Hack. Provide an easy migration path to this framework by building on top of standard functionality from PHP5.

http://docs.hhvm.com/manual/en/Hack.collections.goals.php

Vectors

A Vector is an integer-indexed (zero-based) collection with similar semantics to a C++ vector or a C#/Java ArrayList. Random access to elements happen in O(1) time. Inserts occur at O(1) when added to the end, but could hit O(n) with inserts elsewhere. Removal has similar time semantics.

http://docs.hhvm.com/manual/en/Hack.collections.vector.php

... an example

https://github.com/HackPack/Hacktions/blob/master/Subject.php

Maps

A Map is an ordered dictionary-style collection. Elements are stored as key/value pairs. Maps retain element insertion order, meaning that iterating over a Map will visit the elements in the same order that they were inserted. Insert, remove and search operations are performed in O(lg n) time or better

http://docs.hhvm.com/manual/en/Hack.collections.map.php

... an example

https://github.com/HackPack/Hacktions/blob/master/EventEmitter.php

Note: Maps only support integer and string keys for now.

Sets

A Set is an ordered collection that stores unique values. Unlike vectors and maps, sets do not have keys, and thus cannot be iterated on keys.

http://docs.hhvm.com/manual/en/Hack.collections.set.php

... an example

http://bit.ly/1ntUqVg

Note: Sets only support integer and string values for now.

Pairs

A Pair is an indexed container restricted to containing exactly two elements. Pair has integer keys; key 0 refers to the first element and key 1 refers to the second element (all other integer keys are out of bounds).

http://docs.hhvm.com/manual/en/Hack.collections.pair.php

... an example

http://bit.ly/1k08Rhs

A note on immutable collections

Most Hack collections have immutable variants - i.e ImmVector. Immutable variants function like their mutable counterparts with the exception that items cannot be added or removed.

A note on arrays

Collections are now the preferred method of storing things. Arrays are still allowed, but they must be used in a new way to conform to the type checker.

http://docs.hhvm.com/manual/en/Hack.arrays.php

Shapes

Since PHP does not have the concept of a structs or records, arrays are many times used to mimic a struct or record-like entity . Arrays are also used as "argument bags" to hold a bunch of arguments that will be passed to a function or method. Shapes were created to bring some structure (no pun intended) and type-checking sanity to this use case.

http://docs.hhvm.com/manual/en/Hack.shapes.php

... an example

https://github.com/HackPack/HackUnit/blob/master/Error/TraceParser.php

Type Aliasing

Many programming languages allow existing types to be redefined as new type names. The C language has typedefs. OCaml has type abbreviations. PHP even has rudimentary mechanism with its function class_alias() function. Hack and HHVM are offering two ways to redefine type names: type aliasing and opaque type aliasing.

http://docs.hhvm.com/manual/en/Hack.typealiasing.php

Type aliasing

                                
<?hh //strict
type Origin = shape(
    'method' => string,
    'message' => string,
    'location' => string
);
                                
                            

http://bit.ly/1tOkPwd

Opaque type aliasing

                                
<?hh //strict
newtype Location = shape(
    'file' => string,
    'line' => int
);
                                
                            

Opaque type aliases work like their non-opaque counterpart with the exception that they cannot escape the confines of the file they were defined in. http://bit.ly/1mDUWwj

Mimic linear types with aliases

                                
<?hh
newtype closedfile = resource;
newtype openfile = resource;

function get_file_handler(string $filename): closedfile {
  return some_wrapped_function($filename);
}

function open_file_handler(closedfile $file): openfile {
  $file->open();
  return $file;
}

function read(openfile $file): string {
  return $file->read();
}                                    
                                
                            

http://bit.ly/1irg6cD

Async

Asynchronous programming refers to a programming design pattern that allows several distinct tasks to cooperatively transfer control to one another on a given thread of execution.

http://docs.hhvm.com/manual/en/Hack.async.php

What async is not

Async is not threading. It is cooperative multitasking. Dash those ideas of easily running things in parallel.

http://bit.ly/1nTAV4e

An example

https://gist.github.com/brianium/ff20fe672939de8865fa

Summary

While not threading, this is still extremely useful, and it will only get better. See the official example on coalesced fetching.

Lambdas

To address the shortcomings of PHP closures, HHVM introduced a "lambda expression" feature. Lambda expressions offer similar functionality to PHP closures, but they capture variables from the enclosing function body implicitly and are less verbose in general.

http://docs.hhvm.com/manual/en/Hack.lambda.php

So fresh and so clean

Lambdas make event driven code gorgeous.

                                
$ui = new Text();
$this->runner->on('testFailed', (...) ==> $ui->printFeedback("\033[41;37mF\033[0m"));
$this->runner->on('testPassed', (...) ==> $ui->printFeedback('.'));
                                
                            

https://github.com/HackPack/HackUnit/blob/master/UI/Console.php

Higher order functions are actually pleasant to look at

                                
$squared = array_map($x ==> $x*$x, array(1,2,3));
                                
                            

http://docs.hhvm.com/manual/en/Hack.lambda.examples.php

A huge victory

This may seem like a small addition, but it is probably one of the coolest features of Hack.

                                
$ui = new Text();

$this->runner->on('testFailed', (...) ==> $ui->printFeedback("\033[41;37mF\033[0m"));
//vs
$this->runner->>on('testFailed', function(...) use ($ui) {
   $ui->printFeedback("\033[41;37mF\033[0m";)
});
                                
                            

Tuples

Hack has a basic implementation for tuples. A tuple is, for all intents and purposes, an immutable array. With a PHP array, elements can be added and removed at will. With a tuple, after initialization, elements cannot be added to or removed from the tuple.

http://docs.hhvm.com/manual/en/Hack.tuples.php

Type annotating with tuples

                                
function returns_tuple(): (int, string, bool) {
    return tuple(1, "two", true);
}

class UsesTuples
{
    protected Vector<(int, string)> $vectorOfIntAndStringTuples;
}
                                
                            

Override Attribute

The Override attribute is the child class' counterpart to the parent class' abstract declaration. It is an optional annotation on a method, implemented with the <<Override>> user attribute. If the <<Override>> attribute is found on a method, and that method does not exist in the set of (non-abstract) methods inherited from parents, an error is thrown.

http://docs.hhvm.com/manual/en/Hack.overrideattribute.php

... an example

http://bit.ly/1knbWao

Constructor Argument Promotion

                            
<?hh //strict
class TestResult
{
    public function __construct(protected int $runCount = 0, protected int $errorCount = 0)
    {

    }
}
                            
                        

http://bit.ly/1piowdZ

Other Hack Rules And Features

Hack has some other rules and features, but here are some of the more relevant ones.

Variable number of arguments

Hack provides the capability to indicate that a function takes a variable number of arguments. This is done using ... (three dots).

http://docs.hhvm.com/manual/en/Hack.otherrulesandfeatures.varargs.php

... an example

http://bit.ly/SPEor0

Callbacks and function fun()

In order to make these type of callbacks type-checkable and type-safe, Hack has introduced function fun(). function fun() is a special function used to create a "pointer" to a function in a type-safe way. function fun() takes a string corresponding to the name of the function to be called. It returns a "type-safe" string that can be used in, for example, call_user_func()

http://docs.hhvm.com/manual/en/Hack.otherrulesandfeatures.callbacks.php

fun and family

There are a few variants of fun:

  • function class_meth() - Creates a callable to call a static method
  • function inst_meth() - Creates a callable to call an instance method on a specific object
  • function meth_caller() - Creates a callable to call an instance method on any object

... examples

                                
$factory = class_meth('\HackPack\HackUnit\Runner\Loading\StandardLoader', 'create');

$update = inst_meth($observer, 'update');                                    
                                
                            

https://github.com/HackPack/HackUnit/blob/master/UI/Console.php https://github.com/HackPack/Hacktions/blob/master/Subject.php

invariant()

There are times when it is desirable to have an object be type-checked as a more specific type than it is currently declared. For example, an interface needs to be type-checked as one of its implementing classes. invariant() is used to help the Hack type-checker make this more specific type determination.

http://docs.hhvm.com/manual/en/Hack.otherrulesandfeatures.invariant.php

... an example

                                
public function notifyObservers(...): void
{
    foreach ($this->observers as $observer) {
        invariant($observer instanceof Observer, 'Subjects can only notify Observers');
        $update = inst_meth($observer, 'update');
        $args = array_merge([$this], func_get_args());
        call_user_func_array($update, $args);
    }
}
                                
                            

https://github.com/HackPack/Hacktions/blob/master/Subject.php

Questions?