Generating Power with Yield



Generating Power with Yield

0 1


php-yield-presentation

A presentation given at the Nashville PHP Group

On Github jasonamyers / php-yield-presentation

Generating Power with Yield

Jason Myers / @jasonamyers

Yield, a modern language love story

Originally Proposed in 1995, the yield keyword became official via the Generators RFP on June 20th 2013 with PHP 5.5.

Facebook said a Hip, Hop and Don't Stop, and did their own yield generators in HipHop PHP

They are HEAVILY BASED off of Python, with a nod towards the Mozilla JS implementation and the await C# concept.

I gotta say they did a fantastic job of pulling it all together.
So what are Generators. Well to understand those, it's helpful to have a quick look at Iterators.

Iterator

An object that lets us traverse a container

A thing that lets us read through a data structure as if it were a list

PHP Iterator Interface

         Iterator extends Traversable {
            /* Methods */
            abstract public mixed current ( void )
            abstract public scalar key ( void )
            abstract public void next ( void )
            abstract public void rewind ( void )
            abstract public boolean valid ( void )
         }
         

For example checkout ArrayIterator class

ArrayIterator Example

        $fruits = array(
            "apple" => "yummy",
            "orange" => "ah ya, nice",
            "grape" => "wow, I love it!",
            "plum" => "nah, not me"
        );
        $obj = new ArrayObject( $fruits );
        $it = $obj->getIterator();
        echo "Iterating over: " . $obj->count() . " values\n";
        while( $it->valid() )
        {
              echo $it->key() . "=" . $it->current() . "\n";
              $it->next();
        }
        
        

Iterating over: 4 values apple=yummy orange=ah ya, nice grape=wow, I love it! plum=nah, not me

Can anyone tell me a major drawbacks of an Iterator? Memory and speed!

Generator

a special routine that can be used to control the iteration behavior of a loop, and yields the values one at a time

This is markedly different than an iterator. A generator is extremely performant, less memory, and you can get to the data FAST

TL;DR

A generator looks like a function but behaves like an iterator

Performant?

          range(0, 1000000)
          

Uses over 100MB of RAM

Generator Version

            function xrange($start, $limit, $step = 1) {
                if ($start < $limit) {
                    if ($step <= 0) {
                        throw new LogicException('Step must be +ve');
                    }
                    for ($i = $start; $i <= $limit; $i += $step) {
                        yield $i;
                    }
                } else {
                    if ($step >= 0) {
                        throw new LogicException('Step must be -ve');
                    }
                    for ($i = $start; $i >= $limit; $i += $step) {
                        yield $i;
                    }
                }
            }
          

uses less than 1KB!

only ever need enough memory to create an Iterator object and track the current state of the generator internally

TL; DR

          function xrange($min, $max) {
              for ($i = $min; $i < $max; $i++) {
                  yield $i;
              }
          }
          

Sequences

          function collatz($val) {
              yield $val;

              while ($val != 1) {
                  if ($val%2 == 0) {
                      $val /= 2;
                  } else {
                      $val = 3*$val + 1;
                  }

                  yield $val;
              }
          }
          foreach (collatz(11) as $c) {
              echo $c," ";
          }
          
          11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
          
The Collatz Sequence The conjecture is that no matter what number you start with, you shall always eventually reach 1. The property has also been called oneness. HOTPO (Half or Three Plus One)
I'm well aware of how big of a nerd this makes me

Y U Only LOOPin?

It will work for any function that takes an Iterator or a Traversable as argument

          $arr = iterator_to_array(collatz(11));
          
Traversable (a term for reference from Python and JS) is basically anything that acts like a list or iterator. Do not confuse this with the Traversable Interface that's for PHP internals.

Transformations

          function multiply_sequence($a, $fac) {
              foreach ($a as $val) {
                  yield $val*$fac;
              }
          }

          function to_html_list($input) {
              foreach ($input as $val) {
                    yield "<li>".$val."</li>";
              }
          }
          
These are extremely simple examples with the purpose of illustrating a usage. In my daily work, we regularly use generators to transform data from a query into the format needed to display to the end user, or perform math on data points prior to formatting.

Chaining

          foreach (to_html_list(multiply_sequence(collatz(5),2)) as $val) {
              echo $val,"\n";
          }
          
          <li>10</li>
          <li>32</li>
          <li>16</li>
          <li>8</li>
          <li>4</li>
          <li>2</li>
          
          
This is a perfect example of that although extremely simple where I'll chain a few generators together to get the desired result.

Selections

          function select_pattern($input, $pattern) {
              foreach ($input as $val) {
                  if (preg_match($pattern, $val)) {
                      yield $val;
                  }
              }
          }
          
          
Sometimes I wanna chain generators, but I need different behaviors based on the data I'm operating on. Selection generators give me a great way to do just that.

Breath In

          function getLines($file) {
                $f = fopen($file, 'r');
                if (!$f) {
                    throw new Exception();
                }
                while ($line = fgets($f)) {
                    yield $line;
                }
                fclose($f);
          }

          foreach (getLines("someFile") as $line) {
                doSomethingWithLine($line);
          }
          
Skip reading the whole file by implementing a generator, and slowly sliding through the file.

Breath Out

          function createLog($file) {
                $f = fopen($file, 'a');
                while (true) {
                    $line = yield;
                    fwrite($f, $line);
                }
          }
          $log = createLog($file);
          $log->send("First");
          $log->send("Second");
          $log->send("Third");
          
We can also send data to be used in place of the keyword. We open the file, continue on and accept data as it's given to us. Again useful with a selection or chain of generators.
So these next parts, are going to be a bit wierd

Bro Remote Me!

Fake the simultaneous processing of data

We have a way to basically fake multithreading

Green Threads

threads that are scheduled by a virtual machine (VM/interperter?) instead of natively by the underlying operating system

This is really context switching at this point. Worth noting people call this many things, and most languages have it or a library that does it. So let's look at a super simple example.
          function step1() {
              $f = fopen("file.txt", 'r');
              while ($line = fgets($f)) {
                  processLine($line);
                  yield true;
              }
          }
          
Function 1: opens a file processes a line and yields true.
          function step2() {
              $f = fopen("file2.txt", 'r');
              while ($line = fgets($f)) {
                  processLine($line);
                  yield true;
              }
          }
          
Function 2: does the same thing.
          function step3() {
              $f = fsockopen("www.example.com", 80);
              stream_set_blocking($f, false);
              $headers = "GET / HTTP/1.1\r\n";
              $headers .= "Host: www.example.com\r\n";
              $headers .= "Connection: Close\r\n\r\n";
              fwrite($f, $headers);
              $body = '';
              while (!feof($f)) {
                  $body .= fread($f, 8192);
                  yield true;
              }
              processBody($body);
          }
          
Function 3: Gets us a page yielding every 8192 bytes, and processesBody() when done
          function runner(array $steps) {
              while (true) {
                  foreach ($steps as $key => $step) {
                      $step->next();
                      if (!$step->valid()) {
                          unset($steps[$key]);
                      }
                  }
                  if (empty($steps)) return;
              }
          }
          runner(array(step1(), step2(), step3()));
          
Finally our runner function pulls it all together. this processes the steps switching between each at the yield step.

ZOMG... THERE BE DRAGONS!

This relies on making sure we have no blocking IO

overREACTPHP much?

event based, non-blocking IO - ReActPHP

One More Thing

So if I can flip control, I can haz an Async?

          class Buffer {
              protected $reads, $data;

              public function __construct() {
                  $this->reads = new SplQueue();
                  $this->data = new SplQueue();
              }

              public function read() {
                  if( $this->data->isEmpty() ) {
                      $deferred = new \React\Promise\Deferred();
                      $this->reads->enqueue($deferred->resolver());
                      return $deferred->promise();
                  } else {
                      return \React\Promise\When::resolve($this->data->dequeue());
                  }
              }

              public function write($str) {
                  if( $this->reads->isEmpty() ) {
                      $this->data->enqueue($str);
                  } else {
                      $this->reads->dequeue()->resolve($str);
                  }
              }
          }
          
Class for a buffer of values where potential future reads are represented with promises. Read: Return a promise that will be fulfilled with a value at some point in the future, Write: Write a string to the buffer that can be used to fulfil a promise.
          function printer(Buffer $buffer) {
              while( true ) {
                  $value = ( yield Util::async($buffer->read()) );

                  echo "Printer: ", $value, PHP_EOL;

                  yield Util::async(nested_printer($buffer));
              }
          }
          
Generator that prints a value from a buffer and then defers to nested_printer
          function nested_printer(Buffer $buffer) {
              for( $i = 0; $i < 5; $i++ ) {
                  // Yield a promise task and wait for the result - this is non-blocking
                  $value = ( yield Util::async($buffer->read()) );

                  echo "Nested printer: ", $value, PHP_EOL;
              }
          }
          
Generator that prints 5 values from a buffer
          $buffer = new Buffer();

          $scheduler = new \Async\Scheduler();

          $scheduler->add(new \Async\Task\GeneratorTask(printer($buffer)));

          $i = 0;
          $scheduler->add(new \Async\Task\RecurringTask(
              function() use($buffer, &$i) { $buffer->write(++$i); }
          ));

          $scheduler->run();
          


          Printer: 1
          Nested printer: 2
          Nested printer: 3
          Nested printer: 4
          Nested printer: 5
          Nested printer: 6
          Printer: 7
          Nested printer: 8
          Nested printer: 9
          Nested printer: 10
          Nested printer: 11
          Nested printer: 12
          ...
          
Creates Buffers and Scheduler, Schedule a generator task for printer, Schedule a recurring task that writes incrementing integers to the buffer
          $loop = \React\EventLoop\Factory::create();

          $scheduler = new \Async\Scheduler();

          $scheduler->add(new \Async\Task\RecurringTask([$loop, 'tick']));

          $scheduler->run();
        
Take advantage of ReactPHP NB IO, just add the tick method of the React event loop as a recurring task with the scheduler

Async

Created by Matt Pryor, on Bitbucket

Thanks

Huge thanks to Paul M. Jones and William Golden!

THE END

@jasonamyers