Streams
Solving I/O Bound Problems
Created by Evan Oxfeld / @evanoxfeld
- Show of hands if you've tried NodeJS
- Show of hands if you've done any async programming
NodeJS
- Server-side JS platform built on Chrome's V8 engine
- Easily build fast, scalable network applications
- Event-driven, non-blocking
I/O bound problems
- E.g. database, LDAP, REST call, call to ext. service
- You're reading from / writing to slow interfaces
- I/O bound if the interfaces, not the CPU, not memory
constrains the problem
Overview
- Explore I/O bound problems in Node
- Explore Streams API
You'll not only have the knowledge to use Streams, but also write your own
Streams Example - Unzip
fs.createReadStream('path/to/archive.zip')
.pipe(unzip.Parse())
.pipe(fstream.Writer('output/path'));
- Similar to unix pipes
- Streams are readable, writable, or both
Code!
Behind the curtain
Stream in node 0.8 inherits from EventEmitter
util.pump(readable, writable)
- Sets up event handlers for 'data', 'end'
- Ignores other events e.g. 'error', 'close'
- API limits customization, not chainable
Streams in 0.8
readable.pipe(duplex).pipe(writable)
- Looks more like JavaScript - pipes are chainable
- 0.8 was the stable version of node prior to 3/11/2013
Streams in 0.8
- Readable
- Emit data events
- Optionally implement pause() and resume()
- Writable
- Implement write() and end()
Issues - Backpressure
Readable
Writable
What happened?
- Naive implementation didn't handle buffering and backpresure
- Backpressure - stream signals to its source to stop sending data
Issues - Backpressure
Readable
Writable
write() returns false; stream is full
Issues - Backpressure
Readable
Writable
Writable emits 'drain' to signal it's ok to resume
More Issues
- Buffering and backpressure
- No on.('pipe') method
- pause() isn't a guarantee
- Backpressure is a dance, hyperactivity is bad
- Data events start immediately (big problem at scale)
- Data starts before event handlers set up
Streams of Tomorrow, Finally Here
Streams in 0.10
- Readable streams are now suck streams
- read([size]) is equivalent to write(data)
- read() returns null if less data is buffered than size
- write() return false or pass a callback
- Object mode
- Same composable pipe API
- Shared base classes for backpressure and buffering
- Readable/writable/duplex was one base class in Node 0.8
- highWaterMark option (16kb default) determines how much data is buffered
Streams 0.10 Base Classes
- Readable, Writable, Duplex
- Transform: implement _transform(chunk, encoding, cb)
- PassThrough: no method to implement
Transform - output connected somewhat to input. Saves implementing both
_read() and _write()
Example stream.Transform Code!
Conclusions
- Stream API is great for solving I/O bound problems
- Streams2 developed in the open with a parallel
user-land module
- Now you can use and write streams.
- If you have an I/O bound problem
- Can you leverage NodeJS to solve it?
- Or think about what's possible with your current languages/platforms
@substackStreams make programming in node simple, elegant, and composable.
References