Before we dig in, let me provide some context around the
problem we were trying to solve at Heroku that led to this
talk.
heroku connect
data sync product with a dashboard web UI
gigantic state machine
lots of state changes happen async on the backend
customer trust of the product relies on
communication/transparency
polling wasn't cutting it
constant feedback from customers: I can't tell what's
happening
inefficient API usage (lots of no-op calls)
scaling problems (needed to scale up an order of magnitude)
defining the problem
we had a REST API
we needed to add a realtime channel
we didn't want to drastically increase client
complexity
let's talk about
the web
Now that you have some understanding of the problem we were
trying to solve, let's pull back a bit and talk about the web
and the direction it's going.
the web is
service oriented
we've moved from this
Remember the days when you could describe your stack with an
acronym?
1 data store, usually relational
request/response cycle ecapsulated all business logic
server-side views responsible for all content rendering
to this
er, this
A single service is potentially comprised of many
processes
Services (mostly) talk HTTP to each other
Pubsub (usually via redis) allows efficient 1:many
communication between services
MOST IMPORTANTLY: the web client is now just another
service, a consumer of public services published by the
server
the shift was gradual
First we published APIs for others ("platform play")
Then we began to consume those same APIs in native
clients on other platforms (the rise of mobile)
Finally, we figured out how to effectively turn our web
clients into service consumers as well (JS templating,
client-side MVC, web components)
TL;DR: We've had plenty of time to understand and develop
best practices around the way machines talk http to each
other
best practices
This is part of the table of contents for Heroku's HTTP API
design document.
There's all kinds of stuff in here to help you build a great
API.
platform play is table stakes
When we think about our own web applications as a network
of services, the previously novel idea of "platform" becomes a
given.
A public REST API service is no longer a bonus. It's now a
core part of our application's architecture.
the web is
realtime
responsiveness is an expectation
If your site looks like an application, users expect it to
act like one
Fallacies of distributed computing mean nothing to
users.
In the past, the only way to push reliably was by supporting
multiple transports (long-polling, flash sockets, websockets,
etc)
Now, support for proper websockets is good enough to rely
on by itself.
just because it's easy, doesn't mean it's easy to do right
Think of a product you've built. Now think how a realtime
streaming transport design might look for that product.
If you're like me, you immediately begin to think in terms
of events and data payloads associated with event types.
Then you begin to think about the process of reacting to
events in the inteface. How event publishing would propagate
through client views and state.
Head hurting yet? Mine is.
realtime transport design is not a solved problem
I'd consider a technology "mature" when its usage patterns
are well estalished and understood. The realtime web just
isn't there yet.
the web is
… complicated
platform 🆚 realtime
You should probably be doing both
Commonly solved orthogonally, meaning:
they easily go out of sync
they have human resource contention
2x surface area for errors
common realtime solution
Server side opt-in per event
Client side opt-in per event per component
That's a lot of code for each supported event!
reframing the problem
We've outlined a bit of the mess we're in with regard to web
app architecture. Before we dig into solutions, let's take a
moment to reframe the problem.
events are a proxy for state change
"X happened" is actually shorthand for "A, B, and C objects
have been mutated".
If instead of events, we simply enumerated all state
mutations (including Nth degrees like aggregates), the event
itself would be useless.
The reason we tend to think in terms of events is because it
makes more sense to our causal-driven brains. Computers
care about data flow, not causality.
This creates an implicit contract based on derivation of
cause-effect relationships in our code. That's why it's so
difficult to understand, maintain, and untangle!
isolating data mutation is key
If we agree that state change is what we're targeting, all
we need is a way to know about all operations which change
state.
Once we isolate mutation of data, we have the all the
hooks necessary to enumerate mutations explicitly.
Our streaming contract can then become both comprehensible and
maintainable.
No more spooky action at a distance where the name of an
event type means any number of side-effects to our client
interface.
REST endpoints fully describe application state
If this wasn't true, you couldn't have built your app with
just the API.
This API is an existing and complete contract that already
has a client consumer.
Both of these make it the perfect choice for a realtime
streaming contract as well.
a better way
You're probably able to guess at what all this is leading
to. There's a better way to solve our problem of adding a
realtime transport without making our client app too
complex.
let's review
pubsub channel per user
user authenticates to stream producer with a private key
which is passed on to an identity service for
verification
stream producer subscribes to the user-specific pubsub
channel
events are published to that channel and flow down to the
client
the client has logic to deal with each event type
realtime producer as REST consumer
pubsub channel per endpoint
client asks stream producer to subscribe to REST endpoints on
its behalf
On a state change event, all relevant endpoint channels are
published to
those REST payloads are tunneled to the client via the stream
producer
the client deals with them the same way it would had it
requested the payload via AJAX
challenges
No solution is a panacea, and no analogy is perfect.
Let's look at some of the challenges we faced when
implementing this techique.
isolating data mutation
This is simple if you have 1 source of truth (rdbms) and 1
API service which writes to it:
Most ORMs have hooks after saving a model instance to the
database. Register a global post-save hook and you've isolated
all data mutation.
Otherwise, you need a data service which all other services
interact with to mutate data.
authentication
If we no longer have a channel per user, how do we verify
that the endpoint the client is asking to subscribe to is
allowed?
Our REST API already has auth built in, so let it solve the
problem!
2 common client auth mechanisms: session cookie or access
token
cient passes auth along to stream produer, producer issues a
HEAD request to REST API
If 200, then valid subscribe request
what API endpoints changed?
The hardest part: requires active participation on the part
of the REST API
Registry of model : endpoints needed, describing how to get
from a model instance to a the endpoint url
Once that registry has been created, you simply iterate
through the the associated endpoint generators for a model
type and render the REST paths. You end up with a full list of
modified API endpoints.
performance
One model instance can affect many endpoints. Obviously this
becomes expensive to compute if the data changes frequently.
Computing just the endpoint paths is cheap. Then check to
ensure an endpoint has a subscriber before rendering its
payload.
Push the endpoint rendering off to a work queue if it's too
expensive to do synchronously.
server implementation
let's look at some code
Django ORM hooks
URL Registry
Node.js socket server
the client
react.js
a digression
why react?
this technique can be used with any client framework
in fact, one of its strengths is its ability to plug into
existing infrastructure
I believe react.js (along with flux and immutable js) is the
ideal choice of client tech for this purpose
built for data mutation over time
react handles propagation of state down through hierarchies
of components extremely well
because react effectively re-renders your entire component
on every state change, you know your UI always accurately
reflects state
the shadow dom
react's super-power
makes it not only feasible but FAST to re-render components
every single time
server-side rendering
not directly applicable here, but obvious for any api
consuming client
drastically reduces time to usable page load
requires very little code to implement (provided the vast
majority of your markup is in react components)
uni-directional data flow
this is the big one for our purposes
think of the complexities 2-way data binding would introduce
to an app whose state is being pushed to it
uni-directional data flow guarantees that all data coming
into your app, no matter the source, will be accurately
represented in your UI. It's just that simple.
flux architecture
react is fairly common, so I didn't want to spend a lot of
time going over the basics
flux is less generally understood, and it's more important
to my demonstration, so let's review its basics
view
essentially react components, but in flux architectures,
views are commonly rendered hierarchically
a parent view receives events from a store (explanation
forthcoming) and calls its own setState to re-render
itself
the entire state is then passed down to child views so that
ui changes naturally flow uni-directionally down the
dependency chanin
store
a data store, similar to a model, but think of a store as a
table, not a row
stores register themselves with dispatchers (explanation
forthcoming) so that when data changes, they are notified
once the store is updated, it emits an event that is
commonly used by views (components) to re-render
themselves
dispatcher
Like the name sounds, the dispatcher is responsible for
invoking callbacks registered by other components.
When an action (explanation forthcoming) receives new data,
it calls the dispatch method
Since stores commonly register themselves with the
dispatcher, their callback is called when the dispatch method
is invoked
actions
in generic terms, an action's job is to provide data to the
dispatcher
practically speaking, you can think of it as your REST API
client
this is part of what makes flux such a natural fit for our
realtime channel: all we need to do is get data to our
actions
Actions may be called in response to a view's event handler.
This is how react avoids 2-way data binding without
sacrificing user interactivity.
client implementation
more code diving
items store -> crud store -> base store
items actions -> crud actions -> base actions
items views
api subscription service
live demo time
page initial render
curl API PUT, POST, DELETE
drawbacks
http method is a stretched analogy
it works (mostly), but the methods are meant for requests,
not responses
you can begin to see it break down with PUT vs POST, and it
further degenerates when you begin to think about PATCH
I've presented a purist view of this technique, but there
are some practical things that can be done to shore up the
analogy.
clients must maintain a list of subscriptions
this list can get fairly large, and without sticky sessions,
any socket disconnect means resubscribing to all channels
it's somewhat difficult to answer the question "how do I
know when there's something new I should subscribe to?".
Ideally, list endpoints would act like a PATCH for new members
(again stretching the http method analogy).
beware the race conditions
always use a "last modified" timestamp in the data
store
always inspect the client timestamp against the one in the
socket payload to avoid overwriting with older state
tl;dl
realtime API mirrors REST API
client consumer doesn't care about push vs pull
data service triggers the publishing of REST endpoints