Decoupled Drupal with Silex – About – CMS for Video Metadata



Decoupled Drupal with Silex – About – CMS for Video Metadata

2 0


slides-decoupled-drupal-silex


On Github palantirnet / slides-decoupled-drupal-silex

Decoupled Drupal with Silex

Presented by Larry Garfield (@Crell)

@Crell

  • Senior Architect, Palantir.net
  • Drupal 8 Web Services Lead
  • Drupal Representative, PHP-FIG
  • Advisor, Drupal Association
  • Loveable pedant

About

  • Leading OVP provider
  • Founded in 2007
  • 300+ employees worldwide
  • Global footprint of 200M unique users in 130 countries
  • Ooyala works with the most successful broadcast and media companies in the world

Ooyala Customers

CMS for Video Metadata

  • Support for structured metadata
  • UI to manage metadata
  • Publishing workflow
  • Curate content
    • Lists, Collections
    • Banners (Hero)
    • Homepage and lists of lists

Think NetFlix/HBO Go

… and

  • Data sync with Ooyala Backlot
  • APIs that work at scale
    • Thousands of movies
    • Tens of Thousands of users
    • Support website and applications

Team Palantir

You know what would be an awesomeplatform for this project?

Oh, right, 2013...

So now what?

Drupal 7 with Services Drupal 7 with RestWs Symfony Silex
  • Services: Antiquated approach, not really REST
  • RestWs: Young, still dealing with poor Drupal APIs
  • Symfony: Unfamiliar, would be doing a lot of home-grown work
  • Silex: Very fast, but not even Symfony's UI support

Why choose?

Drupal 7

Pro: CMS and page display

Con: Mediocre at REST

Silex

Pro: HTTP handling

Con: Hand-rolled artisinal UI

Communication?

  • Drupal data is Drupal's
  • Editorial structure != API structure

Answer

  • Lucene engine, like Solr
  • JSON/REST based API
  • Dynamic schema (no restart)
  • Much easier for custom dev
=>
=>
=>
=>

The data model

(aka The Nodes)

Major content

  • Program
  • Asset
  • Offer
  • Collections
  • Various app-specific content

Where does the datacome from?

<featureexport>
  <header>
    <exhibitionwindow start="2013-03-12T13:00:00.0Z" end="2014-07-30T13:00:00.0Z">
    <destinations>
      <destination destinationid="svod" deviceid="STBmanaged">
    </destination></destinations>
  </exhibitionwindow></header>
  <feature>
    <description>
      <title>The Dark Night</title>
      <sorttitle>
      <shortsynopsis>When Batman, Gordon and Harvey Dent...</shortsynopsis>
      <genre sub="1" main="4">Movie:Action/Adventure</genre>
      <production>
        <credits><credit role="Actor"><person>Christian Bale</person></credit></credits>
      </production>
    </sorttitle></description>
  </feature>
  <media>
    <videos>
      <video class="feature">
        <duration>PT152M0S</duration>
        <files base="">
          <file filename="ON600007.gxf" assetid="ON600007">
        </file></files>
      </video>
    </videos>
    <images>
      <imagesource base="">
        <img filename="the_dark_knight.jpg" imageid="I@023072">
      </imagesource>
    </images>
  </media>
</featureexport>
  • Greatly simplified version of file
  • Note devices (destinations)
  • Note Offer info (exhibition window)
  • Note asset Id on video files (Asset files separate)

Importing

  • Migrate
  • Feeds
  • Services
  • Custom
  • Migrate: Don't want CLI triggering, want REST
  • Feeds: Fetch-based, maps to one object, too over-built
  • Services: Clunky

Importing

+

Closure-based mapping objects

=

Potentially multiple nodes

  • OOP made it easy

HTTP

  • Cheat: POST-only, no REST
  • 3 page callbacks, common code
  • Vanilla Drupal
  • Http Basic (over SSL) (simplehttpauth)
  • Modified simplehttp_auth to be path-based, not role based

One more thing…

  • Incoming data often incomplete
  • Need 3rd party in many cases
  • Guzzle
  • Guzzle lib for Rotten Tomatoes
  • Rotten Tomatoes module
  • Map nodes by IMDB ID
  • Refetch periodically
  • Movies and Comments
  • Merge nodes on export
  • Expose comments to API directly

Editing

Drupal's got this, right?

...Almost

Content index

  • Imported data
  • Note filters, courtesy Views

Editorial content

  • "Application" data, not imported.

Program editing

  • Overview for editors
  • See status of related objects, too.

Program editing

  • Big blank space would have video preview if available.
  • Editors change status and QA state

Program editing

  • Rest of page; can see image previews in many sizes

Program editing

  • Edit credits
  • Select credit source

Program editing

  • This page goes on for a while, lots of fields

Program editing

  • Note synopsis source.

Where does the data go?

Export to ElasticSearch

  • Lots of preprocessing needed
  • Existing contribs weak (2013)
  • Rules has too many moving parts
  • Exporting some nodes requires others

Another mini-custom OOP system!

  • Node structure inappropriate for outgoing API
  • Need to do custom processing per node
  • Merge in Rotten Tomatoes content

When?

  • Published == Public
  • Public == In Elasticsearch
  • Published == In Elasticsearch

So when should it be public?

Publication rules

A Program is publishable when:

  • It is curatorially approved

Actually it's a bit more complex than that

Publication rules

An Offer is publishable when:

  • It is curatorially approved
  • … and the Offer is within its Publication window
  • … or it's about to be

Actually it's a bit more complex than that

Publication rules

An Asset is publishable when:

  • It is curatorially approved
  • … and it has an Offer that is approved
  • … or the Offer is within its Publication window
  • … or it's about to be

Actually it's a bit more complex than that

Publication rules

It's complicated…

…And potentially expensive

  • Need to not do work on node save to avoid slowing down UI.

Cron & Queues

Cron->Publish queue: Ready nodes Cron->Publish queue: Related nodes Publish queue->Node save: Publish Cron->Publish queue: Expiring nodes Cron->Publish queue: Related nodes Publish queue->Node save: Unpublish Node save->Index queue: Always opt node published Index queue->Index queue: Prepare data Index queue->Elasticsearch: PUT end opt node unpublished Index queue->Elasticsearch: DELETE end
  • Hard work is all in cron and queues
  • Format translation all happens in index queue
  • Cron and Queue run very frequently.
  • Node save is cheap!
  • Over-writing to Elasticsearch is cheap

Serving the API

  • Lightweight microframework
  • Same Kernel/Routing as Symfony (and Drupal 8)
  • Bare-bones, add what you need
    • Elastica (Elasticsearch)
    • Guzzle (talk back to Ooyala)
    • Nocarrier/Hal (the API format)
  • No UI at all! (so no Twig)
  • Hypertext Application Language
  • IETF Draft
  • JSON and XML (but who uses the XML version?)

Silex overview

Client->+Kernel: Request Kernel->+Controller: Routing Controller->+ESRepository: Load ESRepository->-Controller: Controller->+Formatter: toHal() Formatter->-Controller: Controller->-Kernel: Return Kernel->+View Listener: View Listener->View Listener: Render to Response (JSON or XML) View Listener->View Listener: HTTP caching View Listener->-Kernel: Kernel->-Client:
  • Note separation of controller from view listener
  • Note more controllers than just single object

The HAL Browser

  • Best thing about HAL!
  • Index resource is all you need, just links from there
  • Many require device ID for filtering
  • Follow link for individual program...

Program resource

  • Individual program. Lots of data.
  • Note headers
  • Note caching. Off for dev, easily configurable. Varnish.

Person resource

  • Referenced from program.
  • Note links, especially actedIn
  • Goes to collection of Programs, link to those...
  • Navigate the API like a web site = REST!

High availability?

  • Silex is stateless: Spin up several, load balance
  • Elasticsearch clusters easily
  • Varnish caching = FAST!

Fast forward a year...

One API, Palantir uninvolved

And growing

  • Highly reliable (No? downtime)
  • Library of thousands of assets
  • Thousands of subscribers
  • New features added regularly; architecture held up
  • Not aware of any downtime of the service

Larry Garfield

Palantir.net

Let's make something good together

Keep tabs on our work at @Palantir

For more information