On Github KronicDeth / austin-elixir-alembic
2016-06-20
Author
limhoff@csd.org Kronic.Deth@gmail.com @KronicDeth @KronicDeth @KronicDethI am a Senior Software Engineer at Communication Service for the Deaf (CSD). CSD is fully remote for the development team and most of the non-profit outside the call centers. I work on Vineya, a market place and SaaS to allow the Deaf community to select their own American Sign Language (ASL) interpreter. The current, market-place-only version of Vineya is a Ruby application that was developed before I joined CSD. The SaaS version, under development, is a mix of Ruby applications and Elixir applications.
In my free time I am the maintainer of IntelliJ Elixir, an Elixir plugin for JetBrains IDEs, like IntellIJ IDEA and Rubymine.
When I announced the release of Alembic, there were multiple people wondering how it compared to JaSerializer, but for Vineya, we use JaSerializer with Alembic. We use JaSerializer's plugs to deal with the content-type and accept headers and to convert from JSONAPI's hyphenated keys to the underscored keys used by Ecto. Once inside each action, we use Alembic to (4) validate that the params are valid JSONAPI document; (5 & 6) convert to the params format used by Ecto.Changeset; and (8) to convert included relationships to association preloads before using JaSerializer again to render the view.
When I started at CSD in November 2015, JaSerializer did not have Params.to_attributes, which was only added in April 2016, so params arrived at the Phoenix.Controller action, but there was no validation that the params were even structured as a proper JSONAPI document. We could manually pick out the data attributes and relationships that were needed for each controller and action, but that didn't seem right to me. My understanding of the purpose of JSONAPI was to make the encoding and decoding of resources more regular, so the code for any action across all controllers for JSONAPI resources should be the same. Furthermore, the JSONAPI spec outlines a format for errors and when they should be returned, so I knew that the error handling should be extractable to its own library, so it could get out of the business logic of each action.
Document.from_json takes in params, with underscored keys, in other words, the JSON, and an error template.
The error template is necessary because the JSONAPI spec allows slightly different formats for the different actions. Importantly, Resource ids are optional if the "action" is :create when the "sender" is :client . The error template also contains the source with a JSON pointer to where the JSON is located in a JSONAPI document. For Documents, that pointer is always the root pointer, or the empty string.
An empty single resource can be represented with "data": null in encoded JSON, which would show up as data: nil in the json passed to from_json.
Alembic isn't doing much here, it's turning the map into a struct and so the string "data" becomes the atom key data.
In general, throughout Alembic, if a JSON object has a fixed set of keys, it became a struct while JSON objects with free-form keys remained maps with string keys. This prevent atom table exhaustion.
A present single resource gets more interesting, the data becomes a nested Resource struct.
In addition to Resources, JSONAPI supports Resource Identifiers, which only have a "type" and "id".
There are no type hints of whether the top-level "data" is a Resource or Resource Identifier when converting the json map, so the JSON object in "data" is treated as a Resource if "attributes" or "relationships" is present, otherwise it is a Resource Identifier.
"data" can also be a collection of resources, such as for as has-many relationship or the return from an index.
Just like singletons, collections can be made of either Resources or Resource Identifiers.
The JSONAPI spec considers foreign keys an implementation detail and expressly states that they shouldn't be included in the attributes of a resource, so when you want to represent a foreign key, you make a relationship with a Resource Identifier.
You can side-load relationships to any depth (assuming the server supports the relationship path) and the resources will be elements of the includes.
Notice how most containers are structs with atom keys, but the relationships of a Resource is a map with string keys because the names are not predefined by the JSONAPI spec.
Ok, that's great, Alembic can parse JSON into structs, but that doesn't buy us much. What we want is for Alembic to automatically handle error reporting, so if the json is not a valid JSONAPI document, instead of returning {:ok, Document.t}, Alembic will return {:error, Document.t}
If there's an error, how can Alembic return a Document still? It's not a partial Document containing only the valid parts of the JSON, but instead a JSONAPI error document that can be re-encoded and sent back to the sender.
The format of detail and meta isn't that specific in the JSONAPI spec, but in general Alembic tries to include machine parsable presentations of anything from details in meta, so that you can construct your own error messages if you don't like the detail message.
Alembic doesn't just detect errors in the top-level keys of the JSONAPI document, but in all the nested JSON objects that are defined by the spec.
When an error is found in a nested part of the Document, the source pointer shows which element has the error. The source pointer is a JSON Pointer, defined in RFC 6901.
Alembic can generate the nested JSON pointers due from_json being a behaviour callback in Alembic.FromJson.
All the nested JSON objects in documents have modules (Error, Link, Links, Meta, Relationship, Relationships, Resource, ResourceIdentifier, ResourceLinkage, and Source) that implement the FromJson behavior.
Because each nested JSON object has a module, converting json to an Alembic.Document struct is a matter of each level converting its children by passing an error template with that child's source pointer.
The child's error template is made using Error.descend
The final piece of the error handling is being able to report multiple errors instead of just the first error. This isn't required by the JSONAPI spec, it says "A server MAY choose to stop processing as soon as a problem is encountered, or it MAY continue processing and encounter multiple problems", but I wanted all errors in one request to be available, so users don't need multiple round trips to fix errors.
FromJson.reduce is the enumerable version of FromJson.merge
So, how do we use this nice error handling in our controllers? After JaSerializer.Deserializer has converted the hyphenated params to underscores, the action gets the params, and uses those as the json for Document.from_json.
In our InterpreterServer app for Vineya, I've made a module, InterpreterServer.Controller.Ecto, that has common actions for each of the REST actions required by JSONAPI, so for the create action, it calls IntepreterServer.Controller.Ecto.create.
The general structure of the controller functions uses a happy path with in action function and then the sad path is handled by each method being called generating the error response. This effectively works like halting the Plug.Conn, but I don't have to setup guard clauses at the top of the controller to have plugs used for only one action.
valid_create_changeset and preloads_from_include_param both use Alembic, so we'll focus on those functions.
valid_create_changeset calls create_changeset and create_changeset calls document_from_json.
document_from_json calls Alembic.Document.from_json with the action name and using the with to handle the sad path of rendering the error.
document_from_json is only called for create and update actions, which match the names used in the JSONAPI spec, but there are other actions that don't match exactly.
The JSONAPI spec doesn't differentiate the action for index or show and instead calls that fetch.
When document_from_json doesn't encounter an error, create_changeset's with passes the Document to Document.to_params and then ensures belongs_to associations will insert correctly using ToParams.nested_to_foreign_keys.
A JSONAPI Document is at most two levels deep: The Resource(s) in data and then any direct or indirect relationships are grouped together in included, but the nested params format has no fixed depth, so to_params needs to pass a lookup table keyed by type and id.
However, the lookup table wasn't enough because the JSONAPI allows ResourceIdentifiers, or effectively pointers, AND it represents has_one and belongs_to associations the same way: as to-one relationships, so when both the Resource with the belongs_to association and the related resource with the has_one are included in a JSONAPI document, you get circular references in the ResourceIdentifiers in the relationships. converted_by_id_by_type prevents the circular references from leading to infinite recursion by tracking the {type, id} pairs that have already been converted from JSONAPI to nested params.
If a type and id have already been seen, the conversion can't return nothing because that would make it looks like the nested association is nil, so instead, if the type and id is in converted_by_id_by_type, just the primary key is included.
The ToParams.to_params behaviour only nests params, it does not transform nested params to the format needed for Ecto.Changeset.cast because to_params is not passed the Ecto.Schema modules. This is a deliberate design choice, so that the behaviour doesn't have a dependency on Ecto and it allows to reduce the number of arguments passed down to the child to_params calls.
So, once Document.to_params returns the nested params, along with the primary Ecto.Schema module is passed to ToParams.nested_to_foreign_keys.
When nested_to_foreign_keys converts a nested primary key to a foreign key, it removes the nested params from the returned map.
When the nested params are nil, the foreign key has to be set to nil to distinguish the case where it was just not included.
We have a Changeset for create, but once created, we may want to return some associations as included relationships.
I call preloads_from_include_param before calling mutate_repo because the preloads could be malformed and I don't want errors in the input preventing me from rendering a proper response after I've changed the Repo because that could lead the caller believing the Repo did not change when it did.
Fetch.from_params can't error out because there is no validation on the include parameter: the allowed relationships aren't checked until Fetch.to_query.
If a relationship path, here "secret" isn't mapped to preloads, then an error is returned, so that callers don't have wonder why resources aren't being included. This can be very important when you explicitly decide not to include an association in the list of includable relationships.
With a valid changeset and preloads, we can change the database, knowing that we'll be able to render the response in the format requested by the caller.
The last step is rendering with a JaSerializer.PhoenixView.
So far, I've shown Alembic working as the server, but Alembic can also be used in a client.
In addition to using Alembic for our controllers that interface with Ember, we put JSONAPI inside of JSONRPC and use that to communicate with Ruby applications over RabbitMQ. Our Elixir application has both RPC servers for the resources it owns and RPC clients for getting resources that the Ruby applications own.
When the JSONRPC response comes back, it has a result field. That field is the result argument to result_to_document.
Api.deserialize_keys does the same as JaSerializer.Deserializer, but isn't dependent on the argument being a Plug.Conn.
Internally, Api.deserialize_keys uses the same formatting functions as JaSerialier.Deserializer: JaSerializer.ParamParser.Utils.format_keys
result_to_document's Document is used in reply, which formats the GenServer reply.
As part of that formatting, I wanted the RPC client to return a nested Ecto.Schema struct that would look just like the return from Ecto.Repo calls, so Document.to_ecto_schema does that conversion.
When there is nil data in JSONAPI, it becomes nil for the Ecto.Schema struct.
When there is a single Resource in "data", it becomes an Ecto.Schema struct corresponding to that type using the ecto_schema_by_type map passed to to_ecto_schema.
When there are relationships, they get attached to the struct using the associations.
This may seem like I'm overly tying the relationships to the associations, but none of this code cares that an Ecto.Repo exists. The Ecto.Schema modules are only used as a convenient way to declare the casting rules for the attributes and relationships, mapping JSONAPI Resource attributes to Ecto.Schema fields and relationships to associations. The Ecto.Schema modules don't have to be backed by a database table and aren't for our usage of RPC clients.
It so happens that for our API controllers, the Ecto.Schema module used to cast the JSONAPI to structs is the same as the database table, but that doesn't have to be the case and is only that way because none of our JSONAPI resources are composite views of database tables.
Document.to_ecto_schema eventually calls ToEctoSchema.to_ecto_schema, which uses Ecto.Changeset.cast to apply the casting rules defined by the Ecto.Schema module fields. It does not use changeset, because for the use case of a Client, there is no validation of the attribute and associations because whatever the Server sent is the truth.
Because to_ecto_schema uses only cast and __schema__, which are available to any module using Ecto.Schema's schema DSL, there's no new DSLs needed to use an Ecto.Schema module for to_ecto_schema.
The to_string in put_named_association is the code that cements that relationship names (after prior reformatting from hyphenated to underscored) must match association names. If one wanted to make them different, the to_string could be changed to a transformation function that is passed in. PRs welcome.
put_association handles the differences between different types of associations and starts recursively handling indirect relationships by calling to_ecto_schema on each of the associated params.
With the available conversion functions provided by Alembic, we are able to use it for clients and servers and for API and RPC. It can work with Plugs, such as those from JaSerializer, but does not require a Plug.Conn to be avalable. It leaves the casting and associations to Ecto, without the need for access to a database through an Ecto.Repo.