sumodb_talk



sumodb_talk

0 0


sumodb_talk


On Github amilkr / sumodb_talk

SumoDB

https://github.com/inaka/sumo_db

A clean persistence layer for Erlang/OTP created to make your life easier

Good afternoon everyone, and thank you for coming. This talk is about sumo_db, which is an erlang persistence layer, designed to be easily used in your erlang applications. I hope you enjoy it.

~$ whoami

Marcos Amilcar Almonacid

marcos@inaka.net / github: amilkr

hello@inaka.net github: inakahttp://www.inaka.net

My name is Marcos Almonacid, I'm from Argentina. I study computer science at Universidad Tecnologica Nacional. I've been working on erlang applications for about 3 years now. Before that I worked with ruby mainly. And currently I'm a developer at Inaka, where I worked on some really cool applications, like Whisper. Inaka is an US company with offices in Argentina. we have a quite large team of erlangers. we build end to end applications mainly with erlang, ruby, IOS and Android. We also do Erlang consulting. Ok. Lets get started.

The God Module

We are seeing a handle_call statement extracted from a very big gen_server. Its goal is execute a select query in a database and returns the results. How many of you have written a god module with functions like this one to access your databases? (show hands) It's like the first natural thing to do. Inside this function we have a hardcoded query. And we're executing it by using some db driver. emysql in this case. Also we're handling different type of results, even the errors. But this tend to make crappy projects actually. Because...

No single responsibility

... there's no clear definition of concerns and responsibilities. Our business logic is mixed with database implementation details.

Untestable

&

Tightly Coupled

It also makes the application harder to test without running the whole system. We can't test our business logic isolated from the data access layer.

A lot of Duplicated Code

It has a lot of duplicated code, since we're copy-pasting code to create new handle_call statements.

Hard to scale and refactor

For the same reason, it's tedious to scale and refactor. And it's very easy to make a mistake.

NIGHTMARE

This is a snippet of the god module. We're only showing four functions. And the code doesn't fit in the slide. As you can see - It's an absolute nightmare to work with this code... So, lets see how sumodb can help us to write this in a cleaner way...

The Sumo way

With sumo, we only need 1 line of code to do most of the common queries. we can do a lot of things with just on line. Like persisting and deleting an entity (such as a user, a message, etc), fetching by conditions, paginating results. And also We can create the schema for each one of our entities.

How does it work?

So let's see what sumo is about, how it works, and why it is interesting. The secret is just a pattern.

Repositories. How many of you are familiar with this pattern? To try understand what repositories are...

God Module

Multiple Entities

SQL driver

State translation

Business Logic

DB connection details

...lets go back to the god module for a bit. In the slide we can see its responsibilities. And The first thing that we notice is that it has a long list of responsibilities, and also we notice that it's handling multiple entities at the same time. So lets split each entity on its own module. I mean one module, one entity. This sounds like Activerecord, right?

ActiveRecord

One Entity

SQL driver

State translation

Business Logic

DB connection details

How many of you have used activerecord? in ruby for example. With activerecord, instead of one big god module, we will have several modules. One per entity. It's a first improvement for this situation. But...

STILL A MESS!

we still have a lot of mini god modules. And the code is still coupled to the db details. ... So Lets go one step further...

One Entity

Business Logic

State translation

SQL

DB driver details

Domain Entity

Repository

Storage Backend

and lets take each entity and start splitting its responsibilities into new modules. - The first module will be a domain entity which encapsulates the business logic and the state for the entity. It knows nothing about the storage details. - Then we will have our repository. Which is the layer that translate back and forth our own representation of those entities into something that a database can use. And also the repository stores SQL code (like custom selectors or deleters). - Finally, there's a layer that actually knows how to talk to a database, and those are our storage backends. These are just a convenient layer of abstraction over the db driver. In sumo these concepts are implemented with behaviors.

Domain Entity

=

sumo_doc

Domain Entities in sumodb are just modules that implement the sumo_doc behavior. This behavior only needs 3 callbacks: sumo_schema, sumo_sleep, sumo_wake_up. Lets take a look at them.

sumo_schema/0

Called when creating a schema in the db for entities of this type

sumo_schema is used to create the entity's schema in the database. In it, we construct the schema that sumo needs to create it in the DB. To help with that, sumo provides two convenient functions: new_schema and new_field.

sumo_sleep/1

Called when the entity is going to be persisted

sumo_sleep will be called when we want to persist the entity. - sumodb uses this function to translate our state representation into a proplist which is the sumo internal representation. In this case we're translating from a record.

sumo_wakeup/1

Called when loading the entity from the db

when sumo loads a entity from the db - It will use sumo_wakeup to transtale the entity from the proplist to our representation (a record in this case).

Sumo Doc - Life Cycle

so lets see how the workflow looks like for the persist and find_by operations. When we call the persist function, sumodb will use the sumo_sleep callback to translate the state representation for the entity to a proplist. And then it will convert this proplist to what the database driver needs. The exact opposite happens when we fetch an entity from the database. The database representation is taken and transformed into our own state representation by the function sumo_wakeup. And that's it. The point here is that we only need to implement the sleep and wakeup functions. The rest of the work is made by sumodb That's all about the domain entities.

Repository

=

sumo_repo

Respository. The repositories know about queries and how to work with the different storage backends, and will also translate the information back and forth from the domain to the storage backends. This really helps to minimize the query logic duplication. So the idea is that our domain entities can focus on the business logic, and delegate the storage detail implementations to someone else.

sumo repo functions

By default sumo comes with a basic implementation for different repositories like mysql and mongodb. These repositories already have the implementation of these functions (persist, delete, find_by, etc). In almost of the cases you will use one these default repositories. And if you need some special query to be executed in your application you can use them as the foundation for your own repositories.

Storage Backend

=

sumo_backend

Storage Backend. The storage backends are modules that implement the sumo_backend behavior. It knows how to use the database driver, such as emysql. And also It starts the connection with the database and provides this connection to the repositories.

Domain Events

As a bonus, sumodb is capable of dispatching events that affect the state of the domain...

schema_created

created

updated

deleted

deleted_all

Like notifying that a certain entity was created, updated or deleted. Also it will notify when a schema was created for any entity. This is useful to react upon the events coming-in from the domains, and dispatching other kind of events. For example into a rabbitmq system, or doing cleanup tasks, etc.

So...

How can we use it?

Ok, we got the idea. But how do we use sumodb in our system? Well, lets say we want to work with an entity called user. Lets see what we need to do to implement it with sumodb.

The first step would be to write the entity module - where we add the sumo_doc behavior and its state representation. In this case it's a user record. - Then we implement the three callbacks that we mentioned before. sumo_schema to create the schema in the database. and then sumo_sleep and sumo_wakeup to can translate from the user record to a proplists and viceversa. - And also we could implement a constructor function. We could do it, but it's not necessary.

The next step is the configuration. - First of all we bind our entity to a repository called user_repo. - Then we tie that repository name to the module that has the implementation. which is, in this case, the default repository for mysql. In addition we add some options, like the storage_backend and the number of workers. - Finally we will configure the storage backend with the specific database driver options. such as user, password, host, etc

And That's it!

Done... Now we can start to persist and find users in our system.

For instance we could create a module called user_service with functions like new, get, and delete which use the sumo functions to work.

SumoDB + Cowboy

Lets see how we combine sumodb with cowboy to have a http server with a storage backend in a few lines.

Canillita

SumoDB + Cowboy

We will explore the canillita app, which is a sample application created by inaka to teach erlang to the newcomers. Canillita is a simple Restful server with server sent events capabilities.

It has two endpoints. One to publish newsflashes and another one to receive them. Cowboy talks to the clients. And Sumodb talk with the database.

GET /news

curl -vX GET http://localhost:4004/news
> GET /news HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:4004
> Accept: */* 
>

< HTTP /1.1 200 OK
< transfer-encoding: chunked
< connection: keep-alive
< server: Cowboy
< date: Thu, 07 Nov 2013 14:31:10 GMT
< content-type: text/event-stream
<
event: old_news_flash
data: The first news flash
data: This is an old news flash. keep waiting for more

With the get endpoint we can start listening for news from the server. - When the communication starts, the server will retrieve all the newsflashes from the db. and it will keep the connection opened.

POST /news

curl -vX POST http://localhost:4004/news -H"Content -Type:application/json" 
  -d'{ "title": "This is the title for the news flash",
       "content": "And this is the content...." }'
> POST /news HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:4004
> Accept: */*
> Content-Type:application/json
> Content-Length: 50
>
< HTTP /1.1 204 No Content
< connection: keep-alive
< server: Cowboy
< date: Fri, 08 Nov 2013 20:06:01 GMT
< content-length: 0
<
With the POST we send a newsflash to the server. It will store the newsflash in mysql using sumodb...

GET /news

curl -vX GET http://localhost:4004/news
> GET /news HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:4004
> Accept: */* 
>

< HTTP /1.1 200 OK
< transfer -encoding: chunked
< connection: keep-alive
< server: Cowboy
< date: Thu, 07 Nov 2013 14:31:10 GMT
< content -type: text/event -stream
<
event: old_news_flash
data: The first news flash
data: This is an old news flash. keep waiting for more

event: news_flash
data: This is the title for the news flash
data: And this is the content...

And also it will send the newsflash to the opened connections. -

Lets see the main part of the code. Canillita has only one sumo doc. canillita_news. In it we can see the sumo_doc behavior declaration and the news_flash record which is the state representation for this entity. Also we can see the sumo_schema function. Below we have the sumo_sleep and sumo_wakeup functions. The first one translates a newsflash record to a proplists. And the other one translates in the oposite way Then we have a constructor function which receives two fields and returns a newsflash record. And finally we have 3 getter functions.

Here we have the cowboy handler functions. - when a new POST request arrives, canillita uses the sumo:persist function to save the news_flash in the db. - And when a connection starts, through a GET request, canillita calls to sumo:find_all to get all the stored news_flashes. Very simple. with two slides I showed you an application that really works.

Current State

MySQL - MongoDB

Redis - SQLite3

Mnesia - ETS

DynamoDB

Current State. - Right now sumodb supports for Mysql and mongodb very well. Actually we have a big project working with sumo and mysql. And it works really great. - We have almost done the support for Redis and Sqlite 3. - And next we want to add Mnesia and ets. - And also dynamodd... and maybe cassandra

Last Words

The last thing I want to say it's that...

SumoDB

erlang + databases

Sumo was born from our needs to find a way to write fast and clean code to access databases. And since we couldnt find any application to do that, in the pure erlang world, we decided to created it. Sumo is a young project. It does its work pretty well. And we want it to grow up. So you're invited to contribute.

SumoDB

is our way...

and it can be yours

Sumo is our new way to work in erlang. - And it can be yours, too.

Thank You!

https://github.com/inaka/sumo_db

https://github.com/inaka/erlang_training

Questions?