Service Oriented Architecture with Thrift – Who am I? – Imagine you inherited a huge monolithic web app!



Service Oriented Architecture with Thrift – Who am I? – Imagine you inherited a huge monolithic web app!

0 0


soa-with-thrift


On Github devill / soa-with-thrift

Service Oriented Architecture with Thrift

Created by Rafael Ordog a.k.a. DeVill / @devillsroom

Who am I?

Poker Croupier

https://github.com/devill/poker-croupier

Imagine you inherited a huge monolithic web app!

Single database server

Several million lines of code

Monolitikus alkalmazas

Egy adatbazis

MLOC

What is your first reaction to performance problems?

Performance optimization

It's fine as long as there are really bad parts

Leaking logic to SQL to get better performance is a huge red flag

Preformancia optimalizalas

Logika az SQLben

Scaling by replication

Running the same process on several threads is hard

Concurrence issues

The amount of communication can explode

Threading nehez

Lockok

Sok kommunikacio

Scaling a single database

Master/Slave and sharding?

Lot of concurrent queries

Hard to tell which one caused a performance issue

Egy adatbazis

Master/Slave

Sharding

Sok query

Ki okozza a bajt?

Dependencies

A single deployment can break everything

A single query can paralyze the entire DB

Elszabadult fuggosegek

Deployment

Lancreakcio

Teams can not take full responsibility over their component

Csapat felelosseg?

What if...

Service oriented architecture

is the idea of breaking up your application into smaller independent applications

Host each on separate (virtual) machines

Failures and performance issues are localized

Dependent services can fail gracefully

It's easier to monitor performance issues per service

Independent databases

If there is a database issue, it's immediately clear which team should take care of it

Extra level of structure

Stronger encapsulation

Teams can take full responsibility

DevOps - Development and operations is taken care of by the same team

Bad decisions are more localized

It's easier to remove past mistakes

It's less risky to experiment

Good architecture let's you defer hard decisions

With a service oriented architecture your hard decisions are also localized

Just never try to rewrite a service

Rewrites almost never work out well

INTERMISSION

al'a 2001 Space Odyssey

Your hammer ain't gonna make everything a nail

It's a chance to try new technologies

Just be aware of the shiny nickel

Especially if it's similar to something you already have

Each new technology needs to be maintained

So what is worth trying?

Try using different languages

Try using different types of databases

etc...

Experimenting with languages

Each language has it's virtues

Choose the best tool

Node.js

Use it for server side UI to reduce duplication

Use it for IO heavy application with little logic in it

Ruby and Python

Use them when fast time to market is important

Use them if business logic is complex but CPU and RAM is unlikely to be a problem

C, C++, Cuda

Great for computation heavy applications where scaling up is the only viable option

Although this situation is getting pretty rare

Experiment with new database technologies

Each database technology has it's virtues

That DOES NOT mean that you should have both Oracle SQL, MySQL and Microsoft SQL Server

Paying for an SQL server

SQL - the good old swiss knife

It's kind of good at handling unexpected requirements for a wide variety of aggregates

BUT it promotes leaking logic to the database

It's not the best choice for most operational purposes

Key-value stores

Really good for storing session related data

Non persistent versions can be used as in-memory cache

Document databases

Preferable when a data set is presented in the same way many times

Moves the schema to the code, where it belongs

Graph databases

Similar to SQL in that it let's you slice and dice the data freely

Gives a lot more freedom than SQL, and it's a lot better at recursive queries

Horrible at performing similar queries repeatedly

Plain old files

No... I'm serious!

Plain old files

It's perfect for storing a large corpus of un-indexed text

Okay! So you are about to write your first service!

What would be a natural choice of a communication channel?

REST APIs work well between different organizations

It's a viable option for in-house communication too

But today I'd like to show you another option

Thrift

A remote procedure call framework by

Okay! But what's so cool about it?

Single interface definition can be compiled to many languages

Both server and client works out of the box

And it's pretty simple to use it

Step 1: Define the service interface

enum BinaryOperation {
    ADDITION = 1,
    SUBTRACTION = 2,
    MULTIPLICATION = 3,
    DIVISION = 4,
    MODULUS = 5
}

struct ArithmeticOperation {
    1:BinaryOperation op,
    2:double lh_term,
    3:double rh_term,
}

service Calculator {
    double calc(1:ArithmeticOperation op),
}

Note the numbers before each field

They help with versioning

Step 2: Run the Thrift compiler

thrift -gen rb calculator.thrift

Step 3: Implement a handler

class CalculatorHandler
    def calc(val)
        lh_term = val.lh_term
        rh_term = val.rh_term
        case val.op
            when 1
                lh_term+rh_term
            when 2
                lh_term-rh_term
            when 3
                lh_term*rh_term
            when 4
                lh_term/rh_term
            when 5
                lh_term%rh_term
        end
    end
end

Step 4: Start the service

create_server(CalculatorHandler.new).serve

Okay... you've got me... it's a little more than that

def create_server(handler)
    processor = Calculator::Processor.new(handler)
    transport = Thrift::ServerSocket.new(9090)
    transportFactory = Thrift::BufferedTransportFactory.new()
    Thrift::ThreadPoolServer.new(processor, transport, transportFactory)
end

create_server(CalculatorHandler.new).serve

Step 5: Enjoy

ruby calculator.rb

But what's the use of a service without a service consumer?

Just use the generated client as any other class

$client= new CalculatorClient(ThriftProtocol::get());

$aritmectiOperation = new ArithmeticOperation();
$aritmectiOperation->op = BinaryOperation::ADDITION;
$aritmectiOperation->lh_term = 25;
$aritmectiOperation->rh_term = 10;

echo $client->calc($aritmectiOperation) . "\n";

ThriftProtocol::close();

Okay... you've got me again... the battery is not included

class ThriftProtocol {
    private static $transport = NULL;
    private static $protocol = NULL;

    static function get() {
        if(self::$protocol == NULL) {
            self::$transport = new TSocket('localhost', 9090);
            self::$transport->open();
            self::$protocol = new TBinaryProtocol(self::$transport);
        }
        return self::$protocol;
    }

    static function close() {
        self::$transport->close();
    }
}

Thrift also supports async messages

service AsyncService {
    oneway void (1:string message),
}

You can call it just as you call other functions, but execution will be asynchronous

DEMO

Thrift vs. REST

Thrift has static types versus REST promotes dynamic types

Thrift has versioning support

Thrift vs. REST

REST is not a strict standard

The documentation of Thrift is horrible

Thrift

Server type

Transport

Protocol

Okay, but what about that monolithic app I've been talking about?

Let's suppose we already have tests

Step 0: Try to isolate a few service candidates

It may change later, but you need a starting point

Try to isolate something small and simple first

Step 1: Isolate the database

Until there are shared tables the service would not be independent

In terms of performance this is the most important step anyway

Find shared tables

Decide which service will own it

Replicate the data for other service candidates as needed

Once all necessary data is replicated read from the new sources

Step 2: Isolate the service candidate as a module

Dependencies should be pointing in one direction

Each service candidate should only read and write it's own DB

All other DB access should go through the owning module

At this point the service candidates can be deployed independently

Also the databases can be on separate machines

Step 3: Introduce facades between service candidates

This class will become your service gateway

Step 4: Split the facade to a client and a server side class

Make sure that only plane old data structures are passed between the two classes

Step 5: Introduce an abstract factory that can build the facade, and data structures.

Note that up until now all our tests were kept green

Step 6: Copy the facades interface into a thrift definition

Run the compiler

Step 7: Create an implementation of the abstract factory that returns the thrift objects

With the new factory you have an independent service

Your tests will still run with the original factory

Sounds simple, but there is a catch...

There is no out of the box solution to inject a dependency through a remote call

Design your services with that in mind