Mimic

An API Compatible Mock Service For OpenStack

Lekha

software developer in test

(at) Rackspace

github: lekhajee freenode irc: lekha twitter: @lekha_j1 Lekha: Hi, I am Lekha Jeevan, a software developer in test at Rackspace

Glyph

software developer

(at) Rackspace

github: glyph freenode irc: glyph twitter: @glyph Glyph: and I'm Glyph, also a software developer at Rackspace

What?

Lekha: Today Glyph and I are here to talk about Mimic. An open source framework that allows for testing of OpenStack and Rackspace APIs. Mimic has been making testing across products in Rackspace a cake walk. And we think it could do the same for the OpenStack- backed applications someday. Today, We are not there yet. But your contributions will help us get there soon.

Who?

Lekha: I originally created Mimic to help with testing Rackspace Auto Scale because I hate waiting for test to run. Also, as I was creating it I realized I wanted to make something re-usable for projects across Rackspace and OpenStack. And Glyph volunteered to help

Glyph: I saw Mimic had a lot of promise but needed some help, so I came on to the project and I've been helping improve its architecture, making it more generic and modular.

Sneak Peek

Lekha: Before we even begin to dive into the details, let us take a quick sneak peek to see how easy it is to get started with Mimic.

Lekha: It is only a 3 step process! In a virtual env, we pip install mimic, run mimic and hit the endpoint! And Mimic returns the Authentication endpoint, to be able to Authenticate, and get a service catalog containing the (OpenStack) services that Mimic implements.

Why?

Lekha: While today Mimic is for general purpose testing, it was originally created for testing a specific application: Rackspace Auto Scale. So let's talk about what autoscale is and why we needed Mimic.

Rackspace Auto Scale

An open source project

Source: https://github.com/rackerlabs/otter Lekha: Rackspace Auto Scale is an open source project. A solution that Rackspace utilizes to automate the process of getting the right amount of compute capacity for an application, by creating (scaling up) and deleting(scaling down) servers and associating them with load balancers.

Dependencies

(of Auto Scale)

Lekha: In order to perform these tasks, Auto Scale speaks to three back-end Rackspace APIs.

RackspaceIdentity

Lekha: Identity for authentication and impersonation.

RackspaceCloud Servers

Lekha: Cloud Servers for provisioning and deleting servers.

RackspaceCloud Load Balancers

Lekha: Rackspace Cloud Load Balancers for adding and removing servers to load balancers as they are created or deleted.

Rackspace Identity

is API compatible with

Openstack Identity v2

Lekha: Rackspace Identity is API-compatible with (CLICK) OpenStack Identity v2.

Rackspace Cloud Servers

is powered by

Openstack Compute

Lekha: Rackspace Cloud Servers is powered by (CLICK) OpenStack Compute.

Rackspace Cloud Load Balancers

is a

Custom API

Lekha: And Rackspace Cloud Load Balancers is a custom API.

Testing

(for Auto Scale)

Lekha: As Auto Scale was interacting with so many other systems, testing Auto Scale did not mean just testing features of Auto Scale. But also that, if any of these systems it depended on did not behave as expected, Auto Scale did not crumble and crash, but was consistent and able to handle these upstream failures gracefully.

So, there were two kinds tests for Auto Scale:

Functional

API contracts

Lekha: One, was the Functional tests to validate the API contracts. These tests verified the responses of the Auto Scale API calls given valid, or malformed requests.

System Integration

↱

Identity

Auto Scale

→

Compute

↳

Load Balancers

Lekha: And the other was the System integration tests. Theese, were more complex. These tests verified integration between Auto Scale and Identity, Compute, and Load balancers.

System Integration

Success

Failure

Lekha: For example: When a scaling group was created, one such test will verify that the servers on that group were provisioned successfully. That is, that the servers went into an 'active' state and were then added as a nodes to the load balancer on the scaling group.(DOUBLE CLICK) Or, if a server went into an error state, (yes! that can happen!), Auto Scale was able to re-provision that server successfully, and then add that active server to the load balancer on the scaling group.

Testing Problems

Test run time ≥ server build time

Lekha: All these tests were set up to run against the real services. And... here are some observations I had whilst writing the tests: (CLICK) Servers could take over a minute, or ten minutes, or longer to provision. And the tests would run that-much-longer.

BUILD → ACTIVE ERROR ACTIVE

Lekha: Sometimes, the tests would fail! due to raandom upstream failures. Like a test would expect a building server to go into an 'active' state, but it would (CLICK) go into an ERROR state

unknown errors

Lekha: And tests for such negative scenarios, like actually testing how Auto Scale would behave if the server did go into 'error' state, could not be tested. This is something that could not be reproduced consistently.

However...

Improving test coverage

Tests → gate

Lekha: However, (CLICK)the overall test coverage was improving. And I continued to add tests, oblivious of the time it was taking run the entire test suite! Later, (CLICK) we had started using these tests as a gate, in the Autoscale merge pipeline.

And...

Slow, flaky tests

Unhappy peers

Lekha: And, (CLICK) the tests were running for so long and were sometimes flaky. Nobody dared to run these tests locally! Not even me, when I was adding more tests! (CLICK)

Also, our peers from the compute and load balancers teams, whose resources we were using up for our "Auto-scale" testing, were not happy! So much, so that, we were pretty glad, we were in a remote office!

We've Had Enough!

(on Auto Scale)

Lekha: But! We had had enough! This had to change! we needed something! to save us from these slow flaky tests!

There And Back Again

Specific

→

General

Auto Scale Mimic

General

→

Specific

Mocking Failure Mimic Mimicking OpenStack Glyph: Now that we've had enough, how are we going to solve this problem? (CLICK) Since we've been proceeding from the specific case of Auto Scale to the general utility of Mimic, (click) let's go back to the general problem of testing for failure, and proceed to the specific benefits that Mimic provides.

General →

Negative Path Testing

Glyph: Whenever you have code that handles failures, (CLICK) you need to have tests to ensure that that code works properly.

Real Services

FAIL

Glyph: And if you have code that talks to external services, those services are going to fail, and you're going to need to write code to handle that.

But Not When You

WANT THEM TO

Glyph: But if your only integration tests are against real versions of those external services, then only your unit tests are going to give you any idea of whether you have handled those failure cases correctly.

Succeeding

At Success

Glyph: Your positive-path code - the code that submits a request and gets the response that it expects - is going to get lots of testing in the real world. Services usually work, and when they don't, the whole business of service providers is to fix it so they do. So most likely, the positive-path code is going to get exercised all the time and you will have plenty of opportunities to flush out bugs.

Means Failing

At Failure

Glyph: If you test against real services, your negative-path code will only get invoked in production when there's a real error. If everything is going as planned, this should be infrequent, which is great for your real service but terrible for your test coverage.

Mimic Succeeds

At Failure!

Glyph: It's really important to get negative-path code right. If all the external services you rely on are working fine, then it's probably okay if your code has a couple of bugs. You might be able to manually work around them.

😈 ☁

(Production)

Glyph: But if things are starting to fail with some regularity in your cloud - that is to say - if you are using a cloud - that is exactly the time you want to make sure your system is behaving correctly: accurately reporting the errors, measuring the statistics on those errors, and allowing you to stay on top of incident management for your service and your cloud.

😇 ☁

(Staging)

Glyph: Even worse, when you test against a real service, you are probably testing against a staging instance. And, if your staging instance is typical, it probably doesn't have as much hardware, or as many concurrent users, as your production environment. Every additional piece of harware or concurrent user is another opportunity for failure, so that means your staging environment is even less likely to fail.

import unittest

Glyph: I remember the bad old days of the 1990s when most projects didn't have any unit tests. Things are better than that now. OpenStack itself has great test coverage. We have unit tests for individual unit tests and integration tests for testing real components together.

test_stuff ... [OK]

Glyph: We all know that when you have code like this:

try:
    result = service_request()
except:
    return error
else:
    return ok(result)

Glyph: ... that we need to write tests for this part:

try:
    result = service_request()

except: return error

else: return ok(result) Glyph: ... and one popular way to get test coverage for those error lines is by writing a custom mock for it in your unit tests.

Glyph: So if we can't trust real systems for error conditions, why isn't it sufficient to simply trust your unit tests to cover error conditions, and have your integration tests for making sure that things work in a more realistic scenario?

For those of you who don't recognize it, this is the Mock Turtle from Alice in Wonderland. As you can see, he's not quite the same as a real turtle, just like your test mocks aren't quite the same as a real system.

Glyph: There's a reason that the mock turtle is crying. He knows that he can't quite do the things a real turtle can do, just like your test mocks can't quite replace those real systems.

Let's take a specific example from OpenStack Compute.

if not os.chdir(ca_folder(project_id)):
    raise exception.ProjectNotFound(
        project_id=project_id)

Glyph: In June of this year, OpenStack Compute introduced a bug making it impossible to revoke a certificate. The lines of code at fault were these two additions here. This is not a criticism of Nova itself; the bug has already been fixed. My point is that they fell into a very common trap.

if not os.chdir(ca_folder(project_id)):
    raise exception.ProjectNotFound(
        project_id=project_id)

Glyph: The bug here is that chdir does not actually return a value.

@mock.patch.object(os, 'chdir', return_value=True)
def test_revoke_cert_process_execution_error(self):
    "..."

@mock.patch.object(os, 'chdir', return_value=False)
def test_revoke_cert_project_not_found_chdir_fails(self):
    "..."

Glyph: Because the unit tests introduced with that change construct their own mocks for chdir, Nova's unit tests properly cover all the code, but the code is not integrated with a system that is verified in any way against what the real system (in this case, Python's chdir) does.

Glyph: In this specific case, Nova might have simply tested against a real directory structure in the file system, because relative to the value of testing against a real implementation, "New Folder" is not a terribly expensive operation.

Glyph: However, standing up an OpenStack cloud is significantly more work than running mkdir. If you are developing an application against OpenStack, deploying a real cloud to test against can be expensive, error-prone, and slow, as Auto Scale's experience shows.

The Best Of

Both Worlds?

Creating a one-off mock for every test is quick, but error prone. Good mocks rapidily become a significant maintainance burden in their own right. Auto Scale needed something that could produce all possible behaviors like a unit test mock, but ensure those behaviors accurately reflected a production environment. (CLICK) It should be something maintained as a separate project, not part of a test suite, that can have its own tests and code review to ensure its behavior is accurate.

→Specific

Mimic

Glyph: Since we've been proceeding from the general to the specific, (CLICK) right here, where we need a realistic mock of a back-end openstack service, is where the specific value of Mimic comes in.

Mimic

Version 0.0

Lekha: The first version of Mimic was built as a stand-in service for Identity, Compute and Rackspace Load balancers, the services that Auto Scale depends on.

Pretending

...

Lekha: The essence of Mimic is pretending. The first thing that you must do to interact with it is to...

Pretending

to authenticate

Lekha: ...pretend to authenticate. Mimic does not validate credentials - all authentications will succeed. As with the real Identity endpoint, Mimic's identity endpoint has a service catalog which includes endpoints for all the services implemented within Mimic. A well behaved OpenStack client will use the service catalog to look up URLs for its service endpoints. Such a client will only need two pieces of configuration to begin communicating with the cloud, i.e. credentials and the identity endpoint. A client written this way will only need to change the Identity endpoint to be that of Mimic.

Pretending

to Boot Servers

Lekha: When you ask Mimic to create a server, it pretends to create one. This is not like stubbing with static responses: when Mimic pretends to build a server, it remembers the information about that server and will tell you about it in the subsequent requests.

Pretending

is faster

Lekha: Mimic was originally created to speed things up. So, it was very important that - it be fast both to respond to requests, and to have developers setup.

in-memory

Lekha: It uses in-memory data structures.

minimaldependencies

(almost entirely pure Python)

Lekha: with minimal software dependencies, almost entirely pure Python.

Service Dependencies

Lekha: With no service dependencies

Configuration

Lekha: and no configuration

self-contained

Lekha: And is entirely self-contained.

Demo!

Nova command-line client

Lekha: Lets see how we can run the python nova command-line client against Mimic

config.sh

export OS_USERNAME=username
export OS_PASSWORD=password
export OS_TENANT_NAME=11111
export OS_AUTH_URL=http://localhost:8900/identity/v2.0/tokens

Lekha: Here is the config file that holds the environment variables required for the OpenStack command-line clients.

config.sh

export OS_USERNAME=username
export OS_PASSWORD=password
export OS_TENANT_NAME=11111
export OS_AUTH_URL=http://localhost:8900/identity/v2.0/tokens

Lekha: We have set a random username, password and tenant name, as Mimic only pretends to authenticate

config.sh

export OS_USERNAME=username
export OS_PASSWORD=password
export OS_TENANT_NAME=11111
export OS_AUTH_URL=http://localhost:8900/identity/v2.0/tokens

Lekha: And the Auth url is set to be that of Mimic.

Now, let's continue where we left off with our first demo. So we already have an instance of mimic running.

Lekha: Let's pip install the python nova-client and ensure the config file has the AUTH_URL pointing to that of Mimic. We source the config file and we see that no servers exist on Mimic start up! Let's create a server with a random flavor and image. The server created is in an active state. Lets create a second server, which also is built immediately and is an active state. Now we have 2 active servers that Mimic knows of. Lets delete the second server... and now Mimic knows of the deleted server and has only the one server remaining.

Using Mimic

(on Auto Scale)

Lekha: We did the same thing with Auto Scale. We pointed the tests and the Autoscale API to run against an instance of Mimic.

The Results!

(Functional tests using Mimic)

Lekha: This reduced the test time exponentially! Before Mimic the functional tests would take...

Functional Tests:

15 minutes

against a real system

vs.

30 seconds

against Mimic

Lekha: (CLICK) 15 minutes to complete, and now they run in (CLICK)less than 30 seconds!

The Results!

(Integration tests using Mimic)

Lekha: In the system integration tests, if one of the servers in the test remained in "building" for fifteen minutes longer than usual, then the tests would run fifteen minutes slower.

Integration Tests:

3 hours or more

against a real system

vs.

3 minutes

against Mimic

Lekha: These tests took (CLICK)over 3 hours to complete and using Mimic this went down to be (CLICK) less than 3 *minutes* consistently, to complete!

✈

Lekha: All our dev VMs are now configured to run against Mimic. One of our devs from the Rackspace Cloud Intelligence team, calls this "Developing on Airplane Mode!", as we can work offline without having to worry about uptimes of the upstream systems and get immediate feedback on the code being written.

What aboutnegative paths?

Glyph: But Lekha, what about all the negative-path testing stuff I was talking about before? Does Mimic simulate errors? How did this dev VM test Auto Scale's error conditions?

Mimic doessimulate errors

Lekha: Well Glyph, I am as pleased as I am suprised that you ask that. Mimic does simulate errors.

Error injection using metadata

Lekha: Earlier, when I said Mimic pretends to create a server, that wasn't entirely true - sometimes Mimic pretends to *not* create a server. It uses the metadata provided during the creation of the server, inspects the metadata, and sets the state of the server respectively.

Lets go back to the demo and see how this can be done.

Lekha: So, we had the one active server. Now, lets create a server with the `metadata`: `"server_building": 30`. This will keep the server in build state for 30 seconds. Now we have 2 servers. The active and building sever. Also, We can create a server that goes into an error state, using the `metadata`: `"server_error": True`. As you can see, we now have 3 different servers, with 3 different states.

Retry On Errors

Lekha: For the purposes of Auto Scale it was important that we have right number of servers on a scaling group, even if a number of attempts to create one failed. We chose to use metadata for error injection so that requests with injected errors could also be run against real services. For Auto Scale, the expected end result is the same number of servers created, irrespective of the number of failures. But this behavior may also be useful to many other applications because retrying is a common pattern for handling errors.

Mimic 0.0 was...

Too Limited

Lekha: (PAUSE)However, the first implementation of mimic had some flaws, it was fairly Rackspace specific and only implemented the endpoints of the services that Autoscale depends upon . And they were all implemented as part of Mimic's core. It ran each service on a different port, meaning that for N endpoints you would need not just N port numbers, but N *consecutive* port numbers. It allowed for testing error scenarios, but only using the metadata. This was not useful for all cases, for example, for a control panel that does not not allow the user to enter any metadata.

Mimic 0.0 was...

Single Region

Lekha: Mimic also did not implement multiple regions. It used global variables for storing all state, which meant that it was hard to add additional endpoints with different state in the same running mimic instance.

Beyond Auto Scale:

Refactoring Mimic

Glyph: Mimic had an ambitious vision: to be a one-stop mock for all OpenStack and Rackspace services that needed fast integration testing. However, its architecture at the time severely limited the ability of other teams to use it or contribute to it. As Lekha mentioned, it was specific not only to Rackspace but to Auto Scale.

YAGNI

Glyph: On balance, Mimic was also extremely simple. It followed the You Aren't Gonna Need It principle of extreme programming very well, and implemented just the bare minimum to satisfy its requirements, so there wasn't a whole lot of terrible code to throw out or much unnecessary complexity to eliminate.

E(ITO)YAGNI

Glyph: There is, however, a corrolary to YAGNI, which is E(ITO)YAGNI: Eventually, It Turns Out, You *Are* Going To Need It. As Mimic grew, other services within Rackspace wanted to make use of its functionality, and a couple of JSON response dictionaries in global variables were not going to cut it any more.

Plugins!

Glyph: So we created a plugin architecture.

Identity

Is the Entry Point

(Not A Plugin)

Glyph: Mimic's Identity endpoint is the top-level entry point to Mimic as a service. Every other URL to a mock is available from within the service catalog. As we were designing the plugin API, it was clear that this top-level Identity endpoint needed to be the core part of Mimic, and plug-ins would each add an entry for themselves to the service catalog.

http://localhost:8900/mimicking/ NovaApi-78bc54/ORD/ v2/tenant_id_f15c1028/servers Glyph: URLs within Mimic's service catalog all look similar. In order to prevent conflicts between plugins, Mimic's core one encodes the name of your plugin and the region name specified by your plugin's endpoint. Here we can see what a URL for the Compute mock looks like. (CLICK) This portion of the URL, which identifies which mock is being referenced, is handled by Mimic itself, so that it's always addressing the right plugin. (CLICK) Then there's the part of the URL that your plugin itself handles, which identifies the tenant and the endpoint within your API.

Plugin Interface:“API Mock”

Glyph: Each plugin is an API mock, which has only two methods:

class YourAPIMock():

  def catalog_entries(...)

  def resource_for_region(...)

(that's it!)

Glyph: (click) catalog_entries (click) and resource_for_region (click) That's it!.

def catalog_entries(self,
                    tenant_id):

Glyph: catalog_entries takes a tenant ID and returns the entries in Mimic's service catalog for that particular API mock.

APIs have catalog entries for each API type, which in turn have endpoints for each virtual region they represent.

return [
    Entry(
        tenant_id, "compute", "cloudServersOpenStack",
        [
            Endpoint(tenant_id, region="ORD",
                     endpoint_id=text_type(uuid4()),
                     prefix="v2"),
            Endpoint(tenant_id, region="DFW",
                     endpoint_id=text_type(uuid4()),
                     prefix="v2")
        ]
    )
]

Glyph: This takes the form of an iterable of a class called (CLICK) Entry, each of which is (CLICK) a tenant ID, (CLICK) a type, (CLICK) a name, (CLICK) and a collection of (CLICK) Endpoint objects, each (CLICK) containing (CLICK) the name of a pretend region, (CLICK) a URI version prefix that should appear in the service catalog after the generated service URL but before the tenant ID.

def resource_for_region(
    self, region, uri_prefix,
    session_store
):
    return (YourRegion(...)
            .app.resource())

Glyph: resource_for_region takes (CLICK) the name of a region, (CLICK) a URI prefix - produced by Mimic core to make URI for each service unique, so you can generate URLs to your services in any responses which need them - (CLICK) and a session store where the API mock may look up state of the resources it pretended to provision for the respective tenants. (CLICK) resource_for_region returns an HTTP resource associated with the top level of the given region. This resource then routes this request to any tenant- specific resources associated with the full URL path.

class YourRegion():

    app = MimicApp()

    @app.route('/v2/<string:tenant_id>/servers',
               methods=['GET'])

    def list_servers(self, request, tenant_id):
        return json.dumps({"servers": []})

Glyph: Once you've created a resource for your region, it has a route for the parts of the URI that starts at the end of the URI path. Here you can see what the nova "list servers" endpoint would look like using Mimic's API; as you can see, it's not a lot of work at all to return a canned response. It would be a little beyond the scope of this brief talk to do a full tutorial of how resource traversal works in the web framework that Mimic uses, but hopefully this slide - which is a fully working response - shows that it is pretty easy to get started.

Tell Mimic

To Load It

Glyph: Now that we have most of a plugin written, let's get Mimic to load it up.

# mimic/plugins/your_plugin.py

from your_api import YourAPIMock
the_mock_plugin = YourAPIMock()

Glyph: To register your plugin with Mimic, you just need to drop an instance of it into any module of the mimic.plugins package.

Mimic Remembers

(until you restart it)

Glyph: This, of course, just shows you how to create ephemeral, static responses - but as Lekha said previously, Mimic doesn't just create fake responses; it remembers - (CLICK) in memory - what you've asked it to do.

session = session_store.session_for_tenant_id(tenant_id)

class YourMockData():
    "..."

your_data = session.data_for_api(your_api_mock,
                                 YourMockData)

Glyph: That "session_store" object passed to resource_for_region is the place you can keep any relevant state. It gives you a per-tenant session object, and then you can ask that session for any mock-specific data you want to store for that tenant. All session data is created on demand, so you pass in a callable which will create your data if no data exists for that tentant/API pair.

session = session_store.session_for_tenant_id(tenant_id)

from mimic.plugins.other_mock import (other_api_mock,
                                      OtherMockData)

other_data = session.data_for_api(other_api_mock,
                                  OtherMockData)

Glyph: Note that you can pass other API mocks as well, so if you want to inspect a tenant's session state for other services and factor that into your responses, it's easy to do so. This pattern of inspecting and manipulating a different mock's data can also be used to create control planes for your plugins, so that one plugin can tell another plugin how and when to fail by storing information about the future expected failure on its session.

Errors As A Service

Glyph: We are still working on the first error-injection endpoint that works this way, by having a second plugin tell the first what its failures are, but this is an aspect of Mimic's development we are really excited about, because that control plane API also doubles as a memory of the unexpected, and even potentially undocumented, ways in which the mocked service can fail.

Error Conditions Repository

Lekha: Anyone testing a product, will run into unexpected errors. Thats why we test! But we dont know what we dont know, and cant be prepared for this ahead of time right!

Discovering Errors

Against Real Services

Lekha: When we were running the Auto Scale tests against Compute, we began to see some one-off errors. Like, when provisioning a server, the test expected a server to go into a building state for some time before it is active, but it would remain in building state for over an hour or even would sometimes go into an error state, after.

Record Those Errors

Within Mimic

Lekha: Auto Scale had to handle such scenarios gracefully and the code was changed to do so. And Mimic provided a way to tests this consistently.

Discover More Errors

Against Real Services

Lekha: However, like I said, we dont know what we dont know. We were not anticipating on finding any other such errors. But, there were more! And... this was slow process for us to uncover such errors, as we tested against the real services.

Record Those Errors

For The Next Project

Lekha: And, we continued to add such errors to Mimic.

Now, Wont it be great if not every client that depended on a service had to go through this same cycle. Not everyone had to find all the possible error conditions in the service by experience. And have to deal with them at the pace that they occur.

Share A Repository

For Future Projects

Lekha: What if we had a repository of all such known errors, that everyone contributes to. So the next person using the plugin can use the existing ones, and ensure there application behaves consistently irrespective of the errors. And be able add any new ones to it.

Mimic Is A Repository

Lekha: Mimic is just that, a repository of all known responses including the error responses.

Mimic Endpoint

/mimic/v1.0/presets Lekha: Mimic has an endpoint `presets` that today lists all the metadata related errors conditions that can be simulated using Mimic.

Control

Glyph: In addition to storing a repository of errors, Mimic allows for finer control of behavior beyond simple success and error. You can determine the behavior of a mimicked service in some detail.

Now & Later

Glyph: We're not just here today to talk about exactly what Mimic offers right now, but where we'd like it to go. And in that spirit I will discuss one feature that Mimic has for controlling behavior today, and one which we would like to have in the future.

Now

Glyph: Appropriately enough, since I'm talking about things now and things in the future, the behavior-control feature I'd like to talk about that that Mimic has right now is the ability to control time.

now()

Glyph: That is to say: when you do something against Mimic that will take some time, such as building a server, time does not actually pass ... for the purposes of that operation.

/mimic/v1.1/tick

Glyph: Instead of simply waiting 10 seconds, you can hit this second out-of-band endpoint, the "tick" endpoint ...

{
    "amount": 1.0
}

Glyph: with a payload like this. It will tell you that time has passed, like so:

{
    "advanced": 1.0,
    "now": "1970-01-01T00:00:01.000000Z"
}

Glyph: Now, you may notice there's something a little funny about that timestamp - it's suspiciously close to midnight, january first, 1970. Mimic begins each subsequent restart thinking it's 1970, at the unix epoch; if you want to advance the clock, just plug in the number of seconds since the epoch as the "amount" and your mimic will appear to catch up to real time.

{
  "server": {
    "status": "BUILD",
    "updated": "1970-01-01T00:00:00.000000Z",
    "OS-EXT-STS:task_state": null,
    "user_id": "170454",
    "addresses": {},
    "...": "..."
  }
}

Glyph: If you've previously created a server with "server_building" metadata that tells it to build for some number of seconds, and you hit the 'tick' endpoint telling it to advance time the server_building number of seconds...

{
  "server": {
    "status": "ACTIVE",
    "updated": "1970-01-01T00:00:01.000000Z",
    "OS-EXT-STS:task_state": null,
    "user_id": "170454",
    "addresses": {},
    "...": "..."
  }
}

Glyph: that server (and any others) will now show up as "active", as it should. This means you can set up very long timeouts, and have servers behave "realistically", but in a way where you can test several hours of timeouts a time.

--realtime

Glyph: You can ask Mimic to actually pay attention to the real clock with the --realtime command-line option; that disables this time-advancing endpoint, but it will allow any test suites that rely on real time passing to keep running.

Later

Glyph: Another feature that isn't implemented yet, that we hope to design later, is the ability to inject errors ahead of time, using a separate control-plane interface which is not part of a mock's endpoint.

Error Injection

Glyph: We've begun work on a branch doing this for Compute, but we feel that every service should have the ability to inject arbitrary errors.

Error Injection

Currently: Metadata-Based

Glyph: As Lekha explained, Mimic can already inject some errors by supplying metadata within a request itself.

Error Injection

Currently: In-Band

Glyph: However, this means that in order to cause an error to happen, you need to modify the request that you're making to mimic, which means your application isn't entirely unmodified.

Error Injection

Future: Separate Catalog Entry

Glyph: What we'd like to do in the future is to put the error-injection control plane into the service catalog, with a special entry type so that your testing infrastructure can talk to it.

Error Injection

Future: Out-Of-Band

Glyph: This way, your testing tool would authenticate to mimic, and tell Mimic to cause certain upcoming requests to succeed or fail before the system that you're testing even communicates with it. Your system would not need to relay any expected-failure data itself, and so no metadata would need to be passed through.

Error Injection

Future: With Your Help

Glyph: What we'd really like to build with these out-of-band failures, though, is not just a single feature, but an API that allows people developing applications against openstack to make those applications as robust as possible by easily determining how they will react at scale, under load, and under stress, even if they've never experienced those conditions. So we need you to contribute the errors and behaviors that you have experienced.

Even Later...

Glyph: Mimic is based on a networking framework ...

Glyph: ... some of you know which one I'm talking about ...

Even Later...

Future Possibilities,Crazy Features!

Glyph: ... which has such features as built-in DNS and SSH servers.

Even Later...

Real SSH Server

For Fake Servers

Glyph: It would be really cool if when a virtual server was booted, the advertised SSH port really did give you access to an SSH server, albeit one that can be cheaply created from a local shell as a restricted user or a container deployment, not a real virtual machine.

Even Later...

Real DNS Server

For Fake Zones

Glyph: Similarly, if we were to have a Desginate mock, it would be really cool to have real DNS entries.

Mimic for OpenStack

Lekha: Mimic, can be the tool, where you do not have to stand up the entire dev stack to understand how an OpenStack API behaves.

Mimic can be the tool which enables an OpenStack developer to get quick feedback on the code he/she is writing and not have to go through the gate multiple times to understand that - "maybe I should have handled that one error, that the upstream system decides to throw my way every now and then"

It's Easy!

Glyph: One of the things that I like to point out is that Mimic is not real software. It's tiny, self-contained, doesn't need to interact with a database, or any external services. Since it mimics exclusively existing APIs, there are very few design decisions. As a result, contributing to Mimic is a lot easier than contributing to OpenStack proper.

We need your help!

Source: https://github.com/rackerlabs/mimic Issues: https://github.com/rackerlabs/mimic/issues Chat: ##mimic on Freenode Lekha: So, please come join us build Mimic. Together we can make this a repository for all known reponses (including errors!) for the OpenStack APIs. As we mentioned earlier, Mimic is Open source and here is the github link to the repository. All the features or issues we are working on, or planning to work on, in the near future are under the issues tab on github. You can start by using Mimic and giving us your feedback. Or better yet, forking it and contributing to it, by adding plugins for services that do not exist today! Thank you!

Mimic – An API Compatible Mock Service For OpenStack

glyph

Mimic – An API Compatible Mock Service For OpenStack

0 0 (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/platform.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

mimic-presentation

Mimic

An API Compatible Mock Service For OpenStack

Lekha

software developer in test

(at) Rackspace

Glyph

software developer

(at) Rackspace

What?

Who?

Sneak Peek

Why?

Rackspace Auto Scale

Dependencies

(of Auto Scale)

RackspaceIdentity

RackspaceCloud Servers

RackspaceCloud Load Balancers

Rackspace Identity

is API compatible with

Openstack Identity v2

Rackspace Cloud Servers

is powered by

Openstack Compute

Rackspace Cloud Load Balancers

is a

Custom API

Testing

(for Auto Scale)

Functional

API contracts

System Integration

↱

Identity

Auto Scale

→

Compute

↳

Load Balancers

System Integration

Success

Failure

Testing Problems

Test run time ≥ server build time

BUILD → ACTIVE ERROR ACTIVE

unknown errors

However...

Improving test coverage

Tests → gate

And...

Slow, flaky tests

Unhappy peers

We've Had Enough!

(on Auto Scale)

There And Back Again

Specific

→

General

General

→

Specific

General →

Negative Path Testing

Real Services

FAIL

But Not When You

WANT THEM TO

Succeeding

At Success

Means Failing

At Failure

Mimic Succeeds

At Failure!

😈 ☁

(Production)

0 0