The other side of DevOps: – Virtualization to support JavaScript Developers – DevOps



The other side of DevOps: – Virtualization to support JavaScript Developers – DevOps

0 0


2015-jsDay

Slides for our presentation at jsDay 2015.

On Github bithound / 2015-jsDay

The other side of DevOps:

Virtualization to support JavaScript Developers

Presented by Tony Thompsonwww.bithound.io / @bithoundio

DevOps

Missing the point...

I've heard some people describe DevOps as 'sysadmins who program'. And that's not entirely true. This is from a random devops job posting I found on craigslist.

DevOps has been around since 1987?

Sysadmins have been programming since dinosaurs roamed the machine room. Who do you think uses languages like bash or perl or awk? Either, DevOps has been around since 1987, or it means something more.
In the 'bad old days', a dev team would finish up, toss a product over the wall to the systems team who then would then have the curse of deploying it. Kind of like acme. Poor Wile E Coyote never knows what he's really going to get next. It's not the worst ad. There are plenty of 'Give us the devops. No, we won't change our procedure' type jobs out there, where the 'hopeful' employer thinks that devops is just some bandaid they can apply to their current (broken) process to make it better.
So I guess what I'm trying to say, is that as a DevOps guy, if a dev team comes to me with a product and says 'deploy this!', it's already too late. We're going to miss the deadline. This is especially true for node.js/io.js. Apache/nginx support for js is pretty primitive, and OS distributions lag badly. As somebody with fifteen years of sysadmin experience, I hate running software that's not well maintained upstream. DevOps is a wholistic approach. The goal is to get ahead of mere 'sysadmin' concerns. And that's not just a matter of applying developer tools to operations; it's also important to bring operations tools and culture into the developer world, early in the development process.

Tools

We've all gone to a talk about vagrant that ends in "And now you never have to say 'but it runs on my machine'". This is not that talk.
Really, what you want in a virtualizer from Ops perspective is really the same thing you want in any other software There's nothing in this talk that couldn't be done in puppet. Or powershell. Or as grunt scripts. bitHound tends to be a mac/linux shop, and bash is a cheap / fast / universal tool for us. I like using tools that let me duct tape together other tools. So bash it is. Bash is the Cowbell of DevOps
“The only DevOps tool is someone with the title ‘Director of DevOps’.”
-- @nathenharvey
At it's heart, DevOps is an approach; a culture. To really do devops right, your entire organization has to be geared towards building deployable software. The entire point of DevOps is to not to have silos; there really aren't any software tools that are DevOps specific. You shouldn't choose new tools because they are more devops-y. And if your under pressure to use a given tool 'because devops', then something isn't right. There is no tool that "gives you devops". However, vagrant and bash are a good place to start.
  • Vagrant
  • Docker
  • VirtualBox
  • VMware
  • AWS?
Really, what you want in a virtualizer from Ops perspective is really the same thing you want in any other software from a devs perspective. You want something that can be scripted, and is supported in your organization.

Challenge 1:Virtualization is Slow!

Define slow. But 'slow' is a pretty vague term. Is it CPU, Network, Storage, Memory constraint. I'm going to talk about speed here, but it's a little like talking about the top speed of a family minivan -- it's missing the point. Ultimately, Speed isn't the point. Consistency and convention are the points.
http://mitchellh.com/comparing-filesystem-performance-in-virtual-machines Bigger is better. So here you can see that for small files, NFS outperforms VirtualBox native filesystems. And that's actually operating system specific. In our in-house tests, we found we were getting seeing linear performance with file size on linux, and roughly constant time with file size on OS X.
http://mitchellh.com/comparing-filesystem-performance-in-virtual-machines But NFS doesn't outperform native when it comes to writes. So running a virtualized filestem, NFS, or through your virtualizer to the host VM all have different types of performance. For example, we have our code mounted in via NFS, but our app's scratch space is on a virtual HD that is not available to the host. Benchmark, benchmark, benchmark. Your app is a unique snowflake. Y'know how I know your app is a beautiful unique snowflake? It's written in javascript, and best practices for deploying javascript are still emerging.

Some tasks can run on the host

#!/bin/bash
nodemon \
  -e js,html,css \
  --exec "bash ./restart.sh" 
Also, it's possible to work around some of these issues. For example, we know file reads are slow, right? That makes it things like nodemon also very slow. But we don't have to run it inside the VM. It's not part of our production setup! There's no reason we can't take that and run it in the host environment. I'll get into where we'd stash this in the next section.

Tweak your virtualizer

Vagrant.configure(2) do |config|

  config.vm.provider :virtualbox do |vb|
    vb.cpus = 3
    vb.memory = 2048
    vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
    vb.customize ["modifyvm", :id, "--natdnsproxy1", "on"]
  end

end
Most VMs can be tuned to meet your needs. Vagrant defualt is (I htink) 512MB of RAM and a single CPU. This kind of tuning helps, but regardless, your app will not be as fast inside a virtualizer -- no matter which one -- as it is on bare metal. But keep in mind, you're almost certainly deploying to a virtualized environment. Unless you own your own hardware.

Challenge 2:Virtualization is Not Transparent to Developers

Not if you have to run scripts inside the environment.
If your app is virtualized, as a dev, you are (naturally) separated from your running code. Vagrant actually gets us about halfway there -- mounting in /vagrant by default was a brilliant design decision. It effectively uncouples your editing environment from your execution environment. You can do that in every other virutalizer, too. But setting it up takes work.

Lay out your project

|-- Vagrantfile
|-- Dockerfile
|-- app
|-- bin
|   |-- start.sh
|   +-- stop.sh
|-- etc
+-- scripts
    |-- provision.sh
    |-- deploy.sh
    +-- scripts.sh
We actually have two separate places where we keep scripts. Our main repo looks something like this: And just to make things difficult, we keep our systems level code in a seperate repo from our actual app. We include the app as a submodule.

Lay out your project

|-- Vagrantfile
|-- Dockerfile
|-- app
|   |-- bin
|   |   |-- start.sh
|   |   +-- stop.sh
|   +-- scripts
|       +-- migrate.sh 
|-- bin
|   |-- start.sh
|   +-- stop.sh
|-- etc
+-- scripts
    |-- provision.sh
    |-- deploy.sh
    +-- scripts.sh
If we pull in that submodule, our directory path looks like this. So scripts that are in 'app/bin' handle regular operation (e.g. starting, stopping) our app within the vim. 'bin' handles the same, but from the host. 'bin/scripts' holds our app specific management scripts and 'scripts' holds our system wide, common scripts. This is our convention. There are many like it, but this works for us at bitHound. So depending on your path, you can tell if you are trying to interact from inside or outside the guest.

Starting and stopping

#!/bin/bash

#start app
vagrant ssh --command "cd /vagrant/app && ./bin/cli server start"
So here's a really naive way of starting an app in a vagrant box. This runs, but never comes back.

Starting and stopping

#!/bin/bash

#start app
vagrant ssh --command "cd /vagrant/app && ./bin/cli server start &"
#!/bin/bash

#start app
vagrant ssh --command "cd /vagrant/app && ./bin/cli server start" &
So we can push it into the background, either on the guest or the host, but we don't have a handle or a way to stop or restart the app.

Starting and stopping

#!/bin/bash

#start app
vagrant ssh --command "cd /vagrant/app; forever --uid 'app' -a start ./bin/cli server start"
#!/bin/bash

#stop app
vagrant ssh --command "forever stop app"
            
And here's a version using 'forever'.  It's a JS tool that's a lot like supervisor.  It's built to manage
long running processes.  E.g.: start, stop, restart.  Here, it's running inside the guest, but we're managing it 
from the host.
            

Passing parameters

#!/bin/bash

vagrant ssh --command "cd /vagrant/app && ./bin/bithound.js $*"
Utility / proxy scripts to interact with your project inside the VM. You could call this a trampoline or a thunk, uh, if you lived through the win16 to win32 transition. (I didn't, but I was just after that and still had to deal with old developer documentation. $* is just the parameter list that this program has been called with itself.
So that call took about three seconds to connect to the vagrant guest. And this varies. Docker is pretty much instantaneous. Most of our tasks are CPU bound, so this usually isn't a critical amount of overhead.
Vagrant files in particular are just ruby!
unless Vagrant.has_plugin?('vagrant-s3auth')
  # Attempt to install ourself. Bail out on failure so we don't get
  # stuck in an infinite loop.
  system('vagrant plugin install vagrant-s3auth') || exit!

  # Relaunch Vagrant so the plugin is detected.
  # Exit with the same status code.
  exit system('vagrant', *ARGV)
end

Challenge 3:Production will differ

Different deployment locations in the filesystem. Different users. Different security contexts. Different environments.
Virtualization as tool can give your dev team an approximation of production. It's up to your ops team to make that approximation as close as possible. Ultimately, there is no good answer. Vagrant isn't quite what it says on the label -- you have to be aware of the ways in which your vagrant environment differs from your production environment.

Multiple servers

Vagrant.configure(2) do |config|
  config.vm.define "app" do |app|
    app.vm.box = "trusty64"
    app.vm.provision "shell", path: "scripts/provision_app.sh"
    app.vm.network "private_network", ip: "10.10.11.11"
  end

  config.vm.define "worker1" do |worker|
    worker.vm.box = "trusty64"
    worker.vm.provision "shell", path: "scripts/provision_worker.sh"
    worker.vm.network "private_network", ip: "10.10.11.12"
  end

  config.vm.define "worker2" do |worker|
    worker.vm.box = "trusty64"
    worker.vm.provision "shell", path: "scripts/provision_worker.sh"
    worker.vm.network "private_network", ip: "10.10.11.13"
  end
end
So if you are building a distributed system, develop on a distributed system! Almost all virtualization environments right now allow you define multiple VMs and private networks.

Multiple servers

#!/bin/bash

#start app
vagrant ssh app --command "cd /vagrant/app;   forever --uid 'app' -a start ./bin/cli server start"

#start workers
vagrant ssh worker1 --command "cd /vagrant/app; forever --uid 'worker' -a start ./bin/bithound.js worker 10.10.11.11"
vagrant ssh worker2 --command "cd /vagrant/app; forever --uid 'worker' -a start ./bin/bithound.js worker 10.10.11.11"

SSL

## Do we need fake SSL keys?
ssl_pem=/etc/ssl/private/www_bithound_io.pem
ssl_key=/etc/ssl/private/www_bithound_io.key
ssl_crt=/etc/ssl/private/www_bithound_io.crt

if [ ! -e $ssl_pem ]; then
  # No PEM.
  if [ ! -e $ssl_key ] || [ ! -e $ssl_crt ]; then
    # No keys.
    country=CA
    state=Ontario
    locality=Kitchener
    organization=bitHound
    name=app.bithound.io

    openssl req -x509 \
      -newkey rsa:2048 \
      -subj "/C=$country/ST=$state/L=$locality/O=$organization/CN=$name" \
      -keyout $ssl_key \
      -out $ssl_crt \
      -days 90 \
      -nodes
  fi

  cat $ssl_crt $ssl_key > $ssl_pem
fi
We also deliver everything over SSL. So we do that on our dev boxes too. We autogenerate SSL certs if there isn't one present already. This is useful in other places too. We use Amazon S3 for file storage in production. We do static analysis of code and generate a lot of data. In development, we use an S3 simulater called 's3rver' to stand in for S3. Cause it's cheaper to not send that data out if we don't have to.

Challenge 4:Provisioning is hard

The learning curve for any given configuration management tool is massive.

Provisioners

  • Chef
  • Puppet
  • Ansible
  • CFEngine
  • Cobbler
  • SaltStack
  • ...
Chef, Puppet or Ansible are all pretty heavy for small projects. If you're already using one, great! Keep it! Don't change unless there's a damn good reason.

Provisioner features

  • Configuration Management
  • Orchestration
  • Verification / Auditing
We only really care about the configuration management part. Our VM environment handles orchestration. And we really don't care about verification. We're taking a very If there are changes to the environment that are permanent, they must be scripted and committed. The configuration as present in version control is the gold standard. So if you suspect that your dev environment has gone off the rails, but you do have a good provisioning script, don't bother auditing. Just nuke it from orbit, and create a new one.

Provisioner features

  • Configuration Management
  • Orchestration
  • Verification / Auditing
As long as your provisioner is idempotent, you can provision with just about any tool. Don't try to modify config files in place. Just check 'em in, and copy them into place.

Provisioning with bash

#!/bin/bash 
BASE='/vagrant'

apt-get update
apt-get install -y build-essential curl git mongodb-clients nginx tmux vim

cp "$BASE/etc/nginx/nginx.conf" /etc/nginx/nginx.conf
/etc/init.d/nginx restart

npm -g install forever
npm -g install nodemon
Here's a really early version of our provision script. We don't need to template our nginx config because it's pretty simple. Chef recipes and puppet rules tend to be written to be flexible and generic. And that's great if your maintaining hundreds of servers each configured slightly differently. If you want to configure your servers identically just blind copying a file is much simpler.

Provisioning with bash

Vagrant.configure(2) do |config|
  config.vm.provision "shell", path: "scripts/provision.sh"
end
And then calling a shell provisioner is really simple. In real life, we have parameters that we can pass to our provisioning script. We can tell it what user our app will run as, and what the source and destination paths actually are. We actually use the same scripts to provision our dev machines and our production machines.

Provisioning with bash

FROM ubuntu:1404

RUN scripts/provision.sh
ADD app /app
RUN cd /app && npm install

CMD ["/app/scripts/entrypoint.sh"]
Incidently, then you can use the same bash file with Docker.

Challenge 5:Keeping it going.

We don't need maintenance because devs regularly destroy and rebuild their vagrant environments, right?
Just because you've built a virtual dev environment, doesn't mean it will stay in sync. Your upstream images may change. There will be security updates. Your own application may change. If you don't regularly rebuild your dev environment, You will be surprised. And that's the best case. If you do a lot of work in a particular guest, your dev evnironment as running may no longer match your dev environment as spec'd. If you don't reprovision regularly, you will drift out of sync and lose many of the benefits of virtualization.

Why people don't reprovision:

  • It takes too long.
  • Reprovisioning fails.
  • Data loss.
You may have to push your developers to remain in sync. They'll probably push back. Why don't people stay in sync? It takes time. If your provisioning process takes two hours, that's two hours of lost productivity. If it takes any longer than a coffee break, developers will resist reprovisioning. Even if it _only_ takes a coffee break, there still will be resistance.

Why people don't reprovision:

  • It takes too long.
  • Reprovisioning fails.
  • Data loss.
We've already talked about data loss... Anything that affects the state of the guest should be checked in. Fixtures, for example. It's also possible to copy database dumps in and out of a VM. I haven't had anybody do this to me about production data yet, it's usually about reproducing test cases. So 'build fixtures' is usually a reasonable answer.

Improving provision speed

  • Examine your provisioner.
  • Package your code.
  • Create custom images.
Again, benchmark! We've tested out building custom base boxes in Varnish, and thus far it doesn't seem to help; We're trading off the time it takes to provision the box against the time it takes to download.

Reducing failures

  • Eat your own dogfood.
  • Provision regularly.
  • Watch for bitrot.
I maintain our dev environment. So whenever I make a new branch, I reprovision. That's at least once a day, usually two or three times. Part of the job; that could also very easily be automated. I have worked with projects where the vagrantfile is poorly maintained. I've even seen some that ship with a readme file that suggests you run the vagrant file, and then gives a list of manual tasks you need to complete provisioning. Devs need to trust their dev environment.

Nag your users

On the other hand, trust but verify. It helps to have scripts that check on the host machine. This one, for example, will not start if the systems code is obviously too old. Okay, there's a -f option, but my point still stands.

Does it work?

it does for us. this january, we hired two new staff members. we're a startup. we don't really have a staff onboarding process. it's pretty informal. so we had brand new still-shrinkwrapped macbook pros. and using the tools that i've outlined in this talk, both people had a complete -- functional -- dev environment, and had both made (trivial) commits to our codebase before lunch. including code reviews. So it takes time up front. And it you spend a little more time on each pageload in a dev environment. But what you get for that, is _vastly_ reduced time building dev environments, and easier production. You don't get this stuff out of the box by just using Vagrant.

Thank you!

www.bithound.io

www.bithound.io / @bithoundio