Release – Engineering 101 Building the Pipeline PLOW Lecture, Ecole Polytechnique, October 2014 – Building the Pipeline



Release – Engineering 101 Building the Pipeline PLOW Lecture, Ecole Polytechnique, October 2014 – Building the Pipeline

3 1


Releng-tutorial


On Github kmoir / Releng-tutorial

Release

Engineering 101

Building the Pipeline

PLOW Lecture, Ecole Polytechnique, October 2014

Your instructor Kim Moir / @kmoirThis presentation https://github.com/kmoir/Releng-tutorial

Outline - Lectures

Introductions Overview of Release Engineering Continuous Integration Build and Test pipelines Scaling your CI REST APIs Further reading

Outline - Labs

Using your AWS Instance and Git Setup a build on Jenkins Build from commit on Jenkins Invoke build and test scripts Add test results report Add Jenkins Plugins Create a pipeline by chaining jobs together Query Jenkins REST API Challenge Exercise

About: me

I currently work at Mozilla as a release engineer. Prior to that, I worked at IBM as a release engineer in the Eclipse community. So I've worked in open source communities for a long time. Most of the tools I'll show you today are open source too.

Mozilla Releng Worldwide

About: you

?

How many of you have worked on a developent team before? What open source tools do you use? Git? Eclipse? What languages to you write code in? Java? Ruby? Python? C++? JavaScript? How many of you plan to work in indusry when you graduate?
© Forest Wander, Creative Commons by-nc-sa 2.0

What is release engineering?

“ a sub-discipline in software engineering concerned with the compilation, assembly, and delivery of source code into finished products or other software components. --Wikipedia ” At it's very essence we take source code, and turn it into products that you see on your phone, your desktop, at the bank machine, or on an airplane. We ensure that it compiles, is packaged, tested, signed, can be updated and deployed. You may be wondering how are jobs are different than traditional software developers.

We build pipelines

An old rusty pipe on Deception Island in Antarctica. February 2009 © Ville Miettinen, Creative Commons by-nc-sa 2.0

Role of the Release Engineer

  • Focus on writing code to build a pipeline versus a product
    • scope depends on where you work
    • scale depends on
      • number of products
      • number of platforms x supported versions
      • number of tests
      • number of commits
      • release cadence
Why can't it be just a regular developer doing this job? Sometimes it is. However, developers get very attached to their code and want to push new features despite them lacking stability.

Committing code is only the first step

  • Build, test, package, sign and deploy
  • mobile and desktop apps
  • social media, banks, web sites for online retailers...
  • on multiple platforms
    • Desktop:Windows, Linux, Mac...
    • Mobile: Android, IOS, FirefoxOS, Blackberry...
    • Other: Trains, planes, cars, and drones...

Websites such as Google, Facebook, Twitter

if software then releng

  • Akamai, Facebook, Google, LinkedIn, Twitter
  • Amazon, Mozilla, Etsy, Netflix, Tasktop
  • Oracle, IBM, HP, Microsoft, RedHat
  • NASA, Boeing, Swiss Railways
  • QNX, Hortonworks, Shopify, Puppet
  • Open Source communities like
    • Linux, Apache, Mozilla, Eclipse, Gentoo, Debian

Where do we work?

Chuck Rossi, Director of Release Engineering, Facebook

start 1:38 end 5:30

Release Engineering Focus

  • Schedule
  • Quality
  • Features
  • = shipping quality products on time

Release engineers are not emotionally attached to the features in a product. Their focus is on building the pipeline that allows them to ship products on time. A software developer in a traditional role may have invested months working a a new feature in a product. But if this new feature is not of sufficient quality to be included in the current release, it should not be included. Release management or release engineering may make this decision, depending on where you work.

Releng as a Force multiplier

  • Optimize the pipeline
  • Make the development team as a whole more effective
    • Reduce the end to end build time by 50%?
    • Scale testing in the cloud to reduce wait times?
    • Ship a release in a day?
    • Implement a new build system?

John O'Duinn, gave a talk a couple of years ago about called release engineering as a force multiplier. It describes in detail how the changes in Mozilla release engineering make the development team as a whole more effective.

Scale

  • At Mozilla we run about 4k build jobs and 70k test jobs a day
  • 80% of builds and 50% of tests run on cloud instances
  • Build farm consists of 6000 devices
  • Scaling isues are fun!

Tools

John O'Duinn's Mozilla release pipeline Tree Herder Status

This is a picture of how the different parts of our build farm work together. Developers land change on code repositories such as hg.mozilla.org. As I mentioned before, we use an open source continuous integration engine called Buildbot. We have over 50 buildbot masters. Masters are segregated by function to runn tests, builds, scheduling, and try. Test and build masters are further divided by function so we can limit the type of jobs they run and the types of slaves they serve. For instance, a master may have Windows build slaves allocated to it. Or Android test slaves. This makes the masters more efficient because you don’t need to have every type of job loaded and consuming memory. It also makes maintenance more efficient in that you can bring down for example, Android test masters for maintenance without having to touch other platforms. Buildbot polls the hg push log for each of the code repositories. (Hgpoller) When the poller detects a change, the information about the change is written into the scheduler database. The buildbot scheduler masters are responsible for taking this request in the database and creating a new build request. The build request then will appear as pending in the web page in the previous slide. The jobs may be on existing hardware in our data centre, or new VMs may start or be created in the cloud to run these pending jobs. When the job finishes, logs and artifacts are uploaded to FTP. The results appear on tbpl. Performance results are stored in the graphs database. —— References http://people.mozilla.org/~catlee/relengdocs/flows.html#firefox-builds-from-checkin-to-tbpl http://oduinn.com/blog/2013/06/07/release-engineering-as-a-force-multiplier-relengcon-2013/

Selena Deckelman's diagram of Mozilla Release Engineering flows This is a more complicated picture of how the the release engineering pipeline at Mozilla works. As you can see, it's rereally a large distributed system. Implementing changes on it is quite complex.

Other related disciplines:

Configuration Management

Puppet, Chef, Ansible, Docker. We have several thousand devices in our buildfarm at Mozilla. We need to management them at Scale. For many of our devices we use Puppet to configure them automatically i.e. install a new package, change a file, configure a new platform

Build, Test and Release Automation

© Ian Britton, Creative Commons by-nc-sa 2.0

Automate all the things

Continuous Integration/Deployment

  • Evaluating and writing tools
  • Optimizing the pipeline for shorter end to end times
  • Removing bottlenecks then finding new ones
  • Reducing costs and scaling capacity

Release Management

Coordinating with a development team to determine what is shipped in release

Repository Management

Branching Strategies

Package and Dependency Management

© xkcd - Dependencies - Randall Munroe artifactory, maven repos, p2 repos, pypi, custom rpms, different languages or operating systems have different package management systems and thus different ways of managing dependencies i.e. pip, apt-get, java jars. And there are also different types of deployment artifacts

Going from 3 week to Daily releases

Daniel Zapata, Senior Software Engineer, Netflix

0:44-3:15

Testing and Quality Assurance

  • Porting tests to new platforms
  • Writing Test harnesses
    • Unit tests
    • Performance tests
    • Regression testing

Building a physical device lab

Lara Hogan, Senior Engineering Manager of Performance, Etsy

4:45-7:49

Our project

  • Setup a simple build pipeline using Jenkins on AWS
  • Install additional Jenkins plugins
  • Query Jenkins build status using REST API

Lab 1: Using your AWS instance

Goal: Login to your AWS instance via ssh. Query the packages that are available on it

  • AWS = Amazon web services
  • Shared cloud resources
  • Different instance types - different resources - memory, disk, CPU
  • We are using a m3.large instance type running Ubuntu 14.04
mention other cloud services Azure, IBM etc

Lab 1: Login to your instance

  • ssh -l username hostname
  • if needed, you can sudo - root
  • look at packages installed
dpkg -l | more 
dpkg -l | grep jenkins
				     

Lab 1: Git

  • Git is a distributed version control system
  • Git Design Goals: fast, secure, distributed
  • Each commit has a corresponding hash key
  • Everyone has a local copy of the history

Lab 1: Git - quick command guide

git clone repo_name           #clone a repository locally
cd repo_name        
vi my_file          
git add my_file               #add my_file to git tracking
git status                    #see the status of the git repository 
git commit -a -m "a comment"   #commit the change to the local repository
git push                      #push the change to the remote repository
git branch                    #show current branch
git checkout -b branch2       #create new local branch
				     
You can use Git within an IDE if you want, but these are the command line arguments

Lab 1: Git exercise

cd /home/ubuntu
git clone https://github.com/salimfadhley/jenkinsapi     
cd jenkinsapi
git branch 
git tag
git log
git status 
git config --global user.name "Your Name"
git config --global user.email you@example.com
                                     
Clone the jenkinsapi project locally to your AWS machine Look at the status of the repo (no pending changes) git log - look at recent changes

Lab 2: Continuous Integration

  • Everything in version control - build, test and deployment scripts
  • Automate all the things: Build, test, deployment, update process
  • Build on commit
  • Fix problems as they arise
  • Popular CI servers

some are opensource, some are commercial. i.e. Jenkins, Hudson, Travis from GitHub, Buildbot, Bamboo from Atlassian

Lab 2: Jenkins

  • Jenkins in a continuous integration engine
  • Written in Java
  • Open source
  • Plugin architecture that allows you to add functionality in a modular fashion
  • Many community plugins available
https://wiki.jenkins-ci.org/display/JENKINS/Plugins Examples in the wild: Netflix, as well as the Apache project (https://builds.apache.org) (Mozilla uses Buildbot which scales better for our volume)

Lab 2: Jenkins

  • Jenkins welcome page
  • at your machine's http://fqdn:8080

Lab 2: Jenkins login

  • Select login in on left hand side
  • Username and password are on whiteboard
  • Select Remember me on this computer

Lab 2: Jenkins: Create a job

  • Select New Item on left side of screen
  • In Item name give your job a name
  • Select Freestyle project and select OK

Lab 2: Jenkins: Clone from Git

  • Under Source Code Management select Git
  • Under Repository URL enter the path to your local git repository i.e. /home/ubuntu/jenkinsapi and select Save
  • Then select Build Now on the left hand side of the screen

Lab 2: Jenkins: Build Results

  • When the build finishes, look Console Ouput

Lab 3: Jenkins: Build on Commit

  • Select Back to Project
  • Select Configure
  • Select Poll SCM
  • Under Schedule enter H/2 * * * * to poll git every two minutes
  • Select Save

Lab 3: Jenkins: Commit a file

ubuntu@ip-172-31-31-167:~/jenkinsapi$ cat TODO 
TODO:

* Add a testsuite (preferably nose or py.test) which doesn't rely on a local jenkins setup (or instantiates one during test)
* Clean up the fingerprint code
* Clean up the resultset and results code
* Make all objects inherit off jenkins_base where that makes sense
* Add ability to add/modify/delete jobs
* Add ability to query jenkins for plugin data

ubuntu@ip-172-31-31-167:~/jenkinsapi$ vi TODO
ubuntu@ip-172-31-31-167:~/jenkinsapi$ git commit -a -m "updated TODO"
[master d55e7e2] updated TODO

                                    

You can update any file, it doesn't matter

Lab 3: Jenkins: Look at job output

  • Another job should start within 2 minutes
  • But only if there is a new commit

Build on Commit vs build on Trunk

Why is it important?

Many small projects just build from trunk (master branch) or other designated branch. However, this does not often scale when you have a large code base + a large team of developers. It's difficult to determine who caused the breaking failure if you have a large number of changes and you only run a build once a day. If you build on commit, you can quickly see what caused the breaking failure and back it out before other changes cascade on top of it. If you don't have the infrastructure money to build on commit, you can build and run tests locally i.e. in your IDE before committing code to the canonical repo However, this doesn't address integration 100% - having your code integrate with all the other code in the repo

Bisect, fix bustage or backout

Lab 4: Add more build steps

  • Select your build
  • Go to Configure
  • Under the Build heading, select Add build step
  • Select Execute shell

The people who maintain the Python Jenkins API are really nice to includes a Jenkins script we can run to replicate their build

It uses virtualenv to create a Python virtual environment to run the tests

This is a good practive because it keeps the test environment separate from the machine environment

Lab 4: Add more build steps

  • Text to put in the Command box
chmod 755 $WORKSPACE/jenkins_build.sh
$WORKSPACE/jenkins_build.sh
                                    
  • Select Save
  • Select Build now
  • The build will take about 7 minutes
  • What can you tell from the build output?

Lab 4: Add Test Results report

  • Go back to your project
  • Select Configure
  • Under Add post-build action
  • Select Publish JUnit test result report
  • Enter nosetests.xml
  • Enter Save and rerun your build

Lab 4: Test Results report

  • Go back to your project
  • On left hand side, select test result

Lab 4: Break things

  • Go change some code in the repository to break the build
  • i.e. comment out some files in the code

Lab 4: Bonus labs

  • Write a script to generate many commits to your git repository
  • Add additonal excecutors to the Jenkins install
  • Look at the number of queued jobs

Lab 5: Add a plugin

  • On the left hand side select Back to Dashboard
  • On the left hand side select Manage Jenkins
  • Select Manage Plugins
  • Select the Available tab
  • You'll see a list of available plugins
  • Check the Build Pipeline plugin
  • Select Install plugin
  • Select Download now and install after restart plugin
  • Select Restart Jenkins when installation is complete and no jobs are running

Lab 5: Restart Jenkins to enable new plugins

  • Will take a few minutes to complete
  • After Jenkins has restarted, verified that the Build Pipeline plugin is listed under the installed tab
  • Go to http://your_jenkins_server/restart if it doesn't restart automatically

Lab 5: Add a new build pipeline view

  • Select My Views on the left hand side
  • On the views pages, select + to add a new view
  • Select Build Pipeline View
  • Select OK
  • Enter Build Pipeline for the name
  • The defaults are okay, select OK

Lab 5: Look at build pipeline view

Lab 5: Add two new build jobs

  • Select Jenkins to go back to the dashboard
  • Select New Item on left side of screen
  • In Item name give your job a name like Performance1
  • Select Freestyle project and select OK
  • Add Build step and select Execute Shell
  • echo "$JOB_NAME is running"
  • Select Save
  • Repeat these same steps for a job called Deploy1

Lab 5: Configure chained builds

  • On the Jenkins dashboard select build1 and then configure
  • Select Add post-build action
  • Select Trigger parameterized build on other projects
  • Select Projects to build and select performance1
  • Select Add Parameters
  • Select Current build parameters
  • Select Save

Lab 5: Configure chained builds (2)

  • Repeat these same steps so the performance build triggers the deployment build

Lab 5: Run builds and look at pipeline

  • Look at the pipeline view - does it show build -> performance -> deploy
  • Run the performance and deploy tests (to save time) or build1 if you want

Lab 5: Bonus

  • Implement another pipeline that runs from a diferent branch or tag
  • Hint: You can copy existing jobs and modify

Scaling CI in the cloud

  • Why do we need to scale CI?
  • Different cloud vendors: Amazon, IBM, Microsoft, Rackspace..
  • Goal is to increase capacity while minimizing cost

If you build on commit, the number of builds that you run will scale up as people wait in timezones and start landing code

More Capacity, less $

  • Run instances in multiple regions
  • Start instances in cheaper regions first
  • Scripts to automatically shut down inactive instances
  • Start instances that have been recently running
  • Use spot instances vs on demand instances for tests

Lab 6: REST APIs

  • Representational state transfer
  • client server
  • stateless
  • lightweight
  • scalable

Lab 6: Jenkins REST APIs

Lab 6: Query your Jenkins server

  • Install jenkinsapi from pypi to ensure it's sane
  • sudo pip install jenkinsapi
  • Jenkins API reference
  • Example python to query the last successful builds for each of the builds defined on your Jenkins server
import jenkinsapi

from jenkinsapi.jenkins import Jenkins
J = Jenkins('http://localhost:8080')
print "Jenkins Version %s" % J.version
build_list = J.keys()
print "Jenkins keys %s" % build_list
for b in build_list:
print "Build name %s Last good build %s" % (b, J[b].get_last_good_build())

Lab 6: Query your neighbour's Jenkins server

Replace localhost in the previous example their fqdn

Lab 6: Query the plugins installed

  • Example python to query the last successful builds for each of the builds defined on your Jenkins server. Query your neighbours too!

import jenkinsapi

from jenkinsapi.jenkins import Jenkins
J = Jenkins('http://localhost:8080')
print "Jenkins Version %s" % J.version
for plugin in J.get_plugins().values():
    print "Short Name:%s" %(plugin.shortName)
    print "Long Name:%s" %(plugin.longName)
    print "Version:%s" %(plugin.version)
    print "URL:%s" %(plugin.url)
    print "Active:%s" %(plugin.active)
    print "Enabled:%s" %(plugin.enabled)

Challenge Exercise #1

  • Create a new build pipeline with a different open source project

OR

Challenge Exercise #2

  • Query the REST API of Apache project builds
  • Please provide source code and define functions as appropriate
  • The Python Jenkins API doc is here
  • If you prefer Java, you can use the Jenkins API for Java
  • Or Ruby Ruby Jenkins API
    • What plugins are installed?
    • What builds are running?
    • Do the running builds have downstream builds?
    • What haven't run a successful build in the past week?
    • For failed builds, what is the reason they failed?
    • Extra bonus: present this information graphically

Further Reading

Release Engineering and Devops books and blogs to Read