Release

Engineering 101

Building the Pipeline

PLOW Lecture, Ecole Polytechnique, October 2014

Your instructor Kim Moir / @kmoirThis presentation https://github.com/kmoir/Releng-tutorial

Outline - Lectures

Introductions Overview of Release Engineering Continuous Integration Build and Test pipelines Scaling your CI REST APIs Further reading

Outline - Labs

Using your AWS Instance and Git Setup a build on Jenkins Build from commit on Jenkins Invoke build and test scripts Add test results report Add Jenkins Plugins Create a pipeline by chaining jobs together Query Jenkins REST API Challenge Exercise

About: me

I currently work at Mozilla as a release engineer. Prior to that, I worked at IBM as a release engineer in the Eclipse community. So I've worked in open source communities for a long time. Most of the tools I'll show you today are open source too.

Mozilla Releng Worldwide

About: you

?

How many of you have worked on a developent team before? What open source tools do you use? Git? Eclipse? What languages to you write code in? Java? Ruby? Python? C++? JavaScript? How many of you plan to work in indusry when you graduate?

What is release engineering?

“ a sub-discipline in software engineering concerned with the compilation, assembly, and delivery of source code into finished products or other software components. --Wikipedia ” At it's very essence we take source code, and turn it into products that you see on your phone, your desktop, at the bank machine, or on an airplane. We ensure that it compiles, is packaged, tested, signed, can be updated and deployed. You may be wondering how are jobs are different than traditional software developers.

We build pipelines

An old rusty pipe on Deception Island in Antarctica. February 2009 © Ville Miettinen, Creative Commons by-nc-sa 2.0

Role of the Release Engineer

Focus on writing code to build a pipeline versus a product
- scope depends on where you work
- scale depends on
  - number of products
  - number of platforms x supported versions
  - number of tests
  - number of commits
  - release cadence

Why can't it be just a regular developer doing this job? Sometimes it is. However, developers get very attached to their code and want to push new features despite them lacking stability.

Committing code is only the first step

Build, test, package, sign and deploy
mobile and desktop apps
social media, banks, web sites for online retailers...
on multiple platforms
- Desktop:Windows, Linux, Mac...
- Mobile: Android, IOS, FirefoxOS, Blackberry...
- Other: Trains, planes, cars, and drones...

Websites such as Google, Facebook, Twitter

if software then releng

Akamai, Facebook, Google, LinkedIn, Twitter
Amazon, Mozilla, Etsy, Netflix, Tasktop
Oracle, IBM, HP, Microsoft, RedHat
NASA, Boeing, Swiss Railways
QNX, Hortonworks, Shopify, Puppet
Open Source communities like
- Linux, Apache, Mozilla, Eclipse, Gentoo, Debian

Where do we work?

Chuck Rossi, Director of Release Engineering, Facebook

start 1:38 end 5:30

Release Engineering Focus

Schedule
Quality
Features
= shipping quality products on time

Release engineers are not emotionally attached to the features in a product. Their focus is on building the pipeline that allows them to ship products on time. A software developer in a traditional role may have invested months working a a new feature in a product. But if this new feature is not of sufficient quality to be included in the current release, it should not be included. Release management or release engineering may make this decision, depending on where you work.

Releng as a Force multiplier

Optimize the pipeline
Make the development team as a whole more effective
- Reduce the end to end build time by 50%?
- Scale testing in the cloud to reduce wait times?
- Ship a release in a day?
- Implement a new build system?

John O'Duinn, gave a talk a couple of years ago about called release engineering as a force multiplier. It describes in detail how the changes in Mozilla release engineering make the development team as a whole more effective.

Scale

At Mozilla we run about 4k build jobs and 70k test jobs a day
80% of builds and 50% of tests run on cloud instances
Build farm consists of 6000 devices
Scaling isues are fun!

Tools

John O'Duinn's Mozilla release pipeline Tree Herder Status

This is a picture of how the different parts of our build farm work together. Developers land change on code repositories such as hg.mozilla.org. As I mentioned before, we use an open source continuous integration engine called Buildbot. We have over 50 buildbot masters. Masters are segregated by function to runn tests, builds, scheduling, and try. Test and build masters are further divided by function so we can limit the type of jobs they run and the types of slaves they serve. For instance, a master may have Windows build slaves allocated to it. Or Android test slaves. This makes the masters more efficient because you don’t need to have every type of job loaded and consuming memory. It also makes maintenance more efficient in that you can bring down for example, Android test masters for maintenance without having to touch other platforms. Buildbot polls the hg push log for each of the code repositories. (Hgpoller) When the poller detects a change, the information about the change is written into the scheduler database. The buildbot scheduler masters are responsible for taking this request in the database and creating a new build request. The build request then will appear as pending in the web page in the previous slide. The jobs may be on existing hardware in our data centre, or new VMs may start or be created in the cloud to run these pending jobs. When the job finishes, logs and artifacts are uploaded to FTP. The results appear on tbpl. Performance results are stored in the graphs database. —— References http://people.mozilla.org/~catlee/relengdocs/flows.html#firefox-builds-from-checkin-to-tbpl http://oduinn.com/blog/2013/06/07/release-engineering-as-a-force-multiplier-relengcon-2013/

Selena Deckelman's diagram of Mozilla Release Engineering flows This is a more complicated picture of how the the release engineering pipeline at Mozilla works. As you can see, it's rereally a large distributed system. Implementing changes on it is quite complex.

Other related disciplines:

Configuration Management

Puppet, Chef, Ansible, Docker. We have several thousand devices in our buildfarm at Mozilla. We need to management them at Scale. For many of our devices we use Puppet to configure them automatically i.e. install a new package, change a file, configure a new platform

Build, Test and Release Automation

Automate all the things

Continuous Integration/Deployment

Evaluating and writing tools
Optimizing the pipeline for shorter end to end times
Removing bottlenecks then finding new ones
Reducing costs and scaling capacity

Release Management

Coordinating with a development team to determine what is shipped in release

Repository Management

Branching Strategies

Package and Dependency Management

© xkcd - Dependencies - Randall Munroe artifactory, maven repos, p2 repos, pypi, custom rpms, different languages or operating systems have different package management systems and thus different ways of managing dependencies i.e. pip, apt-get, java jars. And there are also different types of deployment artifacts

Going from 3 week to Daily releases

Daniel Zapata, Senior Software Engineer, Netflix

0:44-3:15

Testing and Quality Assurance

Porting tests to new platforms
Writing Test harnesses
- Unit tests
- Performance tests
- Regression testing

Building a physical device lab

Lara Hogan, Senior Engineering Manager of Performance, Etsy

4:45-7:49

Our project

Setup a simple build pipeline using Jenkins on AWS
Install additional Jenkins plugins
Query Jenkins build status using REST API

Lab 1: Using your AWS instance

Goal: Login to your AWS instance via ssh. Query the packages that are available on it

AWS = Amazon web services
Shared cloud resources
Different instance types - different resources - memory, disk, CPU
We are using a m3.large instance type running Ubuntu 14.04

mention other cloud services Azure, IBM etc

Lab 1: Login to your instance

ssh -l username hostname
if needed, you can sudo - root
look at packages installed

dpkg -l | more 
dpkg -l | grep jenkins

Lab 1: Git

Git is a distributed version control system
Git Design Goals: fast, secure, distributed
Each commit has a corresponding hash key
Everyone has a local copy of the history

Lab 1: Git - quick command guide

git clone repo_name           #clone a repository locally
cd repo_name        
vi my_file          
git add my_file               #add my_file to git tracking
git status                    #see the status of the git repository 
git commit -a -m "a comment"   #commit the change to the local repository
git push                      #push the change to the remote repository
git branch                    #show current branch
git checkout -b branch2       #create new local branch

You can use Git within an IDE if you want, but these are the command line arguments

Lab 1: Git exercise

cd /home/ubuntu
git clone https://github.com/salimfadhley/jenkinsapi     
cd jenkinsapi
git branch 
git tag
git log
git status 
git config --global user.name "Your Name"
git config --global user.email you@example.com

Clone the jenkinsapi project locally to your AWS machine Look at the status of the repo (no pending changes) git log - look at recent changes

Lab 2: Continuous Integration

Everything in version control - build, test and deployment scripts
Automate all the things: Build, test, deployment, update process
Build on commit
Fix problems as they arise
Popular CI servers

some are opensource, some are commercial. i.e. Jenkins, Hudson, Travis from GitHub, Buildbot, Bamboo from Atlassian

Lab 2: Jenkins

Jenkins in a continuous integration engine
Written in Java
Open source
Plugin architecture that allows you to add functionality in a modular fashion
Many community plugins available

https://wiki.jenkins-ci.org/display/JENKINS/Plugins Examples in the wild: Netflix, as well as the Apache project (https://builds.apache.org) (Mozilla uses Buildbot which scales better for our volume)

Lab 2: Jenkins

Jenkins welcome page
at your machine's http://fqdn:8080

Lab 2: Jenkins login

Select login in on left hand side
Username and password are on whiteboard
Select Remember me on this computer

Lab 2: Jenkins: Create a job

Select New Item on left side of screen
In Item name give your job a name
Select Freestyle project and select OK

Lab 2: Jenkins: Clone from Git

Under Source Code Management select Git
Under Repository URL enter the path to your local git repository i.e. /home/ubuntu/jenkinsapi and select Save
Then select Build Now on the left hand side of the screen

Lab 2: Jenkins: Build Results

When the build finishes, look Console Ouput

Lab 3: Jenkins: Build on Commit

Select Back to Project
Select Configure
Select Poll SCM
Under Schedule enter H/2 * * * * to poll git every two minutes
Select Save

Lab 3: Jenkins: Commit a file

ubuntu@ip-172-31-31-167:~/jenkinsapi$ cat TODO 
TODO:

* Add a testsuite (preferably nose or py.test) which doesn't rely on a local jenkins setup (or instantiates one during test)
* Clean up the fingerprint code
* Clean up the resultset and results code
* Make all objects inherit off jenkins_base where that makes sense
* Add ability to add/modify/delete jobs
* Add ability to query jenkins for plugin data

ubuntu@ip-172-31-31-167:~/jenkinsapi$ vi TODO
ubuntu@ip-172-31-31-167:~/jenkinsapi$ git commit -a -m "updated TODO"
[master d55e7e2] updated TODO

You can update any file, it doesn't matter

Lab 3: Jenkins: Look at job output

Another job should start within 2 minutes
But only if there is a new commit

Build on Commit vs build on Trunk

Why is it important?

Many small projects just build from trunk (master branch) or other designated branch. However, this does not often scale when you have a large code base + a large team of developers. It's difficult to determine who caused the breaking failure if you have a large number of changes and you only run a build once a day. If you build on commit, you can quickly see what caused the breaking failure and back it out before other changes cascade on top of it. If you don't have the infrastructure money to build on commit, you can build and run tests locally i.e. in your IDE before committing code to the canonical repo However, this doesn't address integration 100% - having your code integrate with all the other code in the repo

Bisect, fix bustage or backout

Use tools such as git bisect to identify the offending commit=href="http:>
Backout the change set or fix the code

Lab 4: Add more build steps

Select your build
Go to Configure
Under the Build heading, select Add build step
Select Execute shell

The people who maintain the Python Jenkins API are really nice to includes a Jenkins script we can run to replicate their build

It uses virtualenv to create a Python virtual environment to run the tests

This is a good practive because it keeps the test environment separate from the machine environment

Lab 4: Add more build steps

Text to put in the Command box

chmod 755 $WORKSPACE/jenkins_build.sh
$WORKSPACE/jenkins_build.sh

Select Save
Select Build now
The build will take about 7 minutes
What can you tell from the build output?

Lab 4: Add Test Results report

Go back to your project
Select Configure
Under Add post-build action
Select Publish JUnit test result report
Enter nosetests.xml
Enter Save and rerun your build

Lab 4: Test Results report

Go back to your project
On left hand side, select test result

Lab 4: Break things

Go change some code in the repository to break the build
i.e. comment out some files in the code

Lab 4: Bonus labs

Write a script to generate many commits to your git repository
Add additonal excecutors to the Jenkins install
Look at the number of queued jobs

Lab 5: Add a plugin

On the left hand side select Back to Dashboard
On the left hand side select Manage Jenkins
Select Manage Plugins
Select the Available tab
You'll see a list of available plugins
Check the Build Pipeline plugin
Select Install plugin
Select Download now and install after restart plugin
Select Restart Jenkins when installation is complete and no jobs are running

Lab 5: Restart Jenkins to enable new plugins

Will take a few minutes to complete
After Jenkins has restarted, verified that the Build Pipeline plugin is listed under the installed tab
Go to http://your_jenkins_server/restart if it doesn't restart automatically

Lab 5: Add a new build pipeline view

Select My Views on the left hand side
On the views pages, select + to add a new view
Select Build Pipeline View
Select OK
Enter Build Pipeline for the name
The defaults are okay, select OK

Lab 5: Look at build pipeline view

Lab 5: Add two new build jobs

Select Jenkins to go back to the dashboard
Select New Item on left side of screen
In Item name give your job a name like Performance1
Select Freestyle project and select OK
Add Build step and select Execute Shell
echo "$JOB_NAME is running"
Select Save
Repeat these same steps for a job called Deploy1

Lab 5: Configure chained builds

On the Jenkins dashboard select build1 and then configure
Select Add post-build action
Select Trigger parameterized build on other projects
Select Projects to build and select performance1
Select Add Parameters
Select Current build parameters
Select Save

Lab 5: Configure chained builds (2)

Repeat these same steps so the performance build triggers the deployment build

Lab 5: Run builds and look at pipeline

Look at the pipeline view - does it show build -> performance -> deploy
Run the performance and deploy tests (to save time) or build1 if you want

Lab 5: Bonus

Implement another pipeline that runs from a diferent branch or tag
Hint: You can copy existing jobs and modify

Scaling CI in the cloud

Why do we need to scale CI?
Different cloud vendors: Amazon, IBM, Microsoft, Rackspace..
Goal is to increase capacity while minimizing cost

If you build on commit, the number of builds that you run will scale up as people wait in timezones and start landing code

More Capacity, less $

Run instances in multiple regions
Start instances in cheaper regions first
Scripts to automatically shut down inactive instances
Start instances that have been recently running
Use spot instances vs on demand instances for tests

Lab 6: REST APIs

Representational state transfer
client server
stateless
lightweight
scalable

Lab 6: Jenkins REST APIs

Jenkins REST APIs
- XML
- JSON
- Python
extend tooling or reporting

Lab 6: Query your Jenkins server

Install jenkinsapi from pypi to ensure it's sane
sudo pip install jenkinsapi
Jenkins API reference
Example python to query the last successful builds for each of the builds defined on your Jenkins server

import jenkinsapi

from jenkinsapi.jenkins import Jenkins
J = Jenkins('http://localhost:8080')
print "Jenkins Version %s" % J.version
build_list = J.keys()
print "Jenkins keys %s" % build_list
for b in build_list:
print "Build name %s Last good build %s" % (b, J[b].get_last_good_build())

Lab 6: Query your neighbour's Jenkins server

Replace localhost in the previous example their fqdn

Lab 6: Query the plugins installed

Example python to query the last successful builds for each of the builds defined on your Jenkins server. Query your neighbours too!

import jenkinsapi

from jenkinsapi.jenkins import Jenkins
J = Jenkins('http://localhost:8080')
print "Jenkins Version %s" % J.version
for plugin in J.get_plugins().values():
    print "Short Name:%s" %(plugin.shortName)
    print "Long Name:%s" %(plugin.longName)
    print "Version:%s" %(plugin.version)
    print "URL:%s" %(plugin.url)
    print "Active:%s" %(plugin.active)
    print "Enabled:%s" %(plugin.enabled)

Challenge Exercise #1

Create a new build pipeline with a different open source project

Challenge Exercise #2

Query the REST API of Apache project builds
Please provide source code and define functions as appropriate
The Python Jenkins API doc is here
If you prefer Java, you can use the Jenkins API for Java
Or Ruby Ruby Jenkins API
- What plugins are installed?
- What builds are running?
- Do the running builds have downstream builds?
- What haven't run a successful build in the past week?
- For failed builds, what is the reason they failed?
- Extra bonus: present this information graphically

Release – Engineering 101 Building the Pipeline PLOW Lecture, Ecole Polytechnique, October 2014 – Building the Pipeline

kmoir

Release – Engineering 101 Building the Pipeline PLOW Lecture, Ecole Polytechnique, October 2014 – Building the Pipeline

3 1 (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/platform.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Releng-tutorial

Release

Engineering 101

Building the Pipeline

PLOW Lecture, Ecole Polytechnique, October 2014

Outline - Lectures

Outline - Labs

About: me

Mozilla Releng Worldwide

About: you

?

What is release engineering?

We build pipelines

Role of the Release Engineer

Committing code is only the first step

if software then releng

Release Engineering Focus

Releng as a Force multiplier

Scale

Tools

Other related disciplines:

Configuration Management

Build, Test and Release Automation

Continuous Integration/Deployment

Release Management

Repository Management

Branching Strategies

Package and Dependency Management

Going from 3 week to Daily releases

Testing and Quality Assurance

Building a physical device lab

Our project

Lab 1: Using your AWS instance

Lab 1: Login to your instance

Lab 1: Git

Lab 1: Git - quick command guide

Lab 1: Git exercise

Lab 2: Continuous Integration

Lab 2: Jenkins

Lab 2: Jenkins

Lab 2: Jenkins login

Lab 2: Jenkins: Create a job

Lab 2: Jenkins: Clone from Git

Lab 2: Jenkins: Build Results

Lab 3: Jenkins: Build on Commit

Lab 3: Jenkins: Commit a file

Lab 3: Jenkins: Look at job output

Build on Commit vs build on Trunk

Bisect, fix bustage or backout

Lab 4: Add more build steps

Lab 4: Add more build steps

Lab 4: Add Test Results report

Lab 4: Test Results report

Lab 4: Break things

Lab 4: Bonus labs

Lab 5: Add a plugin

Lab 5: Restart Jenkins to enable new plugins

Lab 5: Add a new build pipeline view

Lab 5: Look at build pipeline view

Lab 5: Add two new build jobs

Lab 5: Configure chained builds

Lab 5: Configure chained builds (2)

Lab 5: Run builds and look at pipeline

Lab 5: Bonus

Scaling CI in the cloud

More Capacity, less $

Lab 6: REST APIs

Lab 6: Jenkins REST APIs

Lab 6: Query your Jenkins server

Lab 6: Query your neighbour's Jenkins server

Lab 6: Query the plugins installed

Challenge Exercise #1

Challenge Exercise #2

Further Reading

3 1