The Purely Functional Software Deployment Model – Context – Software deployment doesn't "feel" like a research topic



The Purely Functional Software Deployment Model – Context – Software deployment doesn't "feel" like a research topic

0 0


pf-deployment

Presentation on The Purely Functional Software Deployment Model by Eelco Dolstra

On Github philandstuff / pf-deployment

The Purely Functional Software Deployment Model

Eelco Dolstra, PhD Thesis, University of Utrecht

Philip Potter / @philandstuff

philandstuff.github.io/pf-deployment/

UK Government Digital Service

Context

PhD viva: 18th January 2006

Software deployment doesn't "feel" like a research topic

System administration is often seen as more vocational than academic

Tools like puppet, chef, rpm, dpkg written to scratch an itch

(With exceptions, of course)

M. Burgess, Cfengine: a site configuration engine , USENIX Computing systems, Vol8, No. 3 1995

In fact, there's a whole Cfengine papers page

This PhD thesis presents, first and foremost, a theory of software deployment

The nix package manager is an implementation of that theory

see also Naur's Programming as Theory Building for more exploration of this idea

What is a package manager?

Which of these tools are package managers?

rpm, dpkg, rubygems, bundler, zip, docker, rsync?

Package manager (n.): a tool that deploys software correctly

Deployment (n.): getting computer programs from one machine to another—and having them still work when they get there.

From the first sentence of Chapter 1

rsync: the null package manager

"deployment" should be just "copying some files"

rsync allows you to shoot yourself in the foot

What is correct deployment?

given identical inputs, the software should behave the same on an end-user machine as on the developer machine

-- section 1.1

What goes wrong with deployments?

Missing dependencies

Wrong versions of dependencies

Dependency conflicts

Missing dependencies

Outside scope of package manager (libyaml, libxml for ruby or python)

Accidentally missed by package (don't all machines have zlib? openssl?)

Wrong versions of dependencies

"It works on my machine"

CI and production don't have same version of library

Conflicting dependencies

My app was working fine, until someone deployed something that needed a later version of libfoo

Now my app is broken because of the libfoo change

DLL hell, Cabal hell

Installing or upgrading one package shouldn't break other packages

Basic principles

Use cases

Arbitrary software (not limited by language)

Allow multiple package versions

Allow unprivileged users to install software

Dependencies

Identify packages by cryptographic hash

Hardcode the specific hash of the package version in the built artefact

The nix store

A content-addressable store, where packages have paths based on cryptographic hash

Once a package is created, it can't be changed (because the hash would have to change)

Structure of a nix path

/nix/store/n0a1y0yd54sh10p7rdi0alysscvac5x5-certificate-transparency-2016-01-14/bin/ct

Locating dependencies

Where other binaries might search for libjson in /usr/lib or /usr/local/lib, we search in /nix/store/6hcccvdx29r4j8n0wxl85b6mlmq0gvs1-libjson-7.6.1, and nowhere else

Only the exact same version is acceptable

Querying dependencies

$ nix-store --query --references /nix/store/n0a1y0yd54sh10p7rdi0alysscvac5x5-certificate-transparency-2016-01-14

$ nix-store --query --requisites /nix/store/n0a1y0yd54sh10p7rdi0alysscvac5x5-certificate-transparency-2016-01-14

How to deploy a package:

  • Identify the package's dependencies (by hash)
  • Correctly deploy each of the package's dependencies (recursively if necessary)
  • Copy the package itself

This works for any target machine (with matching arch & kernel): Ubuntu, RHEL, CentOS...

Command: nix-copy-closure

Avoids conflicts like an omnibus package

Shares common dependencies like a regular package manager

Aside: FHS considered harmful

FHS is about guessing

Whatever is at /usr/lib/libjson.so.1 is fine

v1.0.0? v1.52.65? v1.1.3-random-fork?

This is totally against the goal of correct deployment!

Making a nix package

A worked example: sassc

An example of ./configure ; make ; make install

{ stdenv, fetchurl, autoreconfHook, libsass }:

stdenv.mkDerivation rec {
  name = "sassc-${version}";
  version = "3.3.2";

  src = fetchurl {
    url = "https://github.com/sass/sassc/archive/${version}.tar.gz";
    sha256 = "15a2b2698639dfdc7bd6a5ba7a9ecdaf8ebb9f15503fb04dea1be3133308e41d";
  };

  patchPhase = ''
    export SASSC_VERSION=${version}
  '';

  nativeBuildInputs = [ autoreconfHook ];

  buildInputs = [ libsass ];

  meta = with stdenv.lib; {
    description = "A front-end for libsass";
    homepage = https://github.com/sass/sassc/;
    license = licenses.mit;
    maintainers = with maintainers; [ codyopel pjones ];
    platforms = platforms.unix;
  };
}
{ stdenv, fetchurl, autoreconfHook, libsass }:

stdenv.mkDerivation rec {
  # ...
}
{ stdenv, fetchurl, autoreconfHook, libsass }:

stdenv.mkDerivation rec {
  name = "sassc-${version}";
  version = "3.3.2";

  src = fetchurl {
    url = "https://github.com/sass/sassc/archive/${version}.tar.gz";
    sha256 = "15a2b2698639dfdc7bd6a5ba7a9ecdaf8ebb9f15503fb04dea1be3133308e41d";
  };

  # ...
}
{ stdenv, fetchurl, autoreconfHook, libsass }:

stdenv.mkDerivation rec {
  name = "sassc-${version}";
  version = "3.3.2";

  src = fetchurl # ...;

  patchPhase = ''
    export SASSC_VERSION=${version}
  '';

  nativeBuildInputs = [ autoreconfHook ];

  buildInputs = [ libsass ];

  # ...;
}
{ stdenv, fetchurl, autoreconfHook, libsass }:

stdenv.mkDerivation rec {
  name = "sassc-${version}";
  version = "3.3.2";

  src = fetchurl # ...;

  # ...;

  meta = with stdenv.lib; {
    description = "A front-end for libsass";
    homepage = https://github.com/sass/sassc/;
    license = licenses.mit;
    maintainers = with maintainers; [ codyopel pjones ];
    platforms = platforms.unix;
  };
}

demo

nix-shell --pure -A pkgs.sassc

env | grep libsass

A non-autoconf package: ponysay

{ stdenv, fetchurl, python3, texinfo, makeWrapper }:
stdenv.mkDerivation rec {
  buildInputs = [ python3 texinfo makeWrapper ];

  phases = "unpackPhase installPhase fixupPhase";

  installPhase = ''
    find -type f -name "*.py" | xargs sed -i "s@/usr/bin/env python3@$python3/bin/python3@g"
    substituteInPlace setup.py --replace \
        "fileout.write(('#!/usr/bin/env %s\n' % env).encode('utf-8'))" \
        "fileout.write(('#!%s/bin/%s\n' % (os.environ['python3'], env)).encode('utf-8'))"
    python3 setup.py --prefix=$out --freedom=partial install \
        --with-shared-cache=$out/share/ponysay \
        --with-bash
  '';
}

evilvte: compiled-in configuration

{ stdenv, fetchgit, makeWrapper, ... configH}:

stdenv.mkDerivation rec {
  # ...;

  buildPhase = ''
    cat >src/config.h <<EOF
    ${configH}
    EOF
    make
  '';
}

Glossary so far

Nix store A read-only content-addressable filesystem Store path A path in the nix store, identified by the hash of its contents (a "package") Derivation A recipe for constructing a store path Building Creating a store path from a derivation Closure A store path and all of its dependencies needed to function correctly

Nixpkgs and Hydra

nixpkgs: the nix package collection

https://github.com/NixOS/nixpkgs

76k commits, 696 contributors

21-28 Jan: 59 PRs merged by 38 people

Substitutions

Nix derivations are defined in terms of source code

Building a derivation can be skipped by substitution from a prebuilt binary cache

hydra: the nix continuous build server

every new commit to nixpkgs is built on a build farm

binaries made available at cache.nixos.org substitution server

Profiles and environments

An environment is a set of installed packages

With corresponding environment variables PATH, LD_LIBRARY_PATH, PKG_CONFIG_PATH...

Multiple environments can coexist (because multiple packages are allowed)

A profile is a set of environments over time

Each environment is called a "generation"

Each operation on a profile creates a new generation

Each user can have their own profile

Profile operations

  • Install packages
  • Upgrade packages
  • Remove packages
  • Rollback to a previous generation

Aside: mass upgrades

Ever upgraded major distro release (eg 14.10 to 15.04)?

How do you recover?

Rollbacks save you here!

Development environments

An environment can also be created for a particular project context

Minimum necessary dependencies for project

We already saw this with nix-shell for building sassc

Functional programming parallels

Deployment as memory management (Ch 3)

Runtime-provided dependency

class Foo {
  Foo () {}
  int run(Bar y) { return y.doIt(); }
} 
export PYTHONPATH=/path/to/foo-module
python script-that-needs-foo-module.py

Compile-time-provided dependency

class Foo {
  int x;
  Foo (Bar y) { x = y.doIt(); }
  int run() { return x; }
}
gcc -o foo foo.c -static -L/path/to/libjson -ljson 
./foo            # no runtime dependency on libjson

Note that y and libjson are not needed after construction or compilation, respectively

Retained dependency

class Foo {
  Bar x;
  Foo (Bar y) { x = y; }
  int run() { return x.doIt(); }
}
gcc -o foo foo.c -L/path/to/libjson -ljson # no `-static`
./foo            # runtime dependency on libjson.so

Here, y and libjson need to be retained after construction or compilation, respectively, or the constructed artefact will fail

Conservative GC

Has a built artefact retained a dependency, or is it now unneeded?

Use a conservative GC: scan for pointer references

GC roots are user profiles, system profiles, etc

Scanning is simply grep for the package SHA256

PS: pointer arithmetic

It is possible to retain dependencies but "hide" them from the GC

This problem manifests (in theory) in regular GC and in nix's GC; and in practice is not a problem

Binary substitution as memoization

A package description is a function that takes source code and returns compiled code

A binary substitution is a cached evaluation of the same function which can be used instead of rebuilding the package

(In this way we neatly have a single definition for source and binary packages: binary packages come "for free")

Intensional vs extensional equality

Extensional equality

Two functions $f$ and $g$ are extensionally equal if, for all $x$, $f(x) = g(x)$.

Two software programs foo and bar are extensionally equal if, for all input, foo and bar have the same behaviour.

Intensional equality

Two functions $f$ and $g$ are intensionally equal if they have the same syntactic definition.

Two software programs foo and bar are intensionally equal if they have the same representation on disk, byte-for-byte.

Purity of nix packages

Whether you accept substitution model depends on your definition of function purity.

A nix build may, on repeated evaluation, produce artefacts which are only extensionally equal, but not intensionally.

Aside: reproducible builds

There is a growing community around reproducible builds (eg @ReproBuilds)

The value of reproducible builds is precisely that it makes builds respect intensional equality

Summary

Package management is about deployment

Deployment is about dependency management

Avoid conflicts, version mismatches by using the correct theory

Thanks!

Philip Potter / @philandstuff

philandstuff.github.io/pf-deployment/

UK Government Digital Service

The Purely Functional Software Deployment Model Eelco Dolstra, PhD Thesis, University of Utrecht Philip Potter / @philandstuff philandstuff.github.io/pf-deployment/ UK Government Digital Service