Conda – A cross-platform package manager



Conda – A cross-platform package manager

0 0


pyconpl-2015-conda

Presentation slides about Conda at PyCon.PL 2015

On Github mattpap / pyconpl-2015-conda

Conda

A cross-platform package manager

Created by Mateusz Paprocki

Before we begin — installing conda

  • Miniconda (quick installation instructions)
    • only Python, conda and a few dependencies
    • small size (about 25 MB)
  • Anaconda (full installation instructions)
    • Python, conda and lots of common packages (>=150)
    • big download (about 300 MB)

Miniconda installers

Windows Mac OS X Linux Python 2.7 64-bit / 32-bit 64-bit 64-bit / 32-bit Python 3.4 64-bit / 32-bit 64-bit 64-bit / 32-bit

Install on Linux and Mac OS X:

$ bash Miniconda-latest-Linux-x86_64.sh # or MacOSX instead of Linux

Install on Windows:

Miniconda-latest-Windows-x86_64.exe

Presentation plan

  • about package management in general
  • package management in Python
    • setup.py, pip, wheels, ...
  • conda
    • introduction
    • commands
    • packages
    • channels
    • environments
    • building packages
    • disadvantages
  • final words

Why we care about package management?

  • user's perspective:
    • seamlessly install software
    • possibly not an expert
  • developer's perspective:
    • allow users to use his programs
    • developer is a user as well

We all simply don't want to waste time.

Package manager or package management system

  • is collection of software tools
  • automates the process of software:
    • installation
    • configuration
    • upgrading
    • removal
    • ...
  • allows to resolve dependencies
  • allows for reliability and reproducibility

Package managers

Windows Mac OS X Linux
  • chocolatey
  • npackd
  • macports
  • homebrew
  • apt-get (dpkg)
  • yum (rpm)

Package managers

pip Python Bundler Ruby Composer PHP npm Node.js CocoaPods Objective-C IVY Java Lein Clojure Cabal Haskell

Package management in Python

The history

Installing

  • setup.py install
  • easy_install
  • pip
  • apt-get
  • rpm
  • emerge
  • homebrew
  • ...

Why not use a system package manager?

  • missing or (very) outdated packages
  • hard to get packages into certain repositories
  • requires duplication of packages for different distributions or operating systems
  • usually requires administrator privileges

setup.py install

  • fine if it's pure Python, not so much if it isn't
  • you have to have compilers installed
    distutils.errors.DistutilsError: Setup script exited with
    error: command 'gcc' failed with exit status 1

setup.py install — installing NumPy

$ git clone -b v1.10.1 git@github.com:numpy/numpy.git
$ cd numpy
$ python setup.py install
Running from numpy source directory.
Cythonizing sources
Processing numpy/random/mtrand/mtrand.pyx
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named Cython.Compiler.Main
Traceback (most recent call last):
  File "/home/mateusz/repos/numpy/tools/cythonize.py", line 199, in <module>
    main()
  File "/home/mateusz/repos/numpy/tools/cythonize.py", line 195, in main
    find_process_files(root_dir)
  File "/home/mateusz/repos/numpy/tools/cythonize.py", line 187, in find_process_files
    process(cur_dir, fromfile, tofile, function, hash_db)
  File "/home/mateusz/repos/numpy/tools/cythonize.py", line 161, in process
    processor_function(fromfile, tofile)
  File "/home/mateusz/repos/numpy/tools/cythonize.py", line 81, in process_pyx
    raise Exception('Cython failed')
Exception: Cython failed
Traceback (most recent call last):
  File "setup.py", line 264, in <module>
    setup_package()
  File "setup.py", line 252, in setup_package
    generate_cython()
  File "setup.py", line 199, in generate_cython
    raise RuntimeError("Running cythonize failed!")
RuntimeError: Running cythonize failed!

setup.py install

You are your own package manager.

pip

  • only works with Python
  • not so great for packages with native dependencies
    • e.g. scientific packages, but not limited to
    • try installing h5py if you don't have HDF5
  • doesn't have a true dependency solver

pip — installing h5py

$ pip install h5py
Collecting h5py
  Downloading h5py-2.5.0.tar.gz (684kB)
    100% |████████████████████████████████| 688kB 167kB/s
Collecting numpy>=1.6.1 (from h5py)
  Downloading numpy-1.10.1.tar.gz (4.0MB)
    100% |████████████████████████████████| 4.1MB 78kB/s
Collecting Cython>=0.17 (from h5py)
  Downloading Cython-0.23.4.tar.gz (1.6MB)
    100% |████████████████████████████████| 1.6MB 139kB/s
Collecting six (from h5py)
  Downloading six-1.10.0-py2.py3-none-any.whl
Building wheels for collected packages: h5py, numpy, Cython

(...)

In file included from /tmp/pip-build-rDj6mj/h5py/h5py/defs.c:279:0:
/tmp/pip-build-rDj6mj/h5py/h5py/api_compat.h:27:18: fatal error: hdf5.h: No such file or directory
#include "hdf5.h"
                 ^
compilation terminated.
error: command 'gcc' failed with exit status 1

pip

You are a "self integrator".

What about wheels?

  • Python package specific
    • can't build wheels for native libraries
    • can't make a wheel for Python itself
  • still doesn't address the problem that some metadata is only in the package itself
  • you are still a "self integrator"

wheel — installing h5py

No wheel easily available for Linux (see pypi).

Packaging problem?

Package maintainers hate having packages that no one can install.

Conda

  • system level package manager (Python agnostic)
  • Python, HDF5 and h5py are all conda packages
  • cross-platform (works on Windows, OSX and Linux)
  • doesn't require administrator privileges (sudo)
  • installs binaries (no more compiler woes)
  • metadata stored separately in the repository index
  • uses a SAT solver to resolve dependencies
  • allows to seamlessly manage virtual environments

Basic conda usage

Install a package
conda install sympy
List all installed packages
conda list
Search for packages
conda search numpy
Create a new environment
conda create -n py3 python=3
Remove a package
conda remove sympy
Get help
conda install --help

Advanced conda usage

Install to an environment
conda install -n py35 sympy
Update all packages
conda update --all
Export list of packages
conda list --export pkgs.txt
Install from an export
conda install --file pkgs.txt
See package history
conda list --revisions
Revert to a revision
conda install --revision 23
Clean installation
conda clean -pt
Install anaconda
conda create -n my-an anaconda

What is a conda package?

A tarball (.tar.bz2) with:

  • the files form a software package
    • /lib
    • /include
    • /bin
    • /man
    • /info
  • some metadata

Python agnostic

A conda package can be anything:

  • pure Python packages
  • Python package with C extensions
  • Python itself
  • C/C++/Fortran etc. libraries (GDAL, netCDF4, dynd, ...)
  • R
  • node.js
  • Perl
  • ...

Installation

$ conda install sympy
  • the tarball is unarchived in the pkgs directory
  • files are hard-linked to the install path
  • shebang lines and other instances of a place-holder prefix are replaced with the install prefix
  • the metadata is update, so that conda knows that a package is installed
  • post-link script is run (these are rare)

Installation — conda install sympy

$ conda install sympy
Fetching package metadata: ....
Solving package specifications: ...................
Package plan for installation in environment /home/mateusz/miniconda:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    fastcache-1.0.2            |           py27_0          40 KB
    sympy-0.7.6.1              |           py27_0         6.2 MB
    ------------------------------------------------------------
                                           Total:         6.2 MB

The following NEW packages will be INSTALLED:

    fastcache: 1.0.2-py27_0
    sympy:     0.7.6.1-py27_0

Proceed ([y]/n)? y

Fetching packages ...
fastcache-1.0. 100% |######################| Time: 0:00:00 164.82 kB/s
sympy-0.7.6.1- 100% |######################| Time: 0:00:20 314.02 kB/s
Extracting packages ...
[      COMPLETE      ]|#####################| 100%
Linking packages ...
[      COMPLETE      ]|#####################| 100%

Environments

$ conda create -n py35 python=3.5
  • environments are simple: just link package(s) to a different directory
  • hard-links are very cheap and very fast
  • conda environments are completely independent installations of everything
    • no fiddling with PYTHONPATH
    • no symlinking to site-packages
  • "activating" an environment just means changing your PATH, so that environment's bin/ or Scripts/ comes first

Activating an environment

Unix
$ source activate py35
Windows
$ activate py35

Use cases for environments

  • trying new versions of Python
  • exploring new packages from PyPI
  • testing (Python 2.6, 2.7, 3.3, 3.4, 3.5, ...)
  • development
  • reproducible science

Creating a new environment

$ conda create -n py35 python=3.5
Fetching package metadata: ....
Solving package specifications: .
Package plan for installation in environment /home/mateusz/miniconda/envs/py35:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    xz-5.0.5                   |                0         504 KB
    python-3.5.0               |                0        16.6 MB
    setuptools-18.4            |           py35_0         345 KB
    wheel-0.26.0               |           py35_1          77 KB
    pip-7.1.2                  |           py35_0         1.4 MB
    ------------------------------------------------------------
                                           Total:        19.0 MB

The following NEW packages will be INSTALLED:

    openssl:    1.0.1k-1
    pip:        7.1.2-py35_0
    python:     3.5.0-0
    readline:   6.2-2
    setuptools: 18.4-py35_0
    sqlite:     3.8.4.1-1
    tk:         8.5.18-0
    wheel:      0.26.0-py35_1
    xz:         5.0.5-0
    zlib:       1.2.8-0

Proceed ([y]/n)? y

Creating a new environment

Fetching packages ...
xz-5.0.5-0.tar 100% |#######################| Time: 0:00:01 488.65 kB/s
python-3.5.0-0 100% |#######################| Time: 0:00:19 889.30 kB/s
setuptools-18. 100% |#######################| Time: 0:00:00 546.22 kB/s
wheel-0.26.0-p 100% |#######################| Time: 0:00:00 200.20 kB/s
pip-7.1.2-py35 100% |#######################| Time: 0:00:11 127.70 kB/s
Extracting packages ...
[      COMPLETE      ]|######################| 100%
Linking packages ...
[      COMPLETE      ]|######################| 100%
#
# To activate this environment, use:
# $ source activate py35
#
# To deactivate this environment, use:
# $ source deactivate
#

$ source activate py35
discarding /home/mateusz/miniconda/bin from PATH
prepending /home/mateusz/miniconda/envs/py35/bin to PATH

$ python --version
Python 3.5.0 :: Continuum Analytics, Inc.

$ source deactivate
discarding /home/mateusz/miniconda/envs/py35/bin from PATH

$ python --version
Python 2.7.10 :: Continuum Analytics, Inc.

Channels

$ conda install -c https://conda.anaconda.org/travis simpy
  • simply URLs to locations with conda packages
  • allow to explore anaconda.org
    $ conda install -c travis simpy
  • configure channels in .condarc
    $ conda config --add channels travis
  • use conda info inspect the configuration
    $ conda info

Building

Conda Recipes

$ conda build path/to/recipe
  • meta.yaml contains metadata
  • build.sh is the build script for Unix and bld.bat is the build script for Windows

Example meta.yaml

Lots more

  • command line entry points
  • fine-grained control over conda's relocation logic
  • inequalities for version of dependencies (like >=1.2, <2.0)
  • "Preprocessing selectors" allow using the same meta.yaml for many platforms
  • see full documentation
  • conda build is only a convenient wrapper
  • you can also build packages manually just by following the package specification

Sharing

  • Anaconda Cloud (anaconda.org)
    • something like GitHub, but for package maintainers
    • public and private packages
    • build automation
  • self-hosting with conda index
    • anaconda.org is a convenient wrapper

Downsides

  • conda uses its own build of Python (!= python.org)
    $ python
    Python 2.7.10 |Continuum Analytics, Inc.| (default, Sep 15 2015, 14:50:01)
    [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    Anaconda is brought to you by Continuum Analytics.
    Please check out: http://continuum.io/thanks and https://anaconda.org
    >>>
  • developed mainly by one company
  • conda packages can't be used by pip
  • many programs and libraries don't have a conda package
  • environments may need manual activation outside bash/zsh (see e.g. fish shell issue #540 and a solution)

Unfortunate (?) features

  • conda always attempts to update installed software
    • unless it was installed with explicit version number, e.g.
      conda install numpy=1.8.1
    • because people do not update their software
    • and then they report issues about arbitrary versions of obsolete software (maintenance nightmare)
  • conda doesn't distinguish between manually and automatically (dependencies) installed packages
    • this is an issue when you remove a package and its dependencies are not removed as well
    • this is a bug in conda (see issue #232)

Unfortunate (?) features - example 1

$ conda install ipython
Fetching package metadata: ....
Solving package specifications: ..................
Package plan for installation in environment /home/mateusz/miniconda:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    decorator-4.0.4            |           py27_0          11 KB
    ipython_genutils-0.1.0     |           py27_0          32 KB
    path.py-8.1.2              |           py27_0          45 KB
    pexpect-3.3                |           py27_0          60 KB
    simplegeneric-0.8.1        |           py27_0           6 KB
    conda-3.18.3               |           py27_0         175 KB
    pickleshare-0.5            |           py27_0           8 KB
    traitlets-4.0.0            |           py27_0          88 KB
    ipython-4.0.0              |           py27_0         915 KB
    ------------------------------------------------------------
                                           Total:         1.3 MB

Unfortunate (?) features - example 1

The following NEW packages will be INSTALLED:

    decorator:        4.0.4-py27_0
    ipython:          4.0.0-py27_0
    ipython_genutils: 0.1.0-py27_0
    path.py:          8.1.2-py27_0
    pexpect:          3.3-py27_0
    pickleshare:      0.5-py27_0
    simplegeneric:    0.8.1-py27_0
    traitlets:        4.0.0-py27_0

The following packages will be UPDATED:

    conda:            3.18.2-py27_0 --> 3.18.3-py27_0

Proceed ([y]/n)?

Unfortunate (?) features - example 2

$ conda create -n ipy python
(...)
$ conda list -n ipy | grep -v '^#' | wc -l
9
$ conda install -n ipy ipython
(...)
$ conda list -n ipy | grep -v '^#' | wc -l
18
$ conda remove -n ipy ipython
Fetching package metadata: ....

Package plan for package removal in environment /home/mateusz/miniconda/envs/ipy:

The following packages will be REMOVED:

    ipython: 4.0.0-py27_0

Proceed ([y]/n)? y

Unlinking packages ...
[      COMPLETE      ]|####################################| 100%
$ conda list -n ipy | grep -v '^#' | wc -l
17

Final words

1 / 48
Conda A cross-platform package manager Created by Mateusz Paprocki