Software Development and Deployment with Spack

Massimiliano Culpo - SCITAS, EPFL

Why are we here, talking about package managers?

Complexity of a typical scientific software

Combinatorial explosion of deployments in HPC

Installing software manually isn't an option!

Using scripts is only slightly better!

Compare with what happens on a workstation...


$ apt install hwloc libboost-dev libboost-atomic-dev libboost-program-options-dev libboost-filesystem-dev libboost-regex-dev libboost-system-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  binutils binutils-common binutils-x86-64-linux-gnu build-essential cpp cpp-7 dirmngr dpkg-dev fakeroot file fontconfig-config
[...]
Setting up libharfbuzz-dev:amd64 (1.7.2-1ubuntu1) ...
Setting up libicu-le-hb-dev:amd64 (1.0.3+git161113-4) ...
Setting up libicu-dev (60.2-3ubuntu3) ...
Setting up libboost-regex1.65-dev:amd64 (1.65.1+dfsg-0ubuntu5) ...
Setting up libboost-regex-dev:amd64 (1.65.1.0ubuntu1) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...
                
Scientific software must be that easy to use!

What is Spack?

Level 1

A quick overview of the tool

Spack is a package manager for HPC


$ # Clone a git repository (or extract a release tarball)
$ git clone https://github.com/spack/spack.git

$ # Source a setup file (optional)
$ . spack/share/spack/setup-env.sh

$ # Spack is ready to use
$ spack install hdf5
==> zlib is already installed in [...]
[...]
==> Successfully installed hdf5
  Fetch: 21.62s.  Build: 4m 3.15s.  Total: 4m 24.77s.
[+] [...]/gcc-8.2.0/hdf5-1.10.5-n3z5wdfvv4gutcjjktb77kt7zswwp2e7
                

https://spack.io

Spack is an open source project

  • Lead developer: Todd Gamblin (LLNL)
  • Codebase hosted on Github
  • Dual license Apache-2.0 or MIT
  • Want to get buy-in and PRs by HPC vendors
  • Very active and engaging community

https://github.com/spack/spack

Spack is used worldwide!

Sessions on spack.readthedocs.io for one month

Ongoing efforts to disseminate best practices

How is software complexity tamed by Spack?

The spec syntax describes user's needs


   $ # Install a specific version by appending @
   $ spack install hdf5@1.10.1

   $ # Specify a compiler (and optional version), with %
   $ spack install hdf5@1.10.1 %gcc@4.7.3

   $ # Add special boolean compile-time options with +
   $ spack install hdf5@1.10.1 +szip

   $ # Add custom compiler flags
   $ spack install hdf5@1.10.1 cppflags="-O3 -floop-block"

   $ # Cross-compile on a Cray or Blue Gene/Q
   $ spack install hdf5@1.10.1 target=backend
                

Directives model allowed configurations


class Openblas(MakefilePackage):
    """OpenBLAS: An optimized BLAS library"""

    homepage = 'http://www.openblas.net'
    url      = 'http://github.com/OpenBLAS/v0.2.19.tar.gz'

    version('0.3.4', sha256='4b4b4453251')
    version('0.3.3', sha256='49d88f4494a')

    variant('shared', default=True,
            description='Build shared libraries')
    variant('ilp64', default=False,
            description='64 bit integers')

    conflicts('%intel@16', when='@0.2.15:0.2.19')

                

The concretizer fills in missing details

The abstract spec is turned into a concrete configuration that can be installed

Each configuration is installed in its own prefix

All the software is installed in the Spack's store

APIs with multiple providers are allowed

Virtual dependencies declare a dependency on a versioned API and not on a specific provider

Which information is stored at install-time?

Installed configurations are stored in a JSON file


{
 "database": {
  "installs": {
   "ivqu252fvh7r5iar6zwx4fmeoxiykln7": {
    "explicit": true,
    "installation_time": 1548272929.178339,
    "ref_count": 0,
    "installed": true,
    "path": "/home/mculpo/PycharmProjects/spack/opt/spack/linux-ubuntu18.04-x86_64/gcc-8.2.0/zlib-1.2.11-ivqu252fvh7r5iar6zwx4fmeoxiykln7",
    "spec": {
     "zlib": {
      "version": "1.2.11",
      "arch": {
       "platform": "linux",
       "platform_os": "ubuntu18.04",
       "target": "x86_64"
      },
      "compiler": {
       "name": "gcc",
       "version": "8.2.0"
      },
      "namespace": "builtin",
      "parameters": {
       "optimize": true,
       "pic": true,
       "shared": true,
       "cflags": [],
       "cppflags": [],
       "cxxflags": [],
       "fflags": [],
       "ldflags": [],
       "ldlibs": []
      }
     }
    }
   }
  },
  "version": "0.9.3"
 }
}
                
opt/spack/.spack-db/index.json

Provenance is preserved for each configuration


$ tree $(spack location -i hdf5)/.spack
<prefix>/.spack
├── archived-files
│   └── config.log
├── build.env
├── build.out
├── repos
│   └── builtin
│       ├── packages
│       │   ├── hdf5
│       │   │   └── package.py
│       │   └── zlib
│       │       └── package.py
│       └── repo.yaml
└── spec.yaml

6 directories, 7 files
                
This permits to regenerate the JSON database!

Tools are then built over the data in the DB

Check beforehand the result of concretization


$ spack spec -Il hpx cxxstd=14
Input spec
--------------------------------
 -   hpx cxxstd=14

Concretized
--------------------------------
 -   qjjz5gy  hpx@1.2.1%gcc@8.2.0 build_type=RelWithDebInfo ~cuda cuda_arch=none cxxstd=14 instrumentation=none malloc=tcmalloc +networking~tools arch=linux-ubuntu18.04-x86_64
[+]  zllsejt      ^boost@1.70.0%gcc@8.2.0+atomic+chrono~clanglibcpp~context~coroutine cxxstd=14 +date_time~debug+exception~fiber+filesystem+graph~icu+iostreams+locale+log+math~mpi+multithreaded~numpy patches=2ab6c72d03dec6a4ae20220a9dfd5c8c572c5294252155b85c6874d97c323199 ~pic+program_options~python+random+regex+serialization+shared+signals~singlethreaded+system~taggedlayout+test+thread+timer~versionedlayout+wave arch=linux-ubuntu18.04-x86_64
[...]
[+]  kgkd3ck          ^libffi@3.2.1%gcc@8.2.0 arch=linux-ubuntu18.04-x86_64
[+]  oukq22h          ^sqlite@3.26.0%gcc@8.2.0~functions arch=linux-ubuntu18.04-x86_64
                

Query what's installed from the command line


$ spack find zlib
==> 1 installed package
-- linux-ubuntu18.04-x86_64 / gcc@8.2.0 ----
zlib@1.2.11

$ spack find --start-date 'a month ago'
==> 3 installed packages
-- linux-ubuntu18.04-x86_64 / gcc@8.2.0 ----
hdf5@1.10.4  openblas@0.3.5  zlib@1.2.11
                

Uninstall anything in an easy and safe way


$ spack find zlib
==> 2 installed packages
-- linux-ubuntu18.04-x86_64 / gcc@8.2.0 ----
zlib@1.2.8  zlib@1.2.11

$ spack uninstall zlib@1.2.8
==> The following packages will be uninstalled:

-- linux-ubuntu18.04-x86_64 / gcc@8.2.0 ----
    yxoie27 zlib@1.2.8%gcc+optimize+pic+shared
==> Do you want to proceed? [y/N] y
==> Successfully uninstalled zlib@1.2.8%gcc@8.2.0 [...] /yxoie27
                

How are shared libraries found at run-time?

A wrapper injects RPATHs during compilation

In this way the software runs the way it was built

How can I fine-tune Spack's configuration?

Settings are stored in a hierarchy of files

ScopeDirectory
User ~/.spack
Site ${spack_root}/etc/spack
System /etc/spack
Defaults ${spack_root}/etc/spack/defaults
Each configuration is a merge of multiple scopes

Configuration files are written in YAML format


compilers::
- compiler:
    environment: {}
    extra_rpaths: []
    flags: {}
    modules: []
    operating_system: ubuntu16.04
    paths:
      cc: /usr/bin/gcc
      cxx: /usr/bin/g++
      f77: /usr/bin/gfortran
      fc: /usr/bin/gfortran
    spec: gcc@5.4.0
    target: x86_64
                
A double colon overrides lower scopes

Packaging your software

Level 2

How does Spack store recipes?

Packages are Python classes


class Kripke(Package):
    """Kripke is a simple, scalable, 3D Sn
    deterministic particle transport mini app.
    """
    homepage = "https://codesign.llnl.gov/kripke.php"
    url      = "https://codesign.llnl.gov/kripke-1.1.tar.gz"

    version('1.1', '7fe6f2b26ed983a6ce5495ab701f85bf')
    version('1.0', 'f4247dde07952a5ff866b24e45b5cdd1')

    variant('mpi', default=True, description='Build with MPI.')

    depends_on('mpi', when="+mpi")
                

It's really easy to work with packages


                    $ # Create a new package in the built-in repository
                    $ spack create <url>

                    $ # Modify an existing package
                    $ spack edit <package-name>

                    $ # Scrape for versions of an existing package
                    $ spack versions <package-name>
                

Declare package versions and fetching


class QuantumEspresso(Package):
    url = 'https://gitlab.com/QEF/.../q-e-qe-6.3.tar.gz'
    git = 'https://gitlab.com/QEF/q-e.git'

    version('6.4', sha256='781366d...bfe')
    version('5.3', md5='be3f877...592',
            url='https:///old-url.com/qe5.3.tgz')

    version('develop', branch='develop')
    version('latest-backports', branch='qe-6.3-backports')

    def url_for_version(self, version):
        """Returns the correct url for a given version"""
                

Finding new versions of a package


class Mpich(Package):
    url        = "http://[...]/downloads/3.0.4/3.0.4.tar.gz"
    list_url   = "http://[...]/downloads/"
    list_depth = 1
                
                
$ spack checksum mpich
==> Found 34 versions of mpich:

  3.3rc1  http://[...]/downloads/3.3rc1/mpich-3.3rc1.tar.gz
  3.3b3   http://[...]/downloads/3.3b3/mpich-3.3b3.tar.gz
  [...]

==> How many would you like to checksum?
                

Allow different configurations with variants

                    
class Hdf5(AutotoolsPackage):

    variant('shared', default=True, description='...')

    def configure_args(self):
        extra_args = []
        if '+shared' in self.spec:
            extra_args.append('--enable-shared')
        else:
            extra_args.append('--disable-shared')
            extra_args.append('--enable-static-exec')
        ...
                    
                    
                        $ spack install hdf5~shared
                        [...]
                    
                

Variants support complex configurations


class Blis(AutotoolsPackage):
    variant('threads', default='none',
            description='Multithreading support',
            values=('pthreads', 'openmp', 'none'), multi=False)

    def configure_args(self):
        options = []
        if self.spec.variants['threads'].value == 'none':
            options.append('--no-threads')
        ...

class Adios(AutotoolsPackage):
    variant('staging',
            values=any_combination_of('flexpath', 'dataspaces'),
            description='Enable dataspaces and/or flexpath')
                

Model the dependencies of a package


class Hpx(CMakePackage, CudaPackage):
    variant(
        'cxxstd', default='17', values=('98', '11', '14', '17'),
        description='C++ standard used when building.'
    )

    depends_on('boost')
    depends_on('python', type=('build', 'test', 'run'))

    # Boost is further constrained depending on cxxstd
    depends_on('boost cxxstd=98', when='cxxstd=98')
    depends_on('boost cxxstd=11', when='cxxstd=11')
    depends_on('boost cxxstd=14', when='cxxstd=14')
    depends_on('boost cxxstd=17', when='cxxstd=17')
                

Does Spack support different build-systems?

Spack supports many different build-systems

Base classPurpose
Package Generic, non-specialized
MakefilePackage Handwritten Makefiles
AutotoolsPackage GNU Autotools
CMakePackageCMake built packages
......
CudaPackageMixin to help with CUDA

Each installation procedure is specific


class CMakePackage(PackageBase):

    phases = ['cmake', 'build', 'install']

    variant('build_type', default='RelWithDebInfo', ...)
    depends_on('cmake', type='build')

    def cmake(self, spec, prefix):
        """Runs ``cmake`` in the build directory"""
        options = [os.path.abspath(self.root_cmakelists_dir)]
        options += self.std_cmake_args
        options += self.cmake_args()
        with working_dir(self.build_directory, create=True):
            inspect.getmodule(self).cmake(*options)
                
The defaults work fine for most packages

Decorators allow for an easy customization


class R(AutotoolsPackage):

    @run_after('install')
    def copy_makeconf(self):
        # Make a copy of Makeconf because it will be
        # needed to properly build R dependencies
        src_makeconf = join_path(self.etcdir, 'Makeconf')
        dst_makeconf = join_path(self.etcdir, 'Makeconf.spack')
        install(src_makeconf, dst_makeconf)
                
Actions can be executed before or after each phase

How can I debug failing package builds?

Stop at a specific phase of the installation


$ spack configure hdf5~mpi
==> Checking dependencies for hdf5~mpi
==> zlib is already installed in /home/mculpo/[...]
==> Installing hdf5
[...]
==> Created stage in /home/mculpo/[...]
==> No patches needed for hdf5
==> Building hdf5 [AutotoolsPackage]
==> Executing phase: 'autoreconf'
==> Executing phase: 'configure'
==> Stopping at 'configure' phase

$ spack cd hdf5~mpi
                

Parse the build logs to search for errors


$ spack log-parse $(spack location -i hdf5)/.spack/build.out
0 errors

$ spack log-parse --show=errors,warnings \
  $(spack location -i hdf5)/.spack/build.out
0 errors
741 warnings
     534     CC       H5dbg.lo
     535     CC       H5system.lo
     536   H5system.c: In function 'HDfprintf':
  >> 537   H5system.c:276: warning: [...] [-Wformat-nonliteral]
     538                  n = fprintf(stream, format_templ, x);
     539                                      ^~~~~~~~~~~~
[...]
                

Spawn a shell with Spack's build-environment


$ spack stage zlib
==> Fetching file:///home/[...]/zlib-1.2.11.tar.gz
==> Staging archive: /home/[...]/zlib-1.2.11.tar.gz
==> Created stage in /home/[...]/zlib-1.2.11-ivqu252fvh7

$ spack cd zlib
$ spack build-env zlib -- /bin/bash
                

The development environment

Level 3

Spack environments

Spack environments are virtualized instances


$ spack env create my-project
==> Created environment 'my-project' in [...]/my-project

$ spack env activate my-project
$ spack env status
==> In environment my-project

$ spack find
==> In environment my-project
==> No root specs

==> 0 installed packages
                

Environments are built over Spack's store


$ spack find
==> In environment my-project
==> No root specs

==> 0 installed packages

$ spack install zlib hdf5~mpi
==> zlib is already installed in [...]
==> Installing hdf5
[...]
==> Successfully installed hdf5
  Fetch: 0.06s.  Build: 1m 14.67s.  Total: 1m 14.73s.
[+] /home/.../hdf5-1.10.4-xucyflhbo2p47n46smfsqo7p2y3hijmd
                

Users can set-up groups of packages at once


$ spack add hdf5~mpi python
==> Adding hdf5~mpi to environment my-project
==> Adding python to environment my-project
==> Adding mpich to environment my-project

$ spack find -v
==> In environment my-project
==> Root specs
hdf5~mpi  mpich  python

==> 0 installed packages

$ spack install
[...]
                
Specs can be concretized separately or together

Details are configurable from a YAML file


$ spack config get
# This is a Spack Environment file.
#
# It describes a set of packages to be installed, along with
# configuration settings.
spack:
  # add package specs to the `specs` list
  specs: [hdf5~mpi, python, mpich]
  mirrors: {}
  modules:
    enable: []
  repos: []
  packages: {}
  config: {}
  concretize_together: false
                
This is the manifest of your environment

spack.yaml stores abstract user requests

  • Might live inside Spack's tree or not
  • Environment implicitly active if it is in the pwd
  • Spack provides a cli to manage environments
  • Can be bundled with any software project

Fully concretized specs are in spack.lock


{
  "concrete_specs": {
   "teneqii2xv5u6zl5r6qi3pwurc6pmypz": {
    "xz": {
      "version": "5.2.4",
      "arch": {
        "platform": "linux",
        "platform_os": "ubuntu16.04",
      "target": "x86_64"
 },
 ...
                
Reproduce using either the manifest or the lockfile

Environments come with an associated view


spack:
  specs:
  - hdf5+mpi
  - bzip2

  view: True
                                

$ ls .spack-env/view/bin
h5clear   h5diff   ...   h5unjam
h5copy    h5dump   ...   ph5diff
h5debug   ...      ...

$ ls .spack-env/view/include
H5ACpublic.h     ...   hdf5.h
H5Apublic.h      ...   zconf.h
H5Cpublic.h      ...   zlib.h
                                
Single prefix projection of the combinatorial space

How can I use Spack environments then?

Create manifest files for development


$ spack env create -d hpx-dev-env
==> Updating view at [...]/hpx-dev-env/.spack-env/view
==> Created environment in [...]/hpx-dev-env

$ spack add boost@1.68.0 cxxstd=14 cmake git \
  gperftools hwloc python
==> Adding boost@1.68.0 cxxstd=14 to environment [...]/hpx-dev-env
==> Adding cmake to environment [...]/hpx-dev-env
==> Adding git to environment [...]/hpx-dev-env
==> Adding gperftools to environment [...]/hpx-dev-env
==> Adding hwloc to environment [...]/hpx-dev-env
==> Adding python to environment [...]/hpx-dev-env
==> Updating view at [...]/hpx-dev-env/.spack-env/view
                
spack.yaml files can be part of your sources

Setup a development environment with them


$ spack concretize
 -   ttrkdry  boost@1.68.0%gcc@8.2.0+atomic[...]
[+]  uisaimo      ^bzip2@1.0.6%gcc@8.2.0+shared
[+]  cgavvmi          ^diffutils@3.7%gcc@8.2.0
[+]  ivqu252      ^zlib@1.2.11%gcc@8.2.0+optimize
[...]
[+]  oukq22h      ^sqlite@3.26.0%gcc@8.2.0

==> Updating view at [...]hpx-dev-env/.spack-env/view

$ spack install
==> Installing environment [...]/hpx-dev-env
[...]
                
Might be installed directly on your workstation or used to build container images

Can I extend Spack with custom commands?

Spack can be extended with external plugins


$ # Folder structure for Spack's extension
spack-scripting/
├── pytest.ini
├── scripting
│   └── cmd
│       └── filter.py
├── tests
│   ├── conftest.py
│   └── test_filter.py
└── templates
                
You can develop your own specific extension

Extensions can be configured in config.yaml


config:
  extensions:
    - [...]/spack-scripting
                

$ spack filter --help
usage: spack filter [-h] [--installed | --not-installed]
                    [--explicit | --implicit] [--output OUTPUT]
                    ...

filter specs based on their properties
[...]

$ spack test --extension=scripting
[...]
                

Deploy stable versions

Level 4

How can I define a Stack to be deployed?

YAML format to describe entire stacks


spack:
  matrices:
  - specs:
    - zlib@1.2.8
    - zlib@1.2.11
    - hdf5~mpi@1.10.2
    - hdf5+mpi@1.10.2
    toolchains:
    - ['%gcc@4.9.3', ^mvapich2@2.2]
    toolchain-generator:
    - ['%gcc@7.3.0', '%intel@18.0.0']
    - [^mvapich2@2.2, ^mvapich2@2.3]
    exclude:
    - '%intel@18.0.0 ^mvapich2@2.2'

  - specs:
    - cmake@3.8.2
    - python@2.7.15
    toolchains:
    - ['%gcc@7.3.0']
                
Deploy all your software with a single command

Can I generate module files for installed software?

Modules are generated by hooks or on-demand

Module files are configurable in modules.yaml


modules:
  lmod:
    core_compilers:
      - gcc@4.8.5
    hierarchy:
      - mpi
      - lapack
    hash_length: 0
    all:
      suffixes:
        +mpi: mpi
        +openmp: openmp
        ...
                

Click here for a complete tutorial on module files

Module files can be managed using a cli


$ spack module tcl --help
usage: spack module tcl [-h] SUBCOMMAND ...

positional arguments:
  SUBCOMMAND
    refresh   regenerate module files
    find      find module files for packages
    rm        remove module files
    loads     prompt the list of modules associated with
              a constraint

optional arguments:
  -h, --help  show this help message and exit
                
There's also extensive support for lmod hierarchies

Can I maintain repositories with site-specific recipes?

Spack provides support for custom repositories


$ spack repo create -h
usage: spack repo create [-h] directory [namespace]

positional arguments:
  directory   directory to create the repo in
  namespace   namespace to identify packages in the repository

optional arguments:
  -h, --help  show this help message and exit

$ spack repo create repositories/hpx-customization
==> Created repo with namespace 'hpx-customization'.
==> To register it with spack, run this command:
  spack repo add /home/spack/repositories/hpx-customization
                
etc/spack/defaults/repos.yaml

Common usages for custom repositories

  1. Independently maintain your own packages
  2. Share packages without using the built-in repo
  3. Override built-in packages with your own recipes

Can I reuse what is installed in another Spack instance?

Spack chains allows to read upstream DBs

Deploy with different levels of stability

What comes next?

Bonus level

CI infrastructure for sources and binaries

Specific target information in specs

Given a target automatically activate optimizations

Concretization prefers already available binaries

Probably the most wanted feature in Spack

Thanks for listening! Questions?