Scientific Computing Tools
Luis Yanes, Tim Stitt
Lmod - Environment management
- Lmod is a popular HPC tool developed by the community for environment management
- Helps with the issue of having "too many versions, too many tools…"
- Also, improves dependency management across different software
Compilers
- Can I use a particular version of a compiler?
- Am I really using that compiler?
- Can I sanely manage software compilation dependencies?
- GCC example
Software dependencies
- There are too many dependencies! How do I learn which software I have loaded and/or is necessary for a particular software
- Python example
Software requirements
- I want to do QC so I need (a verylonglistoftools…)
- I loaded my verylonglistoftools, but… What has it made available?
- Nice! But… This other tool doesn't play nicely with XYZ-2.3.4, I need XYZ-2.1.2 at most
- Wait, are you telling me I can do all of this within my submission scripts? Sure…
- QC meta-module example
DMTCP - Checkpoint & Restart
- DMTCP stands for Distributed MultiThreaded Checkpointing
- DMTCP's idea is to wrap your program and provide external checkpoint/restart facilities
- DMTCP backs up your process state (opened files, active memory, sub processes) onto the disk and allows to restart from last checkpoint
- DMTCP can be integrated with SLURM and PBS
Single thread processes
- Checkpoint and restart a blast command
- Show the CPU usage to 100%
- Sample script
Multi thread processes
- Checkpoint and restart a multi-core blast command
- Show CPU usage to ~200%
- Sample script
Code customisation
- I need to use this software, but it won't work =(
- No worries, we will make it work! =)
- I want to make this run better in our resources
BioNano pipeline for PBS on UV2000
- Debugging BioNano pipeline and the PBS-DRMAA library to find a bug on pbs_submit
- Provided a workaround…
- Xeon Phi support coming soon*
*Terms and conditions apply...
Migrating tools to SLURM
- PBS, SGE and LSF are bioinformatics friends
- SLURM is the new kid in the bioinformatics block but hasn't got many friends yet…
- We help SLURM meet new friends!
STRUCTURE
- Markov chain Monte Carlo (MCMC) codes are HARD!
- Debugging stochastic code is HARD because it's supposed to be difficult to test/reproduce
- Unless it does the exact same thing all the time!
- Initialise the random seeds randomly when running in parallel helps get the expected different results!
Acknowledgements
- Lmod
- Gonzalo Garcia, Luca Venturini, Matthew Hartley and Chris Bridson
- DMTCP
- Tim Stitt, Michael Burrell, Chris Bridson, Sam Gallop and Adam Carrgilson
- BioNano pipeline
- Graham Etherington, Ricardo Ramirez, Pirita Paajanen and the CiS Team
- STRUCTURE
- Tim Stitt, Diane Saunders, Pillar Corredor-Moreno, Antoine Persoons, Vanessa Bueno, Ricardo Ramirez
Many thanks to all for your help
Sources & Scripts
- Github page with materials
- The presentation was generated using emacs and reveal.js
0
Scientific Computing Tools
Luis Yanes, Tim Stitt
Luis.Yanes@tgac.ac.uk, Tim.Stitt@tgac.ac.uk