These pipelines facilitate the running of the Cactus whole genome alignment tool efficiently on SLURM (and possibly other) clusters.
Tutorials available on the FAS Informatics website:
Installation is done simply by cloning the repository:
git clone https://github.com/harvardinformatics/cactus-snakemake.git
However, Snakemake and Singularity are required as dependencies. For more information, see the setup instructions in any of the tutorials linked above.
Each pipeline has a different config file that is required to specify input and output options and cluster resources.
With the config file setup, the pipelines are generally run as:
snakemake -j <number of jobs to submit simultaneously> -e slurm -s </path/to/snakefile.smk> --configfile </path/to/your/snakmake-config.yml>
💡 Tip: Cannon cluster Snakemake plugin
If you are on the Harvard Cannon cluster, you can use the snakemake-executor-plugin-cannon to do automatic partition selection instead of the generic SLURM executor plugin. Install the plugin with pip or mamba and then use
-e cannon
in all of your commands instead of-e slurm
.
For more information, see the setup and run instructions in each of the tutorials linked above.
Several meta config options exist across pipelines as pseudo-command line flags
Command line flag | Description |
---|---|
--config display=T |
Print the current config settings and exit |
--config info=T |
Display some information about the pipelines, including version and last commit date |
--config version=T |
Display the version of the pipeline |
--config prep=T |
Run all pre-processing steps and exit (e.g. output directory creation, cactus image download, running cactus-prepare ). |
--config debug=T |
The same as prep, but display extra information about the pre-processing steps. |