Skip to content

Cuttlefish 2.0.0

Compare
Choose a tag to compare
@jamshed jamshed released this 03 Feb 01:53
· 75 commits to master since this release

New features

  • Cuttlefish is now able to construct compacted de Bruijn graphs from short read sets, along with reference sequences. This is made possible through the design of a new algorithm, Cuttlefish 2, that works on both forms of input. An associated pre-print describing this new algorithm is available on bioRxiv.

Enhancements

The earlier Cuttlefish algorithm (also referred to as Cuttlefish 1) implementation has been enhanced in a number of ways:

  • An explicitly built KMC-database is not required as an input to Cuttlefish anymore. The database construction has been incorporated into the Cuttlefish execution.
  • The plain-text output has been replaced with the FASTA format.
  • A meta-information file is also output along with the compacted graph, containing summary statistics of the graph.
  • The memory-usage can now be traded-off for faster execution time, through providing a (soft) memory-bound, or lifting off any strict-memory requirements.

All these features are also available under execution of the Cuttlefish 2 algorithm as well.

Fixes

  • A bug in outputting the compacted graph in the GFA 1 format has been fixed (reported in #8, #9). The bug would produce an extra pre-pending overlap value in the ''Overlaps'' field of a ''Path'' line.

Other changes

  • The default k value is bumped to 27 from 25.
  • The default thread-count t is set to a quarter of the number of concurrent threads supported, instead of just 1.

Full Changelog: v1.0.0...v2.0.0