Cuttlefish 2.0.0
New features
- Cuttlefish is now able to construct compacted de Bruijn graphs from short read sets, along with reference sequences. This is made possible through the design of a new algorithm, Cuttlefish 2, that works on both forms of input. An associated pre-print describing this new algorithm is available on bioRxiv.
Enhancements
The earlier Cuttlefish algorithm (also referred to as Cuttlefish 1) implementation has been enhanced in a number of ways:
- An explicitly built KMC-database is not required as an input to Cuttlefish anymore. The database construction has been incorporated into the Cuttlefish execution.
- The plain-text output has been replaced with the FASTA format.
- A meta-information file is also output along with the compacted graph, containing summary statistics of the graph.
- The memory-usage can now be traded-off for faster execution time, through providing a (soft) memory-bound, or lifting off any strict-memory requirements.
All these features are also available under execution of the Cuttlefish 2 algorithm as well.
Fixes
- A bug in outputting the compacted graph in the GFA 1 format has been fixed (reported in #8, #9). The bug would produce an extra pre-pending overlap value in the ''Overlaps'' field of a ''Path'' line.
Other changes
- The default
k
value is bumped to27
from25
. - The default thread-count
t
is set to a quarter of the number of concurrent threads supported, instead of just1
.
Full Changelog: v1.0.0...v2.0.0