-
Notifications
You must be signed in to change notification settings - Fork 80
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
changelog and final dependency upgrade
- Loading branch information
Showing
3 changed files
with
51 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
## 🚀 New features | ||
|
||
### Multi-chain clone assembly for single-cell data | ||
|
||
Now MiXCR calculates Heavy-Light antibody and Alpha-Beta and Gamma-Delta TCR combined clones for single-cell data. Two new commands were introduced to enable this functionality: `groupClones` and `exportCloneGroups`, the first command calculate multi-chain clones form assembled clonotypes and writes result in binary format, the second export information about combined clonotypes. | ||
|
||
All single-cell presets now automatically produce combined multi-chain output in both binary and textual formats, see files with names matching `*.clone.groups.tsv` pattern in the output folder. | ||
|
||
### New characteristics in clonotype export | ||
|
||
- Export biochemical properties of gene regions with `-biochemicalProperty <geneFeature> <property>` or `-baseBiochemicalProperties <geneFeature>` export options. Available in export for alignments, clones and SHM tree nodes. Available properties: `Hydrophobicity`, `Charge`, `Polarity`, `Volume`, `Strength`, `MjEnergy`, `Kf1`, `Kf2`, `Kf3`, `Kf4`, `Kf5`, `Kf6`, `Kf7`, `Kf8`, `Kf9`, `Kf10`, `Rim`, `Surface`, `Turn`, `Alpha`, `Beta`, `Core`, `Disorder`, `N2Strength`, `N2Hydrophobicity`, `N2Volume`, `N2Surface`. | ||
- Export isotype with `-isotype [<(primary|subclass|auto)>]` | ||
- Export `-mutationRate [<gene_feature>]` in `exportShmTreesWithNodes`, `exportClones` and `exportCloneGroups` command: number of mutations relative to corresponding germline divided by the target sequence size. For `exportClones` and `exportCloneGroups` CDR3 is not included in calculation. | ||
|
||
### Support for wider set of input formats | ||
|
||
- Support for `cram` files as input for `analyze` and `align` commands. Optionally, a reference to the genome can be specified by `--reference-for-cram` | ||
- Fixed usage of BAM input for `analyze` and `align`, if file contains both paired and single reads | ||
|
||
### Algorithm enhancements | ||
|
||
- Global consensus assembly algorithm, applied in `assemble` to collapse UMI/Cell groups into contigs, now have much better seed selection empirical step for multi-consensus assembly scenarios. This significantly increases sensitivity during assembly of secondary consensuses from the same group of sequences. | ||
- New constrain in low-quality reads mapping procedure preventing cross-cell read mapping. | ||
|
||
## 📚 Preset updates | ||
|
||
- Additional improvement of clone filters in `10x-sc-xcr-vdj` preset. | ||
- Tag pattern upgrade for `cellecta-human-rna-xcr-umi-drivermap-air`. Now UMI includes a part of the C-gene primer to increase diversity, and R2 is also used for payload. | ||
- Assembling feature fix for `irepertoire-human-rna-xcr-repseq-plus` preset. Now `{CDR2Begin:FR4End}`. | ||
- New preset for BD full-length protocol with enhanced beads V2 featuring B384 whitelists: `bd-sc-xcr-rhapsody-full-length-enhanced-bead-v2`. | ||
- New preset for Takara Bio SMART-Seq Mouse TCR (with UMIs): `takara-mouse-rna-tcr-umi-smarseq`. | ||
- Presets for new Cellecta kits: `cellecta-human-dna-xcr-umi-drivermap-air`, `cellecta-human-rna-xcr-full-length-umi-drivermap-air`, `cellecta-mouse-rna-xcr-umi-drivermap-air`. | ||
- Presets for iRepertoire RepSeq+ kits with UMI: `irepertoire-mouse-rna-xcr-repseq-plus-umi-pe`, `irepertoire-human-rna-xcr-repseq-plus-umi-se`,`irepertoire-human-rna-xcr-repseq-plus-umi-pe`. | ||
- `isotype` field added to `exportClones` for presets supporting isotype identification. | ||
- Split by C-gene enabled in `thermofisher-human-rna-igh-oncomine-lr` and `cellecta-human-rna-xcr-umi-drivermap-air` presets to facilitate isotype separation. | ||
- Default consensus assembly parameters `maxNormalizedAlignmentPenalty` and `altSeedPenaltyTolerance` are adjusted to increase sensitivity. | ||
- The `--split-by-sample` option is now set to `true` by default for all `align` presets, as well as all presets that inherit from it. This new default behavior applies unless it is directly overridden in the preset or with `--dont-split-by-sample` mix-in. | ||
- `exportAlignments` now reports UMI and/or Cell barcodes by default for presets with barcodes. | ||
|
||
## 🛠️ Minor improvements & fixes | ||
|
||
- Fixed possible crash with `--dry-run` option in `analyze` | ||
- More informative help message that appears when using a deprecated preset and incorrectly suggests using `--assemble-contigs-by` instead of `--assemble-clonotypes-by`. | ||
- When split-by-tags is enabled, `exportClone` and `exportShmTreesWithNodes` now output read count as the sum of reads for given tags selection, more complicated formula was used in previous versions | ||
- `exportAlignments` by default now include the column `topChains`. `exportClones` function reports `topChains` for single cell presets. | ||
- Fixed calculation of `geneFamilyName` for genes like `IGHA*00` (without the number before `*` symbol) | ||
- Better formatting in `listPresets` command. Added grouping by vendor, labels and optional filtering | ||
- Validation of input types in `align` or `analyze` by given tag pattern |