-
Notifications
You must be signed in to change notification settings - Fork 14
feat: Return the status of unproductive contigs in is_valid function #287
base: master
Are you sure you want to change the base?
Conversation
vdj_ann/src/transcript.rs
Outdated
pub enum UnproductiveContigCause { | ||
NoCdr3, | ||
Misordered, | ||
NotFull, | ||
TooLarge, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add comments above each failure state describing exactly what criteria need to met to qualify.
// Unwrap gamma/delta mode flag | ||
let gd_mode = is_gd.unwrap_or(false); | ||
let refs = &refdata.refs; | ||
let rheaders = &refdata.rheaders; | ||
let mut ret_vec = Vec::new(); | ||
let mut never_full = true; | ||
// two passes, one for light chains and one for heavy chains | ||
for pass in 0..2 { | ||
let mut m = "A"; | ||
if pass == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ann
tuple below is all over the vdj code base and it makes everything so confusing and hard to follow. Converting this into a struct is a lot more work but at-least in this function if we could rename the unpacked fields to something more intuitive it would really help. I think its defined here
rust-toolbox/vdj_ann/src/annotate.rs
Line 1098 in 03aa9d3
// { ( sequence start, match length, ref tig, ref tig start, {mismatches} ) }. |
if pass == 2 || n % 3 == 1 { | ||
// on second pass, go through with checking for stop codon regardless of n % 3 value | ||
if inner_pass == 2 || n % 3 == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original code was conflating two checks: (1) full length i.e. having a vstart and jstop and (2) finding a frameshift and/or stop codon. We should split the two checks and specify them as separate fields in the enum. FYI these are the categories described on our software support site: https://support.10xgenomics.com/single-cell-vdj/software/pipelines/latest/algorithms/annotation#productive
63e7479
to
16a96cf
Compare
Defining failure categories and returning why a contig is not productive.
Todo:
JIRA: https://10xtech.atlassian.net/browse/CELLRANGER-7568