Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract could operate on either type of BAM ordering #27

Open
cerebis opened this issue Oct 23, 2019 · 0 comments
Open

Extract could operate on either type of BAM ordering #27

cerebis opened this issue Oct 23, 2019 · 0 comments
Labels
enhancement New feature or request

Comments

@cerebis
Copy link
Owner

cerebis commented Oct 23, 2019

bin3C imposes on the user that input BAMs are query name sorted. This makes pair matching trivial and low memory. However, when it comes to invoking bin3C extract -f bam ..., a coordinate sorted and indexed BAM would be much faster to process.

Fix

We should inspect the BAM for ordering and adapt the parsing logic from iterating over the entire input BAM (ie fetch(until_eof=True)) to iterating over the involved references and fetching alignments.

ie

for ref_name in cluster: 
    for aln in fetch(ref_name):
        # do something
@cerebis cerebis added the enhancement New feature or request label Oct 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant