-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download files via htsget #371
Conversation
Nice job, @pahatz ! Works well for me! |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #371 +/- ##
==========================================
+ Coverage 42.25% 43.15% +0.89%
==========================================
Files 12 13 +1
Lines 1678 1840 +162
==========================================
+ Hits 709 794 +85
- Misses 878 938 +60
- Partials 91 108 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works very well when I tested!
Apart from the comments I left in the code. I have a few general comments.
-
I think it would be better to show a message when the file is successfully downloaded.
-
When the output file exists already, currently the command just overwrites the existing file without any notification. It would be better to show if the file already exist or not, and probably also add an option for force overwriting.
-
For the example given, when the
filename
does not have the.bam
extension, the downloaded file also does not have a.bam
extension. However, the file in s3 does have the.bam
extension. I think it would be great to add a proper file extension if it is recognizable.
Is it ready for review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and works well!
I added one suggestion to the Readme, but nothing blocking an approval.
fyi, once you rebase tests should work as expected (the waiting status is due to the update to go 1.22) |
I followed the instructions on the description (also changed the name of the
with logs:
Something strange is going on with |
Also, trying something stupid like neglecting to input a filename for the flag
produces the same error as above:
where logs
however the filename requested is |
This is not handled properly in general, we would need the following change in helpers/helpers.go:
edit: On second thought, the above fix is wrong. Filenames could start with a dash character. The other solution I had in mind is to:
I could not think of some relatively small solution that could ensure this handled better. I'll amend the last commit for now. It might be a good idea to use a library to handle the parsing of the arguments in the future. |
Maybe some leftover volume/container?
|
- move variables as arguments - create single file
- changed the way we input arguments - added force-overwrite check and option - added message for successful operation
Co-authored-by: Malin Klang <ahlberg.malin@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a tendency to prefix everything inside the package with the package name for the variables.
Some of the error messages are not what the user would expect but what you as a developer sees in the current context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well for me!
Co-authored-by: Joakim Bygdell <joakim.bygdell@nbis.se>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works great when I tested and all my previous comments are addressed, good job @pahatz!
I left a minor comments about an probably unexpected indentation in the helping message.
Apart from that, I can approve it when the failing unit tests are solved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Related issue(s) and PR(s)
This PR closes #280.
Description
This adds a component called htsget to the sda-cli that allows to download (partial) files through the htsget server.
How to test
To test this, you need to run the docker compose of
genomicDataInfra/starter-kit-storage-and-interfaces
with:docker compose -f docker-compose-demo.yml up -d
Then get the token as listed in that repo and insert it into the testing/s3cmd.conf file of the sda-cli repo with the name of
access_token=<token>
Last step for the dev environment is to run the docker compose from
genomicDataInfra/starter-kit-htsget
withdocker compose up -d
Then you can run the htsget sda-cli command with something like:
./sda-cli htsget -config testing/s3cmd.conf -dataset DATASET0001 -filename htsnexus_test_NA12878 -reference 11 -host http://localhost:8088 -pubkey testing/c4gh.pub.pem -output ./result.c4gh --force-overwrite
, wherepubkey
is your own c4gh key pair provided to be used.This will download the file
result.c4gh
You can then decrypt the file with:
./sda-cli decrypt -key c4gh.sec.pem result.c4gh
To check that it worked, check the file contents with samtools:
samtools view result