-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieve spectra by USI #14
Conversation
Sorry apparently the history has been messed up a bit in my branch, I will fix that stuff (along with the other edits) tomorrow. |
Thank you for writing this. I'll take a closer look later today. Could you please compare your spectrum retrieval code with https://github.com/mobiusklein/mzdata/blob/main/src/io/proxi.rs to see if they're compatible, or we can remove my existing PROXI codec since it's only for parsing, not making network requests. I also tangled with All the same, please gate this functionality behind the |
Will do! Thanks for the pointers. |
I merged the two files, mostly leaning upon your present serializations and deserializations as they seemed more complete than mine. I fixed the parsing errors that arose from most PROXI servers but stuck with one from PRIDE, where the CURIE string has text in the accession number (see below). I can either work on the CURIE handling to allow text in the accession numbers, or try to ignore this in parsing. {
"accession": "NCIT:C19067",
"name": "project title",
"value": "Sequencing the anti-MUC1 hybridoma antibody 139H2"
} Additionally some of the PROXI server are more strict in handling the interpretation part (need the charge for the peptide) so I choose to strip the interpretation from the USI before sending it to any server. Do you think there is a better way of doing this? I could add the current USI again when the result is formed so that this behaviour is hidden for end users. I want to add an async version as well to allow for different access patterns so that will be added still. Oh and I will fix the git history at some point. |
The PROXI docs are incomplete/outdated and leaves much as implementation details. For example , if I remember correctly, the MassIVE endpoint would include the USI in the response while the PRIDE wouldn't. Additionally, different implementations might indicate failure differently. If you've a will to do it, you could send the USI as-is, and if it is rejected try to re-send it again without the interpretation, but what might be best right now is to understand the failure modes rather than trying to hide them from the end-user, and then when those modes are well understood, try to add error handling.
I was not planning to handle additional controlled vocabularies beyond the two used for mzML in
I'll expand on the drawbacks later today. |
I removed the USI interpretation stripping and tortured some apis a bit to get more known error messages. I do agree with the idea of having as much control as library user as possible, it is very possible still to remove the interpretation field before running the On the NCIT, I think I understand what you mean with the perfect hashing scheme, but I would not trust my own implementation at this point. Do you need this behaviour in more places, or do you think you will need it in the future? Otherwise I suggest to create a |
I wanted to handle USIs in the annotator so I needed support to get the spectrum by USI. Let me know if I do something totally off.
I want to add some tiny things (mostly error handling) but the main stuff of the PR is fixed.