-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly limit the collections that are read #290
base: main
Are you sure you want to change the base?
Conversation
3ab05db
to
c01ecd5
Compare
I think once the |
I have refactored the The on thing that can become inconsistent now is when a user instantiates their own |
The problems in the writer should also be fixed now. They have become necessary because the changes to the our python wrapper around the |
k4FWCore/components/Writer.cpp
Outdated
warning() << "Failed to retrieve collection " << coll << endmsg; | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these changes are not needed for this PR. I think originally I put them there and this code never runs so there was never the need to change it, but it should never happen. Throwing is probably the intended usage here. This first one should never fail because that would mean there are collections that are read in a Frame that are not in the store so something should have before in the Reader. The one below would happen if the unregisterObject
call would fail which is probably only when the object is not in the store, already checked above. Maybe not checking this second one would also be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked again and these are indeed not needed. I am not sure any longer why I thought they were necessary in the first place, but it looks like things are working as expected and we can change this in another PR if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Now I think I really remember why I did this in the first place. If podio doesn't yet support filling a frame with only a subset of all collections, the Writer crashes (see this CI run). Changing the early return to a continue
fixes that issue and we would not need to bump the minimal podio version necessary for k4FWCore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok now I see that is necessary. I don't think it is related to writing only a subset of collections, that has been there for a while and was working before this PR. I think the explanation is related to a mismatch in the collections that are returned with getAvailableCollections
and the ones that are actually in the store, which is something that couldn't happen before because all the collections were being read. I think the messages should not be warnings then, because imagine for a frame with many collections of which only a single one is read, then there will be a warning for every collection that is not read for every event. Alternatively, if there is a way from obtaining only the collections that are being read from podio (taking into account this doesn't change for different events), then that could solve the issue.
Make the IOSvc only read the collections that are actually requested by the user, or all of them if not set
Lift into its own function and only import podio if strictly necessary
The IOSvc machinery has enough functionality to emulate selective reading and only reading a limited set of collections is effectively an optimization at this point.
36c33ca
to
40827b1
Compare
Needs to happen even if the event number is unknown
Co-authored-by: Juan Miguel Carceller <22276694+jmcarcell@users.noreply.github.com>
BEGINRELEASENOTES
PodioInput.collections
,IOSvc.CollectionNames
andReader.InputCollections
properties actually work like expected. They now properly limit the collections that are read and no other collections will be available. This requires building against podio > v1.2`, otherwiset the current behavior will be used, where collections that are not requested will still be available.ENDRELEASENOTES
Fixes #150
documentation foralready presentIOSvc
(?)k4DataSvc
version are actually checking that this works (otherwise they are just skipped).Especially for the
IOSvc
+Reader
side of things there are quite a few handles to change which collections should be read and they have to be kept somewhat consistent. I think we should try to find a way to harmonize that a bit, but I am not entirely sure which way is the one that should be kept. @jmcarcell any preference there, resp. do you see more ways this could be made more robust?