You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently data is stored simply compressing pickled python classes.
This approacj was chosen over other serialisation methods as a good-enough and quick approach. However, as time passes and the codebase evoles, class version dependency for existing serialised instances becomes increasingly problematic. This can prevent users wishing to go back to old data and reanalyse with newer version of the software, since the class cannot be deserialised.
Either we must provide conversions between class changes or better avoid this entirely.
Therefore, bin3C should switch to using a class-agnostic and efficient means of storing intermediate analysis results (contact map, clusterings). Though we could pickle plain datatypes, an obvious candidate is HDF5, which would introduce a chunk of dependencies itself. Another alternative is to consider adopting an existing Hi-C HDF5 format, so long as these do not themselves include external class implementation details or extraneous fields not relevant to metagenomics.
The text was updated successfully, but these errors were encountered:
Currently data is stored simply compressing pickled python classes.
This approacj was chosen over other serialisation methods as a good-enough and quick approach. However, as time passes and the codebase evoles, class version dependency for existing serialised instances becomes increasingly problematic. This can prevent users wishing to go back to old data and reanalyse with newer version of the software, since the class cannot be deserialised.
Either we must provide conversions between class changes or better avoid this entirely.
Therefore, bin3C should switch to using a class-agnostic and efficient means of storing intermediate analysis results (contact map, clusterings). Though we could pickle plain datatypes, an obvious candidate is HDF5, which would introduce a chunk of dependencies itself. Another alternative is to consider adopting an existing Hi-C HDF5 format, so long as these do not themselves include external class implementation details or extraneous fields not relevant to metagenomics.
The text was updated successfully, but these errors were encountered: