Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output Struct Overhaul #445

Open
wants to merge 18 commits into
base: v4-prep
Choose a base branch
from
Open

Output Struct Overhaul #445

wants to merge 18 commits into from

Conversation

steven-murray
Copy link
Member

@steven-murray steven-murray commented Dec 4, 2024

Summary

This changes the output structure interface to be more simple and streamlined.

It is quite a comprehensive set of changes that touch a lot of things on the python-side. I'll try to list as many as I can here for easy reference:

Arrays and Backend mapping

  • New arrays.py module that implements an Array object. This object knows about the shape and dtype of an array, without necessarily having it instantiated, but also knows how to instantiate it, pass it to C, and keeps track of the ArrayState.

OutputStructs

  • The OutputStruct is now an attrs class. More importantly, all of the arrays that it needs to handle are defined directly on the class as Array parameters, making it easier to track them.
  • Each output struct now has a .new() classmethod that instantiates it from an InputParameters object, getting the shape/dtype info (and which arrays need to be present) from the inputs.
  • The downside to the above way of managing the C/Python/Disk interface with Array objects is that the attributes of the OutputStruct are no longer numpy arrays, and so you can't do for example np.mean(ics.lowres_density) any more. This is smoothed over a bit by new get() and set()methods specifically for the arrays, so you can donp.mean(ics.get('lowres_density'))`. This has the added advantage of transparently loading the array from disk if it exists there. Note that on a Coeval object, any field of any OutputStruct can be accessed directly via attribute name, as an array.
  • I've also taken all the caching and I/O management out of the OutputStruct class, instead moving it to the new io subpackage.
  • There's a new _compat_hash attribute on each OutputStruct that tells it the level of input-hash required.

Caching / IO of single-fields (OutputStruct)

  • The new io.caching module implements classes/functions for dealing with the cache. I think this is a bit more intuitive than in previous versions.
  • The OutputCache object has methods for introspecting a particular cache (defined by some directory the user gives at runtime) and reading/writing OutputStructs to it.
  • The RunCache manages full runs (i.e. all boxes belonging to a full redshift-evolved simulation), allowing simple determination of which cache files are present, and which haven't yet been run (useful for checkpointing).
  • The CacheConfig class simply defines a namespace for defining which boxes to write to cache during a larger run (coeval/lightcone).
  • The cache_tools module has been removed as it is redundant with the above module.
  • All the reading/writing of HDF5 boxes has moved to io/h5.py, and so is separated from the OutputStruct class definitions themselves. This might facilitate implementing different cache formats in the future. The file format is also slightly different (I think it's slightly better now -- the format is specified in the docstring of the module, so you can check).
  • There is also a mechanism now for being able to read files written by older versions of the code, so we can maintain explicit backwards compatibility with older outputs.

Single-Field Computations

  • The single_field module is a lot simpler. I have moved most of the boiler-plate logic to a class-style decorator in _param_config.
  • This new decorator checks redshift consistency, input parameter consistency, manages the cache and sets the current redshift appropriately given all inputs.

Lightcone / Coeval

  • I refactored some re-used code in run_coeval and run_lightcone into a set of external functions: evolve_perturb_halos and _redshift_loop_generator.
  • The Coeval and Lightcone objects are much more slim now. I removed the ability to "gather" the cached files associated with a coeval/lc, instead relying on the improved caching module to let people deal with their full-run caches.
  • Also, to read a Coeval/Lightcone you do Coeval.from_fileinstead of Coeval.read() which I think is more intuitive.

Configuration

  • I actually think we should generally move away from package-wide configuration, because it always causes trouble. I haven't removed the module itself here because it's slightly outside the scope of the PR, but I did remove the "regenerate" and "write" configuration options, and removed all places where the config was used.
  • We will have to think about how to re-implement all the functionality we had in the config (e.g. number of sigfigs for the cache). Probably most of this can be put directly into new objects (like the CacheConfig).

Other Stuff

  • I've removed any documentation or caching references to "global params". These are now to be treated as almost purely read-only (and we should move towards them being completely removed soon).
  • I moved the definition of InputParameters from param_config to inputs just because I was getting circular imports.

Meta-info:

  • These changes break strict backwards-compatibility

Issues Solved

@steven-murray steven-murray marked this pull request as ready for review December 14, 2024 00:51
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 78.69393% with 323 lines in your changes missing coverage. Please review.

Project coverage is 76.88%. Comparing base (5930245) to head (6b894c8).
Report is 3 commits behind head on v4-prep.

Files with missing lines Patch % Lines
src/py21cmfast/io/caching.py 54.43% 74 Missing and 3 partials ⚠️
src/py21cmfast/wrapper/outputs.py 82.63% 47 Missing and 19 partials ⚠️
src/py21cmfast/io/h5.py 71.52% 31 Missing and 12 partials ⚠️
src/py21cmfast/drivers/_param_config.py 82.94% 19 Missing and 10 partials ⚠️
src/py21cmfast/drivers/coeval.py 79.85% 21 Missing and 6 partials ⚠️
src/py21cmfast/wrapper/inputs.py 84.96% 15 Missing and 8 partials ⚠️
src/py21cmfast/wrapper/arrays.py 75.00% 9 Missing and 7 partials ⚠️
src/py21cmfast/drivers/lightcone.py 90.35% 5 Missing and 6 partials ⚠️
src/py21cmfast/drivers/single_field.py 85.33% 7 Missing and 4 partials ⚠️
src/py21cmfast/cli.py 55.55% 4 Missing ⚠️
... and 6 more
Additional details and impacted files
@@             Coverage Diff             @@
##           v4-prep     #445      +/-   ##
===========================================
- Coverage    79.56%   76.88%   -2.69%     
===========================================
  Files           24       27       +3     
  Lines         3803     3747      -56     
  Branches       647      611      -36     
===========================================
- Hits          3026     2881     -145     
- Misses         558      648      +90     
+ Partials       219      218       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@daviesje daviesje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I had a few questions and minor points but this looks like a huge improvement

pf = pf2
_bt = None
hb = hb2
st = st2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we separating the node redshifts from the output redshifts here? previously we only created the coevals on the outputs and only updated the previous snapshot on the nodes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I think I may have slightly broken this. The idea is still to only evolve based on the node redshifts, but to yield on every redshift (either out_redshift or node_redshift). Currently, it looks like I might be evolving on everything, so I should check and fix that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had to fix this. I just put an if z in inputs.node_redshifts: check in before updating the "current" boxes.

v
for k, v in outputs.items()
if not k.startswith("previous_") and not k.startswith("descendant_")
]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't got to the single fields yet, but I'm curious how this works with the XraySourceBox, which needs the whole HaloBox history

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd appreciate a closer look at that, as it's not something I'm as familiar with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like at the moment it won't check the HaloBox redshifts, since it is passed as a list(OutputStruct) instead of an OutputStruct. One could add a third option here for a history_ or similar which takes a list and needs one entry for each progenitor z

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so the way I dealt with this is to get two lists of output structs passed to the function:

  1. All output structs that are top-level
  2. All output structs that are either top-level OR exist inside a list of such structs (by setting recurse=True).

The first list gets checked for redshift compatibility, while the second list gets checked for parameter compatibility etc.
So, there are no checks that the list of boxes follow the z-grid specified. We could add that as well later if we want.

if descendant_halos is None:
descendant_halos = HaloField(
descendant_halos = HaloField.new(
redshift=0.0,
inputs=inputs,
dummy=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me that in the backend, the sampling from grid/descendants is controlled by the redshift of this object being <=0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if having a .dummy() constructor method would be neater (auto-setting the reshift to say -1 and setting dummy=True).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this.

elif perturbed_field:
inputs = perturbed_field[0].inputs

if not out_redshifts and not perturbed_field and not inputs.node_redshifts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we want to happen when out_redshifts is None and we have some inputs with node redshifts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behaviour (and I'm happy to discuss this) is to yield on all node redshifts and out_redshifts. So as long as at least one of them is non-empty, everything is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to output either list of out_redshifts Coeval objects with the yield, or a boolean is_in_output flag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done the latter, so that memory is reduced. In the generate_coeval function, two things are now returned: the Coeval and a bool representing if the redshift is in out_redshifts. From run_ceoval, only the final list of coevals is returned.


@classmethod
def new(cls, x: dict | InputStruct | None = None, **kwargs):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should update this docstring

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it modestly -- is that what you were thinking?

default_input_struct.check_output_compatibility([example_ib])

default_input_struct.check_output_compatibility([perturbed_field])
# def test_inputstruct_outputs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to rewrite this test for to test the compatibility checks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea. I can't remember what all I've covered in tests now, but will have another think tomorrow.

@daviesje
Copy link
Contributor

daviesje commented Dec 31, 2024

While one of the current failing tests (the macros 3.12) is the same issue of workflows losing one of the temp directories. There's a GSL error in the Ubuntu 3.12 which I haven't seen before. Nikos Found something similar when running the database, I'm curious what's causing this since sometimes just rerunning makes it work again. I don't think it has much to do with this PR but we should look in to it


if lib.photon_cons_allocated:
lib.FreePhotonConsMemory()


def run_coeval(**kwargs) -> list[Coeval]: # noqa: D103
return [coeval for coeval, in_nodes in generate_coeval(**kwargs) if in_nodes]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is in_outputs rather than in_nodes?

# the last one.
minimum_node = len(inputs.node_redshifts) - 1

if minimum_node < 0 or inputs.flag_options.USE_HALO_FIELD:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be apparent later, but why are we starting at the beginning every time with the halos? Is it so the HaloBox array passed to XraySourcebox always gets populated? It may be a good idea (not necessarily in this PR) to extend this function to run through all three redshift loops (perturbed field, halos, forward loop) to find where we stopped last time and gather all the necessary OutputStructs

# def is_partial(self):
# """Whether the cache is complete down to some redshift, but not the last z."""
# z, idx = self.get_completed_redshift()
# return idx == len(self.inputs.node_redshifts) - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this anymore?

def initial(cls, inputs: InputParameters = InputParameters(random_seed=1)):
"""Create a dummy instance with the given inputs."""
return cls.new(inputs=inputs, redshift=-1.0, initial=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to double-check that not having the same InputParameters won't mess anything up, either in the parameter checks or the backend. Since all the calls to these constructors are without the input kwarg I'm guessing it doesn't matter, but in that case should we just force the input parameters to be InputParameters(random_seed=1) and make sure that they are never called/used?

@@ -260,7 +274,7 @@ def rectlcn(

@pytest.fixture(scope="session")
def lc(rectlcn, ic, cache, default_input_struct_lc):
*_, lc = exhaust_lightcone(
*_, lc = run_lightcone(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know you could unpack like this, cool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants