Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV ADRIO time-series correction #124

Merged
merged 3 commits into from
Jul 1, 2024
Merged

CSV ADRIO time-series correction #124

merged 3 commits into from
Jul 1, 2024

Conversation

TJohnsonAZ
Copy link
Contributor

@TJohnsonAZ TJohnsonAZ commented Jun 27, 2024

Correction for the current implementation of CSV ADRIOs not correctly loading in time-series data when given a CSVSpecTime object and instead simply filtering out rows with a time value outside of the GeoSpec's time period. This case will now correctly reshape the data and produce a TxN array. New cells were added to the 2024-06-12 devlog to demonstrate this. The CSVSpecMatrixTime dataclass was removed as TxNxN shaped data attributes are not currently supported.

@JavadocMD
Copy link
Contributor

Thank you for catching this! Looking at it a bit further, let's make the following two additional changes:

  • Remove time filtering as a feature -- this isn't sufficient in all cases (e.g., DateRanges that cover multiple years) and I can't think of a way to produce an obvious and predictable result for all cases. This will be left to the user for now to pre-process their data; if they want an N-shaped result, they need N-shaped file, etc. (Although we may come up with a universal solution later, perhaps along the lines of the interpolation/padding discussion from Tuesday.)
  • If you pass key_col and data_col as the same index, you get unfriendly errors. I don't think there's a good use-case that would argue for supporting this, so let's check for it and just raise a friendly error.

Copy link
Contributor

@JavadocMD JavadocMD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor tweak, then good to merge. Thanks!

"""Makes an ADRIO to fetch data from a single column within a .csv file and converts it to matrix format."""
if len(set([spec.from_key_col, spec.to_key_col, spec.data_col])) != 3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer set construction syntax here: {item1, item2, item3} instead of set([item1, item2, item3])

@TJohnsonAZ TJohnsonAZ merged commit 9a2608a into main Jul 1, 2024
1 check passed
@TJohnsonAZ TJohnsonAZ deleted the csv-time-series branch July 1, 2024 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants