-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add asynchronous load method #10327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add asynchronous load method #10327
Changes from 38 commits
01e7518
83e553b
e44326d
4e4eeb0
d858059
d377780
3132f6a
900eef5
4c4462f
5b9b749
fadb953
57d9d23
11170fc
0b8fa41
f769f85
4eef318
29242a4
e6b3b3b
3ceab60
071c35a
29374f9
ab12bb8
62aa39d
dfe8bf7
a906dec
629ab31
7e9ae0f
d288351
e0731a0
9b41e78
67ba26a
9344e2e
f8f8563
30ce9be
5d15bbd
1f02de1
2342b50
b6d4a82
2079d7e
48e4534
cca7589
dfe9b87
84099f3
ab000c8
a8b7b46
82c7654
5eacdb0
9f33c09
093bf50
4073a24
842a06c
e19ab55
b9e8e06
8bc7bea
6c47e3f
a86f646
884ce13
a43af86
17d7a0e
d824a2d
6a13611
d79ed54
1da3359
dded9e0
4c347ad
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,7 @@ | |
"coveralls", | ||
"pip", | ||
"pytest", | ||
"pytest-asyncio", | ||
"pytest-cov", | ||
"pytest-env", | ||
"pytest-mypy-plugins", | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,7 @@ | |
import os | ||
import time | ||
import traceback | ||
from abc import ABC, abstractmethod | ||
from collections.abc import Hashable, Iterable, Mapping, Sequence | ||
from glob import glob | ||
from typing import TYPE_CHECKING, Any, ClassVar, TypeVar, Union, overload | ||
|
@@ -267,13 +268,23 @@ def robust_getitem(array, key, catch=Exception, max_retries=6, initial_delay=500 | |
time.sleep(1e-3 * next_delay) | ||
|
||
|
||
class BackendArray(NdimSizeLenMixin, indexing.ExplicitlyIndexed): | ||
class BackendArray(ABC, NdimSizeLenMixin, indexing.ExplicitlyIndexed): | ||
__slots__ = () | ||
|
||
@abstractmethod | ||
def __getitem__(key: indexing.ExplicitIndexer) -> np.typing.ArrayLike: ... | ||
|
||
async def async_getitem(key: indexing.ExplicitIndexer) -> np.typing.ArrayLike: | ||
raise NotImplementedError("Backend does not not support asynchronous loading") | ||
Comment on lines
+273
to
+274
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've implemented this for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might not be the desired behaviour though - this currently means if you opened a dataset from netCDF and called There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes absolutely. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay I can do that. But can you explain why you feel that this would be better behaviour? Asking for something to be done async and it quietly blocking also seems not great... |
||
|
||
def get_duck_array(self, dtype: np.typing.DTypeLike = None): | ||
key = indexing.BasicIndexer((slice(None),) * self.ndim) | ||
return self[key] # type: ignore[index] | ||
|
||
async def async_get_duck_array(self, dtype: np.typing.DTypeLike = None): | ||
key = indexing.BasicIndexer((slice(None),) * self.ndim) | ||
return await self.async_getitem(key) # type: ignore[index] | ||
|
||
|
||
class AbstractDataStore: | ||
__slots__ = () | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -185,6 +185,8 @@ class ZarrArrayWrapper(BackendArray): | |
def __init__(self, zarr_array): | ||
# some callers attempt to evaluate an array if an `array` property exists on the object. | ||
# we prefix with _ to avoid this inference. | ||
|
||
# TODO type hint this? | ||
self._array = zarr_array | ||
self.shape = self._array.shape | ||
|
||
|
@@ -212,6 +214,10 @@ def _vindex(self, key): | |
def _getitem(self, key): | ||
return self._array[key] | ||
|
||
async def _async_getitem(self, key): | ||
async_array = self._array._async_array | ||
return await async_array.getitem(key) | ||
|
||
def __getitem__(self, key): | ||
array = self._array | ||
if isinstance(key, indexing.BasicIndexer): | ||
|
@@ -227,6 +233,22 @@ def __getitem__(self, key): | |
# if self.ndim == 0: | ||
# could possibly have a work-around for 0d data here | ||
|
||
async def async_getitem(self, key): | ||
array = self._array | ||
if isinstance(key, indexing.BasicIndexer): | ||
method = self._async_getitem | ||
elif isinstance(key, indexing.VectorizedIndexer): | ||
# TODO | ||
method = self._vindex | ||
elif isinstance(key, indexing.OuterIndexer): | ||
# TODO | ||
method = self._oindex | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ianhi almost certainly these need to become async to fix your bug There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Outer (also known as "Orthogonal") indexing support added in 5eacdb0, but requires changes to zarr-python: zarr-developers/zarr-python#3083 |
||
|
||
print("did an async get") | ||
return await indexing.async_explicit_indexing_adapter( | ||
key, array.shape, indexing.IndexingSupport.VECTORIZED, method | ||
) | ||
|
||
|
||
def _determine_zarr_chunks( | ||
enc_chunks, var_chunks, ndim, name, safe_chunks, region, mode, shape | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As
__getitem__
is required, I feel likeBackendArray
should always have been an ABC.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is public API and this is a backwards incompatible change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is technically, but only if someone is using this class in a way counter to what the docs explicitly tell you to do (i.e. subclass it).
Regardless this is orthogonal to the rest of the PR, I can remove it, I was just trying to clean up bad things I found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted in 6c47e3f