-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust Series specific tests for string option #55538
Conversation
doc/source/whatsnew/v2.1.2.rst
Outdated
@@ -39,7 +39,7 @@ Bug fixes | |||
Other | |||
~~~~~ | |||
- Fixed non-working installation of optional dependency group ``output_formatting``. Replacing underscore ``_`` with a dash ``-`` fixes broken dependency resolution. A correct way to use now is ``pip install pandas[output-formatting]``. | |||
- | |||
- Setting the environment variable ``PANDAS_INFER_STRING`` to ``"1"`` will now enable the option ``pd.options.future.infer_string`` (:issue:`55533`)^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wasnt this a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point, removed
pandas/_testing/asserters.py
Outdated
@@ -440,7 +440,10 @@ def assert_is_sorted(seq) -> None: | |||
if isinstance(seq, (Index, Series)): | |||
seq = seq.values | |||
# sorting does not change precisions | |||
assert_numpy_array_equal(seq, np.sort(np.array(seq))) | |||
if isinstance(seq, np.ndarray): | |||
assert_numpy_array_equal(seq, np.sort(np.array(seq))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be faster to use libalgos.is_monotonic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used in one place, tbh I want to get rid of the method completely, but that wasn't in scope here
pandas/conftest.py
Outdated
@pytest.fixture | ||
def using_infer_string() -> bool: | ||
""" | ||
Fixture to check if Copy-on-Write is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring looks copy/pasted, needs updating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for catching
@@ -71,7 +71,7 @@ def test_getitem_unrecognized_scalar(self): | |||
def test_getitem_negative_out_of_bounds(self): | |||
ser = Series(["a"] * 10, index=["a"] * 10) | |||
|
|||
msg = "index -11 is out of bounds for axis 0 with size 10" | |||
msg = "index -11 is out of bounds for axis 0 with size 10|index out of bounds" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use a "|".join pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel super strong here but I prefer the pipe pattern for 2 options
# Conflicts: # pandas/tests/series/test_constructors.py
# GH#22717 inserting a Timedelta should _not_ cast to int64 | ||
expected = Series(["x", td], index=[0, "td"], dtype=object) | ||
|
||
ser = Series(["x"]) | ||
ser["td"] = td | ||
tm.assert_series_equal(ser, expected) | ||
assert isinstance(ser["td"], Timedelta) | ||
if using_infer_string and not isinstance(td, Timedelta): | ||
assert not isinstance(ser["td"], Timedelta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would ser["td"]
be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one of those 2:
Timedelta("9 days").to_timedelta64(),
Timedelta("9 days").to_pytimedelta(),
maybe we should xfail that test, not sure anymore what my reasoning here was
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I find it weird to begin with that this always promoted timedeltas to pd.Timedelta
.
Agree to xfail this test for now to nail down the correct behavior in a follow up
msg = "has no kernel" | ||
# with tm.assert_produces_warning(DeprecationWarning, match="comparison"): | ||
with pytest.raises(pa.lib.ArrowNotImplementedError, match=msg): | ||
s == s2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably return all False
in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if we want to patch the arrow behaviour, makes probably sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK could you add a TODO here as a reminder? Would be good to include it in an issue referencing things that would need fixing before 3.0
I'll open issues for them and link them here |
Opened issues for both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Will can merge in a few days unless others have comments
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.