Skip to content

fix test by increasing parallel test tolerance #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 15, 2024

Conversation

bearecinos
Copy link
Collaborator

@jrmaddison @dngoldberg

I only changed the tolerances for ice_stream.toml parallel test!

We already know that the parallel tests are less stable and the tolerance for those was set up to higher values (increase how many decimals we tolerate between expected value and the test output). Test that were failing were the inversion, forward, error_prop and eigendec. But they do pass for the Smith real test case without adjusting the tolerances!

So I think could be down to a simple upgrade on multiple libraries... hard to track

Let me know if you are happy with this I will merge it to main

@bearecinos bearecinos added test maintenance related to keep test up to date with the model numerical changes, dependencies, etc Priority (medium) labels May 14, 2024
@jrmaddison
Copy link
Collaborator

I suspect this is a real fail caused by #126. Running the tests locally with an old version I don't see a fail.

Copy link
Collaborator

@jrmaddison jrmaddison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something has changed in #126

@bearecinos
Copy link
Collaborator Author

@jrmaddison did you make a new environment? or just went back on a different commit?

Copy link
Collaborator

@dngoldberg dngoldberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see inline comments about hardcoding tolerance, otherwise looks fine

pytest.check_float_result(delta_qoi,
expected_delta_qoi,
work_dir, 'expected_delta_qoi', tol=1.0e-5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be tol=tol since it is already set above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dngoldberg it is .. you are looking at the red changes are what I changed... in green you can see that tol=tol see line 174


pytest.check_float_result(u_norm,
expected_u_norm,
work_dir, 'expected_u_norm', tol=1.0e-5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly for this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same se L178

@dngoldberg
Copy link
Collaborator

I think something has changed in #126

@jrmaddison could this be, i mean do you think this is affecting the invsigma test (and no others)? i merged this in a hurry. but still, when i merged main into my feature branch for 126, there were no conflicts. why do you think something has changed??

@bearecinos
Copy link
Collaborator Author

bearecinos commented May 15, 2024

I think this is down to old environment so .. if @jrmaddison did not change his environment this is definitely down to new python packages installation ... I did a new environment and github workflow creates a new one too every time something is push.

What I find even odd! is that if I run the test twice;

FAILED tests/test_runs.py::test_run_inversion[3] - AssertionError: Expected value: 147.09799182811255 Computed value: 146.95494786331122 Tolerance: 0.0001
FAILED tests/test_runs.py::test_run_forward[3] - AssertionError: Expected value: -69669835.07911515 Computed value: -69683940.81161451 Tolerance: 1e-05
FAILED tests/test_runs.py::test_run_eigendec[3] - AssertionError: Expected value: 736865.5570195307 Computed value: 735332.7760197466 Tolerance: 1e-05
FAILED tests/test_runs.py::test_run_errorprop[3] - AssertionError: Expected value: 96652.11259899345 Computed value: 96199.44931681202 Tolerance: 5e-05

every time I run this I get a new set of values computed even if I update the expected values .... test never pass. Which took me to increase tolerances as in this version of the code parallel computations seem to be less stable.

@dngoldberg
Copy link
Collaborator

I think something has changed in #126

@jrmaddison could this be, i mean do you think this is affecting the invsigma test (and no others)? i merged this in a hurry. but still, when i merged main into my feature branch for 126, there were no conflicts. why do you think something has changed??

Dammit!!! @bearecinos @jrmaddison i see what has happened. i pushed changes to example_cases/ismipc_30x30 without realising. it is my fault for doing this too hastily!!! i should have recognised that inv_sigma should only fail in parallel -- im sorry about that.

@bearecinos im not as good with github as you. is there a fast, optimal way to solve this? my best bet is that if the changes to example_cases/ismipc_30x30 are undone, the tests will pass as before.

@dngoldberg
Copy link
Collaborator

I think this is down to old environment so .. if @jrmaddison did not change his environment this is definitely down to new python packages installation ... I did a new environment and github workflow creates a new one too every time something is push.

see my comment @bearecinos -- i changed example case files. im so sorry!!!

@bearecinos
Copy link
Collaborator Author

Yes but that test is not run in parallel @dngoldberg only in serial .. regardless I will check now between the two versions

@bearecinos
Copy link
Collaborator Author

bearecinos commented May 15, 2024

@jrmaddison @dngoldberg ... everything runs but with an old environment ... so it is clearly a python package I suggest just increase the tolerance of test for now until we find the culprit dependency

And it has nothing to do with Dan's code. All tests in the new code with the old environment pass

@dngoldberg
Copy link
Collaborator

dngoldberg commented May 15, 2024

@jrmaddison @dngoldberg ... everything runs but with an old environment ... so it is clearly a python package I suggest just increase the tolerance of test for now until we find the culprit dependency

And it has nothing to do with Dan's code. All tests in the new code with the old environment pass

thanks @bearecinos. sounds good to me -- really annoying!

But -- i did still make changes to the ismip_30x30 toml:
fbea1d9#diff-ed4123d01136b7cb05783fa150cabde71cba847ecaef661aabfa87aa780ea3ba
and this experiment IS used in the tests, whether or not these tests are run as part of github actions:

pytest.case_list.append({"case_dir": "ismipc_30x30",

so the latest change to example_cases/ismipc_30x30/ismipc_30x30.toml and example_cases/ismipc_30x30/ismipc_30x30.sh should be rolled back, but i don't know how to do this through git commands without undoing other changes in the PR.

I could open a fix PR and push the prior versions of these files -- but do you know a cleaner way? just let me know, im happy to do it.

Also, i guess i should raise an issue to have a new velocity sensitivity test in the example cases. I'd definitely need help implementing that!

@bearecinos
Copy link
Collaborator Author

bearecinos commented May 15, 2024

@dngoldberg if you can think of a test that looks like this: https://github.com/EdiGlacUQ/fenics_ice/blob/main/tests/test_runs.py#L270-L286

So basically a python script that runs an experiment with vel_obs obtains a value ... that we can add to the toml I am happy to implement what ever you code on that script.

If you open an issue we can work on it together ... actually we can turn your "idealize appendix case into a test"

@dngoldberg
Copy link
Collaborator

@dngoldberg if you can think of a test that looks like this: https://github.com/EdiGlacUQ/fenics_ice/blob/main/tests/test_runs.py#L270-L286

So basically a python script that runs an experiment with vel_obs obtains a value ... that we can add to the toml I am happy to implement what ever you code on that script.

If you open an issue we can work on it together ... actually we can turn your "idealize appendix case into a test"

sounds good Bea -- i will start an issue to remind myself.

@bearecinos bearecinos merged commit c0c83c4 into EdiGlacUQ:main May 15, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority (medium) test maintenance related to keep test up to date with the model numerical changes, dependencies, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants