Have fsurdat_modifier system test use conda environment rather than ncar_pylib #1798

ekluzek · 2022-07-06T23:21:44Z

Description of changes

Use the manage_python_env script to setup the conda environment and use it rather than ncar_pylib

Specific notes

Contributors other than yourself, if any: None

CTSM Issues Fixed (include github issue #):
Fixes #1786
Fixes #1925

Are answers expected to change (and if so in what way)? No

Any User Interface Changes (namelist or namelist defaults changes)? No

Testing performed, if any:
FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel runs and PASSes on cheyenne

…ronment rather than ncar_pylib

billsacks · 2022-07-07T04:01:04Z

Thanks for working on this. It's not as pretty or simple as I hoped, but it's still an improvement over what we had before - and I really hope any such mechanism can go away before too long.

Can you confirm two things:

Does this work both for a user's shell being csh and bash? (given the shell-specific logic you have here)
Does it work to run FSURDATMODIFYCTSM as one of a suite of tests? I'm both wanting to confirm that earlier tests don't mess anything up about the environment that would interfere with this one (which I think I've seen before) and that this one doesn't mess anything up with the environment that could impact following tests.

ekluzek · 2022-07-07T07:39:31Z

I think I can improve on what I have here, so I'll try to do that. But, I also really like your suggestions on tests as well, so I'll do that.

To simplify this further we'll need conda to be loaded on machines in ccs_config/cime rather than python. That should be OK as when you activate an environment with conda you get a python version with it. But, we've got to convince CESM people that doing that would be OK.

billsacks · 2022-07-07T13:59:00Z

I'm not sure it's worth spending much more time on this. (Sorry: my last comment probably suggested that it would be good to improve this, but I don't really think that should be a priority.) The vision we have discussed in CSEG is that users will load an appropriate python environment before doing anything with CESM / CIME. We don't want CIME to be responsible for controlling the python environment. Once we are comfortable with that (maybe the switch to derecho will be a good time???), I believe we can remove any 'module load python' from CIME. (I don't think we want a conda load in ccs_config/cime: I think we want to leave this up to the user - at least based on previous discussions.) So I feel like a less-than-elegant solution is okay for now as long as it works, and then we can hopefully remove this soon.

billsacks · 2022-07-07T13:59:32Z

Also: I appreciate that this new solution works on multiple machines, not just cheyenne.

billsacks · 2022-07-07T18:26:21Z

One other thing to confirm, if you haven't already, is: if there are failures in various aspects of the subprocess call, make sure that the test stops at that point with a FAIL result noted in the TestStatus file. You should definitely confirm this is the case for a possible failure in running fsurdat_modifier (e.g., deliberately introduce an error there and make sure the test aborts appropriately), but I'd also want to know that a failure in an earlier step in the subprocess call (where you have strung a bunch of commands together) correctly results in an error being reported. (I see you use check=True, but I think that will only work right if you get an error return code from the subprocess. Given the complexity of the subprocess call, it isn't immediately obvious to me that that is guaranteed to be the case.)

billsacks

Approving with the testing caveats noted in my earlier comments. Thanks for your work on changing this so it will work on other systems as well as not relying on the deprecated ncar_pylib!

slevis-lmwg · 2022-07-11T21:44:58Z

Thanks for working on this. It's not as pretty or simple as I hoped, but it's still an improvement over what we had before

Thanks, @ekluzek
It's not a lot of code, but I doubt that I would have figured this out without help :-)
Let me know if you'd like me to try any or all of the testing that @billsacks recommended.

The tools modify_fsurdat and subset_data have similar options modify_fsurdat: zero_nonveg subset_data: include-nonveg modify_fsurdat: dom_plant subset_data: dom-pft Make more similar and use README file for documenting modify_fsurdat. There's potential for additional changes to be addressed later. Change name of manage_python_env to py_env_create as agreed in CTSM software meeting

…m_pylib to match ncar_pylib and to use a different name from the Hammon script that uses a ctsm_py environment name

…black check isn't duplicated

…directory

ekluzek · 2022-12-14T19:28:47Z

Oh, I had forgotten about this one too. Thanks @billsacks

ekluzek · 2022-12-15T16:15:50Z

@slevisconsulting I'm not going to be able to get back to working on this one for awhile, I wonder if you shouldn't merge this into your PR and shepherd bringing it in to CTSM main?

This set of changes allows CTSM to continue API compatability with changes to the FATES API. FATES has updated its nutrient dynamics routine, and required a modification to the test environment, some minor updates to variable dimensions in the history, and a call to a new FATES history routine. Implicitly, the updating of the FATES tag introduces new content in the FATES model since the last API update (mostly bug fixes). Resolved conflicts: cime_config/SystemTests/fsurdatmodifyctsm.py test/tools/test_driver.sh

slevis-lmwg · 2022-12-15T21:28:03Z

@slevisconsulting I'm not going to be able to get back to working on this one for awhile, I wonder if you shouldn't merge this into your PR and shepherd bringing it in to CTSM main?

@ekluzek looks as though I can push directly to your PR, so I will work here for now, and we will likely close the redundant PR that I opened (#1925)

ekluzek · 2022-12-15T21:45:22Z

@slevisconsulting I think some of the documentation changes you have in #1925, might still apply. So you could merge this into that, or the other way around, whichever you prefer. I'm glad you'll be able to work on it though, it would be nice to see this come in.

FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel

slevis-lmwg · 2022-12-19T22:14:22Z

Okay, all sounds good - thank you @slevisconsulting and @ekluzek . It might be good to confirm that this works in the context of a larger test suite, if you haven't already. This wouldn't necessarily need the full aux_clm, but you could create a text-based testlist file containing this test and a few others – with one or more tests before this one and one or more after it – to make sure this doesn't interfere with the python environment loaded by the test system and vice versa.

I ran aux_clm on cheyenne and it passed other than expected failures. I did not get motivated to try a custom list of tests :-)

billsacks · 2022-12-19T22:24:37Z

I ran aux_clm on cheyenne and it passed other than expected failures. I did not get motivated to try a custom list of tests :-)

Awesome, thanks @slevisconsulting

slevis-lmwg · 2022-12-20T01:13:08Z

Tested the latest commit in clm_pymods, as well as with ./create_test and both passed.
@ekluzek please let me know if you agree with my implementation.

slevis-lmwg · 2023-01-24T22:24:34Z

@ekluzek and @slevisconsulting agreed this morning (2023/1/24) that this PR will likely get merged/tagged as dev118, pending successful testing once I update to dev117.

slevis-lmwg · 2023-02-03T00:04:47Z

@ekluzek I think something has changed in my permissions here. I updated to dev117, but the push to the PR has been failing. If nothing has changed on your side, we may need to troubleshoot tomorrow morning after Stand-up.

slevis-lmwg · 2023-02-03T00:15:43Z

Meanwhile
PASS make all from the /python directory
but
FAIL ./run_sys_tests -s clm_pymods -c ctsm5.1.dev117 --skip-generate
so I'm trying this PR in ctsm5.1.dev115 where it used to work:
PASS ./run_sys_tests -s clm_pymods -c ctsm5.1.dev115 --skip-generate
and vanilla ctsm5.1.dev117
PASS ./run_sys_tests -s clm_pymods -c ctsm5.1.dev116 --skip-generate

So now I need to difference this PR from vanilla dev117 to figure out what causes the above failure...

Resolved conflicts: cime_config/SystemTests/fsurdatmodifyctsm.py test/tools/test_driver.sh

slevis-lmwg · 2023-02-03T21:06:24Z

I have tried once more doing everything from scratch and the FSURDAT system test still fails.

This is how I resolved the conflicts:

In test/tools/test_driver.sh I'm keeping escomp/master:

# Setup conda environement
<<<<<<< HEAD
\$CLM_ROOT/py_env_create
conda activate ctsm_pylib
||||||| 17e2acb6a
\$CLM_ROOT/py_env_create
conda activate ctsm_py
=======
conda activate ctsm_py
if [ \$? -ne 0 ]; then
   echo "ERROR: Trouble activating the ctsm_py conda environment, be sure it's setup with \$CLM_ROOT/py_env_create, then rerun"
   exit 4
fi
>>>>>>> escomp/master

but changing ctsm_py to ctsm_pylib as I did elsewhere in this PR.

In cime_config/SystemTests/fsurdatmodifyctsm.py I'm keeping HEAD:

<<<<<<< HEAD

        self._case.load_env(reset=True)
        conda_env = ". "+self._get_caseroot()+"/.env_mach_specific.sh; "
        # Preprend the commands to get the conda environment for python first
        conda_env += self._get_conda_env()
        # Source the env
        try:
            subprocess.run( conda_env+"python3 "+tool_path+" "+self._cfg_file_path, shell=True, check=True)
        except subprocess.CalledProcessError as error:
            print("ERROR while getting the conda environment and/or ")
            print("running the fsurdat_modifier tool: ")
            print("(1) If your ctsm_pylib environment is out of date or you ")
            print("have not created the ctsm_pylib environment, yet, you may ")
            print("get past this error by running ./py_env_create ")
            print("in your ctsm directory and trying this test again. ")
            print("(2) If conda is not available, install and load conda, ")
            print("run ./py_env_create, and then try this test again. ")
            print("(3) If (1) and (2) are not the issue, then you may be ")
            print("getting an error within the fsurdat_modifier tool itself. ")
            print("Default error message: ")
            print(error.output)
        except:
            print("ERROR trying to run fsurdat_modifier tool.")
            raise
||||||| 17e2acb6a
        # Need to specify a specific python version that has the required
        # dependencies
        python_path = _get_python_path()
        subprocess.check_call([python_path, tool_path, self._cfg_file_path])
=======
        # Need to specify a specific python version that has the required
        # dependencies
        python_path = _get_python_path()
        subprocess.check_call([python_path, tool_path, self._cfg_file_path, "--verbose", "--overwrite", "--allow_ideal_and_include_non_veg"])
>>>>>>> escomp/master

I now suspect that one or more new things in this PR fail for some reason. New items include:

load_env
.env_mach_specific.sh
which conda
module unload python; module load conda
conda activate ctsm_pylib

Also, I still get an error when I try to push to the PR:

ERROR: Authentication error: Authentication required: You must have push access to verify locks
error: failed to push some refs to 'https://github.com/ekluzek/CTSM.git'

billsacks · 2023-02-03T21:12:38Z

Regarding the permissions: I have noticed that there is a bug with git-lfs that can prevent you from pushing to other people's forks. The workaround is to do a git lfs uninstall, then git push, then restore with git lfs install.

ekluzek · 2023-02-03T22:48:43Z

I think the permission is just the problem with making @slevisconsulting a collaborator on my fork, which we've resolved now.

In test driver you should change the use of ctsm_py to ctsm_pylib as follows...

conda activate ctsm_pylib
if [ \$? -ne 0 ]; then
   echo "ERROR: Trouble activating the ctsm_py conda environment, be sure it's setup with \$CLM_ROOT/py_env_create, then rerun"
   exit 4
fi

in fsurdatmodifyctsm.py you do need to change the line in

        try:
            subprocess.run( conda_env+"python3 "+tool_path+" "+self._cfg_file_path, shell=True, check=True)
        except subprocess.CalledProcessError as error:

to

        try:
            subprocess.check_call(conda_env+"python3 "+tool_path+" "+self._cfg_file_path+"--verbose --overwrite --allow_ideal_and_include_non_veg", shell=True, check=True)
        except subprocess.CalledProcessError as error:

Do you see why those need to change? Just want to make sure this makes sense to you...

slevis-lmwg · 2023-02-03T22:48:47Z

@ekluzek I think I have found the source of the problem. For some reason it eluded me until now. I have discovered this message, and wanted to ask whether you had a specific reason to add this requirement or if simply it made sense to you:
ERROR: idealized AND include_nonveg can NOT both be on, pick one or the other

slevis-lmwg · 2023-02-03T22:53:27Z

If you had a specific reason, then I will keep it. If not, then I would rather remove this requirement.

slevis-lmwg · 2023-02-03T22:57:20Z

I now see (probably saw in the review, as well, but forgot) that you had added this option:
--allow_ideal_and_include_non_veg
I will try to resolve using this option.

ekluzek · 2023-02-03T23:01:25Z

@slevisconsulting the reason I thought that idealized and include_nonveg should be on together is because idealized explicitly sets the dominant PFT to zero, while include_nonveg (actually false) zeros' out the PFT's. So they contradict each other. It looks like I got the logic wrong though -- and that combination should be allowed. First I got it wrong about it being about include_nonveg True, when it should be False. But, also I got it wrong that include_nonveg=False isn't zeroing vegetation -- just all the other landunit types -- so that's actually fine.

So you should remove that restriction and option I added.

The one that still makes sense is idealized and dom_pft, because idealized explicitly sets the dominant PFT to bare-soil, while dom_pft sets it to the value the user asks for. What behavior you get is then dependent on how it's implemented in the code -- but since it's ambiguous that means it's something that should be checked for.

Sorry I added a non useful restriction there. It is good to think these things through and make sure options make sense together and restrict ones that don't.

slevis-lmwg · 2023-02-03T23:23:03Z

@ekluzek I thought about it some more and decided that I also disagree about keeping the second restriction. The comments in the configure file explain why:

# PFT/CFT to be set to 100% according to user-defined mask.
# If idealized = True and dom_pft = UNSET, the latter defaults to 0
# (bare soil). Valid values range from 0 to a max value (int) that one can
# obtain from the fsurdat_in file using ncdump (or method preferred by user).
# The max valid value will equal (lsmpft - 1) and will also equal the last
# value of cft(cft).
dom_pft = UNSET

In other words, dom_pft is meant to be zero if the user leaves it UNSET or if the user explicitly sets it to zero.

slevis-lmwg · 2023-02-03T23:44:06Z

Regarding the permissions: I have noticed that there is a bug with git-lfs that can prevent you from pushing to other people's forks. The workaround is to do a git lfs uninstall, then git push, then restore with git lfs install.

Thanks @billsacks .
@ekluzek made me collaborator, so I am able to push again.
My only confusion is that I had pushed to this PR previously (e.g. Dec 15-19). Do collaborator permissions expire?

billsacks · 2023-02-03T23:48:41Z

@ekluzek made me collaborator, so I am able to push again.
My only confusion is that I had pushed to this PR previously (e.g. Dec 15-19). Do collaborator permissions expire?

I really think this is due to the lfs issue. There are two ways you can have permission to push to someone else's fork:

(1) If they make you a collaborator. This works even if you have git lfs installed.

(2) If you are a "maintainer" of CTSM (which I think means you have write permission), then by default you are given permission to push to any branch for which there is a PR. (Authors can choose not to give this permission, but the default is to give it. For this pr, it is allowed: I see the message "Maintainers are allowed to edit this pull request" in the metadata on the top right of the PR.) This feature is broken by git lfs for some reason.

ekluzek · 2023-02-03T23:57:48Z

@slevisconsulting that makes sense, and nicely documented in the file. Thanks for pointing that out.

It might be good to add a comment in the code that the dom_pft section HAS to be after the the idealized part for those reasons above. You could also add a comment to the idealized section that this should be done first, so that other options could override the behavior there. The idea that the idealized section is done and other things override its behavior then becomes an expectation of how the script is supposed to function. It's good to make all the expectations on how the script is intended to behavior to be as clear as possible.

Update ChangeLog and python tests

slevis-lmwg · 2023-02-04T01:06:43Z

Tests now PASS:
clm_pymods test-suite and make all in /python directory
I linked the dev117 baselines to dev118: ln -s ctsm5.1.dev117 ctsm5.1.dev118
I have now also updated the ChangeLog
@ekluzek I have added comments in the code as you suggested. Pls let me know if you have further concerns. Otherwise I think this is ready for the merge.

Working version of fsurdatmodifyctsm system test that uses conda envi…

03c6786

…ronment rather than ncar_pylib

ekluzek added enhancement new capability or improved behavior of existing capability testing additions or changes to tests labels Jul 6, 2022

ekluzek requested review from billsacks, negin513 and slevis-lmwg July 6, 2022 23:21

ekluzek self-assigned this Jul 6, 2022

billsacks approved these changes Jul 7, 2022

View reviewed changes

ekluzek added 8 commits July 28, 2022 12:36

Change the name of the conda environment for CTSM from ctsm_py to cts…

dca339e

…m_pylib to match ncar_pylib and to use a different name from the Hammon script that uses a ctsm_py environment name

Add a workflow for python directory testing

d943730

Add conda-forge channel to action, and only do test and lint test so …

f485034

…black check isn't duplicated

Have to manage externals so that cime exists

82adf40

Seperate tests out, and add git lfs

e7c69a6

Add an explicit check for the make returning an exit code

ae3acd0

Remove set of bash for login, and add explicit rule for cd to python …

c9d4861

…directory

billsacks mentioned this pull request Dec 14, 2022

Replace ncar_pylib because it is deprecated and will disappear soon #1925

Closed

This comment was marked as resolved.

Sign in to view

Minor corrections for system test to pass

d8eedef

FSURDATMODIFYCTSM_D_Mmpi-serial_Ld1.5x5_amazon.I2000Clm50SpRs.cheyenne_intel

slevis-lmwg added 2 commits December 19, 2022 16:53

Update test_driver.sh according to Erik's review

31798f1

If "which conda" fails, execute module load/unload, eg on cheyenne

4057f9a

Rm .github/workflows/python_testing.yml to unblock merging this PR

6fea524

Merge remote-tracking branch 'escomp/master' into fsurdatsystest_conda

131cf0e

Resolved conflicts: cime_config/SystemTests/fsurdatmodifyctsm.py test/tools/test_driver.sh

Rm override allow_ideal_and_include_non_veg & corresponding restriction

07e8462

slevis-lmwg added 2 commits February 3, 2023 17:06

First draft of ChangeLog

662bf92

Rm override allow_dom_pft_and_idealized & corresponding restriction

6699c0d

Update ChangeLog and python tests

Update changelog

f3f26fb

ekluzek merged commit e099103 into ESCOMP:master Feb 6, 2023

ekluzek deleted the fsurdatsystest_conda branch February 6, 2023 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have fsurdat_modifier system test use conda environment rather than ncar_pylib #1798

Have fsurdat_modifier system test use conda environment rather than ncar_pylib #1798

ekluzek commented Jul 6, 2022 •

edited by slevis-lmwg

Loading

billsacks commented Jul 7, 2022

ekluzek commented Jul 7, 2022

billsacks commented Jul 7, 2022

billsacks commented Jul 7, 2022

billsacks commented Jul 7, 2022

billsacks left a comment

slevis-lmwg commented Jul 11, 2022

ekluzek commented Dec 14, 2022

ekluzek commented Dec 15, 2022

slevis-lmwg commented Dec 15, 2022

ekluzek commented Dec 15, 2022

This comment was marked as resolved.

slevis-lmwg commented Dec 19, 2022

billsacks commented Dec 19, 2022

slevis-lmwg commented Dec 20, 2022

slevis-lmwg commented Jan 24, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023 •

edited

Loading

slevis-lmwg commented Feb 3, 2023

billsacks commented Feb 3, 2023

ekluzek commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

ekluzek commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

billsacks commented Feb 3, 2023

ekluzek commented Feb 3, 2023

slevis-lmwg commented Feb 4, 2023

Have fsurdat_modifier system test use conda environment rather than ncar_pylib #1798

Have fsurdat_modifier system test use conda environment rather than ncar_pylib #1798

Conversation

ekluzek commented Jul 6, 2022 • edited by slevis-lmwg Loading

Description of changes

Specific notes

billsacks commented Jul 7, 2022

ekluzek commented Jul 7, 2022

billsacks commented Jul 7, 2022

billsacks commented Jul 7, 2022

billsacks commented Jul 7, 2022

billsacks left a comment

Choose a reason for hiding this comment

slevis-lmwg commented Jul 11, 2022

ekluzek commented Dec 14, 2022

ekluzek commented Dec 15, 2022

slevis-lmwg commented Dec 15, 2022

ekluzek commented Dec 15, 2022

This comment was marked as resolved.

slevis-lmwg commented Dec 19, 2022

billsacks commented Dec 19, 2022

slevis-lmwg commented Dec 20, 2022

slevis-lmwg commented Jan 24, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023 • edited Loading

slevis-lmwg commented Feb 3, 2023

billsacks commented Feb 3, 2023

ekluzek commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

ekluzek commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

slevis-lmwg commented Feb 3, 2023

billsacks commented Feb 3, 2023

ekluzek commented Feb 3, 2023

slevis-lmwg commented Feb 4, 2023

ekluzek commented Jul 6, 2022 •

edited by slevis-lmwg

Loading

slevis-lmwg commented Feb 3, 2023 •

edited

Loading