Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test new github.com backfill-tool deployment in UAT #28

Open
jamesfwood opened this issue Aug 20, 2024 · 23 comments
Open

Test new github.com backfill-tool deployment in UAT #28

jamesfwood opened this issue Aug 20, 2024 · 23 comments
Assignees

Comments

@jamesfwood
Copy link
Collaborator

@davidcolemanjpl

Please test backfill tool that is now located here:

https://github.com/podaac/hitide-backfill-tool

Install command line tool from here:

https://test.pypi.org/project/podaac-hitide-backfill-tool/

Also, you'll need to update the default_message_config.json file.

Update from here:
https://github.jpl.nasa.gov/podaac/hitide-backfill-tool/blob/develop/podaac/hitide_backfill_tool/default_message_config.json

Let me or Simon know if you have any issues setting it up.

Thanks!

@davidcolemanjpl
Copy link

@jamesfwood

Please advise regarding the HiTIDE backfill tool version under test for this effort.
Also, what's the associated forge and tig versions in the backfill tool in services-uat?

Thanks!

@jamesfwood
Copy link
Collaborator Author

jamesfwood commented Aug 20, 2024

@davidcolemanjpl

backfill-tool version 0.9.0rc3
forge version 0.11.0-rc.3
tig version 0.12.0-rc.3
forge-py version 0.2.0-rc.4

The versions deployed to each environment should be specified here:
https://github.com/podaac/hitide-backfill-tool/tree/develop/terraform-deploy/terraform_env

@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl
There are also a few collections that use the new forge-py library for footprinting.
These collections are:

  • SCATSAT1_ESDR_L2_WIND_STRESS_V1.1
  • COWVR_STPH8_L2_EDR_V9.0

and a few new ones coming soon.

Please also test those collections in UAT.

Thanks!

@davidcolemanjpl davidcolemanjpl moved this from 🔖 Ready to 🏗 In progress in hitide-24.3 Aug 21, 2024
@davidcolemanjpl
Copy link

davidcolemanjpl commented Aug 22, 2024

podaac-hitide-backfill-tool v0.9.0rc3
podaac-app-services-uat-1858
TIG v0.12.0-rc.3
FORGE v0.11.0-rc.3
FORGE-py v0.2.0-rc.4

Verified that podaac-hitide-backfill-tool works as expected for the following collections:
ASCATB-L2-Coastal
VIIRS_N21-NAVO-L2P-v3.0
AQUARIUS_L2_SSS_V5
SMAP_RSS_L2_SSS_V5
AMSR2-REMSS-L2P-v8.2
MODIS_T-JPL-L2P-v2019.0

When testing 'SCATSAT1_ESDR_L2_WIND_STRESS_V1.1*' , observed errors for podaac-services-uat-hitide-backfill-forge executions:
The executions fail at the ForgePyProcess step:

"Error": "TypeError",
"Cause": "{"errorMessage": "ufunc 'create_collection' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''", "errorType": "TypeError", "requestId": "d8f12ff6-465c-49a4-8b63-190877499101", "stackTrace": [" File \"/var/task/podaac/lambda_handler/lambda_handler.py\", line 311, in handler\n return FootprintGenerator.cumulus_handler(event, context=context)\n", " File \"/var/task/cumulus_process/process.py\", line 315, in cumulus_handler\n return run_cumulus_task(cls.handler, event, context)\n", " File \"/var/task/run_cumulus_task.py\", line 85, in run_cumulus_task\n return handle_task_exception(exception, cumulus_message, logger)\n", " File \"/var/task/run_cumulus_task.py\", line 83, in run_cumulus_task\n task_response = task_function(nested_event, context, **taskargs)\n", " File \"/var/task/podaac/lambda_handler/lambda_handler.py\", line 268, in handler\n return cls.run(path=path, noclean=noclean, context=context, **event)\n", " File \"/var/task/podaac/lambda_handler/lambda_handler.py\", line 276, in run\n output = process.process()\n", " File \"/var/task/podaac/lambda_handler/lambda_handler.py\", line 257, in process\n file_dict = self.footprint_generate(file_, config_file_path, granule_id)\n", " File \"/var/task/podaac/lambda_handler/lambda_handler.py\", line 199, in footprint_generate\n wkt_representation = forge.generate_footprint(lon_data, lat_data, thinning_fac=thinning_fac, alpha=alpha, is360=is360, simplify=simplify,\n", " File \"/var/task/podaac/forge_py/forge.py\", line 115, in generate_footprint\n alpha_shape = selected_function(lon_array, lat, thinning_fac=thinning_fac, alpha=alpha)\n", " File \"/var/task/podaac/forge_py/forge.py\", line 39, in scatsat_footprint\n alpha_shape = alphashape.alphashape(xy, alpha=alpha)\n", " File \"/var/task/alphashape/alphashape.py\", line 180, in alphashape\n m = MultiLineString([coords[np.array(edge)] for edge in perimeter_edges])\n", " File \"/var/task/shapely/geometry/multilinestring.py\", line 60, in new\n return shapely.multilinestrings(subs)\n", " File \"/var/task/shapely/decorators.py\", line 77, in wrapped\n return func(*args, **kwargs)\n", " File \"/var/task/shapely/creation.py\", line 393, in multilinestrings\n return lib.create_collection(geometries, typ, out=out, **kwargs)\n"]}"

Can view recent related forge executions that failed:
e0e25761-b08d-46a4-9e02-ba368ea09f83
f79f549a-ad1e-4f6f-8814-4f41356b94fa
0192ac6b-292a-495a-9962-8f01ac2502ef

Image


When testing 'SCATSAT1_ESDR_L2_WIND_STRESS_V1.1' , observed errors for podaac-services-uat-hitide-backfill-tig executions:

podaac-services-uat-hitide-backfill-tig fails at the 'ImageProcess' step:

see recent hitide-backfill-tig executions:
ece1baf0-aa16-409d-9cea-ca33fb3952ad
656c2ec8-fa45-4da7-b943-69f09f24f54d
5d435340-06c0-4bef-bbdc-3c3b353947a1

"Error": "Lambda.Unknown",
"Cause": "The cause could not be determined because Lambda did not return an error type. Returned payload: {"errorMessage":"2024-08-22T01:07:54.717Z 7c61f7e1-1642-4dec-a6df-96034c45a128 Task timed out after 900.11 seconds"}"
},
"delaySeconds": 0
}

Image


COWVR_STPH8_L2_EDR_V9.0:

  • Unable to locate the associated cumulus-collections configuration "COWVR_STPH8_L2_EDR_V9.0.json" (or COWVR_STPH8_L2_EDR_V9.0-rules.json)

When testing using COWVR_STPH8_L2_EDR__V8.0_ (C1257664524-POCLOUD)

  • User receives following message (if/when footprint enabled or disabled in config.yml):
    % backfill --config config.yml
    Started backfill: 2024-08-26 19:09:13 UTC
    collection config: searching /Users/colemand/work/backfill/cumulus-configurations for /uat//COWVR_STPH8_L2_EDR_V8.0.json
    collection config: found /Users/colemand/work/backfill/cumulus-configurations/cumulus-collections/uat/cowvr/COWVR_STPH8_L2_EDR_V8.0.json
    Checking S3 settings
    Traceback (most recent call last):
    File "/Users/colemand/work/Backfill/venv/bin/backfill", line 8, in
    sys.exit(main())
    File "/Users/colemand/work/Backfill/venv/lib/python3.10/site-packages/podaac/hitide_backfill_tool/cli.py", line 490, in main
    raise Exception("There is no footprint settings for this collection, please disable footprint for backfilling")
    Exception: There is no footprint settings for this collection, please disable footprint for backfilling

@davidcolemanjpl davidcolemanjpl moved this from 🏗 In progress to 👎 Test Failed in hitide-24.3 Aug 22, 2024
@sliu008
Copy link
Contributor

sliu008 commented Aug 22, 2024

We have granules that doesn't belong in the SCATSAT1_ESDR_L2_WIND_STRESS_V1.1 collection all the granules with ancillary in them doesn't belong which is why these forge/tig are failing.

We should delete thees granules from UAT

@jamesfwood
Copy link
Collaborator Author

jamesfwood commented Jan 15, 2025

Hi @davidcolemanjpl This is now ready for you to test this again.

Here are the details:

backfill-tool version 0.10.0rc17
forge-py version 0.4.0rc3
forge version 0.11.0-rc.3
tig version 0.12.0-rc.3
hitide-backfill-lambdas version 0.4.1rc3
postworkflow-normalizer version 0.4.1rc2

Also, don't test the COWVR V9.0 (retired) above, but test this one instead:
COWVR_STPH8_L1_TSDR_V10.0

and this:
COWVR_STPH8_L2_EDR_V10.0

Test other random normal collections too like MODIS and VIIRS, etc.

And include these:
AVHRR19_G-NAVO-L2P-v1.0 (forge-py)
AVHRRMTA_G-NAVO-L2P-v1.0 (forge-py)
AVHRRMTA_G-NAVO-L2P-v2.0 (forge-py)
AVHRRMTB_G-NAVO-L2P-v1.0 (forge-py)
AVHRRMTA_G-NAVO-L2P-v1.0 (forge-py)
SCATSAT1_ESDR_L2_WIND_STRESS_V1.1 (forge-py) (This one may fail but that is expected until we fix the UAT issues)
EWSG2-NAVO-L2P-v01 (forge-py)
SMAP_RSS_L2_SSS_V6 (forge-py)

Also one more thing. We are waiting for a new version of Cumulus before doing the final delivery. It will include new lambda functions. We don't expect anything to change, but you may need to test some things again after we do that update.
Cumulus is at 18.5.2 now, but it will probably go to 18.5.3 before final release.

Thanks!

@jamesfwood jamesfwood moved this to 🔖 Ready in hitide-25.1 Jan 15, 2025
@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl please include this in your new testing https://jira.jpl.nasa.gov/browse/PODAAC-6139

@davidcolemanjpl
Copy link

davidcolemanjpl commented Jan 16, 2025

podaac-hitide-backfill-tool v0.10.0rc17
podaac-app-services-uat-1858
tig:v0.12.0-rc.3
forge-py version v0.4.0rc3
forge version 0.11.0-rc.3

  • VIIRS_NPP-NAVO-L2P-v3.0:

podaac-services-uat-hitide-backfill-forge failed executions (Fails at the 'ForgeProcess' step function):

b927dbad-8cc5-47a9-a3a8-f1ed2e52e523
3a0cd29d-7668-4870-9a4e-88e539b27d19
a587ca63-3a24-4d0b-9402-2ff32b77c3e3

"Error": "cumulus_message_adapter.message_parser.MessageAdapterException",
    "Cause": "{\"errorMessage\":\"An error occurred in the Cumulus Message Adapter: gov.nasa.podaac.forge.FootprintHandlerException: Error processing granule\",\"errorType\":\"cumulus_message_adapter.message_parser.MessageAdapterException\",\"stackTrace\":[\"cumulus_message_adapter.message_parser.MessageParser.HandleMessage(MessageParser.java:99)\",\"cumulus_message_adapter.message_parser.MessageParser.RunCumulusTask(MessageParser.java:116)\",\"gov.nasa.podaac.forge.FootprintHandler.handleRequestStreams(FootprintHandler.java:90)\",\"java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\",\"java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)\",\"java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\",\"java.base/java.lang.reflect.Method.invoke(Unknown Source)\"],\"cause\":{\"errorMessage\":\"Error processing granule\",\"errorType\":\"gov.nasa.podaac.forge.FootprintHandlerException\",\"stackTrace\":[\"gov.nasa.podaac.forge.FootprintHandler.PerformFunction(FootprintHandler.java:187)\",\"cumulus_message_adapter.message_parser.MessageParser.HandleMessage(MessageParser.java:76)\",\"cumulus_message_adapter.message_parser.MessageParser.RunCumulusTask(MessageParser.java:116)\",\"gov.nasa.podaac.forge.FootprintHandler.handleRequestStreams(FootprintHandler.java:90)\",\"java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\",\"java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)\",\"java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\",\"java.base/java.lang.reflect.Method.invoke(Unknown Source)\"],\"cause\":{\"errorMessage\":\"Illegal Range for dimension 1: last requested 3200 > max 3199\",\"errorType\":\"ucar.ma2.InvalidRangeException\",\"stackTrace\":[\"ucar.ma2.Section.fill(Section.java:179)\",\"ucar.nc2.Variable.read(Variable.java:709)\",\"ucar.nc2.Variable.read(Variable.java:683)\",\"gov.nasa.podaac.forge.Footprinter.constructCoordsFromNetcdf(Footprinter.java:233)\",\"gov.nasa.podaac.forge.Footprinter.processRange(Footprinter.java:287)\",\"gov.nasa.podaac.forge.Footprinter.footprint(Footprinter.java:169)\",\"gov.nasa.podaac.forge.FootprintHandler.PerformFunction(FootprintHandler.java:185)\",\"cumulus_message_adapter.message_parser.MessageParser.HandleMessage(MessageParser.java:76)\",\"cumulus_message_adapter.message_parser.MessageParser.RunCumulusTask(MessageParser.java:116)\",\"gov.nasa.podaac.forge.FootprintHandler.handleRequestStreams(FootprintHandler.java:90)\",\"java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\",\"java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)\",\"java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)\",\"java.base/java.lang.reflect.Method.invoke(Unknown Source)\"]}}}"
  }

podaac-services-uat-hitide-backfill-tig failed executions (Fails at the 'ImageProcess' step function):
e7929e2b-1a30-48ae-b3f2-38bbf134a8fb
9b87db20-a6a5-439d-bdfe-61c4ef438255

 "Error": "Lambda.Unknown",
    "Cause": "The cause could not be determined because Lambda did not return an error type. Returned payload: {\"errorMessage\":\"2025-01-15T23:41:53.082Z a8a7b3b3-5db1-4e4f-8477-84436bd63b4f Task timed out after 900.11 seconds\"}"
  }


  • COWVR_STPH8_L1_TSDR_V10.0:

podaac-services-uat-hitide-backfill-tig (Failed at the 'ImageProcess' Step):

see recently failed hitide-backfill-tig executions:

9f7f4cc4-2fb5-4c8c-acf9-70a6066d31bc
a6d87898-238d-492b-b4c1-a55e57aad787
7817f60f-3a29-4cc6-9b0a-eb05ba1a799b
3e741706-c5bd-49c5-a429-65f22c2fa784

"Error": "KeyError",
    "Cause": "{\"errorMessage\": \"'imgVariables'\", \"errorType\": \"KeyError\", \"requestId\": \"6802e5f7-7779-4fcc-a3ec-35dc43113faf\", \"stackTrace\": [\"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 451, in handler\\n    return ImageGenerator.cumulus_handler(event, context=context)\\n\", \"  File \\\"/var/task/cumulus_process/process.py\\\", line 315, in cumulus_handler\\n    return run_cumulus_task(cls.handler, event, context)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 85, in run_cumulus_task\\n    return handle_task_exception(exception, cumulus_message, logger)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 83, in run_cumulus_task\\n    task_response = task_function(nested_event, context, **taskargs)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 408, in handler\\n    return cls.run(path=path, noclean=noclean, context=context, **event)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 416, in run\\n    output = process.process()\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 259, in process\\n    self.download_palette_files(config_file_path)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 196, in download_palette_files\\n    for variable in data['imgVariables']:\\n\"]}"
  }


  • COWVR_STPH8_L2_EDR_V10.0:
    podaac-services-uat-hitide-backfill-tig (Failed at the 'ImageProcess' Step):

see recently failed hitide-backfill-tig executions:
754f68ed-1249-462e-b95e-6ccdd7ea0435
2c9acbc1-6867-4950-bb0f-6033a5bce771
5d3b5ca7-465b-448b-b68b-a43767cb0a00

 "Error": "KeyError",
    "Cause": "{\"errorMessage\": \"'imgVariables'\", \"errorType\": \"KeyError\", \"requestId\": \"540857d3-d1eb-4bf5-8035-7df4b5f16923\", \"stackTrace\": [\"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 451, in handler\\n    return ImageGenerator.cumulus_handler(event, context=context)\\n\", \"  File \\\"/var/task/cumulus_process/process.py\\\", line 315, in cumulus_handler\\n    return run_cumulus_task(cls.handler, event, context)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 85, in run_cumulus_task\\n    return handle_task_exception(exception, cumulus_message, logger)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 83, in run_cumulus_task\\n    task_response = task_function(nested_event, context, **taskargs)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 408, in handler\\n    return cls.run(path=path, noclean=noclean, context=context, **event)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 416, in run\\n    output = process.process()\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 259, in process\\n    self.download_palette_files(config_file_path)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 196, in download_palette_files\\n    for variable in data['imgVariables']:\\n\"]}"
  }


  • SCATSAT1_ESDR_L2_WIND_STRESS_V1.1:
    podaac-services-sit-hitide-backfill-tig
    Not all podaac-services-uat-hitide-backfill-tig executions completed as expected. (1 of 4 and 1 of 5 tig executions were successful)

Failed hitide-backfill-tig executions:
c6542751-bbbc-45dd-be97-8b6a6216795a
d891a25c-b852-44e8-a1c4-16300e0a21ef
58095f8e-eac0-4154-baf7-a518f018fa85

Executions failed at the 'ImageProcess' step:

"exception": {
    "Error": "Lambda.Unknown",
    "Cause": "The cause could not be determined because Lambda did not return an error type. Returned payload: {\"errorMessage\":\"2025-01-17T00:08:58.176Z 0b6710d7-1def-4c03-b5d2-25704e896947 Task timed out after 900.11 seconds\"}"
  }


NOTE:
podaac-hitide-backfill-tool executions (hitide-backfill-forge and hitide-backfill-tig executions) completed as expected / successfully for the following collections under test:

  • AVHRR19_G-NAVO-L2P-v1.0
  • AVHRRMTA_G-NAVO-L2P-v1.0
  • AVHRRMTB_G-NAVO-L2P-v1.0
  • AVHRRMTA_G-NAVO-L2P-v2.0
  • AVHRRMTB_G-NAVO-L2P-v2.0
  • EWSG2-NAVO-L2P-v01
  • SMAP_RSS_L2_SSS_V6

@davidcolemanjpl davidcolemanjpl moved this from 🔖 Ready to 👎 Test Failed in hitide-25.1 Jan 16, 2025
@jamesfwood jamesfwood moved this from 👎 Test Failed to 🔖 Ready in hitide-25.1 Jan 16, 2025
@jamesfwood
Copy link
Collaborator Author

Hi @davidcolemanjpl
I fixed the COWVR issue. Please test the COWVR collections again with the latest CLI release
0.10.0rc18

Basically these 2 collections aren't supported with image so the backfill client should exit and not allow you to run image mode on those collections. Please test this. Thanks!

Same with this collection EWSG2-NAVO-L2P-v01. No image enabled.

@jamesfwood
Copy link
Collaborator Author

jamesfwood commented Jan 16, 2025

@sliu008 is working on the VIIRS issue.

@davidcolemanjpl davidcolemanjpl moved this from 🔖 Ready to 🏗 In progress in hitide-25.1 Jan 21, 2025
@davidcolemanjpl
Copy link

podaac-hitide-backfill-tool v0.10.0rc18
podaac-app-services-uat-1858
tig:v0.12.0-rc.3
forge-py version v0.4.0rc3
forge version 0.11.0-rc.3
HiTIDE-UI UAT v4.17.3-rc.2
HiTIDE Profile v4.10.1-rc.16

podaac-hitide-backfill-tool executions (hitide-backfill-forge and hitide-backfill-tig executions) completed as expected / successfully for the following collections under test:

AVHRRMTB_G-NAVO-L2P-v2.0
AVHRRMTB_G-NAVO-L2P-v1.0
AVHRRMTA_G-NAVO-L2P-v2.0
AVHRRMTA_G-NAVO-L2P-v1.0
SMAP_RSS_L2_SSS_V6

EWSG2-NAVO-L2P-v0 (verified that there is no image setting for this collection)
COWVR_STPH8_L2_EDR_V10.0 (verified that there is no image setting for this collection)
COWVR_STPH8_L1_TSDR_V10.0 (verified that there is no image setting for this collection)
AVHRR19_G-NAVO-L2P-v1.0
MODIS_A-JPL-L2P-v2019.0


Collection FAILed test:
VIIRS_NPP-NAVO-L2P-v3.0 still fails FORGE/TIG executions, DEV currently investigating (see https://jira.jpl.nasa.gov/browse/PODAAC-6139)

Side Observation:
(PODAAC-6656) - When image setting is commented out in config.yml, the backfill execution still warns the user:
"There is no image setting for this collection, please disable image for backfilling"
User must manually set the image setting to "off" to execute backfill tool as expected


Test Complete

@davidcolemanjpl davidcolemanjpl moved this from 🏗 In progress to ✅ Done in hitide-25.1 Jan 21, 2025
@jamesfwood jamesfwood moved this from ✅ Done to 🔖 Ready in hitide-25.1 Jan 23, 2025
@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl We should have fixed the VIIRS issues. Please test those collections again. Thanks!
This includes new versions of forge, tig also.

@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl please test this again too

https://jira.jpl.nasa.gov/browse/PODAAC-6139

@jamesfwood
Copy link
Collaborator Author

jamesfwood commented Jan 23, 2025

@davidcolemanjpl We designed it to fail on empty granules. Now it should fail much quicker. This is normal behavior

@davidcolemanjpl
Copy link

davidcolemanjpl commented Jan 23, 2025

@davidcolemanjpl We should have fixed the VIIRS issues. Please test those collections again. Thanks! This includes new versions of forge, tig also.

@jamesfwood please list the updated versions for re-test. Does Backfill tool version remain the same (v0.10.0rc18)?
Seems like tig version remains @ tigv0.12.0-rc.3 for UAT, correct?

Also, please see PODAAC-6656 issue related to HiTIDE-backfill-tool

@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl

Ok, the new versions now are:
backfill tool: 0.10.0rc29
forge: 0.12.0-rc.14
tig: 0.13.0rc3
forge-py: 0.4.0rc5

Can you tell me why you need to know the version numbers?

I updated it so that it requires each image, footprint, and dmrpp to be explicitly defined.
I fixed that new issue in ticket 6656 also.

Please test everything again.
Thanks!

@davidcolemanjpl
Copy link

davidcolemanjpl commented Jan 27, 2025

Can you tell me why you need to know the version numbers?
@jamesfwood version numbers are imperative for testing traceability, which includes issue tracking and test artifacts.

https://github.com/podaac/hitide-backfill-tool/blob/develop/terraform-deploy/terraform_env/uat.json

https://github.com/podaac/hitide-backfill-tool/blob/release/0.10.0/terraform-deploy/terraform_env/uat.json

https://test.pypi.org/project/podaac-hitide-backfill-tool/#history

note:
per most recent request, I'll plan on updating the podaac-hitide-backfill-tool v0.10.0rc18 to podaac-hitide-backfill-toolv0.10.0rc29 for another round of backfill-tool testing

@davidcolemanjpl davidcolemanjpl moved this from 🔖 Ready to 🏗 In progress in hitide-25.1 Jan 28, 2025
@davidcolemanjpl
Copy link

davidcolemanjpl commented Jan 28, 2025

podaac-hitide-backfill-tool v0.10.0rc29
podaac-app-services-uat-1858
HiTIDE-UI UAT v4.17.3-rc.2
HiTIDE-Profile v4.10.1-rc.16
forge-py: v0.4.0rc5
forge: 0.12.0-rc.14
tig: v0.13.0rc3

***Issues observed in this podaac-hitide-backfill-tool version (see PODAAC-6139):

Test Failed

SMAP_RSS_L2_SSS_V6 :
When testing 'SMAP_RSS_L2_SSS_V6' , observed errors for podaac-services-uat-hitide-backfill-tig executions:

podaac-services-uat-hitide-backfill-tig fails at the 'ImageProcess' step:

see recent hitide-backfill-tig executions:
Failed hitide-backfill-tig executions:

5b29915a-f822-4479-8791-1335829a3d12
8895857a-e669-4893-b005-c5c9added39b
66ecefbb-e73a-4946-9f4c-89c20e2b63ec
bebe6e3b-56c9-4804-98d6-8e9542a9c430

"Error": "KeyError",
    "Cause": "{\"errorMessage\": \"'k'\", \"errorType\": \"KeyError\", \"requestId\": \"c57f9633-1a2b-4f7f-84e1-d13235a88819\", \"stackTrace\": [\"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 503, in handler\\n    return ImageGenerator.cumulus_handler(event, context=context)\\n\", \"  File \\\"/var/task/cumulus_process/process.py\\\", line 315, in cumulus_handler\\n    return run_cumulus_task(cls.handler, event, context)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 85, in run_cumulus_task\\n    return handle_task_exception(exception, cumulus_message, logger)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 83, in run_cumulus_task\\n    task_response = task_function(nested_event, context, **taskargs)\\n\", \"  File \\\"/var/task/podaac/lambdahandler/lambdahandler.py\\\", line 460, in handler\\n    return cls.run(path=path, noclean=noclean, context=context, **event)\\n\", \"  File \\\"/var/task/podaac/lambdahandler/lambdahandler.py\\\", line 468, in run\\n    output = process.process()\\n\", \"  File \\\"/var/task/podaac/lambdahandler/lambdahandler.py\\\", line 271, in process\\n    uploadedimages = self.imagegenerate(file_, configfilepath, self.path, granuleid)\\n\", \"  File \\\"/var/task/podaac/lambdahandler/lambdahandler.py\\\", line 362, in imagegenerate\\n    self.logger.error(\\\"Error during image generation: {}\\\".format(ex), excinfo=True)\\n\", \"  File \\\"/var/task/cumuluslogger.py\\\", line 301, in error\\n    self.log(logging.ERROR, message, *args, **kwargs)\\n\", \"  File \\\"/var/task/cumuluslogger.py\\\", line 326, in log\\n    msg = self.createMessage(message, *args, **kwargs)\\n\", \"  File \\\"/var/task

 

SCATSAT1_ESDR_L2_WIND_STRESS_V1.1:

Failed hitide-backfill-tig executions:

podaac-services-uat-hitide-backfill-tig executions fails at the 'ImageProcess' step:

46906f65-983e-434d-a6ec-85fed1eb8b66
4e43ba6e-86fa-41e9-8e7d-e59ab1a4c8a6
abb377d5-3351-4e0d-827d-d5cc5931fe7c
80278636-ee9c-4871-ade9-7a9fcb84871d
bb7e312d-ba46-48ce-aafb-665089eba355
4b079810-4360-4016-a555-a57be02bc10c

"Error": "Exception",
    "Cause": "{\"errorMessage\": \"Process error: \\\"No variable named 'en_wind_speed'. Variables on the dataset include ['time', 'lat', 'lon', 'flags', 'quality_indicator', ..., 'era_sst', 'era_boundary_layer_height', 'era_rel_humidity_2m', 'globcurrent_u', 'globcurrent_v']\\\"\\nTraceback (most recent call last):\\n  File \\\"/var/task/xarray/core/dataset.py\\\", line 1512, in _construct_dataarray\\n    variable = self._variables[name]\\nKeyError: 'en_wind_speed'\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/var/task/xarray/core/dataset.py\\\", line 1611, in __getitem__\\n    return self._construct_dataarray(key)\\n  File \\\"/var/task/xarray/core/dataset.py\\\", line 1514, in _construct_dataarray\\n    _, name, variable = _get_virtual_variable(self._variables, name, self.sizes)\\n  File \\\"/var/task/xarray/core/dataset.py\\\", line 221, in _get_virtual_variable\\n    raise KeyError(key)\\nKeyError: 'en_wind_speed'\\n\\nThe above exception was the direct cause of the following exception:\\n\\nTraceback (most recent call last):\\n  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 51, in generate_images\\n    images = image_gen.generate_images(granule_id=granule_id)\\n  File \\\"/var/task/podaac/tig/tig.py\\\", line 406, in generate_images\\n    output_images = self.generate_images_group(image_format, world_file, granule_id, group=None)\\n  File \\\"/var/task/podaac/tig/tig.py\\\", line 478, in generate_images_group\\n    output_image_file = self.process_variable(var,\\n  File \\\"/var/task/podaac/tig/tig.py\\\", line 660, in process_variable\\n    var_array = local_dataset[variable].to_masked_array().flatten()\\n  File \\\"/var/task/xarray/core/dataset.py\\\", line 1617, in __getitem__\\n    raise KeyError(message) from e\\nKeyError: \\\"No variable named 'en_wind_speed'. Variables on the dataset include ['time', 'lat', 'lon', 'flags', 'quality_indicator', ..., 'era_sst', 'era_boundary_layer_height', 'era_rel_humidity_2m', 'globcurrent_u', 'globcurrent_v']\\\"\\n\", \"errorType\": \"Exception\", \"requestId\": \"1e381a6d-106f-4b4e-94b1-4a18a2f3680b\", \"stackTrace\": [\"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 503, in handler\\n    return ImageGenerator.cumulus_handler(event, context=context)\\n\", \"  File \\\"/var/task/cumulus_process/process.py\\\", line 315, in cumulus_handler\\n    return run_cumulus_task(cls.handler, event, context)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 85, in run_cumulus_task\\n    return handle_task_exception(exception, cumulus_message, logger)\\n\", \"  File \\\"/var/task/run_cumulus_task.py\\\", line 83, in run_cumulus_task\\n    task_response = task_function(nested_event, context, **taskargs)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 460, in handler\\n    return cls.run(path=path, noclean=noclean, context=context, **event)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 468, in run\\n    output = process.process()\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 271, in process\\n    uploaded_images = self.image_generate(file_, config_file_path, self.path, granule_id)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 357, in image_generate\\n    image_list = self._generate_images(local_file, config_file, palette_dir, granule_id, variables_config)\\n\", \"  File \\\"/var/task/podaac/lambda_handler/lambda_handler.py\\\", line 413, in _generate_images\\n    raise Exception(\\\"\\\\n\\\".join(errors))\\n\"]}"
  }

@davidcolemanjpl davidcolemanjpl moved this from 🏗 In progress to 👎 Test Failed in hitide-25.1 Jan 28, 2025
@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl can you provide the granule id that you are running? And the command line to run this ourselves?

Can you run it again for the SMAP_RSS_L2_SSS_V6 so we can find the failed step function easier? It's hard to track those down right now.

Also, no need to test SCATSAT1_ESDR_L2_WIND_STRESS_V1.1 right now because it's CMR is broken in UAT. So skip testing that collection now.

Thanks!

@jamesfwood
Copy link
Collaborator Author

jamesfwood commented Jan 28, 2025

@davidcolemanjpl btw, you can run the backfill tool now with an input granule list. You can run a specific granule that fails so we can try it too, and easier to track issues with a fixed command line. Can you test all the new features added?

These were all the major changes (from the CHANGELOG):

CLI Changes:

  • Updated to require --image, --footprint, and --dmrpp to be explicitly defined in args or input config
  • Added a new input argument granule-list-file to input a specific list of granules to process,
    and ignore start-date, end-date, cycles, etc
    • List can be a list of GranuleURs or granule concept-IDs
  • Made arguments --cumulus-configurations and --default-message-config optional in preview mode
  • Added check before backfilling images--make sure it is enabled in forge-tig configuration

Internal:

  • Update db size from t2.micro to t3.micro
  • Updated forge-py to 0.4.0
  • Updated cumulus-postworkflow-normalizer to 0.4.1
  • Updated hitide-backfill-lambdas to 0.4.1
  • Update metadata aggregator to cumulus-metadata-aggregator-8.7.0-alpha.6-SNAPSHOT
  • Update forge-py memory to 2048
  • Added in forge-py fargate
  • Updated forge and tig workflow to fork based on granule size when determine lambda or fargate
  • Updated github actions workflow and versioning

Thank you!

@jamesfwood jamesfwood moved this from 👎 Test Failed to 🔖 Ready in hitide-25.1 Jan 29, 2025
@jamesfwood
Copy link
Collaborator Author

jamesfwood commented Jan 29, 2025

@davidcolemanjpl ok we did an update to tig to fix that SMAP issue.

Can you please retest SMAP and other normal collections, but exclude testing SCATSAT1_ESDR_L2_WIND_STRESS_V1.1 in uat. Please test forge and forge-py thoroughly also, including with those empty VIIRS granules.

New versions:
Backfill Tool - 0.10.0rc31
TIG - 0.13.1rc1

The others are the same. Thanks!!

@jamesfwood
Copy link
Collaborator Author

@davidcolemanjpl
please retest this too https://jira.jpl.nasa.gov/browse/PODAAC-6139
And any other JIRA tickets related to Backfill-tool or tig, forge, forge-py, etc.

Thanks!

@davidcolemanjpl davidcolemanjpl moved this from 🔖 Ready to 🏗 In progress in hitide-25.1 Feb 1, 2025
@davidcolemanjpl
Copy link

davidcolemanjpl commented Feb 1, 2025

2/3/25:

podaac-hitide-backfill-tool v0.10.0rc31
podaac-app-services-uat-1858
HiTIDE-UI UAT v4.17.3-rc.2
HiTIDE-Profile v4.10.1-rc.16
TIG v0.13.1rc1
forge v0.11.0-rc.3

SMAP_RSS_L2_SSS_V6 - PASSED - podaac-services-uat-hitide-backfill-tig executions NO longer fail at the 'ImageProcess' step as before.

VIIRS_NPP-NAVO-L2P-v3.0 - Conditional PASS - (FORGE/TIG errors observed) - DEV designed it to fail on empty granules. "Now it should fail much quicker. This is normal behavior" (PODAAC-6139)

Reviewed other respective FORGE/FORGE-py/TIG collections as well in this backfill-tool version.
Verified CLI Changes including the new input argument granule-list-file

NOTE: Bugs are still an issue in this backfill-tool version:
PODAAC-6695
PODAAC-4780

Marking this ticket as Done at this time.

@davidcolemanjpl davidcolemanjpl moved this from 🏗 In progress to ✅ Done in hitide-25.1 Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

No branches or pull requests

3 participants