-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segment large batch processes #2873
Comments
I know it's a long shot since it was back in June; but does anyone remember which GoodJob error this delete parent objects job was receiving? (https://collections-uat.library.yale.edu/management/batch_processes/2039). It would have only been displayed on the main GoodJob Dashboard under the jobs name. Ex: ![]() |
PR ready for review - yalelibrary/yul-dc-management#1475 |
Deployed to Test v2.74.2 |
I feel like something strange is going on with the 'UpdateParentObjects' batch process. It's taking waaaay longer than I would expect to just update a single metadata field. The first parent received a 'Complete' but the rest have 0 status information. |
Looking into why those "Digital Object Source = None" objects are being treated like Preservica objects. Thats weird. For the second issue, we always sync from preservica when we update preservica objects. I just tried updating one of the "Unable to login" objects with a single line CSV upload and it updated the extent_of_digitization successfully with no errors. Im looking into this. Putting back in progress. |
This ticket was spawned from this job (https://collections-uat.library.yale.edu/management/batch_processes/2039) that failed and reran multiple times because GoodJob lost connection and timed out. Instead of segmenting the jobs, it would be cleaner to have more robust error handling by rescuing and returning the specific GoodJob error. The main issue with this is that we no longer have the original GoodJob error to reference. Putting this in Backlog. If a future job fails for a lost GoodJob connection, we will have the error to reference and can implement better error handling. |
Story
As described in a comment in #2859, sometimes a job may time out and fail before all records in a CSV are processed. This causes some jobs to run multiple times. We would like to change the batch process behavior to process CSVs in segments of 50 rows at a time, to prevent process from timing out and re-running.
This behavior should be applied to the following batch processes:
Acceptance
The following jobs run in segments of 50 rows until completion:
Engineering Notes
Jobs that have batching patterns to pull from:
The text was updated successfully, but these errors were encountered: