-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync Temporary Goobi Ingest objects with Preservica #2465
Comments
We should bring this into sprint, as we working on this. I have rewritten this ticket as Law and Medical objects are being ingested into Preservica right now by DPS. The Music and MSSA items are already in Preservica, so could be reassociated now. I would be happy to take this on. |
Music and MSSA items have now been updated with Preservica info in Prod. Next steps are to get the Law and Medical content into Preservica and connected up |
Music batch is actually taking a while, so need to keep an eye on this https://collections.library.yale.edu/management/batch_processes/13334 |
Music update batch process stalled at OID 32205563. As this parent failed, it seems as though the batch itself then stalled. There were 88 parents on the manifest to update. It got to number 25 (which failed) and then didnt process any more. Should discuss if this is intended functionality or not. If so, needs more status reporting for the job to make it clear one parent failed and that the rest of the batch was therefore not processed. Batch: https://collections.library.yale.edu/management/batch_processes/13334 Remaining parents to update Music_parent_preservica_update.csv Will run an updated manifest with the remaining parents to update. |
Update on above, looks like the above failed due to issues in Preservica with the API availability. DPS are investigating a fix. Even though the first 24 say they completed, when you go into the parents, it looks as if the SOLR records have not regenerated, or PDFs. So fix is probably to fix the Preservica API issues, and then run the whole parent update job for Music again |
Waiting for previous ticket to close |
Issue with https://collections.library.yale.edu/management/parent_objects/10022080 but different to the one before. This was a parent I tried to update previously and it brought in the new children from Preservica but also left the existing children. Tried a straight resync and it did nothing https://collections.library.yale.edu/management/batch_processes/13493. Said that DCS matched Preservica? So then I cleared out all of the existing children for this parent, so the parent was left with 0 children. Then ran a resync with Preservica. This brought in the children, but their sort order does not match Preservica Preservica Parent is still not displaying in Blacklight but I dont know if this is because jobs are still run https://collections.library.yale.edu/catalog/10022080. There is a large backlog of jobs (mainly PDF jobs) which might be holding this up |
Testing script and batch process CSVs to test the Preservica issues in DCS UAT - for @K8Sewell Testing script for DCSPreservica Issues - 08162023.pdf |
Getting some preservica errors (https://collections-uat.library.yale.edu/management/batch_processes/1504) but still investigating. Will retry the process and see if I can discern what is hanging us up. |
Testing Results Update Parent Script - Failed - kept old child objects instead of removing them. Will craft some tests that should reveal why they are not being removed as expected. |
PR ready for review - yalelibrary/yul-dc-management#1247 |
Deployed to Test with release v2.63.1 but will need deployed to UAT for testing. |
PR ready for review yalelibrary/yul-dc-management#1251 It's not elegant but it will get us past the issue we had with the last attempt. |
Deployed to UAT for testing with release v2.63.2 |
I think the issue is fixed. While there was an error raised because of a checksum mismatch the parent object 900124050 now matches with the 46 child objects in Test Presevica for structural object ...76868 and they appear to be in the correct order as well. I'm currently testing the other parent object 900099833 up for update testing. The before screenshot below shows both the old and the preservica child objects but hopefully once this object has processed (waiting on a few delayed jobs) we will see only the expected 54 child objects for structural object ...babeb. Before |
@K8Sewell I have reported the Preservica Test outage to our digital preservation folks. They will work on a fix |
Need to roll work into PROD but still keep ticket for others things. Can split out. |
Spawning jobs again. |
I tried to resync this object again in Production and it is still not working as expected. The notable issue here is the sort order is still wrong https://collections.library.yale.edu/management/parent_objects/10022080. Additionally the parent will not display in Blacklight. Is this possibly just an issue with this parent we need to fix? |
Can we change the Bitstream filename over in Preservica? That's how the caption and ordering are created and thus what is throwing the sort order off. I am unable to find the matching record in Preservica test so if anyone has a link to that - would be greatly appreciated. I'd like to confirm the bitstream filename matches what is in Preservica for parent object 10022080 and I'd like to change it from _1 to _01 so that it captures the correct ordering and try updating the parent to confirm the sort order gets fixed. In the meantime I can draft up some logic that will adjust the filename to avoid this order issue but it feels a little bit of overkill now that I have an idea why the sort order was incorrect. |
11 days of work as of 9.18. As per David, "so it looks like the API returns as lexiographic sorting, which is alphabetic sorting of the numbers instead of numerically" Will break adjusting the sorting we interpret from the API into another ticket #2621 |
Waiting for sync issue to be resolved |
Just a note to say that Medical objects are now in Preservica. I may try to resync one or two objects, to see if we have a similar issue as the one MUS object we are trying to resolve. |
Attempted to sync parent 32320833 with Preservica files. Received 'Unable to login' error in DCS. Confirmed with Digital Preservation that the object has the correct security tag in Preservica (one that the DCS user s_dcs_medical should have access to), and that the correct structure and representation type were added on the spreadsheet. Will likely need to investigate on our end, as the Preservica stuff should be fine. https://collections.library.yale.edu/management/batch_processes/14323 |
Still experience login issue with Medical. |
Login issue with Medical fixed, will start work on these again. |
For Medical, tried to update parent 32320833 Received the following error: As you can see from the error, I put in "Structural" and not "Information", so I don't understand why it's telling me there is no Information object with that ref but a different type of entity with ref. In Preservica, the object is a Structural Object with a Preservation representation type. As such, it seems like this is a DCS issue. Could we please investigate? |
Created ticket #2691 for the above issue. |
@motropuk Do you remember if your child OIDs were retained when you synced your Goobi objects, or were they replaced? Today, I: Used ‘Update Parent Objects’ batch process to add Preservica information to Parent 32329442, which had one child object (32329443). Parent now has two child objects with the following oids: The caption for both parents is The old child, 32329443, appears to have been deleted; it is no longer in the Child Objects data table. I resynced with Preservica, and it retained both two new children with the new OIDs. The folder in Preservica located at the assigned UUID only has one image. We should not add Preservica information for any more Medical objects until we can confirm that (a) the original child OID can be retained and (b) the correct number of children are created for each object. These issues might be solved with #2510 ? |
@sshetenhelm good question. I honestly I cant remember, or at least dont know if I checked. I think for the Music objects I was not too concerned if they child oids were updated, so didnt pay enough attention to that. |
A sample of Preservica-reassociated Music objects have:
Affected parents include: We should put this ticket back on hold until these issues are rectified. Will push back to backlog and pull in #2510 to fix errors (#2510 includes specifications to retain child OIDs and remove existing child objects without Preservica info). |
The alternative being whether or not we are comfortable creating new child OIDs for all objects, and then manually deleting the prior "double" images. |
Created #2703 for cleaning up parents |
Follow up to #2426
All objects uploaded with the Goobi Temporary Ingests need to be synced to their files in Preservica. A spreadsheet with these is here https://docs.google.com/spreadsheets/d/1iqsayDnZz_ur8dH5iUHGap9kDfQkn2vMJ1dQms0hXBc/edit#gid=1783466688
Some materials are not yet ingested into Preservica, so these will need to be ingested into Preservica before we can reassociate in DCS. These are
The following are already in Preservica and could be reassociated now
The text was updated successfully, but these errors were encountered: