Skip to content

fix(hybrid-cloud): Adds support for cross database tombstones #69480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 25, 2024

Conversation

GabeVillalobos
Copy link
Member

@GabeVillalobos GabeVillalobos commented Apr 23, 2024

Adds cross-database tombstone deletion support

The current tombstone code performs a join between the model and
tombstone tables to construct a minimal set of IDs to update or delete
from the target table.

Because this isn't possible when the target model lives in a different
database (e.g. crons monitors), we have to take a 2 step approach:

  1. Query the tombstone table for the current IDs list matching the
    watermarking batch.
  2. Query the model table for all rows with a matching identifier on the
    target field

There are possible performance concerns with the model-side query, so
I've placed this behind a new option named
hyrid_cloud.allow_cross_db_tombstones. If the flag is disabled and a
cross db tombstone cleanup is attempted, an explicit exception is raised
noting why the cleanup failed.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 23, 2024
@GabeVillalobos GabeVillalobos force-pushed the gv/adds-support-for-cross-db-tombstoning branch from dce5a74 to f1835b9 Compare April 23, 2024 16:38
Copy link

codecov bot commented Apr 23, 2024

Codecov Report

Attention: Patch coverage is 93.10345% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 79.84%. Comparing base (47b4e5b) to head (0b6ecdc).
Report is 3 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #69480   +/-   ##
=======================================
  Coverage   79.84%   79.84%           
=======================================
  Files        6466     6466           
  Lines      287413   287435   +22     
  Branches    49535    49538    +3     
=======================================
+ Hits       229483   229507   +24     
+ Misses      57508    57506    -2     
  Partials      422      422           
Files Coverage Δ
src/sentry/options/defaults.py 100.00% <100.00%> (ø)
src/sentry/tasks/deletion/hybrid_cloud.py 92.51% <92.85%> (-0.35%) ⬇️

... and 7 files with indirect coverage changes

Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me.

watermark_batch: WatermarkBatch,
) -> tuple[list[int], datetime.datetime]:
"""
Queries the database or databases if spanning multiple), and returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Queries the database or databases if spanning multiple), and returns
Queries the database or databases if spanning multiple, and returns

Comment on lines +381 to +385
[cascade_data[0]["monitor"].id, cascade_data[1]["monitor"].id],
),
(
{"low": in_order_tombstones[1].id, "up": in_order_tombstones[2].id},
[cascade_data[2]["monitor"].id],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cascade_data has four sets of data in it, and all four users were deleted why are only three monitors being deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a good point, let me fix this to span 3 instead. I'll also add a case that tests that the full set is deleted if the watermark low is set to -1

@GabeVillalobos
Copy link
Member Author

Some manual testing notes:

  1. I set up a local instance running in split silo mode
  2. I created a few users attached to the same organization, created monitors with different owners and validated the owner_user_ids
  3. I deleted the monitor owner who was also the organization owner.
  4. I validated that the organization ownership was passed appropriately.
  5. I validated that the correct tombstone was generated for the auth_user model affected, and checked that the user IDs were nulled out successfully.
  6. I validated from the UI that affected monitors had no associated owners.

Going to merge this change tomorrow now that the tombstone behavior has been tested.

@GabeVillalobos GabeVillalobos merged commit 2b93ffe into master Apr 25, 2024
49 checks passed
@GabeVillalobos GabeVillalobos deleted the gv/adds-support-for-cross-db-tombstoning branch April 25, 2024 16:23
@GabeVillalobos GabeVillalobos added the Trigger: Revert Add to a merged PR to revert it (skips CI) label Apr 25, 2024
@getsentry-bot
Copy link
Contributor

PR reverted: 6e7147f

getsentry-bot added a commit that referenced this pull request Apr 25, 2024
…#69480)"

This reverts commit 2b93ffe.

Co-authored-by: GabeVillalobos <5643012+GabeVillalobos@users.noreply.github.com>
@github-actions github-actions bot locked and limited conversation to collaborators May 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components Trigger: Revert Add to a merged PR to revert it (skips CI)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants