Skip to content

perf(issues): getting users seen counts from cache rather than directly from Snuba #69556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 25, 2024

Conversation

vartec
Copy link
Contributor

@vartec vartec commented Apr 23, 2024

This most basic is an optimization to not go to Snuba for every new event in the issue.
While the number of events seen since the last time we checked unique users count is less than then number of users needed to reach the threshold, we can be sure that the threshold was not reached.

Caveat: when the cached count is just one away from triggering, we'll still call Snuba on every new event in issue until new user is seen.

@vartec vartec requested a review from wedamija April 23, 2024 23:52
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 23, 2024
Bartek Ogryczak added 2 commits April 23, 2024 16:53
Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should also clear the cache explicitly when we hit the threshold, and maybe if a user modifies the snooze?

@@ -80,6 +80,7 @@ def register_temporary_features(manager: FeatureManager):
manager.add("organizations:grouping-suppress-unnecessary-secondary-hash", OrganizationFeature, FeatureHandlerStrategy.INTERNAL)
manager.add("organizations:grouping-title-ui", OrganizationFeature, FeatureHandlerStrategy.REMOTE)
manager.add("organizations:grouping-tree-ui", OrganizationFeature, FeatureHandlerStrategy.REMOTE)
manager.add("organizations:groupsnooze-cached-counts", OrganizationFeature, FeatureHandlerStrategy.OPTION)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

@@ -88,13 +91,7 @@ def is_valid(self, group=None, test_rates=False, use_pending_data=False):
if self.user_window:
if not self.test_user_rates():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I was confused, I guess we also need caching like this here, but that could be a follow up

@vartec
Copy link
Contributor Author

vartec commented Apr 24, 2024

I'm wondering if we should also clear the cache explicitly when we hit the threshold, and maybe if a user modifies the snooze?

Is modifying a snooze actually modifying existing GroupSnooze or does it rather delete and recreate?

In either case what's being kept in cache is the count of events seen since last time checked, so that's independent of GroupSnooze params.

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, this approach makes sense


if cache.get(cache_key, float("inf")) < threshold - 1:
# if we've seen less than that many events, we can't possibly have seen enough users
cache.incr(cache_key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should also look at the post incr value that is returned, and if that's above threshold-1 then we query. This just avoids a few race conditions where two processes fetch the same value, both increment, and one ends up over the threshold

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good callout, I've changed it to increase and get value from that before the if check

@vartec vartec marked this pull request as ready for review April 25, 2024 20:41
@vartec vartec requested review from anonrig and wedamija April 25, 2024 20:41
Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go, just needs a couple tests

@vartec vartec enabled auto-merge (squash) April 25, 2024 23:06
@vartec vartec merged commit f9f9d4c into master Apr 25, 2024
47 checks passed
@vartec vartec deleted the vartec/reduce-snoozedgroup-snuba-queries branch April 25, 2024 23:22
Copy link

sentry-io bot commented Apr 26, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ RateLimitExceeded: Query on could not be run due to allocation policies, details: {'ConcurrentRateLimitAllocationPol... sentry.tasks.post_process.post_process_group View Issue
  • ‼️ RateLimitExceeded: Query on could not be run due to allocation policies, details: {'ConcurrentRateLimitAllocationPol... sentry.tasks.post_process.post_process_group View Issue
  • ‼️ RateLimitExceeded: Query on could not be run due to allocation policies, details: {'ConcurrentRateLimitAllocationPol... sentry.tasks.post_process.post_process_group View Issue
  • ‼️ SnubaError: HTTPConnectionPool(host='127.0.0.1', port=10006): Max retries exceeded with url: /events/snql (Ca... sentry.tasks.post_process.post_process_group View Issue
  • ‼️ SnubaError: HTTPConnectionPool(host='127.0.0.1', port=10006): Max retries exceeded with url: /events/snql (Ca... sentry.tasks.post_process.post_process_group View Issue

Did you find this useful? React with a 👍 or 👎

vartec pushed a commit that referenced this pull request Apr 30, 2024
…#69939)

Reducing unnecessary Snuba queries by only querying when Redis count
reached.
Follows exactly same pattern as
#69556
vartec pushed a commit that referenced this pull request May 1, 2024
Cleaning up #69556 and
#69939

Also changing cached `value` to more meaningful names.
@github-actions github-actions bot locked and limited conversation to collaborators May 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants