Skip to content

fix(taskworker) Remove pickle usage from sentryapp hooks #91435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 12, 2025

Conversation

markstory
Copy link
Member

Redo of #91372 which was reverted because of rate limiting. This time around, use an option to control task variants, and only generate a payload when a hook is going to be sent. For projects with multiple service hooks, this should result in fewer snuba calls as events are only serialized once.

Redo of #91372 which was reverted because of rate limiting.
This time around, use an option to control task variants,
and only generate a payload when a hook is going to be sent.
For projects with multiple service hooks, this should result in fewer
snuba calls as events are only serialized once.
@markstory markstory requested review from a team as code owners May 12, 2025 14:16
@markstory markstory requested a review from a team May 12, 2025 14:16
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label May 12, 2025
Copy link

codecov bot commented May 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #91435      +/-   ##
==========================================
- Coverage   87.68%   87.64%   -0.05%     
==========================================
  Files       10320    10313       -7     
  Lines      588042   586255    -1787     
  Branches    22604    22604              
==========================================
- Hits       515639   513815    -1824     
- Misses      71975    72012      +37     
  Partials      428      428              

# only generate the payload once to reduce snuba queries
if payload is None:
# TODO: when we have multiple service hook versions, this will need to change.
payload = json.dumps(get_payload_v0(event))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to serialize the event without doing this call up front? Then we could move all this to the async task function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a way to get the event/project/group id, and then use that to rebuild the event in the webhook task. For projects with multiple hooks, we'll read the event multiple times though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to pass project_id, group_id, and event_id. This approach moves serialization of the event until the hook is delivered. We have to re-read the event from nodestore, but that read should be cached most of the time as the event was recently in post_process.

I've also checked that all the events that go to service hooks are error events, so we don't have to worry about issue occurrences. 🤞

Copy link
Contributor

@Christinarlong Christinarlong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O nice change! I'm surprised the current implementation doesn't currently cause a bunch of rate limited errors (?). Like serializing Group -> snuba call so we may have to live with rate limited errors 🤷‍♀️ I think sentry apps' event webhooks (issue.created) also has this rate limited issue which we ignore :oopsie:

I guess the question here is, I think it's likely we'll still get rate limited errors, so at what threshold do we tolerate having the error happen vs must address?

@markstory
Copy link
Member Author

I'm surprised the current implementation doesn't currently cause a bunch of rate limited errors (?). Like serializing Group -> snuba call so we may have to live with rate limited errors 🤷‍♀️

I think we're squeaking by because not all orgs/projects use sentryapps. If they did, we would be facing rate limits for sure 😢

I guess the question here is, I think it's likely we'll still get rate limited errors, so at what threshold do we tolerate having the error happen vs must address?

I think as long as we stay above SLO targets we're good 😄 With the current approach we can't entirely avoid rate limits.

@markstory markstory merged commit 6c90baa into master May 12, 2025
60 checks passed
@markstory markstory deleted the fix-sentryapp-param-redo branch May 12, 2025 21:12
Copy link

sentry-io bot commented May 13, 2025

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ ReadTimeout: SafeHTTPSConnectionPool(host='applications.zoom.us', port=443): Read timed out. (read timeout=5) sentry.sentry_apps.tasks.service_hooks.process_... View Issue
  • ‼️ TypeError: Task <@task: sentry.sentry_apps.tasks.service_hooks.process_service_hook of sentry at 0x7bb24e7e52b0> was invoked with an object that we do not want to pass via pickle (<sentry.eventstore.models.GroupEvent object at 0x7bb1ef9ae660>, reason is do not pic... sentry.sentry_apps.tasks.service_hooks.process_... View Issue
  • ‼️ ReadTimeout: SafeHTTPSConnectionPool(host='applications.zoom.us', port=443): Read timed out. (read timeout=5) sentry.sentry_apps.tasks.service_hooks.process_... View Issue
  • ‼️ RateLimitExceeded: Query on could not be run due to allocation policies, info: {'details': {'ConcurrentRateLimitAllocationPolicy': {'can_run': False, 'max_threads': 0, 'explanation': {'reason': 'concurrent policy 19 exceeds limit of 18', 'overrides': {}, 'storage_key': 'S... sentry.sentry_apps.tasks.service_hooks.process_... View Issue
  • ‼️ ReadTimeout: SafeHTTPSConnectionPool(host='applications.zoom.us', port=443): Read timed out. (read timeout=5) sentry.sentry_apps.tasks.service_hooks.process_... View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants