Skip to content

🤖 Optimize Snuba Query Size #67247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

seer-by-sentry[bot]
Copy link
Contributor

👋 Hi there! This PR was automatically generated 🤖

Triggered by tillman.elser@sentry.io

Fixes SENTRY-2Z31

The issue is caused by a 'QuerySizeExceeded' exception indicating that the maximum query size was exceeded in a ClickHouse database operation. This happens when attempting to execute a query that is too large. To resolve this, we will implement a query size check before executing the query and split large queries into smaller chunks if necessary.

The steps that were performed:

  1. Implement query size check and splitting in _bulk_snuba_query
  2. Add utility function to estimate query size
  3. Add utility function to split large queries
  4. Modify _bulk_snuba_query to use the new utility functions

📣 Instructions for the reviewer which is you, yes you:

  • If these changes were incorrect, please close this PR and comment explaining why.
  • If these changes were incomplete, please continue working on this PR then merge it.
  • If you are feeling confident in my changes, please merge this PR.

This will greatly help us improve the autofix system. Thank you! 🙏

If there are any questions, please reach out to the AI/ML Team on #proj-autofix

🤓 Stats for the nerds:

Prompt tokens: 191577
Completion tokens: 4178
Total tokens: 195755

Copy link

sentry-io bot commented Mar 19, 2024

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: src/sentry/utils/snuba.py

Function Unhandled Issue
_apply_cache_and_build_results RateLimitExceeded: Query on could not be run due to allocation policies, details: {'ConcurrentRateLimitAllocationPol... ...
Event Count: 2.6k
_apply_cache_and_build_results RateLimitExceeded: Query on could not be run due to allocation policies, details: {'ConcurrentRateLimitAllocationPol... ...
Event Count: 394
_apply_cache_and_build_results RateLimitExceeded: Query on could not be run due to allocation policies, details: {'ReferrerGuardRailPolicy': {'can_... ...
Event Count: 210
_apply_cache_and_build_results QuerySizeExceeded: DB::Exception: Received from snuba-errors-tiger-mz-2-2:9000. DB::Exception: Syntax error: failed ... ...
Event Count: 98

Did you find this useful? React with a 👍 or 👎

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 19, 2024
Copy link

codecov bot commented Mar 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.43%. Comparing base (e8612e2) to head (8c04af4).
Report is 949 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #67247       +/-   ##
===========================================
- Coverage   84.33%   68.43%   -15.91%     
===========================================
  Files        5307     5305        -2     
  Lines      237145   236917      -228     
  Branches    41014    40981       -33     
===========================================
- Hits       199990   162126    -37864     
- Misses      36937    74567    +37630     
- Partials      218      224        +6     
Files Coverage Δ
src/sentry/utils/snuba.py 84.21% <ø> (-9.38%) ⬇️

... and 1548 files with indirect coverage changes

@wedamija
Copy link
Member

wedamija commented Mar 19, 2024

It has a decent idea here - the implementation is bad, but the idea of auto splitting queries so we don't go over the max isn't bad. I'm not sure we'd commit such a wide ranging change (even if correctly implemented) without a lot of caution

@getsantry
Copy link
Contributor

getsantry bot commented Apr 10, 2024

This pull request has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you add the label WIP, I will leave it alone unless WIP is removed ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@getsantry getsantry bot added the Stale label Apr 10, 2024
@getsantry getsantry bot closed this Apr 18, 2024
@github-actions github-actions bot locked and limited conversation to collaborators May 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant