-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
fix(perf-issues): Use parameterized URL for similarities on N+1 API Calls #89362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/sentry/utils/performance_issues/detectors/n_plus_one_api_calls_detector.py
Outdated
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## master #89362 +/- ##
==========================================
- Coverage 85.26% 85.26% -0.01%
==========================================
Files 10172 10172
Lines 574286 574301 +15
Branches 22636 22636
==========================================
+ Hits 489671 489680 +9
- Misses 84070 84076 +6
Partials 545 545 |
🤔 This looks pretty good to me! Interesting find, for sure. I don't remember why the span similarity is done using hashes rather than parameterized URLs. I'm guessing it's something that slipped through the cracks in #43506 and I never noticed. Comparing the parameterized URLs seems okay to me, it's just hard to predict exactly what'll happen, and I'm paranoid. I think this is good? re: sentry/src/sentry/event_manager.py Line 2451 in 321dc2c
I think you'd have to be pretty careful about changing those. N+1 API Calls won't be affected since it makes its own hashes for fingerprints, but other detectors might rely on These span hashes are also used for the "suspect spans" feature, though I don't know if that feature is going to stick around Lastly, I'm not sure if you're worried about rolling this change out (I'm not sure if you need to be, but maybe?) but there's always a risk of creating a bunch of unwanted issues for people. If you want to be rigorous, what we've done in the past is copy/paste a detector, register it, run it silently for a while in parallel with the original and check the output via telemetry. If the output looks good, then you can add those changes to the current detector. I'm not saying this PR needs that, but it's worth considering. |
f2f114f
to
b4b185a
Compare
fca2156
to
749da92
Compare
Closing in favour of #90070 |
…90070) For context, I'm auditing the current performance issue detectors and will likely be making some changes to them. Risky changes like touching fingerprinting should probably run silently beforehand to ensure we catch false positives or fingerprint explosion (see [guidance](#89362 (comment))). In this PR, I'm creating a structure to make these experiments easier to review and colocated since I'll be working on different detectors in parallel. Going forward, my workflow will probably be the following: 1. Copy the detector and test files into `/experiments` and set up a dummy group type. This copy will be identical to the current detector. (1st PR, This one) 2. Apply experimental changes in a follow up PR (2nd PR, making it easier to review) 3. Add the options via `sentry createissueflag` ([docs](https://develop.sentry.dev/backend/issue-platform/#releasing-your-issue-type)) to run in prod (3rd PR, in `options-automator`) 5. If needed, follow up with a fix/change to the experimental detector. Repeat as needed 6. When happy with the output, apply changes to the real detector and delete experiment files. The reason I'm being so formal about this all is because I'll be parallelizing these experiments so I gotta keep this straight for myself as well 👍
Sanity check here, but I believe these parameterized URLs are already being used for fingerprinting, but they are not being used for checking similarity. Instead we are checking hashes from
span_hashes
which is generated from a pinned function which does not attempt to parameterize, meaning the hash of/posts/1/comments/
is not going to match/posts/2/comments/
.Rather than hashing the URLs, we can compare them directly after parameterization, in order to increase how often we can fingerprint. Added tests for this new behaviour and adjusted some previous ones.
As per the recommendation, I think I'm going to have this shadow for a while before actually implementing this to ensure we don't cause any fingerprint explosions.
This will allow for N+1 API calls to trigger when we process a transaction that does something like:
rather than just for query params like: