Skip to content

feat(workflow_engine): Add state tracking for priority levels for stateful detectors #91791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 21, 2025

Conversation

saponifi3d
Copy link
Contributor

@saponifi3d saponifi3d commented May 16, 2025

Description

Implement state tracking for threshold changes on detectors.

This is pt 4 of the great stateful detector refactoring. Taking a moment to also document these changes. For now, adding these as inline markdown. (Not sure if we have a way to do docgen from comment headers or not)

Up next, - the final refactoring - make the Stateful handler easier to reckon with grouping.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label May 16, 2025
@saponifi3d saponifi3d marked this pull request as ready for review May 16, 2025 06:21
@saponifi3d saponifi3d requested a review from a team as a code owner May 16, 2025 06:21
Copy link

codecov bot commented May 16, 2025

Codecov Report

Attention: Patch coverage is 94.21053% with 11 lines in your changes missing coverage. Please review.

⚠️ Parser warning

The parser emitted a warning. Please review your JUnit XML file:

Warning while parsing testcase attributes: Limit of string is 1000 chars, for name, we got 2083 at 1:157236 in /home/runner/work/sentry/sentry/.artifacts/pytest.junit.xml
Files with missing lines Patch % Lines
...ntry/workflow_engine/handlers/detector/stateful.py 83.33% 11 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master   #91791    +/-   ##
========================================
  Coverage   87.63%   87.63%            
========================================
  Files       10356    10357     +1     
  Lines      587192   587355   +163     
  Branches    22585    22585            
========================================
+ Hits       514558   514712   +154     
- Misses      72206    72215     +9     
  Partials      428      428            

Copy link
Member

@ceorourke ceorourke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes lgtm, regarding documentation I think we have something for public facing documentation generation but not internal. Typically we just do big docstrings but this is fine as long as people find it

storing counter values.
"""
pass
# Merge all kinds of counters together for the state manager
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: These are already merged in self._thresholds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way i was thinking about this is that there might be additional counters that the implementer wants to track in their detector handler. this is meant to give the space to allow for that override fairly easily -- but yeah, i feel like this is probs an over optimization, i can simplify to just having threshold counters on the stateful detector for now.

**self._thresholds,
}

self.state_manager = DetectorStateManager(detector, list(counters.keys()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just use list(self._thresholds.keys()) here?

Comment on lines 385 to 443
updated_status_count = (state.counter_updates.get(new_priority) or 0) + 1
self.state_manager.enqueue_counter_update(None, {new_priority: updated_status_count})

if self._thresholds[new_priority] > updated_status_count:
# We haven't met the threshold yet, so don't trigger
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to keep in mind - probably, higher priorities should also increment counts for all lower priorities.

For metric alerts, if we had a threshold of 3, and then we got crit, crit, warn, the warning should fire.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 makes sense. will update.

Comment on lines +293 to +306
def _get_configured_detector_levels(self) -> list[DetectorPriorityLevel]:
# TODO - Is this something that should be provided by the detector itself rather
# than having to query the db for each level?
priority_levels: list[DetectorPriorityLevel] = [level for level in DetectorPriorityLevel]

if self.detector.workflow_condition_group is None:
# TODO - Should this default to _all_ levels or no levels?
return priority_levels

condition_result_levels = self.detector.workflow_condition_group.conditions.filter(
condition_result__in=priority_levels
).values_list("condition_result", flat=True)

return list(DetectorPriorityLevel(level) for level in condition_result_levels)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wedamija this is how i was thinking we could filter to just the levels configure -- any ways we can easily cache this data on the model or should i make a field in the model that looks up all these values and caches it there? 🤔

@saponifi3d saponifi3d requested a review from wedamija May 19, 2025 23:50
@saponifi3d saponifi3d force-pushed the jcallender/aci/ref-stateful-priorities branch from 58fe124 to 3a9cf30 Compare May 20, 2025 00:30
@saponifi3d saponifi3d force-pushed the jcallender/aci/ref-stateful-priorities branch from 3a9cf30 to 2611032 Compare May 20, 2025 00:35
@saponifi3d saponifi3d merged commit 27067a1 into master May 21, 2025
60 checks passed
@saponifi3d saponifi3d deleted the jcallender/aci/ref-stateful-priorities branch May 21, 2025 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants