feat(eap-api): support conditional aggregations in SELECT #6870

xurui-c · 2025-02-10T22:16:49Z

https://github.com/getsentry/eap-planning/issues/166

We now support conditional aggregations for EndpointTraceItemTable. Next is time series

codecov · 2025-02-10T22:35:33Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
2882	1	2881	11

View the top 1 failed test(s) by shortest run time

tests.web.rpc.v1.visitors.test_sparse_aggregate_attribute_transformer::test_basic

Stack Traces | 0.002s run time

Traceback (most recent call last):
  File ".../v1/visitors/test_sparse_aggregate_attribute_transformer.py", line 79, in test_basic
    assert transformed.filter == TraceItemFilter(
AssertionError: assert exists_filter {\n  key {\n    type: TYPE_STRING\n    name: "sentry.category"\n  }\n}\n == and_filter {\n  filters {\n    exists_filter {\n      key {\n        type: TYPE_STRING\n        name: "sentry.category"\n      }\n    }\n  }\n  filters {\n    or_filter {\n      filters {\n        exists_filter {\n          key {\n            type: TYPE_DOUBLE\n            name: "my.float.field"\n          }\n        }\n      }\n      filters {\n        exists_filter {\n          key {\n            type: TYPE_DOUBLE\n            name: "my.float.field"\n          }\n        }\n      }\n    }\n  }\n}\n
 +  where exists_filter {\n  key {\n    type: TYPE_STRING\n    name: "sentry.category"\n  }\n}\n = meta {\n  organization_id: 1\n  cogs_category: "something"\n  referrer: "something"\n  project_ids: 1\n  project_ids: 2\n  project_ids: 3\n  start_timestamp {\n    seconds: 1739530800\n  }\n  end_timestamp {\n    seconds: 1739530800\n  }\n  trace_item_type: TRACE_ITEM_TYPE_SPAN\n}\ncolumns {\n  key {\n    type: TYPE_STRING\n    name: "location"\n  }\n}\ncolumns {\n  aggregation {\n    aggregate: FUNCTION_MAX\n    key {\n      type: TYPE_DOUBLE\n      name: "my.float.field"\n    }\n    label: "max(my.float.field)"\n    extrapolation_mode: EXTRAPOLATION_MODE_NONE\n  }\n}\ncolumns {\n  aggregation {\n    aggregate: FUNCTION_AVG\n    key {\n      type: TYPE_DOUBLE\n      name: "my.float.field"\n    }\n    label: "avg(my.float.field)"\n    extrapolation_mode: EXTRAPOLATION_MODE_NONE\n  }\n}\nfilter {\n  exists_filter {\n    key {\n      type: TYPE_STRING\n      name: "sentry.category"\n    }\n  }\n}\norder_by {\n  column {\n    key {\n      type: TYPE_STRING\n      name: "location"\n    }\n  }\n}\ngroup_by {\n  type: TYPE_STRING\n  name: "location"\n}\nlimit: 5\n.filter
 +  and   and_filter {\n  filters {\n    exists_filter {\n      key {\n        type: TYPE_STRING\n        name: "sentry.category"\n      }\n    }\n  }\n  filters {\n    or_filter {\n      filters {\n        exists_filter {\n          key {\n            type: TYPE_DOUBLE\n            name: "my.float.field"\n          }\n        }\n      }\n      filters {\n        exists_filter {\n          key {\n            type: TYPE_DOUBLE\n            name: "my.float.field"\n          }\n        }\n      }\n    }\n  }\n}\n = TraceItemFilter(and_filter=filters {\n  exists_filter {\n    key {\n      type: TYPE_STRING\n      name: "sentry.category"\n    }\n  }\n}\nfilters {\n  or_filter {\n    filters {\n      exists_filter {\n        key {\n          type: TYPE_DOUBLE\n          name: "my.float.field"\n        }\n      }\n    }\n    filters {\n      exists_filter {\n        key {\n          type: TYPE_DOUBLE\n          name: "my.float.field"\n        }\n      }\n    }\n  }\n}\n)
 +    where filters {\n  exists_filter {\n    key {\n      type: TYPE_STRING\n      name: "sentry.category"\n    }\n  }\n}\nfilters {\n  or_filter {\n    filters {\n      exists_filter {\n        key {\n          type: TYPE_DOUBLE\n          name: "my.float.field"\n        }\n      }\n    }\n    filters {\n      exists_filter {\n        key {\n          type: TYPE_DOUBLE\n          name: "my.float.field"\n        }\n      }\n    }\n  }\n}\n = AndFilter(filters=[exists_filter {\n  key {\n    type: TYPE_STRING\n    name: "sentry.category"\n  }\n}\n, or_filter {\n  filters {\n    exists_filter {\n      key {\n        type: TYPE_DOUBLE\n        name: "my.float.field"\n      }\n    }\n  }\n  filters {\n    exists_filter {\n      key {\n        type: TYPE_DOUBLE\n        name: "my.float.field"\n      }\n    }\n  }\n}\n])

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py

davidtsuk

I think you also need to update the confidence interval calculations to include the condition. Also, could you please add a test for extrapolated conditional aggregates

snuba/web/db_query.py

snuba/web/rpc/v1/endpoint_trace_item_table.py

kylemumma

I think you should clean up those print statements laying around but other than that LGTM assuming the tests pass. Also thanks for including a PR description.

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py

snuba/web/db_query.py

volokluev

From the changes that I see in sentry-protos, you have recreated the aggregation struct with the condition field that you need (we spoke about this):

https://github.com/getsentry/sentry-protos/pull/109/files#diff-4b3d0e1ccced0b672ff3b41746de66f106ab7239b75c2bcc10048ec4c0017a9bR9-R13

But in this code here, what you are doing is creating branches for both cases, this makes the code more fragmented.

What you should be doing instead, is converting the old (soon to be deprecated) object into the new AttributeConditionalAggregation object in the very beginning and then doing all of your operations from there

xurui-c · 2025-02-12T22:26:39Z

There's still some of the fragmented code in aggregation.py because it's still being used by the TimeSeries endpoint. I'll get rid of those in a subsequent PR for time series

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table_extrapolation.py

tests/web/rpc/v1/visitors/test_sparse_aggregate_attribute_transformer.py

volokluev · 2025-02-14T21:50:58Z

snuba/web/rpc/v1/resolvers/common/aggregation.py

        ),
        alias=alias,
    )


-def _get_count_column_alias(aggregation: AttributeAggregation) -> str:
+def _get_count_column_alias(
+    aggregation: AttributeAggregation | AttributeConditionalAggregation,


if you've transformed everything up front, when would you still have an AttributeAggregation in the arguments here?

time series endpoint calls get_count_column which calls _get_count_column_alias. We haven't transformed it in time series endpoint yet (will do in later PR)

volokluev · 2025-02-14T21:53:41Z

snuba/web/rpc/v1/endpoint_trace_item_table.py

@@ -80,6 +87,48 @@ def _transform_request(request: TraceItemTableRequest) -> TraceItemTableRequest:
    return SparseAggregateAttributeTransformer(request).transform()


+def convert_to_conditional_aggregation(in_msg: TraceItemTableRequest) -> None:


How different would this function be for the TimeSeries endpoint?

leave a comment explaining why this is being done. It's not immediately clear unless you are the user

Test this function independently. It has clearly defined inputs and outputs

volokluev

Overall looks good! My main concern is just testing that request processor you wrote independently

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py

volokluev · 2025-02-18T18:26:09Z

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py

@@ -2595,12 +2785,73 @@ def test_nonexistent_attribute(setup_teardown: Any) -> None:
        ]


+def _build_sum_attribute_aggregation_column_with_name(name: str) -> Column:


by which i mean this should be in its own file

test_conditional_aggregation.py

volokluev · 2025-02-18T18:31:38Z

snuba/web/rpc/v1/endpoint_trace_item_table.py

+    This function adds the equivalent conditional aggregation for every aggregation in each Column or AggregationFilter. We don't add the filter field so outside code logic will set the default condition to true. The purpose of this function is to "transform" every AttributeAggregation to AttributeConditionalAggregation in order to avoid code fragmentation
+    """
+
+    def _add_conditional_aggregation(


see if you can replace in all the cases so you don't have the split behavior with Column and AggregationComparsionFilter

replace is done with input.ClearField("aggregation"), but I still need the separate isinstance(input, Column) and isinstance(input, AggregationFilter) because of mypy. Also I feel like it makes _convert more readable(?)

volokluev · 2025-02-18T18:32:37Z

snuba/web/rpc/v1/endpoint_trace_item_table.py

@@ -80,6 +87,52 @@ def _transform_request(request: TraceItemTableRequest) -> TraceItemTableRequest:
    return SparseAggregateAttributeTransformer(request).transform()


+def convert_to_conditional_aggregation(in_msg: TraceItemTableRequest) -> None:
+    """
+    This function adds the equivalent conditional aggregation for every aggregation in each Column or AggregationFilter. We don't add the filter field so outside code logic will set the default condition to true. The purpose of this function is to "transform" every AttributeAggregation to AttributeConditionalAggregation in order to avoid code fragmentation


Explain historical reasons here, that's the important part. Think of yourself as a person new to this code

volokluev

much better

xurui-c force-pushed the rachel/aggregateIf branch 2 times, most recently from 3224936 to 4217164 Compare February 10, 2025 22:19

feat(eap-api): support conditional aggregations

d061eee

xurui-c force-pushed the rachel/aggregateIf branch from 4217164 to d061eee Compare February 11, 2025 19:39

xurui-c commented Feb 11, 2025

View reviewed changes

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py Show resolved Hide resolved

xurui-c commented Feb 11, 2025

View reviewed changes

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py Show resolved Hide resolved

take away order

b03e3e8

xurui-c marked this pull request as ready for review February 11, 2025 21:43

xurui-c requested review from a team as code owners February 11, 2025 21:43

davidtsuk reviewed Feb 11, 2025

View reviewed changes

snuba/web/db_query.py Outdated Show resolved Hide resolved

snuba/web/rpc/v1/endpoint_trace_item_table.py Outdated Show resolved Hide resolved

kylemumma approved these changes Feb 11, 2025

View reviewed changes

volokluev requested changes Feb 12, 2025

View reviewed changes

extrapolation test

abc85b2

xurui-c force-pushed the rachel/aggregateIf branch 2 times, most recently from d9ff324 to 4c5dcc2 Compare February 12, 2025 22:23

convert in beginning

cea3456

xurui-c force-pushed the rachel/aggregateIf branch 2 times, most recently from 68376ab to cea3456 Compare February 13, 2025 17:48

Rachel Chen added 2 commits February 13, 2025 13:15

hacky

b527547

not hacky

dca9e59

xurui-c force-pushed the rachel/aggregateIf branch 2 times, most recently from c8d8a92 to dca9e59 Compare February 14, 2025 01:56

Rachel Chen and others added 2 commits February 13, 2025 18:02

extrapolation

4d57b83

more fixes....

ac24c55

xurui-c force-pushed the rachel/aggregateIf branch from 6d90d21 to ac24c55 Compare February 14, 2025 10:38

fix sparse agg

24b3a30

xurui-c force-pushed the rachel/aggregateIf branch from e8c2938 to 24b3a30 Compare February 14, 2025 18:56

clean up

462a2a7

xurui-c force-pushed the rachel/aggregateIf branch from 9a59a3a to 462a2a7 Compare February 14, 2025 20:06

xurui-c commented Feb 14, 2025

View reviewed changes

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py Outdated Show resolved Hide resolved

xurui-c commented Feb 14, 2025

View reviewed changes

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table_extrapolation.py Outdated Show resolved Hide resolved

xurui-c commented Feb 14, 2025

View reviewed changes

tests/web/rpc/v1/visitors/test_sparse_aggregate_attribute_transformer.py Outdated Show resolved Hide resolved

volokluev reviewed Feb 14, 2025

View reviewed changes

test

f70521a

xurui-c force-pushed the rachel/aggregateIf branch from cc9ee4f to f70521a Compare February 18, 2025 06:22

comment

ce2571e

xurui-c force-pushed the rachel/aggregateIf branch from c3d740d to ce2571e Compare February 18, 2025 17:49

xurui-c requested review from volokluev, kylemumma and davidtsuk February 18, 2025 17:51

volokluev reviewed Feb 18, 2025

View reviewed changes

tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py Show resolved Hide resolved

volokluev reviewed Feb 18, 2025

View reviewed changes

better documentation and tests

3e6a855

xurui-c force-pushed the rachel/aggregateIf branch from 2c9aeeb to 3e6a855 Compare February 18, 2025 19:40

Merge branch 'master' into rachel/aggregateIf

b0a04c5

xurui-c requested a review from volokluev February 18, 2025 19:55

volokluev approved these changes Feb 18, 2025

View reviewed changes

Merge branch 'master' into rachel/aggregateIf

e01beb2

xurui-c enabled auto-merge (squash) February 18, 2025 23:16

xurui-c merged commit 7b60369 into master Feb 18, 2025
32 checks passed

xurui-c deleted the rachel/aggregateIf branch February 18, 2025 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(eap-api): support conditional aggregations in SELECT #6870

feat(eap-api): support conditional aggregations in SELECT #6870

xurui-c commented Feb 10, 2025 •

edited

Loading

codecov bot commented Feb 10, 2025 •

edited

Loading

davidtsuk left a comment

kylemumma left a comment •

edited

Loading

volokluev left a comment

xurui-c commented Feb 12, 2025

volokluev Feb 14, 2025

xurui-c Feb 14, 2025 •

edited

Loading

volokluev Feb 14, 2025

volokluev Feb 14, 2025

volokluev left a comment

volokluev Feb 18, 2025

xurui-c Feb 18, 2025

volokluev Feb 18, 2025

xurui-c Feb 18, 2025

volokluev Feb 18, 2025

volokluev left a comment

		@@ -80,6 +87,48 @@ def _transform_request(request: TraceItemTableRequest) -> TraceItemTableRequest:
		return SparseAggregateAttributeTransformer(request).transform()


		def convert_to_conditional_aggregation(in_msg: TraceItemTableRequest) -> None:

		@@ -2595,12 +2785,73 @@ def test_nonexistent_attribute(setup_teardown: Any) -> None:
		]


		def _build_sum_attribute_aggregation_column_with_name(name: str) -> Column:

feat(eap-api): support conditional aggregations in SELECT #6870

feat(eap-api): support conditional aggregations in SELECT #6870

Conversation

xurui-c commented Feb 10, 2025 • edited Loading

codecov bot commented Feb 10, 2025 • edited Loading

❌ 1 Tests Failed:

davidtsuk left a comment

Choose a reason for hiding this comment

kylemumma left a comment • edited Loading

Choose a reason for hiding this comment

volokluev left a comment

Choose a reason for hiding this comment

xurui-c commented Feb 12, 2025

Choose a reason for hiding this comment

xurui-c Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

volokluev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

volokluev left a comment

Choose a reason for hiding this comment

xurui-c commented Feb 10, 2025 •

edited

Loading

codecov bot commented Feb 10, 2025 •

edited

Loading

kylemumma left a comment •

edited

Loading

xurui-c Feb 14, 2025 •

edited

Loading