Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: is_(not)_in_range supports min/max_limit from the dataframe #87

Open
1 task done
harlankad-db opened this issue Jan 14, 2025 · 0 comments · May be fixed by #153
Open
1 task done

[FEATURE]: is_(not)_in_range supports min/max_limit from the dataframe #87

harlankad-db opened this issue Jan 14, 2025 · 0 comments · May be fixed by #153
Assignees
Labels
enhancement New feature or request

Comments

@harlankad-db
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

Customer GWSC wants to groupBy the entities in the data and check if a column’s values are within, say, 3 standard deviations of the mean for that entity group.

Proposed Solution

Modify is_in_range() to accept dataframe columns for the min/max_limit. Then I can join a dataframe of pre-computed group-specific ranges and provide the column names to is_in_range(). Example ranges to support include multiples of standard deviation, interquartile range, or mean absolute deviation.

Additional Context

This feature may be included in [FEATURE]: Data set level rules #43, but I want to make sure my customer's need is addressed.

@harlankad-db harlankad-db added the enhancement New feature or request label Jan 14, 2025
@harlankad-db harlankad-db changed the title [FEATURE]: is_(not)_in_range suppors min/max_limit from the dataframe [FEATURE]: is_(not)_in_range supports min/max_limit from the dataframe Jan 14, 2025
@karthik-ballullaya-db karthik-ballullaya-db self-assigned this Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment