-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ override builtin functions #5156
Conversation
Today, there is no official way to override builtin functions like `length`, this change tries to do the most minimal change in MQL to allow a unique pattern for providers to override builtin functions. The core change is an additional check when executing a bound function. We check if the resource defines the builtin function like `length`, and if so, we prioritize it. If the provider doesn't define it, we execute the function as we did before, by loading first the builtin function and if not existing, then we run the resource function. Here is an example of this pattern to override the `length` builtin function. Having a resource that exposes a function that loads big amounts of resources: ``` cloud { resources() []resources } ``` The provider implementation would look something like this: ```go func (c *mqlCloud) resources() ([]interface, error) { // API call to fetch all resources } ``` When running the MQL query `cloud.resources.length`, we first load all the resources, and then we count them. Since we do not need any information about the resources themselves, this could delay policies if we do things like: ``` cloud.resources.length < 5 && cloud.resources.length > 1 ``` An alternative to improve these resources is to override the builtin function with a more performant implementation. ``` cloud { // update function with a new custom list resource resources() customResources } // definition of a custom list resource customResources { []resource // overrides builtin function length() int } ``` The new implementation moves the logic that loads the resources into a `list()` function and exposes a new function that overrides the builtin function: would look like: ```go func (c *mqlCloud) resources() (*mqlCustomResources, error) { // Only initializes the custom list resource } func (c *mqlCustomResources) list() ([]interface, error) { // API call to fetch all resources } // length() overrides the builtin function, func (c *mqlCustomResources) length() (int64, error) { // This should be a more performant way to count the "resources" } ``` Additionally, this change moves the `findField()` function from the `compiler` to the resource `Schema`. Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
Test Results3 216 tests 3 212 ✅ 1m 46s ⏱️ Results for commit f896a66. ♻️ This comment has been updated with latest results. |
…de-builtin-functions
Converted to draft as it requires more discussion with @arlimus |
Depends on #5156 Contributes to #5109 Relates to #5153 Use Microsoft `$count` parameter to make `microsoft.applications.length` more performant. https://learn.microsoft.com/en-us/graph/query-parameters?tabs=http#count-parameter Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
When writing policies that require fetching huge amount of data only to search or filter for specific resources, we fetch all resources and then we apply the filters using the builtin functions `where()` `any()` and more. An example of a policy check that uses these patterns is: ``` // search for emergency accounts microsoft.users.any(displayName == /emergency/) // emrgIds holds the ids which match the above criteria emrgID = microsoft.users.where(displayName == /emergency/).map(id) // check if at least one of the accounts identified as such is attached to the "Global Administrator" role microsoft.rolemanagement.roleDefinitions.where(displayName == "Global Administrator").all(assignments.any(principalId == emrgID)) ``` To improve these resources, I am proposing a new pattern, similar to the one used at #5156, but with the difference that it doesn't override builtin functions, instead it leverages list resources which are natively supported in MQL with additional query parameters `filter` and `search`. These query parameters will be used directly when executing API requests against Microsoft Graph API. These query parameters are documented at: https://learn.microsoft.com/en-us/graph/filter-query-parameter?tabs=http#filter-using-lambda-operators The above example can be rewritten using these two new filtered resources like: ``` // search for emergency accounts microsoft.users(search: "displayName:emergency").any() // emrgIds holds the ids which match the above criteria emrgID = microsoft.users(search: "displayName:emergency").map(id) // check if at least one of the accounts identified as such is attached to the "Global Administrator" role microsoft.roles(filter: "displayName eq 'Global Administrator'").all(assignments.any(principalId == emrgID)) ``` Additionally, since these query parameters are directly passed to Microsoft API's, we can write very complex filters for these two new resources. A couple examples are: ``` microsoft.roles(filter: "isBuiltIn eq true and startswith(displayName, 'Global')") microsoft.users(filter: "accountEnabled eq true AND userType eq 'Member'", search: "officeLocation:berlin") ``` Closes #5110 Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
When writing policies that require fetching huge amount of data only to search or filter for specific resources, we fetch all resources and then we apply the filters using the builtin functions `where()` `any()` and more. An example of a policy check that uses these patterns is: ``` // search for emergency accounts microsoft.users.any(displayName == /emergency/) // emrgIds holds the ids which match the above criteria emrgID = microsoft.users.where(displayName == /emergency/).map(id) // check if at least one of the accounts identified as such is attached to the "Global Administrator" role microsoft.rolemanagement.roleDefinitions.where(displayName == "Global Administrator").all(assignments.any(principalId == emrgID)) ``` To improve these resources, I am proposing a new pattern, similar to the one used at #5156, but with the difference that it doesn't override builtin functions, instead it leverages list resources which are natively supported in MQL with additional query parameters `filter` and `search`. These query parameters will be used directly when executing API requests against Microsoft Graph API. These query parameters are documented at: https://learn.microsoft.com/en-us/graph/filter-query-parameter?tabs=http#filter-using-lambda-operators The above example can be rewritten using these two new filtered resources like: ``` // search for emergency accounts microsoft.users(search: "displayName:emergency").any() // emrgIds holds the ids which match the above criteria emrgID = microsoft.users(search: "displayName:emergency").map(id) // check if at least one of the accounts identified as such is attached to the "Global Administrator" role microsoft.roles(filter: "displayName eq 'Global Administrator'").all(assignments.any(principalId == emrgID)) ``` Additionally, since these query parameters are directly passed to Microsoft API's, we can write very complex filters for these two new resources. A couple examples are: ``` microsoft.roles(filter: "isBuiltIn eq true and startswith(displayName, 'Global')") microsoft.users(filter: "accountEnabled eq true AND userType eq 'Member'", search: "officeLocation:berlin") ``` Closes #5110 Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
Talked with @arlimus and we are ready to review this PR. |
mql/mql_test.go
Outdated
Code: "mos.groups.list.length", | ||
ResultIndex: 0, Expectation: int64(7), | ||
}, | ||
// Same here, builtint `length` function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: builtin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing the overrides @afiune !
For anyone who may find this PR in the future:
We introduce this builtin override pattern as an effective short-term pattern to take care of large-scale MQL. It gives us a ton of flexibility to add optimized solutions for some of these super long-running calls.
In parallel we are working on a separate RFC to tackle this problem long-term. The advantages of this override-pattern is the ability to do very efficient calls, which are just not possible today. The reasons we will take a different path mid-term is to avoid changes to the underlying expected user behavior of these builtin functions (eg: other types of return values, other behaviors despite having the same name). (from my past experience in eg ruby and scala: it's a fun feature, very powerful, but is harder to read and easier to make mistakes)
Today, there is no official way to override builtin functions like
length
, this change tries to do the most minimal change in MQL to allow a unique pattern for providers to override builtin functions.The core change is an additional check when executing a bound function. We check if the resource defines the builtin function like
length
, and if so, we prioritize it. If the provider doesn't define it, we execute the function as we did before, by loading first the builtin function and if not existing, then we run the resource function.Here is an example of this pattern to override the
length
builtin function.Having a resource that exposes a function that loads big amounts of resources:
The provider implementation would look something like this:
When running the MQL query
cloud.resources.length
, we first load all the resources, and then we count them.Since we do not need any information about the resources themselves, this could delay policies if we do things like:
An alternative to improve these resources is to override the builtin function with a more performant implementation.
The new implementation moves the logic that loads the resources into a
list()
function and exposes a new function that overrides the builtin function:would look like:
Additionally, this change moves the
findField()
function from thecompiler
to the resourceSchema
.