Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ override builtin functions #5156

Merged
merged 3 commits into from
Feb 20, 2025
Merged

Conversation

afiune
Copy link
Contributor

@afiune afiune commented Feb 3, 2025

Today, there is no official way to override builtin functions like length, this change tries to do the most minimal change in MQL to allow a unique pattern for providers to override builtin functions.

The core change is an additional check when executing a bound function. We check if the resource defines the builtin function like length, and if so, we prioritize it. If the provider doesn't define it, we execute the function as we did before, by loading first the builtin function and if not existing, then we run the resource function.

Here is an example of this pattern to override the length builtin function.

Having a resource that exposes a function that loads big amounts of resources:

cloud {
  resources() []resources
}

The provider implementation would look something like this:

func (c *mqlCloud) resources() ([]interface, error) {
	// API call to fetch all resources
}

When running the MQL query cloud.resources.length, we first load all the resources, and then we count them.

Since we do not need any information about the resources themselves, this could delay policies if we do things like:

cloud.resources.length < 5 && cloud.resources.length > 1

An alternative to improve these resources is to override the builtin function with a more performant implementation.

cloud {
  // update function with a new custom list resource
  resources() customResources
}

// definition of a custom list resource
customResources {
  []resource

  // overrides builtin function
  length() int
}

The new implementation moves the logic that loads the resources into a list() function and exposes a new function that overrides the builtin function:
would look like:

func (c *mqlCloud) resources() (*mqlCustomResources, error) {
	// Only initializes the custom list resource
}
func (c *mqlCustomResources) list() ([]interface, error) {
	// API call to fetch all resources
}
// length() overrides the builtin function,
func (c *mqlCustomResources) length() (int64, error) {
        // This should be a more performant way to count the "resources"
}

Additionally, this change moves the findField() function from the compiler to the resource Schema.

Today, there is no official way to override builtin functions like
`length`, this change tries to do the most minimal change in MQL to
allow a unique pattern for providers to override builtin functions.

The core change is an additional check when executing a bound function.
We check if the resource defines the builtin function like `length`, and
if so, we prioritize it. If the provider doesn't define it, we execute
the function as we did before, by loading first the builtin function and
if not existing, then we run the resource function.

Here is an example of this pattern to override the `length` builtin
function.

Having a resource that exposes a function that loads big amounts of
resources:

```
cloud {
  resources() []resources
}
```

The provider implementation would look something like this:
```go
func (c *mqlCloud) resources() ([]interface, error) {
	// API call to fetch all resources
}
```

When running the MQL query `cloud.resources.length`, we first load all
the resources, and then we count them.

Since we do not need any information about the resources themselves,
this could delay policies if we do things like:
```
cloud.resources.length < 5 && cloud.resources.length > 1
```

An alternative to improve these resources is to override the builtin
function with a more performant implementation.

```
cloud {
  // update function with a new custom list resource
  resources() customResources
}

// definition of a custom list resource
customResources {
  []resource

  // overrides builtin function
  length() int
}
```

The new implementation moves the logic that loads the resources into a
`list()` function and exposes a new function that overrides the builtin
function:
would look like:
```go
func (c *mqlCloud) resources() (*mqlCustomResources, error) {
	// Only initializes the custom list resource
}
func (c *mqlCustomResources) list() ([]interface, error) {
	// API call to fetch all resources
}
// length() overrides the builtin function,
func (c *mqlCustomResources) length() (int64, error) {
        // This should be a more performant way to count the "resources"
}
```

Additionally, this change moves the `findField()` function from the
`compiler` to the resource `Schema`.

Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
Copy link
Contributor

github-actions bot commented Feb 3, 2025

Test Results

3 216 tests   3 212 ✅  1m 46s ⏱️
  385 suites      4 💤
   29 files        0 ❌

Results for commit f896a66.

♻️ This comment has been updated with latest results.

@chris-rock chris-rock marked this pull request as draft February 3, 2025 21:09
@chris-rock
Copy link
Member

Converted to draft as it requires more discussion with @arlimus

afiune added a commit that referenced this pull request Feb 4, 2025
Depends on #5156
Contributes to #5109
Relates to #5153

Use Microsoft `$count` parameter to make `microsoft.applications.length` more performant.

https://learn.microsoft.com/en-us/graph/query-parameters?tabs=http#count-parameter

Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
afiune added a commit that referenced this pull request Feb 5, 2025
When writing policies that require fetching huge amount of data only to
search or filter for specific resources, we fetch all resources and then
we apply the filters using the builtin functions `where()` `any()` and
more.

An example of a policy check that uses these patterns is:
```
// search for emergency accounts
microsoft.users.any(displayName == /emergency/)

// emrgIds holds the ids which match the above criteria
emrgID = microsoft.users.where(displayName == /emergency/).map(id)

// check if at least one of the accounts identified as such is attached to the "Global Administrator" role
microsoft.rolemanagement.roleDefinitions.where(displayName == "Global Administrator").all(assignments.any(principalId == emrgID))
```

To improve these resources, I am proposing a new pattern, similar to the
one used at #5156, but with the
difference that it doesn't override builtin functions, instead it
leverages list resources which are natively supported in MQL with
additional query parameters `filter` and `search`.

These query parameters will be used directly when executing API requests
against Microsoft Graph API. These query parameters are documented at:

https://learn.microsoft.com/en-us/graph/filter-query-parameter?tabs=http#filter-using-lambda-operators

The above example can be rewritten using these two new filtered
resources like:

```
// search for emergency accounts
microsoft.users(search: "displayName:emergency").any()

// emrgIds holds the ids which match the above criteria
emrgID = microsoft.users(search: "displayName:emergency").map(id)

// check if at least one of the accounts identified as such is attached to the "Global Administrator" role
microsoft.roles(filter: "displayName eq 'Global Administrator'").all(assignments.any(principalId == emrgID))
```

Additionally, since these query parameters are directly passed to
Microsoft API's, we can write very complex filters for these two new
resources.

A couple examples are:
```
microsoft.roles(filter: "isBuiltIn eq true and startswith(displayName, 'Global')")
microsoft.users(filter: "accountEnabled eq true AND userType eq 'Member'", search: "officeLocation:berlin")
```

Closes #5110

Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
afiune added a commit that referenced this pull request Feb 7, 2025
When writing policies that require fetching huge amount of data only to
search or filter for specific resources, we fetch all resources and then
we apply the filters using the builtin functions `where()` `any()` and
more.

An example of a policy check that uses these patterns is:
```
// search for emergency accounts
microsoft.users.any(displayName == /emergency/)

// emrgIds holds the ids which match the above criteria
emrgID = microsoft.users.where(displayName == /emergency/).map(id)

// check if at least one of the accounts identified as such is attached to the "Global Administrator" role
microsoft.rolemanagement.roleDefinitions.where(displayName == "Global Administrator").all(assignments.any(principalId == emrgID))
```

To improve these resources, I am proposing a new pattern, similar to the
one used at #5156, but with the
difference that it doesn't override builtin functions, instead it
leverages list resources which are natively supported in MQL with
additional query parameters `filter` and `search`.

These query parameters will be used directly when executing API requests
against Microsoft Graph API. These query parameters are documented at:

https://learn.microsoft.com/en-us/graph/filter-query-parameter?tabs=http#filter-using-lambda-operators

The above example can be rewritten using these two new filtered
resources like:

```
// search for emergency accounts
microsoft.users(search: "displayName:emergency").any()

// emrgIds holds the ids which match the above criteria
emrgID = microsoft.users(search: "displayName:emergency").map(id)

// check if at least one of the accounts identified as such is attached to the "Global Administrator" role
microsoft.roles(filter: "displayName eq 'Global Administrator'").all(assignments.any(principalId == emrgID))
```

Additionally, since these query parameters are directly passed to
Microsoft API's, we can write very complex filters for these two new
resources.

A couple examples are:
```
microsoft.roles(filter: "isBuiltIn eq true and startswith(displayName, 'Global')")
microsoft.users(filter: "accountEnabled eq true AND userType eq 'Member'", search: "officeLocation:berlin")
```

Closes #5110

Signed-off-by: Salim Afiune Maya <afiune@mondoo.com>
@afiune afiune marked this pull request as ready for review February 13, 2025 18:20
@afiune
Copy link
Contributor Author

afiune commented Feb 13, 2025

Talked with @arlimus and we are ready to review this PR.

@arlimus arlimus self-assigned this Feb 19, 2025
mql/mql_test.go Outdated
Code: "mos.groups.list.length",
ResultIndex: 0, Expectation: int64(7),
},
// Same here, builtint `length` function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: builtin

Copy link
Member

@arlimus arlimus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing the overrides @afiune !

For anyone who may find this PR in the future:

We introduce this builtin override pattern as an effective short-term pattern to take care of large-scale MQL. It gives us a ton of flexibility to add optimized solutions for some of these super long-running calls.

In parallel we are working on a separate RFC to tackle this problem long-term. The advantages of this override-pattern is the ability to do very efficient calls, which are just not possible today. The reasons we will take a different path mid-term is to avoid changes to the underlying expected user behavior of these builtin functions (eg: other types of return values, other behaviors despite having the same name). (from my past experience in eg ruby and scala: it's a fun feature, very powerful, but is harder to read and easier to make mistakes)

@afiune afiune merged commit 03ab221 into main Feb 20, 2025
17 checks passed
@afiune afiune deleted the afiune/override-builtin-functions branch February 20, 2025 17:24
@github-actions github-actions bot locked and limited conversation to collaborators Feb 20, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants