Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] [Security Solution] Reduce the _review rule upgrade endpoint response size (#211045) #212921

Merged
merged 1 commit into from
Mar 3, 2025

Conversation

kibanamachine
Copy link
Contributor

Backport

This will backport the following commits from main to 8.x:

Questions ?

Please refer to the Backport tool documentation

… size (elastic#211045)

**Resolves: elastic#208361
**Resolves: elastic#210544

## Summary

This PR introduces significant memory consumption improvements to the
prebuilt rule endpoints, ensuring users won't encounter OOM errors on
memory-limited Kibana instances.

Memory consumption testing results provided in
elastic#211045 (comment).

## Details

This PR implements a number of memory usage optimizations to the
prebuilt rule endpoints with the final goal reducing chances of getting
OOM errors. The changes are extensive and require thorough testing
before merging.

The changes are described by the following bullets

- The most significant change is the addition of pagination to the
`upgrade/_review` endpoint. This endpoint was known for causing OOM
errors due to its large and ever-growing response size. With pagination,
it now returns upgrade information for no more than 20-100 rules at a
time, significantly reducing its memory footprint.
- New backend methods, such as
`ruleObjectsClient.fetchInstalledRuleVersions`, have been introduced.
These methods return rule IDs with their corresponding installed
versions, allowing to build a map of outdated rules without loading all
available rules into memory. Previously, all installed rules, along with
their base and target versions, were fetched unconditionally before
filtering for updates.
- The `stats` data structure of the review endpoint has been deprecated
(it can be safely removed after one Serverless release cycle). Since the
endpoint now returns paginated results, building stats is no longer
feasible due to the limited rule set size fetched on the server side. As
the side effect it required removing related Cypress tests asserting
`Update All` disabled when rules can't be updated.
- All changes to the endpoints are backward-compatible. All previously
required returned structures still present in response. All newly added
structures are optional.
- Upgradeable rule tags are now returned from the prebuilt rule status
endpoint.
- The frontend logic has been updated to move sorting and filtering of
prebuilt rules from the client side to the server side.
- The `upgrade/_perform` endpoint has been rewritten to use lightweight
rule version information rather than full rules to determine upgradeable
rules. Additionally, upgrades are now performed in batches of up to 100
rules, further reducing memory usage.
- A dry run option has been added to the upgrade perform endpoint. This
is needed for the "Update all" rules scenario to determine if any rules
contain conflicts and display a confirmation modal to the user.
- An option to skip conflicting rules has been added to the upgrade
endpoint when called with the `ALL_RULES` mode.
- The `install/_review` endpoint's memory consumption has been optimized
by avoiding loading all rules into memory to determine available rules
for installation. Redundant fetching of all base versions has also been
removed, as they do not participate in the calculation.

---------

Co-authored-by: Maxim Palenov <maxim.palenov@elastic.co>
(cherry picked from commit c4a016e)
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #15 / maps app geo file upload shapefile upload should add as document layer

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 9.1MB 9.1MB +1.0KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
securitySolution 83.8KB 83.9KB +62.0B

cc @xcrzx

@kibanamachine kibanamachine merged commit dd55c99 into elastic:8.x Mar 3, 2025
11 checks passed
SoniaSanzV pushed a commit to SoniaSanzV/kibana that referenced this pull request Mar 4, 2025
…sponse size (elastic#211045) (elastic#212921)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Security Solution] Reduce the _review rule upgrade endpoint response
size (elastic#211045)](elastic#211045)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Dmitrii
Shevchenko","email":"dmitrii.shevchenko@elastic.co"},"sourceCommit":{"committedDate":"2025-03-03T14:03:07Z","message":"[Security
Solution] Reduce the _review rule upgrade endpoint response size
(elastic#211045)\n\n**Resolves:
https://github.com/elastic/kibana/issues/208361**\n**Resolves:
https://github.com/elastic/kibana/issues/210544**\n\n## Summary\n\nThis
PR introduces significant memory consumption improvements to
the\nprebuilt rule endpoints, ensuring users won't encounter OOM errors
on\nmemory-limited Kibana instances.\n\nMemory consumption testing
results provided
in\nhttps://github.com/elastic/pull/211045#issuecomment-2689854328.\n\n##
Details\n\nThis PR implements a number of memory usage optimizations to
the\nprebuilt rule endpoints with the final goal reducing chances of
getting\nOOM errors. The changes are extensive and require thorough
testing\nbefore merging.\n\nThe changes are described by the following
bullets\n\n- The most significant change is the addition of pagination
to the\n`upgrade/_review` endpoint. This endpoint was known for causing
OOM\nerrors due to its large and ever-growing response size. With
pagination,\nit now returns upgrade information for no more than 20-100
rules at a\ntime, significantly reducing its memory footprint.\n- New
backend methods, such
as\n`ruleObjectsClient.fetchInstalledRuleVersions`, have been
introduced.\nThese methods return rule IDs with their corresponding
installed\nversions, allowing to build a map of outdated rules without
loading all\navailable rules into memory. Previously, all installed
rules, along with\ntheir base and target versions, were fetched
unconditionally before\nfiltering for updates.\n- The `stats` data
structure of the review endpoint has been deprecated\n(it can be safely
removed after one Serverless release cycle). Since the\nendpoint now
returns paginated results, building stats is no longer\nfeasible due to
the limited rule set size fetched on the server side. As\nthe side
effect it required removing related Cypress tests asserting\n`Update
All` disabled when rules can't be updated.\n- All changes to the
endpoints are backward-compatible. All previously\nrequired returned
structures still present in response. All newly added\nstructures are
optional.\n- Upgradeable rule tags are now returned from the prebuilt
rule status\nendpoint.\n- The frontend logic has been updated to move
sorting and filtering of\nprebuilt rules from the client side to the
server side.\n- The `upgrade/_perform` endpoint has been rewritten to
use lightweight\nrule version information rather than full rules to
determine upgradeable\nrules. Additionally, upgrades are now performed
in batches of up to 100\nrules, further reducing memory usage.\n- A dry
run option has been added to the upgrade perform endpoint. This\nis
needed for the \"Update all\" rules scenario to determine if any
rules\ncontain conflicts and display a confirmation modal to the
user.\n- An option to skip conflicting rules has been added to the
upgrade\nendpoint when called with the `ALL_RULES` mode.\n- The
`install/_review` endpoint's memory consumption has been optimized\nby
avoiding loading all rules into memory to determine available rules\nfor
installation. Redundant fetching of all base versions has also
been\nremoved, as they do not participate in the
calculation.\n\n---------\n\nCo-authored-by: Maxim Palenov
<maxim.palenov@elastic.co>","sha":"c4a016eda30ae8f224fdd485a634dc6773898e31","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:fix","performance","v9.0.0","Team:Detections
and Resp","Team: SecuritySolution","Team:Detection Rule
Management","Feature:Prebuilt Detection
Rules","backport:version","v8.18.0","v9.1.0","v8.19.0","v8.17.3"],"title":"[Security
Solution] Reduce the _review rule upgrade endpoint response
size","number":211045,"url":"https://github.com/elastic/kibana/pull/211045","mergeCommit":{"message":"[Security
Solution] Reduce the _review rule upgrade endpoint response size
(elastic#211045)\n\n**Resolves:
https://github.com/elastic/kibana/issues/208361**\n**Resolves:
https://github.com/elastic/kibana/issues/210544**\n\n## Summary\n\nThis
PR introduces significant memory consumption improvements to
the\nprebuilt rule endpoints, ensuring users won't encounter OOM errors
on\nmemory-limited Kibana instances.\n\nMemory consumption testing
results provided
in\nhttps://github.com/elastic/pull/211045#issuecomment-2689854328.\n\n##
Details\n\nThis PR implements a number of memory usage optimizations to
the\nprebuilt rule endpoints with the final goal reducing chances of
getting\nOOM errors. The changes are extensive and require thorough
testing\nbefore merging.\n\nThe changes are described by the following
bullets\n\n- The most significant change is the addition of pagination
to the\n`upgrade/_review` endpoint. This endpoint was known for causing
OOM\nerrors due to its large and ever-growing response size. With
pagination,\nit now returns upgrade information for no more than 20-100
rules at a\ntime, significantly reducing its memory footprint.\n- New
backend methods, such
as\n`ruleObjectsClient.fetchInstalledRuleVersions`, have been
introduced.\nThese methods return rule IDs with their corresponding
installed\nversions, allowing to build a map of outdated rules without
loading all\navailable rules into memory. Previously, all installed
rules, along with\ntheir base and target versions, were fetched
unconditionally before\nfiltering for updates.\n- The `stats` data
structure of the review endpoint has been deprecated\n(it can be safely
removed after one Serverless release cycle). Since the\nendpoint now
returns paginated results, building stats is no longer\nfeasible due to
the limited rule set size fetched on the server side. As\nthe side
effect it required removing related Cypress tests asserting\n`Update
All` disabled when rules can't be updated.\n- All changes to the
endpoints are backward-compatible. All previously\nrequired returned
structures still present in response. All newly added\nstructures are
optional.\n- Upgradeable rule tags are now returned from the prebuilt
rule status\nendpoint.\n- The frontend logic has been updated to move
sorting and filtering of\nprebuilt rules from the client side to the
server side.\n- The `upgrade/_perform` endpoint has been rewritten to
use lightweight\nrule version information rather than full rules to
determine upgradeable\nrules. Additionally, upgrades are now performed
in batches of up to 100\nrules, further reducing memory usage.\n- A dry
run option has been added to the upgrade perform endpoint. This\nis
needed for the \"Update all\" rules scenario to determine if any
rules\ncontain conflicts and display a confirmation modal to the
user.\n- An option to skip conflicting rules has been added to the
upgrade\nendpoint when called with the `ALL_RULES` mode.\n- The
`install/_review` endpoint's memory consumption has been optimized\nby
avoiding loading all rules into memory to determine available rules\nfor
installation. Redundant fetching of all base versions has also
been\nremoved, as they do not participate in the
calculation.\n\n---------\n\nCo-authored-by: Maxim Palenov
<maxim.palenov@elastic.co>","sha":"c4a016eda30ae8f224fdd485a634dc6773898e31"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","8.18","8.x","8.17"],"targetPullRequestStates":[{"branch":"9.0","label":"v9.0.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/211045","number":211045,"mergeCommit":{"message":"[Security
Solution] Reduce the _review rule upgrade endpoint response size
(elastic#211045)\n\n**Resolves:
https://github.com/elastic/kibana/issues/208361**\n**Resolves:
https://github.com/elastic/kibana/issues/210544**\n\n## Summary\n\nThis
PR introduces significant memory consumption improvements to
the\nprebuilt rule endpoints, ensuring users won't encounter OOM errors
on\nmemory-limited Kibana instances.\n\nMemory consumption testing
results provided
in\nhttps://github.com/elastic/pull/211045#issuecomment-2689854328.\n\n##
Details\n\nThis PR implements a number of memory usage optimizations to
the\nprebuilt rule endpoints with the final goal reducing chances of
getting\nOOM errors. The changes are extensive and require thorough
testing\nbefore merging.\n\nThe changes are described by the following
bullets\n\n- The most significant change is the addition of pagination
to the\n`upgrade/_review` endpoint. This endpoint was known for causing
OOM\nerrors due to its large and ever-growing response size. With
pagination,\nit now returns upgrade information for no more than 20-100
rules at a\ntime, significantly reducing its memory footprint.\n- New
backend methods, such
as\n`ruleObjectsClient.fetchInstalledRuleVersions`, have been
introduced.\nThese methods return rule IDs with their corresponding
installed\nversions, allowing to build a map of outdated rules without
loading all\navailable rules into memory. Previously, all installed
rules, along with\ntheir base and target versions, were fetched
unconditionally before\nfiltering for updates.\n- The `stats` data
structure of the review endpoint has been deprecated\n(it can be safely
removed after one Serverless release cycle). Since the\nendpoint now
returns paginated results, building stats is no longer\nfeasible due to
the limited rule set size fetched on the server side. As\nthe side
effect it required removing related Cypress tests asserting\n`Update
All` disabled when rules can't be updated.\n- All changes to the
endpoints are backward-compatible. All previously\nrequired returned
structures still present in response. All newly added\nstructures are
optional.\n- Upgradeable rule tags are now returned from the prebuilt
rule status\nendpoint.\n- The frontend logic has been updated to move
sorting and filtering of\nprebuilt rules from the client side to the
server side.\n- The `upgrade/_perform` endpoint has been rewritten to
use lightweight\nrule version information rather than full rules to
determine upgradeable\nrules. Additionally, upgrades are now performed
in batches of up to 100\nrules, further reducing memory usage.\n- A dry
run option has been added to the upgrade perform endpoint. This\nis
needed for the \"Update all\" rules scenario to determine if any
rules\ncontain conflicts and display a confirmation modal to the
user.\n- An option to skip conflicting rules has been added to the
upgrade\nendpoint when called with the `ALL_RULES` mode.\n- The
`install/_review` endpoint's memory consumption has been optimized\nby
avoiding loading all rules into memory to determine available rules\nfor
installation. Redundant fetching of all base versions has also
been\nremoved, as they do not participate in the
calculation.\n\n---------\n\nCo-authored-by: Maxim Palenov
<maxim.palenov@elastic.co>","sha":"c4a016eda30ae8f224fdd485a634dc6773898e31"}},{"branch":"8.x","label":"v8.19.0","branchLabelMappingKey":"^v8.19.0$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.17","label":"v8.17.3","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Dmitrii Shevchenko <dmitrii.shevchenko@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants