-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional_parameters to diagnostics action #3333
Add additional_parameters to diagnostics action #3333
Conversation
a442ea9
to
09e57aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Is there an elastic-agent change needed to make use of the CPU option?
The agent change is elastic/elastic-agent#4394 |
I'm going to hold off on merging this until I can get an e2e test working |
I managed to do an e2e test. I started 8.14.0-SNAPSHOT builds locally then enrolled a new fleet-server running under the elastic agent with changes from elastic/elastic-agent#4394 First I validated that the request diagnostics button on the UI works as intended (changes in the eefb64c commit were needed for that). Then I created a user with permissions to index documents in
The resulting diagnostics bundle was available through the UI, and it contains
|
|
@@ -549,7 +549,14 @@ components: | |||
# unenroll actions have no `data` attribute | |||
actionRequestDiagnostics: | |||
description: The REQUEST_DIAGNOSTICS action data. | |||
# diagnostics actions have no `data` attribute | |||
properties: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check, this is backwards compatible with older agents that don't understand this yet? They'll just ignore it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
I ran another e2e test by enrolling an agent without the changes in elastic/elastic-agent#4394 to a fleet-server instance build with the changes in this PR and manually inserting an action with CPU in additional_metrics
.
A diagnostics bundle was generated with no cpu.pprof
as a part of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
What is the problem this PR solves?
Allow fleet-server to specify if a diagnostics bundle should contain additional CPU metrics.
How does this PR solve the problem?
Add an optional parameter to the RequestDiagnostics action. This parameter is a list of strings (enum) that has a single value
"CPU"
. If this parameter is specified, the agent should collect a CPU trace.Design Checklist
I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.Checklist
I have commented my code, particularly in hard-to-understand areasI have made corresponding changes to the documentationI have made corresponding change to the default configuration files./changelog/fragments
using the changelog toolRelated issues