Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ECK version to version.json #270

Open
stefnestor opened this issue Oct 16, 2024 · 8 comments
Open

Add ECK version to version.json #270

stefnestor opened this issue Oct 16, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@stefnestor
Copy link

stefnestor commented Oct 16, 2024

👋 howdy, team (cc: @kunisen @jakommo 🙏!)

AFAICT, currently in order to determine the running ECK version, we have to regex search ECK version is (.+?)\n from under eck-diagnostics.log which emits from this code:

func logVersion(v *version.Version) {
s := v.String()
if v == fallbackMaxVersion {
s = "unknown"
}
logger.Printf("ECK version is %s\n", s)
}

Will you kindly consider expanding the version.json file to include this content, likely by extending this code:

// versionInfo exists to marshal both eck-diagnostics version information and K8s server version information.
type versionInfo struct {
DiagnosticsVersion DiagnosticsVersion
ServerVersion *version.Info
}
// Version is inspired by "kubectl version" but includes version information about this tool in addition to K8s
// server version information.
func (c Kubectl) Version(out io.Writer) error {
v := versionInfo{
DiagnosticsVersion: about(),
}
client, err := c.factory.ToDiscoveryClient()
if err != nil {
return err
}
// mirroring kubectl behaviour to fetch fresh data from server
client.Invalidate()
serverVersion, err := client.ServerVersion()
if err != nil {
return err
}
v.ServerVersion = serverVersion
bytes, err := json.MarshalIndent(v, "", " ")
if err != nil {
return err
}
_, err = out.Write(bytes)
return err
}

Example diagnostic of current state:

Image

I was told after filing this ballpark also potentially comes through in a lower file; however, since the namespace can be dynamic requesting in a root-folder file:

$ cat ./elastic-system/configmaps.json | jq '.items[]|select(.metadata.name=="elastic-operator")|.metadata.labels."app.kubernetes.io/version"'
"2.10.0"
@stefnestor stefnestor added the enhancement New feature or request label Oct 16, 2024
@thbkrkr
Copy link
Contributor

thbkrkr commented Oct 17, 2024

One thing to note: eck-diagnostics supports multiple operator namespaces, so there can be different operator versions in the same archive:

> grep eck-operator- -r eck-diagnostics-2024-10-17T14-59-53 | grep serviceaccount
eck-diagnostics-2024-10-17T14-59-53/elastic-system/serviceaccount.json:      "helm.sh/chart": "eck-operator-2.14.0"
eck-diagnostics-2024-10-17T14-59-53/elastic-system-11/serviceaccount.json:   "helm.sh/chart": "eck-operator-2.11.0"

We are currently printing the maximum version of all operator versions.

maxOperatorVersion := maxVersion(operatorVersions)
logVersion(maxOperatorVersion)

@stefnestor
Copy link
Author

stefnestor commented Oct 23, 2024

@thbkrkr , cheers! Sorry, my team's a tad confused on how to action your excellent clarification. As follow-up:

  1. If an ECK Diag did pull two setups at once, e.g. namespace: [{"operator":"elastic-system", "workload":"default"},{"operator":"elastic-v2","workload":"asdf"}], then how would you line up each operator to its respective workload/deployment from the diagnostic?
  2. In our support experience, we can't think of a situation where pulling multiple operator-workload pairs would make sense vs it sounds like have users pull separately. May I ask if your team would consider updating diag to that pattern (to at least only allow one operator even if it affects multiple resource namespaces) or if there's a pattern y'all see that Support does not where pulling multiple operators at once makes sense (outside of it's faster than running the diag multiple times, but emitted data is better)?

Will send you internal discussion link in DM.

@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

how would you line up each operator to its respective workload/deployment from the diagnostic?

You have to look at the operator configuration specifically the namespaces attribute in the operator config map that indicates which namespaces the operator is responsible for.

if there's a pattern y'all see that Support does not where pulling multiple operators at once makes sense

The only other potential benefit I can see is that it allows to debug misconfigurations where multiple operators are responsible for the same namespace. This is hard to figure our if you only pull the diagnostic for one operator out of many in a k8s cluster.

@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

We could either log all operator versions in this file, or the max operator version.
Or as a third option: we could create a new file in the archive called operators.json that contains something like

{ 
"elastic-system/elastic-operator": {"version": "2.14.0", "managed_namespaces": ["ns1", "ns2"]}, 
 "other-ns/elastic-operator": {"version": "2.13.0", "managed_namespaces": ["*"]},
 }

The above example would show a misconfiguration if we use * to mean all namespaces.

@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

@stefnestor maybe you could explain what the motivation for this change request is. Are you planning to build some tooling around this that requires the version information to be in a structured format or are you just trying to understand how to read the data in the archive?

The reason I am asking is that there different levels of effort associated with the different ways to solve this, so it would be good to understand the use case better.

@stefnestor
Copy link
Author

Initially, it was going to be to write an analyzer, until @thbkrkr 's first response wiped out my assumptions 😂.

So for now, I'm just interested in understanding so that I can flush out "how to read ECK diags" for my team. We usually only see 1 operator to 1 workload so can use which of the two reports elastic-operator (per description) to say which is which. But I'm learning that's a false assumption and trying to work out how to read the foundation beneath me to work back to knowing when I'm in rather than assuming I'm in my common situation 🙂.

But very quickly I'd be interesetd in building a diag analyzer on top so am interested in deriving the conversation via .json not .log files. AFAICT the analyzer would want to begin by reporting:

  1. For diagnostic, which namespace is operator and which namespace is workload/deployment? And if multiple pairs, what's the mapping between each (and/or are there any marooned by themselves)?
  2. Is ECK operator version EOL? Is stack version EOL? Does KB version equal ES version, etc?
  3. Then check health, e.g. : Does ECK operator report healthy? Per workload's pod/resource, do any report current changes applying and/or health issues?
  4. Then check recommendations, e.g. : does ES qualify for highly available?

So I don't mean at all to say there's a problem with current diag, other than apparent assumption ignorance my side, sorry 🙃. I think a gap I at least had that you highlighted well in DM helps this ballpark:

you don’t have the namespaces attribute in the configmap because this operator is configured to manage all namespaces (which is the default): See one of the first lines in the operator’s log where it says Operator configured to manage all namespaces.

If I'm reading correctly, an operator restricted to namespaces would be via this doc. From your outline it sounds like that'd still reflect in the existing configmap.json. May I have an example so that I can understand the JSON reference path? I assume maybe the final key will be managed_namespaces? Then with that I'll assume any time that path doesn't exist it's interpreted as * without you needing to expand diag to generate an operators.json on my behalf.

@pebrc
Copy link
Collaborator

pebrc commented Oct 24, 2024

If I'm reading correctly, an operator restricted to namespaces would be via this doc.

Not that's not it. It is this document which then translates to a configmap with the namespaces flag set as discussed. The document I linked is for Helm users. But not all our users are Helm users, many use custom built installation manifests or a modifed version of our official YAML manifests. That opens up additional options of how to configure the namespace restriction. In total we have three options:

  • config file typically called eck.yaml: namespaces (configmaps.json in the diagnostic archive)
  • config flag when executing the operator binary --namespaces (pods.json)
  • environment variable: NAMESPACES (pods.json this last one is typically used in OLM-based installations e.g. on OpenShift )

Then with that I'll assume any time that path doesn't exist it's interpreted as *

Yes, if the value is not configured or set to "" it means all namespaces. See operator config doc:

Flag Default Description
namespaces "" Namespaces in which this operator should manage resources. Accepts multiple comma-separated values. Defaults to all namespaces if empty or unspecified.

@stefnestor
Copy link
Author

Ah, this is more complicated than I thought 🙈. I would like to instead request your earlier idea of a operators.json file 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants