Skip to content

"method" metric label has unbound granularity #10208

Open
@jacekn

Description

@jacekn

What happened:

I noticed a problem with metrics that can lead to metric granularity explosion.
In default config ingress will accept any HTTP method, even nonsense one. HTTP methods are exposed as values of the method label in metrics. This means that requests with random, unique, http methods ("AAA", "AAB", "AAC" and so on) will cause metric numbers to explode causing monitoring issues or possibly even affecting the controller itself (DoS).

What you expected to happen:

Prometheus best practice is for label values to be bound so I expected non-standard HTTP request to be handled in a way that doesn't affect metrics.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

This was tested with version 1.8.0

Kubernetes version (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.14", GitCommit:"0018fa8af8ffaff22f36d3bd86289753ca61da81", GitTreeState:"clean", BuildDate:"2023-05-17T16:21:16Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS EC2

  • OS (e.g. from /etc/os-release): Ubuntu 20.04

  • Kernel (e.g. uname -a): 5.11.0-1021-aws

  • Install tools: kubeadm

  • Kubectl info: Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-15T02:15:11Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}

  • How was the ingress-nginx-controller installed:

We installed using helm template that was stored in a git repo:

helm template --values=- --namespace=ingress-nginx --repo https://kubernetes.github.io/ingress-nginx ingress-nginx ingress-nginx
  • Current State of the controller:

Controller is full functional and it works without problems. The only visible problem is very high number of metrics being exported.

  • Current state of ingress object, if applicable:

All Ingress objects are fine and the controller handles requests fine.

How to reproduce this issue:

  1. Deploy controller and add an ingress object.
  2. Confirm what HTTP methods are exposed as metrics using the following prometheus query:
sum(nginx_ingress_controller_requests) by (method)
  1. Make a a few http reqeuests with random HTTP methods:
curl --request "AAA" https://example.com
curl --request "AAB" https://example.com
curl --request "AAC" https://example.com
curl --request "AAD" https://example.com
  1. Confirm that new methods now appear as label values:
sum(nginx_ingress_controller_requests{method=~"AA.*"}) by (method)
  1. Above label values persist until controller is restarted

Anything else we need to know:

Kubernetes security team's assessment is that this but doesn't require CVE.

With regards to the best solution - we could have valid codes listed in the controller code and report any methods not matching the allowlist as method="invalid" or similar.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions