Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flagger complaining about not reaching prometheus. #1758

Open
LittaKake opened this issue Jan 23, 2025 · 0 comments
Open

Flagger complaining about not reaching prometheus. #1758

LittaKake opened this issue Jan 23, 2025 · 0 comments

Comments

@LittaKake
Copy link

Describe the bug

Flagger controller attempts to connect to prometheus, but is unable to do so.

{"level":"error","ts":"2025-01-23T09:22:50.821Z","caller":"controller/events.go:39","msg":"Error checking metric providers: prometheus not avaiable: running query failed: request failed: Get \"http://prometheus:9090/api/v1/query?query=vector%281%29\": dial tcp: lookup prometheus on 10.100.0.10:53: no such host","canary":"fluxcd-test.fluxcd-test","stacktrace":"github.com/fluxcd/flagger/pkg/controller.(*Controller).recordEventErrorf\n\t/workspace/pkg/controller/events.go:39\ngithub.com/fluxcd/flagger/pkg/controller.(*Controller).advanceCanary\n\t/workspace/pkg/controller/scheduler.go:207\ngithub.com/fluxcd/flagger/pkg/controller.CanaryJob.Start.func1\n\t/workspace/pkg/controller/job.go:39"}

Flagger canary is stuck in initializing.

k describe canary ...
 Warning  Synced  25m (x785 over 13h)    flagger  reconcileDestinationRule failed: DestinationRule fluxcd-test-canary.fluxcd-test create error: the server could not find the requested resource (post destinationrules.networking.istio.io)
  Warning  Synced  4m12s (x807 over 13h)  flagger  Error checking metric providers: prometheus not avaiable: running query failed: request failed: Get "http://prometheus:9090/api/v1/query?query=vector%281%29": dial tcp: lookup prometheus on 10.100.0.10:53: no such host

To Reproduce

  1. Install flux flux install and bootstrap.
  2. Use these manifests
apiVersion: v1
kind: Namespace
metadata:
  name: flagger-system
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: flagger
  namespace: flagger-system
spec:
  interval: 1h
  releaseName: flagger
  install: # override existing Flagger CRDs
    crds: CreateReplace
  upgrade: # update Flagger CRDs
    crds: CreateReplace
  chart:
    spec:
      chart: flagger
      version: 1.x # update Flagger to the latest minor version
      interval: 6h # scan for new versions every six hours
      sourceRef:
        kind: HelmRepository
        name: flagger
      verify: # verify the chart signature with Cosign keyless
        provider: cosign 
  values:
    nodeSelector:
      beta.kubernetes.io/os: linux
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: flagger-loadtester
  namespace: flagger-system
spec:
  interval: 6h
  wait: true
  timeout: 5m
  prune: true
  sourceRef:
    kind: OCIRepository
    name: flagger-loadtester
  path: ./tester
  targetNamespace: flagger-system
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
  name: flagger
  namespace: flagger-system
spec:
  interval: 1h
  url: oci://ghcr.io/fluxcd/charts
  type: oci
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
  name: flagger-loadtester
  namespace: flagger-system
spec:
  interval: 6h
  url: oci://ghcr.io/fluxcd/flagger-manifests
  ref:
    semver: 1.x
  verify:
    provider: cosign

and these are more application specific manifests

apiVersion: v1
kind: Namespace
metadata:
  name: fluxcd-test
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: fluxcd-test
  name: fluxcd-test
  namespace: fluxcd-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fluxcd-test
  template:
    metadata:
      labels:
        app: fluxcd-test
    spec:
      containers:
      - image: myimage:latest
        name: fluxcd-test
---
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: fluxcd-test
  namespace: fluxcd-test
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: fluxcd-test
  service:
    port: 5000
  analysis:
    interval: 1m
    threshold: 10
    maxWeight: 50
    stepWeight: 5
    metrics:
    # each minute check if application has above 99% codes that are not 5xx responses
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    # each minute check if application has below 500ms response time
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 1m
    webhooks:
    - name: load-test
      url: http://flagger-loadtester.flagger-system/
      metadata:
        cmd: "hey -z 1m -q 10 -c 2 http://fluxcd-test-canary.fluxcd-test:5000/"

Expected behavior

I expect no errors from the controller.

I expect the canary to be in another state than initializing?

Additional context

  • Flagger version: 1.40.0
  • Kubernetes version: Server version is v1.30.8-gke.1051000
  • Service Mesh provider: n/a
  • Ingress provider: n/a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant