Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query about visibility of alerts in replicas in Alertmanager cluster mode #4249

Open
shivgana opened this issue Feb 12, 2025 · 4 comments
Open

Comments

@shivgana
Copy link

shivgana commented Feb 12, 2025

Alertmanager Version: 0.27.0

Details:
Override yaml file

replicaCount: 3
extraArgs:
  data.retention: 130m
  log.level: debug
config:
  global:
    resolve_timeout: 24h
  receivers:
  - name: webhook
    webhost_configs:
    - url: http://test-streamer:500/
  route:
    group_by: 
    - alertname
    group_wait: 30s
    group interval: 10s
    repeat interval: 10m
    receiver: webhook

When we deploy Alertmanager with 3 replica, alertmanager will deployed with alertmanager cli cluster options.
Pods are up and running.

Now I've sent alert to alertmanager to first replica pod.

Query:

  • Will Alert Manager of first pod will send alerts to peers. Or it will monitor and if same is received on peers to continue to send alert status to webhook.
@grobinson-grafana
Copy link
Contributor

It does not. As per the docs:

Important: Do not load balance traffic between Prometheus and its Alertmanagers, but instead point Prometheus to a list of all Alertmanagers. The Alertmanager implementation expects all alerts to be sent to all Alertmanagers to ensure high availability.

@shivgana
Copy link
Author

So we have to point Prometheus to all replica of alertmanager

@kapilnayar
Copy link

@grobinson-grafana any info on how the deduplication of the alerts is done - the cluster nodes will need to communicate some information about the consumed alerts at some point ? Also, if alert is notified to all Alert Manager nodes, does it impact the scalability in case of high number of alerts in the system.

@grobinson-grafana
Copy link
Contributor

The official docs don't have a lot of information on how it works, but this talk from Promcon 2017 is still relevant and has a lot of useful information on how it works https://youtu.be/i6Hmc0bKHAY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants