Skip to content

Commit

Permalink
SNAT port exhaustion alerting
Browse files Browse the repository at this point in the history
  • Loading branch information
Neill Turner committed Jan 9, 2025
1 parent d52ad9a commit 7afe812
Showing 1 changed file with 20 additions and 0 deletions.
20 changes: 20 additions & 0 deletions documentation/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,23 @@ Following auth keys need to be stored on azure vault as a secret.
1. PROMETHEUS-AUTH
2. ALERTMANAGER-AUTH
3. THANOS-AUTH

## Azure Alerts

AKS uses an azure load balancer for inbound and outbound connections and this can lead to port exhaustion if a node does alot of network requests.

We have 2 alerts defined and they action a message to the slack channel.

## High Port Usage

If port usage goes over a threshold we alert on this as a warning so we can take pre-emptive action.

## Port Exhaustion

If connections start failing because of port exhaustion we alert on this as an error.

## Troubleshooting Port Exhaustion

Unfortunately we can't alert which kubernetes service is using aa high number of ports so this is a troublshooting exercise following:

[Troubleshoot SNAT port exhaustion on Azure Kubernetes Service nodes](https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/connectivity/snat-port-exhaustion?tabs=for-a-linux-pod)

0 comments on commit 7afe812

Please sign in to comment.