Skip to content

Commit 7fb0a90

Browse files
Kyle-Nealealai97
andauthored
Add Traefik Mesh integration (#17585)
* Initial commit * Add unit and end to end tests * added more test and assets to integration * sync CI * fix dashboard and E2E tests * fix metadata types * added more Traefik config metrics * Add changelog and readme * remove additional changelog added * apply readme suggestions Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com> * apply suggestions to readme Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com> * added logo and fixed nits * removed unneeded PY2 import and made tests more readable --------- Co-authored-by: Austin Lai <76412946+alai97@users.noreply.github.com>
1 parent 45d1714 commit 7fb0a90

34 files changed

+4346
-0
lines changed

.codecov.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -598,6 +598,10 @@ coverage:
598598
target: 75
599599
flags:
600600
- torchserve
601+
Traefik_Mesh:
602+
target: 75
603+
flags:
604+
- traefik_mesh
601605
Traffic_Server:
602606
target: 75
603607
flags:
@@ -1440,6 +1444,11 @@ flags:
14401444
paths:
14411445
- torchserve/datadog_checks/torchserve
14421446
- torchserve/tests
1447+
traefik_mesh:
1448+
carryforward: true
1449+
paths:
1450+
- traefik_mesh/datadog_checks/traefik_mesh
1451+
- traefik_mesh/tests
14431452
traffic_server:
14441453
carryforward: true
14451454
paths:

.github/workflows/config/labeler.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -469,6 +469,8 @@ integration/tomcat:
469469
- tomcat/**/*
470470
integration/torchserve:
471471
- torchserve/**/*
472+
integration/traefik_mesh:
473+
- traefik_mesh/**/*
472474
integration/traffic_server:
473475
- traffic_server/**/*
474476
integration/twemproxy:

.github/workflows/test-all.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3355,6 +3355,25 @@ jobs:
33553355
test-py3: ${{ inputs.test-py3 }}
33563356
minimum-base-package: ${{ inputs.minimum-base-package }}
33573357
secrets: inherit
3358+
j41a575a:
3359+
uses: ./.github/workflows/test-target.yml
3360+
with:
3361+
job-name: Traefik Mesh
3362+
target: traefik_mesh
3363+
platform: linux
3364+
runner: '["ubuntu-22.04"]'
3365+
repo: "${{ inputs.repo }}"
3366+
python-version: "${{ inputs.python-version }}"
3367+
standard: ${{ inputs.standard }}
3368+
latest: ${{ inputs.latest }}
3369+
agent-image: "${{ inputs.agent-image }}"
3370+
agent-image-py2: "${{ inputs.agent-image-py2 }}"
3371+
agent-image-windows: "${{ inputs.agent-image-windows }}"
3372+
agent-image-windows-py2: "${{ inputs.agent-image-windows-py2 }}"
3373+
test-py2: ${{ inputs.test-py2 }}
3374+
test-py3: ${{ inputs.test-py3 }}
3375+
minimum-base-package: ${{ inputs.minimum-base-package }}
3376+
secrets: inherit
33583377
j8a1b5bb:
33593378
uses: ./.github/workflows/test-target.yml
33603379
with:

traefik_mesh/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# CHANGELOG - traefik_mesh
2+
3+
<!-- towncrier release notes start -->
4+

traefik_mesh/README.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Agent Check: traefik_mesh
2+
3+
## Overview
4+
5+
Traefik Mesh is a lightweight and easy-to-deploy service mesh that offers advanced traffic management, security, and observability features for microservices applications, leveraging the capabilities of Traefik Proxy. With Datadog's Traefik integration, you can:
6+
- Obtain insights into the traffic entering your service mesh.
7+
- Gain critical insights into the performance, reliability, and security of individual services within your mesh which ensures your services are operating efficiently while also helping to identify and resolve issues quickly.
8+
- Gain detailed insights into the internal traffic flows within your service mesh which helps monitor performance and ensure reliability.
9+
10+
This check monitors [Traefik Mesh][1] through the Datadog Agent.
11+
12+
## Setup
13+
14+
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][3] for guidance on applying these instructions.
15+
16+
### Installation
17+
18+
Starting from Agent release v7.55.0, the Traefik Mesh check is included in the [Datadog Agent][2] package. No additional installation is needed on your server.
19+
20+
**Note**: This check requires Agent v7.55.0 or later.
21+
22+
### Configuration
23+
24+
Traefik Mesh can be configured to expose Prometheus-formatted metrics. The Datadog Agent can collect these metrics using the integration described below. Follow the instructions to configure data collection for your Traefik Mesh instances. For the required configurations to expose the Prometheus metrics, see the [Observability page in the official Traefik Mesh documentation][10].
25+
26+
In addition, a small subset of metrics can be collected by communicating with different API endpoints. Specifically:
27+
- `/api/version`: Version information on the Traefik proxy.
28+
- `/api/status/nodes`: Ready status of nodes visible by the Traefik [controller][12].
29+
- `/api/status/readiness`: Ready status of the Traefik controller.
30+
31+
**Note**: This check uses [OpenMetrics][11] for metric collection, which requires Python 3.
32+
33+
#### Containerized
34+
##### Metric collection
35+
36+
Make sure that the Prometheus-formatted metrics are exposed in your Traefik Mesh cluster. You can configure and customize this by following the instructions on the [Observability page in the official Traefik Mesh documentation][10]. In order for the Agent to start collecting metrics, the Traefik Mesh pods need to be annotated. For more information about annotations, refer to the [Autodiscovery Integration Templates][3] for guidance. You can find additional configuration options by reviewing the [`traefik_mesh.d/conf.yaml` sample][4].
37+
38+
**Note**: The following metrics can only be collected if they are available. Some metrics are generated only when certain actions are performed.
39+
40+
When configuring the Traefik Mesh check, you can use the following parameters:
41+
- `openmetrics_endpoint`: This parameter should be set to the location where the Prometheus-formatted metrics are exposed. The default port is `8082`, but it can be configured using the `--entryPoints.metrics.address`. In containerized environments, `%%host%%` can be used for [host autodetection][3].
42+
- `traefik_proxy_api_endpooint:` This parameter is optional. The default port is `8080` and can be configured using `--entryPoints.traefik.address`. In containerized environments, `%%host%%` can be used for [host autodetection][3].
43+
- `traefik_controller_api_endpoint`: This parameter is optional. The default port is set to `9000`.
44+
45+
### Validation
46+
47+
[Run the Agent's status subcommand][6] and look for `traefik_mesh` under the Checks section.
48+
49+
## Data Collected
50+
51+
### Metrics
52+
53+
See [metadata.csv][7] for a list of metrics provided by this integration.
54+
55+
### Events
56+
57+
The Traefik Mesh integration does not include any events.
58+
59+
### Service Checks
60+
61+
See [service_checks.json][8] for a list of service checks provided by this integration.
62+
63+
## Troubleshooting
64+
65+
Need help? Contact [Datadog support][9].
66+
67+
68+
[1]: https://traefik.io/
69+
[2]: https://app.datadoghq.com/account/settings/agent/latest
70+
[3]: https://docs.datadoghq.com/agent/kubernetes/integrations/
71+
[4]: https://github.com/DataDog/integrations-core/blob/master/traefik_mesh/datadog_checks/traefik_mesh/data/conf.yaml.example
72+
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
73+
[6]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
74+
[7]: https://github.com/DataDog/integrations-core/blob/master/traefik_mesh/metadata.csv
75+
[8]: https://github.com/DataDog/integrations-core/blob/master/traefik_mesh/assets/service_checks.json
76+
[9]: https://docs.datadoghq.com/help/
77+
[10]: https://doc.traefik.io/traefik/observability/metrics/overview/
78+
[11]: https://docs.datadoghq.com/integrations/openmetrics/
79+
[12]: https://doc.traefik.io/traefik-mesh/api/
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
name: Traefik Mesh
2+
files:
3+
- name: traefik_mesh.yaml
4+
options:
5+
- template: init_config
6+
options:
7+
- template: init_config/openmetrics
8+
- template: instances
9+
options:
10+
- template: instances/openmetrics
11+
overrides:
12+
openmetrics_endpoint.required: true
13+
openmetrics_endpoint.value.example: http://<PROXY_ENDPOINT>:<METRICS_PORT>
14+
openmetrics_endpoint.description: |
15+
Endpoint exposing the Traefic Proxy Prometheus metrics. For more information refer to:
16+
https://doc.traefik.io/traefik/observability/metrics/prometheus/
17+
- name: traefik_proxy_api_endpoint
18+
required: false
19+
description: URL of the Traefik Mesh proxy to query.
20+
value:
21+
example: http://<PROXY_ENDPOINT>:<API_PORT>
22+
pattern: \w+
23+
type: string
24+
- name: traefik_controller_api_endpoint
25+
required: false
26+
description: URL of the Traefik Mesh controller to query.
27+
value:
28+
example: http://<CONTROLLER_ENDPOINT>:<API_PORT>
29+
pattern: \w+
30+
type: string

0 commit comments

Comments
 (0)