Skip to content

Commit 8aed42a

Browse files
NouemanKHALiliakur
andauthored
Add the Teleport Integration (#16877)
* initial integration scaffolding * add initial setup for integration tests using docker-compose * add first integration test * critical test pass * add test_connect_ok * fix test_connect_exception and passing test_connect_ok * add test to check common metrics collection * add metadata.csv * ddev test teleport -fs * delete some comments * wip: use openmetrics base class * catch main super.check exception, passing -> first two tests * move exception to test to unit tests * add unit tests, setup mocks, fix integration tests * assert common metrics in unit tests * add metrics fixture * add teleport_cache_stale_events mocks * reporting common metrics * rename 'version' tag to 'teleport_version' * format * remove docker-compose down nothing behavior * update metadata.csv metric names format * update unit and integration tests to match the new metric names * fix implementation to pass the tests * [PLINT-302] Report Teleport Proxy metrics (#17018) * add unit tests for proxy metrics * passing metrics that don't require mocks * tests passing for grpc_client_* metrics * separate unit tests by teleport metrics groups * update teleport proxy metrics test to check all metrics * update fixtures with missing proxy metrics * update proxy metric names in the test * collect proxy metrics in the check * format * ignore linting on a long line * separate metrics maps into metrics.py per suggestion * ddev validate ci --sync * ddev validate config teleport -s * ddev validate models teleport -s * ddev validate labeler --sync * remove arch suffix from docker-compose teleport image * add E2E test * fix linting * [PLINT-303] Report Teleport Auth Service and Backends metrics (#17050) * add test_auth_teleport_metrics test collect auth service metrics update fixtures with missing auth service metrics format add 'cluster_name_not_found' metric fix wrong metrics, grpc_server metrics belong to Auth service replace single quotes with double quotes + format change prefix for auth audit_log metrics add auth s3 backend metrics test collect auth s3 backend metrics update fixtures with auth s3 backend metrics * add support for backend cache metrics * add support for backend dynamoDB metrics * add support for backend firestore metrics * add support for backend GCP GCS metrics * add support for backend ETCD metrics * lint * Refactor and cleanup Teleport Integration (#17084) * use metrics_path fixture in tests * use instance fixture in tests * move metrics constants in tests to common.py * handle check exception * remove the match parameter when testing for exception * add .common prefix to common instance metrics for filtering purposes later * update integration tests * fix linting * fix e2e tests * use the COMMON_METRICS const in the integration and e2e tests * format * use the INSTANCE const in the test_e2e.py * update the config properties to align with the RFC suggestion * lint * make DEFAULT_DIAG_PORT class constant * [PLINT-304] Report metrics for the Teleport SSH Service (#17111) * use metrics_path fixture in tests * use instance fixture in tests * move metrics constants in tests to common.py * handle check exception * remove the match parameter when testing for exception * add .common prefix to common instance metrics for filtering purposes later * update integration tests * fix linting * fix e2e tests * use the COMMON_METRICS const in the integration and e2e tests * format * use the INSTANCE const in the test_e2e.py * update the config properties to align with the RFC suggestion * lint * make DEFAULT_DIAG_PORT class constant * add tests for SSH service metrics * collect SSH metrics, tests passing * cleanup: move METRIC_MAP to metrics.py * [PLINT-306] Report metrics for the Teleport Kubernetes Service (#17113) * add tests for Kubernetes service metrics * successfully collecting kubernetes client metrics * add tests for Kubernetes service server metrics * successfully collecting kubernetes server metrics * [PLINT-305] Report metrics for the Teleport Database Service (#17114) * add tests for Database service metrics * successfully collecting database service metrics * [PLINT-308] Report metrics for the Teleport Enhanced Session Recording / BPF (#17116) * add tests for BPF metrics * successfully collecting BPF metrics * [PLINT-307] Report metrics for the Teleport internal Prometheus (#17117) * add tests for Prometheus metrics * successfully collecting Prometheus metrics * [PLINT-326] Add `teleport_service` tag to Teleport metrics (#17216) * update tests to check for 'teleport_service' tag for auth metrics * update tests to check for 'teleport_service' tag for ssh metrics * update tests to check for 'teleport_service' tag for proxy metrics * update tests to check for 'teleport_service' tag for database metrics * update tests to check for 'teleport_service' tag for kubernetes metrics * update tests to check for 'teleport_service' tag for common teleport metrics * add METRIC_MAP_BY_SERVICE * add custom metric transformer to add 'teleport_service' tag * remove unexisting metric type case * remove useless check * [PLINT-331] Update `metadata.csv` for the Teleport Integration (#17241) * update metadata.csv * fix metadata duplications, and invalid histogram types to their corresponding count type * remove periods at the end of descriptions * [PLINT-325] Update Teleport Configuration spec (#17261) * add 'teleport_url' and 'diag_port' properties to the Configuration Spec * ddev validate config -s * ddev validate models * sort metadata.csv * ddev validate ci -s * add DEFAULT_METRIC_LIMIT to fix the openmetrics validation-n * update classifier_tags * add description to the manifest * fix tile description * update manifest check for metrics * [WIP] add dashboard * add dashboard entry in the manifest.json * delele monitors, logs, and saved_views entries from the manifest.json * remove extra comma in manifest.json * changelog * ddev validate label --sync * [PLINT-362] Standardize metric names (#17383) * update description 'cluster' -> 'instance' * change teleport.auth.audit_log.* prefix to teleport.audit_log.* to ease filtering audit log metrics * standardize metric names in the tests * update implementation * update metadata.csv * remove extra space in metadata.csv * audit_log -> auth.audit_log * Send Teleport service check as a metric (#17441) * update description 'cluster' -> 'instance' * send count metric instead of service check * raise on exception * format * fix integration tests * remove obsolete field version in docker-compose.yml * fix msg arg to self.count and add teleport_status tag to health.up metric * fix integration tests typo * fix e2e tests * log error message on exception * lint * add teleport.health.up to metadata.csv * add assertion for unreachable health state metric tag * apply suggestion * Update metadata.csv with descriptions and units (#17456) * regenerate metadata.csv * add all missing descriptions * add some units * add bucket,sum, and count metrics for histogram metrics * update units and sort * fix manifest.json errors * update README.md with prerequisites section * delete dashboard * Apply suggestions from code review Co-authored-by: Ilia Kurenkov <ilia.kurenkov@datadoghq.com> --------- Co-authored-by: Ilia Kurenkov <ilia.kurenkov@datadoghq.com>
1 parent 33d466e commit 8aed42a

40 files changed

+3268
-2
lines changed

.codecov.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -570,6 +570,10 @@ coverage:
570570
target: 75
571571
flags:
572572
- teamcity
573+
Teleport:
574+
target: 75
575+
flags:
576+
- teleport
573577
Tekton:
574578
target: 75
575579
flags:
@@ -1392,6 +1396,11 @@ flags:
13921396
paths:
13931397
- teamcity/datadog_checks/teamcity
13941398
- teamcity/tests
1399+
teleport:
1400+
carryforward: true
1401+
paths:
1402+
- teleport/datadog_checks/teleport
1403+
- teleport/tests
13951404
tekton:
13961405
carryforward: true
13971406
paths:

.github/workflows/config/labeler.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -335,10 +335,10 @@ integration/oracle:
335335
- oracle/**/*
336336
integration/otel:
337337
- otel/**/*
338-
integration/pan_firewall:
339-
- pan_firewall/**/*
340338
integration/palo_alto_panorama:
341339
- palo_alto_panorama/**/*
340+
integration/pan_firewall:
341+
- pan_firewall/**/*
342342
integration/pdh_check:
343343
- pdh_check/**/*
344344
integration/pgbouncer:
@@ -447,6 +447,8 @@ integration/teamcity:
447447
- teamcity/**/*
448448
integration/tekton:
449449
- tekton/**/*
450+
integration/teleport:
451+
- teleport/**/*
450452
integration/temporal:
451453
- temporal/**/*
452454
integration/tenable:

.github/workflows/test-all.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3203,6 +3203,25 @@ jobs:
32033203
test-py3: ${{ inputs.test-py3 }}
32043204
minimum-base-package: ${{ inputs.minimum-base-package }}
32053205
secrets: inherit
3206+
je68b3b9:
3207+
uses: ./.github/workflows/test-target.yml
3208+
with:
3209+
job-name: Teleport
3210+
target: teleport
3211+
platform: linux
3212+
runner: '["ubuntu-22.04"]'
3213+
repo: "${{ inputs.repo }}"
3214+
python-version: "${{ inputs.python-version }}"
3215+
standard: ${{ inputs.standard }}
3216+
latest: ${{ inputs.latest }}
3217+
agent-image: "${{ inputs.agent-image }}"
3218+
agent-image-py2: "${{ inputs.agent-image-py2 }}"
3219+
agent-image-windows: "${{ inputs.agent-image-windows }}"
3220+
agent-image-windows-py2: "${{ inputs.agent-image-windows-py2 }}"
3221+
test-py2: ${{ inputs.test-py2 }}
3222+
test-py3: ${{ inputs.test-py3 }}
3223+
minimum-base-package: ${{ inputs.minimum-base-package }}
3224+
secrets: inherit
32063225
j840fec7:
32073226
uses: ./.github/workflows/test-target.yml
32083227
with:

teleport/CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# CHANGELOG - Teleport
2+
3+
<!-- towncrier release notes start -->

teleport/README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Agent Check: Teleport
2+
3+
## Overview
4+
5+
This check monitors [Teleport][1] through the Datadog Agent.
6+
7+
## Setup
8+
9+
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][3] for guidance on applying these instructions.
10+
11+
### Installation
12+
13+
The Teleport check is included in the [Datadog Agent][2] package.
14+
No additional installation is needed on your server.
15+
16+
### Prerequisites
17+
18+
The Teleport check gathers Teleport's metrics and performance data using two distinct endpoints:
19+
- The [Health endpoint](https://goteleport.com/docs/management/diagnostics/monitoring/#healthz) provides the overall health status of your Teleport instance.
20+
- The [OpenMetrics endpoint](https://goteleport.com/docs/reference/metrics/#auth-service-and-backends) extracts metrics on the Teleport instance and the various services operating within that instance.
21+
22+
These endpoints aren't activated by default. To enable the diagnostic HTTP endpoints in your Teleport instance, please refer to the public Teleport [documentation](https://goteleport.com/docs/management/diagnostics/monitoring/#enable-health-monitoring).
23+
24+
### Configuration
25+
26+
1. Edit the `teleport.d/conf.yaml` file, in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your teleport performance data. See the [sample teleport.d/conf.yaml][4] for all available configuration options.
27+
28+
2. [Restart the Agent][5].
29+
30+
### Validation
31+
32+
[Run the Agent's status subcommand][6] and look for `teleport` under the Checks section.
33+
34+
## Data Collected
35+
36+
### Metrics
37+
38+
See [metadata.csv][7] for a list of metrics provided by this integration.
39+
40+
### Events
41+
42+
The Teleport integration does not include any events.
43+
44+
### Service Checks
45+
46+
The Teleport integration does not include any service checks.
47+
48+
See [service_checks.json][8] for a list of service checks provided by this integration.
49+
50+
## Troubleshooting
51+
52+
Need help? Contact [Datadog support][9].
53+
54+
55+
[1]: **LINK_TO_INTEGRATION_SITE**
56+
[2]: https://app.datadoghq.com/account/settings/agent/latest
57+
[3]: https://docs.datadoghq.com/agent/kubernetes/integrations/
58+
[4]: https://github.com/DataDog/integrations-core/blob/master/teleport/datadog_checks/teleport/data/conf.yaml.example
59+
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
60+
[6]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
61+
[7]: https://github.com/DataDog/integrations-core/blob/master/teleport/metadata.csv
62+
[8]: https://github.com/DataDog/integrations-core/blob/master/teleport/assets/service_checks.json
63+
[9]: https://docs.datadoghq.com/help/
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name: Teleport
2+
files:
3+
- name: teleport.yaml
4+
options:
5+
- template: init_config
6+
options:
7+
- template: init_config/default
8+
- template: instances
9+
options:
10+
- name: "teleport_url"
11+
required: true
12+
description: "The Teleport URL to connect to."
13+
value:
14+
type: string
15+
example: "http://127.0.0.1"
16+
- name: "diag_port"
17+
description: "The Teleport Diagnostic Port."
18+
value:
19+
type: integer
20+
example: 3000
21+
22+
23+
- template: instances/default

teleport/assets/service_checks.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
[]

teleport/changelog.d/16877.added

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add the Teleport Integration

teleport/datadog_checks/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
__path__ = __import__('pkgutil').extend_path(__path__, __name__) # type: ignore
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
__version__ = '0.0.1'
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
from .__about__ import __version__
5+
from .check import TeleportCheck
6+
7+
__all__ = ['__version__', 'TeleportCheck']
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
from datadog_checks.base import OpenMetricsBaseCheckV2
6+
from datadog_checks.base.checks.openmetrics.v2.transform import get_native_dynamic_transformer
7+
8+
from .metrics import METRIC_MAP, METRIC_MAP_BY_SERVICE
9+
10+
11+
class TeleportCheck(OpenMetricsBaseCheckV2):
12+
__NAMESPACE__ = 'teleport'
13+
DEFAULT_METRIC_LIMIT = 0
14+
DEFAULT_DIAG_PORT = 3000
15+
16+
def __init__(self, name, init_config, instances):
17+
super().__init__(name, init_config, instances)
18+
self.check_initializations.appendleft(self._parse_config)
19+
self.check_initializations.append(self._configure_additional_transformers)
20+
21+
def check(self, _):
22+
try:
23+
health_endpoint = f"{self.diag_addr}/healthz"
24+
response = self.http.get(health_endpoint)
25+
response.raise_for_status()
26+
self.count("health.up", 1, tags=["teleport_status:ok"])
27+
except Exception as e:
28+
self.log.error(
29+
"Cannot connect to Teleport HTTP diagnostic health endpoint '%s': %s.\nPlease make sure to enable Teleport's diagnostic HTTP endpoints.", # noqa: E501
30+
health_endpoint,
31+
str(e),
32+
) # noqa: E501
33+
self.count("health.up", 0, tags=["teleport_status:unreachable"])
34+
raise
35+
36+
super().check(_)
37+
38+
def _parse_config(self):
39+
self.teleport_url = self.instance.get("teleport_url")
40+
self.diag_port = self.instance.get("diag_port", self.DEFAULT_DIAG_PORT)
41+
if self.teleport_url:
42+
self.diag_addr = "{}:{}".format(self.teleport_url, self.diag_port)
43+
self.instance.setdefault("openmetrics_endpoint", "{}/metrics".format(self.diag_addr))
44+
self.instance.setdefault("rename_labels", {'version': "teleport_version"})
45+
46+
def _configure_additional_transformers(self):
47+
metric_transformer = self.scrapers[self.instance['openmetrics_endpoint']].metric_transformer
48+
metric_transformer.add_custom_transformer(r'.*', self.configure_transformer_teleport_metrics(), pattern=True)
49+
50+
def configure_transformer_teleport_metrics(self):
51+
def transform(_metric, sample_data, _runtime_data):
52+
for sample, tags, hostname in sample_data:
53+
metric_name = _metric.name
54+
metric_type = _metric.type
55+
56+
# ignore metrics we don't collect
57+
if metric_name not in METRIC_MAP:
58+
continue
59+
60+
# extract `teleport_service` tag
61+
service = METRIC_MAP_BY_SERVICE.get(metric_name, "teleport")
62+
tags = tags + [f"teleport_service:{service}"]
63+
64+
# get mapped metric name
65+
new_metric_name = METRIC_MAP[metric_name]
66+
if isinstance(new_metric_name, dict) and "name" in new_metric_name:
67+
new_metric_name = new_metric_name["name"]
68+
69+
# send metric
70+
metric_transformer = self.scrapers[self.instance['openmetrics_endpoint']].metric_transformer
71+
72+
if metric_type == "counter":
73+
self.count(new_metric_name + ".count", sample.value, tags=tags, hostname=hostname)
74+
elif metric_type == "gauge":
75+
self.gauge(new_metric_name, sample.value, tags=tags, hostname=hostname)
76+
else:
77+
native_transformer = get_native_dynamic_transformer(
78+
self, new_metric_name, None, metric_transformer.global_options
79+
)
80+
81+
def add_tag_to_sample(sample, service):
82+
[sample, tags, hostname] = sample
83+
return [sample, tags + [f"teleport_service:{service}"], hostname]
84+
85+
modified_sample_data = (add_tag_to_sample(x, service) for x in sample_data)
86+
native_transformer(_metric, modified_sample_data, _runtime_data)
87+
88+
return transform
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
# This file is autogenerated.
6+
# To change this file you should edit assets/configuration/spec.yaml and then run the following commands:
7+
# ddev -x validate config -s <INTEGRATION_NAME>
8+
# ddev -x validate models -s <INTEGRATION_NAME>
9+
10+
from .instance import InstanceConfig
11+
from .shared import SharedConfig
12+
13+
14+
class ConfigMixin:
15+
_config_model_instance: InstanceConfig
16+
_config_model_shared: SharedConfig
17+
18+
@property
19+
def config(self) -> InstanceConfig:
20+
return self._config_model_instance
21+
22+
@property
23+
def shared_config(self) -> SharedConfig:
24+
return self._config_model_shared
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
# This file is autogenerated.
6+
# To change this file you should edit assets/configuration/spec.yaml and then run the following commands:
7+
# ddev -x validate config -s <INTEGRATION_NAME>
8+
# ddev -x validate models -s <INTEGRATION_NAME>
9+
10+
11+
def instance_diag_port():
12+
return 3000
13+
14+
15+
def instance_disable_generic_tags():
16+
return False
17+
18+
19+
def instance_empty_default_hostname():
20+
return False
21+
22+
23+
def instance_min_collection_interval():
24+
return 15
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# (C) Datadog, Inc. 2024-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
# This file is autogenerated.
6+
# To change this file you should edit assets/configuration/spec.yaml and then run the following commands:
7+
# ddev -x validate config -s <INTEGRATION_NAME>
8+
# ddev -x validate models -s <INTEGRATION_NAME>
9+
10+
from __future__ import annotations
11+
12+
from typing import Optional
13+
14+
from pydantic import BaseModel, ConfigDict, field_validator, model_validator
15+
16+
from datadog_checks.base.utils.functions import identity
17+
from datadog_checks.base.utils.models import validation
18+
19+
from . import defaults, validators
20+
21+
22+
class MetricPatterns(BaseModel):
23+
model_config = ConfigDict(
24+
arbitrary_types_allowed=True,
25+
frozen=True,
26+
)
27+
exclude: Optional[tuple[str, ...]] = None
28+
include: Optional[tuple[str, ...]] = None
29+
30+
31+
class InstanceConfig(BaseModel):
32+
model_config = ConfigDict(
33+
validate_default=True,
34+
arbitrary_types_allowed=True,
35+
frozen=True,
36+
)
37+
diag_port: Optional[int] = None
38+
disable_generic_tags: Optional[bool] = None
39+
empty_default_hostname: Optional[bool] = None
40+
metric_patterns: Optional[MetricPatterns] = None
41+
min_collection_interval: Optional[float] = None
42+
service: Optional[str] = None
43+
tags: Optional[tuple[str, ...]] = None
44+
teleport_url: str
45+
46+
@model_validator(mode='before')
47+
def _initial_validation(cls, values):
48+
return validation.core.initialize_config(getattr(validators, 'initialize_instance', identity)(values))
49+
50+
@field_validator('*', mode='before')
51+
def _validate(cls, value, info):
52+
field = cls.model_fields[info.field_name]
53+
field_name = field.alias or info.field_name
54+
if field_name in info.context['configured_fields']:
55+
value = getattr(validators, f'instance_{info.field_name}', identity)(value, field=field)
56+
else:
57+
value = getattr(defaults, f'instance_{info.field_name}', lambda: value)()
58+
59+
return validation.utils.make_immutable(value)
60+
61+
@model_validator(mode='after')
62+
def _final_validation(cls, model):
63+
return validation.core.check_model(getattr(validators, 'check_instance', identity)(model))

0 commit comments

Comments
 (0)