You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/setup_installation/admin/monitoring/export-metrics.md
+88-33
Original file line number
Diff line number
Diff line change
@@ -2,73 +2,126 @@
2
2
3
3
## Introduction
4
4
Hopsworks services produce metrics which are centrally gathered by [Prometheus](https://prometheus.io/) and visualized in [Grafana](../grafana).
5
-
Although the system is self-contained, it is possible to export these metrics to third-party services or another Prometheus instance.
5
+
Although the system is self-contained, it is possible for another *federated* Prometheus instance to scrape these metrics or directly push them to another system.
6
6
This is useful if you have a centralized monitoring system with already configured alerts.
7
7
8
8
## Prerequisites
9
-
In order to configure Prometheus to export metrics you need `root` SSH access to either Hopsworks or to the target server depending on the export method you choose below.
9
+
In order to configure Prometheus to export metrics you need to have the right to change the remote Prometheus configuration.
10
10
11
11
## Exporting metrics
12
12
Prometheus can be configured to export metrics to another Prometheus instance (cross-service federation) or to a custom service which knows how to handle them.
13
13
14
14
### Prometheus federation
15
-
Prometheus servers can be federated to scale better or to just clone all metrics (cross-service federation). Prometheus federation is well [documented](https://prometheus.io/docs/prometheus/latest/federation/#cross-service-federation)
16
-
but there are some specificities to Hopsworks.
15
+
Prometheus servers can be federated to scale better or to just clone all metrics (cross-service federation).
17
16
18
17
In the guide below we assume **Prometheus A** is the service running in Hopsworks and **Prometheus B** is the server you want to clone metrics to.
19
18
20
19
#### Step 1
21
-
**Prometheus B** needs to be able to connect to TCP port `9089` of **Prometheus B** to scrape metrics. If you have any firewall (or Security Group) in place, allow ingress for that port.
20
+
**Prometheus B** needs to be able to connect to TCP port `9090` of **Prometheus A** to scrape metrics. If you have any firewall (or Security Group) in place, allow ingress for that port.
22
21
23
22
#### Step 2
24
-
SSH into **Prometheus B** server, edit Prometheus configuration file and add the following under the `scrape_configs`
23
+
The next step is to expose **Prometheus A** running inside Hopsworks Kubernetes cluster. If **Prometheus B** has direct access to **Prometheus A** then you can skip this step.
24
+
25
+
We will create a Kubernetes *Service* of type *LoadBalancer* to expose port `9090`
26
+
27
+
!!!Warning
28
+
If you need to apply custom **annotations**, then modify the Manifest below
29
+
The example below assumes Hopsworks is **installed** at Namespace *hopsworks*
30
+
31
+
```bash
32
+
kubectl apply -f - <<EOF
33
+
apiVersion: v1
34
+
kind: Service
35
+
metadata:
36
+
name: prometheus-external
37
+
namespace: hopsworks
38
+
labels:
39
+
app: prometheus
40
+
spec:
41
+
type: LoadBalancer
42
+
selector:
43
+
app.kubernetes.io/name: prometheus
44
+
app.kubernetes.io/component: server
45
+
ports:
46
+
- protocol: TCP
47
+
port: 9090
48
+
targetPort: 9090
49
+
EOF
50
+
```
51
+
52
+
Then we need to find the External IP address of the newly created Service
53
+
54
+
```bash
55
+
export NAMESPACE=hopsworks
56
+
kubectl -n $NAMESPACE get svc prometheus-external -ojsonpath='{.status.loadBalancer.ingress[0].ip}'
57
+
```
58
+
59
+
!!!Warning
60
+
It will take a few seconds until an IP address is assigned to the Service
61
+
62
+
We will use this IP address in Step 2
63
+
64
+
#### Step 2
65
+
Edit the configuration file of **Prometheus B** server and append the following Job under `scrape_configs`
25
66
26
67
!!! note
27
-
Replace IP_ADDRESS with the actual address of Hopsworks server
68
+
Replace IP_ADDRESS with the IP address from Step 1 or the IP address of Prometheus service if it is directly accessible.
69
+
The snippet below assumes Hopsworks services runs at Namespace **hopsworks**
28
70
29
71
```yaml
30
72
- job_name: 'federate'
31
-
scrape_interval: 15s
73
+
scrape_interval: 15s
32
74
33
-
honor_labels: true
34
-
metrics_path: '/federate'
75
+
honor_labels: true
76
+
metrics_path: '/federate'
35
77
36
-
params:
37
-
'match[]':
38
-
- '{job="airflow"}'
39
-
- '{job="pushgateway"}'
40
-
- '{job="hadoop"}'
41
-
- '{job="hopsworks"}'
78
+
params:
79
+
'match[]':
80
+
- '{namespace="hopsworks"}'
42
81
43
-
static_configs:
44
-
- targets:
45
-
- 'IP_ADDRESS:9089'
82
+
static_configs:
83
+
- targets:
84
+
- 'IP_ADDRESS:9090'
46
85
```
47
86
48
-
These are the basic labels gathered by Hopsworks.
49
-
50
-
* If your Hopsworks cluster runs **without** Kubernetes append `'{job="cadvisor"}'` to `match[]` list
87
+
The configuration above will scrape for services metrics under the *hopsworks* Namespace. If you want to additionally
88
+
scrape *user application* metrics then append `'{job="pushgateway"}'` to the matchers, for example:
51
89
52
-
* If your Hopsworks cluster runs **with** Kubernetes append the following labels to `match[]`
53
-
* `'{job=~"knative.+"}'`
54
-
* `'{job="kubernetes-cadvisor"}'`
55
-
* `'{job="istio-envoy"}'`
56
-
* `'{job="kube-state-metrics"}'`
57
-
* `'{job="cadvisor"}'`
58
-
* `'{job="cadvisor"}'`
59
-
* `'{job="cadvisor"}'`
90
+
```yaml
91
+
params:
92
+
'match[]':
93
+
- '{namespace="hopsworks"}'
94
+
- '{job="pushgateway"}'
95
+
```
60
96
61
-
#### Step 3
62
-
Finally restart Prometheus service with `sudo systemctl restart prometheus`
97
+
Depending on the Prometheus setup you might need to restart **Prometheus B** service to pick up the new configuration.
98
+
For more details on federation visit Prometheus [documentation](https://prometheus.io/docs/prometheus/latest/federation/#cross-service-federation)
63
99
64
100
### Custom service
65
101
Prometheus can push metrics to another custom resource via HTTP. The custom service is responsible for handling the received metrics.
66
102
To push metrics with this method we use the `remote_write` configuration.
67
103
68
-
69
104
We will only give a sample configuration as `remote_write` is extensively documented in Prometheus [documentation](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
70
105
In the example below we push metrics to a custom service listening on port 9096 which transforms the metrics and forwards them.
71
106
107
+
#### Step 1
108
+
First we need to identify the name of the *ConfigMap* storing the Prometheus configuration. The guide assumes Hopsworks runs at
109
+
`hopsworks`Namespace.
110
+
111
+
```bash
112
+
export NAMESPACE=hopsworks
113
+
configMapName=$(kubectl -n ${NAMESPACE} get configmap -l app.kubernetes.io/name=prometheus,app.kubernetes.io/component=server -ojsonpath='{.items[0].metadata.name}')
114
+
```
115
+
116
+
#### Step 2
117
+
Using the name of the ConfigMap from Step 1 edit the configuration
0 commit comments