|
2 | 2 |
|
3 | 3 | ## Introduction
|
4 | 4 | Hopsworks services produce metrics which are centrally gathered by [Prometheus](https://prometheus.io/) and visualized in [Grafana](../grafana).
|
5 |
| -Although the system is self-contained, it is possible to export these metrics to third-party services or another Prometheus instance. |
| 5 | +Although the system is self-contained, it is possible for another *federated* Prometheus instance to scrape these metrics or directly push them to another system. |
6 | 6 | This is useful if you have a centralized monitoring system with already configured alerts.
|
7 | 7 |
|
8 | 8 | ## Prerequisites
|
9 |
| -In order to configure Prometheus to export metrics you need `root` SSH access to either Hopsworks or to the target server depending on the export method you choose below. |
| 9 | +In order to configure Prometheus to export metrics you need to have the right to change the remote Prometheus configuration. |
10 | 10 |
|
11 | 11 | ## Exporting metrics
|
12 | 12 | Prometheus can be configured to export metrics to another Prometheus instance (cross-service federation) or to a custom service which knows how to handle them.
|
13 | 13 |
|
14 | 14 | ### Prometheus federation
|
15 |
| -Prometheus servers can be federated to scale better or to just clone all metrics (cross-service federation). Prometheus federation is well [documented](https://prometheus.io/docs/prometheus/latest/federation/#cross-service-federation) |
16 |
| -but there are some specificities to Hopsworks. |
| 15 | +Prometheus servers can be federated to scale better or to just clone all metrics (cross-service federation). |
17 | 16 |
|
18 | 17 | In the guide below we assume **Prometheus A** is the service running in Hopsworks and **Prometheus B** is the server you want to clone metrics to.
|
19 | 18 |
|
20 | 19 | #### Step 1
|
21 |
| -**Prometheus B** needs to be able to connect to TCP port `9089` of **Prometheus B** to scrape metrics. If you have any firewall (or Security Group) in place, allow ingress for that port. |
| 20 | +**Prometheus B** needs to be able to connect to TCP port `9090` of **Prometheus A** to scrape metrics. If you have any firewall (or Security Group) in place, allow ingress for that port. |
22 | 21 |
|
23 | 22 | #### Step 2
|
24 |
| -SSH into **Prometheus B** server, edit Prometheus configuration file and add the following under the `scrape_configs` |
| 23 | +The next step is to expose **Prometheus A** running inside Hopsworks Kubernetes cluster. If **Prometheus B** has direct access to **Prometheus A** then you can skip this step. |
| 24 | + |
| 25 | +We will create a Kubernetes *Service* of type *LoadBalancer* to expose port `9090` |
| 26 | + |
| 27 | +!!!Warning |
| 28 | + If you need to apply custom **annotations**, then modify the Manifest below |
| 29 | + The example below assumes Hopsworks is **installed** at Namespace *hopsworks* |
| 30 | + |
| 31 | +```bash |
| 32 | +kubectl apply -f - <<EOF |
| 33 | +apiVersion: v1 |
| 34 | +kind: Service |
| 35 | +metadata: |
| 36 | + name: prometheus-external |
| 37 | + namespace: hopsworks |
| 38 | + labels: |
| 39 | + app: prometheus |
| 40 | +spec: |
| 41 | + type: LoadBalancer |
| 42 | + selector: |
| 43 | + app.kubernetes.io/name: prometheus |
| 44 | + app.kubernetes.io/component: server |
| 45 | + ports: |
| 46 | + - protocol: TCP |
| 47 | + port: 9090 |
| 48 | + targetPort: 9090 |
| 49 | +EOF |
| 50 | +``` |
| 51 | + |
| 52 | +Then we need to find the External IP address of the newly created Service |
| 53 | + |
| 54 | +```bash |
| 55 | +export NAMESPACE=hopsworks |
| 56 | +kubectl -n $NAMESPACE get svc prometheus-external -ojsonpath='{.status.loadBalancer.ingress[0].ip}' |
| 57 | +``` |
| 58 | + |
| 59 | +!!!Warning |
| 60 | + It will take a few seconds until an IP address is assigned to the Service |
| 61 | + |
| 62 | +We will use this IP address in Step 2 |
| 63 | + |
| 64 | +#### Step 2 |
| 65 | +Edit the configuration file of **Prometheus B** server and append the following Job under `scrape_configs` |
25 | 66 |
|
26 | 67 | !!! note
|
27 |
| - Replace IP_ADDRESS with the actual address of Hopsworks server |
| 68 | + Replace IP_ADDRESS with the IP address from Step 1 or the IP address of Prometheus service if it is directly accessible. |
| 69 | + The snippet below assumes Hopsworks services runs at Namespace **hopsworks** |
28 | 70 |
|
29 | 71 | ```yaml
|
30 | 72 | - job_name: 'federate'
|
31 |
| - scrape_interval: 15s |
| 73 | + scrape_interval: 15s |
32 | 74 |
|
33 |
| - honor_labels: true |
34 |
| - metrics_path: '/federate' |
| 75 | + honor_labels: true |
| 76 | + metrics_path: '/federate' |
35 | 77 |
|
36 |
| - params: |
37 |
| - 'match[]': |
38 |
| - - '{job="airflow"}' |
39 |
| - - '{job="pushgateway"}' |
40 |
| - - '{job="hadoop"}' |
41 |
| - - '{job="hopsworks"}' |
| 78 | + params: |
| 79 | + 'match[]': |
| 80 | + - '{namespace="hopsworks"}' |
42 | 81 |
|
43 |
| - static_configs: |
44 |
| - - targets: |
45 |
| - - 'IP_ADDRESS:9089' |
| 82 | + static_configs: |
| 83 | + - targets: |
| 84 | + - 'IP_ADDRESS:9090' |
46 | 85 | ```
|
47 | 86 |
|
48 |
| -These are the basic labels gathered by Hopsworks. |
| 87 | +The configuration above will scrape for services metrics under the *hopsworks* Namespace. If you want to additionally |
| 88 | +scrape *user application* metrics then append `'{job="pushgateway"}'` to the matchers, for example: |
49 | 89 |
|
50 |
| -* If your Hopsworks cluster runs **without** Kubernetes append `'{job="cadvisor"}'` to `match[]` list |
51 |
| - |
52 |
| -* If your Hopsworks cluster runs **with** Kubernetes append the following labels to `match[]` |
53 |
| - * `'{job=~"knative.+"}'` |
54 |
| - * `'{job="kubernetes-cadvisor"}'` |
55 |
| - * `'{job="istio-envoy"}'` |
56 |
| - * `'{job="kube-state-metrics"}'` |
57 |
| - * `'{job="cadvisor"}'` |
58 |
| - * `'{job="cadvisor"}'` |
59 |
| - * `'{job="cadvisor"}'` |
| 90 | +```yaml |
| 91 | + params: |
| 92 | + 'match[]': |
| 93 | + - '{namespace="hopsworks"}' |
| 94 | + - '{job="pushgateway"}' |
| 95 | +``` |
60 | 96 |
|
61 |
| -#### Step 3 |
62 |
| -Finally restart Prometheus service with `sudo systemctl restart prometheus` |
| 97 | +Depending on the Prometheus setup you might need to restart **Prometheus B** service to pick up the new configuration. |
| 98 | +For more details on federation visit Prometheus [documentation](https://prometheus.io/docs/prometheus/latest/federation/#cross-service-federation) |
63 | 99 |
|
64 | 100 | ### Custom service
|
65 | 101 | Prometheus can push metrics to another custom resource via HTTP. The custom service is responsible for handling the received metrics.
|
66 | 102 | To push metrics with this method we use the `remote_write` configuration.
|
67 | 103 |
|
68 |
| - |
69 | 104 | We will only give a sample configuration as `remote_write` is extensively documented in Prometheus [documentation](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
|
70 | 105 | In the example below we push metrics to a custom service listening on port 9096 which transforms the metrics and forwards them.
|
71 | 106 |
|
| 107 | +In order to configure Prometheus to push metrics to a remote HTTP service we need to customize our Helm chart values file with the following snippet after changing the *url* accordingly. You can also tweak other configuration parameters to your needs. |
| 108 | + |
72 | 109 | ```yaml
|
73 |
| -remote_write: |
74 |
| - - url: "http://localhost:9096" |
75 |
| - queue_config: |
76 |
| - capacity: 10000 |
77 |
| - max_samples_per_send: 5000 |
78 |
| - batch_send_deadline: 60s |
| 110 | +prometheus: |
| 111 | + prometheus: |
| 112 | + server: |
| 113 | + remoteWrite: |
| 114 | + - url: "http://localhost:9096" |
| 115 | + queue_config: |
| 116 | + capacity: 10000 |
| 117 | + max_samples_per_send: 5000 |
| 118 | + batch_send_deadline: 60s |
79 | 119 | ```
|
| 120 | + |
| 121 | +If the section already exists, then append the `remoteWrite` section. |
| 122 | + |
| 123 | +Run `helm install` or `helm upgrade` if it's the first time you install Hopsworks or you want to apply the change to an existing cluster respectively. |
0 commit comments