Skip to content

Commit e9529ae

Browse files
alpinetrtrieu
andauthored
add docs for model serving (#19231)
* add docs for model serving * add docs for model serving * add docs for model serving * add docs for model serving * revert to original * revert to original * cluster utiliziation tab * close cluster utilizaiton tab * nested tabs aren't supported * nested tabs aren't supported * comment out spark data * list of metrics * try two lists of metrics * try two lists of metrics * post david review * assert contents of set * remove unused files * Update databricks/README.md Co-authored-by: Rosa Trieu <107086888+rtrieu@users.noreply.github.com> --------- Co-authored-by: Rosa Trieu <107086888+rtrieu@users.noreply.github.com>
1 parent 826f684 commit e9529ae

File tree

2 files changed

+26
-5
lines changed

2 files changed

+26
-5
lines changed

databricks/README.md

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Agent Check: Databricks
22

3-
<div class="alert alert-warning">
3+
<div class="alert alert-info">
44
<a href="https://docs.datadoghq.com/data_jobs/">Data Jobs Monitoring</a> helps you observe, troubleshoot, and cost-optimize your Databricks jobs and clusters.<br/><br/>
5-
This page is limited to documentation for ingesting Databricks cluster utilization metrics and logs.
5+
This page is limited to documentation for ingesting Databricks model serving metrics and cluster utilization data.
66
</div>
77

88
![Databricks default dashboard][21]
@@ -23,11 +23,27 @@ Model serving metrics provide insights into how your Databricks model serving i
2323
## Setup
2424

2525
### Installation
26+
Gain insight into the health of your model serving infrastructure by following the [Model Serving Configuration](#model-serving-configuration) instructions.
2627

27-
Monitor Databricks Spark applications with the [Datadog Spark integration][3]. Install the [Datadog Agent][4] on your clusters following the [configuration](#configuration) instructions for your appropriate cluster. After that, install the [Spark integration][23] on Datadog to autoinstall the Databricks Overview dashboard.
28+
Monitor Databricks Spark applications with the [Datadog Spark integration][3]. Install the [Datadog Agent][4] on your clusters following the [configuration](#spark-configuration) instructions for your appropriate cluster. Refer to [Spark Configuration](#spark-configuration) instructions.
2829

2930
### Configuration
31+
#### Model Serving Configuration
32+
1. In your Databricks workspace, click on your profile in the top right corner and go to **Settings**. Select **Developer** in the left side bar. Next to **Access tokens**, click **Manage**.
33+
2. Click **Generate new token**, enter "Datadog Integration" in the **Comment** field, remove the default value in **Lifetime (days)**, and click **Generate**. Take note of your token.
3034

35+
**Important:**
36+
* Make sure you delete the default value in **Lifetime (days)** so that the token doesn't expire and the integration doesn't break.
37+
* Ensure the account generating the token has [CAN VIEW access][30] for the Databricks jobs and clusters you want to monitor.
38+
39+
As an alternative, follow the [official Databricks documentation][31] to generate an access token for a [service principal][31].
40+
41+
3. In Datadog, open the Databricks integration tile.
42+
4. On the **Configure** tab, click **Add Databricks Workspace**.
43+
5. Enter a workspace name, your Databricks workspace URL, and the Databricks token you generated.
44+
6. In the **Select resources to set up collection** section, make sure **Metrics - Model Serving** is **Enabled**.
45+
46+
#### Spark Configuration
3147
Configure the Spark integration to monitor your Apache Spark Cluster on Databricks and collect system and Spark metrics.
3248

3349
Each script described below can be modified to suits your needs. For instance, you can:
@@ -452,8 +468,10 @@ chmod a+x /tmp/start_datadog.sh
452468
## Data Collected
453469

454470
### Metrics
455-
456-
See the [Spark integration documentation][8] for a list of metrics collected.
471+
#### Model Serving Metrics
472+
See [metadata.csv][29] for a list of metrics provided by this integration.
473+
#### Spark Metrics
474+
See the [Spark integration documentation][8] for a list of Spark metrics collected.
457475

458476
### Service Checks
459477

@@ -501,3 +519,6 @@ Additional helpful documentation, links, and articles:
501519
[26]: https://www.datadoghq.com/product/cloud-cost-management/
502520
[27]: https://www.datadoghq.com/product/log-management/
503521
[28]: https://docs.datadoghq.com/integrations/databricks/?tab=driveronly
522+
[29]: https://github.com/DataDog/integrations-core/blob/master/databricks/metadata.csv
523+
[30]: https://docs.databricks.com/en/security/auth-authz/access-control/index.html#job-acls
524+
[31]: https://docs.databricks.com/en/admin/users-groups/service-principals.html#what-is-a-service-principal
138 KB
Loading

0 commit comments

Comments
 (0)