diff --git a/docs/reference/inference/delete-inference.asciidoc b/docs/reference/inference/delete-inference.asciidoc
index bee39bf9b9851..4fc4beaca6d8e 100644
--- a/docs/reference/inference/delete-inference.asciidoc
+++ b/docs/reference/inference/delete-inference.asciidoc
@@ -6,12 +6,9 @@ experimental[]
 
 Deletes an {infer} endpoint.
 
-IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
-Hugging Face. For built-in models and models uploaded through Eland, the {infer}
-APIs offer an alternative way to use and manage trained models. However, if you
-do not plan to use the {infer} APIs to use these models or if you want to use
-non-NLP models, use the <<ml-df-trained-models-apis>>.
+IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
+For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
+However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
 
 
 [discrete]
diff --git a/docs/reference/inference/get-inference.asciidoc b/docs/reference/inference/get-inference.asciidoc
index c3fe841603bcc..d991729fe77c9 100644
--- a/docs/reference/inference/get-inference.asciidoc
+++ b/docs/reference/inference/get-inference.asciidoc
@@ -6,12 +6,9 @@ experimental[]
 
 Retrieves {infer} endpoint information.
 
-IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
-Hugging Face. For built-in models and models uploaded through Eland, the {infer}
-APIs offer an alternative way to use and manage trained models. However, if you
-do not plan to use the {infer} APIs to use these models or if you want to use
-non-NLP models, use the <<ml-df-trained-models-apis>>.
+IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
+For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
+However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
 
 
 [discrete]
diff --git a/docs/reference/inference/inference-apis.asciidoc b/docs/reference/inference/inference-apis.asciidoc
index 88421e4f64cfd..e756831075027 100644
--- a/docs/reference/inference/inference-apis.asciidoc
+++ b/docs/reference/inference/inference-apis.asciidoc
@@ -54,3 +54,4 @@ include::service-google-vertex-ai.asciidoc[]
 include::service-hugging-face.asciidoc[]
 include::service-mistral.asciidoc[]
 include::service-openai.asciidoc[]
+include::service-watsonx-ai.asciidoc[]
diff --git a/docs/reference/inference/post-inference.asciidoc b/docs/reference/inference/post-inference.asciidoc
index 52131c0b10776..ce51abaff07f8 100644
--- a/docs/reference/inference/post-inference.asciidoc
+++ b/docs/reference/inference/post-inference.asciidoc
@@ -6,12 +6,9 @@ experimental[]
 
 Performs an inference task on an input text by using an {infer} endpoint.
 
-IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or
-Hugging Face. For built-in models and models uploaded through Eland, the {infer}
-APIs offer an alternative way to use and manage trained models. However, if you
-do not plan to use the {infer} APIs to use these models or if you want to use
-non-NLP models, use the <<ml-df-trained-models-apis>>.
+IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
+For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
+However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
 
 
 [discrete]
diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc
index 96e127e741d56..6d6b61ffea771 100644
--- a/docs/reference/inference/put-inference.asciidoc
+++ b/docs/reference/inference/put-inference.asciidoc
@@ -8,13 +8,8 @@ Creates an {infer} endpoint to perform an {infer} task.
 
 [IMPORTANT]
 ====
-* The {infer} APIs enable you to use certain services, such as built-in
-{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral,
-Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic or Hugging Face.
-* For built-in models and models uploaded through Eland, the {infer} APIs offer an
-alternative way to use and manage trained models. However, if you do not plan to
-use the {infer} APIs to use these models or if you want to use non-NLP models,
-use the <<ml-df-trained-models-apis>>.
+* The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
+* For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.
 ====
 
 
@@ -71,6 +66,7 @@ Click the links to review the configuration details of the services:
 * <<infer-service-hugging-face,Hugging Face>> (`text_embedding`)
 * <<infer-service-mistral,Mistral>> (`text_embedding`)
 * <<infer-service-openai,OpenAI>> (`completion`, `text_embedding`)
+* <<infer-service-watsonx-ai>> (`text_embedding`)
 
 The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of
 the services connect to external providers.
\ No newline at end of file
diff --git a/docs/reference/inference/service-watsonx-ai.asciidoc b/docs/reference/inference/service-watsonx-ai.asciidoc
new file mode 100644
index 0000000000000..597afc27fd0cf
--- /dev/null
+++ b/docs/reference/inference/service-watsonx-ai.asciidoc
@@ -0,0 +1,115 @@
+[[infer-service-watsonx-ai]]
+=== Watsonx {infer} service
+
+Creates an {infer} endpoint to perform an {infer} task with the `watsonxai` service.
+
+You need an https://cloud.ibm.com/docs/databases-for-elasticsearch?topic=databases-for-elasticsearch-provisioning&interface=api[IBM Cloud® Databases for Elasticsearch deployment] to use the `watsonxai` {infer} service.
+You can provision one through the https://cloud.ibm.com/databases/databases-for-elasticsearch/create[IBM catalog], the https://cloud.ibm.com/docs/databases-cli-plugin?topic=databases-cli-plugin-cdb-reference[Cloud Databases CLI plug-in], the https://cloud.ibm.com/apidocs/cloud-databases-api[Cloud Databases API], or https://registry.terraform.io/providers/IBM-Cloud/ibm/latest/docs/resources/database[Terraform].
+
+
+[discrete]
+[[infer-service-watsonx-ai-api-request]]
+==== {api-request-title}
+
+`PUT /_inference/<task_type>/<inference_id>`
+
+[discrete]
+[[infer-service-watsonx-ai-api-path-params]]
+==== {api-path-parms-title}
+
+`<inference_id>`::
+(Required, string)
+include::inference-shared.asciidoc[tag=inference-id]
+
+`<task_type>`::
+(Required, string)
+include::inference-shared.asciidoc[tag=task-type]
++
+--
+Available task types:
+
+* `text_embedding`.
+--
+
+[discrete]
+[[infer-service-watsonx-ai-api-request-body]]
+==== {api-request-body-title}
+
+`service`::
+(Required, string)
+The type of service supported for the specified task type. In this case, 
+`watsonxai`.
+
+`service_settings`::
+(Required, object)
+include::inference-shared.asciidoc[tag=service-settings]
++
+--
+These settings are specific to the `watsonxai` service.
+--
+
+`api_key`:::
+(Required, string)
+A valid API key of your Watsonx account.
+You can find your Watsonx API keys or you can create a new one https://cloud.ibm.com/iam/apikeys[on the API keys page].
++
+--
+include::inference-shared.asciidoc[tag=api-key-admonition]
+--
+
+`api_version`:::
+(Required, string)
+Version parameter that takes a version date in the format of `YYYY-MM-DD`.
+For the active version data parameters, refer to the https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[documentation].
+
+`model_id`:::
+(Required, string)
+The name of the model to use for the {infer} task.
+Refer to the IBM Embedding Models section in the https://www.ibm.com/products/watsonx-ai/foundation-models[Watsonx documentation] for the list of available text embedding models.
+
+`url`:::
+(Required, string)
+The URL endpoint to use for the requests.
+
+`project_id`:::
+(Required, string)
+The name of the project to use for the {infer} task.
+
+`rate_limit`:::
+(Optional, object)
+By default, the `watsonxai` service sets the number of requests allowed per minute to `120`.
+This helps to minimize the number of rate limit errors returned from Watsonx.
+To modify this, set the `requests_per_minute` setting of this object in your service settings:
++
+--
+include::inference-shared.asciidoc[tag=request-per-minute-example]
+--
+
+
+[discrete]
+[[inference-example-watsonx-ai]]
+==== Watsonx AI service example
+
+The following example shows how to create an {infer} endpoint called `watsonx-embeddings` to perform a `text_embedding` task type.
+
+[source,console]
+------------------------------------------------------------
+PUT _inference/text_embedding/watsonx-embeddings
+{
+    "service": "watsonxai",
+    "service_settings": {
+        "api_key": "<api_key>", <1>
+        "url": "<url>", <2>
+        "model_id": "ibm/slate-30m-english-rtrvr",
+        "project_id": "<project_id>", <3>
+        "api_version": "2024-03-14" <4>
+    }
+}
+
+------------------------------------------------------------
+// TEST[skip:TBD]
+<1> A valid Watsonx API key.
+You can find on the https://cloud.ibm.com/iam/apikeys[API keys page of your account].
+<2> The {infer} endpoint URL you created on Watsonx.
+<3> The ID of your IBM Cloud project.
+<4> A valid API version parameter. You can find the active version data parameters https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[here].
\ No newline at end of file
diff --git a/docs/reference/inference/update-inference.asciidoc b/docs/reference/inference/update-inference.asciidoc
index 166b002ea45f5..01a99d7f53062 100644
--- a/docs/reference/inference/update-inference.asciidoc
+++ b/docs/reference/inference/update-inference.asciidoc
@@ -6,7 +6,7 @@ experimental[]
 
 Updates an {infer} endpoint.
 
-IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or Hugging Face.
+IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
 For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models.
 However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <<ml-df-trained-models-apis>>.