diff --git a/docs/reference/inference/delete-inference.asciidoc b/docs/reference/inference/delete-inference.asciidoc index bee39bf9b9851..4fc4beaca6d8e 100644 --- a/docs/reference/inference/delete-inference.asciidoc +++ b/docs/reference/inference/delete-inference.asciidoc @@ -6,12 +6,9 @@ experimental[] Deletes an {infer} endpoint. -IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in -{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or -Hugging Face. For built-in models and models uploaded through Eland, the {infer} -APIs offer an alternative way to use and manage trained models. However, if you -do not plan to use the {infer} APIs to use these models or if you want to use -non-NLP models, use the <>. +IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. +For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. +However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. [discrete] diff --git a/docs/reference/inference/get-inference.asciidoc b/docs/reference/inference/get-inference.asciidoc index c3fe841603bcc..d991729fe77c9 100644 --- a/docs/reference/inference/get-inference.asciidoc +++ b/docs/reference/inference/get-inference.asciidoc @@ -6,12 +6,9 @@ experimental[] Retrieves {infer} endpoint information. -IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in -{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or -Hugging Face. For built-in models and models uploaded through Eland, the {infer} -APIs offer an alternative way to use and manage trained models. However, if you -do not plan to use the {infer} APIs to use these models or if you want to use -non-NLP models, use the <>. +IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. +For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. +However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. [discrete] diff --git a/docs/reference/inference/inference-apis.asciidoc b/docs/reference/inference/inference-apis.asciidoc index 88421e4f64cfd..e756831075027 100644 --- a/docs/reference/inference/inference-apis.asciidoc +++ b/docs/reference/inference/inference-apis.asciidoc @@ -54,3 +54,4 @@ include::service-google-vertex-ai.asciidoc[] include::service-hugging-face.asciidoc[] include::service-mistral.asciidoc[] include::service-openai.asciidoc[] +include::service-watsonx-ai.asciidoc[] diff --git a/docs/reference/inference/post-inference.asciidoc b/docs/reference/inference/post-inference.asciidoc index 52131c0b10776..ce51abaff07f8 100644 --- a/docs/reference/inference/post-inference.asciidoc +++ b/docs/reference/inference/post-inference.asciidoc @@ -6,12 +6,9 @@ experimental[] Performs an inference task on an input text by using an {infer} endpoint. -IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in -{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or -Hugging Face. For built-in models and models uploaded through Eland, the {infer} -APIs offer an alternative way to use and manage trained models. However, if you -do not plan to use the {infer} APIs to use these models or if you want to use -non-NLP models, use the <>. +IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. +For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. +However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. [discrete] diff --git a/docs/reference/inference/put-inference.asciidoc b/docs/reference/inference/put-inference.asciidoc index 96e127e741d56..6d6b61ffea771 100644 --- a/docs/reference/inference/put-inference.asciidoc +++ b/docs/reference/inference/put-inference.asciidoc @@ -8,13 +8,8 @@ Creates an {infer} endpoint to perform an {infer} task. [IMPORTANT] ==== -* The {infer} APIs enable you to use certain services, such as built-in -{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, -Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic or Hugging Face. -* For built-in models and models uploaded through Eland, the {infer} APIs offer an -alternative way to use and manage trained models. However, if you do not plan to -use the {infer} APIs to use these models or if you want to use non-NLP models, -use the <>. +* The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. +* For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. ==== @@ -71,6 +66,7 @@ Click the links to review the configuration details of the services: * <> (`text_embedding`) * <> (`text_embedding`) * <> (`completion`, `text_embedding`) +* <> (`text_embedding`) The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of the services connect to external providers. \ No newline at end of file diff --git a/docs/reference/inference/service-watsonx-ai.asciidoc b/docs/reference/inference/service-watsonx-ai.asciidoc new file mode 100644 index 0000000000000..597afc27fd0cf --- /dev/null +++ b/docs/reference/inference/service-watsonx-ai.asciidoc @@ -0,0 +1,115 @@ +[[infer-service-watsonx-ai]] +=== Watsonx {infer} service + +Creates an {infer} endpoint to perform an {infer} task with the `watsonxai` service. + +You need an https://cloud.ibm.com/docs/databases-for-elasticsearch?topic=databases-for-elasticsearch-provisioning&interface=api[IBM Cloud® Databases for Elasticsearch deployment] to use the `watsonxai` {infer} service. +You can provision one through the https://cloud.ibm.com/databases/databases-for-elasticsearch/create[IBM catalog], the https://cloud.ibm.com/docs/databases-cli-plugin?topic=databases-cli-plugin-cdb-reference[Cloud Databases CLI plug-in], the https://cloud.ibm.com/apidocs/cloud-databases-api[Cloud Databases API], or https://registry.terraform.io/providers/IBM-Cloud/ibm/latest/docs/resources/database[Terraform]. + + +[discrete] +[[infer-service-watsonx-ai-api-request]] +==== {api-request-title} + +`PUT /_inference//` + +[discrete] +[[infer-service-watsonx-ai-api-path-params]] +==== {api-path-parms-title} + +``:: +(Required, string) +include::inference-shared.asciidoc[tag=inference-id] + +``:: +(Required, string) +include::inference-shared.asciidoc[tag=task-type] ++ +-- +Available task types: + +* `text_embedding`. +-- + +[discrete] +[[infer-service-watsonx-ai-api-request-body]] +==== {api-request-body-title} + +`service`:: +(Required, string) +The type of service supported for the specified task type. In this case, +`watsonxai`. + +`service_settings`:: +(Required, object) +include::inference-shared.asciidoc[tag=service-settings] ++ +-- +These settings are specific to the `watsonxai` service. +-- + +`api_key`::: +(Required, string) +A valid API key of your Watsonx account. +You can find your Watsonx API keys or you can create a new one https://cloud.ibm.com/iam/apikeys[on the API keys page]. ++ +-- +include::inference-shared.asciidoc[tag=api-key-admonition] +-- + +`api_version`::: +(Required, string) +Version parameter that takes a version date in the format of `YYYY-MM-DD`. +For the active version data parameters, refer to the https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[documentation]. + +`model_id`::: +(Required, string) +The name of the model to use for the {infer} task. +Refer to the IBM Embedding Models section in the https://www.ibm.com/products/watsonx-ai/foundation-models[Watsonx documentation] for the list of available text embedding models. + +`url`::: +(Required, string) +The URL endpoint to use for the requests. + +`project_id`::: +(Required, string) +The name of the project to use for the {infer} task. + +`rate_limit`::: +(Optional, object) +By default, the `watsonxai` service sets the number of requests allowed per minute to `120`. +This helps to minimize the number of rate limit errors returned from Watsonx. +To modify this, set the `requests_per_minute` setting of this object in your service settings: ++ +-- +include::inference-shared.asciidoc[tag=request-per-minute-example] +-- + + +[discrete] +[[inference-example-watsonx-ai]] +==== Watsonx AI service example + +The following example shows how to create an {infer} endpoint called `watsonx-embeddings` to perform a `text_embedding` task type. + +[source,console] +------------------------------------------------------------ +PUT _inference/text_embedding/watsonx-embeddings +{ + "service": "watsonxai", + "service_settings": { + "api_key": "", <1> + "url": "", <2> + "model_id": "ibm/slate-30m-english-rtrvr", + "project_id": "", <3> + "api_version": "2024-03-14" <4> + } +} + +------------------------------------------------------------ +// TEST[skip:TBD] +<1> A valid Watsonx API key. +You can find on the https://cloud.ibm.com/iam/apikeys[API keys page of your account]. +<2> The {infer} endpoint URL you created on Watsonx. +<3> The ID of your IBM Cloud project. +<4> A valid API version parameter. You can find the active version data parameters https://cloud.ibm.com/apidocs/watsonx-ai#active-version-dates[here]. \ No newline at end of file diff --git a/docs/reference/inference/update-inference.asciidoc b/docs/reference/inference/update-inference.asciidoc index 166b002ea45f5..01a99d7f53062 100644 --- a/docs/reference/inference/update-inference.asciidoc +++ b/docs/reference/inference/update-inference.asciidoc @@ -6,7 +6,7 @@ experimental[] Updates an {infer} endpoint. -IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI or Hugging Face. +IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>.