Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding configuring chunking settings notebook for blog #417

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7a765629",
"metadata": {},
"source": [
"# Configuring Chunking Settings For Inference Endpoints\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/document-chunking/configuring-chunking-settings-for-inference-endpoints.ipynb)\n",
"\n",
"\n",
"Learn how to configure [chunking settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html#infer-chunking-config) for [Inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html) endpoints."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to move and summarize some of the content of this wiki into an overview section to explain why users want to change chunking strategies, sizes, and overlap? Or does that not make a lot of sense since it'll distract from the overall goal of teaching people how to call the API with the new fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this notebook will be linked from the chunking blog which explains the motivation behind various chunking setting options, I think we can leave this as is to avoid duplicating the information too many times. We could consider adding in a link to the blog once it is posted but this will have to be a follow-up change.

]
},
{
"cell_type": "markdown",
"id": "f9101eb9",
"metadata": {},
"source": [
"# 🧰 Requirements\n",
"\n",
"For this example, you will need:\n",
"\n",
"- An Elastic deployment:\n",
" - We'll be using [Elastic serverless](https://www.elastic.co/docs/current/serverless) for this example (available with a [free trial](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook))\n",
"\n",
"- Elasticsearch 8.16 or above.\n",
"\n",
"- Python 3.7 or above."
]
},
{
"cell_type": "markdown",
"id": "4cd69cc0",
"metadata": {},
"source": [
"# Create Elastic Cloud deployment or serverless project\n",
"\n",
"If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial."
]
},
{
"cell_type": "markdown",
"id": "f27dffbf",
"metadata": {},
"source": [
"# Install packages and connect with Elasticsearch Client\n",
"\n",
"To get started, we'll need to connect to our Elastic deployment using the Python client (version 8.12.0 or above).\n",
"Because we're using an Elastic Cloud deployment, we'll use the **Cloud ID** to identify our deployment.\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change to "Because we're using an Elastic Serverless deployment, we'll use the Serverless Endpoint to identify our deployment."?

So that it matches the below code hosts=[ELASTIC_SERVERLESS_ENDPOINT]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant to update this section when I switched to serverless endpoint. I'll update this.

Copy link
Member Author

@dan-rubinstein dan-rubinstein Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke with @joseph.mcelroy and was asked to switch to using cloud ID instead of serverless endpoint. I'll update all of the endpoint information to be related to cloud ID.

"\n",
"First we need to `pip` install the following packages:\n",
"\n",
"- `elasticsearch`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8c4b16bc",
"metadata": {},
"outputs": [],
"source": [
"!pip install elasticsearch"
]
},
{
"cell_type": "markdown",
"id": "41ef96b3",
"metadata": {},
"source": [
"Next, we need to import the modules we need. 🔐 NOTE: getpass enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "690ff9af",
"metadata": {},
"outputs": [],
"source": [
"from elasticsearch import Elasticsearch\n",
"from getpass import getpass"
]
},
{
"cell_type": "markdown",
"id": "23fa2b6c",
"metadata": {},
"source": [
"Now we can instantiate the Python Elasticsearch client.\n",
"\n",
"First provide your API key and Serverless Endpoint.\n",
"Then create a `client` object that instantiates an instance of the `Elasticsearch` class."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "195cc597",
"metadata": {},
"outputs": [],
"source": [
"# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id\n",
"ELASTIC_SERVERLESS_ENDPOINT = getpass(\"Elastic serverless endpoint: \")\n",
"\n",
"# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key\n",
"ELASTIC_API_KEY = getpass(\"Elastic Api Key: \")\n",
"\n",
"# Create the client instance\n",
"client = Elasticsearch(\n",
" # For local development\n",
" # hosts=[\"http://localhost:9200\"],\n",
" hosts=[ELASTIC_SERVERLESS_ENDPOINT],\n",
" api_key=ELASTIC_API_KEY,\n",
" request_timeout=120,\n",
" max_retries=10,\n",
" retry_on_timeout=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "b1115ffb",
"metadata": {},
"source": [
"### Test the Client\n",
"Before you continue, confirm that the client has connected with this test."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc0de5ea",
"metadata": {},
"outputs": [],
"source": [
"print(client.info())"
]
},
{
"cell_type": "markdown",
"id": "659c5890",
"metadata": {},
"source": [
"Refer to [the documentation](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#connect-self-managed-new) to learn how to connect to a self-managed deployment.\n",
"\n",
"Read [this page](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#connect-self-managed-new) to learn how to connect using API keys."
]
},
{
"cell_type": "markdown",
"id": "840d92f0",
"metadata": {},
"source": [
"<a name=\"create-the-inference-endpoint\"></a>\n",
"## Create the inference endpoint object\n",
"\n",
"Let's create the inference endpoint by using the [Create Inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html#put-inference-api-desc).\n",
"\n",
"In this example, you'll be creating an inference endpoint for the [ELSER integration](https://www.elastic.co/guide/en/elasticsearch/reference/current/infer-service-elser.html) which will deploy Elastic's [ELSER model](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html) within your cluster. Chunking settings are configurable for any inference endpoint with an embedding task type. A full list of available integrations can be found in the [Create Inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html#put-inference-api-desc) documentation.\n",
"\n",
"To configure chunking settings, the request body must contain a `chunking_settings` map with a `strategy` value along with any required values for the selected chunking strategy. For this example, you'll be configuring chunking settings for a `sentence` strategy with a maximum chunk size of 25 words and 1 sentence overlap between chunks. For more information on available chunking strategies and their configurable values, see the [chunking strategies documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html#_chunking_strategies)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0d007737",
"metadata": {},
"outputs": [],
"source": [
"client.inference.put(\n",
" task_type=\"sparse_embedding\",\n",
" inference_id=\"my_elser_endpoint\",\n",
" body={\n",
" \"service\": \"elasticsearch\",\n",
" \"service_settings\": {\n",
" \"num_allocations\": 1,\n",
" \"num_threads\": 1,\n",
" \"model_id\": \".elser_model_2\",\n",
" },\n",
" \"chunking_settings\": {\n",
" \"strategy\": \"sentence\",\n",
" \"max_chunk_size\": 25,\n",
" \"sentence_overlap\": 1,\n",
" },\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f01de885",
"metadata": {},
"source": [
"<a name=\"create-the-index\"></a>\n",
"## Create the index\n",
"\n",
"To see the chunking settings you've configured in action, you'll need to ingest a document into a semantic text field of an index. Let's create an index with a semantic text field linked to the inference endpoint created in the previous step."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0eed3e3b",
"metadata": {},
"outputs": [],
"source": [
"client.indices.create(\n",
" index=\"my_index\",\n",
" mappings={\n",
" \"properties\": {\n",
" \"infer_field\": {\n",
" \"type\": \"semantic_text\",\n",
" \"inference_id\": \"my_elser_endpoint\",\n",
" }\n",
" }\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"id": "51ae72e4",
"metadata": {},
"source": [
"<a name=\"ingest-a-document\"></a>\n",
"## Ingest a document\n",
"\n",
"Now let's ingest a document into the index created in the previous step.\n",
"\n",
"Note: It may take some time Elasticsearch to allocate nodes to the ELSER model deployment that is started when creating the inference endpoint. You will need to wait until the deployment is allocated to a node before the request below can succeed."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b8ecaec0",
"metadata": {},
"outputs": [],
"source": [
"client.index(\n",
" index=\"my_index\",\n",
" document={\n",
" \"infer_field\": \"This is some sample document data. The data is being used to demonstrate the configurable chunking settings feature. The configured chunking settings will determine how this text is broken down into chunks to help increase inference accuracy.\"\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ccc7ca3a",
"metadata": {},
"source": [
"<a name=\"view-the-chunks\"></a>\n",
"## View the chunks\n",
"\n",
"The generated chunks and their corresponding inference results can be seen stored in the document in the index under the key `chunks` within the `_inference_fields` metafield. The chunks are stored as a list of character offset values. Let's see the chunks generated when ingesting the documenting in the previous step."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "58dc9019",
"metadata": {},
"outputs": [],
"source": [
"client.search(\n",
" index=\"my_index\",\n",
" body={\"size\": 100, \"query\": {\"match_all\": {}}, \"fields\": [\"_inference_fields\"]},\n",
")"
]
},
{
"cell_type": "markdown",
"id": "193f5b8d",
"metadata": {},
"source": [
"<a name=\"conclusion\"></a>\n",
"## Conclusion\n",
"\n",
"You've now learned how to configure chunking settings for an inference endpoint! For more infomration about configurable chunking, see the [configuring chunking](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html#infer-chunking-config) documentation."
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading