Adding configuring chunking settings notebook for blog #417

dan-rubinstein · 2025-03-04T16:48:04Z

Adding a notebook for the configuring chunking settings blog I'm currently writing. A draft of the blog can be seen here.

Note: I've run through the steps with local testing and with a serverless trial account and confirmed that the steps succeeded.

gitnotebooks · 2025-03-04T16:48:07Z

Found 1 changed notebook. Review the changes at https://app.gitnotebooks.com/elastic/elasticsearch-labs/pull/417

prwhelan · 2025-03-04T19:21:23Z

notebooks/document-chunking/configuring-chunking-settings-for-inference-endpoints.ipynb

+    "# Install packages and connect with Elasticsearch Client\n",
+    "\n",
+    "To get started, we'll need to connect to our Elastic deployment using the Python client (version 8.12.0 or above).\n",
+    "Because we're using an Elastic Cloud deployment, we'll use the **Cloud ID** to identify our deployment.\n",


Maybe change to "Because we're using an Elastic Serverless deployment, we'll use the Serverless Endpoint to identify our deployment."?

So that it matches the below code hosts=[ELASTIC_SERVERLESS_ENDPOINT]

Meant to update this section when I switched to serverless endpoint. I'll update this.

Spoke with @joseph.mcelroy and was asked to switch to using cloud ID instead of serverless endpoint. I'll update all of the endpoint information to be related to cloud ID.

notebooks/document-chunking/configuring-chunking-settings-for-inference-endpoints.ipynb

prwhelan · 2025-03-04T19:50:28Z

notebooks/document-chunking/configuring-chunking-settings-for-inference-endpoints.ipynb

+    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/document-chunking/configuring-chunking-settings-for-inference-endpoints.ipynb)\n",
+    "\n",
+    "\n",
+    "Learn how to configure [chunking settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html#infer-chunking-config) for [Inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html) endpoints."


Do we want to move and summarize some of the content of this wiki into an overview section to explain why users want to change chunking strategies, sizes, and overlap? Or does that not make a lot of sense since it'll distract from the overall goal of teaching people how to call the API with the new fields?

Since this notebook will be linked from the chunking blog which explains the motivation behind various chunking setting options, I think we can leave this as is to avoid duplicating the information too many times. We could consider adding in a link to the blog once it is posted but this will have to be a follow-up change.

Adding configuring chunking settings notebook for blog

2f705c7

dan-rubinstein added the blog label Mar 4, 2025

Running pre-commit

cdf1b8c

prwhelan approved these changes Mar 4, 2025

View reviewed changes

Switching to using elastic cloud ID instead of serverless endpoint

3f3bb30

dan-rubinstein requested a review from prwhelan March 4, 2025 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding configuring chunking settings notebook for blog #417

Adding configuring chunking settings notebook for blog #417

dan-rubinstein commented Mar 4, 2025 •

edited

Loading

gitnotebooks bot commented Mar 4, 2025

prwhelan Mar 4, 2025

dan-rubinstein Mar 4, 2025

dan-rubinstein Mar 4, 2025 •

edited

Loading

prwhelan Mar 4, 2025

dan-rubinstein Mar 4, 2025

Adding configuring chunking settings notebook for blog #417

Are you sure you want to change the base?

Adding configuring chunking settings notebook for blog #417

Conversation

dan-rubinstein commented Mar 4, 2025 • edited Loading

gitnotebooks bot commented Mar 4, 2025

prwhelan Mar 4, 2025

Choose a reason for hiding this comment

dan-rubinstein Mar 4, 2025

Choose a reason for hiding this comment

dan-rubinstein Mar 4, 2025 • edited Loading

Choose a reason for hiding this comment

prwhelan Mar 4, 2025

Choose a reason for hiding this comment

dan-rubinstein Mar 4, 2025

Choose a reason for hiding this comment

dan-rubinstein commented Mar 4, 2025 •

edited

Loading

dan-rubinstein Mar 4, 2025 •

edited

Loading