diff --git a/bin/find-notebooks-to-test.sh b/bin/find-notebooks-to-test.sh
index 76840866..7e9513ce 100755
--- a/bin/find-notebooks-to-test.sh
+++ b/bin/find-notebooks-to-test.sh
@@ -4,6 +4,7 @@ EXEMPT_NOTEBOOKS=(
"notebooks/esql/esql-getting-started.ipynb"
"notebooks/search/07-inference.ipynb"
"notebooks/search/08-learning-to-rank.ipynb"
+ "notebooks/images/image-similarity.ipynb"
"notebooks/langchain/langchain-vector-store.ipynb"
"notebooks/langchain/self-query-retriever-examples/chatbot-example.ipynb"
"notebooks/langchain/self-query-retriever-examples/chatbot-with-bm25-only-example.ipynb"
diff --git a/notebooks/images/image-similarity.ipynb b/notebooks/images/image-similarity.ipynb
new file mode 100644
index 00000000..78be723f
--- /dev/null
+++ b/notebooks/images/image-similarity.ipynb
@@ -0,0 +1,715 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CepGq3Kvtdxi"
+ },
+ "source": [
+ "# How to implement Image search using Elasticsearch"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "oMu1SW_TQQrU"
+ },
+ "source": [
+ "The workbook shows how to implement an Image search using Elasticsearch. You will index documents with image embeddings (generated or pre-generated) and then using NLP model be able to search using natural language description of the image.\n",
+ "\n",
+ "## Prerequisities\n",
+ "Before we begin, create an elastic cloud deployment and [autoscale](https://www.elastic.co/guide/en/cloud/current/ec-autoscaling.html) to have least one machine learning (ML) node with enough (4GB) memory. Also ensure that the Elasticsearch cluster is running. \n",
+ "\n",
+ "If you don't already have an Elastic deployment, you can sign up for a free [Elastic Cloud trial](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "VFcdr8IDQE_H"
+ },
+ "source": [
+ "### Install Python requirements\n",
+ "Before you start you need to install all required Python dependencies."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "6WosfR55npKU",
+ "outputId": "033767ff-0eef-48cc-c9e7-efbf73c9cb67"
+ },
+ "outputs": [],
+ "source": [
+ "!pip install sentence-transformers eland elasticsearch transformers torch tqdm Pillow streamlit"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "id": "I0pRCbYMuMVn"
+ },
+ "outputs": [],
+ "source": [
+ "from elasticsearch import Elasticsearch\n",
+ "from elasticsearch.helpers import parallel_bulk\n",
+ "import requests\n",
+ "import os\n",
+ "import sys\n",
+ "\n",
+ "import zipfile\n",
+ "from tqdm.auto import tqdm\n",
+ "import pandas as pd\n",
+ "from PIL import Image\n",
+ "from sentence_transformers import SentenceTransformer\n",
+ "import urllib.request\n",
+ "\n",
+ "# import urllib.error\n",
+ "import json\n",
+ "from getpass import getpass"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "eIV5lAnVt9L7"
+ },
+ "source": [
+ "### Upload NLP model for querying\n",
+ "\n",
+ "Using the [`eland_import_hub_model`](https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch) script, download and install the [clip-ViT-B-32-multilingual-v1](https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1) model, will transfer your search query into vector which will be used for the search over the set of images stored in Elasticsearch.\n",
+ "\n",
+ "To get your cloud id, go to [Elastic cloud](https://cloud.elastic.co) and `On the deployment overview page, copy down the Cloud ID.`\n",
+ "\n",
+ "To authenticate your request, You could use [API key](https://www.elastic.co/guide/en/kibana/current/api-keys.html#create-api-key). Alternatively, you can use your cloud deployment username and password."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id\n",
+ "ELASTIC_CLOUD_ID = getpass(\"Elastic Cloud ID: \")\n",
+ "\n",
+ "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key\n",
+ "ELASTIC_API_KEY = getpass(\"Elastic Api Key: \")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tVhL9jBnuAAQ"
+ },
+ "outputs": [],
+ "source": [
+ "!eland_import_hub_model --cloud-id $ELASTIC_CLOUD_ID --hub-model-id sentence-transformers/clip-ViT-B-32-multilingual-v1 --task-type text_embedding --es-api-key $ELASTIC_API_KEY --start --clear-previous"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Klv3rywdUJBN"
+ },
+ "source": [
+ "### Connect to Elasticsearch cluster\n",
+ "Use your own cluster details `ELASTIC_CLOUD_ID`, `API_KEY`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "YwN8RmFY3FQI",
+ "outputId": "d0d0e31e-2ad2-46fe-ef8c-8c8bce7e1c48"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "ObjectApiResponse({'name': 'instance-0000000001', 'cluster_name': 'a72482be54904952ba46d53c3def7740', 'cluster_uuid': 'g8BE52TtT32pGBbRzP_oKA', 'version': {'number': '8.12.2', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '48a287ab9497e852de30327444b0809e55d46466', 'build_date': '2024-02-19T10:04:32.774273190Z', 'build_snapshot': False, 'lucene_version': '9.9.2', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'})"
+ ]
+ },
+ "execution_count": 9,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "es = Elasticsearch(\n",
+ " cloud_id=ELASTIC_CLOUD_ID,\n",
+ " # basic_auth=(ELASTIC_CLOUD_USER, ELASTIC_CLOUD_PASSWORD),\n",
+ " api_key=ELASTIC_API_KEY,\n",
+ " request_timeout=600,\n",
+ ")\n",
+ "\n",
+ "es.info() # should return cluster info"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "IW-GIlH2OxB4"
+ },
+ "source": [
+ "### Create Index and mappings for Images\n",
+ "Befor you can index documents into Elasticsearch, you need to create an Index with correct mappings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "id": "xAkc1OVcOxy3"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Creating index images\n"
+ ]
+ },
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/var/folders/b0/0h5fbhnd0tz563nl779m3jv80000gn/T/ipykernel_57417/1485784368.py:45: DeprecationWarning: Passing transport options in the API method is deprecated. Use 'Elasticsearch.options()' instead.\n",
+ " es.indices.create(\n"
+ ]
+ },
+ {
+ "data": {
+ "text/plain": [
+ "ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'images'})"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Destination Index name\n",
+ "INDEX_NAME = \"images\"\n",
+ "\n",
+ "# flag to check if index has to be deleted before creating\n",
+ "SHOULD_DELETE_INDEX = True\n",
+ "\n",
+ "INDEX_MAPPING = {\n",
+ " \"properties\": {\n",
+ " \"image_embedding\": {\n",
+ " \"type\": \"dense_vector\",\n",
+ " \"dims\": 512,\n",
+ " \"index\": True,\n",
+ " \"similarity\": \"cosine\",\n",
+ " },\n",
+ " \"photo_id\": {\"type\": \"keyword\"},\n",
+ " \"photo_image_url\": {\"type\": \"keyword\"},\n",
+ " \"ai_description\": {\"type\": \"text\"},\n",
+ " \"photo_description\": {\"type\": \"text\"},\n",
+ " \"photo_url\": {\"type\": \"keyword\"},\n",
+ " \"photographer_first_name\": {\"type\": \"keyword\"},\n",
+ " \"photographer_last_name\": {\"type\": \"keyword\"},\n",
+ " \"photographer_username\": {\"type\": \"keyword\"},\n",
+ " \"exif_camera_make\": {\"type\": \"keyword\"},\n",
+ " \"exif_camera_model\": {\"type\": \"keyword\"},\n",
+ " \"exif_iso\": {\"type\": \"integer\"},\n",
+ " }\n",
+ "}\n",
+ "\n",
+ "# Index settings\n",
+ "INDEX_SETTINGS = {\n",
+ " \"index\": {\n",
+ " \"number_of_replicas\": \"1\",\n",
+ " \"number_of_shards\": \"1\",\n",
+ " \"refresh_interval\": \"5s\",\n",
+ " }\n",
+ "}\n",
+ "\n",
+ "# check if we want to delete index before creating the index\n",
+ "if SHOULD_DELETE_INDEX:\n",
+ " if es.indices.exists(index=INDEX_NAME):\n",
+ " print(\"Deleting existing %s\" % INDEX_NAME)\n",
+ " es.indices.delete(index=INDEX_NAME, ignore=[400, 404])\n",
+ "\n",
+ "print(\"Creating index %s\" % INDEX_NAME)\n",
+ "es.indices.create(\n",
+ " index=INDEX_NAME, mappings=INDEX_MAPPING, settings=INDEX_SETTINGS, ignore=[400, 404]\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "NKE-j0kPUMn_"
+ },
+ "source": [
+ "### Get image dataset and embeddings\n",
+ "Download:\n",
+ "- The example image dataset is from [Unsplash](https://github.com/unsplash/datasets)\n",
+ "- The [Image embeddings](https://github.com/radoondas/flask-elastic-nlp/blob/main/embeddings/blogs/blogs-no-embeddings.json.zip) are pre-generated using CLIP model\n",
+ "\n",
+ "Then unzip both files."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "zFGaPDRR5mqT",
+ "outputId": "0114cdd6-a714-41ab-9b46-3013bd36698a"
+ },
+ "outputs": [],
+ "source": [
+ "!curl -L https://unsplash.com/data/lite/1.2.0 -o unsplash-research-dataset-lite-1.2.0.zip\n",
+ "!curl -L https://raw.githubusercontent.com/radoondas/flask-elastic-nlp/main/embeddings/images/image-embeddings.json.zip -o image-embeddings.json.zip"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "MBh4AQ8i7C0-",
+ "outputId": "17a50b7f-f052-4b72-daa8-0e8fc630326f"
+ },
+ "outputs": [],
+ "source": [
+ "# Unzip downloaded files\n",
+ "UNSPLASH_ZIP_FILE = \"unsplash-research-dataset-lite-1.2.0.zip\"\n",
+ "EMBEDDINGS_ZIP_FILE = \"image-embeddings.json.zip\"\n",
+ "\n",
+ "with zipfile.ZipFile(UNSPLASH_ZIP_FILE, \"r\") as zip_ref:\n",
+ " print(\"Extracting file \", UNSPLASH_ZIP_FILE, \".\")\n",
+ " zip_ref.extractall(\"data/unsplash/\")\n",
+ "\n",
+ "with zipfile.ZipFile(EMBEDDINGS_ZIP_FILE, \"r\") as zip_ref:\n",
+ " print(\"Extracting file \", EMBEDDINGS_ZIP_FILE, \".\")\n",
+ " zip_ref.extractall(\"data/embeddings/\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "qhZRdUyAQd-s"
+ },
+ "source": [
+ "# Import all pregenerated image embeddings\n",
+ "In this section you will import ~19k documents worth of pregenenerated image embeddings with metadata.\n",
+ "\n",
+ "The process downloads files with images information, merge them and index into Elasticsearch."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {
+ "id": "32xrbSUXTODQ"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Indexed 1000 documents\n",
+ "Indexed 2000 documents\n",
+ "Indexed 3000 documents\n",
+ "Indexed 4000 documents\n",
+ "Indexed 5000 documents\n",
+ "Indexed 6000 documents\n",
+ "Indexed 7000 documents\n",
+ "Indexed 8000 documents\n",
+ "Indexed 9000 documents\n",
+ "Indexed 10000 documents\n",
+ "Indexed 11000 documents\n",
+ "Indexed 12000 documents\n",
+ "Indexed 13000 documents\n",
+ "Indexed 14000 documents\n",
+ "Indexed 15000 documents\n",
+ "Indexed 16000 documents\n",
+ "Indexed 17000 documents\n",
+ "Indexed 18000 documents\n",
+ "Indexed 19000 documents\n",
+ "Indexed 19833 image embeddings documents\n"
+ ]
+ }
+ ],
+ "source": [
+ "df_unsplash = pd.read_csv(\"data/unsplash/\" + \"photos.tsv000\", sep=\"\\t\", header=0)\n",
+ "\n",
+ "# follwing 8 lines are fix for inconsistent/incorrect data\n",
+ "df_unsplash[\"photo_description\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"ai_description\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"photographer_first_name\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"photographer_last_name\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"photographer_username\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"exif_camera_make\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"exif_camera_model\"].fillna(\"\", inplace=True)\n",
+ "df_unsplash[\"exif_iso\"].fillna(0, inplace=True)\n",
+ "## end of fix\n",
+ "\n",
+ "# read subset of columns from the original/downloaded dataset\n",
+ "df_unsplash_subset = df_unsplash[\n",
+ " [\n",
+ " \"photo_id\",\n",
+ " \"photo_url\",\n",
+ " \"photo_image_url\",\n",
+ " \"photo_description\",\n",
+ " \"ai_description\",\n",
+ " \"photographer_first_name\",\n",
+ " \"photographer_last_name\",\n",
+ " \"photographer_username\",\n",
+ " \"exif_camera_make\",\n",
+ " \"exif_camera_model\",\n",
+ " \"exif_iso\",\n",
+ " ]\n",
+ "]\n",
+ "\n",
+ "# read all pregenerated embeddings\n",
+ "df_embeddings = pd.read_json(\"data/embeddings/\" + \"image-embeddings.json\", lines=True)\n",
+ "\n",
+ "df_merged = pd.merge(df_unsplash_subset, df_embeddings, on=\"photo_id\", how=\"inner\")\n",
+ "\n",
+ "count = 0\n",
+ "for success, info in parallel_bulk(\n",
+ " client=es,\n",
+ " actions=gen_rows(df_merged),\n",
+ " thread_count=5,\n",
+ " chunk_size=1000,\n",
+ " index=INDEX_NAME,\n",
+ "):\n",
+ " if success:\n",
+ " count += 1\n",
+ " if count % 1000 == 0:\n",
+ " print(\"Indexed %s documents\" % str(count), flush=True)\n",
+ " sys.stdout.flush()\n",
+ " else:\n",
+ " print(\"Doc failed\", info)\n",
+ "\n",
+ "print(\"Indexed %s image embeddings documents\" % str(count), flush=True)\n",
+ "sys.stdout.flush()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-_i2CIpSz9vw"
+ },
+ "source": [
+ "# Query the image dataset\n",
+ "The next step is to run a query to search for images. The example query searches for `\"model_text\": \"Valentine day flowers\"` using the model `sentence-transformers__clip-vit-b-32-multilingual-v1` that we uploaded to Elasticsearch earlier.\n",
+ "\n",
+ "The process is carried out with a single query, even though internaly it consists of two tasks. One is to tramsform your search text into a vector using the NLP model and the second task is to run the vector search over the image dataset.\n",
+ "\n",
+ "```\n",
+ "POST images/_search\n",
+ "{\n",
+ " \"knn\": {\n",
+ " \"field\": \"image_embedding\",\n",
+ " \"k\": 5,\n",
+ " \"num_candidates\": 10,\n",
+ " \"query_vector_builder\": {\n",
+ " \"text_embedding\": {\n",
+ " \"model_id\": \"sentence-transformers__clip-vit-b-32-multilingual-v1\",\n",
+ " \"model_text\": \"Valentine day flowers\"\n",
+ " }\n",
+ " }\n",
+ " },\n",
+ " \"fields\": [\n",
+ " \"photo_description\",\n",
+ " \"ai_description\",\n",
+ " \"photo_url\"\n",
+ " ],\n",
+ " \"_source\": false\n",
+ "}\n",
+ "```\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 375
+ },
+ "id": "wdicpvRlzmXG",
+ "outputId": "00550041-0aed-4f51-ccd3-18eb705ff7ed"
+ },
+ "outputs": [],
+ "source": [
+ "# Search queary\n",
+ "WHAT_ARE_YOU_LOOKING_FOR = \"Valentine day flowers\"\n",
+ "\n",
+ "source_fields = [\n",
+ " \"photo_description\",\n",
+ " \"ai_description\",\n",
+ " \"photo_url\",\n",
+ " \"photo_image_url\",\n",
+ " \"photographer_first_name\",\n",
+ " \"photographer_username\",\n",
+ " \"photographer_last_name\",\n",
+ " \"photo_id\",\n",
+ "]\n",
+ "query = {\n",
+ " \"field\": \"image_embedding\",\n",
+ " \"k\": 5,\n",
+ " \"num_candidates\": 100,\n",
+ " \"query_vector_builder\": {\n",
+ " \"text_embedding\": {\n",
+ " \"model_id\": \"sentence-transformers__clip-vit-b-32-multilingual-v1\",\n",
+ " \"model_text\": WHAT_ARE_YOU_LOOKING_FOR,\n",
+ " }\n",
+ " },\n",
+ "}\n",
+ "\n",
+ "response = es.search(index=INDEX_NAME, fields=source_fields, knn=query, source=False)\n",
+ "\n",
+ "print(response.body)\n",
+ "\n",
+ "# the code writes the response into a file for the streamlit UI used in the optional step.\n",
+ "with open(\"json_data.json\", \"w\") as outfile:\n",
+ " json.dump(response.body[\"hits\"][\"hits\"], outfile)\n",
+ "\n",
+ "# Use the `loads()` method to load the JSON data\n",
+ "dfr = json.loads(json.dumps(response.body[\"hits\"][\"hits\"]))\n",
+ "# Pass the generated JSON data into a pandas dataframe\n",
+ "dfr = pd.DataFrame(dfr)\n",
+ "# Print the data frame\n",
+ "dfr\n",
+ "\n",
+ "results = pd.json_normalize(json.loads(json.dumps(response.body[\"hits\"][\"hits\"])))\n",
+ "# results\n",
+ "results[\n",
+ " [\n",
+ " \"_id\",\n",
+ " \"_score\",\n",
+ " \"fields.photo_id\",\n",
+ " \"fields.photo_image_url\",\n",
+ " \"fields.photo_description\",\n",
+ " \"fields.photographer_first_name\",\n",
+ " \"fields.photographer_last_name\",\n",
+ " \"fields.ai_description\",\n",
+ " \"fields.photo_url\",\n",
+ " ]\n",
+ "]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Ry62sfHFHFi9"
+ },
+ "source": [
+ "# [Optional] Simple streamlit UI\n",
+ "In the following section, you will view the response in a simple UI for better visualisation.\n",
+ "\n",
+ "The query in the previous step did write down a file response `json_data.json` for the UI to load and visualise.\n",
+ "\n",
+ "Follow the steps below to see the results in a table."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "iUAbRqr8II-x"
+ },
+ "source": [
+ "### Install tunnel library"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "RGEmAt2DjtN7",
+ "outputId": "f6c37d54-7e09-4e59-fc21-8a3db4fa840d"
+ },
+ "outputs": [],
+ "source": [
+ "!npm install localtunnel"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KUAfucnYITka"
+ },
+ "source": [
+ "### Create application"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "9Wb7GOWMXFnF",
+ "outputId": "6db23ef3-b25e-4f80-a3cb-6d08c1c78c16"
+ },
+ "outputs": [],
+ "source": [
+ "%%writefile app.py\n",
+ "\n",
+ "import streamlit as st\n",
+ "import json\n",
+ "import pandas as pd\n",
+ "\n",
+ "\n",
+ "def get_image_preview(image_url):\n",
+ " \"\"\"Returns an HTML
tag with preview of the image.\"\"\"\n",
+ " return f\"\"\"
\"\"\"\n",
+ "\n",
+ "\n",
+ "def get_url_link(photo_url):\n",
+ " \"\"\"Returns an HTML tag to the image page.\"\"\"\n",
+ " return f\"\"\" {photo_url} \"\"\"\n",
+ "\n",
+ "\n",
+ "def main():\n",
+ " \"\"\"Creates a Streamlit app with a table of images.\"\"\"\n",
+ " data = json.load(open(\"json_data.json\"))\n",
+ " table = []\n",
+ " for image in data:\n",
+ " image_url = image[\"fields\"][\"photo_image_url\"][0]\n",
+ " image_preview = get_image_preview(image_url)\n",
+ " photo_url = image[\"fields\"][\"photo_url\"][0]\n",
+ " photo_url_link = get_url_link(photo_url)\n",
+ " table.append([image_preview, image[\"fields\"][\"photo_id\"][0],\n",
+ " image[\"fields\"][\"photographer_first_name\"][0],\n",
+ " image[\"fields\"][\"photographer_last_name\"][0],\n",
+ " image[\"fields\"][\"photographer_username\"][0],\n",
+ " photo_url_link])\n",
+ "\n",
+ " st.write(pd.DataFrame(table, columns=[\"Image\", \"ID\", \"First Name\", \"Last Name\",\n",
+ " \"Photographer username\", \"Photo url\"]).to_html(escape = False),\n",
+ " unsafe_allow_html=True)\n",
+ "\n",
+ "\n",
+ "if __name__ == \"__main__\":\n",
+ " main()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CjDhvbGhHuiz"
+ },
+ "source": [
+ "### Run app\n",
+ "Run the application and check your IP for the tunneling"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "851CeYi8jvuF",
+ "outputId": "46a64023-e990-4900-f482-5558237f08cc"
+ },
+ "outputs": [],
+ "source": [
+ "!streamlit run app.py &>/content/logs.txt & curl ipv4.icanhazip.com"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4OuSLFHyHy5M"
+ },
+ "source": [
+ "### Create the tunnel\n",
+ "Run the tunnel and use the link below to connect to the tunnel.\n",
+ "\n",
+ "Use the IP from the previous step to connect to the application"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 38,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "inF7ceBmjyE3",
+ "outputId": "559ce180-3f0f-4475-c9a9-46dc91389276"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[K\u001b[?25hnpx: installed 22 in 2.186s\n",
+ "your url is: https://nine-facts-act.loca.lt\n",
+ "^C\n"
+ ]
+ }
+ ],
+ "source": [
+ "!npx localtunnel --port 8501"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbxbVzvQ7caR"
+ },
+ "source": [
+ "# Resources\n",
+ "\n",
+ "Blog: https://www.elastic.co/blog/implement-image-similarity-search-elastic\n",
+ "\n",
+ "GH : https://github.com/radoondas/flask-elastic-image-search\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.13"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}