Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source code: Add Multimodal RAG with Elasticsearch Gotham City tutorial #390

Merged
merged 25 commits into from
Feb 28, 2025
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
a843f13
Add Multimodal RAG with Elasticsearch Gotham City tutorial
salgado Feb 8, 2025
e47c5e7
Add Multimodal RAG with Elasticsearch Gotham City tutorial
salgado Feb 8, 2025
1557fb2
docs: add OpenAI API key setup instructions
salgado Feb 10, 2025
39674b2
docs: exclude licence
salgado Feb 10, 2025
d2b1b19
fix: fixed comments
salgado Feb 10, 2025
47d6240
docs: added env template
salgado Feb 10, 2025
3748cef
issues fixed 1st review
salgado Feb 14, 2025
fc0f06a
foo
codefromthecrypt Feb 25, 2025
4182f10
foo
codefromthecrypt Feb 25, 2025
76475fa
polish-and-docker
codefromthecrypt Feb 26, 2025
3244b2a
env-example
codefromthecrypt Feb 26, 2025
1217ef6
env-example
codefromthecrypt Feb 26, 2025
55904f1
fix glitch
codefromthecrypt Feb 26, 2025
e24ca5b
remove spurios log
codefromthecrypt Feb 26, 2025
fc2b80d
Add Jupyter notebook implementation of Multimodal RAG
salgado Feb 27, 2025
e381c57
Add Jupyter notebook implementation of Multimodal RAG
salgado Feb 27, 2025
312baa4
Add Jupyter notebook implementation of Multimodal RAG
salgado Feb 27, 2025
c90ba3d
Update documentation with simpler README and Docker setup guide
salgado Feb 27, 2025
36cd475
adding changes from review
JessicaGarson Feb 27, 2025
6866d48
Update 01-mmrag-blog-quick-start.ipynb
JessicaGarson Feb 27, 2025
d34ab33
remove wrong folder
salgado Feb 27, 2025
112c8fa
Remove Docker configuration files
salgado Feb 27, 2025
d7f2472
remove coker references
salgado Feb 27, 2025
a3abfb2
fixing first line notebook to test branch
salgado Feb 28, 2025
be1a03b
fixing first line notebook to main repo
salgado Feb 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Building a Multimodal RAG Pipeline with Elasticsearch: The Story of Gotham City

This repository contains the code for implementing a Multimodal Retrieval-Augmented Generation (RAG) system using Elasticsearch. The system processes and analyzes different types of evidence (images, audio, text, and depth maps) to solve a crime in Gotham City.

## Overview

The pipeline demonstrates how to:
- Generate unified embeddings for multiple modalities using ImageBind
- Store and search vectors efficiently in Elasticsearch
- Analyze evidence using GPT-4 to generate forensic reports

## Prerequisites

- Python 3.x
- Elasticsearch cluster (cloud or local)
- OpenAI API key - Setup an OpenAI account and create a [secret key](https://platform.openai.com/docs/quickstart)
- 8GB+ RAM
- GPU (optional but recommended)

## Code execution

We provide a Google Colab notebook that allows you to explore the entire pipeline interactively:
- [Open the Multimodal RAG Pipeline Notebook](notebook/01-mmrag-blog-quick-start.ipynb)
- This notebook includes step-by-step instructions and explanations for each stage of the pipeline


## Project Structure

```
├── README.md
├── requirements.txt
├── notebook/
│ ├── 01-mmrag-blog-quick-start.ipynb # Jupyter notebook execution
├── src/
│ ├── embedding_generator.py # ImageBind wrapper
│ ├── elastic_manager.py # Elasticsearch operations
│ └── llm_analyzer.py # GPT-4 integration
├── stages/
│ ├── 01-stage/ # File organization
│ ├── 02-stage/ # Embedding generation
│ ├── 03-stage/ # Elasticsearch indexing/search
│ └── 04-stage/ # Evidence analysis
└── data/ # Sample data
├── images/
├── audios/
├── texts/
└── depths/

```

## Sample Data

The repository includes sample evidence files:
- Images: Crime scene photos and security camera footage
- Audio: Suspicious sound recordings
- Text: Mysterious notes and riddles
- Depth Maps: 3D scene captures

## How It Works

1. **Evidence Collection**: Files are organized by modality in the `data/` directory
2. **Embedding Generation**: ImageBind converts each piece of evidence into a 1024-dimensional vector
3. **Vector Storage**: Elasticsearch stores embeddings with metadata for efficient retrieval
4. **Similarity Search**: New evidence is compared against the database using k-NN search
5. **Analysis**: GPT-4 analyzes the connections between evidence to identify suspects

Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Why so serious?

The show has just begun and you're already running
While clowns are dancing and the city's stunning
In the abandoned theater, a surprise awaits
Come play with me before it's too late!

HAHAHAHAHA!
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
PRELIMINARY REPORT - GCPD
Date: 01/28/2025
Time: 22:30

Incident: Break-in and Vandalism
Location: Gotham Central Bank
Evidence Found:
- Playing cards scattered
- Smile graffiti on walls
- Suspicious audio recording
- Witnesses report maniacal laughter

Status: Under Investigation
Priority Level: MAXIMUM
Primary Suspect: Unknown (possible Joker involvement)
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
HAHAHA!

Dear Detective,

In a city of endless night, a new game unfolds
Where chaos reigns and fear takes hold
I left a gift at Gotham Central Bank
Time's ticking, your mind goes blank

The clues are there, scattered with care
Each laugh echoes everywhere
Midnight strikes, you won't catch me
In Gotham's heart, chaos runs free!

With a smile,
?
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Incident Log:
1. Gotham Central Bank - 22:15 - Alarm triggered
2. Monarch Theater - 22:45 - Suspicious laughter reported
3. Abandoned Amusement Park - 23:00 - Strange lights
4. Ace Chemical Plant - 23:30 - Suspicious movement
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Make a copy of this file with the name .env and assign values to variables

# How you connect to Elasticsearch: change details to your instance
ELASTICSEARCH_URL=
ELASTICSEARCH_API_KEY=
# If not using API key, uncomment these and fill them in:
# ELASTICSEARCH_USER=elastic
# ELASTICSEARCH_PASSWORD=elastic

# OpenAI Configuration
OPENAI_API_KEY=

# Model Configuration

# Optional Configuration
# LOG_LEVEL=INFO
# DEBUG=False
Loading