LLM Based Movie RecSys

Configuration

Create a .env file in the root directory of the project to store your MongoDB and Milvus/Zilliz connection string:

MONGO_URI
MILVUS_URI
MILVUS_TOKEN

Exporting the Movies Dataset

You can skip this step and use the movies.json which has the extracted movies.

To export the movies dataset from the sample_mlfix collection in your MongoDB database and save it as a movies.json file, you can execute the db_export.py script. This script connects to your MongoDB instance, retrieves the dataset, and writes it to a JSON file for easy sharing and analysis.

Obtain Embeddings

To generate embeddings for each movie based on its description, you can execute the get_embeddings.py script. This process uses the all-MiniLM-L6-v2 model from the Sentence Transformers library, which produces embeddings of size 384 for each movie description. The embeddings are added to each movie, and the dataset is saved as movies_with_embeddings.json.

Populate Database

The populate_milvus_db.py script is designed to:

Connect to your Milvus/Zilliz database.
Create a collection movies_collection to store movie data and their corresponding embeddings.
Index the collection for efficient vector similarity search.

This script sets up the database and prepares it for recommendation queries, enabling quick and efficient retrieval of similar movie desc.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
connect_milvus.py		connect_milvus.py
db_export.py		db_export.py
get_embeddings.py		get_embeddings.py
movies.json		movies.json
populate_milvus_db.py		populate_milvus_db.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Based Movie RecSys

Configuration

Exporting the Movies Dataset

Obtain Embeddings

Populate Database

About

Releases

Packages

Languages

anasserhussien/movie_llm_recsys

Folders and files

Latest commit

History

Repository files navigation

LLM Based Movie RecSys

Configuration

Exporting the Movies Dataset

Obtain Embeddings

Populate Database

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages