Pathway RAG Application

<<<<<<< HEAD

Pathway RAG Application

This project is a Retriever-Augmented Generation (RAG) application utilizing the Pathway framework. The application indexes documents from various sources (such as Google Drive and Microsoft SharePoint), allowing real-time search and answering functionalities. It also uses the Gemini LLM model for embedding documents.

Project Structure

Below is the project structure of the app:

Files and Directories

app.py: Main application file that runs the app logic and integrates Pathway modules.
config.yaml: Configuration file that stores settings related to data sources and other parameters.
data/: Directory containing PDF files that the app indexes and processes.
- PDF files: These are example documents that the app can use for retrieval and answering queries.
Dockerfile: Defines the Docker image for containerization, which is used for deployment.
README.md: Documentation file explaining how to set up and run the application.
requirements.txt: Lists all the dependencies required to run the app (e.g., Pathway, other Python libraries).
server.py: Contains the server logic, defines endpoints for making POST requests.
static/styles.css: Contains the CSS for styling the front-end HTML pages.
templates/: Contains the HTML files used for rendering the front-end pages.
- index.html: Main page of the app.
- logs.html: A page to display logs or other app-related information.

Key Features

Real-Time Document Indexing: The app indexes documents from sources like Google Drive and Microsoft SharePoint.
Embedding with Gemini LLM: It uses the Gemini LLM model to create embeddings for the documents.
Retriever-Augmented Generation (RAG): The app retrieves relevant documents and generates answers to user queries.
Containerized Deployment: The app can be containerized and deployed via Docker for scalability.

How to Use

Clone the repository.
Install dependencies using the requirements.txt file:
```
pip install -r requirements.txt
```
Run the app using Docker: docker build -t pathway-rag-app . | docker run -p 5000:9000 pathway-rag-app
If docker is not working try running with this command python3 app.py and python3 server.py
Access the app via your browser at http://localhost:5000.
And finally interact with the app.

Configuration

Update the config.yaml file to specify the data sources (Google Drive, SharePoint, etc.).
Add your own Gemini-api-key in .env file.
Modify the app.py or server.py to adjust any specific logic for your use case.

API Endpoints

POST /v1/pw_ai_answer: Endpoint for submitting a question and receiving an answer based on indexed documents.`

Technologies Used

Pathway Framework: For real-time data processing and RAG capabilities.
Gemini LLM: For embedding and semantic search.
Docker: For containerization and deployment.
Python: Backend logic and server implementation.
HTML/CSS: Front-end rendering of the app.

License

This project is licensed under the MIT License.

This file can serve as a basic structure for documentation purposes. You can customize it further based on additional features or details.
=======
# Pathway RAG Application

This project is a **Retriever-Augmented Generation (RAG)** application utilizing the **Pathway framework**. The application indexes documents from various sources (such as Google Drive and Microsoft SharePoint), allowing real-time search and answering functionalities. It also uses the **Gemini LLM** model for embedding documents.

## Project Structure

Below is the project structure of the app:

### Files and Directories

- **app.py**: Main application file that runs the app logic and integrates Pathway modules.
- **config.yaml**: Configuration file that stores settings related to data sources and other parameters.
- **data/**: Directory containing PDF files that the app indexes and processes.
  - *PDF files*: These are example documents that the app can use for retrieval and answering queries.
- **Dockerfile**: Defines the Docker image for containerization, which is used for deployment.
- **README.md**: Documentation file explaining how to set up and run the application.
- **requirements.txt**: Lists all the dependencies required to run the app (e.g., Pathway, other Python libraries).
- **server.py**: Contains the server logic, defines endpoints for making POST requests.
- **static/styles.css**: Contains the CSS for styling the front-end HTML pages.
- **templates/**: Contains the HTML files used for rendering the front-end pages.
  - *index.html*: Main page of the app.
  - *logs.html*: A page to display logs or other app-related information.

## Key Features

1. **Real-Time Document Indexing**: The app indexes documents from sources like Google Drive and Microsoft SharePoint.
2. **Embedding with Gemini LLM**: It uses the Gemini LLM model to create embeddings for the documents.
3. **Retriever-Augmented Generation (RAG)**: The app retrieves relevant documents and generates answers to user queries.
4. **Containerized Deployment**: The app can be containerized and deployed via Docker for scalability.

## How to Use

1. Clone the repository.
2. Install dependencies using the `requirements.txt` file:

   ```bash
   pip install -r requirements.txt
3. Run the app using Docker:
    `docker build -t pathway-rag-app . |
docker run -p 5000:9000 pathway-rag-app
`
4. If docker is not working try running with this command
   `python3 app.py` and
   `python3 server.py`
5. Access the app via your browser at http://localhost:5000.
6. And finally interact with the app.

## Configuration
1. Update the config.yaml file to specify the data sources (Google Drive, SharePoint, etc.).
2. Add your own Gemini-api-key in `.env` file.

3. Modify the app.py or server.py to adjust any specific logic for your use case.

## API Endpoints
`POST /v1/pw_ai_answer`: Endpoint for submitting a question and receiving an answer based on indexed documents.`

## Technologies Used
1. **Pathway Framework**: For real-time data processing and RAG capabilities.
2. **Gemini LLM**: For embedding and semantic search.
3. **Docker**: For containerization and deployment.
4. **Python**: Backend logic and server implementation.
5. **HTML/CSS**: Front-end rendering of the app.

## License
This project is licensed under the MIT License.

```bash
This file can serve as a basic structure for documentation purposes. You can customize it further based on additional features or details.
>>>>>>> a243b4c (inital commmit)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pathway RAG Application

Project Structure

Files and Directories

Key Features

How to Use

Configuration

API Endpoints

Technologies Used

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
static		static
templates		templates
.env		.env
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Screencast from 2024-09-30 21-02-16.webm		Screencast from 2024-09-30 21-02-16.webm
app.py		app.py
config.yaml		config.yaml
requirements.txt		requirements.txt
server.py		server.py

License

Per0x1de-1337/RAG-FileQuery

Folders and files

Latest commit

History

Repository files navigation

Pathway RAG Application

Project Structure

Files and Directories

Key Features

How to Use

Configuration

API Endpoints

Technologies Used

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages