📄 Document Insights

Document Insights is an advanced document understanding system that performs three core tasks with state-of-the-art accuracy:

🟩 Checkbox-Text Pair Detection
🧠 Document Classification (OCR-Free)
📄 Document Parsing with LLMs

🚀 Key Features

✅ Checkbox Detection (YOLOv8)

Custom-trained YOLOv8-large model fine-tuned on 10,000+ diverse document images (scanned + digital) with outstanding precision.

📊 Performance Comparison (F1-Score)

Model	F1-Score
Azure Form Recognizer	0.72
GPT-4 Vision	0.63
YOLO Checkbox Detector	0.88

🧠 Document Classification (DONUT)

OCR-free classification using the DONUT model - fast, lightweight, and accurate.

📄 Document Parsing (LLM-based)

Flexible parsing options:

☁️ API-based (OpenAI, Claude)
💻 Local LLMs (qwen:14b via Ollama)

📦 Installation

Clone the repository:

git clone https://github.com/TatsuProject/document_insights_base_model
cd document_insights_base_model

Install dependencies:
```
pip install -r requirements.txt
```

📥 Downloading Model Weights

✅ YOLO Checkbox Detector

Join our Discord Community to get access to the model weights.
Create a model/ directory at the root of this repository.
Place the downloaded weight file inside the model/ folder.

🗂️ Document Classification (DONUT)

Weights for DONUT will be downloaded automatically from Hugging Face the first time the model is used.
Please ensure you have at least 10 GB of free disk space.

🧠 Using LLMs

To use the LLMs through API, you need to have an API key and include it in the script located at doc_parser/get_llm_response_api.py. You can also implement your custom logic in this script as needed.

To run document parsing without relying on external APIs, you can host LLMs locally using Ollama.

🧰 Step 1: Install Ollama

Visit the official website: https://ollama.com
Download and install Ollama for your OS (Linux, macOS, or Windows)
Start the Ollama service:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

📦 Step 2: Download Qwen 2.5 14B Locally

Run the following command to download and serve Qwen 2.5 (14B) using Ollama:

ollama pull qwen2.5:14b
ollama run qwen2.5:14b

🧪 Running Tests

1. Start the main app service:

python app.py

2. Run a test on a document/image:

python test_app.py

Make sure to update test_app.py with the correct image path.

You can set the task_type parameter to one of the following:

"checkbox" – for checkbox-text detection
"doc-class" – for document classification
"doc-parse" – for document parsing using LLM

Hardware Requirements

For using LLM via API:
A minimum of 16GB of RAM is sufficient to interact with the LLM through the API.
For running LLM locally:
To run the LLM locally, you'll need at least 32GB of RAM and 12GB of GPU memory for optimal performance.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
doc_classifier		doc_classifier
doc_parser		doc_parser
logs		logs
model		model
test_images		test_images
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test_app.py		test_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Document Insights

🚀 Key Features

✅ Checkbox Detection (YOLOv8)

📊 Performance Comparison (F1-Score)

🧠 Document Classification (DONUT)

📄 Document Parsing (LLM-based)

📦 Installation

📥 Downloading Model Weights

✅ YOLO Checkbox Detector

🗂️ Document Classification (DONUT)

🧠 Using LLMs

🧰 Step 1: Install Ollama

📦 Step 2: Download Qwen 2.5 14B Locally

🧪 Running Tests

Hardware Requirements

About

Releases

Packages

Contributors 2

Languages

TatsuProject/document_insights_base_model

Folders and files

Latest commit

History

Repository files navigation

📄 Document Insights

🚀 Key Features

✅ Checkbox Detection (YOLOv8)

📊 Performance Comparison (F1-Score)

🧠 Document Classification (DONUT)

📄 Document Parsing (LLM-based)

📦 Installation

📥 Downloading Model Weights

✅ YOLO Checkbox Detector

🗂️ Document Classification (DONUT)

🧠 Using LLMs

🧰 Step 1: Install Ollama

📦 Step 2: Download Qwen 2.5 14B Locally

🧪 Running Tests

Hardware Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages