Skip to content

Latest commit

 

History

History
53 lines (33 loc) · 1.44 KB

README.md

File metadata and controls

53 lines (33 loc) · 1.44 KB

OCR Web Application with Hindi and English Text Extraction

This web application allows users to upload images containing text in both Hindi and English. The app extracts text using OCR and provides a keyword search functionality to search within the extracted text.

Features:

  • Upload an image (JPEG, PNG).
  • Extract text from images using Tesseract OCR.
  • Search for specific keywords in the extracted text.

How to Run Locally:

  1. Clone this repository: git clone

  2. Install the required Python packages: pip install -r requirements.txt

  3. Install Tesseract OCR:

  • On Ubuntu:
    sudo apt-get install tesseract-ocr
    
  • On Windows, download and install Tesseract.
  1. Install required dependencies and libraries.

pip install pytesseract

pip install Pillow

pip install streamlit

pip install torch

pip install transformers

  1. Run the application: streamlit run anu.py

Screenshots

English Text Hindi Text Extracted Text Hindi Keyword OCR App Search Results Words

License:

This project is licensed under the MIT License.