SignLangNET is a project aimed at interpreting sign language gestures using deep learning techniques. It utilizes the MediaPipe library for hand and body pose estimation and employs Long Short-Term Memory (LSTM) networks for sequence modeling.
- Overview
- Long Short-Term Memory (LSTM)
- Dependencies
- Installation
- Usage
- Contributing
- License
- Acknowledgements
This project consists of three main components:
-
Data Collection: Utilizes webcam input to collect sequences of hand and body pose keypoints corresponding to various sign language gestures. The
utils.py
file must be run first to capture the data. A total of 40 videos will be captured, each 30 frames long. The captured data will be stored in theDATA_PATH
folder, with each frame of the action stored in .npy format. -
Model Training: Trains an LSTM neural network to classify sign language gestures based on the collected keypoints. Run the
model.py
file to start training the model after collecting the data. -
Real-time Gesture Interpretation: Interprets sign language gestures in real-time using the trained model. Run the
detect.py
file after training the model.
Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture designed to overcome the vanishing gradient problem and efficiently capture long-range dependencies in sequential data. LSTMs are well-suited for sequence modeling tasks like time series forecasting, natural language processing, and gesture recognition.
In an LSTM network, each LSTM unit maintains a cell state that can store information over long sequences, which allows it to remember important patterns and relationships in the data. Additionally, LSTMs have gating mechanisms (input gate, forget gate, output gate) that control the flow of information and gradients, enabling them to learn and retain information over many time steps.
For more information on LSTMs, refer to the LSTM Wikipedia page.
- Python 3.x
- OpenCV
- NumPy
- Matplotlib
- MediaPipe
- TensorFlow
- Scikit-learn
-
Clone this repository:
git clone https://github.com/surtecha/SignLangNET.git
-
Install dependencies:
pip install -r requirements.txt
Note for Windows Users: If you are using a Windows machine, please modify the requirements.txt
file by replacing the TensorFlow packages with Windows-specific versions.
- Run
utils.py
to collect training data. Follow on-screen instructions to perform gestures in front of the webcam. The program will exit automatically after recording all the gestures.
-
Run
model.py
to train the LSTM model using the collected data. -
The trained model will be saved as
saved_model.h5
.
-
Run
detect.py
to interpret sign language gestures in real-time using the trained model. -
Perform gestures in front of the webcam, and the predicted gestures will be displayed on the screen.
Contributions are welcome! If you find any issues or have suggestions for improvements, feel free to open an issue or create a pull request.
This project is licensed under the MIT License.
- This project utilizes the MediaPipe library developed by Google.
- Inspiration for this project comes from efforts to make technology more accessible for individuals with disabilities.