A comprehensive AWS-based face recognition system that demonstrates:
- Video Splitting with AWS Lambda (using FFmpeg)
- Face Detection & Recognition with a pre-trained CNN (ResNet)
- Auto-scaling on EC2 based on SQS message queue depth
- Serverless Pipelines (S3 triggers, asynchronous Lambda invocations)
Note: This project contains subfolders for different parts (Project1, Project2, Project3).
- Overview
- Repository Structure
- Architecture
- Prerequisites
- Setup & Installation
- Usage
- Security Considerations
- Possible Improvements
- Acknowledgments
- License
- Contact
This repository showcases an end-to-end face recognition pipeline using AWS:
- Input Bucket: Users upload
.mp4
videos to<ASU_ID>-input
. - Video-Splitting Lambda: Extracts frames from uploaded videos and stores them in
<ASU_ID>-stage-1
. - Face-Recognition Lambda: Detects faces, classifies them, and writes the result to
<ASU_ID>-output
. - Auto-scaling App Tier: Launches or terminates EC2 instances based on SQS queue length.
- Web Tier: A Spring Boot REST API that interacts with SQS and S3.
Originally developed for a Cloud Computing course, but it’s a solid reference for real-world AWS patterns.
aws-face-recognition/
├── Project1/ # Basic AWS resource management (EC2, S3, SQS)
├── Project2/ # IaaS-based face recognition with auto-scaling
│ ├── src/
│ ├── pom.xml
│ └── ...
├── Project3/ # PaaS-based (Lambda) video splitting & face recognition
│ ├── video-splitting/
│ ├── face-recognition/
│ └── ...
├── .gitignore
├── README.md # You're here!
└── Other scripts/data as needed
A high-level flow:
User (video) -> [S3: <ASU_ID>-input] -> [Lambda: video-splitting] -> [S3: <ASU_ID>-stage-1] -> [Lambda: face-recognition] -> [S3: <ASU_ID>-output]
For the IaaS version (Project 2):
User -> Web Tier (Spring Boot) -> SQS -> [Auto-Scaling EC2 App Tier] -> S3
Each part uses AWS components:
- EC2 for the App Tier.
- S3 for input/output storage.
- SQS for message queue handling.
- Lambda for serverless tasks (video splitting & face recognition).
- IAM for access control (best done with roles rather than keys in code).
- AWS Account
- S3, EC2, Lambda, SQS, IAM permissions.
- Java 17 & Maven
- For running Spring Boot or other Java-based code.
- Python 3.8+
- Required for the face-recognition scripts, ffmpeg steps, etc.
- Docker
- For building container images for Lambda.
-
Clone this Repository
git clone [https://github.com/<YourUsername>/aws-face-recognition.git](https://github.com/dvarshith/aws-face-recognition.git) cd aws-face-recognition
-
Configure AWS Credentials
- Recommended: Use IAM Roles for EC2 and Lambda so you don’t store keys in code.
- Or set environment variables locally (for testing):
export AWS_ACCESS_KEY_ID=<YourAccessKey> export AWS_SECRET_ACCESS_KEY=<YourSecretKey> export AWS_DEFAULT_REGION=us-east-1
-
Build the Java Projects
cd Project2 mvn clean package
-
Deploy the Lambdas
- For video-splitting and face-recognition (Project 3), either:
- Upload the JARs to AWS Lambda via console, or
- Build Docker images with their respective Dockerfile and push to ECR, then create Lambdas from those images.
- For video-splitting and face-recognition (Project 3), either:
- Upload a Video (PaaS Example)
- Upload test_00.mp4 to <ASU_ID>-input.
- Lambda (video-splitting) extracts one frame, saves as test_00.jpg in <ASU_ID>-stage-1.
- Lambda (face-recognition) runs face detection, saves the recognized name in test_00.txt in <ASU_ID>-output.
- Web Tier (IaaS Example)
- Run:
java -jar target/Project2-0.0.1-SNAPSHOT.jar
- Upload: POST / with form-data:
inputFile=<image_or_video_file>
- The API sends a message to SQS, triggers the app tier, and eventually returns the classification result.
- Run:
- Remove Hardcoded Credentials: Do not store AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY in Java/Python files.
- Use IAM Roles: Prefer instance profiles for EC2, and execution roles for Lambda.
- .gitignore: Make sure you aren’t committing .pem files, .log files with private data, or large training data sets.
- Add CI/CD (GitHub Actions or Jenkins) for automatic builds and deployments.
- More Granular IAM Policies to adhere to the principle of least privilege.
- Performance Tuning for your Lambda containers (memory, concurrency).
- CloudWatch Alarms & Metrics for deeper monitoring of queue length, CPU, memory usage, etc.
- Dataset, test cases, etc. provided by Vista Lab from Arizona State University [https://github.com/visa-lab/CSE546-Cloud-Computing/tree/main].
This project is released under the MIT License
. That means you’re free to use, modify, and distribute the code, but you do so at your own risk.
Author: Varshith Dupati
GitHub: @dvarshith
Email: dvarshith942@gmail.com
Issues: Please open an issue on this repo if you have questions or find bugs.