Modular computer vision implementations - A collection of production-grade vision systems spanning multiple domains.
Featured Projects β’ Installation β’ Quick Start β’ Contributing
- Project Organization
- Core Features
- Prerequisites
- Tech Stack
- Installation
- Quick Start
- Project Matrix
- Development Standards
- Contributing
- Documentation
- Benchmarks
- Versioning
- Authors
- Citation
- License
- Acknowledgments
graph TD
A[ML Vision Lab] --> B[projects]
A --> C[core]
A --> D[docs]
B --> E[food-classification]
B --> F[object-detection]
B --> G[medical-imaging]
B --> H[satellite-analysis]
C --> I[utils]
C --> J[models]
C --> K[pipelines]
D --> L[api]
D --> M[guides]
D --> N[architecture]
ml-vision-lab/
βββ projects/ # Individual vision projects
β βββ food-classification/ # Food analysis system
β βββ object-detection/ # Real-time detection
β βββ medical-imaging/ # DICOM processing
β βββ satellite-analysis/ # Geospatial vision
βββ core/ # Shared vision components
β βββ utils/ # Common utilities
β βββ models/ # Base model architectures
β βββ pipelines/ # Processing workflows
βββ docs/ # Project documentation
mindmap
root((ML Vision Lab))
Cross-Project
Modular architecture
Shared pipelines
Hardware optimization
Standardized metrics
Project Types
Classification
Detection
Medical
Satellite
Optimization
GPU acceleration
TensorRT
Memory efficiency
Development
MLflow tracking
DVC versioning
CI/CD pipelines
Cross-Project Capabilities
- Modular project architecture
- Shared preprocessing pipelines
- Hardware-optimized inference
- Standardized evaluation metrics
- GPU-accelerated processing
- Production deployment examples
- Memory-efficient inference
- TensorRT integration
Project Types
- Image Classification
- Object Detection & Tracking
- Medical Imaging Analysis
- Satellite Imagery Processing
- Industrial Quality Inspection
- Python 3.11+
- CUDA 12.2+
- OpenCV 5.0+
- PyTorch 2.3+
- TensorFlow 2.15+
- NVIDIA GPU (Compute Capability 6.0+)
graph TD
A[Tech Stack] --> B[Core Libraries]
A --> C[Project Libraries]
B --> D[PyTorch]
B --> E[TensorFlow]
B --> F[OpenCV]
B --> G[CUDA]
C --> H[MONAI]
C --> I[RasterIO]
C --> J[DeepSORT]
C --> K[MLflow]
Core Libraries
- PyTorch - Deep learning framework
- TensorFlow - Machine learning platform
- OpenCV - Computer vision operations
- CUDA - GPU acceleration
- TensorRT - Inference optimization
- NumPy - Numerical computing
- Pandas - Data manipulation
- Scikit-learn - Machine learning utilities
- Matplotlib - Visualization
- Plotly - Visualization
- Pillow - Image processing
Project-Specific Libraries
- MONAI - Medical imaging
- RasterIO - Geospatial analysis
- DeepSORT - Object tracking
- Albumentations - Image augmentation
- MLflow - Experiment tracking
- DVC - Data version control
# Clone repository
git clone https://github.com/BjornMelin/ml-vision-lab.git
cd ml-vision-lab
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/MacOS
# or
.venv\Scripts\activate # Windows
# Install core requirements
pip install -r requirements.txt
# Install project-specific requirements (optional)
pip install -r projects/food-classification/requirements.txt
Food Classification
from projects.food_classification import predict
result = predict("pizza.jpg")
print(f"Identified: {result.label} ({result.confidence:.1%})")
Object Detection
from projects.object_detection import VideoAnalyzer
analyzer = VideoAnalyzer(model="yolov9")
analyzer.process_stream("input.mp4", output="results.mp4")
Project | Task | Models | Input Types |
---|---|---|---|
Food Classification | Image Classification | EfficientNetV2, ViT | JPEG/PNG |
Object Detection | Real-time Tracking | YOLOv9, DeepSORT | Video Streams |
Medical Imaging | DICOM Analysis | UNet3+, MONAI | CT/MRI Scans |
Satellite Analysis | Geospatial ML | ResNet50-ADE20K | GeoTIFF |
flowchart TD
A[Development] --> B[Code Quality]
A --> C[Testing]
A --> D[Documentation]
B --> E[Black]
B --> F[MyPy]
C --> G[PyTest]
C --> H[Coverage]
D --> I[Docstrings]
D --> J[Examples]
Code Quality
# Format all projects
black projects/
# Type checking
mypy projects/
# Run tests
pytest projects/ --cov
Project Structure Template
projects/new-project/
βββ app/ # Application interface
βββ engine/ # Core logic
βββ models/ # Trained weights
βββ tests/ # Unit tests
βββ README.md # Project docs
βββ requirements.txt # Local dependencies
Adding New Projects
- Create project folder in
projects/
- Follow structure template
- Add cross-links to:
- Core utilities (avoid duplication)
- Related projects
- Submit PR with:
- Black-formatted code
- Google-style docstrings
- Unit tests (β₯80% coverage)
See CONTRIBUTING.md for full guidelines.
graph LR
A[Input] --> B[Preprocessing]
B --> C[Inference]
C --> D[Postprocessing]
B --> E[GPU Pipeline]
C --> F[TensorRT]
D --> G[Batch Processing]
- GPU-accelerated preprocessing
- Batch processing optimization
- Memory-efficient inference
- TensorRT integration
- Multi-GPU support
- Mixed precision training
Model | Task | Performance | Speed (FPS) |
---|---|---|---|
YOLOv8 | Detection | mAP: 52.3 | 120 |
Mask R-CNN | Segmentation | mAP: 47.8 | 45 |
DeepSORT | Tracking | MOTA: 76.5 | 80 |
Performance on standard datasets:
Task | Dataset | Model | GPU | FPS | Accuracy |
---|---|---|---|---|---|
Detection | COCO | YOLOv8 | A100 | 120 | mAP: 52.3 |
Segmentation | COCO | Mask R-CNN | V100 | 45 | mAP: 47.8 |
Tracking | MOT17 | DeepSORT | 3090 | 80 | MOTA: 76.5 |
We use SemVer for versioning. For available versions, see the tags on this repository.
Bjorn Melin
- GitHub: @BjornMelin
- LinkedIn: Bjorn Melin
@misc{melin2024mlvisionlab,
author = {Melin, Bjorn},
title = {ML Vision Lab: Production Computer Vision Implementations},
year = {2024},
publisher = {GitHub},
url = {https://github.com/BjornMelin/ml-vision-lab}
}
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenCV community
- YOLO authors and contributors
- Deep SORT implementation team
- Medical imaging community (MONAI)
- Satellite imagery processing teams
- TensorFlow and PyTorch teams
- NVIDIA for CUDA and TensorRT support
Made with ποΈ and β€οΈ by Bjorn Melin