computer-vision

Here is a quick video overview of a computer vision task I have been working on. It includes object detection, image segmentation, and monocular depth estimation.

The idea started for me when a lecturer gave us a task to conceptualise and research an application of combining a language model with a computer vision model. After a little reading, I was shocked to learn that there are roughly 300 million people with moderate to severe vision impairment and 36 million who are completely blind. I asked myself the question: What vision models are available to build situational understanding?

Video Output Link

Depth Estimation with “Intel/dpt-hybrid-midas”

Object Classification was done with Ultrlytics Yolov8-Nano

Image Segmentation with "nvidia/segformer-b0-finetuned-ade-512-512"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

computer-vision

Files

README.md

Latest commit

History

README.md

File metadata and controls

computer-vision