Skip to content

Testing Folders #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 140 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
140 commits
Select commit Hold shift + click to select a range
a919c44
Update README.md
tul53850 Jan 31, 2024
cd6566a
Camera
tun71427 Feb 27, 2024
5f2c6e2
Add files via upload
kbarbarisi Mar 12, 2024
ec9d06e
initialization
tun71427 Mar 12, 2024
c0e6af9
Merge branch 'zhu' of https://github.com/Capstone-Projects-2024-Sprin…
tun71427 Mar 12, 2024
9a19c4e
Use MediaPipe's Hands module to process video frames and draw hand ke…
tun71427 Mar 12, 2024
b11bd9a
Simple initial prediction gestures
tun71427 Mar 14, 2024
505cfa3
Merge pull request #1 from Capstone-Projects-2024-Spring/zhu
LeeMamori Mar 14, 2024
bc324f6
update Kalman filter and bounding box and test feature move-left gesture
shibatachun Mar 18, 2024
fe9506c
update Kalman filter and bounding box and test feature move-left gesture
shibatachun Mar 18, 2024
bf312de
update Kalman filter and bounding box and test feature move-left gesture
shibatachun Mar 18, 2024
0939be6
deleted env
shibatachun Mar 18, 2024
7e5378d
update left test
shibatachun Mar 18, 2024
3f68ae2
Simple prediction gestures use initial kalman filter
tun71427 Mar 18, 2024
d49d275
Refactor kalman filter function to class
tun71427 Mar 18, 2024
364d3e0
Use Kalman filter to utilize the predicted positions for further proc…
tun71427 Mar 18, 2024
124eb4e
Simple gesture movement sensing
tun71427 Mar 19, 2024
e8a5262
Gesture movement sensing once per second
tun71427 Mar 19, 2024
17d565d
Fixed little bug and change time to 2 sencond
tun71427 Mar 19, 2024
f4e32da
Merge branch 'zhu' of https://github.com/Capstone-Projects-2024-Sprin…
tun71427 Mar 19, 2024
34793f2
Extension of demo
kbarbarisi Mar 19, 2024
b5efa1b
Added coordinates and bounding box
tun71427 Mar 19, 2024
7675d3d
Delete useless parameters
tun71427 Mar 19, 2024
e3dc601
Merge pull request #2 from Capstone-Projects-2024-Spring/zhu
tuk85473 Mar 19, 2024
9c50d52
UI code
tul53850 Mar 21, 2024
c3869c2
cam
tul53850 Mar 21, 2024
68bbe09
fix camera bug
tul53850 Mar 21, 2024
4a2ee86
Merge branch 'main' of https://github.com/Capstone-Projects-2024-Spri…
shibatachun Mar 25, 2024
6d79c7e
Using tkinter to setup the menu UI, and add the event for each button…
shibatachun Mar 25, 2024
ef5c6d3
Merge pull request #3 from Capstone-Projects-2024-Spring/yang
tul53850 Mar 25, 2024
38878cb
Update README.md
tul53850 Mar 25, 2024
3fd4c87
volume up and down
tul53850 Mar 26, 2024
3032e7c
Add a volume function to turn the volume up and down, as well as disp…
tul53850 Mar 26, 2024
c61d967
init
tun71427 Mar 26, 2024
5ab0da8
init
tun71427 Mar 26, 2024
fead16b
init
tun71427 Mar 26, 2024
18a077e
init
tun71427 Mar 26, 2024
7ea0469
init
tun71427 Mar 26, 2024
474da76
init
tun71427 Mar 26, 2024
cc4817f
init
tun71427 Mar 26, 2024
8a90075
init
tun71427 Mar 26, 2024
9288b59
init
tun71427 Mar 26, 2024
babb2cc
init
tun71427 Mar 26, 2024
4355f2c
init
tun71427 Mar 26, 2024
6776bf5
init
tun71427 Mar 26, 2024
1d892aa
dog audio
tun71427 Mar 26, 2024
8aa202d
dog image
tun71427 Mar 26, 2024
2467f6f
helpers function
tun71427 Mar 26, 2024
f5eb724
helpers function
tun71427 Mar 26, 2024
1d5e7e1
imagebind
tun71427 Mar 26, 2024
dbb8333
imagebind
tun71427 Mar 26, 2024
69bfb37
init
tun71427 Mar 26, 2024
7f78fd9
init
tun71427 Mar 26, 2024
afc754d
init
tun71427 Mar 26, 2024
fc25e8e
init
tun71427 Mar 26, 2024
fdd3e69
updated requires
tun71427 Mar 26, 2024
4018146
setup
tun71427 Mar 26, 2024
b693c92
init
tun71427 Mar 26, 2024
2e8740c
init
tun71427 Mar 26, 2024
9ac9f02
transformer function
tun71427 Mar 26, 2024
580b477
init
tun71427 Mar 27, 2024
274a54b
init
tun71427 Mar 27, 2024
60e14ea
init
tun71427 Mar 27, 2024
345f061
Merge pull request #4 from Capstone-Projects-2024-Spring/zhu
LeeMamori Mar 28, 2024
6ba66e1
Stop tracking imagebind_huge.pth
tun71427 Mar 28, 2024
b871727
Added MovementDetector class
tun71427 Mar 28, 2024
20422f1
Added MovementDetector logic to start_capture
tun71427 Mar 28, 2024
7e9c393
Add the launch music app function and automatically search if you ins…
shibatachun Mar 28, 2024
6329c7d
Make kalman_filter to dynamic storage, and max numbers of hands = 2
tun71427 Mar 28, 2024
58a2506
Dynamic storage movement detectors
tun71427 Mar 28, 2024
6b2e96a
Updated Kalman filter bounding box
tun71427 Mar 28, 2024
58c1ea9
inti
tun71427 Mar 28, 2024
a749bd1
Refactor name
tun71427 Mar 28, 2024
a7fedd1
Refactor name
tun71427 Mar 28, 2024
fb7cdf3
Use Imagebind
tun71427 Mar 28, 2024
e99d1e4
Updated format
tun71427 Mar 28, 2024
265bd2e
fixing
shibatachun Mar 28, 2024
a6ce7e2
fixed using official example model for recognition. But without draw …
shibatachun Mar 28, 2024
0b1cdc0
Change time window to 2 seconds
tun71427 Mar 28, 2024
fb16d8e
Added monitoring every two seconds
tun71427 Mar 28, 2024
10953a6
Added recording video
tun71427 Mar 28, 2024
2ee2f51
Added record photos
tun71427 Mar 28, 2024
e40a4d8
Fixed bug
tun71427 Mar 28, 2024
adbdf9c
Fixed a bug!
tun71427 Mar 28, 2024
0594dbf
Merge pull request #5 from Capstone-Projects-2024-Spring/zhu
LeeMamori Mar 28, 2024
79ca3d3
Fixed more bug!!
tun71427 Mar 28, 2024
b1a50b0
Change quit bottom from q to esc
tun71427 Mar 28, 2024
d2ffd47
Fixed moreeeeeeeeeeeeee buuuuuuuuuuuuuuug
tun71427 Mar 31, 2024
16bbaa8
Deleted useless variable
tun71427 Mar 31, 2024
e9e0b2c
Fixed moreeeeeeee bug!
tun71427 Mar 31, 2024
e40f3eb
Fixed more buggggggggg!
tun71427 Mar 31, 2024
5c27c1e
Fixed more buggggggggg!
tun71427 Mar 31, 2024
81314dd
Merge branch 'main' of https://github.com/Capstone-Projects-2024-Spri…
shibatachun Apr 1, 2024
ce3d055
resolve conflict
shibatachun Apr 1, 2024
a103deb
Merge pull request #7 from Capstone-Projects-2024-Spring/yang
tul53850 Apr 1, 2024
0929633
a couple of improvements to the UI
tul53850 Apr 1, 2024
cfe532b
A couple of UI improvements
tul53850 Apr 1, 2024
0400476
recovery the code by Kiana
shibatachun Apr 2, 2024
7cf6766
Merge pull request #8 from Capstone-Projects-2024-Spring/yang
LeeMamori Apr 2, 2024
31e5bd8
root.attributes error fixed
tuj47463 Apr 2, 2024
947c709
added code for simulating mouse to implement add to playlist feature
shibatachun Apr 2, 2024
f8626df
added code for simulating mouse to implement add to playlist feature
shibatachun Apr 2, 2024
25c8e55
Fix the resolution error, now the cursor will cover all the screen
shibatachun Apr 2, 2024
85df7aa
For volume control gesture, set the hotkey control spotify volume ins…
shibatachun Apr 2, 2024
27f6282
Display volume text on screen instead of terminal
tul53850 Apr 2, 2024
9d9de48
Merge branch 'main' into yang
tul53850 Apr 2, 2024
4953f55
Merge pull request #9 from Capstone-Projects-2024-Spring/yang
tul53850 Apr 2, 2024
1807385
Added initial imagebind model and analyze with imagebind
tun71427 Apr 2, 2024
4f94000
Ignore cleanup.py
tun71427 Apr 2, 2024
5348f4a
updated
tun71427 Apr 2, 2024
55c7b3a
cleanup useless picture and video
tun71427 Apr 2, 2024
0939237
Merge pull request #10 from Capstone-Projects-2024-Spring/zhu
tun71427 Apr 2, 2024
ddf0f3c
A test file for imagebind
tun71427 Apr 2, 2024
87f84b6
Updated more formatted filenames
tun71427 Apr 2, 2024
02187e7
ignore pycache
tun71427 Apr 2, 2024
a1e0f64
Merge pull request #11 from Capstone-Projects-2024-Spring/zhu
tun71427 Apr 2, 2024
bdfa655
Add hand detector and process detector
shibatachun Apr 3, 2024
cb2fe80
Gesture detection, modified from Camera.py, init setting, for self le…
shibatachun Apr 3, 2024
5f6b92c
Left Hold features
shibatachun Apr 3, 2024
59197d6
Mouse move related features
shibatachun Apr 3, 2024
be44ed4
Appkit for macOS still WIP
shibatachun Apr 3, 2024
4c56c04
Appkit for macOS still WIP
shibatachun Apr 3, 2024
2b5b0c9
Add halt (open palm) and play feature
shibatachun Apr 3, 2024
8dc5ab6
Merge remote-tracking branch 'origin/yang' into yang
shibatachun Apr 3, 2024
6f3c4e6
Add halt (open palm) and play feature
shibatachun Apr 3, 2024
66177d9
Merge pull request #12 from Capstone-Projects-2024-Spring/yang
LeeMamori Apr 4, 2024
791f493
Updated imagebind
tun71427 Apr 4, 2024
083f93d
Updated
tun71427 Apr 7, 2024
2dc5a88
Added print for highest probability for imagebind
tun71427 Apr 7, 2024
91d97ba
Updated
tun71427 Apr 7, 2024
8f7bcd6
Merged
tun71427 Apr 7, 2024
86d60c8
Added cleanup when exit
tun71427 Apr 7, 2024
93a62af
Recognize hand gestures for Stop and Display Camera
tuk85473 Apr 7, 2024
9f2b2c1
Merge branch 'main' into zhu
tun71427 Apr 8, 2024
8e2afda
Merge pull request #14 from Capstone-Projects-2024-Spring/zhu
tun71427 Apr 8, 2024
a65bf70
exitapp progress #1
tuj47463 Apr 9, 2024
8063afa
Merge pull request #13 from Capstone-Projects-2024-Spring/ashley
kbarbarisi Apr 9, 2024
83a810c
change output gesture test with pointer and pinky up
tul53850 Apr 9, 2024
04ef8ab
Created Folder for Testing and subfolders for types of testing
NathanMcCourt Apr 17, 2024
23b4de1
Slight update to loop recognition
NathanMcCourt Apr 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .assets/bird_audio.wav
Binary file not shown.
Binary file added .assets/bird_image.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .assets/car_audio.wav
Binary file not shown.
Binary file added .assets/car_image.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .assets/dog_audio.wav
Binary file not shown.
Binary file added .assets/dog_image.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
745 changes: 745 additions & 0 deletions .github/Untitled21-2-2.ipynb

Large diffs are not rendered by default.

1,156 changes: 1,156 additions & 0 deletions .github/Untitled21-2.ipynb

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,9 @@
.env.development.local
.env.test.local
.env.production.local

# PTH files
.checkpoints/imagebind_huge.pth

# pycache
__pycache__/
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions .idea/project-waveease.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

57 changes: 57 additions & 0 deletions HelloWorld.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from imagebind import data
import torch
from imagebind.models import imagebind_model
from imagebind.models.imagebind_model import ModalityType

# A test file for imagebind

text_list=["A dog.", "A car", "A bird"]
image_paths=[".assets/dog_image.jpg", ".assets/car_image.jpg", ".assets/bird_image.jpg"]
audio_paths=[".assets/dog_audio.wav", ".assets/car_audio.wav", ".assets/bird_audio.wav"]

device = "cuda:0" if torch.cuda.is_available() else "cpu"

# Instantiate model
model = imagebind_model.imagebind_huge(pretrained=True)
model.eval()
model.to(device)

# Load data
inputs = {
ModalityType.TEXT: data.load_and_transform_text(text_list, device),
ModalityType.VISION: data.load_and_transform_vision_data(image_paths, device),
ModalityType.AUDIO: data.load_and_transform_audio_data(audio_paths, device),
}

with torch.no_grad():
embeddings = model(inputs)

print(
"Vision x Text: ",
torch.softmax(embeddings[ModalityType.VISION] @ embeddings[ModalityType.TEXT].T, dim=-1),
)
print(
"Audio x Text: ",
torch.softmax(embeddings[ModalityType.AUDIO] @ embeddings[ModalityType.TEXT].T, dim=-1),
)
print(
"Vision x Audio: ",
torch.softmax(embeddings[ModalityType.VISION] @ embeddings[ModalityType.AUDIO].T, dim=-1),
)

# Expected output:
#
# Vision x Text:
# tensor([[9.9761e-01, 2.3694e-03, 1.8612e-05],
# [3.3836e-05, 9.9994e-01, 2.4118e-05],
# [4.7997e-05, 1.3496e-02, 9.8646e-01]])
#
# Audio x Text:
# tensor([[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]])
#
# Vision x Audio:
# tensor([[0.8070, 0.1088, 0.0842],
# [0.1036, 0.7884, 0.1079],
# [0.0018, 0.0022, 0.9960]])
60 changes: 44 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<div align="center">

# Project Name
# WavEase - Python powered Gesture Recognition
[![Report Issue on Jira](https://img.shields.io/badge/Report%20Issues-Jira-0052CC?style=flat&logo=jira-software)](https://temple-cis-projects-in-cs.atlassian.net/jira/software/c/projects/DT/issues)
[![Deploy Docs](https://github.com/ApplebaumIan/tu-cis-4398-docs-template/actions/workflows/deploy.yml/badge.svg)](https://github.com/ApplebaumIan/tu-cis-4398-docs-template/actions/workflows/deploy.yml)
[![Documentation Website Link](https://img.shields.io/badge/-Documentation%20Website-brightgreen)](https://applebaumian.github.io/tu-cis-4398-docs-template/)
Expand All @@ -11,52 +11,80 @@

## Keywords

Section #, as well as any words that quickly give your peers insights into the application like programming language, development platform, type of application, etc.
Section 004, as well as any words that quickly give your peers insights into the application like programming language, development platform, type of application, etc.

## Project Abstract

This document proposes a novel application of a text message (SMS or Email) read-out and hands-free call interacted between an Android Smartphone and an infotainment platform (headunit) in a car environment. When a phone receives an SMS or Email, the text message is transferred from the phone to the headunit through a Bluetooth connection. On the headunit, user can control which and when the received SMS or E-mail to be read out through the in-vehicle audio system. The user may press one button on the headunit to activate the hands-free feature to call back the SMS sender.
This project would create an application that allows users to use hand gestures in front of a sensor that have been mapped to enact specific commands. For example, a person could have a camera set up for gesture recognition and the network could be integrated with smart devices to turn that device on or off. Say you just sit down on the couch to watch a movie, but you can’t find the remote. With a gesture recognition system, you can simply signal a certain gesture at a camera set up with a connection to your TV and the device could turn on. This could also be the same for lighting in the house. This project would be done using Python.

## High Level Requirement

Describe the requirements – i.e., what the product does and how it does it from a user point of viewat a high level.
The product works by capturing and interpreting physical movements from the user’s hands or body parts. Those movements would then be translated into preset commands or actions. The high-level requirements would include sensor data acquisition, data processing, a gesture recognition algorithm, and command generation. From a user point of view, you would do a physical gesture in front of the sensor that has a command mapping. The program at a high level could be mapped to turn lights on or off. For this project we could start off by just printing text to a screen to communicate it is working.

## Conceptual Design

Describe the initial design concept: Hardware/software architecture, programming language, operating system, etc.
The conceptual design for this project would be a laptop with a built-in camera system to implement the gesture control system. The programming language would be python and the following python libraries would be used, OpenCV, TensorFlow, and NumPy. An open source data set for gesture recognition would be found and we would preprocess the images by resizing, normalizing, and converting them into a format suitable for model training. A model would be built using CNN architecture and TensorFlow. Once the model is trained, it could be deployed to use OpenCV to capture video frames from a camera, process them, and input them into our trained model. From this step, the project could go multiple different ways. For a more advanced project, we could link it to smart devices, but to start off we could print text to a screen of what the action would be performing. We will start with implementing this software to an application like Spotify and possibly consider many other applications like Youtube and Apple Music.

## Background

The background will contain a more detailed description of the product and a comparison to existing similar projects/products. A literature search should be conducted and the results listed. Proper citation of sources is required. If there are similar open-source products, you should state whether existing source will be used and to what extent. If there are similar closed-source/proprietary products, you should state how the proposed product will be similar and different.
The idea for this product is to associate specific gestures with predefined commands. For example, a swipe gesture to the right can be associated with turning on the lights, while a swipe gesture to the left can be associated with turning them off. While there is not an existing product that is able to do this, researchers at the University of Washington are close to achieving this. Their approach is to use Wi-Fi signals to detect specific movements instead of cameras (Ma,2013). This would be different from the approach I suggested earlier because my idea is to use a laptop camera. A product that I found that is like this is the Xbox Kinect. The Kinect uses cameras to recognize gestures and allow you to interact with games on the Xbox (Palangetić, 2014). This is like my proposal because it uses a camera to capture images and allow a user to interact with the games on the device. However, it is also different because it doesn’t connect to smart devices and allows you to control certain features.

## Required Resources

Discuss what you need to develop this project. This includes background information you will need to acquire, hardware resources, and software resources. If these are not part of the standard Computer Science Department lab resources, these must be identified early and discussed with the instructor.
The required hardware for this project would be a laptop with a working camera. Specific python libraries like TensorFlow, NumPy, and OpenCV would be needed to train the model and capture images to input into the model once it is trained. It would be beneficial if the people working on this project had experience or knowledge with computer vision, convolutional neural network architecture, and API calls to connect to the smart devices. While wireless networks would most likely be the preferred way, if anyone has experience with IoT devices and connections that could be a route to go for integrating smart devices as well.

## Collaborators

[//]: # ( readme: collaborators -start )
<table>
<tr>
<td align="center">
<a href="https://github.com/ApplebaumIan">
<img src="https://avatars.githubusercontent.com/u/9451941?v=4" width="100;" alt="ApplebaumIan"/>
<a href="https://github.com/kbarbarisi">
<img src="https://avatars.githubusercontent.com/u/73039627?v=4" width="100;" alt="Kianna"/>
<br />
<sub><b>Ian Tyler Applebaum</b></sub>
<sub><b>Kianna Barbarisi</b></sub>
</a>
</td>
<td align="center">
<a href="https://github.com/leekd99">
<img src="https://avatars.githubusercontent.com/u/32583417?v=4" width="100;" alt="leekd99"/>
<a href="https://github.com/tul53850">
<img src="https://avatars.githubusercontent.com/u/111989518?v=4" width="100;" alt="Jason"/>
<br />
<sub><b>Kyle Dragon Lee</b></sub>
<sub><b>Jason Hankins</b></sub>
</a>
</td>
<td align="center">
<a href="https://github.com/thanhnguyen46">
<img src="https://avatars.githubusercontent.com/u/60533187?v=4" width="100;" alt="thanhnguyen46"/>
<a href="https://github.com/SarinaCurtis">
<img src="https://avatars.githubusercontent.com/u/81874704?v=4" width="100;" alt="Sarina"/>
<br />
<sub><b>Thanh Nguyen</b></sub>
<sub><b>Sarina Curtis</b></sub>
</a>
</td>
<td align="center">
<a href="https://github.com/tun71427">
<img src="https://avatars.githubusercontent.com/u/123014326?v=4" width="100;" alt="Yuxuan"/>
<br />
<sub><b>Yuxuan Zhu</b></sub>
</a>
</td>
<td align="center">
<a href="https://github.com/LeeMamori">
<img src="https://avatars.githubusercontent.com/u/123014841?v=4" width="100;" alt="Yang"/>
<br />
<sub><b>Yang Li</b></sub>
</a>
</td>
<td align="center">
<a href="https://github.com/tuk85473">
<img src="https://avatars.githubusercontent.com/u/97626755?v=4" width="100;" alt="Ashley"/>
<br />
<sub><b>Ashley Jones</b></sub>
</a>
</td>
<td align="center">
<a href="">
<img src="https://avatars.githubusercontent.com/u/97626755?v=4" width="100;" alt="Ashley"/>
<br />
<sub><b>Ashley Jones</b></sub>
</a>
</td>
</tr>
Expand Down
Binary file added __pycache__/Camera.cpython-311.pyc
Binary file not shown.
Binary file added __pycache__/Camera.cpython-38.pyc
Binary file not shown.
Binary file added __pycache__/utile.cpython-311.pyc
Binary file not shown.
Binary file added __pycache__/utile.cpython-38.pyc
Binary file not shown.
Loading