Skip to content

Latest commit

 

History

History
58 lines (49 loc) · 4.36 KB

20250219.md

File metadata and controls

58 lines (49 loc) · 4.36 KB

Information

  • Time: 13:30 -
  • Attendees: Bob Zhang (Supervisor), Huang Yanzhen, Mai Jiajun

Discussion Summary

1. Better methodology of data gathering

  • Common colours of phone cases were chosen for training by using green screen.
  • Integrated runtime parameter dictionary in the main loop. This dictionary includes the trigger to save current hand frames at run time, and the action is triggered by clicking the first side-button on the mouse. This is done to supply excessive training data on-the-go.
  • Null/background examples are recorded as well.

2. Optimizations, Enhancements, and Bug Fixes

  • Added requirements.txt.
  • Optimized plotting, differentiated "application mean" and "performance mean" for YOLO.
    • Application mean is the overall mean process time per frame, including frames where YOLO is not invoked. This measures the ACTUAL performance.
    • Performance mean is the overall mean process time per frame, excluding frames where YOLO is not invoked. This measures the THEORETICAL performance.
    • In a single-person scenario, the application mean is 0.007 and the performance mean is 0.018. Therefore it's quite important to address the difference.

image.png

  • Added a nice banner. 10c694ead3a54e45ec0375e0ad350f5.jpg

  • (Enhancement) Introduced the concept of "primary hand", the hand among the two that's closer to ones face. When reported SUSPICIOUS by the posture recognition model, the primary hand will always be first inspected.

  • (Enhancement) Optimized hand frame size calculation. When a pedestrian is too close to the camera, the hand frame will now remain a constant size.

  • (Bug Fix) Added abs on multiple distance calculations that involves "minus", including face frame crop. Previous face frame crop length was negative, causing unexpected behaviors.

  • (Bug Fix) Applied more error handling, mainly targeting None-type related errors caused by corrupted detections by RTMPose. In detectPhone and render_detect_rectangle, if the pedestrian's hand frames' frame and frame_xyxy are None, the exception is caught without throwing, skipping to the next frame. The main goal of this act is to make the project ignore unimportant errors and don't crash when encountering them.

  • (Enhancement) Added face announcing feature when pushing to remote.

    • The face announcing interval is given as a value in the pkg_phone_detect package, where the running time_last_announce_face is maintained in the runtime_params dictionary above the main loop, ensuring that the face announcing to not be too intense.
    • A new function announceFace is introduced, pushing the list of frames as json data to the websocket server. The frontend will then decode the list and push new faces into the stack in the displayed section.

3. License

To respect intellectual property rights, license declarations for open-source packages as well as this project are written. Open-source package licenses are stored in the root LICENSES directory, where the license of this project stores in the root LICENSE.txt file.

  • Added license for RTMPose from Open-MMLab (Apache-2.0), stored files in LICENSES/Apache_Open-MMLab.
  • Added license for YOLO from ultralytics (AGPL-3.0) in LICENSES/AGPL_ultralytics, along with declaration of its usage with its copyright info in the main.py file, README.md file (According to AGPL requirements).
    • Since this project only made used of ultralytics' official python package without modifying its source code, no source code for this package is provided in this project. However, according to the requirements of AGPL-3.0, the URL to the source GitHub repository and the source AGPL-3.0 license file is given in main.py and README.md.
  • Added Apache-2.0 license for this project. Included the information about ultralytics under AGPL-3.0 License at the start of the Apache-2.0 license.

The current file structure:

LICENSES
|- AGPL_ultralytics
|	|- LICENSE.txt
|- Apache_Open-MMLab
|   |- LICENSE.txt
|- LICENSE.txt
...
|- main.py
|- processing.py
|- README.md
|- .gitignore
...

Remaining Problems:

  1. Try to record some videos in which the person watching phone is swinging his/her other hand. In other words, make sure that the width of bounding box is wider than the ones before.

Next Meeting's Agenda

  • More about