Skip to content

Latest commit



72 lines (57 loc) · 4.03 KB

File metadata and controls

72 lines (57 loc) · 4.03 KB


  • Time: 13:20 - 13:56
  • Attendees: Bob Zhang (Supervisor), Huang Yanzhen, Mai Jiajun

Discussion Summary

1. Frontend

1.1 Achievements

  • Deployed: An Ubuntu Server in DigitalOcean.
    • Runs the NextJS Frontend + NodeJS Server in Docker Container.
  • Push Base64-Encrypted video frame with exposed WebSocket URL.

1.2 Issues

  • Very big latency while large throughput, could reach 12 sec.
  • WebSocket may not fit video streaming.

2. Data Gathering

2.1 Achievements

  • Achieved several recordings on both datasets
  • We displayed some videos recorded, and some tips on editing video to make them better to play the role of input.
  • We produces a .npy data file for each video directly.
    • Each video shares the same label of "using" or "not using".
    • Remark: A video = Various angles of the same variety of a label.
  • We discarded the concepts of making use of directories to split "using" and "not using". We make use of the file name that complies the naming paradigm proposed in meeting notes 20240925 by parsing it and get the third label to settle the label.
    • Starts with U: Using
    • Starts with N: Not using.


2.2 Train with Unbalanced Training Data

2.2.1 Training Information

Item Content Description
Initial Leraning Rate 0.001 Optimized by Adam
Optimizer Adam
Loss Cross Entropy Loss
Epoch 100
Train/Test Ratio 0.8/0.2

Training Results

Training Posture Estimation Model Equipment (GPU) Data Num Train Accuracy Test Accuracy Loss
Limited Training RTMPose Medium RTX-2080Ti (Workstation) 2907 +, 650 - 0.9445 0.9404 0.3766

This looks similar to the results in Meeting notes 20240920, but they are different.

  • The datasets are different. 20240920 uses a un-planned testing set captured with Xue Yulin, while 20241004 uses a well-planned testing set captured with Mai Jiajun, which could be accessed via the link above.
  • The training devices are different, and there are also differences in data number of each label.

Generally speaking, compared to the previous one, the results obtained currently are:

  • More descriptive, as we well-planned to capture different types of each video.
  • More robust, as we captured various angles for each sub-types of all the labels in the videos.

Data gathering is still in progress, so these results are temporary results.

  • The data gathering schedules are not yet completed, still some varieties in either labels are not captured.
  • The data amount in each label are not balanced yet.
  • This results is presented just to give us a glance on how our idea work.

Epoch-Loss Graph (100 Epochs)


Training Console


Agenda for Next Meeting

  • Work In Progress: Gather more & various data to balance the number of using/not_using samples.
  • Apply the newly trained model to test the "observable accuracy".
    • This project aims to provide a real-time experience so "observation" is important.
    • If the observable accuracy on abundant test data is not good enough, consider adjusting the target_list, i.e. what features to observe.