Information

Time: $13:20 - 14:00$
Attendees: Bob Zhang (Supervisor), Huang Yanzhen, Mai Jiajun

Discussion Summary

1. Final Dataset Supplementation

Cover boundary data. Supplement the dataset on some actions that were not observed in data collection.

2. Adjust Features

Previously, we use $C_{13}^{3}$ to produce angle combinations. We mistakenly ignored that the order of the angle points are important. The previous $286$ angles are not robust enough, as the points with a larger index tends to be the angle point of the three points in more combinations.

We use a new method to generate target datasets:

First, list all the $13$ corner points into set $A$.
Then, generate all the possible combinations of two edge points from the $13$ points, into set $E\subset A\times A$. ($C_{13}^2$)
After that, product the corner points into the edge points combinations: $A'=E\times A$
Lastly, remove angles that has a point in the edge points that is the same to the corner point.

Totally, we have $C_{13}^{1}C_{12}^{2}=858$ angles. Adding the scores, it will be $858\times 2=1716$ dimensions in the feature dataset.

Previously, we only have $286$ angles, yielding a $572$ dimensional data sample. Now, the initial convolutional layers grows from $2$ to $6$, with each channel having $286$ data, still.

2. Trade-Off Factor in Application

We introduced a theory of "Loss Function" from CISC3024 Pattern Recognition. The idea is that, different selection errors may have differently significant consequences, i.e., "losses" or "costs".

For instance, if we mis-classify a pedestrian who is not using a phone into "using", the utmost cost we could have is to get sued by him/her.
However, if we mis-classify a pedestrian who is using a phone, it is possible that he/she will get hit by a car and die, which is a way larger cost.

Therefore, to cover this issue, we introduced a bias factor during application. It is an adjustable factor that allows the user (i.e. the pathway chief monitor, or whatever you'd like to call it ;) ) to adjust how "casually" the model considers a person as "using".

This does not require to re-train the model, or have the model trained multiple times. We only work on the output of the model. As discussed before, the model has two outputs: The un-softmaxed 2-dimensional output.

$$ \mathbf{y}=\begin{pmatrix}y_{0} \ y_{1}\end{pmatrix}=f(\mathbf{x}), \ \mathbf{x}\in\mathbb{R}^{3}\times\mathbb{R}^1\times\mathbb{R}^{286}, \ \mathbf{y}\in\mathbb{R}^2 $$ The output will be then sigmoid-ed to suppress it into a possibility-like score in range $[0,1]$. $$ \mathbf{y}'=\begin{pmatrix}y_0'\y_1'\end{pmatrix}=\text{sigmoid}(\mathbf{y})\in\mathbb{R}^2 $$ The final output will be: $$ y^*=\begin{cases} \text{argmax}_{i\in{0,1}} \ y_i & y_1 < \theta \ y_1 & \text{otherwise} \end{cases} $$ where $\theta$ is the pre-defined threshold.

Moreover, this bias also worked as a work-around of the lack of data variance in the training data.

3. Performance Inspection

Posture Estimation Model	Equipment (GPU)	Data Num	Epoch Num	LR
RTMPose Medium	RTX-4060 Laptop	4118 +, 3858 -	650	5e-6

3.1 Training Performance

Under a larger data sample dimension, the overfitting point moved from around $950$ epochs to $650$ epochs. More useful information is discovered!

We can see that the gradient at the end of the training fluctuates, this means that the learning rate is quite large that it's jumping around a local minimum. However, it is still tolerable.

3.2 Testing Performance

Again, the model is well-fitted in the given data samples, yet it is overfitted in the actual posts in the real-world.

3.3 Inspection Performance

3.3.1 Regular

To begin with, this model performs well in the regular term.

We conclude that the convolution is useful, as it lets the model to learn the dependencies among different features. For an evident instance, the model had learned the dependencies between "lowering head" and "raising up hand."

https://drive.google.com/drive/u/0/folders/1oUhbQTxlxmzz8J78fJUmWMiYGS3mtNIR

Agenda for the next meeting

Victor Mai Jiajun gave a new structure of inputs, and try to train and test it.
get ready to train a phone-detector model with interface of YOLOv8 and dive into the next phase of the whole project.
cover more videos that not yet been fitted in model in the 1st phase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

20241101.md

20241101.md

Information

Discussion Summary

1. Final Dataset Supplementation

2. Adjust Features

2. Trade-Off Factor in Application

3. Performance Inspection

3.1 Training Performance

3.2 Testing Performance

3.3 Inspection Performance

3.3.1 Regular

Agenda for the next meeting

Files

20241101.md

Latest commit

History

20241101.md

File metadata and controls

Information

Discussion Summary

1. Final Dataset Supplementation

2. Adjust Features

2. Trade-Off Factor in Application

3. Performance Inspection

3.1 Training Performance

3.2 Testing Performance

3.3 Inspection Performance

3.3.1 Regular

Agenda for the next meeting