Performance of num_head=1? #2

FacePoluke · 2025-01-02T08:59:16Z

Thank you for your excellent work. I noticed that your codebook head number is 4, which means downstream generation tasks need to output 4 tokens at once, potentially making training more challenging. I would like to know how much the performance would differ if the head number is 1.
Looking forward to your response.

zbr17 · 2025-01-02T11:31:51Z

Hello!

Thank you for your interest in our OptVQ project. In the early stages of the project, I experimented with the setting num-head=1, and generally, we found that the performance was inferior compared to when num-head=4. However, due to limited laboratory resources, I am currently using an 8-GPU 4090 server, and in the latest version of the training code, I only have results for num-head=4 and num-head=8. Please bear with me for a moment, as I plan to include checkpoints and evaluation results for num-head=1 in the project in the future.

Best regards,
Borui Zhang

FacePoluke · 2025-01-03T01:53:56Z

Hello!

Thank you for your interest in our OptVQ project. In the early stages of the project, I experimented with the setting num-head=1, and generally, we found that the performance was inferior compared to when num-head=4. However, due to limited laboratory resources, I am currently using an 8-GPU 4090 server, and in the latest version of the training code, I only have results for num-head=4 and num-head=8. Please bear with me for a moment, as I plan to include checkpoints and evaluation results for num-head=1 in the project in the future.

Best regards, Borui Zhang

Thanks for your reply. Looking forward to it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of num_head=1? #2

Performance of num_head=1? #2

FacePoluke commented Jan 2, 2025

zbr17 commented Jan 2, 2025

FacePoluke commented Jan 3, 2025

Performance of num_head=1? #2

Performance of num_head=1? #2

Comments

FacePoluke commented Jan 2, 2025

zbr17 commented Jan 2, 2025

FacePoluke commented Jan 3, 2025