fix qwen and add qwen3 #2235

pass-lin · 2025-04-29T04:48:19Z

There are problems with the current implementation of Qwen.
First of all, "tie_word_embeddings" is an optional feature in the Qwen model. However, in the current Qwen model, it is the default option.

Secondly, regarding the implementation of qwen3. qwen3 and qwen2 only have one difference, which is the qk norm. When we add qwen3, do we just need to add a config for qwen2?

I can help solve the above two problems. Do you think it should be solved by the keras-team, or should I submit a new PR to solve it?

github-actions bot assigned sachinprasadhs Apr 29, 2025

pass-lin closed this as completed Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix qwen and add qwen3 #2235

fix qwen and add qwen3 #2235

pass-lin commented Apr 29, 2025

fix qwen and add qwen3 #2235

fix qwen and add qwen3 #2235

Comments

pass-lin commented Apr 29, 2025