Skip to content

DYT这个替换了RMS这个损失是一点不带变啊,我用自己搭得模型训练也是,这是为啥呢,参数也调过,咋和传得论文中得效果不一样啊,哈哈哈哈哈哈 #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Culturenotes opened this issue Apr 4, 2025 · 1 comment

Comments

@Culturenotes
Copy link

No description provided.

@mdy666
Copy link
Owner

mdy666 commented Apr 5, 2025

我也don't know啊,可能对初始化比较敏感

@mdy666 mdy666 closed this as completed Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants