generated from datawhalechina/repo-template
-
Notifications
You must be signed in to change notification settings - Fork 31
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #19 from gyfffffff/main
更新PPT,完善文档和代码细节
- Loading branch information
Showing
18 changed files
with
5,075 additions
and
563 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,7 @@ | ||
# 总结 | ||
|
||
本章中,我们学习了大模型蒸馏的概念,与传统蒸馏的不同,以及主流的大模型蒸馏范式。 | ||
笔者认为,不论是白盒还是黑盒蒸馏,大模型蒸馏贯穿始终的思想是“训练数据来源于教师”,而非人为标注或机器标注。 | ||
|
||
蒸馏无疑是一种低成本高效率的提升小模型能力的方式,也可以说它是一条“捷径”,它的初衷是有在限资源部署更好的模型。 | ||
但是作为长期主义的研究工作者,想要提升模型能力,不能一味地依靠蒸馏“走捷径”,还是要从第一性原理出发,从根本上探索提升模型能力的技术路线。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.