Skip to content

[RFC]: Add w8a8 Quantization #453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dingdingchaomian opened this issue Apr 1, 2025 · 3 comments
Open

[RFC]: Add w8a8 Quantization #453

dingdingchaomian opened this issue Apr 1, 2025 · 3 comments
Labels
RFC Request For Comments

Comments

@dingdingchaomian
Copy link
Contributor

Motivation.

支持使用modelslim工具生成的权重进行推理

Proposed Change.

修改量化相关实现

Feedback Period.

No response

CC List.

No response

Any Other Things.

No response

@Yikun Yikun changed the title [RFC]: 支持w8a8量化推理 [RFC]: Add w8a8 Quantization Apr 1, 2025
@Yikun Yikun added the RFC Request For Comments label May 11, 2025
@wangxiyuan
Copy link
Collaborator

related issue and guide: #619

@learning-chip
Copy link

Incorrect output from W8A8 model: #628 (comment)

@tingyiz97
Copy link

Found a performance issue right here: #1015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request For Comments
Projects
None yet
Development

No branches or pull requests

5 participants