Add mx_fp8_bf16 kernel #1637

drisspg · 2025-01-29T05:03:52Z

Stacked PRs:

Add mx_fp8_bf16 kernel

Will flesh out more but this moves over the kernel from here: https://github.com/drisspg/driss_torch/blob/2813322f0b0f9a0f0fc8d382090ad0aaecf3468a/src/mx_fp8_bf16.cu#L162

This does fp8xfp8 w/ E8m0 scales and group_size hard coded to 32. The format for the scales is the same as that for cublasLT. I have created a pytorch function that converts the [n_rows, n_cols//32] scales into the expected format:
https://github.com/drisspg/transformer_nuggets/blob/382cb0f19a5f615827174289b8ef552419d51fea/transformer_nuggets/mx/to_blocked.py#L11
This was surprisingly hard fought and would not have been possible w/ @albanD 😊

This allows this PR: #1625 to not have any dependencies on PT core updates while we add the required dtypes and bindings to cublas: pytorch/pytorch#145562

Follow up

Config needs more tuning

pytorch-bot · 2025-01-29T05:03:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1637

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

stack-info: PR: #1637, branch: drisspg/stack/31

vkuzo

nice! if CI is green - looks good! I think this should have at least one numerical test though. Can be a follow-up PR if needed.

albanD

Very cool!

setup.py

torchao/ops.py

stack-info: PR: #1637, branch: drisspg/stack/31

torchao/ops.py

stack-info: PR: #1637, branch: drisspg/stack/31

stack-info: PR: #1661

drisspg added a commit that referenced this pull request Jan 29, 2025

Add mx_fp8_bf16 kernel

ae51147

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from 3b57cd9 to ae51147 Compare January 29, 2025 05:03

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 29, 2025

drisspg mentioned this pull request Jan 29, 2025

Update to cutlass 3.8 #1634

Merged

drisspg added the topic: new feature Use this tag if this PR adds a new feature label Jan 29, 2025

vkuzo approved these changes Jan 29, 2025

View reviewed changes

albanD approved these changes Jan 29, 2025

View reviewed changes

setup.py Outdated Show resolved Hide resolved

vkuzo reviewed Jan 29, 2025

View reviewed changes

torchao/ops.py Show resolved Hide resolved

drisspg changed the base branch from drisspg/stack/30 to main February 3, 2025 21:47

drisspg added a commit that referenced this pull request Feb 3, 2025

Add mx_fp8_bf16 kernel

1e3d2dd

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from ae51147 to 1e3d2dd Compare February 3, 2025 21:47

drisspg changed the base branch from main to drisspg/stack/30 February 3, 2025 21:47

drisspg changed the base branch from drisspg/stack/30 to main February 3, 2025 23:24

drisspg added a commit that referenced this pull request Feb 3, 2025

Add mx_fp8_bf16 kernel

8d90a66

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from 1e3d2dd to 8d90a66 Compare February 3, 2025 23:24

drisspg changed the base branch from main to drisspg/stack/30 February 3, 2025 23:25

drisspg changed the base branch from drisspg/stack/30 to main February 3, 2025 23:30

drisspg added a commit that referenced this pull request Feb 3, 2025

Add mx_fp8_bf16 kernel

0646800

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from 8d90a66 to 0646800 Compare February 3, 2025 23:30

drisspg changed the base branch from main to drisspg/stack/30 February 3, 2025 23:30

drisspg changed the base branch from drisspg/stack/30 to main February 4, 2025 19:55

drisspg changed the base branch from main to drisspg/stack/30 February 4, 2025 19:55

drisspg mentioned this pull request Feb 4, 2025

Add mx_fp4_kernel #1661

Merged

drisspg changed the base branch from drisspg/stack/30 to main February 4, 2025 19:55

drisspg force-pushed the drisspg/stack/31 branch from 0646800 to 7473aca Compare February 4, 2025 19:57

drisspg changed the base branch from main to drisspg/stack/30 February 4, 2025 19:57

drisspg changed the base branch from drisspg/stack/30 to main February 4, 2025 20:00

drisspg changed the base branch from main to drisspg/stack/30 February 4, 2025 20:00

drisspg added the mx label Feb 4, 2025

drisspg changed the base branch from drisspg/stack/30 to main February 7, 2025 22:32

drisspg changed the base branch from main to drisspg/stack/30 February 7, 2025 22:32

drisspg changed the base branch from drisspg/stack/30 to main February 10, 2025 21:44

drisspg force-pushed the drisspg/stack/31 branch from 7473aca to d410880 Compare February 10, 2025 21:44

drisspg changed the base branch from main to drisspg/stack/30 February 10, 2025 21:44

drisspg changed the base branch from drisspg/stack/30 to main February 10, 2025 22:27

drisspg force-pushed the drisspg/stack/31 branch 2 times, most recently from b26ae7b to 7b9df4d Compare February 10, 2025 22:27

drisspg mentioned this pull request Feb 10, 2025

Add third_party to exclude #1692

Merged

drisspg changed the base branch from main to drisspg/stack/30 February 10, 2025 22:27

drisspg changed the base branch from drisspg/stack/30 to main February 10, 2025 22:36

drisspg force-pushed the drisspg/stack/31 branch from 7b9df4d to 1b1e6b5 Compare February 10, 2025 22:36

drisspg changed the base branch from main to drisspg/stack/30 February 10, 2025 22:36

drisspg changed the base branch from drisspg/stack/30 to main February 10, 2025 22:38

drisspg changed the base branch from main to drisspg/stack/30 February 10, 2025 22:38

drisspg changed the base branch from drisspg/stack/30 to main February 11, 2025 23:32

drisspg force-pushed the drisspg/stack/31 branch from 1b1e6b5 to 2febf48 Compare February 11, 2025 23:32

drisspg changed the base branch from main to drisspg/stack/30 February 11, 2025 23:32

drisspg changed the base branch from drisspg/stack/30 to main February 11, 2025 23:36

drisspg force-pushed the drisspg/stack/31 branch from 2febf48 to eb5a573 Compare February 11, 2025 23:37

drisspg changed the base branch from main to drisspg/stack/30 February 11, 2025 23:37

drisspg changed the base branch from drisspg/stack/30 to main February 12, 2025 01:11

drisspg force-pushed the drisspg/stack/31 branch from eb5a573 to b005006 Compare February 12, 2025 01:11

vkuzo reviewed Feb 12, 2025

View reviewed changes

torchao/ops.py Show resolved Hide resolved

Add mx_fp8_bf16 kernel

e18a020

stack-info: PR: #1637, branch: drisspg/stack/31

drisspg force-pushed the drisspg/stack/31 branch from b005006 to e18a020 Compare February 12, 2025 18:32

Add mx_fp4_kernel (#1661)

f013189

stack-info: PR: #1661

drisspg merged commit d3306b2 into main Feb 12, 2025
10 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mx_fp8_bf16 kernel #1637

Add mx_fp8_bf16 kernel #1637

drisspg commented Jan 29, 2025 •

edited

Loading

pytorch-bot bot commented Jan 29, 2025 •

edited

Loading

vkuzo left a comment

albanD left a comment

Add mx_fp8_bf16 kernel #1637

Add mx_fp8_bf16 kernel #1637

Conversation

drisspg commented Jan 29, 2025 • edited Loading

Follow up

pytorch-bot bot commented Jan 29, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1637

vkuzo left a comment

Choose a reason for hiding this comment

albanD left a comment

Choose a reason for hiding this comment

drisspg commented Jan 29, 2025 •

edited

Loading

pytorch-bot bot commented Jan 29, 2025 •

edited

Loading