Skip to content

[WIP] [Transform] Compress, decompress #333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: kylesayrs/transform_permutations
Choose a base branch
from

Conversation

kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented May 31, 2025

idea: submodule structure handles most serialization for us
let's not couple apply_transform_config with apply_quantization_config, otherwise we'd have potential conflicts with the QuantizationMixin

somehow, we need to allow the model_compressor to know the q_config and t_config. In the case of q_config, it's actually built on the fly. That kinda works for q_config, since all the schemes are present (although you lose config_group names). That wouldn't directly work for t_config, since the schemes are still transparent.

A simple solution would be to move towards a pattern where q_config (and as a subfield, t_config) are attached as an attribute to the mode directly, then grabbed by model_compressor. This seems to make sense, I don't see many downsides

Need to decide if we want to keep the weight submodules in the compressed state. The issue is that, without saving them, then there's no way to go from compressed to decompressed. However, saving them requires extra storage and vllm has to ignore those weights

Let's not keep weight transforms, except when trainable. During decompression, let's add activation hooks (these will need to be added by quantization anyways)

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs changed the base branch from main to kylesayrs/transform_permutations May 31, 2025 04:46
@kylesayrs kylesayrs changed the title [Transform] Apply, serialize, deserialize [WIP] [Transform] Apply, serialize, deserialize May 31, 2025
@kylesayrs kylesayrs changed the title [WIP] [Transform] Apply, serialize, deserialize [WIP] [Transform] Apply, compress, decompress May 31, 2025
@kylesayrs kylesayrs changed the title [WIP] [Transform] Apply, compress, decompress [WIP] [Transform] Compress, decompress May 31, 2025
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant