Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fix quantization for adapter v2 #314

Closed
wants to merge 1 commit into from
Closed

Conversation

rasbt
Copy link
Contributor

@rasbt rasbt commented May 22, 2023

Arg, I noticed that llm.int8() quantization breaks for adapter-v2-finetuned models, giving a

    F.linear(input, self.weight, self.bias) + self.adapter_bias
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != signed char

error.

I tried manually setting the dtype in the new v2 parameters (see this PR) but still get the same issue. Is there perhaps something I should be doing with

    with fabric.device:
        torch.set_default_tensor_type(...)

ArturK-85 referenced this pull request in ArturK-85/lit-parrot May 23, 2023
@awaelchli
Copy link
Contributor

@rasbt This might be the fix we were looking for: #323 :)

@rasbt
Copy link
Contributor Author

rasbt commented May 25, 2023

Oh yes, this may be it! Will test it out and continue the discussion in the other PR!

@rasbt
Copy link
Contributor Author

rasbt commented Jun 2, 2023

We can probably close this because of #323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants