Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with self dataset falcon 7b: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != signed char #223

Open
FedericoMontana opened this issue Jul 1, 2023 · 1 comment
Labels
bug Something isn't working quantization

Comments

@FedericoMontana
Copy link

Hi! I followed all steps in the falcon blog, I'm using my own dataset. I get this error when I try to run an inference:

python generate/adapter_v2.py --adapter_path workspace/out/adapter/falcon/lit_model_adapter_finetuned.pth --checkpoint_dir checkpoints/tiiuae/falcon-7b --quantize llm.int8 --prompt "What food do lamas eat?"

Loading model 'checkpoints/tiiuae/falcon-7b/lit_model.pth' with {'org': 'tiiuae', 'name': 'falcon-7b', 'block_size': 2048, 'vocab_size': 50254, 'padding_multiple': 512, 'padded_vocab_size': 65024, 'n_layer': 32, 'n_head': 71, 'n_embd': 4544, 'rotary_percentage': 1.0, 'parallel_residual': True, 'bias': False, 'n_query_groups': 1, 'shared_attention_norm': True, '_norm_class': 'LayerNorm', 'norm_eps': 1e-05, '_mlp_class': 'GptNeoxMLP', 'intermediate_size': 18176, 'adapter_prompt_length': 10, 'adapter_start_layer': 2}
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda117.so
Time to instantiate model: 1.49 seconds.
Time to load the model weights: 27.58 seconds.
Traceback (most recent call last):
  File "/workspace/lit-gpt/generate/adapter_v2.py", line 137, in <module>
    CLI(main)
  File "/usr/local/lib/python3.10/dist-packages/jsonargparse/_cli.py", line 85, in CLI
    return _run_component(component, cfg_init)
  File "/usr/local/lib/python3.10/dist-packages/jsonargparse/_cli.py", line 147, in _run_component
    return component(**cfg)
  File "/workspace/lit-gpt/generate/adapter_v2.py", line 106, in main
    y = generate(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/lit-gpt/generate/base.py", line 66, in generate
    logits = model(x, max_seq_length, input_pos)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lightning/fabric/wrappers.py", line 116, in forward
    output = self._forward_module(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/lit-gpt/lit_gpt/adapter.py", line 108, in forward
    x, self.kv_caches[i], self.adapter_kv_caches[i] = block(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/lit-gpt/lit_gpt/adapter.py", line 150, in forward
    h, new_kv_cache, new_adapter_kv_cache = self.attn(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/lit-gpt/lit_gpt/adapter.py", line 192, in forward
    qkv = self.attn(x)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/lit-gpt/lit_gpt/adapter_v2.py", line 35, in adapter_v2_new_forward
    return self.adapter_scale * (torch.nn.functional.linear(input, self.weight, self.bias) + self.adapter_bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != signed char
aniketmaurya pushed a commit to aniketmaurya/install-lit-gpt that referenced this issue Jul 5, 2023
@journeadrien
Copy link

journeadrien commented Jul 10, 2023

I also have this issue, looking at lit-llama it seems to be same as that issue. I am not sure quantization method is yet address for inference adapter fine-tuned model in lit-gpt repo.

@carmocca carmocca added bug Something isn't working quantization labels Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working quantization
Projects
None yet
Development

No branches or pull requests

3 participants