You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I followed all steps in the falcon blog, I'm using my own dataset. I get this error when I try to run an inference:
python generate/adapter_v2.py --adapter_path workspace/out/adapter/falcon/lit_model_adapter_finetuned.pth --checkpoint_dir checkpoints/tiiuae/falcon-7b --quantize llm.int8 --prompt "What food do lamas eat?"
Loading model 'checkpoints/tiiuae/falcon-7b/lit_model.pth' with {'org': 'tiiuae', 'name': 'falcon-7b', 'block_size': 2048, 'vocab_size': 50254, 'padding_multiple': 512, 'padded_vocab_size': 65024, 'n_layer': 32, 'n_head': 71, 'n_embd': 4544, 'rotary_percentage': 1.0, 'parallel_residual': True, 'bias': False, 'n_query_groups': 1, 'shared_attention_norm': True, '_norm_class': 'LayerNorm', 'norm_eps': 1e-05, '_mlp_class': 'GptNeoxMLP', 'intermediate_size': 18176, 'adapter_prompt_length': 10, 'adapter_start_layer': 2}
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda117.so
Time to instantiate model: 1.49 seconds.
Time to load the model weights: 27.58 seconds.
Traceback (most recent call last):
File "/workspace/lit-gpt/generate/adapter_v2.py", line 137, in <module>
CLI(main)
File "/usr/local/lib/python3.10/dist-packages/jsonargparse/_cli.py", line 85, in CLI
return _run_component(component, cfg_init)
File "/usr/local/lib/python3.10/dist-packages/jsonargparse/_cli.py", line 147, in _run_component
return component(**cfg)
File "/workspace/lit-gpt/generate/adapter_v2.py", line 106, in main
y = generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/workspace/lit-gpt/generate/base.py", line 66, in generate
logits = model(x, max_seq_length, input_pos)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/lightning/fabric/wrappers.py", line 116, in forward
output = self._forward_module(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/lit-gpt/lit_gpt/adapter.py", line 108, in forward
x, self.kv_caches[i], self.adapter_kv_caches[i] = block(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/lit-gpt/lit_gpt/adapter.py", line 150, in forward
h, new_kv_cache, new_adapter_kv_cache = self.attn(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/lit-gpt/lit_gpt/adapter.py", line 192, in forward
qkv = self.attn(x)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1505, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1514, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/lit-gpt/lit_gpt/adapter_v2.py", line 35, in adapter_v2_new_forward
return self.adapter_scale * (torch.nn.functional.linear(input, self.weight, self.bias) + self.adapter_bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != signed char
The text was updated successfully, but these errors were encountered:
aniketmaurya
pushed a commit
to aniketmaurya/install-lit-gpt
that referenced
this issue
Jul 5, 2023
I also have this issue, looking at lit-llama it seems to be same as that issue. I am not sure quantization method is yet address for inference adapter fine-tuned model in lit-gpt repo.
Hi! I followed all steps in the falcon blog, I'm using my own dataset. I get this error when I try to run an inference:
The text was updated successfully, but these errors were encountered: