Error with shape #27

manlenzzz · 2024-04-21T07:52:20Z

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

I used the checkpoints from the training config：
python train_gsm8k.py
--model_name_or_path LoftQ/Llama-2-7b-hf-4bit-64rank
--learning_rate 3e-4
--seed 11
--expt_name gsm8k_llama2_7b_4bit_64rank_loftq
--output_dir exp_results/
--num_train_epochs 6
--per_device_train_batch_size 2
--gradient_accumulation_steps 8
--evaluation_strategy "no"
--save_strategy "epoch"
--weight_decay 0.1
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 10
--do_train
--report_to tensorboard

yxli2123 · 2024-04-21T15:16:14Z

Hi @manlenzzz , this script works for me. Could you please provide more details about the error?

It looks like it's due to smart_tokenizer_and_embedding_resize function. Because LLAMA tokenizer doesn't have PAD token, we need to add one special token during the training, which will make the shape [32000, 4096] to [32001, 4096]. I would suggest you run the smart_tokenizer_and_embedding_resize function after loading the backbone and before loading adapters. For example:

model = AutoModelForCausalLM.from_pretrained("LoftQ/Llama-2-7b-hf-4bit-64rank")
tokenizer = AutoTokenizer.from_pretrained("LoftQ/Llama-2-7b-hf-4bit-64rank")
smart_tokenizer_and_embedding_resize(model, tokenizer)
model = PeftModel.from_pretrained(model, "path/to/your/adapter")

manlenzzz · 2024-04-21T15:31:40Z

yes!you are right.

After I modified the code it worked fine:
def evaluation(model_args, data_args):
if model_args.full_precision:
model = transformers.AutoModelForCausalLM.from_pretrained(
model_args.model_name_or_path,
low_cpu_mem_usage=True,
torch_dtype=torch.bfloat16,
token=model_args.token,
device_map='cuda:1',
)
else:
model = transformers.AutoModelForCausalLM.from_pretrained(
model_args.model_name_or_path,
low_cpu_mem_usage=True,
torch_dtype=torch.bfloat16,
token=model_args.token,
device_map='cuda:1',
quantization_config=transformers.BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=False,
bnb_4bit_quant_type='nf4',
),
)

tokenizer = transformers.AutoTokenizer.from_pretrained(
    model_args.model_name_or_path,
    token=model_args.token,
    model_max_length=model_args.model_max_length,
    padding_side="left",
    use_fast=False,
)

special_tokens_dict = dict()
if tokenizer.pad_token is None:
    special_tokens_dict["pad_token"] = DEFAULT_PAD_TOKEN
if tokenizer.eos_token is None:
    special_tokens_dict["eos_token"] = DEFAULT_EOS_TOKEN
if tokenizer.bos_token is None:
    special_tokens_dict["bos_token"] = DEFAULT_BOS_TOKEN
if tokenizer.unk_token is None:
    special_tokens_dict["unk_token"] = DEFAULT_UNK_TOKEN

smart_tokenizer_and_embedding_resize(
    special_tokens_dict=special_tokens_dict,
    tokenizer=tokenizer,
    model=model,
)
##########################
#       Peft Model       #
##########################
if model_args.adapter_name_or_path is not None:
    model = PeftModel.from_pretrained(
        model,
        model_args.adapter_name_or_path,
        is_trainable=False,
        token=model_args.token,
    )
else:
    model = PeftModel.from_pretrained(
        model,
        model_args.model_name_or_path,
        subfolder='gsm8k',
        is_trainable=False,
        token=model_args.token,
    )

See #27

yxli2123 added a commit that referenced this issue Apr 21, 2024

fix loading adapter shape error #27

ed5ba19

See #27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error with shape #27

Error with shape #27

manlenzzz commented Apr 21, 2024

yxli2123 commented Apr 21, 2024

manlenzzz commented Apr 21, 2024

Error with shape #27

Error with shape #27

Comments

manlenzzz commented Apr 21, 2024

yxli2123 commented Apr 21, 2024

manlenzzz commented Apr 21, 2024