Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

复现效果失败 #48

Open
a1exyu opened this issue Mar 1, 2025 · 2 comments
Open

复现效果失败 #48

a1exyu opened this issue Mar 1, 2025 · 2 comments

Comments

@a1exyu
Copy link

a1exyu commented Mar 1, 2025

[你好,我使用最新的仓库去在0.5,1.5以及3B上实验,显卡是单卡4090,lora微调,复现出来的结果差很多,格式遵循能力很强,但是准确度很低,生成长度也越来越短]

@a1exyu
Copy link
Author

a1exyu commented Mar 1, 2025

这是我的训练过程和参数设置
Image

Image

模型参数

model_name_or_path: /root/shared-nvme/model/Qwen2.5-3B/
model_revision: main
torch_dtype: bfloat16
lora_r: 64 # Lora 秩数
lora_alpha: 32
attn_implementation: flash_attention_2
bf16: true
tf32: true
output_dir: ./output/Datawhale-R1-3B

数据集参数

dataset_id_or_path: ./data/

Swanlab 训练流程记录参数

swanlab: true # 是否开启 Swanlab
workspace: alexyu010120
project: r1-repoduction
experiment_name: qwen2.5-3B-lr:1e-5_beta:1e-3-r32a64

训练参数

max_steps: 750 # 最大训练步长
per_device_train_batch_size: 1
gradient_accumulation_steps: 4
gradient_checkpointing: true
gradient_checkpointing_kwargs:
use_reentrant: false
learning_rate: 1.0e-5 # 学习率调整为1e-5
lr_scheduler_type: cosine # 学习率衰减方案
warmup_ratio: 0.03 # 学习率预热比率(对于整个步长),好用!
seed: 2025 # 随机种子,方便实验复现

GRPO 算法参数

beta: 0.001 # KL 惩罚因子,调整过,参见下文介绍
optim: adamw_8bit # 优化器, 8bit加速
max_prompt_length: 256 # 输入 prompt 最大长度,本实验基本不会有太大变化
max_completion_length: 1024 # 输出回答长度,包含推理思维链
num_generations: 4
use_vllm: true # 启用 vllm 来加速推理
vllm_gpu_memory_utilization: 0.4

Logging arguments

logging_strategy: steps
logging_steps: 1
save_strategy: "steps"
save_steps: 50 # 每隔多少步保存一次

@anine09
Copy link
Contributor

anine09 commented Mar 3, 2025

或许你可以参考下 @Kedreamix 的参数试试:#44 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants