Skip to content

Commit 4640b53

Browse files
author
weikaiwen
committed
update documents
1 parent ce349ba commit 4640b53

File tree

3 files changed

+5
-1
lines changed

3 files changed

+5
-1
lines changed

docs/source/Instruction/命令行参数.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,7 @@ Vera使用`target_modules`, `target_regex`, `modules_to_save`三个参数.
351351
- check_model: 检查本地模型文件有损坏或修改并给出提示,默认为True。如果是断网环境,请设置为False。
352352
- 🔥create_checkpoint_symlink: 额外创建checkpoint软链接,方便书写自动化训练脚本。best_model和last_model的软链接路径分别为f'{output_dir}/best'和f'{output_dir}/last'。
353353
- loss_type: loss类型。默认为None,使用模型自带损失函数。
354+
- channel_list : 数据集包含的channel列表。默认为None。结合`--loss_type channel_loss`使用,可参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/channel_loss.sh)
354355
- 🔥packing: 是否使用序列packing提升计算效率,默认为False。当前支持`swift pt/sft`
355356
- 注意:使用packing请结合`--attn_impl flash_attn`使用且"transformers>=4.44",具体查看[该PR](https://github.com/huggingface/transformers/pull/31629)
356357
- 支持的多模态模型参考:https://github.com/modelscope/ms-swift/blob/main/examples/train/packing/qwen2_5_vl.sh

docs/source_en/Instruction/Command-line-parameters.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -360,6 +360,7 @@ Training arguments include the [base arguments](#base-arguments), [Seq2SeqTraine
360360
- check_model: Check local model files for corruption or modification and give a prompt, default is True. If in an offline environment, please set to False.
361361
- 🔥create_checkpoint_symlink: Creates additional checkpoint symlinks to facilitate writing automated training scripts. The symlink paths for `best_model` and `last_model` are `f'{output_dir}/best'` and `f'{output_dir}/last'` respectively.
362362
- loss_type: Type of loss. Defaults to None, which uses the model's built-in loss function.
363+
- channel_list:List of channels included in the dataset. Defaults to None. Used in conjunction with `--loss_type channel_loss`. Refer to [this example](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/channel_loss.sh) for more details.
363364
- 🔥packing: Whether to use sequence packing to improve computational efficiency. The default value is False. Currently supports `swift pt/sft`.
364365
- Note: When using packing, please combine it with `--attn_impl flash_attn` and ensure "transformers>=4.44". For details, see [this PR](https://github.com/huggingface/transformers/pull/31629).
365366
- Supported multimodal models reference: https://github.com/modelscope/ms-swift/blob/main/examples/train/packing/qwen2_5_vl.sh

examples/train/plugins/channel_loss.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# use loss_type channel_loss
2+
# channel_list specifies the channels included in the dataset
23
# data should have 'channel' field
34
# eg.
45
# {"channel": "chat",
@@ -27,4 +28,5 @@ swift sft \
2728
--system 'You are a helpful assistant.' \
2829
--warmup_ratio 0.05 \
2930
--dataloader_num_workers 4 \
30-
--loss_type channel_loss
31+
--loss_type channel_loss \
32+
--channel_list 'chat' 'math' 'code'

0 commit comments

Comments
 (0)