mask_history 参数详解54:57:src/llamafactory/hparams/data_args.py展开代码mask_history: bool = field( default=False, metadata={"help": "Whether or not to mask the history and train on the last turn only."}, )
False49:50:src/llamafactory/data/processor/supervised.py展开代码if self.data_args.mask_history: encoded_pairs = encoded_pairs[::-1] # high priority for last turns
70:73:src/llamafactory/data/processor/supervised.py展开代码if self.data_args.mask_history and turn_idx != 0: # train on the last turn only target_label = [IGNORE_INDEX] * target_len else: target_label = target_ids
turn_idx == 0:最后一轮,正常计算损失turn_idx != 0:历史轮次,target_label 设为 IGNORE_INDEX,不计算损失75:80:src/llamafactory/data/processor/supervised.py展开代码if self.data_args.mask_history: # reversed sequences input_ids = source_ids + target_ids + input_ids labels = source_label + target_label + labels else: input_ids += source_ids + target_ids labels += source_label + target_label
mask_history=True:新内容前置(source_ids + target_ids + input_ids)mask_history=False:按时间顺序拼接(input_ids += source_ids + target_ids)假设有一个3轮对话:
展开代码轮次1: User: "你好" → Assistant: "你好!" 轮次2: User: "今天天气怎么样?" → Assistant: "今天天气很好。" 轮次3: User: "适合出门吗?" → Assistant: "适合出门。"
mask_history=False(默认):展开代码input_ids: [轮次1的user+assistant] + [轮次2的user+assistant] + [轮次3的user+assistant] labels: [IGNORE...IGNORE, 轮次1的assistant] + [IGNORE...IGNORE, 轮次2的assistant] + [IGNORE...IGNORE, 轮次3的assistant]
mask_history=True:展开代码input_ids: [轮次3的user+assistant] + [轮次2的user+assistant] + [轮次1的user+assistant] (反转) labels: [IGNORE...IGNORE, 轮次3的assistant] + [IGNORE...IGNORE, IGNORE...IGNORE] + [IGNORE...IGNORE, IGNORE...IGNORE]
176:177:src/llamafactory/hparams/data_args.py展开代码if self.mask_history and self.train_on_prompt: raise ValueError("`mask_history` is incompatible with `train_on_prompt`.")
train_on_prompt=True 同时使用(两者冲突)loss_mask 的关系mask_history:在数据预处理阶段统一屏蔽历史轮次loss_mask:在消息级别精细控制每个消息是否计算损失目前 LLaMA-Factory 不支持消息级别的 loss_mask,但 mask_history 可以实现“只训练最后一轮”的效果。
mask_history=True:只对最后一轮计算损失,历史轮次作为上下文但不参与训练train_on_prompt 互斥

本文作者:Dong
本文链接:
版权声明:本博客所有文章除特别声明外,均采用 CC BY-NC。本作品采用《知识共享署名-非商业性使用 4.0 国际许可协议》进行许可。您可以在非商业用途下自由转载和修改,但必须注明出处并提供原作者链接。 许可协议。转载请注明出处!