Conversation
- 降低评估和训练的batch size以节省显存 - 启用gradient checkpointing - 修复类型转换问题(添加.float()转换) - 完善unlearn模式配置 - 添加数据集加载错误处理 - 修复compute_loss方法签名兼容性
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 483337d0d5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # 将 delta reshape 并应用 | ||
| weight = proj.weight | ||
| delta_reshaped = delta[: weight.numel()].reshape(weight.shape) | ||
| weight.add_(delta_reshaped * self.edit_lr) |
There was a problem hiding this comment.
Emit a full-size weight update before reshaping delta
_init_edit_network() produces a vector of length hidden_size * 2, but here that vector is reshaped into proj.weight.shape, which is much larger for every real transformer MLP projection (for example, Llama/Qwen down_proj). In those cases delta[:weight.numel()].reshape(weight.shape) throws at runtime, so MENDEditor.edit() cannot complete on actual checkpoints and will immediately flip success to false on the first edit.
Useful? React with 👍 / 👎.
| # LoRA 参数 | ||
| method_args: | ||
| r: 8 | ||
| lora_alpha: 16 | ||
| target_modules: | ||
| - q_proj | ||
| - v_proj |
There was a problem hiding this comment.
Nest inject experiment method_args under
trainer
This override block sits at the config root, but load_trainer() only reads cfg.trainer.method_args (src/trainer/__init__.py:90-92). As a result, running experiment=inject/alpaca/default silently ignores the advertised LoRA overrides here and falls back to the defaults from configs/trainer/inject/LoRA.yaml; the same pattern also affects the sibling adalora and dora experiment templates.
Useful? React with 👍 / 👎.
| defaults: | ||
| - edit_metrics/reliability | ||
| - edit_metrics/generalization | ||
| - edit_metrics/locality | ||
| - edit_metrics/portability |
There was a problem hiding this comment.
Provide runtime inputs in the default edit eval config
This suite only pulls in the metric nodes, but src/eval.py populates edit_data and original_model only when the config defines edit_data(_path) and original_model_path. With the current eval=edit defaults, the evaluators run against an empty edit set, so reliability/generalization/portability all collapse to 0 while locality returns the optimistic fallback 1.0 from src/evals/edit.py:439-443; python src/eval.py eval=edit ... therefore writes misleading scores unless the user manually adds extra overrides.
Useful? React with 👍 / 👎.
What does this PR do?
Fixes # (issue)
Before submitting