links.md

🔗 Links and References

Links to research papers and resources corresponding to implemented features in this repository. Please feel free to fill in any missing references!

The technical report on OpenUnlearning, its design, features and other details.
A meta-evaluation framework to benchmark unlearning evaluations on a set of 450+ open sourced models.
Results benchmarking 8 diverse unlearning methods in one place using 10 evaluation metrics on TOFU.

Method	Resource
GradAscent, GradDiff	Naive baselines found in many papers including MUSE, TOFU etc.
NPO	Paper📄, Code 🐙
SimNPO	Paper📄, Code 🐙
IdkDPO	TOFU (📄)
RMU	WMDP paper (🐙, 🌐), later used in G-effect (🐙)
UNDIAL	Paper📄, Code 🐙
AltPO	Paper📄, Code 🐙
SatImp	Paper📄, Code 🐙
WGA (G-effect)	Paper📄, Code 🐙
CE-U (Cross-Entropy unlearning)	Paper📄
PDU	Paper 📄

Metric	Resource
Verbatim Probability / ROUGE, simple QA-ROUGE	Naive metrics found in many papers including MUSE, TOFU etc.
Membership Inference Attacks (LOSS, ZLib, Reference, GradNorm, MinK, MinK++)	MIMIR (🐙), MUSE (📄)
PrivLeak	MUSE (📄)
Forget Quality, Truth Ratio, Model Utility	TOFU (📄)
Extraction Strength (ES)	Carlini et al., 2021 (📄), used for unlearning in Wang et al., 2025 (📄)
Exact Memorization (EM)	Tirumala et al., 2022 (📄), used for unlearning in Wang et al., 2025 (📄)
lm-evaluation-harness	Repository: 💻