Skip to content

Latest commit

 

History

History
83 lines (61 loc) · 7.58 KB

File metadata and controls

83 lines (61 loc) · 7.58 KB

🔗 Links and References

Links to research papers and resources corresponding to implemented features in this repository. Please feel free to fill in any missing references!

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics introduces

  • The technical report on OpenUnlearning, its design, features and other details.
  • A meta-evaluation framework to benchmark unlearning evaluations on a set of 450+ open sourced models.
  • Results benchmarking 8 diverse unlearning methods in one place using 10 evaluation metrics on TOFU.

📌 Table of Contents


📗 Implemented Methods

Method Resource
GradAscent, GradDiff Naive baselines found in many papers including MUSE, TOFU etc.
NPO Paper📄, Code 🐙
SimNPO Paper📄, Code 🐙
IdkDPO TOFU (📄)
RMU WMDP paper (🐙, 🌐), later used in G-effect (🐙)
UNDIAL Paper📄, Code 🐙
AltPO Paper📄, Code 🐙
SatImp Paper📄, Code 🐙
WGA (G-effect) Paper📄, Code 🐙
CE-U (Cross-Entropy unlearning) Paper📄
PDU Paper 📄

📘 Benchmarks

Benchmark Resource
TOFU Paper📄
MUSE Paper📄
WMDP Paper📄

📙 Evaluation Metrics

Metric Resource
Verbatim Probability / ROUGE, simple QA-ROUGE Naive metrics found in many papers including MUSE, TOFU etc.
Membership Inference Attacks (LOSS, ZLib, Reference, GradNorm, MinK, MinK++) MIMIR (🐙), MUSE (📄)
PrivLeak MUSE (📄)
Forget Quality, Truth Ratio, Model Utility TOFU (📄)
Extraction Strength (ES) Carlini et al., 2021 (📄), used for unlearning in Wang et al., 2025 (📄)
Exact Memorization (EM) Tirumala et al., 2022 (📄), used for unlearning in Wang et al., 2025 (📄)
lm-evaluation-harness Repository: 💻

🌐 Useful Links

📚 Surveys

🐙 Other GitHub Repositories