Skip to content

IanYHWu/rc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reasoning Cache

This codebase accompanies our research paper: "Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL (2026)."

For inference-only (RC-decoding) code, please see inference (vLLM only).

For training code, please see training (verl required).

Artefacts

RCT-4B: HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_step_120

Stage I Training Set: HerrHruby/acemath_rl_4b_inst_hard

Stage II Training Set: HerrHruby/acemath_rl_4b_inst_hard_with_dishsoap

About

Public-facing codebase accompanying: "Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages