Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents (ICLR 2026 Oral)
This repository contains the official implementation of
The codebase is currently under preparation. We will make the full implementation publicly available by March 22, 2026.
Thanks for your patience and attention!