Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents (ICLR 2026 Oral)

This repository contains the official implementation of $\mathbf{T^3}$ as described in the paper Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents by Deyu Zou, Yongqiang Chen, Jianxiang Wang, Garry YANG, Mufei Li, Qing Da, James Cheng, Pan Li, Yu Gong, which has been selected as ICLR 2026 Oral Presentation.

The codebase is currently under preparation. We will make the full implementation publicly available by March 22, 2026.

Thanks for your patience and attention!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents (ICLR 2026 Oral)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents (ICLR 2026 Oral)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages