IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies
6m ago · 5 min read · Reinterpreting Implicit Q-Learning: Context and Core Thesis At first glance the problem addressed here is familiar: offline reinforcement learning must cope with out-of-distribution actions and the brittleness they introduce. The paper reframes a wel...
Join discussion



























