Gwanwoo Song

Hello! I'm an MS/PhD student at Yonsei University, advised by Prof. Youngwoon Lee. My research interests include reinforcement learning and its applications to real-world problems. Currently, I'm working on offline reinforcement learning, with a focus on leveraging large-scale datasets.

Prior to this, I worked as a research intern at LangAGI lab, conducting research on LLM-based web agents.

Email  /  CV  /  GitHub  /  Twitter

profile photo

Publications

Chunk-Guided Q-Learning
Gwanwoo Song, Kwanyoung Park, Youngwoon Lee
Preprint
Project Page / Paper (Coming Soon)

We propose Chunk-Guided Q-Learning (CGQ), a novel single-step TD algorithm that mitigates bootstrapping error accumulation over long horizons. By regularizing a fine-grained single-step critic toward a action-chunked critic, CGQ reduces compounding errors while preserving precise value propagation. Our empirical results show that CGQ achieves strong performance on challenging long-horizon tasks, often outperforming both single-step and action-chunked methods.

Web Agent Figure Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Hyeongjoo Chae, Namyoung Kim, Kai Tzu-iunn Ong, Minju Kwak, Gwanwoo Song, Jihoon Kim, Seonghwan Kim, Dongha Lee, Jinyoung Yeo
Preprint
Paper / Code

We propose a World-model-augmented (WMA) web agent that enhances decision-making in long-horizon tasks by explicitly learning environment dynamics. To address the lack of world models in current LLMs, we introduce a transition-focused observation abstraction method that highlights key state changes using natural language. Our results demonstrate that WMA significantly improves policy selection without additional training, offering superior efficiency compared to tree-search-based agents.

Service

  • Reviewer: System-2 Reasoning Workshop, NeurIPS 2024

Design and source code from Jon Barron's website.