CTRL: Cross-Task Reinforcement Learning

We study cross-task knowledge reuse in deep reinforcement learning using three complementary paradigms:

Transfer Learning: pretrain on a source task, then adapt to a target task
Meta-Learning: learn an initialization that adapts quickly (few-shot) to a task
Continual Learning: learn tasks sequentially while reducing catastrophic forgetting

Environments used:

Snake (custom)
PuckWorld (custom)
Pong (Atari: ALE/Pong-v5)

Plus a controlled sine-wave regression benchmark (for EWC + toy MAML sanity checks).

Setup

1) Create and activate a venv

python -m venv .venv
source .venv/bin/activate

2) Install dependencies

pip install -r requirements.txt

3) Install Pong ROMs

Gymnasium Atari typically needs ROM install/acceptance.

Recommended:

pip install "gymnasium[atari,accept-rom-license]"

Or AutoROM:

pip install "autorom[accept-rom-license]"
AutoROM --accept-license

Sanity check:

python -c "import gymnasium as gym; env=gym.make('ALE/Pong-v5'); env.reset(); print('Pong OK')"

Transfer learning (PPO, Stable-Baselines3)

Train PPO on one environment:

python -m bridging_tasks.transfer.train --env pong --timesteps 1000000
python -m bridging_tasks.transfer.train --env snake --timesteps 500000
python -m bridging_tasks.transfer.train --env puckworld --timesteps 500000

Outputs:

outputs/transfer/<env>/<timestamp>/
- saved SB3 model (.zip)
- reward curve plot
- tensorboard logs

Meta-learning (first-order RL meta loop)

The original project report frames meta-learning as MAML-style adaptation. In practice, implementing full second-order MAML in RL is expensive and brittle, so this repo includes a first-order meta-learning loop (Reptile-style) that still learns an initialization that adapts quickly.

python -m bridging_tasks.meta.train --iterations 200 --k_shots 5

Outputs:

outputs/meta/<timestamp>/ (plots + init checkpoints)

Continual learning (EWC on sine benchmark)

Runs:

sine transfer baselines (scratch / freeze / finetune)
toy sine MAML
EWC forgetting matrix

python -m bridging_tasks.continual.run --all

Outputs:

outputs/continual/<timestamp>/

License

MIT (See LICENSE).

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
src/ctrl		src/ctrl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CTRL: Cross-Task Reinforcement Learning

Setup

1) Create and activate a venv

2) Install dependencies

3) Install Pong ROMs

Transfer learning (PPO, Stable-Baselines3)

Meta-learning (first-order RL meta loop)

Continual learning (EWC on sine benchmark)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CTRL: Cross-Task Reinforcement Learning

Setup

1) Create and activate a venv

2) Install dependencies

3) Install Pong ROMs

Transfer learning (PPO, Stable-Baselines3)

Meta-learning (first-order RL meta loop)

Continual learning (EWC on sine benchmark)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages