Skip to content

EVIEHub/Rationality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rationality Measurement and Theory for Reinforcement Learning Agents

License: MIT

This repository contains empirical verification of our rationality measures and theoretical analysis. More details are in the following paper:

Kejiang Qian, Amos Storkey, Fengxiang He. Rationality Measurement and Theory for Reinforcement Learning Agents. arXiv

Empirically Testable Hypotheses

Our theory leads to the following hypotheses.

  • H1: Benefits of regularisations: layer normalisation (LN), $\ell_2$ regularisation (L2), and weight normalisation (WN), can penalise hypothesis complexity.

  • H2: Benefits of domain randomisation: improves robustness of reinforcement learning algorithms against distribution shifts across environments.

  • H3: Deficits of environment shifts: larger environment shifts lead to worse rationality.


Project Structure

Rationality/
├── src/
│   ├── env/                # Customised Taxi & CliffWalking environments
│   │   ├── taxi.py
│   │   └── cliffwalking.py
│   ├── model/              # DQN implementation
│   │   └── DQN.py
│   ├── utils/              # Logger & helper functions
│   ├── regularisers.py     # Regularisation modules
│   └── runners.py          # Training / evaluation pipeline
│
├── experiment/             # Reproduction scripts
│   ├── exp1_*_reg.sh
│   ├── exp2_*_domain_rand.sh
│   └── exp3_*_env_level.sh
│
└── train.py                # Main entry

Installation

conda create -n rationality python=3.10
conda activate rationality

pip install torch gym numpy pandas matplotlib

Quick Start

Train DQN on Taxi

python train.py \
  --env taxi \
  --episodes 2000 \
  --regulariser ln

Train on CliffWalking with domain randomisation

python train.py \
  --env cliffwalking \
  --eps_train 0.3

Experiments Reproduction

All results are available at Google Drive.

Exp1 – Regularisation

bash experiment/exp1_taxi_reg.sh
bash experiment/exp1_cliff_reg.sh

Exp2 – Domain Randomisation

bash experiment/exp2_taxi_domain_rand.sh
bash experiment/exp2_cliff_domain_rand.sh

Exp3 – Environment Level

bash experiment/exp3_taxi_env_level.sh
bash experiment/exp3_cliff_env_level.sh

Results will be saved to:

logs/{env}/{experiment}/

Citation

If you use this code in your research, please cite:

@article{qian2025rationality,
  title={Rationality Measurement and Theory for Reinforcement Learning Agents},
  author={Qian, Kejiang and Storkey, Amos and He, Fengxiang},
  year={2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors