Multi-agent reinforcement learning experimental environment using the Apple Deer environment.
- Python 3.7 or higher (Python 3.8+ recommended)
- See
requirements.txtfor package dependencies
Clone the repository to your local machine:
git clone https://github.com/tzkwkblab/AROB2025_Inoue.git
cd AROB2025_InoueInstall the required packages using the following command:
cd AROB2025_Inoue
pip install -r requirements.txtIf you encounter an error related to Git LFS (Large File Storage) when installing PantheonRL, such as:
Error downloading object: ... This repository exceeded its LFS budget.
This is a known issue with the PantheonRL repository's Git LFS storage quota. To work around this, install dependencies with the GIT_LFS_SKIP_SMUDGE environment variable set:
cd AROB2025_Inoue
GIT_LFS_SKIP_SMUDGE=1 pip install -r requirements.txtThis will skip downloading Git LFS files during installation, which is sufficient for most use cases. The missing LFS files are typically only needed for specific datasets that are not required for the Apple Deer environment.
Important: Make sure you are in the AROB2025_Inoue directory when running the training scripts.
Execute train_AROB.py from the AROB2025_Inoue directory to start training in an environment that includes deer:
cd AROB2025_Inoue
python train_AROB.pyFor training in an environment without deer, use train_AROB_nodeer.py:
cd AROB2025_Inoue
python train_AROB_nodeer.pyNote: If you get a ModuleNotFoundError: No module named 'AROB2025_Inoue' error, make sure you are running the script from inside the AROB2025_Inoue directory, not from the parent AROB2025_Inoue directory.
Note: The main difference is that train_AROB_nodeer.py uses the apple_deer_nodeer environment which does not include deer.
Training runs for 30,000,000 steps (default setting). During training, results are saved to the following directories (relative to the parent directory of AROB2025_Inoue):
- Policy files:
policy/ - Tensorboard logs:
tensorboard_log/
To view the training progress, start Tensorboard from the parent directory of AROB2025_Inoue:
tensorboard --logdir=tensorboard_log/ --port=<PORT_NUMBER>Replace <PORT_NUMBER> with an available port number (e.g., 6006). Access http://localhost:<PORT_NUMBER> in your browser to view the learning curves.
After training completes, policy files are saved in subdirectories under policy/ (relative to the parent directory of AROB2025_Inoue). The exact path depends on the configuration in train_AROB.py:
policy_ego: Ego agent's policypolicy_partner0: Partner agent's policy
You can modify train_AROB.py to change experimental settings.
The following parameters can be configured in train_AROB.py:
Environment Settings:
x_size,y_size: Environment size (e.g., 15x15)max_cycles: Maximum steps per episode (e.g., 300)obs_range: Observation range (e.g., 5)tree_range: Tree range (e.g., 1)
Agent Settings:
agents_dict: Agent configuration (color, number, attack range, health, etc.)
Environment Objects:
deer_num: Number of deer (e.g., 1)nuts_num: Number of nuts (e.g., 25)apple_tree_num: Number of apple trees (e.g., 1)deer_health: Deer health (e.g., 30)apple_tree_health: Apple tree health (e.g., 10)
Reward Settings:
apple_reward: Reward for collecting apples (e.g., 100)nut_reward: Reward for collecting nuts (e.g., 1)single_attack_reward: Reward for single attack (e.g., 5)double_attack_reward: Reward for coordinated attack (e.g., 20)
Other Settings:
agent_respawn: Enable agent respawn (e.g., True)sequential_respawn: Sequential respawn (e.g., True)signal_visualization: Signal visualization (e.g., False)
Training results are saved to the following directories (relative to the parent directory of AROB2025_Inoue). The exact subdirectory paths depend on the configuration in train_AROB.py:
- Policies:
policy/<experiment_name>/policy_ego: Ego agent's policypolicy_partner0: Partner agent's policy
- Tensorboard logs:
tensorboard_log/<experiment_name>/
After training completes, you can load and use policies as follows:
from stable_baselines3 import PPO
# Load ego agent's policy (path is relative to parent directory of AROB2025_Inoue)
# Replace <experiment_name> with the actual experiment directory name
ego = PPO.load("policy/<experiment_name>/policy_ego")
# Load partner agent's policy
partner0 = PPO.load("policy/<experiment_name>/policy_partner0")After training, you can test the trained policies and generate GIF animations using test.py:
cd AROB2025_Inoue
python test.pyThis script will:
- Load trained policies from
policy/directory (relative to the parent directory ofAROB2025_Inoue) - Run test episodes with the trained agents
- Generate GIF animations showing agent behavior
- Save GIFs to
GIF/directory (relative to the parent directory ofAROB2025_Inoue)
You can modify test.py to customize the test settings:
gif_num: Number of GIFs to generate (default: 6)num_steps: Maximum number of steps per episode (default: 300)policy_dir: Directory containing saved policies (default:policy/)gif_dir: Directory to save GIF files (default:GIF/)- Environment parameters: Same as training script (e.g.,
obs_range,x_size,y_size, etc.)
Note: Make sure the policy files (policy_ego and policy_partner0, policy_partner1, etc.) exist in the policy/ directory before running the test script.
AROB2025_Inoue/
├── apple_deer/ # Apple Deer environment implementation
│ ├── apple_deer.py # Main environment class
│ ├── apple_deer_base.py
│ └── utils/ # Utility functions
├── apple_deer_v0.py # Environment entry point
├── tensorboard_callback.py # Tensorboard callback
├── train_AROB.py # Training script (with deer)
├── train_AROB_nodeer.py # Training script (without deer)
├── test.py # Test script for generating GIFs
├── requirements.txt # Dependency list
└── README.md # This file