Skip to content

Enacting/d4pg-pytorch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

D4PG-pytorch

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients (https://arxiv.org/abs/1804.08617).

d4pg_arch

Implementation was tested on environments from OpenAI Gym.

About

D4PG and D3PG implementations with following features

  • learner, sampler and agents run in separate processes
  • exploiter agent(s) exists which acts without noise in actions on target network
  • GPU is hold only by exploiters, all other exploration processes are run on CPU

Project was tested on Ubuntu 18.04, Intel i5 with 4 cores, Nvidia GTX 1080Ti

Usage

Run python train.py --config configs/openai/d4pg/walker2d_d4pg.yml

Tests

python -m unittest discover

Results

Configs for reproducing curves below can be found in configs directory (num parallel agents = 4).

OpenAI Mujoco

d4pg_results2

DMControl

dmc_d4pg

Reproduce

All results were obtained with configs in configs directory

References

About

PyTorch implementation of Distributed Distributional Deterministic Policy Gradients

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%