The project deals with general artificial intelligence (from general game playing field). We try to learn different models to play some easy games based on game data, not based on visual input. The games do not need to be similar in a term of gameplay, they can be vastly different.
In the project we are currently using following games, models and types of learning techniques:
- Alhambra (card board game)
- 2048 (simple sliding block puzzle)
- TORCS (car racing simulator)
- Mario (a well known arcade game)
- Evolution - evolving neural networks, namely multi-layer perceptron (MLP) or echo-state networks (ESN), using:
- Simple evolutionary algorithm
- Evolution strategy
- Differential evolution
- Reinforcement learning
- Deep Q-networks (ε-greedy policy)
- DDPG (Deep Deterministic Policy Gradient)
The project contains three main directories:
Controller: all the AI codeExperiments: evaluated experiments, logs and graphs, including trained models (only small ones, to keep this repository in reasonable size)Game-interfaces: interfaces on the side of games, configuration files
- Download this repository
- Games Alhambra and 2048 are already included
- Game Mario can be found in separate repository (link) and must be placed in the same directory as this
general-aiproject (otherwise you need to modifyController/constants.pyfile with paths) - Game TORCS is more complicated, you need to install it from official website and look into the manual. Also, in
Game-interfaces/TORCS/install_directory.txtfile must be your installation directory of TORCS.
To start one of the already implemented models, look into a Controller/controller.py file. Then start learning
(using default parameters):
game = "2048"
run_eva(game)
run_es(game)
run_de(game)
run_dqn(game)
run_ddpg(game)To customize own parameters, simply head into respective function, for example run_eva(game):
eva_parameters = EvolutionaryAlgorithmParameters(
pop_size=50,
cxpb=0.75,
mut=("uniform", 0.1, 0.1),
ngen=1000,
game_batch_size=10,
cxindpb=0.2,
hof_size=0,
elite=2,
selection=("tournament", 3))
mlp = MLP(hidden_layers=[200, 200], activation="relu")
evolution = EvolutionaryAlgorithm(game, eva_parameters, mlp, logs_every=100, max_workers=4)
evolution.run()The project provides a general interface for different AI architectures and games. First, let's take a look for customizing own architecture / model.
Your class must extend Model class with the most important function evaluate(self, input, current_phase) which computes a 'forward' pass through your model with the specified input. You must also provide function get_number_of_parameters(self, game) so the architecture (e.q. evolution algorithm, evolution strategy..) knows the length of single individual to evolve. The get_new_instance(self, weights, game_config) method initializes a new instance of your model, using specified weights = parameters = single individual.
Every game can be written in its own language or in general, it can be any executable subprocess. The communication between games and model rely on interface written in Controller/games on a side of 'models' (e.q. Python code) and in Game-interfaces/<game> on a side of the specified game. In general, the communication works as follows:
The AI reads standard output of the game process and expects a string in json format which must contain a few things:
- state: current state of the game, an array of floats
- current_phase: current phase of the game (games can have multiple phases; we train for each phase a separate network in some of the models); int
- score: player's current score (in the last game step, this should contain final score); an array (in some used games, there are more players, so this can contain results for each player)
- reward: current reward of the AI (used in reinforcement learning); float
- done: determines whether the game has come to an end; int (1 / 0)
Then AI performs an evaluation and writes to game process computed result. The result is simple string with floats separated by whitespace. Described process repeats until game come to an end.
In the case of game 2048, there is not need to run any specific subprocess because the code of 2048 is in Python (included in the project) and the communication is direct.
Game also must provide a configuration file (also json structure) which must contain following three variables: game_phases - and integer, says how many phases game has, input_sizes - array of integers, saying how big is output from the game (i.e. size of the state of the game) for each phase and output_sizes saying how many outputs should AI generate. All games that we use, has only one number of inputs (even Alhambra -- has multiple phases but all of them have same number of inputs).
On the 'python-side' of games, you should extend Game class. Take a look on some already implemented classes in Controller/games/ directory.
On the 'game-side', there's basically no restriction, if game satisfies the I/O communication interface.
Interfaces for every game used. Either here or in a separate repository.
- Python 3.5
- Numpy
- Scipy
- Matplotlib
- Sklearn
- Deap
- Gym
- TensorFlow (0.12.1)
- CUDA
- cuDNN
If you want to run all games, you'll need
- .NET Framework 4.5 (Alhambra)
- Java 8 (TORCS, Mario)
- Game 2048 is in Python (so no anorther language needed)
- Most of the evolutionary experiments were done on Linux. All TensorFlow (reinforcement learning) experiments were done on Windows with GTX 1070. Also, all TORCS experiments were done on Windows.
- Default logging directory for new experiments is
Controller/logs. - Gym and Sklearn is used for 'interface' purposes (with small changes, it can be run without these libraries).
- Learning should work without GPU (if you install proper tensorflow-cpu, without need for CUDA and cuDNN).
- The Open Racing Car Simulator:
- Alhambra:
- Mario
- Original project
- Modified version with nice interface (and own fork here)
- 2048
- Continuous control with deep reinforcement learning [pdf]
- Playing Atari with deep reinforcement learning [pdf]
- Neural networks and deep learning [pdf]
- Adam: A method for stochastic optimization [pdf]
- The “echo state” approach to analysing and training recurrent neural networks – with an Erratum [pdf]
- The CMA evolution strategy: A tutorial [pdf]
- Evolution strategies as a scalable alternative to reinforcement learning [pdf]