You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MAgent is a research platform for many-agent reinforcement learning.
Unlike previous research platforms that focus on reinforcement learning research with a single agent or only few agents,
MAgent aims at supporting reinforcement learning research that scales up from hundreds to millions of agents.
MAgent supports Linux and OS X running Python 2.7 or python 3.
We make no assumptions about the structure of your agents.
You can write rule-based algorithms or use deep learning frameworks.
The training time of following tasks is about 1 day on a GTX1080-Ti card.
If out-of-memory errors occur, you can tune infer_batch_size smaller in models.
Note : You should run following examples in the root directory of this repo. Do not cd to examples/.
Train
Three examples shown in the above video.
Video files will be saved every 10 rounds. You can use render to watch them.
pursuit
python examples/train_pursuit.py --train
gathering
python examples/train_gather.py --train
battle
python examples/train_battle.py --train
Play
An interactive game to play with battle agents. You will act as a general and dispatch your soldiers.
battle game
python examples/show_battle_game.py
Baseline Algorithms
The baseline algorithms parameter-sharing DQN, DRQN, a2c are implemented in Tensorflow and MXNet.
DQN performs best in our large number sharing and gridworld settings.
Acknowledgement
Many thanks to Tianqi Chen for the helpful suggestions.