| CARVIEW |
Select Language
HTTP/2 301
server: GitHub.com
content-type: text/html
location: https://marcometer.github.io/jmlr_2024.github.io/
x-github-request-id: 8A39:2680BD:963954:A8BB5F:69531AED
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 00:21:02 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210024-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767054062.117771,VS0,VE200
vary: Accept-Encoding
x-fastly-request-id: 2016c551aac712fc8fcb8267b61276a3030b4ba7
content-length: 162
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Fri, 28 Feb 2025 06:21:33 GMT
access-control-allow-origin: *
etag: W/"67c155ed-803d"
expires: Tue, 30 Dec 2025 00:31:02 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: D552:3A7A40:964085:A8C47C:69531AEC
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 00:21:02 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210024-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767054062.331075,VS0,VE227
vary: Accept-Encoding
x-fastly-request-id: b4a79b2aae112e62658b2514af55af0758f48d7c
content-length: 3862
Memory Gym Experiments
Memory Gym: Towards Endless Tasks to
Memory Gym: Towards Endless Tasks to
Benchmark Memory Capabilities of Agents
Memory Gym features the environments Mortar Mayhem, Mystery Path, and Searing Spotlights. These environments benchmark an agent's memory to
- memorize events across long sequences,
- generalize,
- and be robust to noise.
Especially, these environments feature endless task variants. As the agent's policy improves, the task goes on. The traveling game "I packed my bag ..." inspired this dynamic concept, which allows for examining levels of effectiveness instead of just sample efficiency.
Github
Citation
@article{JMLR:v26:24-0043,
title={Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents},
author={Marco Pleines and Matthias Pallasch and Frank Zimmer and Mike Preuss},
journal={Journal of Machine Learning Research},
year={2025},
volume={26},
number={6},
pages={1--40},
url={https://www.jmlr.org/papers/v26/24-0043.html}
}
Interactive Result Videos of Trained Agent Behaviors
For each environment, we cherry picked GRU and TrXL agent behaviors for visualization.
Three episodes, related to level seeds not seen during training, were rendered for every chosen agent.
Clicking on distinct data points in line plots or attention scores will take you to the corresponding episode step.
Each of the upcoming result pages features various visualization tools.
Three episodes, related to level seeds not seen during training, were rendered for every chosen agent.
Clicking on distinct data points in line plots or attention scores will take you to the corresponding episode step.
Each of the upcoming result pages features various visualization tools.
- Videos
- Ground Truth
- Agent Observation
- Observation Reconstruction (optional)
- Ground Truth Estimation (optional)
- Action Distribution Bar Chart (red bars denote the chosen actions)
- Line Plots
- State-Value
- Entropy
- Top Attention Scores (only TrXL)