| CARVIEW |
Select Language
HTTP/2 301
server: GitHub.com
content-type: text/html
location: https://orybkin.github.io/lexa/
x-github-request-id: 922D:234FE9:9A6BE3:AD5970:69535220
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 04:16:32 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210035-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767068192.482445,VS0,VE202
vary: Accept-Encoding
x-fastly-request-id: ddd90e45e644493f8d72d9b05c6d358a20520cc4
content-length: 162
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Tue, 19 Oct 2021 09:19:18 GMT
access-control-allow-origin: *
etag: W/"616e8d96-5a6b"
expires: Tue, 30 Dec 2025 04:26:32 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: AD1F:318CF6:982F82:AB1A15:6953521F
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 04:16:32 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210035-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767068193.698259,VS0,VE221
vary: Accept-Encoding
x-fastly-request-id: b70c0e2498cd34e3b0cea15bab08c799a44cfbb5
content-length: 5235
Discovering and Achieving Goals with World Models
Discovering and Achieving Goals via World Models
Neural Information Processing Systems (NeurIPS), 2021
Unsupervised RL workshop at ICML, 2021. Oral Talk
Self-Supervised Learning workshop at ICML, 2021. Oral Talk


How can artificial agents learn to solve wide ranges of tasks in complex visual environments in the absence of external supervision? We decompose this question into two problems, global exploration of the environment and learning to reliably reach situations found during exploration. We introduce the Latent Explorer Achiever (LEXA), a unified solution to these by learning a world model from the high-dimensional image inputs and using it to train an explorer and an achiever policy from imagined trajectories. Unlike prior methods that explore by reaching previously visited states, the explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever. After the unsupervised phase, LEXA solves tasks specified as goal images zero-shot without any additional learning. We introduce a challenging benchmark spanning across four standard robotic manipulation and locomotion domains with a total of over 40 test tasks. LEXA substantially outperforms previous approaches to unsupervised goal reaching, achieving goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of LEXA, we train a single general agent across four distinct environments.
Discovering and Achieving Goals
LEXA explores the world and learns to solve arbitrary goal images purely from pixels and without any form of supervision. After the unsupervised interaction phase, LEXA solves complex tasks by reaching user-specified goal images.

Discovering Goals through Foresight
Prior work used retrospective exploration that searches the replay buffer for most interesting states. Instead, we explore with foresight, by querying the model for most novel states. This enables LEXA to explore beyond the frontier and significantly improve exploration performance.
Goal Reaching Benchmark

Zero-shot Evaluation
Kitchen
Bins
RoboYoga Walker
RoboYoga Quadruped
Source Code
Try our code for LEXA in TensorFlow V2 as well as our new new goal-conditioned RL benchmark!
Previous Works
Sekar*, Rybkin*, Daniilidis, Abbeel, Hafner, Pathak. Planning to Explore via Self-Supervised World Models. In ICML 2020. [website]
Hafner, Lillicrap, Ba, Norouzi. Dream to Control: Learning Behaviors by Latent Imagination. In ICLR 2020. [website]
Pathak, Gandhi, Gupta. Self-Supervised Exploration via Disagreement. In ICML 2019. [website]
|
|
Citation |
|
@misc{lexa2021,
title={Discovering and Achieving Goals via World Models},
author={Mendonca, Russell and Rybkin, Oleh and
Daniilidis, Kostas and Hafner, Danijar and Pathak, Deepak},
year={2021},
Booktitle={NeurIPS}
}
|










