Matlab implementation of paper Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations (CoRL2022 Oral). Python implementation to come!
Clone this repo
git clone https://github.com/yanweiw/tli.git
cd tli
Install dependencies
Install Matlab, and for the first time compile third party softwares by running
setup_code 1
If the compilation results in issues, (e.g. Error using mex when installing lightspeed) you will most likely need to update the link to compilers on your system in file tli/ds-libraries/thirdparty/lightspeed/install_lightspeed.m
Once the compilations are successful, run the following to initialize matlab workspace everytime matlab is restarted)
setup_code
Reproducing multi-modal replanning videos
Successful policy rollouts in the following toy example requires going through the white, yellow, pink and green regions (modes) consecutively. Leaving any colored region (a.k.a. invariance failures, shown by streamlines flowing out of a colored region) due to perturbations requries restarting in the white region. An imitation policy can get stuck switching between modes if it is not modulated by encountered invariance failures, while modulating mode-based policy with cutting planes prevents future invariance failures.
| LTL-DS without modulation | LTL-DS with modulation |
|---|---|
![]() |
![]() |
- Go to
dsltl.mand run the very first section withseed=0corresponding to the existing experiment inexperiments/scoop_seed_00. - Run the section
Load DSthat loads pre-learned DS. Six figures will show: original demos, original unsegmented demos overlayed with the corresponding learned DS, demos after segmentation, and individual DS corresponding to each segmented demo. - Run the section
Simulation. For multi-modal nominal policy, setopt_sim.if_mod=0. For multi-modal modulated policy, setopt_sim.if_mod=1. To perturb the policy, drag your mouse around.
Generating new data
- Go to
dsltl.mand setseedto a different number than the exisiting trials in theexperimentsfolder. Updatedo_erase_exp=1. - Run first two sections (up to
Draw 2D Dataset with GUI) and a GUI will pop up. Drag your mouse to draw one or multiple demonstrations and click theStore Databutton at the end - Run the section
Segment DataandRun Learning, two images will show up: the unsegmented data (now shifted such that all trajectories end in the same attractor) and segmented data according to the AP regions. These are the processed data used for learning. - Run the section Load DS, and the learning results of both the unsegmented data and segmented data will appear.
- Run the section
Simulation. For multi-modal nominal policy, setopt_sim.if_mod=0. For multi-modal modulated policy, setopt_sim.if_mod=1. To perturb the policy, drag your mouse around.
Reproducing single mode BC vs DS videos
Given successful demonstrations of reaching a goal location in red, a learned BC policy does not guarantee goal-reaching due to spurious attractors or diverging flows in regions without data coverage, while a learned DS policy guarantees goal-reaching due to its global stability at the attractor.
| BC policy | DS policy |
|---|---|
![]() |
![]() |
- To interact with a pre-trained BC/DS policy, run in the matlab command window
bc_vs_ds('bc'/'ds')
- To learn DS from the new data generated, follow the previous section.
- To learn BC from the new data generated, run
learn_bc_interactive.mafter setting the intendedexperimentfolder. - To interact with the newly learned BC/DS policy, run
bc_vs_ds('bc'/'ds', 'new_experiment_name')
Reproducing single mode analysis videos
| Boundary estimation by cutting planes |
|---|
![]() |
- To generate the video above, go to
single_mode.m, and setseedto 0 to locate the pre-trained experiment folder, and run the first section. - Skip the drawing, DS learning and BC learning sections, and run
Load DSandLoad BCto load pre-trained policies. - Run
SimulateandInteractive Simulationto interact with either the BC or DS policy. - To generate new convex shapes and collect new data, run
Draw 2D Dataset with GUI,Load Drawn Data, Learn and Save DS, andLoad Drawn Data, Learn and Save BC.
| DS rollouts under OOD perturbations | Modulated DS rollouts under OOD perturbations |
|---|---|
![]() |
![]() |
DS learned from successful demonstrations (in black) exhibit only invariance failures (red) when perturbed to out-of-distribution states. DS modulated by cutting planes learned from past invariance failures guarantees successful rollouts (blue) that are both goal-reaching (arrive at the orange circle) and mode-invariant (stay inside the mode boundary).
| BC rollouts under OOD perturbations | Modulated BC rollouts under OOD perturbations |
|---|---|
![]() |
![]() |
BC learned from successful demonstrations (in black) exhibit both invariance and reachability failures (red) when perturbed to out-of-distribution states. BC modulated by cutting planes is cured of invariance failures but still suffers from reachability failures (red) where the rollouts get stuck in spurious attractors.
- To generate the above analysis, proceed to run section
Batch Simulateinsingle_mode.mfor unmodulated results of BC vs DS. - To generate the modulated results of BC vs DS, run section
Batch Simulate with Modulation.
Translating LTL formulae to replannable automaton
-
Install Spot
-
We show here how an LTL formula written in the Spot format can be translated into an operationable automaton that replans a discrete mode sequence based on sensor measurements in python.
-
For the scooping task with mode transitions
a->b->c->d, run default formula inpython plan_from_ltl.py -s 0. This program simulates perturbations at the task discrete level, and the output format--{curr_mode}_{sensor_measurement}->{next_mode}--shows how a sensor measurement (e.g.100indicates sensorrisTrue, sensorsisFalse, sensortisFalse), the planner advances to the next mode from the current mode. Giving different random seeds with option-sresult in different perturbation patterns. -
To try different formulas, please consider option
-f,-i,-pas documented in the python file.








