| CARVIEW |
Select Language
HTTP/2 301
server: GitHub.com
content-type: text/html
location: https://von31.github.io/synNsync/
x-github-request-id: F547:2118F1:953EEB:A7A805:69530DD6
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 23:25:10 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210088-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767050710.333769,VS0,VE196
vary: Accept-Encoding
x-fastly-request-id: 82c8c98f5383e22f8eda2b5b3ba70c8277b74c78
content-length: 162
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Fri, 30 May 2025 21:25:22 GMT
access-control-allow-origin: *
etag: W/"683a2242-614c"
expires: Mon, 29 Dec 2025 23:35:10 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 14C4:1387E:953FB9:A7AAC4:69530DD6
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 23:25:10 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210088-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767050711.543778,VS0,VE210
vary: Accept-Encoding
x-fastly-request-id: f4d62f58ab006860ac92fe883b2c3e3a9bfd008e
content-length: 6886
Synchrony and Synergy in Couple Dance
Synergy and Synchrony in Couple Dances
Vongani H. Maluleke1
Lea Müller1
Jathushan Rajasegaran1
Georgios Pavlakos2
Shiry Ginosar1 Angjoo Kanazawa1 Jitendra Malik1


We present a dataset of couple dances in the Swing genre collected from Youtube. This data contains SMPL parameters (Translation, Global Orientation, and Body Pose) of the detected people in the videos. This data was cleaned by removing videos that have one person or more than two detected people. This dataset contains 30 Hrs of 680 unique video with the associated SMPL parameters generated from SLAHMR.
Shiry Ginosar1 Angjoo Kanazawa1 Jitendra Malik1
|
|
Paper |
Data |
Code Coming soon |
|
|
| To what extent does Bob’s behavior affect Alice’s behavior? We study this question in a couple’s dance - an example of full-body dyadic physical social interaction. We predict the full body motion of a dancer, Alice (yellow), given their own past motion and their partner, Bob’s (blue), motion. |
Abstract
This paper asks to what extent social interaction influences one's behavior. We study this in the setting of couples swing dancing. We first consider a baseline in which we predict a dancer's future moves conditioned only on their past motion without regard to their partner. We then investigate the advantage of taking social information into account by conditioning also on the motion of their dancing partner. We focus our analysis on Swing, a dance genre with tight physical coupling for which we present an in-the-wild video dataset. We demonstrate that single-person future motion prediction in this context is challenging. Instead, we observe that prediction greatly benefits from considering the interaction partners' behavior, resulting in surprisingly compelling couple dance synthesis results. Our contributions are a demonstration of the advantages of socially conditioned future motion prediction and an in-the-wild, couple dance video dataset to enable future research in this direction.Method: Synergy and Synchrony
TLDR: We predict the future motion of a dancer, Alice, given her past motion and her partner, Bob's, motion. We show that considering Bob's motion significantly improves the prediction of Alice's future motion. Given tokens generated from a Motion VQ-VAE , we autoregressively train a decoder only transformer to predict a set of next tokens in the motion sequence. We show that the dyadic model outperforms the unary model in terms of prediction quality and diversity.
| Test-time autoregressive prediction in the unary (left) and dyadic (right) tasks. In the unary prediction task, we start from Alice's past motion up to time \( t_\pi \), and predict the next time step in each iteration. In the dyadic prediction task, we start from Alice and Bob's motion up to time \( t_\pi \). In each step, we predict Alice's next token from her past ground truth motion until \( t_{π} \), her predicted motion for \( t > t_\pi \) , and Bob's past ground truth motion |

| Illustration of the VQ-VAEs. We learn three separate codebooks, one for each body model parameter. An encoder, E·, maps the body parameter to the codebook, Ζ·. The decoder, D· brings the codebook latent vectors back into body model parameter space. To obtain 3D meshes, we jointly pass the parameters through the body model function. |

| Transformer training procedure in the dyadic case. The major part of our network is a transformer-decoder block with causal masking, such that Alice and Bob can only attend to their past motion. Input to our model are Alice and Bob's codebook indices for body pose, Θ, orientation, Φ, and translation, Γ. We embed the tokens into a latent space and add time, person, and parameter encoding. The final layer in our network generates probability scores over codebook indices, representing the likelihood of an index being the next motion \(s_{t_{\pi}+1}\) |
Swing Dance Dataset
We present a dataset of couple dances in the Swing genre collected from Youtube. This data contains SMPL parameters (Translation, Global Orientation, and Body Pose) of the detected people in the videos. This data was cleaned by removing videos that have one person or more than two detected people. This dataset contains 30 Hrs of 680 unique video with the associated SMPL parameters generated from SLAHMR.
More Data Examples
Results
Dyadic Motion Prediction Results: Our method can predict realistic and plausible future full body motion of Alice conditioned on Alice’s And Bob’s (dance partner) past motion.
Unary Motion Prediction Results: Our method can also predict Alice’s future motion conditioned on only Alice’s past motion.
More Results:
Citation |
AcknowledgementsWe thank Evonne Ng and members of the BAIR community for helpful discussions. This work was supported by BAIR/BDD sponsors, ONR MURI (N00014-21-1-2801), and the DARPA MCS program. This webpage template was borrowed from some colorful folks |