You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We quantize camera trajectories to sequences of tokens and adopt a GPT-based architecture to generate the tokens autoregressively. Learning trajectory and language jointly, CineGPT is capable of text-conditioned trajectory generation.
Codes and model weights for CineGPT coming soon...
Anchor Determinator
Given a prompt describing the image rendered from an anchor point, the anchor selector chooses the best matching input image. An anchor refinement procedure further fine-tunes the anchor position.
Codes and model weights for Anchor Determinator coming soon...
LLM Prompt
Through this prompt, we provide the LLM with detailed instructions and guidelines for tool usage to achieve the target. We include a template and examples for the LLM's responses. Check out the user-agent convesation example.
Citation
If you find ChatCam useful in your research, please consider citing:
@article{liu2024chatcam,
title={ChatCam: Empowering Camera Control through Conversational AI},
author={Liu, Xinhang and Tai, Yu-Wing and Tang, Chi-Keung},
journal={arXiv preprint arXiv:2409.17331},
year={2024}
}
About
This repository contains the implementation of the paper: "ChatCam: Empowering Camera Control through Conversational AI", NeurIPS 2024.