This repository contains the code for HAMSTER, a system for open-world robot manipulation using vision-language models.
This project depends on the following external repositories and resources:
-
VILA Repository
- Source: NVlabs/VILA
- Required Commit:
da98f3b - Usage: Used as a base for the vision-language model implementation
- Note: This is an external dependency and should be cloned separately
-
Model Checkpoint
- Source: yili18/Hamster_dev
- Usage: Contains the trained model weights
- Note: This is downloaded automatically during setup
.
├── server.py # Custom server implementation
├── setup_server.sh # Setup and launch script
├── gradio_server_example.py # Example Gradio interface
├── ip_eth0.txt # Stores the server IP address
└── VILA/ # External VILA repository (not included)
└── ...
- Clone this repository:
git clone <your-repo-url>
cd <your-repo-name>- Clone the VILA repository and checkout the specific commit:
git clone https://github.com/NVlabs/VILA.git
cd VILA
git checkout a5a380d6d09762d6f3fd0443aac6b475fba84f7e
cd ..- Set up the VILA environment:
cd VILA
./environment_setup.py vila
conda activate vila
cd ..- Install additional packages for the Gradio interface:
pip install gradio openai opencv-python matplotlib numpy- Make sure you're in the VILA environment:
conda activate vila- Run the setup script to start the server:
./setup_server.shThis will:
- Download the model checkpoint from Hugging Face
- Save the server IP address to
ip_eth0.txt - Set up the server with the correct configuration
- Start the server on port 8000
-
The server will be available at the IP address stored in
ip_eth0.txton port 8000 -
Use the Gradio interface by running:
python gradio_server_example.pyThe Gradio interface will automatically use the IP address from ip_eth0.txt to connect to the server.
- The VILA repository is an external dependency and should be kept separate from this repository
- Make sure to checkout the specific commit (
a5a380d6d09762d6f3fd0443aac6b475fba84f7e) of VILA - Always use the VILA environment (
conda activate vila) when running the server - Model checkpoints are downloaded from Hugging Face and not included in this repository
- Make sure you have sufficient disk space for the model checkpoint
- The server requires GPU support for optimal performance
- The server IP address is automatically detected and stored in
ip_eth0.txt
[Your License Here]
- VILA: NVlabs/VILA (commit
a5a380d6d09762d6f3fd0443aac6b475fba84f7e) - Model weights: yili18/Hamster_dev