SwarmChat is an innovative project that enables intuitive communication with swarm robotics through natural language. This system integrates advanced audio transcription, text processing, and safety mechanisms with a live simulation environment that visualizes a swarm of agents executing behavior trees.
🚀 This project is Funded by the European Union’s UTTER programme, in collaboration with the UTTER consortium.
-
Audio Input Processing:
- Record commands via a microphone.
- Translate speech into English using the
facebook/seamless-m4t-v2-largemodel. - Perform a safety check on the translated text before execution.
-
Text Input Processing:
- Enter text commands for swarm control.
- Translate text using
EuroLLM(EuroLLM-9B-Instruct). - Detect unsafe or inappropriate content with an integrated safety module.
-
Safety Module:
- Utilizes
Llama-Guardmodel (Llama-Guard-3-8B) for safety classification. - Identifies unsafe content across predefined categories (e.g., violent crimes, privacy violations, hate speech).
- Ensures commands comply with safety standards.
- Utilizes
-
Swarm Simulation:
- Visualize a swarm of agents in a live simulation powered by Violet simulator and Pygame.
- Agents are controlled by behavior trees defined in an XML file (
tree.xml), using thepy_treesframework. - Real-time simulation updates streamed via a Gradio web interface.
-
Behavior Tree Generator:
Falcon3-10B-Instruct-BehaviorTreemodel to dynamically generate behavior trees in XML format.- Automatically extracts available behaviors from the SwarmAgent class and constructs a detailed prompt using a predefined XML template.
- Generates and saves new behavior tree configurations (updating tree.xml) based on user-specified tasks.
-
Integrated Interface:
- A unified Gradio web interface for both audio and text inputs.
- Live streaming of the simulation environment.
- Seamless switching between different input modalities.
-
Backend:
- Python
- Transformers (Hugging Face)
- PyTorch
- Pygame
- Threading and Queue modules for simulation management
-
Frontend:
- Gradio for an interactive web-based interface.
-
AI Models:
- Speech Processing: Seamless-m4t-v2-large for audio transcription and translation.
- Text Processing: EuroLLM-9B-Instruct for text translation.
- Safety Classification: Llama-Guard-3-8B for content safety assessment.
- Behavior Tree Generation: Falcon3-10B-Instruct-BehaviorTree for creating and updating behavior trees.
-
Behavior Trees:
- Agents utilize behavior trees—parsed from XML and built with
py_trees—to dictate their actions within the simulation.
- Agents utilize behavior trees—parsed from XML and built with
-
Clone the repository:
git clone https://github.com/Inventors-Hub/SwarmChat.git cd SwarmChat -
Install dependencies:
pip install -r requirements.txt
-
Setup AI Models:
- Place the EuroLLM model file (
EuroLLM-9B-Instruct-Q4_K_M.gguf) at the specified path intext_processing.py. - Place the LLaMA Guard model file (
llama-guard-3-8b-q4_k_m.gguf) at the specified path insafety_module.py. - Place the DeepSeek model file (
Falcon3-10B-Instruct-BehaviorTree-3-epochs-GGUF) at the specified path inbt_generator.py.
-
Run the Application:
python app.py
-
Access the Interface:
Open your browser and navigate to https://127.0.0.1:7860 to start using SwarmChat.
-
app.py
The main application integrates audio/text processing, behavior tree generation, and the live simulation. It sets up the Gradio interface, handles simulation streaming, and routes user inputs to the appropriate processing modules. -
speech_processing.py
Implements audio transcription and translation using thefacebook/seamless-m4t-v2-largemodel. -
text_processing.py
Translates text commands usingEuroLLM(EuroLLM-9B-Instruct). -
safety_module.py
UtilizesLLaMA Guardto assess the safety of incoming commands, ensuring compliance with safety policies. -
bt_generator.py
Dynamically generates behavior trees in XML format by extracting behaviors from the SwarmAgent class, constructing a prompt, and queryingFalcon3-10B-Instruct-BehaviorTreemodel. The generated XML is saved totree.xmlfor simulation use. -
simulator_env.py
Powers the simulation environment, manages agent behaviors using XML-defined behavior trees, and handles real-time simulation updates.
This work was funded by the European Union under the UTTER programme.
We gratefully acknowledge the support of the entire UTTER consortium.