You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a multi-agent framework designed to automate the end-to-end process of reproducing security vulnerabilities. It leverages a coordinated system of specialized agents to handle different aspects of the reproduction pipeline.
Architecture
Note
The SecVerifier framework is built on the foundation of OpenHands@0.34.0
SecVerifier employs a hierarchical multi-agent architecture:
ManagerAgent - The coordinator agent that orchestrates the full process
BuilderAgent - Specialized assistant responsible for fixing and optimizing build processes for code repositories
ExploiterAgent - Specialized assistant responsible for crafting proof-of-concept (PoC) exploits to reproduce vulnerabilities in code repositories
FixerAgent - Specialized assistant responsible for creating a unified patch file that fixes vulnerabilities in code repositories.
The system uses agent delegation to pass control between specialized agents while maintaining runtime state to preserve the container environment integrity.
Key Features
Coordinated Multi-Agent Workflow: Specialized agents work together in sequence
Maintained Runtime State: Container state is preserved across agent transitions
Vulnerability-Specific Inputs: Each agent receives vulnerability-specific information
Comprehensive Outputs: Detailed outputs from each phase of reproduction
Getting Started
Installation
# Clone the repository
git clone https://github.com/SEC-bench/SecVerifier.git
cd SecVerifier
# Install dependencies with Poetry
poetry install
Running SecVerifier
Usage: ./run_multi-agent.sh -m <max_iterations> [-l <llm_config>] [-c <condenser>] [-i <instance_id>] [-s <limit>] [-t <max_budget>] [-b <label>]
-m: Maximum iterations (default: 20)
-l: LLM config (default: llm.4o)
-c: Condenser (default: recent)
-i: Instance ID
-s: Limit the number of instances to process
-t: Maximum budget per task
-b: Label
-w: Number of workers (default: 1)
-h: Show this help message
Workflow
Initialization: The system loads vulnerability data from SEC-bench
Builder Phase: BuilderAgent validates and fixes build scripts
Exploiter Phase: ExploiterAgent improves helper scripts and creates PoC files
Fixer Phase: FixerAgent creates a unified patch file that fixes vulnerabilities in code repositories
Evaluation: The system evaluates the success of the reproduction process