Multi-Agent System for Vulnerability Reproduction

This is a multi-agent framework designed to automate the end-to-end process of reproducing security vulnerabilities. It leverages a coordinated system of specialized agents to handle different aspects of the reproduction pipeline.

Architecture

Note

The SecVerifier framework is built on the foundation of OpenHands@0.34.0

SecVerifier employs a hierarchical multi-agent architecture:

ManagerAgent - The coordinator agent that orchestrates the full process
BuilderAgent - Specialized assistant responsible for fixing and optimizing build processes for code repositories
ExploiterAgent - Specialized assistant responsible for crafting proof-of-concept (PoC) exploits to reproduce vulnerabilities in code repositories
FixerAgent - Specialized assistant responsible for creating a unified patch file that fixes vulnerabilities in code repositories.

The system uses agent delegation to pass control between specialized agents while maintaining runtime state to preserve the container environment integrity.

Key Features

Coordinated Multi-Agent Workflow: Specialized agents work together in sequence
Maintained Runtime State: Container state is preserved across agent transitions
Vulnerability-Specific Inputs: Each agent receives vulnerability-specific information
Comprehensive Outputs: Detailed outputs from each phase of reproduction

Getting Started

Installation

# Clone the repository
git clone https://github.com/SEC-bench/SecVerifier.git
cd SecVerifier
# Install dependencies with Poetry
poetry install

Running SecVerifier

Usage: ./run_multi-agent.sh -m <max_iterations> [-l <llm_config>] [-c <condenser>] [-i <instance_id>] [-s <limit>] [-t <max_budget>] [-b <label>]
  -m: Maximum iterations (default: 20)
  -l: LLM config (default: llm.4o)
  -c: Condenser (default: recent)
  -i: Instance ID
  -s: Limit the number of instances to process
  -t: Maximum budget per task
  -b: Label
  -w: Number of workers (default: 1)
  -h: Show this help message

Workflow

Initialization: The system loads vulnerability data from SEC-bench
Builder Phase: BuilderAgent validates and fixes build scripts
Exploiter Phase: ExploiterAgent improves helper scripts and creates PoC files
Fixer Phase: FixerAgent creates a unified patch file that fixes vulnerabilities in code repositories
Evaluation: The system evaluates the success of the reproduction process

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
containers		containers
dev_config/python		dev_config/python
docs		docs
evaluation		evaluation
frontend		frontend
microagents		microagents
openhands		openhands
prompts		prompts
tests		tests
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COMMUNITY.md		COMMUNITY.md
CONTRIBUTING.md		CONTRIBUTING.md
CREDITS.md		CREDITS.md
Development.md		Development.md
ISSUE_TRIAGE.md		ISSUE_TRIAGE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
build.sh		build.sh
config.template.toml		config.template.toml
docker-compose.yml		docker-compose.yml
multi-agent.py		multi-agent.py
poetry.lock		poetry.lock
pydoc-markdown.yml		pydoc-markdown.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
run_multi-agent.sh		run_multi-agent.sh
run_single-agent.sh		run_single-agent.sh
single-agent.py		single-agent.py
stat.py		stat.py
verify_agent_cost.py		verify_agent_cost.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Agent System for Vulnerability Reproduction

Architecture

Key Features

Getting Started

Installation

Running SecVerifier

Workflow

About

Uh oh!

Releases

Packages

Languages

License

SEC-bench/SecVerifier

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent System for Vulnerability Reproduction

Architecture

Key Features

Getting Started

Installation

Running SecVerifier

Workflow

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages