Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy.

Introduction

This repo evaluates and analyzes a multi LLM agent framework, VacSim, for simulating health-related decision-making. Paper link: https://arxiv.org/abs/2503.09639

Installation

Create and activate an environment in anaconda (recommend python=3.10)
pip install -r requirements.txt

Codebase Overview

The main entry file of this repo is driver.py, which creates an EvalSuite object that conducts evaluations for the multi-agent system. Each EvalSuite (see implementations in utils/eval_suite.py) will analyze according to the eval_mode you input in:

0 -> conduct attitude tuning
1 -> incentive policy strength eval
2 -> community policy strength eval
3 -> mandate policy strength eval
4 -> news sanity eval
5 -> run simulation with different diseases (by replacing disease names)
6 -> compare simulation run under strong policies of each kind (incentive, community, mandate)

Each EvalSuite will create an DataParallelEngine or AsyncDataParallelEngine object, which will run parallel inferences on vLLM servers or make async API calls, depending on whether you run local or remote inferneces. DataParallelEngine or AsyncDataParallelEngine inherit Engine class, which specifies behaviors of agents and outlines their routines at each simulation time step.

(If use open-source models) Starting vLLM Servers

(Recommended) If you use bash script to run, you can see an example at example_run_4_servers.sh for running parallel servers on 4 GPUs
If you use a single interative GPU, do something like:

python -m vllm.entrypoints.openai.api_server \
      --model $model --guided-decoding-backend lm-format-enforcer --max-model-len 6144 \
      --tensor-parallel-size 1 --port $PORT

where you input $model and $PORT.

Example:

python -m vllm.entrypoints.openai.api_server \
      --model meta-llama/Meta-Llama-3.1-8B-Instruct --guided-decoding-backend lm-format-enforcer --max-model-len 6144 \
      --tensor-parallel-size 1 --port 49172

Parallel: If you use interactive GPUs on N parallel processes, request N GPUs and open N sessions. At each session, do the command above.

Running Evals

Example command:

python src/driver.py 1 --warmup_days 5 --run_days 15 \
	--news_path data/news/COVID-news-total-k=10000.pkl \
	--network_str data/social_network-num=100-incl=neutral.pkl \
	--profile_path data/profiles-num=100-incl=neutral.pkl \
	--model_type meta-llama/Meta-Llama-3.1-8B-Instruct \
	--disease COVID-19 \
	--ports 49172 --temperature 0.7

This command runs policy strength eval (the incentive policy) for five seeds by default. If you want to supply a different list of seeds, add in --seed_list argument.

If you use multiple processes, then include all the ports, like:

python src/driver.py 1 --warmup_days 5 --run_days 15 \
	--news_path data/news/COVID-news-total-k=10000.pkl \
	--network_str data/social_network-num=100-incl=neutral.pkl \
	--profile_path data/profiles-num=100-incl=neutral.pkl \
	--model_type meta-llama/Meta-Llama-3.1-8B-Instruct \
	--disease COVID-19 \
	--ports 49172 55050 60050 60100 --temperature 0.7

(Mandatory) Use OpenAI/Anthropic Models

If you use close-sourced models, we recommend to provide your API keys as environmental variables. If you use:

OpenAI models: create an env variable called OPENAI_API_KEY and modify the init_client method in src/engines/async_engine.
OpenAI models hosted on Azure: create env variables called AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT and specifiy the api_version the init_client method in src/engines/async_engine
Anthropic models: create an env varibale called ANTHROPIC_API_KEY.

Caveat: Note that for some cloud providers, you need to specify the model token limit. For example, instead of --model_type gpt-4o, you need to do --model_type gpt-4o-0513-50ktokenperminute. Please be aware of this when you input.

Further Instructions on Evals

The following lists commands and inputs for each kind of eval:

Attitude tuning (command 0): Do not provide --temperature, provide --temperature_list because it is tuning within a range. Example (on four ports).

python src/driver.py 0 --warmup_days 5 --run_days 15 \
	--news_path data/news/COVID-news-total-k=10000.pkl \
	--network_str data/social_network-num=100-incl=neutral.pkl \
	--profile_path data/profiles-num=100-incl=neutral.pkl \
	--model_type meta-llama/Meta-Llama-3.1-8B-Instruct \
	--disease COVID-19 \
	--ports 49172 55050 60050 60100 --temperature_list 0.1, 0.3, 0.5, 0.7, 1.0, 2.0

Policy Strength Eval (command 1,2,3): Provide --temperature (the selected temperature after attitude tuning). Example (on four ports).

python src/driver.py 1 --warmup_days 5 --run_days 15 \
   --news_path data/news/COVID-news-total-k=10000.pkl \
   --network_str data/social_network-num=100-incl=neutral.pkl \
   --profile_path data/profiles-num=100-incl=neutral.pkl \
   --model_type meta-llama/Meta-Llama-3.1-8B-Instruct \
   --disease COVID-19 \
   --ports 49172 55050 60050 60100 --temperature 0.7

News Eval (command 4): Provide a news_list (positive and negative news).

python src/driver.py 4 --warmup_days 5 --run_days 15 \
	--news_list data/news/COVID-news-positive-k=5000.pkl data/news/COVID-news-negative-k=5000.pkl \
	--network_str data/social_network-num=100-incl=neutral.pkl \
	--profile_path data/profiles-num=100-incl=neutral.pkl \
	--model_type meta-llama/Meta-Llama-3.1-8B-Instruct \
	--disease COVID-19 \
	--ports 49172 55050 60050 60100 --temperature 0.7

Policy Comparison (command 5). Need to specify a temperature, similar command to the Policy Strength Eval.

python src/driver.py 5 --warmup_days 5 --run_days 15 \
	--news_path data/news/COVID-news-total-k=10000.pkl \
	--network_str data/social_network-num=100-incl=neutral.pkl \
	--profile_path data/profiles-num=100-incl=neutral.pkl \
	--model_type meta-llama/Meta-Llama-3.1-8B-Instruct \
	--disease COVID-19 \
	--ports 49172 55050 60050 60100 --temperature 0.7

Interpreting the Eval Results

The eval outputs will stack up your storage in the long term, as it contains the output of individual agents in the simulated population. If you need to save the output in a custom directory for data storage, specify the --save_dir argument, otherwise it will be created in the current directory.

The output directory contains the following files: - results: record the summary of all the simulation ran over a list of seeds, along with the individual run results. - sim: record the detailed info of every simulation run, including all agents output and a trajectory of an individual run.

Customizing Policy Evals

You can create more evaluations by modifying the eval function in the src/utils/eval_suite.py file. You can set the policy, news, social network, and the demographic of agents provided to the engine by specifying corresponding parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_run_4_servers.sh		example_run_4_servers.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy.

Introduction

Installation

Codebase Overview

(If use open-source models) Starting vLLM Servers

Running Evals

(Mandatory) Use OpenAI/Anthropic Models

Further Instructions on Evals

Interpreting the Eval Results

Customizing Policy Evals

About

Uh oh!

Releases

Packages

Languages

License

abehou/VacSim

Folders and files

Latest commit

History

Repository files navigation

Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy.

Introduction

Installation

Codebase Overview

(If use open-source models) Starting vLLM Servers

Running Evals

(Mandatory) Use OpenAI/Anthropic Models

Further Instructions on Evals

Interpreting the Eval Results

Customizing Policy Evals

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages