| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Mon, 15 Dec 2025 03:03:56 GMT
access-control-allow-origin: *
etag: W/"693f7a9c-1cad"
expires: Wed, 31 Dec 2025 18:34:16 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 0E86:234FE9:B9CE3F:D0C500:69556A4F
accept-ranges: bytes
age: 0
date: Wed, 31 Dec 2025 18:24:16 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210085-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767205456.287798,VS0,VE209
vary: Accept-Encoding
x-fastly-request-id: 2a7d44eaa7b7684c39033172021b3a807b0793a7
content-length: 2880
On meta prompt optimization and coding agents
Abstract. Language models (LMs) encode vast amounts of human knowledge in a queryable form and are now widely deployed across domains that previously could not directly benefit from language. Yet major challenges remain: how to effectively extract information (e.g., where and how to prompt the model), and how to ensure alignment with human values and preferences (e.g., personalization, safety, undesirable behaviors). A path forward to solving these is to quantify the objective, perform (test-time) optimization over the prompt space to improve that objective, and ultimately integrate the improvement back into the base model. This talk explores this in two domains:
1) Prompt optimization and adversarial attacks---in [AdvPrompter](https://arxiv.org/abs/2404.16873) (ICML 2025) we learn a *meta-prompting* model that rapidly predicts the optimal adversarial suffix for jailbreaking an LM, and in [AdvPrefix](https://arxiv.org/abs/2412.10321) (NeurIPS 2025), we show variants on the jailbreaking optimization problem objective can result in significantly improved performance, and
2) Numerical code synthesis with agents---in [AlgoTune](https://arxiv.org/abs/2507.15887) (NeurIPS D&B 2025), we propose a benchmark suite of 154 numerical coding tasks along with a baseline code agent to solve it. The coding agent iteratively optimizes the code (and prompts) for the task's runtime as the objective.
We'll draw connections between these two seemingly separate settings and show how meta-learning and agentic test-time computations squeeze the most out of the base model, advancing both capabilities and alignment.