| CARVIEW |
Select Language
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Wed, 24 Dec 2025 01:20:11 GMT
access-control-allow-origin: *
etag: W/"694b3fcb-1a39"
expires: Mon, 29 Dec 2025 18:29:24 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 14CC:2118F1:920C06:A3EA6C:6952C62C
accept-ranges: bytes
age: 0
date: Mon, 29 Dec 2025 18:19:24 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210058-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767032364.335224,VS0,VE200
vary: Accept-Encoding
x-fastly-request-id: 89b6fb5810aaef03929ea2c8d7ef46a7bfba1712
content-length: 2437
SWE-smith
© 2025
SWE-smith
Scaling Data for Software Engineering Agents
April 30, 2025
Creating training data for software engineering agents is difficult. Until now.
Introducing SWE-smith: Generate 100s to 1000s of task instances for any GitHub repository.
We've generated 50k+ task instances for 128 popular GitHub repositories, then trained our own LM for SWE-agent.
The result? SWE-agent-LM-32B achieve 40% pass@1 on SWE-bench Verified.
Now, we've open-sourced everything, and we're excited to see what you build with it!
Check out the tutorial below to generate 100 task instances for any GitHub repository in 10 minutes.
Click here for an extended discussion.
️🔥 Excited about SWE-smith? Build with us!
> Create new bug generation techinques.
> Expand to non-Python repositories.
> Train better SWE-agents!
Read our documentation or code for more.
Authors
John Yang,
Kilian Lieret,
Carlos E. Jimenez,
Alexander Wettig,
Kabir Khandpur,
Yanzhe Zhang,
Binyuan Hui,
Ofir Press,
Ludwig Schmidt,
Diyi Yang
Affiliations
Stanford University,
Stanford SALT Lab,
Princeton Language & Intelligence,
Alibaba Qwen
Citation
@misc{yang2025swesmith,
title={SWE-smith: Scaling Data for Software Engineering Agents},
author={John Yang and Kilian Lieret and Carlos E. Jimenez and Alexander Wettig and Kabir Khandpur and Yanzhe Zhang and Binyuan Hui and Ofir Press and Ludwig Schmidt and Diyi Yang},
year={2025},
eprint={2504.21798},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2504.21798},
}