You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you're interested in turning a GitHub repository into a SWE-gym, install the package from source.
Tip
SWE-smith requires Docker to create execution environments. SWE-smith was developed and tested on Ubuntu 22.04.4 LTS.
We do not plan on supporting Windows or MacOS.
You can then build a dataset for the repository by...
fromswesmith.profilesimportregistryfromdatasetsimportload_datasetds=load_dataset("SWE-bench/SWE-smith", split="train") # Loads all 52k task instancesfortaskinds:
rp=registry.get_from_inst(task) # Get the RepoProfile for the taskcontainer=rp.get_container(task) # Returns pointer to a Docker container with the task initialized"""TODO: Train!"""
SWE-smith has been used to
Fine-tune Qwen 2.5 Coder into SWE-agent-LM-32B (A +32% jump on SWE-bench Verified!) using SWE-agent [Tutorial]
Perform GRPO style reinforcement learning using SkyRL
@misc{yang2025swesmith,
title={SWE-smith: Scaling Data for Software Engineering Agents},
author={John Yang and Kilian Leret and Carlos E. Jimenez and Alexander Wettig and Kabir Khandpur and Yanzhe Zhang and Binyuan Hui and Ofir Press and Ludwig Schmidt and Diyi Yang},
year={2025},
eprint={2504.21798},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2504.21798},
}