This organization contains the source code for SWE-bench, a benchmark for evaluating AI systems on real world GitHub issues.
Use the repositories in this organization to...
- Construct SWE-bench datasets and run local evaluation (SWE-bench/SWE-bench)
- Run evaluations automatically and quickly on the cloud (SWE-bench/sb-cli)
- Submit your predictions and evaluation results to be featured on the public leaderboard (SWE-bench/experiments)
Also check out related organizations
- SWE-bench-repos: Mirror clones for repositories used for SWE-bench style evalautions.
- SWE-agent: Solve GitHub issue(s) automatically with a Language Model powered agent!