Skip to content
@SWE-bench

SWE-bench

Organization for maintaining the SWE-bench/agent projects

SWE-bench

This organization contains the source code for SWE-bench, a benchmark for evaluating AI systems on real world GitHub issues.

Use the repositories in this organization to...

Also check out related organizations

  • SWE-bench-repos: Mirror clones for repositories used for SWE-bench style evalautions.
  • SWE-agent: Solve GitHub issue(s) automatically with a Language Model powered agent!

Pinned Loading

  1. SWE-bench SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    Python 2.5k 430

  2. experiments experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    Shell 150 152

  3. sb-cli sb-cli Public

    Run SWE-bench evaluations remotely

    Python 6

  4. swe-bench.github.io swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    HTML 2 4

Repositories

Showing 6 of 6 repositories
  • swe-bench.github.io Public

    Landing page + leaderboard for SWE-Bench benchmark

    SWE-bench/swe-bench.github.io’s past year of commit activity
    HTML 2 4 1 0 Updated Feb 27, 2025
  • .github Public
    SWE-bench/.github’s past year of commit activity
    0 0 0 0 Updated Feb 25, 2025
  • experiments Public

    Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

    SWE-bench/experiments’s past year of commit activity
    Shell 150 152 4 17 Updated Feb 25, 2025
  • sb-cli Public

    Run SWE-bench evaluations remotely

    SWE-bench/sb-cli’s past year of commit activity
    Python 6 MIT 0 3 0 Updated Feb 25, 2025
  • SWE-bench Public

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    SWE-bench/SWE-bench’s past year of commit activity
    Python 2,541 MIT 430 33 6 Updated Feb 24, 2025
  • humanevalfix-results Public

    Evaluation data + results for SWE-agent inference on HumanEvalFix task

    SWE-bench/humanevalfix-results’s past year of commit activity
    Jupyter Notebook 0 0 0 0 Updated Jul 11, 2024

Most used topics

Loading…