Skip to content

Actions: EleutherAI/lm-evaluation-harness

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
5,350 workflow runs
5,350 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

New healthcare benchmark: careqa
Unit Tests #4241: Pull request #2714 synchronize by PabloAgustin
February 20, 2025 11:26 7m 9s PabloAgustin:careqa
February 20, 2025 11:26 7m 9s
New healthcare benchmark: careqa
Tasks Modified #4269: Pull request #2714 synchronize by PabloAgustin
February 20, 2025 11:26 1m 58s PabloAgustin:careqa
February 20, 2025 11:26 1m 58s
New healthcare benchmark: careqa
Unit Tests #4240: Pull request #2714 synchronize by PabloAgustin
February 20, 2025 11:20 Action required PabloAgustin:careqa
February 20, 2025 11:20 Action required
New healthcare benchmark: careqa
Tasks Modified #4268: Pull request #2714 synchronize by PabloAgustin
February 20, 2025 11:20 Action required PabloAgustin:careqa
February 20, 2025 11:20 Action required
New healthcare benchmark: careqa
Unit Tests #4239: Pull request #2714 synchronize by PabloAgustin
February 20, 2025 11:15 Action required PabloAgustin:careqa
February 20, 2025 11:15 Action required
New healthcare benchmark: careqa
Tasks Modified #4267: Pull request #2714 synchronize by PabloAgustin
February 20, 2025 11:15 Action required PabloAgustin:careqa
February 20, 2025 11:15 Action required
Add Task (Financial mmlu ko)
Tasks Modified #4265: Pull request #2699 synchronize by choics2623
February 20, 2025 00:53 Action required choics2623:financial_mmlu_ko
February 20, 2025 00:53 Action required
Add Task (Financial mmlu ko)
Unit Tests #4237: Pull request #2699 synchronize by choics2623
February 20, 2025 00:53 Action required choics2623:financial_mmlu_ko
February 20, 2025 00:53 Action required
Logging
Unit Tests #4236: Pull request #2203 synchronize by baberabb
February 19, 2025 16:03 7m 23s logging-best-practices
February 19, 2025 16:03 7m 23s
Logging
Tasks Modified #4264: Pull request #2203 synchronize by baberabb
February 19, 2025 16:03 2m 19s logging-best-practices
February 19, 2025 16:03 2m 19s
Logging
Tasks Modified #4263: Pull request #2203 synchronize by baberabb
February 19, 2025 15:55 2m 12s logging-best-practices
February 19, 2025 15:55 2m 12s
Logging
Unit Tests #4235: Pull request #2203 synchronize by baberabb
February 19, 2025 15:55 6m 54s logging-best-practices
February 19, 2025 15:55 6m 54s
Logging
Unit Tests #4234: Pull request #2203 synchronize by baberabb
February 19, 2025 14:51 7m 20s logging-best-practices
February 19, 2025 14:51 7m 20s
Logging
Tasks Modified #4262: Pull request #2203 synchronize by baberabb
February 19, 2025 14:51 2m 35s logging-best-practices
February 19, 2025 14:51 2m 35s
New healthcare benchmark: careqa
Tasks Modified #4261: Pull request #2714 opened by PabloAgustin
February 19, 2025 11:49 2m 2s PabloAgustin:careqa
February 19, 2025 11:49 2m 2s
New healthcare benchmark: careqa
Unit Tests #4233: Pull request #2714 opened by PabloAgustin
February 19, 2025 11:49 7m 32s PabloAgustin:careqa
February 19, 2025 11:49 7m 32s
add o3-mini support
Unit Tests #4232: Pull request #2697 synchronize by HelloJocelynLu
February 19, 2025 06:02 7m 1s deepprinciple:o3_mini
February 19, 2025 06:02 7m 1s
add o3-mini support
Tasks Modified #4260: Pull request #2697 synchronize by HelloJocelynLu
February 19, 2025 06:02 14s deepprinciple:o3_mini
February 19, 2025 06:02 14s
Add AIBE task and utilities
Unit Tests #4231: Pull request #2712 opened by parimalthakre01
February 18, 2025 17:10 Action required parimalthakre01:feature/aibe-task
February 18, 2025 17:10 Action required
Add AIBE task and utilities
Tasks Modified #4259: Pull request #2712 opened by parimalthakre01
February 18, 2025 17:10 Action required parimalthakre01:feature/aibe-task
February 18, 2025 17:10 Action required
add audio modality (qwen2 audio only)
Unit Tests #4230: Pull request #2689 synchronize by artemorloff
February 18, 2025 15:53 6m 54s artemorloff:multimodality_audio
February 18, 2025 15:53 6m 54s
add audio modality (qwen2 audio only)
Tasks Modified #4258: Pull request #2689 synchronize by artemorloff
February 18, 2025 15:53 2m 46s artemorloff:multimodality_audio
February 18, 2025 15:53 2m 46s
add audio modality (qwen2 audio only)
Tasks Modified #4257: Pull request #2689 synchronize by artemorloff
February 18, 2025 15:43 1m 58s artemorloff:multimodality_audio
February 18, 2025 15:43 1m 58s
add audio modality (qwen2 audio only)
Unit Tests #4229: Pull request #2689 synchronize by artemorloff
February 18, 2025 15:43 6m 41s artemorloff:multimodality_audio
February 18, 2025 15:43 6m 41s
fix vllm (#2708)
Tasks Modified #4254: Commit 52df63b pushed by baberabb
February 17, 2025 23:51 15s main
February 17, 2025 23:51 15s