Skip to content

Actions: EleutherAI/lm-evaluation-harness

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
4,259 workflow runs
4,259 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Warn when splitting sequences
Tasks Modified #4707: Pull request #2929 opened by fxmarty-amd
April 25, 2025 13:40 Action required fxmarty-amd:warn-sequence-split
April 25, 2025 13:40 Action required
Warn when splitting sequences
Unit Tests #4679: Pull request #2929 opened by fxmarty-amd
April 25, 2025 13:40 Action required fxmarty-amd:warn-sequence-split
April 25, 2025 13:40 Action required
Resolve the inconsistency between the description of output_path and the actual logic
Tasks Modified #4706: Pull request #2928 synchronize by rangehow
April 25, 2025 04:25 Action required rangehow:main
April 25, 2025 04:25 Action required
Resolve the inconsistency between the description of output_path and the actual logic
Unit Tests #4678: Pull request #2928 synchronize by rangehow
April 25, 2025 04:25 Action required rangehow:main
April 25, 2025 04:25 Action required
Draft - Support ov models via genai
Unit Tests #4677: Pull request #1862 synchronize by sstrehlk
April 24, 2025 10:20 2m 25s sstrehlk:support-ov-models-via-genai
April 24, 2025 10:20 2m 25s
Added NorEval, a novel Norwegian benchmark
Unit Tests #4676: Pull request #2919 synchronize by vmkhlv
April 24, 2025 09:56 Action required vmkhlv:noreval
April 24, 2025 09:56 Action required
Added NorEval, a novel Norwegian benchmark
Tasks Modified #4704: Pull request #2919 synchronize by vmkhlv
April 24, 2025 09:56 Action required vmkhlv:noreval
April 24, 2025 09:56 Action required
Resolve the inconsistency between the description of output_path and the actual logic
Unit Tests #4675: Pull request #2928 opened by rangehow
April 24, 2025 05:11 Action required rangehow:main
April 24, 2025 05:11 Action required
Resolve the inconsistency between the description of output_path and the actual logic
Tasks Modified #4703: Pull request #2928 opened by rangehow
April 24, 2025 05:11 Action required rangehow:main
April 24, 2025 05:11 Action required
enable evaluation from yaml config file
Tasks Modified #4702: Pull request #2893 synchronize by artemorloff
April 23, 2025 19:06 1m 47s artemorloff:feature/eval_from_config
April 23, 2025 19:06 1m 47s
Added NorEval, a novel Norwegian benchmark
Unit Tests #4673: Pull request #2919 synchronize by vmkhlv
April 23, 2025 10:48 Action required vmkhlv:noreval
April 23, 2025 10:48 Action required
Added NorEval, a novel Norwegian benchmark
Tasks Modified #4701: Pull request #2919 synchronize by vmkhlv
April 23, 2025 10:48 Action required vmkhlv:noreval
April 23, 2025 10:48 Action required
enable evaluation from yaml config file
Tasks Modified #4698: Pull request #2893 synchronize by artemorloff
April 22, 2025 21:36 1m 20s artemorloff:feature/eval_from_config
April 22, 2025 21:36 1m 20s
enable evaluation from yaml config file
Tasks Modified #4697: Pull request #2893 synchronize by artemorloff
April 22, 2025 21:16 1m 24s artemorloff:feature/eval_from_config
April 22, 2025 21:16 1m 24s
Add simple Dockerfile and instructions
Tasks Modified #4696: Pull request #2837 synchronize by kiersten-stokes
April 22, 2025 18:25 11s kiersten-stokes:add-dockerfile
April 22, 2025 18:25 11s
Add simple Dockerfile and instructions
Unit Tests #4668: Pull request #2837 synchronize by kiersten-stokes
April 22, 2025 18:25 4m 48s kiersten-stokes:add-dockerfile
April 22, 2025 18:25 4m 48s
Fix gsm8k task to enhance accuracy
Tasks Modified #4694: Pull request #2924 opened by hfadzxy
April 21, 2025 09:31 Action required hfadzxy:zxy_gsm8k
April 21, 2025 09:31 Action required
Fix gsm8k task to enhance accuracy
Unit Tests #4666: Pull request #2924 opened by hfadzxy
April 21, 2025 09:31 Action required hfadzxy:zxy_gsm8k
April 21, 2025 09:31 Action required
Added AIME Support
Unit Tests #4665: Pull request #2892 synchronize by Zephyr271828
April 20, 2025 17:22 Action required Zephyr271828:aime
April 20, 2025 17:22 Action required