Skip to content

Actions: huggingface/lighteval

Tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
1,898 workflow runs
1,898 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Add swiss legal evals as new community tasks
Tests #2125: Pull request #389 synchronize by JoelNiklaus
February 5, 2025 13:24 Action required JoelNiklaus:add_swiss_legal_evals
February 5, 2025 13:24 Action required
Add GPQA for instruct models
Tests #2124: Pull request #534 synchronize by lewtun
February 5, 2025 13:17 38m 32s add-gpqa-generative
February 5, 2025 13:17 38m 32s
Add GPQA for instruct models
Tests #2123: Pull request #534 synchronize by lewtun
February 5, 2025 12:24 38m 46s add-gpqa-generative
February 5, 2025 12:24 38m 46s
Make BLEURT lazy
Tests #2122: Pull request #536 synchronize by hynky1999
February 5, 2025 12:09 41m 43s make_bleurt_lazy
February 5, 2025 12:09 41m 43s
Make BLEURT lazy
Tests #2121: Pull request #536 opened by hynky1999
February 5, 2025 12:08 40m 2s make_bleurt_lazy
February 5, 2025 12:08 40m 2s
Sync Math-verify (#535)
Tests #2120: Commit cb35bea pushed by hynky1999
February 5, 2025 11:34 41m 27s main
February 5, 2025 11:34 41m 27s
Add GPQA for instruct models
Tests #2119: Pull request #534 synchronize by lewtun
February 5, 2025 10:14 43m 14s add-gpqa-generative
February 5, 2025 10:14 43m 14s
Fix loading of vllm model from files
Tests #2118: Pull request #533 synchronize by NathanHB
February 5, 2025 09:15 37m 56s nathan-fix-vllm-from-file
February 5, 2025 09:15 37m 56s
Sync Math-verify
Tests #2117: Pull request #535 synchronize by hynky1999
February 5, 2025 00:47 39m 5s sync_math_verify
February 5, 2025 00:47 39m 5s
Sync Math-verify
Tests #2116: Pull request #535 synchronize by hynky1999
February 5, 2025 00:45 42m 22s sync_math_verify
February 5, 2025 00:45 42m 22s
Sync Math-verify
Tests #2115: Pull request #535 opened by hynky1999
February 5, 2025 00:26 44m 10s sync_math_verify
February 5, 2025 00:26 44m 10s
Add GPQA for instruct models
Tests #2114: Pull request #534 synchronize by lewtun
February 4, 2025 16:14 40m 37s add-gpqa-generative
February 4, 2025 16:14 40m 37s
Fix loading of vllm model from files
Tests #2113: Pull request #533 synchronize by NathanHB
February 4, 2025 15:10 38m 49s nathan-fix-vllm-from-file
February 4, 2025 15:10 38m 49s
Add GPQA for instruct models
Tests #2112: Pull request #534 synchronize by lewtun
February 4, 2025 14:57 39m 3s add-gpqa-generative
February 4, 2025 14:57 39m 3s
Add GPQA for instruct models
Tests #2111: Pull request #534 opened by lewtun
February 4, 2025 14:55 40m 49s add-gpqa-generative
February 4, 2025 14:55 40m 49s
Fix loading of vllm model from files
Tests #2110: Pull request #533 synchronize by NathanHB
February 4, 2025 14:07 39m 58s nathan-fix-vllm-from-file
February 4, 2025 14:07 39m 58s
Fix loading of vllm model from files
Tests #2109: Pull request #533 opened by NathanHB
February 4, 2025 14:05 40m 54s nathan-fix-vllm-from-file
February 4, 2025 14:05 40m 54s
Add custom task (bac-fr) for evaluation of models in french (#518)
Tests #2108: Commit d7a1f11 pushed by clefourrier
February 3, 2025 16:08 41m 20s main
February 3, 2025 16:08 41m 20s
Update french_evals.py
Tests #2107: Commit be7da17 pushed by clefourrier
February 3, 2025 12:13 39m 21s main
February 3, 2025 12:13 39m 21s
Add swiss legal evals as new community tasks
Tests #2106: Pull request #389 synchronize by JoelNiklaus
February 1, 2025 10:55 Action required JoelNiklaus:add_swiss_legal_evals
February 1, 2025 10:55 Action required
Add swiss legal evals as new community tasks
Tests #2105: Pull request #389 synchronize by JoelNiklaus
February 1, 2025 10:37 Action required JoelNiklaus:add_swiss_legal_evals
February 1, 2025 10:37 Action required
Multi node vLLM
Tests #2104: Pull request #530 synchronize by ncassereau
February 1, 2025 08:41 38m 56s ncassereau:multi_node_vllm
February 1, 2025 08:41 38m 56s
Add custom task (bac-fr) for evaluation of models in french
Tests #2103: Pull request #518 synchronize by mdiazmel
January 31, 2025 16:57 37m 50s mdiazmel:main
January 31, 2025 16:57 37m 50s
adds olympiad bench (#521)
Tests #2102: Commit d332207 pushed by NathanHB
January 31, 2025 14:20 39m 4s main
January 31, 2025 14:20 39m 4s
Multi node vLLM
Tests #2101: Pull request #530 opened by ncassereau
January 31, 2025 13:53 Action required ncassereau:multi_node_vllm
January 31, 2025 13:53 Action required