Skip to content

plurigrid/properadness

 
 

Repository files navigation

Preparedness Evals

This repository contains the code for multiple Preparedness evals that use nanoeval and alcatraz.

System requirements

  1. Python 3.11 (3.12 is untested; 3.13 will break chz)

Install pre-requisites

for proj in nanoeval alcatraz nanoeval_alcatraz; do
    pip install -e project/"$proj"
done

Evals

  • PaperBench
  • SWELancer (Forthcoming)
  • MLE-bench (Forthcoming)

About

Properads and Pluricategories

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.7%
  • JavaScript 2.0%
  • Other 2.3%