Skip to content

Commit a02256f

Browse files
committed
docs: update links
1 parent 3ed7502 commit a02256f

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,16 +18,16 @@
1818
<p align="center">
1919
<a href="#-about">🌸About</a> •
2020
<a href="#-quick-start">🔥Quick Start</a> •
21-
<a href="#-llm-generated-code">💻LLM code</a> •
22-
<a href="#-failure-inspection">🔍Failure inspection</a> •
21+
<a href="#-failure-inspection">🔍Failure Inspection</a> •
2322
<a href="#-full-script">🚀Full Script</a> •
2423
<a href="#-result-analysis">📊Result Analysis</a> •
25-
<a href="#-known-issues">🐞Known issues</a> •
24+
<a href="#-llm-generated-code">💻LLM-generated Code</a> •
25+
<a href="#-known-issues">🐞Known Issues</a> •
2626
<a href="#-citation">📜Citation</a> •
2727
<a href="#-acknowledgement">🙏Acknowledgement</a>
2828
</p>
2929

30-
## About
30+
## 🌸 About
3131

3232
### BigCodeBench
3333

@@ -307,23 +307,23 @@ Here are some tips to speed up the evaluation:
307307
</div>
308308
</details>
309309
310-
## Failure Inspection
310+
## 🔍 Failure Inspection
311311
312312
You can inspect the failed samples by using the following command:
313313
314314
```bash
315315
bigcodebench.inspect --eval-results sample-sanitized-calibrated_eval_results.json --in-place
316316
```
317317
318-
## Full Script
318+
## 🚀 Full Script
319319
320320
We provide a sample script to run the full pipeline:
321321
322322
```bash
323323
bash run.sh
324324
```
325325
326-
## Result Analysis
326+
## 📊 Result Analysis
327327
328328
We provide a script to replicate the analysis like Elo Rating and Task Solve Rate, which helps you understand the performance of the models further.
329329
@@ -340,7 +340,7 @@ python get_results.py
340340
We share pre-generated code samples from LLMs we have [evaluated](https://huggingface.co/spaces/bigcode/bigcodebench-leaderboard):
341341
* See the attachment of our [v0.1.5](https://github.com/bigcode-project/bigcodebench/releases/tag/v0.1.5). We include both `sanitized_samples.zip` and `sanitized_samples_calibrated.zip` for your convenience.
342342
343-
## Known Issues
343+
## 🐞 Known Issues
344344
345345
- [ ] Due to the flakes in the evaluation, the execution results may vary slightly (~0.2%) between runs. We are working on improving the evaluation stability.
346346

0 commit comments

Comments
 (0)