docs: update links

terryyz · terryyz · commit a02256ff12cd · 2024-07-02T03:11:04.000+08:00
diff --git a/README.md b/README.md
@@ -18,16 +18,16 @@
 <p align="center">
     <a href="#-about">🌸About</a> •
     <a href="#-quick-start">🔥Quick Start</a> •
-    <a href="#-llm-generated-code">💻LLM code</a> •
-    <a href="#-failure-inspection">🔍Failure inspection</a> •
+    <a href="#-failure-inspection">🔍Failure Inspection</a> •
     <a href="#-full-script">🚀Full Script</a> •
     <a href="#-result-analysis">📊Result Analysis</a> •
-    <a href="#-known-issues">🐞Known issues</a> •
+    <a href="#-llm-generated-code">💻LLM-generated Code</a> •
+    <a href="#-known-issues">🐞Known Issues</a> •
     <a href="#-citation">📜Citation</a> •
     <a href="#-acknowledgement">🙏Acknowledgement</a>
 </p>
 
-## About
+## 🌸 About
 
 ### BigCodeBench
 
@@ -307,23 +307,23 @@ Here are some tips to speed up the evaluation:
 </div>
 </details>
 
-## Failure Inspection
+## 🔍 Failure Inspection
 
 You can inspect the failed samples by using the following command:
 
 ```bash
 bigcodebench.inspect --eval-results sample-sanitized-calibrated_eval_results.json --in-place
 ```
 
-## Full Script
+## 🚀 Full Script
 
 We provide a sample script to run the full pipeline:
 
 ```bash
 bash run.sh
 ```
 
-## Result Analysis
+## 📊 Result Analysis
 
 We provide a script to replicate the analysis like Elo Rating and Task Solve Rate, which helps you understand the performance of the models further.
 
@@ -340,7 +340,7 @@ python get_results.py
 We share pre-generated code samples from LLMs we have [evaluated](https://huggingface.co/spaces/bigcode/bigcodebench-leaderboard):
 *  See the attachment of our [v0.1.5](https://github.com/bigcode-project/bigcodebench/releases/tag/v0.1.5). We include both `sanitized_samples.zip` and `sanitized_samples_calibrated.zip` for your convenience.
 
-## Known Issues
+## 🐞 Known Issues
 
 - [ ] Due to the flakes in the evaluation, the execution results may vary slightly (~0.2%) between runs. We are working on improving the evaluation stability.