Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
areyde authored Jun 6, 2024
1 parent 11bbc88 commit 621d2b5
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions module_summarization/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# 🏟️ Long Code Arena Baselines
## Module summarization

This directory contains code for running baselines for the Module summarization task in the Long Code Arena benchmark.
This directory contains the code for running baselines for the Module summarization task in the Long Code Arena benchmark.

We provide implementation of baselines running inference via [OpenAI](https://platform.openai.com/docs/overview) and [Together.AI](https://www.together.ai/).
We generate documentation based on an intent (one sentence description of documentation content), target documentation name and relevant code context.
We provide the implementation of baselines running inference via [OpenAI](https://platform.openai.com/docs/overview) and [Together.AI](https://www.together.ai/).
We generate documentation based on an intent (one sentence description of documentation content), target documentation name, and relevant code context.

# How-to

Expand All @@ -27,7 +27,7 @@ The script will generate predictions and put them into the `save_dir` directory

#### Metrics

To compare predicted and ground truth metrics we introduce the new metric based on LLM as an assessor. Our approach involves feeding the LLM with relevant code and two versions of documentation: the ground truth and the model-generated text. To mitigate variance and potential ordering effects in model responses, we calculate the probability that the generated documentation is superior by averaging the results of two queries:
To compare predicted and ground truth texts, we introduce the new metric based on LLM as an assessor. Our approach involves feeding the LLM with relevant code and two versions of documentation: the ground truth and the model-generated text. To mitigate variance and potential ordering effects in model responses, we calculate the probability that the generated documentation is superior by averaging the results of two queries:

```math
CompScore = \frac{ P(pred | LLM(code, pred, gold)) + P(pred | LLM(code, gold, pred))}{2}
Expand Down

0 comments on commit 621d2b5

Please sign in to comment.