What is the prompt format? #91

blazgocompany · 2025-04-01T16:36:53Z

I have an OpenAI compatible endpoint that I'm prepping up for evaluation, and I want to know what is the FINAL end request being sent? e.g:
(I'm looking at the HF dataset https://huggingface.co/datasets/bigcode/bigcodebench-hard/viewer/default/v0.1.0_hf?row=0&views%5B%5D=v010_hf)

[
{
"role":"user"
"content": "{complete_prompt if complete else instruct_prompt}"
},
{
"role":"assistant"
"content": "Sure, I can do that! Blah Blah Blah: ```python {canonical_solution} ``` This function does this and that..."
}
]

Basically what format should I expect and what should I send back?

terryyz · 2025-04-01T17:11:42Z

The OpenAI endpoint will receive the user prompts and output the response, as described in the code.

Regarding the prompt context, please refer to this block.

I have an OpenAI compatible endpoint that I'm prepping up for evaluation

You should be able to run the evaluation without changing the current implementation. Passing a base_url arg should be enough. If you need any other customization, please check out this doc.

blazgocompany · 2025-04-01T19:44:33Z

Thank you. Referring to that block in utility.py, It is a great start, but I'd like to know it in terms of the dataset columns. I want to know the column mapping from the code to the dataset (https://huggingface.co/datasets/bigcode/bigcodebench-hard/viewer/default/v0.1.0_hf?row=0&views%5B%5D=v010_hf) I'm assuming: Task_prompt -> instruction_prompt / complete_prompt Instruction_prefix -> ? Also, I don't quite understand why there is a "response" variable. Could you please clarify that? Is that the expected output

…

________________________________ From: Terry Yue Zhuo ***@***.***> Sent: Tuesday, April 1, 2025 10:12 AM To: bigcode-project/bigcodebench ***@***.***> Cc: blazgocompany ***@***.***>; Author ***@***.***> Subject: Re: [bigcode-project/bigcodebench] What is the prompt format? (Issue #91) The OpenAI endpoint will receive the user prompts and output the response, as described in the code<https://github.com/bigcode-project/bigcodebench/blob/main/bigcodebench/gen/util/openai_request.py#L29>. Regarding the prompt context, please refer to this block<https://github.com/bigcode-project/bigcodebench/blob/main/bigcodebench/provider/utility.py#L43-L60>. I have an OpenAI compatible endpoint that I'm prepping up for evaluation You should be able to run the evaluation without changing the current implementation. Passing a base_url arg should be enough. If you need any other customization, please check out this doc<https://github.com/bigcode-project/bigcodebench/blob/main/ADVANCED_USAGE.md>. — Reply to this email directly, view it on GitHub<#91 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AXA4YJMLO5E2ZKETYB4RVKT2XLCGJAVCNFSM6AAAAAB2HMJHQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZQGA3DCMRUHA>. You are receiving this because you authored the thread.Message ID: ***@***.***> [terryyz]terryyz left a comment (bigcode-project/bigcodebench#91)<#91 (comment)> The OpenAI endpoint will receive the user prompts and output the response, as described in the code<https://github.com/bigcode-project/bigcodebench/blob/main/bigcodebench/gen/util/openai_request.py#L29>. Regarding the prompt context, please refer to this block<https://github.com/bigcode-project/bigcodebench/blob/main/bigcodebench/provider/utility.py#L43-L60>. I have an OpenAI compatible endpoint that I'm prepping up for evaluation You should be able to run the evaluation without changing the current implementation. Passing a base_url arg should be enough. If you need any other customization, please check out this doc<https://github.com/bigcode-project/bigcodebench/blob/main/ADVANCED_USAGE.md>. — Reply to this email directly, view it on GitHub<#91 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AXA4YJMLO5E2ZKETYB4RVKT2XLCGJAVCNFSM6AAAAAB2HMJHQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZQGA3DCMRUHA>. You are receiving this because you authored the thread.Message ID: ***@***.***>

terryyz · 2025-04-02T02:20:31Z

Instruction_prefix -> these two variables

The "response" variable is used for profiling, the same design as inside EvalPlus. It won't be used by any model APIs.

blazgocompany · 2025-04-07T20:52:52Z

@terryyz While evaluating, I'm getting issues with eval (0.00 pass@1), and:

AttributeError: module 'os' has no attribute 'killpg'

Is there a way I can debug this?

terryyz · 2025-04-09T14:39:21Z

The pointer of Instruction_prefix has been fixed.

Is there a way I can debug this?

I think you may use this on Windows, where you are supposed to run inside a Linux env. You may find this helpful.

blazgocompany · 2025-04-10T19:50:06Z

@terryyz Ok, but why is it giving 0 pass@1, In fact inorder to test it i made my code retrive the hf dataset (bcb_hard) and if the input matches one of the prompts in the dataset it replies with the last codeblock in the input (because it says write code starting with...) Plus the canonical_solution in the dataset. When tested manually it is fine but it fails here

terryyz · 2025-04-11T13:42:36Z

AttributeError: module 'os' has no attribute 'killpg' is a part of the evaluation logic. If you fail on the os.killpg`, you basically cannot run the evaluation correctly. You have to make sure that you are running inside a correct environment first.

blazgocompany · 2025-04-16T20:16:46Z

@terryyz No module named "pytesseract", "lxml", "sklearn", etc, etc

I'm using Linux now (Jupyter notebook on HF Spaces) and getting low pass rate because of all theses import errors. Is my model supposed to respond with a dynamically importing solution? But when I see the santized prompts for o3-mini-high, I see it never does that. It just imports normally. So do I need to import them myself before evaluating? or is the model supposed to use subprocess.run('pip install module')?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the prompt format? #91

What is the prompt format? #91

blazgocompany commented Apr 1, 2025 •

edited

Loading

terryyz commented Apr 1, 2025

blazgocompany commented Apr 1, 2025 via email •

edited

Loading

terryyz commented Apr 2, 2025 •

edited

Loading

blazgocompany commented Apr 7, 2025 •

edited

Loading

terryyz commented Apr 9, 2025

blazgocompany commented Apr 10, 2025

terryyz commented Apr 11, 2025

blazgocompany commented Apr 16, 2025

What is the prompt format? #91

What is the prompt format? #91

Comments

blazgocompany commented Apr 1, 2025 • edited Loading

terryyz commented Apr 1, 2025

blazgocompany commented Apr 1, 2025 via email • edited Loading

terryyz commented Apr 2, 2025 • edited Loading

blazgocompany commented Apr 7, 2025 • edited Loading

terryyz commented Apr 9, 2025

blazgocompany commented Apr 10, 2025

terryyz commented Apr 11, 2025

blazgocompany commented Apr 16, 2025

blazgocompany commented Apr 1, 2025 •

edited

Loading

blazgocompany commented Apr 1, 2025 via email •

edited

Loading

terryyz commented Apr 2, 2025 •

edited

Loading

blazgocompany commented Apr 7, 2025 •

edited

Loading