Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step 4: Evaluating Models with knn:Incorrect perplexity (ppl) #18

Open
Binn0 opened this issue Dec 4, 2024 · 2 comments
Open

Step 4: Evaluating Models with knn:Incorrect perplexity (ppl) #18

Binn0 opened this issue Dec 4, 2024 · 2 comments

Comments

@Binn0
Copy link

Binn0 commented Dec 4, 2024

Why is it that when I reproduce Step 4: Evaluating Models, the perplexity (ppl) I get from running knn-LM is around 17? Could you please explain why this is the case? I would greatly appreciate it if you could provide a response.

@urialon
Copy link
Collaborator

urialon commented Dec 5, 2024 via email

@Binn0
Copy link
Author

Binn0 commented Dec 5, 2024

Dear author,

Thank you very much for your response! I am using the neulab/gpt2-finetuned-wikitext103 model, and the dataset is Wikitext-103. The index and vals files I am using are gpt2/index_gpt2_116988150_768.indexed and gpt2/dstore_gpt2_116988150_768_vals.npy, respectively, from the link https://knn-transformers.s3.amazonaws.com/index.html.

However, when using the --knn option, the perplexity (PPL) of GPT2 is 17.34, which is significantly higher than the 12.57 you provided. I was wondering if you know what might be causing this discrepancy?
image

Another question is that in your article, the method RetoMaton is compared using Foss, and according to your image, a smaller Foss value indicates a lower PPL and better performance. However, for the knn-LM, it seems there is no hyperparameter related to Foss in the code.
image

If I could receive your reply, it would be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants