Skip to content

Fix HellaSwag #4982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Fix HellaSwag #4982

wants to merge 1 commit into from

Conversation

ikawrakow
Copy link
Contributor

HellaSwag is broken on current master (see #4980). It is related to KV cache handling.

Instead of trying to sort it out, I just changed to evaluating the full ending (so context + ending together) for all 4 endings.
The performance hit is surprisingly low: it runs 400 tasks in 55 seconds with a 7B model (fp16, not quantized), versus 49 seconds before this change (where is the time going? The number of tokens being evaluated is at least two times more). The result for LLaMA-v2 is now 77.00, as we had before, versus 53.00 on master.

Left the existing version commented out for now.

@ikawrakow
Copy link
Contributor Author

Closing in favor of #4981

@ikawrakow ikawrakow closed this Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants