Skip to content

Broken generation with specific ngl values #3820

Closed
@staviq

Description

@staviq

While playing with implementing compression for copy/save state, I found a bug, which turned out to be reproducible in current main (41aee4d)

It seems to be model independent, and no parameters other than -ngl seem to make a difference either.

The first symptom happens for save-load-state, main and server, when -ngl equal to exactly N-1 is specified, basically this happens (generated output):

 Hello there!###############################

Second symptom was found by accident, when fiddling with save-load-state for the purpose of implementing compression. Basically, if -ngl is N or bigger (all layers loaded),
The problem above, seems to disappear, however:
Not only save-load-state fails because generated text is different for both runs,
but also, after some tokens were sampled llama_copy_state_data outputs mostly empty array, which I only noticed because I tried to dump the state post generation, and suddenly started to get 99% compression ratio on that array. Because it turned out to be mostly zeroes.

All -ngl values between 0 - (N-2) work properly.

I have no way of testing on AMD so I do not know if it's Nvidia specific.

main.output.txt
main.log

As a sanity check, here are results for -ngl from 0 to N with the same model and parameters (except -ngl):

out.txt

Edit: Interestingly enough, perplexity looks fine ?

-ngl N-2 (27/29)
[1]5.2069,[2]5.1932,[3]5.1802,[4]5.2837,[5]5.2742,[6]5.0776,
Final estimate: PPL = 5.0776 +/- 0.25768
-ngl N-1 (28/29)
[1]5.2069,[2]5.1932,[3]5.1802,[4]5.2837,[5]5.2742,[6]5.0776,
Final estimate: PPL = 5.0776 +/- 0.25768
-ngl N (29/29)
[1]5.2077,[2]5.1813,[3]5.1687,[4]5.2820,[5]5.2682,[6]5.0756,
Final estimate: PPL = 5.0756 +/- 0.25766

Metadata

Metadata

Assignees

No one assigned

    Labels

    Nvidia GPUIssues specific to Nvidia GPUsbugSomething isn't workinggeneration qualityQuality of model output

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions