Description
What happened?
This is a bit discussed here already: #7938
<|assistant|>
32001 -> '<|assistant|>'
259 -> ' '
Also <|assistant|>\n
:
32001 -> '<|assistant|>'
29871 -> ' '
13 -> '
'
What happens is that the single whitespace, that follows a special token is mutated into a double-whitespace token (259) because add_prefix_space is triggered in llama.cpp when a special token is encountered.
In the second example the template actually wants a \n after assistant, however the special behavior sneaks a space in between.
Is this intended behavior / correct ?
When running PHI3 and asking for a generation after <|assistant|>
, phi3 is adamant in responding with a whitespace or a combination token that starts with a whitespace.
When disabling add_prefix_whitespace and adding a \n
after assistant, this issue is resolved and phi responds right away with normal text.
Name and Version
What operating system are you seeing the problem on?
Windows
Relevant log output
No response