Closed
Description
A new paper has described ANPD.
According to the paper, ANPD can speed up a LLM by 3x, without any drop in generation quality.
The paper also lists multiple advantages of ANPD over speculative techniques that may already be found in llama.cpp.