Skip to content

Implement ANPD (3x speedup, lossless) #6813

Closed
@trudnorx

Description

@trudnorx

A new paper has described ANPD.
According to the paper, ANPD can speed up a LLM by 3x, without any drop in generation quality.
The paper also lists multiple advantages of ANPD over speculative techniques that may already be found in llama.cpp.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions