Skip to content

Lack of documentation regarding RoPE scaling #2402

@maddes8cht

Description

@maddes8cht

There is a lack of documentation on the current development.
There is no documentation on the rope parameters, except for the two lines in the --help command that say:

  --rope-freq-base N RoPE base frequency (default: 10000.0)
  --rope-freq-scale N RoPE frequency scaling factor (default: 1)

There is no mention of RoPE scaling in the primary readme.md, no mention of the new parameters in the readme pages of the "main" example or the "server" example or any of the now linked pages in the "docs" section of the primary readme, and no mention in the new wiki pages.

So it is very hard for a normal user to even notice that he has missed something.
When a user is ready, the only explanation that seems to be available at the moment is PR #2054.

But what are actually reasonable values for scale and base, that requires a lot of reading - the first concrete suggestion is explicitly
For the bold, try adding the following command line parameters to your favorite model: -c 16384 --rope-freq-base 80000 --rope-freq-scale 0.5
What about the not-so bold?

In the course of the PR there were numerous combinations of the parameters base and scale, and I also experimented with recommended combinations.

But reasonably clear descriptions of what values would be recommended in which dependencies to each other, and perhaps also in relation to Llama 2 - that is not really to find and if, then it requires considerable search effort.
RoPE Scaling is a clear extension of the possibilities of Llama - shouldn't there be some form of documentation for it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions