Skip to content

tts : implement mimi decoder #12636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Mar 29, 2025

llama.cpp/example/mimi

Related to #12392

This demonstrates running Kyutai's Mimi model via GGML.

TODO:

  • implement decode_frame
  • see how long generation goes
  • test with audio codes from Sesame
  • abstract the decode into a function decode(codes) that returns std::vector<float>

Quickstart

Convert model to GGUF (no need to download, the script will automatically download the safetensors file)

python examples/tts/convert_mimi_to_gguf.py

# output file: kyutai-mimi.gguf

# optionally, use q8_0 quantization for faster speed
python examples/tts/convert_mimi_to_gguf.py --outtype q8_0

Then compile, run it:

cmake --build build -j --target llama-mimi

./build/bin/llama-mimi kyutai-mimi.gguf codes.txt

# output: output.wav

# alternatively, use "dummy1" to get a "hey hello there" sample output file
./build/bin/llama-mimi kyutai-mimi.gguf dummy1

Example of code file (one code per line):

1263
1597
1596
1477
1540
1720
1433
118
1066
1968
1096
232
418
566
1653
2010

@github-actions github-actions bot added examples python python script changes labels Mar 29, 2025
@ngxson
Copy link
Collaborator Author

ngxson commented Mar 30, 2025

Close and merge with #12648

@ngxson ngxson closed this Mar 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant