`gguf-split` add a default option to not include tensors data in first shard

### Motivation

be able to make a split where the first shard is very small and contains primarily the metadata so that it can be downloaded quickly and then start the download of the other shards without waiting for the first to finish

### Proposition
Add an option to not include tensor data in the first file. Maybe it should be enabled by default.
Should be well tested.

`ggml_alloc` should not be called as it will complain with `WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_malloc!`

We can add extra meta data in the first file that describes all tensors in the shards for example

#### References
- #6404
- #6135
- #6187
- #6192
- #6343
- https://github.com/ggerganov/llama.cpp/pull/6343#issuecomment-2034990690
- https://github.com/ggerganov/llama.cpp/pull/6343#issuecomment-2035011205
- https://github.com/huggingface/huggingface.js/issues/604


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`gguf-split` add a default option to not include tensors data in first shard #6463

Motivation

Proposition

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

gguf-split add a default option to not include tensors data in first shard #6463

Description

Motivation

Proposition

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`gguf-split` add a default option to not include tensors data in first shard #6463