Skip to content

gguf-split add a default option to not include tensors data in first shard #6463

Closed
@phymbert

Description

@phymbert

Motivation

be able to make a split where the first shard is very small and contains primarily the metadata so that it can be downloaded quickly and then start the download of the other shards without waiting for the first to finish

Proposition

Add an option to not include tensor data in the first file. Maybe it should be enabled by default.
Should be well tested.

ggml_alloc should not be called as it will complain with WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_malloc!

We can add extra meta data in the first file that describes all tensors in the shards for example

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestgood first issueGood for newcomershelp wantedExtra attention is neededsplitGGUF split model sharding

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions