Open
Description
Even though the foundation is set, it needs another push to actually make it work with different kinds of hashes.
Tasks
- remove hash-type specific methods from
git-hash
and replace them with parametric usage ofgit_hash::Kind
- all code assuming hashes of len 20 should receive this value as parameter instead. This is what git does for the old index and pack file formats.
- a way to pass
--object-hash
information to thegix
CLI - remove SHA1 mention from
git-features
feature toggles - parameterize hash len when decoding non-blob objects (see this for an example)
understand and implement pack idx V3.- see if git actually implements this, and maybe decide thatgitoxide
won't handle the transition period, is either one has or another.- add new Sha256 enum variant, consider putting it behind a feature flag, and add a hasher for it as well.
- general tests for reading refs and objects of different len
- tests for writing and reading objects of different len, maybe even write a conversion program which transforms an entire repo and double-checks with git-fsck
- when cloning, check the
uninmplemented!()
invocation to configure the repo for expecting a different hash
Implementation ideas
- make sure once Sha256 is added as ObjectId variant, that it's behind a feature toggle to allow builds that opt-out of SHA256 support to not unnecessarily use more memory than needed. Maybe there are alternatives to this, too.
-
One way to do that with approximately zero overhead would be to such functions generic on the object ID, using a trait that has a method to get the type. Then object IDs with a known type return a constant from that method, and object IDs with a runtime dispatched type return the value of that enum.
- @joshtriplett - taken verbatim as I'd barely be able to improve on it when paraphrasing. In short, have a trait for
oid
or allow efficient conversions tooid
(it's just a slice, so that should work for specifically sized types as well especially if these were provided bygit-hash
.
- @joshtriplett - taken verbatim as I'd barely be able to improve on it when paraphrasing. In short, have a trait for
Notes
- find ways to use the existing highly-parallel pack traversal (along with integration of loose-objects) to build an inverse-ref table to quickly traverse objects bottom-up to change the hash used along with all references, while being fast. This ties into being able to build new packs quickly, ideally even with delta-compression (the latter then has to be re-created as most objects actually change) - re-using deltas for blobs is the only way.
- The existing traversal can mutate data in the tree, which is enough to decode the object and keep direct references for later.