Skip to content

improve archive index reading performance #1541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 7, 2021

Conversation

syphar
Copy link
Member

@syphar syphar commented Oct 31, 2021

  • use a streaming parser to find a file in the local index
  • switch to memmap as reader

For the extreme case (stm32ral, 1.2 million files in the archive) index-read gets from ~3.5s to 600ms ~300ms.

This is the first time I worked with custom serde deserializers, I didn't find a way to:

  • only use 1 custom visitor and still pass the seed (the path to search) into the other visitor
  • actually stop reading data when we found the entry (this could improve performance even further).
    I'm happy to add any improvements :)

There are still possible improvements when downloading the index, but for local indexes we probably don't have much more we can optimize without changing the format.

@syphar syphar added the S-waiting-on-review Status: This pull request has been implemented and needs to be reviewed label Oct 31, 2021
@syphar
Copy link
Member Author

syphar commented Oct 31, 2021

perhaps @Nemo157 has some improvement ideas on the deserializer?
(for now, without changing the format :) )

@syphar syphar self-assigned this Nov 1, 2021
@syphar syphar force-pushed the faster-archive-index-read branch 4 times, most recently from d873206 to 1eeccd1 Compare November 1, 2021 17:55
* use a streaming parser to find a file in the local index
* switch to memmap as reader
* use zero-copy deserialization for the key, change key data type
@syphar syphar force-pushed the faster-archive-index-read branch from 1eeccd1 to b42ba77 Compare November 6, 2021 11:05
Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome work, thanks for digging into this 💜 feel free to merge with or without fixing the nits :)

@syphar syphar merged commit 75f7a1c into rust-lang:master Nov 7, 2021
@syphar syphar deleted the faster-archive-index-read branch November 7, 2021 09:03
@syphar syphar added S-waiting-on-deploy This PR is ready to be merged, but is waiting for an admin to have time to deploy it and removed S-waiting-on-review Status: This pull request has been implemented and needs to be reviewed labels Nov 7, 2021
@syphar syphar removed the S-waiting-on-deploy This PR is ready to be merged, but is waiting for an admin to have time to deploy it label Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants