Make File's specialisation of read_to_end use the length of the file to size the Vec it gets passed #27159
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Performance optimisation for File::read_to_end. Currently reallocing in the vector takes a very large percentage of runtime for loading a large file. This is unnecessary because File allows us to work out how much we have left to read, and we can use that information to size the vec in advance.
Performance figures:
Large (230MB+ file, 1024 initial vec capacity)
test sys_common::io::tests::bench_uninitialized_file ... bench: 283,419,418 ns/iter (+/- 33,257,419)
test sys_common::io::tests::bench_uninitialized_file_hint ... bench: 153,119,795 ns/iter (+/- 34,983,881)
Small files:
Since I was concerned about small files - that there might be overhead when working out the remaining size of the file - I also did a test loading a large variety of 4KB files filled with random data. I then experimented with different sizes of Vec being passed into the read_to_end function. It looks like in the worst case (where the Vec is exactly sized to the file), the new code is a little slower, but otherwise it's generally faster. I'm not too concerned about this - if you've already worked out the exact size of the file, you might just as well use read() instead.
Using Vec::new()
test sys_common::io::tests::bench_uninitialized_file_small ... bench: 9,811 ns/iter (+/- 1,346)
test sys_common::io::tests::bench_uninitialized_file_small_hint ... bench: 6,789 ns/iter (+/- 579)
Using Vec::with_capacity(1024)
test sys_common::io::tests::bench_uninitialized_file_small ... bench: 6,193 ns/iter (+/- 955)
test sys_common::io::tests::bench_uninitialized_file_small_hint ... bench: 5,896 ns/iter (+/- 525)
Using Vec::with_capacity(4096)
test sys_common::io::tests::bench_uninitialized_file_small ... bench: 5,123 ns/iter (+/- 2,129)
test sys_common::io::tests::bench_uninitialized_file_small_hint ... bench: 5,936 ns/iter (+/- 605)
'Real' application
I tested loading the large file in a simple rust application, and the runtime of the app averaged ~200ms with this optimisation and ~350ms without.