Skip to content

Optimize Int to String conversions [code inside] #26578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
arthurprs opened this issue Jun 25, 2015 · 1 comment
Closed

Optimize Int to String conversions [code inside] #26578

arthurprs opened this issue Jun 25, 2015 · 1 comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@arthurprs
Copy link
Contributor

As of today rust uses a naive implementation for converting integers to decimal strings in Int Debug/Display methods.

I spent some time crafting an optimized version for the most commonly used case, that is, printing as decimals. This should be non measurable in most programs but might give us a minor speedup on some cases like serde and rustc json serializers.

I wrote a well optimized version here. Further optimizations are possible but I tried to keep the code size small (which I think is important), it's a road of diminished gains.

Running with rustc 1.2.0-nightly (cffaf0e 2015-06-23) @ x64 Linux - Intel(R) Core(TM) i7-2670QM @ 2.20Ghz (My notebook CPU)

test bench::skewed_h_new_u08    ... bench:      67,156 ns/iter (+/- 3,687)
test bench::skewed_h_new_u16    ... bench:     376,573 ns/iter (+/- 23,732)
test bench::skewed_h_new_u32    ... bench:   4,202,419 ns/iter (+/- 203,200)
test bench::skewed_h_new_u64    ... bench:   5,097,971 ns/iter (+/- 337,608)
test bench::skewed_h_stdlib_u08 ... bench:      69,270 ns/iter (+/- 3,321)
test bench::skewed_h_stdlib_u16 ... bench:     420,660 ns/iter (+/- 20,196)
test bench::skewed_h_stdlib_u32 ... bench:   5,451,519 ns/iter (+/- 417,856)
test bench::skewed_h_stdlib_u64 ... bench:   8,360,505 ns/iter (+/- 453,566)
test bench::skewed_l_new_u08    ... bench:      68,705 ns/iter (+/- 3,657)
test bench::skewed_l_new_u16    ... bench:     376,786 ns/iter (+/- 20,804)
test bench::skewed_l_new_u32    ... bench:   4,207,858 ns/iter (+/- 210,143)
test bench::skewed_l_new_u64    ... bench:   5,117,710 ns/iter (+/- 350,017)
test bench::skewed_l_stdlib_u08 ... bench:      68,252 ns/iter (+/- 4,251)
test bench::skewed_l_stdlib_u16 ... bench:     417,692 ns/iter (+/- 22,882)
test bench::skewed_l_stdlib_u32 ... bench:   5,383,473 ns/iter (+/- 322,479)
test bench::skewed_l_stdlib_u64 ... bench:   8,380,458 ns/iter (+/- 357,894)
test bench::skewed_m_new_u08    ... bench:      68,585 ns/iter (+/- 3,099)
test bench::skewed_m_new_u16    ... bench:     375,588 ns/iter (+/- 20,074)
test bench::skewed_m_new_u32    ... bench:   4,178,506 ns/iter (+/- 234,058)
test bench::skewed_m_new_u64    ... bench:   5,041,359 ns/iter (+/- 302,088)
test bench::skewed_m_stdlib_u08 ... bench:      67,975 ns/iter (+/- 3,570)
test bench::skewed_m_stdlib_u16 ... bench:     421,650 ns/iter (+/- 22,984)
test bench::skewed_m_stdlib_u32 ... bench:   5,422,042 ns/iter (+/- 330,263)
test bench::skewed_m_stdlib_u64 ... bench:   8,358,590 ns/iter (+/- 418,777)

Running with rustc 1.3.0-nightly (e5a28bc 2015-06-25) @ x86 Linux - Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz (EC2 c1.medium)

test bench::skewed_h_new_u08    ... bench:     128,062 ns/iter (+/- 624)
test bench::skewed_h_new_u16    ... bench:     701,687 ns/iter (+/- 2,595)
test bench::skewed_h_new_u32    ... bench:   8,013,071 ns/iter (+/- 86,295)
test bench::skewed_h_new_u64    ... bench:  20,619,636 ns/iter (+/- 244,472)
test bench::skewed_h_stdlib_u08 ... bench:     139,061 ns/iter (+/- 4,208)
test bench::skewed_h_stdlib_u16 ... bench:     840,872 ns/iter (+/- 8,870)
test bench::skewed_h_stdlib_u32 ... bench:  10,934,092 ns/iter (+/- 86,377)
test bench::skewed_h_stdlib_u64 ... bench:  62,690,245 ns/iter (+/- 4,648,790)
test bench::skewed_l_new_u08    ... bench:     128,245 ns/iter (+/- 1,491)
test bench::skewed_l_new_u16    ... bench:     702,062 ns/iter (+/- 13,180)
test bench::skewed_l_new_u32    ... bench:   8,021,507 ns/iter (+/- 325,452)
test bench::skewed_l_new_u64    ... bench:  20,596,010 ns/iter (+/- 962,453)
test bench::skewed_l_stdlib_u08 ... bench:     139,014 ns/iter (+/- 7,428)
test bench::skewed_l_stdlib_u16 ... bench:     840,780 ns/iter (+/- 16,955)
test bench::skewed_l_stdlib_u32 ... bench:  10,926,288 ns/iter (+/- 309,821)
test bench::skewed_l_stdlib_u64 ... bench:  62,649,913 ns/iter (+/- 1,106,527)
test bench::skewed_m_new_u08    ... bench:     128,949 ns/iter (+/- 16,267)
test bench::skewed_m_new_u16    ... bench:     706,043 ns/iter (+/- 73,190)
test bench::skewed_m_new_u32    ... bench:   8,001,205 ns/iter (+/- 219,644)
test bench::skewed_m_new_u64    ... bench:  20,569,162 ns/iter (+/- 430,049)
test bench::skewed_m_stdlib_u08 ... bench:     138,840 ns/iter (+/- 5,948)
test bench::skewed_m_stdlib_u16 ... bench:     840,655 ns/iter (+/- 9,596)
test bench::skewed_m_stdlib_u32 ... bench:  10,949,664 ns/iter (+/- 191,620)
test bench::skewed_m_stdlib_u64 ... bench:  62,858,086 ns/iter (+/- 1,316,625)

On modern x64 CPUs it's pretty much the same speed as the stdlib implementation for very small numbers, but pulls ahead as the length of the decimal increases.

On slight older CPUs (worse ALUs) or x86 it's pretty much always faster.

Is this something we want into the stdlib?


Follow up of #26310
marking @llogiq because of previous interest
marking @lifthrasiir because of the awesome work on similar stuff

@steveklabnik steveklabnik added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Jun 26, 2015
@arthurprs
Copy link
Contributor Author

Will submit a PR latter today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

2 participants