Closed
Description
On linux x86_64
touch foo.c
clang -Ofast -fpic -shared foo.c -o foo.so
objdump --disassemble foo.so
gives:
Disassembly of section .text:
00000000000004f0 <set_fast_math>:
4f0: 0f ae 5c 24 fc stmxcsr -0x4(%rsp)
4f5: 81 4c 24 fc 40 80 00 orl $0x8040,-0x4(%rsp)
4fc: 00
4fd: 0f ae 54 24 fc ldmxcsr -0x4(%rsp)
502: c3 retq
503: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
50a: 00 00 00
50d: 0f 1f 00 nopl (%rax)
This means that any thread which later loads the library will then set the flush subnormals to zero (FTZ) and subnormals are zero (DAZ) flags (even if the executable itself was not compiled with -ffast-math
).
Related discussion
- "Someone’s Been Messing With My Subnormals!" by Brendan Dolan-Gavitt (@moyix)
- "Beware of fast-math" by me (@simonbyrne)
- GCC issue #55522