-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Normalize a few more spin-wait loops #21586
Conversation
kouvel
commented
Dec 18, 2018
- Fixed a few more spin-waits to normalize the spin-wait duration between processors
- These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future.
- Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned.
- Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before.
@dotnet-bot test Windows_NT x64 Formatting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed a few more spin-waits to normalize the spin-wait duration between processors
Were these changes speculative, or they were cases where we saw problems and this fixed it?
src/vm/object.cpp
Outdated
@@ -107,7 +107,7 @@ INT32 Object::GetHashCodeEx() | |||
iter++; | |||
if ((iter % 1024) != 0 && g_SystemInfo.dwNumberOfProcessors > 1) | |||
{ | |||
YieldProcessor(); // indicate to the processor that we are spining | |||
YieldProcessorNormalized(); // indicate to the processor that we are spining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: spining => spinning
if (n == 0) | ||
{ | ||
n = 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't this is better than having the while condition use > rather than != ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value is currently unsigned
- Fixed a few more spin-waits to normalize the spin-wait duration between processors - These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future. - Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned. - Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before.
@kouvel - If I understand this right, YieldProcessorNormalizedForPreSkylakeCount is basically a marker that we have not thought about the yield count much, and thus arguably could be wrong. Ideally all of these uses go away eventually. In the mean time, we assume that the count should be divided by 8 to get the 'normalized' yield count. Also I am assuming that we want all YieldProcessor() calls gone (does this change complete that mission? If so we should make it prviate). It would be good if in general we put some good comments in yieldProcessorNormalzied.h, Conceptually what is going on is not hard (we want a yield that represents a certain amount of time) regardless of processor, and that Intel processors after Skylake have a 'yield' instruction that is much slower than processors before that. THis is hard to discover from code but easy to simply state in the YieldProcessorNormalized.h file. In particular we should have a comment on YieldProcessorNormalizedForPreSkylakeCount that tells people the reasoning behind the method and how it is to be used (and hopefully ultimately removed). But I don't have a problem with the actual delta. |
Yea I haven't seen many reports of issues in these spin-waits (maybe one minor case that I sort of recall). They are not necessarily short spin-waits, so there is potential for it to become an issue. Now that the yield normalization stuff has been in for a while and seems to be working as expected, I'm just normalizing the remaining spin-waits to avoid bad hardware-specific scaling. Retuning them can be done later if/when necessary, and after all of those cases are done the function would not be necessary anymore.
The only other uses are in the GC, where scaling is done once on a unit value that is used elsewhere, and the code has to be compatible with NetCore and NetFx. I have now made YieldProcessor() private outside the GC. Addressed other feedback as well. |
Normalize a few more spin-wait loops - Fixed a few more spin-waits to normalize the spin-wait duration between processors - These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future. - Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned. - Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before. - Made YieldProcessor() private outside of the GC and added System_YieldProcessor() for when the system-defined implementation is intended to be used Commit migrated from dotnet/coreclr@616fea5