Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Normalize a few more spin-wait loops #21586

Merged
merged 2 commits into from
Jan 11, 2019
Merged

Normalize a few more spin-wait loops #21586

merged 2 commits into from
Jan 11, 2019

Conversation

kouvel
Copy link

@kouvel kouvel commented Dec 18, 2018

  • Fixed a few more spin-waits to normalize the spin-wait duration between processors
  • These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future.
    • Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned.
  • Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before.

@kouvel kouvel added this to the 3.0 milestone Dec 18, 2018
@kouvel kouvel self-assigned this Dec 18, 2018
@kouvel kouvel requested review from vancem and janvorli December 18, 2018 22:18
@kouvel
Copy link
Author

kouvel commented Dec 18, 2018

@dotnet-bot test Windows_NT x64 Formatting

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a few more spin-waits to normalize the spin-wait duration between processors

Were these changes speculative, or they were cases where we saw problems and this fixed it?

@@ -107,7 +107,7 @@ INT32 Object::GetHashCodeEx()
iter++;
if ((iter % 1024) != 0 && g_SystemInfo.dwNumberOfProcessors > 1)
{
YieldProcessor(); // indicate to the processor that we are spining
YieldProcessorNormalized(); // indicate to the processor that we are spining
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: spining => spinning

if (n == 0)
{
n = 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't this is better than having the while condition use > rather than != ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value is currently unsigned

- Fixed a few more spin-waits to normalize the spin-wait duration between processors
- These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future.
  - Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned.
- Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before.
@vancem
Copy link

vancem commented Jan 4, 2019

@kouvel - If I understand this right, YieldProcessorNormalizedForPreSkylakeCount is basically a marker that we have not thought about the yield count much, and thus arguably could be wrong. Ideally all of these uses go away eventually. In the mean time, we assume that the count should be divided by 8 to get the 'normalized' yield count.

Also I am assuming that we want all YieldProcessor() calls gone (does this change complete that mission? If so we should make it prviate).

It would be good if in general we put some good comments in yieldProcessorNormalzied.h, Conceptually what is going on is not hard (we want a yield that represents a certain amount of time) regardless of processor, and that Intel processors after Skylake have a 'yield' instruction that is much slower than processors before that. THis is hard to discover from code but easy to simply state in the YieldProcessorNormalized.h file.

In particular we should have a comment on YieldProcessorNormalizedForPreSkylakeCount that tells people the reasoning behind the method and how it is to be used (and hopefully ultimately removed).

But I don't have a problem with the actual delta.

@kouvel
Copy link
Author

kouvel commented Jan 5, 2019

Were these changes speculative

If I understand this right, YieldProcessorNormalizedForPreSkylakeCount is basically a marker that we have not thought about the yield count much

Yea I haven't seen many reports of issues in these spin-waits (maybe one minor case that I sort of recall). They are not necessarily short spin-waits, so there is potential for it to become an issue. Now that the yield normalization stuff has been in for a while and seems to be working as expected, I'm just normalizing the remaining spin-waits to avoid bad hardware-specific scaling. Retuning them can be done later if/when necessary, and after all of those cases are done the function would not be necessary anymore.

Also I am assuming that we want all YieldProcessor() calls gone (does this change complete that mission? If so we should make it prviate).

The only other uses are in the GC, where scaling is done once on a unit value that is used elsewhere, and the code has to be compatible with NetCore and NetFx. I have now made YieldProcessor() private outside the GC.

Addressed other feedback as well.

@kouvel kouvel merged commit 616fea5 into dotnet:master Jan 11, 2019
@kouvel kouvel deleted the YpnFix branch January 11, 2019 01:51
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Normalize a few more spin-wait loops

- Fixed a few more spin-waits to normalize the spin-wait duration between processors
- These spin-waits have so far not needed to be retuned to avoid unreasonably long spin-wait durations. They can be retuned as necessary in the future.
  - Added a version of YieldProcessorNormalized() that normalizes based on spin-wait counts tuned for pre-Skylake processors for spin-wait loops that have not been retuned.
- Moved some files around to make YieldProcessorNormalized() and the like available in more places. Initialization is still only done in the VM. Uses outside the VM will use the defaults, where there would be no significant change from before.
- Made YieldProcessor() private outside of the GC and added System_YieldProcessor() for when the system-defined implementation is intended to be used

Commit migrated from dotnet/coreclr@616fea5
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants