-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Remove redundant zero-initialization of struct temps with GC fields. #13868
Conversation
Structs with GC pointer fields are fully zero-initialized in the prolog if compInitMem is true. Therefore, we don't need to insert zero-initialization for the result of newobj or for inlinee locals when they are structs with GC pointer fields and the basic bock is not in a loop.
Frameworks diffs:
|
@AndyAyersMS @briansull @dotnet/jit-contrib PTAL |
Does this also hold when we perform the recursive tail call optimization? |
src/jit/importer.cpp
Outdated
// Structs with GC pointer fields are fully zero-initialized in the prolog if compInitMem is | ||
// true. | ||
// Therefore, we don't need to insert zero-initialization here if this block is not in a loop. | ||
if (!info.compInitMem || !containsGCPtr || ((block->bbFlags & BBF_BACKWARD_JUMP) != 0) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can refactor this expression into its own method since your change uses it twice.
!info.compInitMem || ((block->bbFlags & BBF_BACKWARD_JUMP) != 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@briansull I factored the expression into its own method. PTAL.
Yes, it does. In case of recursive calls we set BBF_BACKWARD_JUMP on the blocks that may form a loop. Line 6892 in b7cce5a
|
Looks like the regexdna failure in OSX10.12 x64 Checked Build and Test is https://github.com/dotnet/coreclr/issues/13779 and was fixed by #13857 a few hours ago. Rerunning the leg. |
@dotnet-bot test OSX10.12 x64 Checked Build and Test |
Fixes #13822 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of questions for areas to investigate.
@@ -6845,7 +6845,7 @@ var_types Compiler::impImportCall(OPCODE opcode, | |||
exactContextHnd = callInfo->contextHandle; | |||
exactContextNeedsRuntimeLookup = callInfo->exactContextNeedsRuntimeLookup == TRUE; | |||
|
|||
// Recursive call is treaded as a loop to the begining of the method. | |||
// Recursive call is treated as a loop to the begining of the method. | |||
if (methHnd == info.compMethodHnd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right test for recursion? (see gtIsRecursiveCall
where we take care to properly handle sites inlinees).
Might be nice to ensure we have a test case with A->B->A call pattern, where B tailcalls A, A introduces a local struct, B gets inlined, and things will break if the local struct is not zeroed each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out we can't rely on BBF_BACKWARD_JUMP to block the struct newobj initialization optimization when tail recursion elimination is possible. It may be set too late, after newobj importing already happened. I added a test case that was failing with my previous changes. To fix that I modified fgMorphRecursiveFastTailCallIntoLoop to initialize temp structs with GC pointers on each iteration.
The test for recursion here is not correct as you indicated. Since this change no longer depends on it, I'd like to fix it in a separate PR. It will affect impMakeDiscretionaryInlineObservations so will likely result in diffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok by me to fix this up later.
src/jit/compiler.hpp
Outdated
bool Compiler::fgStructTempNeedsExplicitZeroInit(CORINFO_CLASS_HANDLE structType, BasicBlock* block) | ||
{ | ||
DWORD typeFlags = info.compCompHnd->getClassAttribs(structType); | ||
bool containsGCPtr = ((typeFlags & CORINFO_FLG_CONTAINS_GC_PTR) != 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you verify this is the right test? See for instance impNormStructType
where we also look at CORINFO_FLG_CONTAINS_STACK_PTR
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was missing some structs. I modified the check to match what's used in codegen to decide whether to zero-initialize the struct in the prolog. The codesize improvement is slightly better now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frameworks diffs:
Summary:
(Note: Lower is better)
Total bytes of diff: -16580 (-0.13 % of base)
diff is an improvement.
Total byte diff includes 0 bytes from reconciling methods
Base had 0 unique methods, 0 unique bytes
Diff had 0 unique methods, 0 unique bytes
Top file improvements by size (bytes):
-4874 : Microsoft.CodeAnalysis.CSharp.dasm (-0.22 % of base)
-4566 : Microsoft.CodeAnalysis.dasm (-0.58 % of base)
-2109 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.09 % of base)
-1705 : System.Collections.Immutable.dasm (-1.18 % of base)
-1119 : System.Reflection.Metadata.dasm (-1.45 % of base)
27 total files with size differences (27 improved, 0 regressed), 52 unchanged.
Top method regessions by size (bytes):
23 : Microsoft.CodeAnalysis.VisualBasic.dasm - Conversions:ClassifyQueryLambdaConversion(ref,ref,ref,byref):int
Top method improvements by size (bytes):
-2957 : Microsoft.CodeAnalysis.dasm - AttributeDescription:.cctor()
-1094 : System.Reflection.Metadata.dasm - MetadataReader:InitializeProjectedTypes()
-334 : System.Collections.Immutable.dasm - Enumerator:MoveNext():bool:this (19 methods)
-312 : Microsoft.CodeAnalysis.CSharp.dasm - OverloadResolution:GetEnumOperation(int,ref,ref,ref,ref):this
-308 : Microsoft.CodeAnalysis.dasm - <GetVerStrings>d__22:MoveNext():bool:this
452 total methods with size differences (451 improved, 1 regressed), 66047 unchanged.
…s with GC pointers. Modify fgStructTempNeedsExplicitZeroInit to check for structs with GC pointers more accurately. Add a test.
@AndyAyersMS @briansull PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Looks good.
Thanks, @erozenfeld. |
Structs with GC pointer fields are fully zero-initialized in the prolog if compInitMem is true.
Therefore, we don't need to insert zero-initialization for the result of newobj or for inlinee locals
when they are structs with GC pointer fields and the basic bock is not in a loop.