Changeset 215292 in webkit for trunk/Source/JavaScriptCore
- Timestamp:
- Apr 12, 2017, 2:22:14 PM (8 years ago)
- Location:
- trunk/Source/JavaScriptCore
- Files:
-
- 2 added
- 2 deleted
- 21 edited
- 2 moved
Legend:
- Unmodified
- Added
- Removed
-
trunk/Source/JavaScriptCore/CMakeLists.txt
r215103 r215292 74 74 assembler/MacroAssemblerX86Common.cpp 75 75 76 b3/air/AirAllocateRegistersAndStackByLinearScan.cpp 76 77 b3/air/AirAllocateRegistersByGraphColoring.cpp 77 b3/air/AirAllocateRegistersByLinearScan.cpp78 78 b3/air/AirAllocateStackByGraphColoring.cpp 79 79 b3/air/AirArg.cpp … … 101 101 b3/air/AirLowerEntrySwitch.cpp 102 102 b3/air/AirLowerMacros.cpp 103 b3/air/AirLowerStackArgs.cpp 103 104 b3/air/AirOptimizeBlockOrder.cpp 104 105 b3/air/AirPadInterference.cpp … … 109 110 b3/air/AirSimplifyCFG.cpp 110 111 b3/air/AirSpecial.cpp 111 b3/air/AirSpillEverything.cpp112 112 b3/air/AirStackSlot.cpp 113 113 b3/air/AirStackSlotKind.cpp -
trunk/Source/JavaScriptCore/ChangeLog
r215272 r215292 1 2017-04-12 Filip Pizlo <[email protected]> 2 3 B3 -O1 should not allocateStackByGraphColoring 4 https://bugs.webkit.org/show_bug.cgi?id=170742 5 6 Reviewed by Keith Miller. 7 8 One of B3 -O1's longest running phases is allocateStackByGraphColoring. One approach to 9 this would be to make that phase cheaper. But it's weird that this phase reruns 10 liveness after register allocation already ran liveness. If only it could reuse the 11 liveness computed by register allocation then it would run a lot faster. At -O2, we do 12 not want this, since we run phases between register allocation and stack allocation, 13 and those phases are free to change the liveness of spill slots (in fact, 14 fixObviousSpills will both shorten and lengthen live ranges because of load and store 15 elimination, respectively). But at -O1, we don't really need to run any phases between 16 register and stack allocation. 17 18 This changes Air's backend in the following ways: 19 20 - Linear scan does stack allocation. This means that we don't need to run 21 allocateStackByGraphColoring at all. In reality, we reuse some of its innards, but 22 we don't run the expensive part of it (liveness->interference->coalescing->coloring). 23 This is a speed-up because we only run liveness once and reuse it for both register 24 and stack allocation. 25 26 - Phases that previously ran between register and stack allocation are taken care of, 27 each in its own special way: 28 29 -> handleCalleSaves: this is now a utility function called by both 30 allocateStackByGraphColoring and allocateRegistersAndStackByLinearScan. 31 32 -> fixObviousSpills: we didn't run this at -O1, so nothing needs to be done. 33 34 -> lowerAfterRegAlloc: this needed to be able to run before stack allocation because 35 it could change register usage (vis a vis callee saves) and it could introduce 36 spill slots. I changed this phase to have a secondary mode for when it runs after 37 stack allocation. 38 39 - The part of allocateStackByGraphColoring that lowered stack addresses and took care 40 of the call arg area is now a separate phase called lowerStackArgs. We run this phase 41 regardless of optimization level. It's a cheap and general lowering. 42 43 This also removes spillEverything, because we never use that phase, we never test it, 44 and it got in the way in this refactoring. 45 46 This is a 21% speed-up on wasm -O1 compile times. This does not significantly change 47 -O1 throughput. We had already disabled allocateStack's most important optimization 48 (spill coalescing). This probably regresses average stack frame size, but I didn't 49 measure by how much. Stack frame size is really not that important. The algorithm in 50 allocateStackByGraphColoring is about much more than optimal frame size; it also 51 tries to avoid having to zero-extend 32-bit spills, it kills dead code, and of course 52 it coalesces. 53 54 * CMakeLists.txt: 55 * JavaScriptCore.xcodeproj/project.pbxproj: 56 * b3/B3Procedure.cpp: 57 (JSC::B3::Procedure::calleeSaveRegisterAtOffsetList): 58 (JSC::B3::Procedure::calleeSaveRegisters): Deleted. 59 * b3/B3Procedure.h: 60 * b3/B3StackmapGenerationParams.cpp: 61 (JSC::B3::StackmapGenerationParams::unavailableRegisters): 62 * b3/air/AirAllocateRegistersAndStackByLinearScan.cpp: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp. 63 (JSC::B3::Air::allocateRegistersAndStackByLinearScan): 64 (JSC::B3::Air::allocateRegistersByLinearScan): Deleted. 65 * b3/air/AirAllocateRegistersAndStackByLinearScan.h: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.h. 66 * b3/air/AirAllocateRegistersByLinearScan.cpp: Removed. 67 * b3/air/AirAllocateRegistersByLinearScan.h: Removed. 68 * b3/air/AirAllocateStackByGraphColoring.cpp: 69 (JSC::B3::Air::allocateEscapedStackSlots): 70 (JSC::B3::Air::updateFrameSizeBasedOnStackSlots): 71 (JSC::B3::Air::allocateStackByGraphColoring): 72 * b3/air/AirAllocateStackByGraphColoring.h: 73 * b3/air/AirArg.cpp: 74 (JSC::B3::Air::Arg::stackAddr): 75 * b3/air/AirArg.h: 76 (JSC::B3::Air::Arg::stackAddr): Deleted. 77 * b3/air/AirCode.cpp: 78 (JSC::B3::Air::Code::addStackSlot): 79 (JSC::B3::Air::Code::setCalleeSaveRegisterAtOffsetList): 80 (JSC::B3::Air::Code::calleeSaveRegisterAtOffsetList): 81 (JSC::B3::Air::Code::dump): 82 * b3/air/AirCode.h: 83 (JSC::B3::Air::Code::setStackIsAllocated): 84 (JSC::B3::Air::Code::stackIsAllocated): 85 (JSC::B3::Air::Code::calleeSaveRegisters): 86 * b3/air/AirGenerate.cpp: 87 (JSC::B3::Air::prepareForGeneration): 88 (JSC::B3::Air::generate): 89 * b3/air/AirHandleCalleeSaves.cpp: 90 (JSC::B3::Air::handleCalleeSaves): 91 * b3/air/AirHandleCalleeSaves.h: 92 * b3/air/AirLowerAfterRegAlloc.cpp: 93 (JSC::B3::Air::lowerAfterRegAlloc): 94 * b3/air/AirLowerStackArgs.cpp: Added. 95 (JSC::B3::Air::lowerStackArgs): 96 * b3/air/AirLowerStackArgs.h: Added. 97 * b3/testb3.cpp: 98 (JSC::B3::testPinRegisters): 99 * ftl/FTLCompile.cpp: 100 (JSC::FTL::compile): 101 * jit/RegisterAtOffsetList.h: 102 * wasm/WasmB3IRGenerator.cpp: 103 (JSC::Wasm::parseAndCompile): 104 1 105 2017-04-12 Michael Saboff <[email protected]> 2 106 -
trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj
r215265 r215292 187 187 0F2AC5661E8A0B770001EE3F /* AirFixSpillsAfterTerminals.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC5641E8A0B760001EE3F /* AirFixSpillsAfterTerminals.cpp */; }; 188 188 0F2AC5671E8A0B790001EE3F /* AirFixSpillsAfterTerminals.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC5651E8A0B760001EE3F /* AirFixSpillsAfterTerminals.h */; }; 189 0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegisters ByLinearScan.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.cpp */; };190 0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegisters ByLinearScan.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.h */; };189 0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp */; }; 190 0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegistersAndStackByLinearScan.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.h */; }; 191 191 0F2AC56E1E8D7B000001EE3F /* AirPhaseInsertionSet.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC56C1E8D7AFF0001EE3F /* AirPhaseInsertionSet.cpp */; }; 192 192 0F2AC56F1E8D7B030001EE3F /* AirPhaseInsertionSet.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC56D1E8D7AFF0001EE3F /* AirPhaseInsertionSet.h */; }; … … 456 456 0F5B4A331C84F0D600F1B17E /* SlowPathReturnType.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F5B4A321C84F0D600F1B17E /* SlowPathReturnType.h */; settings = {ATTRIBUTES = (Private, ); }; }; 457 457 0F5CF9811E96F17F00C18692 /* AirTmpMap.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F5CF9801E96F17D00C18692 /* AirTmpMap.h */; }; 458 0F5CF9841E9D537700C18692 /* AirLowerStackArgs.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F5CF9831E9D537500C18692 /* AirLowerStackArgs.h */; }; 459 0F5CF9851E9D537A00C18692 /* AirLowerStackArgs.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F5CF9821E9D537500C18692 /* AirLowerStackArgs.cpp */; }; 458 460 0F5D085D1B8CF99D001143B4 /* DFGNodeOrigin.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F5D085C1B8CF99D001143B4 /* DFGNodeOrigin.cpp */; }; 459 461 0F5EF91E16878F7A003E5C25 /* JITThunks.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F5EF91B16878F78003E5C25 /* JITThunks.cpp */; }; … … 960 962 0FEC85871BDACDC70080FF74 /* AirSpecial.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FEC85621BDACDC70080FF74 /* AirSpecial.cpp */; }; 961 963 0FEC85881BDACDC70080FF74 /* AirSpecial.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FEC85631BDACDC70080FF74 /* AirSpecial.h */; }; 962 0FEC85891BDACDC70080FF74 /* AirSpillEverything.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FEC85641BDACDC70080FF74 /* AirSpillEverything.cpp */; };963 0FEC858A1BDACDC70080FF74 /* AirSpillEverything.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FEC85651BDACDC70080FF74 /* AirSpillEverything.h */; };964 964 0FEC858B1BDACDC70080FF74 /* AirStackSlot.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FEC85661BDACDC70080FF74 /* AirStackSlot.cpp */; }; 965 965 0FEC858C1BDACDC70080FF74 /* AirStackSlot.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FEC85671BDACDC70080FF74 /* AirStackSlot.h */; }; … … 2731 2731 0F2AC5641E8A0B760001EE3F /* AirFixSpillsAfterTerminals.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirFixSpillsAfterTerminals.cpp; path = b3/air/AirFixSpillsAfterTerminals.cpp; sourceTree = "<group>"; }; 2732 2732 0F2AC5651E8A0B760001EE3F /* AirFixSpillsAfterTerminals.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirFixSpillsAfterTerminals.h; path = b3/air/AirFixSpillsAfterTerminals.h; sourceTree = "<group>"; }; 2733 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegisters ByLinearScan.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirAllocateRegistersByLinearScan.cpp; path = b3/air/AirAllocateRegistersByLinearScan.cpp; sourceTree = "<group>"; };2734 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegisters ByLinearScan.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirAllocateRegistersByLinearScan.h; path = b3/air/AirAllocateRegistersByLinearScan.h; sourceTree = "<group>"; };2733 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirAllocateRegistersAndStackByLinearScan.cpp; path = b3/air/AirAllocateRegistersAndStackByLinearScan.cpp; sourceTree = "<group>"; }; 2734 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirAllocateRegistersAndStackByLinearScan.h; path = b3/air/AirAllocateRegistersAndStackByLinearScan.h; sourceTree = "<group>"; }; 2735 2735 0F2AC56C1E8D7AFF0001EE3F /* AirPhaseInsertionSet.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirPhaseInsertionSet.cpp; path = b3/air/AirPhaseInsertionSet.cpp; sourceTree = "<group>"; }; 2736 2736 0F2AC56D1E8D7AFF0001EE3F /* AirPhaseInsertionSet.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirPhaseInsertionSet.h; path = b3/air/AirPhaseInsertionSet.h; sourceTree = "<group>"; }; … … 2995 2995 0F5B4A321C84F0D600F1B17E /* SlowPathReturnType.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SlowPathReturnType.h; sourceTree = "<group>"; }; 2996 2996 0F5CF9801E96F17D00C18692 /* AirTmpMap.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirTmpMap.h; path = b3/air/AirTmpMap.h; sourceTree = "<group>"; }; 2997 0F5CF9821E9D537500C18692 /* AirLowerStackArgs.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirLowerStackArgs.cpp; path = b3/air/AirLowerStackArgs.cpp; sourceTree = "<group>"; }; 2998 0F5CF9831E9D537500C18692 /* AirLowerStackArgs.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirLowerStackArgs.h; path = b3/air/AirLowerStackArgs.h; sourceTree = "<group>"; }; 2997 2999 0F5D085C1B8CF99D001143B4 /* DFGNodeOrigin.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGNodeOrigin.cpp; path = dfg/DFGNodeOrigin.cpp; sourceTree = "<group>"; }; 2998 3000 0F5EF91B16878F78003E5C25 /* JITThunks.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = JITThunks.cpp; sourceTree = "<group>"; }; … … 3512 3514 0FEC85621BDACDC70080FF74 /* AirSpecial.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirSpecial.cpp; path = b3/air/AirSpecial.cpp; sourceTree = "<group>"; }; 3513 3515 0FEC85631BDACDC70080FF74 /* AirSpecial.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirSpecial.h; path = b3/air/AirSpecial.h; sourceTree = "<group>"; }; 3514 0FEC85641BDACDC70080FF74 /* AirSpillEverything.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirSpillEverything.cpp; path = b3/air/AirSpillEverything.cpp; sourceTree = "<group>"; };3515 0FEC85651BDACDC70080FF74 /* AirSpillEverything.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirSpillEverything.h; path = b3/air/AirSpillEverything.h; sourceTree = "<group>"; };3516 3516 0FEC85661BDACDC70080FF74 /* AirStackSlot.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirStackSlot.cpp; path = b3/air/AirStackSlot.cpp; sourceTree = "<group>"; }; 3517 3517 0FEC85671BDACDC70080FF74 /* AirStackSlot.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirStackSlot.h; path = b3/air/AirStackSlot.h; sourceTree = "<group>"; }; … … 5576 5576 isa = PBXGroup; 5577 5577 children = ( 5578 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp */, 5579 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.h */, 5578 5580 7965C2141E5D799600B7591D /* AirAllocateRegistersByGraphColoring.cpp */, 5579 5581 7965C2151E5D799600B7591D /* AirAllocateRegistersByGraphColoring.h */, 5580 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.cpp */,5581 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.h */,5582 5582 0FEC85481BDACDC70080FF74 /* AirAllocateStackByGraphColoring.cpp */, 5583 5583 0FEC85491BDACDC70080FF74 /* AirAllocateStackByGraphColoring.h */, … … 5638 5638 0F6183271C45BF070072450B /* AirLowerMacros.cpp */, 5639 5639 0F6183281C45BF070072450B /* AirLowerMacros.h */, 5640 0F5CF9821E9D537500C18692 /* AirLowerStackArgs.cpp */, 5641 0F5CF9831E9D537500C18692 /* AirLowerStackArgs.h */, 5640 5642 264091FA1BE2FD4100684DB2 /* AirOpcode.opcodes */, 5641 5643 0FB3878C1BFBC44D00E3AB1E /* AirOptimizeBlockOrder.cpp */, … … 5655 5657 0FEC85621BDACDC70080FF74 /* AirSpecial.cpp */, 5656 5658 0FEC85631BDACDC70080FF74 /* AirSpecial.h */, 5657 0FEC85641BDACDC70080FF74 /* AirSpillEverything.cpp */,5658 0FEC85651BDACDC70080FF74 /* AirSpillEverything.h */,5659 5659 0FEC85661BDACDC70080FF74 /* AirStackSlot.cpp */, 5660 5660 0FEC85671BDACDC70080FF74 /* AirStackSlot.h */, … … 8136 8136 0F338DFE1BED51270013C88F /* AirSimplifyCFG.h in Headers */, 8137 8137 0FEC85881BDACDC70080FF74 /* AirSpecial.h in Headers */, 8138 0FEC858A1BDACDC70080FF74 /* AirSpillEverything.h in Headers */,8139 8138 0FEC858C1BDACDC70080FF74 /* AirStackSlot.h in Headers */, 8140 8139 0F2BBD9E1C5FF4050023EF23 /* AirStackSlotKind.h in Headers */, … … 8212 8211 0FEC85121BDACDAC0080FF74 /* B3ConstDoubleValue.h in Headers */, 8213 8212 43422A631C158E6D00E2EB98 /* B3ConstFloatValue.h in Headers */, 8214 0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegisters ByLinearScan.h in Headers */,8213 0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegistersAndStackByLinearScan.h in Headers */, 8215 8214 0FEC85B31BDED9570080FF74 /* B3ConstPtrValue.h in Headers */, 8216 8215 0F338DF61BE93D550013C88F /* B3ConstrainedValue.h in Headers */, … … 8975 8974 A50E4B6418809DD50068A46D /* JSGlobalObjectRuntimeAgent.h in Headers */, 8976 8975 0F2C63C41E69EF9400C13839 /* B3MemoryValueInlines.h in Headers */, 8976 0F5CF9841E9D537700C18692 /* AirLowerStackArgs.h in Headers */, 8977 8977 A503FA2A188F105900110F14 /* JSGlobalObjectScriptDebugServer.h in Headers */, 8978 8978 A513E5C0185BFACC007E95AD /* JSInjectedScriptHost.h in Headers */, … … 10022 10022 0F338DFD1BED51270013C88F /* AirSimplifyCFG.cpp in Sources */, 10023 10023 0FEC85871BDACDC70080FF74 /* AirSpecial.cpp in Sources */, 10024 0FEC85891BDACDC70080FF74 /* AirSpillEverything.cpp in Sources */,10025 10024 0FEC858B1BDACDC70080FF74 /* AirStackSlot.cpp in Sources */, 10026 10025 0F2BBD9D1C5FF4050023EF23 /* AirStackSlotKind.cpp in Sources */, … … 10243 10242 0F2BDC15151C5D4D00CD8910 /* DFGFixupPhase.cpp in Sources */, 10244 10243 0F20177F1DCADC3300EA5950 /* DFGFlowIndexing.cpp in Sources */, 10245 0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegisters ByLinearScan.cpp in Sources */,10244 0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp in Sources */, 10246 10245 0F9D339617FFC4E60073C2BC /* DFGFlushedAt.cpp in Sources */, 10247 10246 A7D89CF717A0B8CC00773AD8 /* DFGFlushFormat.cpp in Sources */, … … 10758 10757 8642C512151C083D0046D4EF /* RegExpMatchesArray.cpp in Sources */, 10759 10758 14280843107EC0930013E7B2 /* RegExpObject.cpp in Sources */, 10759 0F5CF9851E9D537A00C18692 /* AirLowerStackArgs.cpp in Sources */, 10760 10760 14280844107EC0930013E7B2 /* RegExpPrototype.cpp in Sources */, 10761 10761 6540C7A01B82E1C3000F6B79 /* RegisterAtOffset.cpp in Sources */, -
trunk/Source/JavaScriptCore/b3/B3Procedure.cpp
r214901 r215292 346 346 } 347 347 348 const RegisterAtOffsetList& Procedure::calleeSaveRegisters() const349 { 350 return code().calleeSaveRegister s();348 RegisterAtOffsetList Procedure::calleeSaveRegisterAtOffsetList() const 349 { 350 return code().calleeSaveRegisterAtOffsetList(); 351 351 } 352 352 -
trunk/Source/JavaScriptCore/b3/B3Procedure.h
r214901 r215292 244 244 245 245 JS_EXPORT_PRIVATE unsigned frameSize() const; 246 JS_EXPORT_PRIVATE const RegisterAtOffsetList& calleeSaveRegisters() const;246 JS_EXPORT_PRIVATE RegisterAtOffsetList calleeSaveRegisterAtOffsetList() const; 247 247 248 248 PCToOriginMap& pcToOriginMap() { return m_pcToOriginMap; } -
trunk/Source/JavaScriptCore/b3/B3StackmapGenerationParams.cpp
r214887 r215292 49 49 50 50 RegisterSet unsavedCalleeSaves = RegisterSet::vmCalleeSaveRegisters(); 51 for (const RegisterAtOffset& regAtOffset : m_context.code->calleeSaveRegisters()) 52 unsavedCalleeSaves.clear(regAtOffset.reg()); 51 unsavedCalleeSaves.exclude(m_context.code->calleeSaveRegisters()); 53 52 54 53 result.merge(unsavedCalleeSaves); -
trunk/Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.cpp
r215291 r215292 25 25 26 26 #include "config.h" 27 #include "AirAllocateRegisters ByLinearScan.h"27 #include "AirAllocateRegistersAndStackByLinearScan.h" 28 28 29 29 #if ENABLE(B3_JIT) 30 30 31 #include "AirAllocateStackByGraphColoring.h" 31 32 #include "AirArgInlines.h" 32 33 #include "AirCode.h" 33 34 #include "AirFixSpillsAfterTerminals.h" 35 #include "AirHandleCalleeSaves.h" 34 36 #include "AirPhaseInsertionSet.h" 35 37 #include "AirInstInlines.h" … … 85 87 bool isUnspillable { false }; 86 88 bool didBuildPossibleRegs { false }; 89 unsigned spillIndex { 0 }; 87 90 }; 88 91 … … 119 122 void run() 120 123 { 124 padInterference(m_code); 121 125 buildRegisterSet(); 122 126 buildIndices(); … … 127 131 } 128 132 for (;;) { 129 prepareIntervals ();133 prepareIntervalsForScanForRegisters(); 130 134 m_didSpill = false; 131 135 forEachBank( 132 136 [&] (Bank bank) { 133 attemptScan (bank);137 attemptScanForRegisters(bank); 134 138 }); 135 139 if (!m_didSpill) … … 139 143 insertSpillCode(); 140 144 assignRegisters(); 145 fixSpillsAfterTerminals(m_code); 146 147 handleCalleeSaves(m_code); 148 allocateEscapedStackSlots(m_code); 149 prepareIntervalsForScanForStack(); 150 scanForStack(); 151 updateFrameSizeBasedOnStackSlots(m_code); 152 m_code.setStackIsAllocated(true); 141 153 } 142 154 … … 187 199 void buildIntervals() 188 200 { 201 TimingScope timingScope("LinearScan::buildIntervals"); 189 202 UnifiedTmpLiveness liveness(m_code); 190 203 … … 267 280 dataLog(" ", tmp, ": ", m_map[tmp], "\n"); 268 281 }); 282 dataLog("Clobbers: ", listDump(m_clobbers), "\n"); 269 283 } 270 284 } … … 289 303 } 290 304 291 void prepareIntervals() 305 void prepareIntervalsForScanForRegisters() 306 { 307 prepareIntervals( 308 [&] (TmpData& data) -> bool { 309 if (data.spilled) 310 return false; 311 312 data.assigned = Reg(); 313 return true; 314 }); 315 } 316 317 void prepareIntervalsForScanForStack() 318 { 319 prepareIntervals( 320 [&] (TmpData& data) -> bool { 321 return data.spilled; 322 }); 323 } 324 325 template<typename SelectFunc> 326 void prepareIntervals(const SelectFunc& selectFunc) 292 327 { 293 328 m_tmps.resize(0); … … 296 331 [&] (Tmp tmp) { 297 332 TmpData& data = m_map[tmp]; 298 if ( data.spilled)333 if (!selectFunc(data)) 299 334 return; 300 335 301 data.assigned = Reg();302 336 m_tmps.append(tmp); 303 337 }); … … 309 343 }); 310 344 311 if (verbose()) {345 if (verbose()) 312 346 dataLog("Tmps: ", listDump(m_tmps), "\n"); 313 dataLog("Clobbers: ", listDump(m_clobbers), "\n");314 }315 347 } 316 348 … … 326 358 } 327 359 328 void attemptScan (Bank bank)360 void attemptScanForRegisters(Bank bank) 329 361 { 330 362 // This is modeled after LinearScanRegisterAllocation in Fig. 1 in … … 521 553 } 522 554 555 void scanForStack() 556 { 557 // This is loosely modeled after LinearScanRegisterAllocation in Fig. 1 in 558 // http://dl.acm.org/citation.cfm?id=330250. 559 560 m_active.clear(); 561 m_usedSpillSlots.clearAll(); 562 563 for (Tmp& tmp : m_tmps) { 564 TmpData& entry = m_map[tmp]; 565 if (!entry.spilled) 566 continue; 567 568 size_t index = entry.interval.begin(); 569 570 // This is ExpireOldIntervals in Fig. 1. 571 while (!m_active.isEmpty()) { 572 Tmp tmp = m_active.first(); 573 TmpData& entry = m_map[tmp]; 574 575 bool expired = entry.interval.end() <= index; 576 577 if (!expired) 578 break; 579 580 m_active.removeFirst(); 581 m_usedSpillSlots.clear(entry.spillIndex); 582 } 583 584 entry.spillIndex = m_usedSpillSlots.findBit(0, false); 585 ptrdiff_t offset = -static_cast<ptrdiff_t>(m_code.frameSize()) - static_cast<ptrdiff_t>(entry.spillIndex) * 8 - 8; 586 if (verbose()) 587 dataLog(" Assigning offset = ", offset, " to spill ", pointerDump(entry.spilled), " for ", tmp, "\n"); 588 entry.spilled->setOffsetFromFP(offset); 589 m_usedSpillSlots.set(entry.spillIndex); 590 m_active.append(tmp); 591 } 592 } 593 523 594 void insertSpillCode() 524 595 { … … 570 641 Deque<Tmp> m_active; 571 642 RegisterSet m_activeRegs; 643 BitVector m_usedSpillSlots; 572 644 bool m_didSpill { false }; 573 645 }; 574 646 575 void runLinearScan(Code& code) 647 } // anonymous namespace 648 649 void allocateRegistersAndStackByLinearScan(Code& code) 576 650 { 651 PhaseScope phaseScope(code, "allocateRegistersAndStackByLinearScan"); 577 652 if (verbose()) 578 653 dataLog("Air before linear scan:\n", code); … … 583 658 } 584 659 585 } // anonymous namespace586 587 void allocateRegistersByLinearScan(Code& code)588 {589 PhaseScope phaseScope(code, "allocateRegistersByLinearScan");590 padInterference(code);591 runLinearScan(code);592 fixSpillsAfterTerminals(code);593 }594 595 660 } } } // namespace JSC::B3::Air 596 661 -
trunk/Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.h
r215291 r215292 38 38 // http://dl.acm.org/citation.cfm?id=330250 39 39 // 40 // This is not Air's primary register allocator. We use it only when running at optLevel<2. That's not 41 // the default level. This register allocator is optimized primarily for running quickly. It's expected 42 // that improvements to this register allocator should focus on improving its execution time without much 43 // regard for the quality of generated code. If you want good code, use graph coloring. 40 // This is not Air's primary register allocator. We use it only when running at optLevel<2. 41 // That's not the default level. This register allocator is optimized primarily for running 42 // quickly. It's expected that improvements to this register allocator should focus on improving 43 // its execution time without much regard for the quality of generated code. If you want good 44 // code, use graph coloring. 44 45 // 45 46 // For Air's primary register allocator, see AirAllocateRegistersByGraphColoring.h|cpp. 46 void allocateRegistersByLinearScan(Code&); 47 // 48 // This also does stack allocation as an afterthought. It does not do any spill coalescing. 49 void allocateRegistersAndStackByLinearScan(Code&); 47 50 48 51 } } } // namespace JSC::B3::Air -
trunk/Source/JavaScriptCore/b3/air/AirAllocateStackByGraphColoring.cpp
r215073 r215292 31 31 #include "AirArgInlines.h" 32 32 #include "AirCode.h" 33 #include "Air InsertionSet.h"33 #include "AirHandleCalleeSaves.h" 34 34 #include "AirInstInlines.h" 35 35 #include "AirLiveness.h" … … 130 130 }; 131 131 132 } // anonymous namespace 133 134 void allocateStackByGraphColoring(Code& code) 135 { 136 PhaseScope phaseScope(code, "allocateStackByGraphColoring"); 137 132 Vector<StackSlot*> allocateEscapedStackSlotsImpl(Code& code) 133 { 138 134 // Allocate all of the escaped slots in order. This is kind of a crazy algorithm to allow for 139 135 // the possibility of stack slots being assigned frame offsets before we even get here. 140 ASSERT(!code.frameSize());136 RELEASE_ASSERT(!code.frameSize()); 141 137 Vector<StackSlot*> assignedEscapedStackSlots; 142 138 Vector<StackSlot*> escapedStackSlotsWorklist; … … 159 155 assignedEscapedStackSlots.append(slot); 160 156 } 157 return assignedEscapedStackSlots; 158 } 159 160 template<typename Collection> 161 void updateFrameSizeBasedOnStackSlotsImpl(Code& code, const Collection& collection) 162 { 163 unsigned frameSize = 0; 164 for (StackSlot* slot : collection) 165 frameSize = std::max(frameSize, static_cast<unsigned>(-slot->offsetFromFP())); 166 code.setFrameSize(WTF::roundUpToMultipleOf(stackAlignmentBytes(), frameSize)); 167 } 168 169 } // anonymous namespace 170 171 void allocateEscapedStackSlots(Code& code) 172 { 173 updateFrameSizeBasedOnStackSlotsImpl(code, allocateEscapedStackSlotsImpl(code)); 174 } 175 176 void updateFrameSizeBasedOnStackSlots(Code& code) 177 { 178 updateFrameSizeBasedOnStackSlotsImpl(code, code.stackSlots()); 179 } 180 181 void allocateStackByGraphColoring(Code& code) 182 { 183 PhaseScope phaseScope(code, "allocateStackByGraphColoring"); 184 185 handleCalleeSaves(code); 186 187 Vector<StackSlot*> assignedEscapedStackSlots = allocateEscapedStackSlotsImpl(code); 161 188 162 189 // Now we handle the spill slots. … … 378 405 } 379 406 380 // Figure out how much stack we're using for stack slots. 381 unsigned frameSizeForStackSlots = 0; 382 for (StackSlot* slot : code.stackSlots()) { 383 frameSizeForStackSlots = std::max( 384 frameSizeForStackSlots, 385 static_cast<unsigned>(-slot->offsetFromFP())); 386 } 387 388 frameSizeForStackSlots = WTF::roundUpToMultipleOf(stackAlignmentBytes(), frameSizeForStackSlots); 389 390 // Now we need to deduce how much argument area we need. 391 for (BasicBlock* block : code) { 392 for (Inst& inst : *block) { 393 for (Arg& arg : inst.args) { 394 if (arg.isCallArg()) { 395 // For now, we assume that we use 8 bytes of the call arg. But that's not 396 // such an awesome assumption. 397 // FIXME: https://bugs.webkit.org/show_bug.cgi?id=150454 398 ASSERT(arg.offset() >= 0); 399 code.requestCallArgAreaSizeInBytes(arg.offset() + 8); 400 } 401 } 402 } 403 } 404 405 code.setFrameSize(frameSizeForStackSlots + code.callArgAreaSizeInBytes()); 406 407 // Finally, transform the code to use Addr's instead of StackSlot's. This is a lossless 408 // transformation since we can search the StackSlots array to figure out which StackSlot any 409 // offset-from-FP refers to. 410 411 // FIXME: This may produce addresses that aren't valid if we end up with a ginormous stack frame. 412 // We would have to scavenge for temporaries if this happened. Fortunately, this case will be 413 // extremely rare so we can do crazy things when it arises. 414 // https://bugs.webkit.org/show_bug.cgi?id=152530 415 416 InsertionSet insertionSet(code); 417 for (BasicBlock* block : code) { 418 for (unsigned instIndex = 0; instIndex < block->size(); ++instIndex) { 419 Inst& inst = block->at(instIndex); 420 inst.forEachArg( 421 [&] (Arg& arg, Arg::Role role, Bank, Width width) { 422 auto stackAddr = [&] (int32_t offset) -> Arg { 423 return Arg::stackAddr(offset, code.frameSize(), width); 424 }; 425 426 switch (arg.kind()) { 427 case Arg::Stack: { 428 StackSlot* slot = arg.stackSlot(); 429 if (Arg::isZDef(role) 430 && slot->kind() == StackSlotKind::Spill 431 && slot->byteSize() > bytes(width)) { 432 // Currently we only handle this simple case because it's the only one 433 // that arises: ZDef's are only 32-bit right now. So, when we hit these 434 // assertions it means that we need to implement those other kinds of 435 // zero fills. 436 RELEASE_ASSERT(slot->byteSize() == 8); 437 RELEASE_ASSERT(width == Width32); 438 439 RELEASE_ASSERT(isValidForm(StoreZero32, Arg::Stack)); 440 insertionSet.insert( 441 instIndex + 1, StoreZero32, inst.origin, 442 stackAddr(arg.offset() + 4 + slot->offsetFromFP())); 443 } 444 arg = stackAddr(arg.offset() + slot->offsetFromFP()); 445 break; 446 } 447 case Arg::CallArg: 448 arg = stackAddr(arg.offset() - code.frameSize()); 449 break; 450 default: 451 break; 452 } 453 } 454 ); 455 } 456 insertionSet.execute(block); 457 } 407 updateFrameSizeBasedOnStackSlots(code); 408 code.setStackIsAllocated(true); 458 409 } 459 410 … … 462 413 #endif // ENABLE(B3_JIT) 463 414 464 -
trunk/Source/JavaScriptCore/b3/air/AirAllocateStackByGraphColoring.h
r215073 r215292 34 34 // This allocates StackSlots to places on the stack. It first allocates the pinned ones in index 35 35 // order and then it allocates the rest using first fit. Takes the opportunity to kill dead 36 // assignments to stack slots, since it knows which ones are live. Also fixes ZDefs to anonymous37 // stack slots. Also coalesces spill slotswhenever possible.36 // assignments to stack slots, since it knows which ones are live. Also coalesces spill slots 37 // whenever possible. 38 38 // 39 39 // This is meant to be an optimal stack allocator, focused on generating great code. It's not … … 42 42 void allocateStackByGraphColoring(Code&); 43 43 44 // These are utilities shared by this phase and allocateRegistersAndStackByLinearScan(). 45 void allocateEscapedStackSlots(Code&); 46 void updateFrameSizeBasedOnStackSlots(Code&); 47 44 48 } } } // namespace JSC::B3::Air 45 49 -
trunk/Source/JavaScriptCore/b3/air/AirArg.cpp
r214636 r215292 42 42 namespace JSC { namespace B3 { namespace Air { 43 43 44 Arg Arg::stackAddr(int32_t offsetFromFP, unsigned frameSize, Width width) 45 { 46 Arg result = Arg::addr(Air::Tmp(GPRInfo::callFrameRegister), offsetFromFP); 47 if (!result.isValidForm(width)) { 48 result = Arg::addr( 49 Air::Tmp(MacroAssembler::stackPointerRegister), 50 offsetFromFP + frameSize); 51 } 52 return result; 53 } 54 44 55 bool Arg::isStackMemory() const 45 56 { -
trunk/Source/JavaScriptCore/b3/air/AirArg.h
r214636 r215292 561 561 return result; 562 562 } 563 564 static Arg stackAddr(int32_t offsetFromFP, unsigned frameSize, Width width) 565 { 566 Arg result = Arg::addr(Air::Tmp(GPRInfo::callFrameRegister), offsetFromFP); 567 if (!result.isValidForm(width)) { 568 result = Arg::addr( 569 Air::Tmp(MacroAssembler::stackPointerRegister), 570 offsetFromFP + frameSize); 571 } 572 return result; 573 } 563 564 static Arg stackAddr(int32_t offsetFromFP, unsigned frameSize, Width); 574 565 575 566 // If you don't pass a Width, this optimistically assumes that you're using the right width. -
trunk/Source/JavaScriptCore/b3/air/AirCode.cpp
r214887 r215292 103 103 StackSlot* Code::addStackSlot(unsigned byteSize, StackSlotKind kind, B3::StackSlot* b3Slot) 104 104 { 105 return m_stackSlots.addNew(byteSize, kind, b3Slot); 105 StackSlot* result = m_stackSlots.addNew(byteSize, kind, b3Slot); 106 if (m_stackIsAllocated) { 107 // FIXME: This is unnecessarily awful. Fortunately, it doesn't run often. 108 unsigned extent = WTF::roundUpToMultipleOf(result->alignment(), frameSize() + byteSize); 109 result->setOffsetFromFP(-static_cast<ptrdiff_t>(extent)); 110 setFrameSize(WTF::roundUpToMultipleOf(stackAlignmentBytes(), extent)); 111 } 112 return result; 106 113 } 107 114 … … 137 144 } 138 145 return false; 146 } 147 148 void Code::setCalleeSaveRegisterAtOffsetList(RegisterAtOffsetList&& registerAtOffsetList, StackSlot* slot) 149 { 150 m_uncorrectedCalleeSaveRegisterAtOffsetList = WTFMove(registerAtOffsetList); 151 for (const RegisterAtOffset& registerAtOffset : m_uncorrectedCalleeSaveRegisterAtOffsetList) 152 m_calleeSaveRegisters.set(registerAtOffset.reg()); 153 m_calleeSaveStackSlot = slot; 154 } 155 156 RegisterAtOffsetList Code::calleeSaveRegisterAtOffsetList() const 157 { 158 RegisterAtOffsetList result = m_uncorrectedCalleeSaveRegisterAtOffsetList; 159 if (StackSlot* slot = m_calleeSaveStackSlot) { 160 ptrdiff_t offset = slot->byteSize() + slot->offsetFromFP(); 161 for (size_t i = result.size(); i--;) { 162 result.at(i) = RegisterAtOffset( 163 result.at(i).reg(), 164 result.at(i).offset() + offset); 165 } 166 } 167 return result; 139 168 } 140 169 … … 171 200 out.print(" ", deepDump(special), "\n"); 172 201 } 173 if (m_frameSize )174 out.print("Frame size: ", m_frameSize, "\n");202 if (m_frameSize || m_stackIsAllocated) 203 out.print("Frame size: ", m_frameSize, m_stackIsAllocated ? " (Allocated)" : "", "\n"); 175 204 if (m_callArgAreaSize) 176 205 out.print("Call arg area size: ", m_callArgAreaSize, "\n"); 177 if (m_calleeSaveRegisters.size()) 178 out.print("Callee saves: ", m_calleeSaveRegisters, "\n"); 206 RegisterAtOffsetList calleeSaveRegisters = this->calleeSaveRegisterAtOffsetList(); 207 if (calleeSaveRegisters.size()) 208 out.print("Callee saves: ", calleeSaveRegisters, "\n"); 179 209 } 180 210 -
trunk/Source/JavaScriptCore/b3/air/AirCode.h
r215071 r215292 187 187 m_entrypointLabels = std::forward<Vector>(vector); 188 188 } 189 190 const RegisterAtOffsetList& calleeSaveRegisters() const { return m_calleeSaveRegisters; } 191 RegisterAtOffsetList& calleeSaveRegisters() { return m_calleeSaveRegisters; } 189 190 void setStackIsAllocated(bool value) 191 { 192 m_stackIsAllocated = value; 193 } 194 195 bool stackIsAllocated() const { return m_stackIsAllocated; } 196 197 // This sets the callee save registers. 198 void setCalleeSaveRegisterAtOffsetList(RegisterAtOffsetList&&, StackSlot*); 199 200 // This returns the correctly offset list of callee save registers. 201 RegisterAtOffsetList calleeSaveRegisterAtOffsetList() const; 202 203 // This just tells you what the callee saves are. 204 RegisterSet calleeSaveRegisters() const { return m_calleeSaveRegisters; } 192 205 193 206 // Recomputes predecessors and deletes unreachable blocks. … … 330 343 unsigned m_frameSize { 0 }; 331 344 unsigned m_callArgAreaSize { 0 }; 332 RegisterAtOffsetList m_calleeSaveRegisters; 345 bool m_stackIsAllocated { false }; 346 RegisterAtOffsetList m_uncorrectedCalleeSaveRegisterAtOffsetList; 347 RegisterSet m_calleeSaveRegisters; 348 StackSlot* m_calleeSaveStackSlot { nullptr }; 333 349 Vector<FrequentedBlock> m_entrypoints; // This is empty until after lowerEntrySwitch(). 334 350 Vector<CCallHelpers::Label> m_entrypointLabels; // This is empty until code generation. -
trunk/Source/JavaScriptCore/b3/air/AirGenerate.cpp
r215073 r215292 29 29 #if ENABLE(B3_JIT) 30 30 31 #include "AirAllocateRegistersAndStackByLinearScan.h" 31 32 #include "AirAllocateRegistersByGraphColoring.h" 32 #include "AirAllocateRegistersByLinearScan.h"33 33 #include "AirAllocateStackByGraphColoring.h" 34 34 #include "AirCode.h" … … 37 37 #include "AirFixPartialRegisterStalls.h" 38 38 #include "AirGenerationContext.h" 39 #include "AirHandleCalleeSaves.h"40 39 #include "AirLogRegisterPressure.h" 41 40 #include "AirLowerAfterRegAlloc.h" 42 41 #include "AirLowerEntrySwitch.h" 43 42 #include "AirLowerMacros.h" 43 #include "AirLowerStackArgs.h" 44 44 #include "AirOpcodeUtils.h" 45 45 #include "AirOptimizeBlockOrder.h" 46 46 #include "AirReportUsedRegisters.h" 47 47 #include "AirSimplifyCFG.h" 48 #include "AirSpillEverything.h"49 48 #include "AirValidate.h" 50 49 #include "B3Common.h" … … 85 84 eliminateDeadCode(code); 86 85 87 // Register allocation for all the Tmps that do not have a corresponding machine register. 88 // After this phase, every Tmp has a reg. 89 // 90 // For debugging, you can use spillEverything() to put everything to the stack between each Inst. 91 if (Options::airSpillsEverything()) 92 spillEverything(code); 93 else if (code.optLevel() >= 2) 86 if (code.optLevel() <= 1) { 87 // When we're compiling quickly, we do register and stack allocation in one linear scan 88 // phase. It's fast because it computes liveness only once. 89 allocateRegistersAndStackByLinearScan(code); 90 91 if (Options::logAirRegisterPressure()) { 92 dataLog("Register pressure after register allocation:\n"); 93 logRegisterPressure(code); 94 } 95 96 // We may still need to do post-allocation lowering. Doing it after both register and 97 // stack allocation is less optimal, but it works fine. 98 lowerAfterRegAlloc(code); 99 } else { 100 // NOTE: B3 -O2 generates code that runs 1.5x-2x faster than code generated by -O1. 101 // Most of this performance benefit is from -O2's graph coloring register allocation 102 // and stack allocation pipeline, which you see below. 103 104 // Register allocation for all the Tmps that do not have a corresponding machine 105 // register. After this phase, every Tmp has a reg. 94 106 allocateRegistersByGraphColoring(code); 95 else 96 allocateRegistersByLinearScan(code); 97 98 if (Options::logAirRegisterPressure()) { 99 dataLog("Register pressure after register allocation:\n"); 100 logRegisterPressure(code); 101 } 102 103 if (code.optLevel() >= 2) { 104 // This replaces uses of spill slots with registers or constants if possible. It does this by 105 // minimizing the amount that we perturb the already-chosen register allocation. It may extend 106 // the live ranges of registers though. 107 108 if (Options::logAirRegisterPressure()) { 109 dataLog("Register pressure after register allocation:\n"); 110 logRegisterPressure(code); 111 } 112 113 // This replaces uses of spill slots with registers or constants if possible. It 114 // does this by minimizing the amount that we perturb the already-chosen register 115 // allocation. It may extend the live ranges of registers though. 107 116 fixObviousSpills(code); 108 } 109 110 lowerAfterRegAlloc(code); 111 112 // Prior to this point the prologue and epilogue is implicit. This makes it explicit. It also 113 // does things like identify which callee-saves we're using and saves them. 114 handleCalleeSaves(code); 115 116 // This turns all Stack and CallArg Args into Addr args that use the frame pointer. It does 117 // this by first-fit allocating stack slots. It should be pretty darn close to optimal, so we 118 // shouldn't have to worry about this very much. 119 allocateStackByGraphColoring(code); 117 118 lowerAfterRegAlloc(code); 119 120 // This does first-fit allocation of stack slots using an interference graph plus a 121 // bunch of other optimizations. 122 allocateStackByGraphColoring(code); 123 } 124 125 // This turns all Stack and CallArg Args into Addr args that use the frame pointer. 126 lowerStackArgs(code); 120 127 121 128 // If we coalesced moves then we can unbreak critical edges. This is the main reason for this … … 213 220 jit.addPtr(CCallHelpers::TrustedImm32(-code.frameSize()), MacroAssembler::stackPointerRegister); 214 221 215 for (const RegisterAtOffset& entry : code.calleeSaveRegister s()) {222 for (const RegisterAtOffset& entry : code.calleeSaveRegisterAtOffsetList()) { 216 223 if (entry.reg().isGPR()) 217 224 jit.storePtr(entry.reg().gpr(), argFor(entry)); … … 250 257 auto start = jit.labelIgnoringWatchpoints(); 251 258 if (code.frameSize()) { 252 for (const RegisterAtOffset& entry : code.calleeSaveRegister s()) {259 for (const RegisterAtOffset& entry : code.calleeSaveRegisterAtOffsetList()) { 253 260 if (entry.reg().isGPR()) 254 261 jit.loadPtr(argFor(entry), entry.reg().gpr()); -
trunk/Source/JavaScriptCore/b3/air/AirHandleCalleeSaves.cpp
r207004 r215292 1 1 /* 2 * Copyright (C) 2015-201 6Apple Inc. All rights reserved.2 * Copyright (C) 2015-2017 Apple Inc. All rights reserved. 3 3 * 4 4 * Redistribution and use in source and binary forms, with or without … … 31 31 #include "AirCode.h" 32 32 #include "AirInstInlines.h" 33 #include "AirPhaseScope.h"34 33 35 34 namespace JSC { namespace B3 { namespace Air { … … 37 36 void handleCalleeSaves(Code& code) 38 37 { 39 PhaseScope phaseScope(code, "handleCalleeSaves");40 41 38 RegisterSet usedCalleeSaves; 42 39 … … 62 59 return; 63 60 64 code.calleeSaveRegisters()= RegisterAtOffsetList(usedCalleeSaves);61 RegisterAtOffsetList calleeSaveRegisters = RegisterAtOffsetList(usedCalleeSaves); 65 62 66 63 size_t byteSize = 0; 67 for (const RegisterAtOffset& entry : c ode.calleeSaveRegisters())64 for (const RegisterAtOffset& entry : calleeSaveRegisters) 68 65 byteSize = std::max(static_cast<size_t>(-entry.offset()), byteSize); 69 66 70 StackSlot* savesArea = code.addStackSlot(byteSize, StackSlotKind::Locked); 71 // This is a bit weird since we could have already pinned a different stack slot to this 72 // area. Also, our runtime does not require us to pin the saves area. Maybe we shouldn't pin it? 73 savesArea->setOffsetFromFP(-byteSize); 67 code.setCalleeSaveRegisterAtOffsetList( 68 WTFMove(calleeSaveRegisters), 69 code.addStackSlot(byteSize, StackSlotKind::Locked)); 74 70 } 75 71 -
trunk/Source/JavaScriptCore/b3/air/AirHandleCalleeSaves.h
r206525 r215292 1 1 /* 2 * Copyright (C) 2015 Apple Inc. All rights reserved.2 * Copyright (C) 2015-2017 Apple Inc. All rights reserved. 3 3 * 4 4 * Redistribution and use in source and binary forms, with or without … … 32 32 class Code; 33 33 34 // This phase identifies callee-save registers and adds code to save/restore them in the 35 // prologue/epilogue to the code. It's a mandatory phase. 34 // This utility identifies callee-save registers and tells Code. It's called from phases that 35 // do stack allocation. We don't do it at the end of register allocation because the real end 36 // of register allocation is just before stack allocation. 36 37 37 38 // FIXME: It would be cool to make this more interactive with the Air client and also more -
trunk/Source/JavaScriptCore/b3/air/AirLowerAfterRegAlloc.cpp
r214901 r215292 76 76 77 77 HashMap<Inst*, RegisterSet> usedRegisters; 78 78 79 79 RegLiveness liveness(code); 80 80 for (BasicBlock* block : code) { … … 97 97 } 98 98 } 99 99 100 std::array<std::array<StackSlot*, 2>, numBanks> slots; 101 forEachBank( 102 [&] (Bank bank) { 103 for (unsigned i = 0; i < 2; ++i) 104 slots[bank][i] = nullptr; 105 }); 106 107 // If we run after stack allocation then we cannot use those callee saves that aren't in 108 // the callee save list. Note that we are only run after stack allocation in -O1, so this 109 // kind of slop is OK. 110 RegisterSet blacklistedCalleeSaves; 111 if (code.stackIsAllocated()) { 112 blacklistedCalleeSaves = RegisterSet::calleeSaveRegisters(); 113 blacklistedCalleeSaves.exclude(code.calleeSaveRegisters()); 114 } 115 100 116 auto getScratches = [&] (RegisterSet set, Bank bank) -> std::array<Arg, 2> { 101 117 std::array<Arg, 2> result; … … 103 119 bool found = false; 104 120 for (Reg reg : code.regsInPriorityOrder(bank)) { 105 if (!set.get(reg) ) {121 if (!set.get(reg) && !blacklistedCalleeSaves.get(reg)) { 106 122 result[i] = Tmp(reg); 107 123 set.set(reg); … … 111 127 } 112 128 if (!found) { 113 result[i] = Arg::stack(114 code.addStackSlot(115 bytes(conservativeWidth(bank)),116 StackSlotKind::Spill));129 StackSlot*& slot = slots[bank][i]; 130 if (!slot) 131 slot = code.addStackSlot(bytes(conservativeWidth(bank)), StackSlotKind::Spill); 132 result[i] = Arg::stack(slots[bank][i]); 117 133 } 118 134 } -
trunk/Source/JavaScriptCore/b3/testb3.cpp
r215265 r215292 14103 14103 } 14104 14104 } 14105 for (const RegisterAtOffset& regAtOffset : proc.calleeSaveRegister s())14105 for (const RegisterAtOffset& regAtOffset : proc.calleeSaveRegisterAtOffsetList()) 14106 14106 usesCSRs |= csrs.get(regAtOffset.reg()); 14107 14107 CHECK_EQ(usesCSRs, !pin); -
trunk/Source/JavaScriptCore/ftl/FTLCompile.cpp
r214571 r215292 78 78 79 79 std::unique_ptr<RegisterAtOffsetList> registerOffsets = 80 std::make_unique<RegisterAtOffsetList>(state.proc->calleeSaveRegister s());80 std::make_unique<RegisterAtOffsetList>(state.proc->calleeSaveRegisterAtOffsetList()); 81 81 if (shouldDumpDisassembly()) 82 82 dataLog("Unwind info for ", CodeBlockWithJITType(state.graph.m_codeBlock, JITCode::FTLJIT), ": ", *registerOffsets, "\n"); -
trunk/Source/JavaScriptCore/jit/RegisterAtOffsetList.h
r206525 r215292 39 39 40 40 RegisterAtOffsetList(); 41 RegisterAtOffsetList(RegisterSet, OffsetBaseType = FramePointerBased);41 explicit RegisterAtOffsetList(RegisterSet, OffsetBaseType = FramePointerBased); 42 42 43 43 void dump(PrintStream&) const; -
trunk/Source/JavaScriptCore/runtime/Options.h
r214709 r215292 399 399 v(bool, logB3PhaseTimes, false, Normal, nullptr) \ 400 400 v(double, rareBlockPenalty, 0.001, Normal, nullptr) \ 401 v(bool, airSpillsEverything, false, Normal, nullptr) \402 401 v(bool, airLinearScanVerbose, false, Normal, nullptr) \ 403 402 v(bool, airLinearScanSpillsEverything, false, Normal, nullptr) \ -
trunk/Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp
r215193 r215292 1329 1329 B3::generate(procedure, *compilationContext.wasmEntrypointJIT); 1330 1330 compilationContext.wasmEntrypointByproducts = procedure.releaseByproducts(); 1331 result->wasmEntrypoint.calleeSaveRegisters = procedure.calleeSaveRegister s();1331 result->wasmEntrypoint.calleeSaveRegisters = procedure.calleeSaveRegisterAtOffsetList(); 1332 1332 } 1333 1333
Note:
See TracChangeset
for help on using the changeset viewer.