Changeset 215292 in webkit for trunk/Source/JavaScriptCore


Ignore:
Timestamp:
Apr 12, 2017, 2:22:14 PM (8 years ago)
Author:
[email protected]
Message:

B3 -O1 should not allocateStackByGraphColoring
https://bugs.webkit.org/show_bug.cgi?id=170742

Reviewed by Keith Miller.

One of B3 -O1's longest running phases is allocateStackByGraphColoring. One approach to
this would be to make that phase cheaper. But it's weird that this phase reruns
liveness after register allocation already ran liveness. If only it could reuse the
liveness computed by register allocation then it would run a lot faster. At -O2, we do
not want this, since we run phases between register allocation and stack allocation,
and those phases are free to change the liveness of spill slots (in fact,
fixObviousSpills will both shorten and lengthen live ranges because of load and store
elimination, respectively). But at -O1, we don't really need to run any phases between
register and stack allocation.

This changes Air's backend in the following ways:

  • Linear scan does stack allocation. This means that we don't need to run allocateStackByGraphColoring at all. In reality, we reuse some of its innards, but we don't run the expensive part of it (liveness->interference->coalescing->coloring). This is a speed-up because we only run liveness once and reuse it for both register and stack allocation.


  • Phases that previously ran between register and stack allocation are taken care of, each in its own special way:


-> handleCalleSaves: this is now a utility function called by both

allocateStackByGraphColoring and allocateRegistersAndStackByLinearScan.


-> fixObviousSpills: we didn't run this at -O1, so nothing needs to be done.


-> lowerAfterRegAlloc: this needed to be able to run before stack allocation because

it could change register usage (vis a vis callee saves) and it could introduce
spill slots. I changed this phase to have a secondary mode for when it runs after
stack allocation.


  • The part of allocateStackByGraphColoring that lowered stack addresses and took care of the call arg area is now a separate phase called lowerStackArgs. We run this phase regardless of optimization level. It's a cheap and general lowering.


This also removes spillEverything, because we never use that phase, we never test it,
and it got in the way in this refactoring.

This is a 21% speed-up on wasm -O1 compile times. This does not significantly change
-O1 throughput. We had already disabled allocateStack's most important optimization
(spill coalescing). This probably regresses average stack frame size, but I didn't
measure by how much. Stack frame size is really not that important. The algorithm in
allocateStackByGraphColoring is about much more than optimal frame size; it also
tries to avoid having to zero-extend 32-bit spills, it kills dead code, and of course
it coalesces.

  • CMakeLists.txt:
  • JavaScriptCore.xcodeproj/project.pbxproj:
  • b3/B3Procedure.cpp:

(JSC::B3::Procedure::calleeSaveRegisterAtOffsetList):
(JSC::B3::Procedure::calleeSaveRegisters): Deleted.

  • b3/B3Procedure.h:
  • b3/B3StackmapGenerationParams.cpp:

(JSC::B3::StackmapGenerationParams::unavailableRegisters):

  • b3/air/AirAllocateRegistersAndStackByLinearScan.cpp: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp.

(JSC::B3::Air::allocateRegistersAndStackByLinearScan):
(JSC::B3::Air::allocateRegistersByLinearScan): Deleted.

  • b3/air/AirAllocateRegistersAndStackByLinearScan.h: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.h.
  • b3/air/AirAllocateRegistersByLinearScan.cpp: Removed.
  • b3/air/AirAllocateRegistersByLinearScan.h: Removed.
  • b3/air/AirAllocateStackByGraphColoring.cpp:

(JSC::B3::Air::allocateEscapedStackSlots):
(JSC::B3::Air::updateFrameSizeBasedOnStackSlots):
(JSC::B3::Air::allocateStackByGraphColoring):

  • b3/air/AirAllocateStackByGraphColoring.h:
  • b3/air/AirArg.cpp:

(JSC::B3::Air::Arg::stackAddr):

  • b3/air/AirArg.h:

(JSC::B3::Air::Arg::stackAddr): Deleted.

  • b3/air/AirCode.cpp:

(JSC::B3::Air::Code::addStackSlot):
(JSC::B3::Air::Code::setCalleeSaveRegisterAtOffsetList):
(JSC::B3::Air::Code::calleeSaveRegisterAtOffsetList):
(JSC::B3::Air::Code::dump):

  • b3/air/AirCode.h:

(JSC::B3::Air::Code::setStackIsAllocated):
(JSC::B3::Air::Code::stackIsAllocated):
(JSC::B3::Air::Code::calleeSaveRegisters):

  • b3/air/AirGenerate.cpp:

(JSC::B3::Air::prepareForGeneration):
(JSC::B3::Air::generate):

  • b3/air/AirHandleCalleeSaves.cpp:

(JSC::B3::Air::handleCalleeSaves):

  • b3/air/AirHandleCalleeSaves.h:
  • b3/air/AirLowerAfterRegAlloc.cpp:

(JSC::B3::Air::lowerAfterRegAlloc):

  • b3/air/AirLowerStackArgs.cpp: Added.

(JSC::B3::Air::lowerStackArgs):

  • b3/air/AirLowerStackArgs.h: Added.
  • b3/testb3.cpp:

(JSC::B3::testPinRegisters):

  • ftl/FTLCompile.cpp:

(JSC::FTL::compile):

  • jit/RegisterAtOffsetList.h:
  • wasm/WasmB3IRGenerator.cpp:

(JSC::Wasm::parseAndCompile):

Location:
trunk/Source/JavaScriptCore
Files:
2 added
2 deleted
21 edited
2 moved

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/CMakeLists.txt

    r215103 r215292  
    7474    assembler/MacroAssemblerX86Common.cpp
    7575
     76    b3/air/AirAllocateRegistersAndStackByLinearScan.cpp
    7677    b3/air/AirAllocateRegistersByGraphColoring.cpp
    77     b3/air/AirAllocateRegistersByLinearScan.cpp
    7878    b3/air/AirAllocateStackByGraphColoring.cpp
    7979    b3/air/AirArg.cpp
     
    101101    b3/air/AirLowerEntrySwitch.cpp
    102102    b3/air/AirLowerMacros.cpp
     103    b3/air/AirLowerStackArgs.cpp
    103104    b3/air/AirOptimizeBlockOrder.cpp
    104105    b3/air/AirPadInterference.cpp
     
    109110    b3/air/AirSimplifyCFG.cpp
    110111    b3/air/AirSpecial.cpp
    111     b3/air/AirSpillEverything.cpp
    112112    b3/air/AirStackSlot.cpp
    113113    b3/air/AirStackSlotKind.cpp
  • trunk/Source/JavaScriptCore/ChangeLog

    r215272 r215292  
     12017-04-12  Filip Pizlo  <[email protected]>
     2
     3        B3 -O1 should not allocateStackByGraphColoring
     4        https://bugs.webkit.org/show_bug.cgi?id=170742
     5
     6        Reviewed by Keith Miller.
     7       
     8        One of B3 -O1's longest running phases is allocateStackByGraphColoring. One approach to
     9        this would be to make that phase cheaper. But it's weird that this phase reruns
     10        liveness after register allocation already ran liveness. If only it could reuse the
     11        liveness computed by register allocation then it would run a lot faster. At -O2, we do
     12        not want this, since we run phases between register allocation and stack allocation,
     13        and those phases are free to change the liveness of spill slots (in fact,
     14        fixObviousSpills will both shorten and lengthen live ranges because of load and store
     15        elimination, respectively). But at -O1, we don't really need to run any phases between
     16        register and stack allocation.
     17       
     18        This changes Air's backend in the following ways:
     19       
     20        - Linear scan does stack allocation. This means that we don't need to run
     21          allocateStackByGraphColoring at all. In reality, we reuse some of its innards, but
     22          we don't run the expensive part of it (liveness->interference->coalescing->coloring).
     23          This is a speed-up because we only run liveness once and reuse it for both register
     24          and stack allocation.
     25       
     26        - Phases that previously ran between register and stack allocation are taken care of,
     27          each in its own special way:
     28         
     29          -> handleCalleSaves: this is now a utility function called by both
     30             allocateStackByGraphColoring and allocateRegistersAndStackByLinearScan.
     31         
     32          -> fixObviousSpills: we didn't run this at -O1, so nothing needs to be done.
     33         
     34          -> lowerAfterRegAlloc: this needed to be able to run before stack allocation because
     35             it could change register usage (vis a vis callee saves) and it could introduce
     36             spill slots. I changed this phase to have a secondary mode for when it runs after
     37             stack allocation.
     38       
     39        - The part of allocateStackByGraphColoring that lowered stack addresses and took care
     40          of the call arg area is now a separate phase called lowerStackArgs. We run this phase
     41          regardless of optimization level. It's a cheap and general lowering.
     42       
     43        This also removes spillEverything, because we never use that phase, we never test it,
     44        and it got in the way in this refactoring.
     45       
     46        This is a 21% speed-up on wasm -O1 compile times. This does not significantly change
     47        -O1 throughput. We had already disabled allocateStack's most important optimization
     48        (spill coalescing). This probably regresses average stack frame size, but I didn't
     49        measure by how much. Stack frame size is really not that important. The algorithm in
     50        allocateStackByGraphColoring is about much more than optimal frame size; it also
     51        tries to avoid having to zero-extend 32-bit spills, it kills dead code, and of course
     52        it coalesces.
     53
     54        * CMakeLists.txt:
     55        * JavaScriptCore.xcodeproj/project.pbxproj:
     56        * b3/B3Procedure.cpp:
     57        (JSC::B3::Procedure::calleeSaveRegisterAtOffsetList):
     58        (JSC::B3::Procedure::calleeSaveRegisters): Deleted.
     59        * b3/B3Procedure.h:
     60        * b3/B3StackmapGenerationParams.cpp:
     61        (JSC::B3::StackmapGenerationParams::unavailableRegisters):
     62        * b3/air/AirAllocateRegistersAndStackByLinearScan.cpp: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.cpp.
     63        (JSC::B3::Air::allocateRegistersAndStackByLinearScan):
     64        (JSC::B3::Air::allocateRegistersByLinearScan): Deleted.
     65        * b3/air/AirAllocateRegistersAndStackByLinearScan.h: Copied from Source/JavaScriptCore/b3/air/AirAllocateRegistersByLinearScan.h.
     66        * b3/air/AirAllocateRegistersByLinearScan.cpp: Removed.
     67        * b3/air/AirAllocateRegistersByLinearScan.h: Removed.
     68        * b3/air/AirAllocateStackByGraphColoring.cpp:
     69        (JSC::B3::Air::allocateEscapedStackSlots):
     70        (JSC::B3::Air::updateFrameSizeBasedOnStackSlots):
     71        (JSC::B3::Air::allocateStackByGraphColoring):
     72        * b3/air/AirAllocateStackByGraphColoring.h:
     73        * b3/air/AirArg.cpp:
     74        (JSC::B3::Air::Arg::stackAddr):
     75        * b3/air/AirArg.h:
     76        (JSC::B3::Air::Arg::stackAddr): Deleted.
     77        * b3/air/AirCode.cpp:
     78        (JSC::B3::Air::Code::addStackSlot):
     79        (JSC::B3::Air::Code::setCalleeSaveRegisterAtOffsetList):
     80        (JSC::B3::Air::Code::calleeSaveRegisterAtOffsetList):
     81        (JSC::B3::Air::Code::dump):
     82        * b3/air/AirCode.h:
     83        (JSC::B3::Air::Code::setStackIsAllocated):
     84        (JSC::B3::Air::Code::stackIsAllocated):
     85        (JSC::B3::Air::Code::calleeSaveRegisters):
     86        * b3/air/AirGenerate.cpp:
     87        (JSC::B3::Air::prepareForGeneration):
     88        (JSC::B3::Air::generate):
     89        * b3/air/AirHandleCalleeSaves.cpp:
     90        (JSC::B3::Air::handleCalleeSaves):
     91        * b3/air/AirHandleCalleeSaves.h:
     92        * b3/air/AirLowerAfterRegAlloc.cpp:
     93        (JSC::B3::Air::lowerAfterRegAlloc):
     94        * b3/air/AirLowerStackArgs.cpp: Added.
     95        (JSC::B3::Air::lowerStackArgs):
     96        * b3/air/AirLowerStackArgs.h: Added.
     97        * b3/testb3.cpp:
     98        (JSC::B3::testPinRegisters):
     99        * ftl/FTLCompile.cpp:
     100        (JSC::FTL::compile):
     101        * jit/RegisterAtOffsetList.h:
     102        * wasm/WasmB3IRGenerator.cpp:
     103        (JSC::Wasm::parseAndCompile):
     104
    11052017-04-12  Michael Saboff  <[email protected]>
    2106
  • trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj

    r215265 r215292  
    187187                0F2AC5661E8A0B770001EE3F /* AirFixSpillsAfterTerminals.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC5641E8A0B760001EE3F /* AirFixSpillsAfterTerminals.cpp */; };
    188188                0F2AC5671E8A0B790001EE3F /* AirFixSpillsAfterTerminals.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC5651E8A0B760001EE3F /* AirFixSpillsAfterTerminals.h */; };
    189                 0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegistersByLinearScan.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.cpp */; };
    190                 0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegistersByLinearScan.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.h */; };
     189                0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp */; };
     190                0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegistersAndStackByLinearScan.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.h */; };
    191191                0F2AC56E1E8D7B000001EE3F /* AirPhaseInsertionSet.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F2AC56C1E8D7AFF0001EE3F /* AirPhaseInsertionSet.cpp */; };
    192192                0F2AC56F1E8D7B030001EE3F /* AirPhaseInsertionSet.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F2AC56D1E8D7AFF0001EE3F /* AirPhaseInsertionSet.h */; };
     
    456456                0F5B4A331C84F0D600F1B17E /* SlowPathReturnType.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F5B4A321C84F0D600F1B17E /* SlowPathReturnType.h */; settings = {ATTRIBUTES = (Private, ); }; };
    457457                0F5CF9811E96F17F00C18692 /* AirTmpMap.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F5CF9801E96F17D00C18692 /* AirTmpMap.h */; };
     458                0F5CF9841E9D537700C18692 /* AirLowerStackArgs.h in Headers */ = {isa = PBXBuildFile; fileRef = 0F5CF9831E9D537500C18692 /* AirLowerStackArgs.h */; };
     459                0F5CF9851E9D537A00C18692 /* AirLowerStackArgs.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F5CF9821E9D537500C18692 /* AirLowerStackArgs.cpp */; };
    458460                0F5D085D1B8CF99D001143B4 /* DFGNodeOrigin.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F5D085C1B8CF99D001143B4 /* DFGNodeOrigin.cpp */; };
    459461                0F5EF91E16878F7A003E5C25 /* JITThunks.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0F5EF91B16878F78003E5C25 /* JITThunks.cpp */; };
     
    960962                0FEC85871BDACDC70080FF74 /* AirSpecial.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FEC85621BDACDC70080FF74 /* AirSpecial.cpp */; };
    961963                0FEC85881BDACDC70080FF74 /* AirSpecial.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FEC85631BDACDC70080FF74 /* AirSpecial.h */; };
    962                 0FEC85891BDACDC70080FF74 /* AirSpillEverything.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FEC85641BDACDC70080FF74 /* AirSpillEverything.cpp */; };
    963                 0FEC858A1BDACDC70080FF74 /* AirSpillEverything.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FEC85651BDACDC70080FF74 /* AirSpillEverything.h */; };
    964964                0FEC858B1BDACDC70080FF74 /* AirStackSlot.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 0FEC85661BDACDC70080FF74 /* AirStackSlot.cpp */; };
    965965                0FEC858C1BDACDC70080FF74 /* AirStackSlot.h in Headers */ = {isa = PBXBuildFile; fileRef = 0FEC85671BDACDC70080FF74 /* AirStackSlot.h */; };
     
    27312731                0F2AC5641E8A0B760001EE3F /* AirFixSpillsAfterTerminals.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirFixSpillsAfterTerminals.cpp; path = b3/air/AirFixSpillsAfterTerminals.cpp; sourceTree = "<group>"; };
    27322732                0F2AC5651E8A0B760001EE3F /* AirFixSpillsAfterTerminals.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirFixSpillsAfterTerminals.h; path = b3/air/AirFixSpillsAfterTerminals.h; sourceTree = "<group>"; };
    2733                 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirAllocateRegistersByLinearScan.cpp; path = b3/air/AirAllocateRegistersByLinearScan.cpp; sourceTree = "<group>"; };
    2734                 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirAllocateRegistersByLinearScan.h; path = b3/air/AirAllocateRegistersByLinearScan.h; sourceTree = "<group>"; };
     2733                0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirAllocateRegistersAndStackByLinearScan.cpp; path = b3/air/AirAllocateRegistersAndStackByLinearScan.cpp; sourceTree = "<group>"; };
     2734                0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirAllocateRegistersAndStackByLinearScan.h; path = b3/air/AirAllocateRegistersAndStackByLinearScan.h; sourceTree = "<group>"; };
    27352735                0F2AC56C1E8D7AFF0001EE3F /* AirPhaseInsertionSet.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirPhaseInsertionSet.cpp; path = b3/air/AirPhaseInsertionSet.cpp; sourceTree = "<group>"; };
    27362736                0F2AC56D1E8D7AFF0001EE3F /* AirPhaseInsertionSet.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirPhaseInsertionSet.h; path = b3/air/AirPhaseInsertionSet.h; sourceTree = "<group>"; };
     
    29952995                0F5B4A321C84F0D600F1B17E /* SlowPathReturnType.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SlowPathReturnType.h; sourceTree = "<group>"; };
    29962996                0F5CF9801E96F17D00C18692 /* AirTmpMap.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirTmpMap.h; path = b3/air/AirTmpMap.h; sourceTree = "<group>"; };
     2997                0F5CF9821E9D537500C18692 /* AirLowerStackArgs.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirLowerStackArgs.cpp; path = b3/air/AirLowerStackArgs.cpp; sourceTree = "<group>"; };
     2998                0F5CF9831E9D537500C18692 /* AirLowerStackArgs.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirLowerStackArgs.h; path = b3/air/AirLowerStackArgs.h; sourceTree = "<group>"; };
    29972999                0F5D085C1B8CF99D001143B4 /* DFGNodeOrigin.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = DFGNodeOrigin.cpp; path = dfg/DFGNodeOrigin.cpp; sourceTree = "<group>"; };
    29983000                0F5EF91B16878F78003E5C25 /* JITThunks.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = JITThunks.cpp; sourceTree = "<group>"; };
     
    35123514                0FEC85621BDACDC70080FF74 /* AirSpecial.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirSpecial.cpp; path = b3/air/AirSpecial.cpp; sourceTree = "<group>"; };
    35133515                0FEC85631BDACDC70080FF74 /* AirSpecial.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirSpecial.h; path = b3/air/AirSpecial.h; sourceTree = "<group>"; };
    3514                 0FEC85641BDACDC70080FF74 /* AirSpillEverything.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirSpillEverything.cpp; path = b3/air/AirSpillEverything.cpp; sourceTree = "<group>"; };
    3515                 0FEC85651BDACDC70080FF74 /* AirSpillEverything.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirSpillEverything.h; path = b3/air/AirSpillEverything.h; sourceTree = "<group>"; };
    35163516                0FEC85661BDACDC70080FF74 /* AirStackSlot.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = AirStackSlot.cpp; path = b3/air/AirStackSlot.cpp; sourceTree = "<group>"; };
    35173517                0FEC85671BDACDC70080FF74 /* AirStackSlot.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = AirStackSlot.h; path = b3/air/AirStackSlot.h; sourceTree = "<group>"; };
     
    55765576                        isa = PBXGroup;
    55775577                        children = (
     5578                                0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp */,
     5579                                0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersAndStackByLinearScan.h */,
    55785580                                7965C2141E5D799600B7591D /* AirAllocateRegistersByGraphColoring.cpp */,
    55795581                                7965C2151E5D799600B7591D /* AirAllocateRegistersByGraphColoring.h */,
    5580                                 0F2AC5681E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.cpp */,
    5581                                 0F2AC5691E8A0BD10001EE3F /* AirAllocateRegistersByLinearScan.h */,
    55825582                                0FEC85481BDACDC70080FF74 /* AirAllocateStackByGraphColoring.cpp */,
    55835583                                0FEC85491BDACDC70080FF74 /* AirAllocateStackByGraphColoring.h */,
     
    56385638                                0F6183271C45BF070072450B /* AirLowerMacros.cpp */,
    56395639                                0F6183281C45BF070072450B /* AirLowerMacros.h */,
     5640                                0F5CF9821E9D537500C18692 /* AirLowerStackArgs.cpp */,
     5641                                0F5CF9831E9D537500C18692 /* AirLowerStackArgs.h */,
    56405642                                264091FA1BE2FD4100684DB2 /* AirOpcode.opcodes */,
    56415643                                0FB3878C1BFBC44D00E3AB1E /* AirOptimizeBlockOrder.cpp */,
     
    56555657                                0FEC85621BDACDC70080FF74 /* AirSpecial.cpp */,
    56565658                                0FEC85631BDACDC70080FF74 /* AirSpecial.h */,
    5657                                 0FEC85641BDACDC70080FF74 /* AirSpillEverything.cpp */,
    5658                                 0FEC85651BDACDC70080FF74 /* AirSpillEverything.h */,
    56595659                                0FEC85661BDACDC70080FF74 /* AirStackSlot.cpp */,
    56605660                                0FEC85671BDACDC70080FF74 /* AirStackSlot.h */,
     
    81368136                                0F338DFE1BED51270013C88F /* AirSimplifyCFG.h in Headers */,
    81378137                                0FEC85881BDACDC70080FF74 /* AirSpecial.h in Headers */,
    8138                                 0FEC858A1BDACDC70080FF74 /* AirSpillEverything.h in Headers */,
    81398138                                0FEC858C1BDACDC70080FF74 /* AirStackSlot.h in Headers */,
    81408139                                0F2BBD9E1C5FF4050023EF23 /* AirStackSlotKind.h in Headers */,
     
    82128211                                0FEC85121BDACDAC0080FF74 /* B3ConstDoubleValue.h in Headers */,
    82138212                                43422A631C158E6D00E2EB98 /* B3ConstFloatValue.h in Headers */,
    8214                                 0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegistersByLinearScan.h in Headers */,
     8213                                0F2AC56B1E8A0BD50001EE3F /* AirAllocateRegistersAndStackByLinearScan.h in Headers */,
    82158214                                0FEC85B31BDED9570080FF74 /* B3ConstPtrValue.h in Headers */,
    82168215                                0F338DF61BE93D550013C88F /* B3ConstrainedValue.h in Headers */,
     
    89758974                                A50E4B6418809DD50068A46D /* JSGlobalObjectRuntimeAgent.h in Headers */,
    89768975                                0F2C63C41E69EF9400C13839 /* B3MemoryValueInlines.h in Headers */,
     8976                                0F5CF9841E9D537700C18692 /* AirLowerStackArgs.h in Headers */,
    89778977                                A503FA2A188F105900110F14 /* JSGlobalObjectScriptDebugServer.h in Headers */,
    89788978                                A513E5C0185BFACC007E95AD /* JSInjectedScriptHost.h in Headers */,
     
    1002210022                                0F338DFD1BED51270013C88F /* AirSimplifyCFG.cpp in Sources */,
    1002310023                                0FEC85871BDACDC70080FF74 /* AirSpecial.cpp in Sources */,
    10024                                 0FEC85891BDACDC70080FF74 /* AirSpillEverything.cpp in Sources */,
    1002510024                                0FEC858B1BDACDC70080FF74 /* AirStackSlot.cpp in Sources */,
    1002610025                                0F2BBD9D1C5FF4050023EF23 /* AirStackSlotKind.cpp in Sources */,
     
    1024310242                                0F2BDC15151C5D4D00CD8910 /* DFGFixupPhase.cpp in Sources */,
    1024410243                                0F20177F1DCADC3300EA5950 /* DFGFlowIndexing.cpp in Sources */,
    10245                                 0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegistersByLinearScan.cpp in Sources */,
     10244                                0F2AC56A1E8A0BD30001EE3F /* AirAllocateRegistersAndStackByLinearScan.cpp in Sources */,
    1024610245                                0F9D339617FFC4E60073C2BC /* DFGFlushedAt.cpp in Sources */,
    1024710246                                A7D89CF717A0B8CC00773AD8 /* DFGFlushFormat.cpp in Sources */,
     
    1075810757                                8642C512151C083D0046D4EF /* RegExpMatchesArray.cpp in Sources */,
    1075910758                                14280843107EC0930013E7B2 /* RegExpObject.cpp in Sources */,
     10759                                0F5CF9851E9D537A00C18692 /* AirLowerStackArgs.cpp in Sources */,
    1076010760                                14280844107EC0930013E7B2 /* RegExpPrototype.cpp in Sources */,
    1076110761                                6540C7A01B82E1C3000F6B79 /* RegisterAtOffset.cpp in Sources */,
  • trunk/Source/JavaScriptCore/b3/B3Procedure.cpp

    r214901 r215292  
    346346}
    347347
    348 const RegisterAtOffsetList& Procedure::calleeSaveRegisters() const
    349 {
    350     return code().calleeSaveRegisters();
     348RegisterAtOffsetList Procedure::calleeSaveRegisterAtOffsetList() const
     349{
     350    return code().calleeSaveRegisterAtOffsetList();
    351351}
    352352
  • trunk/Source/JavaScriptCore/b3/B3Procedure.h

    r214901 r215292  
    244244
    245245    JS_EXPORT_PRIVATE unsigned frameSize() const;
    246     JS_EXPORT_PRIVATE const RegisterAtOffsetList& calleeSaveRegisters() const;
     246    JS_EXPORT_PRIVATE RegisterAtOffsetList calleeSaveRegisterAtOffsetList() const;
    247247
    248248    PCToOriginMap& pcToOriginMap() { return m_pcToOriginMap; }
  • trunk/Source/JavaScriptCore/b3/B3StackmapGenerationParams.cpp

    r214887 r215292  
    4949   
    5050    RegisterSet unsavedCalleeSaves = RegisterSet::vmCalleeSaveRegisters();
    51     for (const RegisterAtOffset& regAtOffset : m_context.code->calleeSaveRegisters())
    52         unsavedCalleeSaves.clear(regAtOffset.reg());
     51    unsavedCalleeSaves.exclude(m_context.code->calleeSaveRegisters());
    5352
    5453    result.merge(unsavedCalleeSaves);
  • trunk/Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.cpp

    r215291 r215292  
    2525
    2626#include "config.h"
    27 #include "AirAllocateRegistersByLinearScan.h"
     27#include "AirAllocateRegistersAndStackByLinearScan.h"
    2828
    2929#if ENABLE(B3_JIT)
    3030
     31#include "AirAllocateStackByGraphColoring.h"
    3132#include "AirArgInlines.h"
    3233#include "AirCode.h"
    3334#include "AirFixSpillsAfterTerminals.h"
     35#include "AirHandleCalleeSaves.h"
    3436#include "AirPhaseInsertionSet.h"
    3537#include "AirInstInlines.h"
     
    8587    bool isUnspillable { false };
    8688    bool didBuildPossibleRegs { false };
     89    unsigned spillIndex { 0 };
    8790};
    8891
     
    119122    void run()
    120123    {
     124        padInterference(m_code);
    121125        buildRegisterSet();
    122126        buildIndices();
     
    127131        }
    128132        for (;;) {
    129             prepareIntervals();
     133            prepareIntervalsForScanForRegisters();
    130134            m_didSpill = false;
    131135            forEachBank(
    132136                [&] (Bank bank) {
    133                     attemptScan(bank);
     137                    attemptScanForRegisters(bank);
    134138                });
    135139            if (!m_didSpill)
     
    139143        insertSpillCode();
    140144        assignRegisters();
     145        fixSpillsAfterTerminals(m_code);
     146
     147        handleCalleeSaves(m_code);
     148        allocateEscapedStackSlots(m_code);
     149        prepareIntervalsForScanForStack();
     150        scanForStack();
     151        updateFrameSizeBasedOnStackSlots(m_code);
     152        m_code.setStackIsAllocated(true);
    141153    }
    142154   
     
    187199    void buildIntervals()
    188200    {
     201        TimingScope timingScope("LinearScan::buildIntervals");
    189202        UnifiedTmpLiveness liveness(m_code);
    190203       
     
    267280                    dataLog("    ", tmp, ": ", m_map[tmp], "\n");
    268281                });
     282            dataLog("Clobbers: ", listDump(m_clobbers), "\n");
    269283        }
    270284    }
     
    289303    }
    290304   
    291     void prepareIntervals()
     305    void prepareIntervalsForScanForRegisters()
     306    {
     307        prepareIntervals(
     308            [&] (TmpData& data) -> bool {
     309                if (data.spilled)
     310                    return false;
     311               
     312                data.assigned = Reg();
     313                return true;
     314            });
     315    }
     316   
     317    void prepareIntervalsForScanForStack()
     318    {
     319        prepareIntervals(
     320            [&] (TmpData& data) -> bool {
     321                return data.spilled;
     322            });
     323    }
     324   
     325    template<typename SelectFunc>
     326    void prepareIntervals(const SelectFunc& selectFunc)
    292327    {
    293328        m_tmps.resize(0);
     
    296331            [&] (Tmp tmp) {
    297332                TmpData& data = m_map[tmp];
    298                 if (data.spilled)
     333                if (!selectFunc(data))
    299334                    return;
    300335               
    301                 data.assigned = Reg();
    302336                m_tmps.append(tmp);
    303337            });
     
    309343            });
    310344       
    311         if (verbose()) {
     345        if (verbose())
    312346            dataLog("Tmps: ", listDump(m_tmps), "\n");
    313             dataLog("Clobbers: ", listDump(m_clobbers), "\n");
    314         }
    315347    }
    316348   
     
    326358    }
    327359   
    328     void attemptScan(Bank bank)
     360    void attemptScanForRegisters(Bank bank)
    329361    {
    330362        // This is modeled after LinearScanRegisterAllocation in Fig. 1 in
     
    521553    }
    522554   
     555    void scanForStack()
     556    {
     557        // This is loosely modeled after LinearScanRegisterAllocation in Fig. 1 in
     558        // http://dl.acm.org/citation.cfm?id=330250.
     559
     560        m_active.clear();
     561        m_usedSpillSlots.clearAll();
     562       
     563        for (Tmp& tmp : m_tmps) {
     564            TmpData& entry = m_map[tmp];
     565            if (!entry.spilled)
     566                continue;
     567           
     568            size_t index = entry.interval.begin();
     569           
     570            // This is ExpireOldIntervals in Fig. 1.
     571            while (!m_active.isEmpty()) {
     572                Tmp tmp = m_active.first();
     573                TmpData& entry = m_map[tmp];
     574               
     575                bool expired = entry.interval.end() <= index;
     576               
     577                if (!expired)
     578                    break;
     579               
     580                m_active.removeFirst();
     581                m_usedSpillSlots.clear(entry.spillIndex);
     582            }
     583           
     584            entry.spillIndex = m_usedSpillSlots.findBit(0, false);
     585            ptrdiff_t offset = -static_cast<ptrdiff_t>(m_code.frameSize()) - static_cast<ptrdiff_t>(entry.spillIndex) * 8 - 8;
     586            if (verbose())
     587                dataLog("  Assigning offset = ", offset, " to spill ", pointerDump(entry.spilled), " for ", tmp, "\n");
     588            entry.spilled->setOffsetFromFP(offset);
     589            m_usedSpillSlots.set(entry.spillIndex);
     590            m_active.append(tmp);
     591        }
     592    }
     593   
    523594    void insertSpillCode()
    524595    {
     
    570641    Deque<Tmp> m_active;
    571642    RegisterSet m_activeRegs;
     643    BitVector m_usedSpillSlots;
    572644    bool m_didSpill { false };
    573645};
    574646
    575 void runLinearScan(Code& code)
     647} // anonymous namespace
     648
     649void allocateRegistersAndStackByLinearScan(Code& code)
    576650{
     651    PhaseScope phaseScope(code, "allocateRegistersAndStackByLinearScan");
    577652    if (verbose())
    578653        dataLog("Air before linear scan:\n", code);
     
    583658}
    584659
    585 } // anonymous namespace
    586 
    587 void allocateRegistersByLinearScan(Code& code)
    588 {
    589     PhaseScope phaseScope(code, "allocateRegistersByLinearScan");
    590     padInterference(code);
    591     runLinearScan(code);
    592     fixSpillsAfterTerminals(code);
    593 }
    594 
    595660} } } // namespace JSC::B3::Air
    596661
  • trunk/Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.h

    r215291 r215292  
    3838// http://dl.acm.org/citation.cfm?id=330250
    3939//
    40 // This is not Air's primary register allocator. We use it only when running at optLevel<2. That's not
    41 // the default level. This register allocator is optimized primarily for running quickly. It's expected
    42 // that improvements to this register allocator should focus on improving its execution time without much
    43 // regard for the quality of generated code. If you want good code, use graph coloring.
     40// This is not Air's primary register allocator. We use it only when running at optLevel<2.
     41// That's not the default level. This register allocator is optimized primarily for running
     42// quickly. It's expected that improvements to this register allocator should focus on improving
     43// its execution time without much regard for the quality of generated code. If you want good
     44// code, use graph coloring.
    4445//
    4546// For Air's primary register allocator, see AirAllocateRegistersByGraphColoring.h|cpp.
    46 void allocateRegistersByLinearScan(Code&);
     47//
     48// This also does stack allocation as an afterthought. It does not do any spill coalescing.
     49void allocateRegistersAndStackByLinearScan(Code&);
    4750
    4851} } } // namespace JSC::B3::Air
  • trunk/Source/JavaScriptCore/b3/air/AirAllocateStackByGraphColoring.cpp

    r215073 r215292  
    3131#include "AirArgInlines.h"
    3232#include "AirCode.h"
    33 #include "AirInsertionSet.h"
     33#include "AirHandleCalleeSaves.h"
    3434#include "AirInstInlines.h"
    3535#include "AirLiveness.h"
     
    130130};
    131131
    132 } // anonymous namespace
    133 
    134 void allocateStackByGraphColoring(Code& code)
    135 {
    136     PhaseScope phaseScope(code, "allocateStackByGraphColoring");
    137 
     132Vector<StackSlot*> allocateEscapedStackSlotsImpl(Code& code)
     133{
    138134    // Allocate all of the escaped slots in order. This is kind of a crazy algorithm to allow for
    139135    // the possibility of stack slots being assigned frame offsets before we even get here.
    140     ASSERT(!code.frameSize());
     136    RELEASE_ASSERT(!code.frameSize());
    141137    Vector<StackSlot*> assignedEscapedStackSlots;
    142138    Vector<StackSlot*> escapedStackSlotsWorklist;
     
    159155        assignedEscapedStackSlots.append(slot);
    160156    }
     157    return assignedEscapedStackSlots;
     158}
     159
     160template<typename Collection>
     161void updateFrameSizeBasedOnStackSlotsImpl(Code& code, const Collection& collection)
     162{
     163    unsigned frameSize = 0;
     164    for (StackSlot* slot : collection)
     165        frameSize = std::max(frameSize, static_cast<unsigned>(-slot->offsetFromFP()));
     166    code.setFrameSize(WTF::roundUpToMultipleOf(stackAlignmentBytes(), frameSize));
     167}
     168
     169} // anonymous namespace
     170
     171void allocateEscapedStackSlots(Code& code)
     172{
     173    updateFrameSizeBasedOnStackSlotsImpl(code, allocateEscapedStackSlotsImpl(code));
     174}
     175
     176void updateFrameSizeBasedOnStackSlots(Code& code)
     177{
     178    updateFrameSizeBasedOnStackSlotsImpl(code, code.stackSlots());
     179}
     180
     181void allocateStackByGraphColoring(Code& code)
     182{
     183    PhaseScope phaseScope(code, "allocateStackByGraphColoring");
     184   
     185    handleCalleeSaves(code);
     186   
     187    Vector<StackSlot*> assignedEscapedStackSlots = allocateEscapedStackSlotsImpl(code);
    161188
    162189    // Now we handle the spill slots.
     
    378405    }
    379406
    380     // Figure out how much stack we're using for stack slots.
    381     unsigned frameSizeForStackSlots = 0;
    382     for (StackSlot* slot : code.stackSlots()) {
    383         frameSizeForStackSlots = std::max(
    384             frameSizeForStackSlots,
    385             static_cast<unsigned>(-slot->offsetFromFP()));
    386     }
    387 
    388     frameSizeForStackSlots = WTF::roundUpToMultipleOf(stackAlignmentBytes(), frameSizeForStackSlots);
    389 
    390     // Now we need to deduce how much argument area we need.
    391     for (BasicBlock* block : code) {
    392         for (Inst& inst : *block) {
    393             for (Arg& arg : inst.args) {
    394                 if (arg.isCallArg()) {
    395                     // For now, we assume that we use 8 bytes of the call arg. But that's not
    396                     // such an awesome assumption.
    397                     // FIXME: https://bugs.webkit.org/show_bug.cgi?id=150454
    398                     ASSERT(arg.offset() >= 0);
    399                     code.requestCallArgAreaSizeInBytes(arg.offset() + 8);
    400                 }
    401             }
    402         }
    403     }
    404 
    405     code.setFrameSize(frameSizeForStackSlots + code.callArgAreaSizeInBytes());
    406 
    407     // Finally, transform the code to use Addr's instead of StackSlot's. This is a lossless
    408     // transformation since we can search the StackSlots array to figure out which StackSlot any
    409     // offset-from-FP refers to.
    410 
    411     // FIXME: This may produce addresses that aren't valid if we end up with a ginormous stack frame.
    412     // We would have to scavenge for temporaries if this happened. Fortunately, this case will be
    413     // extremely rare so we can do crazy things when it arises.
    414     // https://bugs.webkit.org/show_bug.cgi?id=152530
    415 
    416     InsertionSet insertionSet(code);
    417     for (BasicBlock* block : code) {
    418         for (unsigned instIndex = 0; instIndex < block->size(); ++instIndex) {
    419             Inst& inst = block->at(instIndex);
    420             inst.forEachArg(
    421                 [&] (Arg& arg, Arg::Role role, Bank, Width width) {
    422                     auto stackAddr = [&] (int32_t offset) -> Arg {
    423                         return Arg::stackAddr(offset, code.frameSize(), width);
    424                     };
    425                    
    426                     switch (arg.kind()) {
    427                     case Arg::Stack: {
    428                         StackSlot* slot = arg.stackSlot();
    429                         if (Arg::isZDef(role)
    430                             && slot->kind() == StackSlotKind::Spill
    431                             && slot->byteSize() > bytes(width)) {
    432                             // Currently we only handle this simple case because it's the only one
    433                             // that arises: ZDef's are only 32-bit right now. So, when we hit these
    434                             // assertions it means that we need to implement those other kinds of
    435                             // zero fills.
    436                             RELEASE_ASSERT(slot->byteSize() == 8);
    437                             RELEASE_ASSERT(width == Width32);
    438 
    439                             RELEASE_ASSERT(isValidForm(StoreZero32, Arg::Stack));
    440                             insertionSet.insert(
    441                                 instIndex + 1, StoreZero32, inst.origin,
    442                                 stackAddr(arg.offset() + 4 + slot->offsetFromFP()));
    443                         }
    444                         arg = stackAddr(arg.offset() + slot->offsetFromFP());
    445                         break;
    446                     }
    447                     case Arg::CallArg:
    448                         arg = stackAddr(arg.offset() - code.frameSize());
    449                         break;
    450                     default:
    451                         break;
    452                     }
    453                 }
    454             );
    455         }
    456         insertionSet.execute(block);
    457     }
     407    updateFrameSizeBasedOnStackSlots(code);
     408    code.setStackIsAllocated(true);
    458409}
    459410
     
    462413#endif // ENABLE(B3_JIT)
    463414
    464 
  • trunk/Source/JavaScriptCore/b3/air/AirAllocateStackByGraphColoring.h

    r215073 r215292  
    3434// This allocates StackSlots to places on the stack. It first allocates the pinned ones in index
    3535// order and then it allocates the rest using first fit. Takes the opportunity to kill dead
    36 // assignments to stack slots, since it knows which ones are live. Also fixes ZDefs to anonymous
    37 // stack slots. Also coalesces spill slots whenever possible.
     36// assignments to stack slots, since it knows which ones are live. Also coalesces spill slots
     37// whenever possible.
    3838//
    3939// This is meant to be an optimal stack allocator, focused on generating great code. It's not
     
    4242void allocateStackByGraphColoring(Code&);
    4343
     44// These are utilities shared by this phase and allocateRegistersAndStackByLinearScan().
     45void allocateEscapedStackSlots(Code&);
     46void updateFrameSizeBasedOnStackSlots(Code&);
     47
    4448} } } // namespace JSC::B3::Air
    4549
  • trunk/Source/JavaScriptCore/b3/air/AirArg.cpp

    r214636 r215292  
    4242namespace JSC { namespace B3 { namespace Air {
    4343
     44Arg Arg::stackAddr(int32_t offsetFromFP, unsigned frameSize, Width width)
     45{
     46    Arg result = Arg::addr(Air::Tmp(GPRInfo::callFrameRegister), offsetFromFP);
     47    if (!result.isValidForm(width)) {
     48        result = Arg::addr(
     49            Air::Tmp(MacroAssembler::stackPointerRegister),
     50            offsetFromFP + frameSize);
     51    }
     52    return result;
     53}
     54
    4455bool Arg::isStackMemory() const
    4556{
  • trunk/Source/JavaScriptCore/b3/air/AirArg.h

    r214636 r215292  
    561561        return result;
    562562    }
    563 
    564     static Arg stackAddr(int32_t offsetFromFP, unsigned frameSize, Width width)
    565     {
    566         Arg result = Arg::addr(Air::Tmp(GPRInfo::callFrameRegister), offsetFromFP);
    567         if (!result.isValidForm(width)) {
    568             result = Arg::addr(
    569                 Air::Tmp(MacroAssembler::stackPointerRegister),
    570                 offsetFromFP + frameSize);
    571         }
    572         return result;
    573     }
     563   
     564    static Arg stackAddr(int32_t offsetFromFP, unsigned frameSize, Width);
    574565
    575566    // If you don't pass a Width, this optimistically assumes that you're using the right width.
  • trunk/Source/JavaScriptCore/b3/air/AirCode.cpp

    r214887 r215292  
    103103StackSlot* Code::addStackSlot(unsigned byteSize, StackSlotKind kind, B3::StackSlot* b3Slot)
    104104{
    105     return m_stackSlots.addNew(byteSize, kind, b3Slot);
     105    StackSlot* result = m_stackSlots.addNew(byteSize, kind, b3Slot);
     106    if (m_stackIsAllocated) {
     107        // FIXME: This is unnecessarily awful. Fortunately, it doesn't run often.
     108        unsigned extent = WTF::roundUpToMultipleOf(result->alignment(), frameSize() + byteSize);
     109        result->setOffsetFromFP(-static_cast<ptrdiff_t>(extent));
     110        setFrameSize(WTF::roundUpToMultipleOf(stackAlignmentBytes(), extent));
     111    }
     112    return result;
    106113}
    107114
     
    137144    }
    138145    return false;
     146}
     147
     148void Code::setCalleeSaveRegisterAtOffsetList(RegisterAtOffsetList&& registerAtOffsetList, StackSlot* slot)
     149{
     150    m_uncorrectedCalleeSaveRegisterAtOffsetList = WTFMove(registerAtOffsetList);
     151    for (const RegisterAtOffset& registerAtOffset : m_uncorrectedCalleeSaveRegisterAtOffsetList)
     152        m_calleeSaveRegisters.set(registerAtOffset.reg());
     153    m_calleeSaveStackSlot = slot;
     154}
     155
     156RegisterAtOffsetList Code::calleeSaveRegisterAtOffsetList() const
     157{
     158    RegisterAtOffsetList result = m_uncorrectedCalleeSaveRegisterAtOffsetList;
     159    if (StackSlot* slot = m_calleeSaveStackSlot) {
     160        ptrdiff_t offset = slot->byteSize() + slot->offsetFromFP();
     161        for (size_t i = result.size(); i--;) {
     162            result.at(i) = RegisterAtOffset(
     163                result.at(i).reg(),
     164                result.at(i).offset() + offset);
     165        }
     166    }
     167    return result;
    139168}
    140169
     
    171200            out.print("    ", deepDump(special), "\n");
    172201    }
    173     if (m_frameSize)
    174         out.print("Frame size: ", m_frameSize, "\n");
     202    if (m_frameSize || m_stackIsAllocated)
     203        out.print("Frame size: ", m_frameSize, m_stackIsAllocated ? " (Allocated)" : "", "\n");
    175204    if (m_callArgAreaSize)
    176205        out.print("Call arg area size: ", m_callArgAreaSize, "\n");
    177     if (m_calleeSaveRegisters.size())
    178         out.print("Callee saves: ", m_calleeSaveRegisters, "\n");
     206    RegisterAtOffsetList calleeSaveRegisters = this->calleeSaveRegisterAtOffsetList();
     207    if (calleeSaveRegisters.size())
     208        out.print("Callee saves: ", calleeSaveRegisters, "\n");
    179209}
    180210
  • trunk/Source/JavaScriptCore/b3/air/AirCode.h

    r215071 r215292  
    187187        m_entrypointLabels = std::forward<Vector>(vector);
    188188    }
    189 
    190     const RegisterAtOffsetList& calleeSaveRegisters() const { return m_calleeSaveRegisters; }
    191     RegisterAtOffsetList& calleeSaveRegisters() { return m_calleeSaveRegisters; }
     189   
     190    void setStackIsAllocated(bool value)
     191    {
     192        m_stackIsAllocated = value;
     193    }
     194   
     195    bool stackIsAllocated() const { return m_stackIsAllocated; }
     196   
     197    // This sets the callee save registers.
     198    void setCalleeSaveRegisterAtOffsetList(RegisterAtOffsetList&&, StackSlot*);
     199
     200    // This returns the correctly offset list of callee save registers.
     201    RegisterAtOffsetList calleeSaveRegisterAtOffsetList() const;
     202   
     203    // This just tells you what the callee saves are.
     204    RegisterSet calleeSaveRegisters() const { return m_calleeSaveRegisters; }
    192205
    193206    // Recomputes predecessors and deletes unreachable blocks.
     
    330343    unsigned m_frameSize { 0 };
    331344    unsigned m_callArgAreaSize { 0 };
    332     RegisterAtOffsetList m_calleeSaveRegisters;
     345    bool m_stackIsAllocated { false };
     346    RegisterAtOffsetList m_uncorrectedCalleeSaveRegisterAtOffsetList;
     347    RegisterSet m_calleeSaveRegisters;
     348    StackSlot* m_calleeSaveStackSlot { nullptr };
    333349    Vector<FrequentedBlock> m_entrypoints; // This is empty until after lowerEntrySwitch().
    334350    Vector<CCallHelpers::Label> m_entrypointLabels; // This is empty until code generation.
  • trunk/Source/JavaScriptCore/b3/air/AirGenerate.cpp

    r215073 r215292  
    2929#if ENABLE(B3_JIT)
    3030
     31#include "AirAllocateRegistersAndStackByLinearScan.h"
    3132#include "AirAllocateRegistersByGraphColoring.h"
    32 #include "AirAllocateRegistersByLinearScan.h"
    3333#include "AirAllocateStackByGraphColoring.h"
    3434#include "AirCode.h"
     
    3737#include "AirFixPartialRegisterStalls.h"
    3838#include "AirGenerationContext.h"
    39 #include "AirHandleCalleeSaves.h"
    4039#include "AirLogRegisterPressure.h"
    4140#include "AirLowerAfterRegAlloc.h"
    4241#include "AirLowerEntrySwitch.h"
    4342#include "AirLowerMacros.h"
     43#include "AirLowerStackArgs.h"
    4444#include "AirOpcodeUtils.h"
    4545#include "AirOptimizeBlockOrder.h"
    4646#include "AirReportUsedRegisters.h"
    4747#include "AirSimplifyCFG.h"
    48 #include "AirSpillEverything.h"
    4948#include "AirValidate.h"
    5049#include "B3Common.h"
     
    8584    eliminateDeadCode(code);
    8685
    87     // Register allocation for all the Tmps that do not have a corresponding machine register.
    88     // After this phase, every Tmp has a reg.
    89     //
    90     // For debugging, you can use spillEverything() to put everything to the stack between each Inst.
    91     if (Options::airSpillsEverything())
    92         spillEverything(code);
    93     else if (code.optLevel() >= 2)
     86    if (code.optLevel() <= 1) {
     87        // When we're compiling quickly, we do register and stack allocation in one linear scan
     88        // phase. It's fast because it computes liveness only once.
     89        allocateRegistersAndStackByLinearScan(code);
     90       
     91        if (Options::logAirRegisterPressure()) {
     92            dataLog("Register pressure after register allocation:\n");
     93            logRegisterPressure(code);
     94        }
     95       
     96        // We may still need to do post-allocation lowering. Doing it after both register and
     97        // stack allocation is less optimal, but it works fine.
     98        lowerAfterRegAlloc(code);
     99    } else {
     100        // NOTE: B3 -O2 generates code that runs 1.5x-2x faster than code generated by -O1.
     101        // Most of this performance benefit is from -O2's graph coloring register allocation
     102        // and stack allocation pipeline, which you see below.
     103       
     104        // Register allocation for all the Tmps that do not have a corresponding machine
     105        // register. After this phase, every Tmp has a reg.
    94106        allocateRegistersByGraphColoring(code);
    95     else
    96         allocateRegistersByLinearScan(code);
    97 
    98     if (Options::logAirRegisterPressure()) {
    99         dataLog("Register pressure after register allocation:\n");
    100         logRegisterPressure(code);
    101     }
    102    
    103     if (code.optLevel() >= 2) {
    104         // This replaces uses of spill slots with registers or constants if possible. It does this by
    105         // minimizing the amount that we perturb the already-chosen register allocation. It may extend
    106         // the live ranges of registers though.
     107       
     108        if (Options::logAirRegisterPressure()) {
     109            dataLog("Register pressure after register allocation:\n");
     110            logRegisterPressure(code);
     111        }
     112       
     113        // This replaces uses of spill slots with registers or constants if possible. It
     114        // does this by minimizing the amount that we perturb the already-chosen register
     115        // allocation. It may extend the live ranges of registers though.
    107116        fixObviousSpills(code);
    108     }
    109 
    110     lowerAfterRegAlloc(code);
    111 
    112     // Prior to this point the prologue and epilogue is implicit. This makes it explicit. It also
    113     // does things like identify which callee-saves we're using and saves them.
    114     handleCalleeSaves(code);
    115    
    116     // This turns all Stack and CallArg Args into Addr args that use the frame pointer. It does
    117     // this by first-fit allocating stack slots. It should be pretty darn close to optimal, so we
    118     // shouldn't have to worry about this very much.
    119     allocateStackByGraphColoring(code);
     117       
     118        lowerAfterRegAlloc(code);
     119       
     120        // This does first-fit allocation of stack slots using an interference graph plus a
     121        // bunch of other optimizations.
     122        allocateStackByGraphColoring(code);
     123    }
     124   
     125    // This turns all Stack and CallArg Args into Addr args that use the frame pointer.
     126    lowerStackArgs(code);
    120127   
    121128    // If we coalesced moves then we can unbreak critical edges. This is the main reason for this
     
    213220                jit.addPtr(CCallHelpers::TrustedImm32(-code.frameSize()), MacroAssembler::stackPointerRegister);
    214221           
    215             for (const RegisterAtOffset& entry : code.calleeSaveRegisters()) {
     222            for (const RegisterAtOffset& entry : code.calleeSaveRegisterAtOffsetList()) {
    216223                if (entry.reg().isGPR())
    217224                    jit.storePtr(entry.reg().gpr(), argFor(entry));
     
    250257            auto start = jit.labelIgnoringWatchpoints();
    251258            if (code.frameSize()) {
    252                 for (const RegisterAtOffset& entry : code.calleeSaveRegisters()) {
     259                for (const RegisterAtOffset& entry : code.calleeSaveRegisterAtOffsetList()) {
    253260                    if (entry.reg().isGPR())
    254261                        jit.loadPtr(argFor(entry), entry.reg().gpr());
  • trunk/Source/JavaScriptCore/b3/air/AirHandleCalleeSaves.cpp

    r207004 r215292  
    11/*
    2  * Copyright (C) 2015-2016 Apple Inc. All rights reserved.
     2 * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    3131#include "AirCode.h"
    3232#include "AirInstInlines.h"
    33 #include "AirPhaseScope.h"
    3433
    3534namespace JSC { namespace B3 { namespace Air {
     
    3736void handleCalleeSaves(Code& code)
    3837{
    39     PhaseScope phaseScope(code, "handleCalleeSaves");
    40 
    4138    RegisterSet usedCalleeSaves;
    4239
     
    6259        return;
    6360
    64     code.calleeSaveRegisters() = RegisterAtOffsetList(usedCalleeSaves);
     61    RegisterAtOffsetList calleeSaveRegisters = RegisterAtOffsetList(usedCalleeSaves);
    6562
    6663    size_t byteSize = 0;
    67     for (const RegisterAtOffset& entry : code.calleeSaveRegisters())
     64    for (const RegisterAtOffset& entry : calleeSaveRegisters)
    6865        byteSize = std::max(static_cast<size_t>(-entry.offset()), byteSize);
    6966
    70     StackSlot* savesArea = code.addStackSlot(byteSize, StackSlotKind::Locked);
    71     // This is a bit weird since we could have already pinned a different stack slot to this
    72     // area. Also, our runtime does not require us to pin the saves area. Maybe we shouldn't pin it?
    73     savesArea->setOffsetFromFP(-byteSize);
     67    code.setCalleeSaveRegisterAtOffsetList(
     68        WTFMove(calleeSaveRegisters),
     69        code.addStackSlot(byteSize, StackSlotKind::Locked));
    7470}
    7571
  • trunk/Source/JavaScriptCore/b3/air/AirHandleCalleeSaves.h

    r206525 r215292  
    11/*
    2  * Copyright (C) 2015 Apple Inc. All rights reserved.
     2 * Copyright (C) 2015-2017 Apple Inc. All rights reserved.
    33 *
    44 * Redistribution and use in source and binary forms, with or without
     
    3232class Code;
    3333
    34 // This phase identifies callee-save registers and adds code to save/restore them in the
    35 // prologue/epilogue to the code. It's a mandatory phase.
     34// This utility identifies callee-save registers and tells Code. It's called from phases that
     35// do stack allocation. We don't do it at the end of register allocation because the real end
     36// of register allocation is just before stack allocation.
    3637
    3738// FIXME: It would be cool to make this more interactive with the Air client and also more
  • trunk/Source/JavaScriptCore/b3/air/AirLowerAfterRegAlloc.cpp

    r214901 r215292  
    7676
    7777    HashMap<Inst*, RegisterSet> usedRegisters;
    78 
     78   
    7979    RegLiveness liveness(code);
    8080    for (BasicBlock* block : code) {
     
    9797        }
    9898    }
    99 
     99   
     100    std::array<std::array<StackSlot*, 2>, numBanks> slots;
     101    forEachBank(
     102        [&] (Bank bank) {
     103            for (unsigned i = 0; i < 2; ++i)
     104                slots[bank][i] = nullptr;
     105        });
     106
     107    // If we run after stack allocation then we cannot use those callee saves that aren't in
     108    // the callee save list. Note that we are only run after stack allocation in -O1, so this
     109    // kind of slop is OK.
     110    RegisterSet blacklistedCalleeSaves;
     111    if (code.stackIsAllocated()) {
     112        blacklistedCalleeSaves = RegisterSet::calleeSaveRegisters();
     113        blacklistedCalleeSaves.exclude(code.calleeSaveRegisters());
     114    }
     115   
    100116    auto getScratches = [&] (RegisterSet set, Bank bank) -> std::array<Arg, 2> {
    101117        std::array<Arg, 2> result;
     
    103119            bool found = false;
    104120            for (Reg reg : code.regsInPriorityOrder(bank)) {
    105                 if (!set.get(reg)) {
     121                if (!set.get(reg) && !blacklistedCalleeSaves.get(reg)) {
    106122                    result[i] = Tmp(reg);
    107123                    set.set(reg);
     
    111127            }
    112128            if (!found) {
    113                 result[i] = Arg::stack(
    114                     code.addStackSlot(
    115                         bytes(conservativeWidth(bank)),
    116                         StackSlotKind::Spill));
     129                StackSlot*& slot = slots[bank][i];
     130                if (!slot)
     131                    slot = code.addStackSlot(bytes(conservativeWidth(bank)), StackSlotKind::Spill);
     132                result[i] = Arg::stack(slots[bank][i]);
    117133            }
    118134        }
  • trunk/Source/JavaScriptCore/b3/testb3.cpp

    r215265 r215292  
    1410314103            }
    1410414104        }
    14105         for (const RegisterAtOffset& regAtOffset : proc.calleeSaveRegisters())
     14105        for (const RegisterAtOffset& regAtOffset : proc.calleeSaveRegisterAtOffsetList())
    1410614106            usesCSRs |= csrs.get(regAtOffset.reg());
    1410714107        CHECK_EQ(usesCSRs, !pin);
  • trunk/Source/JavaScriptCore/ftl/FTLCompile.cpp

    r214571 r215292  
    7878   
    7979    std::unique_ptr<RegisterAtOffsetList> registerOffsets =
    80         std::make_unique<RegisterAtOffsetList>(state.proc->calleeSaveRegisters());
     80        std::make_unique<RegisterAtOffsetList>(state.proc->calleeSaveRegisterAtOffsetList());
    8181    if (shouldDumpDisassembly())
    8282        dataLog("Unwind info for ", CodeBlockWithJITType(state.graph.m_codeBlock, JITCode::FTLJIT), ": ", *registerOffsets, "\n");
  • trunk/Source/JavaScriptCore/jit/RegisterAtOffsetList.h

    r206525 r215292  
    3939
    4040    RegisterAtOffsetList();
    41     RegisterAtOffsetList(RegisterSet, OffsetBaseType = FramePointerBased);
     41    explicit RegisterAtOffsetList(RegisterSet, OffsetBaseType = FramePointerBased);
    4242
    4343    void dump(PrintStream&) const;
  • trunk/Source/JavaScriptCore/runtime/Options.h

    r214709 r215292  
    399399    v(bool, logB3PhaseTimes, false, Normal, nullptr) \
    400400    v(double, rareBlockPenalty, 0.001, Normal, nullptr) \
    401     v(bool, airSpillsEverything, false, Normal, nullptr) \
    402401    v(bool, airLinearScanVerbose, false, Normal, nullptr) \
    403402    v(bool, airLinearScanSpillsEverything, false, Normal, nullptr) \
  • trunk/Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp

    r215193 r215292  
    13291329        B3::generate(procedure, *compilationContext.wasmEntrypointJIT);
    13301330        compilationContext.wasmEntrypointByproducts = procedure.releaseByproducts();
    1331         result->wasmEntrypoint.calleeSaveRegisters = procedure.calleeSaveRegisters();
     1331        result->wasmEntrypoint.calleeSaveRegisters = procedure.calleeSaveRegisterAtOffsetList();
    13321332    }
    13331333
Note: See TracChangeset for help on using the changeset viewer.