Ignore:
Timestamp:
Sep 7, 2017, 6:14:58 PM (8 years ago)
Author:
[email protected]
Message:

Use JIT probes for DFG OSR exit.
https://bugs.webkit.org/show_bug.cgi?id=175144
<rdar://problem/33437050>

Reviewed by Saam Barati.

This patch does the following:

  1. Replaces osrExitGenerationThunkGenerator() with osrExitThunkGenerator(). While osrExitGenerationThunkGenerator() generates a thunk that compiles a unique OSR offramp for each DFG OSR exit site, osrExitThunkGenerator() generates a thunk that just executes the OSR exit.

The osrExitThunkGenerator() generated thunk works by using a single JIT probe
to call OSRExit::executeOSRExit(). The JIT probe takes care of preserving
CPU registers, and providing the Probe::Stack mechanism for modifying the
stack frame.

OSRExit::executeOSRExit() replaces OSRExit::compileOSRExit() and
OSRExit::compileExit(). It is basically a re-write of those functions to
execute the OSR exit work instead of compiling code to execute the work.

As a result, we get the following savings:

  1. no more OSR exit ramp compilation time.
  2. no use of JIT executable memory for storing each unique OSR exit ramp.

On the negative side, we incur these costs:

  1. the OSRExit::executeOSRExit() ramp may be a little slower than the compiled version of the ramp. However, OSR exits are rare. Hence, this small difference should not matter much. It is also offset by the savings from (a).
  1. the Probe::Stack allocates 1K pages for memory for buffering stack modifcations. The number of these pages depends on the span of stack memory that the OSR exit ramp reads from and writes to. Since the OSR exit ramp tends to only modify values in the current DFG frame and the current VMEntryRecord, the number of pages tends to only be 1 or 2.

Using the jsc tests as a workload, the vast majority of tests that do OSR
exit, uses 3 or less 1K pages (with the overwhelming number using just 1 page).
A few tests that are pathological uses up to 14 pages, and one particularly
bad test (function-apply-many-args.js) uses 513 pages.

Similar to the old code, the OSR exit ramp still has 2 parts: 1 part that is
only executed once to compute some values for the exit site that is used by
all exit operations from that site, and a 2nd part to execute the exit. The
1st part is protected by a checking if exit.exitState has already been
initialized. The computed values are cached in exit.exitState.

Because the OSR exit thunk no longer compiles an OSR exit off-ramp, we no
longer need the facility to patch the site that jumps to the OSR exit ramp.
The DFG::JITCompiler has been modified to remove this patching code.

  1. Fixed the bottom most Probe::Context and Probe::Stack get/set methods to use std::memcpy to avoid strict aliasing issues.

Also optimized the implementation of Probe::Stack::physicalAddressFor().

  1. Miscellaneous convenience methods added to make the Probe::Context easier of use.
  1. Added a Probe::Frame class that makes it easier to get/set operands and arguments in a given frame using the deferred write properties of the Probe::Stack. Probe::Frame makes it easier to do some of the recovery work in the OSR exit ramp.
  1. Cloned or converted some functions needed by the OSR exit ramp. The original JIT versions of these functions are still left in place because they are still needed for FTL OSR exit. A FIXME comment has been added to remove them later. These functions include:

DFGOSRExitCompilerCommon.cpp's handleExitCounts() ==>

CodeBlock::updateOSRExitCounterAndCheckIfNeedToReoptimize()

DFGOSRExitCompilerCommon.cpp's reifyInlinedCallFrames() ==>

DFGOSRExit.cpp's reifyInlinedCallFrames()

DFGOSRExitCompilerCommon.cpp's adjustAndJumpToTarget() ==>

DFGOSRExit.cpp's adjustAndJumpToTarget()

MethodOfGettingAValueProfile::emitReportValue() ==>

MethodOfGettingAValueProfile::reportValue()

DFGOperations.cpp's operationCreateDirectArgumentsDuringExit() ==>

DFGOSRExit.cpp's createDirectArgumentsDuringExit()

DFGOperations.cpp's operationCreateClonedArgumentsDuringExit() ==>

DFGOSRExit.cpp's createClonedArgumentsDuringExit()

  • JavaScriptCore.xcodeproj/project.pbxproj:
  • assembler/MacroAssembler.cpp:

(JSC::stdFunctionCallback):

  • assembler/MacroAssemblerPrinter.cpp:

(JSC::Printer::printCallback):

  • assembler/ProbeContext.h:

(JSC::Probe::CPUState::gpr const):
(JSC::Probe::CPUState::spr const):
(JSC::Probe::Context::Context):
(JSC::Probe::Context::arg):
(JSC::Probe::Context::gpr):
(JSC::Probe::Context::spr):
(JSC::Probe::Context::fpr):
(JSC::Probe::Context::gprName):
(JSC::Probe::Context::sprName):
(JSC::Probe::Context::fprName):
(JSC::Probe::Context::gpr const):
(JSC::Probe::Context::spr const):
(JSC::Probe::Context::fpr const):
(JSC::Probe::Context::pc):
(JSC::Probe::Context::fp):
(JSC::Probe::Context::sp):
(JSC::Probe:: const): Deleted.

  • assembler/ProbeFrame.h: Added.

(JSC::Probe::Frame::Frame):
(JSC::Probe::Frame::getArgument):
(JSC::Probe::Frame::getOperand):
(JSC::Probe::Frame::get):
(JSC::Probe::Frame::setArgument):
(JSC::Probe::Frame::setOperand):
(JSC::Probe::Frame::set):

  • assembler/ProbeStack.cpp:

(JSC::Probe::Page::Page):

  • assembler/ProbeStack.h:

(JSC::Probe::Page::get):
(JSC::Probe::Page::set):
(JSC::Probe::Page::physicalAddressFor):
(JSC::Probe::Stack::lowWatermark):
(JSC::Probe::Stack::get):
(JSC::Probe::Stack::set):

  • bytecode/ArithProfile.cpp:
  • bytecode/ArithProfile.h:
  • bytecode/ArrayProfile.h:

(JSC::ArrayProfile::observeArrayMode):

  • bytecode/CodeBlock.cpp:

(JSC::CodeBlock::updateOSRExitCounterAndCheckIfNeedToReoptimize):

  • bytecode/CodeBlock.h:

(JSC::CodeBlock::addressOfOSRExitCounter): Deleted.

  • bytecode/ExecutionCounter.h:

(JSC::ExecutionCounter::hasCrossedThreshold const):
(JSC::ExecutionCounter::setNewThresholdForOSRExit):

  • bytecode/MethodOfGettingAValueProfile.cpp:

(JSC::MethodOfGettingAValueProfile::reportValue):

  • bytecode/MethodOfGettingAValueProfile.h:
  • dfg/DFGDriver.cpp:

(JSC::DFG::compileImpl):

  • dfg/DFGJITCode.cpp:

(JSC::DFG::JITCode::findPC): Deleted.

  • dfg/DFGJITCode.h:
  • dfg/DFGJITCompiler.cpp:

(JSC::DFG::JITCompiler::linkOSRExits):
(JSC::DFG::JITCompiler::link):

  • dfg/DFGOSRExit.cpp:

(JSC::DFG::jsValueFor):
(JSC::DFG::restoreCalleeSavesFor):
(JSC::DFG::saveCalleeSavesFor):
(JSC::DFG::restoreCalleeSavesFromVMEntryFrameCalleeSavesBuffer):
(JSC::DFG::copyCalleeSavesToVMEntryFrameCalleeSavesBuffer):
(JSC::DFG::saveOrCopyCalleeSavesFor):
(JSC::DFG::createDirectArgumentsDuringExit):
(JSC::DFG::createClonedArgumentsDuringExit):
(JSC::DFG::OSRExit::OSRExit):
(JSC::DFG::emitRestoreArguments):
(JSC::DFG::OSRExit::executeOSRExit):
(JSC::DFG::reifyInlinedCallFrames):
(JSC::DFG::adjustAndJumpToTarget):
(JSC::DFG::printOSRExit):
(JSC::DFG::OSRExit::setPatchableCodeOffset): Deleted.
(JSC::DFG::OSRExit::getPatchableCodeOffsetAsJump const): Deleted.
(JSC::DFG::OSRExit::codeLocationForRepatch const): Deleted.
(JSC::DFG::OSRExit::correctJump): Deleted.
(JSC::DFG::OSRExit::emitRestoreArguments): Deleted.
(JSC::DFG::OSRExit::compileOSRExit): Deleted.
(JSC::DFG::OSRExit::compileExit): Deleted.
(JSC::DFG::OSRExit::debugOperationPrintSpeculationFailure): Deleted.

  • dfg/DFGOSRExit.h:

(JSC::DFG::OSRExitState::OSRExitState):
(JSC::DFG::OSRExit::considerAddingAsFrequentExitSite):

  • dfg/DFGOSRExitCompilerCommon.cpp:
  • dfg/DFGOSRExitCompilerCommon.h:
  • dfg/DFGOperations.cpp:
  • dfg/DFGOperations.h:
  • dfg/DFGThunks.cpp:

(JSC::DFG::osrExitThunkGenerator):
(JSC::DFG::osrExitGenerationThunkGenerator): Deleted.

  • dfg/DFGThunks.h:
  • jit/AssemblyHelpers.cpp:

(JSC::AssemblyHelpers::debugCall): Deleted.

  • jit/AssemblyHelpers.h:
  • jit/JITOperations.cpp:
  • jit/JITOperations.h:
  • profiler/ProfilerOSRExit.h:

(JSC::Profiler::OSRExit::incCount):

  • runtime/JSCJSValue.h:
  • runtime/JSCJSValueInlines.h:
  • runtime/VM.h:
Location:
trunk/Source/JavaScriptCore/assembler
Files:
1 added
5 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/assembler/MacroAssembler.cpp

    r220958 r221774  
    3939static void stdFunctionCallback(Probe::Context& context)
    4040{
    41     auto func = static_cast<const std::function<void(Probe::Context&)>*>(context.arg);
     41    auto func = context.arg<const std::function<void(Probe::Context&)>*>();
    4242    (*func)(context);
    4343}
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerPrinter.cpp

    r220958 r221774  
    176176{
    177177    auto& out = WTF::dataFile();
    178     PrintRecordList& list = *reinterpret_cast<PrintRecordList*>(probeContext.arg);
     178    PrintRecordList& list = *probeContext.arg<PrintRecordList*>();
    179179    for (size_t i = 0; i < list.size(); i++) {
    180180        auto& record = list[i];
  • trunk/Source/JavaScriptCore/assembler/ProbeContext.h

    r220960 r221774  
    4646    inline double& fpr(FPRegisterID);
    4747
    48     template<typename T, typename std::enable_if<std::is_integral<T>::value>::type* = nullptr>
    49     T gpr(RegisterID) const;
    50     template<typename T, typename std::enable_if<std::is_pointer<T>::value>::type* = nullptr>
    51     T gpr(RegisterID) const;
    52     template<typename T, typename std::enable_if<std::is_integral<T>::value>::type* = nullptr>
    53     T spr(SPRegisterID) const;
    54     template<typename T, typename std::enable_if<std::is_pointer<T>::value>::type* = nullptr>
    55     T spr(SPRegisterID) const;
     48    template<typename T> T gpr(RegisterID) const;
     49    template<typename T> T spr(SPRegisterID) const;
    5650    template<typename T> T fpr(FPRegisterID) const;
    5751
     
    8680}
    8781
    88 template<typename T, typename std::enable_if<std::is_integral<T>::value>::type*>
     82template<typename T>
    8983T CPUState::gpr(RegisterID id) const
    9084{
    9185    CPUState* cpu = const_cast<CPUState*>(this);
    92     return static_cast<T>(cpu->gpr(id));
    93 }
    94 
    95 template<typename T, typename std::enable_if<std::is_pointer<T>::value>::type*>
    96 T CPUState::gpr(RegisterID id) const
    97 {
    98     CPUState* cpu = const_cast<CPUState*>(this);
    99     return reinterpret_cast<T>(cpu->gpr(id));
    100 }
    101 
    102 template<typename T, typename std::enable_if<std::is_integral<T>::value>::type*>
     86    auto& from = cpu->gpr(id);
     87    typename std::remove_const<T>::type to { };
     88    std::memcpy(&to, &from, sizeof(to)); // Use std::memcpy to avoid strict aliasing issues.
     89    return to;
     90}
     91
     92template<typename T>
    10393T CPUState::spr(SPRegisterID id) const
    10494{
    10595    CPUState* cpu = const_cast<CPUState*>(this);
    106     return static_cast<T>(cpu->spr(id));
    107 }
    108 
    109 template<typename T, typename std::enable_if<std::is_pointer<T>::value>::type*>
    110 T CPUState::spr(SPRegisterID id) const
    111 {
    112     CPUState* cpu = const_cast<CPUState*>(this);
    113     return reinterpret_cast<T>(cpu->spr(id));
     96    auto& from = cpu->spr(id);
     97    typename std::remove_const<T>::type to { };
     98    std::memcpy(&to, &from, sizeof(to)); // Use std::memcpy to avoid strict aliasing issues.
     99    return to;
    114100}
    115101
     
    206192
    207193    Context(State* state)
    208         : m_state(state)
    209         , arg(state->arg)
    210         , cpu(state->cpu)
     194        : cpu(state->cpu)
     195        , m_state(state)
    211196    { }
    212197
    213     uintptr_t& gpr(RegisterID id) { return m_state->cpu.gpr(id); }
    214     uintptr_t& spr(SPRegisterID id) { return m_state->cpu.spr(id); }
    215     double& fpr(FPRegisterID id) { return m_state->cpu.fpr(id); }
    216     const char* gprName(RegisterID id) { return m_state->cpu.gprName(id); }
    217     const char* sprName(SPRegisterID id) { return m_state->cpu.sprName(id); }
    218     const char* fprName(FPRegisterID id) { return m_state->cpu.fprName(id); }
    219 
    220     void*& pc() { return m_state->cpu.pc(); }
    221     void*& fp() { return m_state->cpu.fp(); }
    222     void*& sp() { return m_state->cpu.sp(); }
    223 
    224     template<typename T> T pc() { return m_state->cpu.pc<T>(); }
    225     template<typename T> T fp() { return m_state->cpu.fp<T>(); }
    226     template<typename T> T sp() { return m_state->cpu.sp<T>(); }
     198    template<typename T>
     199    T arg() { return reinterpret_cast<T>(m_state->arg); }
     200
     201    uintptr_t& gpr(RegisterID id) { return cpu.gpr(id); }
     202    uintptr_t& spr(SPRegisterID id) { return cpu.spr(id); }
     203    double& fpr(FPRegisterID id) { return cpu.fpr(id); }
     204    const char* gprName(RegisterID id) { return cpu.gprName(id); }
     205    const char* sprName(SPRegisterID id) { return cpu.sprName(id); }
     206    const char* fprName(FPRegisterID id) { return cpu.fprName(id); }
     207
     208    template<typename T> T gpr(RegisterID id) const { return cpu.gpr<T>(id); }
     209    template<typename T> T spr(SPRegisterID id) const { return cpu.spr<T>(id); }
     210    template<typename T> T fpr(FPRegisterID id) const { return cpu.fpr<T>(id); }
     211
     212    void*& pc() { return cpu.pc(); }
     213    void*& fp() { return cpu.fp(); }
     214    void*& sp() { return cpu.sp(); }
     215
     216    template<typename T> T pc() { return cpu.pc<T>(); }
     217    template<typename T> T fp() { return cpu.fp<T>(); }
     218    template<typename T> T sp() { return cpu.sp<T>(); }
    227219
    228220    Stack& stack()
     
    235227    Stack* releaseStack() { return new Stack(WTFMove(m_stack)); }
    236228
     229    CPUState& cpu;
     230
    237231private:
    238232    State* m_state;
    239 public:
    240     void* arg;
    241     CPUState& cpu;
    242 
    243 private:
    244233    Stack m_stack;
    245234
  • trunk/Source/JavaScriptCore/assembler/ProbeStack.cpp

    r220960 r221774  
    3636Page::Page(void* baseAddress)
    3737    : m_baseLogicalAddress(baseAddress)
     38    , m_physicalAddressOffset(reinterpret_cast<uint8_t*>(&m_buffer) - reinterpret_cast<uint8_t*>(baseAddress))
    3839{
    3940    memcpy(&m_buffer, baseAddress, s_pageSize);
  • trunk/Source/JavaScriptCore/assembler/ProbeStack.h

    r220960 r221774  
    5757    T get(void* logicalAddress)
    5858    {
    59         return *physicalAddressFor<T*>(logicalAddress);
     59        void* from = physicalAddressFor(logicalAddress);
     60        typename std::remove_const<T>::type to { };
     61        std::memcpy(&to, from, sizeof(to)); // Use std::memcpy to avoid strict aliasing issues.
     62        return to;
     63    }
     64    template<typename T>
     65    T get(void* logicalBaseAddress, ptrdiff_t offset)
     66    {
     67        return get<T>(reinterpret_cast<uint8_t*>(logicalBaseAddress) + offset);
    6068    }
    6169
     
    6472    {
    6573        m_dirtyBits |= dirtyBitFor(logicalAddress);
    66         *physicalAddressFor<T*>(logicalAddress) = value;
     74        void* to = physicalAddressFor(logicalAddress);
     75        std::memcpy(to, &value, sizeof(T)); // Use std::memcpy to avoid strict aliasing issues.
     76    }
     77    template<typename T>
     78    void set(void* logicalBaseAddress, ptrdiff_t offset, T value)
     79    {
     80        set<T>(reinterpret_cast<uint8_t*>(logicalBaseAddress) + offset, value);
    6781    }
    6882
     
    8195    }
    8296
    83     template<typename T, typename = typename std::enable_if<std::is_pointer<T>::value>::type>
    84     T physicalAddressFor(void* logicalAddress)
    85     {
    86         uintptr_t offset = reinterpret_cast<uintptr_t>(logicalAddress) & s_pageMask;
    87         void* physicalAddress = reinterpret_cast<uint8_t*>(&m_buffer) + offset;
    88         return reinterpret_cast<T>(physicalAddress);
     97    void* physicalAddressFor(void* logicalAddress)
     98    {
     99        return reinterpret_cast<uint8_t*>(logicalAddress) + m_physicalAddressOffset;
    89100    }
    90101
     
    93104    void* m_baseLogicalAddress { nullptr };
    94105    uintptr_t m_dirtyBits { 0 };
     106    ptrdiff_t m_physicalAddressOffset;
    95107
    96108    static constexpr size_t s_pageSize = 1024;
     
    121133    Stack(Stack&& other);
    122134
    123     void* lowWatermark() { return m_lowWatermark; }
    124 
    125     template<typename T>
    126     typename std::enable_if<!std::is_same<double, typename std::remove_cv<T>::type>::value, T>::type get(void* address)
    127     {
    128         Page* page = pageFor(address);
    129         return page->get<T>(address);
    130     }
    131 
    132     template<typename T, typename = typename std::enable_if<!std::is_same<double, typename std::remove_cv<T>::type>::value>::type>
    133     void set(void* address, T value)
    134     {
    135         Page* page = pageFor(address);
    136         page->set<T>(address, value);
    137 
     135    void* lowWatermark()
     136    {
    138137        // We use the chunkAddress for the low watermark because we'll be doing write backs
    139138        // to the stack in increments of chunks. Hence, we'll treat the lowest address of
    140139        // the chunk as the low watermark of any given set address.
    141         void* chunkAddress = Page::chunkAddressFor(address);
    142         if (chunkAddress < m_lowWatermark)
    143             m_lowWatermark = chunkAddress;
    144     }
    145 
    146     template<typename T>
    147     typename std::enable_if<std::is_same<double, typename std::remove_cv<T>::type>::value, T>::type get(void* address)
     140        return Page::chunkAddressFor(m_lowWatermark);
     141    }
     142
     143    template<typename T>
     144    T get(void* address)
    148145    {
    149146        Page* page = pageFor(address);
    150         return bitwise_cast<double>(page->get<uint64_t>(address));
    151     }
    152 
    153     template<typename T, typename = typename std::enable_if<std::is_same<double, typename std::remove_cv<T>::type>::value>::type>
    154     void set(void* address, double value)
    155     {
    156         set<uint64_t>(address, bitwise_cast<uint64_t>(value));
     147        return page->get<T>(address);
     148    }
     149    template<typename T>
     150    T get(void* logicalBaseAddress, ptrdiff_t offset)
     151    {
     152        return get<T>(reinterpret_cast<uint8_t*>(logicalBaseAddress) + offset);
     153    }
     154
     155    template<typename T>
     156    void set(void* address, T value)
     157    {
     158        Page* page = pageFor(address);
     159        page->set<T>(address, value);
     160
     161        if (address < m_lowWatermark)
     162            m_lowWatermark = address;
     163    }
     164    template<typename T>
     165    void set(void* logicalBaseAddress, ptrdiff_t offset, T value)
     166    {
     167        set<T>(reinterpret_cast<uint8_t*>(logicalBaseAddress) + offset, value);
    157168    }
    158169
Note: See TracChangeset for help on using the changeset viewer.