Ignore:
Timestamp:
Sep 3, 2013, 11:26:04 PM (12 years ago)
Author:
[email protected]
Message:

The DFG should be able to tier-up and OSR enter into the FTL
https://bugs.webkit.org/show_bug.cgi?id=112838

Source/JavaScriptCore:

Reviewed by Mark Hahnenberg.

This adds the ability for the DFG to tier-up into the FTL. This works in both
of the expected tier-up modes:

Replacement: frequently called functions eventually have their entrypoint
replaced with one that goes into FTL-compiled code. Note, this will be a
slow-down for now since we don't yet have LLVM calling convention integration.

OSR entry: code stuck in hot loops gets OSR'd into the FTL from the DFG.

This means that if the DFG detects that a function is an FTL candidate, it
inserts execution counting code similar to the kind that the baseline JIT
would use. If you trip on a loop count in a loop header that is an OSR
candidate (it's not an inlined loop), we do OSR; otherwise we do replacement.
OSR almost always also implies future replacement.

OSR entry into the FTL is really cool. It uses a specialized FTL compile of
the code, where early in the DFG pipeline we replace the original root block
with an OSR entrypoint block that jumps to the pre-header of the hot loop.
The OSR entrypoint loads all live state at the loop pre-header using loads
from a scratch buffer, which gets populated by the runtime's OSR entry
preparation code (FTL::prepareOSREntry()). This approach appears to work well
with all of our subsequent optimizations, including prediction propagation,
CFA, and LICM. LLVM seems happy with it, too. Best of all, it works naturally
with concurrent compilation: when we hit the tier-up trigger we spawn a
compilation plan at the bytecode index from which we triggered; once the
compilation finishes the next trigger will try to enter, at that bytecode
index. If it can't - for example because the code has moved on to another
loop - then we just try again. Loops that get hot enough for OSR entry (about
25,000 iterations) will probably still be running when a concurrent compile
finishes, so this doesn't appear to be a big problem.

This immediately gives us a 70% speed-up on imaging-gaussian-blur. We could
get a bigger speed-up by adding some more intelligence and tweaking LLVM to
compile code faster. Those things will happen eventually but this is a good
start. Probably this code will see more tuning as we get more coverage in the
FTL JIT, but I'll worry about that in future patches.

  • CMakeLists.txt:
  • GNUmakefile.list.am:
  • JavaScriptCore.xcodeproj/project.pbxproj:
  • Target.pri:
  • bytecode/CodeBlock.cpp:

(JSC::CodeBlock::CodeBlock):
(JSC::CodeBlock::hasOptimizedReplacement):
(JSC::CodeBlock::setOptimizationThresholdBasedOnCompilationResult):

  • bytecode/CodeBlock.h:
  • dfg/DFGAbstractInterpreterInlines.h:

(JSC::DFG::::executeEffects):

  • dfg/DFGByteCodeParser.cpp:

(JSC::DFG::ByteCodeParser::parseBlock):
(JSC::DFG::ByteCodeParser::parse):

  • dfg/DFGCFGSimplificationPhase.cpp:

(JSC::DFG::CFGSimplificationPhase::run):

  • dfg/DFGClobberize.h:

(JSC::DFG::clobberize):

  • dfg/DFGDriver.cpp:

(JSC::DFG::compileImpl):
(JSC::DFG::compile):

  • dfg/DFGDriver.h:
  • dfg/DFGFixupPhase.cpp:

(JSC::DFG::FixupPhase::fixupNode):

  • dfg/DFGGraph.cpp:

(JSC::DFG::Graph::dump):
(JSC::DFG::Graph::killBlockAndItsContents):
(JSC::DFG::Graph::killUnreachableBlocks):

  • dfg/DFGGraph.h:
  • dfg/DFGInPlaceAbstractState.cpp:

(JSC::DFG::InPlaceAbstractState::initialize):

  • dfg/DFGJITCode.cpp:

(JSC::DFG::JITCode::reconstruct):
(JSC::DFG::JITCode::checkIfOptimizationThresholdReached):
(JSC::DFG::JITCode::optimizeNextInvocation):
(JSC::DFG::JITCode::dontOptimizeAnytimeSoon):
(JSC::DFG::JITCode::optimizeAfterWarmUp):
(JSC::DFG::JITCode::optimizeSoon):
(JSC::DFG::JITCode::forceOptimizationSlowPathConcurrently):
(JSC::DFG::JITCode::setOptimizationThresholdBasedOnCompilationResult):

  • dfg/DFGJITCode.h:
  • dfg/DFGJITFinalizer.cpp:

(JSC::DFG::JITFinalizer::finalize):
(JSC::DFG::JITFinalizer::finalizeFunction):
(JSC::DFG::JITFinalizer::finalizeCommon):

  • dfg/DFGLoopPreHeaderCreationPhase.cpp:

(JSC::DFG::createPreHeader):
(JSC::DFG::LoopPreHeaderCreationPhase::run):

  • dfg/DFGLoopPreHeaderCreationPhase.h:
  • dfg/DFGNode.h:

(JSC::DFG::Node::hasUnlinkedLocal):
(JSC::DFG::Node::unlinkedLocal):

  • dfg/DFGNodeType.h:
  • dfg/DFGOSREntry.cpp:

(JSC::DFG::prepareOSREntry):

  • dfg/DFGOSREntrypointCreationPhase.cpp: Added.

(JSC::DFG::OSREntrypointCreationPhase::OSREntrypointCreationPhase):
(JSC::DFG::OSREntrypointCreationPhase::run):
(JSC::DFG::performOSREntrypointCreation):

  • dfg/DFGOSREntrypointCreationPhase.h: Added.
  • dfg/DFGOperations.cpp:
  • dfg/DFGOperations.h:
  • dfg/DFGPlan.cpp:

(JSC::DFG::Plan::Plan):
(JSC::DFG::Plan::compileInThread):
(JSC::DFG::Plan::compileInThreadImpl):

  • dfg/DFGPlan.h:
  • dfg/DFGPredictionInjectionPhase.cpp:

(JSC::DFG::PredictionInjectionPhase::run):

  • dfg/DFGPredictionPropagationPhase.cpp:

(JSC::DFG::PredictionPropagationPhase::propagate):

  • dfg/DFGSafeToExecute.h:

(JSC::DFG::safeToExecute):

  • dfg/DFGSpeculativeJIT32_64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGSpeculativeJIT64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGTierUpCheckInjectionPhase.cpp: Added.

(JSC::DFG::TierUpCheckInjectionPhase::TierUpCheckInjectionPhase):
(JSC::DFG::TierUpCheckInjectionPhase::run):
(JSC::DFG::performTierUpCheckInjection):

  • dfg/DFGTierUpCheckInjectionPhase.h: Added.
  • dfg/DFGToFTLDeferredCompilationCallback.cpp: Added.

(JSC::DFG::ToFTLDeferredCompilationCallback::ToFTLDeferredCompilationCallback):
(JSC::DFG::ToFTLDeferredCompilationCallback::~ToFTLDeferredCompilationCallback):
(JSC::DFG::ToFTLDeferredCompilationCallback::create):
(JSC::DFG::ToFTLDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLDeferredCompilationCallback::compilationDidComplete):

  • dfg/DFGToFTLDeferredCompilationCallback.h: Added.
  • dfg/DFGToFTLForOSREntryDeferredCompilationCallback.cpp: Added.

(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::~ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::create):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete):

  • dfg/DFGToFTLForOSREntryDeferredCompilationCallback.h: Added.
  • dfg/DFGWorklist.cpp:

(JSC::DFG::globalWorklist):

  • dfg/DFGWorklist.h:
  • ftl/FTLCapabilities.cpp:

(JSC::FTL::canCompile):

  • ftl/FTLCapabilities.h:
  • ftl/FTLForOSREntryJITCode.cpp: Added.

(JSC::FTL::ForOSREntryJITCode::ForOSREntryJITCode):
(JSC::FTL::ForOSREntryJITCode::~ForOSREntryJITCode):
(JSC::FTL::ForOSREntryJITCode::ftlForOSREntry):
(JSC::FTL::ForOSREntryJITCode::initializeEntryBuffer):

  • ftl/FTLForOSREntryJITCode.h: Added.

(JSC::FTL::ForOSREntryJITCode::entryBuffer):
(JSC::FTL::ForOSREntryJITCode::setBytecodeIndex):
(JSC::FTL::ForOSREntryJITCode::bytecodeIndex):
(JSC::FTL::ForOSREntryJITCode::countEntryFailure):
(JSC::FTL::ForOSREntryJITCode::entryFailureCount):

  • ftl/FTLJITFinalizer.cpp:

(JSC::FTL::JITFinalizer::finalizeFunction):

  • ftl/FTLLink.cpp:

(JSC::FTL::link):

  • ftl/FTLLowerDFGToLLVM.cpp:

(JSC::FTL::LowerDFGToLLVM::compileBlock):
(JSC::FTL::LowerDFGToLLVM::compileNode):
(JSC::FTL::LowerDFGToLLVM::compileExtractOSREntryLocal):
(JSC::FTL::LowerDFGToLLVM::compileGetLocal):
(JSC::FTL::LowerDFGToLLVM::addWeakReference):

  • ftl/FTLOSREntry.cpp: Added.

(JSC::FTL::prepareOSREntry):

  • ftl/FTLOSREntry.h: Added.
  • ftl/FTLOutput.h:

(JSC::FTL::Output::crashNonTerminal):
(JSC::FTL::Output::crash):

  • ftl/FTLState.cpp:

(JSC::FTL::State::State):

  • interpreter/Register.h:

(JSC::Register::unboxedDouble):

  • jit/JIT.cpp:

(JSC::JIT::emitEnterOptimizationCheck):

  • jit/JITCode.cpp:

(JSC::JITCode::ftlForOSREntry):

  • jit/JITCode.h:
  • jit/JITStubs.cpp:

(JSC::DEFINE_STUB_FUNCTION):

  • runtime/Executable.cpp:

(JSC::ScriptExecutable::newReplacementCodeBlockFor):

  • runtime/Options.h:
  • runtime/VM.cpp:

(JSC::VM::ensureWorklist):

  • runtime/VM.h:

LayoutTests:

Reviewed by Mark Hahnenberg.

Fix marsaglia to check the result instead of printing, and add a second
version that relies on OSR entry.

  • fast/js/regress/marsaglia-osr-entry-expected.txt: Added.
  • fast/js/regress/marsaglia-osr-entry.html: Added.
  • fast/js/regress/script-tests/marsaglia-osr-entry.js: Added.

(marsaglia):

  • fast/js/regress/script-tests/marsaglia.js:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/dfg/DFGOperations.cpp

    r154935 r155023  
    3232#include "CommonSlowPaths.h"
    3333#include "CopiedSpaceInlines.h"
     34#include "DFGDriver.h"
    3435#include "DFGOSRExit.h"
    3536#include "DFGRepatch.h"
    3637#include "DFGThunks.h"
     38#include "DFGToFTLDeferredCompilationCallback.h"
     39#include "DFGToFTLForOSREntryDeferredCompilationCallback.h"
     40#include "DFGWorklist.h"
     41#include "FTLForOSREntryJITCode.h"
     42#include "FTLOSREntry.h"
    3743#include "HostCallReturnValue.h"
    3844#include "GetterSetter.h"
     
    20152021    codeBlock->reoptimize();
    20162022}
     2023
     2024#if ENABLE(FTL_JIT)
     2025void DFG_OPERATION triggerTierUpNow(ExecState* exec)
     2026{
     2027    VM* vm = &exec->vm();
     2028    NativeCallFrameTracer tracer(vm, exec);
     2029    DeferGC deferGC(vm->heap);
     2030    CodeBlock* codeBlock = exec->codeBlock();
     2031   
     2032    JITCode* jitCode = codeBlock->jitCode()->dfg();
     2033   
     2034    if (Options::verboseOSR()) {
     2035        dataLog(
     2036            *codeBlock, ": Entered triggerTierUpNow with executeCounter = ",
     2037            jitCode->tierUpCounter, "\n");
     2038    }
     2039   
     2040    if (codeBlock->baselineVersion()->m_didFailFTLCompilation) {
     2041        if (Options::verboseOSR())
     2042            dataLog("Deferring FTL-optimization of ", *codeBlock, " indefinitely because there was an FTL failure.\n");
     2043        jitCode->dontOptimizeAnytimeSoon(codeBlock);
     2044        return;
     2045    }
     2046   
     2047    if (!jitCode->checkIfOptimizationThresholdReached(codeBlock)) {
     2048        if (Options::verboseOSR())
     2049            dataLog("Choosing not to FTL-optimize ", *codeBlock, " yet.\n");
     2050        return;
     2051    }
     2052   
     2053    Worklist::State worklistState;
     2054    if (Worklist* worklist = vm->worklist.get()) {
     2055        worklistState = worklist->completeAllReadyPlansForVM(
     2056            *vm, CompilationKey(codeBlock->baselineVersion(), FTLMode));
     2057    } else
     2058        worklistState = Worklist::NotKnown;
     2059   
     2060    if (worklistState == Worklist::Compiling) {
     2061        jitCode->setOptimizationThresholdBasedOnCompilationResult(
     2062            codeBlock, CompilationDeferred);
     2063        return;
     2064    }
     2065   
     2066    if (codeBlock->hasOptimizedReplacement()) {
     2067        // That's great, we've compiled the code - next time we call this function,
     2068        // we'll enter that replacement.
     2069        jitCode->optimizeSoon(codeBlock);
     2070        return;
     2071    }
     2072   
     2073    if (worklistState == Worklist::Compiled) {
     2074        // This means that we finished compiling, but failed somehow; in that case the
     2075        // thresholds will be set appropriately.
     2076        if (Options::verboseOSR())
     2077            dataLog("Code block ", *codeBlock, " was compiled but it doesn't have an optimized replacement.\n");
     2078        return;
     2079    }
     2080
     2081    // We need to compile the code.
     2082    compile(
     2083        *vm, codeBlock->newReplacement().get(), FTLMode, UINT_MAX, Operands<JSValue>(),
     2084        ToFTLDeferredCompilationCallback::create(codeBlock), vm->ensureWorklist());
     2085}
     2086
     2087char* DFG_OPERATION triggerOSREntryNow(
     2088    ExecState* exec, int32_t bytecodeIndex, int32_t streamIndex)
     2089{
     2090    VM* vm = &exec->vm();
     2091    NativeCallFrameTracer tracer(vm, exec);
     2092    DeferGC deferGC(vm->heap);
     2093    CodeBlock* codeBlock = exec->codeBlock();
     2094   
     2095    JITCode* jitCode = codeBlock->jitCode()->dfg();
     2096   
     2097    if (Options::verboseOSR()) {
     2098        dataLog(
     2099            *codeBlock, ": Entered triggerTierUpNow with executeCounter = ",
     2100            jitCode->tierUpCounter, "\n");
     2101    }
     2102   
     2103    if (codeBlock->baselineVersion()->m_didFailFTLCompilation) {
     2104        if (Options::verboseOSR())
     2105            dataLog("Deferring FTL-optimization of ", *codeBlock, " indefinitely because there was an FTL failure.\n");
     2106        jitCode->dontOptimizeAnytimeSoon(codeBlock);
     2107        return 0;
     2108    }
     2109   
     2110    if (!jitCode->checkIfOptimizationThresholdReached(codeBlock)) {
     2111        if (Options::verboseOSR())
     2112            dataLog("Choosing not to FTL-optimize ", *codeBlock, " yet.\n");
     2113        return 0;
     2114    }
     2115   
     2116    Worklist::State worklistState;
     2117    if (Worklist* worklist = vm->worklist.get()) {
     2118        worklistState = worklist->completeAllReadyPlansForVM(
     2119            *vm, CompilationKey(codeBlock->baselineVersion(), FTLForOSREntryMode));
     2120    } else
     2121        worklistState = Worklist::NotKnown;
     2122   
     2123    if (worklistState == Worklist::Compiling) {
     2124        ASSERT(!jitCode->osrEntryBlock);
     2125        jitCode->setOptimizationThresholdBasedOnCompilationResult(
     2126            codeBlock, CompilationDeferred);
     2127        return 0;
     2128    }
     2129   
     2130    if (CodeBlock* entryBlock = jitCode->osrEntryBlock.get()) {
     2131        void* address = FTL::prepareOSREntry(
     2132            exec, codeBlock, entryBlock, bytecodeIndex, streamIndex);
     2133        if (address) {
     2134            jitCode->optimizeSoon(codeBlock);
     2135            return static_cast<char*>(address);
     2136        }
     2137       
     2138        FTL::ForOSREntryJITCode* entryCode = entryBlock->jitCode()->ftlForOSREntry();
     2139        entryCode->countEntryFailure();
     2140        if (entryCode->entryFailureCount() <
     2141            Options::ftlOSREntryFailureCountForReoptimization()) {
     2142           
     2143            jitCode->optimizeSoon(codeBlock);
     2144            return 0;
     2145        }
     2146       
     2147        // OSR entry failed. Oh no! This implies that we need to retry. We retry
     2148        // without exponential backoff and we only do this for the entry code block.
     2149        jitCode->osrEntryBlock.clear();
     2150       
     2151        jitCode->optimizeAfterWarmUp(codeBlock);
     2152        return 0;
     2153    }
     2154   
     2155    if (worklistState == Worklist::Compiled) {
     2156        // This means that compilation failed and we already set the thresholds.
     2157        if (Options::verboseOSR())
     2158            dataLog("Code block ", *codeBlock, " was compiled but it doesn't have an optimized replacement.\n");
     2159        return 0;
     2160    }
     2161
     2162    // The first order of business is to trigger a for-entry compile.
     2163    Operands<JSValue> mustHandleValues;
     2164    jitCode->reconstruct(
     2165        exec, codeBlock, CodeOrigin(bytecodeIndex), streamIndex, mustHandleValues);
     2166    CompilationResult forEntryResult = DFG::compile(
     2167        *vm, codeBlock->newReplacement().get(), FTLForOSREntryMode, bytecodeIndex,
     2168        mustHandleValues, ToFTLForOSREntryDeferredCompilationCallback::create(codeBlock),
     2169        vm->ensureWorklist());
     2170   
     2171    // But we also want to trigger a replacement compile. Of course, we don't want to
     2172    // trigger it if we don't need to. Note that this is kind of weird because we might
     2173    // have just finished an FTL compile and that compile failed or was invalidated.
     2174    // But this seems uncommon enough that we sort of don't care. It's certainly sound
     2175    // to fire off another compile right now so long as we're not already compiling and
     2176    // we don't already have an optimized replacement. Note, we don't do this for
     2177    // obviously bad cases like global code, where we know that there is a slim chance
     2178    // of this code being invoked ever again.
     2179    CompilationKey keyForReplacement(codeBlock->baselineVersion(), FTLMode);
     2180    if (codeBlock->codeType() != GlobalCode
     2181        && !codeBlock->hasOptimizedReplacement()
     2182        && (!vm->worklist.get()
     2183            || vm->worklist->compilationState(keyForReplacement) == Worklist::NotKnown)) {
     2184        compile(
     2185            *vm, codeBlock->newReplacement().get(), FTLMode, UINT_MAX, Operands<JSValue>(),
     2186            ToFTLDeferredCompilationCallback::create(codeBlock), vm->ensureWorklist());
     2187    }
     2188   
     2189    if (forEntryResult != CompilationSuccessful)
     2190        return 0;
     2191   
     2192    // It's possible that the for-entry compile already succeeded. In that case OSR
     2193    // entry will succeed unless we ran out of stack. It's not clear what we should do.
     2194    // We signal to try again after a while if that happens.
     2195    void* address = FTL::prepareOSREntry(
     2196        exec, codeBlock, jitCode->osrEntryBlock.get(), bytecodeIndex, streamIndex);
     2197    if (address)
     2198        jitCode->optimizeSoon(codeBlock);
     2199    else
     2200        jitCode->optimizeAfterWarmUp(codeBlock);
     2201    return static_cast<char*>(address);
     2202}
     2203#endif // ENABLE(FTL_JIT)
    20172204
    20182205} // extern "C"
Note: See TracChangeset for help on using the changeset viewer.