Ignore:
Timestamp:
Sep 3, 2013, 11:26:04 PM (12 years ago)
Author:
[email protected]
Message:

The DFG should be able to tier-up and OSR enter into the FTL
https://bugs.webkit.org/show_bug.cgi?id=112838

Source/JavaScriptCore:

Reviewed by Mark Hahnenberg.

This adds the ability for the DFG to tier-up into the FTL. This works in both
of the expected tier-up modes:

Replacement: frequently called functions eventually have their entrypoint
replaced with one that goes into FTL-compiled code. Note, this will be a
slow-down for now since we don't yet have LLVM calling convention integration.

OSR entry: code stuck in hot loops gets OSR'd into the FTL from the DFG.

This means that if the DFG detects that a function is an FTL candidate, it
inserts execution counting code similar to the kind that the baseline JIT
would use. If you trip on a loop count in a loop header that is an OSR
candidate (it's not an inlined loop), we do OSR; otherwise we do replacement.
OSR almost always also implies future replacement.

OSR entry into the FTL is really cool. It uses a specialized FTL compile of
the code, where early in the DFG pipeline we replace the original root block
with an OSR entrypoint block that jumps to the pre-header of the hot loop.
The OSR entrypoint loads all live state at the loop pre-header using loads
from a scratch buffer, which gets populated by the runtime's OSR entry
preparation code (FTL::prepareOSREntry()). This approach appears to work well
with all of our subsequent optimizations, including prediction propagation,
CFA, and LICM. LLVM seems happy with it, too. Best of all, it works naturally
with concurrent compilation: when we hit the tier-up trigger we spawn a
compilation plan at the bytecode index from which we triggered; once the
compilation finishes the next trigger will try to enter, at that bytecode
index. If it can't - for example because the code has moved on to another
loop - then we just try again. Loops that get hot enough for OSR entry (about
25,000 iterations) will probably still be running when a concurrent compile
finishes, so this doesn't appear to be a big problem.

This immediately gives us a 70% speed-up on imaging-gaussian-blur. We could
get a bigger speed-up by adding some more intelligence and tweaking LLVM to
compile code faster. Those things will happen eventually but this is a good
start. Probably this code will see more tuning as we get more coverage in the
FTL JIT, but I'll worry about that in future patches.

  • CMakeLists.txt:
  • GNUmakefile.list.am:
  • JavaScriptCore.xcodeproj/project.pbxproj:
  • Target.pri:
  • bytecode/CodeBlock.cpp:

(JSC::CodeBlock::CodeBlock):
(JSC::CodeBlock::hasOptimizedReplacement):
(JSC::CodeBlock::setOptimizationThresholdBasedOnCompilationResult):

  • bytecode/CodeBlock.h:
  • dfg/DFGAbstractInterpreterInlines.h:

(JSC::DFG::::executeEffects):

  • dfg/DFGByteCodeParser.cpp:

(JSC::DFG::ByteCodeParser::parseBlock):
(JSC::DFG::ByteCodeParser::parse):

  • dfg/DFGCFGSimplificationPhase.cpp:

(JSC::DFG::CFGSimplificationPhase::run):

  • dfg/DFGClobberize.h:

(JSC::DFG::clobberize):

  • dfg/DFGDriver.cpp:

(JSC::DFG::compileImpl):
(JSC::DFG::compile):

  • dfg/DFGDriver.h:
  • dfg/DFGFixupPhase.cpp:

(JSC::DFG::FixupPhase::fixupNode):

  • dfg/DFGGraph.cpp:

(JSC::DFG::Graph::dump):
(JSC::DFG::Graph::killBlockAndItsContents):
(JSC::DFG::Graph::killUnreachableBlocks):

  • dfg/DFGGraph.h:
  • dfg/DFGInPlaceAbstractState.cpp:

(JSC::DFG::InPlaceAbstractState::initialize):

  • dfg/DFGJITCode.cpp:

(JSC::DFG::JITCode::reconstruct):
(JSC::DFG::JITCode::checkIfOptimizationThresholdReached):
(JSC::DFG::JITCode::optimizeNextInvocation):
(JSC::DFG::JITCode::dontOptimizeAnytimeSoon):
(JSC::DFG::JITCode::optimizeAfterWarmUp):
(JSC::DFG::JITCode::optimizeSoon):
(JSC::DFG::JITCode::forceOptimizationSlowPathConcurrently):
(JSC::DFG::JITCode::setOptimizationThresholdBasedOnCompilationResult):

  • dfg/DFGJITCode.h:
  • dfg/DFGJITFinalizer.cpp:

(JSC::DFG::JITFinalizer::finalize):
(JSC::DFG::JITFinalizer::finalizeFunction):
(JSC::DFG::JITFinalizer::finalizeCommon):

  • dfg/DFGLoopPreHeaderCreationPhase.cpp:

(JSC::DFG::createPreHeader):
(JSC::DFG::LoopPreHeaderCreationPhase::run):

  • dfg/DFGLoopPreHeaderCreationPhase.h:
  • dfg/DFGNode.h:

(JSC::DFG::Node::hasUnlinkedLocal):
(JSC::DFG::Node::unlinkedLocal):

  • dfg/DFGNodeType.h:
  • dfg/DFGOSREntry.cpp:

(JSC::DFG::prepareOSREntry):

  • dfg/DFGOSREntrypointCreationPhase.cpp: Added.

(JSC::DFG::OSREntrypointCreationPhase::OSREntrypointCreationPhase):
(JSC::DFG::OSREntrypointCreationPhase::run):
(JSC::DFG::performOSREntrypointCreation):

  • dfg/DFGOSREntrypointCreationPhase.h: Added.
  • dfg/DFGOperations.cpp:
  • dfg/DFGOperations.h:
  • dfg/DFGPlan.cpp:

(JSC::DFG::Plan::Plan):
(JSC::DFG::Plan::compileInThread):
(JSC::DFG::Plan::compileInThreadImpl):

  • dfg/DFGPlan.h:
  • dfg/DFGPredictionInjectionPhase.cpp:

(JSC::DFG::PredictionInjectionPhase::run):

  • dfg/DFGPredictionPropagationPhase.cpp:

(JSC::DFG::PredictionPropagationPhase::propagate):

  • dfg/DFGSafeToExecute.h:

(JSC::DFG::safeToExecute):

  • dfg/DFGSpeculativeJIT32_64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGSpeculativeJIT64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGTierUpCheckInjectionPhase.cpp: Added.

(JSC::DFG::TierUpCheckInjectionPhase::TierUpCheckInjectionPhase):
(JSC::DFG::TierUpCheckInjectionPhase::run):
(JSC::DFG::performTierUpCheckInjection):

  • dfg/DFGTierUpCheckInjectionPhase.h: Added.
  • dfg/DFGToFTLDeferredCompilationCallback.cpp: Added.

(JSC::DFG::ToFTLDeferredCompilationCallback::ToFTLDeferredCompilationCallback):
(JSC::DFG::ToFTLDeferredCompilationCallback::~ToFTLDeferredCompilationCallback):
(JSC::DFG::ToFTLDeferredCompilationCallback::create):
(JSC::DFG::ToFTLDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLDeferredCompilationCallback::compilationDidComplete):

  • dfg/DFGToFTLDeferredCompilationCallback.h: Added.
  • dfg/DFGToFTLForOSREntryDeferredCompilationCallback.cpp: Added.

(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::~ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::create):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete):

  • dfg/DFGToFTLForOSREntryDeferredCompilationCallback.h: Added.
  • dfg/DFGWorklist.cpp:

(JSC::DFG::globalWorklist):

  • dfg/DFGWorklist.h:
  • ftl/FTLCapabilities.cpp:

(JSC::FTL::canCompile):

  • ftl/FTLCapabilities.h:
  • ftl/FTLForOSREntryJITCode.cpp: Added.

(JSC::FTL::ForOSREntryJITCode::ForOSREntryJITCode):
(JSC::FTL::ForOSREntryJITCode::~ForOSREntryJITCode):
(JSC::FTL::ForOSREntryJITCode::ftlForOSREntry):
(JSC::FTL::ForOSREntryJITCode::initializeEntryBuffer):

  • ftl/FTLForOSREntryJITCode.h: Added.

(JSC::FTL::ForOSREntryJITCode::entryBuffer):
(JSC::FTL::ForOSREntryJITCode::setBytecodeIndex):
(JSC::FTL::ForOSREntryJITCode::bytecodeIndex):
(JSC::FTL::ForOSREntryJITCode::countEntryFailure):
(JSC::FTL::ForOSREntryJITCode::entryFailureCount):

  • ftl/FTLJITFinalizer.cpp:

(JSC::FTL::JITFinalizer::finalizeFunction):

  • ftl/FTLLink.cpp:

(JSC::FTL::link):

  • ftl/FTLLowerDFGToLLVM.cpp:

(JSC::FTL::LowerDFGToLLVM::compileBlock):
(JSC::FTL::LowerDFGToLLVM::compileNode):
(JSC::FTL::LowerDFGToLLVM::compileExtractOSREntryLocal):
(JSC::FTL::LowerDFGToLLVM::compileGetLocal):
(JSC::FTL::LowerDFGToLLVM::addWeakReference):

  • ftl/FTLOSREntry.cpp: Added.

(JSC::FTL::prepareOSREntry):

  • ftl/FTLOSREntry.h: Added.
  • ftl/FTLOutput.h:

(JSC::FTL::Output::crashNonTerminal):
(JSC::FTL::Output::crash):

  • ftl/FTLState.cpp:

(JSC::FTL::State::State):

  • interpreter/Register.h:

(JSC::Register::unboxedDouble):

  • jit/JIT.cpp:

(JSC::JIT::emitEnterOptimizationCheck):

  • jit/JITCode.cpp:

(JSC::JITCode::ftlForOSREntry):

  • jit/JITCode.h:
  • jit/JITStubs.cpp:

(JSC::DEFINE_STUB_FUNCTION):

  • runtime/Executable.cpp:

(JSC::ScriptExecutable::newReplacementCodeBlockFor):

  • runtime/Options.h:
  • runtime/VM.cpp:

(JSC::VM::ensureWorklist):

  • runtime/VM.h:

LayoutTests:

Reviewed by Mark Hahnenberg.

Fix marsaglia to check the result instead of printing, and add a second
version that relies on OSR entry.

  • fast/js/regress/marsaglia-osr-entry-expected.txt: Added.
  • fast/js/regress/marsaglia-osr-entry.html: Added.
  • fast/js/regress/script-tests/marsaglia-osr-entry.js: Added.

(marsaglia):

  • fast/js/regress/script-tests/marsaglia.js:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/dfg/DFGJITCode.cpp

    r153216 r155023  
    2929#if ENABLE(DFG_JIT)
    3030
     31#include "CodeBlock.h"
     32
    3133namespace JSC { namespace DFG {
    3234
     
    6062}
    6163
     64void JITCode::reconstruct(
     65    CodeBlock* codeBlock, CodeOrigin codeOrigin, unsigned streamIndex,
     66    Operands<ValueRecovery>& result)
     67{
     68    variableEventStream.reconstruct(
     69        codeBlock, codeOrigin, minifiedDFG, streamIndex, result);
     70}
     71
     72void JITCode::reconstruct(
     73    ExecState* exec, CodeBlock* codeBlock, CodeOrigin codeOrigin, unsigned streamIndex,
     74    Operands<JSValue>& result)
     75{
     76    Operands<ValueRecovery> recoveries;
     77    reconstruct(codeBlock, codeOrigin, streamIndex, recoveries);
     78   
     79    result = Operands<JSValue>(OperandsLike, recoveries);
     80    for (size_t i = result.size(); i--;) {
     81        int operand = result.operandForIndex(i);
     82       
     83        if (operandIsArgument(operand)
     84            && !operandToArgument(operand)
     85            && codeBlock->codeType() == FunctionCode
     86            && codeBlock->specializationKind() == CodeForConstruct) {
     87            // Ugh. If we're in a constructor, the 'this' argument may hold garbage. It will
     88            // also never be used. It doesn't matter what we put into the value for this,
     89            // but it has to be an actual value that can be grokked by subsequent DFG passes,
     90            // so we sanitize it here by turning it into Undefined.
     91            result[i] = jsUndefined();
     92            continue;
     93        }
     94       
     95        ValueRecovery recovery = recoveries[i];
     96        JSValue value;
     97        switch (recovery.technique()) {
     98        case AlreadyInJSStack:
     99        case AlreadyInJSStackAsUnboxedCell:
     100        case AlreadyInJSStackAsUnboxedBoolean:
     101            value = exec->r(operand).jsValue();
     102            break;
     103        case AlreadyInJSStackAsUnboxedInt32:
     104            value = jsNumber(exec->r(operand).unboxedInt32());
     105            break;
     106        case AlreadyInJSStackAsUnboxedDouble:
     107            value = jsDoubleNumber(exec->r(operand).unboxedDouble());
     108            break;
     109        case Constant:
     110            value = recovery.constant();
     111            break;
     112        default:
     113            RELEASE_ASSERT_NOT_REACHED();
     114            break;
     115        }
     116        result[i] = value;
     117    }
     118}
     119
     120#if ENABLE(FTL_JIT)
     121bool JITCode::checkIfOptimizationThresholdReached(CodeBlock* codeBlock)
     122{
     123    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     124    return tierUpCounter.checkIfThresholdCrossedAndSet(codeBlock->baselineVersion());
     125}
     126
     127void JITCode::optimizeNextInvocation(CodeBlock* codeBlock)
     128{
     129    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     130    if (Options::verboseOSR())
     131        dataLog(*codeBlock, ": FTL-optimizing next invocation.\n");
     132    tierUpCounter.setNewThreshold(0, codeBlock->baselineVersion());
     133}
     134
     135void JITCode::dontOptimizeAnytimeSoon(CodeBlock* codeBlock)
     136{
     137    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     138    if (Options::verboseOSR())
     139        dataLog(*codeBlock, ": Not FTL-optimizing anytime soon.\n");
     140    tierUpCounter.deferIndefinitely();
     141}
     142
     143void JITCode::optimizeAfterWarmUp(CodeBlock* codeBlock)
     144{
     145    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     146    if (Options::verboseOSR())
     147        dataLog(*codeBlock, ": FTL-optimizing after warm-up.\n");
     148    CodeBlock* baseline = codeBlock->baselineVersion();
     149    tierUpCounter.setNewThreshold(
     150        baseline->adjustedCounterValue(Options::thresholdForFTLOptimizeAfterWarmUp()),
     151        baseline);
     152}
     153
     154void JITCode::optimizeSoon(CodeBlock* codeBlock)
     155{
     156    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     157    if (Options::verboseOSR())
     158        dataLog(*codeBlock, ": FTL-optimizing soon.\n");
     159    CodeBlock* baseline = codeBlock->baselineVersion();
     160    tierUpCounter.setNewThreshold(
     161        baseline->adjustedCounterValue(Options::thresholdForFTLOptimizeSoon()),
     162        baseline);
     163}
     164
     165void JITCode::forceOptimizationSlowPathConcurrently(CodeBlock* codeBlock)
     166{
     167    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     168    if (Options::verboseOSR())
     169        dataLog(*codeBlock, ": Forcing slow path concurrently for FTL entry.\n");
     170    tierUpCounter.forceSlowPathConcurrently();
     171}
     172
     173void JITCode::setOptimizationThresholdBasedOnCompilationResult(
     174    CodeBlock* codeBlock, CompilationResult result)
     175{
     176    ASSERT(codeBlock->jitType() == JITCode::DFGJIT);
     177    switch (result) {
     178    case CompilationSuccessful:
     179        optimizeNextInvocation(codeBlock);
     180        return;
     181    case CompilationFailed:
     182        dontOptimizeAnytimeSoon(codeBlock);
     183        codeBlock->baselineVersion()->m_didFailFTLCompilation = true;
     184        return;
     185    case CompilationDeferred:
     186        optimizeAfterWarmUp(codeBlock);
     187        return;
     188    case CompilationInvalidated:
     189        // This is weird - it will only happen in cases when the DFG code block (i.e.
     190        // the code block that this JITCode belongs to) is also invalidated. So it
     191        // doesn't really matter what we do. But, we do the right thing anyway. Note
     192        // that us counting the reoptimization actually means that we might count it
     193        // twice. But that's generally OK. It's better to overcount reoptimizations
     194        // than it is to undercount them.
     195        codeBlock->baselineVersion()->countReoptimization();
     196        optimizeAfterWarmUp(codeBlock);
     197        return;
     198    }
     199    RELEASE_ASSERT_NOT_REACHED();
     200}
     201#endif // ENABLE(FTL_JIT)
     202
    62203} } // namespace JSC::DFG
    63204
Note: See TracChangeset for help on using the changeset viewer.