Ignore:
Timestamp:
Sep 3, 2013, 11:26:04 PM (12 years ago)
Author:
[email protected]
Message:

The DFG should be able to tier-up and OSR enter into the FTL
https://bugs.webkit.org/show_bug.cgi?id=112838

Source/JavaScriptCore:

Reviewed by Mark Hahnenberg.

This adds the ability for the DFG to tier-up into the FTL. This works in both
of the expected tier-up modes:

Replacement: frequently called functions eventually have their entrypoint
replaced with one that goes into FTL-compiled code. Note, this will be a
slow-down for now since we don't yet have LLVM calling convention integration.

OSR entry: code stuck in hot loops gets OSR'd into the FTL from the DFG.

This means that if the DFG detects that a function is an FTL candidate, it
inserts execution counting code similar to the kind that the baseline JIT
would use. If you trip on a loop count in a loop header that is an OSR
candidate (it's not an inlined loop), we do OSR; otherwise we do replacement.
OSR almost always also implies future replacement.

OSR entry into the FTL is really cool. It uses a specialized FTL compile of
the code, where early in the DFG pipeline we replace the original root block
with an OSR entrypoint block that jumps to the pre-header of the hot loop.
The OSR entrypoint loads all live state at the loop pre-header using loads
from a scratch buffer, which gets populated by the runtime's OSR entry
preparation code (FTL::prepareOSREntry()). This approach appears to work well
with all of our subsequent optimizations, including prediction propagation,
CFA, and LICM. LLVM seems happy with it, too. Best of all, it works naturally
with concurrent compilation: when we hit the tier-up trigger we spawn a
compilation plan at the bytecode index from which we triggered; once the
compilation finishes the next trigger will try to enter, at that bytecode
index. If it can't - for example because the code has moved on to another
loop - then we just try again. Loops that get hot enough for OSR entry (about
25,000 iterations) will probably still be running when a concurrent compile
finishes, so this doesn't appear to be a big problem.

This immediately gives us a 70% speed-up on imaging-gaussian-blur. We could
get a bigger speed-up by adding some more intelligence and tweaking LLVM to
compile code faster. Those things will happen eventually but this is a good
start. Probably this code will see more tuning as we get more coverage in the
FTL JIT, but I'll worry about that in future patches.

  • CMakeLists.txt:
  • GNUmakefile.list.am:
  • JavaScriptCore.xcodeproj/project.pbxproj:
  • Target.pri:
  • bytecode/CodeBlock.cpp:

(JSC::CodeBlock::CodeBlock):
(JSC::CodeBlock::hasOptimizedReplacement):
(JSC::CodeBlock::setOptimizationThresholdBasedOnCompilationResult):

  • bytecode/CodeBlock.h:
  • dfg/DFGAbstractInterpreterInlines.h:

(JSC::DFG::::executeEffects):

  • dfg/DFGByteCodeParser.cpp:

(JSC::DFG::ByteCodeParser::parseBlock):
(JSC::DFG::ByteCodeParser::parse):

  • dfg/DFGCFGSimplificationPhase.cpp:

(JSC::DFG::CFGSimplificationPhase::run):

  • dfg/DFGClobberize.h:

(JSC::DFG::clobberize):

  • dfg/DFGDriver.cpp:

(JSC::DFG::compileImpl):
(JSC::DFG::compile):

  • dfg/DFGDriver.h:
  • dfg/DFGFixupPhase.cpp:

(JSC::DFG::FixupPhase::fixupNode):

  • dfg/DFGGraph.cpp:

(JSC::DFG::Graph::dump):
(JSC::DFG::Graph::killBlockAndItsContents):
(JSC::DFG::Graph::killUnreachableBlocks):

  • dfg/DFGGraph.h:
  • dfg/DFGInPlaceAbstractState.cpp:

(JSC::DFG::InPlaceAbstractState::initialize):

  • dfg/DFGJITCode.cpp:

(JSC::DFG::JITCode::reconstruct):
(JSC::DFG::JITCode::checkIfOptimizationThresholdReached):
(JSC::DFG::JITCode::optimizeNextInvocation):
(JSC::DFG::JITCode::dontOptimizeAnytimeSoon):
(JSC::DFG::JITCode::optimizeAfterWarmUp):
(JSC::DFG::JITCode::optimizeSoon):
(JSC::DFG::JITCode::forceOptimizationSlowPathConcurrently):
(JSC::DFG::JITCode::setOptimizationThresholdBasedOnCompilationResult):

  • dfg/DFGJITCode.h:
  • dfg/DFGJITFinalizer.cpp:

(JSC::DFG::JITFinalizer::finalize):
(JSC::DFG::JITFinalizer::finalizeFunction):
(JSC::DFG::JITFinalizer::finalizeCommon):

  • dfg/DFGLoopPreHeaderCreationPhase.cpp:

(JSC::DFG::createPreHeader):
(JSC::DFG::LoopPreHeaderCreationPhase::run):

  • dfg/DFGLoopPreHeaderCreationPhase.h:
  • dfg/DFGNode.h:

(JSC::DFG::Node::hasUnlinkedLocal):
(JSC::DFG::Node::unlinkedLocal):

  • dfg/DFGNodeType.h:
  • dfg/DFGOSREntry.cpp:

(JSC::DFG::prepareOSREntry):

  • dfg/DFGOSREntrypointCreationPhase.cpp: Added.

(JSC::DFG::OSREntrypointCreationPhase::OSREntrypointCreationPhase):
(JSC::DFG::OSREntrypointCreationPhase::run):
(JSC::DFG::performOSREntrypointCreation):

  • dfg/DFGOSREntrypointCreationPhase.h: Added.
  • dfg/DFGOperations.cpp:
  • dfg/DFGOperations.h:
  • dfg/DFGPlan.cpp:

(JSC::DFG::Plan::Plan):
(JSC::DFG::Plan::compileInThread):
(JSC::DFG::Plan::compileInThreadImpl):

  • dfg/DFGPlan.h:
  • dfg/DFGPredictionInjectionPhase.cpp:

(JSC::DFG::PredictionInjectionPhase::run):

  • dfg/DFGPredictionPropagationPhase.cpp:

(JSC::DFG::PredictionPropagationPhase::propagate):

  • dfg/DFGSafeToExecute.h:

(JSC::DFG::safeToExecute):

  • dfg/DFGSpeculativeJIT32_64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGSpeculativeJIT64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGTierUpCheckInjectionPhase.cpp: Added.

(JSC::DFG::TierUpCheckInjectionPhase::TierUpCheckInjectionPhase):
(JSC::DFG::TierUpCheckInjectionPhase::run):
(JSC::DFG::performTierUpCheckInjection):

  • dfg/DFGTierUpCheckInjectionPhase.h: Added.
  • dfg/DFGToFTLDeferredCompilationCallback.cpp: Added.

(JSC::DFG::ToFTLDeferredCompilationCallback::ToFTLDeferredCompilationCallback):
(JSC::DFG::ToFTLDeferredCompilationCallback::~ToFTLDeferredCompilationCallback):
(JSC::DFG::ToFTLDeferredCompilationCallback::create):
(JSC::DFG::ToFTLDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLDeferredCompilationCallback::compilationDidComplete):

  • dfg/DFGToFTLDeferredCompilationCallback.h: Added.
  • dfg/DFGToFTLForOSREntryDeferredCompilationCallback.cpp: Added.

(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::~ToFTLForOSREntryDeferredCompilationCallback):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::create):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidBecomeReadyAsynchronously):
(JSC::DFG::ToFTLForOSREntryDeferredCompilationCallback::compilationDidComplete):

  • dfg/DFGToFTLForOSREntryDeferredCompilationCallback.h: Added.
  • dfg/DFGWorklist.cpp:

(JSC::DFG::globalWorklist):

  • dfg/DFGWorklist.h:
  • ftl/FTLCapabilities.cpp:

(JSC::FTL::canCompile):

  • ftl/FTLCapabilities.h:
  • ftl/FTLForOSREntryJITCode.cpp: Added.

(JSC::FTL::ForOSREntryJITCode::ForOSREntryJITCode):
(JSC::FTL::ForOSREntryJITCode::~ForOSREntryJITCode):
(JSC::FTL::ForOSREntryJITCode::ftlForOSREntry):
(JSC::FTL::ForOSREntryJITCode::initializeEntryBuffer):

  • ftl/FTLForOSREntryJITCode.h: Added.

(JSC::FTL::ForOSREntryJITCode::entryBuffer):
(JSC::FTL::ForOSREntryJITCode::setBytecodeIndex):
(JSC::FTL::ForOSREntryJITCode::bytecodeIndex):
(JSC::FTL::ForOSREntryJITCode::countEntryFailure):
(JSC::FTL::ForOSREntryJITCode::entryFailureCount):

  • ftl/FTLJITFinalizer.cpp:

(JSC::FTL::JITFinalizer::finalizeFunction):

  • ftl/FTLLink.cpp:

(JSC::FTL::link):

  • ftl/FTLLowerDFGToLLVM.cpp:

(JSC::FTL::LowerDFGToLLVM::compileBlock):
(JSC::FTL::LowerDFGToLLVM::compileNode):
(JSC::FTL::LowerDFGToLLVM::compileExtractOSREntryLocal):
(JSC::FTL::LowerDFGToLLVM::compileGetLocal):
(JSC::FTL::LowerDFGToLLVM::addWeakReference):

  • ftl/FTLOSREntry.cpp: Added.

(JSC::FTL::prepareOSREntry):

  • ftl/FTLOSREntry.h: Added.
  • ftl/FTLOutput.h:

(JSC::FTL::Output::crashNonTerminal):
(JSC::FTL::Output::crash):

  • ftl/FTLState.cpp:

(JSC::FTL::State::State):

  • interpreter/Register.h:

(JSC::Register::unboxedDouble):

  • jit/JIT.cpp:

(JSC::JIT::emitEnterOptimizationCheck):

  • jit/JITCode.cpp:

(JSC::JITCode::ftlForOSREntry):

  • jit/JITCode.h:
  • jit/JITStubs.cpp:

(JSC::DEFINE_STUB_FUNCTION):

  • runtime/Executable.cpp:

(JSC::ScriptExecutable::newReplacementCodeBlockFor):

  • runtime/Options.h:
  • runtime/VM.cpp:

(JSC::VM::ensureWorklist):

  • runtime/VM.h:

LayoutTests:

Reviewed by Mark Hahnenberg.

Fix marsaglia to check the result instead of printing, and add a second
version that relies on OSR entry.

  • fast/js/regress/marsaglia-osr-entry-expected.txt: Added.
  • fast/js/regress/marsaglia-osr-entry.html: Added.
  • fast/js/regress/script-tests/marsaglia-osr-entry.js: Added.

(marsaglia):

  • fast/js/regress/script-tests/marsaglia.js:
File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/ftl/FTLCapabilities.cpp

    r153292 r155023  
    3333using namespace DFG;
    3434
    35 inline bool canCompile(Node* node)
     35inline CapabilityLevel canCompile(Node* node)
    3636{
     37    // NOTE: If we ever have phantom arguments, we can compile them but we cannot
     38    // OSR enter.
     39   
    3740    switch (node->op()) {
    3841    case JSConstant:
     
    8285    case Phi:
    8386    case Upsilon:
     87    case ExtractOSREntryLocal:
     88    case LoopHint:
    8489        // These are OK.
    8590        break;
     
    9196            break;
    9297        default:
    93             return false;
     98            return CannotCompile;
    9499        }
    95100        break;
     
    97102        switch (node->arrayMode().type()) {
    98103        case Array::ForceExit:
    99             return true;
     104            return CanCompileAndOSREnter;
    100105        case Array::Int32:
    101106        case Array::Double:
     
    103108            break;
    104109        default:
    105             return false;
     110            return CannotCompile;
    106111        }
    107112        switch (node->arrayMode().speculation()) {
     
    110115            break;
    111116        default:
    112             return false;
     117            return CannotCompile;
    113118        }
    114119        break;
     
    117122        switch (node->arrayMode().type()) {
    118123        case Array::ForceExit:
    119             return true;
     124            return CanCompileAndOSREnter;
    120125        case Array::Int32:
    121126        case Array::Double:
     
    123128            break;
    124129        default:
    125             return false;
     130            return CannotCompile;
    126131        }
    127132        break;
     
    134139        if (node->isBinaryUseKind(ObjectUse))
    135140            break;
    136         return false;
     141        return CannotCompile;
    137142    case CompareLess:
    138143    case CompareLessEq:
     
    143148        if (node->isBinaryUseKind(NumberUse))
    144149            break;
    145         return false;
     150        return CannotCompile;
    146151    case Branch:
    147152    case LogicalNot:
     
    153158            break;
    154159        default:
    155             return false;
     160            return CannotCompile;
    156161        }
    157162        break;
     
    162167            break;
    163168        default:
    164             return false;
     169            return CannotCompile;
    165170        }
    166171        break;
    167172    default:
    168173        // Don't know how to handle anything else.
    169         return false;
     174        return CannotCompile;
    170175    }
    171     return true;
     176    return CanCompileAndOSREnter;
    172177}
    173178
    174 bool canCompile(Graph& graph)
     179CapabilityLevel canCompile(Graph& graph)
    175180{
     181    if (graph.m_codeBlock->codeType() != FunctionCode) {
     182        if (verboseCompilationEnabled())
     183            dataLog("FTL rejecting code block that doesn't belong to a function.\n");
     184        return CannotCompile;
     185    }
     186   
     187    CapabilityLevel result = CanCompileAndOSREnter;
     188   
    176189    for (BlockIndex blockIndex = graph.numBlocks(); blockIndex--;) {
    177190        BasicBlock* block = graph.block(blockIndex);
     
    211224                        graph.dump(WTF::dataFile(), "    ", node);
    212225                    }
    213                     return false;
     226                    return CannotCompile;
    214227                }
    215228            }
    216229           
    217             if (!canCompile(node)) {
     230            switch (canCompile(node)) {
     231            case CannotCompile:
    218232                if (verboseCompilationEnabled()) {
    219233                    dataLog("FTL rejecting node:\n");
    220234                    graph.dump(WTF::dataFile(), "    ", node);
    221235                }
    222                 return false;
     236                return CannotCompile;
     237               
     238            case CanCompile:
     239                if (result == CanCompileAndOSREnter && verboseCompilationEnabled()) {
     240                    dataLog("FTL disabling OSR entry because of node:\n");
     241                    graph.dump(WTF::dataFile(), "    ", node);
     242                }
     243                result = CanCompile;
     244                break;
     245               
     246            case CanCompileAndOSREnter:
     247                break;
    223248            }
    224249           
    225             // We don't care if we can compile anything after a force-exit.
    226250            if (node->op() == ForceOSRExit)
    227251                break;
     
    229253    }
    230254   
    231     return true;
     255    return result;
    232256}
    233257
Note: See TracChangeset for help on using the changeset viewer.