Air should support linear scan for optLevel<2
https://bugs.webkit.org/show_bug.cgi?id=170161
Reviewed by Saam Barati.
Source/JavaScriptCore:
This changes the default opt level of B3 to 2. It makes the other opt levels useful by adding a
new register allocator. This new linear scan allocator will produce significantly worse code.
But it will produce that code a lot faster than IRC or Briggs.
The opt levels are:
0: no optimizations, linear scan
1: some optimizations, linear scan
2: full optimizations, graph coloring (IRC or Briggs based on CPU)
What we used to call optLevel=1 is not called optLevel=2, or better yet,
optLevel=B3::defaultOptLevel(). We no longer have anything like the old optLevel=0 (which did no
optimizations but ran graph coloring).
allocateRegistersByLinearScan() faithfully implements Massimiliano Poletto and Vivek Sarkar's
famous algorithm. It uses the variant that handles clobbered registers by avoiding assigning
ranges to those registers if the range overlaps a clobber. It's engineered to allocate registers
very quickly and generate inefficient code without falling off a cliff.
The new optLevel=1 speeds up B3 by a factor of 2, and results in a 80% throughput regression.
Linear scan runs 4.7x faster than graph coloring on average.
- CMakeLists.txt:
- JavaScriptCore.xcodeproj/project.pbxproj:
- b3/B3BasicBlockUtils.h:
(JSC::B3::blocksInPreOrder):
(JSC::B3::blocksInPostOrder):
- b3/B3BlockWorklist.h:
- b3/B3CFG.h:
(JSC::B3::CFG::newMap):
(JSC::B3::defaultOptLevel):
- b3/B3Compile.h:
- b3/B3DuplicateTails.cpp:
- b3/B3EliminateCommonSubexpressions.cpp:
- b3/B3FixSSA.cpp:
(JSC::B3::demoteValues):
(JSC::B3::fixSSA):
- b3/B3FixSSA.h:
- b3/B3Generate.cpp:
(JSC::B3::prepareForGeneration):
(JSC::B3::generateToAir):
- b3/B3Generate.h:
- b3/B3HeapRange.cpp: Removed.
- b3/B3HeapRange.h:
(JSC::B3::HeapRange::HeapRange): Deleted.
(JSC::B3::HeapRange::top): Deleted.
(JSC::B3::HeapRange::operator==): Deleted.
(JSC::B3::HeapRange::operator!=): Deleted.
(JSC::B3::HeapRange::operator|): Deleted.
(JSC::B3::HeapRange::operator bool): Deleted.
(JSC::B3::HeapRange::begin): Deleted.
(JSC::B3::HeapRange::end): Deleted.
(JSC::B3::HeapRange::overlaps): Deleted.
- b3/B3LowerToAir.cpp:
- b3/B3MoveConstants.cpp:
- b3/B3PhiChildren.h:
- b3/B3Procedure.cpp:
(JSC::B3::Procedure::dump):
(JSC::B3::Procedure::deleteOrphans):
(JSC::B3::Procedure::setBlockOrderImpl):
- b3/B3ReduceDoubleToFloat.cpp:
- b3/B3ReduceStrength.cpp:
- b3/B3SSACalculator.h:
- b3/B3UseCounts.h:
- b3/air/AirAllocateRegistersByGraphColoring.cpp:
- b3/air/AirAllocateRegistersByLinearScan.cpp: Added.
(JSC::B3::Air::allocateRegistersByLinearScan):
- b3/air/AirAllocateRegistersByLinearScan.h: Added.
- b3/air/AirAllocateStack.cpp:
(JSC::B3::Air::allocateStack):
(WTF::printInternal):
(JSC::B3::Air::Arg::activeAt):
(JSC::B3::Air::Arg::timing):
(JSC::B3::Air::Arg::forEachPhase):
- b3/air/AirBasicBlock.h:
- b3/air/AirBlockWorklist.h:
- b3/air/AirCFG.h:
(JSC::B3::Air::CFG::newMap):
- b3/air/AirEliminateDeadCode.cpp:
(JSC::B3::Air::eliminateDeadCode):
- b3/air/AirFixObviousSpills.cpp:
- b3/air/AirFixPartialRegisterStalls.cpp:
(JSC::B3::Air::fixPartialRegisterStalls):
- b3/air/AirFixSpillsAfterTerminals.cpp: Added.
(JSC::B3::Air::fixSpillsAfterTerminals):
- b3/air/AirFixSpillsAfterTerminals.h: Added.
- b3/air/AirGenerate.cpp:
(JSC::B3::Air::prepareForGeneration):
(JSC::B3::Air::generate):
- b3/air/AirGenerate.h:
- b3/air/AirGenerationContext.h:
- b3/air/AirInsertionSet.h:
- b3/air/AirInst.cpp:
(JSC::B3::Air::Inst::needsPadding):
- b3/air/AirLowerAfterRegAlloc.cpp:
(JSC::B3::Air::lowerAfterRegAlloc):
- b3/air/AirLowerEntrySwitch.cpp:
(JSC::B3::Air::lowerEntrySwitch):
- b3/air/AirOpcode.opcodes:
- b3/air/AirPhaseInsertionSet.cpp: Added.
(JSC::B3::Air::PhaseInsertionSet::execute):
- b3/air/AirPhaseInsertionSet.h: Added.
(JSC::B3::Air::PhaseInsertion::PhaseInsertion):
(JSC::B3::Air::PhaseInsertion::phase):
(JSC::B3::Air::PhaseInsertion::operator<):
(JSC::B3::Air::PhaseInsertionSet::PhaseInsertionSet):
(JSC::B3::Air::PhaseInsertionSet::appendInsertion):
(JSC::B3::Air::PhaseInsertionSet::insertInst):
(JSC::B3::Air::PhaseInsertionSet::insert):
(JSC::B3::Air::RegLiveness::LocalCalc::LocalCalc):
- b3/air/AirSpillEverything.cpp:
(JSC::B3::Air::spillEverything):
- b3/air/AirTmp.cpp:
- b3/air/AirTmp.h:
(JSC::B3::Air::Tmp::tmpForIndex):
(JSC::B3::Air::Tmp::Indexed::Indexed):
(JSC::B3::Air::Tmp::Indexed::index):
(JSC::B3::Air::Tmp::AbsolutelyIndexed::AbsolutelyIndexed):
(JSC::B3::Air::Tmp::AbsolutelyIndexed::index):
(JSC::B3::Air::Tmp::indexed):
(JSC::B3::Air::Tmp::absolutelyIndexed):
(JSC::B3::Air::Tmp::tmpForAbsoluteIndex):
(JSC::B3::compile):
(JSC::B3::testMulLoadTwice):
(JSC::RegisterSet::add):
(JSC::RegisterSet::remove):
- runtime/Options.h:
- wasm/WasmB3IRGenerator.h:
Source/WTF:
This change introduces a new low-latency register allocator. It can allocate registers very
quickly by doing a relatively poor job. Implementing this algorithm required beefing up some of
our core algorithms.
- WTF.xcodeproj/project.pbxproj:
- wtf/CMakeLists.txt:
- wtf/Deque.h: Make it possible to do some basic priority queueing with this data structure.
(WTF::inlineCapacity>::removeAllMatching):
(WTF::inlineCapacity>::appendAndBubble):
(WTF::inlineCapacity>::takeLast):
- wtf/IndexKeyType.h: Added. This makes it possible to use IndexMap and IndexSet with value or pointer types. Previously they just worked with pointer types.
(WTF::IndexKeyType::index):
- wtf/IndexMap.h: Adopt IndexKeyType.
(WTF::IndexMap::operator[]):
(WTF::IndexMap::append):
- wtf/IndexSet.h: Adopt IndexKeyType.
(WTF::IndexSet::add):
(WTF::IndexSet::addAll):
(WTF::IndexSet::remove):
(WTF::IndexSet::contains):
(WTF::IndexSet::Iterable::iterator::operator*):
- wtf/Range.h: Added. This used to be B3::HeapRange. This generalizes that data structure to any kind of range stuff.
(WTF::Range::Range):
(WTF::Range::top):
(WTF::Range::operator==):
(WTF::Range::operator!=):
(WTF::Range::operator bool):
(WTF::Range::operator|):
(WTF::Range::operator|=):
(WTF::Range::begin):
(WTF::Range::end):
(WTF::Range::overlaps):
(WTF::Range::dump):
(WTF::RangeSet::add):
Tools:
This makes us run a bunch of JS tests at optLevel=1 to force testing of this new compiler
pipeline.
- Scripts/run-jsc-stress-tests: