Ignore:
Timestamp:
Apr 20, 2017, 10:55:44 AM (8 years ago)
Author:
[email protected]
Message:

Optimize SharedArrayBuffer in the DFG+FTL
https://bugs.webkit.org/show_bug.cgi?id=164108

Reviewed by Saam Barati.

JSTests:

Added a fairly comprehensive test of the intrinsics. This creates a function for each possible
combination of type and operation, and then first uses it nicely and then tries a bunch of
erroneous conditions like OOB.

  • stress/SharedArrayBuffer-opt.js: Added.

(string_appeared_here.switch):
(string_appeared_here.str):
(runAtomic):
(shouldFail):
(Symbol):
(string_appeared_here.a.of.arrays.m.of.atomics):

  • stress/SharedArrayBuffer.js:

Source/JavaScriptCore:

This adds atomics intrinsics to the DFG and wires them through to the DFG and FTL backends. This
was super easy in the FTL since B3 already has comprehensive atomic intrinsics, which are more
powerful than what we need right now. In the DFG backend, I went with an easy-to-write
implementation that just reduces everything to a weak CAS loop. It's very inefficient with
registers (it needs ~8) but it's the DFG backend, so it's not obvious how much we care.

To make the rare cases easy to handle, I refactored AtomicsObject.cpp so that the operations for
the slow paths can share code with the native functions.

This also fixes register handling in the X86 implementations of CAS, in the case that
expectedAndResult is not %rax. This also fixes the ARM64 implementation of branchWeakCAS.

I adapted the CascadeLock from WTF/benchmarks/ToyLocks.h as a microbenchmark of lock performance.
This benchmark performs 2.5x faster, in both the contended and uncontended case, thanks to this
change. It's still about 3x slower than native. I investigated this only a bit. I suspect that
the story will be different in asm.js code, which will get constant-folding of the typed array
backing store by virtue of how it uses lexically scoped variables as pointers to the heap arrays.
It's worth noting that the native lock I was comparing against, the very nicely-tuned
CascadeLock, is at the very high end of lock throughput under virtually all conditions
(uncontended, microcontended, held for a long time). I also compared to WTF::Lock and others, and
the only ones that performed better in this microbenchmark were spinlocks. I don't recommend
using those. So, when I say this is 3x slower than native, I really mean that it's 3x slower than
the fastest native lock that I have in my arsenal.

Also worth noting is that I experimented with exposing Atomics.yield(), which uses sched_yield,
as a way of testing if adding a yield loop to the JS cascadeLock would help. It does not help. I
did not investigate why.

  • assembler/AbstractMacroAssembler.h:

(JSC::AbstractMacroAssembler::JumpList::append):

  • assembler/CPU.h:

(JSC::is64Bit):
(JSC::is32Bit):

  • b3/B3Common.h:

(JSC::B3::is64Bit): Deleted.
(JSC::B3::is32Bit): Deleted.

  • b3/B3LowerToAir.cpp:

(JSC::B3::Air::LowerToAir::appendTrapping):
(JSC::B3::Air::LowerToAir::appendCAS):
(JSC::B3::Air::LowerToAir::appendGeneralAtomic):

  • dfg/DFGAbstractInterpreterInlines.h:

(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):

  • dfg/DFGByteCodeParser.cpp:

(JSC::DFG::ByteCodeParser::handleIntrinsicCall):

  • dfg/DFGClobberize.h:

(JSC::DFG::clobberize):

  • dfg/DFGDoesGC.cpp:

(JSC::DFG::doesGC):

  • dfg/DFGFixupPhase.cpp:

(JSC::DFG::FixupPhase::fixupNode):

  • dfg/DFGNode.h:

(JSC::DFG::Node::hasHeapPrediction):
(JSC::DFG::Node::hasArrayMode):

  • dfg/DFGNodeType.h:

(JSC::DFG::isAtomicsIntrinsic):
(JSC::DFG::numExtraAtomicsArgs):

  • dfg/DFGPredictionPropagationPhase.cpp:
  • dfg/DFGSSALoweringPhase.cpp:

(JSC::DFG::SSALoweringPhase::handleNode):

  • dfg/DFGSafeToExecute.h:

(JSC::DFG::safeToExecute):

  • dfg/DFGSpeculativeJIT.cpp:

(JSC::DFG::SpeculativeJIT::loadFromIntTypedArray):
(JSC::DFG::SpeculativeJIT::setIntTypedArrayLoadResult):
(JSC::DFG::SpeculativeJIT::compileGetByValOnIntTypedArray):
(JSC::DFG::SpeculativeJIT::getIntTypedArrayStoreOperand):
(JSC::DFG::SpeculativeJIT::compilePutByValForIntTypedArray):

  • dfg/DFGSpeculativeJIT.h:

(JSC::DFG::SpeculativeJIT::callOperation):

  • dfg/DFGSpeculativeJIT32_64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • dfg/DFGSpeculativeJIT64.cpp:

(JSC::DFG::SpeculativeJIT::compile):

  • ftl/FTLAbstractHeapRepository.cpp:

(JSC::FTL::AbstractHeapRepository::decorateFencedAccess):
(JSC::FTL::AbstractHeapRepository::computeRangesAndDecorateInstructions):

  • ftl/FTLAbstractHeapRepository.h:
  • ftl/FTLCapabilities.cpp:

(JSC::FTL::canCompile):

  • ftl/FTLLowerDFGToB3.cpp:

(JSC::FTL::DFG::LowerDFGToB3::compileNode):
(JSC::FTL::DFG::LowerDFGToB3::compileAtomicsReadModifyWrite):
(JSC::FTL::DFG::LowerDFGToB3::compileAtomicsIsLockFree):
(JSC::FTL::DFG::LowerDFGToB3::compileGetByVal):
(JSC::FTL::DFG::LowerDFGToB3::compilePutByVal):
(JSC::FTL::DFG::LowerDFGToB3::pointerIntoTypedArray):
(JSC::FTL::DFG::LowerDFGToB3::loadFromIntTypedArray):
(JSC::FTL::DFG::LowerDFGToB3::storeType):
(JSC::FTL::DFG::LowerDFGToB3::setIntTypedArrayLoadResult):
(JSC::FTL::DFG::LowerDFGToB3::getIntTypedArrayStoreOperand):
(JSC::FTL::DFG::LowerDFGToB3::vmCall):

  • ftl/FTLOutput.cpp:

(JSC::FTL::Output::store):
(JSC::FTL::Output::store32As8):
(JSC::FTL::Output::store32As16):
(JSC::FTL::Output::atomicXchgAdd):
(JSC::FTL::Output::atomicXchgAnd):
(JSC::FTL::Output::atomicXchgOr):
(JSC::FTL::Output::atomicXchgSub):
(JSC::FTL::Output::atomicXchgXor):
(JSC::FTL::Output::atomicXchg):
(JSC::FTL::Output::atomicStrongCAS):

  • ftl/FTLOutput.h:

(JSC::FTL::Output::store32):
(JSC::FTL::Output::store64):
(JSC::FTL::Output::storePtr):
(JSC::FTL::Output::storeFloat):
(JSC::FTL::Output::storeDouble):

  • jit/JITOperations.h:
  • runtime/AtomicsObject.cpp:

(JSC::atomicsFuncAdd):
(JSC::atomicsFuncAnd):
(JSC::atomicsFuncCompareExchange):
(JSC::atomicsFuncExchange):
(JSC::atomicsFuncIsLockFree):
(JSC::atomicsFuncLoad):
(JSC::atomicsFuncOr):
(JSC::atomicsFuncStore):
(JSC::atomicsFuncSub):
(JSC::atomicsFuncWait):
(JSC::atomicsFuncWake):
(JSC::atomicsFuncXor):
(JSC::operationAtomicsAdd):
(JSC::operationAtomicsAnd):
(JSC::operationAtomicsCompareExchange):
(JSC::operationAtomicsExchange):
(JSC::operationAtomicsIsLockFree):
(JSC::operationAtomicsLoad):
(JSC::operationAtomicsOr):
(JSC::operationAtomicsStore):
(JSC::operationAtomicsSub):
(JSC::operationAtomicsXor):

  • runtime/AtomicsObject.h:

Source/WTF:

Made small changes as part of benchmarking the JS versions of these locks.

  • benchmarks/LockSpeedTest.cpp:
  • benchmarks/ToyLocks.h:
  • wtf/Range.h:

(WTF::Range::dump):

LayoutTests:

Add a test of futex performance.

  • workers/sab/cascade_lock-worker.js: Added.

(onmessage):

  • workers/sab/cascade_lock.html: Added.
  • workers/sab/worker-resources.js:

(cascadeLockSlow):
(cascadeLock):
(cascadeUnlock):

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerARM64.h

    r214384 r215565  
    35093509    }
    35103510   
    3511     void atomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3511    template<typename AddressType>
     3512    void atomicStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35123513    {
    35133514        atomicStrongCAS<8>(cond, expectedAndResult, newValue, address, result);
    35143515    }
    35153516   
    3516     void atomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3517    template<typename AddressType>
     3518    void atomicStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35173519    {
    35183520        atomicStrongCAS<16>(cond, expectedAndResult, newValue, address, result);
    35193521    }
    35203522   
    3521     void atomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3523    template<typename AddressType>
     3524    void atomicStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35223525    {
    35233526        atomicStrongCAS<32>(cond, expectedAndResult, newValue, address, result);
    35243527    }
    35253528   
    3526     void atomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3529    template<typename AddressType>
     3530    void atomicStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35273531    {
    35283532        atomicStrongCAS<64>(cond, expectedAndResult, newValue, address, result);
    35293533    }
    35303534   
    3531     void atomicRelaxedStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3535    template<typename AddressType>
     3536    void atomicRelaxedStrongCAS8(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35323537    {
    35333538        atomicRelaxedStrongCAS<8>(cond, expectedAndResult, newValue, address, result);
    35343539    }
    35353540   
    3536     void atomicRelaxedStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3541    template<typename AddressType>
     3542    void atomicRelaxedStrongCAS16(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35373543    {
    35383544        atomicRelaxedStrongCAS<16>(cond, expectedAndResult, newValue, address, result);
    35393545    }
    35403546   
    3541     void atomicRelaxedStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3547    template<typename AddressType>
     3548    void atomicRelaxedStrongCAS32(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35423549    {
    35433550        atomicRelaxedStrongCAS<32>(cond, expectedAndResult, newValue, address, result);
    35443551    }
    35453552   
    3546     void atomicRelaxedStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     3553    template<typename AddressType>
     3554    void atomicRelaxedStrongCAS64(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    35473555    {
    35483556        atomicRelaxedStrongCAS<64>(cond, expectedAndResult, newValue, address, result);
    35493557    }
    35503558   
    3551     JumpList branchAtomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3559    template<typename AddressType>
     3560    JumpList branchAtomicWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35523561    {
    35533562        return branchAtomicWeakCAS<8>(cond, expectedAndClobbered, newValue, address);
    35543563    }
    35553564   
    3556     JumpList branchAtomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3565    template<typename AddressType>
     3566    JumpList branchAtomicWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35573567    {
    35583568        return branchAtomicWeakCAS<16>(cond, expectedAndClobbered, newValue, address);
    35593569    }
    35603570   
    3561     JumpList branchAtomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3571    template<typename AddressType>
     3572    JumpList branchAtomicWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35623573    {
    35633574        return branchAtomicWeakCAS<32>(cond, expectedAndClobbered, newValue, address);
    35643575    }
    35653576   
    3566     JumpList branchAtomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3577    template<typename AddressType>
     3578    JumpList branchAtomicWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35673579    {
    35683580        return branchAtomicWeakCAS<64>(cond, expectedAndClobbered, newValue, address);
    35693581    }
    35703582   
    3571     JumpList branchAtomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3583    template<typename AddressType>
     3584    JumpList branchAtomicRelaxedWeakCAS8(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35723585    {
    35733586        return branchAtomicRelaxedWeakCAS<8>(cond, expectedAndClobbered, newValue, address);
    35743587    }
    35753588   
    3576     JumpList branchAtomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3589    template<typename AddressType>
     3590    JumpList branchAtomicRelaxedWeakCAS16(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35773591    {
    35783592        return branchAtomicRelaxedWeakCAS<16>(cond, expectedAndClobbered, newValue, address);
    35793593    }
    35803594   
    3581     JumpList branchAtomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3595    template<typename AddressType>
     3596    JumpList branchAtomicRelaxedWeakCAS32(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35823597    {
    35833598        return branchAtomicRelaxedWeakCAS<32>(cond, expectedAndClobbered, newValue, address);
    35843599    }
    35853600   
    3586     JumpList branchAtomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     3601    template<typename AddressType>
     3602    JumpList branchAtomicRelaxedWeakCAS64(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    35873603    {
    35883604        return branchAtomicRelaxedWeakCAS<64>(cond, expectedAndClobbered, newValue, address);
     
    41734189    void storeCondRel(RegisterID src, RegisterID dest, RegisterID result)
    41744190    {
    4175         m_assembler.stlxr<datasize>(src, dest, result);
     4191        m_assembler.stlxr<datasize>(result, src, dest);
    41764192    }
    41774193   
     
    42144230    }
    42154231   
    4216     template<int datasize>
    4217     void atomicRelaxedStrongCAS(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, Address address, RegisterID result)
     4232    template<int datasize, typename AddressType>
     4233    void atomicRelaxedStrongCAS(StatusCondition cond, RegisterID expectedAndResult, RegisterID newValue, AddressType address, RegisterID result)
    42184234    {
    42194235        signExtend<datasize>(expectedAndResult, expectedAndResult);
     
    42384254    }
    42394255   
    4240     template<int datasize>
    4241     JumpList branchAtomicWeakCAS(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     4256    template<int datasize, typename AddressType>
     4257    JumpList branchAtomicWeakCAS(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    42424258    {
    42434259        signExtend<datasize>(expectedAndClobbered, expectedAndClobbered);
     
    42664282    }
    42674283   
    4268     template<int datasize>
    4269     JumpList branchAtomicRelaxedWeakCAS(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, Address address)
     4284    template<int datasize, typename AddressType>
     4285    JumpList branchAtomicRelaxedWeakCAS(StatusCondition cond, RegisterID expectedAndClobbered, RegisterID newValue, AddressType address)
    42704286    {
    42714287        signExtend<datasize>(expectedAndClobbered, expectedAndClobbered);
     
    43044320    }
    43054321
     4322    // This uses both the memory and data temp, but only returns the memorty temp. So you can use the
     4323    // data temp after this finishes.
     4324    RegisterID extractSimpleAddress(BaseIndex address)
     4325    {
     4326        RegisterID result = getCachedMemoryTempRegisterIDAndInvalidate();
     4327        lshift64(address.index, TrustedImm32(address.scale), result);
     4328        add64(address.base, result);
     4329        add64(TrustedImm32(address.offset), result);
     4330        return result;
     4331    }
     4332
    43064333    Jump jumpAfterFloatingPointCompare(DoubleCondition cond)
    43074334    {
Note: See TracChangeset for help on using the changeset viewer.