Ignore:
Timestamp:
Jun 12, 2012, 7:06:50 PM (13 years ago)
Author:
[email protected]
Message:

GC should be 1.7X faster
https://bugs.webkit.org/show_bug.cgi?id=88840

Reviewed by Oliver Hunt.

I profiled, and removed anything that showed up as a concurrency
bottleneck. Then, I added 3 threads to our max thread count, since we
can scale up to more threads now.

  • heap/BlockAllocator.cpp:

(JSC::BlockAllocator::BlockAllocator):
(JSC::BlockAllocator::~BlockAllocator):
(JSC::BlockAllocator::releaseFreeBlocks):
(JSC::BlockAllocator::waitForRelativeTimeWhileHoldingLock):
(JSC::BlockAllocator::waitForRelativeTime):
(JSC::BlockAllocator::blockFreeingThreadMain):

  • heap/BlockAllocator.h:

(BlockAllocator):
(JSC::BlockAllocator::allocate):
(JSC::BlockAllocator::deallocate): Use a spin lock for the common case
where we're just popping a linked list. (A pthread mutex would sleep our
thread even if the lock were only contended for a microsecond.)

Scope the lock to avoid holding it while allocating VM, since that's a
slow activity and it doesn't modify any of our data structures.

We still use a pthread mutex to handle our condition variable since we
have to, and it's not a hot path.

  • heap/CopiedSpace.cpp:

(JSC::CopiedSpace::CopiedSpace):
(JSC::CopiedSpace::doneFillingBlock):

  • heap/CopiedSpace.h:

(JSC::CopiedSpace::CopiedSpace): Use a spin lock for the to space lock,
since it just guards linked list and hash table manipulation.

  • heap/MarkStack.cpp:

(JSC::MarkStackSegmentAllocator::MarkStackSegmentAllocator):
(JSC::MarkStackSegmentAllocator::allocate):
(JSC::MarkStackSegmentAllocator::release):
(JSC::MarkStackSegmentAllocator::shrinkReserve): Use a spin lock, since
we're just managing a linked list.

(JSC::MarkStackArray::donateSomeCellsTo): Changed donation to be proportional
to our current stack size. This fixes cases where we used to donate too
much. Interestingly, donating too much was starving the donor (when it
ran out of work later) *and* the recipient (since it had to wait on a
long donation operation to complete before it could acquire the lock).

In the worst case, we're still guaranteed to donate N cells in roughly log N time.

This change also fixes cases where we used to donate too little, since
we would always keep a fixed minimum number of cells. In the worst case,
with N marking threads, would could have N large object graph roots in
our stack for the duration of GC, and scale to only 1 thread.

It's an interesting observation that a single object in the mark stack
might represent an arbitrarily large object graph -- and only the act
of marking can find out.

(JSC::MarkStackArray::stealSomeCellsFrom): Steal in proportion to idle
threads. Once again, this fixes cases where constants could cause us
to steal too much or too little.

(JSC::SlotVisitor::donateKnownParallel): Always wake up other threads
if they're idle. We can afford to do this because we're conservative
about when we donate.

(JSC::SlotVisitor::drainFromShared):

  • heap/MarkStack.h:

(MarkStackSegmentAllocator):
(MarkStackArray):
(JSC):

  • heap/SlotVisitor.h: Merged the "should I donate?" decision into a

single function, for simplicity.

  • runtime/Options.cpp:

(minimumNumberOfScansBetweenRebalance): Reduced the delay before donation
a lot. We can afford to do this because, in the common case, donation is
a single branch that decides not to donate.

(cpusToUse): Use more CPUs now, since we scale better now.

  • runtime/Options.h:

(Options): Removed now-unused variables.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/heap/MarkStack.cpp

    r119633 r120149  
    4747    : m_nextFreeSegment(0)
    4848{
     49    m_lock.Init();
    4950}
    5051
     
    5758{
    5859    {
    59         MutexLocker locker(m_lock);
     60        SpinLockHolder locker(&m_lock);
    6061        if (m_nextFreeSegment) {
    6162            MarkStackSegment* result = m_nextFreeSegment;
     
    7071void MarkStackSegmentAllocator::release(MarkStackSegment* segment)
    7172{
    72     MutexLocker locker(m_lock);
     73    SpinLockHolder locker(&m_lock);
    7374    segment->m_previous = m_nextFreeSegment;
    7475    m_nextFreeSegment = segment;
     
    7980    MarkStackSegment* segments;
    8081    {
    81         MutexLocker locker(m_lock);
     82        SpinLockHolder locker(&m_lock);
    8283        segments = m_nextFreeSegment;
    8384        m_nextFreeSegment = 0;
     
    143144}
    144145
    145 bool MarkStackArray::donateSomeCellsTo(MarkStackArray& other)
    146 {
     146void MarkStackArray::donateSomeCellsTo(MarkStackArray& other)
     147{
     148    // Try to donate about 1 / 2 of our cells. To reduce copying costs,
     149    // we prefer donating whole segments over donating individual cells,
     150    // even if this skews away from our 1 / 2 target.
     151
    147152    ASSERT(m_segmentCapacity == other.m_segmentCapacity);
     153
     154    size_t segmentsToDonate = (m_numberOfPreviousSegments + 2 - 1) / 2; // Round up to donate 1 / 1 previous segments.
     155
     156    if (!segmentsToDonate) {
     157        size_t cellsToDonate = m_top / 2; // Round down to donate 0 / 1 cells.
     158        while (cellsToDonate--) {
     159            ASSERT(m_top);
     160            other.append(removeLast());
     161        }
     162        return;
     163    }
     164
    148165    validatePrevious();
    149166    other.validatePrevious();
    150        
    151     // Fast check: see if the other mark stack already has enough segments.
    152     if (other.m_numberOfPreviousSegments + 1 >= Options::maximumNumberOfSharedSegments)
    153         return false;
    154        
    155     size_t numberOfCellsToKeep = Options::minimumNumberOfCellsToKeep;
    156     ASSERT(m_top > numberOfCellsToKeep || m_topSegment->m_previous);
    157        
    158     // Looks like we should donate! Give the other mark stack all of our
    159     // previous segments, and then top it off.
     167
    160168    MarkStackSegment* previous = m_topSegment->m_previous;
    161     while (previous) {
     169    while (segmentsToDonate--) {
     170        ASSERT(previous);
    162171        ASSERT(m_numberOfPreviousSegments);
    163172
     
    171180        other.m_numberOfPreviousSegments++;
    172181    }
    173     ASSERT(!m_numberOfPreviousSegments);
    174     m_topSegment->m_previous = 0;
     182    m_topSegment->m_previous = previous;
     183
    175184    validatePrevious();
    176185    other.validatePrevious();
    177        
    178     // Now top off. We want to keep at a minimum numberOfCellsToKeep, but if
    179     // we really have a lot of work, we give up half.
    180     if (m_top > numberOfCellsToKeep * 2)
    181         numberOfCellsToKeep = m_top / 2;
    182     while (m_top > numberOfCellsToKeep)
    183         other.append(removeLast());
    184        
    185     return true;
    186 }
    187 
    188 void MarkStackArray::stealSomeCellsFrom(MarkStackArray& other)
    189 {
     186}
     187
     188void MarkStackArray::stealSomeCellsFrom(MarkStackArray& other, size_t idleThreadCount)
     189{
     190    // Try to steal 1 / Nth of the shared array, where N is the number of idle threads.
     191    // To reduce copying costs, we prefer stealing a whole segment over stealing
     192    // individual cells, even if this skews away from our 1 / N target.
     193
    190194    ASSERT(m_segmentCapacity == other.m_segmentCapacity);
    191195    validatePrevious();
     
    212216        return;
    213217    }
    214        
    215     // Otherwise drain 1/Nth of the shared array where N is the number of
    216     // workers, or Options::minimumNumberOfCellsToKeep, whichever is bigger.
    217     size_t numberOfCellsToSteal = std::max((size_t)Options::minimumNumberOfCellsToKeep, other.size() / Options::numberOfGCMarkers);
     218
     219    size_t numberOfCellsToSteal = (other.size() + idleThreadCount - 1) / idleThreadCount; // Round up to steal 1 / 1.
    218220    while (numberOfCellsToSteal-- > 0 && other.canRemoveLast())
    219221        append(other.removeLast());
     
    344346}
    345347
    346 void SlotVisitor::donateSlow()
    347 {
    348     // Refuse to donate if shared has more entries than I do.
    349     if (m_shared.m_sharedMarkStack.size() > m_stack.size())
    350         return;
    351     MutexLocker locker(m_shared.m_markingLock);
    352     if (m_stack.donateSomeCellsTo(m_shared.m_sharedMarkStack)) {
    353         // Only wake up threads if the shared stack is big enough; otherwise assume that
    354         // it's more profitable for us to just scan this ourselves later.
    355         if (m_shared.m_sharedMarkStack.size() >= Options::sharedStackWakeupThreshold)
    356             m_shared.m_markingCondition.broadcast();
    357     }
     348void SlotVisitor::donateKnownParallel()
     349{
     350    // NOTE: Because we re-try often, we can afford to be conservative, and
     351    // assume that donating is not profitable.
     352
     353    // Avoid locking when a thread reaches a dead end in the object graph.
     354    if (m_stack.size() < 2)
     355        return;
     356
     357    // If there's already some shared work queued up, be conservative and assume
     358    // that donating more is not profitable.
     359    if (m_shared.m_sharedMarkStack.size())
     360        return;
     361
     362    // If we're contending on the lock, be conservative and assume that another
     363    // thread is already donating.
     364    MutexTryLocker locker(m_shared.m_markingLock);
     365    if (!locker.locked())
     366        return;
     367
     368    // Otherwise, assume that a thread will go idle soon, and donate.
     369    m_stack.donateSomeCellsTo(m_shared.m_sharedMarkStack);
     370
     371    if (m_shared.m_numberOfActiveParallelMarkers < Options::numberOfGCMarkers)
     372        m_shared.m_markingCondition.broadcast();
    358373}
    359374
     
    455470            }
    456471           
    457             m_stack.stealSomeCellsFrom(m_shared.m_sharedMarkStack);
     472            size_t idleThreadCount = Options::numberOfGCMarkers - m_shared.m_numberOfActiveParallelMarkers;
     473            m_stack.stealSomeCellsFrom(m_shared.m_sharedMarkStack, idleThreadCount);
    458474            m_shared.m_numberOfActiveParallelMarkers++;
    459475        }
Note: See TracChangeset for help on using the changeset viewer.