source: webkit/trunk/JavaScriptCore/wtf/FastMalloc.cpp@ 35691

Last change on this file since 35691 was 35691, checked in by [email protected], 17 years ago

JavaScriptCore:

  • JavaScriptCore part of <rdar://problem/6121636> Make fast*alloc() abort() on failure and add "try" variants that return NULL on failure.

Reviewed by Darin Adler.

  • JavaScriptCore.exp: Exported tryFastCalloc().
  • VM/RegisterFile.h: (KJS::RegisterFile::RegisterFile): Removed an ASSERT().
  • kjs/JSArray.cpp: (KJS::JSArray::putSlowCase): Changed to use tryFastRealloc(). (KJS::JSArray::increaseVectorLength): Ditto.
  • kjs/ustring.cpp: (KJS::allocChars): Changed to use tryFastMalloc(). (KJS::reallocChars): Changed to use tryFastRealloc().
  • wtf/FastMalloc.cpp: (WTF::fastZeroedMalloc): Removed null checking of fastMalloc()'s result and removed extra call to InvokeNewHook(). (WTF::tryFastZeroedMalloc): Added. Uses tryFastMalloc(). (WTF::tryFastMalloc): Renamed fastMalloc() to this. (WTF::fastMalloc): Added. This version abort()s if allocation fails. (WTF::tryFastCalloc): Renamed fastCalloc() to this. (WTF::fastCalloc): Added. This version abort()s if allocation fails. (WTF::tryFastRealloc): Renamed fastRealloc() to this. (WTF::fastRealloc): Added. This version abort()s if allocation fails. (WTF::do_malloc): Made this a function template. When the abortOnFailure template parameter is set, the function abort()s on failure to allocate. Otherwise, it sets errno to ENOMEM and returns zero. (WTF::TCMallocStats::fastMalloc): Defined to abort() on failure. (WTF::TCMallocStats::tryFastMalloc): Added. Does not abort() on failure. (WTF::TCMallocStats::fastCalloc): Defined to abort() on failure. (WTF::TCMallocStats::tryFastCalloc): Added. Does not abort() on failure. (WTF::TCMallocStats::fastRealloc): Defined to abort() on failure. (WTF::TCMallocStats::tryFastRealloc): Added. Does not abort() on failure.
  • wtf/FastMalloc.h: Declared the "try" variants.

WebCore:

  • WebCore part of <rdar://problem/6121636> Make fast*alloc() abort() on failure and add "try" variants that return NULL on failure.

Reviewed by Darin Adler.

  • platform/Arena.cpp: (WebCore::ArenaAllocate): Removed null checking of fastMalloc()'s result.
  • platform/graphics/cg/ImageBufferCG.cpp: (WebCore::ImageBuffer::create): Changed to use tryFastCalloc().
  • Property svn:eol-style set to native
File size: 117.7 KB
Line 
1// Copyright (c) 2005, 2007, Google Inc.
2// All rights reserved.
3// Copyright (C) 2005, 2006, 2007, 2008 Apple Inc. All rights reserved.
4//
5// Redistribution and use in source and binary forms, with or without
6// modification, are permitted provided that the following conditions are
7// met:
8//
9// * Redistributions of source code must retain the above copyright
10// notice, this list of conditions and the following disclaimer.
11// * Redistributions in binary form must reproduce the above
12// copyright notice, this list of conditions and the following disclaimer
13// in the documentation and/or other materials provided with the
14// distribution.
15// * Neither the name of Google Inc. nor the names of its
16// contributors may be used to endorse or promote products derived from
17// this software without specific prior written permission.
18//
19// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30
31// ---
32// Author: Sanjay Ghemawat <[email protected]>
33//
34// A malloc that uses a per-thread cache to satisfy small malloc requests.
35// (The time for malloc/free of a small object drops from 300 ns to 50 ns.)
36//
37// See doc/tcmalloc.html for a high-level
38// description of how this malloc works.
39//
40// SYNCHRONIZATION
41// 1. The thread-specific lists are accessed without acquiring any locks.
42// This is safe because each such list is only accessed by one thread.
43// 2. We have a lock per central free-list, and hold it while manipulating
44// the central free list for a particular size.
45// 3. The central page allocator is protected by "pageheap_lock".
46// 4. The pagemap (which maps from page-number to descriptor),
47// can be read without holding any locks, and written while holding
48// the "pageheap_lock".
49// 5. To improve performance, a subset of the information one can get
50// from the pagemap is cached in a data structure, pagemap_cache_,
51// that atomically reads and writes its entries. This cache can be
52// read and written without locking.
53//
54// This multi-threaded access to the pagemap is safe for fairly
55// subtle reasons. We basically assume that when an object X is
56// allocated by thread A and deallocated by thread B, there must
57// have been appropriate synchronization in the handoff of object
58// X from thread A to thread B. The same logic applies to pagemap_cache_.
59//
60// THE PAGEID-TO-SIZECLASS CACHE
61// Hot PageID-to-sizeclass mappings are held by pagemap_cache_. If this cache
62// returns 0 for a particular PageID then that means "no information," not that
63// the sizeclass is 0. The cache may have stale information for pages that do
64// not hold the beginning of any free()'able object. Staleness is eliminated
65// in Populate() for pages with sizeclass > 0 objects, and in do_malloc() and
66// do_memalign() for all other relevant pages.
67//
68// TODO: Bias reclamation to larger addresses
69// TODO: implement mallinfo/mallopt
70// TODO: Better testing
71//
72// 9/28/2003 (new page-level allocator replaces ptmalloc2):
73// * malloc/free of small objects goes from ~300 ns to ~50 ns.
74// * allocation of a reasonably complicated struct
75// goes from about 1100 ns to about 300 ns.
76
77#include "config.h"
78#include "FastMalloc.h"
79
80#include "Assertions.h"
81#if ENABLE(JSC_MULTIPLE_THREADS)
82#include <pthread.h>
83#endif
84
85#ifndef NO_TCMALLOC_SAMPLES
86#ifdef WTF_CHANGES
87#define NO_TCMALLOC_SAMPLES
88#endif
89#endif
90
91#if !defined(USE_SYSTEM_MALLOC) && defined(NDEBUG)
92#define FORCE_SYSTEM_MALLOC 0
93#else
94#define FORCE_SYSTEM_MALLOC 1
95#endif
96
97#define TCMALLOC_TRACK_DECOMMITED_SPANS (HAVE(VIRTUALALLOC))
98
99#ifndef NDEBUG
100namespace WTF {
101
102#if ENABLE(JSC_MULTIPLE_THREADS)
103static pthread_key_t isForbiddenKey;
104static pthread_once_t isForbiddenKeyOnce = PTHREAD_ONCE_INIT;
105static void initializeIsForbiddenKey()
106{
107 pthread_key_create(&isForbiddenKey, 0);
108}
109
110static bool isForbidden()
111{
112 pthread_once(&isForbiddenKeyOnce, initializeIsForbiddenKey);
113 return !!pthread_getspecific(isForbiddenKey);
114}
115
116void fastMallocForbid()
117{
118 pthread_once(&isForbiddenKeyOnce, initializeIsForbiddenKey);
119 pthread_setspecific(isForbiddenKey, &isForbiddenKey);
120}
121
122void fastMallocAllow()
123{
124 pthread_once(&isForbiddenKeyOnce, initializeIsForbiddenKey);
125 pthread_setspecific(isForbiddenKey, 0);
126}
127
128#else
129
130static bool staticIsForbidden;
131static bool isForbidden()
132{
133 return staticIsForbidden;
134}
135
136void fastMallocForbid()
137{
138 staticIsForbidden = true;
139}
140
141void fastMallocAllow()
142{
143 staticIsForbidden = false;
144}
145#endif // ENABLE(JSC_MULTIPLE_THREADS)
146
147} // namespace WTF
148#endif // NDEBUG
149
150#include <string.h>
151
152namespace WTF {
153
154void* fastZeroedMalloc(size_t n)
155{
156 void* result = fastMalloc(n);
157 memset(result, 0, n);
158 return result;
159}
160
161void* tryFastZeroedMalloc(size_t n)
162{
163 void* result = tryFastMalloc(n);
164 if (!result)
165 return 0;
166 memset(result, 0, n);
167 return result;
168}
169
170} // namespace WTF
171
172#if FORCE_SYSTEM_MALLOC
173
174#include <stdlib.h>
175#if !PLATFORM(WIN_OS)
176 #include <pthread.h>
177#endif
178
179namespace WTF {
180
181void* tryFastMalloc(size_t n)
182{
183 ASSERT(!isForbidden());
184 return malloc(n);
185}
186
187void* fastMalloc(size_t n)
188{
189 ASSERT(!isForbidden());
190 void* result = malloc(n);
191 if (!result)
192 abort();
193 return result;
194}
195
196void* tryFastCalloc(size_t n_elements, size_t element_size)
197{
198 ASSERT(!isForbidden());
199 return calloc(n_elements, element_size);
200}
201
202void* fastCalloc(size_t n_elements, size_t element_size)
203{
204 ASSERT(!isForbidden());
205 void* result = calloc(n_elements, element_size);
206 if (!result)
207 abort();
208 return result;
209}
210
211void fastFree(void* p)
212{
213 ASSERT(!isForbidden());
214 free(p);
215}
216
217void* tryFastRealloc(void* p, size_t n)
218{
219 ASSERT(!isForbidden());
220 return realloc(p, n);
221}
222
223void* fastRealloc(void* p, size_t n)
224{
225 ASSERT(!isForbidden());
226 void* result = realloc(p, n);
227 if (!result)
228 abort();
229 return result;
230}
231
232void releaseFastMallocFreeMemory() { }
233
234} // namespace WTF
235
236#if PLATFORM(DARWIN)
237// This symbol is present in the JavaScriptCore exports file even when FastMalloc is disabled.
238// It will never be used in this case, so it's type and value are less interesting than its presence.
239extern "C" const int jscore_fastmalloc_introspection = 0;
240#endif
241
242#else // FORCE_SYSTEM_MALLOC
243
244#if HAVE(STDINT_H)
245#include <stdint.h>
246#elif HAVE(INTTYPES_H)
247#include <inttypes.h>
248#else
249#include <sys/types.h>
250#endif
251
252#include "AlwaysInline.h"
253#include "Assertions.h"
254#include "TCPackedCache.h"
255#include "TCPageMap.h"
256#include "TCSpinLock.h"
257#include "TCSystemAlloc.h"
258#include <algorithm>
259#include <errno.h>
260#include <new>
261#include <pthread.h>
262#include <stdarg.h>
263#include <stddef.h>
264#include <stdio.h>
265#if COMPILER(MSVC)
266#ifndef WIN32_LEAN_AND_MEAN
267#define WIN32_LEAN_AND_MEAN
268#endif
269#include <windows.h>
270#endif
271
272#if WTF_CHANGES
273
274#if PLATFORM(DARWIN)
275#include "MallocZoneSupport.h"
276#include <wtf/HashSet.h>
277#endif
278
279#ifndef PRIuS
280#define PRIuS "zu"
281#endif
282
283// Calling pthread_getspecific through a global function pointer is faster than a normal
284// call to the function on Mac OS X, and it's used in performance-critical code. So we
285// use a function pointer. But that's not necessarily faster on other platforms, and we had
286// problems with this technique on Windows, so we'll do this only on Mac OS X.
287#if PLATFORM(DARWIN)
288static void* (*pthread_getspecific_function_pointer)(pthread_key_t) = pthread_getspecific;
289#define pthread_getspecific(key) pthread_getspecific_function_pointer(key)
290#endif
291
292#define DEFINE_VARIABLE(type, name, value, meaning) \
293 namespace FLAG__namespace_do_not_use_directly_use_DECLARE_##type##_instead { \
294 type FLAGS_##name(value); \
295 char FLAGS_no##name; \
296 } \
297 using FLAG__namespace_do_not_use_directly_use_DECLARE_##type##_instead::FLAGS_##name
298
299#define DEFINE_int64(name, value, meaning) \
300 DEFINE_VARIABLE(int64_t, name, value, meaning)
301
302#define DEFINE_double(name, value, meaning) \
303 DEFINE_VARIABLE(double, name, value, meaning)
304
305namespace WTF {
306
307#define malloc fastMalloc
308#define calloc fastCalloc
309#define free fastFree
310#define realloc fastRealloc
311
312#define MESSAGE LOG_ERROR
313#define CHECK_CONDITION ASSERT
314
315#if PLATFORM(DARWIN)
316class TCMalloc_PageHeap;
317class TCMalloc_ThreadCache;
318class TCMalloc_Central_FreeListPadded;
319
320class FastMallocZone {
321public:
322 static void init();
323
324 static kern_return_t enumerate(task_t, void*, unsigned typeMmask, vm_address_t zoneAddress, memory_reader_t, vm_range_recorder_t);
325 static size_t goodSize(malloc_zone_t*, size_t size) { return size; }
326 static boolean_t check(malloc_zone_t*) { return true; }
327 static void print(malloc_zone_t*, boolean_t) { }
328 static void log(malloc_zone_t*, void*) { }
329 static void forceLock(malloc_zone_t*) { }
330 static void forceUnlock(malloc_zone_t*) { }
331 static void statistics(malloc_zone_t*, malloc_statistics_t* stats) { memset(stats, 0, sizeof(malloc_statistics_t)); }
332
333private:
334 FastMallocZone(TCMalloc_PageHeap*, TCMalloc_ThreadCache**, TCMalloc_Central_FreeListPadded*);
335 static size_t size(malloc_zone_t*, const void*);
336 static void* zoneMalloc(malloc_zone_t*, size_t);
337 static void* zoneCalloc(malloc_zone_t*, size_t numItems, size_t size);
338 static void zoneFree(malloc_zone_t*, void*);
339 static void* zoneRealloc(malloc_zone_t*, void*, size_t);
340 static void* zoneValloc(malloc_zone_t*, size_t) { LOG_ERROR("valloc is not supported"); return 0; }
341 static void zoneDestroy(malloc_zone_t*) { }
342
343 malloc_zone_t m_zone;
344 TCMalloc_PageHeap* m_pageHeap;
345 TCMalloc_ThreadCache** m_threadHeaps;
346 TCMalloc_Central_FreeListPadded* m_centralCaches;
347};
348
349#endif
350
351#endif
352
353#ifndef WTF_CHANGES
354// This #ifdef should almost never be set. Set NO_TCMALLOC_SAMPLES if
355// you're porting to a system where you really can't get a stacktrace.
356#ifdef NO_TCMALLOC_SAMPLES
357// We use #define so code compiles even if you #include stacktrace.h somehow.
358# define GetStackTrace(stack, depth, skip) (0)
359#else
360# include <google/stacktrace.h>
361#endif
362#endif
363
364// Even if we have support for thread-local storage in the compiler
365// and linker, the OS may not support it. We need to check that at
366// runtime. Right now, we have to keep a manual set of "bad" OSes.
367#if defined(HAVE_TLS)
368 static bool kernel_supports_tls = false; // be conservative
369 static inline bool KernelSupportsTLS() {
370 return kernel_supports_tls;
371 }
372# if !HAVE_DECL_UNAME // if too old for uname, probably too old for TLS
373 static void CheckIfKernelSupportsTLS() {
374 kernel_supports_tls = false;
375 }
376# else
377# include <sys/utsname.h> // DECL_UNAME checked for <sys/utsname.h> too
378 static void CheckIfKernelSupportsTLS() {
379 struct utsname buf;
380 if (uname(&buf) != 0) { // should be impossible
381 MESSAGE("uname failed assuming no TLS support (errno=%d)\n", errno);
382 kernel_supports_tls = false;
383 } else if (strcasecmp(buf.sysname, "linux") == 0) {
384 // The linux case: the first kernel to support TLS was 2.6.0
385 if (buf.release[0] < '2' && buf.release[1] == '.') // 0.x or 1.x
386 kernel_supports_tls = false;
387 else if (buf.release[0] == '2' && buf.release[1] == '.' &&
388 buf.release[2] >= '0' && buf.release[2] < '6' &&
389 buf.release[3] == '.') // 2.0 - 2.5
390 kernel_supports_tls = false;
391 else
392 kernel_supports_tls = true;
393 } else { // some other kernel, we'll be optimisitic
394 kernel_supports_tls = true;
395 }
396 // TODO(csilvers): VLOG(1) the tls status once we support RAW_VLOG
397 }
398# endif // HAVE_DECL_UNAME
399#endif // HAVE_TLS
400
401// __THROW is defined in glibc systems. It means, counter-intuitively,
402// "This function will never throw an exception." It's an optional
403// optimization tool, but we may need to use it to match glibc prototypes.
404#ifndef __THROW // I guess we're not on a glibc system
405# define __THROW // __THROW is just an optimization, so ok to make it ""
406#endif
407
408//-------------------------------------------------------------------
409// Configuration
410//-------------------------------------------------------------------
411
412// Not all possible combinations of the following parameters make
413// sense. In particular, if kMaxSize increases, you may have to
414// increase kNumClasses as well.
415static const size_t kPageShift = 12;
416static const size_t kPageSize = 1 << kPageShift;
417static const size_t kMaxSize = 8u * kPageSize;
418static const size_t kAlignShift = 3;
419static const size_t kAlignment = 1 << kAlignShift;
420static const size_t kNumClasses = 68;
421
422// Allocates a big block of memory for the pagemap once we reach more than
423// 128MB
424static const size_t kPageMapBigAllocationThreshold = 128 << 20;
425
426// Minimum number of pages to fetch from system at a time. Must be
427// significantly bigger than kBlockSize to amortize system-call
428// overhead, and also to reduce external fragementation. Also, we
429// should keep this value big because various incarnations of Linux
430// have small limits on the number of mmap() regions per
431// address-space.
432static const size_t kMinSystemAlloc = 1 << (20 - kPageShift);
433
434// Number of objects to move between a per-thread list and a central
435// list in one shot. We want this to be not too small so we can
436// amortize the lock overhead for accessing the central list. Making
437// it too big may temporarily cause unnecessary memory wastage in the
438// per-thread free list until the scavenger cleans up the list.
439static int num_objects_to_move[kNumClasses];
440
441// Maximum length we allow a per-thread free-list to have before we
442// move objects from it into the corresponding central free-list. We
443// want this big to avoid locking the central free-list too often. It
444// should not hurt to make this list somewhat big because the
445// scavenging code will shrink it down when its contents are not in use.
446static const int kMaxFreeListLength = 256;
447
448// Lower and upper bounds on the per-thread cache sizes
449static const size_t kMinThreadCacheSize = kMaxSize * 2;
450static const size_t kMaxThreadCacheSize = 2 << 20;
451
452// Default bound on the total amount of thread caches
453static const size_t kDefaultOverallThreadCacheSize = 16 << 20;
454
455// For all span-lengths < kMaxPages we keep an exact-size list.
456// REQUIRED: kMaxPages >= kMinSystemAlloc;
457static const size_t kMaxPages = kMinSystemAlloc;
458
459/* The smallest prime > 2^n */
460static int primes_list[] = {
461 // Small values might cause high rates of sampling
462 // and hence commented out.
463 // 2, 5, 11, 17, 37, 67, 131, 257,
464 // 521, 1031, 2053, 4099, 8209, 16411,
465 32771, 65537, 131101, 262147, 524309, 1048583,
466 2097169, 4194319, 8388617, 16777259, 33554467 };
467
468// Twice the approximate gap between sampling actions.
469// I.e., we take one sample approximately once every
470// tcmalloc_sample_parameter/2
471// bytes of allocation, i.e., ~ once every 128KB.
472// Must be a prime number.
473#ifdef NO_TCMALLOC_SAMPLES
474DEFINE_int64(tcmalloc_sample_parameter, 0,
475 "Unused: code is compiled with NO_TCMALLOC_SAMPLES");
476static size_t sample_period = 0;
477#else
478DEFINE_int64(tcmalloc_sample_parameter, 262147,
479 "Twice the approximate gap between sampling actions."
480 " Must be a prime number. Otherwise will be rounded up to a "
481 " larger prime number");
482static size_t sample_period = 262147;
483#endif
484
485// Protects sample_period above
486static SpinLock sample_period_lock = SPINLOCK_INITIALIZER;
487
488// Parameters for controlling how fast memory is returned to the OS.
489
490DEFINE_double(tcmalloc_release_rate, 1,
491 "Rate at which we release unused memory to the system. "
492 "Zero means we never release memory back to the system. "
493 "Increase this flag to return memory faster; decrease it "
494 "to return memory slower. Reasonable rates are in the "
495 "range [0,10]");
496
497//-------------------------------------------------------------------
498// Mapping from size to size_class and vice versa
499//-------------------------------------------------------------------
500
501// Sizes <= 1024 have an alignment >= 8. So for such sizes we have an
502// array indexed by ceil(size/8). Sizes > 1024 have an alignment >= 128.
503// So for these larger sizes we have an array indexed by ceil(size/128).
504//
505// We flatten both logical arrays into one physical array and use
506// arithmetic to compute an appropriate index. The constants used by
507// ClassIndex() were selected to make the flattening work.
508//
509// Examples:
510// Size Expression Index
511// -------------------------------------------------------
512// 0 (0 + 7) / 8 0
513// 1 (1 + 7) / 8 1
514// ...
515// 1024 (1024 + 7) / 8 128
516// 1025 (1025 + 127 + (120<<7)) / 128 129
517// ...
518// 32768 (32768 + 127 + (120<<7)) / 128 376
519static const size_t kMaxSmallSize = 1024;
520static const int shift_amount[2] = { 3, 7 }; // For divides by 8 or 128
521static const int add_amount[2] = { 7, 127 + (120 << 7) };
522static unsigned char class_array[377];
523
524// Compute index of the class_array[] entry for a given size
525static inline int ClassIndex(size_t s) {
526 const int i = (s > kMaxSmallSize);
527 return static_cast<int>((s + add_amount[i]) >> shift_amount[i]);
528}
529
530// Mapping from size class to max size storable in that class
531static size_t class_to_size[kNumClasses];
532
533// Mapping from size class to number of pages to allocate at a time
534static size_t class_to_pages[kNumClasses];
535
536// TransferCache is used to cache transfers of num_objects_to_move[size_class]
537// back and forth between thread caches and the central cache for a given size
538// class.
539struct TCEntry {
540 void *head; // Head of chain of objects.
541 void *tail; // Tail of chain of objects.
542};
543// A central cache freelist can have anywhere from 0 to kNumTransferEntries
544// slots to put link list chains into. To keep memory usage bounded the total
545// number of TCEntries across size classes is fixed. Currently each size
546// class is initially given one TCEntry which also means that the maximum any
547// one class can have is kNumClasses.
548static const int kNumTransferEntries = kNumClasses;
549
550// Note: the following only works for "n"s that fit in 32-bits, but
551// that is fine since we only use it for small sizes.
552static inline int LgFloor(size_t n) {
553 int log = 0;
554 for (int i = 4; i >= 0; --i) {
555 int shift = (1 << i);
556 size_t x = n >> shift;
557 if (x != 0) {
558 n = x;
559 log += shift;
560 }
561 }
562 ASSERT(n == 1);
563 return log;
564}
565
566// Some very basic linked list functions for dealing with using void * as
567// storage.
568
569static inline void *SLL_Next(void *t) {
570 return *(reinterpret_cast<void**>(t));
571}
572
573static inline void SLL_SetNext(void *t, void *n) {
574 *(reinterpret_cast<void**>(t)) = n;
575}
576
577static inline void SLL_Push(void **list, void *element) {
578 SLL_SetNext(element, *list);
579 *list = element;
580}
581
582static inline void *SLL_Pop(void **list) {
583 void *result = *list;
584 *list = SLL_Next(*list);
585 return result;
586}
587
588
589// Remove N elements from a linked list to which head points. head will be
590// modified to point to the new head. start and end will point to the first
591// and last nodes of the range. Note that end will point to NULL after this
592// function is called.
593static inline void SLL_PopRange(void **head, int N, void **start, void **end) {
594 if (N == 0) {
595 *start = NULL;
596 *end = NULL;
597 return;
598 }
599
600 void *tmp = *head;
601 for (int i = 1; i < N; ++i) {
602 tmp = SLL_Next(tmp);
603 }
604
605 *start = *head;
606 *end = tmp;
607 *head = SLL_Next(tmp);
608 // Unlink range from list.
609 SLL_SetNext(tmp, NULL);
610}
611
612static inline void SLL_PushRange(void **head, void *start, void *end) {
613 if (!start) return;
614 SLL_SetNext(end, *head);
615 *head = start;
616}
617
618static inline size_t SLL_Size(void *head) {
619 int count = 0;
620 while (head) {
621 count++;
622 head = SLL_Next(head);
623 }
624 return count;
625}
626
627// Setup helper functions.
628
629static ALWAYS_INLINE size_t SizeClass(size_t size) {
630 return class_array[ClassIndex(size)];
631}
632
633// Get the byte-size for a specified class
634static ALWAYS_INLINE size_t ByteSizeForClass(size_t cl) {
635 return class_to_size[cl];
636}
637static int NumMoveSize(size_t size) {
638 if (size == 0) return 0;
639 // Use approx 64k transfers between thread and central caches.
640 int num = static_cast<int>(64.0 * 1024.0 / size);
641 if (num < 2) num = 2;
642 // Clamp well below kMaxFreeListLength to avoid ping pong between central
643 // and thread caches.
644 if (num > static_cast<int>(0.8 * kMaxFreeListLength))
645 num = static_cast<int>(0.8 * kMaxFreeListLength);
646
647 // Also, avoid bringing in too many objects into small object free
648 // lists. There are lots of such lists, and if we allow each one to
649 // fetch too many at a time, we end up having to scavenge too often
650 // (especially when there are lots of threads and each thread gets a
651 // small allowance for its thread cache).
652 //
653 // TODO: Make thread cache free list sizes dynamic so that we do not
654 // have to equally divide a fixed resource amongst lots of threads.
655 if (num > 32) num = 32;
656
657 return num;
658}
659
660// Initialize the mapping arrays
661static void InitSizeClasses() {
662 // Do some sanity checking on add_amount[]/shift_amount[]/class_array[]
663 if (ClassIndex(0) < 0) {
664 MESSAGE("Invalid class index %d for size 0\n", ClassIndex(0));
665 abort();
666 }
667 if (static_cast<size_t>(ClassIndex(kMaxSize)) >= sizeof(class_array)) {
668 MESSAGE("Invalid class index %d for kMaxSize\n", ClassIndex(kMaxSize));
669 abort();
670 }
671
672 // Compute the size classes we want to use
673 size_t sc = 1; // Next size class to assign
674 unsigned char alignshift = kAlignShift;
675 int last_lg = -1;
676 for (size_t size = kAlignment; size <= kMaxSize; size += (1 << alignshift)) {
677 int lg = LgFloor(size);
678 if (lg > last_lg) {
679 // Increase alignment every so often.
680 //
681 // Since we double the alignment every time size doubles and
682 // size >= 128, this means that space wasted due to alignment is
683 // at most 16/128 i.e., 12.5%. Plus we cap the alignment at 256
684 // bytes, so the space wasted as a percentage starts falling for
685 // sizes > 2K.
686 if ((lg >= 7) && (alignshift < 8)) {
687 alignshift++;
688 }
689 last_lg = lg;
690 }
691
692 // Allocate enough pages so leftover is less than 1/8 of total.
693 // This bounds wasted space to at most 12.5%.
694 size_t psize = kPageSize;
695 while ((psize % size) > (psize >> 3)) {
696 psize += kPageSize;
697 }
698 const size_t my_pages = psize >> kPageShift;
699
700 if (sc > 1 && my_pages == class_to_pages[sc-1]) {
701 // See if we can merge this into the previous class without
702 // increasing the fragmentation of the previous class.
703 const size_t my_objects = (my_pages << kPageShift) / size;
704 const size_t prev_objects = (class_to_pages[sc-1] << kPageShift)
705 / class_to_size[sc-1];
706 if (my_objects == prev_objects) {
707 // Adjust last class to include this size
708 class_to_size[sc-1] = size;
709 continue;
710 }
711 }
712
713 // Add new class
714 class_to_pages[sc] = my_pages;
715 class_to_size[sc] = size;
716 sc++;
717 }
718 if (sc != kNumClasses) {
719 MESSAGE("wrong number of size classes: found %" PRIuS " instead of %d\n",
720 sc, int(kNumClasses));
721 abort();
722 }
723
724 // Initialize the mapping arrays
725 int next_size = 0;
726 for (unsigned char c = 1; c < kNumClasses; c++) {
727 const size_t max_size_in_class = class_to_size[c];
728 for (size_t s = next_size; s <= max_size_in_class; s += kAlignment) {
729 class_array[ClassIndex(s)] = c;
730 }
731 next_size = static_cast<int>(max_size_in_class + kAlignment);
732 }
733
734 // Double-check sizes just to be safe
735 for (size_t size = 0; size <= kMaxSize; size++) {
736 const size_t sc = SizeClass(size);
737 if (sc == 0) {
738 MESSAGE("Bad size class %" PRIuS " for %" PRIuS "\n", sc, size);
739 abort();
740 }
741 if (sc > 1 && size <= class_to_size[sc-1]) {
742 MESSAGE("Allocating unnecessarily large class %" PRIuS " for %" PRIuS
743 "\n", sc, size);
744 abort();
745 }
746 if (sc >= kNumClasses) {
747 MESSAGE("Bad size class %" PRIuS " for %" PRIuS "\n", sc, size);
748 abort();
749 }
750 const size_t s = class_to_size[sc];
751 if (size > s) {
752 MESSAGE("Bad size %" PRIuS " for %" PRIuS " (sc = %" PRIuS ")\n", s, size, sc);
753 abort();
754 }
755 if (s == 0) {
756 MESSAGE("Bad size %" PRIuS " for %" PRIuS " (sc = %" PRIuS ")\n", s, size, sc);
757 abort();
758 }
759 }
760
761 // Initialize the num_objects_to_move array.
762 for (size_t cl = 1; cl < kNumClasses; ++cl) {
763 num_objects_to_move[cl] = NumMoveSize(ByteSizeForClass(cl));
764 }
765
766#ifndef WTF_CHANGES
767 if (false) {
768 // Dump class sizes and maximum external wastage per size class
769 for (size_t cl = 1; cl < kNumClasses; ++cl) {
770 const int alloc_size = class_to_pages[cl] << kPageShift;
771 const int alloc_objs = alloc_size / class_to_size[cl];
772 const int min_used = (class_to_size[cl-1] + 1) * alloc_objs;
773 const int max_waste = alloc_size - min_used;
774 MESSAGE("SC %3d [ %8d .. %8d ] from %8d ; %2.0f%% maxwaste\n",
775 int(cl),
776 int(class_to_size[cl-1] + 1),
777 int(class_to_size[cl]),
778 int(class_to_pages[cl] << kPageShift),
779 max_waste * 100.0 / alloc_size
780 );
781 }
782 }
783#endif
784}
785
786// -------------------------------------------------------------------------
787// Simple allocator for objects of a specified type. External locking
788// is required before accessing one of these objects.
789// -------------------------------------------------------------------------
790
791// Metadata allocator -- keeps stats about how many bytes allocated
792static uint64_t metadata_system_bytes = 0;
793static void* MetaDataAlloc(size_t bytes) {
794 void* result = TCMalloc_SystemAlloc(bytes, 0);
795 if (result != NULL) {
796 metadata_system_bytes += bytes;
797 }
798 return result;
799}
800
801template <class T>
802class PageHeapAllocator {
803 private:
804 // How much to allocate from system at a time
805 static const size_t kAllocIncrement = 32 << 10;
806
807 // Aligned size of T
808 static const size_t kAlignedSize
809 = (((sizeof(T) + kAlignment - 1) / kAlignment) * kAlignment);
810
811 // Free area from which to carve new objects
812 char* free_area_;
813 size_t free_avail_;
814
815 // Free list of already carved objects
816 void* free_list_;
817
818 // Number of allocated but unfreed objects
819 int inuse_;
820
821 public:
822 void Init() {
823 ASSERT(kAlignedSize <= kAllocIncrement);
824 inuse_ = 0;
825 free_area_ = NULL;
826 free_avail_ = 0;
827 free_list_ = NULL;
828 }
829
830 T* New() {
831 // Consult free list
832 void* result;
833 if (free_list_ != NULL) {
834 result = free_list_;
835 free_list_ = *(reinterpret_cast<void**>(result));
836 } else {
837 if (free_avail_ < kAlignedSize) {
838 // Need more room
839 free_area_ = reinterpret_cast<char*>(MetaDataAlloc(kAllocIncrement));
840 if (free_area_ == NULL) abort();
841 free_avail_ = kAllocIncrement;
842 }
843 result = free_area_;
844 free_area_ += kAlignedSize;
845 free_avail_ -= kAlignedSize;
846 }
847 inuse_++;
848 return reinterpret_cast<T*>(result);
849 }
850
851 void Delete(T* p) {
852 *(reinterpret_cast<void**>(p)) = free_list_;
853 free_list_ = p;
854 inuse_--;
855 }
856
857 int inuse() const { return inuse_; }
858};
859
860// -------------------------------------------------------------------------
861// Span - a contiguous run of pages
862// -------------------------------------------------------------------------
863
864// Type that can hold a page number
865typedef uintptr_t PageID;
866
867// Type that can hold the length of a run of pages
868typedef uintptr_t Length;
869
870static const Length kMaxValidPages = (~static_cast<Length>(0)) >> kPageShift;
871
872// Convert byte size into pages. This won't overflow, but may return
873// an unreasonably large value if bytes is huge enough.
874static inline Length pages(size_t bytes) {
875 return (bytes >> kPageShift) +
876 ((bytes & (kPageSize - 1)) > 0 ? 1 : 0);
877}
878
879// Convert a user size into the number of bytes that will actually be
880// allocated
881static size_t AllocationSize(size_t bytes) {
882 if (bytes > kMaxSize) {
883 // Large object: we allocate an integral number of pages
884 ASSERT(bytes <= (kMaxValidPages << kPageShift));
885 return pages(bytes) << kPageShift;
886 } else {
887 // Small object: find the size class to which it belongs
888 return ByteSizeForClass(SizeClass(bytes));
889 }
890}
891
892// Information kept for a span (a contiguous run of pages).
893struct Span {
894 PageID start; // Starting page number
895 Length length; // Number of pages in span
896 Span* next; // Used when in link list
897 Span* prev; // Used when in link list
898 void* objects; // Linked list of free objects
899 unsigned int free : 1; // Is the span free
900#ifndef NO_TCMALLOC_SAMPLES
901 unsigned int sample : 1; // Sampled object?
902#endif
903 unsigned int sizeclass : 8; // Size-class for small objects (or 0)
904 unsigned int refcount : 11; // Number of non-free objects
905 bool decommitted : 1;
906
907#undef SPAN_HISTORY
908#ifdef SPAN_HISTORY
909 // For debugging, we can keep a log events per span
910 int nexthistory;
911 char history[64];
912 int value[64];
913#endif
914};
915
916#if TCMALLOC_TRACK_DECOMMITED_SPANS
917#define ASSERT_SPAN_COMMITTED(span) ASSERT(!span->decommitted)
918#else
919#define ASSERT_SPAN_COMMITTED(span)
920#endif
921
922#ifdef SPAN_HISTORY
923void Event(Span* span, char op, int v = 0) {
924 span->history[span->nexthistory] = op;
925 span->value[span->nexthistory] = v;
926 span->nexthistory++;
927 if (span->nexthistory == sizeof(span->history)) span->nexthistory = 0;
928}
929#else
930#define Event(s,o,v) ((void) 0)
931#endif
932
933// Allocator/deallocator for spans
934static PageHeapAllocator<Span> span_allocator;
935static Span* NewSpan(PageID p, Length len) {
936 Span* result = span_allocator.New();
937 memset(result, 0, sizeof(*result));
938 result->start = p;
939 result->length = len;
940#ifdef SPAN_HISTORY
941 result->nexthistory = 0;
942#endif
943 return result;
944}
945
946static inline void DeleteSpan(Span* span) {
947#ifndef NDEBUG
948 // In debug mode, trash the contents of deleted Spans
949 memset(span, 0x3f, sizeof(*span));
950#endif
951 span_allocator.Delete(span);
952}
953
954// -------------------------------------------------------------------------
955// Doubly linked list of spans.
956// -------------------------------------------------------------------------
957
958static inline void DLL_Init(Span* list) {
959 list->next = list;
960 list->prev = list;
961}
962
963static inline void DLL_Remove(Span* span) {
964 span->prev->next = span->next;
965 span->next->prev = span->prev;
966 span->prev = NULL;
967 span->next = NULL;
968}
969
970static ALWAYS_INLINE bool DLL_IsEmpty(const Span* list) {
971 return list->next == list;
972}
973
974#ifndef WTF_CHANGES
975static int DLL_Length(const Span* list) {
976 int result = 0;
977 for (Span* s = list->next; s != list; s = s->next) {
978 result++;
979 }
980 return result;
981}
982#endif
983
984#if 0 /* Not needed at the moment -- causes compiler warnings if not used */
985static void DLL_Print(const char* label, const Span* list) {
986 MESSAGE("%-10s %p:", label, list);
987 for (const Span* s = list->next; s != list; s = s->next) {
988 MESSAGE(" <%p,%u,%u>", s, s->start, s->length);
989 }
990 MESSAGE("\n");
991}
992#endif
993
994static inline void DLL_Prepend(Span* list, Span* span) {
995 ASSERT(span->next == NULL);
996 ASSERT(span->prev == NULL);
997 span->next = list->next;
998 span->prev = list;
999 list->next->prev = span;
1000 list->next = span;
1001}
1002
1003// -------------------------------------------------------------------------
1004// Stack traces kept for sampled allocations
1005// The following state is protected by pageheap_lock_.
1006// -------------------------------------------------------------------------
1007
1008// size/depth are made the same size as a pointer so that some generic
1009// code below can conveniently cast them back and forth to void*.
1010static const int kMaxStackDepth = 31;
1011struct StackTrace {
1012 uintptr_t size; // Size of object
1013 uintptr_t depth; // Number of PC values stored in array below
1014 void* stack[kMaxStackDepth];
1015};
1016static PageHeapAllocator<StackTrace> stacktrace_allocator;
1017static Span sampled_objects;
1018
1019// -------------------------------------------------------------------------
1020// Map from page-id to per-page data
1021// -------------------------------------------------------------------------
1022
1023// We use PageMap2<> for 32-bit and PageMap3<> for 64-bit machines.
1024// We also use a simple one-level cache for hot PageID-to-sizeclass mappings,
1025// because sometimes the sizeclass is all the information we need.
1026
1027// Selector class -- general selector uses 3-level map
1028template <int BITS> class MapSelector {
1029 public:
1030 typedef TCMalloc_PageMap3<BITS-kPageShift> Type;
1031 typedef PackedCache<BITS, uint64_t> CacheType;
1032};
1033
1034// A two-level map for 32-bit machines
1035template <> class MapSelector<32> {
1036 public:
1037 typedef TCMalloc_PageMap2<32-kPageShift> Type;
1038 typedef PackedCache<32-kPageShift, uint16_t> CacheType;
1039};
1040
1041// -------------------------------------------------------------------------
1042// Page-level allocator
1043// * Eager coalescing
1044//
1045// Heap for page-level allocation. We allow allocating and freeing a
1046// contiguous runs of pages (called a "span").
1047// -------------------------------------------------------------------------
1048
1049class TCMalloc_PageHeap {
1050 public:
1051 void init();
1052
1053 // Allocate a run of "n" pages. Returns zero if out of memory.
1054 Span* New(Length n);
1055
1056 // Delete the span "[p, p+n-1]".
1057 // REQUIRES: span was returned by earlier call to New() and
1058 // has not yet been deleted.
1059 void Delete(Span* span);
1060
1061 // Mark an allocated span as being used for small objects of the
1062 // specified size-class.
1063 // REQUIRES: span was returned by an earlier call to New()
1064 // and has not yet been deleted.
1065 void RegisterSizeClass(Span* span, size_t sc);
1066
1067 // Split an allocated span into two spans: one of length "n" pages
1068 // followed by another span of length "span->length - n" pages.
1069 // Modifies "*span" to point to the first span of length "n" pages.
1070 // Returns a pointer to the second span.
1071 //
1072 // REQUIRES: "0 < n < span->length"
1073 // REQUIRES: !span->free
1074 // REQUIRES: span->sizeclass == 0
1075 Span* Split(Span* span, Length n);
1076
1077 // Return the descriptor for the specified page.
1078 inline Span* GetDescriptor(PageID p) const {
1079 return reinterpret_cast<Span*>(pagemap_.get(p));
1080 }
1081
1082#ifdef WTF_CHANGES
1083 inline Span* GetDescriptorEnsureSafe(PageID p)
1084 {
1085 pagemap_.Ensure(p, 1);
1086 return GetDescriptor(p);
1087 }
1088#endif
1089
1090 // Dump state to stderr
1091#ifndef WTF_CHANGES
1092 void Dump(TCMalloc_Printer* out);
1093#endif
1094
1095 // Return number of bytes allocated from system
1096 inline uint64_t SystemBytes() const { return system_bytes_; }
1097
1098 // Return number of free bytes in heap
1099 uint64_t FreeBytes() const {
1100 return (static_cast<uint64_t>(free_pages_) << kPageShift);
1101 }
1102
1103 bool Check();
1104 bool CheckList(Span* list, Length min_pages, Length max_pages);
1105
1106 // Release all pages on the free list for reuse by the OS:
1107 void ReleaseFreePages();
1108
1109 // Return 0 if we have no information, or else the correct sizeclass for p.
1110 // Reads and writes to pagemap_cache_ do not require locking.
1111 // The entries are 64 bits on 64-bit hardware and 16 bits on
1112 // 32-bit hardware, and we don't mind raciness as long as each read of
1113 // an entry yields a valid entry, not a partially updated entry.
1114 size_t GetSizeClassIfCached(PageID p) const {
1115 return pagemap_cache_.GetOrDefault(p, 0);
1116 }
1117 void CacheSizeClass(PageID p, size_t cl) const { pagemap_cache_.Put(p, cl); }
1118
1119 private:
1120 // Pick the appropriate map and cache types based on pointer size
1121 typedef MapSelector<8*sizeof(uintptr_t)>::Type PageMap;
1122 typedef MapSelector<8*sizeof(uintptr_t)>::CacheType PageMapCache;
1123 PageMap pagemap_;
1124 mutable PageMapCache pagemap_cache_;
1125
1126 // We segregate spans of a given size into two circular linked
1127 // lists: one for normal spans, and one for spans whose memory
1128 // has been returned to the system.
1129 struct SpanList {
1130 Span normal;
1131 Span returned;
1132 };
1133
1134 // List of free spans of length >= kMaxPages
1135 SpanList large_;
1136
1137 // Array mapping from span length to a doubly linked list of free spans
1138 SpanList free_[kMaxPages];
1139
1140 // Number of pages kept in free lists
1141 uintptr_t free_pages_;
1142
1143 // Bytes allocated from system
1144 uint64_t system_bytes_;
1145
1146 bool GrowHeap(Length n);
1147
1148 // REQUIRES span->length >= n
1149 // Remove span from its free list, and move any leftover part of
1150 // span into appropriate free lists. Also update "span" to have
1151 // length exactly "n" and mark it as non-free so it can be returned
1152 // to the client.
1153 //
1154 // "released" is true iff "span" was found on a "returned" list.
1155 void Carve(Span* span, Length n, bool released);
1156
1157 void RecordSpan(Span* span) {
1158 pagemap_.set(span->start, span);
1159 if (span->length > 1) {
1160 pagemap_.set(span->start + span->length - 1, span);
1161 }
1162 }
1163
1164 // Allocate a large span of length == n. If successful, returns a
1165 // span of exactly the specified length. Else, returns NULL.
1166 Span* AllocLarge(Length n);
1167
1168 // Incrementally release some memory to the system.
1169 // IncrementalScavenge(n) is called whenever n pages are freed.
1170 void IncrementalScavenge(Length n);
1171
1172 // Number of pages to deallocate before doing more scavenging
1173 int64_t scavenge_counter_;
1174
1175 // Index of last free list we scavenged
1176 size_t scavenge_index_;
1177
1178#if defined(WTF_CHANGES) && PLATFORM(DARWIN)
1179 friend class FastMallocZone;
1180#endif
1181};
1182
1183void TCMalloc_PageHeap::init()
1184{
1185 pagemap_.init(MetaDataAlloc);
1186 pagemap_cache_ = PageMapCache(0);
1187 free_pages_ = 0;
1188 system_bytes_ = 0;
1189 scavenge_counter_ = 0;
1190 // Start scavenging at kMaxPages list
1191 scavenge_index_ = kMaxPages-1;
1192 COMPILE_ASSERT(kNumClasses <= (1 << PageMapCache::kValuebits), valuebits);
1193 DLL_Init(&large_.normal);
1194 DLL_Init(&large_.returned);
1195 for (size_t i = 0; i < kMaxPages; i++) {
1196 DLL_Init(&free_[i].normal);
1197 DLL_Init(&free_[i].returned);
1198 }
1199}
1200
1201inline Span* TCMalloc_PageHeap::New(Length n) {
1202 ASSERT(Check());
1203 ASSERT(n > 0);
1204
1205 // Find first size >= n that has a non-empty list
1206 for (Length s = n; s < kMaxPages; s++) {
1207 Span* ll = NULL;
1208 bool released = false;
1209 if (!DLL_IsEmpty(&free_[s].normal)) {
1210 // Found normal span
1211 ll = &free_[s].normal;
1212 } else if (!DLL_IsEmpty(&free_[s].returned)) {
1213 // Found returned span; reallocate it
1214 ll = &free_[s].returned;
1215 released = true;
1216 } else {
1217 // Keep looking in larger classes
1218 continue;
1219 }
1220
1221 Span* result = ll->next;
1222 Carve(result, n, released);
1223#if TCMALLOC_TRACK_DECOMMITED_SPANS
1224 if (result->decommitted) {
1225 TCMalloc_SystemCommit(reinterpret_cast<void*>(result->start << kPageShift), static_cast<size_t>(n << kPageShift));
1226 result->decommitted = false;
1227 }
1228#endif
1229 ASSERT(Check());
1230 free_pages_ -= n;
1231 return result;
1232 }
1233
1234 Span* result = AllocLarge(n);
1235 if (result != NULL) {
1236 ASSERT_SPAN_COMMITTED(result);
1237 return result;
1238 }
1239
1240 // Grow the heap and try again
1241 if (!GrowHeap(n)) {
1242 ASSERT(Check());
1243 return NULL;
1244 }
1245
1246 return AllocLarge(n);
1247}
1248
1249Span* TCMalloc_PageHeap::AllocLarge(Length n) {
1250 // find the best span (closest to n in size).
1251 // The following loops implements address-ordered best-fit.
1252 bool from_released = false;
1253 Span *best = NULL;
1254
1255 // Search through normal list
1256 for (Span* span = large_.normal.next;
1257 span != &large_.normal;
1258 span = span->next) {
1259 if (span->length >= n) {
1260 if ((best == NULL)
1261 || (span->length < best->length)
1262 || ((span->length == best->length) && (span->start < best->start))) {
1263 best = span;
1264 from_released = false;
1265 }
1266 }
1267 }
1268
1269 // Search through released list in case it has a better fit
1270 for (Span* span = large_.returned.next;
1271 span != &large_.returned;
1272 span = span->next) {
1273 if (span->length >= n) {
1274 if ((best == NULL)
1275 || (span->length < best->length)
1276 || ((span->length == best->length) && (span->start < best->start))) {
1277 best = span;
1278 from_released = true;
1279 }
1280 }
1281 }
1282
1283 if (best != NULL) {
1284 Carve(best, n, from_released);
1285#if TCMALLOC_TRACK_DECOMMITED_SPANS
1286 if (best->decommitted) {
1287 TCMalloc_SystemCommit(reinterpret_cast<void*>(best->start << kPageShift), static_cast<size_t>(n << kPageShift));
1288 best->decommitted = false;
1289 }
1290#endif
1291 ASSERT(Check());
1292 free_pages_ -= n;
1293 return best;
1294 }
1295 return NULL;
1296}
1297
1298Span* TCMalloc_PageHeap::Split(Span* span, Length n) {
1299 ASSERT(0 < n);
1300 ASSERT(n < span->length);
1301 ASSERT(!span->free);
1302 ASSERT(span->sizeclass == 0);
1303 Event(span, 'T', n);
1304
1305 const Length extra = span->length - n;
1306 Span* leftover = NewSpan(span->start + n, extra);
1307 Event(leftover, 'U', extra);
1308 RecordSpan(leftover);
1309 pagemap_.set(span->start + n - 1, span); // Update map from pageid to span
1310 span->length = n;
1311
1312 return leftover;
1313}
1314
1315#if !TCMALLOC_TRACK_DECOMMITED_SPANS
1316static ALWAYS_INLINE void propagateDecommittedState(Span*, Span*) { }
1317#else
1318static ALWAYS_INLINE void propagateDecommittedState(Span* destination, Span* source)
1319{
1320 destination->decommitted = source->decommitted;
1321}
1322#endif
1323
1324inline void TCMalloc_PageHeap::Carve(Span* span, Length n, bool released) {
1325 ASSERT(n > 0);
1326 DLL_Remove(span);
1327 span->free = 0;
1328 Event(span, 'A', n);
1329
1330 const int extra = static_cast<int>(span->length - n);
1331 ASSERT(extra >= 0);
1332 if (extra > 0) {
1333 Span* leftover = NewSpan(span->start + n, extra);
1334 leftover->free = 1;
1335 propagateDecommittedState(leftover, span);
1336 Event(leftover, 'S', extra);
1337 RecordSpan(leftover);
1338
1339 // Place leftover span on appropriate free list
1340 SpanList* listpair = (static_cast<size_t>(extra) < kMaxPages) ? &free_[extra] : &large_;
1341 Span* dst = released ? &listpair->returned : &listpair->normal;
1342 DLL_Prepend(dst, leftover);
1343
1344 span->length = n;
1345 pagemap_.set(span->start + n - 1, span);
1346 }
1347}
1348
1349#if !TCMALLOC_TRACK_DECOMMITED_SPANS
1350static ALWAYS_INLINE void mergeDecommittedStates(Span*, Span*) { }
1351#else
1352static ALWAYS_INLINE void mergeDecommittedStates(Span* destination, Span* other)
1353{
1354 if (other->decommitted)
1355 destination->decommitted = true;
1356}
1357#endif
1358
1359inline void TCMalloc_PageHeap::Delete(Span* span) {
1360 ASSERT(Check());
1361 ASSERT(!span->free);
1362 ASSERT(span->length > 0);
1363 ASSERT(GetDescriptor(span->start) == span);
1364 ASSERT(GetDescriptor(span->start + span->length - 1) == span);
1365 span->sizeclass = 0;
1366#ifndef NO_TCMALLOC_SAMPLES
1367 span->sample = 0;
1368#endif
1369
1370 // Coalesce -- we guarantee that "p" != 0, so no bounds checking
1371 // necessary. We do not bother resetting the stale pagemap
1372 // entries for the pieces we are merging together because we only
1373 // care about the pagemap entries for the boundaries.
1374 //
1375 // Note that the spans we merge into "span" may come out of
1376 // a "returned" list. For simplicity, we move these into the
1377 // "normal" list of the appropriate size class.
1378 const PageID p = span->start;
1379 const Length n = span->length;
1380 Span* prev = GetDescriptor(p-1);
1381 if (prev != NULL && prev->free) {
1382 // Merge preceding span into this span
1383 ASSERT(prev->start + prev->length == p);
1384 const Length len = prev->length;
1385 mergeDecommittedStates(span, prev);
1386 DLL_Remove(prev);
1387 DeleteSpan(prev);
1388 span->start -= len;
1389 span->length += len;
1390 pagemap_.set(span->start, span);
1391 Event(span, 'L', len);
1392 }
1393 Span* next = GetDescriptor(p+n);
1394 if (next != NULL && next->free) {
1395 // Merge next span into this span
1396 ASSERT(next->start == p+n);
1397 const Length len = next->length;
1398 mergeDecommittedStates(span, next);
1399 DLL_Remove(next);
1400 DeleteSpan(next);
1401 span->length += len;
1402 pagemap_.set(span->start + span->length - 1, span);
1403 Event(span, 'R', len);
1404 }
1405
1406 Event(span, 'D', span->length);
1407 span->free = 1;
1408 if (span->length < kMaxPages) {
1409 DLL_Prepend(&free_[span->length].normal, span);
1410 } else {
1411 DLL_Prepend(&large_.normal, span);
1412 }
1413 free_pages_ += n;
1414
1415 IncrementalScavenge(n);
1416 ASSERT(Check());
1417}
1418
1419void TCMalloc_PageHeap::IncrementalScavenge(Length n) {
1420 // Fast path; not yet time to release memory
1421 scavenge_counter_ -= n;
1422 if (scavenge_counter_ >= 0) return; // Not yet time to scavenge
1423
1424 // If there is nothing to release, wait for so many pages before
1425 // scavenging again. With 4K pages, this comes to 16MB of memory.
1426 static const size_t kDefaultReleaseDelay = 1 << 8;
1427
1428 // Find index of free list to scavenge
1429 size_t index = scavenge_index_ + 1;
1430 for (size_t i = 0; i < kMaxPages+1; i++) {
1431 if (index > kMaxPages) index = 0;
1432 SpanList* slist = (index == kMaxPages) ? &large_ : &free_[index];
1433 if (!DLL_IsEmpty(&slist->normal)) {
1434 // Release the last span on the normal portion of this list
1435 Span* s = slist->normal.prev;
1436 DLL_Remove(s);
1437 TCMalloc_SystemRelease(reinterpret_cast<void*>(s->start << kPageShift),
1438 static_cast<size_t>(s->length << kPageShift));
1439#if TCMALLOC_TRACK_DECOMMITED_SPANS
1440 s->decommitted = true;
1441#endif
1442 DLL_Prepend(&slist->returned, s);
1443
1444 scavenge_counter_ = std::max<size_t>(64UL, std::min<size_t>(kDefaultReleaseDelay, kDefaultReleaseDelay - (free_pages_ / kDefaultReleaseDelay)));
1445
1446 if (index == kMaxPages && !DLL_IsEmpty(&slist->normal))
1447 scavenge_index_ = index - 1;
1448 else
1449 scavenge_index_ = index;
1450 return;
1451 }
1452 index++;
1453 }
1454
1455 // Nothing to scavenge, delay for a while
1456 scavenge_counter_ = kDefaultReleaseDelay;
1457}
1458
1459void TCMalloc_PageHeap::RegisterSizeClass(Span* span, size_t sc) {
1460 // Associate span object with all interior pages as well
1461 ASSERT(!span->free);
1462 ASSERT(GetDescriptor(span->start) == span);
1463 ASSERT(GetDescriptor(span->start+span->length-1) == span);
1464 Event(span, 'C', sc);
1465 span->sizeclass = static_cast<unsigned int>(sc);
1466 for (Length i = 1; i < span->length-1; i++) {
1467 pagemap_.set(span->start+i, span);
1468 }
1469}
1470
1471#ifndef WTF_CHANGES
1472static double PagesToMB(uint64_t pages) {
1473 return (pages << kPageShift) / 1048576.0;
1474}
1475
1476void TCMalloc_PageHeap::Dump(TCMalloc_Printer* out) {
1477 int nonempty_sizes = 0;
1478 for (int s = 0; s < kMaxPages; s++) {
1479 if (!DLL_IsEmpty(&free_[s].normal) || !DLL_IsEmpty(&free_[s].returned)) {
1480 nonempty_sizes++;
1481 }
1482 }
1483 out->printf("------------------------------------------------\n");
1484 out->printf("PageHeap: %d sizes; %6.1f MB free\n",
1485 nonempty_sizes, PagesToMB(free_pages_));
1486 out->printf("------------------------------------------------\n");
1487 uint64_t total_normal = 0;
1488 uint64_t total_returned = 0;
1489 for (int s = 0; s < kMaxPages; s++) {
1490 const int n_length = DLL_Length(&free_[s].normal);
1491 const int r_length = DLL_Length(&free_[s].returned);
1492 if (n_length + r_length > 0) {
1493 uint64_t n_pages = s * n_length;
1494 uint64_t r_pages = s * r_length;
1495 total_normal += n_pages;
1496 total_returned += r_pages;
1497 out->printf("%6u pages * %6u spans ~ %6.1f MB; %6.1f MB cum"
1498 "; unmapped: %6.1f MB; %6.1f MB cum\n",
1499 s,
1500 (n_length + r_length),
1501 PagesToMB(n_pages + r_pages),
1502 PagesToMB(total_normal + total_returned),
1503 PagesToMB(r_pages),
1504 PagesToMB(total_returned));
1505 }
1506 }
1507
1508 uint64_t n_pages = 0;
1509 uint64_t r_pages = 0;
1510 int n_spans = 0;
1511 int r_spans = 0;
1512 out->printf("Normal large spans:\n");
1513 for (Span* s = large_.normal.next; s != &large_.normal; s = s->next) {
1514 out->printf(" [ %6" PRIuS " pages ] %6.1f MB\n",
1515 s->length, PagesToMB(s->length));
1516 n_pages += s->length;
1517 n_spans++;
1518 }
1519 out->printf("Unmapped large spans:\n");
1520 for (Span* s = large_.returned.next; s != &large_.returned; s = s->next) {
1521 out->printf(" [ %6" PRIuS " pages ] %6.1f MB\n",
1522 s->length, PagesToMB(s->length));
1523 r_pages += s->length;
1524 r_spans++;
1525 }
1526 total_normal += n_pages;
1527 total_returned += r_pages;
1528 out->printf(">255 large * %6u spans ~ %6.1f MB; %6.1f MB cum"
1529 "; unmapped: %6.1f MB; %6.1f MB cum\n",
1530 (n_spans + r_spans),
1531 PagesToMB(n_pages + r_pages),
1532 PagesToMB(total_normal + total_returned),
1533 PagesToMB(r_pages),
1534 PagesToMB(total_returned));
1535}
1536#endif
1537
1538bool TCMalloc_PageHeap::GrowHeap(Length n) {
1539 ASSERT(kMaxPages >= kMinSystemAlloc);
1540 if (n > kMaxValidPages) return false;
1541 Length ask = (n>kMinSystemAlloc) ? n : static_cast<Length>(kMinSystemAlloc);
1542 size_t actual_size;
1543 void* ptr = TCMalloc_SystemAlloc(ask << kPageShift, &actual_size, kPageSize);
1544 if (ptr == NULL) {
1545 if (n < ask) {
1546 // Try growing just "n" pages
1547 ask = n;
1548 ptr = TCMalloc_SystemAlloc(ask << kPageShift, &actual_size, kPageSize);;
1549 }
1550 if (ptr == NULL) return false;
1551 }
1552 ask = actual_size >> kPageShift;
1553
1554 uint64_t old_system_bytes = system_bytes_;
1555 system_bytes_ += (ask << kPageShift);
1556 const PageID p = reinterpret_cast<uintptr_t>(ptr) >> kPageShift;
1557 ASSERT(p > 0);
1558
1559 // If we have already a lot of pages allocated, just pre allocate a bunch of
1560 // memory for the page map. This prevents fragmentation by pagemap metadata
1561 // when a program keeps allocating and freeing large blocks.
1562
1563 if (old_system_bytes < kPageMapBigAllocationThreshold
1564 && system_bytes_ >= kPageMapBigAllocationThreshold) {
1565 pagemap_.PreallocateMoreMemory();
1566 }
1567
1568 // Make sure pagemap_ has entries for all of the new pages.
1569 // Plus ensure one before and one after so coalescing code
1570 // does not need bounds-checking.
1571 if (pagemap_.Ensure(p-1, ask+2)) {
1572 // Pretend the new area is allocated and then Delete() it to
1573 // cause any necessary coalescing to occur.
1574 //
1575 // We do not adjust free_pages_ here since Delete() will do it for us.
1576 Span* span = NewSpan(p, ask);
1577 RecordSpan(span);
1578 Delete(span);
1579 ASSERT(Check());
1580 return true;
1581 } else {
1582 // We could not allocate memory within "pagemap_"
1583 // TODO: Once we can return memory to the system, return the new span
1584 return false;
1585 }
1586}
1587
1588bool TCMalloc_PageHeap::Check() {
1589 ASSERT(free_[0].normal.next == &free_[0].normal);
1590 ASSERT(free_[0].returned.next == &free_[0].returned);
1591 CheckList(&large_.normal, kMaxPages, 1000000000);
1592 CheckList(&large_.returned, kMaxPages, 1000000000);
1593 for (Length s = 1; s < kMaxPages; s++) {
1594 CheckList(&free_[s].normal, s, s);
1595 CheckList(&free_[s].returned, s, s);
1596 }
1597 return true;
1598}
1599
1600#if ASSERT_DISABLED
1601bool TCMalloc_PageHeap::CheckList(Span*, Length, Length) {
1602 return true;
1603}
1604#else
1605bool TCMalloc_PageHeap::CheckList(Span* list, Length min_pages, Length max_pages) {
1606 for (Span* s = list->next; s != list; s = s->next) {
1607 CHECK_CONDITION(s->free);
1608 CHECK_CONDITION(s->length >= min_pages);
1609 CHECK_CONDITION(s->length <= max_pages);
1610 CHECK_CONDITION(GetDescriptor(s->start) == s);
1611 CHECK_CONDITION(GetDescriptor(s->start+s->length-1) == s);
1612 }
1613 return true;
1614}
1615#endif
1616
1617static void ReleaseFreeList(Span* list, Span* returned) {
1618 // Walk backwards through list so that when we push these
1619 // spans on the "returned" list, we preserve the order.
1620 while (!DLL_IsEmpty(list)) {
1621 Span* s = list->prev;
1622 DLL_Remove(s);
1623 DLL_Prepend(returned, s);
1624 TCMalloc_SystemRelease(reinterpret_cast<void*>(s->start << kPageShift),
1625 static_cast<size_t>(s->length << kPageShift));
1626 }
1627}
1628
1629void TCMalloc_PageHeap::ReleaseFreePages() {
1630 for (Length s = 0; s < kMaxPages; s++) {
1631 ReleaseFreeList(&free_[s].normal, &free_[s].returned);
1632 }
1633 ReleaseFreeList(&large_.normal, &large_.returned);
1634 ASSERT(Check());
1635}
1636
1637//-------------------------------------------------------------------
1638// Free list
1639//-------------------------------------------------------------------
1640
1641class TCMalloc_ThreadCache_FreeList {
1642 private:
1643 void* list_; // Linked list of nodes
1644 uint16_t length_; // Current length
1645 uint16_t lowater_; // Low water mark for list length
1646
1647 public:
1648 void Init() {
1649 list_ = NULL;
1650 length_ = 0;
1651 lowater_ = 0;
1652 }
1653
1654 // Return current length of list
1655 int length() const {
1656 return length_;
1657 }
1658
1659 // Is list empty?
1660 bool empty() const {
1661 return list_ == NULL;
1662 }
1663
1664 // Low-water mark management
1665 int lowwatermark() const { return lowater_; }
1666 void clear_lowwatermark() { lowater_ = length_; }
1667
1668 ALWAYS_INLINE void Push(void* ptr) {
1669 SLL_Push(&list_, ptr);
1670 length_++;
1671 }
1672
1673 void PushRange(int N, void *start, void *end) {
1674 SLL_PushRange(&list_, start, end);
1675 length_ = length_ + static_cast<uint16_t>(N);
1676 }
1677
1678 void PopRange(int N, void **start, void **end) {
1679 SLL_PopRange(&list_, N, start, end);
1680 ASSERT(length_ >= N);
1681 length_ = length_ - static_cast<uint16_t>(N);
1682 if (length_ < lowater_) lowater_ = length_;
1683 }
1684
1685 ALWAYS_INLINE void* Pop() {
1686 ASSERT(list_ != NULL);
1687 length_--;
1688 if (length_ < lowater_) lowater_ = length_;
1689 return SLL_Pop(&list_);
1690 }
1691
1692#ifdef WTF_CHANGES
1693 template <class Finder, class Reader>
1694 void enumerateFreeObjects(Finder& finder, const Reader& reader)
1695 {
1696 for (void* nextObject = list_; nextObject; nextObject = *reader(reinterpret_cast<void**>(nextObject)))
1697 finder.visit(nextObject);
1698 }
1699#endif
1700};
1701
1702//-------------------------------------------------------------------
1703// Data kept per thread
1704//-------------------------------------------------------------------
1705
1706class TCMalloc_ThreadCache {
1707 private:
1708 typedef TCMalloc_ThreadCache_FreeList FreeList;
1709#if COMPILER(MSVC)
1710 typedef DWORD ThreadIdentifier;
1711#else
1712 typedef pthread_t ThreadIdentifier;
1713#endif
1714
1715 size_t size_; // Combined size of data
1716 ThreadIdentifier tid_; // Which thread owns it
1717 bool in_setspecific_; // Called pthread_setspecific?
1718 FreeList list_[kNumClasses]; // Array indexed by size-class
1719
1720 // We sample allocations, biased by the size of the allocation
1721 uint32_t rnd_; // Cheap random number generator
1722 size_t bytes_until_sample_; // Bytes until we sample next
1723
1724 // Allocate a new heap. REQUIRES: pageheap_lock is held.
1725 static inline TCMalloc_ThreadCache* NewHeap(ThreadIdentifier tid);
1726
1727 // Use only as pthread thread-specific destructor function.
1728 static void DestroyThreadCache(void* ptr);
1729 public:
1730 // All ThreadCache objects are kept in a linked list (for stats collection)
1731 TCMalloc_ThreadCache* next_;
1732 TCMalloc_ThreadCache* prev_;
1733
1734 void Init(ThreadIdentifier tid);
1735 void Cleanup();
1736
1737 // Accessors (mostly just for printing stats)
1738 int freelist_length(size_t cl) const { return list_[cl].length(); }
1739
1740 // Total byte size in cache
1741 size_t Size() const { return size_; }
1742
1743 void* Allocate(size_t size);
1744 void Deallocate(void* ptr, size_t size_class);
1745
1746 void FetchFromCentralCache(size_t cl, size_t allocationSize);
1747 void ReleaseToCentralCache(size_t cl, int N);
1748 void Scavenge();
1749 void Print() const;
1750
1751 // Record allocation of "k" bytes. Return true iff allocation
1752 // should be sampled
1753 bool SampleAllocation(size_t k);
1754
1755 // Pick next sampling point
1756 void PickNextSample(size_t k);
1757
1758 static void InitModule();
1759 static void InitTSD();
1760 static TCMalloc_ThreadCache* GetThreadHeap();
1761 static TCMalloc_ThreadCache* GetCache();
1762 static TCMalloc_ThreadCache* GetCacheIfPresent();
1763 static TCMalloc_ThreadCache* CreateCacheIfNecessary();
1764 static void DeleteCache(TCMalloc_ThreadCache* heap);
1765 static void BecomeIdle();
1766 static void RecomputeThreadCacheSize();
1767
1768#ifdef WTF_CHANGES
1769 template <class Finder, class Reader>
1770 void enumerateFreeObjects(Finder& finder, const Reader& reader)
1771 {
1772 for (unsigned sizeClass = 0; sizeClass < kNumClasses; sizeClass++)
1773 list_[sizeClass].enumerateFreeObjects(finder, reader);
1774 }
1775#endif
1776};
1777
1778//-------------------------------------------------------------------
1779// Data kept per size-class in central cache
1780//-------------------------------------------------------------------
1781
1782class TCMalloc_Central_FreeList {
1783 public:
1784 void Init(size_t cl);
1785
1786 // These methods all do internal locking.
1787
1788 // Insert the specified range into the central freelist. N is the number of
1789 // elements in the range.
1790 void InsertRange(void *start, void *end, int N);
1791
1792 // Returns the actual number of fetched elements into N.
1793 void RemoveRange(void **start, void **end, int *N);
1794
1795 // Returns the number of free objects in cache.
1796 size_t length() {
1797 SpinLockHolder h(&lock_);
1798 return counter_;
1799 }
1800
1801 // Returns the number of free objects in the transfer cache.
1802 int tc_length() {
1803 SpinLockHolder h(&lock_);
1804 return used_slots_ * num_objects_to_move[size_class_];
1805 }
1806
1807#ifdef WTF_CHANGES
1808 template <class Finder, class Reader>
1809 void enumerateFreeObjects(Finder& finder, const Reader& reader)
1810 {
1811 for (Span* span = &empty_; span && span != &empty_; span = (span->next ? reader(span->next) : 0))
1812 ASSERT(!span->objects);
1813
1814 ASSERT(!nonempty_.objects);
1815 for (Span* span = reader(nonempty_.next); span && span != &nonempty_; span = (span->next ? reader(span->next) : 0)) {
1816 for (void* nextObject = span->objects; nextObject; nextObject = *reader(reinterpret_cast<void**>(nextObject)))
1817 finder.visit(nextObject);
1818 }
1819 }
1820#endif
1821
1822 private:
1823 // REQUIRES: lock_ is held
1824 // Remove object from cache and return.
1825 // Return NULL if no free entries in cache.
1826 void* FetchFromSpans();
1827
1828 // REQUIRES: lock_ is held
1829 // Remove object from cache and return. Fetches
1830 // from pageheap if cache is empty. Only returns
1831 // NULL on allocation failure.
1832 void* FetchFromSpansSafe();
1833
1834 // REQUIRES: lock_ is held
1835 // Release a linked list of objects to spans.
1836 // May temporarily release lock_.
1837 void ReleaseListToSpans(void *start);
1838
1839 // REQUIRES: lock_ is held
1840 // Release an object to spans.
1841 // May temporarily release lock_.
1842 void ReleaseToSpans(void* object);
1843
1844 // REQUIRES: lock_ is held
1845 // Populate cache by fetching from the page heap.
1846 // May temporarily release lock_.
1847 void Populate();
1848
1849 // REQUIRES: lock is held.
1850 // Tries to make room for a TCEntry. If the cache is full it will try to
1851 // expand it at the cost of some other cache size. Return false if there is
1852 // no space.
1853 bool MakeCacheSpace();
1854
1855 // REQUIRES: lock_ for locked_size_class is held.
1856 // Picks a "random" size class to steal TCEntry slot from. In reality it
1857 // just iterates over the sizeclasses but does so without taking a lock.
1858 // Returns true on success.
1859 // May temporarily lock a "random" size class.
1860 static bool EvictRandomSizeClass(size_t locked_size_class, bool force);
1861
1862 // REQUIRES: lock_ is *not* held.
1863 // Tries to shrink the Cache. If force is true it will relase objects to
1864 // spans if it allows it to shrink the cache. Return false if it failed to
1865 // shrink the cache. Decrements cache_size_ on succeess.
1866 // May temporarily take lock_. If it takes lock_, the locked_size_class
1867 // lock is released to the thread from holding two size class locks
1868 // concurrently which could lead to a deadlock.
1869 bool ShrinkCache(int locked_size_class, bool force);
1870
1871 // This lock protects all the data members. cached_entries and cache_size_
1872 // may be looked at without holding the lock.
1873 SpinLock lock_;
1874
1875 // We keep linked lists of empty and non-empty spans.
1876 size_t size_class_; // My size class
1877 Span empty_; // Dummy header for list of empty spans
1878 Span nonempty_; // Dummy header for list of non-empty spans
1879 size_t counter_; // Number of free objects in cache entry
1880
1881 // Here we reserve space for TCEntry cache slots. Since one size class can
1882 // end up getting all the TCEntries quota in the system we just preallocate
1883 // sufficient number of entries here.
1884 TCEntry tc_slots_[kNumTransferEntries];
1885
1886 // Number of currently used cached entries in tc_slots_. This variable is
1887 // updated under a lock but can be read without one.
1888 int32_t used_slots_;
1889 // The current number of slots for this size class. This is an
1890 // adaptive value that is increased if there is lots of traffic
1891 // on a given size class.
1892 int32_t cache_size_;
1893};
1894
1895// Pad each CentralCache object to multiple of 64 bytes
1896class TCMalloc_Central_FreeListPadded : public TCMalloc_Central_FreeList {
1897 private:
1898 char pad_[(64 - (sizeof(TCMalloc_Central_FreeList) % 64)) % 64];
1899};
1900
1901//-------------------------------------------------------------------
1902// Global variables
1903//-------------------------------------------------------------------
1904
1905// Central cache -- a collection of free-lists, one per size-class.
1906// We have a separate lock per free-list to reduce contention.
1907static TCMalloc_Central_FreeListPadded central_cache[kNumClasses];
1908
1909// Page-level allocator
1910static SpinLock pageheap_lock = SPINLOCK_INITIALIZER;
1911static void* pageheap_memory[(sizeof(TCMalloc_PageHeap) + sizeof(void*) - 1) / sizeof(void*)];
1912static bool phinited = false;
1913
1914// Avoid extra level of indirection by making "pageheap" be just an alias
1915// of pageheap_memory.
1916typedef union {
1917 void* m_memory;
1918 TCMalloc_PageHeap* m_pageHeap;
1919} PageHeapUnion;
1920
1921static inline TCMalloc_PageHeap* getPageHeap()
1922{
1923 PageHeapUnion u = { &pageheap_memory[0] };
1924 return u.m_pageHeap;
1925}
1926
1927#define pageheap getPageHeap()
1928
1929// If TLS is available, we also store a copy
1930// of the per-thread object in a __thread variable
1931// since __thread variables are faster to read
1932// than pthread_getspecific(). We still need
1933// pthread_setspecific() because __thread
1934// variables provide no way to run cleanup
1935// code when a thread is destroyed.
1936#ifdef HAVE_TLS
1937static __thread TCMalloc_ThreadCache *threadlocal_heap;
1938#endif
1939// Thread-specific key. Initialization here is somewhat tricky
1940// because some Linux startup code invokes malloc() before it
1941// is in a good enough state to handle pthread_keycreate().
1942// Therefore, we use TSD keys only after tsd_inited is set to true.
1943// Until then, we use a slow path to get the heap object.
1944static bool tsd_inited = false;
1945static pthread_key_t heap_key;
1946#if COMPILER(MSVC)
1947DWORD tlsIndex = TLS_OUT_OF_INDEXES;
1948#endif
1949
1950static ALWAYS_INLINE void setThreadHeap(TCMalloc_ThreadCache* heap)
1951{
1952 // still do pthread_setspecific when using MSVC fast TLS to
1953 // benefit from the delete callback.
1954 pthread_setspecific(heap_key, heap);
1955#if COMPILER(MSVC)
1956 TlsSetValue(tlsIndex, heap);
1957#endif
1958}
1959
1960// Allocator for thread heaps
1961static PageHeapAllocator<TCMalloc_ThreadCache> threadheap_allocator;
1962
1963// Linked list of heap objects. Protected by pageheap_lock.
1964static TCMalloc_ThreadCache* thread_heaps = NULL;
1965static int thread_heap_count = 0;
1966
1967// Overall thread cache size. Protected by pageheap_lock.
1968static size_t overall_thread_cache_size = kDefaultOverallThreadCacheSize;
1969
1970// Global per-thread cache size. Writes are protected by
1971// pageheap_lock. Reads are done without any locking, which should be
1972// fine as long as size_t can be written atomically and we don't place
1973// invariants between this variable and other pieces of state.
1974static volatile size_t per_thread_cache_size = kMaxThreadCacheSize;
1975
1976//-------------------------------------------------------------------
1977// Central cache implementation
1978//-------------------------------------------------------------------
1979
1980void TCMalloc_Central_FreeList::Init(size_t cl) {
1981 lock_.Init();
1982 size_class_ = cl;
1983 DLL_Init(&empty_);
1984 DLL_Init(&nonempty_);
1985 counter_ = 0;
1986
1987 cache_size_ = 1;
1988 used_slots_ = 0;
1989 ASSERT(cache_size_ <= kNumTransferEntries);
1990}
1991
1992void TCMalloc_Central_FreeList::ReleaseListToSpans(void* start) {
1993 while (start) {
1994 void *next = SLL_Next(start);
1995 ReleaseToSpans(start);
1996 start = next;
1997 }
1998}
1999
2000ALWAYS_INLINE void TCMalloc_Central_FreeList::ReleaseToSpans(void* object) {
2001 const PageID p = reinterpret_cast<uintptr_t>(object) >> kPageShift;
2002 Span* span = pageheap->GetDescriptor(p);
2003 ASSERT(span != NULL);
2004 ASSERT(span->refcount > 0);
2005
2006 // If span is empty, move it to non-empty list
2007 if (span->objects == NULL) {
2008 DLL_Remove(span);
2009 DLL_Prepend(&nonempty_, span);
2010 Event(span, 'N', 0);
2011 }
2012
2013 // The following check is expensive, so it is disabled by default
2014 if (false) {
2015 // Check that object does not occur in list
2016 int got = 0;
2017 for (void* p = span->objects; p != NULL; p = *((void**) p)) {
2018 ASSERT(p != object);
2019 got++;
2020 }
2021 ASSERT(got + span->refcount ==
2022 (span->length<<kPageShift)/ByteSizeForClass(span->sizeclass));
2023 }
2024
2025 counter_++;
2026 span->refcount--;
2027 if (span->refcount == 0) {
2028 Event(span, '#', 0);
2029 counter_ -= (span->length<<kPageShift) / ByteSizeForClass(span->sizeclass);
2030 DLL_Remove(span);
2031
2032 // Release central list lock while operating on pageheap
2033 lock_.Unlock();
2034 {
2035 SpinLockHolder h(&pageheap_lock);
2036 pageheap->Delete(span);
2037 }
2038 lock_.Lock();
2039 } else {
2040 *(reinterpret_cast<void**>(object)) = span->objects;
2041 span->objects = object;
2042 }
2043}
2044
2045ALWAYS_INLINE bool TCMalloc_Central_FreeList::EvictRandomSizeClass(
2046 size_t locked_size_class, bool force) {
2047 static int race_counter = 0;
2048 int t = race_counter++; // Updated without a lock, but who cares.
2049 if (t >= static_cast<int>(kNumClasses)) {
2050 while (t >= static_cast<int>(kNumClasses)) {
2051 t -= kNumClasses;
2052 }
2053 race_counter = t;
2054 }
2055 ASSERT(t >= 0);
2056 ASSERT(t < static_cast<int>(kNumClasses));
2057 if (t == static_cast<int>(locked_size_class)) return false;
2058 return central_cache[t].ShrinkCache(static_cast<int>(locked_size_class), force);
2059}
2060
2061bool TCMalloc_Central_FreeList::MakeCacheSpace() {
2062 // Is there room in the cache?
2063 if (used_slots_ < cache_size_) return true;
2064 // Check if we can expand this cache?
2065 if (cache_size_ == kNumTransferEntries) return false;
2066 // Ok, we'll try to grab an entry from some other size class.
2067 if (EvictRandomSizeClass(size_class_, false) ||
2068 EvictRandomSizeClass(size_class_, true)) {
2069 // Succeeded in evicting, we're going to make our cache larger.
2070 cache_size_++;
2071 return true;
2072 }
2073 return false;
2074}
2075
2076
2077namespace {
2078class LockInverter {
2079 private:
2080 SpinLock *held_, *temp_;
2081 public:
2082 inline explicit LockInverter(SpinLock* held, SpinLock *temp)
2083 : held_(held), temp_(temp) { held_->Unlock(); temp_->Lock(); }
2084 inline ~LockInverter() { temp_->Unlock(); held_->Lock(); }
2085};
2086}
2087
2088bool TCMalloc_Central_FreeList::ShrinkCache(int locked_size_class, bool force) {
2089 // Start with a quick check without taking a lock.
2090 if (cache_size_ == 0) return false;
2091 // We don't evict from a full cache unless we are 'forcing'.
2092 if (force == false && used_slots_ == cache_size_) return false;
2093
2094 // Grab lock, but first release the other lock held by this thread. We use
2095 // the lock inverter to ensure that we never hold two size class locks
2096 // concurrently. That can create a deadlock because there is no well
2097 // defined nesting order.
2098 LockInverter li(&central_cache[locked_size_class].lock_, &lock_);
2099 ASSERT(used_slots_ <= cache_size_);
2100 ASSERT(0 <= cache_size_);
2101 if (cache_size_ == 0) return false;
2102 if (used_slots_ == cache_size_) {
2103 if (force == false) return false;
2104 // ReleaseListToSpans releases the lock, so we have to make all the
2105 // updates to the central list before calling it.
2106 cache_size_--;
2107 used_slots_--;
2108 ReleaseListToSpans(tc_slots_[used_slots_].head);
2109 return true;
2110 }
2111 cache_size_--;
2112 return true;
2113}
2114
2115void TCMalloc_Central_FreeList::InsertRange(void *start, void *end, int N) {
2116 SpinLockHolder h(&lock_);
2117 if (N == num_objects_to_move[size_class_] &&
2118 MakeCacheSpace()) {
2119 int slot = used_slots_++;
2120 ASSERT(slot >=0);
2121 ASSERT(slot < kNumTransferEntries);
2122 TCEntry *entry = &tc_slots_[slot];
2123 entry->head = start;
2124 entry->tail = end;
2125 return;
2126 }
2127 ReleaseListToSpans(start);
2128}
2129
2130void TCMalloc_Central_FreeList::RemoveRange(void **start, void **end, int *N) {
2131 int num = *N;
2132 ASSERT(num > 0);
2133
2134 SpinLockHolder h(&lock_);
2135 if (num == num_objects_to_move[size_class_] && used_slots_ > 0) {
2136 int slot = --used_slots_;
2137 ASSERT(slot >= 0);
2138 TCEntry *entry = &tc_slots_[slot];
2139 *start = entry->head;
2140 *end = entry->tail;
2141 return;
2142 }
2143
2144 // TODO: Prefetch multiple TCEntries?
2145 void *tail = FetchFromSpansSafe();
2146 if (!tail) {
2147 // We are completely out of memory.
2148 *start = *end = NULL;
2149 *N = 0;
2150 return;
2151 }
2152
2153 SLL_SetNext(tail, NULL);
2154 void *head = tail;
2155 int count = 1;
2156 while (count < num) {
2157 void *t = FetchFromSpans();
2158 if (!t) break;
2159 SLL_Push(&head, t);
2160 count++;
2161 }
2162 *start = head;
2163 *end = tail;
2164 *N = count;
2165}
2166
2167
2168void* TCMalloc_Central_FreeList::FetchFromSpansSafe() {
2169 void *t = FetchFromSpans();
2170 if (!t) {
2171 Populate();
2172 t = FetchFromSpans();
2173 }
2174 return t;
2175}
2176
2177void* TCMalloc_Central_FreeList::FetchFromSpans() {
2178 if (DLL_IsEmpty(&nonempty_)) return NULL;
2179 Span* span = nonempty_.next;
2180
2181 ASSERT(span->objects != NULL);
2182 ASSERT_SPAN_COMMITTED(span);
2183 span->refcount++;
2184 void* result = span->objects;
2185 span->objects = *(reinterpret_cast<void**>(result));
2186 if (span->objects == NULL) {
2187 // Move to empty list
2188 DLL_Remove(span);
2189 DLL_Prepend(&empty_, span);
2190 Event(span, 'E', 0);
2191 }
2192 counter_--;
2193 return result;
2194}
2195
2196// Fetch memory from the system and add to the central cache freelist.
2197ALWAYS_INLINE void TCMalloc_Central_FreeList::Populate() {
2198 // Release central list lock while operating on pageheap
2199 lock_.Unlock();
2200 const size_t npages = class_to_pages[size_class_];
2201
2202 Span* span;
2203 {
2204 SpinLockHolder h(&pageheap_lock);
2205 span = pageheap->New(npages);
2206 if (span) pageheap->RegisterSizeClass(span, size_class_);
2207 }
2208 if (span == NULL) {
2209 MESSAGE("allocation failed: %d\n", errno);
2210 lock_.Lock();
2211 return;
2212 }
2213 ASSERT_SPAN_COMMITTED(span);
2214 ASSERT(span->length == npages);
2215 // Cache sizeclass info eagerly. Locking is not necessary.
2216 // (Instead of being eager, we could just replace any stale info
2217 // about this span, but that seems to be no better in practice.)
2218 for (size_t i = 0; i < npages; i++) {
2219 pageheap->CacheSizeClass(span->start + i, size_class_);
2220 }
2221
2222 // Split the block into pieces and add to the free-list
2223 // TODO: coloring of objects to avoid cache conflicts?
2224 void** tail = &span->objects;
2225 char* ptr = reinterpret_cast<char*>(span->start << kPageShift);
2226 char* limit = ptr + (npages << kPageShift);
2227 const size_t size = ByteSizeForClass(size_class_);
2228 int num = 0;
2229 char* nptr;
2230 while ((nptr = ptr + size) <= limit) {
2231 *tail = ptr;
2232 tail = reinterpret_cast<void**>(ptr);
2233 ptr = nptr;
2234 num++;
2235 }
2236 ASSERT(ptr <= limit);
2237 *tail = NULL;
2238 span->refcount = 0; // No sub-object in use yet
2239
2240 // Add span to list of non-empty spans
2241 lock_.Lock();
2242 DLL_Prepend(&nonempty_, span);
2243 counter_ += num;
2244}
2245
2246//-------------------------------------------------------------------
2247// TCMalloc_ThreadCache implementation
2248//-------------------------------------------------------------------
2249
2250inline bool TCMalloc_ThreadCache::SampleAllocation(size_t k) {
2251 if (bytes_until_sample_ < k) {
2252 PickNextSample(k);
2253 return true;
2254 } else {
2255 bytes_until_sample_ -= k;
2256 return false;
2257 }
2258}
2259
2260void TCMalloc_ThreadCache::Init(ThreadIdentifier tid) {
2261 size_ = 0;
2262 next_ = NULL;
2263 prev_ = NULL;
2264 tid_ = tid;
2265 in_setspecific_ = false;
2266 for (size_t cl = 0; cl < kNumClasses; ++cl) {
2267 list_[cl].Init();
2268 }
2269
2270 // Initialize RNG -- run it for a bit to get to good values
2271 bytes_until_sample_ = 0;
2272 rnd_ = static_cast<uint32_t>(reinterpret_cast<uintptr_t>(this));
2273 for (int i = 0; i < 100; i++) {
2274 PickNextSample(static_cast<size_t>(FLAGS_tcmalloc_sample_parameter * 2));
2275 }
2276}
2277
2278void TCMalloc_ThreadCache::Cleanup() {
2279 // Put unused memory back into central cache
2280 for (size_t cl = 0; cl < kNumClasses; ++cl) {
2281 if (list_[cl].length() > 0) {
2282 ReleaseToCentralCache(cl, list_[cl].length());
2283 }
2284 }
2285}
2286
2287ALWAYS_INLINE void* TCMalloc_ThreadCache::Allocate(size_t size) {
2288 ASSERT(size <= kMaxSize);
2289 const size_t cl = SizeClass(size);
2290 FreeList* list = &list_[cl];
2291 size_t allocationSize = ByteSizeForClass(cl);
2292 if (list->empty()) {
2293 FetchFromCentralCache(cl, allocationSize);
2294 if (list->empty()) return NULL;
2295 }
2296 size_ -= allocationSize;
2297 return list->Pop();
2298}
2299
2300inline void TCMalloc_ThreadCache::Deallocate(void* ptr, size_t cl) {
2301 size_ += ByteSizeForClass(cl);
2302 FreeList* list = &list_[cl];
2303 list->Push(ptr);
2304 // If enough data is free, put back into central cache
2305 if (list->length() > kMaxFreeListLength) {
2306 ReleaseToCentralCache(cl, num_objects_to_move[cl]);
2307 }
2308 if (size_ >= per_thread_cache_size) Scavenge();
2309}
2310
2311// Remove some objects of class "cl" from central cache and add to thread heap
2312ALWAYS_INLINE void TCMalloc_ThreadCache::FetchFromCentralCache(size_t cl, size_t allocationSize) {
2313 int fetch_count = num_objects_to_move[cl];
2314 void *start, *end;
2315 central_cache[cl].RemoveRange(&start, &end, &fetch_count);
2316 list_[cl].PushRange(fetch_count, start, end);
2317 size_ += allocationSize * fetch_count;
2318}
2319
2320// Remove some objects of class "cl" from thread heap and add to central cache
2321inline void TCMalloc_ThreadCache::ReleaseToCentralCache(size_t cl, int N) {
2322 ASSERT(N > 0);
2323 FreeList* src = &list_[cl];
2324 if (N > src->length()) N = src->length();
2325 size_ -= N*ByteSizeForClass(cl);
2326
2327 // We return prepackaged chains of the correct size to the central cache.
2328 // TODO: Use the same format internally in the thread caches?
2329 int batch_size = num_objects_to_move[cl];
2330 while (N > batch_size) {
2331 void *tail, *head;
2332 src->PopRange(batch_size, &head, &tail);
2333 central_cache[cl].InsertRange(head, tail, batch_size);
2334 N -= batch_size;
2335 }
2336 void *tail, *head;
2337 src->PopRange(N, &head, &tail);
2338 central_cache[cl].InsertRange(head, tail, N);
2339}
2340
2341// Release idle memory to the central cache
2342inline void TCMalloc_ThreadCache::Scavenge() {
2343 // If the low-water mark for the free list is L, it means we would
2344 // not have had to allocate anything from the central cache even if
2345 // we had reduced the free list size by L. We aim to get closer to
2346 // that situation by dropping L/2 nodes from the free list. This
2347 // may not release much memory, but if so we will call scavenge again
2348 // pretty soon and the low-water marks will be high on that call.
2349 //int64 start = CycleClock::Now();
2350
2351 for (size_t cl = 0; cl < kNumClasses; cl++) {
2352 FreeList* list = &list_[cl];
2353 const int lowmark = list->lowwatermark();
2354 if (lowmark > 0) {
2355 const int drop = (lowmark > 1) ? lowmark/2 : 1;
2356 ReleaseToCentralCache(cl, drop);
2357 }
2358 list->clear_lowwatermark();
2359 }
2360
2361 //int64 finish = CycleClock::Now();
2362 //CycleTimer ct;
2363 //MESSAGE("GC: %.0f ns\n", ct.CyclesToUsec(finish-start)*1000.0);
2364}
2365
2366void TCMalloc_ThreadCache::PickNextSample(size_t k) {
2367 // Make next "random" number
2368 // x^32+x^22+x^2+x^1+1 is a primitive polynomial for random numbers
2369 static const uint32_t kPoly = (1 << 22) | (1 << 2) | (1 << 1) | (1 << 0);
2370 uint32_t r = rnd_;
2371 rnd_ = (r << 1) ^ ((static_cast<int32_t>(r) >> 31) & kPoly);
2372
2373 // Next point is "rnd_ % (sample_period)". I.e., average
2374 // increment is "sample_period/2".
2375 const int flag_value = static_cast<int>(FLAGS_tcmalloc_sample_parameter);
2376 static int last_flag_value = -1;
2377
2378 if (flag_value != last_flag_value) {
2379 SpinLockHolder h(&sample_period_lock);
2380 int i;
2381 for (i = 0; i < (static_cast<int>(sizeof(primes_list)/sizeof(primes_list[0])) - 1); i++) {
2382 if (primes_list[i] >= flag_value) {
2383 break;
2384 }
2385 }
2386 sample_period = primes_list[i];
2387 last_flag_value = flag_value;
2388 }
2389
2390 bytes_until_sample_ += rnd_ % sample_period;
2391
2392 if (k > (static_cast<size_t>(-1) >> 2)) {
2393 // If the user has asked for a huge allocation then it is possible
2394 // for the code below to loop infinitely. Just return (note that
2395 // this throws off the sampling accuracy somewhat, but a user who
2396 // is allocating more than 1G of memory at a time can live with a
2397 // minor inaccuracy in profiling of small allocations, and also
2398 // would rather not wait for the loop below to terminate).
2399 return;
2400 }
2401
2402 while (bytes_until_sample_ < k) {
2403 // Increase bytes_until_sample_ by enough average sampling periods
2404 // (sample_period >> 1) to allow us to sample past the current
2405 // allocation.
2406 bytes_until_sample_ += (sample_period >> 1);
2407 }
2408
2409 bytes_until_sample_ -= k;
2410}
2411
2412void TCMalloc_ThreadCache::InitModule() {
2413 // There is a slight potential race here because of double-checked
2414 // locking idiom. However, as long as the program does a small
2415 // allocation before switching to multi-threaded mode, we will be
2416 // fine. We increase the chances of doing such a small allocation
2417 // by doing one in the constructor of the module_enter_exit_hook
2418 // object declared below.
2419 SpinLockHolder h(&pageheap_lock);
2420 if (!phinited) {
2421#ifdef WTF_CHANGES
2422 InitTSD();
2423#endif
2424 InitSizeClasses();
2425 threadheap_allocator.Init();
2426 span_allocator.Init();
2427 span_allocator.New(); // Reduce cache conflicts
2428 span_allocator.New(); // Reduce cache conflicts
2429 stacktrace_allocator.Init();
2430 DLL_Init(&sampled_objects);
2431 for (size_t i = 0; i < kNumClasses; ++i) {
2432 central_cache[i].Init(i);
2433 }
2434 pageheap->init();
2435 phinited = 1;
2436#if defined(WTF_CHANGES) && PLATFORM(DARWIN)
2437 FastMallocZone::init();
2438#endif
2439 }
2440}
2441
2442inline TCMalloc_ThreadCache* TCMalloc_ThreadCache::NewHeap(ThreadIdentifier tid) {
2443 // Create the heap and add it to the linked list
2444 TCMalloc_ThreadCache *heap = threadheap_allocator.New();
2445 heap->Init(tid);
2446 heap->next_ = thread_heaps;
2447 heap->prev_ = NULL;
2448 if (thread_heaps != NULL) thread_heaps->prev_ = heap;
2449 thread_heaps = heap;
2450 thread_heap_count++;
2451 RecomputeThreadCacheSize();
2452 return heap;
2453}
2454
2455inline TCMalloc_ThreadCache* TCMalloc_ThreadCache::GetThreadHeap() {
2456#ifdef HAVE_TLS
2457 // __thread is faster, but only when the kernel supports it
2458 if (KernelSupportsTLS())
2459 return threadlocal_heap;
2460#elif COMPILER(MSVC)
2461 return static_cast<TCMalloc_ThreadCache*>(TlsGetValue(tlsIndex));
2462#else
2463 return static_cast<TCMalloc_ThreadCache*>(pthread_getspecific(heap_key));
2464#endif
2465}
2466
2467inline TCMalloc_ThreadCache* TCMalloc_ThreadCache::GetCache() {
2468 TCMalloc_ThreadCache* ptr = NULL;
2469 if (!tsd_inited) {
2470 InitModule();
2471 } else {
2472 ptr = GetThreadHeap();
2473 }
2474 if (ptr == NULL) ptr = CreateCacheIfNecessary();
2475 return ptr;
2476}
2477
2478// In deletion paths, we do not try to create a thread-cache. This is
2479// because we may be in the thread destruction code and may have
2480// already cleaned up the cache for this thread.
2481inline TCMalloc_ThreadCache* TCMalloc_ThreadCache::GetCacheIfPresent() {
2482 if (!tsd_inited) return NULL;
2483 void* const p = GetThreadHeap();
2484 return reinterpret_cast<TCMalloc_ThreadCache*>(p);
2485}
2486
2487void TCMalloc_ThreadCache::InitTSD() {
2488 ASSERT(!tsd_inited);
2489 pthread_key_create(&heap_key, DestroyThreadCache);
2490#if COMPILER(MSVC)
2491 tlsIndex = TlsAlloc();
2492#endif
2493 tsd_inited = true;
2494
2495#if !COMPILER(MSVC)
2496 // We may have used a fake pthread_t for the main thread. Fix it.
2497 pthread_t zero;
2498 memset(&zero, 0, sizeof(zero));
2499#endif
2500#ifndef WTF_CHANGES
2501 SpinLockHolder h(&pageheap_lock);
2502#else
2503 ASSERT(pageheap_lock.IsHeld());
2504#endif
2505 for (TCMalloc_ThreadCache* h = thread_heaps; h != NULL; h = h->next_) {
2506#if COMPILER(MSVC)
2507 if (h->tid_ == 0) {
2508 h->tid_ = GetCurrentThreadId();
2509 }
2510#else
2511 if (pthread_equal(h->tid_, zero)) {
2512 h->tid_ = pthread_self();
2513 }
2514#endif
2515 }
2516}
2517
2518TCMalloc_ThreadCache* TCMalloc_ThreadCache::CreateCacheIfNecessary() {
2519 // Initialize per-thread data if necessary
2520 TCMalloc_ThreadCache* heap = NULL;
2521 {
2522 SpinLockHolder h(&pageheap_lock);
2523
2524#if COMPILER(MSVC)
2525 DWORD me;
2526 if (!tsd_inited) {
2527 me = 0;
2528 } else {
2529 me = GetCurrentThreadId();
2530 }
2531#else
2532 // Early on in glibc's life, we cannot even call pthread_self()
2533 pthread_t me;
2534 if (!tsd_inited) {
2535 memset(&me, 0, sizeof(me));
2536 } else {
2537 me = pthread_self();
2538 }
2539#endif
2540
2541 // This may be a recursive malloc call from pthread_setspecific()
2542 // In that case, the heap for this thread has already been created
2543 // and added to the linked list. So we search for that first.
2544 for (TCMalloc_ThreadCache* h = thread_heaps; h != NULL; h = h->next_) {
2545#if COMPILER(MSVC)
2546 if (h->tid_ == me) {
2547#else
2548 if (pthread_equal(h->tid_, me)) {
2549#endif
2550 heap = h;
2551 break;
2552 }
2553 }
2554
2555 if (heap == NULL) heap = NewHeap(me);
2556 }
2557
2558 // We call pthread_setspecific() outside the lock because it may
2559 // call malloc() recursively. The recursive call will never get
2560 // here again because it will find the already allocated heap in the
2561 // linked list of heaps.
2562 if (!heap->in_setspecific_ && tsd_inited) {
2563 heap->in_setspecific_ = true;
2564 setThreadHeap(heap);
2565 }
2566 return heap;
2567}
2568
2569void TCMalloc_ThreadCache::BecomeIdle() {
2570 if (!tsd_inited) return; // No caches yet
2571 TCMalloc_ThreadCache* heap = GetThreadHeap();
2572 if (heap == NULL) return; // No thread cache to remove
2573 if (heap->in_setspecific_) return; // Do not disturb the active caller
2574
2575 heap->in_setspecific_ = true;
2576 pthread_setspecific(heap_key, NULL);
2577#ifdef HAVE_TLS
2578 // Also update the copy in __thread
2579 threadlocal_heap = NULL;
2580#endif
2581 heap->in_setspecific_ = false;
2582 if (GetThreadHeap() == heap) {
2583 // Somehow heap got reinstated by a recursive call to malloc
2584 // from pthread_setspecific. We give up in this case.
2585 return;
2586 }
2587
2588 // We can now get rid of the heap
2589 DeleteCache(heap);
2590}
2591
2592void TCMalloc_ThreadCache::DestroyThreadCache(void* ptr) {
2593 // Note that "ptr" cannot be NULL since pthread promises not
2594 // to invoke the destructor on NULL values, but for safety,
2595 // we check anyway.
2596 if (ptr == NULL) return;
2597#ifdef HAVE_TLS
2598 // Prevent fast path of GetThreadHeap() from returning heap.
2599 threadlocal_heap = NULL;
2600#endif
2601 DeleteCache(reinterpret_cast<TCMalloc_ThreadCache*>(ptr));
2602}
2603
2604void TCMalloc_ThreadCache::DeleteCache(TCMalloc_ThreadCache* heap) {
2605 // Remove all memory from heap
2606 heap->Cleanup();
2607
2608 // Remove from linked list
2609 SpinLockHolder h(&pageheap_lock);
2610 if (heap->next_ != NULL) heap->next_->prev_ = heap->prev_;
2611 if (heap->prev_ != NULL) heap->prev_->next_ = heap->next_;
2612 if (thread_heaps == heap) thread_heaps = heap->next_;
2613 thread_heap_count--;
2614 RecomputeThreadCacheSize();
2615
2616 threadheap_allocator.Delete(heap);
2617}
2618
2619void TCMalloc_ThreadCache::RecomputeThreadCacheSize() {
2620 // Divide available space across threads
2621 int n = thread_heap_count > 0 ? thread_heap_count : 1;
2622 size_t space = overall_thread_cache_size / n;
2623
2624 // Limit to allowed range
2625 if (space < kMinThreadCacheSize) space = kMinThreadCacheSize;
2626 if (space > kMaxThreadCacheSize) space = kMaxThreadCacheSize;
2627
2628 per_thread_cache_size = space;
2629}
2630
2631void TCMalloc_ThreadCache::Print() const {
2632 for (size_t cl = 0; cl < kNumClasses; ++cl) {
2633 MESSAGE(" %5" PRIuS " : %4d len; %4d lo\n",
2634 ByteSizeForClass(cl),
2635 list_[cl].length(),
2636 list_[cl].lowwatermark());
2637 }
2638}
2639
2640// Extract interesting stats
2641struct TCMallocStats {
2642 uint64_t system_bytes; // Bytes alloced from system
2643 uint64_t thread_bytes; // Bytes in thread caches
2644 uint64_t central_bytes; // Bytes in central cache
2645 uint64_t transfer_bytes; // Bytes in central transfer cache
2646 uint64_t pageheap_bytes; // Bytes in page heap
2647 uint64_t metadata_bytes; // Bytes alloced for metadata
2648};
2649
2650#ifndef WTF_CHANGES
2651// Get stats into "r". Also get per-size-class counts if class_count != NULL
2652static void ExtractStats(TCMallocStats* r, uint64_t* class_count) {
2653 r->central_bytes = 0;
2654 r->transfer_bytes = 0;
2655 for (int cl = 0; cl < kNumClasses; ++cl) {
2656 const int length = central_cache[cl].length();
2657 const int tc_length = central_cache[cl].tc_length();
2658 r->central_bytes += static_cast<uint64_t>(ByteSizeForClass(cl)) * length;
2659 r->transfer_bytes +=
2660 static_cast<uint64_t>(ByteSizeForClass(cl)) * tc_length;
2661 if (class_count) class_count[cl] = length + tc_length;
2662 }
2663
2664 // Add stats from per-thread heaps
2665 r->thread_bytes = 0;
2666 { // scope
2667 SpinLockHolder h(&pageheap_lock);
2668 for (TCMalloc_ThreadCache* h = thread_heaps; h != NULL; h = h->next_) {
2669 r->thread_bytes += h->Size();
2670 if (class_count) {
2671 for (size_t cl = 0; cl < kNumClasses; ++cl) {
2672 class_count[cl] += h->freelist_length(cl);
2673 }
2674 }
2675 }
2676 }
2677
2678 { //scope
2679 SpinLockHolder h(&pageheap_lock);
2680 r->system_bytes = pageheap->SystemBytes();
2681 r->metadata_bytes = metadata_system_bytes;
2682 r->pageheap_bytes = pageheap->FreeBytes();
2683 }
2684}
2685#endif
2686
2687#ifndef WTF_CHANGES
2688// WRITE stats to "out"
2689static void DumpStats(TCMalloc_Printer* out, int level) {
2690 TCMallocStats stats;
2691 uint64_t class_count[kNumClasses];
2692 ExtractStats(&stats, (level >= 2 ? class_count : NULL));
2693
2694 if (level >= 2) {
2695 out->printf("------------------------------------------------\n");
2696 uint64_t cumulative = 0;
2697 for (int cl = 0; cl < kNumClasses; ++cl) {
2698 if (class_count[cl] > 0) {
2699 uint64_t class_bytes = class_count[cl] * ByteSizeForClass(cl);
2700 cumulative += class_bytes;
2701 out->printf("class %3d [ %8" PRIuS " bytes ] : "
2702 "%8" PRIu64 " objs; %5.1f MB; %5.1f cum MB\n",
2703 cl, ByteSizeForClass(cl),
2704 class_count[cl],
2705 class_bytes / 1048576.0,
2706 cumulative / 1048576.0);
2707 }
2708 }
2709
2710 SpinLockHolder h(&pageheap_lock);
2711 pageheap->Dump(out);
2712 }
2713
2714 const uint64_t bytes_in_use = stats.system_bytes
2715 - stats.pageheap_bytes
2716 - stats.central_bytes
2717 - stats.transfer_bytes
2718 - stats.thread_bytes;
2719
2720 out->printf("------------------------------------------------\n"
2721 "MALLOC: %12" PRIu64 " Heap size\n"
2722 "MALLOC: %12" PRIu64 " Bytes in use by application\n"
2723 "MALLOC: %12" PRIu64 " Bytes free in page heap\n"
2724 "MALLOC: %12" PRIu64 " Bytes free in central cache\n"
2725 "MALLOC: %12" PRIu64 " Bytes free in transfer cache\n"
2726 "MALLOC: %12" PRIu64 " Bytes free in thread caches\n"
2727 "MALLOC: %12" PRIu64 " Spans in use\n"
2728 "MALLOC: %12" PRIu64 " Thread heaps in use\n"
2729 "MALLOC: %12" PRIu64 " Metadata allocated\n"
2730 "------------------------------------------------\n",
2731 stats.system_bytes,
2732 bytes_in_use,
2733 stats.pageheap_bytes,
2734 stats.central_bytes,
2735 stats.transfer_bytes,
2736 stats.thread_bytes,
2737 uint64_t(span_allocator.inuse()),
2738 uint64_t(threadheap_allocator.inuse()),
2739 stats.metadata_bytes);
2740}
2741
2742static void PrintStats(int level) {
2743 const int kBufferSize = 16 << 10;
2744 char* buffer = new char[kBufferSize];
2745 TCMalloc_Printer printer(buffer, kBufferSize);
2746 DumpStats(&printer, level);
2747 write(STDERR_FILENO, buffer, strlen(buffer));
2748 delete[] buffer;
2749}
2750
2751static void** DumpStackTraces() {
2752 // Count how much space we need
2753 int needed_slots = 0;
2754 {
2755 SpinLockHolder h(&pageheap_lock);
2756 for (Span* s = sampled_objects.next; s != &sampled_objects; s = s->next) {
2757 StackTrace* stack = reinterpret_cast<StackTrace*>(s->objects);
2758 needed_slots += 3 + stack->depth;
2759 }
2760 needed_slots += 100; // Slop in case sample grows
2761 needed_slots += needed_slots/8; // An extra 12.5% slop
2762 }
2763
2764 void** result = new void*[needed_slots];
2765 if (result == NULL) {
2766 MESSAGE("tcmalloc: could not allocate %d slots for stack traces\n",
2767 needed_slots);
2768 return NULL;
2769 }
2770
2771 SpinLockHolder h(&pageheap_lock);
2772 int used_slots = 0;
2773 for (Span* s = sampled_objects.next; s != &sampled_objects; s = s->next) {
2774 ASSERT(used_slots < needed_slots); // Need to leave room for terminator
2775 StackTrace* stack = reinterpret_cast<StackTrace*>(s->objects);
2776 if (used_slots + 3 + stack->depth >= needed_slots) {
2777 // No more room
2778 break;
2779 }
2780
2781 result[used_slots+0] = reinterpret_cast<void*>(static_cast<uintptr_t>(1));
2782 result[used_slots+1] = reinterpret_cast<void*>(stack->size);
2783 result[used_slots+2] = reinterpret_cast<void*>(stack->depth);
2784 for (int d = 0; d < stack->depth; d++) {
2785 result[used_slots+3+d] = stack->stack[d];
2786 }
2787 used_slots += 3 + stack->depth;
2788 }
2789 result[used_slots] = reinterpret_cast<void*>(static_cast<uintptr_t>(0));
2790 return result;
2791}
2792#endif
2793
2794#ifndef WTF_CHANGES
2795
2796// TCMalloc's support for extra malloc interfaces
2797class TCMallocImplementation : public MallocExtension {
2798 public:
2799 virtual void GetStats(char* buffer, int buffer_length) {
2800 ASSERT(buffer_length > 0);
2801 TCMalloc_Printer printer(buffer, buffer_length);
2802
2803 // Print level one stats unless lots of space is available
2804 if (buffer_length < 10000) {
2805 DumpStats(&printer, 1);
2806 } else {
2807 DumpStats(&printer, 2);
2808 }
2809 }
2810
2811 virtual void** ReadStackTraces() {
2812 return DumpStackTraces();
2813 }
2814
2815 virtual bool GetNumericProperty(const char* name, size_t* value) {
2816 ASSERT(name != NULL);
2817
2818 if (strcmp(name, "generic.current_allocated_bytes") == 0) {
2819 TCMallocStats stats;
2820 ExtractStats(&stats, NULL);
2821 *value = stats.system_bytes
2822 - stats.thread_bytes
2823 - stats.central_bytes
2824 - stats.pageheap_bytes;
2825 return true;
2826 }
2827
2828 if (strcmp(name, "generic.heap_size") == 0) {
2829 TCMallocStats stats;
2830 ExtractStats(&stats, NULL);
2831 *value = stats.system_bytes;
2832 return true;
2833 }
2834
2835 if (strcmp(name, "tcmalloc.slack_bytes") == 0) {
2836 // We assume that bytes in the page heap are not fragmented too
2837 // badly, and are therefore available for allocation.
2838 SpinLockHolder l(&pageheap_lock);
2839 *value = pageheap->FreeBytes();
2840 return true;
2841 }
2842
2843 if (strcmp(name, "tcmalloc.max_total_thread_cache_bytes") == 0) {
2844 SpinLockHolder l(&pageheap_lock);
2845 *value = overall_thread_cache_size;
2846 return true;
2847 }
2848
2849 if (strcmp(name, "tcmalloc.current_total_thread_cache_bytes") == 0) {
2850 TCMallocStats stats;
2851 ExtractStats(&stats, NULL);
2852 *value = stats.thread_bytes;
2853 return true;
2854 }
2855
2856 return false;
2857 }
2858
2859 virtual bool SetNumericProperty(const char* name, size_t value) {
2860 ASSERT(name != NULL);
2861
2862 if (strcmp(name, "tcmalloc.max_total_thread_cache_bytes") == 0) {
2863 // Clip the value to a reasonable range
2864 if (value < kMinThreadCacheSize) value = kMinThreadCacheSize;
2865 if (value > (1<<30)) value = (1<<30); // Limit to 1GB
2866
2867 SpinLockHolder l(&pageheap_lock);
2868 overall_thread_cache_size = static_cast<size_t>(value);
2869 TCMalloc_ThreadCache::RecomputeThreadCacheSize();
2870 return true;
2871 }
2872
2873 return false;
2874 }
2875
2876 virtual void MarkThreadIdle() {
2877 TCMalloc_ThreadCache::BecomeIdle();
2878 }
2879
2880 virtual void ReleaseFreeMemory() {
2881 SpinLockHolder h(&pageheap_lock);
2882 pageheap->ReleaseFreePages();
2883 }
2884};
2885#endif
2886
2887// The constructor allocates an object to ensure that initialization
2888// runs before main(), and therefore we do not have a chance to become
2889// multi-threaded before initialization. We also create the TSD key
2890// here. Presumably by the time this constructor runs, glibc is in
2891// good enough shape to handle pthread_key_create().
2892//
2893// The constructor also takes the opportunity to tell STL to use
2894// tcmalloc. We want to do this early, before construct time, so
2895// all user STL allocations go through tcmalloc (which works really
2896// well for STL).
2897//
2898// The destructor prints stats when the program exits.
2899class TCMallocGuard {
2900 public:
2901
2902 TCMallocGuard() {
2903#ifdef HAVE_TLS // this is true if the cc/ld/libc combo support TLS
2904 // Check whether the kernel also supports TLS (needs to happen at runtime)
2905 CheckIfKernelSupportsTLS();
2906#endif
2907#ifndef WTF_CHANGES
2908#ifdef WIN32 // patch the windows VirtualAlloc, etc.
2909 PatchWindowsFunctions(); // defined in windows/patch_functions.cc
2910#endif
2911#endif
2912 free(malloc(1));
2913 TCMalloc_ThreadCache::InitTSD();
2914 free(malloc(1));
2915#ifndef WTF_CHANGES
2916 MallocExtension::Register(new TCMallocImplementation);
2917#endif
2918 }
2919
2920#ifndef WTF_CHANGES
2921 ~TCMallocGuard() {
2922 const char* env = getenv("MALLOCSTATS");
2923 if (env != NULL) {
2924 int level = atoi(env);
2925 if (level < 1) level = 1;
2926 PrintStats(level);
2927 }
2928#ifdef WIN32
2929 UnpatchWindowsFunctions();
2930#endif
2931 }
2932#endif
2933};
2934
2935#ifndef WTF_CHANGES
2936static TCMallocGuard module_enter_exit_hook;
2937#endif
2938
2939
2940//-------------------------------------------------------------------
2941// Helpers for the exported routines below
2942//-------------------------------------------------------------------
2943
2944#ifndef WTF_CHANGES
2945
2946static Span* DoSampledAllocation(size_t size) {
2947
2948 // Grab the stack trace outside the heap lock
2949 StackTrace tmp;
2950 tmp.depth = GetStackTrace(tmp.stack, kMaxStackDepth, 1);
2951 tmp.size = size;
2952
2953 SpinLockHolder h(&pageheap_lock);
2954 // Allocate span
2955 Span *span = pageheap->New(pages(size == 0 ? 1 : size));
2956 if (span == NULL) {
2957 return NULL;
2958 }
2959
2960 // Allocate stack trace
2961 StackTrace *stack = stacktrace_allocator.New();
2962 if (stack == NULL) {
2963 // Sampling failed because of lack of memory
2964 return span;
2965 }
2966
2967 *stack = tmp;
2968 span->sample = 1;
2969 span->objects = stack;
2970 DLL_Prepend(&sampled_objects, span);
2971
2972 return span;
2973}
2974#endif
2975
2976static inline bool CheckCachedSizeClass(void *ptr) {
2977 PageID p = reinterpret_cast<uintptr_t>(ptr) >> kPageShift;
2978 size_t cached_value = pageheap->GetSizeClassIfCached(p);
2979 return cached_value == 0 ||
2980 cached_value == pageheap->GetDescriptor(p)->sizeclass;
2981}
2982
2983static inline void* CheckedMallocResult(void *result)
2984{
2985 ASSERT(result == 0 || CheckCachedSizeClass(result));
2986 return result;
2987}
2988
2989static inline void* SpanToMallocResult(Span *span) {
2990 ASSERT_SPAN_COMMITTED(span);
2991 pageheap->CacheSizeClass(span->start, 0);
2992 return
2993 CheckedMallocResult(reinterpret_cast<void*>(span->start << kPageShift));
2994}
2995
2996#ifdef WTF_CHANGES
2997template <bool abortOnFailure>
2998#endif
2999static ALWAYS_INLINE void* do_malloc(size_t size) {
3000 void* ret = NULL;
3001
3002#ifdef WTF_CHANGES
3003 ASSERT(!isForbidden());
3004#endif
3005
3006 // The following call forces module initialization
3007 TCMalloc_ThreadCache* heap = TCMalloc_ThreadCache::GetCache();
3008#ifndef WTF_CHANGES
3009 if ((FLAGS_tcmalloc_sample_parameter > 0) && heap->SampleAllocation(size)) {
3010 Span* span = DoSampledAllocation(size);
3011 if (span != NULL) {
3012 ret = SpanToMallocResult(span);
3013 }
3014 } else
3015#endif
3016 if (size > kMaxSize) {
3017 // Use page-level allocator
3018 SpinLockHolder h(&pageheap_lock);
3019 Span* span = pageheap->New(pages(size));
3020 if (span != NULL) {
3021 ret = SpanToMallocResult(span);
3022 }
3023 } else {
3024 // The common case, and also the simplest. This just pops the
3025 // size-appropriate freelist, afer replenishing it if it's empty.
3026 ret = CheckedMallocResult(heap->Allocate(size));
3027 }
3028 if (!ret) {
3029#ifdef WTF_CHANGES
3030 if (abortOnFailure) // This branch should be optimized out by the compiler.
3031 abort();
3032#else
3033 errno = ENOMEM;
3034#endif
3035 }
3036 return ret;
3037}
3038
3039static ALWAYS_INLINE void do_free(void* ptr) {
3040 if (ptr == NULL) return;
3041 ASSERT(pageheap != NULL); // Should not call free() before malloc()
3042 const PageID p = reinterpret_cast<uintptr_t>(ptr) >> kPageShift;
3043 Span* span = NULL;
3044 size_t cl = pageheap->GetSizeClassIfCached(p);
3045
3046 if (cl == 0) {
3047 span = pageheap->GetDescriptor(p);
3048 cl = span->sizeclass;
3049 pageheap->CacheSizeClass(p, cl);
3050 }
3051 if (cl != 0) {
3052#ifndef NO_TCMALLOC_SAMPLES
3053 ASSERT(!pageheap->GetDescriptor(p)->sample);
3054#endif
3055 TCMalloc_ThreadCache* heap = TCMalloc_ThreadCache::GetCacheIfPresent();
3056 if (heap != NULL) {
3057 heap->Deallocate(ptr, cl);
3058 } else {
3059 // Delete directly into central cache
3060 SLL_SetNext(ptr, NULL);
3061 central_cache[cl].InsertRange(ptr, ptr, 1);
3062 }
3063 } else {
3064 SpinLockHolder h(&pageheap_lock);
3065 ASSERT(reinterpret_cast<uintptr_t>(ptr) % kPageSize == 0);
3066 ASSERT(span != NULL && span->start == p);
3067#ifndef NO_TCMALLOC_SAMPLES
3068 if (span->sample) {
3069 DLL_Remove(span);
3070 stacktrace_allocator.Delete(reinterpret_cast<StackTrace*>(span->objects));
3071 span->objects = NULL;
3072 }
3073#endif
3074 pageheap->Delete(span);
3075 }
3076}
3077
3078#ifndef WTF_CHANGES
3079// For use by exported routines below that want specific alignments
3080//
3081// Note: this code can be slow, and can significantly fragment memory.
3082// The expectation is that memalign/posix_memalign/valloc/pvalloc will
3083// not be invoked very often. This requirement simplifies our
3084// implementation and allows us to tune for expected allocation
3085// patterns.
3086static void* do_memalign(size_t align, size_t size) {
3087 ASSERT((align & (align - 1)) == 0);
3088 ASSERT(align > 0);
3089 if (pageheap == NULL) TCMalloc_ThreadCache::InitModule();
3090
3091 // Allocate at least one byte to avoid boundary conditions below
3092 if (size == 0) size = 1;
3093
3094 if (size <= kMaxSize && align < kPageSize) {
3095 // Search through acceptable size classes looking for one with
3096 // enough alignment. This depends on the fact that
3097 // InitSizeClasses() currently produces several size classes that
3098 // are aligned at powers of two. We will waste time and space if
3099 // we miss in the size class array, but that is deemed acceptable
3100 // since memalign() should be used rarely.
3101 size_t cl = SizeClass(size);
3102 while (cl < kNumClasses && ((class_to_size[cl] & (align - 1)) != 0)) {
3103 cl++;
3104 }
3105 if (cl < kNumClasses) {
3106 TCMalloc_ThreadCache* heap = TCMalloc_ThreadCache::GetCache();
3107 return CheckedMallocResult(heap->Allocate(class_to_size[cl]));
3108 }
3109 }
3110
3111 // We will allocate directly from the page heap
3112 SpinLockHolder h(&pageheap_lock);
3113
3114 if (align <= kPageSize) {
3115 // Any page-level allocation will be fine
3116 // TODO: We could put the rest of this page in the appropriate
3117 // TODO: cache but it does not seem worth it.
3118 Span* span = pageheap->New(pages(size));
3119 return span == NULL ? NULL : SpanToMallocResult(span);
3120 }
3121
3122 // Allocate extra pages and carve off an aligned portion
3123 const Length alloc = pages(size + align);
3124 Span* span = pageheap->New(alloc);
3125 if (span == NULL) return NULL;
3126
3127 // Skip starting portion so that we end up aligned
3128 Length skip = 0;
3129 while ((((span->start+skip) << kPageShift) & (align - 1)) != 0) {
3130 skip++;
3131 }
3132 ASSERT(skip < alloc);
3133 if (skip > 0) {
3134 Span* rest = pageheap->Split(span, skip);
3135 pageheap->Delete(span);
3136 span = rest;
3137 }
3138
3139 // Skip trailing portion that we do not need to return
3140 const Length needed = pages(size);
3141 ASSERT(span->length >= needed);
3142 if (span->length > needed) {
3143 Span* trailer = pageheap->Split(span, needed);
3144 pageheap->Delete(trailer);
3145 }
3146 return SpanToMallocResult(span);
3147}
3148#endif
3149
3150// Helpers for use by exported routines below:
3151
3152#ifndef WTF_CHANGES
3153static inline void do_malloc_stats() {
3154 PrintStats(1);
3155}
3156#endif
3157
3158static inline int do_mallopt(int, int) {
3159 return 1; // Indicates error
3160}
3161
3162#ifdef HAVE_STRUCT_MALLINFO // mallinfo isn't defined on freebsd, for instance
3163static inline struct mallinfo do_mallinfo() {
3164 TCMallocStats stats;
3165 ExtractStats(&stats, NULL);
3166
3167 // Just some of the fields are filled in.
3168 struct mallinfo info;
3169 memset(&info, 0, sizeof(info));
3170
3171 // Unfortunately, the struct contains "int" field, so some of the
3172 // size values will be truncated.
3173 info.arena = static_cast<int>(stats.system_bytes);
3174 info.fsmblks = static_cast<int>(stats.thread_bytes
3175 + stats.central_bytes
3176 + stats.transfer_bytes);
3177 info.fordblks = static_cast<int>(stats.pageheap_bytes);
3178 info.uordblks = static_cast<int>(stats.system_bytes
3179 - stats.thread_bytes
3180 - stats.central_bytes
3181 - stats.transfer_bytes
3182 - stats.pageheap_bytes);
3183
3184 return info;
3185}
3186#endif
3187
3188//-------------------------------------------------------------------
3189// Exported routines
3190//-------------------------------------------------------------------
3191
3192// CAVEAT: The code structure below ensures that MallocHook methods are always
3193// called from the stack frame of the invoked allocation function.
3194// heap-checker.cc depends on this to start a stack trace from
3195// the call to the (de)allocation function.
3196
3197#ifndef WTF_CHANGES
3198extern "C"
3199#else
3200#define do_malloc do_malloc<abortOnFailure>
3201
3202template <bool abortOnFailure>
3203void* malloc(size_t);
3204
3205void* fastMalloc(size_t size)
3206{
3207 return malloc<true>(size);
3208}
3209
3210void* tryFastMalloc(size_t size)
3211{
3212 return malloc<false>(size);
3213}
3214
3215template <bool abortOnFailure>
3216ALWAYS_INLINE
3217#endif
3218void* malloc(size_t size) {
3219 void* result = do_malloc(size);
3220#ifndef WTF_CHANGES
3221 MallocHook::InvokeNewHook(result, size);
3222#endif
3223 return result;
3224}
3225
3226#ifndef WTF_CHANGES
3227extern "C"
3228#endif
3229void free(void* ptr) {
3230#ifndef WTF_CHANGES
3231 MallocHook::InvokeDeleteHook(ptr);
3232#endif
3233 do_free(ptr);
3234}
3235
3236#ifndef WTF_CHANGES
3237extern "C"
3238#else
3239template <bool abortOnFailure>
3240void* calloc(size_t, size_t);
3241
3242void* fastCalloc(size_t n, size_t elem_size)
3243{
3244 return calloc<true>(n, elem_size);
3245}
3246
3247void* tryFastCalloc(size_t n, size_t elem_size)
3248{
3249 return calloc<false>(n, elem_size);
3250}
3251
3252template <bool abortOnFailure>
3253ALWAYS_INLINE
3254#endif
3255void* calloc(size_t n, size_t elem_size) {
3256 const size_t totalBytes = n * elem_size;
3257
3258 // Protect against overflow
3259 if (n > 1 && elem_size && (totalBytes / elem_size) != n)
3260 return 0;
3261
3262 void* result = do_malloc(totalBytes);
3263 if (result != NULL) {
3264 memset(result, 0, totalBytes);
3265 }
3266#ifndef WTF_CHANGES
3267 MallocHook::InvokeNewHook(result, totalBytes);
3268#endif
3269 return result;
3270}
3271
3272#ifndef WTF_CHANGES
3273extern "C"
3274#endif
3275void cfree(void* ptr) {
3276#ifndef WTF_CHANGES
3277 MallocHook::InvokeDeleteHook(ptr);
3278#endif
3279 do_free(ptr);
3280}
3281
3282#ifndef WTF_CHANGES
3283extern "C"
3284#else
3285template <bool abortOnFailure>
3286void* realloc(void*, size_t);
3287
3288void* fastRealloc(void* old_ptr, size_t new_size)
3289{
3290 return realloc<true>(old_ptr, new_size);
3291}
3292
3293void* tryFastRealloc(void* old_ptr, size_t new_size)
3294{
3295 return realloc<false>(old_ptr, new_size);
3296}
3297
3298template <bool abortOnFailure>
3299ALWAYS_INLINE
3300#endif
3301void* realloc(void* old_ptr, size_t new_size) {
3302 if (old_ptr == NULL) {
3303 void* result = do_malloc(new_size);
3304#ifndef WTF_CHANGES
3305 MallocHook::InvokeNewHook(result, new_size);
3306#endif
3307 return result;
3308 }
3309 if (new_size == 0) {
3310#ifndef WTF_CHANGES
3311 MallocHook::InvokeDeleteHook(old_ptr);
3312#endif
3313 free(old_ptr);
3314 return NULL;
3315 }
3316
3317 // Get the size of the old entry
3318 const PageID p = reinterpret_cast<uintptr_t>(old_ptr) >> kPageShift;
3319 size_t cl = pageheap->GetSizeClassIfCached(p);
3320 Span *span = NULL;
3321 size_t old_size;
3322 if (cl == 0) {
3323 span = pageheap->GetDescriptor(p);
3324 cl = span->sizeclass;
3325 pageheap->CacheSizeClass(p, cl);
3326 }
3327 if (cl != 0) {
3328 old_size = ByteSizeForClass(cl);
3329 } else {
3330 ASSERT(span != NULL);
3331 old_size = span->length << kPageShift;
3332 }
3333
3334 // Reallocate if the new size is larger than the old size,
3335 // or if the new size is significantly smaller than the old size.
3336 if ((new_size > old_size) || (AllocationSize(new_size) < old_size)) {
3337 // Need to reallocate
3338 void* new_ptr = do_malloc(new_size);
3339 if (new_ptr == NULL) {
3340 return NULL;
3341 }
3342#ifndef WTF_CHANGES
3343 MallocHook::InvokeNewHook(new_ptr, new_size);
3344#endif
3345 memcpy(new_ptr, old_ptr, ((old_size < new_size) ? old_size : new_size));
3346#ifndef WTF_CHANGES
3347 MallocHook::InvokeDeleteHook(old_ptr);
3348#endif
3349 // We could use a variant of do_free() that leverages the fact
3350 // that we already know the sizeclass of old_ptr. The benefit
3351 // would be small, so don't bother.
3352 do_free(old_ptr);
3353 return new_ptr;
3354 } else {
3355 return old_ptr;
3356 }
3357}
3358
3359#ifdef WTF_CHANGES
3360#undef do_malloc
3361#else
3362
3363static SpinLock set_new_handler_lock = SPINLOCK_INITIALIZER;
3364
3365static inline void* cpp_alloc(size_t size, bool nothrow) {
3366 for (;;) {
3367 void* p = do_malloc(size);
3368#ifdef PREANSINEW
3369 return p;
3370#else
3371 if (p == NULL) { // allocation failed
3372 // Get the current new handler. NB: this function is not
3373 // thread-safe. We make a feeble stab at making it so here, but
3374 // this lock only protects against tcmalloc interfering with
3375 // itself, not with other libraries calling set_new_handler.
3376 std::new_handler nh;
3377 {
3378 SpinLockHolder h(&set_new_handler_lock);
3379 nh = std::set_new_handler(0);
3380 (void) std::set_new_handler(nh);
3381 }
3382 // If no new_handler is established, the allocation failed.
3383 if (!nh) {
3384 if (nothrow) return 0;
3385 throw std::bad_alloc();
3386 }
3387 // Otherwise, try the new_handler. If it returns, retry the
3388 // allocation. If it throws std::bad_alloc, fail the allocation.
3389 // if it throws something else, don't interfere.
3390 try {
3391 (*nh)();
3392 } catch (const std::bad_alloc&) {
3393 if (!nothrow) throw;
3394 return p;
3395 }
3396 } else { // allocation success
3397 return p;
3398 }
3399#endif
3400 }
3401}
3402
3403void* operator new(size_t size) {
3404 void* p = cpp_alloc(size, false);
3405 // We keep this next instruction out of cpp_alloc for a reason: when
3406 // it's in, and new just calls cpp_alloc, the optimizer may fold the
3407 // new call into cpp_alloc, which messes up our whole section-based
3408 // stacktracing (see ATTRIBUTE_SECTION, above). This ensures cpp_alloc
3409 // isn't the last thing this fn calls, and prevents the folding.
3410 MallocHook::InvokeNewHook(p, size);
3411 return p;
3412}
3413
3414void* operator new(size_t size, const std::nothrow_t&) __THROW {
3415 void* p = cpp_alloc(size, true);
3416 MallocHook::InvokeNewHook(p, size);
3417 return p;
3418}
3419
3420void operator delete(void* p) __THROW {
3421 MallocHook::InvokeDeleteHook(p);
3422 do_free(p);
3423}
3424
3425void operator delete(void* p, const std::nothrow_t&) __THROW {
3426 MallocHook::InvokeDeleteHook(p);
3427 do_free(p);
3428}
3429
3430void* operator new[](size_t size) {
3431 void* p = cpp_alloc(size, false);
3432 // We keep this next instruction out of cpp_alloc for a reason: when
3433 // it's in, and new just calls cpp_alloc, the optimizer may fold the
3434 // new call into cpp_alloc, which messes up our whole section-based
3435 // stacktracing (see ATTRIBUTE_SECTION, above). This ensures cpp_alloc
3436 // isn't the last thing this fn calls, and prevents the folding.
3437 MallocHook::InvokeNewHook(p, size);
3438 return p;
3439}
3440
3441void* operator new[](size_t size, const std::nothrow_t&) __THROW {
3442 void* p = cpp_alloc(size, true);
3443 MallocHook::InvokeNewHook(p, size);
3444 return p;
3445}
3446
3447void operator delete[](void* p) __THROW {
3448 MallocHook::InvokeDeleteHook(p);
3449 do_free(p);
3450}
3451
3452void operator delete[](void* p, const std::nothrow_t&) __THROW {
3453 MallocHook::InvokeDeleteHook(p);
3454 do_free(p);
3455}
3456
3457extern "C" void* memalign(size_t align, size_t size) __THROW {
3458 void* result = do_memalign(align, size);
3459 MallocHook::InvokeNewHook(result, size);
3460 return result;
3461}
3462
3463extern "C" int posix_memalign(void** result_ptr, size_t align, size_t size)
3464 __THROW {
3465 if (((align % sizeof(void*)) != 0) ||
3466 ((align & (align - 1)) != 0) ||
3467 (align == 0)) {
3468 return EINVAL;
3469 }
3470
3471 void* result = do_memalign(align, size);
3472 MallocHook::InvokeNewHook(result, size);
3473 if (result == NULL) {
3474 return ENOMEM;
3475 } else {
3476 *result_ptr = result;
3477 return 0;
3478 }
3479}
3480
3481static size_t pagesize = 0;
3482
3483extern "C" void* valloc(size_t size) __THROW {
3484 // Allocate page-aligned object of length >= size bytes
3485 if (pagesize == 0) pagesize = getpagesize();
3486 void* result = do_memalign(pagesize, size);
3487 MallocHook::InvokeNewHook(result, size);
3488 return result;
3489}
3490
3491extern "C" void* pvalloc(size_t size) __THROW {
3492 // Round up size to a multiple of pagesize
3493 if (pagesize == 0) pagesize = getpagesize();
3494 size = (size + pagesize - 1) & ~(pagesize - 1);
3495 void* result = do_memalign(pagesize, size);
3496 MallocHook::InvokeNewHook(result, size);
3497 return result;
3498}
3499
3500extern "C" void malloc_stats(void) {
3501 do_malloc_stats();
3502}
3503
3504extern "C" int mallopt(int cmd, int value) {
3505 return do_mallopt(cmd, value);
3506}
3507
3508#ifdef HAVE_STRUCT_MALLINFO
3509extern "C" struct mallinfo mallinfo(void) {
3510 return do_mallinfo();
3511}
3512#endif
3513
3514//-------------------------------------------------------------------
3515// Some library routines on RedHat 9 allocate memory using malloc()
3516// and free it using __libc_free() (or vice-versa). Since we provide
3517// our own implementations of malloc/free, we need to make sure that
3518// the __libc_XXX variants (defined as part of glibc) also point to
3519// the same implementations.
3520//-------------------------------------------------------------------
3521
3522#if defined(__GLIBC__)
3523extern "C" {
3524# if defined(__GNUC__) && !defined(__MACH__) && defined(HAVE___ATTRIBUTE__)
3525 // Potentially faster variants that use the gcc alias extension.
3526 // Mach-O (Darwin) does not support weak aliases, hence the __MACH__ check.
3527# define ALIAS(x) __attribute__ ((weak, alias (x)))
3528 void* __libc_malloc(size_t size) ALIAS("malloc");
3529 void __libc_free(void* ptr) ALIAS("free");
3530 void* __libc_realloc(void* ptr, size_t size) ALIAS("realloc");
3531 void* __libc_calloc(size_t n, size_t size) ALIAS("calloc");
3532 void __libc_cfree(void* ptr) ALIAS("cfree");
3533 void* __libc_memalign(size_t align, size_t s) ALIAS("memalign");
3534 void* __libc_valloc(size_t size) ALIAS("valloc");
3535 void* __libc_pvalloc(size_t size) ALIAS("pvalloc");
3536 int __posix_memalign(void** r, size_t a, size_t s) ALIAS("posix_memalign");
3537# undef ALIAS
3538# else /* not __GNUC__ */
3539 // Portable wrappers
3540 void* __libc_malloc(size_t size) { return malloc(size); }
3541 void __libc_free(void* ptr) { free(ptr); }
3542 void* __libc_realloc(void* ptr, size_t size) { return realloc(ptr, size); }
3543 void* __libc_calloc(size_t n, size_t size) { return calloc(n, size); }
3544 void __libc_cfree(void* ptr) { cfree(ptr); }
3545 void* __libc_memalign(size_t align, size_t s) { return memalign(align, s); }
3546 void* __libc_valloc(size_t size) { return valloc(size); }
3547 void* __libc_pvalloc(size_t size) { return pvalloc(size); }
3548 int __posix_memalign(void** r, size_t a, size_t s) {
3549 return posix_memalign(r, a, s);
3550 }
3551# endif /* __GNUC__ */
3552}
3553#endif /* __GLIBC__ */
3554
3555// Override __libc_memalign in libc on linux boxes specially.
3556// They have a bug in libc that causes them to (very rarely) allocate
3557// with __libc_memalign() yet deallocate with free() and the
3558// definitions above don't catch it.
3559// This function is an exception to the rule of calling MallocHook method
3560// from the stack frame of the allocation function;
3561// heap-checker handles this special case explicitly.
3562static void *MemalignOverride(size_t align, size_t size, const void *caller)
3563 __THROW {
3564 void* result = do_memalign(align, size);
3565 MallocHook::InvokeNewHook(result, size);
3566 return result;
3567}
3568void *(*__memalign_hook)(size_t, size_t, const void *) = MemalignOverride;
3569
3570#endif
3571
3572#if defined(WTF_CHANGES) && PLATFORM(DARWIN)
3573
3574class FreeObjectFinder {
3575 const RemoteMemoryReader& m_reader;
3576 HashSet<void*> m_freeObjects;
3577
3578public:
3579 FreeObjectFinder(const RemoteMemoryReader& reader) : m_reader(reader) { }
3580
3581 void visit(void* ptr) { m_freeObjects.add(ptr); }
3582 bool isFreeObject(void* ptr) const { return m_freeObjects.contains(ptr); }
3583 size_t freeObjectCount() const { return m_freeObjects.size(); }
3584
3585 void findFreeObjects(TCMalloc_ThreadCache* threadCache)
3586 {
3587 for (; threadCache; threadCache = (threadCache->next_ ? m_reader(threadCache->next_) : 0))
3588 threadCache->enumerateFreeObjects(*this, m_reader);
3589 }
3590
3591 void findFreeObjects(TCMalloc_Central_FreeListPadded* centralFreeList, size_t numSizes)
3592 {
3593 for (unsigned i = 0; i < numSizes; i++)
3594 centralFreeList[i].enumerateFreeObjects(*this, m_reader);
3595 }
3596};
3597
3598class PageMapFreeObjectFinder {
3599 const RemoteMemoryReader& m_reader;
3600 FreeObjectFinder& m_freeObjectFinder;
3601
3602public:
3603 PageMapFreeObjectFinder(const RemoteMemoryReader& reader, FreeObjectFinder& freeObjectFinder)
3604 : m_reader(reader)
3605 , m_freeObjectFinder(freeObjectFinder)
3606 { }
3607
3608 int visit(void* ptr) const
3609 {
3610 if (!ptr)
3611 return 1;
3612
3613 Span* span = m_reader(reinterpret_cast<Span*>(ptr));
3614 if (span->free) {
3615 void* ptr = reinterpret_cast<void*>(span->start << kPageShift);
3616 m_freeObjectFinder.visit(ptr);
3617 } else if (span->sizeclass) {
3618 // Walk the free list of the small-object span, keeping track of each object seen
3619 for (void* nextObject = span->objects; nextObject; nextObject = *m_reader(reinterpret_cast<void**>(nextObject)))
3620 m_freeObjectFinder.visit(nextObject);
3621 }
3622 return span->length;
3623 }
3624};
3625
3626class PageMapMemoryUsageRecorder {
3627 task_t m_task;
3628 void* m_context;
3629 unsigned m_typeMask;
3630 vm_range_recorder_t* m_recorder;
3631 const RemoteMemoryReader& m_reader;
3632 const FreeObjectFinder& m_freeObjectFinder;
3633 mutable HashSet<void*> m_seenPointers;
3634
3635public:
3636 PageMapMemoryUsageRecorder(task_t task, void* context, unsigned typeMask, vm_range_recorder_t* recorder, const RemoteMemoryReader& reader, const FreeObjectFinder& freeObjectFinder)
3637 : m_task(task)
3638 , m_context(context)
3639 , m_typeMask(typeMask)
3640 , m_recorder(recorder)
3641 , m_reader(reader)
3642 , m_freeObjectFinder(freeObjectFinder)
3643 { }
3644
3645 int visit(void* ptr) const
3646 {
3647 if (!ptr)
3648 return 1;
3649
3650 Span* span = m_reader(reinterpret_cast<Span*>(ptr));
3651 if (m_seenPointers.contains(ptr))
3652 return span->length;
3653 m_seenPointers.add(ptr);
3654
3655 // Mark the memory used for the Span itself as an administrative region
3656 vm_range_t ptrRange = { reinterpret_cast<vm_address_t>(ptr), sizeof(Span) };
3657 if (m_typeMask & (MALLOC_PTR_REGION_RANGE_TYPE | MALLOC_ADMIN_REGION_RANGE_TYPE))
3658 (*m_recorder)(m_task, m_context, MALLOC_ADMIN_REGION_RANGE_TYPE, &ptrRange, 1);
3659
3660 ptrRange.address = span->start << kPageShift;
3661 ptrRange.size = span->length * kPageSize;
3662
3663 // Mark the memory region the span represents as candidates for containing pointers
3664 if (m_typeMask & (MALLOC_PTR_REGION_RANGE_TYPE | MALLOC_ADMIN_REGION_RANGE_TYPE))
3665 (*m_recorder)(m_task, m_context, MALLOC_PTR_REGION_RANGE_TYPE, &ptrRange, 1);
3666
3667 if (!span->free && (m_typeMask & MALLOC_PTR_IN_USE_RANGE_TYPE)) {
3668 // If it's an allocated large object span, mark it as in use
3669 if (span->sizeclass == 0 && !m_freeObjectFinder.isFreeObject(reinterpret_cast<void*>(ptrRange.address)))
3670 (*m_recorder)(m_task, m_context, MALLOC_PTR_IN_USE_RANGE_TYPE, &ptrRange, 1);
3671 else if (span->sizeclass) {
3672 const size_t byteSize = ByteSizeForClass(span->sizeclass);
3673 unsigned totalObjects = (span->length << kPageShift) / byteSize;
3674 ASSERT(span->refcount <= totalObjects);
3675 char* ptr = reinterpret_cast<char*>(span->start << kPageShift);
3676
3677 // Mark each allocated small object within the span as in use
3678 for (unsigned i = 0; i < totalObjects; i++) {
3679 char* thisObject = ptr + (i * byteSize);
3680 if (m_freeObjectFinder.isFreeObject(thisObject))
3681 continue;
3682
3683 vm_range_t objectRange = { reinterpret_cast<vm_address_t>(thisObject), byteSize };
3684 (*m_recorder)(m_task, m_context, MALLOC_PTR_IN_USE_RANGE_TYPE, &objectRange, 1);
3685 }
3686 }
3687 }
3688
3689 return span->length;
3690 }
3691};
3692
3693kern_return_t FastMallocZone::enumerate(task_t task, void* context, unsigned typeMask, vm_address_t zoneAddress, memory_reader_t reader, vm_range_recorder_t recorder)
3694{
3695 RemoteMemoryReader memoryReader(task, reader);
3696
3697 InitSizeClasses();
3698
3699 FastMallocZone* mzone = memoryReader(reinterpret_cast<FastMallocZone*>(zoneAddress));
3700 TCMalloc_PageHeap* pageHeap = memoryReader(mzone->m_pageHeap);
3701 TCMalloc_ThreadCache** threadHeapsPointer = memoryReader(mzone->m_threadHeaps);
3702 TCMalloc_ThreadCache* threadHeaps = memoryReader(*threadHeapsPointer);
3703
3704 TCMalloc_Central_FreeListPadded* centralCaches = memoryReader(mzone->m_centralCaches, sizeof(TCMalloc_Central_FreeListPadded) * kNumClasses);
3705
3706 FreeObjectFinder finder(memoryReader);
3707 finder.findFreeObjects(threadHeaps);
3708 finder.findFreeObjects(centralCaches, kNumClasses);
3709
3710 TCMalloc_PageHeap::PageMap* pageMap = &pageHeap->pagemap_;
3711 PageMapFreeObjectFinder pageMapFinder(memoryReader, finder);
3712 pageMap->visit(pageMapFinder, memoryReader);
3713
3714 PageMapMemoryUsageRecorder usageRecorder(task, context, typeMask, recorder, memoryReader, finder);
3715 pageMap->visit(usageRecorder, memoryReader);
3716
3717 return 0;
3718}
3719
3720size_t FastMallocZone::size(malloc_zone_t*, const void*)
3721{
3722 return 0;
3723}
3724
3725void* FastMallocZone::zoneMalloc(malloc_zone_t*, size_t)
3726{
3727 return 0;
3728}
3729
3730void* FastMallocZone::zoneCalloc(malloc_zone_t*, size_t, size_t)
3731{
3732 return 0;
3733}
3734
3735void FastMallocZone::zoneFree(malloc_zone_t*, void* ptr)
3736{
3737 // Due to <rdar://problem/5671357> zoneFree may be called by the system free even if the pointer
3738 // is not in this zone. When this happens, the pointer being freed was not allocated by any
3739 // zone so we need to print a useful error for the application developer.
3740 malloc_printf("*** error for object %p: pointer being freed was not allocated\n", ptr);
3741}
3742
3743void* FastMallocZone::zoneRealloc(malloc_zone_t*, void*, size_t)
3744{
3745 return 0;
3746}
3747
3748
3749#undef malloc
3750#undef free
3751#undef realloc
3752#undef calloc
3753
3754extern "C" {
3755malloc_introspection_t jscore_fastmalloc_introspection = { &FastMallocZone::enumerate, &FastMallocZone::goodSize, &FastMallocZone::check, &FastMallocZone::print,
3756 &FastMallocZone::log, &FastMallocZone::forceLock, &FastMallocZone::forceUnlock, &FastMallocZone::statistics };
3757}
3758
3759FastMallocZone::FastMallocZone(TCMalloc_PageHeap* pageHeap, TCMalloc_ThreadCache** threadHeaps, TCMalloc_Central_FreeListPadded* centralCaches)
3760 : m_pageHeap(pageHeap)
3761 , m_threadHeaps(threadHeaps)
3762 , m_centralCaches(centralCaches)
3763{
3764 memset(&m_zone, 0, sizeof(m_zone));
3765 m_zone.zone_name = "JavaScriptCore FastMalloc";
3766 m_zone.size = &FastMallocZone::size;
3767 m_zone.malloc = &FastMallocZone::zoneMalloc;
3768 m_zone.calloc = &FastMallocZone::zoneCalloc;
3769 m_zone.realloc = &FastMallocZone::zoneRealloc;
3770 m_zone.free = &FastMallocZone::zoneFree;
3771 m_zone.valloc = &FastMallocZone::zoneValloc;
3772 m_zone.destroy = &FastMallocZone::zoneDestroy;
3773 m_zone.introspect = &jscore_fastmalloc_introspection;
3774 malloc_zone_register(&m_zone);
3775}
3776
3777
3778void FastMallocZone::init()
3779{
3780 static FastMallocZone zone(pageheap, &thread_heaps, static_cast<TCMalloc_Central_FreeListPadded*>(central_cache));
3781}
3782
3783#endif
3784
3785void releaseFastMallocFreeMemory()
3786{
3787 SpinLockHolder h(&pageheap_lock);
3788 pageheap->ReleaseFreePages();
3789}
3790
3791#if WTF_CHANGES
3792} // namespace WTF
3793#endif
3794
3795#endif // FORCE_SYSTEM_MALLOC
Note: See TracBrowser for help on using the repository browser.