Source File
arena.go
Belonging Package
runtime
// Copyright 2022 The Go Authors. All rights reserved.// Use of this source code is governed by a BSD-style// license that can be found in the LICENSE file.// Implementation of (safe) user arenas.//// This file contains the implementation of user arenas wherein Go values can// be manually allocated and freed in bulk. The act of manually freeing memory,// potentially before a GC cycle, means that a garbage collection cycle can be// delayed, improving efficiency by reducing GC cycle frequency. There are other// potential efficiency benefits, such as improved locality and access to a more// efficient allocation strategy.//// What makes the arenas here safe is that once they are freed, accessing the// arena's memory will cause an explicit program fault, and the arena's address// space will not be reused until no more pointers into it are found. There's one// exception to this: if an arena allocated memory that isn't exhausted, it's placed// back into a pool for reuse. This means that a crash is not always guaranteed.//// While this may seem unsafe, it still prevents memory corruption, and is in fact// necessary in order to make new(T) a valid implementation of arenas. Such a property// is desirable to allow for a trivial implementation. (It also avoids complexities// that arise from synchronization with the GC when trying to set the arena chunks to// fault while the GC is active.)//// The implementation works in layers. At the bottom, arenas are managed in chunks.// Each chunk must be a multiple of the heap arena size, or the heap arena size must// be divisible by the arena chunks. The address space for each chunk, and each// corresponding heapArena for that address space, are eternally reserved for use as// arena chunks. That is, they can never be used for the general heap. Each chunk// is also represented by a single mspan, and is modeled as a single large heap// allocation. It must be, because each chunk contains ordinary Go values that may// point into the heap, so it must be scanned just like any other object. Any// pointer into a chunk will therefore always cause the whole chunk to be scanned// while its corresponding arena is still live.//// Chunks may be allocated either from new memory mapped by the OS on our behalf,// or by reusing old freed chunks. When chunks are freed, their underlying memory// is returned to the OS, set to fault on access, and may not be reused until the// program doesn't point into the chunk anymore (the code refers to this state as// "quarantined"), a property checked by the GC.//// The sweeper handles moving chunks out of this quarantine state to be ready for// reuse. When the chunk is placed into the quarantine state, its corresponding// span is marked as noscan so that the GC doesn't try to scan memory that would// cause a fault.//// At the next layer are the user arenas themselves. They consist of a single// active chunk which new Go values are bump-allocated into and a list of chunks// that were exhausted when allocating into the arena. Once the arena is freed,// it frees all full chunks it references, and places the active one onto a reuse// list for a future arena to use. Each arena keeps its list of referenced chunks// explicitly live until it is freed. Each user arena also maps to an object which// has a finalizer attached that ensures the arena's chunks are all freed even if// the arena itself is never explicitly freed.//// Pointer-ful memory is bump-allocated from low addresses to high addresses in each// chunk, while pointer-free memory is bump-allocated from high address to low// addresses. The reason for this is to take advantage of a GC optimization wherein// the GC will stop scanning an object when there are no more pointers in it, which// also allows us to elide clearing the heap bitmap for pointer-free Go values// allocated into arenas.//// Note that arenas are not safe to use concurrently.//// In summary, there are 2 resources: arenas, and arena chunks. They exist in the// following lifecycle://// (1) A new arena is created via newArena.// (2) Chunks are allocated to hold memory allocated into the arena with new or slice.// (a) Chunks are first allocated from the reuse list of partially-used chunks.// (b) If there are no such chunks, then chunks on the ready list are taken.// (c) Failing all the above, memory for a new chunk is mapped.// (3) The arena is freed, or all references to it are dropped, triggering its finalizer.// (a) If the GC is not active, exhausted chunks are set to fault and placed on a// quarantine list.// (b) If the GC is active, exhausted chunks are placed on a fault list and will// go through step (a) at a later point in time.// (c) Any remaining partially-used chunk is placed on a reuse list.// (4) Once no more pointers are found into quarantined arena chunks, the sweeper// takes these chunks out of quarantine and places them on the ready list.package runtimeimport ()// Functions starting with arena_ are meant to be exported to downstream users// of arenas. They should wrap these functions in a higher-lever API.//// The underlying arena and its resources are managed through an opaque unsafe.Pointer.// arena_newArena is a wrapper around newUserArena.////go:linkname arena_newArena arena.runtime_arena_newArenafunc arena_newArena() unsafe.Pointer {return unsafe.Pointer(newUserArena())}// arena_arena_New is a wrapper around (*userArena).new, except that typ// is an any (must be a *_type, still) and typ must be a type descriptor// for a pointer to the type to actually be allocated, i.e. pass a *T// to allocate a T. This is necessary because this function returns a *T.////go:linkname arena_arena_New arena.runtime_arena_arena_Newfunc arena_arena_New( unsafe.Pointer, any) any {:= (*_type)(efaceOf(&).data)if .Kind_&abi.KindMask != abi.Pointer {throw("arena_New: non-pointer type")}:= (*ptrtype)(unsafe.Pointer()).Elem:= ((*userArena)()).new()var any:= efaceOf(&)._type =.data =return}// arena_arena_Slice is a wrapper around (*userArena).slice.////go:linkname arena_arena_Slice arena.runtime_arena_arena_Slicefunc arena_arena_Slice( unsafe.Pointer, any, int) {((*userArena)()).slice(, )}// arena_arena_Free is a wrapper around (*userArena).free.////go:linkname arena_arena_Free arena.runtime_arena_arena_Freefunc arena_arena_Free( unsafe.Pointer) {((*userArena)()).free()}// arena_heapify takes a value that lives in an arena and makes a copy// of it on the heap. Values that don't live in an arena are returned unmodified.////go:linkname arena_heapify arena.runtime_arena_heapifyfunc arena_heapify( any) any {var unsafe.Pointer:= efaceOf(&):= ._typeswitch .Kind_ & abi.KindMask {case abi.String:= stringStructOf((*string)(.data)).strcase abi.Slice:= (*slice)(.data).arraycase abi.Pointer:= .datadefault:panic("arena: Clone only supports pointers, slices, and strings")}:= spanOf(uintptr())if == nil || !.isUserArenaChunk {// Not stored in a user arena chunk.return}// Heap-allocate storage for a copy.var anyswitch .Kind_ & abi.KindMask {case abi.String::= .(string), := rawstring(len())copy(, )=case abi.Slice::= (*slice)(.data).len:= (*slicetype)(unsafe.Pointer()).Elem:= new(slice)* = slice{makeslicecopy(, , , (*slice)(.data).array), , }:= efaceOf(&)._type =.data = unsafe.Pointer()case abi.Pointer::= (*ptrtype)(unsafe.Pointer()).Elem:= newobject()typedmemmove(, , .data):= efaceOf(&)._type =.data =}return}const (// userArenaChunkBytes is the size of a user arena chunk.userArenaChunkBytesMax = 8 << 20userArenaChunkBytes = uintptr(int64(userArenaChunkBytesMax-heapArenaBytes)&(int64(userArenaChunkBytesMax-heapArenaBytes)>>63) + heapArenaBytes) // min(userArenaChunkBytesMax, heapArenaBytes)// userArenaChunkPages is the number of pages a user arena chunk uses.userArenaChunkPages = userArenaChunkBytes / pageSize// userArenaChunkMaxAllocBytes is the maximum size of an object that can// be allocated from an arena. This number is chosen to cap worst-case// fragmentation of user arenas to 25%. Larger allocations are redirected// to the heap.userArenaChunkMaxAllocBytes = userArenaChunkBytes / 4)func init() {if userArenaChunkPages*pageSize != userArenaChunkBytes {throw("user arena chunk size is not a multiple of the page size")}if userArenaChunkBytes%physPageSize != 0 {throw("user arena chunk size is not a multiple of the physical page size")}if userArenaChunkBytes < heapArenaBytes {if heapArenaBytes%userArenaChunkBytes != 0 {throw("user arena chunk size is smaller than a heap arena, but doesn't divide it")}} else {if userArenaChunkBytes%heapArenaBytes != 0 {throw("user arena chunks size is larger than a heap arena, but not a multiple")}}lockInit(&userArenaState.lock, lockRankUserArenaState)}// userArenaChunkReserveBytes returns the amount of additional bytes to reserve for// heap metadata.func userArenaChunkReserveBytes() uintptr {// In the allocation headers experiment, we reserve the end of the chunk for// a pointer/scalar bitmap. We also reserve space for a dummy _type that// refers to the bitmap. The PtrBytes field of the dummy _type indicates how// many of those bits are valid.return userArenaChunkBytes/goarch.PtrSize/8 + unsafe.Sizeof(_type{})}type userArena struct {// fullList is a list of full chunks that have not enough free memory left, and// that we'll free once this user arena is freed.//// Can't use mSpanList here because it's not-in-heap.fullList *mspan// active is the user arena chunk we're currently allocating into.active *mspan// refs is a set of references to the arena chunks so that they're kept alive.//// The last reference in the list always refers to active, while the rest of// them correspond to fullList. Specifically, the head of fullList is the// second-to-last one, fullList.next is the third-to-last, and so on.//// In other words, every time a new chunk becomes active, its appended to this// list.refs []unsafe.Pointer// defunct is true if free has been called on this arena.//// This is just a best-effort way to discover a concurrent allocation// and free. Also used to detect a double-free.defunct atomic.Bool}// newUserArena creates a new userArena ready to be used.func newUserArena() *userArena {:= new(userArena)SetFinalizer(, func( *userArena) {// If arena handle is dropped without being freed, then call// free on the arena, so the arena chunks are never reclaimed// by the garbage collector..free()}).refill()return}// new allocates a new object of the provided type into the arena, and returns// its pointer.//// This operation is not safe to call concurrently with other operations on the// same arena.func ( *userArena) ( *_type) unsafe.Pointer {return .alloc(, -1)}// slice allocates a new slice backing store. slice must be a pointer to a slice// (i.e. *[]T), because userArenaSlice will update the slice directly.//// cap determines the capacity of the slice backing store and must be non-negative.//// This operation is not safe to call concurrently with other operations on the// same arena.func ( *userArena) ( any, int) {if < 0 {panic("userArena.slice: negative cap")}:= efaceOf(&):= ._typeif .Kind_&abi.KindMask != abi.Pointer {panic("slice result of non-ptr type")}= (*ptrtype)(unsafe.Pointer()).Elemif .Kind_&abi.KindMask != abi.Slice {panic("slice of non-ptr-to-slice type")}= (*slicetype)(unsafe.Pointer()).Elem// t is now the element type of the slice we want to allocate.*((*slice)(.data)) = slice{.alloc(, ), , }}// free returns the userArena's chunks back to mheap and marks it as defunct.//// Must be called at most once for any given arena.//// This operation is not safe to call concurrently with other operations on the// same arena.func ( *userArena) () {// Check for a double-free.if .defunct.Load() {panic("arena double free")}// Mark ourselves as defunct..defunct.Store(true)SetFinalizer(, nil)// Free all the full arenas.//// The refs on this list are in reverse order from the second-to-last.:= .fullList:= len(.refs) - 2for != nil {.fullList = .next.next = nilfreeUserArenaChunk(, .refs[])= .fullList--}if .fullList != nil || >= 0 {// There's still something left on the full list, or we// failed to actually iterate over the entire refs list.throw("full list doesn't match refs list in length")}// Put the active chunk onto the reuse list.//// Note that active's reference is always the last reference in refs.= .activeif != nil {if raceenabled || msanenabled || asanenabled {// Don't reuse arenas with sanitizers enabled. We want to catch// any use-after-free errors aggressively.freeUserArenaChunk(, .refs[len(.refs)-1])} else {lock(&userArenaState.lock)userArenaState.reuse = append(userArenaState.reuse, liveUserArenaChunk{, .refs[len(.refs)-1]})unlock(&userArenaState.lock)}}// nil out a.active so that a race with freeing will more likely cause a crash..active = nil.refs = nil}// alloc reserves space in the current chunk or calls refill and reserves space// in a new chunk. If cap is negative, the type will be taken literally, otherwise// it will be considered as an element type for a slice backing store with capacity// cap.func ( *userArena) ( *_type, int) unsafe.Pointer {:= .activevar unsafe.Pointerfor {= .userArenaNextFree(, )if != nil {break}= .refill()}return}// refill inserts the current arena chunk onto the full list and obtains a new// one, either from the partial list or allocating a new one, both from mheap.func ( *userArena) () *mspan {// If there's an active chunk, assume it's full.:= .activeif != nil {if .userArenaChunkFree.size() > userArenaChunkMaxAllocBytes {// It's difficult to tell when we're actually out of memory// in a chunk because the allocation that failed may still leave// some free space available. However, that amount of free space// should never exceed the maximum allocation size.throw("wasted too much memory in an arena chunk")}.next = .fullList.fullList =.active = nil= nil}var unsafe.Pointer// Check the partially-used list.lock(&userArenaState.lock)if len(userArenaState.reuse) > 0 {// Pick off the last arena chunk from the list.:= len(userArenaState.reuse) - 1= userArenaState.reuse[].x= userArenaState.reuse[].mspanuserArenaState.reuse[].x = niluserArenaState.reuse[].mspan = niluserArenaState.reuse = userArenaState.reuse[:]}unlock(&userArenaState.lock)if == nil {// Allocate a new one., = newUserArenaChunk()if == nil {throw("out of memory")}}.refs = append(.refs, ).active =return}type liveUserArenaChunk struct {*mspan // Must represent a user arena chunk.// Reference to mspan.base() to keep the chunk alive.x unsafe.Pointer}var userArenaState struct {lock mutex// reuse contains a list of partially-used and already-live// user arena chunks that can be quickly reused for another// arena.//// Protected by lock.reuse []liveUserArenaChunk// fault contains full user arena chunks that need to be faulted.//// Protected by lock.fault []liveUserArenaChunk}// userArenaNextFree reserves space in the user arena for an item of the specified// type. If cap is not -1, this is for an array of cap elements of type t.func ( *mspan) ( *_type, int) unsafe.Pointer {:= .Size_if > 0 {if > ^uintptr(0)/uintptr() {// Overflow.throw("out of memory")}*= uintptr()}if == 0 || == 0 {return unsafe.Pointer(&zerobase)}if > userArenaChunkMaxAllocBytes {// Redirect allocations that don't fit into a chunk well directly// from the heap.if >= 0 {return newarray(, )}return newobject()}// Prevent preemption as we set up the space for a new object.//// Act like we're allocating.:= acquirem()if .mallocing != 0 {throw("malloc deadlock")}if .gsignal == getg() {throw("malloc during signal")}.mallocing = 1var unsafe.Pointerif !.Pointers() {// Allocate pointer-less objects from the tail end of the chunk., := .userArenaChunkFree.takeFromBack(, .Align_)if {= unsafe.Pointer()}} else {, := .userArenaChunkFree.takeFromFront(, .Align_)if {= unsafe.Pointer()}}if == nil {// Failed to allocate..mallocing = 0releasem()return nil}if .needzero != 0 {throw("arena chunk needs zeroing, but should already be zeroed")}// Set up heap bitmap and do extra accounting.if .Pointers() {if >= 0 {userArenaHeapBitsSetSliceType(, , , )} else {userArenaHeapBitsSetType(, , )}:= getMCache()if == nil {throw("mallocgc called without a P or outside bootstrapping")}if > 0 {.scanAlloc += - (.Size_ - .PtrBytes)} else {.scanAlloc += .PtrBytes}}// Ensure that the stores above that initialize x to// type-safe memory and set the heap bits occur before// the caller can make ptr observable to the garbage// collector. Otherwise, on weakly ordered machines,// the garbage collector could follow a pointer to x,// but see uninitialized memory or stale heap bits.publicationBarrier().mallocing = 0releasem()return}// userArenaHeapBitsSetSliceType is the equivalent of heapBitsSetType but for// Go slice backing store values allocated in a user arena chunk. It sets up the// heap bitmap for n consecutive values with type typ allocated at address ptr.func userArenaHeapBitsSetSliceType( *_type, int, unsafe.Pointer, *mspan) {, := math.MulUintptr(.Size_, uintptr())if || < 0 || > maxAlloc {panic(plainError("runtime: allocation size out of range"))}for := 0; < ; ++ {userArenaHeapBitsSetType(, add(, uintptr()*.Size_), )}}// userArenaHeapBitsSetType is the equivalent of heapSetType but for// non-slice-backing-store Go values allocated in a user arena chunk. It// sets up the type metadata for the value with type typ allocated at address ptr.// base is the base address of the arena chunk.func userArenaHeapBitsSetType( *_type, unsafe.Pointer, *mspan) {:= .base():= .writeUserArenaHeapBits(uintptr()):= getGCMask() // start of 1-bit pointer mask:= .PtrBytes / goarch.PtrSizefor := uintptr(0); < ; += ptrBits {:= -if > ptrBits {= ptrBits}// N.B. On big endian platforms we byte swap the data that we// read from GCData, which is always stored in little-endian order// by the compiler. writeUserArenaHeapBits handles data in// a platform-ordered way for efficiency, but stores back the// data in little endian order, since we expose the bitmap through// a dummy type.= .write(, readUintptr(addb(, /8)), )}// Note: we call pad here to ensure we emit explicit 0 bits// for the pointerless tail of the object. This ensures that// there's only a single noMorePtrs mark for the next object// to clear. We don't need to do this to clear stale noMorePtrs// markers from previous uses because arena chunk pointer bitmaps// are always fully cleared when reused.= .pad(, .Size_-.PtrBytes).flush(, uintptr(), .Size_)// Update the PtrBytes value in the type information. After this// point, the GC will observe the new bitmap..largeType.PtrBytes = uintptr() - + .PtrBytes// Double-check that the bitmap was written out correctly.const = falseif {doubleCheckHeapPointersInterior(uintptr(), uintptr(), .Size_, .Size_, , &.largeType, )}}type writeUserArenaHeapBits struct {offset uintptr // offset in span that the low bit of mask represents the pointer state of.mask uintptr // some pointer bits starting at the address addr.valid uintptr // number of bits in buf that are valid (including low)low uintptr // number of low-order bits to not overwrite}func ( *mspan) ( uintptr) ( writeUserArenaHeapBits) {:= - .base()// We start writing bits maybe in the middle of a heap bitmap word.// Remember how many bits into the word we started, so we can be sure// not to overwrite the previous bits..low = / goarch.PtrSize % ptrBits// round down to heap word that starts the bitmap word..offset = - .low*goarch.PtrSize// We don't have any bits yet..mask = 0.valid = .lowreturn}// write appends the pointerness of the next valid pointer slots// using the low valid bits of bits. 1=pointer, 0=scalar.func ( writeUserArenaHeapBits) ( *mspan, , uintptr) writeUserArenaHeapBits {if .valid+ <= ptrBits {// Fast path - just accumulate the bits..mask |= << .valid.valid +=return}// Too many bits to fit in this word. Write the current word// out and move on to the next word.:= .mask | <<.valid // mask for this word.mask = >> (ptrBits - .valid) // leftover for next word.valid += - ptrBits // have h.valid+valid bits, writing ptrBits of them// Flush mask to the memory bitmap.:= .offset / (ptrBits * goarch.PtrSize):= uintptr(1)<<.low - 1:= .heapBits()[] = bswapIfBigEndian(bswapIfBigEndian([])& | )// Note: no synchronization required for this write because// the allocator has exclusive access to the page, and the bitmap// entries are all for a single page. Also, visibility of these// writes is guaranteed by the publication barrier in mallocgc.// Move to next word of bitmap..offset += ptrBits * goarch.PtrSize.low = 0return}// Add padding of size bytes.func ( writeUserArenaHeapBits) ( *mspan, uintptr) writeUserArenaHeapBits {if == 0 {return}:= / goarch.PtrSizefor > ptrBits {= .write(, 0, ptrBits)-= ptrBits}return .write(, 0, )}// Flush the bits that have been written, and add zeros as needed// to cover the full object [addr, addr+size).func ( writeUserArenaHeapBits) ( *mspan, , uintptr) {:= - .base()// zeros counts the number of bits needed to represent the object minus the// number of bits we've already written. This is the number of 0 bits// that need to be added.:= (+-.offset)/goarch.PtrSize - .valid// Add zero bits up to the bitmap word boundaryif > 0 {:= ptrBits - .validif > {=}.valid +=-=}// Find word in bitmap that we're going to write.:= .heapBits():= .offset / (ptrBits * goarch.PtrSize)// Write remaining bits.if .valid != .low {:= uintptr(1)<<.low - 1 // don't clear existing bits below "low"|= ^(uintptr(1)<<.valid - 1) // don't clear existing bits above "valid"[] = bswapIfBigEndian(bswapIfBigEndian([])& | .mask)}if == 0 {return}// Advance to next bitmap word..offset += ptrBits * goarch.PtrSize// Continue on writing zeros for the rest of the object.// For standard use of the ptr bits this is not required, as// the bits are read from the beginning of the object. Some uses,// like noscan spans, oblets, bulk write barriers, and cgocheck, might// start mid-object, so these writes are still required.for {// Write zero bits.:= .offset / (ptrBits * goarch.PtrSize)if < ptrBits {[] = bswapIfBigEndian(bswapIfBigEndian([]) &^ (uintptr(1)<< - 1))break} else if == ptrBits {[] = 0break} else {[] = 0-= ptrBits}.offset += ptrBits * goarch.PtrSize}}// bswapIfBigEndian swaps the byte order of the uintptr on goarch.BigEndian platforms,// and leaves it alone elsewhere.func bswapIfBigEndian( uintptr) uintptr {if goarch.BigEndian {if goarch.PtrSize == 8 {return uintptr(sys.Bswap64(uint64()))}return uintptr(sys.Bswap32(uint32()))}return}// newUserArenaChunk allocates a user arena chunk, which maps to a single// heap arena and single span. Returns a pointer to the base of the chunk// (this is really important: we need to keep the chunk alive) and the span.func newUserArenaChunk() (unsafe.Pointer, *mspan) {if gcphase == _GCmarktermination {throw("newUserArenaChunk called with gcphase == _GCmarktermination")}// Deduct assist credit. Because user arena chunks are modeled as one// giant heap object which counts toward heapLive, we're obligated to// assist the GC proportionally (and it's worth noting that the arena// does represent additional work for the GC, but we also have no idea// what that looks like until we actually allocate things into the// arena).deductAssistCredit(userArenaChunkBytes)// Set mp.mallocing to keep from being preempted by GC.:= acquirem()if .mallocing != 0 {throw("malloc deadlock")}if .gsignal == getg() {throw("malloc during signal")}.mallocing = 1// Allocate a new user arena.var *mspansystemstack(func() {= mheap_.allocUserArenaChunk()})if == nil {throw("out of memory")}:= unsafe.Pointer(.base())// Allocate black during GC.// All slots hold nil so no scanning is needed.// This may be racing with GC so do it atomically if there can be// a race marking the bit.if gcphase != _GCoff {gcmarknewobject(, .base())}if raceenabled {// TODO(mknyszek): Track individual objects.racemalloc(unsafe.Pointer(.base()), .elemsize)}if msanenabled {// TODO(mknyszek): Track individual objects.msanmalloc(unsafe.Pointer(.base()), .elemsize)}if asanenabled {// TODO(mknyszek): Track individual objects.// N.B. span.elemsize includes a redzone already.:= .base() + .elemsizeasanpoison(unsafe.Pointer(), .limit-)asanunpoison(unsafe.Pointer(.base()), .elemsize)}if := MemProfileRate; > 0 {:= getMCache()if == nil {throw("newUserArenaChunk called without a P or outside bootstrapping")}// Note cache c only valid while m acquired; see #47302if != 1 && int64(userArenaChunkBytes) < .nextSample {.nextSample -= int64(userArenaChunkBytes)} else {profilealloc(, unsafe.Pointer(.base()), userArenaChunkBytes)}}.mallocing = 0releasem()// Again, because this chunk counts toward heapLive, potentially trigger a GC.if := (gcTrigger{kind: gcTriggerHeap}); .test() {gcStart()}if debug.malloc {if inittrace.active && inittrace.id == getg().goid {// Init functions are executed sequentially in a single goroutine.inittrace.bytes += uint64(userArenaChunkBytes)}}// Double-check it's aligned to the physical page size. Based on the current// implementation this is trivially true, but it need not be in the future.// However, if it's not aligned to the physical page size then we can't properly// set it to fault later.if uintptr()%physPageSize != 0 {throw("user arena chunk is not aligned to the physical page size")}return ,}// isUnusedUserArenaChunk indicates that the arena chunk has been set to fault// and doesn't contain any scannable memory anymore. However, it might still be// mSpanInUse as it sits on the quarantine list, since it needs to be swept.//// This is not safe to execute unless the caller has ownership of the mspan or// the world is stopped (preemption is prevented while the relevant state changes).//// This is really only meant to be used by accounting tests in the runtime to// distinguish when a span shouldn't be counted (since mSpanInUse might not be// enough).func ( *mspan) () bool {return .isUserArenaChunk && .spanclass == makeSpanClass(0, true)}// setUserArenaChunkToFault sets the address space for the user arena chunk to fault// and releases any underlying memory resources.//// Must be in a non-preemptible state to ensure the consistency of statistics// exported to MemStats.func ( *mspan) () {if !.isUserArenaChunk {throw("invalid span in heapArena for user arena")}if .npages*pageSize != userArenaChunkBytes {throw("span on userArena.faultList has invalid size")}// Update the span class to be noscan. What we want to happen is that// any pointer into the span keeps it from getting recycled, so we want// the mark bit to get set, but we're about to set the address space to fault,// so we have to prevent the GC from scanning this memory.//// It's OK to set it here because (1) a GC isn't in progress, so the scanning code// won't make a bad decision, (2) we're currently non-preemptible and in the runtime,// so a GC is blocked from starting. We might race with sweeping, which could// put it on the "wrong" sweep list, but really don't care because the chunk is// treated as a large object span and there's no meaningful difference between scan// and noscan large objects in the sweeper. The STW at the start of the GC acts as a// barrier for this update..spanclass = makeSpanClass(0, true)// Actually set the arena chunk to fault, so we'll get dangling pointer errors.// sysFault currently uses a method on each OS that forces it to evacuate all// memory backing the chunk.sysFault(unsafe.Pointer(.base()), .npages*pageSize)// Everything on the list is counted as in-use, however sysFault transitions to// Reserved, not Prepared, so we skip updating heapFree or heapReleased and just// remove the memory from the total altogether; it's just address space now.gcController.heapInUse.add(-int64(.npages * pageSize))// Count this as a free of an object right now as opposed to when// the span gets off the quarantine list. The main reason is so that the// amount of bytes allocated doesn't exceed how much is counted as// "mapped ready," which could cause a deadlock in the pacer.gcController.totalFree.Add(int64(.elemsize))// Update consistent stats to match.//// We're non-preemptible, so it's safe to update consistent stats (our P// won't change out from under us).:= memstats.heapStats.acquire()atomic.Xaddint64(&.committed, -int64(.npages*pageSize))atomic.Xaddint64(&.inHeap, -int64(.npages*pageSize))atomic.Xadd64(&.largeFreeCount, 1)atomic.Xadd64(&.largeFree, int64(.elemsize))memstats.heapStats.release()// This counts as a free, so update heapLive.gcController.update(-int64(.elemsize), 0)// Mark it as free for the race detector.if raceenabled {racefree(unsafe.Pointer(.base()), .elemsize)}systemstack(func() {// Add the user arena to the quarantine list.lock(&mheap_.lock)mheap_.userArena.quarantineList.insert()unlock(&mheap_.lock)})}// inUserArenaChunk returns true if p points to a user arena chunk.func inUserArenaChunk( uintptr) bool {:= spanOf()if == nil {return false}return .isUserArenaChunk}// freeUserArenaChunk releases the user arena represented by s back to the runtime.//// x must be a live pointer within s.//// The runtime will set the user arena to fault once it's safe (the GC is no longer running)// and then once the user arena is no longer referenced by the application, will allow it to// be reused.func freeUserArenaChunk( *mspan, unsafe.Pointer) {if !.isUserArenaChunk {throw("span is not for a user arena")}if .npages*pageSize != userArenaChunkBytes {throw("invalid user arena span size")}// Mark the region as free to various sanitizers immediately instead// of handling them at sweep time.if raceenabled {racefree(unsafe.Pointer(.base()), .elemsize)}if msanenabled {msanfree(unsafe.Pointer(.base()), .elemsize)}if asanenabled {asanpoison(unsafe.Pointer(.base()), .elemsize)}if valgrindenabled {valgrindFree(unsafe.Pointer(.base()))}// Make ourselves non-preemptible as we manipulate state and statistics.//// Also required by setUserArenaChunksToFault.:= acquirem()// We can only set user arenas to fault if we're in the _GCoff phase.if gcphase == _GCoff {lock(&userArenaState.lock):= userArenaState.faultuserArenaState.fault = nilunlock(&userArenaState.lock).setUserArenaChunkToFault()for , := range {.mspan.setUserArenaChunkToFault()}// Until the chunks are set to fault, keep them alive via the fault list.KeepAlive()KeepAlive()} else {// Put the user arena on the fault list.lock(&userArenaState.lock)userArenaState.fault = append(userArenaState.fault, liveUserArenaChunk{, })unlock(&userArenaState.lock)}releasem()}// allocUserArenaChunk attempts to reuse a free user arena chunk represented// as a span.//// Must be in a non-preemptible state to ensure the consistency of statistics// exported to MemStats.//// Acquires the heap lock. Must run on the system stack for that reason.////go:systemstackfunc ( *mheap) () *mspan {var *mspanvar uintptr// First check the free list.lock(&.lock)if !.userArena.readyList.isEmpty() {= .userArena.readyList.first.userArena.readyList.remove()= .base()} else {// Free list was empty, so allocate a new arena.:= &.userArena.arenaHintsif raceenabled {// In race mode just use the regular heap hints. We might fragment// the address space, but the race detector requires that the heap// is mapped contiguously.= &.arenaHints}, := .sysAlloc(userArenaChunkBytes, , &mheap_.userArenaArenas)if %userArenaChunkBytes != 0 {throw("sysAlloc size is not divisible by userArenaChunkBytes")}if > userArenaChunkBytes {// We got more than we asked for. This can happen if// heapArenaSize > userArenaChunkSize, or if sysAlloc just returns// some extra as a result of trying to find an aligned region.//// Divide it up and put it on the ready list.for := userArenaChunkBytes; < ; += userArenaChunkBytes {:= .allocMSpanLocked().init(uintptr()+, userArenaChunkPages).userArena.readyList.insertBack()}= userArenaChunkBytes}= uintptr()if == 0 {// Out of memory.unlock(&.lock)return nil}= .allocMSpanLocked()}unlock(&.lock)// sysAlloc returns Reserved address space, and any span we're// reusing is set to fault (so, also Reserved), so transition// it to Prepared and then Ready.//// Unlike (*mheap).grow, just map in everything that we// asked for. We're likely going to use it all.sysMap(unsafe.Pointer(), userArenaChunkBytes, &gcController.heapReleased, "user arena chunk")sysUsed(unsafe.Pointer(), userArenaChunkBytes, userArenaChunkBytes)// Model the user arena as a heap span for a large object.:= makeSpanClass(0, false).initSpan(, spanAllocHeap, , , userArenaChunkPages).isUserArenaChunk = true.elemsize -= userArenaChunkReserveBytes().freeindex = 1.allocCount = 1// Adjust s.limit down to the object-containing part of the span.//// This is just to create a slightly tighter bound on the limit.// It's totally OK if the garbage collector, in particular// conservative scanning, can temporarily observes an inflated// limit. It will simply mark the whole chunk or just skip it// since we're in the mark phase anyway..limit = .base() + .elemsize// Adjust size to include redzone.if asanenabled {.elemsize -= redZoneSize(.elemsize)}// Account for this new arena chunk memory.gcController.heapInUse.add(int64(userArenaChunkBytes))gcController.heapReleased.add(-int64(userArenaChunkBytes)):= memstats.heapStats.acquire()atomic.Xaddint64(&.inHeap, int64(userArenaChunkBytes))atomic.Xaddint64(&.committed, int64(userArenaChunkBytes))// Model the arena as a single large malloc.atomic.Xadd64(&.largeAlloc, int64(.elemsize))atomic.Xadd64(&.largeAllocCount, 1)memstats.heapStats.release()// Count the alloc in inconsistent, internal stats.gcController.totalAlloc.Add(int64(.elemsize))// Update heapLive.gcController.update(int64(.elemsize), 0)// This must clear the entire heap bitmap so that it's safe// to allocate noscan data without writing anything out..initHeapBits()// Clear the span preemptively. It's an arena chunk, so let's assume// everything is going to be used.//// This also seems to make a massive difference as to whether or// not Linux decides to back this memory with transparent huge// pages. There's latency involved in this zeroing, but the hugepage// gains are almost always worth it. Note: it's important that we// clear even if it's freshly mapped and we know there's no point// to zeroing as *that* is the critical signal to use huge pages.memclrNoHeapPointers(unsafe.Pointer(.base()), .elemsize).needzero = 0.freeIndexForScan = 1// Set up the range for allocation..userArenaChunkFree = makeAddrRange(, +.elemsize)// Put the large span in the mcentral swept list so that it's// visible to the background sweeper..central[].mcentral.fullSwept(.sweepgen).push()// Set up an allocation header. Avoid write barriers here because this type// is not a real type, and it exists in an invalid location.*(*uintptr)(unsafe.Pointer(&.largeType)) = uintptr(unsafe.Pointer(.limit))*(*uintptr)(unsafe.Pointer(&.largeType.GCData)) = .limit + unsafe.Sizeof(_type{}).largeType.PtrBytes = 0.largeType.Size_ = .elemsizereturn}
![]() |
The pages are generated with Golds v0.7.9-preview. (GOOS=linux GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds. |