// Copyright 2023 The Go Authors. All rights reserved.// Use of this source code is governed by a BSD-style// license that can be found in the LICENSE file.//go:build goexperiment.allocheaders// Garbage collector: type and heap bitmaps.//// Stack, data, and bss bitmaps//// Stack frames and global variables in the data and bss sections are// described by bitmaps with 1 bit per pointer-sized word. A "1" bit// means the word is a live pointer to be visited by the GC (referred to// as "pointer"). A "0" bit means the word should be ignored by GC// (referred to as "scalar", though it could be a dead pointer value).//// Heap bitmaps//// The heap bitmap comprises 1 bit for each pointer-sized word in the heap,// recording whether a pointer is stored in that word or not. This bitmap// is stored at the end of a span for small objects and is unrolled at// runtime from type metadata for all larger objects. Objects without// pointers have neither a bitmap nor associated type metadata.//// Bits in all cases correspond to words in little-endian order.//// For small objects, if s is the mspan for the span starting at "start",// then s.heapBits() returns a slice containing the bitmap for the whole span.// That is, s.heapBits()[0] holds the goarch.PtrSize*8 bits for the first// goarch.PtrSize*8 words from "start" through "start+63*ptrSize" in the span.// On a related note, small objects are always small enough that their bitmap// fits in goarch.PtrSize*8 bits, so writing out bitmap data takes two bitmap// writes at most (because object boundaries don't generally lie on// s.heapBits()[i] boundaries).//// For larger objects, if t is the type for the object starting at "start",// within some span whose mspan is s, then the bitmap at t.GCData is "tiled"// from "start" through "start+s.elemsize".// Specifically, the first bit of t.GCData corresponds to the word at "start",// the second to the word after "start", and so on up to t.PtrBytes. At t.PtrBytes,// we skip to "start+t.Size_" and begin again from there. This process is// repeated until we hit "start+s.elemsize".// This tiling algorithm supports array data, since the type always refers to// the element type of the array. Single objects are considered the same as// single-element arrays.// The tiling algorithm may scan data past the end of the compiler-recognized// object, but any unused data within the allocation slot (i.e. within s.elemsize)// is zeroed, so the GC just observes nil pointers.// Note that this "tiled" bitmap isn't stored anywhere; it is generated on-the-fly.//// For objects without their own span, the type metadata is stored in the first// word before the object at the beginning of the allocation slot. For objects// with their own span, the type metadata is stored in the mspan.//// The bitmap for small unallocated objects in scannable spans is not maintained// (can be junk).package runtimeimport ()const (// A malloc header is functionally a single type pointer, but // we need to use 8 here to ensure 8-byte alignment of allocations // on 32-bit platforms. It's wasteful, but a lot of code relies on // 8-byte alignment for 8-byte atomics. mallocHeaderSize = 8// The minimum object size that has a malloc header, exclusive. // // The size of this value controls overheads from the malloc header. // The minimum size is bound by writeHeapBitsSmall, which assumes that the // pointer bitmap for objects of a size smaller than this doesn't cross // more than one pointer-word boundary. This sets an upper-bound on this // value at the number of bits in a uintptr, multiplied by the pointer // size in bytes. // // We choose a value here that has a natural cutover point in terms of memory // overheads. This value just happens to be the maximum possible value this // can be. // // A span with heap bits in it will have 128 bytes of heap bits on 64-bit // platforms, and 256 bytes of heap bits on 32-bit platforms. The first size // class where malloc headers match this overhead for 64-bit platforms is // 512 bytes (8 KiB / 512 bytes * 8 bytes-per-header = 128 bytes of overhead). // On 32-bit platforms, this same point is the 256 byte size class // (8 KiB / 256 bytes * 8 bytes-per-header = 256 bytes of overhead). // // Guaranteed to be exactly at a size class boundary. The reason this value is // an exclusive minimum is subtle. Suppose we're allocating a 504-byte object // and its rounded up to 512 bytes for the size class. If minSizeForMallocHeader // is 512 and an inclusive minimum, then a comparison against minSizeForMallocHeader // by the two values would produce different results. In other words, the comparison // would not be invariant to size-class rounding. Eschewing this property means a // more complex check or possibly storing additional state to determine whether a // span has malloc headers. minSizeForMallocHeader = goarch.PtrSize * ptrBits)// heapBitsInSpan returns true if the size of an object implies its ptr/scalar// data is stored at the end of the span, and is accessible via span.heapBits.//// Note: this works for both rounded-up sizes (span.elemsize) and unrounded// type sizes because minSizeForMallocHeader is guaranteed to be at a size// class boundary.////go:nosplitfunc heapBitsInSpan( uintptr) bool {// N.B. minSizeForMallocHeader is an exclusive minimum so that this function is // invariant under size-class rounding on its input.return <= minSizeForMallocHeader}// heapArenaPtrScalar contains the per-heapArena pointer/scalar metadata for the GC.type heapArenaPtrScalar struct {// N.B. This is no longer necessary with allocation headers.}// typePointers is an iterator over the pointers in a heap object.//// Iteration through this type implements the tiling algorithm described at the// top of this file.type typePointers struct {// elem is the address of the current array element of type typ being iterated over. // Objects that are not arrays are treated as single-element arrays, in which case // this value does not change. elem uintptr// addr is the address the iterator is currently working from and describes // the address of the first word referenced by mask. addr uintptr// mask is a bitmask where each bit corresponds to pointer-words after addr. // Bit 0 is the pointer-word at addr, Bit 1 is the next word, and so on. // If a bit is 1, then there is a pointer at that word. // nextFast and next mask out bits in this mask as their pointers are processed. mask uintptr// typ is a pointer to the type information for the heap object's type. // This may be nil if the object is in a span where heapBitsInSpan(span.elemsize) is true. typ *_type}// typePointersOf returns an iterator over all heap pointers in the range [addr, addr+size).//// addr and addr+size must be in the range [span.base(), span.limit).//// Note: addr+size must be passed as the limit argument to the iterator's next method on// each iteration. This slightly awkward API is to allow typePointers to be destructured// by the compiler.//// nosplit because it is used during write barriers and must not be preempted.////go:nosplitfunc ( *mspan) (, uintptr) typePointers { := .objBase() := .typePointersOfUnchecked()if == && == .elemsize {return }return .fastForward(-.addr, +)}// typePointersOfUnchecked is like typePointersOf, but assumes addr is the base// of an allocation slot in a span (the start of the object if no header, the// header otherwise). It returns an iterator that generates all pointers// in the range [addr, addr+span.elemsize).//// nosplit because it is used during write barriers and must not be preempted.////go:nosplitfunc ( *mspan) ( uintptr) typePointers {const = falseif && .objBase() != {print("runtime: addr=", , " base=", .objBase(), "\n")throw("typePointersOfUnchecked consisting of non-base-address for object") } := .spanclassif .noscan() {returntypePointers{} }ifheapBitsInSpan(.elemsize) {// Handle header-less objects.returntypePointers{elem: , addr: , mask: .heapBitsSmallForAddr()} }// All of these objects have a header.var *_typeif .sizeclass() != 0 {// Pull the allocation header from the first word of the object. = *(**_type)(unsafe.Pointer()) += mallocHeaderSize } else { = .largeType } := .GCDatareturntypePointers{elem: , addr: , mask: readUintptr(), typ: }}// typePointersOfType is like typePointersOf, but assumes addr points to one or more// contiguous instances of the provided type. The provided type must not be nil and// it must not have its type metadata encoded as a gcprog.//// It returns an iterator that tiles typ.GCData starting from addr. It's the caller's// responsibility to limit iteration.//// nosplit because its callers are nosplit and require all their callees to be nosplit.////go:nosplitfunc ( *mspan) ( *abi.Type, uintptr) typePointers {const = falseif && ( == nil || .Kind_&kindGCProg != 0) {throw("bad type passed to typePointersOfType") }if .spanclass.noscan() {returntypePointers{} }// Since we have the type, pretend we have a header. := .GCDatareturntypePointers{elem: , addr: , mask: readUintptr(), typ: }}// nextFast is the fast path of next. nextFast is written to be inlineable and,// as the name implies, fast.//// Callers that are performance-critical should iterate using the following// pattern://// for {// var addr uintptr// if tp, addr = tp.nextFast(); addr == 0 {// if tp, addr = tp.next(limit); addr == 0 {// break// }// }// // Use addr.// ...// }//// nosplit because it is used during write barriers and must not be preempted.////go:nosplitfunc ( typePointers) () (typePointers, uintptr) {// TESTQ/JEQif .mask == 0 {return , 0 }// BSFQvarintifgoarch.PtrSize == 8 { = sys.TrailingZeros64(uint64(.mask)) } else { = sys.TrailingZeros32(uint32(.mask)) }// BTCQ .mask ^= uintptr(1) << ( & (ptrBits - 1))// LEAQ (XX)(XX*8)return , .addr + uintptr()*goarch.PtrSize}// next advances the pointers iterator, returning the updated iterator and// the address of the next pointer.//// limit must be the same each time it is passed to next.//// nosplit because it is used during write barriers and must not be preempted.////go:nosplitfunc ( typePointers) ( uintptr) (typePointers, uintptr) {for {if .mask != 0 {return .nextFast() }// Stop if we don't actually have type information.if .typ == nil {returntypePointers{}, 0 }// Advance to the next element if necessary.if .addr+goarch.PtrSize*ptrBits >= .elem+.typ.PtrBytes { .elem += .typ.Size_ .addr = .elem } else { .addr += ptrBits * goarch.PtrSize }// Check if we've exceeded the limit with the last update.if .addr >= {returntypePointers{}, 0 }// Grab more bits and try again. .mask = readUintptr(addb(.typ.GCData, (.addr-.elem)/goarch.PtrSize/8))if .addr+goarch.PtrSize*ptrBits > { := (.addr + goarch.PtrSize*ptrBits - ) / goarch.PtrSize .mask &^= ((1 << ()) - 1) << (ptrBits - ) } }}// fastForward moves the iterator forward by n bytes. n must be a multiple// of goarch.PtrSize. limit must be the same limit passed to next for this// iterator.//// nosplit because it is used during write barriers and must not be preempted.////go:nosplitfunc ( typePointers) (, uintptr) typePointers {// Basic bounds check. := .addr + if >= {returntypePointers{} }if .typ == nil {// Handle small objects. // Clear any bits before the target address. .mask &^= (1 << (( - .addr) / goarch.PtrSize)) - 1// Clear any bits past the limit.if .addr+goarch.PtrSize*ptrBits > { := (.addr + goarch.PtrSize*ptrBits - ) / goarch.PtrSize .mask &^= ((1 << ()) - 1) << (ptrBits - ) }return }// Move up elem and addr. // Offsets within an element are always at a ptrBits*goarch.PtrSize boundary.if >= .typ.Size_ {// elem needs to be moved to the element containing // tp.addr + n. := .elem .elem += (.addr - .elem + ) / .typ.Size_ * .typ.Size_ .addr = .elem + alignDown(-(.elem-), ptrBits*goarch.PtrSize) } else { .addr += alignDown(, ptrBits*goarch.PtrSize) }if .addr-.elem >= .typ.PtrBytes {// We're starting in the non-pointer area of an array. // Move up to the next element. .elem += .typ.Size_ .addr = .elem .mask = readUintptr(.typ.GCData)// We may have exceeded the limit after this. Bail just like next does.if .addr >= {returntypePointers{} } } else {// Grab the mask, but then clear any bits before the target address and any // bits over the limit. .mask = readUintptr(addb(.typ.GCData, (.addr-.elem)/goarch.PtrSize/8)) .mask &^= (1 << (( - .addr) / goarch.PtrSize)) - 1 }if .addr+goarch.PtrSize*ptrBits > { := (.addr + goarch.PtrSize*ptrBits - ) / goarch.PtrSize .mask &^= ((1 << ()) - 1) << (ptrBits - ) }return}// objBase returns the base pointer for the object containing addr in span.//// Assumes that addr points into a valid part of span (span.base() <= addr < span.limit).////go:nosplitfunc ( *mspan) ( uintptr) uintptr {return .base() + .objIndex()*.elemsize}// bulkBarrierPreWrite executes a write barrier// for every pointer slot in the memory range [src, src+size),// using pointer/scalar information from [dst, dst+size).// This executes the write barriers necessary before a memmove.// src, dst, and size must be pointer-aligned.// The range [dst, dst+size) must lie within a single object.// It does not perform the actual writes.//// As a special case, src == 0 indicates that this is being used for a// memclr. bulkBarrierPreWrite will pass 0 for the src of each write// barrier.//// Callers should call bulkBarrierPreWrite immediately before// calling memmove(dst, src, size). This function is marked nosplit// to avoid being preempted; the GC must not stop the goroutine// between the memmove and the execution of the barriers.// The caller is also responsible for cgo pointer checks if this// may be writing Go pointers into non-Go memory.//// Pointer data is not maintained for allocations containing// no pointers at all; any caller of bulkBarrierPreWrite must first// make sure the underlying allocation contains pointers, usually// by checking typ.PtrBytes.//// The typ argument is the type of the space at src and dst (and the// element type if src and dst refer to arrays) and it is optional.// If typ is nil, the barrier will still behave as expected and typ// is used purely as an optimization. However, it must be used with// care.//// If typ is not nil, then src and dst must point to one or more values// of type typ. The caller must ensure that the ranges [src, src+size)// and [dst, dst+size) refer to one or more whole values of type src and// dst (leaving off the pointerless tail of the space is OK). If this// precondition is not followed, this function will fail to scan the// right pointers.//// When in doubt, pass nil for typ. That is safe and will always work.//// Callers must perform cgo checks if goexperiment.CgoCheck2.////go:nosplitfunc bulkBarrierPreWrite(, , uintptr, *abi.Type) {if (||)&(goarch.PtrSize-1) != 0 {throw("bulkBarrierPreWrite: unaligned arguments") }if !writeBarrier.enabled {return } := spanOf()if == nil {// If dst is a global, use the data or BSS bitmaps to // execute write barriers.for , := rangeactiveModules() {if .data <= && < .edata {bulkBarrierBitmap(, , , -.data, .gcdatamask.bytedata)return } }for , := rangeactiveModules() {if .bss <= && < .ebss {bulkBarrierBitmap(, , , -.bss, .gcbssmask.bytedata)return } }return } elseif .state.get() != mSpanInUse || < .base() || .limit <= {// dst was heap memory at some point, but isn't now. // It can't be a global. It must be either our stack, // or in the case of direct channel sends, it could be // another stack. Either way, no need for barriers. // This will also catch if dst is in a freed span, // though that should never have.return } := &getg().m.p.ptr().wbBuf// Double-check that the bitmaps generated in the two possible paths match.const = falseif {doubleCheckTypePointersOfType(, , , ) }vartypePointersif != nil && .Kind_&kindGCProg == 0 { = .typePointersOfType(, ) } else { = .typePointersOf(, ) }if == 0 {for {varuintptrif , = .next( + ); == 0 {break } := (*uintptr)(unsafe.Pointer()) := .get1() [0] = * } } else {for {varuintptrif , = .next( + ); == 0 {break } := (*uintptr)(unsafe.Pointer()) := (*uintptr)(unsafe.Pointer( + ( - ))) := .get2() [0] = * [1] = * } }}// bulkBarrierPreWriteSrcOnly is like bulkBarrierPreWrite but// does not execute write barriers for [dst, dst+size).//// In addition to the requirements of bulkBarrierPreWrite// callers need to ensure [dst, dst+size) is zeroed.//// This is used for special cases where e.g. dst was just// created and zeroed with malloc.//// The type of the space can be provided purely as an optimization.// See bulkBarrierPreWrite's comment for more details -- use this// optimization with great care.////go:nosplitfunc bulkBarrierPreWriteSrcOnly(, , uintptr, *abi.Type) {if (||)&(goarch.PtrSize-1) != 0 {throw("bulkBarrierPreWrite: unaligned arguments") }if !writeBarrier.enabled {return } := &getg().m.p.ptr().wbBuf := spanOf()// Double-check that the bitmaps generated in the two possible paths match.const = falseif {doubleCheckTypePointersOfType(, , , ) }vartypePointersif != nil && .Kind_&kindGCProg == 0 { = .typePointersOfType(, ) } else { = .typePointersOf(, ) }for {varuintptrif , = .next( + ); == 0 {break } := (*uintptr)(unsafe.Pointer( - + )) := .get1() [0] = * }}// initHeapBits initializes the heap bitmap for a span.//// TODO(mknyszek): This should set the heap bits for single pointer// allocations eagerly to avoid calling heapSetType at allocation time,// just to write one bit.func ( *mspan) ( bool) {if (!.spanclass.noscan() && heapBitsInSpan(.elemsize)) || .isUserArenaChunk { := .heapBits()for := range { [] = 0 } }}// bswapIfBigEndian swaps the byte order of the uintptr on goarch.BigEndian platforms,// and leaves it alone elsewhere.func bswapIfBigEndian( uintptr) uintptr {ifgoarch.BigEndian {ifgoarch.PtrSize == 8 {returnuintptr(sys.Bswap64(uint64())) }returnuintptr(sys.Bswap32(uint32())) }return}type writeUserArenaHeapBits struct { offset uintptr// offset in span that the low bit of mask represents the pointer state of. mask uintptr// some pointer bits starting at the address addr. valid uintptr// number of bits in buf that are valid (including low) low uintptr// number of low-order bits to not overwrite}func ( *mspan) ( uintptr) ( writeUserArenaHeapBits) { := - .base()// We start writing bits maybe in the middle of a heap bitmap word. // Remember how many bits into the word we started, so we can be sure // not to overwrite the previous bits. .low = / goarch.PtrSize % ptrBits// round down to heap word that starts the bitmap word. .offset = - .low*goarch.PtrSize// We don't have any bits yet. .mask = 0 .valid = .lowreturn}// write appends the pointerness of the next valid pointer slots// using the low valid bits of bits. 1=pointer, 0=scalar.func ( writeUserArenaHeapBits) ( *mspan, , uintptr) writeUserArenaHeapBits {if .valid+ <= ptrBits {// Fast path - just accumulate the bits. .mask |= << .valid .valid += return }// Too many bits to fit in this word. Write the current word // out and move on to the next word. := .mask | <<.valid// mask for this word .mask = >> (ptrBits - .valid) // leftover for next word .valid += - ptrBits// have h.valid+valid bits, writing ptrBits of them// Flush mask to the memory bitmap. := .offset / (ptrBits * goarch.PtrSize) := uintptr(1)<<.low - 1 := .heapBits() [] = bswapIfBigEndian(bswapIfBigEndian([])& | )// Note: no synchronization required for this write because // the allocator has exclusive access to the page, and the bitmap // entries are all for a single page. Also, visibility of these // writes is guaranteed by the publication barrier in mallocgc.// Move to next word of bitmap. .offset += ptrBits * goarch.PtrSize .low = 0return}// Add padding of size bytes.func ( writeUserArenaHeapBits) ( *mspan, uintptr) writeUserArenaHeapBits {if == 0 {return } := / goarch.PtrSizefor > ptrBits { = .write(, 0, ptrBits) -= ptrBits }return .write(, 0, )}// Flush the bits that have been written, and add zeros as needed// to cover the full object [addr, addr+size).func ( writeUserArenaHeapBits) ( *mspan, , uintptr) { := - .base()// zeros counts the number of bits needed to represent the object minus the // number of bits we've already written. This is the number of 0 bits // that need to be added. := (+-.offset)/goarch.PtrSize - .valid// Add zero bits up to the bitmap word boundaryif > 0 { := ptrBits - .validif > { = } .valid += -= }// Find word in bitmap that we're going to write. := .heapBits() := .offset / (ptrBits * goarch.PtrSize)// Write remaining bits.if .valid != .low { := uintptr(1)<<.low - 1// don't clear existing bits below "low" |= ^(uintptr(1)<<.valid - 1) // don't clear existing bits above "valid" [] = bswapIfBigEndian(bswapIfBigEndian([])& | .mask) }if == 0 {return }// Advance to next bitmap word. .offset += ptrBits * goarch.PtrSize// Continue on writing zeros for the rest of the object. // For standard use of the ptr bits this is not required, as // the bits are read from the beginning of the object. Some uses, // like noscan spans, oblets, bulk write barriers, and cgocheck, might // start mid-object, so these writes are still required.for {// Write zero bits. := .offset / (ptrBits * goarch.PtrSize)if < ptrBits { [] = bswapIfBigEndian(bswapIfBigEndian([]) &^ (uintptr(1)<< - 1))break } elseif == ptrBits { [] = 0break } else { [] = 0 -= ptrBits } .offset += ptrBits * goarch.PtrSize }}// heapBits returns the heap ptr/scalar bits stored at the end of the span for// small object spans and heap arena spans.//// Note that the uintptr of each element means something different for small object// spans and for heap arena spans. Small object spans are easy: they're never interpreted// as anything but uintptr, so they're immune to differences in endianness. However, the// heapBits for user arena spans is exposed through a dummy type descriptor, so the byte// ordering needs to match the same byte ordering the compiler would emit. The compiler always// emits the bitmap data in little endian byte ordering, so on big endian platforms these// uintptrs will have their byte orders swapped from what they normally would be.//// heapBitsInSpan(span.elemsize) or span.isUserArenaChunk must be true.////go:nosplitfunc ( *mspan) () []uintptr {const = falseif && !.isUserArenaChunk {if .spanclass.noscan() {throw("heapBits called for noscan") }if .elemsize > minSizeForMallocHeader {throw("heapBits called for span class that should have a malloc header") } }// Find the bitmap at the end of the span. // // Nearly every span with heap bits is exactly one page in size. Arenas are the only exception.if .npages == 1 {// This will be inlined and constant-folded down.returnheapBitsSlice(.base(), pageSize) }returnheapBitsSlice(.base(), .npages*pageSize)}// Helper for constructing a slice for the span's heap bits.////go:nosplitfunc heapBitsSlice(, uintptr) []uintptr { := / goarch.PtrSize / 8 := int( / goarch.PtrSize)varnotInHeapSlice = notInHeapSlice{(*notInHeap)(unsafe.Pointer( + - )), , }return *(*[]uintptr)(unsafe.Pointer(&))}// heapBitsSmallForAddr loads the heap bits for the object stored at addr from span.heapBits.//// addr must be the base pointer of an object in the span. heapBitsInSpan(span.elemsize)// must be true.////go:nosplitfunc ( *mspan) ( uintptr) uintptr { := .npages * pageSize := / goarch.PtrSize / 8 := (*byte)(unsafe.Pointer(.base() + - ))// These objects are always small enough that their bitmaps // fit in a single word, so just load the word or two we need. // // Mirrors mspan.writeHeapBitsSmall. // // We should be using heapBits(), but unfortunately it introduces // both bounds checks panics and throw which causes us to exceed // the nosplit limit in quite a few cases. := ( - .base()) / goarch.PtrSize / ptrBits := ( - .base()) / goarch.PtrSize % ptrBits := .elemsize / goarch.PtrSize := (*uintptr)(unsafe.Pointer(addb(, goarch.PtrSize*(+0)))) := (*uintptr)(unsafe.Pointer(addb(, goarch.PtrSize*(+1))))varuintptrif + > ptrBits {// Two reads. := ptrBits - := - = * >> |= (* & ((1 << ) - 1)) << } else {// One read. = (* >> ) & ((1 << ) - 1) }return}// writeHeapBitsSmall writes the heap bits for small objects whose ptr/scalar data is// stored as a bitmap at the end of the span.//// Assumes dataSize is <= ptrBits*goarch.PtrSize. x must be a pointer into the span.// heapBitsInSpan(dataSize) must be true. dataSize must be >= typ.Size_.////go:nosplitfunc ( *mspan) (, uintptr, *_type) ( uintptr) {// The objects here are always really small, so a single load is sufficient. := readUintptr(.GCData)// Create repetitions of the bitmap if we have a small array. := .elemsize / goarch.PtrSize = .PtrBytes := switch .Size_ {casegoarch.PtrSize: = (1 << ( / goarch.PtrSize)) - 1default:for := .Size_; < ; += .Size_ { |= << ( / goarch.PtrSize) += .Size_ } }// Since we're never writing more than one uintptr's worth of bits, we're either going // to do one or two writes. := .heapBits() := ( - .base()) / goarch.PtrSize := / ptrBits := % ptrBitsif + > ptrBits {// Two writes. := ptrBits - := - [+0] = [+0]&(^uintptr(0)>>) | ( << ) [+1] = [+1]&^((1<<)-1) | ( >> ) } else {// One write. [] = ([] &^ (((1 << ) - 1) << )) | ( << ) }const = falseif { := .heapBitsSmallForAddr()if != {print("runtime: x=", hex(), " i=", , " j=", , " bits=", , "\n")print("runtime: dataSize=", , " typ.Size_=", .Size_, " typ.PtrBytes=", .PtrBytes, "\n")print("runtime: src0=", hex(), " src=", hex(), " srcRead=", hex(), "\n")throw("bad pointer bits written for small object") } }return}// For !goexperiment.AllocHeaders.func heapBitsSetType(, , uintptr, *_type) {}// heapSetType records that the new allocation [x, x+size)// holds in [x, x+dataSize) one or more values of type typ.// (The number of values is given by dataSize / typ.Size.)// If dataSize < size, the fragment [x+dataSize, x+size) is// recorded as non-pointer data.// It is known that the type has pointers somewhere;// malloc does not call heapSetType when there are no pointers.//// There can be read-write races between heapSetType and things// that read the heap metadata like scanobject. However, since// heapSetType is only used for objects that have not yet been// made reachable, readers will ignore bits being modified by this// function. This does mean this function cannot transiently modify// shared memory that belongs to neighboring objects. Also, on weakly-ordered// machines, callers must execute a store/store (publication) barrier// between calling this function and making the object reachable.func heapSetType(, uintptr, *_type, **_type, *mspan) ( uintptr) {const = false := if == nil {if && (!heapBitsInSpan() || !heapBitsInSpan(.elemsize)) {throw("tried to write heap bits, but no heap bits in span") }// Handle the case where we have no malloc header. = .writeHeapBitsSmall(, , ) } else {if .Kind_&kindGCProg != 0 {// Allocate space to unroll the gcprog. This space will consist of // a dummy _type value and the unrolled gcprog. The dummy _type will // refer to the bitmap, and the mspan will refer to the dummy _type.if .spanclass.sizeclass() != 0 {throw("GCProg for type that isn't large") } := alignUp(unsafe.Sizeof(_type{}), goarch.PtrSize) := += alignUp(.PtrBytes/goarch.PtrSize/8, goarch.PtrSize) := alignUp(, pageSize) / pageSizevar *mspansystemstack(func() { = mheap_.allocManual(, spanAllocPtrScalarBits)memclrNoHeapPointers(unsafe.Pointer(.base()), .npages*pageSize) })// Write a dummy _type in the new space. // // We only need to write size, PtrBytes, and GCData, since that's all // the GC cares about. = (*_type)(unsafe.Pointer(.base())) .Size_ = .Size_ .PtrBytes = .PtrBytes .GCData = (*byte)(add(unsafe.Pointer(.base()), )) .TFlag = abi.TFlagUnrolledBitmap// Expand the GC program into space reserved at the end of the new span.runGCProg(addb(.GCData, 4), .GCData) }// Write out the header. * = = .elemsize }if {doubleCheckHeapPointers(, , , , )// To exercise the less common path more often, generate // a random interior pointer and make sure iterating from // that point works correctly too. := .elemsizeif == nil { = } := alignUp(uintptr(cheaprand())%, goarch.PtrSize) := - if == 0 { -= goarch.PtrSize += goarch.PtrSize } := + -= alignDown(uintptr(cheaprand())%, goarch.PtrSize)if == 0 { = goarch.PtrSize }// Round up the type to the size of the type. = ( + .Size_ - 1) / .Size_ * .Size_if + > + { = + - }doubleCheckHeapPointersInterior(, , , , , , ) }return}func doubleCheckHeapPointers(, uintptr, *_type, **_type, *mspan) {// Check that scanning the full object works. := .typePointersOfUnchecked(.objBase()) := .elemsizeif == nil { = } := falsefor := uintptr(0); < ; += goarch.PtrSize {// Compute the pointer bit we want at offset i. := falseif < .elemsize { := % .Size_if < .PtrBytes { := / goarch.PtrSize = *addb(.GCData, /8)>>(%8)&1 != 0 } }if {varuintptr , = .next( + .elemsize)if == 0 {println("runtime: found bad iterator") }if != + {print("runtime: addr=", hex(), " x+i=", hex(+), "\n") = true } } }if ! {varuintptr , = .next( + .elemsize)if == 0 {return }println("runtime: extra pointer:", hex()) }print("runtime: hasHeader=", != nil, " typ.Size_=", .Size_, " hasGCProg=", .Kind_&kindGCProg != 0, "\n")print("runtime: x=", hex(), " dataSize=", , " elemsize=", .elemsize, "\n")print("runtime: typ=", unsafe.Pointer(), " typ.PtrBytes=", .PtrBytes, "\n")print("runtime: limit=", hex(+.elemsize), "\n") = .typePointersOfUnchecked()dumpTypePointers()for {varuintptrif , = .next( + .elemsize); == 0 {println("runtime: would've stopped here")dumpTypePointers()break }print("runtime: addr=", hex(), "\n")dumpTypePointers() }throw("heapSetType: pointer entry not correct")}func doubleCheckHeapPointersInterior(, , , uintptr, *_type, **_type, *mspan) { := falseif < {print("runtime: interior=", hex(), " x=", hex(), "\n")throw("found bad interior pointer") } := - := .typePointersOf(, )for := ; < +; += goarch.PtrSize {// Compute the pointer bit we want at offset i. := falseif < .elemsize { := % .Size_if < .PtrBytes { := / goarch.PtrSize = *addb(.GCData, /8)>>(%8)&1 != 0 } }if {varuintptr , = .next( + )if == 0 {println("runtime: found bad iterator") = true }if != + {print("runtime: addr=", hex(), " x+i=", hex(+), "\n") = true } } }if ! {varuintptr , = .next( + )if == 0 {return }println("runtime: extra pointer:", hex()) }print("runtime: hasHeader=", != nil, " typ.Size_=", .Size_, "\n")print("runtime: x=", hex(), " dataSize=", , " elemsize=", .elemsize, " interior=", hex(), " size=", , "\n")print("runtime: limit=", hex(+), "\n") = .typePointersOf(, )dumpTypePointers()for {varuintptrif , = .next( + ); == 0 {println("runtime: would've stopped here")dumpTypePointers()break }print("runtime: addr=", hex(), "\n")dumpTypePointers() }print("runtime: want: ")for := ; < +; += goarch.PtrSize {// Compute the pointer bit we want at offset i. := falseif < { := % .Size_if < .PtrBytes { := / goarch.PtrSize = *addb(.GCData, /8)>>(%8)&1 != 0 } }if {print("1") } else {print("0") } }println()throw("heapSetType: pointer entry not correct")}//go:nosplitfunc doubleCheckTypePointersOfType( *mspan, *_type, , uintptr) {if == nil || .Kind_&kindGCProg != 0 {return }if .Kind_&kindMask == kindInterface {// Interfaces are unfortunately inconsistently handled // when it comes to the type pointer, so it's easy to // produce a lot of false positives here.return } := .typePointersOfType(, ) := .typePointersOf(, ) := falsefor {var , uintptr , = .next( + ) , = .next( + )if != { = truebreak }if == 0 {break } }if { := .typePointersOfType(, ) := .typePointersOf(, )print("runtime: addr=", hex(), " size=", , "\n")print("runtime: type=", toRType().string(), "\n")dumpTypePointers()dumpTypePointers()for {var , uintptr , = .next( + ) , = .next( + )print("runtime: ", hex(), " ", hex(), "\n")if == 0 && == 0 {break } }throw("mismatch between typePointersOfType and typePointersOf") }}func dumpTypePointers( typePointers) {print("runtime: tp.elem=", hex(.elem), " tp.typ=", unsafe.Pointer(.typ), "\n")print("runtime: tp.addr=", hex(.addr), " tp.mask=")for := uintptr(0); < ptrBits; ++ {if .mask&(uintptr(1)<<) != 0 {print("1") } else {print("0") } }println()}// Testing.// Returns GC type info for the pointer stored in ep for testing.// If ep points to the stack, only static live information will be returned// (i.e. not for objects which are only dynamically live stack objects).func getgcmask( any) ( []byte) { := *efaceOf(&) := .data := ._typevar *_typeif .Kind_&kindMask != kindPtr {throw("bad argument to getgcmask: expected type to be a pointer to the value type whose mask is being queried") } = (*ptrtype)(unsafe.Pointer()).Elem// data or bssfor , := rangeactiveModules() {// dataif .data <= uintptr() && uintptr() < .edata { := .gcdatamask.bytedata := .Size_ = make([]byte, /goarch.PtrSize)for := uintptr(0); < ; += goarch.PtrSize { := (uintptr() + - .data) / goarch.PtrSize [/goarch.PtrSize] = (*addb(, /8) >> ( % 8)) & 1 }return }// bssif .bss <= uintptr() && uintptr() < .ebss { := .gcbssmask.bytedata := .Size_ = make([]byte, /goarch.PtrSize)for := uintptr(0); < ; += goarch.PtrSize { := (uintptr() + - .bss) / goarch.PtrSize [/goarch.PtrSize] = (*addb(, /8) >> ( % 8)) & 1 }return } }// heapif , , := findObject(uintptr(), 0, 0); != 0 {if .spanclass.noscan() {returnnil } := + .elemsize// Move the base up to the iterator's start, because // we want to hide evidence of a malloc header from the // caller. := .typePointersOfUnchecked() = .addr// Unroll the full bitmap the GC would actually observe. := make([]byte, (-)/goarch.PtrSize)for {varuintptrif , = .next(); == 0 {break } [(-)/goarch.PtrSize] = 1 }// Double-check that every part of the ptr/scalar we're not // showing the caller is zeroed. This keeps us honest that // that information is actually irrelevant.for := ; < .elemsize; ++ {if *(*byte)(unsafe.Pointer()) != 0 {throw("found non-zeroed tail of allocation") } }// Callers (and a check we're about to run) expects this mask // to end at the last pointer.forlen() > 0 && [len()-1] == 0 { = [:len()-1] }if .Kind_&kindGCProg == 0 {// Unroll again, but this time from the type information. := make([]byte, (-)/goarch.PtrSize) = .typePointersOfType(, )for {varuintptrif , = .next(); == 0 {break } [(-)/goarch.PtrSize] = 1 }// Validate that the prefix of maskFromType is equal to // maskFromHeap. maskFromType may contain more pointers than // maskFromHeap produces because maskFromHeap may be able to // get exact type information for certain classes of objects. // With maskFromType, we're always just tiling the type bitmap // through to the elemsize. // // It's OK if maskFromType has pointers in elemsize that extend // past the actual populated space; we checked above that all // that space is zeroed, so just the GC will just see nil pointers. := falsefor := range {if [] != [] { = truebreak } }if {print("runtime: heap mask=")for , := range {print() }println()print("runtime: type mask=")for , := range {print() }println()print("runtime: type=", toRType().string(), "\n")throw("found two different masks from two different methods") } }// Select the heap mask to return. We may not have a type mask. = // Make sure we keep ep alive. We may have stopped referencing // ep's data pointer sometime before this point and it's possible // for that memory to get freed.KeepAlive()return }// stackif := getg(); .m.curg.stack.lo <= uintptr() && uintptr() < .m.curg.stack.hi { := falsevarunwinderfor .initAt(.m.curg.sched.pc, .m.curg.sched.sp, 0, .m.curg, 0); .valid(); .next() {if .frame.sp <= uintptr() && uintptr() < .frame.varp { = truebreak } }if { , , := .frame.getStackMap(false)if .n == 0 {return } := uintptr(.n) * goarch.PtrSize := (*ptrtype)(unsafe.Pointer()).Elem.Size_ = make([]byte, /goarch.PtrSize)for := uintptr(0); < ; += goarch.PtrSize { := (uintptr() + - .frame.varp + ) / goarch.PtrSize [/goarch.PtrSize] = .ptrbit() } }return }// otherwise, not something the GC knows about. // possibly read-only data, like malloc(0). // must not have pointersreturn}// userArenaHeapBitsSetType is the equivalent of heapSetType but for// non-slice-backing-store Go values allocated in a user arena chunk. It// sets up the type metadata for the value with type typ allocated at address ptr.// base is the base address of the arena chunk.func userArenaHeapBitsSetType( *_type, unsafe.Pointer, *mspan) { := .base() := .writeUserArenaHeapBits(uintptr()) := .GCData// start of 1-bit pointer mask (or GC program)varuintptrif .Kind_&kindGCProg != 0 {// Expand gc program, using the object itself for storage. = runGCProg(addb(, 4), (*byte)()) = (*byte)() } := .PtrBytes / goarch.PtrSizefor := uintptr(0); < ; += ptrBits { := - if > ptrBits { = ptrBits }// N.B. On big endian platforms we byte swap the data that we // read from GCData, which is always stored in little-endian order // by the compiler. writeUserArenaHeapBits handles data in // a platform-ordered way for efficiency, but stores back the // data in little endian order, since we expose the bitmap through // a dummy type. = .write(, readUintptr(addb(, /8)), ) }// Note: we call pad here to ensure we emit explicit 0 bits // for the pointerless tail of the object. This ensures that // there's only a single noMorePtrs mark for the next object // to clear. We don't need to do this to clear stale noMorePtrs // markers from previous uses because arena chunk pointer bitmaps // are always fully cleared when reused. = .pad(, .Size_-.PtrBytes) .flush(, uintptr(), .Size_)if .Kind_&kindGCProg != 0 {// Zero out temporary ptrmask buffer inside object.memclrNoHeapPointers(, (+7)/8) }// Update the PtrBytes value in the type information. After this // point, the GC will observe the new bitmap. .largeType.PtrBytes = uintptr() - + .PtrBytes// Double-check that the bitmap was written out correctly.const = falseif {doubleCheckHeapPointersInterior(uintptr(), uintptr(), .Size_, .Size_, , &.largeType, ) }}// For !goexperiment.AllocHeaders, to pass TestIntendedInlining.func writeHeapBitsForAddr() {panic("not implemented")}// For !goexperiment.AllocHeaders.type heapBits struct {}// For !goexperiment.AllocHeaders.////go:nosplitfunc heapBitsForAddr(, uintptr) heapBits {panic("not implemented")}// For !goexperiment.AllocHeaders.////go:nosplitfunc ( heapBits) () (heapBits, uintptr) {panic("not implemented")}// For !goexperiment.AllocHeaders.////go:nosplitfunc ( heapBits) () (heapBits, uintptr) {panic("not implemented")}
The pages are generated with Goldsv0.6.9-preview. (GOOS=linux GOARCH=amd64)
Golds is a Go 101 project developed by Tapir Liu.
PR and bug reports are welcome and can be submitted to the issue list.
Please follow @Go100and1 (reachable from the left QR code) to get the latest news of Golds.