// Copyright 2009 The Go Authors. All rights reserved.// Use of this source code is governed by a BSD-style// license that can be found in the LICENSE file.package xmlimport ()// BUG(rsc): Mapping between XML elements and data structures is inherently flawed:// an XML element is an order-dependent collection of anonymous// values, while a data structure is an order-independent collection// of named values.// See [encoding/json] for a textual representation more suitable// to data structures.// Unmarshal parses the XML-encoded data and stores the result in// the value pointed to by v, which must be an arbitrary struct,// slice, or string. Well-formed data that does not fit into v is// discarded.//// Because Unmarshal uses the reflect package, it can only assign// to exported (upper case) fields. Unmarshal uses a case-sensitive// comparison to match XML element names to tag values and struct// field names.//// Unmarshal maps an XML element to a struct using the following rules.// In the rules, the tag of a field refers to the value associated with the// key 'xml' in the struct field's tag (see the example above).//// - If the struct has a field of type []byte or string with tag// ",innerxml", Unmarshal accumulates the raw XML nested inside the// element in that field. The rest of the rules still apply.//// - If the struct has a field named XMLName of type Name,// Unmarshal records the element name in that field.//// - If the XMLName field has an associated tag of the form// "name" or "namespace-URL name", the XML element must have// the given name (and, optionally, name space) or else Unmarshal// returns an error.//// - If the XML element has an attribute whose name matches a// struct field name with an associated tag containing ",attr" or// the explicit name in a struct field tag of the form "name,attr",// Unmarshal records the attribute value in that field.//// - If the XML element has an attribute not handled by the previous// rule and the struct has a field with an associated tag containing// ",any,attr", Unmarshal records the attribute value in the first// such field.//// - If the XML element contains character data, that data is// accumulated in the first struct field that has tag ",chardata".// The struct field may have type []byte or string.// If there is no such field, the character data is discarded.//// - If the XML element contains comments, they are accumulated in// the first struct field that has tag ",comment". The struct// field may have type []byte or string. If there is no such// field, the comments are discarded.//// - If the XML element contains a sub-element whose name matches// the prefix of a tag formatted as "a" or "a>b>c", unmarshal// will descend into the XML structure looking for elements with the// given names, and will map the innermost elements to that struct// field. A tag starting with ">" is equivalent to one starting// with the field name followed by ">".//// - If the XML element contains a sub-element whose name matches// a struct field's XMLName tag and the struct field has no// explicit name tag as per the previous rule, unmarshal maps// the sub-element to that struct field.//// - If the XML element contains a sub-element whose name matches a// field without any mode flags (",attr", ",chardata", etc), Unmarshal// maps the sub-element to that struct field.//// - If the XML element contains a sub-element that hasn't matched any// of the above rules and the struct has a field with tag ",any",// unmarshal maps the sub-element to that struct field.//// - An anonymous struct field is handled as if the fields of its// value were part of the outer struct.//// - A struct field with tag "-" is never unmarshaled into.//// If Unmarshal encounters a field type that implements the Unmarshaler// interface, Unmarshal calls its UnmarshalXML method to produce the value from// the XML element. Otherwise, if the value implements// [encoding.TextUnmarshaler], Unmarshal calls that value's UnmarshalText method.//// Unmarshal maps an XML element to a string or []byte by saving the// concatenation of that element's character data in the string or// []byte. The saved []byte is never nil.//// Unmarshal maps an attribute value to a string or []byte by saving// the value in the string or slice.//// Unmarshal maps an attribute value to an [Attr] by saving the attribute,// including its name, in the Attr.//// Unmarshal maps an XML element or attribute value to a slice by// extending the length of the slice and mapping the element or attribute// to the newly created value.//// Unmarshal maps an XML element or attribute value to a bool by// setting it to the boolean value represented by the string. Whitespace// is trimmed and ignored.//// Unmarshal maps an XML element or attribute value to an integer or// floating-point field by setting the field to the result of// interpreting the string value in decimal. There is no check for// overflow. Whitespace is trimmed and ignored.//// Unmarshal maps an XML element to a Name by recording the element// name.//// Unmarshal maps an XML element to a pointer by setting the pointer// to a freshly allocated value and then mapping the element to that value.//// A missing element or empty attribute value will be unmarshaled as a zero value.// If the field is a slice, a zero value will be appended to the field. Otherwise, the// field will be set to its zero value.func ( []byte, any) error {returnNewDecoder(bytes.NewReader()).Decode()}// Decode works like [Unmarshal], except it reads the decoder// stream to find the start element.func ( *Decoder) ( any) error {return .DecodeElement(, nil)}// DecodeElement works like [Unmarshal] except that it takes// a pointer to the start XML element to decode into v.// It is useful when a client reads some raw XML tokens itself// but also wants to defer to [Unmarshal] for some elements.func ( *Decoder) ( any, *StartElement) error { := reflect.ValueOf()if .Kind() != reflect.Pointer {returnerrors.New("non-pointer passed to Unmarshal") }if .IsNil() {returnerrors.New("nil pointer passed to Unmarshal") }return .unmarshal(.Elem(), , 0)}// An UnmarshalError represents an error in the unmarshaling process.typeUnmarshalErrorstringfunc ( UnmarshalError) () string { returnstring() }// Unmarshaler is the interface implemented by objects that can unmarshal// an XML element description of themselves.//// UnmarshalXML decodes a single XML element// beginning with the given start element.// If it returns an error, the outer call to Unmarshal stops and// returns that error.// UnmarshalXML must consume exactly one XML element.// One common implementation strategy is to unmarshal into// a separate value with a layout matching the expected XML// using d.DecodeElement, and then to copy the data from// that value into the receiver.// Another common strategy is to use d.Token to process the// XML object one token at a time.// UnmarshalXML may not use d.RawToken.typeUnmarshalerinterface {UnmarshalXML(d *Decoder, start StartElement) error}// UnmarshalerAttr is the interface implemented by objects that can unmarshal// an XML attribute description of themselves.//// UnmarshalXMLAttr decodes a single XML attribute.// If it returns an error, the outer call to [Unmarshal] stops and// returns that error.// UnmarshalXMLAttr is used only for struct fields with the// "attr" option in the field tag.typeUnmarshalerAttrinterface {UnmarshalXMLAttr(attr Attr) error}// receiverType returns the receiver type to use in an expression like "%s.MethodName".func receiverType( any) string { := reflect.TypeOf()if .Name() != "" {return .String() }return"(" + .String() + ")"}// unmarshalInterface unmarshals a single XML element into val.// start is the opening tag of the element.func ( *Decoder) ( Unmarshaler, *StartElement) error {// Record that decoder must stop at end tag corresponding to start. .pushEOF() .unmarshalDepth++ := .UnmarshalXML(, *) .unmarshalDepth--if != nil { .popEOF()return }if !.popEOF() {returnfmt.Errorf("xml: %s.UnmarshalXML did not consume entire <%s> element", receiverType(), .Name.Local) }returnnil}// unmarshalTextInterface unmarshals a single XML element into val.// The chardata contained in the element (but not its children)// is passed to the text unmarshaler.func ( *Decoder) ( encoding.TextUnmarshaler) error {var []byte := 1for > 0 { , := .Token()if != nil {return }switch t := .(type) {caseCharData:if == 1 { = append(, ...) }caseStartElement: ++caseEndElement: -- } }return .UnmarshalText()}// unmarshalAttr unmarshals a single XML attribute into val.func ( *Decoder) ( reflect.Value, Attr) error {if .Kind() == reflect.Pointer {if .IsNil() { .Set(reflect.New(.Type().Elem())) } = .Elem() }if .CanInterface() && .Type().Implements(unmarshalerAttrType) {// This is an unmarshaler with a non-pointer receiver, // so it's likely to be incorrect, but we do what we're told.return .Interface().(UnmarshalerAttr).UnmarshalXMLAttr() }if .CanAddr() { := .Addr()if .CanInterface() && .Type().Implements(unmarshalerAttrType) {return .Interface().(UnmarshalerAttr).UnmarshalXMLAttr() } }// Not an UnmarshalerAttr; try encoding.TextUnmarshaler.if .CanInterface() && .Type().Implements(textUnmarshalerType) {// This is an unmarshaler with a non-pointer receiver, // so it's likely to be incorrect, but we do what we're told.return .Interface().(encoding.TextUnmarshaler).UnmarshalText([]byte(.Value)) }if .CanAddr() { := .Addr()if .CanInterface() && .Type().Implements(textUnmarshalerType) {return .Interface().(encoding.TextUnmarshaler).UnmarshalText([]byte(.Value)) } }if .Type().Kind() == reflect.Slice && .Type().Elem().Kind() != reflect.Uint8 {// Slice of element values. // Grow slice. := .Len() .Grow(1) .SetLen( + 1)// Recur to read element into slice.if := .(.Index(), ); != nil { .SetLen()return }returnnil }if .Type() == attrType { .Set(reflect.ValueOf())returnnil }returncopyValue(, []byte(.Value))}var ( attrType = reflect.TypeFor[Attr]() unmarshalerType = reflect.TypeFor[Unmarshaler]() unmarshalerAttrType = reflect.TypeFor[UnmarshalerAttr]() textUnmarshalerType = reflect.TypeFor[encoding.TextUnmarshaler]())const ( maxUnmarshalDepth = 10000 maxUnmarshalDepthWasm = 5000// go.dev/issue/56498)var errUnmarshalDepth = errors.New("exceeded max depth")// Unmarshal a single XML element into val.func ( *Decoder) ( reflect.Value, *StartElement, int) error {if >= maxUnmarshalDepth || runtime.GOARCH == "wasm" && >= maxUnmarshalDepthWasm {returnerrUnmarshalDepth }// Find start element if we need it.if == nil {for { , := .Token()if != nil {return }if , := .(StartElement); { = &break } } }// Load value from interface, but only if the result will be // usefully addressable.if .Kind() == reflect.Interface && !.IsNil() { := .Elem()if .Kind() == reflect.Pointer && !.IsNil() { = } }if .Kind() == reflect.Pointer {if .IsNil() { .Set(reflect.New(.Type().Elem())) } = .Elem() }if .CanInterface() && .Type().Implements(unmarshalerType) {// This is an unmarshaler with a non-pointer receiver, // so it's likely to be incorrect, but we do what we're told.return .unmarshalInterface(.Interface().(Unmarshaler), ) }if .CanAddr() { := .Addr()if .CanInterface() && .Type().Implements(unmarshalerType) {return .unmarshalInterface(.Interface().(Unmarshaler), ) } }if .CanInterface() && .Type().Implements(textUnmarshalerType) {return .unmarshalTextInterface(.Interface().(encoding.TextUnmarshaler)) }if .CanAddr() { := .Addr()if .CanInterface() && .Type().Implements(textUnmarshalerType) {return .unmarshalTextInterface(.Interface().(encoding.TextUnmarshaler)) } }var ( []bytereflect.Value []bytereflect.Valuereflect.Valueint []bytereflect.Valuereflect.Value *typeInfoerror )switch := ; .Kind() {default:returnerrors.New("unknown type " + .Type().String())casereflect.Interface:// TODO: For now, simply ignore the field. In the near // future we may choose to unmarshal the start // element on it, if not nil.return .Skip()casereflect.Slice: := .Type()if .Elem().Kind() == reflect.Uint8 {// []byte = break }// Slice of element values. // Grow slice. := .Len() .Grow(1) .SetLen( + 1)// Recur to read element into slice.if := .(.Index(), , +1); != nil { .SetLen()return }returnnilcasereflect.Bool, reflect.Float32, reflect.Float64, reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64, reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr, reflect.String: = casereflect.Struct: := .Type()if == nameType { .Set(reflect.ValueOf(.Name))break } = , = getTypeInfo()if != nil {return }// Validate and assign element name.if .xmlname != nil { := .xmlnameif .name != "" && .name != .Name.Local {returnUnmarshalError("expected element type <" + .name + "> but have <" + .Name.Local + ">") }if .xmlns != "" && .xmlns != .Name.Space { := "expected element <" + .name + "> in name space " + .xmlns + " but have "if .Name.Space == "" { += "no name space" } else { += .Name.Space }returnUnmarshalError() } := .value(, initNilPointers)if , := .Interface().(Name); { .Set(reflect.ValueOf(.Name)) } }// Assign attributes.for , := range .Attr { := false := -1for := range .fields { := &.fields[]switch .flags & fMode {casefAttr: := .value(, initNilPointers)if .Name.Local == .name && (.xmlns == "" || .xmlns == .Name.Space) {if := .unmarshalAttr(, ); != nil {return } = true }casefAny | fAttr:if == -1 { = } } }if ! && >= 0 { := &.fields[] := .value(, initNilPointers)if := .unmarshalAttr(, ); != nil {return } } }// Determine whether we need to save character data or comments.for := range .fields { := &.fields[]switch .flags & fMode {casefCDATA, fCharData:if !.IsValid() { = .value(, initNilPointers) }casefComment:if !.IsValid() { = .value(, initNilPointers) }casefAny, fAny | fElement:if !.IsValid() { = .value(, initNilPointers) }casefInnerXML:if !.IsValid() { = .value(, initNilPointers)if .saved == nil { = 0 .saved = new(bytes.Buffer) } else { = .savedOffset() } } } } }// Find end element. // Process sub-elements along the way.:for {varintif .IsValid() { = .savedOffset() } , := .Token()if != nil {return }switch t := .(type) {caseStartElement: := falseif .IsValid() {// unmarshalPath can call unmarshal, so we need to pass the depth through so that // we can continue to enforce the maximum recursion limit. , = .unmarshalPath(, , nil, &, )if != nil {return }if ! && .IsValid() { = trueif := .(, &, +1); != nil {return } } }if ! {if := .Skip(); != nil {return } }caseEndElement:if .IsValid() { = .saved.Bytes()[:]if == 0 { .saved = nil } }breakcaseCharData:if .IsValid() { = append(, ...) }caseComment:if .IsValid() { = append(, ...) } } }if .IsValid() && .CanInterface() && .Type().Implements(textUnmarshalerType) {if := .Interface().(encoding.TextUnmarshaler).UnmarshalText(); != nil {return } = reflect.Value{} }if .IsValid() && .CanAddr() { := .Addr()if .CanInterface() && .Type().Implements(textUnmarshalerType) {if := .Interface().(encoding.TextUnmarshaler).UnmarshalText(); != nil {return } = reflect.Value{} } }if := copyValue(, ); != nil {return }switch := ; .Kind() {casereflect.String: .SetString(string())casereflect.Slice: .Set(reflect.ValueOf()) }switch := ; .Kind() {casereflect.String: .SetString(string())casereflect.Slice:if .Type().Elem().Kind() == reflect.Uint8 { .Set(reflect.ValueOf()) } }returnnil}func copyValue( reflect.Value, []byte) ( error) { := if .Kind() == reflect.Pointer {if .IsNil() { .Set(reflect.New(.Type().Elem())) } = .Elem() }// Save accumulated data.switch .Kind() {casereflect.Invalid:// Probably a comment.default:returnerrors.New("cannot unmarshal into " + .Type().String())casereflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:iflen() == 0 { .SetInt(0)returnnil } , := strconv.ParseInt(strings.TrimSpace(string()), 10, .Type().Bits())if != nil {return } .SetInt()casereflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:iflen() == 0 { .SetUint(0)returnnil } , := strconv.ParseUint(strings.TrimSpace(string()), 10, .Type().Bits())if != nil {return } .SetUint()casereflect.Float32, reflect.Float64:iflen() == 0 { .SetFloat(0)returnnil } , := strconv.ParseFloat(strings.TrimSpace(string()), .Type().Bits())if != nil {return } .SetFloat()casereflect.Bool:iflen() == 0 { .SetBool(false)returnnil } , := strconv.ParseBool(strings.TrimSpace(string()))if != nil {return } .SetBool()casereflect.String: .SetString(string())casereflect.Slice:iflen() == 0 {// non-nil to flag presence = []byte{} } .SetBytes() }returnnil}// unmarshalPath walks down an XML structure looking for wanted// paths, and calls unmarshal on them.// The consumed result tells whether XML elements have been consumed// from the Decoder until start's matching end element, or if it's// still untouched because start is uninteresting for sv's fields.func ( *Decoder) ( *typeInfo, reflect.Value, []string, *StartElement, int) ( bool, error) { := false:for := range .fields { := &.fields[]if .flags&fElement == 0 || len(.parents) < len() || .xmlns != "" && .xmlns != .Name.Space {continue }for := range {if [] != .parents[] {continue } }iflen(.parents) == len() && .name == .Name.Local {// It's a perfect match, unmarshal the field.returntrue, .unmarshal(.value(, initNilPointers), , +1) }iflen(.parents) > len() && .parents[len()] == .Name.Local {// It's a prefix for the field. Break and recurse // since it's not ok for one field path to be itself // the prefix for another field path. = true// We can reuse the same slice as long as we // don't try to append to it. = .parents[:len()+1]break } }if ! {// We have no business with this element.returnfalse, nil }// The element is not a perfect match for any field, but one // or more fields have the path to this element as a parent // prefix. Recurse and attempt to match these.for {varToken , = .Token()if != nil {returntrue, }switch t := .(type) {caseStartElement:// the recursion depth of unmarshalPath is limited to the path length specified // by the struct field tag, so we don't increment the depth here. , := .(, , , &, )if != nil {returntrue, }if ! {if := .Skip(); != nil {returntrue, } }caseEndElement:returntrue, nil } }}// Skip reads tokens until it has consumed the end element// matching the most recent start element already consumed,// skipping nested structures.// It returns nil if it finds an end element matching the start// element; otherwise it returns an error describing the problem.func ( *Decoder) () error {varint64for { , := .Token()if != nil {return }switch .(type) {caseStartElement: ++caseEndElement:if == 0 {returnnil } -- } }}
The pages are generated with Goldsv0.7.0-preview. (GOOS=linux GOARCH=amd64)
Golds is a Go 101 project developed by Tapir Liu.
PR and bug reports are welcome and can be submitted to the issue list.
Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds.