package norm
Import Path
vendor/golang.org/x/text/unicode/norm (on go.dev)
Dependency Relation
imports 6 packages, and imported by one package
Involved Source Files
composition.go
forminfo.go
input.go
iter.go
Package norm contains types and functions for normalizing Unicode strings.
readwriter.go
tables15.0.0.go
transform.go
trie.go
Package-Level Type Names (total 3)
A Form denotes a canonical representation of Unicode code points.
The Unicode-defined normalization and equivalence forms are:
NFC Unicode Normalization Form C
NFD Unicode Normalization Form D
NFKC Unicode Normalization Form KC
NFKD Unicode Normalization Form KD
For a Form f, this documentation uses the notation f(x) to mean
the bytes or string x converted to the given form.
A position n in x is called a boundary if conversion to the form can
proceed independently on both sides:
f(x) == append(f(x[0:n]), f(x[n:])...)
References: https://unicode.org/reports/tr15/ and
https://unicode.org/notes/tn5/.
Append returns f(append(out, b...)).
The buffer out must be nil, empty, or equal to f(out).
AppendString returns f(append(out, []byte(s))).
The buffer out must be nil, empty, or equal to f(out).
Bytes returns f(b). May return b if f(b) = b.
FirstBoundary returns the position i of the first boundary in b
or -1 if b contains no boundary.
FirstBoundaryInString returns the position i of the first boundary in s
or -1 if s contains no boundary.
IsNormal returns true if b == f(b).
IsNormalString returns true if s == f(s).
LastBoundary returns the position i of the last boundary in b
or -1 if b contains no boundary.
NextBoundary reports the index of the boundary between the first and next
segment in b or -1 if atEOF is false and there are not enough bytes to
determine this boundary.
NextBoundaryInString reports the index of the boundary between the first and
next segment in b or -1 if atEOF is false and there are not enough bytes to
determine this boundary.
Properties returns properties for the first rune in s.
PropertiesString returns properties for the first rune in s.
QuickSpan returns a boundary n such that b[0:n] == f(b[0:n]).
It is not guaranteed to return the largest such n.
QuickSpanString returns a boundary n such that s[0:n] == f(s[0:n]).
It is not guaranteed to return the largest such n.
Reader returns a new reader that implements Read
by reading data from r and returning f(data).
Reset implements the Reset method of the transform.Transformer interface.
Span implements transform.SpanningTransformer. It returns a boundary n such
that b[0:n] == f(b[0:n]). It is not guaranteed to return the largest such n.
SpanString returns a boundary n such that s[0:n] == f(s[0:n]).
It is not guaranteed to return the largest such n.
String returns f(s).
Transform implements the Transform method of the transform.Transformer
interface. It may need to write segments of up to MaxSegmentSize at once.
Users should either catch ErrShortDst and allow dst to grow or have dst be at
least of size MaxTransformChunkSize to be guaranteed of progress.
Writer returns a new writer that implements Write(b)
by writing f(b) to w. The returned writer may use an
internal buffer to maintain state across Write calls.
Calling its Close method writes any buffered data to w.
Form : vendor/golang.org/x/text/transform.SpanningTransformer
Form : vendor/golang.org/x/text/transform.Transformer
func (*Iter).Init(f Form, src []byte)
func (*Iter).InitString(f Form, src string)
const NFC
const NFD
const NFKC
const NFKD
An Iter iterates over a string or byte slice, while normalizing it
to a given Form.
Done returns true if there is no more input to process.
Init initializes i to iterate over src after normalizing it to Form f.
InitString initializes i to iterate over src after normalizing it to Form f.
Next returns f(i.input[i.Pos():n]), where n is a boundary of i.input.
For any input a and b for which f(a) == f(b), subsequent calls
to Next will return the same segments.
Modifying runes are grouped together with the preceding starter, if such a starter exists.
Although not guaranteed, n will typically be the smallest possible n.
Pos returns the byte position at which the next call to Next will commence processing.
Seek sets the segment to be returned by the next call to Next to start
at position p. It is the responsibility of the caller to set p to the
start of a segment.
*Iter : io.Seeker
Properties provides access to normalization properties of a rune.
BoundaryAfter returns true if runes cannot combine with or otherwise
interact with this or previous runes.
BoundaryBefore returns true if this rune starts a new segment and
cannot combine with any rune on the left.
CCC returns the canonical combining class of the underlying rune.
Decomposition returns the decomposition for the underlying rune
or nil if there is none.
LeadCCC returns the CCC of the first rune in the decomposition.
If there is no decomposition, LeadCCC equals CCC.
Size returns the length of UTF-8 encoding of the rune.
TrailCCC returns the CCC of the last rune in the decomposition.
If there is no decomposition, TrailCCC equals CCC.
func Form.Properties(s []byte) Properties
func Form.PropertiesString(s string) Properties
Package-Level Constants (total 8)
GraphemeJoiner is inserted after maxNonStarters non-starter runes.
MaxSegmentSize is the maximum size of a byte buffer needed to consider any
sequence of starter and non-starter runes for the purpose of normalization.
MaxTransformChunkSize indicates the maximum number of bytes that Transform
may need to write atomically for any Form. Making a destination buffer at
least this size ensures that Transform can always make progress and that
the user does not need to grow the buffer on an ErrShortDst.
Version is the Unicode edition from which the tables are derived.
The pages are generated with Golds v0.7.3. (GOOS=linux GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds. |