package graphemes

Import Path
	github.com/clipperhouse/uax29/v2/graphemes (on go.dev)

Dependency Relation
	imports 3 packages, and imported by one package

Involved Source Files iterator.go Package graphemes implements Unicode grapheme cluster boundaries: https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries splitfunc.go trie.go
Package-Level Type Names (total 2)
/* sort by: | */
Type Parameters: T: iterators.Stringish Iterator *iterators.Iterator[T] End returns the byte position after the current token in the original data. Next advances the iterator to the next token. It returns false when there are no remaining tokens or an error occurred. Reset resets the iterator to the beginning of the data. SetText sets the text for the iterator to operate on, and resets all state. Split sets the SplitFunc for the Iterator. Start returns the byte position of the current token in the original data. Value returns the current token. Iterator : github.com/apache/arrow-go/v18/arrow/compute/exec.ArrayIter[bool] func FromBytes(b []byte) Iterator[[]byte] func FromString(s string) Iterator[string]
Scanner *bufio.Scanner Buffer controls memory allocation by the Scanner. It sets the initial buffer to use when scanning and the maximum size of buffer that may be allocated during scanning. The contents of the buffer are ignored. The maximum token size must be less than the larger of max and cap(buf). If max <= cap(buf), [Scanner.Scan] will use this buffer only and do no allocation. By default, [Scanner.Scan] uses an internal buffer and sets the maximum token size to [MaxScanTokenSize]. Buffer panics if it is called after scanning has started. Bytes returns the most recent token generated by a call to [Scanner.Scan]. The underlying array may point to data that will be overwritten by a subsequent call to Scan. It does no allocation. Err returns the first non-EOF error that was encountered by the [Scanner]. Scan advances the [Scanner] to the next token, which will then be available through the [Scanner.Bytes] or [Scanner.Text] method. It returns false when there are no more tokens, either by reaching the end of the input or an error. After Scan returns false, the [Scanner.Err] method will return any error that occurred during scanning, except that if it was [io.EOF], [Scanner.Err] will return nil. Scan panics if the split function returns too many empty tokens without advancing the input. This is a common error mode for scanners. Split sets the split function for the [Scanner]. The default split function is [ScanLines]. Split panics if it is called after scanning has started. Text returns the most recent token generated by a call to [Scanner.Scan] as a newly allocated string holding its bytes. Scanner : github.com/apache/arrow-go/v18/internal/hashing.ByteSlice func FromReader(r io.Reader) *Scanner
Package-Level Functions (total 3)
FromBytes returns an iterator for the grapheme clusters in the input bytes. Iterate while Next() is true, and access the grapheme via Value().
FromReader returns a Scanner, to split graphemes per https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries. It embeds a [bufio.Scanner], so you can use its methods. Iterate through graphemes by calling Scan() until false, then check Err().
FromString returns an iterator for the grapheme clusters in the input string. Iterate while Next() is true, and access the grapheme via Value().
Package-Level Variables (only one)
SplitFunc is a bufio.SplitFunc implementation of Unicode grapheme cluster segmentation, for use with bufio.Scanner. See https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries.