package syntax
Import Path
github.com/dlclark/regexp2/syntax (on go.dev)
Dependency Relation
imports 11 packages, and imported by one package
Involved Source Files
charclass.go
code.go
escape.go
parser.go
prefix.go
replacerdata.go
tree.go
writer.go
Package-Level Type Names (total 11)
anchorDescription returns a human-readable description of the anchors
AnchorLoc : expvar.Var
AnchorLoc : fmt.Stringer
const AnchorBeginning
BmPrefix precomputes the Boyer-Moore
tables for fast string scanning. These tables allow
you to scan for the first occurrence of a string within
a large body of text without examining every character.
The performance of the heuristic depends on the actual
string and the text being searched, but usually, the longer
the string that is being searched for, the fewer characters
need to be examined.
Dump returns the contents of the filter as a human readable string
When a regex is anchored, we can do a quick IsMatch test instead of a Scan
Scan uses the Boyer-Moore algorithm to find the first occurrence
of the specified string within text, beginning at index, and
constrained within beglimit and endlimit.
The direction and case-sensitivity of the match is determined
by the arguments to the RegexBoyerMoore constructor.
(*BmPrefix) String() string
*BmPrefix : expvar.Var
*BmPrefix : fmt.Stringer
CharSet combines start-end rune ranges and unicode categories representing a set of characters
CharIn returns true if the rune is in our character set (either ranges or categories).
It handles negations and subtracted sub-charsets.
Copy makes a deep copy to prevent accidental mutation of a set
( CharSet) HasSubtraction() bool
( CharSet) IsEmpty() bool
( CharSet) IsMergeable() bool
( CharSet) IsNegated() bool
( CharSet) IsSingleton() bool
( CharSet) IsSingletonInverse() bool
SingletonChar will return the char from the first range without validation.
It assumes you have checked for IsSingleton or IsSingletonInverse and will panic given bad input
gets a human-readable description for a set string
CharSet : expvar.Var
CharSet : fmt.Stringer
func CharSet.Copy() CharSet
// the set of zero-length start anchors (RegexFCD.Bol, etc)
// the fixed prefix string as a Boyer-Moore machine (may be null)
// mapping of user group numbers -> impl group slots
// number of impl group slots
// the code
// the set of candidate first characters (may be null)
// true if right to left
// character set table
// string table
// how many instructions use backtracking
(*Code) Dump() string
OpcodeDescription is a humman readable string of the specific offset
func Write(tree *RegexTree) (*Code, error)
An Error describes a failure to parse a regular expression
and gives the offending expression.
Args []interface{}
Code ErrorCode
Expr string
(*Error) Error() string
*Error : error
An ErrorCode describes a failure to parse a regular expression.
( ErrorCode) String() string
ErrorCode : expvar.Var
ErrorCode : fmt.Stringer
const ErrInternalError
const Onerep
func NewReplacerData(rep string, caps map[int]int, capsize int, capnames map[string]int, op RegexOptions) (*ReplacerData, error)
func Parse(re string, op RegexOptions) (*RegexTree, error)
const IgnoreCase
Package-Level Functions (total 8)
CharDescription Produces a human-readable description for a single character.
func IsECMAWordChar(r rune) bool
According to UTS#18 Unicode Regular Expressions (http://www.unicode.org/reports/tr18/)
RL 1.4 Simple Word Boundaries The class of <word_character> includes all Alphabetic
values from the Unicode character database, from UnicodeData.txt [UData], plus the U+200C
ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER.
NewReplacerData will populate a reusable replacer data struct based on the given replacement string
and the capture group data from a regexp
Parse converts a regex string into a parse tree
Package-Level Variables (total 18)
var DigitClass func() *CharSet var ECMAAnyClass func() *CharSet var ECMADigitClass func() *CharSet var ECMASpaceClass func() *CharSet var ECMAWordClass func() *CharSet
ErrReplacementError is a general error during parsing the replacement text
var NotDigitClass func() *CharSet var NotECMADigitClass func() *CharSet var NotECMASpaceClass func() *CharSet var NotECMAWordClass func() *CharSet var NotRE2SpaceClass func() *CharSet var NotSpaceClass func() *CharSet var NotWordClass func() *CharSet var RE2SpaceClass func() *CharSet var SpaceClass func() *CharSet
Package-Level Constants (total 113)
where the regex can be pegged
where the regex can be pegged
where the regex can be pegged
where the regex can be pegged
where the regex can be pegged
where the regex can be pegged
where the regex can be pegged
where the regex can be pegged
const Back = 128 // bit to indicate that we're backtracking. const Back2 = 256 // bit to indicate that we're backtracking on a second branch. const Backjump = 35 // zap back to saved state const Beginning = 18 // \A const Bol = 14 // ^ const Boundary = 16 // \b const Branchcount = 28 // back jump,limit branch++ if zero<=c<limit const Branchmark = 24 // back jump branch first for loop const Capturemark = 32 // back group define group const Ci = 512 // bit to indicate that we're case-insensitive. const Compiled = 8 // "c" const Debug = 128 // "d" const E = 1 // should be escaped const ECMABoundary = 41 // \b const ECMAScript = 256 // "e" const End = 21 // \Z const EndZ = 20 // \Z const Eol = 15 // $ const ErrAlternationCantCapture = "alternation conditions do not capture and cannot be named" const ErrAlternationCantHaveComment = "alternation conditions cannot be comments" const ErrBadClassInCharRange = "cannot include class \\%v in character range" const ErrCapNumNotZero = "capture number cannot be zero" const ErrCaptureGroupOutOfRange = "capture group number out of range" const ErrConditionalExpression = "illegal conditional (?(...)) expression" const ErrIllegalEndEscape = "illegal \\ at end of pattern" const ErrIncompleteSlashP = "incomplete \\p{X} character escape"
internal issue
const ErrInvalidCharRange = "invalid character class range" const ErrInvalidGroupName = "invalid group name: group names must begin with a word character and... const ErrInvalidHex = "hex values may not be larger than 0x10FFFF" const ErrInvalidRepeatOp = "invalid nested repetition operator" const ErrInvalidRepeatSize = "invalid repeat count" const ErrInvalidUTF8 = "invalid UTF-8" const ErrMalformedNameRef = "malformed \\k<...> named back reference" const ErrMalformedReference = "(?(%v) ) malformed" const ErrMalformedSlashP = "malformed \\p{X} character escape" const ErrMissingBrace = "missing closing }" const ErrMissingControl = "missing control character" const ErrMissingParen = "missing closing )" const ErrMissingRepeatArgument = "missing argument to repetition operator" const ErrReversedCharRange = "[%c-%c] range in reverse order" const ErrSubtractionMustBeLast = "a subtraction must be the last element in a character class" const ErrTooFewHex = "insufficient hexadecimal digits" const ErrTooManyAlternates = "too many | in (?()|)" const ErrUndefinedBackRef = "reference to undefined group number %v" const ErrUndefinedNameRef = "reference to undefined group name %v" const ErrUndefinedReference = "(?(%v) ) reference to undefined group" const ErrUnexpectedParen = "unexpected )" const ErrUnknownSlashP = "unknown unicode category, script, or property '%v'" const ErrUnrecognizedControl = "unrecognized control character" const ErrUnrecognizedEscape = "unrecognized escape sequence \\%v" const ErrUnrecognizedGrouping = "unrecognized grouping construct: (%v" const ErrUnterminatedBracket = "unterminated [] set"
Parser errors
const ExplicitCapture = 4 // "n" const Forejump = 36 // zap backtracking state const Getmark = 33 // back recall position const Goto = 38 // jump just go const IgnoreCase RegexOptions = 1 // "i" const IgnorePatternWhitespace = 32 // "x" const Lazybranch = 23 // back jump straight first const Lazybranchcount = 29 // back jump,limit same, but straight first const Lazybranchmark = 25 // back jump straight first for loop const LowercaseAdd = 1 // Add arg. const LowercaseBad = 3 // Bitwise and with 1 and add original. const LowercaseBor = 2 // Bitwise or with 1. const LowercaseSet = 0 // Set to arg. const Mask = 63 // Mask to get unmodified ordinary operator
MaxPrefixSize is the largest number of runes we'll use for a BoyerMoyer prefix
const Multi = 12 // lef string abcd const Multiline = 2 // "m" const Nonboundary = 17 // \B const NonECMABoundary = 42 // \B const Nothing = 22 // Reject! const Notone = 10 // lef char [^a] const Notonelazy = 7 // lef,back char,min,max .{,n}? const Notoneloop = 4 // lef,back char,min,max .{,n} const Notonerep = 1 // lef,back char,min,max .{n} const Nullcount = 26 // back val set counter, null mark const Nullmark = 30 // back save position const One = 9 // lef char a const Onelazy = 6 // lef,back char,min,max a {,n}? const Oneloop = 3 // lef,back char,min,max a {,n} const Prune = 39 // prune it baby const RE2 = 512 // RE2 compat mode const Ref = 13 // lef group \# const RightToLeft = 64 // "r" const Rtl = 64 // bit to indicate that we're reverse scanning. const S = 4 // ordinary stopper const Set = 11 // lef set [a-z\s] \w \s \d const Setcount = 27 // back val set counter, make mark const Setjump = 34 // back save backtrack state const Setlazy = 8 // lef,back set,min,max [\d]{,n}? const Setloop = 5 // lef,back set,min,max [\d]{,n} const Setmark = 31 // back save position const Setrep = 2 // lef,back set,min,max [\d]{n} const Singleline = 16 // "s" const Start = 19 // \G const Stop = 40 // done! const Testref = 37 // backtrack if ref undefined const Unicode = 1024 // "u" const X = 2 // whitespace const Z = 3 // ScanBlank stopper![]() |
The pages are generated with Golds v0.8.2. (GOOS=linux GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds. |