mirror of
https://github.com/ilyakooo0/urbit.git
synced 2024-12-24 23:44:56 +03:00
396 lines
12 KiB
Plaintext
396 lines
12 KiB
Plaintext
|
RE2 regular expression syntax reference
|
|||
|
-------------------------------------
|
|||
|
|
|||
|
Single characters:
|
|||
|
. any character, possibly including newline (s=true)
|
|||
|
[xyz] character class
|
|||
|
[^xyz] negated character class
|
|||
|
\d Perl character class
|
|||
|
\D negated Perl character class
|
|||
|
[:alpha:] ASCII character class
|
|||
|
[:^alpha:] negated ASCII character class
|
|||
|
\pN Unicode character class (one-letter name)
|
|||
|
\p{Greek} Unicode character class
|
|||
|
\PN negated Unicode character class (one-letter name)
|
|||
|
\P{Greek} negated Unicode character class
|
|||
|
|
|||
|
Composites:
|
|||
|
xy «x» followed by «y»
|
|||
|
x|y «x» or «y» (prefer «x»)
|
|||
|
|
|||
|
Repetitions:
|
|||
|
x* zero or more «x», prefer more
|
|||
|
x+ one or more «x», prefer more
|
|||
|
x? zero or one «x», prefer one
|
|||
|
x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
|
|||
|
x{n,} «n» or more «x», prefer more
|
|||
|
x{n} exactly «n» «x»
|
|||
|
x*? zero or more «x», prefer fewer
|
|||
|
x+? one or more «x», prefer fewer
|
|||
|
x?? zero or one «x», prefer zero
|
|||
|
x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
|
|||
|
x{n,}? «n» or more «x», prefer fewer
|
|||
|
x{n}? exactly «n» «x»
|
|||
|
x{} (== x*) NOT SUPPORTED vim
|
|||
|
x{-} (== x*?) NOT SUPPORTED vim
|
|||
|
x{-n} (== x{n}?) NOT SUPPORTED vim
|
|||
|
x= (== x?) NOT SUPPORTED vim
|
|||
|
|
|||
|
Possessive repetitions:
|
|||
|
x*+ zero or more «x», possessive NOT SUPPORTED
|
|||
|
x++ one or more «x», possessive NOT SUPPORTED
|
|||
|
x?+ zero or one «x», possessive NOT SUPPORTED
|
|||
|
x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
|
|||
|
x{n,}+ «n» or more «x», possessive NOT SUPPORTED
|
|||
|
x{n}+ exactly «n» «x», possessive NOT SUPPORTED
|
|||
|
|
|||
|
Grouping:
|
|||
|
(re) numbered capturing group
|
|||
|
(?P<name>re) named & numbered capturing group
|
|||
|
(?<name>re) named & numbered capturing group NOT SUPPORTED
|
|||
|
(?'name're) named & numbered capturing group NOT SUPPORTED
|
|||
|
(?:re) non-capturing group
|
|||
|
(?flags) set flags within current group; non-capturing
|
|||
|
(?flags:re) set flags during re; non-capturing
|
|||
|
(?#text) comment NOT SUPPORTED
|
|||
|
(?|x|y|z) branch numbering reset NOT SUPPORTED
|
|||
|
(?>re) possessive match of «re» NOT SUPPORTED
|
|||
|
re@> possessive match of «re» NOT SUPPORTED vim
|
|||
|
%(re) non-capturing group NOT SUPPORTED vim
|
|||
|
|
|||
|
Flags:
|
|||
|
i case-insensitive (default false)
|
|||
|
m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
|
|||
|
s let «.» match «\n» (default false)
|
|||
|
U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
|
|||
|
Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
|
|||
|
|
|||
|
Empty strings:
|
|||
|
^ at beginning of text or line («m»=true)
|
|||
|
$ at end of text (like «\z» not «\Z») or line («m»=true)
|
|||
|
\A at beginning of text
|
|||
|
\b at word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
|
|||
|
\B not a word boundary
|
|||
|
\G at beginning of subtext being searched NOT SUPPORTED pcre
|
|||
|
\G at end of last match NOT SUPPORTED perl
|
|||
|
\Z at end of text, or before newline at end of text NOT SUPPORTED
|
|||
|
\z at end of text
|
|||
|
(?=re) before text matching «re» NOT SUPPORTED
|
|||
|
(?!re) before text not matching «re» NOT SUPPORTED
|
|||
|
(?<=re) after text matching «re» NOT SUPPORTED
|
|||
|
(?<!re) after text not matching «re» NOT SUPPORTED
|
|||
|
re& before text matching «re» NOT SUPPORTED vim
|
|||
|
re@= before text matching «re» NOT SUPPORTED vim
|
|||
|
re@! before text not matching «re» NOT SUPPORTED vim
|
|||
|
re@<= after text matching «re» NOT SUPPORTED vim
|
|||
|
re@<! after text not matching «re» NOT SUPPORTED vim
|
|||
|
\zs sets start of match (= \K) NOT SUPPORTED vim
|
|||
|
\ze sets end of match NOT SUPPORTED vim
|
|||
|
\%^ beginning of file NOT SUPPORTED vim
|
|||
|
\%$ end of file NOT SUPPORTED vim
|
|||
|
\%V on screen NOT SUPPORTED vim
|
|||
|
\%# cursor position NOT SUPPORTED vim
|
|||
|
\%'m mark «m» position NOT SUPPORTED vim
|
|||
|
\%23l in line 23 NOT SUPPORTED vim
|
|||
|
\%23c in column 23 NOT SUPPORTED vim
|
|||
|
\%23v in virtual column 23 NOT SUPPORTED vim
|
|||
|
|
|||
|
Escape sequences:
|
|||
|
\a bell (== \007)
|
|||
|
\f form feed (== \014)
|
|||
|
\t horizontal tab (== \011)
|
|||
|
\n newline (== \012)
|
|||
|
\r carriage return (== \015)
|
|||
|
\v vertical tab character (== \013)
|
|||
|
\* literal «*», for any punctuation character «*»
|
|||
|
\123 octal character code (up to three digits)
|
|||
|
\x7F hex character code (exactly two digits)
|
|||
|
\x{10FFFF} hex character code
|
|||
|
\C match a single byte even in UTF-8 mode
|
|||
|
\Q...\E literal text «...» even if «...» has punctuation
|
|||
|
|
|||
|
\1 backreference NOT SUPPORTED
|
|||
|
\b backspace NOT SUPPORTED (use «\010»)
|
|||
|
\cK control char ^K NOT SUPPORTED (use «\001» etc)
|
|||
|
\e escape NOT SUPPORTED (use «\033»)
|
|||
|
\g1 backreference NOT SUPPORTED
|
|||
|
\g{1} backreference NOT SUPPORTED
|
|||
|
\g{+1} backreference NOT SUPPORTED
|
|||
|
\g{-1} backreference NOT SUPPORTED
|
|||
|
\g{name} named backreference NOT SUPPORTED
|
|||
|
\g<name> subroutine call NOT SUPPORTED
|
|||
|
\g'name' subroutine call NOT SUPPORTED
|
|||
|
\k<name> named backreference NOT SUPPORTED
|
|||
|
\k'name' named backreference NOT SUPPORTED
|
|||
|
\lX lowercase «X» NOT SUPPORTED
|
|||
|
\ux uppercase «x» NOT SUPPORTED
|
|||
|
\L...\E lowercase text «...» NOT SUPPORTED
|
|||
|
\K reset beginning of «$0» NOT SUPPORTED
|
|||
|
\N{name} named Unicode character NOT SUPPORTED
|
|||
|
\R line break NOT SUPPORTED
|
|||
|
\U...\E upper case text «...» NOT SUPPORTED
|
|||
|
\X extended Unicode sequence NOT SUPPORTED
|
|||
|
|
|||
|
\%d123 decimal character 123 NOT SUPPORTED vim
|
|||
|
\%xFF hex character FF NOT SUPPORTED vim
|
|||
|
\%o123 octal character 123 NOT SUPPORTED vim
|
|||
|
\%u1234 Unicode character 0x1234 NOT SUPPORTED vim
|
|||
|
\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
|
|||
|
|
|||
|
Character class elements:
|
|||
|
x single character
|
|||
|
A-Z character range (inclusive)
|
|||
|
\d Perl character class
|
|||
|
[:foo:] ASCII character class «foo»
|
|||
|
\p{Foo} Unicode character class «Foo»
|
|||
|
\pF Unicode character class «F» (one-letter name)
|
|||
|
|
|||
|
Named character classes as character class elements:
|
|||
|
[\d] digits (== \d)
|
|||
|
[^\d] not digits (== \D)
|
|||
|
[\D] not digits (== \D)
|
|||
|
[^\D] not not digits (== \d)
|
|||
|
[[:name:]] named ASCII class inside character class (== [:name:])
|
|||
|
[^[:name:]] named ASCII class inside negated character class (== [:^name:])
|
|||
|
[\p{Name}] named Unicode property inside character class (== \p{Name})
|
|||
|
[^\p{Name}] named Unicode property inside negated character class (== \P{Name})
|
|||
|
|
|||
|
Perl character classes:
|
|||
|
\d digits (== [0-9])
|
|||
|
\D not digits (== [^0-9])
|
|||
|
\s whitespace (== [\t\n\f\r ])
|
|||
|
\S not whitespace (== [^\t\n\f\r ])
|
|||
|
\w word characters (== [0-9A-Za-z_])
|
|||
|
\W not word characters (== [^0-9A-Za-z_])
|
|||
|
|
|||
|
\h horizontal space NOT SUPPORTED
|
|||
|
\H not horizontal space NOT SUPPORTED
|
|||
|
\v vertical space NOT SUPPORTED
|
|||
|
\V not vertical space NOT SUPPORTED
|
|||
|
|
|||
|
ASCII character classes:
|
|||
|
[:alnum:] alphanumeric (== [0-9A-Za-z])
|
|||
|
[:alpha:] alphabetic (== [A-Za-z])
|
|||
|
[:ascii:] ASCII (== [\x00-\x7F])
|
|||
|
[:blank:] blank (== [\t ])
|
|||
|
[:cntrl:] control (== [\x00-\x1F\x7F])
|
|||
|
[:digit:] digits (== [0-9])
|
|||
|
[:graph:] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
|
|||
|
[:lower:] lower case (== [a-z])
|
|||
|
[:print:] printable (== [ -~] == [ [:graph:]])
|
|||
|
[:punct:] punctuation (== [!-/:-@[-`{-~])
|
|||
|
[:space:] whitespace (== [\t\n\v\f\r ])
|
|||
|
[:upper:] upper case (== [A-Z])
|
|||
|
[:word:] word characters (== [0-9A-Za-z_])
|
|||
|
[:xdigit:] hex digit (== [0-9A-Fa-f])
|
|||
|
|
|||
|
Unicode character class names--general category:
|
|||
|
C other
|
|||
|
Cc control
|
|||
|
Cf format
|
|||
|
Cn unassigned code points NOT SUPPORTED
|
|||
|
Co private use
|
|||
|
Cs surrogate
|
|||
|
L letter
|
|||
|
LC cased letter NOT SUPPORTED
|
|||
|
L& cased letter NOT SUPPORTED
|
|||
|
Ll lowercase letter
|
|||
|
Lm modifier letter
|
|||
|
Lo other letter
|
|||
|
Lt titlecase letter
|
|||
|
Lu uppercase letter
|
|||
|
M mark
|
|||
|
Mc spacing mark
|
|||
|
Me enclosing mark
|
|||
|
Mn non-spacing mark
|
|||
|
N number
|
|||
|
Nd decimal number
|
|||
|
Nl letter number
|
|||
|
No other number
|
|||
|
P punctuation
|
|||
|
Pc connector punctuation
|
|||
|
Pd dash punctuation
|
|||
|
Pe close punctuation
|
|||
|
Pf final punctuation
|
|||
|
Pi initial punctuation
|
|||
|
Po other punctuation
|
|||
|
Ps open punctuation
|
|||
|
S symbol
|
|||
|
Sc currency symbol
|
|||
|
Sk modifier symbol
|
|||
|
Sm math symbol
|
|||
|
So other symbol
|
|||
|
Z separator
|
|||
|
Zl line separator
|
|||
|
Zp paragraph separator
|
|||
|
Zs space separator
|
|||
|
|
|||
|
Unicode character class names--scripts:
|
|||
|
Arabic Arabic
|
|||
|
Armenian Armenian
|
|||
|
Balinese Balinese
|
|||
|
Bamum Bamum
|
|||
|
Batak Batak
|
|||
|
Bengali Bengali
|
|||
|
Bopomofo Bopomofo
|
|||
|
Brahmi Brahmi
|
|||
|
Braille Braille
|
|||
|
Buginese Buginese
|
|||
|
Buhid Buhid
|
|||
|
Canadian_Aboriginal Canadian Aboriginal
|
|||
|
Carian Carian
|
|||
|
Chakma Chakma
|
|||
|
Cham Cham
|
|||
|
Cherokee Cherokee
|
|||
|
Common characters not specific to one script
|
|||
|
Coptic Coptic
|
|||
|
Cuneiform Cuneiform
|
|||
|
Cypriot Cypriot
|
|||
|
Cyrillic Cyrillic
|
|||
|
Deseret Deseret
|
|||
|
Devanagari Devanagari
|
|||
|
Egyptian_Hieroglyphs Egyptian Hieroglyphs
|
|||
|
Ethiopic Ethiopic
|
|||
|
Georgian Georgian
|
|||
|
Glagolitic Glagolitic
|
|||
|
Gothic Gothic
|
|||
|
Greek Greek
|
|||
|
Gujarati Gujarati
|
|||
|
Gurmukhi Gurmukhi
|
|||
|
Han Han
|
|||
|
Hangul Hangul
|
|||
|
Hanunoo Hanunoo
|
|||
|
Hebrew Hebrew
|
|||
|
Hiragana Hiragana
|
|||
|
Imperial_Aramaic Imperial Aramaic
|
|||
|
Inherited inherit script from previous character
|
|||
|
Inscriptional_Pahlavi Inscriptional Pahlavi
|
|||
|
Inscriptional_Parthian Inscriptional Parthian
|
|||
|
Javanese Javanese
|
|||
|
Kaithi Kaithi
|
|||
|
Kannada Kannada
|
|||
|
Katakana Katakana
|
|||
|
Kayah_Li Kayah Li
|
|||
|
Kharoshthi Kharoshthi
|
|||
|
Khmer Khmer
|
|||
|
Lao Lao
|
|||
|
Latin Latin
|
|||
|
Lepcha Lepcha
|
|||
|
Limbu Limbu
|
|||
|
Linear_B Linear B
|
|||
|
Lycian Lycian
|
|||
|
Lydian Lydian
|
|||
|
Malayalam Malayalam
|
|||
|
Mandaic Mandaic
|
|||
|
Meetei_Mayek Meetei Mayek
|
|||
|
Meroitic_Cursive Meroitic Cursive
|
|||
|
Meroitic_Hieroglyphs Meroitic Hieroglyphs
|
|||
|
Miao Miao
|
|||
|
Mongolian Mongolian
|
|||
|
Myanmar Myanmar
|
|||
|
New_Tai_Lue New Tai Lue (aka Simplified Tai Lue)
|
|||
|
Nko Nko
|
|||
|
Ogham Ogham
|
|||
|
Ol_Chiki Ol Chiki
|
|||
|
Old_Italic Old Italic
|
|||
|
Old_Persian Old Persian
|
|||
|
Old_South_Arabian Old South Arabian
|
|||
|
Old_Turkic Old Turkic
|
|||
|
Oriya Oriya
|
|||
|
Osmanya Osmanya
|
|||
|
Phags_Pa 'Phags Pa
|
|||
|
Phoenician Phoenician
|
|||
|
Rejang Rejang
|
|||
|
Runic Runic
|
|||
|
Saurashtra Saurashtra
|
|||
|
Sharada Sharada
|
|||
|
Shavian Shavian
|
|||
|
Sinhala Sinhala
|
|||
|
Sora_Sompeng Sora Sompeng
|
|||
|
Sundanese Sundanese
|
|||
|
Syloti_Nagri Syloti Nagri
|
|||
|
Syriac Syriac
|
|||
|
Tagalog Tagalog
|
|||
|
Tagbanwa Tagbanwa
|
|||
|
Tai_Le Tai Le
|
|||
|
Tai_Tham Tai Tham
|
|||
|
Tai_Viet Tai Viet
|
|||
|
Takri Takri
|
|||
|
Tamil Tamil
|
|||
|
Telugu Telugu
|
|||
|
Thaana Thaana
|
|||
|
Thai Thai
|
|||
|
Tibetan Tibetan
|
|||
|
Tifinagh Tifinagh
|
|||
|
Ugaritic Ugaritic
|
|||
|
Vai Vai
|
|||
|
Yi Yi
|
|||
|
|
|||
|
Vim character classes:
|
|||
|
\i identifier character NOT SUPPORTED vim
|
|||
|
\I «\i» except digits NOT SUPPORTED vim
|
|||
|
\k keyword character NOT SUPPORTED vim
|
|||
|
\K «\k» except digits NOT SUPPORTED vim
|
|||
|
\f file name character NOT SUPPORTED vim
|
|||
|
\F «\f» except digits NOT SUPPORTED vim
|
|||
|
\p printable character NOT SUPPORTED vim
|
|||
|
\P «\p» except digits NOT SUPPORTED vim
|
|||
|
\s whitespace character (== [ \t]) NOT SUPPORTED vim
|
|||
|
\S non-white space character (== [^ \t]) NOT SUPPORTED vim
|
|||
|
\d digits (== [0-9]) vim
|
|||
|
\D not «\d» vim
|
|||
|
\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
|
|||
|
\X not «\x» NOT SUPPORTED vim
|
|||
|
\o octal digits (== [0-7]) NOT SUPPORTED vim
|
|||
|
\O not «\o» NOT SUPPORTED vim
|
|||
|
\w word character vim
|
|||
|
\W not «\w» vim
|
|||
|
\h head of word character NOT SUPPORTED vim
|
|||
|
\H not «\h» NOT SUPPORTED vim
|
|||
|
\a alphabetic NOT SUPPORTED vim
|
|||
|
\A not «\a» NOT SUPPORTED vim
|
|||
|
\l lowercase NOT SUPPORTED vim
|
|||
|
\L not lowercase NOT SUPPORTED vim
|
|||
|
\u uppercase NOT SUPPORTED vim
|
|||
|
\U not uppercase NOT SUPPORTED vim
|
|||
|
\_x «\x» plus newline, for any «x» NOT SUPPORTED vim
|
|||
|
|
|||
|
Vim flags:
|
|||
|
\c ignore case NOT SUPPORTED vim
|
|||
|
\C match case NOT SUPPORTED vim
|
|||
|
\m magic NOT SUPPORTED vim
|
|||
|
\M nomagic NOT SUPPORTED vim
|
|||
|
\v verymagic NOT SUPPORTED vim
|
|||
|
\V verynomagic NOT SUPPORTED vim
|
|||
|
\Z ignore differences in Unicode combining characters NOT SUPPORTED vim
|
|||
|
|
|||
|
Magic:
|
|||
|
(?{code}) arbitrary Perl code NOT SUPPORTED perl
|
|||
|
(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
|
|||
|
(?n) recursive call to regexp capturing group «n» NOT SUPPORTED
|
|||
|
(?+n) recursive call to relative group «+n» NOT SUPPORTED
|
|||
|
(?-n) recursive call to relative group «-n» NOT SUPPORTED
|
|||
|
(?C) PCRE callout NOT SUPPORTED pcre
|
|||
|
(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
|
|||
|
(?&name) recursive call to named group NOT SUPPORTED
|
|||
|
(?P=name) named backreference NOT SUPPORTED
|
|||
|
(?P>name) recursive call to named group NOT SUPPORTED
|
|||
|
(?(cond)true|false) conditional branch NOT SUPPORTED
|
|||
|
(?(cond)true) conditional branch NOT SUPPORTED
|
|||
|
(*ACCEPT) make regexps more like Prolog NOT SUPPORTED
|
|||
|
(*COMMIT) NOT SUPPORTED
|
|||
|
(*F) NOT SUPPORTED
|
|||
|
(*FAIL) NOT SUPPORTED
|
|||
|
(*MARK) NOT SUPPORTED
|
|||
|
(*PRUNE) NOT SUPPORTED
|
|||
|
(*SKIP) NOT SUPPORTED
|
|||
|
(*THEN) NOT SUPPORTED
|
|||
|
(*ANY) set newline convention NOT SUPPORTED
|
|||
|
(*ANYCRLF) NOT SUPPORTED
|
|||
|
(*CR) NOT SUPPORTED
|
|||
|
(*CRLF) NOT SUPPORTED
|
|||
|
(*LF) NOT SUPPORTED
|
|||
|
(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
|
|||
|
(*BSR_UNICODE) NOT SUPPORTED pcre
|
|||
|
|