ladybird/Userland/Libraries/LibRegex
Timothy Flynn 562d4e497b LibRegex: Treat pattern string characters as unsigned
For example, consider the following pattern:

    new RegExp('\ud834\udf06', 'u')

With this pattern, the regex parser should insert the UTF-8 encoded
bytes 0xf0, 0x9d, 0x8c, and 0x86. However, because these characters are
currently treated as normal char types, they have a negative value since
they are all > 0x7f. Then, due to sign extension, when these characters
are cast to u64, the sign bit is preserved. The result is that these
bytes are inserted as 0xfffffffffffffff0, 0xffffffffffffff9d, etc.

Fortunately, there are only a few places where we insert bytecode with
the raw characters. In these places, be sure to treat the bytes as u8
before they are cast to u64.
2021-08-20 19:16:33 +02:00
..
C LibRegex+LibC: Make re_nsub available to the user 2021-07-13 07:04:06 +02:00
CMakeLists.txt LibRegex+LibUnicode: Begin implementing Unicode property escapes 2021-07-30 21:26:31 +01:00
Forward.h Everything: Move to SPDX license identifiers in all files. 2021-04-22 11:22:27 +02:00
Regex.h Everything: Move to SPDX license identifiers in all files. 2021-04-22 11:22:27 +02:00
RegexByteCode.cpp LibRegex: Ensure the GoBack operation decrements the code unit index 2021-08-18 09:47:09 +04:30
RegexByteCode.h LibRegex: Implement and use a REPEAT operation for bytecode repetition 2021-08-15 11:43:45 +01:00
RegexDebug.h LibRegex: Switch to east-const style 2021-07-23 21:19:21 +04:30
RegexError.h LibRegex+LibUnicode: Begin implementing Unicode property escapes 2021-07-30 21:26:31 +01:00
RegexLexer.cpp LibRegex: Convert regex::Lexer to inherit from GenericLexer 2021-08-19 23:49:25 +02:00
RegexLexer.h LibRegex: Convert regex::Lexer to inherit from GenericLexer 2021-08-19 23:49:25 +02:00
RegexMatch.h LibRegex+LibJS: Change capture group names from a String to a FlyString 2021-08-19 23:49:25 +02:00
RegexMatcher.cpp LibRegex: Implement and use a REPEAT operation for bytecode repetition 2021-08-15 11:43:45 +01:00
RegexMatcher.h AK: Move FormatParser definition from header to implementation file 2021-08-19 23:49:25 +02:00
RegexOptions.h LibRegex: Allow RegexOptions to be declared at compile time 2021-07-30 21:26:31 +01:00
RegexParser.cpp LibRegex: Treat pattern string characters as unsigned 2021-08-20 19:16:33 +02:00
RegexParser.h LibRegex: Treat pattern string characters as unsigned 2021-08-20 19:16:33 +02:00