ladybird/Tests
Andreas Kling a3e82eaad3 AK: Introduce the new String, replacement for DeprecatedString
DeprecatedString (formerly String) has been with us since the start,
and it has served us well. However, it has a number of shortcomings
that I'd like to address.

Some of these issues are hard if not impossible to solve incrementally
inside of DeprecatedString, so instead of doing that, let's build a new
String class and then incrementally move over to it instead.

Problems in DeprecatedString:

- It assumes string allocation never fails. This makes it impossible
  to use in allocation-sensitive contexts, and is the reason we had to
  ban DeprecatedString from the kernel entirely.

- The awkward null state. DeprecatedString can be null. It's different
  from the empty state, although null strings are considered empty.
  All code is immediately nicer when using Optional<DeprecatedString>
  but DeprecatedString came before Optional, which is how we ended up
  like this.

- The encoding of the underlying data is ambiguous. For the most part,
  we use it as if it's always UTF-8, but there have been cases where
  we pass around strings in other encodings (e.g ISO8859-1)

- operator[] and length() are used to iterate over DeprecatedString one
  byte at a time. This is done all over the codebase, and will *not*
  give the right results unless the string is all ASCII.

How we solve these issues in the new String:

- Functions that may allocate now return ErrorOr<String> so that ENOMEM
  errors can be passed to the caller.

- String has no null state. Use Optional<String> when needed.

- String is always UTF-8. This is validated when constructing a String.
  We may need to add a bypass for this in the future, for cases where
  you have a known-good string, but for now: validate all the things!

- There is no operator[] or length(). You can get the underlying data
  with bytes(), but for iterating over code points, you should be using
  an UTF-8 iterator.

Furthermore, it has two nifty new features:

- String implements a small string optimization (SSO) for strings that
  can fit entirely within a pointer. This means up to 3 bytes on 32-bit
  platforms, and 7 bytes on 64-bit platforms. Such small strings will
  not be heap-allocated.

- String can create substrings without making a deep copy of the
  substring. Instead, the superstring gets +1 refcount from the
  substring, and it acts like a view into the superstring. To make
  substrings like this, use the substring_with_shared_superstring() API.

One caveat:

- String does not guarantee that the underlying data is null-terminated
  like DeprecatedString does today. While this was nifty in a handful of
  places where we were calling C functions, it did stand in the way of
  shared-superstring substrings.
2022-12-06 15:21:26 +01:00
..
AK AK: Introduce the new String, replacement for DeprecatedString 2022-12-06 15:21:26 +01:00
Kernel AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibAudio AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibC AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibCompress AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibCore AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibCpp Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibCrypto AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibEDID LibEDID: Fix handling extension maps 2022-01-24 19:29:06 +00:00
LibELF AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibGfx AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibGL AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibIMAP Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibJS Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibLocale AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibMarkdown AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibPDF AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibRegex Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibSQL Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibTest LibTest: Add EXPECT_NO_CRASH 2021-12-19 14:22:06 -08:00
LibTextCodec Everywhere: Add sv suffix to strings relying on StringView(char const*) 2022-07-12 23:11:35 +02:00
LibThreading LibC: Remove the LibPthread interface target 2022-07-19 11:00:35 +01:00
LibTimeZone LibTimeZone+LibJS: Update to TZDB version 2022e 2022-10-18 16:01:44 +02:00
LibTLS AK+Everywhere: Rename String to DeprecatedString 2022-12-06 08:54:33 +01:00
LibTTF LibGfx: Move TTF files from TrueTypeFont/ to Font/TrueType/ 2022-04-09 23:48:18 +02:00
LibUnicode LibUnicode: Update code point ideographic replacements for Unicode 15 2022-10-07 18:17:40 +01:00
LibVideo LibVideo: Read Matroska lazily so that large files can start quickly 2022-11-25 23:28:39 +01:00
LibWasm Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibWeb Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
LibXML Everywhere: Add sv suffix to strings relying on StringView(char const*) 2022-07-12 23:11:35 +02:00
Spreadsheet Everywhere: Rename to_{string => deprecated_string}() where applicable 2022-12-06 08:54:33 +01:00
UserspaceEmulator Everywhere: Mark dependencies of most targets as PRIVATE 2022-11-01 14:49:09 +00:00
CMakeLists.txt LibVideo: Add test to ensure that a VP9 WebM file will decode 2022-10-09 20:32:40 -06:00