1
1
mirror of https://github.com/wez/wezterm.git synced 2024-12-22 12:51:31 +03:00

Add wezterm-bidi crate

In order to support RTL/BIDI, wezterm needs a bidi implementation.  I
don't think a well-conforming rust implementation exists today; what I
found were implementations that didn't pass 100% of the conformance
tests.

So I decided to port "bidiref", the reference implementation of the UBA
described in http://unicode.org/reports/tr9/ to Rust.

This implementation focuses on conformance: no special measures have
been taken to optimize it so far, with my focus having been to ensure
that all of the approx 780,000 test cases in the unicode data for
unicode 14 pass.  Having the tests passing 100% allows for making
performance improvements with confidence in the future.

The API isn't completely designed/fully baked.  Until I get to hooking
it up to wezterm's shaper, I'm not 100% sure exactly what I'll need.
There's a good discussion on API in
https://github.com/open-i18n/rust-unic/issues/273 that suggests omitting
"legacy" operations such as reordering. I suspect that wezterm may
actually need that function to support monospace text layout in some
terminal scenarios, but regardless: reordering is part of the
conformance test suite so it remains a part of the API.

That said: the API does model the major operations as separate
phases, so you should be able to pay for just what you use:

* Resolving the embedding levels from a paragraph
* Returning paragraph runs of those levels (and their directions)
* Returning the whitespace-level-reset runs for a line-slice within the
  paragraph
* Returning the reordered indices + levels for a line-slice within the
  paragraph.

refs: https://github.com/wez/wezterm/issues/784
refs: https://github.com/kas-gui/kas-text/issues/20
This commit is contained in:
Wez Furlong 2022-01-21 08:42:44 -07:00
parent 31b09f840a
commit 601a85e12b
19 changed files with 602651 additions and 0 deletions

16
Cargo.lock generated
View File

@ -1496,6 +1496,13 @@ dependencies = [
"thread_local",
]
[[package]]
name = "generate-bidi"
version = "0.1.0"
dependencies = [
"anyhow",
]
[[package]]
name = "generic-array"
version = "0.12.4"
@ -4652,6 +4659,15 @@ dependencies = [
"wezterm-term",
]
[[package]]
name = "wezterm-bidi"
version = "0.1.0"
dependencies = [
"k9",
"log",
"pretty_env_logger",
]
[[package]]
name = "wezterm-client"
version = "0.1.0"

View File

@ -1,5 +1,7 @@
[workspace]
members = [
"bidi",
"bidi/generate",
"strip-ansi-escapes",
"wezterm",
"wezterm-gui",

13
bidi/Cargo.toml Normal file
View File

@ -0,0 +1,13 @@
[package]
name = "wezterm-bidi"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
log = "0.4"
[dev-dependencies]
k9 = "0.11.0"
pretty_env_logger = "0.4"

59
bidi/LICENSE.md Normal file
View File

@ -0,0 +1,59 @@
Copyright (c) 2022-Present Wez Furlong
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Portions of this crate were derived from the bidiref reference implementation
of the UBA, and several data files are included that have the following
copyright and license:
COPYRIGHT AND PERMISSION NOTICE
Copyright © 1991-2022 Unicode, Inc. All rights reserved.
Distributed under the Terms of Use in https://www.unicode.org/license.txt
which is reproduced below:
Permission is hereby granted, free of charge, to any person obtaining
a copy of the Unicode data files and any associated documentation
(the "Data Files") or Unicode software and any associated documentation
(the "Software") to deal in the Data Files or Software
without restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, and/or sell copies of
the Data Files or Software, and to permit persons to whom the Data Files
or Software are furnished to do so, provided that either
(a) this copyright and permission notice appear with all copies
of the Data Files or Software, or
(b) this copyright and permission notice appear in associated
Documentation.
THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT OF THIRD PARTY RIGHTS.
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS
NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL
DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,
DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THE DATA FILES OR SOFTWARE.
Except as contained in this notice, the name of a copyright holder
shall not be used in advertising or otherwise to promote the sale,
use or other dealings in these Data Files or Software without prior
written authorization of the copyright holder.

28
bidi/README.md Normal file
View File

@ -0,0 +1,28 @@
# wezterm-bidi - a pure Rust bidi implementation
This crate provides an implementation of the *The Unicode Bidirectional
Algorithm (UBA)* in Rust.
This crate was developed for use in wezterm but does not depend on
other code in wezterm.
The focus for this crate is conformance.
## Status
This crate resolves embedding levels and can reorder line ranges.
The implementation conformant with 100% of the BidiTest.txt and
BidiCharacterTest.txt test cases (approx 780,000 test cases).
## License
MIT compatible License
Copyright © 2022-Present Wez Furlong.
Portions of the code in this crate were derived from the bidiref reference
implementation of the UBA which is:
Copyright © 1991-2022 Unicode, Inc. All rights reserved.
See [LICENSE.md](LICENSE.md) for the full text of the license.

193
bidi/data/BidiBrackets.txt Normal file
View File

@ -0,0 +1,193 @@
# BidiBrackets-14.0.0.txt
# Date: 2021-06-30, 23:59:00 GMT [AG, LI, KW]
# © 2021 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see https://www.unicode.org/reports/tr44/
#
# Bidi_Paired_Bracket and Bidi_Paired_Bracket_Type Properties
#
# This file is a normative contributory data file in the Unicode
# Character Database.
#
# Bidi_Paired_Bracket is a normative property of type Miscellaneous,
# which establishes a mapping between characters that are treated as
# bracket pairs by the Unicode Bidirectional Algorithm.
#
# Bidi_Paired_Bracket_Type is a normative property of type Enumeration,
# which classifies characters into opening and closing paired brackets
# for the purposes of the Unicode Bidirectional Algorithm.
#
# This file lists the set of code points with Bidi_Paired_Bracket_Type
# property values Open and Close. The set is derived from the character
# properties General_Category (gc), Bidi_Class (bc), Bidi_Mirrored (Bidi_M),
# and Bidi_Mirroring_Glyph (bmg), as follows: two characters, A and B,
# form a bracket pair if A has gc=Ps and B has gc=Pe, both have bc=ON and
# Bidi_M=Y, and bmg of A is B. Bidi_Paired_Bracket (bpb) maps A to B and
# vice versa, and their Bidi_Paired_Bracket_Type (bpt) property values are
# Open (o) and Close (c), respectively.
#
# The brackets with ticks U+298D LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
# through U+2990 RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER are paired the
# same way their glyphs form mirror pairs, according to their bmg property
# values. They are not paired on the basis of a diagonal or antidiagonal
# matching of the corner ticks inferred from code point order.
#
# For legacy reasons, the characters U+FD3E ORNATE LEFT PARENTHESIS and
# U+FD3F ORNATE RIGHT PARENTHESIS do not mirror in bidirectional display
# and therefore do not form a bracket pair.
#
# The Unicode property value stability policy guarantees that characters
# which have bpt=o or bpt=c also have bc=ON and Bidi_M=Y. As a result, an
# implementation can optimize the lookup of the Bidi_Paired_Bracket_Type
# property values Open and Close by restricting the processing to characters
# with bc=ON.
#
# The format of the file is three fields separated by a semicolon.
# Field 0: Unicode code point value, represented as a hexadecimal value
# Field 1: Bidi_Paired_Bracket property value, a code point value or <none>
# Field 2: Bidi_Paired_Bracket_Type property value, one of the following:
# o Open
# c Close
# n None
# The names of the characters in field 0 are given in comments at the end
# of each line.
#
# For information on bidirectional paired brackets, see UAX #9: Unicode
# Bidirectional Algorithm, at https://www.unicode.org/reports/tr9/
#
# This file was originally created by Andrew Glass and Laurentiu Iancu
# for Unicode 6.3.
0028; 0029; o # LEFT PARENTHESIS
0029; 0028; c # RIGHT PARENTHESIS
005B; 005D; o # LEFT SQUARE BRACKET
005D; 005B; c # RIGHT SQUARE BRACKET
007B; 007D; o # LEFT CURLY BRACKET
007D; 007B; c # RIGHT CURLY BRACKET
0F3A; 0F3B; o # TIBETAN MARK GUG RTAGS GYON
0F3B; 0F3A; c # TIBETAN MARK GUG RTAGS GYAS
0F3C; 0F3D; o # TIBETAN MARK ANG KHANG GYON
0F3D; 0F3C; c # TIBETAN MARK ANG KHANG GYAS
169B; 169C; o # OGHAM FEATHER MARK
169C; 169B; c # OGHAM REVERSED FEATHER MARK
2045; 2046; o # LEFT SQUARE BRACKET WITH QUILL
2046; 2045; c # RIGHT SQUARE BRACKET WITH QUILL
207D; 207E; o # SUPERSCRIPT LEFT PARENTHESIS
207E; 207D; c # SUPERSCRIPT RIGHT PARENTHESIS
208D; 208E; o # SUBSCRIPT LEFT PARENTHESIS
208E; 208D; c # SUBSCRIPT RIGHT PARENTHESIS
2308; 2309; o # LEFT CEILING
2309; 2308; c # RIGHT CEILING
230A; 230B; o # LEFT FLOOR
230B; 230A; c # RIGHT FLOOR
2329; 232A; o # LEFT-POINTING ANGLE BRACKET
232A; 2329; c # RIGHT-POINTING ANGLE BRACKET
2768; 2769; o # MEDIUM LEFT PARENTHESIS ORNAMENT
2769; 2768; c # MEDIUM RIGHT PARENTHESIS ORNAMENT
276A; 276B; o # MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
276B; 276A; c # MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
276C; 276D; o # MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
276D; 276C; c # MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
276E; 276F; o # HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
276F; 276E; c # HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
2770; 2771; o # HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
2771; 2770; c # HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
2772; 2773; o # LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
2773; 2772; c # LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
2774; 2775; o # MEDIUM LEFT CURLY BRACKET ORNAMENT
2775; 2774; c # MEDIUM RIGHT CURLY BRACKET ORNAMENT
27C5; 27C6; o # LEFT S-SHAPED BAG DELIMITER
27C6; 27C5; c # RIGHT S-SHAPED BAG DELIMITER
27E6; 27E7; o # MATHEMATICAL LEFT WHITE SQUARE BRACKET
27E7; 27E6; c # MATHEMATICAL RIGHT WHITE SQUARE BRACKET
27E8; 27E9; o # MATHEMATICAL LEFT ANGLE BRACKET
27E9; 27E8; c # MATHEMATICAL RIGHT ANGLE BRACKET
27EA; 27EB; o # MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
27EB; 27EA; c # MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
27EC; 27ED; o # MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
27ED; 27EC; c # MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
27EE; 27EF; o # MATHEMATICAL LEFT FLATTENED PARENTHESIS
27EF; 27EE; c # MATHEMATICAL RIGHT FLATTENED PARENTHESIS
2983; 2984; o # LEFT WHITE CURLY BRACKET
2984; 2983; c # RIGHT WHITE CURLY BRACKET
2985; 2986; o # LEFT WHITE PARENTHESIS
2986; 2985; c # RIGHT WHITE PARENTHESIS
2987; 2988; o # Z NOTATION LEFT IMAGE BRACKET
2988; 2987; c # Z NOTATION RIGHT IMAGE BRACKET
2989; 298A; o # Z NOTATION LEFT BINDING BRACKET
298A; 2989; c # Z NOTATION RIGHT BINDING BRACKET
298B; 298C; o # LEFT SQUARE BRACKET WITH UNDERBAR
298C; 298B; c # RIGHT SQUARE BRACKET WITH UNDERBAR
298D; 2990; o # LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
298E; 298F; c # RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
298F; 298E; o # LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
2990; 298D; c # RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
2991; 2992; o # LEFT ANGLE BRACKET WITH DOT
2992; 2991; c # RIGHT ANGLE BRACKET WITH DOT
2993; 2994; o # LEFT ARC LESS-THAN BRACKET
2994; 2993; c # RIGHT ARC GREATER-THAN BRACKET
2995; 2996; o # DOUBLE LEFT ARC GREATER-THAN BRACKET
2996; 2995; c # DOUBLE RIGHT ARC LESS-THAN BRACKET
2997; 2998; o # LEFT BLACK TORTOISE SHELL BRACKET
2998; 2997; c # RIGHT BLACK TORTOISE SHELL BRACKET
29D8; 29D9; o # LEFT WIGGLY FENCE
29D9; 29D8; c # RIGHT WIGGLY FENCE
29DA; 29DB; o # LEFT DOUBLE WIGGLY FENCE
29DB; 29DA; c # RIGHT DOUBLE WIGGLY FENCE
29FC; 29FD; o # LEFT-POINTING CURVED ANGLE BRACKET
29FD; 29FC; c # RIGHT-POINTING CURVED ANGLE BRACKET
2E22; 2E23; o # TOP LEFT HALF BRACKET
2E23; 2E22; c # TOP RIGHT HALF BRACKET
2E24; 2E25; o # BOTTOM LEFT HALF BRACKET
2E25; 2E24; c # BOTTOM RIGHT HALF BRACKET
2E26; 2E27; o # LEFT SIDEWAYS U BRACKET
2E27; 2E26; c # RIGHT SIDEWAYS U BRACKET
2E28; 2E29; o # LEFT DOUBLE PARENTHESIS
2E29; 2E28; c # RIGHT DOUBLE PARENTHESIS
2E55; 2E56; o # LEFT SQUARE BRACKET WITH STROKE
2E56; 2E55; c # RIGHT SQUARE BRACKET WITH STROKE
2E57; 2E58; o # LEFT SQUARE BRACKET WITH DOUBLE STROKE
2E58; 2E57; c # RIGHT SQUARE BRACKET WITH DOUBLE STROKE
2E59; 2E5A; o # TOP HALF LEFT PARENTHESIS
2E5A; 2E59; c # TOP HALF RIGHT PARENTHESIS
2E5B; 2E5C; o # BOTTOM HALF LEFT PARENTHESIS
2E5C; 2E5B; c # BOTTOM HALF RIGHT PARENTHESIS
3008; 3009; o # LEFT ANGLE BRACKET
3009; 3008; c # RIGHT ANGLE BRACKET
300A; 300B; o # LEFT DOUBLE ANGLE BRACKET
300B; 300A; c # RIGHT DOUBLE ANGLE BRACKET
300C; 300D; o # LEFT CORNER BRACKET
300D; 300C; c # RIGHT CORNER BRACKET
300E; 300F; o # LEFT WHITE CORNER BRACKET
300F; 300E; c # RIGHT WHITE CORNER BRACKET
3010; 3011; o # LEFT BLACK LENTICULAR BRACKET
3011; 3010; c # RIGHT BLACK LENTICULAR BRACKET
3014; 3015; o # LEFT TORTOISE SHELL BRACKET
3015; 3014; c # RIGHT TORTOISE SHELL BRACKET
3016; 3017; o # LEFT WHITE LENTICULAR BRACKET
3017; 3016; c # RIGHT WHITE LENTICULAR BRACKET
3018; 3019; o # LEFT WHITE TORTOISE SHELL BRACKET
3019; 3018; c # RIGHT WHITE TORTOISE SHELL BRACKET
301A; 301B; o # LEFT WHITE SQUARE BRACKET
301B; 301A; c # RIGHT WHITE SQUARE BRACKET
FE59; FE5A; o # SMALL LEFT PARENTHESIS
FE5A; FE59; c # SMALL RIGHT PARENTHESIS
FE5B; FE5C; o # SMALL LEFT CURLY BRACKET
FE5C; FE5B; c # SMALL RIGHT CURLY BRACKET
FE5D; FE5E; o # SMALL LEFT TORTOISE SHELL BRACKET
FE5E; FE5D; c # SMALL RIGHT TORTOISE SHELL BRACKET
FF08; FF09; o # FULLWIDTH LEFT PARENTHESIS
FF09; FF08; c # FULLWIDTH RIGHT PARENTHESIS
FF3B; FF3D; o # FULLWIDTH LEFT SQUARE BRACKET
FF3D; FF3B; c # FULLWIDTH RIGHT SQUARE BRACKET
FF5B; FF5D; o # FULLWIDTH LEFT CURLY BRACKET
FF5D; FF5B; c # FULLWIDTH RIGHT CURLY BRACKET
FF5F; FF60; o # FULLWIDTH LEFT WHITE PARENTHESIS
FF60; FF5F; c # FULLWIDTH RIGHT WHITE PARENTHESIS
FF62; FF63; o # HALFWIDTH LEFT CORNER BRACKET
FF63; FF62; c # HALFWIDTH RIGHT CORNER BRACKET
# EOF

File diff suppressed because it is too large Load Diff

633
bidi/data/BidiMirroring.txt Normal file
View File

@ -0,0 +1,633 @@
# BidiMirroring-14.0.0.txt
# Date: 2021-08-08, 22:55:00 GMT [KW, RP]
# © 2021 Unicode®, Inc.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Character Database
# For documentation, see https://www.unicode.org/reports/tr44/
#
# Bidi_Mirroring_Glyph Property
#
# This file is an informative contributory data file in the
# Unicode Character Database.
#
# This data file lists characters that have the Bidi_Mirrored=Yes property
# value, for which there is another Unicode character that typically has a glyph
# that is the mirror image of the original character's glyph.
#
# The repertoire covered by the file is Unicode 14.0.0.
#
# The file contains a list of lines with mappings from one code point
# to another one for character-based mirroring.
# Note that for "real" mirroring, a rendering engine needs to select
# appropriate alternative glyphs, and that many Unicode characters do not
# have a mirror-image Unicode character.
#
# Each mapping line contains two fields, separated by a semicolon (';').
# Each of the two fields contains a code point represented as a
# variable-length hexadecimal value with 4 to 6 digits.
# A comment indicates where the characters are "BEST FIT" mirroring.
#
# Code points for which Bidi_Mirrored=Yes, but for which no appropriate
# characters exist with mirrored glyphs, are
# listed as comments at the end of the file.
#
# Formally, the default value of the Bidi_Mirroring_Glyph property
# for each code point is <none>, unless a mapping to
# some other character is specified in this data file. When a code
# point has the default value for the Bidi_Mirroring_Glyph property,
# that means that no other character exists whose glyph is suitable
# for character-based mirroring.
#
# For information on bidi mirroring, see UAX #9: Unicode Bidirectional Algorithm,
# at https://www.unicode.org/reports/tr9/
#
# This file was originally created by Markus Scherer.
# Extended for Unicode 3.2, 4.0, 4.1, 5.0, 5.1, 5.2, and 6.0 by Ken Whistler,
# and for subsequent versions by Ken Whistler, Laurentiu Iancu, and Roozbeh Pournader.
#
# Historical and Compatibility Information:
#
# The OpenType Mirroring Pairs List (OMPL) is frozen to match the
# Unicode 5.1 version of the Bidi_Mirroring_Glyph property (2008).
# See https://www.microsoft.com/typography/otspec/ompl.txt
#
# The Unicode 6.1 version of the Bidi_Mirroring_Glyph property (2011)
# added one mirroring pair: 27CB <--> 27CD.
#
# The Unicode 11.0 version of the Bidi_Mirroring_Glyph property (2018)
# underwent a substantial revision, to formally recognize all of the
# exact mirroring pairs and "BEST FIT" mirroring pairs that had been
# added after the freezing of the OMPL list. As a result, starting
# with Unicode 11.0, the bmg mapping values more accurately reflect
# the current status of glyphs for Bidi_Mirrored characters in
# the Unicode Standard, but this listing now extends significantly
# beyond the frozen OMPL list. Implementers should be aware of this
# intentional distinction.
#
# ############################################################
#
# Property: Bidi_Mirroring_Glyph
#
# @missing: 0000..10FFFF; <none>
0028; 0029 # LEFT PARENTHESIS
0029; 0028 # RIGHT PARENTHESIS
003C; 003E # LESS-THAN SIGN
003E; 003C # GREATER-THAN SIGN
005B; 005D # LEFT SQUARE BRACKET
005D; 005B # RIGHT SQUARE BRACKET
007B; 007D # LEFT CURLY BRACKET
007D; 007B # RIGHT CURLY BRACKET
00AB; 00BB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
00BB; 00AB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0F3A; 0F3B # TIBETAN MARK GUG RTAGS GYON
0F3B; 0F3A # TIBETAN MARK GUG RTAGS GYAS
0F3C; 0F3D # TIBETAN MARK ANG KHANG GYON
0F3D; 0F3C # TIBETAN MARK ANG KHANG GYAS
169B; 169C # OGHAM FEATHER MARK
169C; 169B # OGHAM REVERSED FEATHER MARK
2039; 203A # SINGLE LEFT-POINTING ANGLE QUOTATION MARK
203A; 2039 # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
2045; 2046 # LEFT SQUARE BRACKET WITH QUILL
2046; 2045 # RIGHT SQUARE BRACKET WITH QUILL
207D; 207E # SUPERSCRIPT LEFT PARENTHESIS
207E; 207D # SUPERSCRIPT RIGHT PARENTHESIS
208D; 208E # SUBSCRIPT LEFT PARENTHESIS
208E; 208D # SUBSCRIPT RIGHT PARENTHESIS
2208; 220B # ELEMENT OF
2209; 220C # [BEST FIT] NOT AN ELEMENT OF
220A; 220D # SMALL ELEMENT OF
220B; 2208 # CONTAINS AS MEMBER
220C; 2209 # [BEST FIT] DOES NOT CONTAIN AS MEMBER
220D; 220A # SMALL CONTAINS AS MEMBER
2215; 29F5 # DIVISION SLASH
221F; 2BFE # RIGHT ANGLE
2220; 29A3 # ANGLE
2221; 299B # MEASURED ANGLE
2222; 29A0 # SPHERICAL ANGLE
2224; 2AEE # DOES NOT DIVIDE
223C; 223D # TILDE OPERATOR
223D; 223C # REVERSED TILDE
2243; 22CD # ASYMPTOTICALLY EQUAL TO
2245; 224C # APPROXIMATELY EQUAL TO
224C; 2245 # ALL EQUAL TO
2252; 2253 # APPROXIMATELY EQUAL TO OR THE IMAGE OF
2253; 2252 # IMAGE OF OR APPROXIMATELY EQUAL TO
2254; 2255 # COLON EQUALS
2255; 2254 # EQUALS COLON
2264; 2265 # LESS-THAN OR EQUAL TO
2265; 2264 # GREATER-THAN OR EQUAL TO
2266; 2267 # LESS-THAN OVER EQUAL TO
2267; 2266 # GREATER-THAN OVER EQUAL TO
2268; 2269 # [BEST FIT] LESS-THAN BUT NOT EQUAL TO
2269; 2268 # [BEST FIT] GREATER-THAN BUT NOT EQUAL TO
226A; 226B # MUCH LESS-THAN
226B; 226A # MUCH GREATER-THAN
226E; 226F # [BEST FIT] NOT LESS-THAN
226F; 226E # [BEST FIT] NOT GREATER-THAN
2270; 2271 # [BEST FIT] NEITHER LESS-THAN NOR EQUAL TO
2271; 2270 # [BEST FIT] NEITHER GREATER-THAN NOR EQUAL TO
2272; 2273 # [BEST FIT] LESS-THAN OR EQUIVALENT TO
2273; 2272 # [BEST FIT] GREATER-THAN OR EQUIVALENT TO
2274; 2275 # [BEST FIT] NEITHER LESS-THAN NOR EQUIVALENT TO
2275; 2274 # [BEST FIT] NEITHER GREATER-THAN NOR EQUIVALENT TO
2276; 2277 # LESS-THAN OR GREATER-THAN
2277; 2276 # GREATER-THAN OR LESS-THAN
2278; 2279 # [BEST FIT] NEITHER LESS-THAN NOR GREATER-THAN
2279; 2278 # [BEST FIT] NEITHER GREATER-THAN NOR LESS-THAN
227A; 227B # PRECEDES
227B; 227A # SUCCEEDS
227C; 227D # PRECEDES OR EQUAL TO
227D; 227C # SUCCEEDS OR EQUAL TO
227E; 227F # [BEST FIT] PRECEDES OR EQUIVALENT TO
227F; 227E # [BEST FIT] SUCCEEDS OR EQUIVALENT TO
2280; 2281 # [BEST FIT] DOES NOT PRECEDE
2281; 2280 # [BEST FIT] DOES NOT SUCCEED
2282; 2283 # SUBSET OF
2283; 2282 # SUPERSET OF
2284; 2285 # [BEST FIT] NOT A SUBSET OF
2285; 2284 # [BEST FIT] NOT A SUPERSET OF
2286; 2287 # SUBSET OF OR EQUAL TO
2287; 2286 # SUPERSET OF OR EQUAL TO
2288; 2289 # [BEST FIT] NEITHER A SUBSET OF NOR EQUAL TO
2289; 2288 # [BEST FIT] NEITHER A SUPERSET OF NOR EQUAL TO
228A; 228B # [BEST FIT] SUBSET OF WITH NOT EQUAL TO
228B; 228A # [BEST FIT] SUPERSET OF WITH NOT EQUAL TO
228F; 2290 # SQUARE IMAGE OF
2290; 228F # SQUARE ORIGINAL OF
2291; 2292 # SQUARE IMAGE OF OR EQUAL TO
2292; 2291 # SQUARE ORIGINAL OF OR EQUAL TO
2298; 29B8 # CIRCLED DIVISION SLASH
22A2; 22A3 # RIGHT TACK
22A3; 22A2 # LEFT TACK
22A6; 2ADE # ASSERTION
22A8; 2AE4 # TRUE
22A9; 2AE3 # FORCES
22AB; 2AE5 # DOUBLE VERTICAL BAR DOUBLE RIGHT TURNSTILE
22B0; 22B1 # PRECEDES UNDER RELATION
22B1; 22B0 # SUCCEEDS UNDER RELATION
22B2; 22B3 # NORMAL SUBGROUP OF
22B3; 22B2 # CONTAINS AS NORMAL SUBGROUP
22B4; 22B5 # NORMAL SUBGROUP OF OR EQUAL TO
22B5; 22B4 # CONTAINS AS NORMAL SUBGROUP OR EQUAL TO
22B6; 22B7 # ORIGINAL OF
22B7; 22B6 # IMAGE OF
22B8; 27DC # MULTIMAP
22C9; 22CA # LEFT NORMAL FACTOR SEMIDIRECT PRODUCT
22CA; 22C9 # RIGHT NORMAL FACTOR SEMIDIRECT PRODUCT
22CB; 22CC # LEFT SEMIDIRECT PRODUCT
22CC; 22CB # RIGHT SEMIDIRECT PRODUCT
22CD; 2243 # REVERSED TILDE EQUALS
22D0; 22D1 # DOUBLE SUBSET
22D1; 22D0 # DOUBLE SUPERSET
22D6; 22D7 # LESS-THAN WITH DOT
22D7; 22D6 # GREATER-THAN WITH DOT
22D8; 22D9 # VERY MUCH LESS-THAN
22D9; 22D8 # VERY MUCH GREATER-THAN
22DA; 22DB # LESS-THAN EQUAL TO OR GREATER-THAN
22DB; 22DA # GREATER-THAN EQUAL TO OR LESS-THAN
22DC; 22DD # EQUAL TO OR LESS-THAN
22DD; 22DC # EQUAL TO OR GREATER-THAN
22DE; 22DF # EQUAL TO OR PRECEDES
22DF; 22DE # EQUAL TO OR SUCCEEDS
22E0; 22E1 # [BEST FIT] DOES NOT PRECEDE OR EQUAL
22E1; 22E0 # [BEST FIT] DOES NOT SUCCEED OR EQUAL
22E2; 22E3 # [BEST FIT] NOT SQUARE IMAGE OF OR EQUAL TO
22E3; 22E2 # [BEST FIT] NOT SQUARE ORIGINAL OF OR EQUAL TO
22E4; 22E5 # [BEST FIT] SQUARE IMAGE OF OR NOT EQUAL TO
22E5; 22E4 # [BEST FIT] SQUARE ORIGINAL OF OR NOT EQUAL TO
22E6; 22E7 # [BEST FIT] LESS-THAN BUT NOT EQUIVALENT TO
22E7; 22E6 # [BEST FIT] GREATER-THAN BUT NOT EQUIVALENT TO
22E8; 22E9 # [BEST FIT] PRECEDES BUT NOT EQUIVALENT TO
22E9; 22E8 # [BEST FIT] SUCCEEDS BUT NOT EQUIVALENT TO
22EA; 22EB # [BEST FIT] NOT NORMAL SUBGROUP OF
22EB; 22EA # [BEST FIT] DOES NOT CONTAIN AS NORMAL SUBGROUP
22EC; 22ED # [BEST FIT] NOT NORMAL SUBGROUP OF OR EQUAL TO
22ED; 22EC # [BEST FIT] DOES NOT CONTAIN AS NORMAL SUBGROUP OR EQUAL
22F0; 22F1 # UP RIGHT DIAGONAL ELLIPSIS
22F1; 22F0 # DOWN RIGHT DIAGONAL ELLIPSIS
22F2; 22FA # ELEMENT OF WITH LONG HORIZONTAL STROKE
22F3; 22FB # ELEMENT OF WITH VERTICAL BAR AT END OF HORIZONTAL STROKE
22F4; 22FC # SMALL ELEMENT OF WITH VERTICAL BAR AT END OF HORIZONTAL STROKE
22F6; 22FD # ELEMENT OF WITH OVERBAR
22F7; 22FE # SMALL ELEMENT OF WITH OVERBAR
22FA; 22F2 # CONTAINS WITH LONG HORIZONTAL STROKE
22FB; 22F3 # CONTAINS WITH VERTICAL BAR AT END OF HORIZONTAL STROKE
22FC; 22F4 # SMALL CONTAINS WITH VERTICAL BAR AT END OF HORIZONTAL STROKE
22FD; 22F6 # CONTAINS WITH OVERBAR
22FE; 22F7 # SMALL CONTAINS WITH OVERBAR
2308; 2309 # LEFT CEILING
2309; 2308 # RIGHT CEILING
230A; 230B # LEFT FLOOR
230B; 230A # RIGHT FLOOR
2329; 232A # LEFT-POINTING ANGLE BRACKET
232A; 2329 # RIGHT-POINTING ANGLE BRACKET
2768; 2769 # MEDIUM LEFT PARENTHESIS ORNAMENT
2769; 2768 # MEDIUM RIGHT PARENTHESIS ORNAMENT
276A; 276B # MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
276B; 276A # MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
276C; 276D # MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
276D; 276C # MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
276E; 276F # HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
276F; 276E # HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
2770; 2771 # HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
2771; 2770 # HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
2772; 2773 # LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
2773; 2772 # LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
2774; 2775 # MEDIUM LEFT CURLY BRACKET ORNAMENT
2775; 2774 # MEDIUM RIGHT CURLY BRACKET ORNAMENT
27C3; 27C4 # OPEN SUBSET
27C4; 27C3 # OPEN SUPERSET
27C5; 27C6 # LEFT S-SHAPED BAG DELIMITER
27C6; 27C5 # RIGHT S-SHAPED BAG DELIMITER
27C8; 27C9 # REVERSE SOLIDUS PRECEDING SUBSET
27C9; 27C8 # SUPERSET PRECEDING SOLIDUS
27CB; 27CD # MATHEMATICAL RISING DIAGONAL
27CD; 27CB # MATHEMATICAL FALLING DIAGONAL
27D5; 27D6 # LEFT OUTER JOIN
27D6; 27D5 # RIGHT OUTER JOIN
27DC; 22B8 # LEFT MULTIMAP
27DD; 27DE # LONG RIGHT TACK
27DE; 27DD # LONG LEFT TACK
27E2; 27E3 # WHITE CONCAVE-SIDED DIAMOND WITH LEFTWARDS TICK
27E3; 27E2 # WHITE CONCAVE-SIDED DIAMOND WITH RIGHTWARDS TICK
27E4; 27E5 # WHITE SQUARE WITH LEFTWARDS TICK
27E5; 27E4 # WHITE SQUARE WITH RIGHTWARDS TICK
27E6; 27E7 # MATHEMATICAL LEFT WHITE SQUARE BRACKET
27E7; 27E6 # MATHEMATICAL RIGHT WHITE SQUARE BRACKET
27E8; 27E9 # MATHEMATICAL LEFT ANGLE BRACKET
27E9; 27E8 # MATHEMATICAL RIGHT ANGLE BRACKET
27EA; 27EB # MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
27EB; 27EA # MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
27EC; 27ED # MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
27ED; 27EC # MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
27EE; 27EF # MATHEMATICAL LEFT FLATTENED PARENTHESIS
27EF; 27EE # MATHEMATICAL RIGHT FLATTENED PARENTHESIS
2983; 2984 # LEFT WHITE CURLY BRACKET
2984; 2983 # RIGHT WHITE CURLY BRACKET
2985; 2986 # LEFT WHITE PARENTHESIS
2986; 2985 # RIGHT WHITE PARENTHESIS
2987; 2988 # Z NOTATION LEFT IMAGE BRACKET
2988; 2987 # Z NOTATION RIGHT IMAGE BRACKET
2989; 298A # Z NOTATION LEFT BINDING BRACKET
298A; 2989 # Z NOTATION RIGHT BINDING BRACKET
298B; 298C # LEFT SQUARE BRACKET WITH UNDERBAR
298C; 298B # RIGHT SQUARE BRACKET WITH UNDERBAR
298D; 2990 # LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
298E; 298F # RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
298F; 298E # LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
2990; 298D # RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
2991; 2992 # LEFT ANGLE BRACKET WITH DOT
2992; 2991 # RIGHT ANGLE BRACKET WITH DOT
2993; 2994 # LEFT ARC LESS-THAN BRACKET
2994; 2993 # RIGHT ARC GREATER-THAN BRACKET
2995; 2996 # DOUBLE LEFT ARC GREATER-THAN BRACKET
2996; 2995 # DOUBLE RIGHT ARC LESS-THAN BRACKET
2997; 2998 # LEFT BLACK TORTOISE SHELL BRACKET
2998; 2997 # RIGHT BLACK TORTOISE SHELL BRACKET
299B; 2221 # MEASURED ANGLE OPENING LEFT
29A0; 2222 # SPHERICAL ANGLE OPENING LEFT
29A3; 2220 # REVERSED ANGLE
29A4; 29A5 # ANGLE WITH UNDERBAR
29A5; 29A4 # REVERSED ANGLE WITH UNDERBAR
29A8; 29A9 # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING UP AND RIGHT
29A9; 29A8 # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING UP AND LEFT
29AA; 29AB # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING DOWN AND RIGHT
29AB; 29AA # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING DOWN AND LEFT
29AC; 29AD # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING RIGHT AND UP
29AD; 29AC # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING LEFT AND UP
29AE; 29AF # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING RIGHT AND DOWN
29AF; 29AE # MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING LEFT AND DOWN
29B8; 2298 # CIRCLED REVERSE SOLIDUS
29C0; 29C1 # CIRCLED LESS-THAN
29C1; 29C0 # CIRCLED GREATER-THAN
29C4; 29C5 # SQUARED RISING DIAGONAL SLASH
29C5; 29C4 # SQUARED FALLING DIAGONAL SLASH
29CF; 29D0 # LEFT TRIANGLE BESIDE VERTICAL BAR
29D0; 29CF # VERTICAL BAR BESIDE RIGHT TRIANGLE
29D1; 29D2 # BOWTIE WITH LEFT HALF BLACK
29D2; 29D1 # BOWTIE WITH RIGHT HALF BLACK
29D4; 29D5 # TIMES WITH LEFT HALF BLACK
29D5; 29D4 # TIMES WITH RIGHT HALF BLACK
29D8; 29D9 # LEFT WIGGLY FENCE
29D9; 29D8 # RIGHT WIGGLY FENCE
29DA; 29DB # LEFT DOUBLE WIGGLY FENCE
29DB; 29DA # RIGHT DOUBLE WIGGLY FENCE
29E8; 29E9 # DOWN-POINTING TRIANGLE WITH LEFT HALF BLACK
29E9; 29E8 # DOWN-POINTING TRIANGLE WITH RIGHT HALF BLACK
29F5; 2215 # REVERSE SOLIDUS OPERATOR
29F8; 29F9 # BIG SOLIDUS
29F9; 29F8 # BIG REVERSE SOLIDUS
29FC; 29FD # LEFT-POINTING CURVED ANGLE BRACKET
29FD; 29FC # RIGHT-POINTING CURVED ANGLE BRACKET
2A2B; 2A2C # MINUS SIGN WITH FALLING DOTS
2A2C; 2A2B # MINUS SIGN WITH RISING DOTS
2A2D; 2A2E # PLUS SIGN IN LEFT HALF CIRCLE
2A2E; 2A2D # PLUS SIGN IN RIGHT HALF CIRCLE
2A34; 2A35 # MULTIPLICATION SIGN IN LEFT HALF CIRCLE
2A35; 2A34 # MULTIPLICATION SIGN IN RIGHT HALF CIRCLE
2A3C; 2A3D # INTERIOR PRODUCT
2A3D; 2A3C # RIGHTHAND INTERIOR PRODUCT
2A64; 2A65 # Z NOTATION DOMAIN ANTIRESTRICTION
2A65; 2A64 # Z NOTATION RANGE ANTIRESTRICTION
2A79; 2A7A # LESS-THAN WITH CIRCLE INSIDE
2A7A; 2A79 # GREATER-THAN WITH CIRCLE INSIDE
2A7B; 2A7C # [BEST FIT] LESS-THAN WITH QUESTION MARK ABOVE
2A7C; 2A7B # [BEST FIT] GREATER-THAN WITH QUESTION MARK ABOVE
2A7D; 2A7E # LESS-THAN OR SLANTED EQUAL TO
2A7E; 2A7D # GREATER-THAN OR SLANTED EQUAL TO
2A7F; 2A80 # LESS-THAN OR SLANTED EQUAL TO WITH DOT INSIDE
2A80; 2A7F # GREATER-THAN OR SLANTED EQUAL TO WITH DOT INSIDE
2A81; 2A82 # LESS-THAN OR SLANTED EQUAL TO WITH DOT ABOVE
2A82; 2A81 # GREATER-THAN OR SLANTED EQUAL TO WITH DOT ABOVE
2A83; 2A84 # LESS-THAN OR SLANTED EQUAL TO WITH DOT ABOVE RIGHT
2A84; 2A83 # GREATER-THAN OR SLANTED EQUAL TO WITH DOT ABOVE LEFT
2A85; 2A86 # [BEST FIT] LESS-THAN OR APPROXIMATE
2A86; 2A85 # [BEST FIT] GREATER-THAN OR APPROXIMATE
2A87; 2A88 # [BEST FIT] LESS-THAN AND SINGLE-LINE NOT EQUAL TO
2A88; 2A87 # [BEST FIT] GREATER-THAN AND SINGLE-LINE NOT EQUAL TO
2A89; 2A8A # [BEST FIT] LESS-THAN AND NOT APPROXIMATE
2A8A; 2A89 # [BEST FIT] GREATER-THAN AND NOT APPROXIMATE
2A8B; 2A8C # LESS-THAN ABOVE DOUBLE-LINE EQUAL ABOVE GREATER-THAN
2A8C; 2A8B # GREATER-THAN ABOVE DOUBLE-LINE EQUAL ABOVE LESS-THAN
2A8D; 2A8E # [BEST FIT] LESS-THAN ABOVE SIMILAR OR EQUAL
2A8E; 2A8D # [BEST FIT] GREATER-THAN ABOVE SIMILAR OR EQUAL
2A8F; 2A90 # [BEST FIT] LESS-THAN ABOVE SIMILAR ABOVE GREATER-THAN
2A90; 2A8F # [BEST FIT] GREATER-THAN ABOVE SIMILAR ABOVE LESS-THAN
2A91; 2A92 # LESS-THAN ABOVE GREATER-THAN ABOVE DOUBLE-LINE EQUAL
2A92; 2A91 # GREATER-THAN ABOVE LESS-THAN ABOVE DOUBLE-LINE EQUAL
2A93; 2A94 # LESS-THAN ABOVE SLANTED EQUAL ABOVE GREATER-THAN ABOVE SLANTED EQUAL
2A94; 2A93 # GREATER-THAN ABOVE SLANTED EQUAL ABOVE LESS-THAN ABOVE SLANTED EQUAL
2A95; 2A96 # SLANTED EQUAL TO OR LESS-THAN
2A96; 2A95 # SLANTED EQUAL TO OR GREATER-THAN
2A97; 2A98 # SLANTED EQUAL TO OR LESS-THAN WITH DOT INSIDE
2A98; 2A97 # SLANTED EQUAL TO OR GREATER-THAN WITH DOT INSIDE
2A99; 2A9A # DOUBLE-LINE EQUAL TO OR LESS-THAN
2A9A; 2A99 # DOUBLE-LINE EQUAL TO OR GREATER-THAN
2A9B; 2A9C # DOUBLE-LINE SLANTED EQUAL TO OR LESS-THAN
2A9C; 2A9B # DOUBLE-LINE SLANTED EQUAL TO OR GREATER-THAN
2A9D; 2A9E # [BEST FIT] SIMILAR OR LESS-THAN
2A9E; 2A9D # [BEST FIT] SIMILAR OR GREATER-THAN
2A9F; 2AA0 # [BEST FIT] SIMILAR ABOVE LESS-THAN ABOVE EQUALS SIGN
2AA0; 2A9F # [BEST FIT] SIMILAR ABOVE GREATER-THAN ABOVE EQUALS SIGN
2AA1; 2AA2 # DOUBLE NESTED LESS-THAN
2AA2; 2AA1 # DOUBLE NESTED GREATER-THAN
2AA6; 2AA7 # LESS-THAN CLOSED BY CURVE
2AA7; 2AA6 # GREATER-THAN CLOSED BY CURVE
2AA8; 2AA9 # LESS-THAN CLOSED BY CURVE ABOVE SLANTED EQUAL
2AA9; 2AA8 # GREATER-THAN CLOSED BY CURVE ABOVE SLANTED EQUAL
2AAA; 2AAB # SMALLER THAN
2AAB; 2AAA # LARGER THAN
2AAC; 2AAD # SMALLER THAN OR EQUAL TO
2AAD; 2AAC # LARGER THAN OR EQUAL TO
2AAF; 2AB0 # PRECEDES ABOVE SINGLE-LINE EQUALS SIGN
2AB0; 2AAF # SUCCEEDS ABOVE SINGLE-LINE EQUALS SIGN
2AB1; 2AB2 # [BEST FIT] PRECEDES ABOVE SINGLE-LINE NOT EQUAL TO
2AB2; 2AB1 # [BEST FIT] SUCCEEDS ABOVE SINGLE-LINE NOT EQUAL TO
2AB3; 2AB4 # PRECEDES ABOVE EQUALS SIGN
2AB4; 2AB3 # SUCCEEDS ABOVE EQUALS SIGN
2AB5; 2AB6 # [BEST FIT] PRECEDES ABOVE NOT EQUAL TO
2AB6; 2AB5 # [BEST FIT] SUCCEEDS ABOVE NOT EQUAL TO
2AB7; 2AB8 # [BEST FIT] PRECEDES ABOVE ALMOST EQUAL TO
2AB8; 2AB7 # [BEST FIT] SUCCEEDS ABOVE ALMOST EQUAL TO
2AB9; 2ABA # [BEST FIT] PRECEDES ABOVE NOT ALMOST EQUAL TO
2ABA; 2AB9 # [BEST FIT] SUCCEEDS ABOVE NOT ALMOST EQUAL TO
2ABB; 2ABC # DOUBLE PRECEDES
2ABC; 2ABB # DOUBLE SUCCEEDS
2ABD; 2ABE # SUBSET WITH DOT
2ABE; 2ABD # SUPERSET WITH DOT
2ABF; 2AC0 # SUBSET WITH PLUS SIGN BELOW
2AC0; 2ABF # SUPERSET WITH PLUS SIGN BELOW
2AC1; 2AC2 # SUBSET WITH MULTIPLICATION SIGN BELOW
2AC2; 2AC1 # SUPERSET WITH MULTIPLICATION SIGN BELOW
2AC3; 2AC4 # SUBSET OF OR EQUAL TO WITH DOT ABOVE
2AC4; 2AC3 # SUPERSET OF OR EQUAL TO WITH DOT ABOVE
2AC5; 2AC6 # SUBSET OF ABOVE EQUALS SIGN
2AC6; 2AC5 # SUPERSET OF ABOVE EQUALS SIGN
2AC7; 2AC8 # [BEST FIT] SUBSET OF ABOVE TILDE OPERATOR
2AC8; 2AC7 # [BEST FIT] SUPERSET OF ABOVE TILDE OPERATOR
2AC9; 2ACA # [BEST FIT] SUBSET OF ABOVE ALMOST EQUAL TO
2ACA; 2AC9 # [BEST FIT] SUPERSET OF ABOVE ALMOST EQUAL TO
2ACB; 2ACC # [BEST FIT] SUBSET OF ABOVE NOT EQUAL TO
2ACC; 2ACB # [BEST FIT] SUPERSET OF ABOVE NOT EQUAL TO
2ACD; 2ACE # SQUARE LEFT OPEN BOX OPERATOR
2ACE; 2ACD # SQUARE RIGHT OPEN BOX OPERATOR
2ACF; 2AD0 # CLOSED SUBSET
2AD0; 2ACF # CLOSED SUPERSET
2AD1; 2AD2 # CLOSED SUBSET OR EQUAL TO
2AD2; 2AD1 # CLOSED SUPERSET OR EQUAL TO
2AD3; 2AD4 # SUBSET ABOVE SUPERSET
2AD4; 2AD3 # SUPERSET ABOVE SUBSET
2AD5; 2AD6 # SUBSET ABOVE SUBSET
2AD6; 2AD5 # SUPERSET ABOVE SUPERSET
2ADE; 22A6 # SHORT LEFT TACK
2AE3; 22A9 # DOUBLE VERTICAL BAR LEFT TURNSTILE
2AE4; 22A8 # VERTICAL BAR DOUBLE LEFT TURNSTILE
2AE5; 22AB # DOUBLE VERTICAL BAR DOUBLE LEFT TURNSTILE
2AEC; 2AED # DOUBLE STROKE NOT SIGN
2AED; 2AEC # REVERSED DOUBLE STROKE NOT SIGN
2AEE; 2224 # DOES NOT DIVIDE WITH REVERSED NEGATION SLASH
2AF7; 2AF8 # TRIPLE NESTED LESS-THAN
2AF8; 2AF7 # TRIPLE NESTED GREATER-THAN
2AF9; 2AFA # DOUBLE-LINE SLANTED LESS-THAN OR EQUAL TO
2AFA; 2AF9 # DOUBLE-LINE SLANTED GREATER-THAN OR EQUAL TO
2BFE; 221F # REVERSED RIGHT ANGLE
2E02; 2E03 # LEFT SUBSTITUTION BRACKET
2E03; 2E02 # RIGHT SUBSTITUTION BRACKET
2E04; 2E05 # LEFT DOTTED SUBSTITUTION BRACKET
2E05; 2E04 # RIGHT DOTTED SUBSTITUTION BRACKET
2E09; 2E0A # LEFT TRANSPOSITION BRACKET
2E0A; 2E09 # RIGHT TRANSPOSITION BRACKET
2E0C; 2E0D # LEFT RAISED OMISSION BRACKET
2E0D; 2E0C # RIGHT RAISED OMISSION BRACKET
2E1C; 2E1D # LEFT LOW PARAPHRASE BRACKET
2E1D; 2E1C # RIGHT LOW PARAPHRASE BRACKET
2E20; 2E21 # LEFT VERTICAL BAR WITH QUILL
2E21; 2E20 # RIGHT VERTICAL BAR WITH QUILL
2E22; 2E23 # TOP LEFT HALF BRACKET
2E23; 2E22 # TOP RIGHT HALF BRACKET
2E24; 2E25 # BOTTOM LEFT HALF BRACKET
2E25; 2E24 # BOTTOM RIGHT HALF BRACKET
2E26; 2E27 # LEFT SIDEWAYS U BRACKET
2E27; 2E26 # RIGHT SIDEWAYS U BRACKET
2E28; 2E29 # LEFT DOUBLE PARENTHESIS
2E29; 2E28 # RIGHT DOUBLE PARENTHESIS
2E55; 2E56 # LEFT SQUARE BRACKET WITH STROKE
2E56; 2E55 # RIGHT SQUARE BRACKET WITH STROKE
2E57; 2E58 # LEFT SQUARE BRACKET WITH DOUBLE STROKE
2E58; 2E57 # RIGHT SQUARE BRACKET WITH DOUBLE STROKE
2E59; 2E5A # TOP HALF LEFT PARENTHESIS
2E5A; 2E59 # TOP HALF RIGHT PARENTHESIS
2E5B; 2E5C # BOTTOM HALF LEFT PARENTHESIS
2E5C; 2E5B # BOTTOM HALF RIGHT PARENTHESIS
3008; 3009 # LEFT ANGLE BRACKET
3009; 3008 # RIGHT ANGLE BRACKET
300A; 300B # LEFT DOUBLE ANGLE BRACKET
300B; 300A # RIGHT DOUBLE ANGLE BRACKET
300C; 300D # [BEST FIT] LEFT CORNER BRACKET
300D; 300C # [BEST FIT] RIGHT CORNER BRACKET
300E; 300F # [BEST FIT] LEFT WHITE CORNER BRACKET
300F; 300E # [BEST FIT] RIGHT WHITE CORNER BRACKET
3010; 3011 # LEFT BLACK LENTICULAR BRACKET
3011; 3010 # RIGHT BLACK LENTICULAR BRACKET
3014; 3015 # LEFT TORTOISE SHELL BRACKET
3015; 3014 # RIGHT TORTOISE SHELL BRACKET
3016; 3017 # LEFT WHITE LENTICULAR BRACKET
3017; 3016 # RIGHT WHITE LENTICULAR BRACKET
3018; 3019 # LEFT WHITE TORTOISE SHELL BRACKET
3019; 3018 # RIGHT WHITE TORTOISE SHELL BRACKET
301A; 301B # LEFT WHITE SQUARE BRACKET
301B; 301A # RIGHT WHITE SQUARE BRACKET
FE59; FE5A # SMALL LEFT PARENTHESIS
FE5A; FE59 # SMALL RIGHT PARENTHESIS
FE5B; FE5C # SMALL LEFT CURLY BRACKET
FE5C; FE5B # SMALL RIGHT CURLY BRACKET
FE5D; FE5E # SMALL LEFT TORTOISE SHELL BRACKET
FE5E; FE5D # SMALL RIGHT TORTOISE SHELL BRACKET
FE64; FE65 # SMALL LESS-THAN SIGN
FE65; FE64 # SMALL GREATER-THAN SIGN
FF08; FF09 # FULLWIDTH LEFT PARENTHESIS
FF09; FF08 # FULLWIDTH RIGHT PARENTHESIS
FF1C; FF1E # FULLWIDTH LESS-THAN SIGN
FF1E; FF1C # FULLWIDTH GREATER-THAN SIGN
FF3B; FF3D # FULLWIDTH LEFT SQUARE BRACKET
FF3D; FF3B # FULLWIDTH RIGHT SQUARE BRACKET
FF5B; FF5D # FULLWIDTH LEFT CURLY BRACKET
FF5D; FF5B # FULLWIDTH RIGHT CURLY BRACKET
FF5F; FF60 # FULLWIDTH LEFT WHITE PARENTHESIS
FF60; FF5F # FULLWIDTH RIGHT WHITE PARENTHESIS
FF62; FF63 # [BEST FIT] HALFWIDTH LEFT CORNER BRACKET
FF63; FF62 # [BEST FIT] HALFWIDTH RIGHT CORNER BRACKET
# The following characters have no appropriate mirroring character.
# For these characters it is up to the rendering system
# to provide mirrored glyphs.
# 2140; DOUBLE-STRUCK N-ARY SUMMATION
# 2201; COMPLEMENT
# 2202; PARTIAL DIFFERENTIAL
# 2203; THERE EXISTS
# 2204; THERE DOES NOT EXIST
# 2211; N-ARY SUMMATION
# 2216; SET MINUS
# 221A; SQUARE ROOT
# 221B; CUBE ROOT
# 221C; FOURTH ROOT
# 221D; PROPORTIONAL TO
# 2226; NOT PARALLEL TO
# 222B; INTEGRAL
# 222C; DOUBLE INTEGRAL
# 222D; TRIPLE INTEGRAL
# 222E; CONTOUR INTEGRAL
# 222F; SURFACE INTEGRAL
# 2230; VOLUME INTEGRAL
# 2231; CLOCKWISE INTEGRAL
# 2232; CLOCKWISE CONTOUR INTEGRAL
# 2233; ANTICLOCKWISE CONTOUR INTEGRAL
# 2239; EXCESS
# 223B; HOMOTHETIC
# 223E; INVERTED LAZY S
# 223F; SINE WAVE
# 2240; WREATH PRODUCT
# 2241; NOT TILDE
# 2242; MINUS TILDE
# 2244; NOT ASYMPTOTICALLY EQUAL TO
# 2246; APPROXIMATELY BUT NOT ACTUALLY EQUAL TO
# 2247; NEITHER APPROXIMATELY NOR ACTUALLY EQUAL TO
# 2248; ALMOST EQUAL TO
# 2249; NOT ALMOST EQUAL TO
# 224A; ALMOST EQUAL OR EQUAL TO
# 224B; TRIPLE TILDE
# 225F; QUESTIONED EQUAL TO
# 2260; NOT EQUAL TO
# 2262; NOT IDENTICAL TO
# 228C; MULTISET
# 22A7; MODELS
# 22AA; TRIPLE VERTICAL BAR RIGHT TURNSTILE
# 22AC; DOES NOT PROVE
# 22AD; NOT TRUE
# 22AE; DOES NOT FORCE
# 22AF; NEGATED DOUBLE VERTICAL BAR DOUBLE RIGHT TURNSTILE
# 22BE; RIGHT ANGLE WITH ARC
# 22BF; RIGHT TRIANGLE
# 22F5; ELEMENT OF WITH DOT ABOVE
# 22F8; ELEMENT OF WITH UNDERBAR
# 22F9; ELEMENT OF WITH TWO HORIZONTAL STROKES
# 22FF; Z NOTATION BAG MEMBERSHIP
# 2320; TOP HALF INTEGRAL
# 2321; BOTTOM HALF INTEGRAL
# 27C0; THREE DIMENSIONAL ANGLE
# 27CC; LONG DIVISION
# 27D3; LOWER RIGHT CORNER WITH DOT
# 27D4; UPPER LEFT CORNER WITH DOT
# 299C; RIGHT ANGLE VARIANT WITH SQUARE
# 299D; MEASURED RIGHT ANGLE WITH DOT
# 299E; ANGLE WITH S INSIDE
# 299F; ACUTE ANGLE
# 29A2; TURNED ANGLE
# 29A6; OBLIQUE ANGLE OPENING UP
# 29A7; OBLIQUE ANGLE OPENING DOWN
# 29C2; CIRCLE WITH SMALL CIRCLE TO THE RIGHT
# 29C3; CIRCLE WITH TWO HORIZONTAL STROKES TO THE RIGHT
# 29C9; TWO JOINED SQUARES
# 29CE; RIGHT TRIANGLE ABOVE LEFT TRIANGLE
# 29DC; INCOMPLETE INFINITY
# 29E1; INCREASES AS
# 29E3; EQUALS SIGN AND SLANTED PARALLEL
# 29E4; EQUALS SIGN AND SLANTED PARALLEL WITH TILDE ABOVE
# 29E5; IDENTICAL TO AND SLANTED PARALLEL
# 29F4; RULE-DELAYED
# 29F6; SOLIDUS WITH OVERBAR
# 29F7; REVERSE SOLIDUS WITH HORIZONTAL STROKE
# 2A0A; MODULO TWO SUM
# 2A0B; SUMMATION WITH INTEGRAL
# 2A0C; QUADRUPLE INTEGRAL OPERATOR
# 2A0D; FINITE PART INTEGRAL
# 2A0E; INTEGRAL WITH DOUBLE STROKE
# 2A0F; INTEGRAL AVERAGE WITH SLASH
# 2A10; CIRCULATION FUNCTION
# 2A11; ANTICLOCKWISE INTEGRATION
# 2A12; LINE INTEGRATION WITH RECTANGULAR PATH AROUND POLE
# 2A13; LINE INTEGRATION WITH SEMICIRCULAR PATH AROUND POLE
# 2A14; LINE INTEGRATION NOT INCLUDING THE POLE
# 2A15; INTEGRAL AROUND A POINT OPERATOR
# 2A16; QUATERNION INTEGRAL OPERATOR
# 2A17; INTEGRAL WITH LEFTWARDS ARROW WITH HOOK
# 2A18; INTEGRAL WITH TIMES SIGN
# 2A19; INTEGRAL WITH INTERSECTION
# 2A1A; INTEGRAL WITH UNION
# 2A1B; INTEGRAL WITH OVERBAR
# 2A1C; INTEGRAL WITH UNDERBAR
# 2A1E; LARGE LEFT TRIANGLE OPERATOR
# 2A1F; Z NOTATION SCHEMA COMPOSITION
# 2A20; Z NOTATION SCHEMA PIPING
# 2A21; Z NOTATION SCHEMA PROJECTION
# 2A24; PLUS SIGN WITH TILDE ABOVE
# 2A26; PLUS SIGN WITH TILDE BELOW
# 2A29; MINUS SIGN WITH COMMA ABOVE
# 2A3E; Z NOTATION RELATIONAL COMPOSITION
# 2A57; SLOPING LARGE OR
# 2A58; SLOPING LARGE AND
# 2A6A; TILDE OPERATOR WITH DOT ABOVE
# 2A6B; TILDE OPERATOR WITH RISING DOTS
# 2A6C; SIMILAR MINUS SIMILAR
# 2A6D; CONGRUENT WITH DOT ABOVE
# 2A6F; ALMOST EQUAL TO WITH CIRCUMFLEX ACCENT
# 2A70; APPROXIMATELY EQUAL OR EQUAL TO
# 2A73; EQUALS SIGN ABOVE TILDE OPERATOR
# 2A74; DOUBLE COLON EQUAL
# 2AA3; DOUBLE NESTED LESS-THAN WITH UNDERBAR
# 2ADC; FORKING
# 2AE2; VERTICAL BAR TRIPLE RIGHT TURNSTILE
# 2AE6; LONG DASH FROM LEFT MEMBER OF DOUBLE VERTICAL
# 2AF3; PARALLEL WITH TILDE OPERATOR
# 2AFB; TRIPLE SOLIDUS BINARY RELATION
# 2AFD; DOUBLE SOLIDUS OPERATOR
# 1D6DB; MATHEMATICAL BOLD PARTIAL DIFFERENTIAL
# 1D715; MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL
# 1D74F; MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL
# 1D789; MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL
# 1D7C3; MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL
# EOF

497589
bidi/data/BidiTest.txt Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

51
bidi/examples/shaping.rs Normal file
View File

@ -0,0 +1,51 @@
use wezterm_bidi::{BidiContext, Direction};
fn main() {
// The UBA is strongly coupled with codepoints and indices into the
// original text, and that fans out to our API here.
//
// paragraph is a Vec<char>.
let paragraph = vec!['א', 'ב', 'ג', 'a', 'b', 'c'];
let mut context = BidiContext::new();
// Leave it to the algorithm to determine the paragraph direction.
// If you have some higher level understanding or override for the
// direction, you can set `direction` accordingly.
let direction: Option<Direction> = None;
// Resolve the embedding levels for our paragraph.
context.resolve_paragraph(&paragraph, direction);
/// In order to layout the text, we need to feed information to a shaper.
/// For the purposes of example, we're sketching out a stub shaper interface
/// here, which is essentially compatible with eg: Harfbuzz's buffer data type.
struct ShaperBuffer {}
impl ShaperBuffer {
pub fn add_codepoint(&mut self, codepoint: char) {
let _ = codepoint;
// could call hb_buffer_add_codepoints() here
}
pub fn set_direction(&mut self, direction: Direction) {
let _ = direction;
// could call hb_buffer_set_direction() here
}
pub fn reset(&mut self) {}
pub fn shape(&mut self) {}
}
let mut buffer = ShaperBuffer {};
for run in context.runs() {
buffer.reset();
buffer.set_direction(run.direction);
for idx in run.indices() {
buffer.add_codepoint(paragraph[idx]);
}
buffer.shape();
// Now it is up to you to use the information from the shaper
// to decide whether and how the paragraph should be wrapped
// into lines
}
}

10
bidi/generate/Cargo.toml Normal file
View File

@ -0,0 +1,10 @@
[package]
name = "generate-bidi"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
anyhow = "1.0"

228
bidi/generate/src/main.rs Normal file
View File

@ -0,0 +1,228 @@
use anyhow::Context;
use std::io::Write;
fn parse_codepoint(s: &str) -> anyhow::Result<u32> {
u32::from_str_radix(s.trim(), 16).with_context(|| s.to_string())
}
fn gen_class() -> anyhow::Result<()> {
let data = std::fs::read_to_string("bidi/data/DerivedBidiClass.txt")
.context("bidi/data/DerivedBidiClass.txt")?;
struct Entry {
start: u32,
end: u32,
bidi_class: String,
comment: String,
}
impl Entry {
fn parse(line: &str) -> anyhow::Result<Option<Self>> {
let line = line.trim();
if line.starts_with("#") || line.is_empty() {
return Ok(None);
}
let fields: Vec<&str> = line.split(';').collect();
let range_fields: Vec<&str> = fields[0].trim().split("..").collect();
let start: u32 = parse_codepoint(range_fields[0])?;
let end = if let Some(end) = range_fields.get(1) {
parse_codepoint(end)?
} else {
start
};
let fields: Vec<&str> = fields[1].split('#').collect();
let bidi_class = fields[0].trim().to_string();
let comment = fields[1].trim().to_string();
Ok(Some(Entry {
start,
end,
bidi_class,
comment,
}))
}
}
let mut entries = vec![];
for line in data.lines() {
if let Some(entry) = Entry::parse(line)? {
entries.push(entry);
}
}
entries.sort_by_key(|e| e.start);
let mut f =
std::fs::File::create("bidi/src/bidi_class.rs").context("bidi/src/bidi_class.rs")?;
writeln!(
f,
"//! Generated from bidi/data/DerivedBidiClass.txt by bidi/generate/src/main.rs"
)?;
writeln!(
f,
r"
#[derive(Clone, Copy, Debug, Hash, Eq, PartialEq)]
#[repr(u8)]
pub enum BidiClass {{
ArabicLetter,
ArabicNumber,
BoundaryNeutral,
CommonSeparator,
EuropeanNumber,
EuropeanSeparator,
EuropeanTerminator,
FirstStrongIsolate,
LeftToRight,
LeftToRightEmbedding,
LeftToRightIsolate,
LeftToRightOverride,
NonspacingMark,
OtherNeutral,
ParagraphSeparator,
PopDirectionalFormat,
PopDirectionalIsolate,
RightToLeft,
RightToLeftEmbedding,
RightToLeftIsolate,
RightToLeftOverride,
SegmentSeparator,
WhiteSpace,
}}
"
)?;
writeln!(
f,
"pub const BIDI_CLASS: &'static [(char, char, BidiClass)] = &["
)?;
for entry in entries.into_iter() {
writeln!(
f,
" ('{}', '{}', {}), // {}",
char::from_u32(entry.start).unwrap().escape_unicode(),
char::from_u32(entry.end).unwrap().escape_unicode(),
match entry.bidi_class.as_str() {
"AL" => "BidiClass::ArabicLetter",
"AN" => "BidiClass::ArabicNumber",
"BN" => "BidiClass::BoundaryNeutral",
"CS" => "BidiClass::CommonSeparator",
"EN" => "BidiClass::EuropeanNumber",
"ES" => "BidiClass::EuropeanSeparator",
"ET" => "BidiClass::EuropeanTerminator",
"FSI" => "BidiClass::FirstStrongIsolate",
"L" => "BidiClass::LeftToRight",
"LRO" => "BidiClass::LeftToRightOverride",
"LRE" => "BidiClass::LeftToRightEmbedding",
"LRI" => "BidiClass::LeftToRightIsolate",
"NSM" => "BidiClass::NonspacingMark",
"ON" => "BidiClass::OtherNeutral",
"B" => "BidiClass::ParagraphSeparator",
"PDF" => "BidiClass::PopDirectionalFormat",
"PDI" => "BidiClass::PopDirectionalIsolate",
"R" => "BidiClass::RightToLeft",
"RLE" => "BidiClass::RightToLeftEmbedding",
"RLI" => "BidiClass::RightToLeftIsolate",
"RLO" => "BidiClass::RightToLeftOverride",
"S" => "BidiClass::SegmentSeparator",
"WS" => "BidiClass::WhiteSpace",
bad => panic!("invalid BidiClass {}", bad),
},
entry.comment
)?;
}
writeln!(f, "];")?;
Ok(())
}
fn gen_brackets() -> anyhow::Result<()> {
let data = std::fs::read_to_string("bidi/data/BidiBrackets.txt")
.context("bidi/data/BidiBrackets.txt")?;
struct Entry {
code_point: u32,
bidi_paired_bracket: u32,
bidi_paired_bracket_type: char,
comment: String,
}
impl Entry {
fn parse(line: &str) -> anyhow::Result<Option<Self>> {
let line = line.trim();
if line.starts_with("#") || line.is_empty() {
return Ok(None);
}
let fields: Vec<&str> = line.split(';').collect();
let code_point: u32 = parse_codepoint(fields[0])?;
let bidi_paired_bracket: u32 = parse_codepoint(fields[1])?;
let fields: Vec<&str> = fields[2].split('#').collect();
let bidi_paired_bracket_type: char = fields[0]
.trim()
.parse()
.with_context(|| fields[0].to_string())?;
let comment = fields[1].trim().to_string();
Ok(Some(Entry {
code_point,
bidi_paired_bracket,
bidi_paired_bracket_type,
comment,
}))
}
}
let mut entries = vec![];
for line in data.lines() {
if let Some(entry) = Entry::parse(line)? {
entries.push(entry);
}
}
entries.sort_by_key(|e| e.code_point);
let mut f =
std::fs::File::create("bidi/src/bidi_brackets.rs").context("bidi/src/bidi_brackets.rs")?;
writeln!(
f,
"//! Generated from bidi/data/BidiBrackets.txt by bidi/generate/src/main.rs"
)?;
writeln!(
f,
"#[derive(Debug, Clone, Copy, PartialEq, Eq)] #[repr(u8)] pub enum BracketType {{ Open, Close }}"
)?;
writeln!(
f,
"pub const BIDI_BRACKETS: &'static [(char, char, BracketType)] = &["
)?;
for entry in entries.into_iter() {
writeln!(
f,
" ('{}', '{}', {}), // {}",
char::from_u32(entry.code_point).unwrap().escape_unicode(),
char::from_u32(entry.bidi_paired_bracket)
.unwrap()
.escape_unicode(),
match entry.bidi_paired_bracket_type {
'o' => "BracketType::Open",
'c' => "BracketType::Close",
bad => panic!("invalid BracketType {}", bad),
},
entry.comment
)?;
}
writeln!(f, "];")?;
Ok(())
}
fn main() -> anyhow::Result<()> {
gen_brackets().context("gen_brackets")?;
gen_class().context("gen_class")?;
Ok(())
}

137
bidi/src/bidi_brackets.rs Normal file
View File

@ -0,0 +1,137 @@
//! Generated from bidi/data/BidiBrackets.txt by bidi/generate/src/main.rs
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub enum BracketType {
Open,
Close,
}
pub const BIDI_BRACKETS: &'static [(char, char, BracketType)] = &[
('\u{28}', '\u{29}', BracketType::Open), // LEFT PARENTHESIS
('\u{29}', '\u{28}', BracketType::Close), // RIGHT PARENTHESIS
('\u{5b}', '\u{5d}', BracketType::Open), // LEFT SQUARE BRACKET
('\u{5d}', '\u{5b}', BracketType::Close), // RIGHT SQUARE BRACKET
('\u{7b}', '\u{7d}', BracketType::Open), // LEFT CURLY BRACKET
('\u{7d}', '\u{7b}', BracketType::Close), // RIGHT CURLY BRACKET
('\u{f3a}', '\u{f3b}', BracketType::Open), // TIBETAN MARK GUG RTAGS GYON
('\u{f3b}', '\u{f3a}', BracketType::Close), // TIBETAN MARK GUG RTAGS GYAS
('\u{f3c}', '\u{f3d}', BracketType::Open), // TIBETAN MARK ANG KHANG GYON
('\u{f3d}', '\u{f3c}', BracketType::Close), // TIBETAN MARK ANG KHANG GYAS
('\u{169b}', '\u{169c}', BracketType::Open), // OGHAM FEATHER MARK
('\u{169c}', '\u{169b}', BracketType::Close), // OGHAM REVERSED FEATHER MARK
('\u{2045}', '\u{2046}', BracketType::Open), // LEFT SQUARE BRACKET WITH QUILL
('\u{2046}', '\u{2045}', BracketType::Close), // RIGHT SQUARE BRACKET WITH QUILL
('\u{207d}', '\u{207e}', BracketType::Open), // SUPERSCRIPT LEFT PARENTHESIS
('\u{207e}', '\u{207d}', BracketType::Close), // SUPERSCRIPT RIGHT PARENTHESIS
('\u{208d}', '\u{208e}', BracketType::Open), // SUBSCRIPT LEFT PARENTHESIS
('\u{208e}', '\u{208d}', BracketType::Close), // SUBSCRIPT RIGHT PARENTHESIS
('\u{2308}', '\u{2309}', BracketType::Open), // LEFT CEILING
('\u{2309}', '\u{2308}', BracketType::Close), // RIGHT CEILING
('\u{230a}', '\u{230b}', BracketType::Open), // LEFT FLOOR
('\u{230b}', '\u{230a}', BracketType::Close), // RIGHT FLOOR
('\u{2329}', '\u{232a}', BracketType::Open), // LEFT-POINTING ANGLE BRACKET
('\u{232a}', '\u{2329}', BracketType::Close), // RIGHT-POINTING ANGLE BRACKET
('\u{2768}', '\u{2769}', BracketType::Open), // MEDIUM LEFT PARENTHESIS ORNAMENT
('\u{2769}', '\u{2768}', BracketType::Close), // MEDIUM RIGHT PARENTHESIS ORNAMENT
('\u{276a}', '\u{276b}', BracketType::Open), // MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT
('\u{276b}', '\u{276a}', BracketType::Close), // MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT
('\u{276c}', '\u{276d}', BracketType::Open), // MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT
('\u{276d}', '\u{276c}', BracketType::Close), // MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT
('\u{276e}', '\u{276f}', BracketType::Open), // HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT
('\u{276f}', '\u{276e}', BracketType::Close), // HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT
('\u{2770}', '\u{2771}', BracketType::Open), // HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT
('\u{2771}', '\u{2770}', BracketType::Close), // HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT
('\u{2772}', '\u{2773}', BracketType::Open), // LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT
('\u{2773}', '\u{2772}', BracketType::Close), // LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT
('\u{2774}', '\u{2775}', BracketType::Open), // MEDIUM LEFT CURLY BRACKET ORNAMENT
('\u{2775}', '\u{2774}', BracketType::Close), // MEDIUM RIGHT CURLY BRACKET ORNAMENT
('\u{27c5}', '\u{27c6}', BracketType::Open), // LEFT S-SHAPED BAG DELIMITER
('\u{27c6}', '\u{27c5}', BracketType::Close), // RIGHT S-SHAPED BAG DELIMITER
('\u{27e6}', '\u{27e7}', BracketType::Open), // MATHEMATICAL LEFT WHITE SQUARE BRACKET
('\u{27e7}', '\u{27e6}', BracketType::Close), // MATHEMATICAL RIGHT WHITE SQUARE BRACKET
('\u{27e8}', '\u{27e9}', BracketType::Open), // MATHEMATICAL LEFT ANGLE BRACKET
('\u{27e9}', '\u{27e8}', BracketType::Close), // MATHEMATICAL RIGHT ANGLE BRACKET
('\u{27ea}', '\u{27eb}', BracketType::Open), // MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
('\u{27eb}', '\u{27ea}', BracketType::Close), // MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
('\u{27ec}', '\u{27ed}', BracketType::Open), // MATHEMATICAL LEFT WHITE TORTOISE SHELL BRACKET
('\u{27ed}', '\u{27ec}', BracketType::Close), // MATHEMATICAL RIGHT WHITE TORTOISE SHELL BRACKET
('\u{27ee}', '\u{27ef}', BracketType::Open), // MATHEMATICAL LEFT FLATTENED PARENTHESIS
('\u{27ef}', '\u{27ee}', BracketType::Close), // MATHEMATICAL RIGHT FLATTENED PARENTHESIS
('\u{2983}', '\u{2984}', BracketType::Open), // LEFT WHITE CURLY BRACKET
('\u{2984}', '\u{2983}', BracketType::Close), // RIGHT WHITE CURLY BRACKET
('\u{2985}', '\u{2986}', BracketType::Open), // LEFT WHITE PARENTHESIS
('\u{2986}', '\u{2985}', BracketType::Close), // RIGHT WHITE PARENTHESIS
('\u{2987}', '\u{2988}', BracketType::Open), // Z NOTATION LEFT IMAGE BRACKET
('\u{2988}', '\u{2987}', BracketType::Close), // Z NOTATION RIGHT IMAGE BRACKET
('\u{2989}', '\u{298a}', BracketType::Open), // Z NOTATION LEFT BINDING BRACKET
('\u{298a}', '\u{2989}', BracketType::Close), // Z NOTATION RIGHT BINDING BRACKET
('\u{298b}', '\u{298c}', BracketType::Open), // LEFT SQUARE BRACKET WITH UNDERBAR
('\u{298c}', '\u{298b}', BracketType::Close), // RIGHT SQUARE BRACKET WITH UNDERBAR
('\u{298d}', '\u{2990}', BracketType::Open), // LEFT SQUARE BRACKET WITH TICK IN TOP CORNER
('\u{298e}', '\u{298f}', BracketType::Close), // RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
('\u{298f}', '\u{298e}', BracketType::Open), // LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER
('\u{2990}', '\u{298d}', BracketType::Close), // RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER
('\u{2991}', '\u{2992}', BracketType::Open), // LEFT ANGLE BRACKET WITH DOT
('\u{2992}', '\u{2991}', BracketType::Close), // RIGHT ANGLE BRACKET WITH DOT
('\u{2993}', '\u{2994}', BracketType::Open), // LEFT ARC LESS-THAN BRACKET
('\u{2994}', '\u{2993}', BracketType::Close), // RIGHT ARC GREATER-THAN BRACKET
('\u{2995}', '\u{2996}', BracketType::Open), // DOUBLE LEFT ARC GREATER-THAN BRACKET
('\u{2996}', '\u{2995}', BracketType::Close), // DOUBLE RIGHT ARC LESS-THAN BRACKET
('\u{2997}', '\u{2998}', BracketType::Open), // LEFT BLACK TORTOISE SHELL BRACKET
('\u{2998}', '\u{2997}', BracketType::Close), // RIGHT BLACK TORTOISE SHELL BRACKET
('\u{29d8}', '\u{29d9}', BracketType::Open), // LEFT WIGGLY FENCE
('\u{29d9}', '\u{29d8}', BracketType::Close), // RIGHT WIGGLY FENCE
('\u{29da}', '\u{29db}', BracketType::Open), // LEFT DOUBLE WIGGLY FENCE
('\u{29db}', '\u{29da}', BracketType::Close), // RIGHT DOUBLE WIGGLY FENCE
('\u{29fc}', '\u{29fd}', BracketType::Open), // LEFT-POINTING CURVED ANGLE BRACKET
('\u{29fd}', '\u{29fc}', BracketType::Close), // RIGHT-POINTING CURVED ANGLE BRACKET
('\u{2e22}', '\u{2e23}', BracketType::Open), // TOP LEFT HALF BRACKET
('\u{2e23}', '\u{2e22}', BracketType::Close), // TOP RIGHT HALF BRACKET
('\u{2e24}', '\u{2e25}', BracketType::Open), // BOTTOM LEFT HALF BRACKET
('\u{2e25}', '\u{2e24}', BracketType::Close), // BOTTOM RIGHT HALF BRACKET
('\u{2e26}', '\u{2e27}', BracketType::Open), // LEFT SIDEWAYS U BRACKET
('\u{2e27}', '\u{2e26}', BracketType::Close), // RIGHT SIDEWAYS U BRACKET
('\u{2e28}', '\u{2e29}', BracketType::Open), // LEFT DOUBLE PARENTHESIS
('\u{2e29}', '\u{2e28}', BracketType::Close), // RIGHT DOUBLE PARENTHESIS
('\u{2e55}', '\u{2e56}', BracketType::Open), // LEFT SQUARE BRACKET WITH STROKE
('\u{2e56}', '\u{2e55}', BracketType::Close), // RIGHT SQUARE BRACKET WITH STROKE
('\u{2e57}', '\u{2e58}', BracketType::Open), // LEFT SQUARE BRACKET WITH DOUBLE STROKE
('\u{2e58}', '\u{2e57}', BracketType::Close), // RIGHT SQUARE BRACKET WITH DOUBLE STROKE
('\u{2e59}', '\u{2e5a}', BracketType::Open), // TOP HALF LEFT PARENTHESIS
('\u{2e5a}', '\u{2e59}', BracketType::Close), // TOP HALF RIGHT PARENTHESIS
('\u{2e5b}', '\u{2e5c}', BracketType::Open), // BOTTOM HALF LEFT PARENTHESIS
('\u{2e5c}', '\u{2e5b}', BracketType::Close), // BOTTOM HALF RIGHT PARENTHESIS
('\u{3008}', '\u{3009}', BracketType::Open), // LEFT ANGLE BRACKET
('\u{3009}', '\u{3008}', BracketType::Close), // RIGHT ANGLE BRACKET
('\u{300a}', '\u{300b}', BracketType::Open), // LEFT DOUBLE ANGLE BRACKET
('\u{300b}', '\u{300a}', BracketType::Close), // RIGHT DOUBLE ANGLE BRACKET
('\u{300c}', '\u{300d}', BracketType::Open), // LEFT CORNER BRACKET
('\u{300d}', '\u{300c}', BracketType::Close), // RIGHT CORNER BRACKET
('\u{300e}', '\u{300f}', BracketType::Open), // LEFT WHITE CORNER BRACKET
('\u{300f}', '\u{300e}', BracketType::Close), // RIGHT WHITE CORNER BRACKET
('\u{3010}', '\u{3011}', BracketType::Open), // LEFT BLACK LENTICULAR BRACKET
('\u{3011}', '\u{3010}', BracketType::Close), // RIGHT BLACK LENTICULAR BRACKET
('\u{3014}', '\u{3015}', BracketType::Open), // LEFT TORTOISE SHELL BRACKET
('\u{3015}', '\u{3014}', BracketType::Close), // RIGHT TORTOISE SHELL BRACKET
('\u{3016}', '\u{3017}', BracketType::Open), // LEFT WHITE LENTICULAR BRACKET
('\u{3017}', '\u{3016}', BracketType::Close), // RIGHT WHITE LENTICULAR BRACKET
('\u{3018}', '\u{3019}', BracketType::Open), // LEFT WHITE TORTOISE SHELL BRACKET
('\u{3019}', '\u{3018}', BracketType::Close), // RIGHT WHITE TORTOISE SHELL BRACKET
('\u{301a}', '\u{301b}', BracketType::Open), // LEFT WHITE SQUARE BRACKET
('\u{301b}', '\u{301a}', BracketType::Close), // RIGHT WHITE SQUARE BRACKET
('\u{fe59}', '\u{fe5a}', BracketType::Open), // SMALL LEFT PARENTHESIS
('\u{fe5a}', '\u{fe59}', BracketType::Close), // SMALL RIGHT PARENTHESIS
('\u{fe5b}', '\u{fe5c}', BracketType::Open), // SMALL LEFT CURLY BRACKET
('\u{fe5c}', '\u{fe5b}', BracketType::Close), // SMALL RIGHT CURLY BRACKET
('\u{fe5d}', '\u{fe5e}', BracketType::Open), // SMALL LEFT TORTOISE SHELL BRACKET
('\u{fe5e}', '\u{fe5d}', BracketType::Close), // SMALL RIGHT TORTOISE SHELL BRACKET
('\u{ff08}', '\u{ff09}', BracketType::Open), // FULLWIDTH LEFT PARENTHESIS
('\u{ff09}', '\u{ff08}', BracketType::Close), // FULLWIDTH RIGHT PARENTHESIS
('\u{ff3b}', '\u{ff3d}', BracketType::Open), // FULLWIDTH LEFT SQUARE BRACKET
('\u{ff3d}', '\u{ff3b}', BracketType::Close), // FULLWIDTH RIGHT SQUARE BRACKET
('\u{ff5b}', '\u{ff5d}', BracketType::Open), // FULLWIDTH LEFT CURLY BRACKET
('\u{ff5d}', '\u{ff5b}', BracketType::Close), // FULLWIDTH RIGHT CURLY BRACKET
('\u{ff5f}', '\u{ff60}', BracketType::Open), // FULLWIDTH LEFT WHITE PARENTHESIS
('\u{ff60}', '\u{ff5f}', BracketType::Close), // FULLWIDTH RIGHT WHITE PARENTHESIS
('\u{ff62}', '\u{ff63}', BracketType::Open), // HALFWIDTH LEFT CORNER BRACKET
('\u{ff63}', '\u{ff62}', BracketType::Close), // HALFWIDTH RIGHT CORNER BRACKET
];

2348
bidi/src/bidi_class.rs Normal file

File diff suppressed because it is too large Load Diff

32
bidi/src/direction.rs Normal file
View File

@ -0,0 +1,32 @@
use crate::bidi_class::BidiClass;
#[derive(Debug, Clone, Copy, Eq, PartialEq)]
pub enum Direction {
LeftToRight,
RightToLeft,
}
impl Direction {
pub fn with_level(level: i8) -> Self {
if level % 2 == 1 {
Self::RightToLeft
} else {
Self::LeftToRight
}
}
pub fn opposite(self) -> Self {
if self == Direction::LeftToRight {
Direction::RightToLeft
} else {
Direction::LeftToRight
}
}
pub fn as_bidi_class(self) -> BidiClass {
match self {
Self::RightToLeft => BidiClass::RightToLeft,
Self::LeftToRight => BidiClass::LeftToRight,
}
}
}

58
bidi/src/level.rs Normal file
View File

@ -0,0 +1,58 @@
use crate::bidi_class::BidiClass;
use crate::direction::Direction;
use crate::NO_LEVEL;
/// Maximum stack depth; UBA guarantees that it will never increase
/// in later versions of the spec.
pub const MAX_DEPTH: usize = 125;
#[derive(Default, Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub struct Level(pub(crate) i8);
impl Level {
pub fn direction(self) -> Direction {
Direction::with_level(self.0)
}
pub fn as_bidi_class(self) -> BidiClass {
if self.0 % 2 == 1 {
BidiClass::RightToLeft
} else {
BidiClass::LeftToRight
}
}
pub fn removed_by_x9(self) -> bool {
self.0 == NO_LEVEL
}
pub fn max(self, other: Level) -> Level {
Level(self.0.max(other.0))
}
pub(crate) fn least_greater_even(self) -> Option<Level> {
let level = if self.0 % 2 == 0 {
self.0 + 2
} else {
self.0 + 1
};
if level as usize > MAX_DEPTH {
None
} else {
Some(Self(level))
}
}
pub(crate) fn least_greater_odd(self) -> Option<Level> {
let level = if self.0 % 2 == 1 {
self.0 + 2
} else {
self.0 + 1
};
if level as usize > MAX_DEPTH {
None
} else {
Some(Self(level))
}
}
}

78
bidi/src/level_stack.rs Normal file
View File

@ -0,0 +1,78 @@
use crate::bidi_class::BidiClass;
use crate::level::{Level, MAX_DEPTH};
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
pub(crate) enum Override {
Neutral,
LTR,
RTL,
}
/// An implementation of the stack/STATUSSTACKELEMENT from bidiref
#[derive(Debug)]
pub(crate) struct LevelStack {
embedding_level: [Level; MAX_DEPTH],
override_status: [Override; MAX_DEPTH],
isolate_status: [bool; MAX_DEPTH],
/// Current index into the stack arrays above
depth: usize,
}
impl LevelStack {
pub fn new() -> Self {
Self {
embedding_level: [Level::default(); MAX_DEPTH],
override_status: [Override::Neutral; MAX_DEPTH],
isolate_status: [false; MAX_DEPTH],
depth: 0,
}
}
pub fn depth(&self) -> usize {
self.depth
}
pub fn push(&mut self, level: Level, override_status: Override, isolate_status: bool) {
let depth = self.depth;
if depth >= MAX_DEPTH {
return;
}
log::trace!(
"pushing level={:?} override={:?} isolate={} at depth={}",
level,
override_status,
isolate_status,
depth
);
self.embedding_level[depth] = level;
self.override_status[depth] = override_status;
self.isolate_status[depth] = isolate_status;
self.depth += 1;
}
pub fn pop(&mut self) {
if self.depth > 0 {
self.depth -= 1;
}
}
pub fn embedding_level(&self) -> Level {
self.embedding_level[self.depth - 1]
}
pub fn override_status(&self) -> Override {
self.override_status[self.depth - 1]
}
pub fn apply_override(&self, bc: &mut BidiClass) {
match self.override_status() {
Override::LTR => *bc = BidiClass::LeftToRight,
Override::RTL => *bc = BidiClass::RightToLeft,
Override::Neutral => {}
}
}
pub fn isolate_status(&self) -> bool {
self.isolate_status[self.depth - 1]
}
}

2189
bidi/src/lib.rs Normal file

File diff suppressed because it is too large Load Diff