Commit Graph

562 Commits

Author SHA1 Message Date
Sarah Hoffmann
9864b191b1 fix various typos 2022-07-31 17:10:35 +02:00
Sarah Hoffmann
51b6d16dc6 overhaul the token analysis interface
The functional split betweenthe two functions is now that the
first one creates the ID that is used in the word table and
the second one creates the variants. There no longer is a
requirement that the ID is the normalized version. We might
later reintroduce the requirement that a normalized version be available
but it doesn't necessarily need to be through the ID.

The function that creates the ID now gets the full PlaceName. That way
it might take into account attributes that were set by the sanitizers.

Finally rename both functions to something more sane.
2022-07-29 15:14:11 +02:00
Sarah Hoffmann
34d27ed45c move PlaceName into the generic data module 2022-07-29 11:42:20 +02:00
Sarah Hoffmann
094100bbf6 harmonize spelling
Stick with the American spelling of Analyze.
2022-07-29 10:52:01 +02:00
Sarah Hoffmann
c8873d34af harmonize interface of token analysis module
The configure() function now receives a Transliterator object instead
of the ICU rules. This harmonizes the parameters with the create
function.
2022-07-29 10:43:07 +02:00
Sarah Hoffmann
f0d640961a add documentation for custom token analysis 2022-07-29 09:41:28 +02:00
Sarah Hoffmann
3746befd88 add documentation for sanitizer interface
Also switches mkdocstrings to 0.18 with the rather unfortunate
consequence that now mkdocstrings-python-legacy is needed as well.
2022-07-28 22:00:29 +02:00
Sarah Hoffmann
d819036daa add support for external token analysis modules 2022-07-25 16:27:22 +02:00
Sarah Hoffmann
6d41046b15 add support for external sanitizer modules 2022-07-25 16:10:19 +02:00
Sarah Hoffmann
7b7203c149 add function for loading plugin modules
Loads modules for configurable code like tokenizers, sanitizers, etc.
Supports internal modules, external libraries and code from the
project directory.
2022-07-25 16:10:10 +02:00
Kian-Meng Ang
f5e52e748f docs: fix typos 2022-07-20 22:05:31 +08:00
Sarah Hoffmann
5aad105c73 add explicit cast for fetchone 2022-07-18 10:18:51 +02:00
Sarah Hoffmann
83054af46f remove typing_extensions requirement
The typing_extensions package is only necessary now when running mypy.
It won't be used at runtime anymore.
2022-07-18 09:55:58 +02:00
Sarah Hoffmann
a849f3c9ec add type annotations for command line functions 2022-07-18 09:55:54 +02:00
Sarah Hoffmann
25d854dc5c add type annotations for Tiger import function 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
9963261d8d add type annotations to special phrase importer 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
459ab3bbdc add type annotations to database check functions 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
a21d4d3ac4 add type annotations for database import functions 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
4da1f0da6f add type annotations for migrations 2022-07-18 09:54:29 +02:00
Sarah Hoffmann
17bbe2637a add type annotations to tool functions 2022-07-18 09:54:27 +02:00
Sarah Hoffmann
6c6bbe5747 add type annotations for ICU tokenizer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
18b16e06ca add type annotations for legacy tokenizer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e37cfc64d2 add type annotations to ICU tokenizer helper modules 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d35e3c25b6 add type annotations for token analysis
No annotations for ICU types yet.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
62eedbb8f6 add type hints for sanitizers 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
5617bffe2f add type annotations for indexer 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
8adab2c6ca add typing information for postcode formatter 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d0c44431d0 add typing information for place_info and country_info 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
282a61ce51 add typing information for utils submodule 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
7a1d22ff15 type annotations for non-blocking DB connection 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
0dff71a410 add type annotations for SQL preprocessor 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
26f30bff28 add type annotation to DB utils
As a cursor is needed as type, make this a public type.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6775e713c add typing information to DB properties 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
69f9122bef add typing annotations for DB status module
Requires TypedDict which is only available from Python 3.8. Require
therefore typing_extensions to make the functions available for
earlier Python versions.
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
845c43137a add type annotations to freeze functions 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
aaf2b6032e fix uses of config.get_path() to expect None 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
c4928c646d define type for enivronment dictionaries 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f12fe54d2b restrict return type more 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
fc03c0266a add type annotations to exec_utils 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
681aad7e0d avoid issues with Python < 3.9 and linting 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
f22fa992f7 move complex typing annotations to extra file 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
992e6f72cf type annotations for DB utils 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e6ee3c772c type annotations for DB connection 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
95ed95c616 add type annotations to config module 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
bf36f33e79 add type annotations for version.py 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
9b636fdc10 mypy: minimal annotations to enable a clean run 2022-07-18 09:47:57 +02:00
Sarah Hoffmann
4b12d52ef5 convert admin --analyse-indexing to new indexing method
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
856925d19b remove analyze() from PlaceInfo class
The function creates circular dependencies.
2022-07-07 12:06:58 +02:00
Sarah Hoffmann
cbbcbb1fd7 move country_info into data submodule 2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd move PlaceInfo into data submodule
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00