Sarah Hoffmann
9963261d8d
add type annotations to special phrase importer
2022-07-18 09:54:29 +02:00
Sarah Hoffmann
18b16e06ca
add type annotations for legacy tokenizer
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
e37cfc64d2
add type annotations to ICU tokenizer helper modules
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
d0c44431d0
add typing information for place_info and country_info
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
bce93d60bd
move PlaceInfo into data submodule
...
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
344a2bfc1a
add new command for cleaning word tokens
...
Just pulls outdated housenumbers for the moment.
2022-01-20 20:05:15 +01:00
Sarah Hoffmann
c3788d765e
add consistent SPDX copyright headers
2022-01-03 16:23:58 +01:00
Sarah Hoffmann
7beccb7997
remove unnecessary pass statements
2021-12-02 15:54:24 +01:00
Sarah Hoffmann
e8e2502e2f
make word recount a tokenizer-specific function
2021-10-19 11:21:16 +02:00
Sarah Hoffmann
2a94bfc703
fix argument description for check_database
2021-10-07 09:49:13 +02:00
Sarah Hoffmann
16daa57e47
unify ICUNameProcessorRules and ICURuleLoader
...
There is no need for the additional layer of indirection that
the ICUNameProcessorRules class adds. The ICURuleLoader can
fill the database properties directly.
2021-10-01 12:27:24 +02:00
Sarah Hoffmann
5e5addcdbf
fix typo
2021-09-29 14:16:09 +02:00
Sarah Hoffmann
231250f2eb
add wrapper class for place data passed to tokenizer
...
This is mostly for convenience and documentation purposes.
2021-09-29 11:54:07 +02:00
Sarah Hoffmann
90b40fc3e6
define formal public Python interface for tokenizer
...
This introduces an abstract class for the Tokenizer/Analyzer
for documentation purposes.
2021-08-16 11:41:54 +02:00