Nominatim/nominatim/tokenizer
Sarah Hoffmann 52847b61a3 extend ICU config to accomodate multiple analysers
Adds parsing of multiple variant lists from the configuration.
Every entry except one must have a unique 'id' paramter to
distinguish the entries. The entry without id is considered
the default. Currently only the list without an id is used
for analysis.
2021-10-04 16:40:28 +02:00
..
sanitizers add unit tests for new sanatizer functions 2021-10-01 12:27:24 +02:00
__init__.py introduce tokenizer modules 2021-04-30 11:29:57 +02:00
base.py unify ICUNameProcessorRules and ICURuleLoader 2021-10-01 12:27:24 +02:00
factory.py unify ICUNameProcessorRules and ICURuleLoader 2021-10-01 12:27:24 +02:00
icu_name_processor.py unify ICUNameProcessorRules and ICURuleLoader 2021-10-01 12:27:24 +02:00
icu_rule_loader.py extend ICU config to accomodate multiple analysers 2021-10-04 16:40:28 +02:00
icu_tokenizer.py introduce sanitizer step before token analysis 2021-10-01 12:27:24 +02:00
icu_variants.py unify ICUNameProcessorRules and ICURuleLoader 2021-10-01 12:27:24 +02:00
legacy_tokenizer.py unify ICUNameProcessorRules and ICURuleLoader 2021-10-01 12:27:24 +02:00
place_sanitizer.py introduce sanitizer step before token analysis 2021-10-01 12:27:24 +02:00