Sarah Hoffmann
929a13d4cd
remove comma as name separator
...
Commas are most of the time used as a part of a name, not to
separate multiple names.
See also #2950 .
2023-01-22 22:29:36 +01:00
Sarah Hoffmann
56f0d678e3
exclude names ending in :wikipedia from indexing
...
The wikipedia prefix is used for referencing a wikipedia article
for the given tag, not the object, so not useful to search.
2023-01-21 11:16:08 +01:00
Sarah Hoffmann
610af95ed1
remove old import styles
2022-12-23 19:29:07 +01:00
Sarah Hoffmann
200eae3bc0
add tests for examples in lua style documentation
...
And fix all the errors the tests have found.
2022-12-23 17:35:28 +01:00
Sarah Hoffmann
9321e425a4
add documentation for flex style
...
Includes minor adaptions to bring the code in line with the
documentation.
2022-12-23 11:10:40 +01:00
Sarah Hoffmann
2ca83efc36
flez: add other default styles
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
06796745ff
flex: hide compiled matchers
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
093d531509
flex: switch to functions for substyles
...
This gives us a bit more flexibility about the implementation
in the future.
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
a915815e4d
explicit export for functions in flex-base
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
de3c28104c
flex: add combining clean function
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
d9d13a6204
flex: simplify name handling
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
d1f5820711
flex: simplify address configuration
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
7592f8f189
update osm2pgsql (flex not building index)
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
6f51c1ba33
remove code that disables processing of forward dependencies
2022-12-11 19:35:58 +01:00
Sarah Hoffmann
0e186835b9
contract duplicate spaces in transliteration string
...
There are some pathological cases where an isolated letter may
be deleted because it is in itself meaningless. If this happens in
the middle of a sentence, then the transliteration contains two
consecutive spaces. Add a final rule to fix this.
See #2909 .
2022-12-02 10:15:02 +01:00
Sarah Hoffmann
41e8bddaa9
remove BDD test for tiger:county
...
We no longer rely on the import to strip the tag.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
fd3dec8efe
add sanitizer for TIGER tags
...
Currently only takes over cleaning the tiger:county data. This was
done by the import until now.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
b6ff697ff0
add experimental option for enabling forward dependencies
2022-11-21 14:48:00 +01:00
Sarah Hoffmann
d63d7cb9a8
remove dependent territories from country list
...
Removes territories of US, France, Australia and Netherlands from the
country list. These territories have their own country code (which is
why they are in the list in the first place) but are mapped as part of
the admin_level 2 relations for the respective parent countries.
Therefore they never had any places attached. In practical terms, the
change only affects the number of tables created.
2022-11-15 11:37:30 +01:00
Sarah Hoffmann
63a9bc94f7
fix country handling in flex style
...
If the country tag does not match a 2-letter code, it needs to
be dropped.
2022-11-10 15:52:13 +01:00
Sarah Hoffmann
3683cf7ddc
optimise tag match function
2022-11-10 09:38:25 +01:00
Sarah Hoffmann
51ed55cc32
initial flex import scripts
...
Only implements the extratags style for the moment. Tests pass
for the same behaviour as the gazetteer output. Updates still need
to be done.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
536f08f33a
ignore 5+ postcodes in the US for now
...
Hierarchical postcodes need a different treatment.
2022-06-24 19:24:22 +02:00
Sarah Hoffmann
e86db3001f
fix postcode pattern for Mozambique
...
Optional groups are not implemented yet.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
ca7b46511d
introduce and use analyzer for postcodes
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
18864afa8a
postcodes: introduce a default pattern for countries without postcodes
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9cf700e85d
add postcodes for most of the remaining countries
...
Now includes all postcodes that have optional parts.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9172696324
postcodes: add support for optional spaces
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
49626ba709
add postcode formats with optional country code
...
If the country code is not part of the mandatory output, the
country code filter will do the correct handling.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
28ab2f6048
add postcodes patterns without optional spaces
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
6e0014e138
add postcode patterns for numeric postcodes
...
Adds patterns for countries that have simple numeric-only postcodes.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
8080625747
remove postcodes from countries that don't have them
...
The postcodes will only be removed as a 'computed postcode' they
are still searchable for the given object.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
21fb501699
add info about countries without a postcode
2022-06-23 23:42:31 +02:00
bgo-eiu
04644102f2
added additional languages for pakistan in country settings
2022-06-16 06:26:44 -04:00
Sarah Hoffmann
8a67ddcb2b
remove county nodes in Canada from addresses
...
Canada has complete coverage for administrative boundaries on
county level. Removing the county nodes from the addresses avoids error
due to a wide-spread doubling of place nodes for city counties.
2022-05-18 10:19:05 +02:00
Sarah Hoffmann
4002bee0c1
make ICU the default tokenizer
2022-05-10 12:02:50 +02:00
Sarah Hoffmann
9d468f6da0
support arbitrary prefixes in country name list
...
This means we can now get rid of the last special cases for names.
2022-05-09 11:55:26 +02:00
Sarah Hoffmann
3a8ddf736e
move country names into separate include files
2022-05-09 11:55:26 +02:00
Sarah Hoffmann
63dc4b39bc
ICU: better letter identification in normalization
...
The Letter class does not include non-spacing marks that can also
have a consonant or vowel meaning, especially in Indian languages.
Use the alnum propoerty instead which includes them all. Also
include the vowel-canceling Virama, which is not a letter by itself
but changes the transliteration.
2022-04-28 18:23:17 +02:00
Sarah Hoffmann
fd4ab3f262
Merge pull request #2629 from tareqpi/country-names-yaml-configuration
...
Move default country names into yaml configuration
2022-04-04 09:04:25 +02:00
Tareq Al-Ahdal
7bb7ed468a
fix storing of escape sequences in database
2022-03-24 13:18:44 +08:00
Sarah Hoffmann
42f0282f14
remove special case for operator names
...
The OSM data has been sufficiently cleaned up by now that
the operator no longer needs to be considered a name tag.
Use 'brand' as the searchable alternative.
2022-03-18 10:48:53 +01:00
Tareq Al-Ahdal
456d439e97
Reformatting of country keys
2022-03-18 02:23:11 +08:00
Tareq Al-Ahdal
165d17f7f7
reintroduce 'name:' prefix to country name keys
2022-03-13 18:58:27 +08:00
Tareq Al-Ahdal
8b6652a40b
move default country names into yaml configuration
2022-03-12 15:17:01 +08:00
Marc Tobias
1fcc9717bb
documentation: clarify osm2pgsql isnt in project directory by default
2022-03-10 14:16:12 +01:00
Sarah Hoffmann
f03a05f6bb
add new analyser for houenumbers
...
This analyser makes spaces optional.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
855909b4e9
add 'healthcare' as main tag
...
Given that the tag is most of the time duplicated by an amenity
tag which is already imported, only import it as a fallback when
there is no name.
Fixes #2609 .
2022-02-21 11:52:17 +01:00
Sarah Hoffmann
610f2cc254
sanitizer: move helpers into a configuration class
2022-02-07 10:48:00 +01:00
Sarah Hoffmann
a79a3210e6
implement is-a-name option for housenumbers
2022-02-07 09:27:11 +01:00