Sarah Hoffmann
996026e5ed
provide full URL in more field
...
This is a regression against the PHP version.
Fixes #3138 .
2023-08-06 17:50:02 +02:00
Sarah Hoffmann
afdbdb02a1
do not lookup by address vector when only few tokens are available
...
Names of countries and states are exceedingly rare in the word count
but are very frequent in the address. A short name has the danger
of producing too many results.
2023-08-02 09:25:47 +02:00
Sarah Hoffmann
252fe42612
Merge pull request #3122 from miku0/sanitizer-final
...
Adds sanitizer for Japanese addresses to correspond to block address
2023-08-01 10:38:58 +02:00
miku0
67e1c7dc72
Moved KANJI_MAP to icu-rules
2023-07-31 11:57:49 +00:00
miku0
4d61cc87cf
Add the test of reconbine_place
2023-07-31 02:39:56 +00:00
Sarah Hoffmann
e523da9e12
reintroduce file logging for Python frontend
2023-07-30 19:58:00 +02:00
miku0
67706cec4e
add @fail-legacy
2023-07-27 07:33:53 +00:00
Sarah Hoffmann
9448c5e16f
add tests for new arm and export Python functions
2023-07-26 11:09:52 +02:00
miku0
0722495434
add japanese sanitizer
2023-07-26 07:54:58 +00:00
Sarah Hoffmann
d545c6d73c
mostly remove php-cgi requirement
...
This is now only needed for BDD tests against the php API.
2023-07-26 00:10:11 +02:00
Sarah Hoffmann
f69fea4210
remove now unused run_api_script function
2023-07-25 22:45:29 +02:00
Sarah Hoffmann
4cd0a4ced4
remove now unused run_legacy_script()
2023-07-25 21:39:23 +02:00
Sarah Hoffmann
0804cc0cff
port export function to Python
...
Some of the parameters have been renoved as they don't make sense
anymore.
2023-07-25 21:39:23 +02:00
Sarah Hoffmann
faeee7528f
move warm script to python code
2023-07-25 21:39:23 +02:00
Sarah Hoffmann
66ecb56cea
add tests for new endpoints
2023-07-25 10:57:19 +02:00
Sarah Hoffmann
927d2cc824
do not split names from typed phrases
...
When phrases are typed, they should only contain exactly one term.
2023-07-17 20:09:08 +02:00
Sarah Hoffmann
cc45930ef9
avoid lookup via partials on frequent words
...
Drops expensive searches via partials on terms like 'rue de'.
See #2979 .
2023-07-06 12:16:57 +02:00
Sarah Hoffmann
ce17b0eeca
Merge pull request #3101 from lonvia/custom-geometry-type
...
Improve use of SQLAlchemy statement cache with search queries
2023-07-03 11:03:26 +02:00
Sarah Hoffmann
82216ebf8b
always run function update on migrations
...
This means that we can have migrations which require nothing but
an update of the functions.
2023-07-01 20:18:59 +02:00
Sarah Hoffmann
9f6f12cfeb
move search to bind parameters
2023-07-01 18:03:07 +02:00
Sarah Hoffmann
a873f260cf
fix merging of linked names into unnamed boundaries
...
The NULL value of the boundaries' name field was erasing all
content when used in SQL operations.
2023-06-30 22:14:11 +02:00
Sarah Hoffmann
d7a3039c2a
also switch legacy tokenizer to new street/place choice behaviour
2023-06-30 17:03:17 +02:00
Sarah Hoffmann
645ea5a057
use information from tokenizer to determine street vs. place address
...
So far the SQL logic used the information from the address field
to determine if an address is attached to a street or place.
This changes the logic to use the information provided in the
token_info. This allows sanitizers to enforce a certain parenting
without changing the visible address information.
2023-06-30 11:08:25 +02:00
Sarah Hoffmann
2755ebe883
Merge pull request #3094 from lonvia/fix-failing-bdd-tests
...
Add BDD tests against Python frontend to CI
2023-06-22 22:28:31 +02:00
Sarah Hoffmann
2d05ff0190
slightly adapt postcode tests
2023-06-22 16:51:59 +02:00
Sarah Hoffmann
0d338fa4c0
bdd: fix faking HTTP headers for python web frameworks
2023-06-22 14:00:33 +02:00
mtmail
15a66e7b7d
Merge branch 'osm-search:master' into check-database-on-frozen-database
2023-06-22 12:14:55 +02:00
Marc Tobias
2337cc653b
check-database on frozen db shouldnt recommend indexing
2023-06-21 17:47:57 +02:00
Sarah Hoffmann
9bc5be837b
remove useless check
...
Found by new mypy version.
2023-06-21 11:56:39 +02:00
Sarah Hoffmann
b79d5494f9
remove support for sanic framework
...
There is no performance gain over falcon or starlette but the special
structure of sanic makes it hard to have exchangable code
2023-06-21 10:53:57 +02:00
Sarah Hoffmann
36df56b093
fix header name for browser languages
2023-06-20 11:56:43 +02:00
Sarah Hoffmann
d0a1e8e311
tweak postcode search
...
Give a preference to left-right reading, i.e <postcode>,<address>
prefers a postcode search while <address>,<postcode> rather does
an address search.
Also exclude non-addressables, countries and state from results when a
postcode is contained in the query.
2023-06-20 11:56:43 +02:00
Sarah Hoffmann
1f83efa8f2
Merge pull request #3086 from lonvia/close-connection-on-replication
...
Close database connections while waiting for the next update cycle
2023-06-19 15:48:00 +02:00
Sarah Hoffmann
6f3339cc49
close DB connection when waiting for next update cycle
2023-06-19 12:02:51 +02:00
Sarah Hoffmann
771be0e056
do not fail php script generation when curly braces are present
...
Fixes #3084 .
2023-06-19 11:23:30 +02:00
Sarah Hoffmann
41bf162306
remove tests for old PHP cli commands
2023-05-26 17:36:05 +02:00
Sarah Hoffmann
8f299838f7
fix various failing BDD tests
2023-05-26 15:08:48 +02:00
Sarah Hoffmann
146a0b29c0
add support for search by houenumber
2023-05-26 14:10:57 +02:00
Sarah Hoffmann
371a780ef4
add server fronting for search endpoint
...
This also implements some of the quirks of free-text search of the
V1 API, in particular, search for categories and coordinates.
2023-05-26 11:40:45 +02:00
Sarah Hoffmann
0608cf1476
switch CLI search command to python implementation
2023-05-24 22:54:54 +02:00
Sarah Hoffmann
f335e78d1e
make localisation of results explicit
...
Localisation was previously done as part of the formatting but might
also be useful on its own when working with the results directly.
2023-05-24 18:12:34 +02:00
Sarah Hoffmann
dcfb228c9a
add API functions for search functions
...
Search is now split into three functions: for free-text search,
for structured search and for search by category. Note that the
free-text search does not have as many hidden features like
coordinate search. Use the search parameters for that.
2023-05-24 18:05:43 +02:00
Sarah Hoffmann
dc99bbb0af
implement actual database searches
2023-05-24 13:52:31 +02:00
Sarah Hoffmann
c42273a4db
implement search builder
2023-05-23 11:23:44 +02:00
Sarah Hoffmann
3bf489cd7c
implement token assignment
2023-05-22 15:49:03 +02:00
Sarah Hoffmann
d8240f9ee4
add query analyser for legacy tokenizer
2023-05-22 11:07:14 +02:00
Sarah Hoffmann
2448cf2a14
add factory for query analyzer
2023-05-22 09:23:19 +02:00
Sarah Hoffmann
004883bdb1
query analyzer for ICU tokenizer
2023-05-22 08:46:19 +02:00
Sarah Hoffmann
ff66595f7a
add data structure for tokenized query
2023-05-21 09:30:57 +02:00
Sarah Hoffmann
d9d8b9c526
add tests for parameter converter
2023-05-18 18:09:07 +02:00
Sarah Hoffmann
bef5cea48e
switch API parameters to keyword arguments
...
This switches the input parameters for API calls to a generic
keyword argument catch-all which is then loaded into a dataclass
where the parameters are checked and forwarded to internal
function.
The dataclass gives more flexibility with the parameters and makes
it easier to reuse common parameters for the different API calls.
2023-05-18 17:42:23 +02:00
Marc Tobias
e5f332bd71
when adding Tiger data, check first if database is in frozen state
2023-05-08 14:35:30 +02:00
Sarah Hoffmann
5751686fdc
Merge pull request #3006 from biswajit-k/generalize-filter
...
generalize filter function for sanitizers
2023-04-11 19:20:08 +02:00
Sarah Hoffmann
60c1301fca
fix a number of corner cases with interpolation splitting
...
Snapping a line to a point before splitting was meant to ensure
that the split point is really on the line. However, ST_Snap() does
not always behave well for this case. It may shorten the interpolation
line in some cases with the result that two points housenumbers
suddenly fall on the same point. It might also shorten the line down
to a single point which then makes ST_Split() crash.
Switch to a combination of ST_LineLocatePoint and ST_LineSubString
instead, which guarantees to keep the original geometry. Explicitly
handle the corner cases, where the split point falls on the beginning
or end of the line.
2023-04-06 16:54:00 +02:00
Sarah Hoffmann
1dce2b98b4
switch CLI lookup command to Python implementation
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
86c4897c9b
add lookup call to server glue
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
2237603677
add tests for new lookup API
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
6e81596609
rename lookup() API to details and add lookup call
...
The initial plan to serve /details and /lookup endpoints from
the same API call turned out to be impractical, so the API now
also has deparate functions for both.
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
ed9cd9f0e5
bdd: disable detail tests searching by place ID
...
Place IDs are not stable and cannot be used in tests.
2023-04-03 10:07:06 +02:00
Sarah Hoffmann
7d30dbebc5
flex style: reinstate postcode boundaries
...
Postcode boundaries don't have a name, so need to be imported
unconditionally.
2023-04-03 09:17:50 +02:00
biswajit-k
8f03c80ce8
generalize filter for sanitizers
2023-04-01 19:24:09 +05:30
Sarah Hoffmann
683a3cb3ec
call osm2pgsql postprocessing flush_deleted_places() when adding data
2023-03-31 18:05:07 +02:00
Sarah Hoffmann
1feac2069b
add BDD tests for new layers parameter
2023-03-30 09:54:55 +02:00
Sarah Hoffmann
26ee6b6dde
python reverse: add support for point geometries in interpolations
2023-03-29 17:21:33 +02:00
Sarah Hoffmann
6c67a4b500
switch reverse CLI command to Python implementation
2023-03-26 18:09:33 +02:00
Sarah Hoffmann
86b43dc605
make sure PHP and Python reverse code does the same
...
The only allowable difference is precision of coordinates. Python uses
a precision of 7 digits where possible, which corresponds to the
precision of OSM data.
Also fixes some smaller bugs found by the BDD tests.
2023-03-26 16:21:43 +02:00
Sarah Hoffmann
35b52c4656
add output formatters for ReverseResults
...
These formatters are written in a way that they can be reused for
search results later.
2023-03-25 15:45:03 +01:00
Sarah Hoffmann
2f54732500
python: implement reverse lookup function
...
The implementation follows for most part the PHP code but introduces an
additional layer parameter with which the kind of places to be returned
can be restricted. This replaces the hard-coded exclusion lists.
2023-03-23 22:38:37 +01:00
Sarah Hoffmann
1facfd019b
api: generalize error handling
...
Return a consistent error response which takes into account the chosen
content type. Also adds tests for V1 server glue.
2023-03-23 10:16:50 +01:00
Sarah Hoffmann
00e3a752c9
split SearchResult type
...
Use adapted types for the different result types. This makes it
easier to have adapted output formatting and means there are only
result fields that are filled.
2023-03-23 10:16:50 +01:00
Sarah Hoffmann
81430bd3bd
bdd: be more fuzzy with coordinate comparisons
2023-03-09 22:37:45 +01:00
Sarah Hoffmann
93203f355a
avoid recent Python dialect
2023-03-09 20:57:43 +01:00
Sarah Hoffmann
3f2296e3ea
bdd: extend reverse API tests for format checks
...
Reorganise the API reverse tests and extend the checks for the
output format, testing for all expected fields.
2023-03-09 20:20:50 +01:00
Sarah Hoffmann
2b7eb4906a
bdd: add tests for valid debug output
2023-03-09 20:10:51 +01:00
Sarah Hoffmann
db1aa4d02e
bdd: replace old formatting strings
2023-03-09 19:49:55 +01:00
Sarah Hoffmann
ad88d7a3e0
bdd: more format checks for reverse XML
2023-03-09 19:40:24 +01:00
Sarah Hoffmann
e42c1c9c7a
bdd: new step variant 'result contains in field'
...
This replaces the + notation for recursing into result dictionaries.
2023-03-09 19:31:21 +01:00
Sarah Hoffmann
556bb2386d
bdd: factor out computation of result to-check lists
2023-03-09 18:01:45 +01:00
Sarah Hoffmann
1e58cef174
bdd: replace property_list construct with standard check functions
2023-03-09 17:56:28 +01:00
Sarah Hoffmann
01010e443f
bdd: remove special case for osm_type field
...
The fuzzy field check hide cover formatting errors. Use 'osm' when
only caring about the conent.
2023-03-09 17:44:34 +01:00
Sarah Hoffmann
da0a7a765e
bdd: reorganise field comparisons
...
Move comparision on Field values from assert_field() into a
comparator class. Replace BadRowValueAssert with a simpler
check_row() function.
2023-03-09 17:05:05 +01:00
Sarah Hoffmann
9769a0dcdb
bdd: use new check_for_attributes() function also in steps
2023-03-09 16:44:07 +01:00
Sarah Hoffmann
fbff4fa218
bdd: fully check correctness of geojson and geocodejson
...
Parse code now checks presence of all required fields and exports
all fields for inspection.
2023-03-09 16:36:46 +01:00
Sarah Hoffmann
d17ec56e54
bdd: remove OrderedDict
...
dicts are guaranteed to keep insertion order by since Python 3.7, making
use of ORderedDict mute.
2023-03-09 16:08:39 +01:00
biswajit-k
ca149fb796
Adds sanitizer for preventing certain tags to enter search index based on parameters
...
fix: pylint error
added docs for delete tags sanitizer
fixed typos in docs and code comments
fix: python typechecking error
fixed rank address type
Revert "fixed typos in docs and code comments"
This reverts commit 6839eea755a87f557895f30524fb5c03dd983d60.
added default parameters and refactored code
added test for all parameters
2023-03-09 14:18:39 +05:30
Sarah Hoffmann
412ead5f2d
adapt PHP tests for debug output
2023-02-20 16:23:28 +01:00
Sarah Hoffmann
d574ceb598
restrict place rank inheritance to address items
...
Place tags must have no influence on street- or POI-level
objects.
2023-02-17 16:25:26 +01:00
Sarah Hoffmann
ee0c5e24bb
add a WKB decoder for the Point class
...
This allows to return point geometries from the database and makes
the SQL a bit simpler.
2023-02-16 17:29:56 +01:00
Sarah Hoffmann
8557105c40
add debug output for unit tests
...
This uses the debug output facility meant for pretty HTML output
to give us debugging output for the unit tests.
2023-02-14 11:57:37 +01:00
Sarah Hoffmann
42c3754dcd
add tests for details result formatting and trim results
...
Values that are None are no longer included in the output to save
a bit of bandwidth.
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
b742200442
expand details BDD tests
...
There are now minor differences in the output between PHP and
Python versions, so introduce specific tests.
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
3ac70f7cc2
implement details endpoint in Python servers
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
104722a56a
switch details cli command to new Python implementation
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
1924beeb20
add lookup of postcdoe data
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
70f6f9a711
add lookup of tiger data
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
f1ceefe9a6
add lookup of address interpolations
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
189f74a40d
add unit tests for lookup function
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
370c9b38c0
improve scaffolding for API unit tests
...
Use the static table definition to create the test database.
Add helper function to simplify filling the tables.
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
16b6484c65
add property cache for API
...
This caches results from querying nominatim_properties.
2023-01-30 09:36:17 +01:00
Sarah Hoffmann
77bec1261e
add streaming json writer for JSON output
2023-01-25 15:05:33 +01:00
Sarah Hoffmann
8f4426fbc8
reorganize code around result formatting
...
Code is now organized by api version. So formatting has moved to
the api.v1 module. Instead of holding a separate ResultFormatter
object per result format, simply move the functions to the
formater collector and hand in the requested format as a parameter.
Thus reorganized, the api.v1 module can export three simple functions
for result formatting which in turn makes the code that uses
the formatters much simpler.
2023-01-24 17:20:51 +01:00
Sarah Hoffmann
32c1e59622
reorganize api submodule
...
Use a directory for the submodule where the __init__ file contains
the public API. This makes it easier to separate public interface
from the internal implementation.
2023-01-24 13:28:04 +01:00
Sarah Hoffmann
3cc357bffa
Merge pull request #2955 from lonvia/fix-importance-refresh
...
Fix importance recalculation
2023-01-23 09:07:43 +01:00
Sarah Hoffmann
ce9ed993c8
fix importance recalculation
...
The signature of the compute_importance() function has changed.
2023-01-22 22:32:16 +01:00
Sarah Hoffmann
929a13d4cd
remove comma as name separator
...
Commas are most of the time used as a part of a name, not to
separate multiple names.
See also #2950 .
2023-01-22 22:29:36 +01:00
Sarah Hoffmann
5f4e98e0d9
update Makefile in test directory
2023-01-09 20:49:33 +01:00
Sarah Hoffmann
a72e2ecb3f
update dependencies for Actions
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
0c47558729
convert version to named tuple
...
Also return the new NominatimVersion rather than a string in the
status result.
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
93b9288c30
fix error message for non-existing database
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
9d31a67116
add unit tests for new Python API
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
7219ee6532
extend BDD API tests to query via Python frameworks
...
A new config option ENGINE allows to choose between php and any of the
supported Python engines.
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
018ef5bd53
bdd: recreate project directory for every run
2022-12-23 18:36:41 +01:00
Sarah Hoffmann
200eae3bc0
add tests for examples in lua style documentation
...
And fix all the errors the tests have found.
2022-12-23 17:35:28 +01:00
Sarah Hoffmann
89a34e7508
adapt tests for new lua styles
2022-12-19 17:32:28 +01:00
Sarah Hoffmann
a915815e4d
explicit export for functions in flex-base
2022-12-18 10:10:58 +01:00
Sarah Hoffmann
2231401483
clean up uses of cli.nominatim()
...
They should not hand in data paths anymore.
2022-11-27 15:27:04 +01:00
Sarah Hoffmann
2abe9e6fd9
use data paths from new nominatim.paths
2022-11-27 12:15:41 +01:00
Sarah Hoffmann
0ed60d29cb
remove NOMINATIM_NOMINATIM_TOOL variable
...
This was used by the old PHP scripts to call the Python tool.
With the scripts now gone, the variable can be removed.
2022-11-26 16:40:20 +01:00
Sarah Hoffmann
41e8bddaa9
remove BDD test for tiger:county
...
We no longer rely on the import to strip the tag.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
fd3dec8efe
add sanitizer for TIGER tags
...
Currently only takes over cleaning the tiger:county data. This was
done by the import until now.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
c9ff7d2130
drop illegal values for addr:interpolation on update
2022-11-18 17:26:56 +01:00
Sarah Hoffmann
52456230cc
Merge pull request #2887 from lonvia/lookup-linked-places
...
Add support for lookup of linked places
2022-11-17 13:35:53 +01:00
Sarah Hoffmann
c4b13f2b7f
add support for lookup of linked places
2022-11-16 21:34:45 +01:00
Sarah Hoffmann
4f05a03d13
handle associatedStreet relations with multiple streets
...
When a associatedStreet relation has multiple street members
always take the closest one. Avoid geometry operations for
the frequent case that there is only one street.
2022-11-16 17:25:51 +01:00
Sarah Hoffmann
93ada250f7
bdd: add tests for osm2pgsql update of postcode nodes
2022-11-14 17:27:04 +01:00
Sarah Hoffmann
d8e3ba3b54
bdd: add osm2pgsql tests for updating interpolations
2022-11-14 16:57:31 +01:00
Sarah Hoffmann
a46348da38
bdd: test placex content when updating with osm2pgsql
2022-11-14 14:48:44 +01:00
Sarah Hoffmann
36cf0eb922
reorganize handling of place type changes
...
Always replace existing entries in place, never delete them because
a direct delete will cause conflicts.
2022-11-14 13:57:26 +01:00
Sarah Hoffmann
63a9bc94f7
fix country handling in flex style
...
If the country tag does not match a 2-letter code, it needs to
be dropped.
2022-11-10 15:52:13 +01:00
Sarah Hoffmann
2dafc4cf4f
remove tests that differ between lua and gazetteer versions
2022-11-10 15:51:55 +01:00
Sarah Hoffmann
68d09f9cad
node locations must be stable for osm2pgsql update tests
2022-11-10 11:11:45 +01:00
Sarah Hoffmann
b98d3d3f00
bdd: extend osm2pgsql update tests
...
Now also checks for correct indexing state of placex table.
2022-11-10 09:38:25 +01:00
Sarah Hoffmann
2fac507453
change updates to handle delete/insert workflow
...
This makes Nominatim compatible with osm2pgsql's default update
modus operandi of deleting and reinserting data. Deletes are diverted
into a TODO table instead of executing them. When data is reinserted,
the corresponding entry in the TODO table is deleted. After updates are
finished, the remaining entries in the TODO table are executed, doing
the same work as the delete trigger did before.
The new behaviour also works against the gazetteer output with its
insert-only mechanism.
2022-11-10 09:38:23 +01:00
Sarah Hoffmann
51ed55cc32
initial flex import scripts
...
Only implements the extratags style for the moment. Tests pass
for the same behaviour as the gazetteer output. Updates still need
to be done.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
de2a3bd5f8
bdd tests: make import style configurable
...
The switch is for development. Tests are not guaranteed to still
work when run with anything but the 'extratags' style.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
981e9700be
add osm2pgsql gazetteer tests
...
This ports the gazetteer tests from osm2pgsql to BDD tests.
2022-11-10 09:37:38 +01:00
Sarah Hoffmann
5f6dcd36ed
fix flaky API test
...
The search 'landstr' produces many duplicates so that with
some bad luck 4 or less results may appear. Disable deduplication
to make it more predictable.
2022-10-05 15:16:14 +02:00
Sarah Hoffmann
5877b69d51
do not run unit test when postgis_raster is not available
2022-10-01 11:01:49 +02:00
Sarah Hoffmann
5ec2c1b712
adapt unit tests to changed function names
2022-10-01 11:01:49 +02:00
Sarah Hoffmann
0a73ed7d64
add secondary importance to API BDD tests
...
Also fixes a path issue during API test DB creation that could
never possibly have worked.
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
0ab0f0ea44
Integrated OSM views into importance computation
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
ac467c7a2d
Enhanced the implementation of OSM views GeoTIFF import functionality
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
c85b74497b
Initial implementation of GeoTIFF import functionality
2022-10-01 11:01:49 +02:00
Sarah Hoffmann
f4d3ae6f70
consolidate indexes over geometry_sectors
...
The index over geometry_sectors are mainly used for ordering
the places which need indexing. That means they function effectively
as a TODO list. Consolodate them so that they always only contain
the places which are still to do. Also add the appropriate index
for the boundary indexing phase.
2022-09-21 10:38:58 +02:00
Sarah Hoffmann
dddfa3a075
ignore irrelevant extra tags on address interpolations
...
When deciding if an address interpolation has address information, only
look for addr:street and addr:place. If they are not there go looking
for the address on the address nodes. Ignores irrelevant tags like
addr:inclusion.
Fixes #2797 .
2022-08-13 14:07:06 +02:00
Sarah Hoffmann
487e81fe3c
more invalidations when boundary changes rank
...
When a boundary or place changes its address rank, all places where
it participates as address need to be potentially reindexed.
Also use the computed rank when testing place nodes against
boundaries. Boundaries are computed earlier.
Fixes #2794 .
2022-08-12 09:48:46 +02:00
Sarah Hoffmann
51b6d16dc6
overhaul the token analysis interface
...
The functional split betweenthe two functions is now that the
first one creates the ID that is used in the word table and
the second one creates the variants. There no longer is a
requirement that the ID is the normalized version. We might
later reintroduce the requirement that a normalized version be available
but it doesn't necessarily need to be through the ID.
The function that creates the ID now gets the full PlaceName. That way
it might take into account attributes that were set by the sanitizers.
Finally rename both functions to something more sane.
2022-07-29 15:14:11 +02:00
Sarah Hoffmann
c8873d34af
harmonize interface of token analysis module
...
The configure() function now receives a Transliterator object instead
of the ICU rules. This harmonizes the parameters with the create
function.
2022-07-29 10:43:07 +02:00
Sarah Hoffmann
6d41046b15
add support for external sanitizer modules
2022-07-25 16:10:19 +02:00
Sarah Hoffmann
7b7203c149
add function for loading plugin modules
...
Loads modules for configurable code like tokenizers, sanitizers, etc.
Supports internal modules, external libraries and code from the
project directory.
2022-07-25 16:10:10 +02:00