Sarah Hoffmann
9448c5e16f
add tests for new arm and export Python functions
2023-07-26 11:09:52 +02:00
miku0
0722495434
add japanese sanitizer
2023-07-26 07:54:58 +00:00
Sarah Hoffmann
d545c6d73c
mostly remove php-cgi requirement
...
This is now only needed for BDD tests against the php API.
2023-07-26 00:10:11 +02:00
Sarah Hoffmann
f69fea4210
remove now unused run_api_script function
2023-07-25 22:45:29 +02:00
Sarah Hoffmann
4cd0a4ced4
remove now unused run_legacy_script()
2023-07-25 21:39:23 +02:00
Sarah Hoffmann
0804cc0cff
port export function to Python
...
Some of the parameters have been renoved as they don't make sense
anymore.
2023-07-25 21:39:23 +02:00
Sarah Hoffmann
faeee7528f
move warm script to python code
2023-07-25 21:39:23 +02:00
Sarah Hoffmann
66ecb56cea
add tests for new endpoints
2023-07-25 10:57:19 +02:00
Sarah Hoffmann
927d2cc824
do not split names from typed phrases
...
When phrases are typed, they should only contain exactly one term.
2023-07-17 20:09:08 +02:00
Sarah Hoffmann
cc45930ef9
avoid lookup via partials on frequent words
...
Drops expensive searches via partials on terms like 'rue de'.
See #2979 .
2023-07-06 12:16:57 +02:00
Sarah Hoffmann
82216ebf8b
always run function update on migrations
...
This means that we can have migrations which require nothing but
an update of the functions.
2023-07-01 20:18:59 +02:00
Sarah Hoffmann
9f6f12cfeb
move search to bind parameters
2023-07-01 18:03:07 +02:00
Sarah Hoffmann
d7a3039c2a
also switch legacy tokenizer to new street/place choice behaviour
2023-06-30 17:03:17 +02:00
Sarah Hoffmann
645ea5a057
use information from tokenizer to determine street vs. place address
...
So far the SQL logic used the information from the address field
to determine if an address is attached to a street or place.
This changes the logic to use the information provided in the
token_info. This allows sanitizers to enforce a certain parenting
without changing the visible address information.
2023-06-30 11:08:25 +02:00
mtmail
15a66e7b7d
Merge branch 'osm-search:master' into check-database-on-frozen-database
2023-06-22 12:14:55 +02:00
Marc Tobias
2337cc653b
check-database on frozen db shouldnt recommend indexing
2023-06-21 17:47:57 +02:00
Sarah Hoffmann
9bc5be837b
remove useless check
...
Found by new mypy version.
2023-06-21 11:56:39 +02:00
Sarah Hoffmann
b79d5494f9
remove support for sanic framework
...
There is no performance gain over falcon or starlette but the special
structure of sanic makes it hard to have exchangable code
2023-06-21 10:53:57 +02:00
Sarah Hoffmann
36df56b093
fix header name for browser languages
2023-06-20 11:56:43 +02:00
Sarah Hoffmann
d0a1e8e311
tweak postcode search
...
Give a preference to left-right reading, i.e <postcode>,<address>
prefers a postcode search while <address>,<postcode> rather does
an address search.
Also exclude non-addressables, countries and state from results when a
postcode is contained in the query.
2023-06-20 11:56:43 +02:00
Sarah Hoffmann
1f83efa8f2
Merge pull request #3086 from lonvia/close-connection-on-replication
...
Close database connections while waiting for the next update cycle
2023-06-19 15:48:00 +02:00
Sarah Hoffmann
6f3339cc49
close DB connection when waiting for next update cycle
2023-06-19 12:02:51 +02:00
Sarah Hoffmann
771be0e056
do not fail php script generation when curly braces are present
...
Fixes #3084 .
2023-06-19 11:23:30 +02:00
Sarah Hoffmann
41bf162306
remove tests for old PHP cli commands
2023-05-26 17:36:05 +02:00
Sarah Hoffmann
146a0b29c0
add support for search by houenumber
2023-05-26 14:10:57 +02:00
Sarah Hoffmann
371a780ef4
add server fronting for search endpoint
...
This also implements some of the quirks of free-text search of the
V1 API, in particular, search for categories and coordinates.
2023-05-26 11:40:45 +02:00
Sarah Hoffmann
0608cf1476
switch CLI search command to python implementation
2023-05-24 22:54:54 +02:00
Sarah Hoffmann
f335e78d1e
make localisation of results explicit
...
Localisation was previously done as part of the formatting but might
also be useful on its own when working with the results directly.
2023-05-24 18:12:34 +02:00
Sarah Hoffmann
dcfb228c9a
add API functions for search functions
...
Search is now split into three functions: for free-text search,
for structured search and for search by category. Note that the
free-text search does not have as many hidden features like
coordinate search. Use the search parameters for that.
2023-05-24 18:05:43 +02:00
Sarah Hoffmann
dc99bbb0af
implement actual database searches
2023-05-24 13:52:31 +02:00
Sarah Hoffmann
c42273a4db
implement search builder
2023-05-23 11:23:44 +02:00
Sarah Hoffmann
3bf489cd7c
implement token assignment
2023-05-22 15:49:03 +02:00
Sarah Hoffmann
d8240f9ee4
add query analyser for legacy tokenizer
2023-05-22 11:07:14 +02:00
Sarah Hoffmann
2448cf2a14
add factory for query analyzer
2023-05-22 09:23:19 +02:00
Sarah Hoffmann
004883bdb1
query analyzer for ICU tokenizer
2023-05-22 08:46:19 +02:00
Sarah Hoffmann
ff66595f7a
add data structure for tokenized query
2023-05-21 09:30:57 +02:00
Sarah Hoffmann
d9d8b9c526
add tests for parameter converter
2023-05-18 18:09:07 +02:00
Sarah Hoffmann
bef5cea48e
switch API parameters to keyword arguments
...
This switches the input parameters for API calls to a generic
keyword argument catch-all which is then loaded into a dataclass
where the parameters are checked and forwarded to internal
function.
The dataclass gives more flexibility with the parameters and makes
it easier to reuse common parameters for the different API calls.
2023-05-18 17:42:23 +02:00
Marc Tobias
e5f332bd71
when adding Tiger data, check first if database is in frozen state
2023-05-08 14:35:30 +02:00
Sarah Hoffmann
5751686fdc
Merge pull request #3006 from biswajit-k/generalize-filter
...
generalize filter function for sanitizers
2023-04-11 19:20:08 +02:00
Sarah Hoffmann
1dce2b98b4
switch CLI lookup command to Python implementation
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
86c4897c9b
add lookup call to server glue
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
2237603677
add tests for new lookup API
2023-04-03 14:40:41 +02:00
Sarah Hoffmann
6e81596609
rename lookup() API to details and add lookup call
...
The initial plan to serve /details and /lookup endpoints from
the same API call turned out to be impractical, so the API now
also has deparate functions for both.
2023-04-03 14:40:41 +02:00
biswajit-k
8f03c80ce8
generalize filter for sanitizers
2023-04-01 19:24:09 +05:30
Sarah Hoffmann
683a3cb3ec
call osm2pgsql postprocessing flush_deleted_places() when adding data
2023-03-31 18:05:07 +02:00
Sarah Hoffmann
26ee6b6dde
python reverse: add support for point geometries in interpolations
2023-03-29 17:21:33 +02:00
Sarah Hoffmann
6c67a4b500
switch reverse CLI command to Python implementation
2023-03-26 18:09:33 +02:00
Sarah Hoffmann
86b43dc605
make sure PHP and Python reverse code does the same
...
The only allowable difference is precision of coordinates. Python uses
a precision of 7 digits where possible, which corresponds to the
precision of OSM data.
Also fixes some smaller bugs found by the BDD tests.
2023-03-26 16:21:43 +02:00
Sarah Hoffmann
35b52c4656
add output formatters for ReverseResults
...
These formatters are written in a way that they can be reused for
search results later.
2023-03-25 15:45:03 +01:00
Sarah Hoffmann
2f54732500
python: implement reverse lookup function
...
The implementation follows for most part the PHP code but introduces an
additional layer parameter with which the kind of places to be returned
can be restricted. This replaces the hard-coded exclusion lists.
2023-03-23 22:38:37 +01:00
Sarah Hoffmann
1facfd019b
api: generalize error handling
...
Return a consistent error response which takes into account the chosen
content type. Also adds tests for V1 server glue.
2023-03-23 10:16:50 +01:00
Sarah Hoffmann
00e3a752c9
split SearchResult type
...
Use adapted types for the different result types. This makes it
easier to have adapted output formatting and means there are only
result fields that are filled.
2023-03-23 10:16:50 +01:00
biswajit-k
ca149fb796
Adds sanitizer for preventing certain tags to enter search index based on parameters
...
fix: pylint error
added docs for delete tags sanitizer
fixed typos in docs and code comments
fix: python typechecking error
fixed rank address type
Revert "fixed typos in docs and code comments"
This reverts commit 6839eea755a87f557895f30524fb5c03dd983d60.
added default parameters and refactored code
added test for all parameters
2023-03-09 14:18:39 +05:30
Sarah Hoffmann
ee0c5e24bb
add a WKB decoder for the Point class
...
This allows to return point geometries from the database and makes
the SQL a bit simpler.
2023-02-16 17:29:56 +01:00
Sarah Hoffmann
8557105c40
add debug output for unit tests
...
This uses the debug output facility meant for pretty HTML output
to give us debugging output for the unit tests.
2023-02-14 11:57:37 +01:00
Sarah Hoffmann
42c3754dcd
add tests for details result formatting and trim results
...
Values that are None are no longer included in the output to save
a bit of bandwidth.
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
104722a56a
switch details cli command to new Python implementation
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
1924beeb20
add lookup of postcdoe data
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
70f6f9a711
add lookup of tiger data
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
f1ceefe9a6
add lookup of address interpolations
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
189f74a40d
add unit tests for lookup function
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
370c9b38c0
improve scaffolding for API unit tests
...
Use the static table definition to create the test database.
Add helper function to simplify filling the tables.
2023-02-04 21:22:22 +01:00
Sarah Hoffmann
16b6484c65
add property cache for API
...
This caches results from querying nominatim_properties.
2023-01-30 09:36:17 +01:00
Sarah Hoffmann
77bec1261e
add streaming json writer for JSON output
2023-01-25 15:05:33 +01:00
Sarah Hoffmann
8f4426fbc8
reorganize code around result formatting
...
Code is now organized by api version. So formatting has moved to
the api.v1 module. Instead of holding a separate ResultFormatter
object per result format, simply move the functions to the
formater collector and hand in the requested format as a parameter.
Thus reorganized, the api.v1 module can export three simple functions
for result formatting which in turn makes the code that uses
the formatters much simpler.
2023-01-24 17:20:51 +01:00
Sarah Hoffmann
32c1e59622
reorganize api submodule
...
Use a directory for the submodule where the __init__ file contains
the public API. This makes it easier to separate public interface
from the internal implementation.
2023-01-24 13:28:04 +01:00
Sarah Hoffmann
ce9ed993c8
fix importance recalculation
...
The signature of the compute_importance() function has changed.
2023-01-22 22:32:16 +01:00
Sarah Hoffmann
0c47558729
convert version to named tuple
...
Also return the new NominatimVersion rather than a string in the
status result.
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
93b9288c30
fix error message for non-existing database
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
9d31a67116
add unit tests for new Python API
2023-01-03 10:03:00 +01:00
Sarah Hoffmann
89a34e7508
adapt tests for new lua styles
2022-12-19 17:32:28 +01:00
Sarah Hoffmann
2231401483
clean up uses of cli.nominatim()
...
They should not hand in data paths anymore.
2022-11-27 15:27:04 +01:00
Sarah Hoffmann
2abe9e6fd9
use data paths from new nominatim.paths
2022-11-27 12:15:41 +01:00
Sarah Hoffmann
fd3dec8efe
add sanitizer for TIGER tags
...
Currently only takes over cleaning the tiger:county data. This was
done by the import until now.
2022-11-23 10:37:27 +01:00
Sarah Hoffmann
5877b69d51
do not run unit test when postgis_raster is not available
2022-10-01 11:01:49 +02:00
Sarah Hoffmann
5ec2c1b712
adapt unit tests to changed function names
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
0ab0f0ea44
Integrated OSM views into importance computation
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
ac467c7a2d
Enhanced the implementation of OSM views GeoTIFF import functionality
2022-10-01 11:01:49 +02:00
Tareq Al-Ahdal
c85b74497b
Initial implementation of GeoTIFF import functionality
2022-10-01 11:01:49 +02:00
Sarah Hoffmann
f4d3ae6f70
consolidate indexes over geometry_sectors
...
The index over geometry_sectors are mainly used for ordering
the places which need indexing. That means they function effectively
as a TODO list. Consolodate them so that they always only contain
the places which are still to do. Also add the appropriate index
for the boundary indexing phase.
2022-09-21 10:38:58 +02:00
Sarah Hoffmann
51b6d16dc6
overhaul the token analysis interface
...
The functional split betweenthe two functions is now that the
first one creates the ID that is used in the word table and
the second one creates the variants. There no longer is a
requirement that the ID is the normalized version. We might
later reintroduce the requirement that a normalized version be available
but it doesn't necessarily need to be through the ID.
The function that creates the ID now gets the full PlaceName. That way
it might take into account attributes that were set by the sanitizers.
Finally rename both functions to something more sane.
2022-07-29 15:14:11 +02:00
Sarah Hoffmann
c8873d34af
harmonize interface of token analysis module
...
The configure() function now receives a Transliterator object instead
of the ICU rules. This harmonizes the parameters with the create
function.
2022-07-29 10:43:07 +02:00
Sarah Hoffmann
6d41046b15
add support for external sanitizer modules
2022-07-25 16:10:19 +02:00
Sarah Hoffmann
7b7203c149
add function for loading plugin modules
...
Loads modules for configurable code like tokenizers, sanitizers, etc.
Supports internal modules, external libraries and code from the
project directory.
2022-07-25 16:10:10 +02:00
Kian-Meng Ang
f5e52e748f
docs: fix typos
2022-07-20 22:05:31 +08:00
Sarah Hoffmann
9963261d8d
add type annotations to special phrase importer
2022-07-18 09:54:29 +02:00
Sarah Hoffmann
62eedbb8f6
add type hints for sanitizers
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
aaf2b6032e
fix uses of config.get_path() to expect None
2022-07-18 09:47:57 +02:00
Sarah Hoffmann
b1903f0fbf
Merge pull request #2761 from lonvia/repair-index-analysis
...
Repair `admin --analyse-indexing`
2022-07-18 09:38:08 +02:00
marc tobias
c70ca7f57b
In tests for PHP 8 disable Just-in-time, it conflicts with tools that determine coverage
2022-07-09 22:03:48 +02:00
Sarah Hoffmann
4b12d52ef5
convert admin --analyse-indexing to new indexing method
...
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
cbbcbb1fd7
move country_info into data submodule
2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd
move PlaceInfo into data submodule
...
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
69e51aebab
test: avoid column names with upper-case letters
...
This may cause problems when the column names get quoted.
2022-07-05 09:12:55 +02:00
Sarah Hoffmann
612d34930b
handle postcodes properly on word table updates
...
update_postcodes_from_db() needs to do the full postcode treatment
in order to derive the correct word table entries.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7b6ec4fc6c
add tests for discarding bad postcodes
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
80ea13437d
move postcode matcher in a separate file
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
4885fdf0f9
add class for online centroid computation
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
18864afa8a
postcodes: introduce a default pattern for countries without postcodes
2022-06-23 23:42:31 +02:00