Commit Graph

968 Commits

Author SHA1 Message Date
Sarah Hoffmann
ace84ed0e3 use address counts for improving index lookup 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
ff3230a7f3 add penalty for single words that look like stop words 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
07b7fd1dbb add address counts to tokens 2024-03-18 11:25:48 +01:00
Sarah Hoffmann
bb5de9b955 extend word statistics to address index
Word frequency in names is not sufficient to interpolate word
frequency in the address because names of towns, states etc. are
much more frequently used than, say street names.
2024-03-18 11:25:48 +01:00
Sarah Hoffmann
9c48726691 add geometry details for postcode area output 2024-03-12 13:51:29 +01:00
Sarah Hoffmann
6e688a0113 postcodes: exclude seen places later
The seen list will only have the postcode area when available but
we want the postcode point excluded as well if the area has been seen.
2024-03-11 15:18:57 +01:00
Sarah Hoffmann
dc7cfd1708 look for postcode areas when finding something in the postcode table 2024-03-11 14:48:24 +01:00
Sarah Hoffmann
e5a5f02666 prepare release 4.4.0 2024-03-07 11:43:01 +01:00
Sarah Hoffmann
e929693cae
Merge pull request #3356 from lonvia/use-date-from-osm2pgsql-prop
Use import date from osm2pgsql property table if available
2024-03-05 15:32:16 +01:00
Sarah Hoffmann
ae7c584e28 use import date from osm2pgsql property table if available 2024-03-05 11:33:32 +01:00
marc tobias
b7eea4d53a Github Actions: add codespell linter, warn only 2024-03-04 00:22:24 +01:00
Sarah Hoffmann
9fa73cfb15 improve display name for postcodes
Don't add the postcode again in the list of address details and
make sure that the result proper always comes before anything else
independently of the address rank.
2024-02-28 16:50:40 +01:00
Sarah Hoffmann
247065ff6f
Merge pull request #3342 from mtmail/tyops
Correct some typos
2024-02-28 14:25:16 +01:00
Sarah Hoffmann
1879cf902c
Merge pull request #3346 from lonvia/reduce-artificial-importance
Reduce default importance
2024-02-28 14:21:46 +01:00
Sarah Hoffmann
36b1660121 add support for new middle table format of osm2pgsql
Functions are adapted according to the format detected from the
osm2pgsql property table.
2024-02-27 18:18:19 +01:00
Sarah Hoffmann
c6d40d4bf4 reduce importance when computed from search rank 2024-02-27 10:15:54 +01:00
Sarah Hoffmann
a4f2e6a893 do not send outdated parameters to osm2pgsql flex 2024-02-27 10:15:36 +01:00
Sarah Hoffmann
dc1baaa0af prefer min() function over if construct
Fixes a linter complaint.
2024-02-27 09:26:50 +01:00
marc tobias
7205491b84 Correct some typos 2024-02-26 18:13:30 +01:00
Sarah Hoffmann
4aba36c5ac API debug: properly escape non-highlighted code 2024-02-19 18:39:01 +01:00
Sarah Hoffmann
05fad607ff make Python frontend default and PHP optional 2024-02-19 18:39:01 +01:00
Sarah Hoffmann
b2d3f0a8b3 remove unnecessary nested group in CLI import command 2024-02-16 11:32:50 +01:00
Sarah Hoffmann
4ce13f5c1f prefilter bad results before adding details and reranking
Move the first cutting of the result list before reranking
by result match. This means that results with significantly
less importance are removed early and independently of the
fact how well they match the original query.

Fixes #3266.
2024-02-06 20:29:48 +01:00
Sarah Hoffmann
bc51378aee properly grant rights to read-only user when switching out word table 2024-02-06 17:30:01 +01:00
Sarah Hoffmann
81eed0680c recreate word table when refreshing counts
The counting touches a large part of the word table, leaving
bloated tables and indexes. Thus recreate the table instead and
swap it in.
2024-02-04 21:35:10 +01:00
Sarah Hoffmann
33c0f249b1 avoid LookupAny with address and too many name tokens
The index for nameaddress_vector has grown so large that PostgreSQL
will resort to a sequential scan if there are too many items
in the LookupAny list.
2024-01-29 16:52:14 +01:00
Sarah Hoffmann
76eadc562c print any collected debug output when returning a timeout error 2024-01-28 22:30:34 +01:00
Sarah Hoffmann
f07f8530a8 housenumber-only searches cannot be combined with qualifiers 2024-01-28 19:03:11 +01:00
Sarah Hoffmann
103800a732 adjust rankings for housenumber-only searches
A normal address search with housenumber will use name rankings for
the street name. This is slightly different than weighing for
address parts. Use the same ranking for the first part of the
address for housenumber-only searches to make sure that penalties
remain comparable.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
f9ba7a465a always add a penalty for name + address search fallback
If there already was a search by full names, the search is likely
a repeatition that yields the same results, only running slower.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
fed46240d5 disallow category tokens in the middle of a query string
This already worked for left-to-right readings and now is also
implemented for right-to-left reading. A qualifier must always be
before or after the name.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
2703442fd2 protect against very frequent bad partials 2024-01-28 19:03:11 +01:00
Sarah Hoffmann
2813bf18e6 avoid duplicates in the list of partial tokens for a query
This messes with the estimates for expected results.
2024-01-28 19:03:11 +01:00
Sarah Hoffmann
b3a2b3d484 catch special async timeout error in servers
In Python <= 3.10 this is not yet the same as TimeoutError.

Fixes #3303.
2024-01-27 20:57:23 +01:00
Sarah Hoffmann
e0ca2ce6ec interpret stand-alone special terms always as near term
Fixes #3298.
2024-01-16 17:19:21 +01:00
Sarah Hoffmann
28f7e51279 add country code to words to be rematched 2024-01-08 12:23:23 +01:00
Sarah Hoffmann
b2afe3ce3e when a country is in the results, restrict further searches to places
A country search result usually comes with a very high importance.
As a result only other very well known places will show up together
with country results and that means only places with lower address
ranks. Name searches for country names tend to yield a lot of POI
results because the country name is part of the name
(think "embassy of Sweden"). By excluding POIs from further searches,
the search is sped up quite a bit.
2024-01-07 17:29:12 +01:00
Sarah Hoffmann
7337898b84 dump params in log view 2024-01-07 15:37:53 +01:00
Sarah Hoffmann
4305160c91 prioritize country searches when penaly is equal 2024-01-07 15:28:37 +01:00
Sarah Hoffmann
dc52d0954e
Merge pull request #3238 from mtmail/check-database-for-version-match
admin --check-database also checks database vs nominatim version
2024-01-07 15:24:00 +01:00
Sarah Hoffmann
d3a575319f
Merge pull request #3289 from lonvia/viewbox-and-housenumbers
Do not restrict by viewbox when housenumber or postcode is available
2024-01-07 15:23:14 +01:00
Sarah Hoffmann
2592bf1954
Merge pull request #3290 from lonvia/near-vs-quaifier-words
Do not run near queries on qualifier words
2024-01-07 15:23:00 +01:00
Sarah Hoffmann
474d4230b8 fix timezone handling for timestamps from the database
SQLite is not timezone-aware, so make sure to convert to UTC
before inserting any data.
2024-01-07 11:37:40 +01:00
Sarah Hoffmann
10a5424a71 do not run near queries on qualifier words
There is too much potential for confusion (e.g. 'Rio Grande' read
as 'river near Grande') fir too little gain. Use near phrases
instead.
2024-01-07 11:33:11 +01:00
Sarah Hoffmann
7eb04f67e2 do not restrict by viewbox when housenumber or postcode is available
Fixes #3274.
2024-01-07 11:29:26 +01:00
Marc Tobias
1d7e078a2c check-database also checks database vs nominatim version 2024-01-06 20:56:56 +01:00
Sarah Hoffmann
8e90fa3395 avoid closure variables in lambda statements
There is a bug in SQLAlchemy that assigns the wrong value to bind
parameters from closure variables when reusing lambda statements
that are later extended with other non-lambda expressions.

Thus either avoid lambda statements with closure variables or extending
them with non-lambda expressions.
2024-01-05 17:49:28 +01:00
Sarah Hoffmann
02af0a2c87 use correct SQLAlchemy pool for asynchronous connections
See https://github.com/sqlalchemy/sqlalchemy/issues/8771
2024-01-02 16:15:44 +01:00
Sarah Hoffmann
fa4e5513d1 API: avoid engine disposal on startup 2024-01-02 16:10:30 +01:00
Sarah Hoffmann
93afe5a7c3 update typing for latest changes in SQLAlchemy 2023-12-29 20:55:33 +01:00