Commit Graph

212 Commits

Author SHA1 Message Date
Sarah Hoffmann
38a99856c0 Rework word set computation
Switch from an recursive algorithm for computing the word sets
to an iterative one that benefits from caching intermediate
results. This considerably reduces the amount of memory needed,
so that the depth restriction can be dropped. To ensure that
the number of word sets remains manageable, only sets up to
a certain length are accepted and only a certain number of
total word sets. If word sets need to be dropped, we drop
the ones with more words per word set first.

To further reduce the number of potential word sets, the valid
tokens are looked up first and then only word sets containing
valid tokens are computed.

Fixes #1403, #1404 and #654.
2019-06-29 18:22:31 +02:00
marc tobias
890d415e1f Nominatim::DB support input variables, custom error messages 2019-03-10 16:56:36 +01:00
marc tobias
d4b633bfc5 replace database abstraction DB with PDO 2019-03-09 00:18:15 +01:00
Sarah Hoffmann
7d74bf781c correctly discard partially matching duplicates
The same result may be found with different result ranks
in the same search loop when housenumber or postcode are
part of the name or address. In this case we need to keep
the result with the lower result rank.

Fixes #1264.
2019-01-03 21:49:50 +01:00
Sarah Hoffmann
9908c93d4c Add result ranking for missing housenumber and postcode
Fixes #988.
2018-11-17 00:00:01 +01:00
Sarah Hoffmann
5a772a5770 Don't add viewbox weight when no viewbox is given
Fixes #1068.
2018-07-20 23:29:36 +02:00
Sarah Hoffmann
25baaf530d unify address details lookup
Introduces new AddressDetails class which is responsible
for address lookups. Saves always the complete result
and then allows filtering throught the different access
function. Remove special handling in Geocode() and use
there the lookup throught PlaceLookup() as well.
2018-07-10 23:54:35 +02:00
Sarah Hoffmann
320d488627 move ClassTypes into own namespace
Also adds some convenience functions for lookups.
2018-07-09 23:20:46 +02:00
Sarah Hoffmann
80a6751c51 make phpcs happy 2018-07-06 22:06:05 +02:00
Sarah Hoffmann
01d5ecb86b use already existing address field in geocodejson 2018-07-06 21:58:41 +02:00
Marc Tobias Metten
7a964efb3a search/reverse/lookup with geojson,geocodejson output 2018-05-29 17:20:34 +02:00
Sarah Hoffmann
f29c7bf910 introduce classes for token list and token types 2018-05-14 23:04:15 +02:00
Sarah Hoffmann
115792d1db replace word frequency hash
The word frequency hash was only used to determine if the
name of a SearchDescription is rare. Do this already when
building the SearchDescription (when the word frequency
is still available) and get gid of the extra hash.
2018-05-06 22:35:31 +02:00
Marc Tobias Metten
329948e685 fix -undefined offset- error 2018-03-27 03:00:07 +02:00
Sarah Hoffmann
2c42bda9ce nicer formatting for Geocode debug output 2018-03-25 22:28:18 +02:00
marc tobias
27bc8d4f7b replace PHP sizeof() with either count() or empty() 2018-03-22 12:36:24 +01:00
Sarah Hoffmann
df008d99f5 do not allow importance to become 0
Importance is weighed against a viewbox factor which disappears
when the importance is 0.

Fixes #930.
2018-03-01 22:37:45 +01:00
Sarah Hoffmann
3505417e3f
Merge pull request #905 from mtmail/illinois-li-case-insensitive
make sure Illinois,Alabama,Louisiana state code special handling is case insensitive
2018-02-10 15:50:42 +01:00
marc tobias
e428019170 typo in error message 2018-02-08 18:02:19 +01:00
Marc Tobias Metten
315713ff9a make sure Illinois,Alabama,Louisiana state code special handling is case insensitive 2018-02-07 00:48:18 +01:00
Sarah Hoffmann
13469e1576 convert remaining http links and shorten copyright URL 2018-01-11 23:05:28 +01:00
Sarah Hoffmann
6c1977b448 replace double-quoting with single quotes where applicable 2017-10-26 21:40:33 +02:00
Sarah Hoffmann
919b1b42fa fix uninitialised rank variable when regrouping searches 2017-10-24 23:17:47 +02:00
Sarah Hoffmann
760807c5e0 revert use of global penalty for a search direction
Adding a penalty to a search description because there
is a term at the beginning which looks like a country
turned out to be a bad idea as there are too many
abbreviations around that match against frequently
matched words.
2017-10-24 22:42:29 +02:00
Sarah Hoffmann
282c6777ee use PlaceLookup::loadParamArray in search and lookup 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
1a4506f6ab use PlaceLookup in search 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
1424e8e29b use Result class in reverse geocoding
Also simplifies the reverse algorithm slightly by no longer
having an additional distance lookup.
2017-10-23 23:30:53 +02:00
Sarah Hoffmann
42f079c355 introduce Result class in Geocode and SearchDescription 2017-10-23 23:30:53 +02:00
Sarah Hoffmann
cdf8c67898 fix CodeSniffer offences 2017-10-13 23:11:09 +02:00
Sarah Hoffmann
00265af528 move word recheck into token collection
Drop tokens for special and postcode searches already when
collecting them for ValidTokens when they cannot be found
in the normalized query.
2017-10-13 23:04:12 +02:00
Sarah Hoffmann
77b76ae51b simplify cross-check of country tokens
Drop country tokens that do not match the country code list
early. Remove in turn the special country code check for
structured phrases. It is sufficient to do this during
word list building.
2017-10-13 22:23:39 +02:00
Sarah Hoffmann
9ef2370a2a remove unused $aPossibleMainWordIDs array 2017-10-13 21:34:13 +02:00
Sarah Hoffmann
77abe882ab take frequency scores from token description
No need to hand them in separately.
2017-10-12 22:59:07 +02:00
Sarah Hoffmann
023f94b066 convert phrase array to class 2017-10-12 22:37:44 +02:00
Sarah Hoffmann
3da4c9c384 Sort results for near searches by proximity
If a reference coordinate is given, results really should be
sorted by distance to this point ignoring importance completely.

Fixes #796.
2017-10-10 23:03:28 +02:00
Sarah Hoffmann
c02bf4986f coding style and some documentation 2017-10-09 23:13:04 +02:00
Sarah Hoffmann
9a5d5d9aec move complete search query code into SearchDescription 2017-10-09 22:55:50 +02:00
Sarah Hoffmann
55629a4891 move country list to SearchContext 2017-10-08 23:33:54 +02:00
Sarah Hoffmann
907133a38c move excluded place list to SearchContext 2017-10-08 23:15:06 +02:00
Sarah Hoffmann
86c0858130 move viewbox sql to new SearchContext 2017-10-08 22:44:01 +02:00
Sarah Hoffmann
30511fd3ab replace NearPoint with a more generic context object
The NearPoint is actually common to all SearchDescriptions
and there is other context data as well. like viewbox, that
needs to be available to the search object but is common.
2017-10-08 21:23:31 +02:00
Sarah Hoffmann
8e0ffde3e0 fix CodeSniffer violations 2017-10-08 17:00:59 +02:00
Sarah Hoffmann
795153b213 fix more syntax issues 2017-10-08 16:42:04 +02:00
Sarah Hoffmann
75e35f3832 fix syntax errors from introduction of SearchDescription 2017-10-08 15:26:14 +02:00
Sarah Hoffmann
16268f92cc convert getGroupedSearches to SearchDescription class 2017-10-08 12:57:22 +02:00
Sarah Hoffmann
d72c863353 add function to convert array to SQL 2017-10-08 10:06:17 +02:00
Sarah Hoffmann
96b6a1a418 use SearchDescription class in query loop 2017-10-08 09:54:12 +02:00
Sarah Hoffmann
0067555c38 move initial search setup to new class type 2017-10-07 12:24:21 +02:00
Sarah Hoffmann
c563c2bfec drop searches with excluded country codes earlier 2017-10-07 12:23:46 +02:00
Sarah Hoffmann
266153f218 remove code for dropping address terms
This code has been inactive in quite a while and is a suboptimal
solution. We need to be much more selective in what gets dropped.
2017-10-07 11:53:33 +02:00