Commit Graph

768 Commits

Author SHA1 Message Date
Sarah Hoffmann
d78f0ba804 port replication initialisation to Python 2021-01-26 22:50:54 +01:00
Sarah Hoffmann
5b46fcad8e convert functon creation to python
The new functions always creates normal and partitioned functions.
Also adds specialised connection and cursor classes for adding
frequently used helper functions.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
94fa7162be port address level computation to Python
Also adds simple tests for correct table creation.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
e6c2842b66 move update code for postcode and word count to Python
Adds also tests for the new function to execute a SQL script.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
504922ffbe remove old nominatim.py in favour of 'nominatim index'
The PHP scripts need to know the position of the nominatim
tool in order to call it. This is handed in as environment
variable, so it can be set by the Python script.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
340e7f7210 bdd: complete coverage for API tests
Also removes some functions that are no longer used and
fixes debug output where the tests found an issue.
2021-01-17 16:12:06 +01:00
Sarah Hoffmann
f9c43137c9 remove unused output formatting functions 2021-01-16 17:39:49 +01:00
Sarah Hoffmann
9348fc5e15 move dotenv parsing to installed PHP scripts
This means that the php-symfony-dotenv library is now only needed
when using the legacy scripts. This includes the BDD tests which
currently still rely on the PHP utils.
2021-01-14 18:06:22 +01:00
Sarah Hoffmann
0847964a27 avoid accessing constants when getting data from env
When a setting can be read from the environment variable, avoid
accessing the internal defaults. This way the scripts can be
accessed directly in \lib as long as the environment is set up
correctly with full defaults.
2021-01-14 09:37:04 +01:00
Sarah Hoffmann
bc09d7aedb fix access to environment variable 2021-01-14 09:29:43 +01:00
Sarah Hoffmann
04690ad8c4 implement warming in new cli tool
Adds infrastructure for calling the legacy PHP scripts. As the
CONST_* values cannot be set from the python script, hand the values
in via secret environment variables instead. These are all
temporary hacks for the transition phase to python code.
2021-01-13 18:25:15 +01:00
Sarah Hoffmann
ec636111ba warm.php needs constant setup for queries
Warming is done using the query classes and therefore the same
copy-over from dotenv settings to CONST_* parameters is needed
as for query.php.
2021-01-13 18:12:53 +01:00
Sarah Hoffmann
e467b956ff set CONST_LibDir directly from the source scripts
Now that the source scripts have been moved to \lib, they
can determine the position of the PHP library relative to
themselves.
2021-01-13 17:00:38 +01:00
Sarah Hoffmann
ff5a237200 move PHP utilities into the lib directory
These are not called directly as programs but used in a library
fashion by the installed utilities. So the library directory
is a better place.
2021-01-13 14:44:45 +01:00
Sarah Hoffmann
5e989b9296 configure osm2pgsql and module location via cmake
The default location of osm2pgsql and the postgresql module
is decided at compile/installation time and is not necessarily
in the project directory.

With this change it is now possible to have a project directory
that is completely separate from the build directory.
2021-01-04 11:37:56 +01:00
Sarah Hoffmann
867baab3d1 make phpcs happy 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
433017b990 move creation of website scripts to setup script
Instead of creating the website wrapper scripts with cmake,
they are now created when --setup-website is called. The
setup of the configuration constants is directly embedded
into the scripts. This means we can get rid of the separate
settings-frontend.php. More importantly however, it means
that it is now possible to set up multiple website directories
from the same build directory.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
d97aed8741 adapt tests to new dotenv environment
DB tests now can simply set the environment to change configuration
variables. API tests still rely on a configuration file.

Also, query.php needs to set up the CONST_* variables to work with
the query scripts. That is a tiny bit messy and duplicates code
but this part will need to be reworked later.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
06d89e1d47 fix various typos 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
992d3faac8 switch all utils to initialising dotenv 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
0947b61808 switch remaining settings to dotenv format
CONST_Search_AreaPolygons and CONST_Search_ReversePlanForAll have
been removed completely.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
15a1666f8a replace database settings with dotenv variant
As we can't refer to the project root dir in the module path, the
module path may now also be a relative directory which is then
taken as being relative to the project root path.

Moves the checkModulePresence() function into the Setup class, so
that it can work on the computed absolute module path.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
25bdd7c6d9 introduce dotenv parsing for setup.php
This adds the notion of a project directory. This is the directory
that holds all necessary files for one specific installation of
Nominatim. Dotenv looks for an .env file in this directory and
adds it to the global environment together with the defaults from
Nominatim's data directory.

Add's symfony's dotenv library as a new dependency.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
ac116980ac make HTTP proxy setup explicit
The setup relies on the project configuration which we want to
explicitly set up in later steps. Therefore proxy setup needs to
be done explicitly as well. There is the added bonus that the
setup is done only for the utils which try to call outside.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
b5480f6e36 reorganise path settings in config
CONST_BasePath is split into separate configuration variables
for binaries, libraries and data. These variables as well as
the installation path are now set in the executable directly and
no longer configurable via project settings.

This is the first step towards an installable software. The
executables should know per installation where to find their
necessary data to execute. Project configuration needs to be
restricted to settings that really concern the specific Nominatim
installation.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
ea844db847 do not classify housenumbers as rare
House numbers are highly redundant, so don't even attempt to
do it as a rare name search. Greatly improves speed of such
queries.
2020-12-08 17:25:15 +01:00
Sarah Hoffmann
0334915067 update osm2pgsql
Needs now an explicit switch to avoid propagating changes from
nodes to ways to relations.
2020-12-02 18:27:18 +01:00
Sarah Hoffmann
dc53288e6b
Merge pull request #2085 from lonvia/add-address-rank-to-xml-output
add address rank to XML output
2020-12-01 19:59:29 +01:00
Sarah Hoffmann
cf9b248f29 add address rank to XML output
The address rank is much more interesting than the search rank
these days because it tells something about the kind of object.

Reverse did have neither rank, so add both for consistency.
2020-12-01 17:54:53 +01:00
Sarah Hoffmann
df12954312 fix use of term count in partial terms
Term count for partial words is one less than the actual number
of words. Take that into account when adding to the search rank.

Fixes #2081.
2020-12-01 17:21:01 +01:00
Sarah Hoffmann
c5d98effc0
Merge pull request #2074 from lonvia/add-housenumber-to-unknown-places
Improve finding addresses that have their own search_name entry because of unknown addr:* parts
2020-11-25 16:57:09 +01:00
Sarah Hoffmann
57f0d55c2e make phpcs happy 2020-11-25 16:14:31 +01:00
Sarah Hoffmann
3cf763475f do not use artificial housenumbers as names
If they are artificial they cannot have a search_name entry.
2020-11-25 16:11:32 +01:00
Sarah Hoffmann
0f87da017f improve handling of multi-word partials in SearchDescription
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.

This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.

Also adds a penalty to very short postcodes.
2020-11-25 12:07:04 +01:00
Sarah Hoffmann
22800d7d59 Search housenumbers with unknown address parts by housenumber term
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.

This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.

The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.

no penalty for housenumber searches
2020-11-25 11:36:10 +01:00
Sarah Hoffmann
ba8accf4bb make phpcs happy 2020-10-29 11:36:16 +01:00
Sarah Hoffmann
b81894d3d5 remove now unused settings related to website
There are two places where the website URL is still used:
for icons, replace the URL with a link to the icon repository
of the UI repo. The more URL now builds the link from the
server info.
2020-10-29 11:13:32 +01:00
Sarah Hoffmann
c0d21d0bd3 remove HTML output and UI elements 2020-10-29 11:13:04 +01:00
Sarah Hoffmann
5872b81232 use highest admin boundary for duplicated ones 2020-10-28 10:49:26 +01:00
Sarah Hoffmann
788ba6d985 adjust secondary order when no addressimportance available
In cases of countries and remote places without an address
it is possible that 'addressimportance' comes back empty.
Adjust the 'foundorder' to the places importance instead
in such cases.

Fixes #2023.
2020-10-22 10:20:16 +02:00
Sarah Hoffmann
b012e15245 readd boundary:administrative to class importance 2020-10-22 10:16:04 +02:00
Sarah Hoffmann
e115de47fc use Debug class for formatting reverse debug output 2020-10-12 17:12:03 +02:00
Sarah Hoffmann
7d2b6879c8 restrict postcode searches to postcode in first token
In structured queries we should only assume that it is
a postcode search when only the postcode and optionally
the country is given. If any other term is present, it
is better to avoid the search for postcode as it yields
too many bad searches. Given that the terms in a structured
query are ordered, this means that the postcode must be
the first token just like in the unstructured query.

Fixes #1988.
2020-10-06 14:08:31 +02:00
Sarah Hoffmann
1db8f7e353 add descriptive term for address rank 24
With that term we have terms for all ranks, so that no generic
'administrative' term will show up in the address details anymore.
2020-09-25 16:02:17 +02:00
Sarah Hoffmann
f8a5f2964f remove dead code
The SQL query has moved into the addTokensFromDB() funtion.
2020-09-21 10:39:14 +02:00
Sarah Hoffmann
576ee5aaab use same label for all types of postcode in address 2020-09-18 16:17:30 +02:00
marc tobias
7ac22e9227 starting PHP 5.4 get_magic_quotes_gpc() returns false, no need to check 2020-09-14 00:45:22 +02:00
Sarah Hoffmann
770754ae2c place lookup: filter places that have no details
In rare cases search_name might have entries for places for
which we do not return details, in particular for linkees.
Need to remove those entries in the result list before returning
the details.

Fixes #1932.
2020-08-27 09:33:21 +02:00
Sarah Hoffmann
72ee1abc90 ensure that ordering by importance is stable
The initial search results retrieved from the database already come
preordered, either by importnace or by distance. We want to keep
that order if all other things are equal.
2020-08-26 17:42:43 +02:00
Sarah Hoffmann
9e1909643c PlaceLookup should return results in input order 2020-08-26 17:15:11 +02:00