Sarah Hoffmann
4342b28882
switch special phrases to new word table format
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
5394b1fa1b
switch postcode tokens to new word table layout
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
5ab0a63fd6
switch housenumber tokens to new word table layout
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
1618aba5f2
switch country name tokens to new word table layout
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
8377528952
new word table layout for icu tokenizer
...
The table now directly reflects the different token types.
Extra information is saved in a json structure that may be
dynamically extended in the future without affecting the
table layout.
2021-07-28 11:31:47 +02:00
Sarah Hoffmann
34dcf02dee
fix typos in tokenizer docs
2021-07-28 11:28:49 +02:00
Sarah Hoffmann
5d7d7f15d9
Merge pull request #2401 from lonvia/port-add-data-to-python
...
Port add-data functions from PHP to Python
2021-07-26 12:38:56 +02:00
Sarah Hoffmann
0c023fb4d2
adapt cli tests to Python port for add-data
2021-07-26 10:41:37 +02:00
Sarah Hoffmann
1bd068d42d
remove unused update script
2021-07-26 10:41:37 +02:00
Sarah Hoffmann
e42349c963
replace add-data function with native Python code
2021-07-26 10:41:37 +02:00
Sarah Hoffmann
878835e4bd
move add-data subcommand into a separate file
2021-07-25 18:14:12 +02:00
Sarah Hoffmann
8096a1d67f
fix parameters for TokenWord creation
2021-07-20 10:21:40 +02:00
Sarah Hoffmann
e16c5d5f70
Merge pull request #2397 from lonvia/increase-minimum-required-versions
...
Increase minimum required PostgreSQL version to 9.5
2021-07-19 14:28:02 +02:00
Sarah Hoffmann
2c8242c8df
remove special code for pre9.5 postgresql
...
9.5 is now the minimum requirement.
2021-07-19 10:24:57 +02:00
Sarah Hoffmann
e7d6f89aca
increase minimum version for PostgreSQL to 9.5
...
This is the minimum version we can test with the CI.
With 9.5 there is also complete support for jsonb available.
2021-07-19 10:21:19 +02:00
Sarah Hoffmann
379f5db516
require Python 3.6 also in CMakeFile
...
This had been forgotten when increasing the minimum Python version.
2021-07-19 10:14:14 +02:00
Sarah Hoffmann
ee32315378
Merge pull request #2396 from lonvia/partial-word-token
...
Reorganise code that build the SearchDescription
2021-07-19 09:42:37 +02:00
Sarah Hoffmann
cca912af4e
make all Token menbers private
2021-07-18 22:54:55 +02:00
Sarah Hoffmann
86ea077092
merge marking rare name with adding name token
...
Only name tokens can be rare, so this should be the same
function.
2021-07-18 16:52:37 +02:00
Sarah Hoffmann
5d6aabc457
add documentation for public interface of SearchDescription
2021-07-18 16:10:42 +02:00
Sarah Hoffmann
b14ce959d9
factor out check if a token fits current search
...
Saves allocating an empty array.
2021-07-17 22:01:35 +02:00
Sarah Hoffmann
a48ebd9b47
move SearchDescription building into tokens
...
Moving the logic for extending the SearchDescription into the
token classes splits up the code and makes it more readable.
More importantly: it allows tokenizer to define custom token
classes in the future.
2021-07-17 20:24:33 +02:00
Sarah Hoffmann
3cd85eaaf1
remove Token from explicit input for SearchDescription extension
...
The token string is only required by the PartialToken type, so
it can simply save the token string internally. No need to pass
it to every type.
Also moves the check for multi-word partials to the token loader
code in the tokenizer. Multi-word partials can only happen with
the legacy tokenizer and when the database was loaded with an
older version of Nominatim. No need to keep the check for
everybody.
2021-07-17 18:18:31 +02:00
Sarah Hoffmann
ec3f6c9c42
factor out query position
...
Moves token and phrase position and phrase type into a separate
class that is handed in when assembling the search description.
This drastically reduces the number of parameters for the function
to extend the search descriptions and gives us more flexibility
in the future for more complex positional analysis.
2021-07-15 14:12:59 +02:00
Sarah Hoffmann
143ff14466
remove special status of partial tokens
...
Full-word tokens are no longer marked by a space at the
beginning of the token. Use the new Partial token category
instead. This removes a couple of special casing, we don't
really need.
The word table still has the space for compatibility reasons,
so the tokenizer code needs to get rid of it when loading the
tokens.
2021-07-14 22:17:17 +02:00
Sarah Hoffmann
6070c3d1d5
introduce a separate token type for partials
...
This means that the leading space can be removed as a partial
word indicator.
2021-07-13 16:57:12 +02:00
Sarah Hoffmann
bc8b2d4ae0
Merge pull request #2393 from lonvia/fix-flake8-issues
...
Fix flake8 issues
2021-07-13 16:46:12 +02:00
Sarah Hoffmann
14f777da18
use psycopg's SQL quoting where possible
...
Use the SQL formatting supplied with psycopg whenever the
query needs to be put together from snippets.
2021-07-12 22:05:22 +02:00
Sarah Hoffmann
6f6681ce67
add helper function for execute_values
...
Make psycopg2's convenience function accessible through
the cursor.
2021-07-12 21:08:20 +02:00
Sarah Hoffmann
06602b4ec0
provide wrapper function for DROP TABLE
...
Use psycopg2 formatting to ensure correct quoting.
2021-07-12 20:32:46 +02:00
Sarah Hoffmann
cf98cff2a1
more formatting fixes
...
Found by flake8.
2021-07-12 17:45:42 +02:00
Sarah Hoffmann
b4fec57b6d
Merge pull request #2391 from lonvia/fix-sonar-issues
...
Fix bugs and code smells found by Sonarqube
2021-07-12 17:14:59 +02:00
Sarah Hoffmann
f8b5a63de3
factor out connection reset code
2021-07-12 14:58:44 +02:00
Sarah Hoffmann
568316f07c
simplify analyse function
2021-07-12 14:47:50 +02:00
Sarah Hoffmann
daa597b300
split up variant computation for better readability
2021-07-12 14:43:50 +02:00
Sarah Hoffmann
47adb2a3fc
reorganise process_place function
...
Move address processing into its own function as it is
rather extensive.
2021-07-12 11:57:55 +02:00
Sarah Hoffmann
fff0012249
simplify website setup code
...
Use formaat strings and move variable quoting code into extra
function.
2021-07-12 11:41:05 +02:00
Sarah Hoffmann
d5a1883b62
avoid repeated patterns for table name
2021-07-12 11:33:09 +02:00
Sarah Hoffmann
a08ef43e40
simplify if statements
2021-07-12 11:28:47 +02:00
Sarah Hoffmann
bc5e15996a
convert single case switch to if statement
2021-07-12 11:28:47 +02:00
Sarah Hoffmann
128ca800cd
avoid local variable assignment
2021-07-11 23:22:53 +02:00
Sarah Hoffmann
000d133af6
fix more missing braces on one-liners
2021-07-11 23:22:53 +02:00
Sarah Hoffmann
1e40d65aa9
remove dead code
2021-07-11 23:22:53 +02:00
Sarah Hoffmann
bffbe68ec3
do not intermix params with and without default
2021-07-11 23:22:53 +02:00
Sarah Hoffmann
58b10074ad
directly return data in function
...
The temporary variable is not necessary.
2021-07-11 19:24:04 +02:00
Sarah Hoffmann
d933ead2b5
remove unnecessayly nested ifs
...
Found by Sonarqube.
2021-07-11 19:11:37 +02:00
Sarah Hoffmann
1cdc30c5e8
remove unused functions
...
The functions were necessary for the transitory code
to Python and are no longer used.
2021-07-11 19:10:04 +02:00
Sarah Hoffmann
3661f7a321
avoid multiple returns of same value
...
Found by Sonarqube.
2021-07-11 18:23:42 +02:00
Sarah Hoffmann
27af9b102c
always use brackets on if statements
...
This adds bracket around all one-line if statements that did
not have them yet.
2021-07-10 17:04:46 +02:00
Sarah Hoffmann
500c61685b
remove unused variables
...
As reported by sonarqube.
2021-07-09 16:36:42 +02:00