Sarah Hoffmann
02f6afa51b
always ignore multi term partials in search
...
Partial terms should only ever consist of one word. Ignore
any other, they are a leftover from inefficient word index
builts.
2021-05-23 22:13:03 +02:00
Sarah Hoffmann
10143e0ac7
Merge pull request #2342 from lonvia/icu-tokenizer-ci
...
Add BDD tests with icu tokenizer to CI runs
2021-05-22 10:36:35 +02:00
Sarah Hoffmann
8f3429939f
CI: run BDD tests with legacy_icu tokenizer
2021-05-21 23:18:45 +02:00
Sarah Hoffmann
00094c43d1
enable Tiger BDD API test for legacy_icu
2021-05-21 22:39:56 +02:00
Sarah Hoffmann
8bf15fa691
Merge pull request #2341 from lonvia/cleanup-python-tests
...
Cleanup and linting of python tests
2021-05-20 17:30:30 +02:00
Sarah Hoffmann
63dc503b8d
Merge pull request #2337 from mogita/fix/invalid-query-string
...
fix: add the missing question mark
2021-05-20 10:26:23 +02:00
Sarah Hoffmann
430c316e45
test: fix linting errors
2021-05-19 23:07:39 +02:00
Sarah Hoffmann
01f5a9ff84
test: more use of table_factory
2021-05-19 17:37:03 +02:00
Sarah Hoffmann
af52eed0dd
test: avoid use of tempfile module
...
Use the tmp_path fixture instead which provides automatic
cleanup.
2021-05-19 16:43:26 +02:00
Sarah Hoffmann
f93d0fa957
test: use src_dir fixture instead of self-computed paths
2021-05-19 16:03:54 +02:00
Sarah Hoffmann
c06a1d007a
test: replace raw execute() with fixture code where possible
2021-05-19 12:11:04 +02:00
Sarah Hoffmann
65bd749918
test: use table_rows() and execute_values() where possible
...
Some uses of scalar() could also be replaced with convenience
functions from the word table mock.
2021-05-19 10:51:10 +02:00
Sarah Hoffmann
510eb53f53
test: move Testingcursor into separate class
...
Also adds more convenience functions: counting with a where
statement and a wrapper to execute_values().
2021-05-19 10:30:36 +02:00
mogita
507543a482
fix: add the missing question mark
2021-05-19 13:35:15 +08:00
Sarah Hoffmann
16bb007135
Merge pull request #2336 from lonvia/do-not-mask-error-when-loading-tokenizer
...
Do not hide errors when importing tokenizer
2021-05-18 23:00:10 +02:00
Sarah Hoffmann
1ffb6bd5d0
Merge pull request #2321 from AntoJvlt/csv-import-special-phrases
...
CSV import for special phrases and loader refactoring
2021-05-18 22:58:25 +02:00
AntoJvlt
799a4c9ab6
Documentation update and small code fixes
2021-05-18 22:35:21 +02:00
Sarah Hoffmann
b2722650d4
do not hide errors when importing tokenizer
...
Explicitly check for the tokenizer source file to check that
the name is correct. We can't use the import error for that
because it hides other import errors like a missing
library.
Fixes #2327 .
2021-05-18 16:28:21 +02:00
Sarah Hoffmann
54b06d7abc
Merge pull request #2332 from lonvia/fix-keyword-details
...
Always use object type for details keywords
2021-05-18 11:30:58 +02:00
Sarah Hoffmann
fef1bbb1a7
always use object type for details keywords
...
When name and address is empty, the keywords field in the response
of the details API would be an array because that is what PHP's
json_encode defaults to with empty array(). This default can only
be changed globally per json_encode call and that might cause
unintended colleteral damage. Work around the issue by making
name and address an empty array instead of keywords.
Fixes #2329 .
2021-05-17 16:36:32 +02:00
AntoJvlt
3206bf59df
Resolve conflicts
2021-05-17 13:52:35 +02:00
AntoJvlt
a33f2c0f5b
Special phrases documentation updated
2021-05-17 13:25:16 +02:00
AntoJvlt
8b8dfc46eb
Added --no-replace command for special phrases importation and added corresponding tests
2021-05-17 13:25:06 +02:00
AntoJvlt
06aab389ed
Code cleaning and SPLoader deleted
2021-05-16 16:59:12 +02:00
AntoJvlt
fb0ebb5bf0
Add tests for the new SPWikiLoader and SPCsvLoader
2021-05-16 16:10:06 +02:00
Sarah Hoffmann
925726222f
Merge pull request #2323 from darkshredder/disable-search-reverse-only
...
Feat: Disabled search API for --reverse-only imports
2021-05-14 10:40:22 +02:00
Sarah Hoffmann
550e7edb64
Merge pull request #2328 from lonvia/convert-tiger-to-csv
...
Switch external Tiger data to CSV format
2021-05-14 09:58:50 +02:00
Sarah Hoffmann
2992dea5c8
install default settings for legacy_icu tokenizer
2021-05-14 09:44:10 +02:00
Sarah Hoffmann
e76e4bd964
adapt documentation to use Tiger CSV dump
2021-05-14 00:02:50 +02:00
Sarah Hoffmann
7d621389ee
adapt tests to new TIGER CSV format
2021-05-14 00:02:50 +02:00
Sarah Hoffmann
35efe3b41c
use tokenizer during Tiger data import
...
This also changes the required import format to CSV.
2021-05-14 00:02:50 +02:00
Darkshredder
e5ffc59cd5
feat: Added reverse-only-search validation
2021-05-14 02:36:21 +05:30
Sarah Hoffmann
d7f9d2bde9
Merge pull request #2326 from lonvia/wokerpool-for-tiger-data
...
Use WorkerPool when importing Tiger data
2021-05-13 22:09:56 +02:00
Sarah Hoffmann
5feece64c1
use WorkerPool for Tiger data import
...
Requires adding an option that SQL errors are ignored.
2021-05-13 20:36:50 +02:00
Sarah Hoffmann
b9a09129fa
move WorkerPool into db module
...
The pool is independent of the indexer and may also be used
by other parts of the software.
2021-05-13 17:11:17 +02:00
Sarah Hoffmann
96e6bbe3a1
Merge pull request #2325 from lonvia/do-not-precompute-postcodes
...
Do not preload postcodes in the legacy tokenizer
2021-05-13 17:00:29 +02:00
Frederik Ramm
fe39185894
Add array_key_last function for PHP <7.3
...
This patch adds an array_key_last function if it doesn't yet exist, fixes #2316 . It is tested on PHP 7.2.24 but not PHP 7.3.
2021-05-13 16:42:22 +02:00
Sarah Hoffmann
fc860787dd
do not preload postcodes
...
This is too expensive for updates.
2021-05-13 16:14:12 +02:00
Sarah Hoffmann
63e35574d4
Merge pull request #2324 from lonvia/generic-external-postcodes
...
Rework postcode handling and generalised external postcode support
2021-05-13 14:52:19 +02:00
Sarah Hoffmann
db2dbf15f7
fix token_info migration
...
A bad indent meant that only one table received the new column.
2021-05-13 14:31:41 +02:00
Sarah Hoffmann
f5977dac75
ignore invalid coordinates in external postcodes
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
8f2746fe24
ignore entries without country code
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
41b9bc9984
add documentation for external postcode feature
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
1ccd4360b4
correctly handle removing all postcodes for country
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
bf864b2c54
index postcodes after refreshing
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
4abaf71234
add and extend tests for new postcode handling
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
a4aba23a83
move filling of postcode table to python
...
The Python code now takes care of reading postcodes from placex,
enhancing them with potentially existing external postcodes and
updating location_postcodes accordingly. The initial setup and
updates use exactly the same function.
External postcode handling has been generalized. External postcodes
for any country are now accepted. The format of the external postcode
file has changed. We now expect CSV, potentially gzipped. The
postcodes are no longer saved in the database.
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
cae0cf3546
Merge pull request #2322 from mtmail/type-label-already-lowercased
...
typelabel value is already lowercased
2021-05-12 20:25:22 +02:00
marc tobias
38f9e18afb
typelabel value is already lowercased
2021-05-12 19:16:51 +02:00
AntoJvlt
9d83da830f
Introduction of SPCsvLoader to load special phrases from a csv file
2021-05-10 23:26:39 +02:00