Merge pull request #2427 from lonvia/remove-us-states-special-casing

Move US state hack into legacy tokenizer
This commit is contained in:
Sarah Hoffmann 2021-08-17 21:55:32 +02:00 committed by GitHub
commit 656c1291b1
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 51 additions and 9 deletions

View File

@ -506,13 +506,6 @@ class Geocode
userError('Query string is not UTF-8 encoded.');
}
// Conflicts between US state abreviations and various words for 'the' in different languages
if (isset($this->aLangPrefOrder['name:en'])) {
$sQuery = preg_replace('/(^|,)\s*il\s*(,|$)/i', '\1illinois\2', $sQuery);
$sQuery = preg_replace('/(^|,)\s*al\s*(,|$)/i', '\1alabama\2', $sQuery);
$sQuery = preg_replace('/(^|,)\s*la\s*(,|$)/i', '\1louisiana\2', $sQuery);
}
// Do we have anything that looks like a lat/lon pair?
$sQuery = $oCtx->setNearPointFromQuery($sQuery);

View File

@ -9,7 +9,8 @@ namespace Nominatim;
*/
class Phrase
{
// Complete phrase as a string.
// Complete phrase as a string (guaranteed to have no leading or trailing
// spaces).
private $sPhrase;
// Element type for structured searches.
private $sPhraseType;

View File

@ -87,6 +87,23 @@ class Tokenizer
$sNormQuery .= ','.$this->normalizeString($oPhrase->getPhrase());
$sSQL .= 'make_standard_name(:' .$iPhrase.') as p'.$iPhrase.',';
$aParams[':'.$iPhrase] = $oPhrase->getPhrase();
// Conflicts between US state abbreviations and various words
// for 'the' in different languages
switch (strtolower($oPhrase->getPhrase())) {
case 'il':
$aParams[':'.$iPhrase] = 'illinois';
break;
case 'al':
$aParams[':'.$iPhrase] = 'alabama';
break;
case 'la':
$aParams[':'.$iPhrase] = 'louisiana';
break;
default:
$aParams[':'.$iPhrase] = $oPhrase->getPhrase();
break;
}
}
$sSQL = substr($sSQL, 0, -1);

View File

@ -61,7 +61,7 @@ Feature: Searching of simple objects
| osm |
| N20 |
Scenario: when the housenumber is missing the stret is still returned
Scenario: when the housenumber is missing the street is still returned
Given the grid
| 1 | | 2 |
Given the places
@ -72,3 +72,34 @@ Feature: Searching of simple objects
Then results contain
| osm |
| W1 |
Scenario Outline: Special cased american states will be found
Given the grid
| 1 | | 2 |
| | 10 | |
| 4 | | 3 |
Given the places
| osm | class | type | admin | name | name+ref | geometry |
| R1 | boundary | administrative | 4 | <state> | <ref> | (1,2,3,4,1) |
Given the places
| osm | class | type | name | geometry |
| N2 | place | town | <city> | 10 |
| N3 | place | city | <city> | country:ca |
When importing
And sending search query "<city>, <state>"
Then results contain
| osm |
| N2 |
When sending search query "<city>, <ref>"
| accept-language |
| en |
Then results contain
| osm |
| N2 |
Examples:
| city | state | ref |
| Chicago | Illinois | IL |
| Auburn | Alabama | AL |
| New Orleans | Louisiana | LA |