Time/EN: Tighten up handling of split times like "five ten"

Summary:
While debugging an attempt to extend our handling of spelled-out
times, I realized that we are being too aggressive in our parsing of
times like "five ten", because we'll parse "five nine" as possibly
meaning "5:09", which isn't something an English speaker would say
(or rather if they did, it's more likely they mean "five (to) six"
or something similar.

Reviewed By: chessai

Differential Revision: D27848429

fbshipit-source-id: 34d783332fd60359ad9b6e7862367453bc93a1d1
This commit is contained in:
Steven Troxler 2021-04-20 05:22:46 -07:00 committed by Facebook GitHub Bot
parent a250e60cbb
commit 35532b0b7c
3 changed files with 3 additions and 1 deletions

View File

@ -7,6 +7,7 @@
### Rulesets
* EN (English)
* Time: Allow latent match for \<part-of-day\> \<latent-time-of-day\>
* Time: Avoid parsing phrases like 'two five' as times
* RU (Russian)
* Duration: Diminutives for minutes and hours

View File

@ -112,6 +112,7 @@ negativeCorpus = (testContext, testOptions, examples)
, "A4 A5"
, "palm"
, "Martin Luther King' day"
, "two three"
]
latentCorpus :: Corpus

View File

@ -766,7 +766,7 @@ ruleHONumeral = Rule
{ name = "<hour-of-day> <integer>"
, pattern =
[ Predicate isAnHourOfDay
, Predicate $ isIntegerBetween 1 59
, Predicate $ isIntegerBetween 10 59
]
, prod = \tokens -> case tokens of
(Token Time TimeData{TTime.form = Just (TTime.TimeOfDay (Just hours) is12H)