Summary:
In theory, years can be negative. In practice, this is rarely seen and when it is, it's usually with the B.C. postfix.
Updating `ruleYearLatent` to not accept negative years for all languages.
Reviewed By: codyohl
Differential Revision: D33895871
fbshipit-source-id: 818000104da825aab91be7fa2a72704aa350a91a
Summary:
Some changes were originally suggested by me during the review of
https://github.com/facebook/duckling/pull/474.
Others are new.
1. "Day after tomorrow/before yesterday"
2. Ordinals in the form of number+suffix like "8th of March"
3. Tuesdays require a special preposition.
4. Support "Yo" (U+0451) in "fourth" and "during daylight".
5. Support special perposition for "next week".
6. Support "one before last" adjective for time grains.
7. Proper suffixes for "quarter" grain.
8. Support "at midnight".
9. Support alternative flag for "afternoon".
Changes in Ordinal and TimeGrain are all driven by the new examples in the
corpus for Time.
There are also a couple of bugfixes:
1. A hidden latin "e" was present in an otherwise Cyrillic regex.
2. Wrong order of options in a regex separated with "|" prevented some matches.
Reviewed By: haoxuany
Differential Revision: D32311714
fbshipit-source-id: 084f6c3893eb5bfd767c267f558b910c6854eb59
Summary:
Add the most common rules for Japanese time dimension.
Pull Request resolved: https://github.com/facebook/duckling/pull/646
Reviewed By: stroxler
Differential Revision: D30675005
Pulled By: chessai
fbshipit-source-id: 917aa98b5cfe0c73d207b1f51b80d8e17a1c7e6a
Summary: An interval regex was overzealous and matching too much, so `hh:mm:ss` was getting parsed as an interval instead of a time.
Reviewed By: patapizza
Differential Revision: D30608223
fbshipit-source-id: b24c18146070f15ada80b9401e67f0c0aefef7d8
Summary:
Some time recognition improvements for Catalan:
- morning should be a time range recognised until noon
- "dema" can also be used for tomorrow (besides "demà")
- "se" alone should not be understood as September
Pull Request resolved: https://github.com/facebook/duckling/pull/639
Reviewed By: stroxler
Differential Revision: D30312076
Pulled By: chessai
fbshipit-source-id: 1a42bbd7eecc4f5690145ee9cadb8eccae8edd08
Summary: Initialise Time for CA (Catalan) language
Reviewed By: stroxler
Differential Revision: D28455273
Pulled By: chessai
fbshipit-source-id: be9a4d61692ba4fb32986e161e9fdd6d25a357dc
Summary:
I don't think abbreviating pattern to ptn is a win, the abbreviation
isn't good enough to be unambiguous (I found this because I was
trying to figure out what "ptn" meant), and `pattern` isn't that
long a name.
Reviewed By: chessai
Differential Revision: D28462776
fbshipit-source-id: bd685720b198fed791d84c00732eb1873b37528b
Summary:
Solutions were:
- use targeted and qualified imports to avoid pulling in the universe of Duckling.Types
- use long-form descriptive names in a few places
- shuffle a let clause to just define an output instead of a local func
Also got rid of another lint error suggesting to use a section instead of flip;
the module is now lint-warning free.
Reviewed By: chessai
Differential Revision: D28462775
fbshipit-source-id: 1e2855756b22cb62db0d94334a7e063aa728b7bf
Summary:
Using rules of thumb:
- use unabbreviated names for aguments to top-level functions where there's
a clash (e.g. lots of t -> time transformations)
- use abbreviated names for nested local functions (so e.g. t, d to avoid
a clash with the `day` top-level function)
Reviewed By: chessai
Differential Revision: D28462777
fbshipit-source-id: 8795d038b2c3a65b60f0d2d9091b7c56cc8a5ff7
Summary: Currently, Duckling will accept "The first christmas of next month" in this rule, which is nonsensical. This reduces the scope of the times the rule recognises, thereby limiting us to a set of more sensible resolutions.
Reviewed By: stroxler
Differential Revision: D27861417
fbshipit-source-id: 3f19700af7298a6238c59f5de0598168d4b4a3c4
Summary:
While I was working on fixing #604, I came across the rules
`ruleMilitarySpelledOutAMPM(2)`, which were actually capturing
some of my test phrases and confusing me.
This commit removes them because
- they aren't needed: the existing latent spelled-out hour + minute rules plus
the "(in the )?(am/pm)" rules together give the same behavior
- they are confusingly named - these aren't military times at all, they are
spelled-out civilian times
Reviewed By: haoxuany
Differential Revision: D27848485
fbshipit-source-id: ba1ed16ec22b5139b0b500b44dc91adb1b5e3d82
Summary:
It's common to use dashes when spelling out times longhand,
e.g. "five-thirty am", but Duckling wasn't handling this at all.
This commit adds rules for times spelled out with dashes. The
rules explicitly forbid the second of the two times from including
digits via a negative match. This is because
- it wouldn't be at all idomatic to write five-26 or five-oh-6
- allowing that pattern clashes with time range parsing, e.g.
"9-10 am" should parse as a time range, not as "9:10 am"
Reviewed By: chessai
Differential Revision: D27848428
fbshipit-source-id: dfe8b98cb38119a16db2a19db47fd3128783e617
Summary:
While debugging an attempt to extend our handling of spelled-out
times, I realized that we are being too aggressive in our parsing of
times like "five ten", because we'll parse "five nine" as possibly
meaning "5:09", which isn't something an English speaker would say
(or rather if they did, it's more likely they mean "five (to) six"
or something similar.
Reviewed By: chessai
Differential Revision: D27848429
fbshipit-source-id: 34d783332fd60359ad9b6e7862367453bc93a1d1
Summary:
This commit gets rid of all the easy-to-fix lint warnings on time helper modules:
- replacing unnecessary `.` with `$`
- Flipping a lambda in a map to an infix operation
- Use `ts` for a list of times, not `series` which produces a pretty confusing naming collision
There are still quite a lot of lint errors related to name masking, which would be challenging to fix without us coming to an agreement about naming conventions.
But at least in my editor, name-masking errors are a lot less visually noisy than other errors (they only highlight the one name) so I don't mind them as much when skimming the code.
Reviewed By: chessai
Differential Revision: D27842198
fbshipit-source-id: 9091e5349657243b61d7ee169d0d06dd2122ac17
Summary:
This fixes#592 in a very conservative way: the reason why `ruleIntersect` does
not detect "tonight 815" and "tonight eight fifteen" as it does "tonight 8:15"
is because it explicitly forbids the second part of the intersection from being
latent, unless it is a year.
I don't think it's a good idea to remove the restriction on latent inputs in
`ruleIntersect`, so instead I just made a new rule specifically for the
intersection of `<part-of-day> <time-of-day>`.
It also seems to me that there's a lot of room for this to be too aggressive,
for example if I say "tonight 500 people will laugh" the "tonight" and "500"
aren't really linked. So, I set the rule to be latent; this may be too conservative
to be useful though (do client libraries usually allow latent results?).
Reviewed By: chessai
Differential Revision: D27842596
fbshipit-source-id: 36ac59e31c632d4864241bce291147a46d52f780
Summary:
The facebook internal linters prefer us to avoid
excessive point-free style and extra $ where we could
instead move existing brackets.
Making those style tweaks for Time/EN/Rules.hs because
I was looking at the file as part of
Reviewed By: chessai
Differential Revision: D27108042
fbshipit-source-id: 7c8e76578476ea14d655131943e693c5159b12d2
Summary:
I was looking at adding support for "next week" constructions in Spanish to
close https://github.com/facebook/duckling/issues/553 (which it appears has
already been handled), when I noticed that the equivalent logic for English
has been split into two separate examples: "coming week" isn't in the same
example as other equivalent constructs like "upcoming week" and "next week".
This diff combines them, which I think is clearer and fewer lines of code
Reviewed By: chessai
Differential Revision: D26892322
fbshipit-source-id: 68ca4644759198fc79d963ae080495c3f2d4a923
Summary: due to exploit in T85548324, factoring the input to get a smaller parse tree (the existing one parses tail recursively, whereas this one uses ruleIntersect which is still bad, but slightly better).
Differential Revision: D26657170
fbshipit-source-id: fe3a738073b4d30ae401521bb692f4a4bba48d96
Summary:
In general there are some clashes between time formats `hhmm` and date formats `ddmm`. For example, depending on context, `22.10` can mean clock time ten past ten or the twenty second of october. In general it's correct to interpret this as clock time, as Duckling currently does.
But there are some cases not currently covered by Duckling where we have more unambiguous dates, e.g. `12.03.2018` and `27.11`. These are included here (in addition to midnight `24:00` which was also missing).
#### Changes:
- Bug in `ruleDdmm` regex meant that dates on the format `dd/mm` where `mm > 9` were not parsed
- `ruleYyyymmdd` now also parses dots and forward slashes, i.e. `2012.05.14` and `2012/05/14`
- New rule `rule2400` parses `24:00` and `24.00` (I elected not to include it in `ruleMidnighteodendOfDay` as it has grain minute rather than day)
- New rule `ruleDmm` parses `1/10`, `9.12` etc
- New rule `ruleDDm` parses `10/3`, `11.1` etc
- New rule `ruleDdDotMm` parses `25.02`, `31.10` etc
- `ruleDdmmyyyy` now also parses dots, i.e. `03.10.1983`
- New tests
Pull Request resolved: https://github.com/facebook/duckling/pull/395
Reviewed By: patapizza
Differential Revision: D26193069
Pulled By: chessai
fbshipit-source-id: cf711807fa1d40be2303f2426d74ded40c2e23b3
Summary: adds a new rule that parses year intervals such as "1960 - 1961". see inline comments for heuristics.
Reviewed By: patapizza
Differential Revision: D25840835
fbshipit-source-id: 851a5b1c78440cbf065bf9f20a05c78d4967ea3c
Summary: adds a rule for 'the day after tomorrow' in Romanian. regenerates classifiers.
Reviewed By: girifb
Differential Revision: D26155042
fbshipit-source-id: 80005ab94a10f9fbf242c9a712bd040e4f6bc477
Summary:
**2nd set of changes from pull request https://github.com/facebook/duckling/issues/516
Supporting Cantonese and more common expressions in Chinese.
Adding rules file for Duration/ZH.
Pull Request resolved: https://github.com/facebook/duckling/pull/523
Reviewed By: haoxuany
Differential Revision: D23428901
Pulled By: chessai
fbshipit-source-id: 6d04c97b63bac966eb61d77cab2f08f7543dbbf0
Summary:
* "at the moment" is considered identical to "now".
* "ASAP" is considered identical to "from now"
Pull Request resolved: https://github.com/facebook/duckling/pull/405
Reviewed By: patapizza
Differential Revision: D26009483
Pulled By: chessai
fbshipit-source-id: addf4c509e69d413cae279601c64f72710eba11f
Summary:
Improves the recognition of German time approximation language and removes a single error in the rule of <time-of-day> approximately.
Pull Request resolved: https://github.com/facebook/duckling/pull/435
Reviewed By: patapizza
Differential Revision: D24934281
Pulled By: chessai
fbshipit-source-id: 641bcb6a7e5c26e66c735fe13bccae9b7a8909ae
Summary:
Found a lacking frequent duration in German and a small typo in the existing one.
Pull Request resolved: https://github.com/facebook/duckling/pull/509
Reviewed By: patapizza
Differential Revision: D24690104
Pulled By: chessai
fbshipit-source-id: b49a7a636abf5b92f2fe7c0d5b2ca2fe64acbaa2