Summary: Currently, Duckling will accept "The first christmas of next month" in this rule, which is nonsensical. This reduces the scope of the times the rule recognises, thereby limiting us to a set of more sensible resolutions.
Reviewed By: stroxler
Differential Revision: D27861417
fbshipit-source-id: 3f19700af7298a6238c59f5de0598168d4b4a3c4
Summary:
Support numerals like "forty-five (45)". Commonly seen in legal documents.
There are no classifiers to regenerate.
Resolves#216
Reviewed By: stroxler
Differential Revision: D28305725
fbshipit-source-id: b9b4e160f630ce3cf462fcf9f2e575738463c313
Summary:
My attention was brought to this issue by the linter complaining about `read`.
The linter is just as happy with `error`, but I do think it's better to fail
with a usable error message here than with whatever error `read` gives us.
Reviewed By: chessai
Differential Revision: D28213406
fbshipit-source-id: f101d0515ee64978480bdbb873ff72d80d124969
Summary:
I think they all fell into one of two categories:
- names colliding with field names, but where there was already an existing
pattern (e.g. d for dimension, v for value) and we just had to be consistent
- cases that were best fixed by turning NamedFieldPuns on, which I did
Reviewed By: chessai
Differential Revision: D28213245
fbshipit-source-id: 18fbd61771e12da11ce03b98b74af51d1e837787
Summary:
When tracing the code from Debug downward, the unnecessary rename
of an argument from `sentence` to `input` creates a context switch. Let's
use the same name throughout.
Reviewed By: chessai
Differential Revision: D28213244
fbshipit-source-id: 22476d958312e5c60cd32ff1e3d0d460cf0c8c79
Summary:
In both `Api.hs` and `Debug.hs` I noticed that I was staring at the code
longer than necessary to figure out what a lambda with a destructure and
pattern match were doing. Moving it to a function in `where` named
`isRelevantDimension` makes skimming easier.
Reviewed By: chessai
Differential Revision: D28213243
fbshipit-source-id: 344f464dcac7297009c35b19373eef67e0eb9540
Summary:
Easy style fixes for ExampleMain.hs, Debug.hs, Api.hs, Core.hs
Most of these are just lint fixes, but I also made a few not-just-lint changes
to conform to some elements of our style guide that I agree with:
- if the type signature doesn't fit on one line, then put one type per line
with nothing on the first line, so that all types are vertically aligned - makes
for a quick skim
- try to avoid mixing same-line function args with hanging function args: hang
all arguments or none at all to get a more outline-like feel, again better for
skimming
I was actually able to eliminate all errors for most of these modules - the name
collisions I usually give up on were manageable by hiding + easy variable renames
Reviewed By: chessai
Differential Revision: D28213246
fbshipit-source-id: 1f77d56f2ff8dccfd5f3b534f087c07047b92885
Summary:
While I was working on fixing #604, I came across the rules
`ruleMilitarySpelledOutAMPM(2)`, which were actually capturing
some of my test phrases and confusing me.
This commit removes them because
- they aren't needed: the existing latent spelled-out hour + minute rules plus
the "(in the )?(am/pm)" rules together give the same behavior
- they are confusingly named - these aren't military times at all, they are
spelled-out civilian times
Reviewed By: haoxuany
Differential Revision: D27848485
fbshipit-source-id: ba1ed16ec22b5139b0b500b44dc91adb1b5e3d82
Summary:
This commit extend Spanish-language support for concatenations
of the form "<higher-order-of-magnitude> <lower>", e.g.
"doscientos tres" (203) or "cuatro mil ventiuno" (4022) to work
not just for hundreds but also for thousands and millions.
Reviewed By: chessai
Differential Revision: D27858133
fbshipit-source-id: 5c6b227ae7dad9009cd636e7ea49c209480c931a
Summary:
This commit adds two things to Spanish numeral support:
- support for millions
- support, via hooking into the `isMultipliable` logic used by EN, for
composing counts of 2-999 with either "mil" or "millones", which is
the standard way to say things like "tres mil" = 3000
Reviewed By: chessai
Differential Revision: D27858135
fbshipit-source-id: 980e95bd989f818c5ceaa2bb6c87fe81d3e08366
Summary:
This diff refactors our handling of "<hundreds> 0..99" numbers
to be more flexible by replacing `ruleNumeralthreePartHundreds`
with
- a rule for two-part hundreds like "dos cientos" (which is technically
incorrect grammar - doscientos is correct - but probably worth keeping) based
on a notion of multipliability like that used in EN rules
- a rule stating that we can compose hundreds with 0..99 additively
The resulting rules are more flexible, and they correctly parse not only
gramatically iffy phrases like "dos cientos tres", but also grammatically
correct phrases like "doscientos tres". This fixes#380.
Reviewed By: chessai
Differential Revision: D27858136
fbshipit-source-id: 4a918d84d93ac074f83f6947a8f80cfd11145115
Summary:
Whle looking into fixing https://github.com/facebook/duckling/issues/380
I was having a bit of trouble navigating the existing rules and guessing
what is / is not supported.
This diff refactors the Numeral/ES code to be easier to navigate:
- rename all the `ruleNumeral{1,2,3,4,5,6}` rules to be descriptive
- changes the order to be themed from small to large numbers, and
make sure the order of defines matches the order of rules at the end
of the module
- use [20 .. 90] instead of manually specifying the same list out-of-order
Reviewed By: chessai
Differential Revision: D27858134
fbshipit-source-id: b13983d75b36bb4e2b387ef06fe61066d81ae19a
Summary:
It's common to use dashes when spelling out times longhand,
e.g. "five-thirty am", but Duckling wasn't handling this at all.
This commit adds rules for times spelled out with dashes. The
rules explicitly forbid the second of the two times from including
digits via a negative match. This is because
- it wouldn't be at all idomatic to write five-26 or five-oh-6
- allowing that pattern clashes with time range parsing, e.g.
"9-10 am" should parse as a time range, not as "9:10 am"
Reviewed By: chessai
Differential Revision: D27848428
fbshipit-source-id: dfe8b98cb38119a16db2a19db47fd3128783e617
Summary:
This commit fixes#111, which was an open issue that any non-integer
multiple of any unit of time was being converted to seconds.
My solution is to write a recursive function `Duration.Helpers.inCoarsestGrain`
which, given a grain `g` and double value `v` finds the coarses grain `g'` such that
`v * g` - rounded to the nearest seconds - has integral units.
We call this function only in the case of non-integer multiples, and we start our
search from the given grain because nothing coarser would make sense. The code could
actually be slightly more efficient if we started at the next-smallest grain, but
in the interest of clarity I think this is probably better.
Reviewed By: chessai
Differential Revision: D27891439
fbshipit-source-id: b048310963eb71337fd91ab4ef3c840134a76e73
Summary:
While debugging an attempt to extend our handling of spelled-out
times, I realized that we are being too aggressive in our parsing of
times like "five ten", because we'll parse "five nine" as possibly
meaning "5:09", which isn't something an English speaker would say
(or rather if they did, it's more likely they mean "five (to) six"
or something similar.
Reviewed By: chessai
Differential Revision: D27848429
fbshipit-source-id: 34d783332fd60359ad9b6e7862367453bc93a1d1
Summary:
This commit gets rid of all the easy-to-fix lint warnings on time helper modules:
- replacing unnecessary `.` with `$`
- Flipping a lambda in a map to an infix operation
- Use `ts` for a list of times, not `series` which produces a pretty confusing naming collision
There are still quite a lot of lint errors related to name masking, which would be challenging to fix without us coming to an agreement about naming conventions.
But at least in my editor, name-masking errors are a lot less visually noisy than other errors (they only highlight the one name) so I don't mind them as much when skimming the code.
Reviewed By: chessai
Differential Revision: D27842198
fbshipit-source-id: 9091e5349657243b61d7ee169d0d06dd2122ac17
Summary:
The Dutch numeral and ordinal rules contained a spelling mistake: [‘achttien’](https://en.wiktionary.org/wiki/achttien) (18) was spelt as ‘achtien’ (note the single *t*).
This PR fixes this spelling mistake everywhere in the code base.
Pull Request resolved: https://github.com/facebook/duckling/pull/602
Reviewed By: patapizza
Differential Revision: D27859001
Pulled By: chessai
fbshipit-source-id: 8da0d8b099d49cf6207d4066cee1fc7da68a418e
Summary:
While looking at bugfixes for some ES and IT numeral parsing, I was
looking at the Helpers.hs module and figured I'd silence some lint
errors. There are still some name-masking lint issues, but all others
have been fixed.
Reviewed By: chessai
Differential Revision: D27854164
fbshipit-source-id: 5be8d924b033a55608c455074df1c80c8c0019be
Summary:
This fixes#592 in a very conservative way: the reason why `ruleIntersect` does
not detect "tonight 815" and "tonight eight fifteen" as it does "tonight 8:15"
is because it explicitly forbids the second part of the intersection from being
latent, unless it is a year.
I don't think it's a good idea to remove the restriction on latent inputs in
`ruleIntersect`, so instead I just made a new rule specifically for the
intersection of `<part-of-day> <time-of-day>`.
It also seems to me that there's a lot of room for this to be too aggressive,
for example if I say "tonight 500 people will laugh" the "tonight" and "500"
aren't really linked. So, I set the rule to be latent; this may be too conservative
to be useful though (do client libraries usually allow latent results?).
Reviewed By: chessai
Differential Revision: D27842596
fbshipit-source-id: 36ac59e31c632d4864241bce291147a46d52f780
Summary: Adds Catalan language and Numeral rules for it
Reviewed By: haoxuany
Differential Revision: D26518604
Pulled By: chessai
fbshipit-source-id: e6b4b0ceb9b7931d086c732dd03fb5cbbe062d5b
Summary:
These are popular variants/abbreviations of Egyptian pounds.
All these forms are documented on wikipedia (https://en.wikipedia.org/wiki/Egyptian_pound)
Pull Request resolved: https://github.com/facebook/duckling/pull/590
Reviewed By: haoxuany
Differential Revision: D27598249
Pulled By: chessai
fbshipit-source-id: 42ae9115b1def48c58e50a6deb624c3407c029f3
Summary:
Convert `This` to `Seal` in order to make this example working.
Pull Request resolved: https://github.com/facebook/duckling/pull/585
Reviewed By: girifb
Differential Revision: D27235685
Pulled By: chessai
fbshipit-source-id: 71a712a622b5d9d10f7842276a2b8f60f962477e
Summary:
By default, Facebook Haskell code has a lint error banning
-1s, because at one point it was used as a default value in some
API handlers and it's better to use Option + Nothing for this use case.
But in Duckling - particularly in the Time module - it's quite
common that we actually want to work with -1 as a meaningful integer
value, involving some kind of offset.
Reviewed By: chessai
Differential Revision: D27118984
fbshipit-source-id: 5fe2200e8005a20855d7fdd3a8eb2ad33291edc8
Summary:
The facebook internal linters prefer us to avoid
excessive point-free style and extra $ where we could
instead move existing brackets.
Making those style tweaks for Time/EN/Rules.hs because
I was looking at the file as part of
Reviewed By: chessai
Differential Revision: D27108042
fbshipit-source-id: 7c8e76578476ea14d655131943e693c5159b12d2
Summary:
On some linux systems, such as on NixOS, /usr/share/zoneinfo does not
exist. What does exist in its place is /etc/zoneinfo. So, we should try to load
that if /usr/share/zoneinfo does not exist.
Pull Request resolved: https://github.com/facebook/duckling/pull/582
Reviewed By: girifb
Differential Revision: D27086925
Pulled By: chessai
fbshipit-source-id: f4a38822be9888d57034f67a6f7abd17d56d38b8
Summary:
I was testing an unrelated change (which doesn't change
classifier scores) and reran classifiers just to be safe, I noticed
that the scores changed.
This diff updates them.
Reviewed By: chessai
Differential Revision: D26892970
fbshipit-source-id: c7da3e3b7d01955f98b287a3ff4e7c1ff2837c7f
Summary:
I was looking at adding support for "next week" constructions in Spanish to
close https://github.com/facebook/duckling/issues/553 (which it appears has
already been handled), when I noticed that the equivalent logic for English
has been split into two separate examples: "coming week" isn't in the same
example as other equivalent constructs like "upcoming week" and "next week".
This diff combines them, which I think is clearer and fewer lines of code
Reviewed By: chessai
Differential Revision: D26892322
fbshipit-source-id: 68ca4644759198fc79d963ae080495c3f2d4a923
Summary: due to exploit in T85548324, factoring the input to get a smaller parse tree (the existing one parses tail recursively, whereas this one uses ruleIntersect which is still bad, but slightly better).
Differential Revision: D26657170
fbshipit-source-id: fe3a738073b4d30ae401521bb692f4a4bba48d96
Summary: There are a handful of more spelling for russian numbers [20, 30 .. 90] that we aren't handling. Additionally, we optimise for recall over precision by allowing some invalid spellings that could be understandable typos.
Reviewed By: patapizza
Differential Revision: D26285711
Pulled By: chessai
fbshipit-source-id: fd8a8f373d228a526e79b22326eff48bb966310d