Summary:
Some time recognition improvements for Catalan:
- morning should be a time range recognised until noon
- "dema" can also be used for tomorrow (besides "demà")
- "se" alone should not be understood as September
Pull Request resolved: https://github.com/facebook/duckling/pull/639
Reviewed By: stroxler
Differential Revision: D30312076
Pulled By: chessai
fbshipit-source-id: 1a42bbd7eecc4f5690145ee9cadb8eccae8edd08
Summary:
I'm using Duckling in my project but I noticed that quantities in kg weren't being detected correctly, though other entities such as numeral/volume were all working as expected. Investigating more I noticed that this was just because in Portuguese the Quantity entity was only configured to detect cups and pounds, never grams. And even for cups/pounds, products weren't being detected correctly.
So I've just adapted the rules from English Quantity to work in Portuguese as well, while keeping the cups/pounds too. It's all working as expected now and it's backwards compatible.
Pull Request resolved: https://github.com/facebook/duckling/pull/631
Reviewed By: stroxler
Differential Revision: D29701339
Pulled By: chessai
fbshipit-source-id: fca08a14c50844d418f101b885ca54554d993f58
Summary:
add ruleLeadingDotNumeral which parses "punto 2" and "coma 2" as 0.2, and allow "coma" in ruleNumeralDotNumeral.
Also extend ruleNumeralsPrefixWithNegativeOrMinus to include 'negativo' prefixes
Reviewed By: stroxler
Differential Revision: D29405886
fbshipit-source-id: eb43f6f72374430af414e0d29009b98df2115a31
Summary: Initialise Time for CA (Catalan) language
Reviewed By: stroxler
Differential Revision: D28455273
Pulled By: chessai
fbshipit-source-id: be9a4d61692ba4fb32986e161e9fdd6d25a357dc
Summary:
I don't think abbreviating pattern to ptn is a win, the abbreviation
isn't good enough to be unambiguous (I found this because I was
trying to figure out what "ptn" meant), and `pattern` isn't that
long a name.
Reviewed By: chessai
Differential Revision: D28462776
fbshipit-source-id: bd685720b198fed791d84c00732eb1873b37528b
Summary:
Solutions were:
- use targeted and qualified imports to avoid pulling in the universe of Duckling.Types
- use long-form descriptive names in a few places
- shuffle a let clause to just define an output instead of a local func
Also got rid of another lint error suggesting to use a section instead of flip;
the module is now lint-warning free.
Reviewed By: chessai
Differential Revision: D28462775
fbshipit-source-id: 1e2855756b22cb62db0d94334a7e063aa728b7bf
Summary:
Using rules of thumb:
- use unabbreviated names for aguments to top-level functions where there's
a clash (e.g. lots of t -> time transformations)
- use abbreviated names for nested local functions (so e.g. t, d to avoid
a clash with the `day` top-level function)
Reviewed By: chessai
Differential Revision: D28462777
fbshipit-source-id: 8795d038b2c3a65b60f0d2d9091b7c56cc8a5ff7
Summary:
In my opinion putting `Candidate` into the core `Types.hs`
is a mistake - it's used exclusively in the ranking stage, so cluttering
the core tokenizing and recursive parsing / value resolution logic in
`Duckling.Types` with this irrelevant datatype makes things less clear
than if we keep it in the `Ranking` modules.
Reviewed By: chessai
Differential Revision: D28462902
fbshipit-source-id: cd4bb88c4a16945265e8f21c8808b06ae3383559
Summary: There are different implementations of isRangeValid that work well for different languages, thus it makes sense to facilitate having different implementations based on the language.
Reviewed By: patapizza
Differential Revision: D28362777
fbshipit-source-id: 5f2991d54af3095c8e95cf534e2dd3b4a34dee3a
Summary:
In ES (Spanish), decimals can be expressed by `<number> con <number>`, where the whole part is to the left and the decimal part is to the right.
Resolves#615
Reviewed By: stroxler
Differential Revision: D28449722
fbshipit-source-id: caa0fb52f72f94c4a4cc456a46c25fa5f3b9b625
Summary:
Make the code reflect the call graph, which looks roughly like this:
```
parseAndResolve
runDuckling
resolveNode
parseString
saturateParseString
parseString1
matchFirst
... low level stuff
matchFirstAnywhere
... low level stuff
```
I found the existing order pretty hard to untangle when I was writing some architecture notes on this module, I think the new ordering will help
Reviewed By: chessai
Differential Revision: D28441933
fbshipit-source-id: 07c722aa6d4038baa7f14fec84660ecc2736ed2e
Summary: Some classifiers were a bit out of date. They needed regenerating.
Reviewed By: girifb
Differential Revision: D28399234
fbshipit-source-id: 2780dbe5478a5386a2b6062dec8696736b3ce723
Summary:
I spent a surprising amount of time trying to figure out what
this comment was referring to because it wasn't at all clear to me
that it meant a comment in another file. Making it more specific
Reviewed By: chessai
Differential Revision: D28411103
fbshipit-source-id: 26cd29b47367a7e0d865f616f289fef570544c39
Summary:
When documenting `Types.hs` last week I got confused about what the Bool
represented here, following up on a suggestion to add a doc comment
Reviewed By: chessai
Differential Revision: D28412103
fbshipit-source-id: 01af1f0831fc3e49d4b7f5bb9a4e89c5897b3d25
Summary: Just replace `.` with `$`, also tweaked the spacing a bit for skimmability
Reviewed By: chessai
Differential Revision: D28411898
fbshipit-source-id: d18b9ef5db99b82d150231080c89f812f709f409
Summary: Use targeted imports to avoid clash on `node` variable name
Reviewed By: chessai
Differential Revision: D28411902
fbshipit-source-id: 4a81e35a6aa601015685ccab3f571e919e9025c8
Summary: Initialise Temperature for CA (Catalan) language
Reviewed By: stroxler
Differential Revision: D28298423
Pulled By: chessai
fbshipit-source-id: 73b87d002196b6b707388e9f83f42591510f40eb
Summary: Currently, Duckling will accept "The first christmas of next month" in this rule, which is nonsensical. This reduces the scope of the times the rule recognises, thereby limiting us to a set of more sensible resolutions.
Reviewed By: stroxler
Differential Revision: D27861417
fbshipit-source-id: 3f19700af7298a6238c59f5de0598168d4b4a3c4
Summary:
Support numerals like "forty-five (45)". Commonly seen in legal documents.
There are no classifiers to regenerate.
Resolves#216
Reviewed By: stroxler
Differential Revision: D28305725
fbshipit-source-id: b9b4e160f630ce3cf462fcf9f2e575738463c313
Summary:
My attention was brought to this issue by the linter complaining about `read`.
The linter is just as happy with `error`, but I do think it's better to fail
with a usable error message here than with whatever error `read` gives us.
Reviewed By: chessai
Differential Revision: D28213406
fbshipit-source-id: f101d0515ee64978480bdbb873ff72d80d124969
Summary:
I think they all fell into one of two categories:
- names colliding with field names, but where there was already an existing
pattern (e.g. d for dimension, v for value) and we just had to be consistent
- cases that were best fixed by turning NamedFieldPuns on, which I did
Reviewed By: chessai
Differential Revision: D28213245
fbshipit-source-id: 18fbd61771e12da11ce03b98b74af51d1e837787
Summary:
When tracing the code from Debug downward, the unnecessary rename
of an argument from `sentence` to `input` creates a context switch. Let's
use the same name throughout.
Reviewed By: chessai
Differential Revision: D28213244
fbshipit-source-id: 22476d958312e5c60cd32ff1e3d0d460cf0c8c79
Summary:
In both `Api.hs` and `Debug.hs` I noticed that I was staring at the code
longer than necessary to figure out what a lambda with a destructure and
pattern match were doing. Moving it to a function in `where` named
`isRelevantDimension` makes skimming easier.
Reviewed By: chessai
Differential Revision: D28213243
fbshipit-source-id: 344f464dcac7297009c35b19373eef67e0eb9540
Summary:
Easy style fixes for ExampleMain.hs, Debug.hs, Api.hs, Core.hs
Most of these are just lint fixes, but I also made a few not-just-lint changes
to conform to some elements of our style guide that I agree with:
- if the type signature doesn't fit on one line, then put one type per line
with nothing on the first line, so that all types are vertically aligned - makes
for a quick skim
- try to avoid mixing same-line function args with hanging function args: hang
all arguments or none at all to get a more outline-like feel, again better for
skimming
I was actually able to eliminate all errors for most of these modules - the name
collisions I usually give up on were manageable by hiding + easy variable renames
Reviewed By: chessai
Differential Revision: D28213246
fbshipit-source-id: 1f77d56f2ff8dccfd5f3b534f087c07047b92885
Summary:
While I was working on fixing #604, I came across the rules
`ruleMilitarySpelledOutAMPM(2)`, which were actually capturing
some of my test phrases and confusing me.
This commit removes them because
- they aren't needed: the existing latent spelled-out hour + minute rules plus
the "(in the )?(am/pm)" rules together give the same behavior
- they are confusingly named - these aren't military times at all, they are
spelled-out civilian times
Reviewed By: haoxuany
Differential Revision: D27848485
fbshipit-source-id: ba1ed16ec22b5139b0b500b44dc91adb1b5e3d82
Summary:
This commit extend Spanish-language support for concatenations
of the form "<higher-order-of-magnitude> <lower>", e.g.
"doscientos tres" (203) or "cuatro mil ventiuno" (4022) to work
not just for hundreds but also for thousands and millions.
Reviewed By: chessai
Differential Revision: D27858133
fbshipit-source-id: 5c6b227ae7dad9009cd636e7ea49c209480c931a
Summary:
This commit adds two things to Spanish numeral support:
- support for millions
- support, via hooking into the `isMultipliable` logic used by EN, for
composing counts of 2-999 with either "mil" or "millones", which is
the standard way to say things like "tres mil" = 3000
Reviewed By: chessai
Differential Revision: D27858135
fbshipit-source-id: 980e95bd989f818c5ceaa2bb6c87fe81d3e08366
Summary:
This diff refactors our handling of "<hundreds> 0..99" numbers
to be more flexible by replacing `ruleNumeralthreePartHundreds`
with
- a rule for two-part hundreds like "dos cientos" (which is technically
incorrect grammar - doscientos is correct - but probably worth keeping) based
on a notion of multipliability like that used in EN rules
- a rule stating that we can compose hundreds with 0..99 additively
The resulting rules are more flexible, and they correctly parse not only
gramatically iffy phrases like "dos cientos tres", but also grammatically
correct phrases like "doscientos tres". This fixes#380.
Reviewed By: chessai
Differential Revision: D27858136
fbshipit-source-id: 4a918d84d93ac074f83f6947a8f80cfd11145115