Commit Graph

740 Commits

Author SHA1 Message Date
Filipe Pereira
d8888e2ff8 Ca time improvements (#639)
Summary:
Some time recognition improvements for Catalan:
- morning should be a time range recognised until noon
- "dema" can also be used for tomorrow (besides "demà")
- "se" alone should not be understood as September

Pull Request resolved: https://github.com/facebook/duckling/pull/639

Reviewed By: stroxler

Differential Revision: D30312076

Pulled By: chessai

fbshipit-source-id: 1a42bbd7eecc4f5690145ee9cadb8eccae8edd08
2021-08-16 10:46:40 -07:00
chessai
32eb5db8c2 update lts resolver (#641)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/641

Reviewed By: haoxuany

Differential Revision: D30260899

Pulled By: chessai

fbshipit-source-id: 3e6d3d8aa84caac8ba6b845d79a577618f12515f
2021-08-11 19:02:53 -07:00
Dubovinszky Péter
0354f27ef4 Time/HU: extend dates (#462)
Summary:
Extend Hungarian dates with new cases

Pull Request resolved: https://github.com/facebook/duckling/pull/462

Differential Revision: D25573636

Pulled By: chessai

fbshipit-source-id: 251698cf9f5126162ad4fbf1489dcbc4c12541ed
2021-08-11 13:31:28 -07:00
Filipe Pereira
57dab83ad3 PT time improvements 2 (#636)
Summary:
Fixed rules for PT time expressions like "amanhã à noite", "dia 17", "dia 15 às 18"

Pull Request resolved: https://github.com/facebook/duckling/pull/636

Reviewed By: stroxler

Differential Revision: D30138416

Pulled By: chessai

fbshipit-source-id: 5265d44e7ddce5eee8cd7266df9254389a10b139
2021-08-05 13:47:41 -07:00
Filipe Pereira
fc7950a68f ES time improvements (#634)
Summary:
New rules for ES time expressions like "3 Marzo", "Marzo 3.

Pull Request resolved: https://github.com/facebook/duckling/pull/634

Reviewed By: girifb

Differential Revision: D30110631

Pulled By: chessai

fbshipit-source-id: e6add868535522d243ccf1dab2443e6cd3f7f8b2
2021-08-05 10:48:47 -07:00
Filipe Pereira
a6499228af FR time improvement (#635)
Summary:
Fixed recognition for month "Juil" (abbreviation of "Juillet")

Pull Request resolved: https://github.com/facebook/duckling/pull/635

Reviewed By: stroxler

Differential Revision: D30115291

Pulled By: chessai

fbshipit-source-id: e04d6e7952f85f4ca061540a3967908bcd4f1ebd
2021-08-04 17:31:08 -07:00
Filipe Pereira
fe4f77bdc0 PT time improvements (#633)
Summary:
New rules for PT time expressions like "5 Maio", "Maio 5", "5 Maio 2022".

Pull Request resolved: https://github.com/facebook/duckling/pull/633

Reviewed By: stroxler

Differential Revision: D30114330

Pulled By: chessai

fbshipit-source-id: f56418d95efa1d7488957b8b8083daec3193949b
2021-08-04 17:31:07 -07:00
Maíra Bello
328e59ebc4 Quantity/PT: Extend quantity to include grams in portuguese (#631)
Summary:
I'm using Duckling in my project but I noticed that quantities in kg weren't being detected correctly, though other entities such as numeral/volume were all working as expected. Investigating more I noticed that this was just because in Portuguese the Quantity entity was only configured to detect cups and pounds, never grams. And even for cups/pounds, products weren't being detected correctly.

So I've just adapted the rules from English Quantity to work in Portuguese as well, while keeping the cups/pounds too. It's all working as expected now and it's backwards compatible.

Pull Request resolved: https://github.com/facebook/duckling/pull/631

Reviewed By: stroxler

Differential Revision: D29701339

Pulled By: chessai

fbshipit-source-id: fca08a14c50844d418f101b885ca54554d993f58
2021-07-19 15:48:04 -07:00
Daniel Cartwright
b10e1d6a78 ES Numeral - Add ruleLeadingDotNumeral and improve ruleNumeralDotNumeral
Summary:
add ruleLeadingDotNumeral which parses "punto 2" and "coma 2" as 0.2, and allow "coma" in ruleNumeralDotNumeral.

Also extend ruleNumeralsPrefixWithNegativeOrMinus to include 'negativo' prefixes

Reviewed By: stroxler

Differential Revision: D29405886

fbshipit-source-id: eb43f6f72374430af414e0d29009b98df2115a31
2021-07-19 13:18:09 -07:00
Amr Keleg
79ac8f63f9 Add isArabic rule (#577)
Summary:
Fixes https://github.com/facebook/duckling/issues/437, fixes https://github.com/facebook/duckling/issues/571

Pull Request resolved: https://github.com/facebook/duckling/pull/577

Reviewed By: stroxler

Differential Revision: D29664126

Pulled By: chessai

fbshipit-source-id: b6365699231527b0869322c798e32a21328f1071
2021-07-12 13:37:23 -07:00
Daniel Cartwright
ed291c2a3a ES (Spanish) Time - add rule for 'next <day-of-week>'
Summary: Resolves #623. Add rule for 'proximo <day-of-week>'

Reviewed By: stroxler

Differential Revision: D29002419

fbshipit-source-id: 7d5fb04b66fe068ae2906b63ede44009e80f1a3c
2021-06-09 20:33:12 -07:00
Damien Gallet
28e38679a7 Time.FR > add rule for years in twentieth centry (#357)
Summary:
In Time.FR, add support for birthdates like "15 juin 72"

Pull Request resolved: https://github.com/facebook/duckling/pull/357

Reviewed By: patapizza

Differential Revision: D26193322

Pulled By: chessai

fbshipit-source-id: d22efea81aad31af8baa2f7f9afdaf1a75c0dc10
2021-06-04 13:04:12 -07:00
Daniel Cartwright
8cb77a43c7 Add custom isRangeValid implementation for ZH
Summary: Fixes #313

Reviewed By: stroxler

Differential Revision: D28364035

fbshipit-source-id: 7fe3dba75410d217747a0d7a6f7df611ac26ec70
2021-06-04 12:48:32 -07:00
evjava
4878820294 Russian(RU) numeral and ordinal improvements (#374)
Summary:
- added non-typo variant for 11 (одиннадцать)
- added variants for grammatical cases

Pull Request resolved: https://github.com/facebook/duckling/pull/374

Test Plan:
```
:test Duckling.Numeral.RU.Tests
:test Duckling.Ordinal.RU.Tests
```

Reviewed By: stroxler

Differential Revision: D20332223

Pulled By: chessai

fbshipit-source-id: be1c6f6477af56418b69da21f5219ba27b50d0a1
2021-06-04 12:18:31 -07:00
tuantvk
25b39f4a8b Time/VI: update rule Sunday (#611)
Summary:
In Vietnam, sunday is "chủ nhật" or "Chúa nhật (Catholic)".

Pull Request resolved: https://github.com/facebook/duckling/pull/611

Reviewed By: haoxuany

Differential Revision: D28399277

Pulled By: chessai

fbshipit-source-id: 26aa7c76cf1f8b8c2ba32e049f7f470a140e3d92
2021-06-04 12:18:27 -07:00
chessai
99e1dce9c4 restrict dimensions to only those specified (#625)
Summary:
Resolves https://github.com/facebook/duckling/issues/624

Before patch (specifying quantity and numeral, but time still shows up):
```
❯ curl -XPOST http://0.0.0.0:8000/parse --data 'locale=en_US&text="June 21 and 3 cups of sugar"&dims="[\"quantity\",\"numeral\"]"' | jq
[
  {
    "body": "June 21",
    "start": 1,
    "value": {
      "values": [
        {
          "value": "2021-06-21T00:00:00.000-07:00",
          "grain": "day",
          "type": "value"
        },
        {
          "value": "2022-06-21T00:00:00.000-07:00",
          "grain": "day",
          "type": "value"
        },
        {
          "value": "2023-06-21T00:00:00.000-07:00",
          "grain": "day",
          "type": "value"
        }
      ],
      "value": "2021-06-21T00:00:00.000-07:00",
      "grain": "day",
      "type": "value"
    },
    "end": 8,
    "dim": "time",
    "latent": false
  },
  {
    "body": "3 cups of sugar",
    "start": 13,
    "value": {
      "value": 3,
      "type": "value",
      "product": "sugar",
      "unit": "cup"
    },
    "end": 28,
    "dim": "quantity",
    "latent": false
  }
]
```

After patch (time no longer shows up):
```
❯ curl -XPOST http://0.0.0.0:8000/parse --data 'locale=en_US&text="June 21 and 3 cups of sugar"&dims="[\"quantity\",\"numeral\"]"' | jq
[
  {
    "body": "3 cups of sugar",
    "start": 13,
    "value": {
      "value": 3,
      "type": "value",
      "product": "sugar",
      "unit": "cup"
    },
    "end": 28,
    "dim": "quantity",
    "latent": false
  }
]
```

Pull Request resolved: https://github.com/facebook/duckling/pull/625

Reviewed By: stroxler

Differential Revision: D28851759

Pulled By: chessai

fbshipit-source-id: d3b3f33092c7e60bf29886939488ed562a213c35
2021-06-03 10:33:42 -07:00
chessai
878beb7aa1 fix hanging ci (#626)
Summary:
resolves https://github.com/facebook/duckling/issues/622

Pull Request resolved: https://github.com/facebook/duckling/pull/626

Reviewed By: stroxler

Differential Revision: D28842781

Pulled By: chessai

fbshipit-source-id: 3210bbf9eb76f21c90af86f6abdeac566fc86415
2021-06-02 15:32:58 -07:00
leandro.guisandez@pgconocimiento.com
5d8d99bbf4 Init
Summary: Initialise Time for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28455273

Pulled By: chessai

fbshipit-source-id: be9a4d61692ba4fb32986e161e9fdd6d25a357dc
2021-05-18 13:50:19 -07:00
Steven Troxler
3d2f1939ef s/ptn/pattern
Summary:
I don't think abbreviating pattern to ptn is a win, the abbreviation
isn't good enough to be unambiguous (I found this because I was
trying to figure out what "ptn" meant), and `pattern` isn't that
long a name.

Reviewed By: chessai

Differential Revision: D28462776

fbshipit-source-id: bd685720b198fed791d84c00732eb1873b37528b
2021-05-18 11:50:19 -07:00
Steven Troxler
dbc5c91263 Remove all name clashes in Time/Helpers.hs
Summary:
Solutions were:
 - use targeted and qualified imports to avoid pulling in the universe of Duckling.Types
 - use long-form descriptive names in a few places
 - shuffle a let clause to just define an output instead of a local func

Also got rid of another lint error suggesting to use a section instead of flip;
the module is now lint-warning free.

Reviewed By: chessai

Differential Revision: D28462775

fbshipit-source-id: 1e2855756b22cb62db0d94334a7e063aa728b7bf
2021-05-18 11:50:18 -07:00
Steven Troxler
dd1ae664cc Fix name collisions in Time/Types.hs
Summary:
Using rules of thumb:
 - use unabbreviated names for aguments to top-level functions where there's
   a clash (e.g. lots of t -> time transformations)
 - use abbreviated names for nested local functions (so e.g. t, d to avoid
   a clash with the `day` top-level function)

Reviewed By: chessai

Differential Revision: D28462777

fbshipit-source-id: 8795d038b2c3a65b60f0d2d9091b7c56cc8a5ff7
2021-05-18 11:50:18 -07:00
Steven Troxler
81ab073acf Move Candidate to Ranking/Types.hs
Summary:
In my opinion putting `Candidate` into the core `Types.hs`
is a mistake - it's used exclusively in the ranking stage, so cluttering
the core tokenizing and recursive parsing / value resolution logic in
`Duckling.Types` with this irrelevant datatype makes things less clear
than if we keep it in the `Ranking` modules.

Reviewed By: chessai

Differential Revision: D28462902

fbshipit-source-id: cd4bb88c4a16945265e8f21c8808b06ae3383559
2021-05-18 11:50:17 -07:00
Daniel Cartwright
69d951220e Make isRangeValid take Lang as input
Summary: There are different implementations of isRangeValid that work well for different languages, thus it makes sense to facilitate having different implementations based on the language.

Reviewed By: patapizza

Differential Revision: D28362777

fbshipit-source-id: 5f2991d54af3095c8e95cf534e2dd3b4a34dee3a
2021-05-17 13:18:11 -07:00
Daniel Cartwright
7762af850a ES Numeral con
Summary:
In ES (Spanish), decimals can be expressed by `<number> con <number>`, where the whole part is to the left and the decimal part is to the right.

Resolves #615

Reviewed By: stroxler

Differential Revision: D28449722

fbshipit-source-id: caa0fb52f72f94c4a4cc456a46c25fa5f3b9b625
2021-05-14 13:50:22 -07:00
Steven Troxler
323a7df023 Rearrange Engine.hs to top-down ordering
Summary:
Make the code reflect the call graph, which looks roughly like this:
```
parseAndResolve
  runDuckling
  resolveNode
  parseString
    saturateParseString
    parseString1
      matchFirst
         ... low level stuff
      matchFirstAnywhere
         ... low level stuff
```

I found the existing order pretty hard to untangle when I was writing some architecture notes on this module, I think the new ordering will help

Reviewed By: chessai

Differential Revision: D28441933

fbshipit-source-id: 07c722aa6d4038baa7f14fec84660ecc2736ed2e
2021-05-14 11:50:03 -07:00
Daniel Cartwright
13513d30a5 Regenerate classifiers
Summary: Some classifiers were a bit out of date. They needed regenerating.

Reviewed By: girifb

Differential Revision: D28399234

fbshipit-source-id: 2780dbe5478a5386a2b6062dec8696736b3ce723
2021-05-13 14:02:35 -07:00
Steven Troxler
9151f9e1ab Specify where the note on regex + text lives
Summary:
I spent a surprising amount of time trying to figure out what
this comment was referring to because it wasn't at all clear to me
that it meant a comment in another file. Making it more specific

Reviewed By: chessai

Differential Revision: D28411103

fbshipit-source-id: 26cd29b47367a7e0d865f616f289fef570544c39
2021-05-13 11:18:11 -07:00
Steven Troxler
fcdd8047a3 Add haddock comments to Candidate
Summary:
When documenting `Types.hs` last week I got confused about what the Bool
represented here, following up on a suggestion to add a doc comment

Reviewed By: chessai

Differential Revision: D28412103

fbshipit-source-id: 01af1f0831fc3e49d4b7f5bb9a4e89c5897b3d25
2021-05-13 11:18:10 -07:00
Steven Troxler
d6587dafbb Fix excessive-free-point-style lint errors on Rank.hs
Summary: Just replace `.` with `$`, also tweaked the spacing a bit for skimmability

Reviewed By: chessai

Differential Revision: D28411898

fbshipit-source-id: d18b9ef5db99b82d150231080c89f812f709f409
2021-05-13 11:18:09 -07:00
Steven Troxler
3eafced0fa Get rid of name clash warnings in Extraction.hs
Summary: Use targeted imports to avoid clash on `node` variable name

Reviewed By: chessai

Differential Revision: D28411902

fbshipit-source-id: 4a81e35a6aa601015685ccab3f571e919e9025c8
2021-05-13 11:03:19 -07:00
leandro.guisandez@pgconocimiento.com
173d8c235f Init
Summary: Initialise Duration for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28299352

Pulled By: chessai

fbshipit-source-id: d1d4dd186b9fbf018c83a0df4d752b29da20f04d
2021-05-12 17:33:54 -07:00
kcnhk1@gmail.com
0a2ae7d895 Add rulePrecision2
Summary: about <volume>

Reviewed By: haoxuany

Differential Revision: D28390051

Pulled By: chessai

fbshipit-source-id: a357bbd15ab77578fda477eae6303158824458da
2021-05-12 16:53:37 -07:00
leandro.guisandez@pgconocimiento.com
f10c4db112 Init
Summary: Initialise TimeGrain for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28299113

Pulled By: chessai

fbshipit-source-id: ffcd4043554123f5de2d279ef2660db83eb9f475
2021-05-12 16:33:51 -07:00
leandro.guisandez@pgconocimiento.com
0e3d0604a2 Init
Summary: Initialise Volume for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28298901

Pulled By: chessai

fbshipit-source-id: 72fd95062393b8b780e521b56b097b66e2263aef
2021-05-12 14:54:05 -07:00
leandro.guisandez@pgconocimiento.com
219e5600d6 Init
Summary: Initialise Temperature for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28298423

Pulled By: chessai

fbshipit-source-id: 73b87d002196b6b707388e9f83f42591510f40eb
2021-05-12 14:02:54 -07:00
kcnhk1@gmail.com
ff342868d7 Add rulePrecision
Summary: about <distance> rule

Reviewed By: haoxuany

Differential Revision: D28389599

Pulled By: chessai

fbshipit-source-id: 237f6f8ed605ba7d22f40cd338e637ed99565e28
2021-05-12 13:32:39 -07:00
leandro.guisandez@pgconocimiento.com
1322cd69ec Init
Summary: Initialises Distance for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28297521

Pulled By: chessai

fbshipit-source-id: eb8641568f5981ea6e2d481c305e36cdb683dcfb
2021-05-12 12:50:13 -07:00
leandro.guisandez@pgconocimiento.com
213d1f12a5 Init
Summary: Initialise AmountOfMoney for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28296090

Pulled By: chessai

fbshipit-source-id: f02305a762ee3cbf357bbb0a65eef614d3d828c9
2021-05-12 11:32:18 -07:00
Daniel Cartwright
3e28d42a29 Nth Time of Time rules: Make it less permissive
Summary: Currently, Duckling will accept "The first christmas of next month" in this rule, which is nonsensical. This reduces the scope of the times the rule recognises, thereby limiting us to a set of more sensible resolutions.

Reviewed By: stroxler

Differential Revision: D27861417

fbshipit-source-id: 3f19700af7298a6238c59f5de0598168d4b4a3c4
2021-05-11 11:32:14 -07:00
chessai
ccdf27ad1d FR: add nth <time> of <time> rules (#596)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/596

Reviewed By: stroxler

Differential Revision: D27722743

Pulled By: chessai

fbshipit-source-id: a9136fef2a26e87269bca8212ae07d3d7fe04977
2021-05-11 11:32:13 -07:00
Daniel Cartwright
59cb9e0879 Support legalese numerals
Summary:
Support numerals like "forty-five (45)". Commonly seen in legal documents.

There are no classifiers to regenerate.

Resolves #216

Reviewed By: stroxler

Differential Revision: D28305725

fbshipit-source-id: b9b4e160f630ce3cf462fcf9f2e575738463c313
2021-05-10 10:33:35 -07:00
Steven Troxler
0efbfe5988 Give a usable error if we fail to parse reftime parameter
Summary:
My attention was brought to this issue by the linter complaining about `read`.

The linter is just as happy with `error`, but I do think it's better to fail
with a usable error message here than with whatever error `read` gives us.

Reviewed By: chessai

Differential Revision: D28213406

fbshipit-source-id: f101d0515ee64978480bdbb873ff72d80d124969
2021-05-07 06:20:03 -07:00
Steven Troxler
4b44e969c9 Fix all name collisions on the main Types.hs
Summary:
I think they all fell into one of two categories:
- names colliding with field names, but where there was already an existing
  pattern (e.g. d for dimension, v for value) and we just had to be consistent
- cases that were best fixed by turning NamedFieldPuns on, which I did

Reviewed By: chessai

Differential Revision: D28213245

fbshipit-source-id: 18fbd61771e12da11ce03b98b74af51d1e837787
2021-05-07 06:20:03 -07:00
Steven Troxler
ce3614fedd In Debug.hs, s/sentence/input/g
Summary:
When tracing the code from Debug downward, the unnecessary rename
of an argument from `sentence` to `input` creates a context switch. Let's
use the same name throughout.

Reviewed By: chessai

Differential Revision: D28213244

fbshipit-source-id: 22476d958312e5c60cd32ff1e3d0d460cf0c8c79
2021-05-06 08:54:57 -07:00
Steven Troxler
a88b70feb7 Filter with a self-describing function in where
Summary:
In both `Api.hs` and `Debug.hs` I noticed that I was staring at the code
longer than necessary to figure out what a lambda with a destructure and
pattern match were doing. Moving it to a function in `where` named
`isRelevantDimension` makes skimming easier.

Reviewed By: chessai

Differential Revision: D28213243

fbshipit-source-id: 344f464dcac7297009c35b19373eef67e0eb9540
2021-05-06 08:54:57 -07:00
Steven Troxler
eba5d0a825 Simple style fixes for outer layers around Engine.hs
Summary:
Easy style fixes for ExampleMain.hs, Debug.hs, Api.hs, Core.hs

Most of these are just lint fixes, but I also made a few not-just-lint changes
to conform to some elements of our style guide that I agree with:
- if the type signature doesn't fit on one line, then put one type per line
  with nothing on the first line, so that all types are vertically aligned - makes
  for a quick skim
- try to avoid mixing same-line function args with hanging function args: hang
  all arguments or none at all to get a more outline-like feel, again better for
  skimming

I was actually able to eliminate all errors for most of these modules - the name
collisions I usually give up on were manageable by hiding + easy variable renames

Reviewed By: chessai

Differential Revision: D28213246

fbshipit-source-id: 1f77d56f2ff8dccfd5f3b534f087c07047b92885
2021-05-06 08:54:56 -07:00
Steven Troxler
0e13d28b4d Time/EN: Get rid of unnecessary rules
Summary:
While I was working on fixing #604, I came across the rules
`ruleMilitarySpelledOutAMPM(2)`, which were actually capturing
some of my test phrases and confusing me.

This commit removes them because
- they aren't needed: the existing latent spelled-out hour + minute rules plus
  the "(in the )?(am/pm)" rules together give the same behavior
- they are confusingly named - these aren't military times at all, they are
  spelled-out civilian times

Reviewed By: haoxuany

Differential Revision: D27848485

fbshipit-source-id: ba1ed16ec22b5139b0b500b44dc91adb1b5e3d82
2021-04-26 06:17:44 -07:00
Steven Troxler
c44c73fe04 Numeral/ES: Add support for additive concatenations
Summary:
This commit extend Spanish-language support for concatenations
of the form "<higher-order-of-magnitude> <lower>", e.g.
"doscientos tres" (203) or "cuatro mil ventiuno" (4022) to work
not just for hundreds but also for thousands and millions.

Reviewed By: chessai

Differential Revision: D27858133

fbshipit-source-id: 5c6b227ae7dad9009cd636e7ea49c209480c931a
2021-04-23 09:48:07 -07:00
Steven Troxler
888da76215 Numeral/ES: Add support for 1M, and multiples of 1K/1M
Summary:
This commit adds two things to Spanish numeral support:
- support for millions
- support, via hooking into the `isMultipliable` logic used by EN, for
  composing counts of 2-999 with either "mil" or "millones", which is
  the standard way to say things like "tres mil" = 3000

Reviewed By: chessai

Differential Revision: D27858135

fbshipit-source-id: 980e95bd989f818c5ceaa2bb6c87fe81d3e08366
2021-04-23 09:48:06 -07:00
Steven Troxler
15bba9eba9 Numeral/ES: Refactor hundreds handling to fix bug
Summary:
This diff refactors our handling of "<hundreds> 0..99" numbers
to be more flexible by replacing `ruleNumeralthreePartHundreds`
with
- a rule for two-part hundreds like "dos cientos" (which is technically
  incorrect grammar - doscientos is correct - but probably worth keeping) based
  on a notion of multipliability like that used in EN rules
- a rule stating that we can compose hundreds with 0..99 additively

The resulting rules are more flexible, and they correctly parse not only
gramatically iffy phrases like "dos cientos tres", but also grammatically
correct phrases like "doscientos tres". This fixes #380.

Reviewed By: chessai

Differential Revision: D27858136

fbshipit-source-id: 4a918d84d93ac074f83f6947a8f80cfd11145115
2021-04-23 09:48:06 -07:00