Commit Graph

742 Commits

Author SHA1 Message Date
Daniel Cartwright
84175d61d6 Specify that you need to install PCRE headers on linux as well
Summary: Some users have been confused by this in the past.

Reviewed By: stroxler

Differential Revision: D30685469

fbshipit-source-id: a4c190a4a2bc0c02cd9711422934bde27f9ec116
2021-08-31 17:20:43 -07:00
Daniel Cartwright
72f45e8e2c Recognise hh:mm:ss as not an interval
Summary: An interval regex was overzealous and matching too much, so `hh:mm:ss` was getting parsed as an interval instead of a time.

Reviewed By: patapizza

Differential Revision: D30608223

fbshipit-source-id: b24c18146070f15ada80b9401e67f0c0aefef7d8
2021-08-27 12:23:43 -07:00
Filipe Pereira
d8888e2ff8 Ca time improvements (#639)
Summary:
Some time recognition improvements for Catalan:
- morning should be a time range recognised until noon
- "dema" can also be used for tomorrow (besides "demà")
- "se" alone should not be understood as September

Pull Request resolved: https://github.com/facebook/duckling/pull/639

Reviewed By: stroxler

Differential Revision: D30312076

Pulled By: chessai

fbshipit-source-id: 1a42bbd7eecc4f5690145ee9cadb8eccae8edd08
2021-08-16 10:46:40 -07:00
chessai
32eb5db8c2 update lts resolver (#641)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/641

Reviewed By: haoxuany

Differential Revision: D30260899

Pulled By: chessai

fbshipit-source-id: 3e6d3d8aa84caac8ba6b845d79a577618f12515f
2021-08-11 19:02:53 -07:00
Dubovinszky Péter
0354f27ef4 Time/HU: extend dates (#462)
Summary:
Extend Hungarian dates with new cases

Pull Request resolved: https://github.com/facebook/duckling/pull/462

Differential Revision: D25573636

Pulled By: chessai

fbshipit-source-id: 251698cf9f5126162ad4fbf1489dcbc4c12541ed
2021-08-11 13:31:28 -07:00
Filipe Pereira
57dab83ad3 PT time improvements 2 (#636)
Summary:
Fixed rules for PT time expressions like "amanhã à noite", "dia 17", "dia 15 às 18"

Pull Request resolved: https://github.com/facebook/duckling/pull/636

Reviewed By: stroxler

Differential Revision: D30138416

Pulled By: chessai

fbshipit-source-id: 5265d44e7ddce5eee8cd7266df9254389a10b139
2021-08-05 13:47:41 -07:00
Filipe Pereira
fc7950a68f ES time improvements (#634)
Summary:
New rules for ES time expressions like "3 Marzo", "Marzo 3.

Pull Request resolved: https://github.com/facebook/duckling/pull/634

Reviewed By: girifb

Differential Revision: D30110631

Pulled By: chessai

fbshipit-source-id: e6add868535522d243ccf1dab2443e6cd3f7f8b2
2021-08-05 10:48:47 -07:00
Filipe Pereira
a6499228af FR time improvement (#635)
Summary:
Fixed recognition for month "Juil" (abbreviation of "Juillet")

Pull Request resolved: https://github.com/facebook/duckling/pull/635

Reviewed By: stroxler

Differential Revision: D30115291

Pulled By: chessai

fbshipit-source-id: e04d6e7952f85f4ca061540a3967908bcd4f1ebd
2021-08-04 17:31:08 -07:00
Filipe Pereira
fe4f77bdc0 PT time improvements (#633)
Summary:
New rules for PT time expressions like "5 Maio", "Maio 5", "5 Maio 2022".

Pull Request resolved: https://github.com/facebook/duckling/pull/633

Reviewed By: stroxler

Differential Revision: D30114330

Pulled By: chessai

fbshipit-source-id: f56418d95efa1d7488957b8b8083daec3193949b
2021-08-04 17:31:07 -07:00
Maíra Bello
328e59ebc4 Quantity/PT: Extend quantity to include grams in portuguese (#631)
Summary:
I'm using Duckling in my project but I noticed that quantities in kg weren't being detected correctly, though other entities such as numeral/volume were all working as expected. Investigating more I noticed that this was just because in Portuguese the Quantity entity was only configured to detect cups and pounds, never grams. And even for cups/pounds, products weren't being detected correctly.

So I've just adapted the rules from English Quantity to work in Portuguese as well, while keeping the cups/pounds too. It's all working as expected now and it's backwards compatible.

Pull Request resolved: https://github.com/facebook/duckling/pull/631

Reviewed By: stroxler

Differential Revision: D29701339

Pulled By: chessai

fbshipit-source-id: fca08a14c50844d418f101b885ca54554d993f58
2021-07-19 15:48:04 -07:00
Daniel Cartwright
b10e1d6a78 ES Numeral - Add ruleLeadingDotNumeral and improve ruleNumeralDotNumeral
Summary:
add ruleLeadingDotNumeral which parses "punto 2" and "coma 2" as 0.2, and allow "coma" in ruleNumeralDotNumeral.

Also extend ruleNumeralsPrefixWithNegativeOrMinus to include 'negativo' prefixes

Reviewed By: stroxler

Differential Revision: D29405886

fbshipit-source-id: eb43f6f72374430af414e0d29009b98df2115a31
2021-07-19 13:18:09 -07:00
Amr Keleg
79ac8f63f9 Add isArabic rule (#577)
Summary:
Fixes https://github.com/facebook/duckling/issues/437, fixes https://github.com/facebook/duckling/issues/571

Pull Request resolved: https://github.com/facebook/duckling/pull/577

Reviewed By: stroxler

Differential Revision: D29664126

Pulled By: chessai

fbshipit-source-id: b6365699231527b0869322c798e32a21328f1071
2021-07-12 13:37:23 -07:00
Daniel Cartwright
ed291c2a3a ES (Spanish) Time - add rule for 'next <day-of-week>'
Summary: Resolves #623. Add rule for 'proximo <day-of-week>'

Reviewed By: stroxler

Differential Revision: D29002419

fbshipit-source-id: 7d5fb04b66fe068ae2906b63ede44009e80f1a3c
2021-06-09 20:33:12 -07:00
Damien Gallet
28e38679a7 Time.FR > add rule for years in twentieth centry (#357)
Summary:
In Time.FR, add support for birthdates like "15 juin 72"

Pull Request resolved: https://github.com/facebook/duckling/pull/357

Reviewed By: patapizza

Differential Revision: D26193322

Pulled By: chessai

fbshipit-source-id: d22efea81aad31af8baa2f7f9afdaf1a75c0dc10
2021-06-04 13:04:12 -07:00
Daniel Cartwright
8cb77a43c7 Add custom isRangeValid implementation for ZH
Summary: Fixes #313

Reviewed By: stroxler

Differential Revision: D28364035

fbshipit-source-id: 7fe3dba75410d217747a0d7a6f7df611ac26ec70
2021-06-04 12:48:32 -07:00
evjava
4878820294 Russian(RU) numeral and ordinal improvements (#374)
Summary:
- added non-typo variant for 11 (одиннадцать)
- added variants for grammatical cases

Pull Request resolved: https://github.com/facebook/duckling/pull/374

Test Plan:
```
:test Duckling.Numeral.RU.Tests
:test Duckling.Ordinal.RU.Tests
```

Reviewed By: stroxler

Differential Revision: D20332223

Pulled By: chessai

fbshipit-source-id: be1c6f6477af56418b69da21f5219ba27b50d0a1
2021-06-04 12:18:31 -07:00
tuantvk
25b39f4a8b Time/VI: update rule Sunday (#611)
Summary:
In Vietnam, sunday is "chủ nhật" or "Chúa nhật (Catholic)".

Pull Request resolved: https://github.com/facebook/duckling/pull/611

Reviewed By: haoxuany

Differential Revision: D28399277

Pulled By: chessai

fbshipit-source-id: 26aa7c76cf1f8b8c2ba32e049f7f470a140e3d92
2021-06-04 12:18:27 -07:00
chessai
99e1dce9c4 restrict dimensions to only those specified (#625)
Summary:
Resolves https://github.com/facebook/duckling/issues/624

Before patch (specifying quantity and numeral, but time still shows up):
```
❯ curl -XPOST http://0.0.0.0:8000/parse --data 'locale=en_US&text="June 21 and 3 cups of sugar"&dims="[\"quantity\",\"numeral\"]"' | jq
[
  {
    "body": "June 21",
    "start": 1,
    "value": {
      "values": [
        {
          "value": "2021-06-21T00:00:00.000-07:00",
          "grain": "day",
          "type": "value"
        },
        {
          "value": "2022-06-21T00:00:00.000-07:00",
          "grain": "day",
          "type": "value"
        },
        {
          "value": "2023-06-21T00:00:00.000-07:00",
          "grain": "day",
          "type": "value"
        }
      ],
      "value": "2021-06-21T00:00:00.000-07:00",
      "grain": "day",
      "type": "value"
    },
    "end": 8,
    "dim": "time",
    "latent": false
  },
  {
    "body": "3 cups of sugar",
    "start": 13,
    "value": {
      "value": 3,
      "type": "value",
      "product": "sugar",
      "unit": "cup"
    },
    "end": 28,
    "dim": "quantity",
    "latent": false
  }
]
```

After patch (time no longer shows up):
```
❯ curl -XPOST http://0.0.0.0:8000/parse --data 'locale=en_US&text="June 21 and 3 cups of sugar"&dims="[\"quantity\",\"numeral\"]"' | jq
[
  {
    "body": "3 cups of sugar",
    "start": 13,
    "value": {
      "value": 3,
      "type": "value",
      "product": "sugar",
      "unit": "cup"
    },
    "end": 28,
    "dim": "quantity",
    "latent": false
  }
]
```

Pull Request resolved: https://github.com/facebook/duckling/pull/625

Reviewed By: stroxler

Differential Revision: D28851759

Pulled By: chessai

fbshipit-source-id: d3b3f33092c7e60bf29886939488ed562a213c35
2021-06-03 10:33:42 -07:00
chessai
878beb7aa1 fix hanging ci (#626)
Summary:
resolves https://github.com/facebook/duckling/issues/622

Pull Request resolved: https://github.com/facebook/duckling/pull/626

Reviewed By: stroxler

Differential Revision: D28842781

Pulled By: chessai

fbshipit-source-id: 3210bbf9eb76f21c90af86f6abdeac566fc86415
2021-06-02 15:32:58 -07:00
leandro.guisandez@pgconocimiento.com
5d8d99bbf4 Init
Summary: Initialise Time for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28455273

Pulled By: chessai

fbshipit-source-id: be9a4d61692ba4fb32986e161e9fdd6d25a357dc
2021-05-18 13:50:19 -07:00
Steven Troxler
3d2f1939ef s/ptn/pattern
Summary:
I don't think abbreviating pattern to ptn is a win, the abbreviation
isn't good enough to be unambiguous (I found this because I was
trying to figure out what "ptn" meant), and `pattern` isn't that
long a name.

Reviewed By: chessai

Differential Revision: D28462776

fbshipit-source-id: bd685720b198fed791d84c00732eb1873b37528b
2021-05-18 11:50:19 -07:00
Steven Troxler
dbc5c91263 Remove all name clashes in Time/Helpers.hs
Summary:
Solutions were:
 - use targeted and qualified imports to avoid pulling in the universe of Duckling.Types
 - use long-form descriptive names in a few places
 - shuffle a let clause to just define an output instead of a local func

Also got rid of another lint error suggesting to use a section instead of flip;
the module is now lint-warning free.

Reviewed By: chessai

Differential Revision: D28462775

fbshipit-source-id: 1e2855756b22cb62db0d94334a7e063aa728b7bf
2021-05-18 11:50:18 -07:00
Steven Troxler
dd1ae664cc Fix name collisions in Time/Types.hs
Summary:
Using rules of thumb:
 - use unabbreviated names for aguments to top-level functions where there's
   a clash (e.g. lots of t -> time transformations)
 - use abbreviated names for nested local functions (so e.g. t, d to avoid
   a clash with the `day` top-level function)

Reviewed By: chessai

Differential Revision: D28462777

fbshipit-source-id: 8795d038b2c3a65b60f0d2d9091b7c56cc8a5ff7
2021-05-18 11:50:18 -07:00
Steven Troxler
81ab073acf Move Candidate to Ranking/Types.hs
Summary:
In my opinion putting `Candidate` into the core `Types.hs`
is a mistake - it's used exclusively in the ranking stage, so cluttering
the core tokenizing and recursive parsing / value resolution logic in
`Duckling.Types` with this irrelevant datatype makes things less clear
than if we keep it in the `Ranking` modules.

Reviewed By: chessai

Differential Revision: D28462902

fbshipit-source-id: cd4bb88c4a16945265e8f21c8808b06ae3383559
2021-05-18 11:50:17 -07:00
Daniel Cartwright
69d951220e Make isRangeValid take Lang as input
Summary: There are different implementations of isRangeValid that work well for different languages, thus it makes sense to facilitate having different implementations based on the language.

Reviewed By: patapizza

Differential Revision: D28362777

fbshipit-source-id: 5f2991d54af3095c8e95cf534e2dd3b4a34dee3a
2021-05-17 13:18:11 -07:00
Daniel Cartwright
7762af850a ES Numeral con
Summary:
In ES (Spanish), decimals can be expressed by `<number> con <number>`, where the whole part is to the left and the decimal part is to the right.

Resolves #615

Reviewed By: stroxler

Differential Revision: D28449722

fbshipit-source-id: caa0fb52f72f94c4a4cc456a46c25fa5f3b9b625
2021-05-14 13:50:22 -07:00
Steven Troxler
323a7df023 Rearrange Engine.hs to top-down ordering
Summary:
Make the code reflect the call graph, which looks roughly like this:
```
parseAndResolve
  runDuckling
  resolveNode
  parseString
    saturateParseString
    parseString1
      matchFirst
         ... low level stuff
      matchFirstAnywhere
         ... low level stuff
```

I found the existing order pretty hard to untangle when I was writing some architecture notes on this module, I think the new ordering will help

Reviewed By: chessai

Differential Revision: D28441933

fbshipit-source-id: 07c722aa6d4038baa7f14fec84660ecc2736ed2e
2021-05-14 11:50:03 -07:00
Daniel Cartwright
13513d30a5 Regenerate classifiers
Summary: Some classifiers were a bit out of date. They needed regenerating.

Reviewed By: girifb

Differential Revision: D28399234

fbshipit-source-id: 2780dbe5478a5386a2b6062dec8696736b3ce723
2021-05-13 14:02:35 -07:00
Steven Troxler
9151f9e1ab Specify where the note on regex + text lives
Summary:
I spent a surprising amount of time trying to figure out what
this comment was referring to because it wasn't at all clear to me
that it meant a comment in another file. Making it more specific

Reviewed By: chessai

Differential Revision: D28411103

fbshipit-source-id: 26cd29b47367a7e0d865f616f289fef570544c39
2021-05-13 11:18:11 -07:00
Steven Troxler
fcdd8047a3 Add haddock comments to Candidate
Summary:
When documenting `Types.hs` last week I got confused about what the Bool
represented here, following up on a suggestion to add a doc comment

Reviewed By: chessai

Differential Revision: D28412103

fbshipit-source-id: 01af1f0831fc3e49d4b7f5bb9a4e89c5897b3d25
2021-05-13 11:18:10 -07:00
Steven Troxler
d6587dafbb Fix excessive-free-point-style lint errors on Rank.hs
Summary: Just replace `.` with `$`, also tweaked the spacing a bit for skimmability

Reviewed By: chessai

Differential Revision: D28411898

fbshipit-source-id: d18b9ef5db99b82d150231080c89f812f709f409
2021-05-13 11:18:09 -07:00
Steven Troxler
3eafced0fa Get rid of name clash warnings in Extraction.hs
Summary: Use targeted imports to avoid clash on `node` variable name

Reviewed By: chessai

Differential Revision: D28411902

fbshipit-source-id: 4a81e35a6aa601015685ccab3f571e919e9025c8
2021-05-13 11:03:19 -07:00
leandro.guisandez@pgconocimiento.com
173d8c235f Init
Summary: Initialise Duration for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28299352

Pulled By: chessai

fbshipit-source-id: d1d4dd186b9fbf018c83a0df4d752b29da20f04d
2021-05-12 17:33:54 -07:00
kcnhk1@gmail.com
0a2ae7d895 Add rulePrecision2
Summary: about <volume>

Reviewed By: haoxuany

Differential Revision: D28390051

Pulled By: chessai

fbshipit-source-id: a357bbd15ab77578fda477eae6303158824458da
2021-05-12 16:53:37 -07:00
leandro.guisandez@pgconocimiento.com
f10c4db112 Init
Summary: Initialise TimeGrain for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28299113

Pulled By: chessai

fbshipit-source-id: ffcd4043554123f5de2d279ef2660db83eb9f475
2021-05-12 16:33:51 -07:00
leandro.guisandez@pgconocimiento.com
0e3d0604a2 Init
Summary: Initialise Volume for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28298901

Pulled By: chessai

fbshipit-source-id: 72fd95062393b8b780e521b56b097b66e2263aef
2021-05-12 14:54:05 -07:00
leandro.guisandez@pgconocimiento.com
219e5600d6 Init
Summary: Initialise Temperature for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28298423

Pulled By: chessai

fbshipit-source-id: 73b87d002196b6b707388e9f83f42591510f40eb
2021-05-12 14:02:54 -07:00
kcnhk1@gmail.com
ff342868d7 Add rulePrecision
Summary: about <distance> rule

Reviewed By: haoxuany

Differential Revision: D28389599

Pulled By: chessai

fbshipit-source-id: 237f6f8ed605ba7d22f40cd338e637ed99565e28
2021-05-12 13:32:39 -07:00
leandro.guisandez@pgconocimiento.com
1322cd69ec Init
Summary: Initialises Distance for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28297521

Pulled By: chessai

fbshipit-source-id: eb8641568f5981ea6e2d481c305e36cdb683dcfb
2021-05-12 12:50:13 -07:00
leandro.guisandez@pgconocimiento.com
213d1f12a5 Init
Summary: Initialise AmountOfMoney for CA (Catalan) language

Reviewed By: stroxler

Differential Revision: D28296090

Pulled By: chessai

fbshipit-source-id: f02305a762ee3cbf357bbb0a65eef614d3d828c9
2021-05-12 11:32:18 -07:00
Daniel Cartwright
3e28d42a29 Nth Time of Time rules: Make it less permissive
Summary: Currently, Duckling will accept "The first christmas of next month" in this rule, which is nonsensical. This reduces the scope of the times the rule recognises, thereby limiting us to a set of more sensible resolutions.

Reviewed By: stroxler

Differential Revision: D27861417

fbshipit-source-id: 3f19700af7298a6238c59f5de0598168d4b4a3c4
2021-05-11 11:32:14 -07:00
chessai
ccdf27ad1d FR: add nth <time> of <time> rules (#596)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/596

Reviewed By: stroxler

Differential Revision: D27722743

Pulled By: chessai

fbshipit-source-id: a9136fef2a26e87269bca8212ae07d3d7fe04977
2021-05-11 11:32:13 -07:00
Daniel Cartwright
59cb9e0879 Support legalese numerals
Summary:
Support numerals like "forty-five (45)". Commonly seen in legal documents.

There are no classifiers to regenerate.

Resolves #216

Reviewed By: stroxler

Differential Revision: D28305725

fbshipit-source-id: b9b4e160f630ce3cf462fcf9f2e575738463c313
2021-05-10 10:33:35 -07:00
Steven Troxler
0efbfe5988 Give a usable error if we fail to parse reftime parameter
Summary:
My attention was brought to this issue by the linter complaining about `read`.

The linter is just as happy with `error`, but I do think it's better to fail
with a usable error message here than with whatever error `read` gives us.

Reviewed By: chessai

Differential Revision: D28213406

fbshipit-source-id: f101d0515ee64978480bdbb873ff72d80d124969
2021-05-07 06:20:03 -07:00
Steven Troxler
4b44e969c9 Fix all name collisions on the main Types.hs
Summary:
I think they all fell into one of two categories:
- names colliding with field names, but where there was already an existing
  pattern (e.g. d for dimension, v for value) and we just had to be consistent
- cases that were best fixed by turning NamedFieldPuns on, which I did

Reviewed By: chessai

Differential Revision: D28213245

fbshipit-source-id: 18fbd61771e12da11ce03b98b74af51d1e837787
2021-05-07 06:20:03 -07:00
Steven Troxler
ce3614fedd In Debug.hs, s/sentence/input/g
Summary:
When tracing the code from Debug downward, the unnecessary rename
of an argument from `sentence` to `input` creates a context switch. Let's
use the same name throughout.

Reviewed By: chessai

Differential Revision: D28213244

fbshipit-source-id: 22476d958312e5c60cd32ff1e3d0d460cf0c8c79
2021-05-06 08:54:57 -07:00
Steven Troxler
a88b70feb7 Filter with a self-describing function in where
Summary:
In both `Api.hs` and `Debug.hs` I noticed that I was staring at the code
longer than necessary to figure out what a lambda with a destructure and
pattern match were doing. Moving it to a function in `where` named
`isRelevantDimension` makes skimming easier.

Reviewed By: chessai

Differential Revision: D28213243

fbshipit-source-id: 344f464dcac7297009c35b19373eef67e0eb9540
2021-05-06 08:54:57 -07:00
Steven Troxler
eba5d0a825 Simple style fixes for outer layers around Engine.hs
Summary:
Easy style fixes for ExampleMain.hs, Debug.hs, Api.hs, Core.hs

Most of these are just lint fixes, but I also made a few not-just-lint changes
to conform to some elements of our style guide that I agree with:
- if the type signature doesn't fit on one line, then put one type per line
  with nothing on the first line, so that all types are vertically aligned - makes
  for a quick skim
- try to avoid mixing same-line function args with hanging function args: hang
  all arguments or none at all to get a more outline-like feel, again better for
  skimming

I was actually able to eliminate all errors for most of these modules - the name
collisions I usually give up on were manageable by hiding + easy variable renames

Reviewed By: chessai

Differential Revision: D28213246

fbshipit-source-id: 1f77d56f2ff8dccfd5f3b534f087c07047b92885
2021-05-06 08:54:56 -07:00
Steven Troxler
0e13d28b4d Time/EN: Get rid of unnecessary rules
Summary:
While I was working on fixing #604, I came across the rules
`ruleMilitarySpelledOutAMPM(2)`, which were actually capturing
some of my test phrases and confusing me.

This commit removes them because
- they aren't needed: the existing latent spelled-out hour + minute rules plus
  the "(in the )?(am/pm)" rules together give the same behavior
- they are confusingly named - these aren't military times at all, they are
  spelled-out civilian times

Reviewed By: haoxuany

Differential Revision: D27848485

fbshipit-source-id: ba1ed16ec22b5139b0b500b44dc91adb1b5e3d82
2021-04-26 06:17:44 -07:00
Steven Troxler
c44c73fe04 Numeral/ES: Add support for additive concatenations
Summary:
This commit extend Spanish-language support for concatenations
of the form "<higher-order-of-magnitude> <lower>", e.g.
"doscientos tres" (203) or "cuatro mil ventiuno" (4022) to work
not just for hundreds but also for thousands and millions.

Reviewed By: chessai

Differential Revision: D27858133

fbshipit-source-id: 5c6b227ae7dad9009cd636e7ea49c209480c931a
2021-04-23 09:48:07 -07:00