Commit Graph

613 Commits

Author SHA1 Message Date
chessai
cdeefe1d4d ghc88x compat (#550)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/550

Reviewed By: haoxuany

Differential Revision: D24844625

Pulled By: chessai

fbshipit-source-id: 52dcf5f9488386f7f407535e876bff1207823fe0
2020-11-12 13:47:46 -08:00
Dmitri Osipov
e7264b55c9 adds frequent durations in German (#509)
Summary:
Found a lacking frequent duration in German and a small typo in the existing one.

Pull Request resolved: https://github.com/facebook/duckling/pull/509

Reviewed By: patapizza

Differential Revision: D24690104

Pulled By: chessai

fbshipit-source-id: b49a7a636abf5b92f2fe7c0d5b2ca2fe64acbaa2
2020-11-09 11:18:35 -08:00
Daniel Cartwright
eb043d7018 Quantity rules for Spanish (ES)
Summary:
Spanish (ES) will now have all the same quantity rules as English (EN) (which I think is the most-supported language), plus more.

This includes the following:
* bowls - (bol(es)?|tazón(es)?|cuencos?|platos? (soperos?)|(hondos?)) (EN does not currently have this)
* cups - (tazas?)
* dishes - (platos?|fuentes?) (EN does not currently have this)
* grams - (((m(ili)?)|(k(ilo)?))?g(ramo)?s?)
* ounces - ((onzas?)|oz)
* pints - (pintas?) (EN does not currently have this)
* pounds - ((lb|libra)s?)
* quarts - (cuartos? de galón) (EN does not currently have this)
* tablespoons - (cucharadas? (grande)?) (EN does not currently have this)
* teaspoons - (cucharaditas?) (EN does not currently have this)

Reviewed By: patapizza

Differential Revision: D24628214

fbshipit-source-id: 2e8d500661f30fa0928cb7d3f21470afc01e2285
2020-11-09 11:18:35 -08:00
Tpt
888b1cba35 Dockerfile: debugs the build and uses Debian Buster everywhere (#539)
Summary:
The Dockerfile build part did not copy the Duckling implementation into the container, making the build fail.

I also harmonized the target Debian to Buster, that is the one currently hidden behind `haskell:8`.

Pull Request resolved: https://github.com/facebook/duckling/pull/539

Reviewed By: patapizza

Differential Revision: D24688839

Pulled By: chessai

fbshipit-source-id: 0ffcc4d28a599b7edad668730117828d26e116ad
2020-11-02 13:33:00 -08:00
Victor Pothin
bfc75849d2 Adds new rules of accentuation of the Portuguese (#531)
Summary:
Keeps accents consistent, "quinquagésimo" there is no more "Ü".

Pull Request resolved: https://github.com/facebook/duckling/pull/531

Reviewed By: patapizza

Differential Revision: D23770703

Pulled By: chessai

fbshipit-source-id: f8a34c02028faf9f51eca6a016b5bad988a83f04
2020-11-02 12:17:57 -08:00
Daniel Cartwright
01b812b69c Update dependencies/CI
Summary:
This PR accomplishes several things:

- removes dist-newstyle (local build artifacts should not be checked in)
- extends the .gitignore to include many common build artifacts/editor artifacts
- allow more modern dependencies (upper bounds of many were out of date by one or two years' worth of releases)
- upgrade stack lts (9.2 -> 14.2) to GHC 8.6.5
- regenerate .travis.yml using the now-standard haskell-ci (many haskell core libraries use this), instead of the outdated script that was maintained by hvr; as a precursor to this, the tested-with versions were updated

Reviewed By: patapizza

Differential Revision: D24623967

fbshipit-source-id: 838fe571df0b8d44106349659ce8ce8ab82f0bc6
2020-10-29 11:02:49 -07:00
Josef Svenningsson
7889f396f3 Remove dependency on Data.Some (#533)
Summary:
Pull Request resolved: https://github.com/facebook/duckling/pull/533

In recent versions of Data.Some the name of the constructor, `This` has changed name to `Some`. This has become rather problematic for us to migrate so we're just going to remove the dependency. The meat of this diff is adding the type `Seal` to `Duckling.Types`. That type replaces `Some`.

Reviewed By: pepeiborra

Differential Revision: D23929459

fbshipit-source-id: 8ff4146ecba4f1119a17899961b2d877547f6e4f
2020-09-28 01:33:01 -07:00
Julien Odent
7ba9ea8aeb Time/EN: Fix empty group match
Summary: sad_palpatine

Differential Revision: D23718913

fbshipit-source-id: 363bf9a43d8d1cd77405882bc70a7fa1a1de2dbe
2020-09-15 17:22:00 -07:00
Julien Odent
ef2b1b1b0e Time/FR: Some speed up
Summary: Guarding against grains, shortening regexes.

Reviewed By: jtliao

Differential Revision: D23387716

fbshipit-source-id: de84d0efa79c4ae10bd9fbf14e82a724fee1a1f2
2020-08-28 09:48:15 -07:00
Arjan Scherpenisse
df2ada617a NL/Duration: Add "anderhalf uur" (#502)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/502

Reviewed By: patapizza

Differential Revision: D22260625

Pulled By: haoxuany

fbshipit-source-id: bf44fdab7def19f6dd0e0ef7763c112a3b024396
2020-08-05 15:34:05 -07:00
Julien Odent
3d5e1c3bad Time/DE: Don't parse "so"
Summary:
"so" is an adverb in German: https://github.com/wit-ai/wit/issues/1860
It's also a short form for "Sonntag" (Sunday); making the dot mandatory.

Reviewed By: haoxuany

Differential Revision: D22900791

fbshipit-source-id: 8dc873f79a21ca2add074f9c664e84fae56f1e67
2020-08-03 12:34:49 -07:00
Julien Odent
4846641456
Merge pull request #515 from patapizza/fixup-T70792907-master
Re-sync with internal repository
2020-07-30 15:20:39 -07:00
Julien Odent
6370f3e6f1 Re-sync with internal repository 2020-07-30 14:10:45 -07:00
James Addison
5e8277e105 Export default module name 'Main' from within TestMain.hs file (#512)
Summary:
**Summary**

**Current**
`stack test` fails with an error "output was redirected with -o, but no output will be generated
because there is no Main module"

**Expected**
`stack test` should run tests to completion

The cause here seems to be that the [`main-is` flag](a88e0669f7/duckling.cabal (L851)) supplies the *filename* in which to begin tests, but expects to find a *module* named `Main` there by default.

Two possible fixes are possible - either:

- [Add a ghc-options flag](https://github.com/facebook/duckling/issues/505#issue-650474748) to specify a module name; confusingly the flag name is also `main-is`
- Use the default `Main` module name within TestMain.hs

(the approach taken here is the latter, since this avoids duplicating use of flags named `main-is` in slightly different contexts)

**References**
- https://github.com/facebook/duckling/issues/505
- https://github.com/haskell/cabal/issues/4315

**Version Info**
```sh
$ stack --version
1.9.3.1 x86_64
Compiled with:
- Cabal-2.4.0.1
# <remainder of output omitted>
```

Resolves https://github.com/facebook/duckling/issues/505

Pull Request resolved: https://github.com/facebook/duckling/pull/512

Reviewed By: girifb

Differential Revision: D22799888

Pulled By: patapizza

fbshipit-source-id: 2c0808790e6671e6bc3c9b1f322e57b8dc32a8cc
2020-07-30 11:20:11 -07:00
Bing Yuan
a88e0669f7 Fixed the rule for parsing "coming <time cycle>"
Summary: Currently the term "coming" is being treated the same way as "this" or "current". The expected treatment should be the same as the term "next".

Reviewed By: chinmay87

Differential Revision: D22435156

fbshipit-source-id: b0b20d8a38014267fb7d037b685ce126f602bda7
2020-07-17 13:17:18 -07:00
Bing Yuan
5af4d617ba Fixed a problem in parsing mult-word timestamp for ES
Summary:
Current:
"seis cero cinco pm" [dimension Time] -> "cero cinco pm" or "5 pm"
here the term "seis" was dropped because it was treated as "6" in "Numeral" dimension.

Expected:
"seis cero cinco pm" -> "6:05 pm"

The root cause was that the rule "<hour-of-day> <integer> (as relative minutes)" dropped the first term "hour-of-day" if it was parsed as a latent token.

Reviewed By: chinmay87

Differential Revision: D22553028

fbshipit-source-id: abc92bb369c23d2b3084641eab2a2dabb87dbc66
2020-07-17 11:38:43 -07:00
Bing Yuan
780bd0aac5 Fixed the problem parsing "next <day-of-week>"
Summary:
If the current time is: 07/07/2020 (tuesday),
Current:
"next saturday" -> 07/11/2020
Expected:
"next saturday" -> 07/18/2020

According to
Quora (https://www.quora.com/When-is-this-Monday-and-next-Monday-Are-they-the-same#:~:text='Next%20Monday'%20is%20Monday%20of,the%20first%20Monday%20after%20today.),

the term "next saturday" means the first saturday in the week after current (this) week, regardless the current day of week.

Reviewed By: haoxuany

Differential Revision: D22420499

fbshipit-source-id: c2bd28b9fda78ff3cb0418a50c3b302be350b02d
2020-07-15 14:47:41 -07:00
Bing Yuan
9c1ab0de69 Tweak the rule for parsing "tomorrow" in ES
Summary:
There are two rules for parsing "manana" (dimension: Time): one is resolved to "morning"; while the other is resolved to "tomorrow". And the first (or "morning") rule resolves to a LATENT result; while the second (or "tomorrow") rule resolves to a NON-LATENT result.

If the duckling is called with "latent" option turned off, the "tomorrow" rule prevails. However, if the duckling is invoked with "latent" option turned on, the "morning" rule is preferred.

The solution (for now) is to steer the classifier towards "tomorrow" rule by adding large number of (same) examples for "tomorrow" rule.

Reviewed By: chinmay87

Differential Revision: D22425277

fbshipit-source-id: 2f139eec0c38b9b5227f27d9f09f6264e7cf86cd
2020-07-15 12:08:20 -07:00
Bing Yuan
82e976b77d Added support for parsing year composed of multiple ES words
Summary:
The root cause is this lacking of support for the composition of numerals in ES.

For example, "mil novecientos noventa" is parsed 3 individual numbers: 1000, 900 and 90 correspondingly. Instead, the expected result is a single numeral value that is the sum of aforementioned three numbers. The same expection can be extended to the composition with arbitrary number of numeral values.

Reviewed By: chinmay87

Differential Revision: D22192034

fbshipit-source-id: 476489145b83297b82d88f3451020c867e2d08aa
2020-07-06 17:02:59 -07:00
Bing Yuan
857aa16d06 added support to parse oridinal day-of-week
Summary:
Current:
"first monday of last month" -> the date of first monday starting from current time. Note here the term "last month" is dropped

Expected:
"first monday of last month" -> the date of first monday of previous month.

Reviewed By: chinmay87

Differential Revision: D22300243

fbshipit-source-id: 16622860c52ec2ce9c7a7bcd6094192255aa5a0b
2020-07-06 15:39:57 -07:00
Bing Yuan
c7aed76c5a added new rule to handle ES phrase for next week (#497)
Summary:
Current:
"siquiente semana" -> [] // empty result

Expected:
"siquiete semana" -> "next week"
Pull Request resolved: https://github.com/facebook/duckling/pull/497

Test Plan: haxlsh> H.io $ debug (makeLocale ES Nothing) "siguiente semana" [This Time]

Reviewed By: chinmay87

Differential Revision: D22054455

Pulled By: yuanbing

fbshipit-source-id: 576e96a49eebace9b5baa382efac2e266e651d8e
2020-07-06 12:50:45 -07:00
Bing Yuan
44007b76d3 Add support for spelled out time of
Summary:
Current:
"twelve zero three" -> 12:00pm

Expected:
"twelve zero three" -> 12:03pm

The root cause was that duckling doesn't support this kind of pattern for timestamp. The uniqueness here was that the number "three" was spelled as "zero three" that Duckling failed to understand.

Reviewed By: chinmay87

Differential Revision: D22313140

fbshipit-source-id: 9e481a142a16b94c61b1770e7f8be036497419f8
2020-07-06 12:17:25 -07:00
Bing Yuan
a78aacfc50 Updated the rule to parse "last <day-of-week> of <time>"
Summary:
current:
last friday in october -> the date of Friday of previous week
expected:
last friday in october -> the data of last Friday of month october

Reviewed By: chinmay87

Differential Revision: D22201326

fbshipit-source-id: 1983c1b9c24aa356977af7def42d5ba07c7f08be
2020-06-25 16:04:17 -07:00
Bing Yuan
36a3d2011f Added new rule to parse ES phrase for time of day (in the afternoon) (#496)
Summary:
Current:
"seis dos de lar tarde" -> "dos de lar tarde" or 2pm; note
that the term "seis" is dropped.

Expected:
"seis dos de lar tarde" -> "seis dos de lar tarde"
or 6:02pm
Pull Request resolved: https://github.com/facebook/duckling/pull/496

Test Plan: H.io $ debug (makeLocale ES Nothing) "seis dos de la tarde" [This Time]

Reviewed By: chinmay87

Differential Revision: D22054328

Pulled By: yuanbing

fbshipit-source-id: 1ecb05885fc506176cc04768aa158279c7e7fd4f
2020-06-25 15:07:32 -07:00
Bing Yuan
eb9ddcbd95 Fixed a problem in parsing ES timestamp
Summary:
There are two types of ES phrases for timestamp to support:

1. "para las seis cero dos pm"
2. "para las 6 0 2 pm"

The solution is to:
1. added a new rule to parse two-digit number between 1 and 9 (inclusive);
2. modified the regex pattern to support additional optional phrase "para" in front of "las".

Reviewed By: chinmay87

Differential Revision: D22218800

fbshipit-source-id: 58f692beb6f10834c0ab639b31bf239bf4a1970e
2020-06-25 12:49:39 -07:00
Bing Yuan
1ad3a8514e added new rule to parse phrase in the pattern "xxx minutes to <hour-of-day>" (#500)
Summary:
Current:
20 minutes to 2pm tomorrow -> 20 minutes (dimension: Time)

Expected:
20 minutes to 2pm tomorrow -> 1:45pm of next day (dimension: Time)
Pull Request resolved: https://github.com/facebook/duckling/pull/500

Reviewed By: chinmay87

Differential Revision: D22200580

Pulled By: yuanbing

fbshipit-source-id: e47e5b5aaf4e3644c7032096caa75672a8543087
2020-06-25 11:21:29 -07:00
Bing Yuan
e570acd2f9 Added new rule support composite duration phrase in ES (#498)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/498

Test Plan:
In haxlsh:
H.io $ debug (makeLocale ES Nothing) "dos hora y treinta y cinco minutos" [This Duration]

Reviewed By: chinmay87

Differential Revision: D22054695

Pulled By: yuanbing

fbshipit-source-id: b4486141bf7ccb0e538e40ce40fadd7daef374a8
2020-06-25 09:47:32 -07:00
Bing Yuan
7b2def024e support "noon" phrase in ES
Summary:
This fix is to add support to parse alternative phrase, in ES, for "noon".
Currently the supported ES phrase for "noon" is "mediodia", the alternative form is "medio<whitespace*>dia".

Reviewed By: chinmay87

Differential Revision: D22188049

fbshipit-source-id: 798b83be75798f3b0d695a0f01a65dc84af98e22
2020-06-24 16:36:05 -07:00
Bing Yuan
dddb4adf23 Updated the rule to parse ordinal day of month in ES (#495)
Summary:
the rule is updated to conform with natural expression of "ordinal day of month".
Pull Request resolved: https://github.com/facebook/duckling/pull/495

Differential Revision: D22054297

Pulled By: yuanbing

fbshipit-source-id: d9d8e00311d4d3121685ab5b09f6c1f52f3077c9
2020-06-24 11:47:22 -07:00
Bing Yuan
195a9d7aa1 Added new rule to support ES phrase for "next week". (#493)
Summary:
Please note that the major diff with the
existing rule for next week is that the new
phrase doesn't have the leading "la" or anything with
similar meaning.
Pull Request resolved: https://github.com/facebook/duckling/pull/493

Test Plan: Imported from GitHub, without a Test Plan: line.

Reviewed By: patapizza

Differential Revision: D21981169

Pulled By: yuanbing

fbshipit-source-id: 7478d1262c3a4599d359b485b28a547ad5f44b76
2020-06-24 11:02:24 -07:00
Bing Yuan
8cf3fdb581 Fix a problem with parsing ES time phrase
Summary:
The root cause was the error in parsing the ES numeral value [1-9] that spelled with two words instead of one.

For example "cero dos" should be parsed the as "dos". Currently it's being as two numeral values: 0 and 3.

Reviewed By: chinmay87

Differential Revision: D22162804

fbshipit-source-id: 949956935a21e742f6788e7afa788ff728dd9a8d
2020-06-22 12:03:15 -07:00
Bing Yuan
097b9260d5 Added new rules to parse phrases for upcoming weeks. (#491)
Summary:
the new rules could parse phrases in the form of
xxx upcoming weeks
upcoming xxx weeks
Pull Request resolved: https://github.com/facebook/duckling/pull/491

Test Plan: Imported from GitHub, without a Test Plan: line.

Differential Revision: D21959647

Pulled By: chinmay87

fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454
2020-06-17 14:32:59 -07:00
Cody Ohlsen
474ae1b851 Duckling probabilistic layer bug fix
Summary:
while computing a score used to rank in Duckling, it currently sums up the log likelihoods learned during training. While ranking, the goal is to find the (same span) parse candidate which is _more_ likely to lead to a *correct* parse. However, the old logic was summing up the "more confident of the two classes" log likelihood.From what I understand this is the part which feels wrong.

I created an example of two rules:
#1. a rule where the classifier learns that the rule is very confidently NOT the correct parse.
- okdata (positive class) is very low confidence (high negative number prior)
- kodata (negative class) is very high confidence (low negative number prior)

#2. a rule where the classifier is confident that it is the correct parse, but not Very Confident.
- okdata (positive class) is high confidence (nonzero, but low negative number prior)
- kodata (negative class) is very low confidence (high negative number prior)

these two rules match the same regex, thus the same span. While duckling parses it, it turns out, that rule #1 ranks higher than rule #2. The reason why is because #1 is MORE confident that it is the INCORRECT (does not contribute to) parse than rule #2. Does this make sense?

to solve this problem, I changed the ranking score estimation to use only the positive class scores (okdata). In the example above, it fixes it so rule #2 would end up ranking higher because the positive class confidence is higher than #1's positive class confidence.

Would really love some deeper input from Duckling experts. I re-learned haskell and learned haxl to craft a small example here, and I am very new to Duckling (just started reading the ranking code on Friday). I know Duckling is battle-tested but I also don't believe that means a bug can't exist. And further, this specific bug may not happen a whole lot for 2 reasons:
- there are not a lot of rules which end up higher negative confidence than positive (requires enough negative corpus examples over positive ones)
- ranking uses span width first, and only when the spans are equivalent does the score based ranking come into play. So it requires that 2 rules match the same span before any actual score calculation even matters.

Reviewed By: patapizza

Differential Revision: D22009276

fbshipit-source-id: 13491689d39d810da526fa4bb8b6e526d4cafd35
2020-06-12 16:06:11 -07:00
Julien Odent
326fc25737 ES/Duration: Add Copyright header to tests file
Summary: as title

Reviewed By: girifb

Differential Revision: D21998107

fbshipit-source-id: 7c1c91db9a1ebf29d702930570341dc3b6b0ce65
2020-06-11 10:17:44 -07:00
byuan
558b38c1cb Fixed the problem with parsing fractional hour phrase that contains "quarter" or "quarters" (#485)
Summary:
Current:
if the fractional hour expression describes the hour fraction with term like "quarter or quarters", then duckling couldn't correctly recognize it.
Expected:
Duckling should be able to identify this kind of expression and parse it correctly.
Fix:
Add new rule to parse the fractional hour pattern that contains the keyword like "quarter or quarters".
Pull Request resolved: https://github.com/facebook/duckling/pull/485

Test Plan: Imported from GitHub, without a Test Plan: line.

Reviewed By: haoxuany

Differential Revision: D21850804

Pulled By: chinmay87

fbshipit-source-id: 818b7b3f37e3f8a6d1a7d579db19fb2cfb2763f4
2020-06-10 12:19:28 -07:00
Bing Yuan
220c0f2d7d Added support for parsing new ES duration phrases like half hour, quarter of hour. (#489)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/489

Differential Revision: D21959268

Pulled By: chinmay87

fbshipit-source-id: 2b785b44da5437c7b27af098daef551139dad990
2020-06-09 15:16:38 -07:00
byuan
0cfd8102da Fixed problem with parsing fractional (with decimal) minutes (#484)
Summary:
Current behavior
sentence with pattern "xxx.yyy minutes" parsed as yyy minutes.

Expected behavior:
xxx.yyy minutes = 60*xxx+0.yyy*60 seconds

For example:
"15.5" minutes = 60*15+0.560 = 930 seconds
Pull Request resolved: https://github.com/facebook/duckling/pull/484

Reviewed By: haoxuany

Differential Revision: D21850782

Pulled By: chinmay87

fbshipit-source-id: c007901d4dd6476e5e383a13892ecff9b2191fff
2020-06-09 14:51:09 -07:00
Bing Yuan
33aa18dca8 Added new rule for "midday" (#490)
Summary:
added new EN rule to parse the phrases that contain "midday".
Pull Request resolved: https://github.com/facebook/duckling/pull/490

Differential Revision: D21959562

Pulled By: chinmay87

fbshipit-source-id: f9ab45aecd551e8959d00b0025ed38b616ed6b14
2020-06-09 14:51:08 -07:00
byuan
596bf62888 Fixed a problem with parsing "day of month" that contains "dia" in it (#487)
Summary:
Current:

"el dia nueve" -> "9pm" of current day

Expected:
"el dia nueve" -> 9th of current or next month

Fix:

added new ES rule to handle the pattern like "el dia  <day of month>"
Pull Request resolved: https://github.com/facebook/duckling/pull/487

Reviewed By: girifb

Differential Revision: D21850807

Pulled By: chinmay87

fbshipit-source-id: d8edd81273c7e5f700b440ccc8c7e7bded679051
2020-06-09 14:51:08 -07:00
byuan
4cfe88ead1 Fixed a problem with paring fractional time phrase for hours and minutes. (#483)
Summary:
Current behavior:
"an hour and 45 minutes" -> parsed as "1 hour" [dimension: "Duration"]
"a minute and 30 seconds" ->parsed as "1 minute" [dimension: "Duration"]

Expected behavior:
"an hour and 45 minutes" -> "105 minutes" with dimension as "Duration"
"a minute and 30 seconds" -> "90 seconds" with dimension as "Duration"

The fix:

adding new rule to handle this duration composition
pattern. (<some duration> and <some other duration>)
Pull Request resolved: https://github.com/facebook/duckling/pull/483

Reviewed By: haoxuany

Differential Revision: D21850773

Pulled By: chinmay87

fbshipit-source-id: 62eb6859e0ce2b88cf8ae48d836a1a6a1ac8705d
2020-06-05 13:01:48 -07:00
byuan
1dac46a8ce Time/es: Make "n horas" latent". (#478)
Summary:
1. ~~Fixed broken build due to the problem with main test entry point;~~
2. Fixed the ambiguous results caused by mishandling the
ranking rules for parsing frames in ES. For example "una hora"
be interpreted either as "Duration" or "1pm" in "Time" dimension.
And the expected result should be in "Duration" dimension.
3. ~~ignore stack lock file~~
Pull Request resolved: https://github.com/facebook/duckling/pull/478

Test Plan:
```
:test Endpoint.Duckling.Tests --hide-successes
[1003 of 1003] Endpoint.Duckling.Tests (Duckling.Api changed)
Ok, two modules loaded.

All 357 tests passed (79.69s)
```

```
haxlsh> H.io $ debug (makeLocale ES Nothing) "de una horas" [This Time, This Duration]
<integer> <unit-of-duration> (una horas)
-- number (0..15) (una)
-- -- regex (una)
-- hora (grain) (horas)
-- -- regex (horas)
[Entity {dim = "duration", body = "una horas", value = RVal Duration (DurationData {value = 1, grain = Hour}), start = 3, end = 12, latent = False, enode = Node {nodeRange = Range 3 12, token = Token Duration (DurationData {value = 1, grain = Hour}), children = [Node {nodeRange = Range 3 6, token = Token Numeral (NumeralData {value = 1.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 3 6, token = Token RegexMatch (GroupMatch ["una","","a","","",""]), children = [], rule = Nothing}], rule = Just "number (0..15)"},Node {nodeRange = Range 7 12, token = Token TimeGrain Hour, children = [Node {nodeRange = Range 7 12, token = Token RegexMatch (GroupMatch ["ora"]), children = [], rule = Nothing}], rule = Just "hora (grain)"}], rule = Just "<integer> <unit-of-duration>"}}]
it :: [Entity]
```

Reviewed By: fascpt

Differential Revision: D21770015

Pulled By: chinmay87

fbshipit-source-id: 3056fcf656140c9d65b70b5c604a286ea2c307b2
2020-05-29 11:09:46 -07:00
Pranas Kiziela
a93cae1c02 Improve Docker build (#341)
Summary:
* Reduces size of final image from 5GB to 130MB
* Builds any checkout (not locked to the master)
* Doesn't run stack on CMD (executes static build of Duckling instead)
Pull Request resolved: https://github.com/facebook/duckling/pull/341

Reviewed By: chinmay87

Differential Revision: D21083018

Pulled By: patapizza

fbshipit-source-id: d909158f20f5b8da5b0248a25103b850797bc3a3
2020-04-17 08:22:43 -07:00
Steven Troxler
4bfe50eed0 AmountOfMoney/EN: Make ruleIntervalMax, ruleIntervalMin symmetric
Summary:
When I was working on some related diffs, I noticed that there were some
asymmetries between the regexes for ruleIntervalMax and ruleIntervalMin:
 - we had no support for "at most", even though we did have "at least"
 - we had no support for "not? less than"
 - the ordering of the different constructions didn't match

This a minor tweak to make things match better

Reviewed By: patapizza

Differential Revision: D20484594

fbshipit-source-id: c3c54a9cc1b83402e42634b7a98a1a3b8cc5e09c
2020-04-07 10:32:29 -07:00
Cameron Sheikholeslami
9b69b1fc96 Fix to dockerfile so PCRE regex works. (#467)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/467

Reviewed By: chinmay87

Differential Revision: D20700248

Pulled By: patapizza

fbshipit-source-id: 17f933106c6f18fcd93b73f42af458220d93b6cf
2020-03-30 09:48:07 -07:00
Chinmay Deshmukh
d91a2dd4c0 Time/es: Fix ruleYearLatent
Summary: Fix `ruleYearLatent` to be the same as the one in `en`. We don't want to match numerals that could have been hours.

Reviewed By: patapizza

Differential Revision: D20683975

fbshipit-source-id: cdef9b1b5f8a21dc5e207ed2a7afcad84c56a596
2020-03-27 15:07:22 -07:00
Julien Odent
7166286a6f Add type=value to JSON response for Email, PhoneNumber and Url
Summary: For consistency.

Reviewed By: jtliao

Differential Revision: D20524369

fbshipit-source-id: 44031667adccab9bca7b3b6d42c80878bb96ccae
2020-03-18 17:04:42 -07:00
Steven Troxler
5d6750208a Duration: Rename timesOneAndAHalf to nPlusOneHalf
Summary:
When I first skimmed our rules for "half an hour" vs "an hour and a half"
I actually thought there might be a bug, because `timesOneAndAHalf`
sounds like it's actually multiplying by `1.5`.

There's no bug, the implementation is entirely correct, but it does
not multiply by 1.5, it adds .5 to any integer value at the given grain.
This diff renames the function to be more descriptive.

Handy trick for doing this kind of refactor without IDE tooling:
```
find duckling/Duckling/Duration/ -name 'Rules.hs'| xargs sed -i 's/timesOneAndAHalf/nPlusOneHalf/g'
```

Reviewed By: haoxuany

Differential Revision: D20456966

fbshipit-source-id: 35020685f091a41618b30b7e5f95dbfa48509b88
2020-03-16 12:17:00 -07:00
Ciaran O'Reilly
04d0db6efa Numeral/EN: Fixes ambiguous parses when both ruleNegative and ruleMultiply apply (#406)
Summary:
I noticed two ambiguous parses would occur when both ruleNegative and ruleMultiply would apply.

For example: "minus three million two hundred thousand"

```
*Duckling.Debug> debug (makeLocale EN Nothing) "minus three million two hundred thousand" [This Numeral]
compose by multiplication (minus three million two hundred thousand)
-- negative numbers (minus three million two hundred)
-- -- regex (minus)
-- -- intersect 2 numbers (three million two hundred)
-- -- -- compose by multiplication (three million)
-- -- -- -- integer (0..19) (three)
-- -- -- -- -- regex (three)
-- -- -- -- powers of tens (million)
-- -- -- -- -- regex (million)
-- -- -- compose by multiplication (two hundred)
-- -- -- -- integer (0..19) (two)
-- -- -- -- -- regex (two)
-- -- -- -- powers of tens (hundred)
-- -- -- -- -- regex (hundred)
-- powers of tens (thousand)
-- -- regex (thousand)
negative numbers (minus three million two hundred thousand)
-- regex (minus)
-- intersect 2 numbers (three million two hundred thousand)
-- -- compose by multiplication (three million)
-- -- -- integer (0..19) (three)
-- -- -- -- regex (three)
-- -- -- powers of tens (million)
-- -- -- -- regex (million)
-- -- compose by multiplication (two hundred thousand)
-- -- -- compose by multiplication (two hundred)
-- -- -- -- integer (0..19) (two)
-- -- -- -- -- regex (two)
-- -- -- -- powers of tens (hundred)
-- -- -- -- -- regex (hundred)
-- -- -- powers of tens (thousand)
-- -- -- -- regex (thousand)
```

This PR fixes this ambiguity and Duckling will only return the second (correct) parse.
Pull Request resolved: https://github.com/facebook/duckling/pull/406

Test Plan:
regen'd classifiers (no-op)

  :test Duckling.Tests

Imported from GitHub, without a `Test Plan:` line.

Reviewed By: chinmay87, girifb

Differential Revision: D20303354

Pulled By: patapizza

fbshipit-source-id: 280b0e33b7c944f9d87a7c23afda2f6a843e28a4
2020-03-13 11:40:02 -07:00
Steven Troxler
e3114c08f5 AmountOfMoney/ES: Add support for intervals
Summary:
This change applies roughly the same rules for supporting intervals
in Spanish AmountOfMoney that we suppor in English: intervals using
`entre _ e _` / `de _ a _` / `_ - _` with either money in both slots
or a number in the first slot and money in the second.

My Spanish is okay but not great - I'm confident these rules are good and
cover the most likely phrases, but there's probably room to add more coverage.

Reviewed By: patapizza

Differential Revision: D20425979

fbshipit-source-id: deb17fc331e1aa192d91dd47bc7f3864a246f0be
2020-03-13 11:21:45 -07:00
Ashwini Challa
2f38255cf8 Enabling TimeGrain (#460)
Summary:
Pull Request resolved: https://github.com/facebook/duckling/pull/460

Exposing the TimeGrain feature

Reviewed By: patapizza

Differential Revision: D20250270

fbshipit-source-id: 726f85eebd95ae31d911ebd9a43428d549aba877
2020-03-05 12:50:20 -08:00