Commit Graph

254 Commits

Author SHA1 Message Date
Julien Odent
7364955b35 Add lang to Quickstart
Summary: Fixes #77.

Reviewed By: JonCoens

Differential Revision: D5574087

fbshipit-source-id: 46ccf86
2017-08-07 10:04:26 -07:00
Lily Li
9801c99eeb Quantity: add weight dimensions
Summary:
Fixes #75.

I created a list of regex expressions and their corresponding units. I then mapped over them to create two rules for each expression: one for numeral quantities and one for single (a/an) quantities.

Reviewed By: patapizza

Differential Revision: D5532950

fbshipit-source-id: 63e35bd
2017-08-01 13:49:26 -07:00
Julien Odent
a84eb62180 Time: Fix nthDOWOfMonth
Summary:
Fixes #65.
* fixes US holidays
* Black Friday is actually the first day after Thanksgiving day (not necessary the fourth Friday of November)

Reviewed By: JonCoens

Differential Revision: D5533906

fbshipit-source-id: 1824cba
2017-08-01 09:04:31 -07:00
Julien Odent
ef461c3133 Time/FR: don't parse 'a un'
Summary:
In French, the form "at hh" is not valid (it requires an hour indicator).
This fixes false positives such as in "John a un rendez-vous."

Fixes https://github.com/wit-ai/wit/issues/666.

Reviewed By: JonCoens

Differential Revision: D5530713

fbshipit-source-id: ecee1e5
2017-08-01 08:49:41 -07:00
Julien Odent
8710b9d59b Time/EN: for duration from time
Summary: Fixes #57.

Reviewed By: blandinw

Differential Revision: D5495075

fbshipit-source-id: 018fd29
2017-07-26 11:49:44 -07:00
Julien Odent
7a6c2597af Time: don't shift duration on <duration> before/after <time> (but now)
Summary:
* `Duration` before/after `Time` now resolves with the lowest grain
* "now" has an undefined grain `NoGrain`, as depending on the context it might mean different things, as opposed to "right now"

Before:
`day after tomorrow` -> `day` grain
`1 day after tomorrow` -> `hour` grain

Given that the reference date/time is `2013-02-12T04:30:00`.
`one year from now` -> `2014-02-01T00:00:00` with `month` grain.
`one year from today` -> `2014-02-01T00:00:00` with `month` grain.

After:
`day after tomorrow` -> `day` grain
`1 day after tomorrow` -> `day` grain
`one year from now` -> `2014-02-12T04:30:00` with `month` grain (remains the same).
`one year from today` -> `2014-02-12T00:00:00` with `day` grain.

For other `Time` entities involving `Duration`, such as "in + `Duration`", the behavior remains the same: shift to the lower grain (the intent is not precise).

Reviewed By: l5t, blandinw

Differential Revision: D5467164

fbshipit-source-id: b63b6a4
2017-07-26 11:49:44 -07:00
Bartosz Nitka
ed834ee182 Bump to 0.1.2.0
Summary: Prepare for new hackage release.

Reviewed By: patapizza

Differential Revision: D5488945

fbshipit-source-id: ef6dedf
2017-07-25 07:49:23 -07:00
Bartosz Nitka
2e89c3b33d Test with GHC 8.2.1
Summary: GHC 8.2.1 has been released.

Reviewed By: JonCoens

Differential Revision: D5478409

fbshipit-source-id: e065ed3
2017-07-24 09:50:05 -07:00
Bartosz Nitka
c208ae01bc Remove redundant version bounds
Summary:
Repeating version bounds on executables that depend on
`duckling` library is needless bureaucracy.

Reviewed By: JonCoens

Differential Revision: D5478388

fbshipit-source-id: e18f1b8
2017-07-24 09:50:05 -07:00
chiralcarbon
ce0e9e4f50 Add MM/YYYY and MM/YY
Summary:
MM/YY is a common format for dates in India,UK and other parts of the world.Have added testcases in `Time/EN/corpus.hs` ,however it conflicts with one of the original(2/15 is output now as Feb. 2015 and not the 15th of February).
Closes https://github.com/facebookincubator/duckling/pull/59

Reviewed By: niteria

Differential Revision: D5455881

Pulled By: patapizza

fbshipit-source-id: 23b73a5
2017-07-20 11:20:02 -07:00
Julien Odent
a55629d541 Fix #53
Summary: Added a rule to handle "from <month> dd-dd" to fix #53.

Reviewed By: blandinw

Differential Revision: D5329214

fbshipit-source-id: a5f746d
2017-06-27 12:49:23 -07:00
Julien Odent
bfb6ba0387 Numeral flag for Time patterns
Summary:
Today things like `at single`, `at a few`, `at a couple of` would return a `Time`.
Discussed with blandinw to do this very explicit hack right now until other use cases show up.

Reviewed By: niteria

Differential Revision: D5325369

fbshipit-source-id: aec0402
2017-06-27 07:34:21 -07:00
Andrew Farmer
068e23db13 Prepare for DuplicateRecordFields
Summary:
The one restriction on using DuplicateRecordFields is that record
selectors have to be imported under their constructor, instead of as
top-level functions. Do this for si_sigma so D5242707 passes the compat
check.

Reviewed By: watashi

Differential Revision: D5326634

fbshipit-source-id: 74ec0dd
2017-06-26 22:34:21 -07:00
Jim Regan
8af3ae5d8a also match old-style -adh ending
Summary: Closes https://github.com/facebookincubator/duckling/pull/47

Reviewed By: niteria

Differential Revision: D5318111

Pulled By: patapizza

fbshipit-source-id: 963450b
2017-06-26 09:19:20 -07:00
André
3ec5390ac2 numerals between 100 and 999 in Portuguese fixed
Summary:
I fixed some bugs I found in Portuguese. This is my first attempt to contribute so let me know if there's any thing I could do better next time! thanks! awesome project!
Closes https://github.com/facebookincubator/duckling/pull/56

Differential Revision: D5318968

Pulled By: patapizza

fbshipit-source-id: 94ff30f
2017-06-26 08:19:28 -07:00
Julien Odent
45822009e1 Time: weekend helper
Summary: `weekend` production helper.

Reviewed By: blandinw

Differential Revision: D5302012

fbshipit-source-id: bb1f234
2017-06-23 07:04:27 -07:00
chao pan
efc1f36494 fix issue #50
Summary: Closes https://github.com/facebookincubator/duckling/pull/52

Reviewed By: blandinw

Differential Revision: D5296543

Pulled By: patapizza

fbshipit-source-id: 041844a
2017-06-23 07:04:27 -07:00
Sebastian Mika
291bd28873 Various smaller DE time improvements
Summary:
This PR contains various smaller but - at least on my data - important performance improvements for matching of German time and time range expressions.

I evaluated this on approx 11.000 time and time range expressions taken from emails (rather formal business travel requests) that have been manually annotated with the "true" time. Comparing this branch to the current master (`d6f8dd`) I get e.g. approx. 80% of the duckling results within +/- 1h of the true value (hours are the smallest grain in my data), vs. only 70% in the master. Other indicators I checked (time/range confusion, other thresholds, failures to find anything in the first place, etc.) were all improved as well.

**Changes**:

* [significant performance plus] added a rule `ruleDateDateInterval` that handles variations of "13.-15.10." correctly. Here the common case is that "13." refers to "13.10." and not "13.CURRENTMONTH". I didn't see an obvious way to fix that in the `<datetime> - <datetime>` rule.
* [significant performance plus] In `ruleMmdd` (which matches expressions like "13.03." in German), I made the last dot optional. At least in less formal text this is quite common to be forgotten. Also here and in `ruleDateDateInterval` I changed the order of the terms in the regular expression matching the month to prefer matching e.g. "10" over matching "1"+"no dot".
* [minor] treat "14/15Uhr" the same as "14-15Uhr"
*  [minor] Extended "bis" to also match "bis zum" and "auf den" (e.g. in "von Montag bis zum Freitag" or "von Dienstag auf Mittwoch")
* [minor] Changed `hh:mm` matching to also get the rather esoteric expression "17h00" - should do no harm.
Closes https://github.com/facebookincubator/duckling/pull/54

Reviewed By: blandinw

Differential Revision: D5301815

Pulled By: patapizza

fbshipit-source-id: 8766caf
2017-06-23 07:04:27 -07:00
Bartosz Nitka
f48b536b1e Add example modules for benchmarking and profiling
Summary:
This adds two new targets:
* `:duckling-expensive` - meant to have inputs for which running Duckling is expensive
* `:duckling-request-sample` - meant to have a random sample of inputs

The reason to have 2 of them is that they measure different things.
`:duckling-expensive` is correlated with failures,
`:duckling-request-sample` is correlated with cost.

I intend to add basic instruction on how to use them for
benchmarking/profiling soon.

Reviewed By: patapizza

Differential Revision: D5301554

fbshipit-source-id: f73fd85
2017-06-22 09:19:18 -07:00
Bartosz Nitka
65f8ec170a Don't double build on travis
Summary:
We build `duckling` again to test the source distribution.
It's a low-signal check that's too costly at the moment.

Reviewed By: patapizza

Differential Revision: D5301108

fbshipit-source-id: cc651a4
2017-06-22 06:34:53 -07:00
tarung-ml
d6f8ddc064 Support dates of type mm-dd
Summary:
e.g. "New York from 10-6 to 10-22" currently extracts: HH-MM. Instead, it should extract mm-dd i.e. October 10th to October 22nd.
Closes https://github.com/facebookincubator/duckling/pull/48

Reviewed By: niteria

Differential Revision: D5292473

Pulled By: patapizza

fbshipit-source-id: 04f1a4b
2017-06-22 04:04:26 -07:00
Adrien Menella
3e37bd0f71 Add rules for FR Time
Summary:
Add / modif rules to support:
  * `début|milieu|fin de matinée` (`early|mid|late morning`)
  * `début|milieu|fin de après-midi` (`early|mid|late afternoon`)
  * `début|milieu|fin de journée` (`early|mid|late day`)
  * `début|fin de soirée` (`early|late evening`)
  * `début|fin de mois` (`early|late month`)
  * `début|fin d'année` (`early|late year`)
  * `plus tard` (`later`)
  * `plus tard que ...` (`later than ...`)
  * `plus tard <time>` (`later than <time>`)
  * `plus tard <part-of-day>` (`later than <part-of-day>`)

And add additional corpus' examples for FR Time
Closes https://github.com/facebookincubator/duckling/pull/49

Reviewed By: blandinw

Differential Revision: D5292472

Pulled By: patapizza

fbshipit-source-id: 2f40d29
2017-06-21 15:04:27 -07:00
Julien Odent
213f94dda7 Time: don't parse 'this (past )?one'
Summary:
'one' is a latent time of day.
Restricting a couple of rules to accept non-latent time tokens.

Reviewed By: blandinw

Differential Revision: D5293972

fbshipit-source-id: 07cdb9b
2017-06-21 15:04:27 -07:00
Daniel Rodríguez
36808e6086 HashMap lookups for large regexes.
Summary:
Transform large case matches into HashMap lookups.

Add an extra example for a rule set that wasn't tested before.

Reviewed By: patapizza

Differential Revision: D5253349

fbshipit-source-id: 303dbca
2017-06-19 11:34:18 -07:00
Julien Odent
4a1f78a9f7 bump to 0.1.1.0
Summary: Prepare for new hackage release.

Reviewed By: niteria

Differential Revision: D5267168

fbshipit-source-id: ac51db8
2017-06-16 13:04:29 -07:00
Şeref R.Ayar
b0574191ac Support early/mid/late <month> #35
Summary: Closes https://github.com/facebookincubator/duckling/pull/43

Reviewed By: blandinw

Differential Revision: D5264974

Pulled By: patapizza

fbshipit-source-id: 0330f2b
2017-06-16 12:04:29 -07:00
Julien Odent
3ec2228eac Setup logs
Summary: Fixes #41

Reviewed By: niteria

Differential Revision: D5242794

fbshipit-source-id: cd53bd6
2017-06-14 02:04:26 -07:00
Julien Odent
b943111b4f Intervals for AmountOfMoney
Summary:
* Supports "a" + currency (e.g. "a dollar")
* Supports intervals (e.g. "10-20 dollars")
* Supports open intervals (e.g. "above 3 dollars", "less than 3 dollars")
* Follows `Time` format

Reviewed By: blandinw

Differential Revision: D5233766

fbshipit-source-id: 57cb6a8
2017-06-13 16:34:21 -07:00
Şeref R.Ayar
8711df5047 change json response #12
Summary:
not sure about this. Maybe I need some guidance.
Closes https://github.com/facebookincubator/duckling/pull/42

Reviewed By: blandinw

Differential Revision: D5228520

Pulled By: patapizza

fbshipit-source-id: 4f99cc5
2017-06-12 15:19:22 -07:00
hongwui
c6c1330ed5 Add myr
Summary:
Add MYR(Malaysia currency) into amountOfMoney in EN.
Closes https://github.com/facebookincubator/duckling/pull/40

Reviewed By: blandinw

Differential Revision: D5218026

Pulled By: patapizza

fbshipit-source-id: 5fd179e
2017-06-09 19:49:16 -07:00
Julien Odent
cd3f3dd2f4 Time/EN: afternoonish
Summary: Apparently it's a thing.

Reviewed By: blandinw

Differential Revision: D5199823

fbshipit-source-id: d2ed2aa
2017-06-09 09:34:20 -07:00
Julien Odent
486ab645fc Quantity to accept any product for English
Summary:
* accepts any word besides meat/sugar
* allow for grams

Reviewed By: blandinw

Differential Revision: D5197073

fbshipit-source-id: f58aa54
2017-06-09 09:34:20 -07:00
Anand Bhaskar
b8277411e7 Refactor rule 'number.number hours'
Summary: Created a helper for the rule to reuse across languages.

Reviewed By: patapizza

Differential Revision: D5189741

fbshipit-source-id: 7b4dcd4
2017-06-06 09:34:22 -07:00
Heejin (macbook)
9a49b7652b KO/Time: Fix typo
Summary:
Found a common mistake.
Closes https://github.com/facebookincubator/duckling/pull/38

Reviewed By: niteria

Differential Revision: D5182189

Pulled By: patapizza

fbshipit-source-id: 182a325
2017-06-05 08:49:21 -07:00
Şeref R.Ayar
ba26ca7e91 Volume for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/34

Reviewed By: niteria

Differential Revision: D5168380

Pulled By: patapizza

fbshipit-source-id: 31d0a11
2017-06-02 12:49:20 -07:00
Julien Odent
4a1741528c Expose fromZonedTime
Summary: See #37

Reviewed By: JonCoens

Differential Revision: D5172801

fbshipit-source-id: bf63303
2017-06-02 11:19:30 -07:00
Şeref R.Ayar
b69874cd9f Duration for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/32

Reviewed By: niteria

Differential Revision: D5150778

Pulled By: patapizza

fbshipit-source-id: d156b0a
2017-05-31 02:19:40 -07:00
chao pan
292a94128f Generalize mm/dd rules to accept spaces like '05 / 27', '05/ 27'
Summary:
The existing "mm/dd" rules only accepts format like "05/27"; However, in practice there might be extra spaces like "05 / 27", "05/ 27". The pull requests tweaks the regex to accept extra space.
Closes https://github.com/facebookincubator/duckling/pull/31

Reviewed By: niteria

Differential Revision: D5147118

Pulled By: patapizza

fbshipit-source-id: f6a5069
2017-05-30 09:49:55 -07:00
serefayar
92a3e16886 Temperature for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/30

Reviewed By: niteria

Differential Revision: D5147114

Pulled By: patapizza

fbshipit-source-id: 804f623
2017-05-30 09:34:17 -07:00
kkpoon
001ff291fc Add Cantonese support to date time into ZH
Summary:
Add the [Cantonese](https://en.wikipedia.org/wiki/Cantonese) (the official spoken language used in Hong Kong) support to date time

- updated Duration ZH corpus
- updated Time ZH rules and corpus
- updated TimeGrain ZH rules
Closes https://github.com/facebookincubator/duckling/pull/24

Reviewed By: patapizza

Differential Revision: D5143947

Pulled By: niteria

fbshipit-source-id: 9107d05
2017-05-30 07:49:19 -07:00
Mohankumar Dhayalan
21c9b8ed7a HashMap lookups for large regexes
Summary: Added Hashmap lookups for Regex for Numeral/ID

Reviewed By: patapizza

Differential Revision: D5128492

fbshipit-source-id: 5ab928b
2017-05-25 11:04:18 -07:00
Şeref R.Ayar
69ce841710 Comma as decimal mark for Numeral TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/28

Differential Revision: D5120967

Pulled By: patapizza

fbshipit-source-id: 41a5e4b
2017-05-24 09:04:17 -07:00
Şeref R.Ayar
6de7c2142b Distance for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/26

Reviewed By: niteria

Differential Revision: D5112142

Pulled By: patapizza

fbshipit-source-id: d71f654
2017-05-23 10:49:18 -07:00
rfranek@email.cz
b64f72eb19 updated Distance for CS
Summary: Closes https://github.com/facebookincubator/duckling/pull/25

Reviewed By: niteria

Differential Revision: D5111890

Pulled By: patapizza

fbshipit-source-id: 2b69c0c
2017-05-23 09:04:22 -07:00
Eli Sadoff
bde1e07928 Correct cents to pence for GBP
Summary:
GBP are broken up into 100 pence (or p) but not cents.
Closes https://github.com/facebookincubator/duckling/pull/3

Reviewed By: niteria

Differential Revision: D5102940

Pulled By: patapizza

fbshipit-source-id: 1462440
2017-05-23 08:04:21 -07:00
Magnus Burton
9a21c53a7b Updated Swedish time
Summary:
Added `vid` to allow for sentences like `vid kl. 12` (-> `by 12 O'clock`)
Closes https://github.com/facebookincubator/duckling/pull/22

Differential Revision: D5105205

Pulled By: patapizza

fbshipit-source-id: 405575d
2017-05-22 12:49:28 -07:00
Julien Odent
4ed89d29de DE/Time: Convert unicode char to hexadecimal
Summary: I missed that one.

Reviewed By: niteria

Differential Revision: D5079134

fbshipit-source-id: 2eacbea
2017-05-18 08:04:45 -07:00
Ramtin Seraj
9b9f837e94 Adding dockerfile for the http server
Summary: Closes https://github.com/facebookincubator/duckling/pull/20

Differential Revision: D5078813

Pulled By: patapizza

fbshipit-source-id: a8f95ff
2017-05-17 10:19:44 -07:00
Sebastian Mika
39cb76024b Smaller improvements to DE about/before/after
Summary:
* In DE `frühestens` and `spätestens` act implicitly as `nach` and `vor` (after and before) on times and may also appear after the time

* The rule `ruleTimeofdayTimeofdayInterval` does match `9Uhr-10` but not the
way more common expression `9-10Uhr`; added the same rule with the
second time as non-latent; actually I am not sure whether the original
rule makes sense at all

* Simple extension of `intersect by ,` to THE formal way in DE to express
a date (i.e. `Freitag, der 13.03.2013`)

General remark: I used UTF-8 characters albeit I saw that the other rules and examples use escaped hex encoding for e.g. German umlaute. If there is any reason to do that (it is not very readable), I will of course change that.
Closes https://github.com/facebookincubator/duckling/pull/19

Reviewed By: niteria

Differential Revision: D5070052

Pulled By: patapizza

fbshipit-source-id: 990ad08
2017-05-17 10:19:44 -07:00
Sebastian Mika
b00e5faeac Fix DE numerical ordinal matching
Summary:
The numerical ordinal matching rule in DE is too broad. An ordinal like "1." may not be proceeded or followed by numbers.

* Added negative lookbehind - avoids matching the first "1." in "1.1" as an ordinal.
* Added negative lookahead - avoids matching the second "1." in "1.1. as an ordinal
Closes https://github.com/facebookincubator/duckling/pull/18

Reviewed By: patapizza

Differential Revision: D5069200

Pulled By: niteria

fbshipit-source-id: 0583076
2017-05-16 09:49:21 -07:00