Commit Graph

183 Commits

Author SHA1 Message Date
Yao Xiao
434e88b511 Consolidate days of week and months
Summary:
Remove hardcoded named-days and named-months, and
replace them with ruleDaysOfWeek and ruleMonths.

Reviewed By: patapizza

Differential Revision: D5742209

fbshipit-source-id: 339fc0a
2017-09-01 13:49:36 -07:00
Kevin Doherty
c41b71c665 Consolidate days of the week and months for GA
Summary:
The Galeic ruleset has 12 separate rules for months, and 7 for days. This
change replaces those with a list of months/days and a single function
to create a list of rules from those. This is the same approach as is currently in the English ruleset.

Reviewed By: patapizza

Differential Revision: D5756222

fbshipit-source-id: ac4bc42
2017-09-01 12:34:29 -07:00
Henry Swanson
b62be42077 Consolidate other times in DE ruleset
Summary:
Combined each of seasons, instants, and holidays into a data list and a
function to generate the list of Rules.

*Instants = today, tomorrow, now, end of year, etc.

Reviewed By: patapizza

Differential Revision: D5730896

fbshipit-source-id: 23170e7
2017-08-29 16:34:30 -07:00
Henry Swanson
96432e5b7d Consolidate days of the week and months for DE
Summary:
The German ruleset has 12 separate rules for months, and 7 for days. This
change replaces those with a list of months/days and a single function
to create a list of rules from those. This is the same approach as is currently in the English ruleset.

Reviewed By: patapizza

Differential Revision: D5728656

fbshipit-source-id: 8590f4a
2017-08-29 14:49:39 -07:00
Caren Thomas
5d9b774b9d Consolidate days of week and months for Swedish rules
Summary: Consolidated all the days of week rules into one rule, and did the same for all the month rules.

Reviewed By: patapizza

Differential Revision: D5721202

fbshipit-source-id: 2b4a56f
2017-08-28 16:04:35 -07:00
Dana Thomas
0d387c0775 Consolidating days of week and names of months into separate rules for French Duckling
Summary: Consolidated the previous days of weeks and month names in french duckling file to become only 2 rules. Allows for more concise, updated code.

Reviewed By: patapizza

Differential Revision: D5710056

fbshipit-source-id: 816ef88
2017-08-28 11:34:27 -07:00
Julien Odent
b5e646d8f6 New language instructions
Summary: Updated `README.md` with instructions on how to add a new language in Duckling.

Reviewed By: JonCoens

Differential Revision: D5712523

fbshipit-source-id: 6f5eda0
2017-08-25 17:34:26 -07:00
Andrew Shields
7df488fbe2 reuse ruleDaysOfWeek and ruleMonths in Danish rules
Summary: Changed Danish time rules to use ruleDaysOfWeek and ruleMonths.

Reviewed By: patapizza

Differential Revision: D5709782

fbshipit-source-id: aa03065
2017-08-25 17:04:29 -07:00
Fredrik Wallén
9b5b7bc6ce Corrected ordinals for Swedish and updated tests accordingly
Summary:
There are problems in the ordinal recognition for Swedish. The most severe one is that all the numbers above 15 are actually Danish, not Swedish. Apart from that digits and digits followed by a dot are considered ordinals.

This pull request fixes this and also adds support for ordinals up to 100. The structure of the code is similar to in the ordinal recognition in English. Tests are also updated, both the ordinal tests and the time tests where incorrect ordinals were used.
Closes https://github.com/facebookincubator/duckling/pull/86

Reviewed By: JonCoens

Differential Revision: D5698145

Pulled By: patapizza

fbshipit-source-id: c31d7bc
2017-08-25 17:04:29 -07:00
Margaret Li
b3d10dbf05 consolidated rules for days of week/months in duckling ZH
Reviewed By: patapizza

Differential Revision: D5702961

fbshipit-source-id: 49906d2
2017-08-25 16:34:48 -07:00
Julien Odent
9f856cec48 Time/PT: don't parse 'um' alone
Summary: In Portuguese, "um" means the numeral "one" and the article "a".

Reviewed By: bfiss

Differential Revision: D5703396

fbshipit-source-id: 92ed04f
2017-08-24 21:49:36 -07:00
Matt Lim
faa91d026b Consolidate days of week and months, for Vietnamese rules
Summary:
Remove hardcoded named-days and named-months, and
replace them with ruleDaysOfWeek and ruleMonths.

Reviewed By: patapizza

Differential Revision: D5695475

fbshipit-source-id: d30557f
2017-08-24 10:06:18 -07:00
Julien Odent
2f28e4e33d Time/PL: Don't parse ordinals without context
Summary: "pierwszy" and "drugiej" shouldn't parse as hours without context (e.g. at/until x).

Reviewed By: blandinw

Differential Revision: D5694804

fbshipit-source-id: 40e3eb7
2017-08-24 08:34:33 -07:00
dubovinszky
60565c15aa HU Time, TimeGrain
Summary: Closes https://github.com/facebookincubator/duckling/pull/83

Reviewed By: blandinw

Differential Revision: D5681515

Pulled By: patapizza

fbshipit-source-id: 918d0a4
2017-08-22 19:34:33 -07:00
Joseph Button
ff76927956 Consolidate days of week and months
Summary: Consolidating the rules for months and days of the week in Italian following the pattern seen in English.

Reviewed By: patapizza

Differential Revision: D5665259

fbshipit-source-id: 45d6c3b
2017-08-22 15:04:27 -07:00
Atyansh Jaiswal
4e96a15c15 Refactored weekend rules to use the weekend helper for all languages
Summary: This is a simple refactor that uses the weekend helper for all languages

Reviewed By: patapizza

Differential Revision: D5677330

fbshipit-source-id: 9984539
2017-08-22 10:34:24 -07:00
Julien Odent
004995b595 Don't allow matches in the middle of words
Summary:
We don't allow matches adjacent to a character of the same class.
We were treating uppercase and lowercase characters differently.
"jon Friday" wouldn't match "on" but "Jon Friday" would.

Reviewed By: blandinw

Differential Revision: D5653681

fbshipit-source-id: be67358
2017-08-17 15:49:26 -07:00
Maury Turay
4a7aacae2f Consolidate days of week and months for Romanian rules
Reviewed By: patapizza

Differential Revision: D5645993

fbshipit-source-id: d2b69a1
2017-08-17 15:34:19 -07:00
Satya Bodduluri
da41db3766 Added ruleIntervalDDDDMonth to EN
Summary: Added ruleIntervalDDDDMonth to EN to handle cases such as "23rd to 26th Oct" and "1-8 september"

Reviewed By: patapizza

Differential Revision: D5637280

fbshipit-source-id: a1fdcd2
2017-08-16 14:34:26 -07:00
Daniel Kantor
5cad4359e2 Added HU Ordinals
Summary: Closes https://github.com/facebookincubator/duckling/pull/82

Reviewed By: JonCoens

Differential Revision: D5631927

Pulled By: patapizza

fbshipit-source-id: d68b238
2017-08-16 11:19:24 -07:00
Jesse Hellemn
98b58647b1 Adapting Spanish rules to handles names days and months in the same rules
Summary: Moved all named days to the same rule, moved all named months to the same rule. Kept same regexes, just consolidated them.

Reviewed By: patapizza

Differential Revision: D5637061

fbshipit-source-id: e08ecf9
2017-08-16 09:34:24 -07:00
Atyansh Jaiswal
e7431739ec Fixed Ordinal parsing for format "August 27th-30th"
Summary: Changed ruleIntervalMonthDDDD to use the ordinal predicate instead of ugly regex

Reviewed By: patapizza

Differential Revision: D5628188

fbshipit-source-id: 1dbe195
2017-08-15 10:34:37 -07:00
Veselin Stoyanov
e9b1c8932a Added AmountOfMoney dimension to Bulgarian language
Summary:
- Added AmountOfMoney dimension to Bulgarian language
Closes https://github.com/facebookincubator/duckling/pull/80

Reviewed By: JonCoens

Differential Revision: D5606699

Pulled By: patapizza

fbshipit-source-id: c18f5d4
2017-08-14 09:34:36 -07:00
Hiten Parmar
be113689ac Add support for parsing day intervals beginning with from "from 10 to 16 August"
Summary: Added EN rule "ruleIntervalFromDDDDMonth" to support "from 10 to 16 August". Used "isDOMValue" helper rather than regex.

Reviewed By: patapizza

Differential Revision: D5610623

fbshipit-source-id: 00a5208
2017-08-12 02:19:23 -07:00
Julien Odent
4c348b1b9d Try -j1 to fix Travis
Summary:
The Travis build fails.
Trying https://github.com/haskell/cabal/issues/2546 to see if that helps.

Reviewed By: niteria

Differential Revision: D5612089

fbshipit-source-id: d4df127
2017-08-11 10:49:23 -07:00
dubovinszky
24d3f19976 HU Setup + Numeral
Summary:
- Setup Hungarian (HU) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/79

Reviewed By: blandinw

Differential Revision: D5595812

Pulled By: patapizza

fbshipit-source-id: 5959938
2017-08-09 17:49:56 -07:00
Veselin Stoyanov
5d03b45af9 Setup Bulgarian language and Numeral Dimension
Summary:
- Setup Bulgarian (BG) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/78

Reviewed By: niteria

Differential Revision: D5575513

Pulled By: patapizza

fbshipit-source-id: e566155
2017-08-09 08:19:24 -07:00
Julien Odent
9037126937 [Duckling] Time/DE: don't parse 'nächste 5'
Summary: Fixes https://github.com/wit-ai/wit/issues/694.

Reviewed By: niteria

Differential Revision: D5590023

fbshipit-source-id: 6356615
2017-08-09 06:49:26 -07:00
Julien Odent
61800297c8 Time/PL: don't parse 'nie' as Time
Summary: 'nie' means 'no' in Polish, and isn't a common abbreviation for 'niedziela' (Sunday).

Reviewed By: blandinw

Differential Revision: D5587036

fbshipit-source-id: bfda7fc
2017-08-08 16:49:58 -07:00
Julien Odent
7364955b35 Add lang to Quickstart
Summary: Fixes #77.

Reviewed By: JonCoens

Differential Revision: D5574087

fbshipit-source-id: 46ccf86
2017-08-07 10:04:26 -07:00
Lily Li
9801c99eeb Quantity: add weight dimensions
Summary:
Fixes #75.

I created a list of regex expressions and their corresponding units. I then mapped over them to create two rules for each expression: one for numeral quantities and one for single (a/an) quantities.

Reviewed By: patapizza

Differential Revision: D5532950

fbshipit-source-id: 63e35bd
2017-08-01 13:49:26 -07:00
Julien Odent
a84eb62180 Time: Fix nthDOWOfMonth
Summary:
Fixes #65.
* fixes US holidays
* Black Friday is actually the first day after Thanksgiving day (not necessary the fourth Friday of November)

Reviewed By: JonCoens

Differential Revision: D5533906

fbshipit-source-id: 1824cba
2017-08-01 09:04:31 -07:00
Julien Odent
ef461c3133 Time/FR: don't parse 'a un'
Summary:
In French, the form "at hh" is not valid (it requires an hour indicator).
This fixes false positives such as in "John a un rendez-vous."

Fixes https://github.com/wit-ai/wit/issues/666.

Reviewed By: JonCoens

Differential Revision: D5530713

fbshipit-source-id: ecee1e5
2017-08-01 08:49:41 -07:00
Julien Odent
8710b9d59b Time/EN: for duration from time
Summary: Fixes #57.

Reviewed By: blandinw

Differential Revision: D5495075

fbshipit-source-id: 018fd29
2017-07-26 11:49:44 -07:00
Julien Odent
7a6c2597af Time: don't shift duration on <duration> before/after <time> (but now)
Summary:
* `Duration` before/after `Time` now resolves with the lowest grain
* "now" has an undefined grain `NoGrain`, as depending on the context it might mean different things, as opposed to "right now"

Before:
`day after tomorrow` -> `day` grain
`1 day after tomorrow` -> `hour` grain

Given that the reference date/time is `2013-02-12T04:30:00`.
`one year from now` -> `2014-02-01T00:00:00` with `month` grain.
`one year from today` -> `2014-02-01T00:00:00` with `month` grain.

After:
`day after tomorrow` -> `day` grain
`1 day after tomorrow` -> `day` grain
`one year from now` -> `2014-02-12T04:30:00` with `month` grain (remains the same).
`one year from today` -> `2014-02-12T00:00:00` with `day` grain.

For other `Time` entities involving `Duration`, such as "in + `Duration`", the behavior remains the same: shift to the lower grain (the intent is not precise).

Reviewed By: l5t, blandinw

Differential Revision: D5467164

fbshipit-source-id: b63b6a4
2017-07-26 11:49:44 -07:00
Bartosz Nitka
ed834ee182 Bump to 0.1.2.0
Summary: Prepare for new hackage release.

Reviewed By: patapizza

Differential Revision: D5488945

fbshipit-source-id: ef6dedf
2017-07-25 07:49:23 -07:00
Bartosz Nitka
2e89c3b33d Test with GHC 8.2.1
Summary: GHC 8.2.1 has been released.

Reviewed By: JonCoens

Differential Revision: D5478409

fbshipit-source-id: e065ed3
2017-07-24 09:50:05 -07:00
Bartosz Nitka
c208ae01bc Remove redundant version bounds
Summary:
Repeating version bounds on executables that depend on
`duckling` library is needless bureaucracy.

Reviewed By: JonCoens

Differential Revision: D5478388

fbshipit-source-id: e18f1b8
2017-07-24 09:50:05 -07:00
chiralcarbon
ce0e9e4f50 Add MM/YYYY and MM/YY
Summary:
MM/YY is a common format for dates in India,UK and other parts of the world.Have added testcases in `Time/EN/corpus.hs` ,however it conflicts with one of the original(2/15 is output now as Feb. 2015 and not the 15th of February).
Closes https://github.com/facebookincubator/duckling/pull/59

Reviewed By: niteria

Differential Revision: D5455881

Pulled By: patapizza

fbshipit-source-id: 23b73a5
2017-07-20 11:20:02 -07:00
Julien Odent
a55629d541 Fix #53
Summary: Added a rule to handle "from <month> dd-dd" to fix #53.

Reviewed By: blandinw

Differential Revision: D5329214

fbshipit-source-id: a5f746d
2017-06-27 12:49:23 -07:00
Julien Odent
bfb6ba0387 Numeral flag for Time patterns
Summary:
Today things like `at single`, `at a few`, `at a couple of` would return a `Time`.
Discussed with blandinw to do this very explicit hack right now until other use cases show up.

Reviewed By: niteria

Differential Revision: D5325369

fbshipit-source-id: aec0402
2017-06-27 07:34:21 -07:00
Andrew Farmer
068e23db13 Prepare for DuplicateRecordFields
Summary:
The one restriction on using DuplicateRecordFields is that record
selectors have to be imported under their constructor, instead of as
top-level functions. Do this for si_sigma so D5242707 passes the compat
check.

Reviewed By: watashi

Differential Revision: D5326634

fbshipit-source-id: 74ec0dd
2017-06-26 22:34:21 -07:00
Jim Regan
8af3ae5d8a also match old-style -adh ending
Summary: Closes https://github.com/facebookincubator/duckling/pull/47

Reviewed By: niteria

Differential Revision: D5318111

Pulled By: patapizza

fbshipit-source-id: 963450b
2017-06-26 09:19:20 -07:00
André
3ec5390ac2 numerals between 100 and 999 in Portuguese fixed
Summary:
I fixed some bugs I found in Portuguese. This is my first attempt to contribute so let me know if there's any thing I could do better next time! thanks! awesome project!
Closes https://github.com/facebookincubator/duckling/pull/56

Differential Revision: D5318968

Pulled By: patapizza

fbshipit-source-id: 94ff30f
2017-06-26 08:19:28 -07:00
Julien Odent
45822009e1 Time: weekend helper
Summary: `weekend` production helper.

Reviewed By: blandinw

Differential Revision: D5302012

fbshipit-source-id: bb1f234
2017-06-23 07:04:27 -07:00
chao pan
efc1f36494 fix issue #50
Summary: Closes https://github.com/facebookincubator/duckling/pull/52

Reviewed By: blandinw

Differential Revision: D5296543

Pulled By: patapizza

fbshipit-source-id: 041844a
2017-06-23 07:04:27 -07:00
Sebastian Mika
291bd28873 Various smaller DE time improvements
Summary:
This PR contains various smaller but - at least on my data - important performance improvements for matching of German time and time range expressions.

I evaluated this on approx 11.000 time and time range expressions taken from emails (rather formal business travel requests) that have been manually annotated with the "true" time. Comparing this branch to the current master (`d6f8dd`) I get e.g. approx. 80% of the duckling results within +/- 1h of the true value (hours are the smallest grain in my data), vs. only 70% in the master. Other indicators I checked (time/range confusion, other thresholds, failures to find anything in the first place, etc.) were all improved as well.

**Changes**:

* [significant performance plus] added a rule `ruleDateDateInterval` that handles variations of "13.-15.10." correctly. Here the common case is that "13." refers to "13.10." and not "13.CURRENTMONTH". I didn't see an obvious way to fix that in the `<datetime> - <datetime>` rule.
* [significant performance plus] In `ruleMmdd` (which matches expressions like "13.03." in German), I made the last dot optional. At least in less formal text this is quite common to be forgotten. Also here and in `ruleDateDateInterval` I changed the order of the terms in the regular expression matching the month to prefer matching e.g. "10" over matching "1"+"no dot".
* [minor] treat "14/15Uhr" the same as "14-15Uhr"
*  [minor] Extended "bis" to also match "bis zum" and "auf den" (e.g. in "von Montag bis zum Freitag" or "von Dienstag auf Mittwoch")
* [minor] Changed `hh:mm` matching to also get the rather esoteric expression "17h00" - should do no harm.
Closes https://github.com/facebookincubator/duckling/pull/54

Reviewed By: blandinw

Differential Revision: D5301815

Pulled By: patapizza

fbshipit-source-id: 8766caf
2017-06-23 07:04:27 -07:00
Bartosz Nitka
f48b536b1e Add example modules for benchmarking and profiling
Summary:
This adds two new targets:
* `:duckling-expensive` - meant to have inputs for which running Duckling is expensive
* `:duckling-request-sample` - meant to have a random sample of inputs

The reason to have 2 of them is that they measure different things.
`:duckling-expensive` is correlated with failures,
`:duckling-request-sample` is correlated with cost.

I intend to add basic instruction on how to use them for
benchmarking/profiling soon.

Reviewed By: patapizza

Differential Revision: D5301554

fbshipit-source-id: f73fd85
2017-06-22 09:19:18 -07:00
Bartosz Nitka
65f8ec170a Don't double build on travis
Summary:
We build `duckling` again to test the source distribution.
It's a low-signal check that's too costly at the moment.

Reviewed By: patapizza

Differential Revision: D5301108

fbshipit-source-id: cc651a4
2017-06-22 06:34:53 -07:00
tarung-ml
d6f8ddc064 Support dates of type mm-dd
Summary:
e.g. "New York from 10-6 to 10-22" currently extracts: HH-MM. Instead, it should extract mm-dd i.e. October 10th to October 22nd.
Closes https://github.com/facebookincubator/duckling/pull/48

Reviewed By: niteria

Differential Revision: D5292473

Pulled By: patapizza

fbshipit-source-id: 04f1a4b
2017-06-22 04:04:26 -07:00