Commit Graph

53 Commits

Author SHA1 Message Date
Igor Drozdov
5d2c5c78ba Added Distance and Volume Dimensions for Russian language
Summary:
- Added Distance Dimension for Russian language (RU)
- Added Volume Dimension for Russian language (RU)
- Extended `Duckling.Distance.Types.Unit` type definition by adding `Millimetre` representation
Closes https://github.com/facebook/duckling/pull/101

Reviewed By: JonCoens

Differential Revision: D6254070

Pulled By: patapizza

fbshipit-source-id: 579f7a259f76ff1c23ccfe2371afea385eb56aa1
2017-11-08 11:00:31 -08:00
Panagiotis Vekris
e8937e1cd6 Support for Greek durations
Summary: Adding support for Greek time grains and durations.

Reviewed By: patapizza

Differential Revision: D6249955

fbshipit-source-id: 1c69e26
2017-11-06 18:49:36 -08:00
Julien Odent
ba46d592cd Prevent double negatives + cleanup
Summary:
* Prevent double negatives (resulting from `ruleNegative` applying twice and from engine tokenizer)
* Hashmap lookup for tens
* cleanup

Reviewed By: blandinw

Differential Revision: D6221107

fbshipit-source-id: 42e401d
2017-11-03 00:31:22 -07:00
Panagiotis Vekris
fda8c7c759 Support Greek numerals
Summary:
- Setup Greek language (EL)
- Added Greek Numerals

Reviewed By: patapizza

Differential Revision: D6217873

fbshipit-source-id: 379170f
2017-11-02 17:16:18 -07:00
Julien Odent
f0a0c1e6b8 Time/EN: don't parse "this in 2 minutes" + fix thanksgiving in EN locales
Summary:
* add flag for this/next/last time
* fix thanskgiving in EN locales
* `analyzedRangeTest` helper with `rangeTests` for `Time/EN`

Reviewed By: blandinw

Differential Revision: D6191209

fbshipit-source-id: 6eaa117
2017-10-31 12:34:21 -07:00
Sam Pepose
251028e691 Uses correct date format for all EN locales
Summary: The date format changes between EN locales (https://en.wikipedia.org/wiki/Date_format_by_country). This diff fixes how dates are handled in each locale.

Reviewed By: patapizza

Differential Revision: D6156147

fbshipit-source-id: 22f296c
2017-10-26 08:49:21 -07:00
Julien Odent
980c0f279e Time: don't parse at + phone number
Summary: Fixes #95.

Reviewed By: blandinw

Differential Revision: D6129893

fbshipit-source-id: e863021
2017-10-23 16:34:34 -07:00
Matthijs Mullender
1ade1935b2 Support Dutch dates and times
Summary: [Duckling][Time][NL] Support Dutch dates and times

Reviewed By: patapizza

Differential Revision: D6090294

fbshipit-source-id: 54b8729
2017-10-19 14:04:38 -07:00
Abdallatif Sulaiman
18cd2210ac Add Duration Dimension to Arabic Language
Summary: Closes https://github.com/facebookincubator/duckling/pull/94

Reviewed By: blandinw

Differential Revision: D6078221

Pulled By: patapizza

fbshipit-source-id: b653b24
2017-10-17 16:04:28 -07:00
Julien Odent
b2de97800f Numeral/FR: allow space as a thousand separator
Summary: Fixes #91.

Reviewed By: blandinw

Differential Revision: D6079821

fbshipit-source-id: f3160c1
2017-10-17 12:20:32 -07:00
Julien Odent
0e950620b8 Email: restrict domain extensions to letters when spelling out
Summary: We would parse things like "tonight at 6.40".

Reviewed By: blandinw

Differential Revision: D6066926

fbshipit-source-id: d18a8c6
2017-10-16 12:19:22 -07:00
Julien Odent
1ab5f447d2 en_CA + fix Canadian Thanksgiving
Summary:
* `en_CA` locale
* In Canada, Thanksgiving Day is the second Monday of October.
* Black Friday is the same as the US.
* However Canada observes both DDMM and MMDD formats. Defer to later, falling back to US.

Reviewed By: blandinw

Differential Revision: D6058909

fbshipit-source-id: 3d4e05e
2017-10-16 10:04:43 -07:00
Julien Odent
fb1dcaa138 Chinese locales + fix TW National Day
Summary:
* Moving `ruleNationalDay` from `ZH` rules to specific locales: `zh_CN`, `zh_HK`, `zh_MO`
* Fixed National Day for `zh_TW`.

Reviewed By: blandinw

Differential Revision: D6057565

fbshipit-source-id: 8f9f2ab
2017-10-13 17:04:43 -07:00
Matthijs Mullender
33a08bb76b Support Dutch Durations
Summary:
This change adds support for durations in Dutch/Netherlands (NL)
Implemented: TimeGrain/NL, Durations/NL

Reviewed By: patapizza

Differential Revision: D6049404

fbshipit-source-id: 3621cdb
2017-10-13 12:49:30 -07:00
Julien Odent
ab0ad0256e Locales support
Summary:
* Locales support for the library, following `<Lang>_<Region>` with ISO 639-1 code for `<Lang>` and ISO 3166-1 alpha-2 code for `<Region>` (#33)
* `Locale` opaque type (composite of `Lang` and `Region`) with `makeLocale` smart constructor to only allow valid `(Lang, Region)` combinations
* API: `Context`'s `lang` parameter has been replaced by `locale`, with optional `Region` and backward compatibility.
*  `Rules/<Lang>.hs` exposes
  - `langRules`: cross-locale rules for `<Lang>`, from `<Dimension>/<Lang>/Rules.hs`
  - `localeRules`: locale-specific rules, from `<Dimension>/<Lang>/<Region>/Rules.hs`
  - `defaultRules`: `langRules` + specific rules from select locales to ensure backward-compatibility
* Corpus, tests & classifiers
  - 1 classifier per locale, with default classifier (`<Lang>_XX`) when no locale provided (backward-compatible)
  - Default classifiers are built on existing corpus
  - Locale classifiers are built on
  - `<Dimension>/<Lang>/Corpus.hs` exposes a common `corpus` to all locales of `<Lang>`
  - `<Dimension>/<Lang>/<Region>/Corpus.hs` exposes `allExamples`: a list of examples specific to the locale (following `<Dimension>/<Lang>/<Region>/Rules.hs`).
  - Locale classifiers use the language corpus extended with the locale examples as training set.
  - Locale examples need to use the same `Context` (i.e. reference time) as the language corpus.
  - For backward compatibility, `<Dimension>/<Lang>/Corpus.hs` can expose also `defaultCorpus`, which is `corpus` augmented with specific examples. This is controlled by `getDefaultCorpusForLang` in `Duckling.Ranking.Generate`.
  - Tests run against each classifier to make sure runtime works as expected.
* MM/DD (en_US) vs DD/MM (en_GB) example to illustrate

Reviewed By: JonCoens, blandinw

Differential Revision: D6038096

fbshipit-source-id: f29c28d
2017-10-13 08:34:21 -07:00
Julien Odent
1eea25049f Time/RO: don't parse 'sa' as Saturday
Summary: In Romanian, `sa` is fairly common: hai sa ne vedem (let's see), hai sa mergem (let's go).

Reviewed By: blandinw

Differential Revision: D5801345

fbshipit-source-id: db677e4
2017-09-09 11:04:57 -07:00
Julien Odent
b954380937 ES/Ordinal: Fixes + tests
Summary:
* fixes '1st' variants (e.g. primeros, primera)
* fixes accents

Reviewed By: JonCoens

Differential Revision: D5772079

fbshipit-source-id: 6a09d79
2017-09-06 10:19:31 -07:00
Stepan Parunashvili
6f774abe38 georgian numeral support
Summary: Introducing Georgian (KA), and the very beginnings of numeral support

Reviewed By: patapizza

Differential Revision: D5757952

fbshipit-source-id: 89d05f8
2017-09-05 12:19:29 -07:00
dubovinszky
60565c15aa HU Time, TimeGrain
Summary: Closes https://github.com/facebookincubator/duckling/pull/83

Reviewed By: blandinw

Differential Revision: D5681515

Pulled By: patapizza

fbshipit-source-id: 918d0a4
2017-08-22 19:34:33 -07:00
Daniel Kantor
5cad4359e2 Added HU Ordinals
Summary: Closes https://github.com/facebookincubator/duckling/pull/82

Reviewed By: JonCoens

Differential Revision: D5631927

Pulled By: patapizza

fbshipit-source-id: d68b238
2017-08-16 11:19:24 -07:00
Veselin Stoyanov
e9b1c8932a Added AmountOfMoney dimension to Bulgarian language
Summary:
- Added AmountOfMoney dimension to Bulgarian language
Closes https://github.com/facebookincubator/duckling/pull/80

Reviewed By: JonCoens

Differential Revision: D5606699

Pulled By: patapizza

fbshipit-source-id: c18f5d4
2017-08-14 09:34:36 -07:00
dubovinszky
24d3f19976 HU Setup + Numeral
Summary:
- Setup Hungarian (HU) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/79

Reviewed By: blandinw

Differential Revision: D5595812

Pulled By: patapizza

fbshipit-source-id: 5959938
2017-08-09 17:49:56 -07:00
Veselin Stoyanov
5d03b45af9 Setup Bulgarian language and Numeral Dimension
Summary:
- Setup Bulgarian (BG) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/78

Reviewed By: niteria

Differential Revision: D5575513

Pulled By: patapizza

fbshipit-source-id: e566155
2017-08-09 08:19:24 -07:00
Julien Odent
61800297c8 Time/PL: don't parse 'nie' as Time
Summary: 'nie' means 'no' in Polish, and isn't a common abbreviation for 'niedziela' (Sunday).

Reviewed By: blandinw

Differential Revision: D5587036

fbshipit-source-id: bfda7fc
2017-08-08 16:49:58 -07:00
Julien Odent
ef461c3133 Time/FR: don't parse 'a un'
Summary:
In French, the form "at hh" is not valid (it requires an hour indicator).
This fixes false positives such as in "John a un rendez-vous."

Fixes https://github.com/wit-ai/wit/issues/666.

Reviewed By: JonCoens

Differential Revision: D5530713

fbshipit-source-id: ecee1e5
2017-08-01 08:49:41 -07:00
Şeref R.Ayar
8711df5047 change json response #12
Summary:
not sure about this. Maybe I need some guidance.
Closes https://github.com/facebookincubator/duckling/pull/42

Reviewed By: blandinw

Differential Revision: D5228520

Pulled By: patapizza

fbshipit-source-id: 4f99cc5
2017-06-12 15:19:22 -07:00
Şeref R.Ayar
ba26ca7e91 Volume for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/34

Reviewed By: niteria

Differential Revision: D5168380

Pulled By: patapizza

fbshipit-source-id: 31d0a11
2017-06-02 12:49:20 -07:00
Şeref R.Ayar
b69874cd9f Duration for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/32

Reviewed By: niteria

Differential Revision: D5150778

Pulled By: patapizza

fbshipit-source-id: d156b0a
2017-05-31 02:19:40 -07:00
serefayar
92a3e16886 Temperature for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/30

Reviewed By: niteria

Differential Revision: D5147114

Pulled By: patapizza

fbshipit-source-id: 804f623
2017-05-30 09:34:17 -07:00
Şeref R.Ayar
6de7c2142b Distance for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/26

Reviewed By: niteria

Differential Revision: D5112142

Pulled By: patapizza

fbshipit-source-id: d71f654
2017-05-23 10:49:18 -07:00
Sebastian Mika
b00e5faeac Fix DE numerical ordinal matching
Summary:
The numerical ordinal matching rule in DE is too broad. An ordinal like "1." may not be proceeded or followed by numbers.

* Added negative lookbehind - avoids matching the first "1." in "1.1" as an ordinal.
* Added negative lookahead - avoids matching the second "1." in "1.1. as an ordinal
Closes https://github.com/facebookincubator/duckling/pull/18

Reviewed By: patapizza

Differential Revision: D5069200

Pulled By: niteria

fbshipit-source-id: 0583076
2017-05-16 09:49:21 -07:00
rfranek@email.cz
325fa69304 added Distance for CZ language
Summary:
Added first Czech language file
Closes https://github.com/facebookincubator/duckling/pull/16

Reviewed By: niteria

Differential Revision: D5044499

Pulled By: patapizza

fbshipit-source-id: c736a35
2017-05-12 08:19:20 -07:00
Julien Odent
37829902b7 CS: Setup + basic Numeral
Summary:
* Setup for Czech
* Basic `Numeral` (0-10 integers + digits) from http://www.omniglot.com/language/numbers/czech.htm

Reviewed By: JonCoens

Differential Revision: D5044775

fbshipit-source-id: b5cd9d2
2017-05-11 09:49:27 -07:00
Matteo
e11014dc4b Volume for IT lang
Summary:
I notice that there are several missing dimensions for the IT language: this patch is for the Volume dimension

Regards
Matteo
Closes https://github.com/facebookincubator/duckling/pull/4

Reviewed By: JonCoens

Differential Revision: D4986389

Pulled By: patapizza

fbshipit-source-id: 314d33e
2017-05-02 11:19:14 -07:00
Julien Odent
d3d3703015 HE: Time
Summary:
Time dimension for Hebrew.
Commented out the failing tests that actually also fail in Clojure.

Reviewed By: JonCoens

Differential Revision: D4970308

fbshipit-source-id: b455142
2017-04-28 10:04:35 -07:00
Julien Odent
ab2c89df4f IT: Temperature
Summary: Temperature dimension for Italian.

Reviewed By: JonCoens

Differential Revision: D4970338

fbshipit-source-id: 024802e
2017-04-28 10:04:35 -07:00
Bartosz Nitka
74936df848 Make matching anywhere vs at pos obvious
Summary:
This change refactors the Engine to use a different
code path for when we're calling `lookupItem` to find
a first token `Node` matching the rule and a different
one for subsequent ones.

This division lets us get better invariants and more importantly
do full text regexp matches only when necessary.

This should be particularly useful for longer texts.

Reviewed By: patapizza

Differential Revision: D4953918

fbshipit-source-id: e3a69ad
2017-04-28 09:19:20 -07:00
Julien Odent
9269727617 PT: Bring latest changes
Summary: * PhoneNumber: support for "ramal" as extension keyword

Reviewed By: niteria

Differential Revision: D4959209

fbshipit-source-id: cd12c1f
2017-04-28 08:04:22 -07:00
Julien Odent
3f40625339 Temperature for Croatian
Summary: Temperature dimension for Croatian

Reviewed By: niteria

Differential Revision: D4958590

fbshipit-source-id: fe6c2e4
2017-04-28 08:04:22 -07:00
Julien Odent
3cc3266e28 Quantity for Croatian
Summary: Quantity dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4958501

fbshipit-source-id: b90c8f6
2017-04-28 08:04:22 -07:00
Julien Odent
0372f4f3da Volume for Croatian
Summary: Volume dimension for Croatian

Reviewed By: niteria

Differential Revision: D4957186

fbshipit-source-id: 63012ad
2017-04-28 08:04:22 -07:00
Julien Odent
0aa4aa56bb Distance for Croatian
Summary: Distance dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4957067

fbshipit-source-id: 232ce30
2017-04-28 08:04:21 -07:00
Julien Odent
35b9101c48 VI: Time
Summary:
* Time dimension for Vietnamese.
* Expose `debugContext`.

Reviewed By: niteria

Differential Revision: D4963594

fbshipit-source-id: 2373735
2017-04-28 08:04:21 -07:00
Julien Odent
3314ddc7a4 VI: Ordinal
Summary: Ordinal for Vietnamese.

Reviewed By: niteria

Differential Revision: D4959285

fbshipit-source-id: 7212cc9
2017-04-28 08:04:21 -07:00
Julien Odent
0370c452f1 Time
Summary: Time dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4954399

fbshipit-source-id: 906c4a6
2017-04-26 09:19:27 -07:00
Julien Odent
b32696f8eb AmountOfMoney
Summary: AmountOfMoney dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4947584

fbshipit-source-id: a20670a
2017-04-26 09:19:27 -07:00
Julien Odent
0f98a42b03 Ordinal
Summary: Ordinal dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4947244

fbshipit-source-id: 54bda8f
2017-04-26 09:19:27 -07:00
Julien Odent
840deda7dd Setup + Numeral
Summary: Setup + Numeral dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4946964

fbshipit-source-id: 204429b
2017-04-26 09:19:26 -07:00
Julien Odent
f5f4889770 Ordinal
Summary: Ordinal dimension for Hebrew.

Reviewed By: niteria

Differential Revision: D4930162

fbshipit-source-id: 02545ae
2017-04-24 06:49:40 -07:00
Julien Odent
bd96d3dd95 Setup + Numeral
Summary: Setup for Hebrew + Numeral dimension

Reviewed By: niteria

Differential Revision: D4930041

fbshipit-source-id: 965132b
2017-04-24 06:49:40 -07:00