Commit Graph

142 Commits

Author SHA1 Message Date
RIAN DOUGLAS
884904b5ca Implement handling of "grand" for EN_?? locales
Summary: Add implementation of "a grand" and "<num> grand" for AU, BZ, CA, GB, IE, IN, JM, NZ, PH, TT, ZA locales. Some resolve to the local currency (AU, IN), others resolve to Dollar (NZ, PH).

Reviewed By: patapizza

Differential Revision: D7943186

fbshipit-source-id: c71ab462fa9df0ee65223ee82dc2c98457a4e13b
2018-05-17 15:45:27 -07:00
Anshuman Chhabra
8237ef4503 Added Duration support to Hindi (HI)
Summary:
Hello!

I have added __Hindi (HI)__ support for the __Duration__ dimension in ```Duckling/Duration/Rules.hs```. Written tests as well in ```Duckling/Duration/Corpus.hs``` and everything is passing and verified (`253 tests passed`).

Cheers!
Closes https://github.com/facebook/duckling/pull/194

Reviewed By: patapizza

Differential Revision: D7999571

Pulled By: chinmay87

fbshipit-source-id: deb5d60ba7f7ecbc2aa4d97ce5fb96e9bbe63b3d
2018-05-17 14:46:21 -07:00
RIAN DOUGLAS
963aa6b465 Implement handling of "grand" for Ca and GB
Summary: Add implementation of "a grand" and "<num> grand" for EN_CA and EN_GB locales

Reviewed By: patapizza

Differential Revision: D7916643

fbshipit-source-id: 0cd55f17ec522c0334f48436a8a8cc19e0560b0b
2018-05-09 14:46:16 -07:00
Ziyang Liu
c84166d708 Add Quantity/ZH
Summary:
Fixes #181.
Closes https://github.com/facebook/duckling/pull/182

Reviewed By: patapizza

Differential Revision: D7891288

Pulled By: haoxuany

fbshipit-source-id: faa505dbf149c728545a92579cb2e92a89cd5cb8
2018-05-07 09:45:32 -07:00
RIAN DOUGLAS
f328821bbc Add support for using 'grand' to refer to 1,000's of currency in EN
Summary: Added support for ",number> grand" and "a grand" to result in thousands of currency amount

Reviewed By: haoxuany

Differential Revision: D7688809

fbshipit-source-id: 72a81c7a1c48329f85c9f525b72b00c479a9edb0
2018-05-03 14:45:25 -07:00
Ziyang Liu
a3b35880e5 Change value in Entity to typed value instead of JSON
Summary:
Modified `Entity` to use the new `ResolvedVal` data type. Other changes follow naturally. Related issues: https://github.com/facebook/duckling/issues/121 and https://github.com/facebook/duckling/issues/172

Now one can pattern match on the output value, for instance:

```
{-# LANGUAGE GADTs #-}

import Data.Text
import Duckling.Core
import Duckling.Testing.Types
import qualified Duckling.PhoneNumber.Types as PN

parsePhoneNumber :: Text -> Text
parsePhoneNumber input =
  case value entity of
    (RVal PhoneNumber (PN.PhoneNumberValue v)) -> v
    where
    (entity:_) = parse input testContext testOptions [This PhoneNumber]
```

Reviewed By: patapizza

Differential Revision: D7502020

fbshipit-source-id: 76ba7b315cfd0d2c61ff95c855b7c95efc0a401c
2018-04-20 14:18:47 -07:00
Aaron Yue
4df76289bc Implement AmountOfMoney
Summary: implement AmountOfMoney rules and corpus for ZH

Reviewed By: patapizza

Differential Revision: D7508507

fbshipit-source-id: 3591b399a9880c5278587979c6576720343cc123
2018-04-17 16:45:32 -07:00
Aaron Yue
4ef255e577 Generalize and expand digit specifier usage for hanzi
Summary:
generalize chinese digit specifier (十百千万亿) parsing, and add hanzi tests
These digits specifiers can be parsed as (<num><speci>)<num>,
by using the multiplicater value <num><speci>, and a connect function that adds them together
(two cases, skipping digits [which requires a 零 in between], and digits in consecutive locations).
Note that 个 is technically a digit specifier,
but in Chinese it is never used directly as a numeral specifier, and always as a counter.

Reviewed By: zliu41

Differential Revision: D7424249

fbshipit-source-id: 20a85a7df1f908ee9879e92b904178fa26a9a5e5
2018-04-12 10:00:33 -07:00
Souvik Ghosh
ced73dcdcb AmountOfMoney NL Support
Summary:
Hello,
I have added new directory for AmountOfMoney NL Support
Added two files
- Duckling/AmountOfMoney/NL/Corpus.hs
- Duckling/AmountOfMoney/NL/Rules.hs

Updated File
- Duckling/Rules/NL.hs

Added test cases
- tests/Duckling/AmountOfMoney/NL/Tests.hs

Updated Test file
- tests/Duckling/AmountOfMoney/Tests.hs

Updated
duckling.cabal

Thanks for the review and the latest merge. Looking forward

Regards
Closes https://github.com/facebook/duckling/pull/173

Reviewed By: JonCoens

Differential Revision: D7592192

Pulled By: patapizza

fbshipit-source-id: 5895c29bf7f1033e4ffd791d5915a16d230e9375
2018-04-11 17:15:30 -07:00
Julien Odent
120f569ed0 Time/EN: Fix "in" + year
Summary: Now that years are latent, let's absorb the "in" and make them not latent.

Reviewed By: zliu41

Differential Revision: D7587599

fbshipit-source-id: 61a19ac389244df491591d78c28f0301f9124439
2018-04-11 11:00:30 -07:00
Aaron Yue
878bcb9277 Show input for ambiguous parse failures
Summary: show the input that an ambiguous parse is failing at

Reviewed By: patapizza

Differential Revision: D7502191

fbshipit-source-id: 9f0fbf8301413d9007236ba5b6af1f4b41c20269
2018-04-04 10:30:32 -07:00
Aaron Yue
3629fbd503 Prompt ambiguous parses in corpus tests
Summary:
During ranking, due to how candidates are ordered, it is completely possible to have multiple correct candidates
have the exact same rank (equal range and exact equal score). In this case `analyze` returns all of them, which gets
misinterpreted as having multiple tokens in output rather than multiple solutions. Checks this case and gives the
correct prompt for ambiguous parses.

Reviewed By: patapizza

Differential Revision: D7489391

fbshipit-source-id: b66947e37bddb3ac6273843dd79b559aff9d0083
2018-04-03 17:00:27 -07:00
Ziyang Liu
22d11c055b Add ZH distance support
Summary: Add ZH distance support

Reviewed By: patapizza

Differential Revision: D7341977

fbshipit-source-id: 301d90515c492da9fa4c4faf5bcc3351eacbed1b
2018-03-21 12:15:28 -07:00
hend
f70b4ebe07 add distance to SV
Summary: Closes https://github.com/facebook/duckling/pull/168

Reviewed By: chinmay87

Differential Revision: D7322559

Pulled By: patapizza

fbshipit-source-id: 3d4b86068c05a7a82442bc7c182b80b5cf75ec90
2018-03-21 11:15:25 -07:00
yasen-yankov
21c3b32e4d Added TimeGrain, Duration, and Ordinal
Summary: Closes https://github.com/facebook/duckling/pull/164

Reviewed By: chinmay87

Differential Revision: D7280110

Pulled By: patapizza

fbshipit-source-id: d98ddd900fe83f06b28afd39ea3311f42716288c
2018-03-19 18:00:29 -07:00
Giri Anantharaman
519c9519a3 Support Tamil numerals
Summary:
* Setup Tamil (TA) language
* Added Numeral Dimension

Reviewed By: patapizza

Differential Revision: D7323636

fbshipit-source-id: 4b1a42197ff4799880cded9ce86b8d7fae1507bc
2018-03-19 16:45:36 -07:00
Chinmay Deshmukh
5ac990bbe2 Return latent entities
Summary: Add an option to return latent time entities. This can be used when one is pretty certain that the input contains a datetime.

Reviewed By: patapizza

Differential Revision: D7254245

fbshipit-source-id: e9e0503cace2691804056fcebdc18fd9090fb181
2018-03-19 14:45:27 -07:00
yasen-yankov
1493d44465 Added BG distance
Summary: Closes https://github.com/facebook/duckling/pull/162

Reviewed By: chinmay87

Differential Revision: D7239243

Pulled By: patapizza

fbshipit-source-id: a5518219a67fa46bb06e97eee3dfd07ab683162f
2018-03-13 15:45:32 -07:00
Ezgi Çiçek
2254034e62 Add Node to Entity to pass along the parse tree information
Summary:
- add `Node` field to `Entity`
- ignore `Node` field of `Entity` for toJSON for now (will be fixed later)
- change `Debug.hs` so that we print the respective Entity's Node
- add wildcard for the new Node field in `parseTest` of Api/Tests.hs`

Reviewed By: patapizza

Differential Revision: D7174696

fbshipit-source-id: 240e4c53b72323b500ac58a74f873ce247bb3387
2018-03-07 10:45:28 -08:00
Panagiotis Vekris
169df6b46b ruleSkipHundreds only accepts alphabetic numerals
Summary:
Before, duckling would parse `"1 2 2"` as the three digit number 122 through
`ruleSkipHundreds`. This, however, allowed the string `"Pay Kiran1 10eur"` to be
parsed as "110 EUR", which was reported in https://github.com/facebook/duckling/issues/159.

This change only accepts alphabetic numerals to pass through `ruleSkipHundreds`
(for example `"one twenty two"`), which presumably was the original intention of
this rule. This fixes the above issue, without any change in Corpus.

Reviewed By: patapizza

Differential Revision: D7151934

fbshipit-source-id: 024a7a0b6a53beb3a0d42d4bb7f542ce3b05726b
2018-03-06 17:00:29 -08:00
Julien Odent
aed5b8f779 Numeral: don't compose negative numbers, fix double negatives
Summary:
* added `isMultipliable` helper and used that in patterns along with `isPositive`
* fixed double negatives in most languages

Reviewed By: niteria

Differential Revision: D7034982

fbshipit-source-id: a0bb67056d3107167830ece0c34d761c5563c5a7
2018-02-21 12:02:45 -08:00
bidhan-a
43079e7113 Setup Nepali (NE) and add Numeral dimension
Summary:
- Setup Nepali (NE) language
- Add basic Numeral dimension
Closes https://github.com/facebook/duckling/pull/156

Reviewed By: JonCoens

Differential Revision: D6965558

Pulled By: patapizza

fbshipit-source-id: f46c9b104d4345f20bd0cf53f8c9c8754855f314
2018-02-13 07:45:31 -08:00
Ashwini Reddy Challa
c8501d3e85 Added HI Ordinals
Summary: Added ordinal support for Hindi

Reviewed By: patapizza

Differential Revision: D6944156

fbshipit-source-id: eb5da698e5cccde9a1cc31adf7bc433b89e07454
2018-02-09 13:15:30 -08:00
Abdallatif Sulaiman
0c395f7a1b fix Valentine's Day regex in arabic
Summary: Closes https://github.com/facebook/duckling/pull/146

Reviewed By: adelnobel

Differential Revision: D6834246

Pulled By: patapizza

fbshipit-source-id: ba90be16e877f9a39162820f1071ce7a6d6da6d7
2018-02-05 14:30:34 -08:00
Julien Odent
b00fa512af AmountOfMoney: fixes
Summary:
* don't recursively compose cents
* don't allow decreasing ranges

Reviewed By: blandinw

Differential Revision: D6849132

fbshipit-source-id: ed6ca30388642c21e677a628971747a4fb3dfbef
2018-01-30 14:45:36 -08:00
Julien Odent
da36ab8a80 Time: fix empty values for past time
Summary: "yesterday" would resolve to an entity without any `values`.

Reviewed By: JonCoens

Differential Revision: D6697432

fbshipit-source-id: 7b15727f92703842a2995210fdeb99c00be74bc3
2018-01-10 15:30:32 -08:00
Abdallatif Sulaiman
89822776c6 Added Amount of Money Dimension to Arabic language
Summary: Closes https://github.com/facebook/duckling/pull/134

Reviewed By: JonCoens

Differential Revision: D6649920

Pulled By: patapizza

fbshipit-source-id: 9c647c84f5ae4f3dc26cb0c7aa74abb097ea001a
2018-01-02 14:00:51 -08:00
Abdallatif Sulaiman
2d726d3837 Added Quantity Dimension to Arabic language
Summary: Closes https://github.com/facebook/duckling/pull/127

Reviewed By: panagosg7

Differential Revision: D6616298

Pulled By: patapizza

fbshipit-source-id: dc774ff1e870bbd083a9cca8ee6f75db852afce9
2017-12-20 18:00:43 -08:00
Abdallatif Sulaiman
c056a0b46a Added Temperature Dimension to Arabic language
Summary: Closes https://github.com/facebook/duckling/pull/126

Reviewed By: panagosg7

Differential Revision: D6614995

Pulled By: patapizza

fbshipit-source-id: 35ff142a3fc8a498d9abbb80d28dc9be6cdcbc4d
2017-12-20 18:00:43 -08:00
Abdallatif Sulaiman
380457db8f Added Volume Dimension to Arabic language
Summary: Closes https://github.com/facebook/duckling/pull/125

Reviewed By: panagosg7

Differential Revision: D6614574

Pulled By: patapizza

fbshipit-source-id: 054ed04e0cc1cf79be31340ebb8b7fea3bc67f57
2017-12-20 18:00:43 -08:00
Abdallatif Sulaiman
1393098bcc Added Time Dimension to Arabic
Summary:
Hi, in this pr:
* Added time dimension to Arabic language, thanks to Hussein-Dahir & Yazeed-Obaid for writing time corpus.
* Fixed some bugs in numeral & ordinals and added more test cases for them.

Also, I don't really understand why do we use classifiers in time dimension?
Closes https://github.com/facebook/duckling/pull/123

Reviewed By: blandinw

Differential Revision: D6583313

Pulled By: patapizza

fbshipit-source-id: f7acdef0c032d7b7fd7d224832fdaf484d2df825
2017-12-19 14:30:42 -08:00
Newinfinite007
c133bad24a Hindi Language Numeral Dimension(minimalistic model). Tests passed.
Summary: Closes https://github.com/facebook/duckling/pull/119

Reviewed By: JonCoens

Differential Revision: D6597628

Pulled By: patapizza

fbshipit-source-id: 8bac0f686d6cecc38d9998e37042fe48f73530dc
2017-12-19 13:15:30 -08:00
Zahar Shimanchik
2274e40369 Added AmountOfMoney Dimension to Russian language
Summary: Closes https://github.com/facebook/duckling/pull/120

Reviewed By: patapizza

Differential Revision: D6520302

Pulled By: panagosg7

fbshipit-source-id: 2c9ac6e15ca3ee90b2bf5911ee62835966fcacd1
2017-12-12 12:15:30 -08:00
Julien Odent
6df3b26707 Numeral: common rule + supporting hindu-arabic numerals for Burmese
Summary:
* `ruleIntegerNumeric` was used in all languages but Burmese.
* it seems like the hindu-arabic numerals are slowly getting in Burmese (e.g. recent car plates)
* Moving the rule in `Duckling/Numeral/Common.hs`

Reviewed By: blandinw

Differential Revision: D6498349

fbshipit-source-id: e868dc9960f18f0781e4aa98a0dfcd14969537c9
2017-12-06 16:00:28 -08:00
Panagiotis Vekris
12a726aee7 Support for Greek times and dates
Summary:
This adds support for greek times.

There are still some issues with expressions of the form:
```
9:30 - 11:00 την πέμπτη
```
Where `11:00 την πέμπτη` is parsed first (as 11:30 on Thu), instead of prioritizing `9:30 - 11:00` as the training data suggests. These tests are for the moment excluded from the corpus.

Reviewed By: patapizza

Differential Revision: D6376271

fbshipit-source-id: 2f31e058fb88386429070e3b51cd33f93b9c5936
2017-12-04 16:45:40 -08:00
igor-drozdov
29d776dee5 Added TimeGrain and Duration Dimensions to Russian language
Summary:
- Added Duration dimension to Russian language
- Added TimeGrain dimension to Russian language
- Refactored isNatural and isNaturalWith out of Duration helpers into Numeral helpers
- Implemented <integer> and a half rule for Russian Numeral
- Changed the type of inSeconds to polymorphic one
Closes https://github.com/facebook/duckling/pull/105

Reviewed By: blandinw

Differential Revision: D6312604

Pulled By: patapizza

fbshipit-source-id: 9ae237b4beb6915ff8da013230457937d8e56733
2017-11-15 10:45:24 -08:00
Igor Drozdov
f6492b5da0 Added Quantity dimension to Russian language
Summary: Closes https://github.com/facebook/duckling/pull/106

Reviewed By: blandinw

Differential Revision: D6312605

Pulled By: patapizza

fbshipit-source-id: 69ec673f95ec8a2d86ec207a6d75cd8ebfcdb4f6
2017-11-14 21:00:28 -08:00
Julien Odent
fb10a6e6ba Time/EN: Parse spelled out times + AM/PM
Summary:
When using speech recognition, we might see things like "six thirty six a m" or
"ten thirty p m".
Also fixed the argument order of `timeOfDayAMPM` to be more idiomatic.

Reviewed By: blandinw

Differential Revision: D6316542

fbshipit-source-id: 0008c049040219b3a1dd80d9e4661ba8a246fa7f
2017-11-14 13:30:26 -08:00
Panagiotis Vekris
536f2844e3 Support Greek ordinals
Summary: Adding support for Greek ordinals

Reviewed By: patapizza

Differential Revision: D6263781

fbshipit-source-id: ff339ee51e4e8ad6b0c8f3fa75f5652391dbe48e
2017-11-08 11:00:31 -08:00
Igor Drozdov
5d2c5c78ba Added Distance and Volume Dimensions for Russian language
Summary:
- Added Distance Dimension for Russian language (RU)
- Added Volume Dimension for Russian language (RU)
- Extended `Duckling.Distance.Types.Unit` type definition by adding `Millimetre` representation
Closes https://github.com/facebook/duckling/pull/101

Reviewed By: JonCoens

Differential Revision: D6254070

Pulled By: patapizza

fbshipit-source-id: 579f7a259f76ff1c23ccfe2371afea385eb56aa1
2017-11-08 11:00:31 -08:00
Panagiotis Vekris
e8937e1cd6 Support for Greek durations
Summary: Adding support for Greek time grains and durations.

Reviewed By: patapizza

Differential Revision: D6249955

fbshipit-source-id: 1c69e26
2017-11-06 18:49:36 -08:00
Julien Odent
ba46d592cd Prevent double negatives + cleanup
Summary:
* Prevent double negatives (resulting from `ruleNegative` applying twice and from engine tokenizer)
* Hashmap lookup for tens
* cleanup

Reviewed By: blandinw

Differential Revision: D6221107

fbshipit-source-id: 42e401d
2017-11-03 00:31:22 -07:00
Panagiotis Vekris
fda8c7c759 Support Greek numerals
Summary:
- Setup Greek language (EL)
- Added Greek Numerals

Reviewed By: patapizza

Differential Revision: D6217873

fbshipit-source-id: 379170f
2017-11-02 17:16:18 -07:00
Julien Odent
f0a0c1e6b8 Time/EN: don't parse "this in 2 minutes" + fix thanksgiving in EN locales
Summary:
* add flag for this/next/last time
* fix thanskgiving in EN locales
* `analyzedRangeTest` helper with `rangeTests` for `Time/EN`

Reviewed By: blandinw

Differential Revision: D6191209

fbshipit-source-id: 6eaa117
2017-10-31 12:34:21 -07:00
Sam Pepose
251028e691 Uses correct date format for all EN locales
Summary: The date format changes between EN locales (https://en.wikipedia.org/wiki/Date_format_by_country). This diff fixes how dates are handled in each locale.

Reviewed By: patapizza

Differential Revision: D6156147

fbshipit-source-id: 22f296c
2017-10-26 08:49:21 -07:00
Julien Odent
980c0f279e Time: don't parse at + phone number
Summary: Fixes #95.

Reviewed By: blandinw

Differential Revision: D6129893

fbshipit-source-id: e863021
2017-10-23 16:34:34 -07:00
Matthijs Mullender
1ade1935b2 Support Dutch dates and times
Summary: [Duckling][Time][NL] Support Dutch dates and times

Reviewed By: patapizza

Differential Revision: D6090294

fbshipit-source-id: 54b8729
2017-10-19 14:04:38 -07:00
Abdallatif Sulaiman
18cd2210ac Add Duration Dimension to Arabic Language
Summary: Closes https://github.com/facebookincubator/duckling/pull/94

Reviewed By: blandinw

Differential Revision: D6078221

Pulled By: patapizza

fbshipit-source-id: b653b24
2017-10-17 16:04:28 -07:00
Julien Odent
b2de97800f Numeral/FR: allow space as a thousand separator
Summary: Fixes #91.

Reviewed By: blandinw

Differential Revision: D6079821

fbshipit-source-id: f3160c1
2017-10-17 12:20:32 -07:00
Julien Odent
0e950620b8 Email: restrict domain extensions to letters when spelling out
Summary: We would parse things like "tonight at 6.40".

Reviewed By: blandinw

Differential Revision: D6066926

fbshipit-source-id: d18a8c6
2017-10-16 12:19:22 -07:00
Julien Odent
1ab5f447d2 en_CA + fix Canadian Thanksgiving
Summary:
* `en_CA` locale
* In Canada, Thanksgiving Day is the second Monday of October.
* Black Friday is the same as the US.
* However Canada observes both DDMM and MMDD formats. Defer to later, falling back to US.

Reviewed By: blandinw

Differential Revision: D6058909

fbshipit-source-id: 3d4e05e
2017-10-16 10:04:43 -07:00
Julien Odent
fb1dcaa138 Chinese locales + fix TW National Day
Summary:
* Moving `ruleNationalDay` from `ZH` rules to specific locales: `zh_CN`, `zh_HK`, `zh_MO`
* Fixed National Day for `zh_TW`.

Reviewed By: blandinw

Differential Revision: D6057565

fbshipit-source-id: 8f9f2ab
2017-10-13 17:04:43 -07:00
Matthijs Mullender
33a08bb76b Support Dutch Durations
Summary:
This change adds support for durations in Dutch/Netherlands (NL)
Implemented: TimeGrain/NL, Durations/NL

Reviewed By: patapizza

Differential Revision: D6049404

fbshipit-source-id: 3621cdb
2017-10-13 12:49:30 -07:00
Julien Odent
ab0ad0256e Locales support
Summary:
* Locales support for the library, following `<Lang>_<Region>` with ISO 639-1 code for `<Lang>` and ISO 3166-1 alpha-2 code for `<Region>` (#33)
* `Locale` opaque type (composite of `Lang` and `Region`) with `makeLocale` smart constructor to only allow valid `(Lang, Region)` combinations
* API: `Context`'s `lang` parameter has been replaced by `locale`, with optional `Region` and backward compatibility.
*  `Rules/<Lang>.hs` exposes
  - `langRules`: cross-locale rules for `<Lang>`, from `<Dimension>/<Lang>/Rules.hs`
  - `localeRules`: locale-specific rules, from `<Dimension>/<Lang>/<Region>/Rules.hs`
  - `defaultRules`: `langRules` + specific rules from select locales to ensure backward-compatibility
* Corpus, tests & classifiers
  - 1 classifier per locale, with default classifier (`<Lang>_XX`) when no locale provided (backward-compatible)
  - Default classifiers are built on existing corpus
  - Locale classifiers are built on
  - `<Dimension>/<Lang>/Corpus.hs` exposes a common `corpus` to all locales of `<Lang>`
  - `<Dimension>/<Lang>/<Region>/Corpus.hs` exposes `allExamples`: a list of examples specific to the locale (following `<Dimension>/<Lang>/<Region>/Rules.hs`).
  - Locale classifiers use the language corpus extended with the locale examples as training set.
  - Locale examples need to use the same `Context` (i.e. reference time) as the language corpus.
  - For backward compatibility, `<Dimension>/<Lang>/Corpus.hs` can expose also `defaultCorpus`, which is `corpus` augmented with specific examples. This is controlled by `getDefaultCorpusForLang` in `Duckling.Ranking.Generate`.
  - Tests run against each classifier to make sure runtime works as expected.
* MM/DD (en_US) vs DD/MM (en_GB) example to illustrate

Reviewed By: JonCoens, blandinw

Differential Revision: D6038096

fbshipit-source-id: f29c28d
2017-10-13 08:34:21 -07:00
Julien Odent
1eea25049f Time/RO: don't parse 'sa' as Saturday
Summary: In Romanian, `sa` is fairly common: hai sa ne vedem (let's see), hai sa mergem (let's go).

Reviewed By: blandinw

Differential Revision: D5801345

fbshipit-source-id: db677e4
2017-09-09 11:04:57 -07:00
Julien Odent
b954380937 ES/Ordinal: Fixes + tests
Summary:
* fixes '1st' variants (e.g. primeros, primera)
* fixes accents

Reviewed By: JonCoens

Differential Revision: D5772079

fbshipit-source-id: 6a09d79
2017-09-06 10:19:31 -07:00
Stepan Parunashvili
6f774abe38 georgian numeral support
Summary: Introducing Georgian (KA), and the very beginnings of numeral support

Reviewed By: patapizza

Differential Revision: D5757952

fbshipit-source-id: 89d05f8
2017-09-05 12:19:29 -07:00
dubovinszky
60565c15aa HU Time, TimeGrain
Summary: Closes https://github.com/facebookincubator/duckling/pull/83

Reviewed By: blandinw

Differential Revision: D5681515

Pulled By: patapizza

fbshipit-source-id: 918d0a4
2017-08-22 19:34:33 -07:00
Daniel Kantor
5cad4359e2 Added HU Ordinals
Summary: Closes https://github.com/facebookincubator/duckling/pull/82

Reviewed By: JonCoens

Differential Revision: D5631927

Pulled By: patapizza

fbshipit-source-id: d68b238
2017-08-16 11:19:24 -07:00
Veselin Stoyanov
e9b1c8932a Added AmountOfMoney dimension to Bulgarian language
Summary:
- Added AmountOfMoney dimension to Bulgarian language
Closes https://github.com/facebookincubator/duckling/pull/80

Reviewed By: JonCoens

Differential Revision: D5606699

Pulled By: patapizza

fbshipit-source-id: c18f5d4
2017-08-14 09:34:36 -07:00
dubovinszky
24d3f19976 HU Setup + Numeral
Summary:
- Setup Hungarian (HU) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/79

Reviewed By: blandinw

Differential Revision: D5595812

Pulled By: patapizza

fbshipit-source-id: 5959938
2017-08-09 17:49:56 -07:00
Veselin Stoyanov
5d03b45af9 Setup Bulgarian language and Numeral Dimension
Summary:
- Setup Bulgarian (BG) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/78

Reviewed By: niteria

Differential Revision: D5575513

Pulled By: patapizza

fbshipit-source-id: e566155
2017-08-09 08:19:24 -07:00
Julien Odent
61800297c8 Time/PL: don't parse 'nie' as Time
Summary: 'nie' means 'no' in Polish, and isn't a common abbreviation for 'niedziela' (Sunday).

Reviewed By: blandinw

Differential Revision: D5587036

fbshipit-source-id: bfda7fc
2017-08-08 16:49:58 -07:00
Julien Odent
ef461c3133 Time/FR: don't parse 'a un'
Summary:
In French, the form "at hh" is not valid (it requires an hour indicator).
This fixes false positives such as in "John a un rendez-vous."

Fixes https://github.com/wit-ai/wit/issues/666.

Reviewed By: JonCoens

Differential Revision: D5530713

fbshipit-source-id: ecee1e5
2017-08-01 08:49:41 -07:00
Şeref R.Ayar
8711df5047 change json response #12
Summary:
not sure about this. Maybe I need some guidance.
Closes https://github.com/facebookincubator/duckling/pull/42

Reviewed By: blandinw

Differential Revision: D5228520

Pulled By: patapizza

fbshipit-source-id: 4f99cc5
2017-06-12 15:19:22 -07:00
Şeref R.Ayar
ba26ca7e91 Volume for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/34

Reviewed By: niteria

Differential Revision: D5168380

Pulled By: patapizza

fbshipit-source-id: 31d0a11
2017-06-02 12:49:20 -07:00
Şeref R.Ayar
b69874cd9f Duration for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/32

Reviewed By: niteria

Differential Revision: D5150778

Pulled By: patapizza

fbshipit-source-id: d156b0a
2017-05-31 02:19:40 -07:00
serefayar
92a3e16886 Temperature for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/30

Reviewed By: niteria

Differential Revision: D5147114

Pulled By: patapizza

fbshipit-source-id: 804f623
2017-05-30 09:34:17 -07:00
Şeref R.Ayar
6de7c2142b Distance for TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/26

Reviewed By: niteria

Differential Revision: D5112142

Pulled By: patapizza

fbshipit-source-id: d71f654
2017-05-23 10:49:18 -07:00
Sebastian Mika
b00e5faeac Fix DE numerical ordinal matching
Summary:
The numerical ordinal matching rule in DE is too broad. An ordinal like "1." may not be proceeded or followed by numbers.

* Added negative lookbehind - avoids matching the first "1." in "1.1" as an ordinal.
* Added negative lookahead - avoids matching the second "1." in "1.1. as an ordinal
Closes https://github.com/facebookincubator/duckling/pull/18

Reviewed By: patapizza

Differential Revision: D5069200

Pulled By: niteria

fbshipit-source-id: 0583076
2017-05-16 09:49:21 -07:00
rfranek@email.cz
325fa69304 added Distance for CZ language
Summary:
Added first Czech language file
Closes https://github.com/facebookincubator/duckling/pull/16

Reviewed By: niteria

Differential Revision: D5044499

Pulled By: patapizza

fbshipit-source-id: c736a35
2017-05-12 08:19:20 -07:00
Julien Odent
37829902b7 CS: Setup + basic Numeral
Summary:
* Setup for Czech
* Basic `Numeral` (0-10 integers + digits) from http://www.omniglot.com/language/numbers/czech.htm

Reviewed By: JonCoens

Differential Revision: D5044775

fbshipit-source-id: b5cd9d2
2017-05-11 09:49:27 -07:00
Matteo
e11014dc4b Volume for IT lang
Summary:
I notice that there are several missing dimensions for the IT language: this patch is for the Volume dimension

Regards
Matteo
Closes https://github.com/facebookincubator/duckling/pull/4

Reviewed By: JonCoens

Differential Revision: D4986389

Pulled By: patapizza

fbshipit-source-id: 314d33e
2017-05-02 11:19:14 -07:00
Julien Odent
d3d3703015 HE: Time
Summary:
Time dimension for Hebrew.
Commented out the failing tests that actually also fail in Clojure.

Reviewed By: JonCoens

Differential Revision: D4970308

fbshipit-source-id: b455142
2017-04-28 10:04:35 -07:00
Julien Odent
ab2c89df4f IT: Temperature
Summary: Temperature dimension for Italian.

Reviewed By: JonCoens

Differential Revision: D4970338

fbshipit-source-id: 024802e
2017-04-28 10:04:35 -07:00
Bartosz Nitka
74936df848 Make matching anywhere vs at pos obvious
Summary:
This change refactors the Engine to use a different
code path for when we're calling `lookupItem` to find
a first token `Node` matching the rule and a different
one for subsequent ones.

This division lets us get better invariants and more importantly
do full text regexp matches only when necessary.

This should be particularly useful for longer texts.

Reviewed By: patapizza

Differential Revision: D4953918

fbshipit-source-id: e3a69ad
2017-04-28 09:19:20 -07:00
Julien Odent
9269727617 PT: Bring latest changes
Summary: * PhoneNumber: support for "ramal" as extension keyword

Reviewed By: niteria

Differential Revision: D4959209

fbshipit-source-id: cd12c1f
2017-04-28 08:04:22 -07:00
Julien Odent
3f40625339 Temperature for Croatian
Summary: Temperature dimension for Croatian

Reviewed By: niteria

Differential Revision: D4958590

fbshipit-source-id: fe6c2e4
2017-04-28 08:04:22 -07:00
Julien Odent
3cc3266e28 Quantity for Croatian
Summary: Quantity dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4958501

fbshipit-source-id: b90c8f6
2017-04-28 08:04:22 -07:00
Julien Odent
0372f4f3da Volume for Croatian
Summary: Volume dimension for Croatian

Reviewed By: niteria

Differential Revision: D4957186

fbshipit-source-id: 63012ad
2017-04-28 08:04:22 -07:00
Julien Odent
0aa4aa56bb Distance for Croatian
Summary: Distance dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4957067

fbshipit-source-id: 232ce30
2017-04-28 08:04:21 -07:00
Julien Odent
35b9101c48 VI: Time
Summary:
* Time dimension for Vietnamese.
* Expose `debugContext`.

Reviewed By: niteria

Differential Revision: D4963594

fbshipit-source-id: 2373735
2017-04-28 08:04:21 -07:00
Julien Odent
3314ddc7a4 VI: Ordinal
Summary: Ordinal for Vietnamese.

Reviewed By: niteria

Differential Revision: D4959285

fbshipit-source-id: 7212cc9
2017-04-28 08:04:21 -07:00
Julien Odent
0370c452f1 Time
Summary: Time dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4954399

fbshipit-source-id: 906c4a6
2017-04-26 09:19:27 -07:00
Julien Odent
b32696f8eb AmountOfMoney
Summary: AmountOfMoney dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4947584

fbshipit-source-id: a20670a
2017-04-26 09:19:27 -07:00
Julien Odent
0f98a42b03 Ordinal
Summary: Ordinal dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4947244

fbshipit-source-id: 54bda8f
2017-04-26 09:19:27 -07:00
Julien Odent
840deda7dd Setup + Numeral
Summary: Setup + Numeral dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4946964

fbshipit-source-id: 204429b
2017-04-26 09:19:26 -07:00
Julien Odent
f5f4889770 Ordinal
Summary: Ordinal dimension for Hebrew.

Reviewed By: niteria

Differential Revision: D4930162

fbshipit-source-id: 02545ae
2017-04-24 06:49:40 -07:00
Julien Odent
bd96d3dd95 Setup + Numeral
Summary: Setup for Hebrew + Numeral dimension

Reviewed By: niteria

Differential Revision: D4930041

fbshipit-source-id: 965132b
2017-04-24 06:49:40 -07:00
Bartosz Nitka
879b103ca3 Fix indexing problems with new regexp matcher
Summary:
My change had a couple of problems:
* utf8 character width logic was completely wrong for characters that need 3 or 4 bytes
* `Array.listArray (start, end)` produces an array where `end` is a valid index
* because of ^ the `arraySize` logic also has to change

Reviewed By: watashi, darshankapashi

Differential Revision: D4894355

fbshipit-source-id: 8d07dfd
2017-04-14 15:49:17 -07:00
Bartosz Nitka
e37bb7c186 Duckling monad for Engine
Summary:
This converts the code to monadic style, so that
we can in the future:
* stop threading the `Document` parameter everywhere
* keep some state, like regexp match cache (I've already checked that it makes a substantial difference)

There should be no difference in performance or behavior
at this point.

Reviewed By: patapizza

Differential Revision: D4778808

fbshipit-source-id: a167ed8
2017-03-31 14:19:40 -07:00
Bartosz Nitka
bd94622f64 Move tests to tests and exes to exe
Summary:
This works around https://github.com/haskell/cabal/issues/4350
If we don't do this files get compiled multiple times
and cabal is unhappy.

Reviewed By: patapizza

Differential Revision: D4782749

fbshipit-source-id: 5bbe425
2017-03-27 16:04:24 -07:00