Commit Graph

128 Commits

Author SHA1 Message Date
Julien Odent
b2de97800f Numeral/FR: allow space as a thousand separator
Summary: Fixes #91.

Reviewed By: blandinw

Differential Revision: D6079821

fbshipit-source-id: f3160c1
2017-10-17 12:20:32 -07:00
Matthijs Mullender
33a08bb76b Support Dutch Durations
Summary:
This change adds support for durations in Dutch/Netherlands (NL)
Implemented: TimeGrain/NL, Durations/NL

Reviewed By: patapizza

Differential Revision: D6049404

fbshipit-source-id: 3621cdb
2017-10-13 12:49:30 -07:00
Julien Odent
ab0ad0256e Locales support
Summary:
* Locales support for the library, following `<Lang>_<Region>` with ISO 639-1 code for `<Lang>` and ISO 3166-1 alpha-2 code for `<Region>` (#33)
* `Locale` opaque type (composite of `Lang` and `Region`) with `makeLocale` smart constructor to only allow valid `(Lang, Region)` combinations
* API: `Context`'s `lang` parameter has been replaced by `locale`, with optional `Region` and backward compatibility.
*  `Rules/<Lang>.hs` exposes
  - `langRules`: cross-locale rules for `<Lang>`, from `<Dimension>/<Lang>/Rules.hs`
  - `localeRules`: locale-specific rules, from `<Dimension>/<Lang>/<Region>/Rules.hs`
  - `defaultRules`: `langRules` + specific rules from select locales to ensure backward-compatibility
* Corpus, tests & classifiers
  - 1 classifier per locale, with default classifier (`<Lang>_XX`) when no locale provided (backward-compatible)
  - Default classifiers are built on existing corpus
  - Locale classifiers are built on
  - `<Dimension>/<Lang>/Corpus.hs` exposes a common `corpus` to all locales of `<Lang>`
  - `<Dimension>/<Lang>/<Region>/Corpus.hs` exposes `allExamples`: a list of examples specific to the locale (following `<Dimension>/<Lang>/<Region>/Rules.hs`).
  - Locale classifiers use the language corpus extended with the locale examples as training set.
  - Locale examples need to use the same `Context` (i.e. reference time) as the language corpus.
  - For backward compatibility, `<Dimension>/<Lang>/Corpus.hs` can expose also `defaultCorpus`, which is `corpus` augmented with specific examples. This is controlled by `getDefaultCorpusForLang` in `Duckling.Ranking.Generate`.
  - Tests run against each classifier to make sure runtime works as expected.
* MM/DD (en_US) vs DD/MM (en_GB) example to illustrate

Reviewed By: JonCoens, blandinw

Differential Revision: D6038096

fbshipit-source-id: f29c28d
2017-10-13 08:34:21 -07:00
Ian Stewart-Binks
2b566eeac0 Numeral/JA: HashMap lookups for large regexes
Summary: Replaced pattern matching with Hashmap. Also, removed ruleInteger17 and moved its regex to ruleInteger.

Reviewed By: patapizza

Differential Revision: D5812629

fbshipit-source-id: f0c1a06
2017-10-06 10:34:33 -07:00
Julien Odent
83ea150d94 Convert back escaped characters in rules
Summary:
We noticed that using UTF-8 characters directly in regexes work.
Hence converting back the escaped characters for readability and maintenance.

Reviewed By: blandinw

Differential Revision: D5787146

fbshipit-source-id: e5a4b9a
2017-09-07 12:49:33 -07:00
Stepan Parunashvili
6f774abe38 georgian numeral support
Summary: Introducing Georgian (KA), and the very beginnings of numeral support

Reviewed By: patapizza

Differential Revision: D5757952

fbshipit-source-id: 89d05f8
2017-09-05 12:19:29 -07:00
dubovinszky
24d3f19976 HU Setup + Numeral
Summary:
- Setup Hungarian (HU) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/79

Reviewed By: blandinw

Differential Revision: D5595812

Pulled By: patapizza

fbshipit-source-id: 5959938
2017-08-09 17:49:56 -07:00
Veselin Stoyanov
5d03b45af9 Setup Bulgarian language and Numeral Dimension
Summary:
- Setup Bulgarian (BG) language
- Added Numeral Dimension
Closes https://github.com/facebookincubator/duckling/pull/78

Reviewed By: niteria

Differential Revision: D5575513

Pulled By: patapizza

fbshipit-source-id: e566155
2017-08-09 08:19:24 -07:00
Julien Odent
bfb6ba0387 Numeral flag for Time patterns
Summary:
Today things like `at single`, `at a few`, `at a couple of` would return a `Time`.
Discussed with blandinw to do this very explicit hack right now until other use cases show up.

Reviewed By: niteria

Differential Revision: D5325369

fbshipit-source-id: aec0402
2017-06-27 07:34:21 -07:00
Andrew Farmer
068e23db13 Prepare for DuplicateRecordFields
Summary:
The one restriction on using DuplicateRecordFields is that record
selectors have to be imported under their constructor, instead of as
top-level functions. Do this for si_sigma so D5242707 passes the compat
check.

Reviewed By: watashi

Differential Revision: D5326634

fbshipit-source-id: 74ec0dd
2017-06-26 22:34:21 -07:00
André
3ec5390ac2 numerals between 100 and 999 in Portuguese fixed
Summary:
I fixed some bugs I found in Portuguese. This is my first attempt to contribute so let me know if there's any thing I could do better next time! thanks! awesome project!
Closes https://github.com/facebookincubator/duckling/pull/56

Differential Revision: D5318968

Pulled By: patapizza

fbshipit-source-id: 94ff30f
2017-06-26 08:19:28 -07:00
Daniel Rodríguez
36808e6086 HashMap lookups for large regexes.
Summary:
Transform large case matches into HashMap lookups.

Add an extra example for a rule set that wasn't tested before.

Reviewed By: patapizza

Differential Revision: D5253349

fbshipit-source-id: 303dbca
2017-06-19 11:34:18 -07:00
Anand Bhaskar
b8277411e7 Refactor rule 'number.number hours'
Summary: Created a helper for the rule to reuse across languages.

Reviewed By: patapizza

Differential Revision: D5189741

fbshipit-source-id: 7b4dcd4
2017-06-06 09:34:22 -07:00
Mohankumar Dhayalan
21c9b8ed7a HashMap lookups for large regexes
Summary: Added Hashmap lookups for Regex for Numeral/ID

Reviewed By: patapizza

Differential Revision: D5128492

fbshipit-source-id: 5ab928b
2017-05-25 11:04:18 -07:00
Şeref R.Ayar
69ce841710 Comma as decimal mark for Numeral TR
Summary: Closes https://github.com/facebookincubator/duckling/pull/28

Differential Revision: D5120967

Pulled By: patapizza

fbshipit-source-id: 41a5e4b
2017-05-24 09:04:17 -07:00
Julien Odent
3b64603d81 Don't allow 0 as fraction denominator
Summary:
"1/0" was returning "null" -- this is not a valid fraction.
Now "1/0" returns 2 Numeral.

Reviewed By: niteria

Differential Revision: D5037579

fbshipit-source-id: 70fa4c9
2017-05-11 12:04:20 -07:00
Julien Odent
37829902b7 CS: Setup + basic Numeral
Summary:
* Setup for Czech
* Basic `Numeral` (0-10 integers + digits) from http://www.omniglot.com/language/numbers/czech.htm

Reviewed By: JonCoens

Differential Revision: D5044775

fbshipit-source-id: b5cd9d2
2017-05-11 09:49:27 -07:00
Matt Schultz
ff9b54ad43 Added English fractional Numeral rule (ex: "3/4", "1/2", "5/7")
Summary:
Also added real-world test to English `Quantity` corpus ("3/4 cup", as a culinary example)
Closes https://github.com/facebookincubator/duckling/pull/14

Reviewed By: patapizza

Differential Revision: D5035990

Pulled By: niteria

fbshipit-source-id: c1b8f65
2017-05-10 07:04:16 -07:00
Julien Odent
5ba2c9e9a1 NB: Bringing latest changes
Summary:
* Numeral: fixed "hundre" (not "hundred")
* Numeral: added "tretti", "søtti"
* Time: updated last times to support "sist"
* Time: christmas days

Reviewed By: niteria

Differential Revision: D4958919

fbshipit-source-id: e4eecf5
2017-04-28 08:04:22 -07:00
Julien Odent
840deda7dd Setup + Numeral
Summary: Setup + Numeral dimension for Croatian.

Reviewed By: niteria

Differential Revision: D4946964

fbshipit-source-id: 204429b
2017-04-26 09:19:26 -07:00
Julien Odent
bd96d3dd95 Setup + Numeral
Summary: Setup for Hebrew + Numeral dimension

Reviewed By: niteria

Differential Revision: D4930041

fbshipit-source-id: 965132b
2017-04-24 06:49:40 -07:00
Kevin Cros
62bc5a317b Using hashmap look up instead of 'case of'
Summary: Updating regex with hashmap look ups.

Reviewed By: patapizza

Differential Revision: D4848178

fbshipit-source-id: 4d5ded8
2017-04-11 11:04:20 -07:00
ADAM LIU
928139569c Refactor of Duckling.Numeral.TR to hashmap lookup
Summary: Update of TR Rules hashmap

Reviewed By: patapizza

Differential Revision: D4860819

fbshipit-source-id: 6f5a722
2017-04-11 09:34:23 -07:00
ADAM LIU
572ff95adf Update RU Rules HashMap lookups update
Summary: Update of RU Rules hashmap

Reviewed By: patapizza

Differential Revision: D4840947

fbshipit-source-id: 00cb679
2017-04-06 15:49:17 -07:00
Amelia Wilson
70ef9b1bbe using hashmap lookups
Summary: converting large regex lookups to hashmap lookups in Duckling/Numeral/FR/Rules.hs and Duckling/Ordinal/FR/Rules.hs

Reviewed By: patapizza

Differential Revision: D4836336

fbshipit-source-id: 2241a3a
2017-04-05 12:20:10 -07:00
Bartosz Nitka
bd94622f64 Move tests to tests and exes to exe
Summary:
This works around https://github.com/haskell/cabal/issues/4350
If we don't do this files get compiled multiple times
and cabal is unhappy.

Reviewed By: patapizza

Differential Revision: D4782749

fbshipit-source-id: 5bbe425
2017-03-27 16:04:24 -07:00
Christian Bell
02e74cacd6 HashMap lookups for large regexes
Summary: Use HashMaps to speed up string pattern matching for UK (Ukranian).

Reviewed By: patapizza

Differential Revision: D4747195

fbshipit-source-id: e582dba
2017-03-22 08:49:17 -07:00
Julien Odent
54c9448fba Rename Number to Numeral
Summary: For consistency with the dimension name.

Reviewed By: JonCoens

Differential Revision: D4722216

fbshipit-source-id: 82c56d3
2017-03-16 13:49:16 -07:00