Summary:
* Closes PRs #115, #116.
* `Numeral`: Moves `ruleFractions` to common
* `Numeral/PT`: fixes dot in regex
* `Time`: Moves `isNumeralSafeToUse` to `isIntegerBetween` (more robust and cleaner)
* `Time/EN`: refactored `ruleCycleLastNextN` to `ruleDurationLastNext` (so it also accounts for non-safe Numerals, like "in a few days")
Reviewed By: blandinw
Differential Revision: D6502958
fbshipit-source-id: 6d9c08a23dab88f441aadade08d663c8e0da415d
Summary:
* `ruleIntegerNumeric` was used in all languages but Burmese.
* it seems like the hindu-arabic numerals are slowly getting in Burmese (e.g. recent car plates)
* Moving the rule in `Duckling/Numeral/Common.hs`
Reviewed By: blandinw
Differential Revision: D6498349
fbshipit-source-id: e868dc9960f18f0781e4aa98a0dfcd14969537c9
Summary:
When I write "dois mil e duzentos" the result should be 2200, but duckling recognize the numbers separated and give the result:
`[{"dim":"number","body":"dois","value":{"value":2,"type":"value"},"start":0,"end":4},{"dim":"number","body":"mil","value":{"value":1000,"type":"value"},"start":5,"end":8},{"dim":"time","body":"mil","value":{"values":[],"value":"1000-01-01T00:00:00.000-07:53","grain":"year","type":"value"},"start":5,"end":8},{"dim":"number","body":"duzentos","value":{"value":200,"type":"value"},"start":11,"end":19}]`
Now with this commit, duckling gives the correct result:
`[{"dim":"number","body":"dois mil e duzentos","value":{"value":2200,"type":"value"},"start":0,"end":19}]`
Closes https://github.com/facebook/duckling/pull/117
Reviewed By: blandinw
Differential Revision: D6477925
Pulled By: patapizza
fbshipit-source-id: 26ab503cc8def739c51ceb5bae7546016ba65ad6
Summary:
* Locales support for the library, following `<Lang>_<Region>` with ISO 639-1 code for `<Lang>` and ISO 3166-1 alpha-2 code for `<Region>` (#33)
* `Locale` opaque type (composite of `Lang` and `Region`) with `makeLocale` smart constructor to only allow valid `(Lang, Region)` combinations
* API: `Context`'s `lang` parameter has been replaced by `locale`, with optional `Region` and backward compatibility.
* `Rules/<Lang>.hs` exposes
- `langRules`: cross-locale rules for `<Lang>`, from `<Dimension>/<Lang>/Rules.hs`
- `localeRules`: locale-specific rules, from `<Dimension>/<Lang>/<Region>/Rules.hs`
- `defaultRules`: `langRules` + specific rules from select locales to ensure backward-compatibility
* Corpus, tests & classifiers
- 1 classifier per locale, with default classifier (`<Lang>_XX`) when no locale provided (backward-compatible)
- Default classifiers are built on existing corpus
- Locale classifiers are built on
- `<Dimension>/<Lang>/Corpus.hs` exposes a common `corpus` to all locales of `<Lang>`
- `<Dimension>/<Lang>/<Region>/Corpus.hs` exposes `allExamples`: a list of examples specific to the locale (following `<Dimension>/<Lang>/<Region>/Rules.hs`).
- Locale classifiers use the language corpus extended with the locale examples as training set.
- Locale examples need to use the same `Context` (i.e. reference time) as the language corpus.
- For backward compatibility, `<Dimension>/<Lang>/Corpus.hs` can expose also `defaultCorpus`, which is `corpus` augmented with specific examples. This is controlled by `getDefaultCorpusForLang` in `Duckling.Ranking.Generate`.
- Tests run against each classifier to make sure runtime works as expected.
* MM/DD (en_US) vs DD/MM (en_GB) example to illustrate
Reviewed By: JonCoens, blandinw
Differential Revision: D6038096
fbshipit-source-id: f29c28d
Summary:
We noticed that using UTF-8 characters directly in regexes work.
Hence converting back the escaped characters for readability and maintenance.
Reviewed By: blandinw
Differential Revision: D5787146
fbshipit-source-id: e5a4b9a
Summary:
I fixed some bugs I found in Portuguese. This is my first attempt to contribute so let me know if there's any thing I could do better next time! thanks! awesome project!
Closes https://github.com/facebookincubator/duckling/pull/56
Differential Revision: D5318968
Pulled By: patapizza
fbshipit-source-id: 94ff30f
Summary:
This works around https://github.com/haskell/cabal/issues/4350
If we don't do this files get compiled multiple times
and cabal is unhappy.
Reviewed By: patapizza
Differential Revision: D4782749
fbshipit-source-id: 5bbe425