Summary:
This diff implements volume intervals in Duckling. It follows the AmountofMoney and Quantity modules in doing so.
I had to make one crucial choice here -- defining the core Volume data type -- and whether the attribute "Unit" was optional (i.e. "Maybe") or not. Like in Quantity and unlike in AmountOfMoney, I made it optional, so that latent volumes can be supported down the line in the codebase.
I also wrote the codebase to be more modular, such that future developers only need to add regular expressions rather than functions for any language. For instance, developers can simply define a fraction (e.g. "eighth") at the start of the file, and new rules will be generated automatically; rather than requiring the developer to create an entirely new rule, as previously. The only (partial) exceptions were in the Arabic and Russian Rules files, where the language structure is more difficult and so I cannot fully implement this. Developers for those two languages may need to write new rules, as before.
Reviewed By: patapizza
Differential Revision: D9043117
fbshipit-source-id: f08de4f167596b5b32d12a79268b8ab92c099b22
Summary: Add an option to return latent time entities. This can be used when one is pretty certain that the input contains a datetime.
Reviewed By: patapizza
Differential Revision: D7254245
fbshipit-source-id: e9e0503cace2691804056fcebdc18fd9090fb181
Summary:
* Locales support for the library, following `<Lang>_<Region>` with ISO 639-1 code for `<Lang>` and ISO 3166-1 alpha-2 code for `<Region>` (#33)
* `Locale` opaque type (composite of `Lang` and `Region`) with `makeLocale` smart constructor to only allow valid `(Lang, Region)` combinations
* API: `Context`'s `lang` parameter has been replaced by `locale`, with optional `Region` and backward compatibility.
* `Rules/<Lang>.hs` exposes
- `langRules`: cross-locale rules for `<Lang>`, from `<Dimension>/<Lang>/Rules.hs`
- `localeRules`: locale-specific rules, from `<Dimension>/<Lang>/<Region>/Rules.hs`
- `defaultRules`: `langRules` + specific rules from select locales to ensure backward-compatibility
* Corpus, tests & classifiers
- 1 classifier per locale, with default classifier (`<Lang>_XX`) when no locale provided (backward-compatible)
- Default classifiers are built on existing corpus
- Locale classifiers are built on
- `<Dimension>/<Lang>/Corpus.hs` exposes a common `corpus` to all locales of `<Lang>`
- `<Dimension>/<Lang>/<Region>/Corpus.hs` exposes `allExamples`: a list of examples specific to the locale (following `<Dimension>/<Lang>/<Region>/Rules.hs`).
- Locale classifiers use the language corpus extended with the locale examples as training set.
- Locale examples need to use the same `Context` (i.e. reference time) as the language corpus.
- For backward compatibility, `<Dimension>/<Lang>/Corpus.hs` can expose also `defaultCorpus`, which is `corpus` augmented with specific examples. This is controlled by `getDefaultCorpusForLang` in `Duckling.Ranking.Generate`.
- Tests run against each classifier to make sure runtime works as expected.
* MM/DD (en_US) vs DD/MM (en_GB) example to illustrate
Reviewed By: JonCoens, blandinw
Differential Revision: D6038096
fbshipit-source-id: f29c28d
Summary:
We noticed that using UTF-8 characters directly in regexes work.
Hence converting back the escaped characters for readability and maintenance.
Reviewed By: blandinw
Differential Revision: D5787146
fbshipit-source-id: e5a4b9a
Summary:
This works around https://github.com/haskell/cabal/issues/4350
If we don't do this files get compiled multiple times
and cabal is unhappy.
Reviewed By: patapizza
Differential Revision: D4782749
fbshipit-source-id: 5bbe425
Summary:
No need to reinvent the wheel when `dependent-sum` has what we need. I re-export `Some(..)` from `Duckling.Dimensions.Types` to cut down on import bloat.
Instead of a `Read` instance I created a `fromName` function.
Reviewed By: zilberstein
Differential Revision: D4710014
fbshipit-source-id: 1d4e86d