Summary:
* Moving `ruleNationalDay` from `ZH` rules to specific locales: `zh_CN`, `zh_HK`, `zh_MO`
* Fixed National Day for `zh_TW`.
Reviewed By: blandinw
Differential Revision: D6057565
fbshipit-source-id: 8f9f2ab
Summary:
- Fixes#89. "<number> <number>" would parse as Time if in the right range.
- Applied same rule for all languages. Note that for Italian and Polish, I updated "<hour> <minute>" tests to be in the form "at <hour> <minute>".
- Replaced `liftM2` with more generic `and|or . sequence [f1, f2, ...]`.
Reviewed By: blandinw
Differential Revision: D5992879
fbshipit-source-id: 5409ffb
Summary:
We noticed that using UTF-8 characters directly in regexes work.
Hence converting back the escaped characters for readability and maintenance.
Reviewed By: blandinw
Differential Revision: D5787146
fbshipit-source-id: e5a4b9a
Summary: This is a simple refactor that uses the weekend helper for all languages
Reviewed By: patapizza
Differential Revision: D5677330
fbshipit-source-id: 9984539
Summary:
Add the [Cantonese](https://en.wikipedia.org/wiki/Cantonese) (the official spoken language used in Hong Kong) support to date time
- updated Duration ZH corpus
- updated Time ZH rules and corpus
- updated TimeGrain ZH rules
Closes https://github.com/facebookincubator/duckling/pull/24
Reviewed By: patapizza
Differential Revision: D5143947
Pulled By: niteria
fbshipit-source-id: 9107d05
Summary:
This is analogous to
[Duckling] Don't produce trivially empty Tokens
but that change did that for intersect, this one
deals with interval.
Reviewed By: patapizza
Differential Revision: D5039215
fbshipit-source-id: 95bd821
Summary:
`intersectMB` was a name used for the purpose of migrating.
This is the last part of the migration.
Reviewed By: patapizza
Differential Revision: D4906098
fbshipit-source-id: a70af78
Summary:
This continues the work from:
"[Duckling] Don't produce trivially empty Tokens"
All the Rules should use intervalMB from now on.
Reviewed By: patapizza
Differential Revision: D4906072
fbshipit-source-id: 277b961
Summary:
We can detect certain kinds of contradictions sooner,
producing a token with an unresolvable Predicate is wasteful.
For a text like:
```
"Demain apres midi 14h 15 h 16h vendredi 14 a 15h"
```
it could produce 7000 tokens with empty predicates.
After this change it produces none and we get a 4x improvement in
time and 6x improvement in allocations.
Note I only covered `ruleIntersect*` here. I need to do this for
other instances as well.
Reviewed By: JonCoens
Differential Revision: D4871078
fbshipit-source-id: 9f0e7ad
Summary:
This is a very common pattern (>1k occurrences).
Replacing it with something shorter makes the rules a bit less
boilerplate-y.
Feel free to bikeshed the name, I can easily redo the codemod.
Reviewed By: patapizza
Differential Revision: D4848864
fbshipit-source-id: 7baeee3
Summary:
This makes the code easier to read.
I'm not attached to naming, but this is
standard terminology from topology.
Reviewed By: JonCoens, patapizza
Differential Revision: D4848740
fbshipit-source-id: 79c2c20
Summary: `DNumber` is a terrible name and was only there because legacy. `Numeral` makes more sense for this dimension, so let's use that instead.
Reviewed By: patapizza
Differential Revision: D4707167
fbshipit-source-id: cd78aa3