Commit Graph

651 Commits

Author SHA1 Message Date
kcnhk1@gmail.com
722cc838ff Volume - extend interval support
Reviewed By: haoxuany

Differential Revision: D26255089

Pulled By: chessai

fbshipit-source-id: e4bdb0aa3c1be55dff0a5577155a3d0469d6762d
2021-02-04 12:02:32 -08:00
kcnhk1@gmail.com
67c1dbe94f AmountOfMoney - extend interval support
Reviewed By: haoxuany

Differential Revision: D26254863

Pulled By: chessai

fbshipit-source-id: dfc06f9831de2d50c11d252429c4fb9b8c1eb13a
2021-02-04 11:19:19 -08:00
kcnhk1@gmail.com
b6da3929ce Extend distance rules
Summary:
Add rules:
- one meter and <dist>
- <dist> meters and <dist>

Reviewed By: girifb

Differential Revision: D26191350

Pulled By: chessai

fbshipit-source-id: 52c85c94647e98fba866c24d3386eea988f7f58c
2021-02-03 15:01:39 -08:00
kcnhk1@gmail.com
776b1ec64d extend AmountOfMoney rules
Summary:
Add rules:
- `hkd` as HKD, and related rules (prefix and suffix)
- dollar and <amount-of-money> rule
- dollar and a half rule
- intersection for <amount-of-money> and `a half`

Changed:
- dime and dollar rules now have improved coverage

Reviewed By: girifb

Differential Revision: D26191724

Pulled By: chessai

fbshipit-source-id: bf63b6eaa751fb96dcf341fa2b66db06a6eeca79
2021-02-03 14:05:30 -08:00
Daniel Cartwright
041a81ad1a Use System.FilePath.Posix
Summary: Results in no change on linux/macos, but this is necessary on windows to prevent paths from being botched

Reviewed By: girifb

Differential Revision: D25893201

fbshipit-source-id: ca79dd8a766aecf27562044865d9bc258a4e8d11
2021-02-03 13:31:34 -08:00
Amr Keleg
e673ba5e84 Quantity/EN: Support k.g k.g. (#570)
Summary:
Adding . in between kilogram units used to be extracted as a Numeral
instead of Quantity.

Pull Request resolved: https://github.com/facebook/duckling/pull/570

Reviewed By: patapizza

Differential Revision: D26199687

Pulled By: chessai

fbshipit-source-id: 65e39f20296946d5762d7180b12878f4e66ea701
2021-02-03 12:46:27 -08:00
kcnhk1@gmail.com
496842d16a Extend numeral rules
Summary:
- Extend fraction rule
- add mixed fraction rules
- add prefix of 10/100/10_000 rules

Reviewed By: girifb

Differential Revision: D26191175

Pulled By: chessai

fbshipit-source-id: c2f6b74602e1b8061e0c556721ad8e36821fdb5c
2021-02-03 11:19:33 -08:00
jfulse
788f63eeac Parse more date formats in Norwegian (#395)
Summary:
In general there are some clashes between time formats `hhmm` and date formats `ddmm`. For example, depending on context, `22.10` can mean clock time ten past ten or the twenty second of october. In general it's correct to interpret this as clock time, as Duckling currently does.

But there are some cases not currently covered by Duckling where we have more unambiguous dates, e.g. `12.03.2018` and `27.11`. These are included here (in addition to midnight `24:00` which was also missing).

#### Changes:

- Bug in `ruleDdmm` regex meant that dates on the format `dd/mm` where `mm > 9` were not parsed
- `ruleYyyymmdd` now also parses dots and forward slashes, i.e. `2012.05.14` and `2012/05/14`
- New rule `rule2400` parses `24:00` and `24.00` (I elected not to include it in `ruleMidnighteodendOfDay` as it has grain minute rather than day)
- New rule `ruleDmm` parses `1/10`, `9.12` etc
- New rule `ruleDDm` parses `10/3`, `11.1` etc
- New rule `ruleDdDotMm` parses `25.02`, `31.10` etc
- `ruleDdmmyyyy` now also parses dots, i.e. `03.10.1983`
- New tests

Pull Request resolved: https://github.com/facebook/duckling/pull/395

Reviewed By: patapizza

Differential Revision: D26193069

Pulled By: chessai

fbshipit-source-id: cf711807fa1d40be2303f2426d74ded40c2e23b3
2021-02-02 23:18:48 -08:00
Maxime Biais
16708d9572 Minor Volume.FR improvement: add "Centilitre" type (#354)
Summary:
Minor Volume.FR improvement: add "Centilitre" type. This is useful for recipe parsing.

Pull Request resolved: https://github.com/facebook/duckling/pull/354

Reviewed By: patapizza

Differential Revision: D26193246

Pulled By: chessai

fbshipit-source-id: ddd551e062b8efeff1e786e30e35815c0c29a34c
2021-02-01 22:48:34 -08:00
kcnhk1@gmail.com
61e06c3aa6 Add initial support for volumes in Chinese
Reviewed By: girifb

Differential Revision: D26183123

Pulled By: chessai

fbshipit-source-id: 1acd27d5172cfb5bccbeb1576700e2c60a8e3907
2021-02-01 16:05:42 -08:00
Igor Kuzmenko
9993911e3b Adds UAH currency Type and examples to EN and RU Corpus (#433)
Summary:
This PR adds UAH currency Type and examples to EN and RU Corpus

Pull Request resolved: https://github.com/facebook/duckling/pull/433

Reviewed By: girifb

Differential Revision: D25102990

Pulled By: chessai

fbshipit-source-id: ed40e8dfcf145a65c7e6d87158da0efacb32e256
2021-02-01 14:32:24 -08:00
Daniel Cartwright
7193caafb9 parse latent year intervals
Summary: adds a new rule that parses year intervals such as "1960 - 1961". see inline comments for heuristics.

Reviewed By: patapizza

Differential Revision: D25840835

fbshipit-source-id: 851a5b1c78440cbf065bf9f20a05c78d4967ea3c
2021-01-29 16:33:56 -08:00
Daniel Cartwright
33f0c17ee2 implement 'the day after tomorrow' in Romanian
Summary: adds a rule for 'the day after tomorrow' in Romanian. regenerates classifiers.

Reviewed By: girifb

Differential Revision: D26155042

fbshipit-source-id: 80005ab94a10f9fbf242c9a712bd040e4f6bc477
2021-01-29 14:49:13 -08:00
Marcin Armatys
d5fac5f14e Polish(PL) - Support for seventy, eighty, ninety (#417)
Summary:
Support for polish equivalents of seventy, eighty, ninety.

Pull Request resolved: https://github.com/facebook/duckling/pull/417

Reviewed By: patapizza

Differential Revision: D26130642

Pulled By: chessai

fbshipit-source-id: 4a0be944dcd0a9dea155caae145cf4a38537753f
2021-01-29 11:47:36 -08:00
Nour Shalabi
6346cfe926 Add Arabic rule for a week ago (#379)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/379

Reviewed By: patapizza

Differential Revision: D26149123

Pulled By: chessai

fbshipit-source-id: 5f0bca88fc1b64da5d93fcf715996d58a972fda2
2021-01-29 11:32:32 -08:00
Tobias Wochinger
97636f525e skip logfile creation if no logging (#377)
Summary:
**Motivation**
Currently the log files and the log directory for the server are always created, even if the logging is disabled. If duckling is used on OpenShift the file creation leads to errors if no volume mount is defined.

**Proposed Change**:
Only create log files / log directory if the logging is enabled.

Pull Request resolved: https://github.com/facebook/duckling/pull/377

Reviewed By: patapizza

Differential Revision: D26148878

Pulled By: chessai

fbshipit-source-id: f8e2b1a38586121d854a4826c322b4b859cc9c6b
2021-01-29 11:32:32 -08:00
Arjan Scherpenisse
d095b05060 NL/Duration: Support composite durations (#503)
Summary:
E.g. "1 uur en drie kwartier", "1 dag 4 uur", etc.

Pull Request resolved: https://github.com/facebook/duckling/pull/503

Reviewed By: patapizza

Differential Revision: D22260615

Pulled By: chessai

fbshipit-source-id: 40689f7630b4d5bab498df730528ce6bf768fa89
2021-01-27 11:18:10 -08:00
kckckcng
a82684e723 Time&Duration/ZH: support Cantonese and more common expressions (#516-2) (#523)
Summary:
**2nd set of changes from pull request https://github.com/facebook/duckling/issues/516

Supporting Cantonese and more common expressions in Chinese.
Adding rules file for Duration/ZH.

Pull Request resolved: https://github.com/facebook/duckling/pull/523

Reviewed By: haoxuany

Differential Revision: D23428901

Pulled By: chessai

fbshipit-source-id: 6d04c97b63bac966eb61d77cab2f08f7543dbbf0
2021-01-26 15:17:45 -08:00
michaelmarien
28ddc3bff7 NL/amount-of-money (#504)
Summary:
Currently values like 1000.000 (in Dutch . is thousand separator) are not recognised, as the ruleDecimalWithThousandsSeparator requires the decimal part (e.g. 1000.000,34) to be present. This PR adds some data and changes the ruleDecimalWithThousandsSeparator to make the decimal part optional.

Pull Request resolved: https://github.com/facebook/duckling/pull/504

Reviewed By: patapizza, girifb

Differential Revision: D26078885

Pulled By: chessai

fbshipit-source-id: b1679c713e1d17a168d34a3cc556b6c36a571d75
2021-01-26 12:33:14 -08:00
kckckcng
f2798021b6 Numeral/ZH: support more common expressions (#516-1) (#522)
Summary:
**1st set of changes from pull request https://github.com/facebook/duckling/issues/516

Supporting more common expressions, such as fraction, half, dozen, in Chinese.

Pull Request resolved: https://github.com/facebook/duckling/pull/522

Reviewed By: patapizza

Differential Revision: D23428893

Pulled By: chessai

fbshipit-source-id: 3454ac70a4bfff90dc282560916a0fae9969f521
2021-01-21 21:17:54 -08:00
Sam Coope
e9e5507820 Add ASAP, at the moment to EN time (#405)
Summary:
* "at the moment" is considered identical to "now".
* "ASAP" is considered identical to "from now"

Pull Request resolved: https://github.com/facebook/duckling/pull/405

Reviewed By: patapizza

Differential Revision: D26009483

Pulled By: chessai

fbshipit-source-id: addf4c509e69d413cae279601c64f72710eba11f
2021-01-21 20:47:40 -08:00
Daniel Cartwright
1ba1aedeba Correct CDT TimeZone offset
Summary: CDT is UTC -5. (-5 hours) * (60 minutes/hour) = -300 hours. 540 was probably copy/paste error.

Reviewed By: girifb

Differential Revision: D25877623

fbshipit-source-id: de4f84f2564cbb154aec95eee63c458c64f8a85f
2021-01-12 14:02:52 -08:00
chessai
40cdb88982 Add CreditCardNumber to common dimensions (#563)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/563

Reviewed By: girifb

Differential Revision: D25624047

Pulled By: chessai

fbshipit-source-id: b50cf34f4a28bfcbd4a0ca3479debc5a5c118b5e
2021-01-05 13:18:19 -08:00
Wojtek Przechodzeń
10eee56f10 Time/PL - new rules (#538)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/538

Reviewed By: haoxuany

Differential Revision: D24640854

Pulled By: chessai

fbshipit-source-id: 51eb0d530b143511f79992a91ca8f465b7860b6e
2020-12-16 13:47:49 -08:00
chaitu9701
28cb5ebd2a Adding Numerical Dimention support for Telugu language (#470)
Summary:
This pull request is to add support for Telugu language (Numerical Dimension) to Duckling

Pull Request resolved: https://github.com/facebook/duckling/pull/470

Differential Revision: D25546700

Pulled By: chessai

fbshipit-source-id: 1d88ee27da8a577a4a79ff31be8cb55ed6444c4e
2020-12-15 17:48:03 -08:00
Amit Manchanda
724325b02f add: support for quarter to, quarter past and half in HI (#423)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/423

Reviewed By: girifb

Differential Revision: D25573001

Pulled By: chessai

fbshipit-source-id: 5474f108e968bdfb53ebc2518b46f28befdeba89
2020-12-15 17:02:28 -08:00
chessai
a319da07b2 ExampleMain: fix build failure (#560)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/560

Reviewed By: patapizza

Differential Revision: D25564850

Pulled By: chessai

fbshipit-source-id: 631f96a3ed71b9d7707560ff6bfe7596feee2305
2020-12-15 11:48:04 -08:00
Amr Keleg
703ff13210 Add a new Arabic locale (EG) (#554)
Summary:
Egyptian Arabic is a dialect of Arabic that is mostly a spoken language that is used in everyday communications.
This PR adds new locale to Arabic to support the differences between Modern Standard Arabic (MSA) and Egyptian Arabic (EG).
I have mainly depended on the different locales of Spanish that are supported by Duckling to create the new Egyptian Arabic locale.
New modifications are added to the `Numeral` dimension since I didn't spot differences in other dimensions.

Pull Request resolved: https://github.com/facebook/duckling/pull/554

Reviewed By: patapizza

Differential Revision: D25543502

Pulled By: chessai

fbshipit-source-id: 4cbb7be78a52071c8681380077f0b4dc033a60de
2020-12-15 11:33:40 -08:00
Daniel Cartwright
181037e469 Support abbreviation of Crore and Lakh
Summary:
Crore (1e7) and Lakh (1e5) are both commonly used to describe an amount of Indian currency. Common abbreviations are "Cr" (Crore) and "lkh", "L", "lac" (lakh).

Additionally, common spellings of "crore" include "karor" and "koti"

Reviewed By: patapizza

Differential Revision: D25550546

fbshipit-source-id: 0c1479d9027431cb0d1182b5117eabca6f939cb2
2020-12-15 11:18:05 -08:00
moozzyk
c33249b4dd Fix typo in PL Duration Rules (#426)
Summary:
'miej' in Polish is the imperative form of the verb 'mieć' (to have). "mniej więcej" means "more or less" and it was the intention here.

Pull Request resolved: https://github.com/facebook/duckling/pull/426

Reviewed By: patapizza, girifb

Differential Revision: D25546380

Pulled By: chessai

fbshipit-source-id: 1047b83109cab917f1f4dbe87b667f8ccd2fb92d
2020-12-14 16:32:05 -08:00
Daniel Cartwright
17f11135f2 Document how to pass dimensions to the example application
Summary: External users are repeatedly confused by lack of results from the duckling example executable. We should just go through all dimensions for the duckling call in the example app.

Reviewed By: patapizza

Differential Revision: D25468199

fbshipit-source-id: 6cf56b130d4d0aa3181f098d6a7c9a133bfa85ff
2020-12-14 15:02:37 -08:00
chessai
12b1db3794 GitHub CI over Travis (#555)
Summary:
Facebook is migrating away from Travis CI, to GitHub actions.

Pull Request resolved: https://github.com/facebook/duckling/pull/555

Reviewed By: patapizza

Differential Revision: D25228779

Pulled By: chessai

fbshipit-source-id: a392b93e5a7b02d1f47b477b6c459901d3171e05
2020-12-10 11:04:14 -08:00
Hernan Barijhoff
f053b14676 ES/Ordinal: Fixes "tercero" pattern regex (#477)
Summary:
Missing "tercer" regex in rule

Pull Request resolved: https://github.com/facebook/duckling/pull/477

Reviewed By: patapizza

Differential Revision: D24934794

Pulled By: chessai

fbshipit-source-id: a51f6fe3187749885784bfaacfee09cf26a8df6d
2020-11-19 13:48:43 -08:00
Christoph Flick
d0a6f8114c Improve german time approximation (#435)
Summary:
Improves the recognition of German time approximation language and removes a single error in the rule of <time-of-day> approximately.

Pull Request resolved: https://github.com/facebook/duckling/pull/435

Reviewed By: patapizza

Differential Revision: D24934281

Pulled By: chessai

fbshipit-source-id: 641bcb6a7e5c26e66c735fe13bccae9b7a8909ae
2020-11-19 13:48:42 -08:00
Sajjad Heydari
700118644c FA Setup (#520)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/520

Reviewed By: patapizza

Differential Revision: D25072459

Pulled By: chessai

fbshipit-source-id: 5db72eda36fe166a452b2345cab75fb1508b192b
2020-11-19 12:20:00 -08:00
Harisankar H
11595b7377 Support for more Hindi numbers (#552)
Summary:
Add support for additional Hindi numbers like 300, 81, 150, 1000, 1520. These are not supported in the current master version.

Pull Request resolved: https://github.com/facebook/duckling/pull/552

Reviewed By: ashwinp-fb, girifb

Differential Revision: D25072230

Pulled By: chessai

fbshipit-source-id: 35277a2349384bcf44a20e74852113f5c010e618
2020-11-18 17:04:29 -08:00
Daniel Cartwright
58cf66589f make duckling time not treat 0:xx and 12:xx ambiguously
Reviewed By: haoxuany

Differential Revision: D24929661

fbshipit-source-id: 3858d14ef1655f079daa33d2b159e8cb918a70ac
2020-11-12 14:19:04 -08:00
chessai
b23de34b46 fix common windows build issue (#549)
Summary:
* use regex-pcre-builtin by default on windows
* update cabal version to 2.2 to support leading commas
    - requires the very first line in cabal file be the
      cabal-version line
    - BSD3 is not BSD-3-Clause (don't ask me why)

resolves https://github.com/facebook/duckling/issues/547

Pull Request resolved: https://github.com/facebook/duckling/pull/549

Reviewed By: haoxuany

Differential Revision: D24838317

Pulled By: chessai

fbshipit-source-id: 376eb30a94ab88420915b868dffddb252fd08e76
2020-11-12 14:04:45 -08:00
chessai
cdeefe1d4d ghc88x compat (#550)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/550

Reviewed By: haoxuany

Differential Revision: D24844625

Pulled By: chessai

fbshipit-source-id: 52dcf5f9488386f7f407535e876bff1207823fe0
2020-11-12 13:47:46 -08:00
Dmitri Osipov
e7264b55c9 adds frequent durations in German (#509)
Summary:
Found a lacking frequent duration in German and a small typo in the existing one.

Pull Request resolved: https://github.com/facebook/duckling/pull/509

Reviewed By: patapizza

Differential Revision: D24690104

Pulled By: chessai

fbshipit-source-id: b49a7a636abf5b92f2fe7c0d5b2ca2fe64acbaa2
2020-11-09 11:18:35 -08:00
Daniel Cartwright
eb043d7018 Quantity rules for Spanish (ES)
Summary:
Spanish (ES) will now have all the same quantity rules as English (EN) (which I think is the most-supported language), plus more.

This includes the following:
* bowls - (bol(es)?|tazón(es)?|cuencos?|platos? (soperos?)|(hondos?)) (EN does not currently have this)
* cups - (tazas?)
* dishes - (platos?|fuentes?) (EN does not currently have this)
* grams - (((m(ili)?)|(k(ilo)?))?g(ramo)?s?)
* ounces - ((onzas?)|oz)
* pints - (pintas?) (EN does not currently have this)
* pounds - ((lb|libra)s?)
* quarts - (cuartos? de galón) (EN does not currently have this)
* tablespoons - (cucharadas? (grande)?) (EN does not currently have this)
* teaspoons - (cucharaditas?) (EN does not currently have this)

Reviewed By: patapizza

Differential Revision: D24628214

fbshipit-source-id: 2e8d500661f30fa0928cb7d3f21470afc01e2285
2020-11-09 11:18:35 -08:00
Tpt
888b1cba35 Dockerfile: debugs the build and uses Debian Buster everywhere (#539)
Summary:
The Dockerfile build part did not copy the Duckling implementation into the container, making the build fail.

I also harmonized the target Debian to Buster, that is the one currently hidden behind `haskell:8`.

Pull Request resolved: https://github.com/facebook/duckling/pull/539

Reviewed By: patapizza

Differential Revision: D24688839

Pulled By: chessai

fbshipit-source-id: 0ffcc4d28a599b7edad668730117828d26e116ad
2020-11-02 13:33:00 -08:00
Victor Pothin
bfc75849d2 Adds new rules of accentuation of the Portuguese (#531)
Summary:
Keeps accents consistent, "quinquagésimo" there is no more "Ü".

Pull Request resolved: https://github.com/facebook/duckling/pull/531

Reviewed By: patapizza

Differential Revision: D23770703

Pulled By: chessai

fbshipit-source-id: f8a34c02028faf9f51eca6a016b5bad988a83f04
2020-11-02 12:17:57 -08:00
Daniel Cartwright
01b812b69c Update dependencies/CI
Summary:
This PR accomplishes several things:

- removes dist-newstyle (local build artifacts should not be checked in)
- extends the .gitignore to include many common build artifacts/editor artifacts
- allow more modern dependencies (upper bounds of many were out of date by one or two years' worth of releases)
- upgrade stack lts (9.2 -> 14.2) to GHC 8.6.5
- regenerate .travis.yml using the now-standard haskell-ci (many haskell core libraries use this), instead of the outdated script that was maintained by hvr; as a precursor to this, the tested-with versions were updated

Reviewed By: patapizza

Differential Revision: D24623967

fbshipit-source-id: 838fe571df0b8d44106349659ce8ce8ab82f0bc6
2020-10-29 11:02:49 -07:00
Josef Svenningsson
7889f396f3 Remove dependency on Data.Some (#533)
Summary:
Pull Request resolved: https://github.com/facebook/duckling/pull/533

In recent versions of Data.Some the name of the constructor, `This` has changed name to `Some`. This has become rather problematic for us to migrate so we're just going to remove the dependency. The meat of this diff is adding the type `Seal` to `Duckling.Types`. That type replaces `Some`.

Reviewed By: pepeiborra

Differential Revision: D23929459

fbshipit-source-id: 8ff4146ecba4f1119a17899961b2d877547f6e4f
2020-09-28 01:33:01 -07:00
Julien Odent
7ba9ea8aeb Time/EN: Fix empty group match
Summary: sad_palpatine

Differential Revision: D23718913

fbshipit-source-id: 363bf9a43d8d1cd77405882bc70a7fa1a1de2dbe
2020-09-15 17:22:00 -07:00
Julien Odent
ef2b1b1b0e Time/FR: Some speed up
Summary: Guarding against grains, shortening regexes.

Reviewed By: jtliao

Differential Revision: D23387716

fbshipit-source-id: de84d0efa79c4ae10bd9fbf14e82a724fee1a1f2
2020-08-28 09:48:15 -07:00
Arjan Scherpenisse
df2ada617a NL/Duration: Add "anderhalf uur" (#502)
Summary: Pull Request resolved: https://github.com/facebook/duckling/pull/502

Reviewed By: patapizza

Differential Revision: D22260625

Pulled By: haoxuany

fbshipit-source-id: bf44fdab7def19f6dd0e0ef7763c112a3b024396
2020-08-05 15:34:05 -07:00
Julien Odent
3d5e1c3bad Time/DE: Don't parse "so"
Summary:
"so" is an adverb in German: https://github.com/wit-ai/wit/issues/1860
It's also a short form for "Sonntag" (Sunday); making the dot mandatory.

Reviewed By: haoxuany

Differential Revision: D22900791

fbshipit-source-id: 8dc873f79a21ca2add074f9c664e84fae56f1e67
2020-08-03 12:34:49 -07:00
Julien Odent
4846641456
Merge pull request #515 from patapizza/fixup-T70792907-master
Re-sync with internal repository
2020-07-30 15:20:39 -07:00