2017-03-15 17:46:15 +03:00
![Duckling Logo ](https://github.com/facebookincubator/duckling/raw/master/logo.png )
2017-03-11 01:55:35 +03:00
# Duckling
Duckling is a Haskell library that parses text into structured data.
2017-03-14 21:13:35 +03:00
```
"the first Tuesday of October"
=> {"value":"2017-10-03T00:00:00.000-07:00","grain":"day"}
```
2017-03-11 01:55:35 +03:00
## Requirements
A Haskell environment is required. We recommend using
[stack ](https://haskell-lang.org/get-started ).
## Quickstart
To compile and run the binary:
```
$ stack build
2017-03-28 01:49:18 +03:00
$ stack exec duckling-example-exe
2017-03-11 01:55:35 +03:00
```
The first time you run it, it will download all required packages.
2017-03-28 01:49:18 +03:00
See `exe/ExampleMain.hs` for an example on how to integrate Duckling in your
2017-03-11 01:55:35 +03:00
project.
2017-03-14 21:13:35 +03:00
## Supported dimensions
Duckling supports many languages, but most don't support all dimensions yet
(we need your help!).
| Dimension | Example input | Example value output
| --------- | ------------- | --------------------
| `Distance` | "6 miles" | `{"value":6,"type":"value","unit":"mile"}`
| `Duration` | "3 mins" | `{"value":3,"minute":3,"unit":"minute","normalized":{"value":180,"unit":"second"}}`
2017-03-31 01:57:01 +03:00
| `Email` | "duckling-team@fb.com" | `{"value":"duckling-team@fb.com"}`
2017-03-17 00:34:53 +03:00
| `AmountOfMoney` | "42€" | `{"value":42,"type":"value","unit":"EUR"}`
2017-03-14 23:19:13 +03:00
| `Numeral` | "eighty eight" | `{"value":88,"type":"value"}`
2017-03-14 21:13:35 +03:00
| `Ordinal` | "33rd" | `{"value":33,"type":"value"}`
| `PhoneNumber` | "+1 (650) 123-4567" | `{"value":"(+1) 6501234567"}`
| `Quantity` | "3 cups of sugar" | `{"value":3,"type":"value","product":"sugar","unit":"cup"}`
| `Temperature` | "80F" | `{"value":80,"type":"value","unit":"fahrenheit"}`
| `Time` | "today at 9am" | `{"values":[{"value":"2016-12-14T09:00:00.000-08:00","grain":"hour","type":"value"}],"value":"2016-12-14T09:00:00.000-08:00","grain":"hour","type":"value"}`
| `Url` | "https://api.wit.ai/message?q=hi" | `{"value":"https://api.wit.ai/message?q=hi","domain":"api.wit.ai"}`
| `Volume` | "4 gallons" | `{"value":4,"type":"value","unit":"gallon"}`
## Extending Duckling
To regenerate the classifiers and run the test suite:
```
2017-03-28 17:26:21 +03:00
$ stack build :duckling-regen-exe & & stack exec duckling-regen-exe & & stack test
2017-03-14 21:13:35 +03:00
```
It's important to regenerate the classifiers after updating the code and before
running the test suite.
To extend Duckling's support for a dimension in a given language, typically 2
files need to be updated:
* `Duckling/<dimension>/<language>/Rules.hs`
* `Duckling/<dimension>/<language>/Corpus.hs`
Rules have a name, a pattern and a production.
Patterns are used to perform character-level matching (regexes on input) and
concept-level matching (predicates on tokens).
Productions are arbitrary functions that take a list of tokens and return a new
token.
The corpus (resp. negative corpus) is a list of examples that should (resp.
shouldn't) parse. The reference time for the corpus is Tuesday Feb 12, 2013 at
4:30am.
`Duckling.Debug` provides a few debugging tools:
2017-03-11 01:55:35 +03:00
```
2017-03-14 21:13:35 +03:00
> :l Duckling.Debug
2017-03-15 23:03:16 +03:00
> debug EN "in two minutes" [This Time]
2017-03-14 21:13:35 +03:00
in|within|after < duration > (in two minutes)
-- regex (in)
-- < integer > < unit-of-duration > (two minutes)
-- -- integer (0..19) (two)
-- -- -- regex (two)
-- -- minute (grain) (minutes)
-- -- -- regex (minutes)
[Entity {dim = "time", body = "in two minutes", value = "{\"values\":[{\"value\":\"2013-02-12T04:32:00.000-02:00\",\"grain\":\"second\",\"type\":\"value\"}],\"value\":\"2013-02-12T04:32:00.000-02:00\",\"grain\":\"second\",\"type\":\"value\"}", start = 0, end = 14}]
2017-03-11 01:55:35 +03:00
```
## License
Duckling is BSD-licensed. We also provide an additional patent grant.