duckling

mirror of https://github.com/facebook/duckling.git synced 2024-12-01 08:19:36 +03:00

Author	SHA1	Message	Date
Bartosz Nitka	b108ab260f	Allocate less in lookupRegexp Summary: Contrary to my intuitions this part is the lion share of allocations in `lookupRegexp`. I'd have expected `Text` operations to dwarf it. It's a bit doubious that we build such big lists that it matters, perhaps in the future we can explore limiting the number of matches considered. Reviewed By: patapizza Differential Revision: D4745711 fbshipit-source-id: ebdc1aa	2017-03-21 09:19:18 -07:00
Bartosz Nitka	56a039eef1	Optimize isRangeValid Summary: `isRangeValid` was doing lots of random indexing inside a Text. Since we already have a convenient O(1), indexable `Vector Char` we can just use it instead. Reviewed By: patapizza Differential Revision: D4744297 fbshipit-source-id: b23011b	2017-03-21 08:49:16 -07:00
Bartosz Nitka	58bf36b9f4	Optimize isAdjacent Summary: `isAdjacent` was doing a ton of useless copies and redundant work. But pre-computing a `firstNonAdjacent` table we can answer every `isAdjacent` query in `O(1)` time and (almost?) no allocations. It may be a symptom of algorithmic problems, but we shouldn't make it more expensive than it needs to be. Reviewed By: patapizza Differential Revision: D4744172 fbshipit-source-id: dd70be2	2017-03-21 07:34:24 -07:00
Bartosz Nitka	26b1327bcd	Make Document type abstract Summary: This will let me do smarter things on document construction, like precomputing where all the whitespace is so that I can answer `isAdjacent` in O(1) time. If I'm measuring things right my next diff will cut down allocations 4x on problematic inputs. Reviewed By: patapizza Differential Revision: D4742664 fbshipit-source-id: 7e14e25	2017-03-20 20:49:24 -07:00
Bartosz Nitka	09acefbcf5	Make Show Dimension "law-abiding" Summary: `Show` should print things close to source level representation. I wanted to generate some tests from inputs that cause problems and there was no way to get source level representation of Dimension. Reviewed By: patapizza Differential Revision: D4723711 fbshipit-source-id: fff658d	2017-03-16 16:34:16 -07:00
Julien Odent	e76cee3a6d	Rename Finance to AmountOfMoney Summary: Because it makes more sense. Reviewed By: JonCoens Differential Revision: D4721646 fbshipit-source-id: 449bfb4	2017-03-16 14:49:44 -07:00
Julien Odent	54c9448fba	Rename Number to Numeral Summary: For consistency with the dimension name. Reviewed By: JonCoens Differential Revision: D4722216 fbshipit-source-id: 82c56d3	2017-03-16 13:49:16 -07:00
Julien Odent	33fa98734a	Fix 'no dia 20' Summary: * 'no dia 20' (on the 20) * Unifying two rules into one, with a day grain See https://github.com/wit-ai/wit/issues/388 Reviewed By: blandinw Differential Revision: D4715780 fbshipit-source-id: e990954	2017-03-15 13:49:17 -07:00
Julien Odent	1c98c0308c	Fix Some in README Summary: #accept2ship Reviewed By: niteria Differential Revision: D4715804 fbshipit-source-id: d53ca9a	2017-03-15 13:19:36 -07:00
Jonathan Coens	41800a3171	Move onto dependent-sum instead of custom local data Some Summary: No need to reinvent the wheel when `dependent-sum` has what we need. I re-export `Some(..)` from `Duckling.Dimensions.Types` to cut down on import bloat. Instead of a `Read` instance I created a `fromName` function. Reviewed By: zilberstein Differential Revision: D4710014 fbshipit-source-id: 1d4e86d	2017-03-15 10:34:17 -07:00
Bartosz Nitka	d23ae54ab9	.gitignore .stack-work Summary: stack creates this directory, we should prevent it from being commited. Reviewed By: JonCoens Differential Revision: D4713790 fbshipit-source-id: 34b723d	2017-03-15 10:04:30 -07:00
Bartosz Nitka	1a251d8e42	Use HashMap.lookupDefault Summary: This is a small stylystic improvement. Reviewed By: patapizza Differential Revision: D4713463 fbshipit-source-id: 47720d3	2017-03-15 08:19:11 -07:00
Julien Odent	1edf62f347	Adding logo Summary: happy_duck Reviewed By: niteria Differential Revision: D4713395 fbshipit-source-id: dd1c141	2017-03-15 08:04:31 -07:00
Julien Odent	ea80ab07d3	Update maintainer email Summary: . Reviewed By: niteria Differential Revision: D4713313 fbshipit-source-id: 4fbeabb	2017-03-15 07:49:12 -07:00
Julien Odent	cc016bb178	Refactoring + return domain Summary: * Simplified `Url` to only keep track of what we need (we can change back later) * Normalize domain: remove subdomains like `www`, `www2` and lower case * Return the full domain in the JSON value field * Updated offensive url example Reviewed By: JonCoens Differential Revision: D4705403 fbshipit-source-id: e5d11ee	2017-03-14 13:49:20 -07:00
Jonathan Coens	1b91b70c58	codemod DNumber to Numeral Summary: `DNumber` is a terrible name and was only there because legacy. `Numeral` makes more sense for this dimension, so let's use that instead. Reviewed By: patapizza Differential Revision: D4707167 fbshipit-source-id: cd78aa3	2017-03-14 13:34:11 -07:00
Bartosz Nitka	ec39c21593	Make the regexp less dangerous Summary: The current regexp matches sequences of numbers of unbounded length with lots of backtracking. Since phone numbers are shorter than X=20 characters we can put a bound on every currently unbounded match. Additionally we can use groups that don't capture, to avoid marshalling data that we won't need. Reviewed By: JonCoens Differential Revision: D4706862 fbshipit-source-id: 39ca9bb	2017-03-14 12:19:12 -07:00
Julien Odent	2f4ecfba08	Update README Summary: Doc to extend existing dimension/language support Reviewed By: JonCoens Differential Revision: D4706035 fbshipit-source-id: a8ecca4	2017-03-14 11:34:11 -07:00
Julien Odent	483ad4a191	OverloadedStrings for Debug Summary: #accept2ship Reviewed By: niteria Differential Revision: D4705625 fbshipit-source-id: 1245858	2017-03-14 08:34:11 -07:00
Bartosz Nitka	28d53fce30	Remove ruleIntersect2 Summary: It is no longer necessary after D4676812 and D4698788. `"I have 9 am 12 pm 1 pm 2pm 4 pm 3 pm on Saturday"` now works in less than a second, it used to be 10s. The test suite also got 3s faster. Reviewed By: patapizza Differential Revision: D4701890 fbshipit-source-id: 107a55f	2017-03-14 05:04:12 -07:00
Zejun Wu	3001604548	Clean redudant parentheses to test landcastle Summary: Clean redudant parentheses to test landcastle opt-out-review Differential Revision: D4703203 verified-sandcastle fbshipit-source-id: def175d	2017-03-13 18:19:24 -07:00
Bartosz Nitka	003604dce7	Optimize simple time predicates Summary: This is the next step for: https://fb.facebook.com/groups/527352907463243/permalink/600056483526218/ This: * changes the time language to be able to track contradictions (`EmptyPredicate`) * changes the time language to be able to collect non-contradicting pieces, like month and hour and unify them * provides an efficient way to convert those pieces into (past,future) time series * adds AMPM predicate runner - there's a bit of overlap with is12H, but it basically works * changes a test case that was wrong before * regenerates classifiers, I'm not sure why they changed exactly Before: ``` res <- H.io $ let sentence = "10am thurs 4.30 thurs 12pm sat" in (debugTokens sentence $ analyze sentence (testContext {lang = EN}) HashSet.empty) (15.50 secs, 6,171,188,928 bytes) res <- H.io $ let sentence = "I have 9 am 12 pm 1 pm 2pm 4 pm 3 pm on Saturday" in (debugTokens sentence $ analyze sentence (testContext {lang = EN}) HashSet.empty) (110.82 secs, 44,031,569,512 bytes) ``` After: ``` res <- H.io $ let sentence = "10am thurs 4.30 thurs 12pm sat" in (debugTokens sentence $ analyze sentence (testContext {lang = EN}) HashSet.empty) (1.24 secs, 703,020,912 bytes) res <- H.io $ let sentence = "I have 9 am 12 pm 1 pm 2pm 4 pm 3 pm on Saturday" in (debugTokens sentence $ analyze sentence (testContext {lang = EN}) HashSet.empty) (9.51 secs, 5,891,109,592 bytes) ``` Reviewed By: JonCoens Differential Revision: D4676812 fbshipit-source-id: 9810203	2017-03-13 17:04:10 -07:00
Julien Odent	fd80953407	Adding Feb tomorrow Summary: . Reviewed By: niteria Differential Revision: D4700059 fbshipit-source-id: 3d63aa4	2017-03-13 14:04:22 -07:00
Julien Odent	2e50aa5ea0	Fix 'tomorrow July' + IT fixes Summary: * we weren't checking the right reference time in `takeNth` and `takeN` * fixing resulting failing tests for `IT` * `analyzedNTest` to check that input results in `n` parsed tokens Reviewed By: niteria Differential Revision: D4698788 fbshipit-source-id: 2cd4762	2017-03-13 12:04:17 -07:00
Bartosz Nitka	5f6c4fcec3	Make the license field more precise Summary: `cabal` is spewing this (it still successfully loads): ``` Warning: 'license: BSD' is not a recognised license. The known licenses are: GPL, GPL-2, GPL-3, LGPL, LGPL-2.1, LGPL-3, AGPL, AGPL-3, BSD2, BSD3, MIT, ISC, MPL-2.0, Apache, Apache-2.0, PublicDomain, AllRightsReserved, OtherLicense ``` Looking at the LICENSE file we have in the repo and the wikipedia page: https://en.wikipedia.org/wiki/BSD_licenses, it looks like we're using BSD3. Reviewed By: patapizza Differential Revision: D4697670 fbshipit-source-id: 6c80078	2017-03-13 06:04:10 -07:00
Julien Odent	161889c3e6	README.md + updating cabal Summary: * basic `README.md` * updated `duckling.cabal` Reviewed By: JonCoens Differential Revision: D4691967 fbshipit-source-id: 0a5cdf7	2017-03-10 15:04:23 -08:00
Julien Odent	d5690f5e5e	CONTRIBUTING.md Summary: https://our.intern.facebook.com/intern/dex/open-source/open-source-licenses/#a-contributing-template Adapted https://github.com/facebook/bistro/blob/master/CONTRIBUTING.md for `Our Development Process`. Test-driven workflow. Reviewed By: JonCoens Differential Revision: D4691472 fbshipit-source-id: d296c77	2017-03-10 14:49:18 -08:00
Julien Odent	ab06262291	Strip off TODO/FIXME Summary: as the title says Differential Revision: D4682120 fbshipit-source-id: 3f66286	2017-03-10 12:04:16 -08:00
Julien Odent	69aeff3a71	Fix st build Summary: `RebindableSyntax` looks for `fromString` in scope. Reviewed By: JonCoens Differential Revision: D4675221 fbshipit-source-id: d7ff49d	2017-03-09 10:49:26 -08:00
FBShipIt	3f8e52e70a	Initial commit fbshipit-source-id: 301a10f448e9623aa1c953544f42de562909e192	2017-03-08 10:33:56 -08:00

30 Commits