duckling

mirror of https://github.com/facebook/duckling.git synced 2024-12-24 12:42:53 +03:00

Author	SHA1	Message	Date
Julien Odent	d3d3703015	HE: Time Summary: Time dimension for Hebrew. Commented out the failing tests that actually also fail in Clojure. Reviewed By: JonCoens Differential Revision: D4970308 fbshipit-source-id: b455142	2017-04-28 10:04:35 -07:00
Julien Odent	ab2c89df4f	IT: Temperature Summary: Temperature dimension for Italian. Reviewed By: JonCoens Differential Revision: D4970338 fbshipit-source-id: 024802e	2017-04-28 10:04:35 -07:00
Bartosz Nitka	74936df848	Make matching anywhere vs at pos obvious Summary: This change refactors the Engine to use a different code path for when we're calling `lookupItem` to find a first token `Node` matching the rule and a different one for subsequent ones. This division lets us get better invariants and more importantly do full text regexp matches only when necessary. This should be particularly useful for longer texts. Reviewed By: patapizza Differential Revision: D4953918 fbshipit-source-id: e3a69ad	2017-04-28 09:19:20 -07:00
Julien Odent	9269727617	PT: Bring latest changes Summary: * PhoneNumber: support for "ramal" as extension keyword Reviewed By: niteria Differential Revision: D4959209 fbshipit-source-id: cd12c1f	2017-04-28 08:04:22 -07:00
Julien Odent	5ba2c9e9a1	NB: Bringing latest changes Summary: * Numeral: fixed "hundre" (not "hundred") * Numeral: added "tretti", "søtti" * Time: updated last times to support "sist" * Time: christmas days Reviewed By: niteria Differential Revision: D4958919 fbshipit-source-id: e4eecf5	2017-04-28 08:04:22 -07:00
Julien Odent	2182d94edb	Bring latest updates for ID Summary: * added one example in `AmountOfMoney` Reviewed By: niteria Differential Revision: D4958635 fbshipit-source-id: c70ce7c	2017-04-28 08:04:22 -07:00
Julien Odent	3f40625339	Temperature for Croatian Summary: Temperature dimension for Croatian Reviewed By: niteria Differential Revision: D4958590 fbshipit-source-id: fe6c2e4	2017-04-28 08:04:22 -07:00
Julien Odent	3cc3266e28	Quantity for Croatian Summary: Quantity dimension for Croatian. Reviewed By: niteria Differential Revision: D4958501 fbshipit-source-id: b90c8f6	2017-04-28 08:04:22 -07:00
Julien Odent	0372f4f3da	Volume for Croatian Summary: Volume dimension for Croatian Reviewed By: niteria Differential Revision: D4957186 fbshipit-source-id: 63012ad	2017-04-28 08:04:22 -07:00
Julien Odent	0aa4aa56bb	Distance for Croatian Summary: Distance dimension for Croatian. Reviewed By: niteria Differential Revision: D4957067 fbshipit-source-id: 232ce30	2017-04-28 08:04:21 -07:00
Julien Odent	35b9101c48	VI: Time Summary: * Time dimension for Vietnamese. * Expose `debugContext`. Reviewed By: niteria Differential Revision: D4963594 fbshipit-source-id: 2373735	2017-04-28 08:04:21 -07:00
Julien Odent	e4d4531877	VI: Duration Summary: Duration dimension for Vietnamese. This only uses the common rule. Reviewed By: niteria Differential Revision: D4962329 fbshipit-source-id: 9273245	2017-04-28 08:04:21 -07:00
Julien Odent	432ff51bd0	VI: TimeGrain Summary: TimeGrain dimension for Vietnamese. Reviewed By: niteria Differential Revision: D4959399 fbshipit-source-id: e053413	2017-04-28 08:04:21 -07:00
Julien Odent	3314ddc7a4	VI: Ordinal Summary: Ordinal for Vietnamese. Reviewed By: niteria Differential Revision: D4959285 fbshipit-source-id: 7212cc9	2017-04-28 08:04:21 -07:00
Julien Odent	0370c452f1	Time Summary: Time dimension for Croatian. Reviewed By: niteria Differential Revision: D4954399 fbshipit-source-id: 906c4a6	2017-04-26 09:19:27 -07:00
Julien Odent	2d0594576f	Duration Summary: Duration dimension for Croatian. Reviewed By: niteria Differential Revision: D4947983 fbshipit-source-id: 8e55a7e	2017-04-26 09:19:27 -07:00
Julien Odent	1c15d0bbb2	TimeGrain Summary: TimeGrain dimension for Croatian. Reviewed By: niteria Differential Revision: D4947837 fbshipit-source-id: b86d256	2017-04-26 09:19:27 -07:00
Julien Odent	b32696f8eb	AmountOfMoney Summary: AmountOfMoney dimension for Croatian. Reviewed By: niteria Differential Revision: D4947584 fbshipit-source-id: a20670a	2017-04-26 09:19:27 -07:00
Julien Odent	0f98a42b03	Ordinal Summary: Ordinal dimension for Croatian. Reviewed By: niteria Differential Revision: D4947244 fbshipit-source-id: 54bda8f	2017-04-26 09:19:27 -07:00
Julien Odent	840deda7dd	Setup + Numeral Summary: Setup + Numeral dimension for Croatian. Reviewed By: niteria Differential Revision: D4946964 fbshipit-source-id: 204429b	2017-04-26 09:19:26 -07:00
Bartosz Nitka	c70cf6d38d	Move Duckling.Stash to Duckling.Types.Stash Summary: This is for consistency with Duckling.Types.Document Reviewed By: patapizza Differential Revision: D4948569 fbshipit-source-id: 459565a	2017-04-25 16:49:18 -07:00
Bartosz Nitka	8db73688d7	Move Document and helpers to a fresh module Summary: Document had its internal details leaked over 2 files. This consolidates it. It took a long time to make this perf neutral (now it's even a tiny win), for reasons I don't completely understand. The INLINE pragma on byteStringFromPos I semi-understand, but I also had to move isRangeValid to Document and that's a bit of a mystery. Reviewed By: patapizza Differential Revision: D4948449 fbshipit-source-id: ffb251a	2017-04-25 16:49:18 -07:00
Bartosz Nitka	924516103b	Revert Duckling part of 'clean up unused imports' Summary: it doesn't take .cabal into account Reviewed By: patapizza Differential Revision: D4938400 fbshipit-source-id: 8bc99a5	2017-04-24 07:34:27 -07:00
Julien Odent	dbe9e73541	Duration Summary: Duration dimension for Hebrew. Reviewed By: niteria Differential Revision: D4930403 fbshipit-source-id: 690db8f	2017-04-24 06:49:40 -07:00
Julien Odent	efa38401b5	TimeGrain Summary: TimeGrain dimension for Hebrew. Reviewed By: niteria Differential Revision: D4930294 fbshipit-source-id: 9c0f0da	2017-04-24 06:49:40 -07:00
Julien Odent	f5f4889770	Ordinal Summary: Ordinal dimension for Hebrew. Reviewed By: niteria Differential Revision: D4930162 fbshipit-source-id: 02545ae	2017-04-24 06:49:40 -07:00
Julien Odent	bd96d3dd95	Setup + Numeral Summary: Setup for Hebrew + Numeral dimension Reviewed By: niteria Differential Revision: D4930041 fbshipit-source-id: 965132b	2017-04-24 06:49:40 -07:00
Bartosz Nitka	b26aa7d84d	clean up unused imports Summary: This diff was generated by running `hsclimps` PLEASE TAKE ONE OF THE FOLLOWING ACTIONS AS SOON AS POSSIBLE: 1) Select Accept and Ship to land this change 2) If you have issues with this diff, request changes 3) If you are no longer the owner, add reviewers and update the `.context` file with the appropriate owner NOTE: If the diff is unable to land because of a merge conflict I will automatically update it for you. #accept2ship Reviewed By: niteria Differential Revision: D4937839 fbshipit-source-id: bb3d330	2017-04-24 05:19:24 -07:00
Bartosz Nitka	7f7cc70d72	Make first pass more obvious Summary: Separating out the first pass lets us avoid repeated filtering and makes the structure of the algorithm a bit more clear. Previously `Stash.null` was used as a test for being part of the first pass or not, but that is a bit indirect. Encoding the algorithm structure (the state automaton) as function calls lets us make additional assumptions. It also has a nice side effect of costs being attributed to first/subsequent passes in the profile. I also prepend to `matches` because it's likely to be bigger. Reviewed By: patapizza Differential Revision: D4922195 fbshipit-source-id: 0aec79f	2017-04-20 11:49:15 -07:00
Bartosz Nitka	878f85b9e1	Codemod intersectMB to intersect Summary: `intersectMB` was a name used for the purpose of migrating. This is the last part of the migration. Reviewed By: patapizza Differential Revision: D4906098 fbshipit-source-id: a70af78	2017-04-18 10:19:20 -07:00
Bartosz Nitka	fe39a55a4c	Use intervalMB instead of interval Summary: This continues the work from: "[Duckling] Don't produce trivially empty Tokens" All the Rules should use intervalMB from now on. Reviewed By: patapizza Differential Revision: D4906072 fbshipit-source-id: 277b961	2017-04-18 10:19:20 -07:00
Bartosz Nitka	a91e787bb7	Derive Eq, Show for TimeIntervalType Summary: This is always useful to have. Reviewed By: patapizza Differential Revision: D4864208 fbshipit-source-id: b879893	2017-04-18 08:19:20 -07:00
Bartosz Nitka	879b103ca3	Fix indexing problems with new regexp matcher Summary: My change had a couple of problems: * utf8 character width logic was completely wrong for characters that need 3 or 4 bytes * `Array.listArray (start, end)` produces an array where `end` is a valid index * because of ^ the `arraySize` logic also has to change Reviewed By: watashi, darshankapashi Differential Revision: D4894355 fbshipit-source-id: 8d07dfd	2017-04-14 15:49:17 -07:00
Bartosz Nitka	e7aeef5436	Avoid allocations and encoding in regexp matching Summary: The rationale is explained in a new Note. Reviewed By: patapizza Differential Revision: D4884104 fbshipit-source-id: 81f36ee	2017-04-14 12:19:21 -07:00
Bartosz Nitka	3d18cf5ea9	Don't produce trivially empty Tokens Summary: We can detect certain kinds of contradictions sooner, producing a token with an unresolvable Predicate is wasteful. For a text like: ``` "Demain apres midi 14h 15 h 16h vendredi 14 a 15h" ``` it could produce 7000 tokens with empty predicates. After this change it produces none and we get a 4x improvement in time and 6x improvement in allocations. Note I only covered `ruleIntersect*` here. I need to do this for other instances as well. Reviewed By: JonCoens Differential Revision: D4871078 fbshipit-source-id: 9f0e7ad	2017-04-11 16:35:05 -07:00
Kevin Cros	62bc5a317b	Using hashmap look up instead of 'case of' Summary: Updating regex with hashmap look ups. Reviewed By: patapizza Differential Revision: D4848178 fbshipit-source-id: 4d5ded8	2017-04-11 11:04:20 -07:00
ADAM LIU	928139569c	Refactor of Duckling.Numeral.TR to hashmap lookup Summary: Update of TR Rules hashmap Reviewed By: patapizza Differential Revision: D4860819 fbshipit-source-id: 6f5a722	2017-04-11 09:34:23 -07:00
Bartosz Nitka	f7b3f2ed73	Detect interval contradictions sooner Summary: So far contradictions from intersection only propagated through intersection. This change makes it so that it also propagates through intervals and lets intervals also generate contradictions. Reviewed By: patapizza Differential Revision: D4864160 fbshipit-source-id: 8348267	2017-04-10 16:35:27 -07:00
Bartosz Nitka	1cf8496967	tt helper for returning Time Tokens Summary: This is a very common pattern (>1k occurrences). Replacing it with something shorter makes the rules a bit less boilerplate-y. Feel free to bikeshed the name, I can easily redo the codemod. Reviewed By: patapizza Differential Revision: D4848864 fbshipit-source-id: 7baeee3	2017-04-10 12:34:43 -07:00
Bartosz Nitka	f46539ced2	Type for Closed/Open intervals Summary: This makes the code easier to read. I'm not attached to naming, but this is standard terminology from topology. Reviewed By: JonCoens, patapizza Differential Revision: D4848740 fbshipit-source-id: 79c2c20	2017-04-07 12:19:17 -07:00
Jonathan Coens	b3ca32104d	Simple example HTTP server Summary: Runs a `snap` server to return the support targets as well as do parsing. It's a bit cludgy, but gets the job done. Reviewed By: patapizza Differential Revision: D4813197 fbshipit-source-id: 0fa165b	2017-04-06 17:04:48 -07:00
ADAM LIU	572ff95adf	Update RU Rules HashMap lookups update Summary: Update of RU Rules hashmap Reviewed By: patapizza Differential Revision: D4840947 fbshipit-source-id: 00cb679	2017-04-06 15:49:17 -07:00
Bartosz Nitka	78ecaa3728	Derive NFData for Entity Summary: This makes benchmarking easier. Reviewed By: JonCoens Differential Revision: D4846839 fbshipit-source-id: 9cc8dfa	2017-04-06 15:34:43 -07:00
Bartosz Nitka	290ca48e25	Fix 4:23am returning 5:23am Summary: This is the easiest way to fix it, but talking offline with Julien, we may need to revisit. It basically gets rid of time series where we were producing intervals that are not a multiply of the grain. Reviewed By: patapizza Differential Revision: D4841759 fbshipit-source-id: 1c4742a	2017-04-06 11:04:16 -07:00
Amelia Wilson	70ef9b1bbe	using hashmap lookups Summary: converting large regex lookups to hashmap lookups in Duckling/Numeral/FR/Rules.hs and Duckling/Ordinal/FR/Rules.hs Reviewed By: patapizza Differential Revision: D4836336 fbshipit-source-id: 2241a3a	2017-04-05 12:20:10 -07:00
Jonathan Coens	7c47431ce5	Upgrade to stackage 8.8 Summary: Just a little bounds bump Reviewed By: patapizza Differential Revision: D4835536 fbshipit-source-id: d51fbb8	2017-04-05 11:19:31 -07:00
Jonathan Coens	e2da9bc7fb	Upgrade to stackage 8.6 Summary: Moves to the 8.6 resolver, updates package limits, and fixes errors due to upgrade. Reviewed By: patapizza Differential Revision: D4810924 fbshipit-source-id: c8a64a9	2017-04-04 15:19:41 -07:00
Bartosz Nitka	e37bb7c186	Duckling monad for Engine Summary: This converts the code to monadic style, so that we can in the future: * stop threading the `Document` parameter everywhere * keep some state, like regexp match cache (I've already checked that it makes a substantial difference) There should be no difference in performance or behavior at this point. Reviewed By: patapizza Differential Revision: D4778808 fbshipit-source-id: a167ed8	2017-03-31 14:19:40 -07:00
Julien Odent	78228dea83	Update email Summary: Setup the correct email. Reviewed By: JonCoens Differential Revision: D4806876 fbshipit-source-id: a52f9f8	2017-03-30 16:20:08 -07:00
Bartosz Nitka	a1917a53f3	Make sure regen is rebuilt Summary: `stack exe/RegenMain.hs` uses runghc which is a tool we don't test with often. Making sure the executable is rebuilt and using it should be enough. Reviewed By: patapizza Differential Revision: D4783844 fbshipit-source-id: 459dbc4	2017-03-28 07:49:19 -07:00

1 2

83 Commits