Commit Graph

446 Commits

Author SHA1 Message Date
Jerin Philip
ca9aa64926 Switch to work with ssplit-cpp both pcre2 and pcrecpp 2021-02-18 11:07:31 +00:00
Abhishek Aggarwal
b75e72e65d Added more explanation for FILES_TO_PACKAGE in README 2021-02-18 10:42:06 +01:00
Jerin Philip
d249dcbfaa Build doc updated with wasm-branch compatible command 2021-02-17 21:15:35 +00:00
Jerin Philip
b9d081dd45 Temporary: Updating marian-dev to wasm branch 2021-02-17 19:51:57 +00:00
Abhishek Aggarwal
9feebe5cb2 Allow using relative paths for packaging files
- PACKAGE_DIR cmake option can now accept relative paths
2021-02-17 20:06:04 +01:00
Jerin Philip
d72343567c BatchTranslator doesn't do thread_, residue from merge removed 2021-02-17 16:41:04 +00:00
Jerin Philip
70b57ee3e7 Redundant parser include fixed 2021-02-17 16:38:47 +00:00
Jerin Philip
7b10c35483 Hard abort if multithread path launched without multithread-support 2021-02-17 13:50:42 +00:00
Jerin Philip
47b9db0c45 Documentation formatting/syntax fix 2021-02-17 13:35:10 +00:00
Jerin Philip
72848ba0f6 Fixes UEdin builds after wasm-integration merge
A bug which crept in during manual merge is now fixed. PCItem -> Batch
on a PCQueue.

docs/marian-integration.md provides instructions to compile successfully
for multithread.
2021-02-17 13:28:58 +00:00
Abhishek Aggarwal
b86f8a7dc2 Improved README
- Clears up the spaghetti of model packaging
 - Usage instructions
 - Formatting changes
2021-02-17 14:21:51 +01:00
Jerin Philip
d005f73cb9 Reverting changes to PCQueue 2021-02-17 13:10:39 +00:00
Jerin Philip
10dcb8f548 Merge remote-tracking branch 'origin/wasm-integration' into jp/absorb-batch-translator
Merging wasm-integration. Single thread codepath seems functional.
Multithreading is broken.
2021-02-17 13:08:58 +00:00
Jerin Philip
44a44fa156 CMake build with submodule recursive clones 2021-02-17 11:48:00 +00:00
Jerin Philip
c205c82585 Updates to README with option changes 2021-02-17 01:12:30 +00:00
Jerin Philip
fba44bec8f Improving Batcher error message with new option names 2021-02-17 01:05:20 +00:00
Jerin Philip
69201ba44c Unify options with marian
Service specific options are renamed to align with marian-option naming
as follows:

1. max-input-sentence-tokens -> max-length-break (There's a
   max-length-crop in marian, this is the same, except breaks into
   multiple sentences than truncate/crop).
2. max-input-tokens -> mini-batch-words.
2021-02-17 00:54:30 +00:00
Jerin Philip
0296a38cd4 Bunch of integers on containers to size_ts 2021-02-17 00:45:19 +00:00
Jerin Philip
d7556bc168 SentenceRanges: Class to work with string_views
Adds SentenceRanges in sentence_ranges.{h,cpp} and propogates use of the
class into the rest of the pipeline.

SentenceRanges previously a vector<vector<...>> is now converted into a
flat single vector<string_view>. Annotations marking sentence boundaries
are additionally stored in the class, enabling sentence string_view
access through methods.
2021-02-17 00:31:44 +00:00
Jerin Philip
9c907ea605 another int to size_t 2021-02-16 20:04:30 +00:00
Jerin Philip
4c8b655ac5 Batch cleanup
Moves Batch into batch.{h,cpp}.

- Id_ no longer used due to overflow concerns. (#27)
- size_t for places where signed integer is not preferred.
- Adjustments to response.{h,cpp}
2021-02-16 19:46:40 +00:00
Jerin Philip
65e7406970 Comments and lazy stuff to response 2021-02-16 17:00:53 +00:00
Motin
d907400a80 Updated test page to use the model structure from bergamot-models repo 2021-02-16 17:00:45 +02:00
Motin
b1e72ce75e Updated instructions on how to get all relevant models in place for the upcoming release 2021-02-16 15:46:15 +02:00
Abhishek Aggarwal
921c2eedf8 Updated config for min inference time
- This combination gives min inference time (~ 200 WPS)
   on local machine
2021-02-16 14:39:42 +01:00
Jerin Philip
d5a5e75451 Renaming variables; Enhancing documentation 2021-02-15 20:21:10 +00:00
Abhishek Aggarwal
c5c5339489 Re-enable simd shuffle pattern for intgemm compilation 2021-02-15 17:18:59 +01:00
Abhishek Aggarwal
3607523c24 Enabled COMPILE_WITHOUT_EXCEPTIONS for marian submodule 2021-02-15 16:54:50 +01:00
Abhishek Aggarwal
0374ac4696 Updated marian submodule
- Includes try/catch free builds
 - Has ASSERTION=0 and DISABLE_EXCEPTION_CATCHING=1 for wasm builds
2021-02-15 16:36:26 +01:00
Jerin Philip
ca6ca154b9 Changing fn name from enqueue to produceTo(pcqueue) 2021-02-15 15:22:31 +00:00
abhi-agg
fc3ab33277
Merge pull request #26 from motin/wasm-integration
Turn of assertions and disable exception catching for wasm builds
2021-02-15 14:21:18 +01:00
Motin
9a5cf30bbb Revert "Enabled simd shuffle pattern for intgemm compilation"
This reverts commit 3dd7a60b35.
2021-02-15 15:03:00 +02:00
Motin
9a5ae9568e Turn of assertions and disable exception catching for wasm builds 2021-02-15 14:24:59 +02:00
Motin
91e45cb4f0 Prepend shortlist path with / 2021-02-15 13:59:49 +02:00
Abhishek Aggarwal
3dd7a60b35 Enabled simd shuffle pattern for intgemm compilation
- WORMHOLE cmake option is set to ON when compiling for WASM
 - WASM module might not run on Chrome
2021-02-15 12:58:18 +01:00
Motin
64d57d8aa0 Use yaml for modelConfig on test page 2021-02-15 13:50:59 +02:00
Motin
7d6346d3b0 Add model config used in pr6 benchmarks 2021-02-15 13:35:22 +02:00
Motin
f3ff1d29ae Make modelConfig an object instead of string (less likelihood of typos) 2021-02-15 13:30:46 +02:00
Motin
fcc998ffa4 Add 10 lines of esen benchmark sentences to test page 2021-02-15 13:30:07 +02:00
Motin
1e94d78c4d Formatting 2021-02-15 13:19:39 +02:00
Motin
da56501c4f Finally found the original typo that made it appear as if loading the model in the test page was faster than elsewhere - the lexical shortlist was not being included at the right place in the model config 2021-02-15 13:10:10 +02:00
Motin
70bdcd4365 Fix typo from when fixing typo 2021-02-15 12:54:32 +02:00
Motin
dbdcdab115 Avoid use of unsafe eval in glue code 2021-02-15 11:59:03 +02:00
Motin
77f39545f3 Add time it takes to arrive to preRun to test page 2021-02-15 11:30:45 +02:00
Motin
49ad6514ae Add reproducible docker-based builds + let test page use these by default 2021-02-15 11:27:47 +02:00
Motin
7030fa0157 Ignore test page bundled artifacts 2021-02-15 11:25:13 +02:00
Motin
e50dd0909f Ignore contents in models directory 2021-02-15 11:23:08 +02:00
Motin
53e0b9fc5c Fix typo in lexical shortlist argument on test page 2021-02-15 11:22:23 +02:00
Motin
a33b3a3bb5 Add instructions on how to assemble and package the set of files expected by the test page 2021-02-15 11:21:36 +02:00
Motin
28c0ab2e04 Tweak words per second metric in the test page log 2021-02-15 10:38:36 +02:00