Commit Graph

446 Commits

Author SHA1 Message Date
Motin
d3969bcd2d Add support for translating multiple sentences on the test page + report words per second metric in the log 2021-02-15 10:34:57 +02:00
Motin
26ea5bba7a Some cleanup 2021-02-15 10:26:04 +02:00
Motin
f7c86518cf Update test page package-lock.json 2021-02-15 10:04:49 +02:00
Motin
d27a96fc53 Updated wasm readme 2021-02-15 10:04:15 +02:00
Jerin Philip
45a8309c69 Missed translation_result -> response rename 2021-02-14 22:28:08 +00:00
Jerin Philip
be455a3da1 Straightening multithreading in translator workers
BatchTranslators are now held in Service. Threads are separate, and
constructed via lambdas. Retaining BatchTranslator class and member
function (Probably a matter of taste I guess).

This should eliminate complaints in (#10), hopefully.
2021-02-14 22:14:01 +00:00
Jerin Philip
370e9e2fb6 {translation_result -> response}.h; propogates; 2021-02-14 20:37:46 +00:00
Jerin Philip
0fc6105df4 No more two TranslationResults (sort-of)
To avoid confusion, this commit renames
marian::bergamot::TranslationResult -> marian::bergamot::Response.
Usages of marian::bergamot::TranslationResults are updated across the
source to be consistent with the change and get source back working.
2021-02-14 20:27:53 +00:00
Jerin Philip
5bd4a1a3c0 Refactor: marian-TranslationResult and associated
marian-TranslationResult has more guards in place. Switching to a
construction on demand model for sentenceMappings. These changes
propogate to bergamot translation results.

Integration broke with the change in marian's internals, which are
updated accordingly to get back functionality.

Changes revealed a few bugs, which are fixed:

- ConfigParser already discovered in wasm-integration
  (a06530e92b).
- Lambda captures and undefined values in DeviceId
2021-02-14 20:05:02 +00:00
Andre Natal
0dbc8612c2 Adding missing bergamot-httpserver.js 2021-02-14 09:55:29 -08:00
Jerin Philip
ecc91c51e3 BatchTranslator* -> unique_ptr<BatchTranslator> 2021-02-14 13:23:46 +00:00
Jerin Philip
47323d21b9 Getting rid of unused variables in Batch 2021-02-14 13:05:05 +00:00
Andre Natal
1e413f71cd Including a more elaborated test page, a node webserver containing the proper cors headers and wasm mimetype 2021-02-13 18:23:25 -08:00
Jerin Philip
e585a9e786 Sanitizing Batch construction
Batch Ids cannot be set by outside classes to values < 0.

Batch.Id_ =
    -1 : Poison, for use in PCQueue
     0 : Default constructed, invalid batch.
    >0 : Legit batch.

Book-keeping for batch metrics (maxLength, numTokens, etc) and logging
are now moved to Batch. Batch is now a class instead of a struct with
accessors controlling how members can be modified to suit above.
2021-02-13 16:31:30 +00:00
Jerin Philip
73a56a8f4f Refactoring batching-mechanisms into Batcher
Guided by an objective to  move batching mechanism and queueing request
to generate batches into a diffenrent thread. This commit is in
preparation for this functionality.

First, PCItem from the looks of it is *Batch*. Renamed to reflect the
same. Fingers crossed, hopefully no naming conflicts with marian.
BatchTranslator translates a "Batch" now, instead of
vector<RequestSentence>. Additional data members are setup at Batch to
enable development.

Workflows previously in Service, but more adequate in Batcher are now
moved, preparing to move Batcher/enqueuing of a request into a new
thread making it non-blocking. This will allow service to queue requests
into the batcher thread and exit, without waiting until the full-request
is queued.

Batcher now has a path with and without pcqueue.
2021-02-13 15:48:23 +00:00
Jerin Philip
77a600b637 Removing join() (#10) 2021-02-13 14:19:10 +00:00
Jerin Philip
f1d9f67b56 single-threaded run with --cpu-threads 0 (#10) 2021-02-13 11:43:25 +00:00
Jerin Philip
4764f11e95 Move BatchTranslator::thread_ to Service (#10)
Service now holds an std::vector<std::thread> instead of
BatchTranslators.
2021-02-13 10:55:07 +00:00
Andre Natal
47db65972c
Update README.md 2021-02-12 17:18:57 -08:00
Andre Natal
a97bf7b504
Update README.md 2021-02-12 17:00:12 -08:00
Andre Natal
3a53a68444
Update README.md
updating  `--recursive`  on wasm instructions too
2021-02-12 15:41:17 -08:00
Kenneth Heafield
f43dc33b03
Merge pull request #20 from browsermt/andrenatal-patch-1
Update README.md
2021-02-12 23:26:54 +00:00
Andre Natal
9108d9f0b3
Update README.md
Add  `--recursive` to `git clone` instructions
2021-02-12 15:25:40 -08:00
Jerin Philip
38e8b3cd6d Updates: marian-dev, ssplit for marian-decoder-new
Updates marian-dev and ssplit submodules to point to the upstream
commits which implements the following:

 - marian-dev: encodeWithByteRanges(...) to get source token byte-ranges
 - ssplit: Has a trivial sentencesplitter functionality implemented, and
   now is faster to benchmark with marian-decoder.

This enables a marian-decoder replacement written through ssplit in this
source to be benchmarked constantly with existing marian-decoder.

Nits: Removes logging introduced for multiple workers, and respective
log statements.
2021-02-12 14:23:24 +00:00
Abhishek Aggarwal
3b7673bf15 Updated marian-dev submodule
- This fixes the issue of sentencepiece not being able to checkout
   properly
2021-02-12 14:38:16 +01:00
Abhishek Aggarwal
28dcf55b41 Improved cmake to use wasm compilation flags across project 2021-02-12 11:36:33 +01:00
Abhishek Aggarwal
ff95e37f89 Improved cmake option PACKAGE_DIR 2021-02-11 23:52:37 +01:00
Abhishek Aggarwal
e12647076c Updated README with wasm build and use instructions 2021-02-11 23:36:42 +01:00
Abhishek Aggarwal
de501e8f96 Added JS binding files and cmake infrastructure to build them
- Added "wasm" folder
 - Contains README file as well
2021-02-11 23:36:29 +01:00
Abhishek Aggarwal
74b06d863e Add wasm folder to compile JS bindings 2021-02-11 19:09:30 +01:00
Abhishek Aggarwal
7b80003a5f Added code to generate proper JS bindings of translator
- COMPILE_WASM cmake option sets WASM_BINDINGS compile
   definition that enables code for generating proper JS
   bindings
2021-02-11 16:59:07 +01:00
Abhishek Aggarwal
23a9527824 Source code changes to compile the project without threads
- Set COMPILE_THREAD_VARIANT cmake option to ON to compile
   multithreaded variant of the project
2021-02-11 16:57:14 +01:00
Abhishek Aggarwal
a06530e92b Fixed a bug in TranslationModel class
- Using bergamot-translator as a library fails at run time because
   necessary parser options are not set
2021-02-11 16:14:03 +01:00
Abhishek Aggarwal
79c445ae3a cmake compile option changes for wasm builds
- Make WASM builds successful with marian decoder
  - Setting COMPILE_WASM to ON requires importing some
    compile definitions from marian
2021-02-11 15:57:26 +01:00
Abhishek Aggarwal
9b896507e3 cmake compile option changes
- Make native builds successful with marian decoder
 - COMPILE_DECODER_ONLY flag requires importing some
   compile definitions from marian
2021-02-11 15:53:38 +01:00
Abhishek Aggarwal
838547e4d5 Set cmake options of marian properly for this project 2021-02-11 15:42:18 +01:00
Abhishek Aggarwal
b73d4f4cc2 Set cmake option to compile marian library only
- Set COMPILE_LIBRARY_ONLY to ON for marian library
2021-02-11 15:37:38 +01:00
Abhishek Aggarwal
9747d9ba83 Add cmake option to compile project on WASM
- Set cmake option COMPILE_WASM to ON to compile the project
   on WASM
2021-02-11 15:34:27 +01:00
Abhishek Aggarwal
a2d3269344 Updated ssplit submodule 2021-02-10 11:27:16 +01:00
Abhishek Aggarwal
584700ce91 Changed translate() API from non-blocking to blocking
- Can be changed back to non-blocking once blocking API
   becomes integrable via WASM port in browser
2021-02-10 11:15:16 +01:00
Abhishek Aggarwal
5683168a8d Updated ssplit submodule to a different repository
- Added abhi-agg/ssplit-cpp
 - Added its wasm branch in bergamot-translator
 - Native builds of bergamot-translator are successful
   -- Sentence splitting is NOT WORKING
   -- Only translation is working
2021-02-10 10:33:01 +01:00
Abhishek Aggarwal
47b4bae268 Changed encodePreservingSource -> encodeWithByteRanges
- This change happened because marian submodule changed
   this name

 - Native builds are working fine
   -- bergamot-translator-app output is consistent
2021-02-09 15:37:29 +01:00
Abhishek Aggarwal
9a54d2116c Updated marian-dev submodule
- Switch to "wasm" branch of browsermt/marian-dev
2021-02-08 13:46:59 +01:00
Jerin Philip
2929077324 Reordering git submodule update before includes 2021-02-02 14:41:26 +00:00
Jerin Philip
548c8880ff CMake updates submodules 2021-02-02 14:39:19 +00:00
Jerin Philip
e76a602dc7 Removing config file printing 2021-01-28 21:44:05 +00:00
Jerin Philip
9a17f365c6 Fix for garbled output through cli.
Requirement for string_view is the original source string be transferred
all the way from input to service to back to TranslationResult. This
constraint was violated in several places by means of existence of a
copy-constructor. The issue is fixed by deleting copy and assignment
constructors in marian::bergamot::TranslationResult and
UnifiedAPI::TranslationResult, which demonstrated a few occurances of
the same. Replaced the same with move semantics.  In addition, future is
set and get using move semantics at the moment.  Default
move-constructor didn't seem to be working, so they're made explicit for
TranslationResults.

This commit additionally packs a few deletions and improvements made to
improve structure (textops.cpp, batcher.cpp) along the process of
inspecting and fixing the garbled outputs. They are choose to be kept,
in the interest of time, against a prettified atomic commit engineering.

Combinations of the following commits in jp/string-view-bug
[acfc92 78a588 12d91b 00a277 919e2f 9d3a46 b7e39b 18f67b bf667c]
2021-01-26 21:18:15 +00:00
Abhishek Aggarwal
0d16b1957f Improved main.cpp file
- Print original and translated text
 - Just add 2 vector entries for texts
2021-01-26 14:49:28 +01:00
Abhishek Aggarwal
b49f2c1af3 Cleanup TranslationModelConfiguration to std::string change in API
- Provide yaml formatted string as model configuration
 - Remove redundant files
2021-01-26 11:13:41 +01:00
Abhishek Aggarwal
026f1af887 Removed redundant lines from CMakeFile 2021-01-26 10:46:35 +01:00