Commit Graph

435 Commits

Author SHA1 Message Date
Jerin Philip
330840338c
Including WASM documentation in sphinx build toc (#176) 2021-06-01 12:39:28 +01:00
Jerin Philip
ceaf21a532
Deploy generated documentation only if browsermt (#179) 2021-06-01 11:00:53 +01:00
Jerin Philip
5d3ec9c0a9
Single executable (#175)
* Collapsing executables

* Adding new test executable

* Deleting old executable sources

* Updating brt to operate with modes

* cli-framework -> cli

* Updating workflows to check for bergamot instead of bergamot-translator-app

* Adding documentation

* Making fn pure virtual

* Shuffling apps into app namespace, alongside class documentation

* Include app folder in documentation

* BRT update service-cli -> native

* parser.h: service-cli -> native

* Updates to marian-integration.md

* Cleanup: Remove templates, interface proper

* change 4 to 2 cores for build instructions

* service-cli -> native

* Commenting the string constructor explanation

* Not doing halfway interface / inheritance

* Nick hates state, let's try this one

* Revert "Nick hates state, let's try this one"

This reverts commit e56db9f474.

* class -> struct before trying std::function stuff

* oop -> functional?

* Hints on what is happening

* app::ftable -> app::REGISTRY

* We have if-else and functions now.

And we won't have test apps.

* Doc linking to usage examples in brt

* Remove unordered_map

* Documentation updates

* Fix warning
2021-05-31 14:44:59 +01:00
Jerin Philip
eb579ed26f
Updating marian dev RelwithDebInfo -> Release (#178)
* Updating marian dev RelwithDebInfo -> Release

* Updating submodule to point to master
2021-05-27 10:51:53 +01:00
Qianqian Zhu
8bec1b7b6b
Fix failures when loading text shortlist (#154) 2021-05-25 12:05:16 +01:00
Jerin Philip
576afae6b3
Adding documentation action (#168)
Adds a GitHub workflow that builds documentation from sources through doxygen through sphinx on push to the main branch or on push of any semantic version tags. The built documentation is deployed at https://github.com/browsermt/docs@gh-pages, which is rendered at https://browser.mt/docs/<suffix>, where <suffix> is 'main' or a tag vM.m.p corresponding to a semantic version.

On pull request artifacts are uploaded for reviewers to inspect if need be.
2021-05-25 11:10:56 +01:00
Jerin Philip
22a1b9113e
Remove O(N^2) reallocation (#171) 2021-05-22 00:04:49 +01:00
Jerin Philip
f1253720a8
Bumping BRT for hotfixes (#169)
* Bumping BRT for hotfixes

* updating brt to point to main
2021-05-20 12:49:44 +01:00
Nikolay Bogoychev
4f8050be64 Update tests 2021-05-20 11:03:10 +01:00
Motin
4b177d57e4
GitHub action to push browsermt/main branch to mozilla/bergamot-translator every hour (#160)
* Create push-browsermt-main-to-mozilla-main.yml

* Update .github/workflows/push-browsermt-main-to-mozilla-main.yml

Co-authored-by: Graeme <graemenail@gmail.com>

* Tweaks

* Fix yaml syntax

* Parametrized the workflow based on @jerinphilip's example

Co-authored-by: Graeme <graemenail@gmail.com>
2021-05-20 09:33:58 +03:00
Motin
0f8f8e026a
Pin emsdk version to the same one used in Circle CI (#165) 2021-05-20 08:59:30 +03:00
Jerin Philip
9dcf6ab665
Adding clang-format and updating existing sources to adhere (#151)
* Adding a first version of clang-format

* Adding run-clang-format.py

* Adding coding styles to workflow

* Fix indentation on coding-styles workflow

* run-clang-format.'py'

* -style -> --style in python

* Updating ColumnLimit: 120

* Format update with clang-format

* Revert "Format update with clang-format"

This reverts commit 5340b19eae.

* Apply update after sync

* Removing a few empty lines

* Removing one more empty line

* Removing empty in workflow file

* Updating README with coding style instructions

* clang-format-* provided in this repository doc update

Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
2021-05-19 21:50:21 +01:00
Qianqian Zhu
7ad8d0a04d
initialise MemoryBundle members (#167) 2021-05-19 20:11:20 +01:00
Kenneth Heafield
89bd47342b
Use binary lexical shortlist in documentation (#152)
* Use binary lexical shortlist in documentation

* MKL/AppleAccelerate note

Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
Co-authored-by: Jerin Philip <jphilip@ed.ac.uk>
2021-05-19 10:44:32 +01:00
Kenneth Heafield
b25f223fe4
Rewriting batching for threadsafety (#155)
This does make the batcher a critical section across job submission and
cleaving though.  If that becomes a problem, we should go back to
incoming and outgoing queues with a batcher thread.

Also removes blocking mode from native compiles.

Note that translateMultiple no longer guarantees great batching.  Guess
we could lease the mutex from ThreadsafeBatcher and create a session.

There is the risk that one sentence comes in at a time and each thread
grabs one sentence at a time instead of better batching.  Not sure what
to do about that other than some sort of Nagle algorithm.

Due to non-deterministic batching, even with one thread, the regression
tests will go haywire.
2021-05-18 16:11:14 +01:00
Jerin Philip
269edc7ce5
Collapsing TranslationRequest -> ResponseOptions (#139) 2021-05-18 14:25:25 +01:00
abhi-agg
8b621de358
Merge pull request #159 from mozilla/main
Merge histories across bergamot-translator forks
2021-05-18 13:53:26 +02:00
abhi-agg
813e81c10c
Merge branch 'main' into main 2021-05-18 13:53:12 +02:00
Nikolay Bogoychev
10131c731a
Marian submodule with unified loading (#157) 2021-05-18 12:45:22 +01:00
Motin
1c40cc8289
Merge branch 'main' into main 2021-05-18 13:44:08 +03:00
Abhishek Aggarwal
7a973df74d Corrected the version number
- To be in sync with versioning in mozilla/bergamot-translator repo
2021-05-18 12:17:56 +02:00
Abhishek Aggarwal
b73714e222 Merge remote-tracking branch 'upstream/main' into main
- Sync with upstream (https://github.com/browsermt/bergamot-translator)
2021-05-18 08:48:41 +02:00
Abhishek Aggarwal
067076fbc1 Bumped version to 0.3.0
- This brings the version info in sync with the various releases
   of extension
2021-05-17 19:34:58 +02:00
Abhishek Aggarwal
0ad583cc34 Generate project version file for native builds
- The header file exposes a function that provides version information
   for native binaries
2021-05-17 19:34:58 +02:00
Abhishek Aggarwal
2e5880d3d4 Modified wasm cmake file to include version information in built artifacts 2021-05-17 19:34:58 +02:00
Abhishek Aggarwal
c44868e1fd Import GetVersionFromFile cmake file in root level CMakeLists.txt 2021-05-17 19:34:58 +02:00
Abhishek Aggarwal
c1ef6f2bcb Added cmake file to compute version information
- Reads BERGAMOT_VERSION file for generating various strings
   for versioning
2021-05-17 19:34:58 +02:00
Kenneth Heafield
3e70587672
Rewrite annotation class to remove corner cases (#135) 2021-05-17 16:42:18 +01:00
Qianqian Zhu
5bd1fc6b83
Refactor vocabs in Service (#143)
Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
2021-05-17 13:09:03 +01:00
Jerin Philip
77424a3df1
Enabling ccache on github builds for Ubuntu (#95)
* CI Changes to add tiny regression tests

* Adding an inspect cache step

* Removing ccache, pursue in another

* Incorporating Nick's changes through submodule merge

* Submodule now points to master

* Restoring ccache enabled workflow file

* Restoring ccache enabled CMakeLists

* cache -> ccache typo fix

* Moving CCACHE setup to GitHub runner file

* Find also uses CCACHE dir

* Updating CMakeLists not to override env

* Cache compiler binary's contents

* Changing a few names to trigger new build; Testing cache looks fun

* USE_CCACHE=on, -L for inspection

* Adding a ccache_cmd, but will only use in next commit

* Using ccache_cmd

* Removing "

* Adding compiler hash script

* Bunch of absolute paths

* GITHUB_WORKSPACE typo

* Nah, I'll keep -L and trigger another build

* Trying something with compiler hash on cache key backup as well

* builtin, bash it seems

* Empty commit #1

* Move ccache stats to after compile

* Reshuffling ccache vars

* No comments

* Updates to Github output set syntax

* Empty Commit 1

* Empty Commit 2

* Empty commit 3

* /bin/bash -> bash; ccache_cmd for consistency

* Adding ccache -s before and after build

* Adding comments to compiler-hash script

* Let's build cached and non-cached variants together for comparison

* Fixing quotes, /bin/bash -> bash

* Minor var/env adjustment

* Adding ccache -z before the job

* Reverting CMakeLists.txt without CCACHE

* Switching to CMAKE_LANG_COMPILER_LAUNCHER instead of CMakeLists.txt rule

* 5G -> 1G cache size

* 1G -> 2G; Hyperparameter tuning
2021-05-17 11:42:47 +01:00
Qianqian Zhu
6c7e6156ab
Bundle AlignedMemory inputs with MemoryBundle (#147) 2021-05-13 13:18:08 +01:00
Abhishek Aggarwal
6c063c607e Updated CMakeLists.txt to remove packaging steps for wasm compilation
- Removed PACKAGE_DIR cmake option
 - Removed Workerfs, FORCE_FILESYSTEM=1 in wasm builds
   -- File system support is not needed any more (since model,
     shortlist and vocabs are being passed as bytes now)
2021-05-12 16:23:09 +02:00
Abhishek Aggarwal
0189500160 Updated README to remove packaging steps for wasm compilation
- We don't need to package model, shortlist or vocab files into wasm
   binary at build time
2021-05-12 16:23:09 +02:00
Abhishek Aggarwal
e0b9bad058 Updated wasm README to update for passing vocabs as bytes
- Updated Using JS APIs section to pass vocabs as bytes
2021-05-12 16:23:09 +02:00
Abhishek Aggarwal
8a6c7b44a3 Avoid packaging vocab files into wasm binary in CI builds
- We don't need to package vocab files into wasm binary any more
   as a sync with upstream enabled passing vocabs as bytes
2021-05-12 09:55:49 +02:00
Abhishek Aggarwal
451ab047ff Merge remote-tracking branch 'upstream/main' into main 2021-05-12 08:53:25 +02:00
Abhishek Aggarwal
d7cb859ab7 Refactoring TranslationModelBindings class
- typdef AlignedMemory for code readability

 - Added documentation for one of the binding function
2021-05-12 07:32:42 +02:00
Abhishek Aggarwal
5025285e5c Updated wasm test page to pass vocabulary files as bytes 2021-05-12 07:32:42 +02:00
Abhishek Aggarwal
9f78985e45 JS bindings for vocabularies as bytes 2021-05-12 07:32:42 +02:00
Abhishek Aggarwal
331216e017 Enable Debugging information in wasm module builds
- Added "-g2" flag furing linking step
2021-05-11 18:50:55 +02:00
Abhishek Aggarwal
ce576c27f1 Export "addOnPreMain" function from wasm module
- This is required in the extension while using wasm module in a worker environment
2021-05-11 18:50:55 +02:00
Kenneth Heafield
ce01de939d
Change USE_WASM_COMPATIBLE_SOURCE =OFF by default on native, force on for WASM (#138)
* Change WASM_COMPATIBLE_SOURCE=OFF by default

The default was WASN_COMPATIBLE_SOURCE=ON COMPILE_WASM=OFF which is a
testing configuration, not a sensible default for native or wasm.

* Always USE_WASM_COMPATIBLE_SOURCE with COMPILE_WASM

* Set CMP0077 to fix variable handling
2021-05-10 12:28:37 +02:00
Jerin Philip
354e7ac6be
Remove unused used types TokenRanges, SentenceTokenRanges, UPtr (#137) 2021-05-09 13:42:57 +01:00
Nikolay Bogoychev
87adb5d60a Target master of ssplit-cpp 2021-05-07 18:41:08 +01:00
Jerin Philip
bef12765ad
Minor rename: sentence_ranges -> annotation (#134) 2021-05-07 18:38:27 +01:00
Nikolay Bogoychev
21c1cae472
Update ssplit submodule, removing absl (#132)
* Update ssplit submodule, removing absl

* Fix ssplit variables

* Update ssplit branch

* Fix emscripten compilaiton

* Update tests
2021-05-07 17:58:58 +01:00
Qianqian Zhu
5b02008a97
Enable vocabs pass as byte arrays (#122)
* first attempt to enable vocabs pass as byte arrays

* pass vocabs bytes as AlignedMemory

* add vocabIndices to avoid double loading

* small fix on parameter names and documentation

* fix windows build plus tiny update on documentation

* update marian-dev submodule

* move validate model bytearray in BatchTranslator

* small refactors on validateBinaryModel()

* switch vocab memories to std::vector<marian::Ptr<AlignedMemory>>

* update marian-dev submodule

* replace marian::Ptr to std::shared_ptr for vocab memories

* add note for vocab memories
2021-05-07 14:54:48 +01:00
Jerin Philip
b86c76b004
Faithful to source-structure translation (#115)
* First draft of faithful translation

* Comments explaining pre and post

* Comments on response_builder

* Updating bergamot-translator-tests with new outputs

* Cosmetic changes in response target text construction

* Replacing &(x[0]) -> x.data() to avoid illegal indices

* Removing nullptr given both branches init pointer with legal values

* pre, post -> gap(i) addressing review comments

Functions which were pre and post before are subsumed by gap(i), and the
algorithm in ResponseBuilder adjusted to fix.

`x = nullptr` is back, should be harmless.

* Updating brt with paragraph outputs

* Bumping brt with updated outputs, buffer text at begin as well

* Bumping BRT with sync after bytearray collapse merge

* Pointing BRT to main after merge

Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
2021-05-06 16:19:27 +01:00
Jerin Philip
bc2e4eee5c
Making bytearray a commandline switch (#127)
* Adding bytearray option

* collapse intermediate for bytearray apps

* Removing service-cli-bytearray

* Removing the bergamot bytearray app

* Bumping updates to brt collapsing apps

* Reasonable defaults and hard check when cmd enabled

* Update documentation for flags

* Bump brt with MKL check and skip

* Bumping BRT with MKL_FOUND instead of USE_MKL

* Bumping BRT with no mkl enforce

* Bumping BRT with ssse3 output

* Let's try disabling OpenBLAS

* Trying to disable apple accelerate

* Using WASM compatible BLAS can enable intgemm

* Adding a CMake -L to see what exactly is the diff

* Revert "Let's try disabling OpenBLAS"

This reverts commit 9a6b9bc53b.

* Revert "Using WASM compatible BLAS can enable intgemm"

This reverts commit 936a592e18.

* Restricting mac tests through tags and on GitHub CI

* Using only check-bytearray

* Bumping BRT with change of default behaviour
2021-05-06 00:26:03 +01:00
Kenneth Heafield
c61b2bdd10
Fix busy loop in windows (#131)
* Fix busy loop in windows

* Nick wants the while loop gone

* Fix continue leftover

Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
2021-05-06 00:21:50 +01:00