Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
Go to file
2021-02-17 13:35:10 +00:00
.github/workflows CI scripts: master -> main 2021-01-23 14:39:08 +00:00
3rd_party Merge remote-tracking branch 'origin/wasm-integration' into jp/absorb-batch-translator 2021-02-17 13:08:58 +00:00
app Merge remote-tracking branch 'origin/wasm-integration' into jp/absorb-batch-translator 2021-02-17 13:08:58 +00:00
doc Documentation formatting/syntax fix 2021-02-17 13:35:10 +00:00
docker Add reproducible docker-based builds + let test page use these by default 2021-02-15 11:27:47 +02:00
src Fixes UEdin builds after wasm-integration merge 2021-02-17 13:28:58 +00:00
wasm Updated test page to use the model structure from bergamot-models repo 2021-02-16 17:00:45 +02:00
.gitignore Merge remote-tracking branch 'origin/wasm-integration' into jp/absorb-batch-translator 2021-02-17 13:08:58 +00:00
.gitmodules Updated ssplit submodule to a different repository 2021-02-10 10:33:01 +01:00
CMakeLists.txt Merge remote-tracking branch 'origin/wasm-integration' into jp/absorb-batch-translator 2021-02-17 13:08:58 +00:00
LICENSE Initial commit 2020-10-19 13:49:38 +02:00
README.md Updated instructions on how to get all relevant models in place for the upcoming release 2021-02-16 15:46:15 +02:00

Bergamot Translator

Bergamot translator provides a unified API for (Marian NMT framework based) neural machine translation functionality in accordance with the Bergamot project that focuses on improving client-side machine translation in a web browser.

Build Instructions

Build Natively

git clone  --recursive https://github.com/browsermt/bergamot-translator
cd bergamot-translator
mkdir build
cd build
cmake ../
make -j

Build WASM

To compile WASM, first download and Install Emscripten using following instructions:

  1. Get the latest sdk: git clone https://github.com/emscripten-core/emsdk.git
  2. Enter the cloned directory: cd emsdk
  3. Install the lastest sdk tools: ./emsdk install latest
  4. Activate the latest sdk tools: ./emsdk activate latest
  5. Activate path variables: source ./emsdk_env.sh

After the successful installation of Emscripten, perform these steps:

git clone --recursive https://github.com/browsermt/bergamot-translator
cd bergamot-translator
git checkout wasm-integration
git submodule update --recursive
mkdir build-wasm
cd build-wasm
emcmake cmake -DCOMPILE_WASM=on ../
emmake make -j

It should generate the artefacts (.js and .wasm files) in wasm folder inside build directory ("build-wasm" in this case).

Download the models from https://github.com/mozilla-applied-ml/bergamot-models, and place all the desired ones to package in a folder called models.

The build also allows packaging files into wasm binary (i.e. preloading in Emscriptens virtual file system) using cmake option PACKAGE_DIR. The compile command below packages all the files in PATH directory (in these case, your models) into wasm binary.

emcmake cmake -DCOMPILE_WASM=on -DPACKAGE_DIR=/repo/models ../

Files packaged this way are preloaded in the root of the virtual file system.

To package the set of files expected by the test page:

mkdir models
git clone https://github.com/motin/bergamot-models
cp -r bergamot-models/* models
gunzip models/*/*

After Editing Files:

emmake make -j

After Adding/Removing Files:

emcmake cmake -DCOMPILE_WASM=on ../
emmake make -j

Using Native version

The builds generate library that can be integrated to any project. All the public header files are specified in src folder. A short example of how to use the APIs is provided in app/main.cpp file

Using WASM version

Please follow the README inside the wasm folder of this repository that demonstrates how to use the translator in JavaScript.