Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
Go to file
2021-02-15 17:18:59 +01:00
.github/workflows CI scripts: master -> main 2021-01-23 14:39:08 +00:00
3rd_party Updated marian submodule 2021-02-15 16:36:26 +01:00
app Changed translate() API from non-blocking to blocking 2021-02-10 11:15:16 +01:00
doc Unified api draft (#1) 2020-10-29 09:17:32 +01:00
docker Add reproducible docker-based builds + let test page use these by default 2021-02-15 11:27:47 +02:00
src Improved cmake to use wasm compilation flags across project 2021-02-12 11:36:33 +01:00
wasm Turn of assertions and disable exception catching for wasm builds 2021-02-15 14:24:59 +02:00
.gitignore Add reproducible docker-based builds + let test page use these by default 2021-02-15 11:27:47 +02:00
.gitmodules Updated ssplit submodule to a different repository 2021-02-10 10:33:01 +01:00
CMakeLists.txt Re-enable simd shuffle pattern for intgemm compilation 2021-02-15 17:18:59 +01:00
LICENSE Initial commit 2020-10-19 13:49:38 +02:00
README.md Add instructions on how to assemble and package the set of files expected by the test page 2021-02-15 11:21:36 +02:00

Bergamot Translator

Bergamot translator provides a unified API for (Marian NMT framework based) neural machine translation functionality in accordance with the Bergamot project that focuses on improving client-side machine translation in a web browser.

Build Instructions

Build Natively

git clone  --recursive https://github.com/browsermt/bergamot-translator
cd bergamot-translator
mkdir build
cd build
cmake ../
make -j

Build WASM

To compile WASM, first download and Install Emscripten using following instructions:

  1. Get the latest sdk: git clone https://github.com/emscripten-core/emsdk.git
  2. Enter the cloned directory: cd emsdk
  3. Install the lastest sdk tools: ./emsdk install latest
  4. Activate the latest sdk tools: ./emsdk activate latest
  5. Activate path variables: source ./emsdk_env.sh

After the successful installation of Emscripten, perform these steps:

git clone --recursive https://github.com/browsermt/bergamot-translator
cd bergamot-translator
git checkout wasm-integration
git submodule update --recursive
mkdir build-wasm
cd build-wasm
emcmake cmake -DCOMPILE_WASM=on ../
emmake make -j

It should generate the artefacts (.js and .wasm files) in wasm folder inside build directory ("build-wasm" in this case).

Download the models from https://github.com/mozilla-applied-ml/bergamot-models, and place all the desired ones to package in a folder called models.

The build also allows packaging files into wasm binary (i.e. preloading in Emscriptens virtual file system) using cmake option PACKAGE_DIR. The compile command below packages all the files in PATH directory (in these case, your models) into wasm binary.

emcmake cmake -DCOMPILE_WASM=on -DPACKAGE_DIR=/repo/models ../

Files packaged this way are preloaded in the root of the virtual file system.

To package the set of files expected by the test page:

git clone https://github.com/browsermt/students
cd students/esen/
./download-models.sh
cp esen.student.tiny11/lex.s2t ../../models/lex.esen.s2t
cp esen.student.tiny11/model.npz ../../models/model.esen.npz
cp esen.student.tiny11/vocab.esen.spm ../../models/vocab.esen.spm
cd -
cd students/enes/
./download-models.sh
cp enes.student.tiny11/lex.s2t ../../models/lex.enes.s2t
cp enes.student.tiny11/model.npz ../../models/model.enes.npz

After Editing Files:

emmake make -j

After Adding/Removing Files:

emcmake cmake -DCOMPILE_WASM=on ../
emmake make -j

Using Native version

The builds generate library that can be integrated to any project. All the public header files are specified in src folder. A short example of how to use the APIs is provided in app/main.cpp file

Using WASM version

Please follow the README inside the wasm folder of this repository that demonstrates how to use the translator in JavaScript.