Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
Go to file
Graeme Nail 4b0da8d434
Enables model ensembles (#450)
* Enables model ensembles

Adds the ability to use ensembles of models. This supports ensembles of
binary- or npz-format models, as well as mixtures of both.

When all models in the ensembles are of binary format, the load from
memory path is used. Otherwise, they are loaded via the file system.
Enable log-level debug for output related to this.

* Fix formatting

* Fix WASM bindings for MemoryBundle

For now, this does not support ensembles.

* Remove shared_ptr wrapping the AlignedMemory of models.

* Fix formatting
2023-08-01 19:35:11 +01:00
.circleci Upgrade emsdk to 3.1.8 (#414) 2022-04-20 00:39:32 +01:00
.github Fix CI (#454) 2023-07-31 15:54:42 +01:00
3rd_party Bump 3rd_party/marian-dev from 6a6bbb6 to aa0221e (#452) 2023-07-31 15:26:44 +01:00
app Streamline memory-bundle loads (#307) 2022-01-19 16:36:48 +00:00
bergamot-translator-tests@a04432d792 Bump bergamot-translator-tests from 7984d14 to a04432d (#455) 2023-07-31 15:54:53 +01:00
bindings Fix Python formatting (Black) (#453) 2023-07-31 15:27:24 +01:00
cmake CMake fixes: Generate project.h in binary dir, fix GetVersionFromFile for use as submodule. (#193) 2021-06-09 10:12:00 +01:00
doc Docs: Pin Jinja2 to last known working version (#389) 2022-03-24 19:26:20 +00:00
examples Streamline memory-bundle loads (#307) 2022-01-19 16:36:48 +00:00
patches Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
src Enables model ensembles (#450) 2023-08-01 19:35:11 +01:00
wasm Enables model ensembles (#450) 2023-08-01 19:35:11 +01:00
.clang-format Adding clang-format and updating existing sources to adhere (#151) 2021-05-19 21:50:21 +01:00
.clang-format-ignore Adding clang-format and updating existing sources to adhere (#151) 2021-05-19 21:50:21 +01:00
.clang-tidy Add a clang-tidy run (#214) 2021-08-13 16:26:44 +01:00
.gitignore More portable WASM demo (#437) 2023-01-18 19:41:39 +00:00
.gitmodules Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
BERGAMOT_VERSION Bump version to 0.4.5 (#427) 2022-06-21 17:49:07 +01:00
build-wasm.sh Upgrade emsdk to 3.1.8 (#414) 2022-04-20 00:39:32 +01:00
CMakeLists.txt Upgrade emsdk to 3.1.8 (#414) 2022-04-20 00:39:32 +01:00
Doxyfile.in QualityEstimation: Preliminary Implementation (#197) 2021-09-16 16:28:40 +01:00
LICENSE Initial commit 2020-10-19 13:49:38 +02:00
MANIFEST.in Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
README.md Fix path to example program 2023-03-01 18:30:38 +00:00
run-clang-format.py Adding clang-format and updating existing sources to adhere (#151) 2021-05-19 21:50:21 +01:00
setup.py Fix Python formatting (Black) (#453) 2023-07-31 15:27:24 +01:00

Bergamot Translator

CircleCI badge

Bergamot translator provides a unified API for (Marian NMT framework based) neural machine translation functionality in accordance with the Bergamot project that focuses on improving client-side machine translation in a web browser.

Build Instructions

Build Natively

Create a folder where you want to build all the artifacts (build-native in this case) and compile

mkdir build-native
cd build-native
cmake ../
make -j2

Build WASM

Prerequisite

Building on wasm requires Emscripten toolchain. It can be downloaded and installed using following instructions:

  • Get the latest sdk: git clone https://github.com/emscripten-core/emsdk.git
  • Enter the cloned directory: cd emsdk
  • Install the sdk: ./emsdk install 3.1.8
  • Activate the sdk: ./emsdk activate 3.1.8
  • Activate path variables: source ./emsdk_env.sh

Compile

To build a version that translates with higher speeds on Firefox Nightly browser, follow these instructions:

  1. Create a folder where you want to build all the artifacts (build-wasm in this case) and compile

    mkdir build-wasm
    cd build-wasm
    emcmake cmake -DCOMPILE_WASM=on ../
    emmake make -j2
    

    The wasm artifacts (.js and .wasm files) will be available in the build directory ("build-wasm" in this case).

  2. Enable SIMD Wormhole via Wasm instantiation API in generated artifacts

    bash ../wasm/patch-artifacts-enable-wormhole.sh
    
  3. Patch generated artifacts to import GEMM library from a separate wasm module

    bash ../wasm/patch-artifacts-import-gemm-module.sh
    

To build a version that runs on all browsers (including Firefox Nightly) but translates slowly, follow these instructions:

  1. Create a folder where you want to build all the artifacts (build-wasm in this case) and compile

    mkdir build-wasm
    cd build-wasm
    emcmake cmake -DCOMPILE_WASM=on -DWORMHOLE=off ../
    emmake make -j2
    
  2. Patch generated artifacts to import GEMM library from a separate wasm module

    bash ../wasm/patch-artifacts-import-gemm-module.sh
    

Recompiling

As long as you don't update any submodule, just follow Compile steps.
If you update a submodule, execute following command in repository root folder before executing Compile steps.

git submodule update --init --recursive

How to use

Using Native version

The builds generate library that can be integrated to any project. All the public header files are specified in src folder.
A short example of how to use the APIs is provided in app/bergamot.cpp file.

Using WASM version

Please follow the README inside the wasm folder of this repository that demonstrates how to use the translator in JavaScript.