Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
Go to file
Jerin Philip c0f311a8c0
Batteries included python package (#310)
Imports python bindings and associated sources incubated in
https://github.com/jerinphilip/lemonade to bergamot-translator. Adds
 a pybind11 dependency for python bindings.

Following the import, the python build is integrated into the existing 
CMake based build system here. There is a command-line application 
provided through python which provides the ability to fetch and prepare 
models from model-repositories (like browsermt/students or OPUS).

Wheels built for a few common operating systems are provided via GitHub
releases through automated actions configured to run at tagged semantic
versions and pushes to main.

The documentation for python is also integrated into our existing
documentation setup. Previous documentation GitHub action is now
configured to run behind python builds in Ubuntu 18.04 Python3.7,
in order to pick up the packaged as a wheel bergamot module and the
sphinx documentation using the python module.

Formatting checks of black, isort with profile black and a pytype type
checker is configured for the python component residing in this repository.
2022-01-26 20:33:43 +00:00
.circleci CI: Circle CI config script update (#287) 2021-12-21 23:58:13 +01:00
.github/workflows Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
3rd_party Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
app Streamline memory-bundle loads (#307) 2022-01-19 16:36:48 +00:00
bergamot-translator-tests@e49686b7ca First class pivot translation capability (#236) 2022-01-17 13:44:23 +00:00
bindings Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
cmake CMake fixes: Generate project.h in binary dir, fix GetVersionFromFile for use as submodule. (#193) 2021-06-09 10:12:00 +01:00
doc Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
examples Streamline memory-bundle loads (#307) 2022-01-19 16:36:48 +00:00
patches Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
src Add API to trigger fast shutdown of AsyncService (#297) 2022-01-21 13:14:57 +00:00
vcpkg-override/ports/pcre2 Fixes windows workflow for PCRE2 (#260) 2021-11-05 20:48:28 +00:00
wasm Disabled importing optimized gemm module (#282) 2021-12-17 17:39:43 +01:00
.clang-format Adding clang-format and updating existing sources to adhere (#151) 2021-05-19 21:50:21 +01:00
.clang-format-ignore Adding clang-format and updating existing sources to adhere (#151) 2021-05-19 21:50:21 +01:00
.clang-tidy Add a clang-tidy run (#214) 2021-08-13 16:26:44 +01:00
.gitignore Treat most HTML elements as word-breaking (#286) 2022-01-16 10:26:40 +00:00
.gitmodules Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
BERGAMOT_VERSION Corrected the version number 2021-05-18 12:17:56 +02:00
build-wasm.sh Import matrix-multiply from a separate wasm module (#232) 2021-10-27 11:54:39 +02:00
CMakeLists.txt Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
Doxyfile.in QualityEstimation: Preliminary Implementation (#197) 2021-09-16 16:28:40 +01:00
LICENSE Initial commit 2020-10-19 13:49:38 +02:00
MANIFEST.in Batteries included python package (#310) 2022-01-26 20:33:43 +00:00
README.md Fix badge to point to this repo instead mozilla's (#261) 2021-11-15 08:14:21 +00:00
run-clang-format.py Adding clang-format and updating existing sources to adhere (#151) 2021-05-19 21:50:21 +01:00
setup.py Batteries included python package (#310) 2022-01-26 20:33:43 +00:00

Bergamot Translator

CircleCI badge

Bergamot translator provides a unified API for (Marian NMT framework based) neural machine translation functionality in accordance with the Bergamot project that focuses on improving client-side machine translation in a web browser.

Build Instructions

Build Natively

Create a folder where you want to build all the artifacts (build-native in this case) and compile

mkdir build-native
cd build-native
cmake ../
make -j2

Build WASM

Prerequisite

Building on wasm requires Emscripten toolchain. It can be downloaded and installed using following instructions:

  • Get the latest sdk: git clone https://github.com/emscripten-core/emsdk.git
  • Enter the cloned directory: cd emsdk
  • Install the lastest sdk tools: ./emsdk install 2.0.9
  • Activate the latest sdk tools: ./emsdk activate 2.0.9
  • Activate path variables: source ./emsdk_env.sh

Compile

To build a version that translates with higher speeds on Firefox Nightly browser, follow these instructions:

  1. Create a folder where you want to build all the artifacts (build-wasm in this case) and compile

    mkdir build-wasm
    cd build-wasm
    emcmake cmake -DCOMPILE_WASM=on ../
    emmake make -j2
    

    The wasm artifacts (.js and .wasm files) will be available in the build directory ("build-wasm" in this case).

  2. Enable SIMD Wormhole via Wasm instantiation API in generated artifacts

    bash ../wasm/patch-artifacts-enable-wormhole.sh
    
  3. Patch generated artifacts to import GEMM library from a separate wasm module

    bash ../wasm/patch-artifacts-import-gemm-module.sh
    

To build a version that runs on all browsers (including Firefox Nightly) but translates slowly, follow these instructions:

  1. Create a folder where you want to build all the artifacts (build-wasm in this case) and compile

    mkdir build-wasm
    cd build-wasm
    emcmake cmake -DCOMPILE_WASM=on -DWORMHOLE=off ../
    emmake make -j2
    
  2. Patch generated artifacts to import GEMM library from a separate wasm module

    bash ../wasm/patch-artifacts-import-gemm-module.sh
    

Recompiling

As long as you don't update any submodule, just follow Compile steps.
If you update a submodule, execute following command in repository root folder before executing Compile steps.

git submodule update --init --recursive

How to use

Using Native version

The builds generate library that can be integrated to any project. All the public header files are specified in src folder.
A short example of how to use the APIs is provided in app/main.cpp file.

Using WASM version

Please follow the README inside the wasm folder of this repository that demonstrates how to use the translator in JavaScript.