2020-10-29 11:17:32 +03:00
# Bergamot Translator
Bergamot translator provides a unified API for ([Marian NMT](https://marian-nmt.github.io/) framework based) neural machine translation functionality in accordance with the [Bergamot ](https://browser.mt/ ) project that focuses on improving client-side machine translation in a web browser.
2020-11-16 15:09:42 +03:00
## Build Instructions
2021-02-12 01:27:16 +03:00
### Build Natively
2021-02-25 17:22:52 +03:00
1. Clone the repository using these instructions:
```bash
git clone https://github.com/browsermt/bergamot-translator
cd bergamot-translator
```
2. Compile
2021-02-12 01:27:16 +03:00
2021-02-25 17:22:52 +03:00
Create a folder where you want to build all the artifacts (`build-native` in this case) and compile in that folder
```bash
mkdir build-native
cd build-native
cmake ../
make -j
```
2020-11-16 15:09:42 +03:00
2021-02-12 01:27:16 +03:00
### Build WASM
2021-02-17 14:55:31 +03:00
#### Compiling for the first time
1. Download and Install Emscripten using following instructions
* Get the latest sdk: `git clone https://github.com/emscripten-core/emsdk.git`
* Enter the cloned directory: `cd emsdk`
* Install the lastest sdk tools: `./emsdk install latest`
* Activate the latest sdk tools: `./emsdk activate latest`
* Activate path variables: `source ./emsdk_env.sh`
2021-02-25 17:22:52 +03:00
2. Clone the repository using these instructions:
2021-02-17 14:55:31 +03:00
```bash
git clone https://github.com/browsermt/bergamot-translator
cd bergamot-translator
```
2021-05-04 12:18:45 +03:00
3. Download files (only required if you want to perform inference using build artifacts)
2021-02-17 14:55:31 +03:00
2021-05-04 12:18:45 +03:00
It packages the vocabulary files into wasm binary, which is required only if you want to perform inference.
The compilation commands will preload these files in Emscripten’ s virtual file system.
2021-02-17 14:55:31 +03:00
2021-05-04 12:18:45 +03:00
If you want to package bergamot project specific files, please follow these instructions:
2021-02-17 14:55:31 +03:00
```bash
2021-03-24 19:10:42 +03:00
git clone --depth 1 --branch main --single-branch https://github.com/mozilla-applied-ml/bergamot-models
2021-05-04 12:18:45 +03:00
mkdir models
2021-03-24 19:10:42 +03:00
cp -rf bergamot-models/prod/* models
2021-02-17 14:55:31 +03:00
gunzip models/*/*
2021-05-04 12:18:45 +03:00
find models \( -type f -name "model*" -or -type f -name "lex*" \) -delete
2021-02-17 14:55:31 +03:00
```
4. Compile
1. Create a folder where you want to build all the artefacts (`build-wasm` in this case)
```bash
mkdir build-wasm
cd build-wasm
```
2. Compile the artefacts
2021-05-04 12:18:45 +03:00
* If you want to package files into wasm binary then execute following commands (Replace `FILES_TO_PACKAGE` with the
directory containing all the files to be packaged)
2021-02-17 14:55:31 +03:00
```bash
emcmake cmake -DCOMPILE_WASM=on -DPACKAGE_DIR=FILES_TO_PACKAGE ../
emmake make -j
```
2021-05-04 12:18:45 +03:00
e.g. If you want to package bergamot project specific files (downloaded using step 3 above) then
2021-02-18 12:42:06 +03:00
replace `FILES_TO_PACKAGE` with `../models`
2021-02-17 14:55:31 +03:00
* If you don't want to package any file into wasm binary then execute following commands:
```bash
emcmake cmake -DCOMPILE_WASM=on ../
emmake make -j
```
2021-05-04 12:18:45 +03:00
The wasm artifacts (.js and .wasm files) will be available in the build directory ("build-wasm" in this case).
2021-03-24 19:10:42 +03:00
3. Enable SIMD Wormhole via Wasm instantiation API in generated artifacts
```bash
bash ../wasm/patch-artifacts-enable-wormhole.sh
```
2021-02-17 14:55:31 +03:00
#### Recompiling
2021-03-24 19:10:42 +03:00
As long as you don't update any submodule, just follow steps in `4.ii` and `4.iii` to recompile.\
If you update a submodule, execute following command before executing steps in `4.ii` and `4.iii` to recompile.
2021-01-22 14:51:49 +03:00
```bash
2021-02-17 14:55:31 +03:00
git submodule update --init --recursive
2021-02-12 01:27:16 +03:00
```
2021-01-22 14:51:49 +03:00
2021-02-15 12:21:36 +03:00
2021-02-17 14:55:31 +03:00
## How to use
2021-01-22 14:51:49 +03:00
2021-02-12 01:27:16 +03:00
### Using Native version
2021-01-22 14:51:49 +03:00
2021-02-17 14:55:31 +03:00
The builds generate library that can be integrated to any project. All the public header files are specified in `src` folder.\
A short example of how to use the APIs is provided in `app/main.cpp` file.
2021-01-22 14:51:49 +03:00
2021-02-12 01:27:16 +03:00
### Using WASM version
2021-01-22 14:51:49 +03:00
2021-02-12 01:27:16 +03:00
Please follow the `README` inside the `wasm` folder of this repository that demonstrates how to use the translator in JavaScript.