Merge remote-tracking branch 'origin/wasm-integration' into jp/absorb-batch-translator

2024-09-11 05:35:33 +03:00 · 2021-02-22 16:33:46 +00:00 · 2021-02-22 16:33:46 +00:00 · fd9e79a817
commit fd9e79a817
parent fbff7389d1 51f702ea6c
7 changed files with 88 additions and 189 deletions
--- a/README.md
+++ b/README.md
@ -16,64 +16,78 @@ make -j
 ```

 ### Build WASM
+#### Compiling for the first time

-To compile WASM, first download and Install Emscripten using following instructions:
+1. Download and Install Emscripten using following instructions
+    * Get the latest sdk: `git clone https://github.com/emscripten-core/emsdk.git`
+    * Enter the cloned directory: `cd emsdk`
+    * Install the lastest sdk tools: `./emsdk install latest`
+    * Activate the latest sdk tools: `./emsdk activate latest`
+    * Activate path variables: `source ./emsdk_env.sh`

-1. Get the latest sdk: `git clone https://github.com/emscripten-core/emsdk.git`
-2. Enter the cloned directory: `cd emsdk`
-3. Install the lastest sdk tools: `./emsdk install latest`
-4. Activate the latest sdk tools: `./emsdk activate latest`
-5. Activate path variables: `source ./emsdk_env.sh`
+2. Clone the repository and checkout the appropriate branch using these instructions:
+    ```bash
+    git clone https://github.com/browsermt/bergamot-translator
+    cd bergamot-translator
+    git checkout -b wasm-integration origin/wasm-integration
+    git submodule update --init --recursive
+    ```

-After the successful installation of Emscripten, perform these steps:
+3. Download files (only required if you want to package files in wasm binary)

+    This step is only required if you want to package files (e.g. models, vocabularies etc.)
+    into wasm binary. If you don't then just skip this step.
+
+    The build preloads the files in Emscripten’s virtual file system.
+
+    If you want to package bergamot project specific models, please follow these instructions:
+    ```bash
+    mkdir models
+    git clone https://github.com/mozilla-applied-ml/bergamot-models
+    cp -rf bergamot-models/* models
+    gunzip models/*/*
+    ```
+
+4. Compile
+    1. Create a folder where you want to build all the artefacts (`build-wasm` in this case)
+        ```bash
+        mkdir build-wasm
+        cd build-wasm
+        ```
+
+    2. Compile the artefacts
+        * If you want to package files into wasm binary then execute following commands (Replace `FILES_TO_PACKAGE` with the path of the
+        directory containing the files to be packaged in wasm binary)
+
+            ```bash
+            emcmake cmake -DCOMPILE_WASM=on -DPACKAGE_DIR=FILES_TO_PACKAGE ../
+            emmake make -j
+            ```
+            e.g. If you want to package bergamot project specific models (downloaded using step 3 above) then
+            replace `FILES_TO_PACKAGE` with `../models`
+
+        * If you don't want to package any file into wasm binary then execute following commands:
+            ```bash
+            emcmake cmake -DCOMPILE_WASM=on ../
+            emmake make -j
+            ```
+
+    The artefacts (.js and .wasm files) will be available in `wasm` folder of build directory ("build-wasm" in this case).
+
+#### Recompiling
+As long as you don't update any submodule, just follow steps in `4.ii` to recompile.\
+If you update a submodule, execute following command before executing steps in `4.ii` to recompile.
 ```bash
-git clone --recursive https://github.com/browsermt/bergamot-translator
-cd bergamot-translator
-git checkout wasm-integration
-git submodule update --recursive
-mkdir build-wasm
-cd build-wasm
-emcmake cmake -DCOMPILE_WASM=on ../
-emmake make -j
+git submodule update --init --recursive
 ```

-It should generate the artefacts (.js and .wasm files) in `wasm` folder inside build directory ("build-wasm" in this case).

-Download the models from `https://github.com/mozilla-applied-ml/bergamot-models`, and place all the desired ones to package in a folder called `models`.
-
-The build also allows packaging files into wasm binary (i.e. preloading in Emscripten’s virtual file system) using cmake
-option `PACKAGE_DIR`. The compile command below packages all the files in PATH directory (in these case, your models) into wasm binary.
-```bash
-emcmake cmake -DCOMPILE_WASM=on -DPACKAGE_DIR=/repo/models ../
-```
-Files packaged this way are preloaded in the root of the virtual file system.
-
-To package the set of files expected by the test page:
-
-```bash
-mkdir models
-git clone https://github.com/motin/bergamot-models
-cp -r bergamot-models/* models
-gunzip models/*/*
-```
-
-After Editing Files:
-
-```bash
-emmake make -j
-```
-
-After Adding/Removing Files:
-
-```bash
-emcmake cmake -DCOMPILE_WASM=on ../
-emmake make -j
-```
+## How to use

 ### Using Native version

-The builds generate library that can be integrated to any project. All the public header files are specified in `src` folder. A short example of how to use the APIs is provided in `app/main.cpp` file
+The builds generate library that can be integrated to any project. All the public header files are specified in `src` folder.\
+A short example of how to use the APIs is provided in `app/main.cpp` file.

 ### Using WASM version

--- a/docker/Makefile
+++ b/docker/Makefile
@ -1,55 +0,0 @@
-# -*- mode: makefile-gmake; indent-tabs-mode: true; tab-width: 4 -*-
-SHELL   		= bash
-PWD     		= $(shell pwd)
-WASM_IMAGE	    = local/bergamot-translator-build-wasm
-
-all: wasm-image compile-wasm
-
-# Build the Docker image for WASM builds
-wasm-image:
-	docker build -t local/bergamot-translator-build-wasm ./wasm/
-
-# Commands for compilation:
-cmake_cmd  = cmake
-
-wasm_cmake_cmd = ${cmake_cmd}
-wasm_cmake_cmd += -DCOMPILE_WASM=on
-wasm_cmake_cmd += -DProtobuf_INCLUDE_DIR=/usr/opt/protobuf-wasm-lib/dist/include
-wasm_cmake_cmd += -DProtobuf_LIBRARY=/usr/opt/protobuf-wasm-lib/dist/lib/libprotobuf.a
-wasm_cmake_cmd += -DPACKAGE_DIR=/repo/models
-
-make_cmd  = make
-#make_cmd += VERBOSE=1
-
-# ... and running things on Docker
-docker_mounts  = ${PWD}/..:/repo
-docker_mounts += ${HOME}/.ccache:/.ccache
-run_on_docker  = docker run --rm
-run_on_docker += $(addprefix -v, ${docker_mounts})
-run_on_docker += ${INTERACTIVE_DOCKER_SESSION}
-
-${HOME}/.ccache:
-	mkdir -p $@
-
-# Remove the bergamot-translator WASM build dir, forcing a clean compilation attempt
-clean-wasm: BUILD_DIR = /repo/build-wasm-docker
-clean-wasm: ${HOME}/.ccache
-	${run_on_docker} ${WASM_IMAGE} bash -c '(rm -rf ${BUILD_DIR} || true)'
-
-# Compile bergamot-translator to WASM
-compile-wasm: BUILD_DIR = /repo/build-wasm-docker
-compile-wasm: ${HOME}/.ccache
-	${run_on_docker} ${WASM_IMAGE} bash -c 'mkdir -p ${BUILD_DIR} && \
-cd ${BUILD_DIR} && \
-(emcmake ${wasm_cmake_cmd} .. && \
-(emmake ${make_cmd}) || \
-rm CMakeCache.txt)'
-
-# Start interactive shells for development / debugging purposes
-native-shell: INTERACTIVE_DOCKER_SESSION = -it
-native-shell:
-	${run_on_docker} ${NATIVE_IMAGE} bash
-
-wasm-shell: INTERACTIVE_DOCKER_SESSION = -it
-wasm-shell:
-	${run_on_docker} ${WASM_IMAGE} bash
--- a/docker/README.md
+++ b/docker/README.md
@ -1,27 +0,0 @@
-## WASM
-
-Prepare docker image for WASM compilation:
-
-```bash
-make wasm-image
-```
-
-Compile to wasm:
-
-```bash
-make compile-wasm
-```
-
-## Debugging
-
-Remove the marian-decoder build dir, forcing the next compilation attempt to start from scratch:
-
-```bash
-make clean-wasm
-```
-
-Enter a docker container shell for manually running commands:
-
-```bash
-make wasm-shell
-```
--- a/docker/wasm/Dockerfile
+++ b/docker/wasm/Dockerfile
@ -1,36 +0,0 @@
-FROM emscripten/emsdk:2.0.9
-
-# Install specific version of CMake
-WORKDIR /usr
-RUN wget https://github.com/Kitware/CMake/releases/download/v3.17.2/cmake-3.17.2-Linux-x86_64.tar.gz -qO-\
-    | tar xzf - --strip-components 1
-
-# Install Python and Java (needed for Closure Compiler minification)
-RUN apt-get update \
-    && apt-get install -y \
-    python3 \
-    default-jre
-
-# Deps to compile protobuf from source + the protoc binary which we need natively
-RUN apt-get update -y  && apt-get --no-install-recommends -y install \
-    protobuf-compiler \
-    autoconf \
-    autotools-dev \
-    automake \
-    autogen \
-    libtool && ln -s /usr/bin/libtoolize /usr/bin/libtool \
-    && mkdir -p /usr/opt \
-    && cd /usr/opt \
-    && git clone https://github.com/menduz/protobuf-wasm-lib
-
-RUN cd /usr/opt/protobuf-wasm-lib \
-    && /bin/bash -c "BRANCH=v3.6.1 ./prepare.sh"
-RUN cd /usr/opt/protobuf-wasm-lib/protobuf \
-    && bash -x ../build.sh
-RUN cp /usr/bin/protoc /usr/opt/protobuf-wasm-lib/dist/bin/protoc
-
-RUN apt-get --no-install-recommends -y install \
-    libprotobuf-dev
-
-# Necessary for benchmarking
-RUN pip3 install sacrebleu
--- a/wasm/CMakeLists.txt
+++ b/wasm/CMakeLists.txt
@ -16,7 +16,8 @@ target_compile_options(bergamot-translator-worker PRIVATE ${WASM_COMPILE_FLAGS})

 set(LINKER_FLAGS "--bind -s ASSERTIONS=0 -s DISABLE_EXCEPTION_CATCHING=1 -s FORCE_FILESYSTEM=1 -s ALLOW_MEMORY_GROWTH=1 -s NO_DYNAMIC_EXECUTION=1")
 if (NOT PACKAGE_DIR STREQUAL "")
-  set(LINKER_FLAGS "${LINKER_FLAGS} --preload-file ${PACKAGE_DIR}@/")
+  get_filename_component(REALPATH_PACKAGE_DIR ${PACKAGE_DIR} REALPATH BASE_DIR ${CMAKE_BINARY_DIR})
+  set(LINKER_FLAGS "${LINKER_FLAGS} --preload-file ${REALPATH_PACKAGE_DIR}@/")
 endif()

 set_target_properties(bergamot-translator-worker PROPERTIES
--- a/wasm/README.md
+++ b/wasm/README.md
@ -1,13 +1,14 @@
 ## Using Bergamot Translator in JavaScript
 The example file `bergamot.html` in the folder `test_page` demonstrates how to use the bergamot translator in JavaScript via a `<script>` tag.
-This example assumes that files were packaged in wasm binary.

-A brief summary is here though:
+Please note that everything below assumes that the [bergamot project specific model files](https://github.com/mozilla-applied-ml/bergamot-models) were packaged in wasm binary (using the compile instructions given in the top level README).
+
+### Using JS APIs

 ```js
 // The model configuration as YAML formatted string. For available configuration options, please check: https://marian-nmt.github.io/docs/cmd/marian-decoder/
 // This example captures the most relevant options: model file, vocabulary files and shortlist file
-const modelConfig = "{\"models\":[\"/model.npz\"],\"vocabs\":[\"/vocab.esen.spm\",\"/vocab.esen.spm\"],\"shortlist\":[\"/lex.s2t\"],\"beam-size\":1}";
+const modelConfig = "{\"models\":[\"/esen/model.esen.npz\"],\"vocabs\":[\"/esen/vocab.esen.spm\",\"/esen/vocab.esen.spm\"],\"shortlist\":[\"/esen/lex.esen.s2t\"],\"beam-size\":1}";

 // Instantiate the TranslationModel
 const model = new Module.TranslationModel(modelConfig);
@ -33,29 +34,30 @@ request.delete();
 input.delete();
 ```

-You can also see everything in action by following the next steps:
+### Demo (see everything in action)
+
 * Start the test webserver (ensure you have the latest nodejs installed)
-```bash
-cd test_page
-bash start_server.sh
-```
+    ```bash
+    cd test_page
+    bash start_server.sh
+    ```
 * Open any of the browsers below
    * Firefox Nightly +87: make sure the following prefs are on (about:config)
-    ```
-    dom.postMessage.sharedArrayBuffer.bypassCOOP_COEP.insecure.enabled = true
-    javascript.options.wasm_simd = true
-    javascript.options.wasm_simd_wormhole = true
-    ```
+        ```
+        dom.postMessage.sharedArrayBuffer.bypassCOOP_COEP.insecure.enabled = true
+        javascript.options.wasm_simd = true
+        javascript.options.wasm_simd_wormhole = true
+        ```

    * Chrome Canary +90: start with the following argument
-    ```
-    --js-flags="--experimental-wasm-simd"
-    ```
+        ```
+        --js-flags="--experimental-wasm-simd"
+        ```

 * Browse to the following page:
-```
-http://localhost:8000/bergamot.html
-```
+    ```
+    http://localhost:8000/bergamot.html
+    ```

 * Run some translations:
    * Choose a model and press `Load Model`
--- a/wasm/test_page/start_server.sh
+++ b/wasm/test_page/start_server.sh
@ -1,8 +1,8 @@
 #!/bin/bash

-cp ../../build-wasm-docker/wasm/bergamot-translator-worker.data .
-cp ../../build-wasm-docker/wasm/bergamot-translator-worker.js .
-cp ../../build-wasm-docker/wasm/bergamot-translator-worker.wasm .
-cp ../../build-wasm-docker/wasm/bergamot-translator-worker.worker.js .
+cp ../../build-wasm/wasm/bergamot-translator-worker.data .
+cp ../../build-wasm/wasm/bergamot-translator-worker.js .
+cp ../../build-wasm/wasm/bergamot-translator-worker.wasm .
+cp ../../build-wasm/wasm/bergamot-translator-worker.worker.js .
 npm install
 node bergamot-httpserver.js