mirror of https://github.com/marian-nmt/marian.git synced 2024-11-03 20:13:47 +03:00

Add Triton Marian backend running in AzureML Inference Environment (#749 )

* Add Triton Marian backend running in AzureML Inference Environment

2020-11-04 14:29:36 -08:00

1.7 KiB

Raw Blame History

Triton-AML

Triton-AML is a Triton custom backend running with Marian in the AzureML Inference Environment, it's one of the implementation of Triton Backend Shared Library.

This backend is compiled with the static library of Marian on a specific version.

Layout:

marian_backend: Triton Marian backend source code
src: Changed code and CMakeLists.txt of Marian
Dockerfile: Used for compiling the backend with the static library of Marian
build.sh: A simple shell script to run the Dockerfile to get the generated libtriton_marian.so

Usage

Run ./build.sh to get the Triton Marian backend shared library.

For all the users, you can put the libtriton_marian.so into the following places:

<model_repository>/<model_name>/<version_directory>/libtriton_marian.so
<model_repository>/<model_name>/libtriton_marian.so

For the AzureML Inference team members, you can put it into the following place of aml-triton base image:

<backend_directory>/marian/libtriton_marian.so

Where <backend_directory> is by default /opt/tritonserver/backends.

Make changes

If you want to compile with another version of Marian, you need to replace RUN git checkout youki/quantize-embedding in the Dockerfile, then copy the new CMakeLists.txt replace the old one, add src/cmarian.cpp into CMakeLists.txt and make some changes to make sure it will build a static library of Marian.

Limitation

For now, it's only used for nlxseq2seq model, some hard code is in the ModelState::SetMarianConfigPath function, some changes must be done if you want to run other models with Marian.

1.7 KiB Raw Blame History

Triton-AML

Usage

Make changes

Limitation

1.7 KiB

Raw Blame History