marian/contrib/triton-aml
Roman Grundkiewicz c5081df93f Merged PR 24111: Remove external reference to Docker images
The reference to docker.io triggers a security warning (https://eng.ms/docs/more/containers-secure-supply-chain) making our pipelines flashing orange, which cover the real status of regression testing. This PR simply replaced the external reference to an internal mirror (https://eng.ms/docs/more/containers-secure-supply-chain/approved-images).
2022-05-31 15:31:39 +00:00
..
marian_backend Update marian-backend (#786) 2021-02-22 13:26:55 +00:00
src Add Triton Marian backend running in AzureML Inference Environment (#749) 2020-11-04 14:29:36 -08:00
build.sh Add Triton Marian backend running in AzureML Inference Environment (#749) 2020-11-04 14:29:36 -08:00
Dockerfile Merged PR 24111: Remove external reference to Docker images 2022-05-31 15:31:39 +00:00
README.md Add Triton Marian backend running in AzureML Inference Environment (#749) 2020-11-04 14:29:36 -08:00

Triton-AML

Triton-AML is a Triton custom backend running with Marian in the AzureML Inference Environment, it's one of the implementation of Triton Backend Shared Library.

This backend is compiled with the static library of Marian on a specific version.

Layout:

  • marian_backend: Triton Marian backend source code
  • src: Changed code and CMakeLists.txt of Marian
  • Dockerfile: Used for compiling the backend with the static library of Marian
  • build.sh: A simple shell script to run the Dockerfile to get the generated libtriton_marian.so

Usage

Run ./build.sh to get the Triton Marian backend shared library.

For all the users, you can put the libtriton_marian.so into the following places:

  • <model_repository>/<model_name>/<version_directory>/libtriton_marian.so
  • <model_repository>/<model_name>/libtriton_marian.so

For the AzureML Inference team members, you can put it into the following place of aml-triton base image:

  • <backend_directory>/marian/libtriton_marian.so

Where <backend_directory> is by default /opt/tritonserver/backends.

Make changes

If you want to compile with another version of Marian, you need to replace RUN git checkout youki/quantize-embedding in the Dockerfile, then copy the new CMakeLists.txt replace the old one, add src/cmarian.cpp into CMakeLists.txt and make some changes to make sure it will build a static library of Marian.

Limitation

For now, it's only used for nlxseq2seq model, some hard code is in the ModelState::SetMarianConfigPath function, some changes must be done if you want to run other models with Marian.