* Change WASM_COMPATIBLE_SOURCE=OFF by default
The default was WASN_COMPATIBLE_SOURCE=ON COMPILE_WASM=OFF which is a
testing configuration, not a sensible default for native or wasm.
* Always USE_WASM_COMPATIBLE_SOURCE with COMPILE_WASM
* Set CMP0077 to fix variable handling
* first attempt to enable vocabs pass as byte arrays
* pass vocabs bytes as AlignedMemory
* add vocabIndices to avoid double loading
* small fix on parameter names and documentation
* fix windows build plus tiny update on documentation
* update marian-dev submodule
* move validate model bytearray in BatchTranslator
* small refactors on validateBinaryModel()
* switch vocab memories to std::vector<marian::Ptr<AlignedMemory>>
* update marian-dev submodule
* replace marian::Ptr to std::shared_ptr for vocab memories
* add note for vocab memories
* Update marian-dev to the newest mac version
* Attempt windows workflow
* force workflow rerun
* Separate id
* Attempt 3 at github action
* Marian dev submodule now compiles with apple clang
* Updated ssplit version to something more recent
* Attempt to fix compile on wasm
* Do not compile subproject tests
* Fix emscripten compilation on Mac
* 99% on the way to windows compile
* Try with a different generator
* Build release not debug
* Revert CMakeLists.txt hacks
* Fix sse2 compilation failure
* MSVC settings for WIN32
* Add nodefaultlib LIBCMT
* Do not compile ssplit.cpp as it contains sys/mman.h
* Revert ab56b9aa4f
* Update paths
* Set the build type to release if not set previously
* Attempt to build release with the windows workflow
* Attempt 5 at VS studio release build
* Attempt 6 at getting release build on MSVC generator
* The windows build is debug at the moment...
* fix ssplit for ubuntu 16.04
* Fix compilation with clang
* Compile on ubuntu16.04
* Explain what is going on
* Updated ssplit and workflow
* Updated marian-dev submodule
- cmake changes required after the submodule update
* Added workflows for building custom marian on mac and ubuntu
* Renamed cmake option
- Renamed USE_WASM_COMPATIBLE_SOURCES to USE_WASM_COMPATIBLE_SOURCE
- Use proper compile defnitions
* Switch to wasm branch for this example
* Load marian model from a byte array
* Sanitise executable names
* Change marian branch
* Update marian branch that loads binary models
* Example of loading model as a byte array
* Add the byte array loading files
* Die on misaligned memory
* Remove the unused argument
* Allow loading without a ptr parameter so that we don't break emc workflow
Updates marian-dev and ssplit submodules to point to the upstream
commits which implements the following:
- marian-dev: encodeWithByteRanges(...) to get source token byte-ranges
- ssplit: Has a trivial sentencesplitter functionality implemented, and
now is faster to benchmark with marian-decoder.
This enables a marian-decoder replacement written through ssplit in this
source to be benchmarked constantly with existing marian-decoder.
Nits: Removes logging introduced for multiple workers, and respective
log statements.
- Added abhi-agg/ssplit-cpp
- Added its wasm branch in bergamot-translator
- Native builds of bergamot-translator are successful
-- Sentence splitting is NOT WORKING
-- Only translation is working
Enables Mac and Ubuntu CPU only builds through GitHub CI. CI scripts are
copied from marian-dev with necessary changes.
3rd-party/marian-dev is modified to meet C++17 requirements modifying
for half_float.
CMakeLists have been modified with the necessary includes to add
browsermt/mts@nuke files to the bergamot-translator library. In
addition, adds the ssplit dependency, corresponding includes.
Intel MKL fails on compilation, unable to find libraries. To solve this
3rd_party/CMakeLists.txt is modified with @ug's fixes to propogate
variables (EXT_LIBS, etc) at a library level.
Modifications to SentencePiece are necessary to provide token level
string_views. This commit changes marian to an alternate branch which
has the feature incorporated.