* Fixes reductions into scalars for <= 32 input elements. Only affects reductions where 0 is not the identity
* Update CHANGELOG.md
* Adds space before "?"
* Adds comment explaining increase in margin for reduction tests. Adds axis comment to argument to reduce functions. Adds more tests for small reduction operators
A few updates to Azure Pipelines:
* Adding CPU-only and GPU-only builds on Ubuntu
* Compiling Marian statically in some of the Ubuntu builds
* Ubuntu build with minimum supported versions of CMake (3.5.1), gcc (5.5), CUDA (10.0 due to GCC 5.5), no MKL
* Compiling marian-server with Boost 1.72 on Windows builds
* Minor clean up
- Add installation targets (enabled by GENERATE_MARIAN_INSTALL_TARGETS; default: OFF to preserve CMake 3.5.1 compatibility)
- Add COMPILE_LIBRARY_ONLY option (default: OFF) to exclude in-source executables from the build
- Compiler warning flags are no longer exported as part of the public link interface, only when building privately
- Always set CPUINFO_BUILD_TOOLS=OFF when building fbgemm, not just for MSVC builds
Related work items: #108034
This PR enables final post-processing of a full transformer stack for correct prenorm behavior.
See issues: #715 and #699,
List of changes:
Add final post-processing in encoder and decoder if requested with --transformer-postprocess-top. Can take combinations of d, n, a. Using a will add a skip connection from the bottom of the stack.
Add --task transformer-base-prenorm and --task transformer-big-prenorm which correspond to --task transformer-base --transformer-preprocess n --transformer-postprocess da --transformer-postprocess-top n.
A few improvements to Azure Pipelines:
- Disabling build on Ubuntu 20.04 due to [issues with FBGEMM and GCC 9+](https://github.com/marian-nmt/marian-dev/issues/709)
- Replacing Invoke-WebRequest with wget.exe
- Cleaning environmental variables
Adds `--output-omit-bias` option which allows to train an output layer without a bias vector. This is expected to be useful for `--output-approx-knn` during decoding, as the LSH-based k-NN search is then exactly approximating the correct top-K values for decoding. The bias adds a shift otherwise. In first experiments the lack of the output bias does not seem to result in any performance loss.
* Return exit code 15 (SIGTERM) after SIGTERM.
When marian receives signal SIGTERM and exits gracefully (save model & exit),
it should then exit with a non-zero exit code, to signal to any parent process
that it did not exit "naturally".
* Added explanatory comment about exiting marian_train with non-zero status after SIGTERM.
* Bug fix: better handling of SIGTERM for graceful shutdown during training.
Prior to this bug fix, BatchGenerator::fetchBatches, which runs in a separate
thread, would ignore SIGTERM during training (training uses a custom signal handler
for SIGTERM, which simply sets a global flag, to enable graceful shutdown (i.e.,
save models and current state of training before shutting down).
The changes in this commit also facilitate custom handling of other signals in the
future by providing a general singal handler for all signals with a signal number
below 32 (setSignalFlag) and a generic flag checking function (getSignalFlag(sig))
for checking such flags.
This PR adds initial Azure Pipelines with builds on Ubuntu (with CUDA) and macOS (CPU-only).
The scripts installing CUDA are already in the public master.