Commit Graph

259 Commits

Author SHA1 Message Date
Roman Grundkiewicz
e025bfb07c Merged PR 20070: Run regression tests in Azure Pipelines
The changes proposed in this pull request:
* Added regression testing with internal models into Azure Pipelines on both Windows and Ubuntu
* Created https://machinetranslation.visualstudio.com/Marian/_git/marian-prod-tests (more tests will be added over time)
* Made regression test outputs (all `.log`, `.out`, `.diff` files) available for inspection as a downloadable artifact.
* Made `--build-info` option available in CMake-based Windows builds

Warning: I tried to handle multiple cases, but some regression tests may occasionally fail, especially tests using avx2 or avx512 models, because the outputs are system/CPU dependent. I think it's better to merge this already, monitoring the stability of tests, and adding expected outputs variations if necessary, improving the coverage and stability of regression tests over time.
2021-08-06 08:02:18 +00:00
Rohit Jain
4ff2ef189e Merged PR 19761: Expose SPM Interface from Marian
This PR adds interfaces in Marian to allow it to handle segmentation duties.

Related work items: #121418
2021-07-30 03:28:00 +00:00
Nikolay Bogoychev
379212b75c
Enable compute86 where supported (#863)
* Enable compute86 where supported
2021-05-04 12:36:10 +01:00
Marcin Junczys-Dowmunt
be65065623
Allow to choose fine-grained CPU intrinsics on as CMake options (#849)
* allow to choose fine-grained CPU intrinsics on as CMake options
* inform user that e.g. -DCOMPILE_AVX2=off will be ignored with -march=native if there is compiler support
2021-04-09 09:02:34 -07:00
rhenry-nv
fddd0e0661
Adds better Affine support for GPUs when using CUDA 11. Introduces a new bias addition kernel for CUDA < 11 (#778)
Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>
2021-04-08 21:46:27 -07:00
Marcin Junczys-Dowmunt
bfa6180033 Revert "remove TC_MALLOC from optional dependencies (#840)"
This reverts commit 096c48e51c.
2021-04-08 07:30:38 +00:00
Marcin Junczys-Dowmunt
096c48e51c
remove TC_MALLOC from optional dependencies (#840)
There seems to be no benefit from TC_MALLOC any more, hence removing.
2021-03-22 08:02:04 -07:00
Roman Grundkiewicz
a1aaa32c6a Merged PR 18201: Install Boost in Azure pipelines
Installing Boost manually in all workflows, because it has been recently removed from Azure/GitHub hosted runners. This should fix recent failures of Marian CI builds.
2021-03-17 17:34:09 +00:00
Martin Junczys-Dowmunt
5aeea4e066 Merged PR 17430: Refactors MPI interfaces and adds different types of gradient exchanges
* Refactors MPI-related code
* Adds node-local updates with occasional inter-node updates
* decouples batch-reading across nodes
2021-02-08 05:27:49 +00:00
Young Jin Kim
b2d2a5d457 Merged PR 17525: cmake fix for GENERATE_MARIAN_INSTALL_TARGETS
cmake fix for GENERATE_MARIAN_INSTALL_TARGETS
2021-02-04 05:04:48 +00:00
Martin Junczys-Dowmunt
ba91b391e2 Merged PR 17337: fp16 support for training
This PR refactors the training graph groups and optimizers to enable and simplify things for fp16 support.

Deprecates old unused graph groups and fixes a couple of MPI issues.
2021-01-28 16:15:44 +00:00
Nikolay Bogoychev
600f5cbdec
Integrate intgemm into marian (#595)
Adds intgemm as a module for Marian. Intgemm is @kpu 's 8/16 bit gemm library with support for architectures from SSE2 to AVX512VNNI
Removes outdated integer code, related to the --optimize option

Co-authored-by: Kenneth Heafield <github@kheafield.com>
Co-authored-by: Kenneth Heafield <kpu@users.noreply.github.com>
Co-authored-by: Ulrich Germann <ugermann@inf.ed.ac.uk>
Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>
Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-01-24 16:02:30 -08:00
Tommy MacWilliam
f59008276c
Add support for Apple Accelerate (#769)
* support for Apple Accelerate
* add a CMake flag to use Apple Accelerate as the BLAS library.
* rename USE_ACCELERATE to USE_APPLE_ACCELERATE
* add comment with more info on Accelerate
* link to the Apple documentation on Accelerate.
2021-01-07 10:41:55 +00:00
Marcin Junczys-Dowmunt
bbdccd1c92
Update sentencepiece to newest version (#753)
- Updates sentencepiece to the newest version (removes dependency on protobuf)
- Enable SentencePiece compilation by default since there is no dependency in protobuf anymore.
2020-11-10 07:01:34 -08:00
Martin Junczys-Dowmunt
3a028f215c Merged PR 16144: Merge Cross-Entropy with Label-Smoothing operation
* Compute label-smoothing within the cross-entropy node, should result in faster training.
2020-11-01 18:54:24 +00:00
Roman Grundkiewicz
ae866af035 Merged PR 15561: Properly compile FBGEMM in CMake MSVC build
This fixes compilation of FBGEMM on Windows using CMake:
1. Compiling FBGEMM and cpuinfo statically
2. Forcing USE_STATIC_LIBS if USE_FBGEMM is set
2020-09-25 15:50:56 +00:00
Aaron Burke
5c45a37fcc Merged PR 14474: CMake build fixes for QuickSAND
- Add installation targets (enabled by GENERATE_MARIAN_INSTALL_TARGETS; default: OFF to preserve CMake 3.5.1 compatibility)
- Add COMPILE_LIBRARY_ONLY option (default: OFF) to exclude in-source executables from the build
- Compiler warning flags are no longer exported as part of the public link interface, only when building privately
- Always set CPUINFO_BUILD_TOOLS=OFF when building fbgemm, not just for MSVC builds

Related work items: #108034
2020-09-10 01:33:44 +00:00
Martin Junczys-Dowmunt
e3916b3d08 Merged PR 15233: Sync internal master with public master
Regular sync of public and internal master.
2020-09-07 19:37:41 +00:00
Roman Grundkiewicz
080d75ad59 Merged PR 14402: Sync with public marian-dev master 1.9.31
This simply pulls the recent updates from the public repo.
2020-07-28 22:19:40 +00:00
Roman Grundkiewicz
71dccf343e Merged PR 14262: Update MSVC CMake build and instructions
This PR updates Windows build via CMake and build instructions. With https://github.com/marian-nmt/marian-dev/pull/676, this should be fully workable, including CUDA, FBGEMM, SentencePiece, unit tests, marian-server.

List of changes:
- Fixing compilation of marian-server on Windows via CMake
- Updating vs/CheckDeps.bat
    - zlib no longer needs to be installed as it is included in 3rd_party
    - Installing Boost 1.72 since newer is not supported
    - Installing minimal required Boost components in CheckDeps.bat
    - Installing protobuf in CheckDeps.bat
- Updating CMakeSettings.json
- Updating vs/README.md
- Development notes extracted to vs/NOTES.md

I did not update and test with CUDA, because I do not have a machine for that, but AFAIK it works properly.
2020-07-25 20:57:17 +00:00
Frank Seide
435aa9505e Merged PR 14334: full ONNX conversion script
This PR adds a full ONNX conversion script that exports a Marian model and wraps it in a greedy-search implemented in ONNX.
2020-07-24 17:23:05 +00:00
Martin Junczys-Dowmunt
c3fb60cbcd Merged PR 13476: Add LASER reimplementation and code for embeddings sentences
This reimplements the LASER encoder from:
```
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
Mikel Artetxe, Holger Schwenk
https://arxiv.org/abs/1812.10464
```

and adds functionality to embed sentences with any Marian encoder, also different from LASER. Some early attempts to train a transformer model with Encoder-Decoder bottle-neck. This is quite early code, so some code-duplication is to be expected. Nevertheless, it's functional and I would like to have it in master as we will slowly put that into production in various places. I will make the code "nicer" as we go along.
2020-06-24 01:54:27 +00:00
Marcin Junczys-Dowmunt
50ce630449 small fix to cmake file 2020-06-19 10:28:45 -07:00
Martin Junczys-Dowmunt
0dc318e993 Merged PR 13052: Fix FAISS on Windows
* adds the correct definitions for FINTEGER for Linux and Windows
* moves things around a bit in the CMakeLists.txt files to keep things more local
* fixes a few more warnings in 3rd party code
2020-05-24 02:22:13 +00:00
Martin Junczys-Dowmunt
5a439245b1 Merged PR 12923: LSH indexing to replace short list
This PR adds the FAISS LSH index to the Marian as a CPU-side optimization in the output layer of transformer models via KNN search.

* It allows to replace the potentially harmful short-list with a ML free approximation of the final matrix multiply. With increasing number of K neighbors and the size of the chosen hash in bits the approximation becomes more accurate, but also slower. For model dimensions of 512, the sweet spot seems to be around k=100-150 and nbits=1024-1536
* Enable during CPU-side decoding via `--output-approx-knn k nbits`
* Add a `lambda` node that allows to create custom new nodes, most useful for CPU use right now.
2020-05-21 23:20:52 +00:00
Frank Seide
77a420740c Merged PR 12958: ONNX support
This branch adds functionality to export ONNX models (with limitations).
2020-05-21 05:51:18 +00:00
Martin Junczys-Dowmunt
f1be95fce4 Merged PR 11929: Move around code to make later comparison with FP16 code easier
This does not introduce any new functionality, just moves code around, so that future PRs are easier to compare. Moving old GraphGroup code to training/deprecated. Once it is clear there is nothing in there that's worth saving, this will be deleted.

Replace -Ofast with -O3 and make sure ffinite-math is turned off.
2020-03-14 00:07:37 +00:00
Roman Grundkiewicz
aad22c9d09
Add option for printing CMake cached variables (#583)
* Add option --build-info
2020-03-10 10:29:50 -07:00
Roman Grundkiewicz
00d2e999e3
Add support for compiling on Mac (and clang) (#598)
* Compile marian on mac and clang. Two linker errors left

* MacOS defines has a different definition for unsigned long

* Find OpenBLAS on mac

* Fix a typo in the BLAS detection

* Simplify and add comments

* Refactor cpu allocation code. Do not fallback to malloc

* Fix compilation warning on gcc

* Refactor memory allocation

* Make things compile with clang-8 with fewer warnings.

* Eliminate clang warnings when compiling examples and when compiling without MKL

* added USE_MKL option to compile without MKL for debugging even when MKL is installed

* fixed issues with compiling examples with clang

* Fix compile errors with clang in src/tests.

* Fix missing whitespace in error message in src/tests/sqlite.cpp.

* Responding to Frank Seide's code review.

* Eliminate clang warnings when compiling with -DUSE_FBGEMM=on.

* Fix compilation on gcc 8

* Get Marian to compile with Clang-10.

* Fix Clang-8 warnings when compiling with marian-server

* Add more comments and explicit unsigned long long for windows

* Pull in fbgemm that supports mac

* Fix warning flags order in CMakeLists.txt

Co-authored-by: Kenneth Heafield <kpu@users.noreply.github.com>
Co-authored-by: Ulrich Germann <ulrich.germann@gmail.com>
Co-authored-by: Roman Grundkiewicz <romang@amu.edu.pl>
2020-03-05 21:08:23 +00:00
Martin Junczys-Dowmunt
164d26cc36 Merged PR 10999: Splitting up add_all.h into *.h, *.cu and *.inc
Splitting up header file into header and *.cu, comes with the price of having to include specializations for combinations of types as for element.inc and add.inc. No code changes otherwise.

Add CMake options to disable specific compute capabilities.

When run with `make -j16` this compiles in about 6 minutes instead of 7 minutes. Selecting only SM70 during compilation brings down the time to 3 minutes.
2020-01-06 19:14:00 +00:00
Marcin Junczys-Dowmunt
bab02e3b84 Merge branch 'cblas' 2019-12-13 13:12:07 -08:00
Kenneth Heafield
734a8791dd Use __AVX__ compiler define instead of custom and broken NO_AVX flag. Fixes #559 (#564)
* Change to __AVX__ macro defined by compiler
* Source files should not be executable
2019-12-05 14:17:54 -08:00
Nikolay Bogoychev
34e99da49b Fix compilation on CPUs that don't support AVX and some white space i… (#561)
* Fix compilation on CPUs that don't support AVX and some white space issues
* Change from ifdef to ifndef
2019-12-05 08:12:17 -08:00
Marcin Junczys-Dowmunt
26859d2e15 Merge with ms-internal master 2019-11-25 19:18:03 -08:00
Martin Junczys-Dowmunt
93b7ed80fe Merged PR 10553: Fix multiple problems in reduce kernels that occurred during back-prop
This fixes a number of bugs in our GPU reduce-kernels that would manifest mainly for larger matrices and during back-prop. We also drop support for CUDA 8.0 to be able to take advantage of new GPU primitives introduced by NVidia in CUDA 9.0.
2019-11-26 01:48:07 +00:00
Marcin Junczys-Dowmunt
d394641275
Merge pull request #546 from alvations/marian-from-origin
Added ssse4.2 support
2019-11-21 21:28:32 -08:00
Ulrich Germann
66b95a5afa Sort custom cmake options alphabetically (#547) 2019-11-11 09:33:13 +00:00
alvations
59011c8129 added ssse4.2 support 2019-11-07 07:09:40 +08:00
Ulrich Germann
7b36b329ef Use ccache for faster compilation if available. (#525)
* Use ccache only when requested via cmake -DUSE_CCACHE=on
* Add link to https://ccache.dev in comment about using ccache.
* Issue success / missing ccache message when ccache is requested during the CCACHE run
* Issue cmake warning instead of cmake status message when use of ccache is requested but ccache cannot be found.
2019-10-30 17:29:27 +00:00
Marcin Junczys-Dowmunt
bf10d36d1e disable FBGEMM by default 2019-10-29 07:29:20 -07:00
Marcin Junczys-Dowmunt
a0e472294a remove compatibility check for gcc 4.9 and below, as we cannot compile on 4.9 anyway 2019-10-28 13:57:43 -07:00
Marcin Junczys-Dowmunt
0a89e8f168 fix compilation for various gcc and cuda combinations 2019-10-28 13:18:07 -07:00
Marcin Junczys-Dowmunt
1ab2484978 Merge branch 'master' into mjd/syncWithPublic 2019-10-26 15:17:14 -07:00
Marcin Junczys-Dowmunt
1174cecbd6 merge with public master 2019-10-25 22:24:59 -07:00
Marcin Junczys-Dowmunt
ed5f5866c2 bye bye boost 2019-10-25 17:02:16 -07:00
Ulrich Germann
d1ed2c2506 More helpful error message when CUDA libraries cannot be found. 2019-10-23 18:54:34 +01:00
Marcin Junczys-Dowmunt
cd18ee0396 safeguard cmake against rogue doxygen 2019-10-08 15:54:59 -07:00
Marcin Junczys-Dowmunt
834905411e address code review comments 2019-10-07 15:49:46 -07:00
Marcin Junczys-Dowmunt
4d10ae9212 small changes to cmake 2019-09-17 13:46:52 -07:00
Marcin Junczys-Dowmunt
16e6fb9fae small clean up, move fbgemm pointer 2019-09-17 11:30:46 -07:00