Commit Graph

4397 Commits

Author SHA1 Message Date
Marcin Junczys-Dowmunt
adba021a5e bump version 2020-03-12 20:49:31 -07:00
Martin Junczys-Dowmunt
69d6f02711 Merged PR 11998: Lazy init for cuda handles (cusparse and cublas)
This does a lazy init of the two cuda handles that we are using on the GPU. When initialized every eagerly cusparse will consume about 250MB of CPU RAM and about 75MB of GPU RAM. Should only be used when actually needed.
2020-03-13 03:22:40 +00:00
Marcin Junczys-Dowmunt
3c7a88f4e9 update changelog and version 2020-03-10 11:34:07 -07:00
Marcin Junczys-Dowmunt
4b23fe76ff update to marian-dev 2020-03-10 11:32:14 -07:00
Marcin Junczys-Dowmunt
8640031437 resolve merge conflicts 2020-03-10 11:16:43 -07:00
Roman Grundkiewicz
aad22c9d09
Add option for printing CMake cached variables (#583)
* Add option --build-info
2020-03-10 10:29:50 -07:00
Martin Junczys-Dowmunt
cf7f0321f8 Merged PR 11920: Compare external master against internal master
Compare external master against internal master. Just double checking.
2020-03-10 00:29:55 +00:00
Marcin Junczys-Dowmunt
f4ea8239c4 sync with internal branch 2020-03-06 20:54:40 -08:00
Marcin Junczys-Dowmunt
bec7e029b1 bump version 2020-03-06 20:46:11 -08:00
Marcin Junczys-Dowmunt
9f29403627 version and changelog 2020-03-06 20:32:30 -08:00
Martin Junczys-Dowmunt
015a218c99 Merged PR 11312: Guard scheduler against circular references
This is the first PR in a series to get my graph-group* clean-up into master. End-goal is to have clean label-based updates and fp16 in master.

This small change guards against circular references in the scheduler. For large models the freeing order is wrong which can prevent GPU memory to be freed at the correct time leaving no memory for the actual model. The weak references solve that.
2020-03-07 04:29:50 +00:00
Martin Junczys-Dowmunt
45b83b20f2 Merged PR 11895: Use lowest() for INVALID_PATH_SCORE
* use lowest() as INVALID_PATH_SCORE insteaf of -9999 which caused problems with very long sequences
* add a number of aborts in relation of invalid path scores during beam search
2020-03-07 03:59:10 +00:00
Roman Grundkiewicz
00d2e999e3
Add support for compiling on Mac (and clang) (#598)
* Compile marian on mac and clang. Two linker errors left

* MacOS defines has a different definition for unsigned long

* Find OpenBLAS on mac

* Fix a typo in the BLAS detection

* Simplify and add comments

* Refactor cpu allocation code. Do not fallback to malloc

* Fix compilation warning on gcc

* Refactor memory allocation

* Make things compile with clang-8 with fewer warnings.

* Eliminate clang warnings when compiling examples and when compiling without MKL

* added USE_MKL option to compile without MKL for debugging even when MKL is installed

* fixed issues with compiling examples with clang

* Fix compile errors with clang in src/tests.

* Fix missing whitespace in error message in src/tests/sqlite.cpp.

* Responding to Frank Seide's code review.

* Eliminate clang warnings when compiling with -DUSE_FBGEMM=on.

* Fix compilation on gcc 8

* Get Marian to compile with Clang-10.

* Fix Clang-8 warnings when compiling with marian-server

* Add more comments and explicit unsigned long long for windows

* Pull in fbgemm that supports mac

* Fix warning flags order in CMakeLists.txt

Co-authored-by: Kenneth Heafield <kpu@users.noreply.github.com>
Co-authored-by: Ulrich Germann <ulrich.germann@gmail.com>
Co-authored-by: Roman Grundkiewicz <romang@amu.edu.pl>
2020-03-05 21:08:23 +00:00
Roman Grundkiewicz
67b055fe4a Update the submodule regression-tests 2020-02-26 14:13:47 +00:00
Pinzhen Chen (Patrick)
03fbf31592
Combine two for-loops in nth_element.cpp on CPU (#601) 2020-02-26 13:10:56 +00:00
Marcin Junczys-Dowmunt
24df8f1cb8 Revert "Merge pull request #599 from marian-nmt/nccl-unused-function"
This reverts commit 63272d1bc4, reversing
changes made to 64a67d5fc8.
2020-02-15 13:47:18 -08:00
Kenneth Heafield
63272d1bc4
Merge pull request #599 from marian-nmt/nccl-unused-function
clang: Disable unused-function warnings for 3rd-party NCCL library
2020-02-15 21:25:45 +00:00
Frank Seide
e09f7136ce Merged PR 11566: removed overstuff and understuff features
removed overstuff and understuff features
2020-02-15 07:38:16 +00:00
Kenneth Heafield
0873311d48 Gate warning on clang 2020-02-11 20:19:50 +00:00
Kenneth Heafield
a2a567c538 clang: Disable unused-function warnings for 3rd-party NCCL library 2020-02-11 20:10:50 +00:00
Nikolay Bogoychev
64a67d5fc8
Remove unused variables (#571) 2020-02-11 11:36:13 -08:00
Ulrich Germann
bb44c2a8b9
Remove printout of variable to std::cerr (#596) 2020-02-11 11:29:20 +01:00
Martin Junczys-Dowmunt
1044f7f587 Merged PR 11434: Fixes empty line handling with factored segmenter
Fixes empty line handling with factored segmenter. In my previous PR where I fixed general empty line handling I misunderstood the relation between WordIndex and factors and did an incorrect inverse look-up of the word index of EOS. Should be fixed now for FS, Should be no change when not using FS.
2020-02-07 01:22:21 +00:00
Roman Grundkiewicz
990aeb5daf
Add a warning message for an incorrect use of output sampling (#585)
Add a warning message if sampling with beam-size > 1 or ensembles
2020-01-31 14:22:20 +00:00
Roman Grundkiewicz
533604024b
Update script exporting embeddings to support tied embeddings (#569) 2020-01-29 13:19:21 -08:00
Roman Grundkiewicz
22ad592a1d
Add error message about missing --no-restore-corpus (#584) 2020-01-29 13:11:36 -08:00
Roman Grundkiewicz
ad28f99ac0
Add --valid-reset-stalled (#587) 2020-01-29 13:08:45 -08:00
Roman Grundkiewicz
a43ccd6c28
Mention sentence cropping for --valid-max-length (#588) 2020-01-29 13:06:58 -08:00
Ulrich Germann
7f4c730f1b
Merge pull request #591 from marian-nmt/ug-const-diligence
Make Vocab const in beam_search.h
2020-01-29 17:45:29 +00:00
Ulrich Germann
7228698b06 Make Vocab const in beam search. Remove some trailing whitespace. 2020-01-29 16:25:56 +00:00
Ulrich Germann
cfdde151a1 Merge branch 'master' into ug-const-diligence 2020-01-29 16:23:35 +00:00
Martin Junczys-Dowmunt
b3a23108b4 Merged PR 11188: Handle empty inputs with batch purging
The previous mechanism to remove empty inputs does not play well with batch purging (removal of finished sentences). Now we reuse the batch purging mechanism to get rid of empty inputs by forcing EOS for all beam entries of a batch entry for the corresponding source batch entry. The purging then takes care of the rest. We set the probability to log(1) = 0.
2020-01-17 21:52:33 +00:00
Marcin Junczys-Dowmunt
b822cd4d12 move regression test pointer 2020-01-11 14:00:32 -08:00
Marcin Junczys-Dowmunt
703fcf4347 update regression test pointer 2020-01-11 13:46:55 -08:00
Marcin Junczys-Dowmunt
1f7a63d3ff bump version 2020-01-11 12:31:34 -08:00
Martin Junczys-Dowmunt
af02867fb1 Merged PR 11103: Clear cache for RNN object between batches
* Clears cache for RNN object in transformer, otherwise stale tensor might be kept around.
* Add missing `hash()` and `equal` functions everywhere.
* Fixes bug from deployment test.
2020-01-11 20:29:43 +00:00
Young Jin Kim
0fab6ea850 Merged PR 10709: Disable a warning in FBGEMM code. This issue only appears in debug build.
Disable a warning in FBGEMM code. This issue only appears in debug build.
2020-01-11 01:29:02 +00:00
Martin Junczys-Dowmunt
164d26cc36 Merged PR 10999: Splitting up add_all.h into *.h, *.cu and *.inc
Splitting up header file into header and *.cu, comes with the price of having to include specializations for combinations of types as for element.inc and add.inc. No code changes otherwise.

Add CMake options to disable specific compute capabilities.

When run with `make -j16` this compiles in about 6 minutes instead of 7 minutes. Selecting only SM70 during compilation brings down the time to 3 minutes.
2020-01-06 19:14:00 +00:00
Martin Junczys-Dowmunt
88d9980589 Merged PR 10996: A number of smaller changes and clean-up
* Downgrade NCCL to 2.3.7 as 2.4.2 is buggy (hangs with larger models)
* Actually enable gradient-checkpointing, previous option was inactive
* Clean-up training-only options that should not be displayed for decoder and scorer
* Re-enable conversion to FP16 if element types are compatible (belong to the same type class)
* A few typos and more verbose log messages.
2020-01-05 23:16:13 +00:00
Roman Grundkiewicz
24f062cd27 Add option to print word-level scores (#501)
* Add printing word level scores

* Add option --no-spm-decode

* Fix precision for word-level scores

* Fix getting the no-spm-decode option

* Update CHANGELOG

* Add comments and refactor

* Print word-level scores next to other scores in an n-best list

* Remove --word-scores from marian-scorer

* Add --no-spm-decode only if compiled with SentencePiece

* Add comments

* Printing word scores before model scores in n-best lists

* Update VERSION

Co-authored-by: Marcin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com>
2020-01-03 19:10:21 -08:00
Marcin Junczys-Dowmunt
2bd986d8a7 update version and changelog 2019-12-23 12:10:30 -08:00
Martin Junczys-Dowmunt
0dc1ef11d3 Merged PR 10797: Differentiate packed8 type by layout
For FBGEMM based int8 implementation, packed matrix (model) could be different based on available AVX instruction sets. This PR split packed8 format into two separate data formats (packed8avx2, packed8avx512). And, this enables any packed model can be generated on any machine.
* Added packed8avx2, packed8avx512 types, removed packe8 type
* Added blocking factors to the fbgemm interface based on the pack type for pack function and gem functions.
2019-12-23 20:04:13 +00:00
Frank Seide
f882f27c09 Merged PR 10692: new factor conditioning, inline fixing suppression, nan suppression
This adds two new features related to factored vocabs:
* a new conditioning mechanism that mimics a mini transformer layer between the emitted lemma and the factors. This affects only factored vocabs, and requires to be enabled explicitly. Change is in `generic.cpp`.
* in case of inline phrase-fixing, cross-attention is now no longer allows to look into the source sequence. This only affects inputs with `|is` factors or `<IOPEN>` tags. Change is in `states.h`.
* Adam optimizer now skips update if the gradient contains a NaN. Does not affect existing configs unless they produce NaNs. Change is in `optimizers.cpp`.
* reverts to old `LayerNorm` routine. *TODO*: Is this change still needed?

Additional changes:
* new method `locate()` for accessing batch data with array coordinates
* new overloads for `constant_like()` from vector directly (most used case)
* rvalue-ref version of `fromVector()`
2019-12-20 01:46:37 +00:00
Marcin Junczys-Dowmunt
bab02e3b84 Merge branch 'cblas' 2019-12-13 13:12:07 -08:00
Marcin Junczys-Dowmunt
eba7aed344 increase version 2019-12-13 11:09:02 -08:00
Martin Junczys-Dowmunt
e0500b20b8 Merged PR 10827: Sequential unlikelihood training and fixed gather operation
This implements Sequential Unlikelihood Training from https://arxiv.org/abs/1908.04319
* implementation as expensive multi-op, special node in-progress.
* fixed gather operator to work in batched cases
2019-12-13 18:55:36 +00:00
Marcin Junczys-Dowmunt
5be8558c35 ammend changelog and bump version 2019-12-12 19:17:16 -08:00
Ulrich Germann
2b14d490d1 Allow file name templated valid-translation-output files (#549)
This allows us to preserve validation output at each validation point.
Template parameters are explained in the help message.
2019-12-09 15:05:44 +00:00
Roman Grundkiewicz
eb5f97240e
Fix word weighting with max length cropping (#562)
Fix word weighting with max length cropping
2019-12-06 13:27:48 +00:00
Roman Grundkiewicz
c343ced9e1 Add lexical shortlists to marian-server (#560)
* Add lexical shortlists to marian-server

* Update CHANGELOG
2019-12-05 19:41:36 -08:00