marian

mirror of https://github.com/marian-nmt/marian.git synced 2024-09-19 02:37:14 +03:00

Author	SHA1	Message	Date
Marcin Junczys-Dowmunt	be65065623	Allow to choose fine-grained CPU intrinsics on as CMake options (#849 ) * allow to choose fine-grained CPU intrinsics on as CMake options * inform user that e.g. -DCOMPILE_AVX2=off will be ignored with -march=native if there is compiler support	2021-04-09 09:02:34 -07:00
rhenry-nv	fddd0e0661	Adds better Affine support for GPUs when using CUDA 11. Introduces a new bias addition kernel for CUDA < 11 (#778 ) Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>	2021-04-08 21:46:27 -07:00
Marcin Junczys-Dowmunt	bfa6180033	Revert "remove TC_MALLOC from optional dependencies (#840 )" This reverts commit `096c48e51c`.	2021-04-08 07:30:38 +00:00
Martin Junczys-Dowmunt	7d1f941242	Merged PR 18309: Cleaner suppression of unwanted output words This PR adds cleaner suppression of unwanted output words. We identified a situation where SPM with byte-fallback can generate random bytes with output-sampling. That is particularly harmful when that random bytes happens to be a newline symbol. Here we suppress newline in output unless explicitly wanted.	2021-03-26 16:17:12 +00:00
Nikolay Bogoychev	ffd997e360	Properly copy the entire vector in the int16_t case (#845 ) Fixes #842 #843 #844	2021-03-23 14:32:01 -07:00
Young Jin Kim	b36d0bbbab	Fix FBGEMM build with gcc 9.3+ (#836 )	2021-03-22 11:13:40 -07:00
Marcin Junczys-Dowmunt	0394d2cdbe	Display decoder speed statistics with --stat-freq N (#841 ) Display decoder time statistics if requested	2021-03-22 08:58:04 -07:00
Marcin Junczys-Dowmunt	096c48e51c	remove TC_MALLOC from optional dependencies (#840 ) There seems to be no benefit from TC_MALLOC any more, hence removing.	2021-03-22 08:02:04 -07:00
Nikolay Bogoychev	d780082973	Fix model loading on architectures where size_t is 32bits (#825 ) * fix model loading on architectures where size_t is 32bit * Update the changelog Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-03-19 15:56:17 +00:00
Marcin Junczys-Dowmunt	a11418c17c	Add simple unit tests for binary files (#826 ) * unit tests for binary file operations * adjust changelog * Set file_ in TemporaryFile for MSVC Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-03-19 13:17:17 +00:00
Roman Grundkiewicz	db2a5e4d66	Fix broken links to MNIST data sets (#838 )	2021-03-19 13:16:10 +00:00
Marcin Junczys-Dowmunt	272096c1d1	sync public and internal master	2021-03-18 03:41:24 +00:00
Marcin Junczys-Dowmunt	8f73923d31	increase version and update changelog	2021-03-18 03:34:44 +00:00
Graeme	2352475705	Fix missing float template specialisation for elem::Plus (#822 ) * Fix missing float template specialisation for elem::Plus * Update CHANGELOG.md	2021-03-12 11:58:34 +00:00
Kenneth Heafield	af3aa314d0	Fix OMP compilation (#824 ) * Fix omp variable names	2021-03-02 17:26:49 -08:00
Marcin Junczys-Dowmunt	55a7047f8a	merge with internal master	2021-03-02 05:15:41 +00:00
Roman Grundkiewicz	8155d232db	Update CHANGELOG and VERSION	2021-02-28 09:08:50 +00:00
Graeme	ac71ee8518	Add graph operations documentation (#801 ) * Doxygen structure for expression graph operators * Document arithmetic expression operations * Document comparison expression operations * Document exp/log and trig operations * Add missing implementation for cos/tan * Document expression manipulation operations * Document misc math operations * Overview of operators * Document activation functions * Document element-wise min/max * Document debugging/checkpoint operators * Document topk/argmin/argmax operations * Document index-based operations * Document reduction operations * Document lambda expression operators * Document product operations * Document softmax, cross-entropy, unlikelihood operations * Document dropout operations * Document scalar product and weighted average operations * Document layer normalization, highway and pooling operations * Document shift expression operator * Extra details on rules for adding specializations to .inc files * Add SinNodeOp example for specialization documentation * Additional details in tensor operator documentation * Remove brief command from doxygen comments * Prefer @ style doxygen functions to \ * Document n-ary function macros * Enable .cu and .inc files in documentation * Add a comment about ONNX mapping * Remove empty lines in doxygen * Update CHANGELOG Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-02-28 08:38:32 +00:00
Marcin Junczys-Dowmunt	8ecc8b653f	FindMKL modified to find installed mkl in system paths	2021-02-24 05:19:34 +00:00
Roman Grundkiewicz	f88ded2ba8	Add documentation platform based on Sphinx+Doxygen+Breathe+Exhale (#803 ) * Setup Sphinx+Doxygen+Breathe+Exhale * Remove links to modindex and search pages * Enable Doxygen autobrief	2021-02-23 16:25:30 +00:00
Rihards Krišlauks	b29c50f9b7	Update simple-websocket-server to the latest version (#799 ) This adds support for boost 1.75	2021-02-22 13:01:03 +00:00
Martin Junczys-Dowmunt	5aeea4e066	Merged PR 17430: Refactors MPI interfaces and adds different types of gradient exchanges * Refactors MPI-related code * Adds node-local updates with occasional inter-node updates * decouples batch-reading across nodes	2021-02-08 05:27:49 +00:00
Marcin Junczys-Dowmunt	6f6d484665	increase version to 1.10.0	2021-02-06 15:35:16 -08:00
Nikolay Bogoychev	600f5cbdec	Integrate intgemm into marian (#595 ) Adds intgemm as a module for Marian. Intgemm is @kpu 's 8/16 bit gemm library with support for architectures from SSE2 to AVX512VNNI Removes outdated integer code, related to the --optimize option Co-authored-by: Kenneth Heafield <github@kheafield.com> Co-authored-by: Kenneth Heafield <kpu@users.noreply.github.com> Co-authored-by: Ulrich Germann <ugermann@inf.ed.ac.uk> Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-01-24 16:02:30 -08:00
Qianqian Zhu	737f43014a	Fix to resolve run time failures for FastOpt enabled WASM builds (#779 ) * copy changes from commit 4df92f2 * add comments for better understanding * restore the newline at the end of file and add this changes in changelog.md	2021-01-07 13:12:36 +00:00
Tommy MacWilliam	f59008276c	Add support for Apple Accelerate (#769 ) * support for Apple Accelerate * add a CMake flag to use Apple Accelerate as the BLAS library. * rename USE_ACCELERATE to USE_APPLE_ACCELERATE * add comment with more info on Accelerate * link to the Apple documentation on Accelerate.	2021-01-07 10:41:55 +00:00
Graeme	49994a0460	Prefer if/else in place of try/catch in FastOpt makeScalar (#774 )	2021-01-06 11:30:22 +00:00
Martin Junczys-Dowmunt	9dad84ae9b	Merged PR 16337: Update sentencepiece to new version This updates the SentencePiece version in Marian to a much more recent revision. Due to that there is no dependency on Protobuf anymore.	2020-11-11 00:38:37 +00:00
Martin Junczys-Dowmunt	b90229d8ee	Merged PR 16294: Change stopping criterion for mini-batch-fit binary search This PR changes the stopping criterion for mini-batch-fit binary search to better maximize batch size.	2020-11-10 17:08:47 +00:00
Martin Junczys-Dowmunt	d5e773f937	Merged PR 15925: Training embedder separation with margin losses * This PR adds training of embedding spaces with better separation based on https://arxiv.org/abs/2007.01852 * We can now train with in-batch negative examples or a handful of hand-constructed negative examples provided in a tsv-file.	2020-11-07 17:46:39 +00:00
Marcin Junczys-Dowmunt	3d233ec592	Merge branch 'master' into pmaster	2020-11-04 14:37:25 -08:00
afaji	e274ac76b2	Quantized model finetuning (#690 ) * enable quantized training	2020-11-04 14:25:40 -08:00
Martin Junczys-Dowmunt	f3e4cbf705	Merged PR 16219: Allow to set epoch display width Allow to set the display width of the fractional pert of a logical epoch.	2020-11-04 17:31:59 +00:00
Marcin Junczys-Dowmunt	fabbe20309	merge with internal	2020-11-02 09:05:15 -08:00
Martin Junczys-Dowmunt	ca777c86b7	Merged PR 16168: Add redefinition of logical epoch Adds e.g. --logical-epoch 1Gt (or other units) that alters the way the epoch counter is displayed. The actual underlying counter in form data passes is not changed. This is essentially a logging change that will now display the epoch as a fractional multiple of the chosen unit. Example for `--logical-epoch 100Mt`: ``` [2020-11-02 04:14:16] Ep. 4.8602 : Up. 16755 : Sen. 1,088,000 : Cost 1.17630422 * 1,993,304 @ 31,985 after 486,015,051 : Time 61.36s : 32483.55 words/s [2020-11-02 04:15:18] Ep. 4.8803 : Up. 16825 : Sen. 1,162,648 : Cost 1.17474616 * 2,009,996 @ 37,740 after 488,025,047 : Time 61.88s : 32480.17 words/s [2020-11-02 04:16:19] Ep. 4.9002 : Up. 16893 : Sen. 1,235,200 : Cost 1.17799997 * 1,990,844 @ 26,173 after 490,015,891 : Time 60.47s : 32920.16 words/s	2020-11-02 17:03:41 +00:00
Martin Junczys-Dowmunt	1b908a82a9	Merged PR 16162: ChrF validation metric This PR adds ChrF as a validation metric. This follows the implementation from SacreBLEU.	2020-11-02 15:47:46 +00:00
Martin Junczys-Dowmunt	3a028f215c	Merged PR 16144: Merge Cross-Entropy with Label-Smoothing operation * Compute label-smoothing within the cross-entropy node, should result in faster training.	2020-11-01 18:54:24 +00:00
Martin Junczys-Dowmunt	160b36cec8	Merged PR 15896: Add --after N option to supersede --after-batches and --after-epochs Replace `--after-batches N` and `--after-epochs N` with `--after Nu/Ne` which allows to specify updates, epochs, target labels with units, e.g.: * `--after 30Gt` or `--after 50ku` or `--after 10e` * Can also combine multiple criteria: `--after 30Gt,50ku,10e` and will stop when whichever hits first Changes default `cost-type` from `ce-mean` to `ce-sum` and turns `display-label-counts` on by default.	2020-10-29 20:16:19 +00:00
rhenry-nv	595fba4145	Fixes bug for certain reductions (#746 ) * Fixes reductions into scalars for <= 32 input elements. Only affects reductions where 0 is not the identity * Update CHANGELOG.md * Adds space before "?" * Adds comment explaining increase in margin for reduction tests. Adds axis comment to argument to reduce functions. Adds more tests for small reduction operators	2020-10-26 12:26:41 -07:00
Martin Junczys-Dowmunt	91ad534c65	Merged PR 15320: Sync internal and external master Updates internal master to external master. Changes: * Correct behavior for Pre-Norm transformer * Small changes to CMake files	2020-09-19 16:22:08 +00:00
Marcin Junczys-Dowmunt	951ecfe932	Enable final stack post-processing for transformer for correct prenorm behavior (#719 ) This PR enables final post-processing of a full transformer stack for correct prenorm behavior. See issues: #715 and #699, List of changes: Add final post-processing in encoder and decoder if requested with --transformer-postprocess-top. Can take combinations of d, n, a. Using a will add a skip connection from the bottom of the stack. Add --task transformer-base-prenorm and --task transformer-big-prenorm which correspond to --task transformer-base --transformer-preprocess n --transformer-postprocess da --transformer-postprocess-top n.	2020-09-09 08:06:20 -07:00
Martin Junczys-Dowmunt	e3916b3d08	Merged PR 15233: Sync internal master with public master Regular sync of public and internal master.	2020-09-07 19:37:41 +00:00
Roman Grundkiewicz	452f9c79e6	Print message that marian-server is listening on port X after it is accepting connections (#705 ) * Print 'server is listening' after it is accepting connections; fix #701 * Minor code formatting	2020-09-03 11:47:10 +01:00
Nikolay Bogoychev	4d9d15649e	Enable compute75 when using cuda10 (#698 ) * Enable compute75 when using cuda10 or newer and disable compute <50 when using CUDA11 * Re-enable deprecated architectures with CUDA11	2020-09-01 08:56:24 -07:00
Ulrich Germann	044b416af5	Improved handling of SIGTERM (#660 ) * Return exit code 15 (SIGTERM) after SIGTERM. When marian receives signal SIGTERM and exits gracefully (save model & exit), it should then exit with a non-zero exit code, to signal to any parent process that it did not exit "naturally". * Added explanatory comment about exiting marian_train with non-zero status after SIGTERM. * Bug fix: better handling of SIGTERM for graceful shutdown during training. Prior to this bug fix, BatchGenerator::fetchBatches, which runs in a separate thread, would ignore SIGTERM during training (training uses a custom signal handler for SIGTERM, which simply sets a global flag, to enable graceful shutdown (i.e., save models and current state of training before shutting down). The changes in this commit also facilitate custom handling of other signals in the future by providing a general singal handler for all signals with a signal number below 32 (setSignalFlag) and a generic flag checking function (getSignalFlag(sig)) for checking such flags.	2020-08-31 21:13:41 -07:00
Roman Grundkiewicz	3aed9143d9	Add --word-scores to marian-scorer (#638 ) * Add --word-scores to scorer * Update CHANGELOG * Use single RescorerLoss class * Update VERSION	2020-08-04 16:15:49 +01:00
Kenneth Heafield	c944633dd2	Optimize CPU LayerNormalization 6x with -ffast-math and lifting branches (#689 ) * Optimize LayerNormalization with -ffast-math and lifting branches * LayerNormalization changelog gprof: Now 1.65 30.11 0.57 void marian::cpu::LayerNormalizationImpl<1, 1, true>(float, float const, float const, float const, float, int, int) Baseline 9.08 22.31 3.49 marian::cpu::LayerNormalization(IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, float) That's 3.49 seconds to 0.57 second * LayerNormalization: longer comments, @frankseide-style if	2020-08-01 20:41:56 -07:00
Ulrich Germann	dade12a1b2	Update CHANGELOG.md Capitalize "Internal", add period at end.	2020-07-30 19:56:49 +01:00
Ulrich Germann	2987d1eb1f	Update CHANGELOG.md Fixed typo.	2020-07-30 19:56:49 +01:00
Ulrich Germann	ad91e54a24	Updated CHANGELOG.md.	2020-07-30 19:56:49 +01:00

1 2 3 4

169 Commits