marian

mirror of https://github.com/marian-nmt/marian.git synced 2024-09-11 06:15:56 +03:00

Author	SHA1	Message	Date
Marcin Junczys-Dowmunt	4d3702c4ec	Merged PR 25950: Add missing defaults for concatenated factors This PR adds missing default values for concatenated factors.	2022-10-06 05:53:16 +00:00
Marcin Junczys-Dowmunt	1e92cff93d	Merged PR 25919: Sync with public master - no review required Sync with public master, checking compilation, regression tests etc.	2022-10-04 00:42:52 +00:00
Marcin Junczys-Dowmunt	2c55cdb3c0	Merged PR 25889: Fixes bad memory access problem in hashing Fix bad memory access problem in hashing by using the graph allocator	2022-09-29 19:01:49 +00:00
Marcin Junczys-Dowmunt	2cd3055d76	Merged PR 25836: Check via hashing if re-syncing in local mode is required * This adds GPU-side hashing to tensors (a hash based on mumurhash3) * The hash is used to check if parameters across nodes have diverged, if yes, resync all parameters and optimizer shards. Before it would resync every N (100 or 200) updates. Now this can be skipped if nothing diverged.	2022-09-27 18:40:53 +00:00
Marcin Junczys-Dowmunt	7d2045a907	Merged PR 25686: Loading checkpoints from main node only via MPI Enables loading of model checkpoints from main node only via MPI. Until now the checkpoint needed to present in the same location on all nodes. That could be done either via writing to a shared filesystem (problematic due to bad syncing) or by manual copying to the same local location, e.g. /tmp on each node (while writing only happened to one main location). Now, marian can resume training from only one location on the main node. The remaining nodes do not need to have access. E.g. local /tmp on the main node can be used, or race conditons on shared storage are avoided. Also avoids creating files for logging on more than one node. This is a bit wonky, done via environment variable lookup.	2022-09-21 20:39:54 +00:00
Marcin Junczys-Dowmunt	f3e1efe731	merge with internal master	2022-05-26 06:28:06 -07:00
Marcin Junczys-Dowmunt	1a74358277	Merged PR 23429: Small fixes around fp16 training and batch fitting This PR introduces small fixes around fp16 training and batch fitting: * Multi-loss casts type to first loss-type before accumulation (aborted before due to missing cast) * Throw `ShapeSizeException` if total expanded shape size exceeds numeric capacity of the maximum int value (2^31-1) * During mini-batch-fitting, catch `ShapeSizeException` and use another sizing hint. Aborts outside mini-batch-fitting. * Negative `--workspace -N` value allocates workspace as total available GPU memory minus N megabytes.	2022-04-11 20:19:58 +00:00
Marcin Junczys-Dowmunt	d5c7372a67	Merged PR 23407: Fix incorrect/missing gradient accumulation for affine biases This PR fixes incorrect/missing gradient accumulation with delay > 1 or large effective batch size of biases of affine operations.	2022-04-08 16:00:04 +00:00
Artur Nowakowski	23c36ec1a3	Fixed fp16 training/inference with factors-combine concat (#926 )	2022-03-22 10:07:41 +00:00
Marcin Junczys-Dowmunt	16bfa0c913	Merged PR 23094: Adapt --cost-scaling to more stable setting This PR sets default parameters for cost-scaling to 8.f 10000 1.f 8.f, i.e. when scaling scale by 8 and do not try to automatically scale up or down. This seems most stable than variable cost-scaling with larger numbers that was the default before.	2022-03-16 14:44:17 +00:00
Marcin Junczys-Dowmunt	310d2f42f6	Merged PR 22939: Fix case augmentation with multi-threaded reading This PR fixes case augmentation with multi-threaded reading. The solution is to not look at iterator::pos_ in lazy processing, rather pass it as an argument to the lazy function.	2022-03-07 16:57:32 +00:00
Graeme Nail	601c9ac980	Detect fortran_order in npz (#911 ) * Fix fortran_order parsing * Abort on non row-major NPZ entries * Update CHANGELOG * Update VERSION Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2022-02-15 13:22:49 +00:00
Nikolay Bogoychev	8a9580b329	update the intgemm version to upstream (#924 ) Some data types got upper cased, that's why there is a larger diff than expected Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2022-02-15 11:18:29 +00:00
Marcin Junczys-Dowmunt	b0275e7754	merge with internal master	2022-02-11 06:03:16 -08:00
Marcin Junczys-Dowmunt	4b51dcbd06	Merged PR 22524: Optimize guided alignment training speed via sparse alignments - part 1 This replaces dense alignment storage and training with a sparse representation. Training speed with guided alignment matches now nearly normal training speed, regaining about 25% speed. This is no. 1 of 2 PRs. The next one will introduce a new guided-alignment training scheme with better alignment accuracy.	2022-02-11 13:50:47 +00:00
Marcin Junczys-Dowmunt	3b21ff39c5	update VERSION and CHANGELOG	2022-02-10 08:35:49 -08:00
Marcin Junczys-Dowmunt	b3feecc82b	Merged PR 22483: Make C++17 the official standard for Marian Make C++17 the official standard for Marian	2022-02-10 16:34:23 +00:00
Graeme Nail	4d44627f26	PyYaml safe_load instead of load (#913 ) * pyyaml safe_load instead of load * Update CHANGELOG	2022-02-10 11:20:27 +00:00
Marcin Junczys-Dowmunt	f00d062189	update VERSION and CHANGELOG - Release 1.11.0	2022-02-08 08:40:33 -08:00
Marcin Junczys-Dowmunt	3cf9e83bac	resolve conflicts	2022-02-06 12:33:58 -08:00
Graeme Nail	b29cc07a95	Scorer model loading (#860 ) * Add MMAP as an option * Use io::isBin * Allow getYamlFromModel from an Item vector * ScorerWrapper can now load on to a graph from Item vector The interface IEncoderDecoder can now call graph loads directly from an Item Vector. * Translator loads model before creating scorers Scorers are created from an Item vector * Replace model-config try-catch with check using IsNull * Prefer empty vs size * load by items should be pure virtual * Stepwise forward load to encdec * nematus can load from items * amun can load from items * loadItems in TranslateService * Remove logging * Remove by filename scorer functions * Replace by filename createScorer * Explicitly provide default value for get model-mmap * CLI option for model-mmap only for translation and CPU compile * Ensure model-mmap option is CPU only * Remove move on temporary object * Reinstate log messages for model loading in Amun / Nematus * Add log messages for model loading in scorers Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2022-01-18 12:58:52 +00:00
Nikolay Bogoychev	e26e5b6faf	Use apple accelerate on MacOs by default (#897 )	2021-12-16 15:07:34 +00:00
Nikolay Bogoychev	e8a1a2530f	Fix AVX2+ detection on Mac (#895 ) MacOS is weird and its CPU flags are separated in two separate fields returned by the sysctl interface. To get around this, we need to test both of them, so here goes Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-12-07 17:47:33 +00:00
Marcin Junczys-Dowmunt	bbc673c50f	update CHANGELOG and VERSION	2021-11-24 18:42:14 -08:00
Nikolay Bogoychev	ab6b826083	Add GCC 11 support (#888 ) * Add GCC 11 support Some C++ Standard Library headers have been changed to no longer include other headers that they do need to depend on. As such, C++ programs that used standard library components without including the right headers will no longer compile. The following headers are used less widely in libstdc++ and may need to be included explicitly when compiled with GCC 11: <limits> (for std::numeric_limits) <memory> (for std::unique_ptr, std::shared_ptr etc.) <utility> (for std::pair, std::tuple_size, std::index_sequence etc.) <thread> (for members of namespace std::this_thread.) Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-11-23 10:13:29 +00:00
Nikolay Bogoychev	1adf80b7c9	Task alias validation during training mode (#886 ) * Attempt to validate task alias * Validate allowed options for --task alias * Update comment in aliases.cpp * Show allowed values for alias Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-11-22 19:19:58 +00:00
David Meikle	3b4e943cda	Added pragma to ignore unused-private-field error on elementType_ which failed in macOS (#872 ) Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-11-22 12:22:06 +00:00
Kenneth Heafield	4dd30b5065	Factor concatenation improvements and documentation (#748 ) * concatenation combining option added when embeding using factors * crossMask not used by default * added an option to better clarify when choosing factor predictor options * fixed bug when choosing re-embedding option and not setting embedding size * avoid uncessary string copy * Check in factors documentation * Fix duplication in merge * Self-referential repository * change --factors-predictor to --lemma-dependency. Default behaviour changed. * factor related options are now stored with the model * Update doc/factors.md * add backward compability for the target factors * Move backward compatibility checks for factors to happen after the model.npz config is loaded * Add explicit error msg if using concat on target * Update func comments. Fix spaces * Add Marian version requirement * delete experimental code Co-authored-by: Pedro Coelho <pedrodiascoelho97@gmail.com> Co-authored-by: Pedro Coelho <pedro.coelho@unbabel.com> Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-09-08 14:02:21 +01:00
Rohit Jain	056c4bef5b	Merged PR 19860: Case augmented data, if not using factored vocab must not set guided alignments This change allows marking SentenceTuples as 'altered', if they were generated or modified by data augmentation internally in such a way so as to impact processing. In particular, for such sentence tuples, we do not want to try setting guided alignments if the externally provided guided alignments might no longer be correct after that alteration.	2021-07-17 23:03:16 +00:00
Martin Junczys-Dowmunt	8e88071ae8	Merged PR 19842: Adapt LSH to work with Leaf Small changes to make the LSH work with Leaf server and QuickSand.	2021-07-16 20:04:16 +00:00
Qianqian Zhu	42f0b8b74b	Binary shortlist (#856 ) Co-authored-by: Kenneth Heafield <github@kheafield.com>	2021-07-10 22:56:58 -07:00
Marcin Junczys-Dowmunt	3a478fc47d	update version and changelog	2021-07-09 13:46:18 -07:00
Martin Junczys-Dowmunt	fc0f41f24a	Merged PR 19597: Enable mpi wrapper to use size larger than MAX_INT Enable mpi wrapper to use size larger than MAX_INT.	2021-06-28 23:15:23 +00:00
Roman Grundkiewicz	6e87f16e48	Merged PR 18763: Fix adding new validation metrics with --valid-reset-stalled This fixes a bug that's been discovered recently by checking if a validator exists before resetting its stalled validations. Regression test for it is in: https://github.com/marian-nmt/marian-regression-tests/pull/80	2021-05-26 06:12:33 +00:00
Marcin Junczys-Dowmunt	3133a9b27b	resolve conflict	2021-05-24 11:19:20 -07:00
Marcin Junczys-Dowmunt	84a20f65a1	Merge branch 'master' into pmaster	2021-05-24 11:17:53 -07:00
Marcin Junczys-Dowmunt	8b818b7c07	Avoid Ampere misaligment issue	2021-05-17 13:25:13 -07:00
Nikolay Bogoychev	379212b75c	Enable compute86 where supported (#863 ) * Enable compute86 where supported	2021-05-04 12:36:10 +01:00
Kenneth Heafield	36b4b69d7b	Remove unused memoized_ variable (#852 )	2021-04-28 13:28:50 +01:00
Roman Grundkiewicz	49e379bba5	Merged PR 18612: Early stopping on first, all, or any validation metrics Adds `--early-stopping-on first\|all\|any` allowing to decide if early stopping should take into account only first, all, or any validation metrics. Feature request: https://github.com/marian-nmt/marian-dev/issues/850 Regression tests: https://github.com/marian-nmt/marian-regression-tests/pull/79	2021-04-26 11:51:43 +00:00
Marcin Junczys-Dowmunt	309bd748ab	Merge branch 'master' of github.com:marian-nmt/marian-dev into pmaster	2021-04-21 05:13:58 +00:00
Marcin Junczys-Dowmunt	3e51ff3872	fix depth-scaling in FFN	2021-04-20 15:50:53 +00:00
Kenneth Heafield	bb6092da2b	Compute tensor size using integers (#851 )	2021-04-14 08:48:51 -07:00
Marcin Junczys-Dowmunt	ed29048004	Merge branch 'master' of vs-ssh.visualstudio.com:v3/machinetranslation/Marian/marian-dev	2021-04-11 04:29:46 +00:00
Marcin Junczys-Dowmunt	ea55722372	Merge branch 'pmaster'	2021-04-11 04:29:17 +00:00
huangjq0617	a7c3a0b2ef	fix beam_search ABORT when enable openmp and OMP_NUM_THREADS > 1 (#767 )	2021-04-10 21:28:04 -07:00
Martin Junczys-Dowmunt	caddad90cd	Merged PR 18505: RMSNorm on GPU Support for RMSNorm as drop-in replace for LayerNorm from _Biao Zhang; Rico Sennrich (2019). Root Mean Square Layer Normalization_. Enabled in Transformer model via `--transformer-postprocess dar` instead of `dan`.	2021-04-10 15:28:38 +00:00
Marcin Junczys-Dowmunt	6435c6f1ce	synced with public master	2021-04-09 16:12:34 +00:00
Marcin Junczys-Dowmunt	be65065623	Allow to choose fine-grained CPU intrinsics on as CMake options (#849 ) * allow to choose fine-grained CPU intrinsics on as CMake options * inform user that e.g. -DCOMPILE_AVX2=off will be ignored with -march=native if there is compiler support	2021-04-09 09:02:34 -07:00
rhenry-nv	fddd0e0661	Adds better Affine support for GPUs when using CUDA 11. Introduces a new bias addition kernel for CUDA < 11 (#778 ) Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>	2021-04-08 21:46:27 -07:00

1 2 3 4 5

218 Commits