marian

mirror of https://github.com/marian-nmt/marian.git synced 2024-11-03 20:13:47 +03:00

Author	SHA1	Message	Date
Marcin Junczys-Dowmunt	65bf82ffce	version 1.12.0 (#980 )	2023-02-21 17:56:29 +00:00
Roman Grundkiewicz	ee50d4aaea	Merged PR 27051: Add an option for completely resetting validation metrics Added `--valid-reset-all` that works as `--valid-reset-stalled` but it also resets last best saved validation metrics, which is useful for when the validation sets change for continued training. Added new regression test: https://github.com/marian-nmt/marian-regression-tests/pull/89	2022-12-20 17:56:10 +00:00
Marcin Junczys-Dowmunt	4d3702c4ec	Merged PR 25950: Add missing defaults for concatenated factors This PR adds missing default values for concatenated factors.	2022-10-06 05:53:16 +00:00
Marcin Junczys-Dowmunt	1e92cff93d	Merged PR 25919: Sync with public master - no review required Sync with public master, checking compilation, regression tests etc.	2022-10-04 00:42:52 +00:00
Marcin Junczys-Dowmunt	2c55cdb3c0	Merged PR 25889: Fixes bad memory access problem in hashing Fix bad memory access problem in hashing by using the graph allocator	2022-09-29 19:01:49 +00:00
Marcin Junczys-Dowmunt	2cd3055d76	Merged PR 25836: Check via hashing if re-syncing in local mode is required * This adds GPU-side hashing to tensors (a hash based on mumurhash3) * The hash is used to check if parameters across nodes have diverged, if yes, resync all parameters and optimizer shards. Before it would resync every N (100 or 200) updates. Now this can be skipped if nothing diverged.	2022-09-27 18:40:53 +00:00
Marcin Junczys-Dowmunt	7d2045a907	Merged PR 25686: Loading checkpoints from main node only via MPI Enables loading of model checkpoints from main node only via MPI. Until now the checkpoint needed to present in the same location on all nodes. That could be done either via writing to a shared filesystem (problematic due to bad syncing) or by manual copying to the same local location, e.g. /tmp on each node (while writing only happened to one main location). Now, marian can resume training from only one location on the main node. The remaining nodes do not need to have access. E.g. local /tmp on the main node can be used, or race conditons on shared storage are avoided. Also avoids creating files for logging on more than one node. This is a bit wonky, done via environment variable lookup.	2022-09-21 20:39:54 +00:00
Marcin Junczys-Dowmunt	1a74358277	Merged PR 23429: Small fixes around fp16 training and batch fitting This PR introduces small fixes around fp16 training and batch fitting: * Multi-loss casts type to first loss-type before accumulation (aborted before due to missing cast) * Throw `ShapeSizeException` if total expanded shape size exceeds numeric capacity of the maximum int value (2^31-1) * During mini-batch-fitting, catch `ShapeSizeException` and use another sizing hint. Aborts outside mini-batch-fitting. * Negative `--workspace -N` value allocates workspace as total available GPU memory minus N megabytes.	2022-04-11 20:19:58 +00:00
Marcin Junczys-Dowmunt	d5c7372a67	Merged PR 23407: Fix incorrect/missing gradient accumulation for affine biases This PR fixes incorrect/missing gradient accumulation with delay > 1 or large effective batch size of biases of affine operations.	2022-04-08 16:00:04 +00:00
Marcin Junczys-Dowmunt	16bfa0c913	Merged PR 23094: Adapt --cost-scaling to more stable setting This PR sets default parameters for cost-scaling to 8.f 10000 1.f 8.f, i.e. when scaling scale by 8 and do not try to automatically scale up or down. This seems most stable than variable cost-scaling with larger numbers that was the default before.	2022-03-16 14:44:17 +00:00
Marcin Junczys-Dowmunt	310d2f42f6	Merged PR 22939: Fix case augmentation with multi-threaded reading This PR fixes case augmentation with multi-threaded reading. The solution is to not look at iterator::pos_ in lazy processing, rather pass it as an argument to the lazy function.	2022-03-07 16:57:32 +00:00
Marcin Junczys-Dowmunt	4b51dcbd06	Merged PR 22524: Optimize guided alignment training speed via sparse alignments - part 1 This replaces dense alignment storage and training with a sparse representation. Training speed with guided alignment matches now nearly normal training speed, regaining about 25% speed. This is no. 1 of 2 PRs. The next one will introduce a new guided-alignment training scheme with better alignment accuracy.	2022-02-11 13:50:47 +00:00
Marcin Junczys-Dowmunt	3b21ff39c5	update VERSION and CHANGELOG	2022-02-10 08:35:49 -08:00
Marcin Junczys-Dowmunt	b3feecc82b	Merged PR 22483: Make C++17 the official standard for Marian Make C++17 the official standard for Marian	2022-02-10 16:34:23 +00:00
Marcin Junczys-Dowmunt	f00d062189	update VERSION and CHANGELOG - Release 1.11.0	2022-02-08 08:40:33 -08:00
Marcin Junczys-Dowmunt	3cf9e83bac	resolve conflicts	2022-02-06 12:33:58 -08:00
Roman Grundkiewicz	3b458b044e	Update VERSION	2022-01-24 15:28:37 +00:00
Roman Grundkiewicz	b64e258bda	Update VERSION	2022-01-18 12:59:37 +00:00
Roman Grundkiewicz	c84599d08a	Update VERSION	2021-12-16 15:07:55 +00:00
Marcin Junczys-Dowmunt	bbc673c50f	update CHANGELOG and VERSION	2021-11-24 18:42:14 -08:00
Nikolay Bogoychev	ab6b826083	Add GCC 11 support (#888 ) * Add GCC 11 support Some C++ Standard Library headers have been changed to no longer include other headers that they do need to depend on. As such, C++ programs that used standard library components without including the right headers will no longer compile. The following headers are used less widely in libstdc++ and may need to be included explicitly when compiled with GCC 11: <limits> (for std::numeric_limits) <memory> (for std::unique_ptr, std::shared_ptr etc.) <utility> (for std::pair, std::tuple_size, std::index_sequence etc.) <thread> (for members of namespace std::this_thread.) Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2021-11-23 10:13:29 +00:00
Martin Junczys-Dowmunt	8e88071ae8	Merged PR 19842: Adapt LSH to work with Leaf Small changes to make the LSH work with Leaf server and QuickSand.	2021-07-16 20:04:16 +00:00
Marcin Junczys-Dowmunt	3a478fc47d	update version and changelog	2021-07-09 13:46:18 -07:00
Martin Junczys-Dowmunt	fc0f41f24a	Merged PR 19597: Enable mpi wrapper to use size larger than MAX_INT Enable mpi wrapper to use size larger than MAX_INT.	2021-06-28 23:15:23 +00:00
Roman Grundkiewicz	fe74576dc3	Update VERSION	2021-05-04 12:36:37 +01:00
Marcin Junczys-Dowmunt	1c8ee95a54	update version	2021-04-21 05:14:36 +00:00
Marcin Junczys-Dowmunt	8a53b761d5	update version	2021-04-11 04:30:35 +00:00
Martin Junczys-Dowmunt	caddad90cd	Merged PR 18505: RMSNorm on GPU Support for RMSNorm as drop-in replace for LayerNorm from _Biao Zhang; Rico Sennrich (2019). Root Mean Square Layer Normalization_. Enabled in Transformer model via `--transformer-postprocess dar` instead of `dan`.	2021-04-10 15:28:38 +00:00
Marcin Junczys-Dowmunt	fdf9fe7d4a	Update VERSION	2021-04-09 09:03:39 -07:00
Marcin Junczys-Dowmunt	a17ee300f4	Create VERSION	2021-04-08 21:48:01 -07:00
Marcin Junczys-Dowmunt	bfa6180033	Revert "remove TC_MALLOC from optional dependencies (#840 )" This reverts commit `096c48e51c`.	2021-04-08 07:30:38 +00:00
Martin Junczys-Dowmunt	7d1f941242	Merged PR 18309: Cleaner suppression of unwanted output words This PR adds cleaner suppression of unwanted output words. We identified a situation where SPM with byte-fallback can generate random bytes with output-sampling. That is particularly harmful when that random bytes happens to be a newline symbol. Here we suppress newline in output unless explicitly wanted.	2021-03-26 16:17:12 +00:00
Nikolay Bogoychev	ffd997e360	Properly copy the entire vector in the int16_t case (#845 ) Fixes #842 #843 #844	2021-03-23 14:32:01 -07:00
Young Jin Kim	b36d0bbbab	Fix FBGEMM build with gcc 9.3+ (#836 )	2021-03-22 11:13:40 -07:00
Marcin Junczys-Dowmunt	0394d2cdbe	Display decoder speed statistics with --stat-freq N (#841 ) Display decoder time statistics if requested	2021-03-22 08:58:04 -07:00
Marcin Junczys-Dowmunt	096c48e51c	remove TC_MALLOC from optional dependencies (#840 ) There seems to be no benefit from TC_MALLOC any more, hence removing.	2021-03-22 08:02:04 -07:00
Roman Grundkiewicz	c89efbe919	Update VERSION	2021-03-19 15:56:37 +00:00
Roman Grundkiewicz	c724837ab3	Update VERSION	2021-03-19 13:20:31 +00:00
Marcin Junczys-Dowmunt	272096c1d1	sync public and internal master	2021-03-18 03:41:24 +00:00
Marcin Junczys-Dowmunt	8f73923d31	increase version and update changelog	2021-03-18 03:34:44 +00:00
Roman Grundkiewicz	bb92b817dd	Update VERSION	2021-03-12 11:58:53 +00:00
Roman Grundkiewicz	db771e09bd	Update VERSION	2021-03-03 10:21:06 +00:00
Roman Grundkiewicz	d92b74f67a	Update simple websocket server (#823 ) * Update simple-websocket-server submodule * Update VERSION	2021-03-02 17:48:19 +00:00
Roman Grundkiewicz	6810afae36	Update VERSION	2021-03-02 08:42:55 +00:00
Roman Grundkiewicz	8155d232db	Update CHANGELOG and VERSION	2021-02-28 09:08:50 +00:00
Roman Grundkiewicz	6627134064	Update VERSION	2021-02-22 13:01:38 +00:00
Marcin Junczys-Dowmunt	6f6d484665	increase version to 1.10.0	2021-02-06 15:35:16 -08:00
Roman Grundkiewicz	024de9a4ad	Update VERSION	2021-01-25 14:39:51 +00:00
Marcin Junczys-Dowmunt	18fd50df85	Bump up version	2021-01-24 16:03:49 -08:00
Roman Grundkiewicz	c1c4af08a9	Bump version	2021-01-07 10:42:24 +00:00

1 2 3

143 Commits