marian

mirror of https://github.com/marian-nmt/marian.git synced 2024-11-04 14:04:24 +03:00

Author	SHA1	Message	Date
Roman Grundkiewicz	b6581c4c44	Merged PR 26667: Update examples submodule to fix vulnerability issues Updating examples submodule using [protobuf 3.20.2](https://github.com/marian-nmt/marian-examples/pull/29) to fix recent [vulnerability issues](https://machinetranslation.visualstudio.com/MachineTranslation/_componentGovernance/mtmain/alert/8035094?typeId=14698327&pipelinesTrackingFilter=0). Related work items: #134319	2022-11-23 19:16:44 +00:00
Roman Grundkiewicz	c79dc80a2f	Merged PR 26617: Update regression-tests & fix CI pipelines Update regression-tests & fix CI pipelines	2022-11-20 13:31:10 +00:00
Roman Grundkiewicz	be1ee3fa94	Merged PR 26318: Fix incorrect envvar name in Azure Pipeline Fix incorrect environment variable name for SAS token in Windows tests	2022-11-01 10:07:40 +00:00
Roman Grundkiewicz	a6de1b781c	Merged PR 26271: Update CI pipeline triggers Updates to the CI triggers: - Stop running parallel CI runs, i.e. if a pipeline is running, it must finish before new runs are started. - Exclude paths to files, which are not related to/critical the codebase - Downloading MKL from a mirror hosting server	2022-11-01 06:26:56 +00:00
Marcin Junczys-Dowmunt	4d3702c4ec	Merged PR 25950: Add missing defaults for concatenated factors This PR adds missing default values for concatenated factors.	2022-10-06 05:53:16 +00:00
Marcin Junczys-Dowmunt	1e92cff93d	Merged PR 25919: Sync with public master - no review required Sync with public master, checking compilation, regression tests etc.	2022-10-04 00:42:52 +00:00
Marcin Junczys-Dowmunt	2c55cdb3c0	Merged PR 25889: Fixes bad memory access problem in hashing Fix bad memory access problem in hashing by using the graph allocator	2022-09-29 19:01:49 +00:00
Marcin Junczys-Dowmunt	2cd3055d76	Merged PR 25836: Check via hashing if re-syncing in local mode is required * This adds GPU-side hashing to tensors (a hash based on mumurhash3) * The hash is used to check if parameters across nodes have diverged, if yes, resync all parameters and optimizer shards. Before it would resync every N (100 or 200) updates. Now this can be skipped if nothing diverged.	2022-09-27 18:40:53 +00:00
Marcin Junczys-Dowmunt	1f2929d528	Merged PR 25733: Fused inplace ReLU and Dropout in transformer FFN layer * First attempt at fused inplace ReLU and Dropout in transformer FFN layer * Adds optional output projection to SSRU. For large FFN blocks and dropout about 20-25% speed improvement during training.	2022-09-26 20:17:33 +00:00
Marcin Junczys-Dowmunt	cfc33f5498	only use tcmalloc_minimal	2022-09-22 15:11:33 -07:00
Marcin Junczys-Dowmunt	7d2045a907	Merged PR 25686: Loading checkpoints from main node only via MPI Enables loading of model checkpoints from main node only via MPI. Until now the checkpoint needed to present in the same location on all nodes. That could be done either via writing to a shared filesystem (problematic due to bad syncing) or by manual copying to the same local location, e.g. /tmp on each node (while writing only happened to one main location). Now, marian can resume training from only one location on the main node. The remaining nodes do not need to have access. E.g. local /tmp on the main node can be used, or race conditons on shared storage are avoided. Also avoids creating files for logging on more than one node. This is a bit wonky, done via environment variable lookup.	2022-09-21 20:39:54 +00:00
Marcin Junczys-Dowmunt	76964791ad	Merged PR 23767: More principled sampling and force-decoding This PR adds correct force-decoding and more principled sampling, both should now work for ensembles, batches and with beam search.	2022-09-16 22:53:08 +00:00
Roman Grundkiewicz	e13053a6f2	Merged PR 25698: Install Python 3.8 on GPU pool Python >= 3.8 is required for numpy >= 1.22, which is the minimum version without vulnerability issues.	2022-09-16 09:30:10 +00:00
Roman Grundkiewicz	6f7766f837	Merged PR 25465: Choose top checkpoints from train.log for averaging Added `--from-log logfile N metric asc\|desc` option to `average.py`, which selects top N checkpoint paths from the provided train.log file according to the selected metric. Last 3 arguments to this option are optional. If the last argument is omitted, "asc" is assumed for perplexity and "desc" for other metrics.	2022-09-15 06:19:18 +00:00
Roman Grundkiewicz	a47912d9f1	Merged PR 25518: Upgrade Azure Pipelines to macos-12 macos-10.15 will become unsupported in December 2022. Changes: * Upgrade Azure DevOps to macos-12 * Pull https://github.com/marian-nmt/sentencepiece/pull/14 * Fix clang 13 errors as in https://github.com/marian-nmt/marian-dev/pull/939	2022-09-15 06:18:42 +00:00
Roman Grundkiewicz	5d466bc367	Merged PR 25507: Upgrade Azure Pipelines to ubuntu-20.04 Ubuntu-18.04 will not be supported after October 2022.	2022-09-02 05:55:20 +00:00
Alex Muzio	a90950ea25	Merged PR 25154: Add model shapes flag to model_info.py script Add model shapes flag to model_info.py script through `--matrix-shapes` flag This will print something like: ``` ... encoder_l6_ffn_W1 (1024, 4096) encoder_l6_ffn_W2 (4096, 1024) encoder_l6_ffn_b1 (1, 4096) encoder_l6_ffn_b2 (1, 1024) encoder_l6_ffn_ffn_ln_bias (1, 1024) encoder_l6_ffn_ffn_ln_scale (1, 1024) encoder_l6_self_Wk (1024, 1024) encoder_l6_self_Wo (1024, 1024) encoder_l6_self_Wo_ln_bias (1, 1024) encoder_l6_self_Wo_ln_scale (1, 1024) encoder_l6_self_Wq (1024, 1024) encoder_l6_self_Wv (1024, 1024) encoder_l6_self_bk (1, 1024) encoder_l6_self_bo (1, 1024) encoder_l6_self_bq (1, 1024) encoder_l6_self_bv (1, 1024) special:model.yml (1264,) ```	2022-08-10 22:23:47 +00:00
Roman Grundkiewicz	c5081df93f	Merged PR 24111: Remove external reference to Docker images The reference to docker.io triggers a security warning (https://eng.ms/docs/more/containers-secure-supply-chain) making our pipelines flashing orange, which cover the real status of regression testing. This PR simply replaced the external reference to an internal mirror (https://eng.ms/docs/more/containers-secure-supply-chain/approved-images).	2022-05-31 15:31:39 +00:00
Marcin Junczys-Dowmunt	042ed8f2e2	Merged PR 24072: Revert changes to transformer caching This PR reverts changes to transformer caching (public PR https://github.com/marian-nmt/marian-dev/pull/881) It seems to cause catastrophic memory leaks or incorrect de-allocation during decoding.	2022-05-30 07:27:15 +00:00
Marcin Junczys-Dowmunt	f3e1efe731	merge with internal master	2022-05-26 06:28:06 -07:00
Graeme Nail	95720ae19f	Update NVIDIA CUDA signing key for CI; fix for building docs (#932 ) * Update NVIDIA CUDA signing key for CI * Constrain Jinja2 to build docs	2022-05-18 11:11:28 +01:00
Roman Grundkiewicz	704a323142	Merged PR 22799: Running regression tests on Azure Pipelines This PR adds an Azure Pipeline for running regression tests on an Azure Hosted GPU Pool. It currently run on Ubuntu 18.04, GCC 8, CUDA 11.1, a single Nvidia M60 GPU device (Maxwell). The pipeline needs to be started manually: go to "Pipelines", then "Marian GPU Pool", click "Run pipeline", select the branch, click "Run".	2022-05-13 07:30:36 +00:00
Roman Grundkiewicz	e0e3287a3b	Merged PR 23840: Update CUDA installation script for Ubuntu Updates CUDA deb/key fetching https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key/	2022-05-12 16:23:58 +00:00
Marcin Junczys-Dowmunt	e4f3d0f740	add fallback option for sampling, for back-compat	2022-05-09 13:28:28 -07:00
Marcin Junczys-Dowmunt	1a74358277	Merged PR 23429: Small fixes around fp16 training and batch fitting This PR introduces small fixes around fp16 training and batch fitting: * Multi-loss casts type to first loss-type before accumulation (aborted before due to missing cast) * Throw `ShapeSizeException` if total expanded shape size exceeds numeric capacity of the maximum int value (2^31-1) * During mini-batch-fitting, catch `ShapeSizeException` and use another sizing hint. Aborts outside mini-batch-fitting. * Negative `--workspace -N` value allocates workspace as total available GPU memory minus N megabytes.	2022-04-11 20:19:58 +00:00
Roman Grundkiewicz	1e4e1014ed	Merged PR 23415: Set Windows image back to windows-2019 This should resolve latest issues with Windows checks.	2022-04-08 17:15:56 +00:00
Marcin Junczys-Dowmunt	d5c7372a67	Merged PR 23407: Fix incorrect/missing gradient accumulation for affine biases This PR fixes incorrect/missing gradient accumulation with delay > 1 or large effective batch size of biases of affine operations.	2022-04-08 16:00:04 +00:00
Artur Nowakowski	23c36ec1a3	Fixed fp16 training/inference with factors-combine concat (#926 )	2022-03-22 10:07:41 +00:00
dependabot[bot]	78bef7aeba	Bump src/3rd_party/sentencepiece from `c307b87` to `5312a30` (#927 ) Bumps [src/3rd_party/sentencepiece](https://github.com/marian-nmt/sentencepiece) from `c307b87` to `5312a30`. - [Release notes](https://github.com/marian-nmt/sentencepiece/releases) - [Commits](`c307b874de...5312a306c4`) --- updated-dependencies: - dependency-name: src/3rd_party/sentencepiece dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-22 10:06:11 +00:00
dependabot[bot]	75a7a1dfd2	Bump regression-tests from `88e6382` to `4fa9ff5` (#929 ) Bumps [regression-tests](https://github.com/marian-nmt/marian-regression-tests) from `88e6382` to `4fa9ff5`. - [Release notes](https://github.com/marian-nmt/marian-regression-tests/releases) - [Commits](`88e6382241...4fa9ff55af`) --- updated-dependencies: - dependency-name: regression-tests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-22 08:40:11 +00:00
dependabot[bot]	c809843f14	Bump examples from `6d5921c` to `29f4f7c` (#928 ) Bumps [examples](https://github.com/marian-nmt/marian-examples) from `6d5921c` to `29f4f7c`. - [Release notes](https://github.com/marian-nmt/marian-examples/releases) - [Commits](`6d5921cc7d...29f4f7c380`) --- updated-dependencies: - dependency-name: examples dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-22 08:38:30 +00:00
Marcin Junczys-Dowmunt	16bfa0c913	Merged PR 23094: Adapt --cost-scaling to more stable setting This PR sets default parameters for cost-scaling to 8.f 10000 1.f 8.f, i.e. when scaling scale by 8 and do not try to automatically scale up or down. This seems most stable than variable cost-scaling with larger numbers that was the default before.	2022-03-16 14:44:17 +00:00
Marcin Junczys-Dowmunt	310d2f42f6	Merged PR 22939: Fix case augmentation with multi-threaded reading This PR fixes case augmentation with multi-threaded reading. The solution is to not look at iterator::pos_ in lazy processing, rather pass it as an argument to the lazy function.	2022-03-07 16:57:32 +00:00
Marcin Junczys-Dowmunt	adaaf087e4	better error message	2022-02-16 13:20:48 -08:00
Graeme Nail	601c9ac980	Detect fortran_order in npz (#911 ) * Fix fortran_order parsing * Abort on non row-major NPZ entries * Update CHANGELOG * Update VERSION Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2022-02-15 13:22:49 +00:00
dependabot[bot]	58c4576e5d	Bump regression-tests from `da95717` to `88e6382` (#923 ) Bumps [regression-tests](https://github.com/marian-nmt/marian-regression-tests) from `da95717` to `88e6382`. - [Release notes](https://github.com/marian-nmt/marian-regression-tests/releases) - [Commits](`da95717d41...88e6382241`) --- updated-dependencies: - dependency-name: regression-tests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-02-15 11:21:14 +00:00
Nikolay Bogoychev	8a9580b329	update the intgemm version to upstream (#924 ) Some data types got upper cased, that's why there is a larger diff than expected Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>	2022-02-15 11:18:29 +00:00
Marcin Junczys-Dowmunt	b8bf086b10	move regression-tests pointer	2022-02-11 06:04:38 -08:00
Marcin Junczys-Dowmunt	b0275e7754	merge with internal master	2022-02-11 06:03:16 -08:00
Marcin Junczys-Dowmunt	4b51dcbd06	Merged PR 22524: Optimize guided alignment training speed via sparse alignments - part 1 This replaces dense alignment storage and training with a sparse representation. Training speed with guided alignment matches now nearly normal training speed, regaining about 25% speed. This is no. 1 of 2 PRs. The next one will introduce a new guided-alignment training scheme with better alignment accuracy.	2022-02-11 13:50:47 +00:00
Marcin Junczys-Dowmunt	3b21ff39c5	update VERSION and CHANGELOG	2022-02-10 08:35:49 -08:00
Marcin Junczys-Dowmunt	b3feecc82b	Merged PR 22483: Make C++17 the official standard for Marian Make C++17 the official standard for Marian	2022-02-10 16:34:23 +00:00
Marcin Junczys-Dowmunt	e6dbacb310	Merged PR 22490: Faster LSH top-k for CPU This PR replaces the top-k search from FAISS on the CPU with a more specialized version for discrete distances in sub-linear time.	2022-02-10 16:30:21 +00:00
dependabot[bot]	8fd553e582	Bump examples from `6d5921c` to `0ca966e` (#919 ) Bumps [examples](https://github.com/marian-nmt/marian-examples) from `6d5921c` to `0ca966e`. - [Release notes](https://github.com/marian-nmt/marian-examples/releases) - [Commits](`6d5921cc7d...0ca966eadd`) --- updated-dependencies: - dependency-name: examples dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-02-10 14:03:37 +00:00
Roman Grundkiewicz	17e55f5a7d	Update VERSION	2022-02-10 11:20:47 +00:00
Graeme Nail	4d44627f26	PyYaml safe_load instead of load (#913 ) * pyyaml safe_load instead of load * Update CHANGELOG	2022-02-10 11:20:27 +00:00
dependabot[bot]	a492bc57d2	Bump regression-tests from `0716f4e` to `f7971b7` (#918 ) Bumps [regression-tests](https://github.com/marian-nmt/marian-regression-tests) from `0716f4e` to `f7971b7`. - [Release notes](https://github.com/marian-nmt/marian-regression-tests/releases) - [Commits](`0716f4e012...f7971b790a`) --- updated-dependencies: - dependency-name: regression-tests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-02-10 10:28:04 +00:00
Roman Grundkiewicz	73f1899307	Add dependabot for git submodules (#916 )	2022-02-10 10:25:08 +00:00
Roman Grundkiewicz	b97645846a	Update release workflow (#915 ) * Add CUDA 11.x to Windows installation script * Update release.yml workflow	2022-02-09 18:56:56 +00:00
Graeme Nail	bcf29b8cd2	Update acknowledgements (#914 )	2022-02-09 17:05:48 +00:00

1 2 3 4 5 ...

4905 Commits