Commit Graph

4870 Commits

Author SHA1 Message Date
dependabot[bot]
58c4576e5d
Bump regression-tests from da95717 to 88e6382 (#923)
Bumps [regression-tests](https://github.com/marian-nmt/marian-regression-tests) from `da95717` to `88e6382`.
- [Release notes](https://github.com/marian-nmt/marian-regression-tests/releases)
- [Commits](da95717d41...88e6382241)

---
updated-dependencies:
- dependency-name: regression-tests
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-15 11:21:14 +00:00
Nikolay Bogoychev
8a9580b329
update the intgemm version to upstream (#924)
Some data types got upper cased, that's why there is a larger diff than expected

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2022-02-15 11:18:29 +00:00
Marcin Junczys-Dowmunt
b8bf086b10 move regression-tests pointer 2022-02-11 06:04:38 -08:00
Marcin Junczys-Dowmunt
b0275e7754 merge with internal master 2022-02-11 06:03:16 -08:00
Marcin Junczys-Dowmunt
4b51dcbd06 Merged PR 22524: Optimize guided alignment training speed via sparse alignments - part 1
This replaces dense alignment storage and training with a sparse representation. Training speed with guided alignment matches now nearly normal training speed, regaining about 25% speed.

This is no. 1 of 2 PRs. The next one will introduce a new guided-alignment training scheme with better alignment accuracy.
2022-02-11 13:50:47 +00:00
Marcin Junczys-Dowmunt
3b21ff39c5 update VERSION and CHANGELOG 2022-02-10 08:35:49 -08:00
Marcin Junczys-Dowmunt
b3feecc82b Merged PR 22483: Make C++17 the official standard for Marian
Make C++17 the official standard for Marian
2022-02-10 16:34:23 +00:00
Marcin Junczys-Dowmunt
e6dbacb310 Merged PR 22490: Faster LSH top-k for CPU
This PR replaces the top-k search from FAISS on the CPU with a more specialized version for discrete distances in sub-linear time.
2022-02-10 16:30:21 +00:00
dependabot[bot]
8fd553e582
Bump examples from 6d5921c to 0ca966e (#919)
Bumps [examples](https://github.com/marian-nmt/marian-examples) from `6d5921c` to `0ca966e`.
- [Release notes](https://github.com/marian-nmt/marian-examples/releases)
- [Commits](6d5921cc7d...0ca966eadd)

---
updated-dependencies:
- dependency-name: examples
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-10 14:03:37 +00:00
Roman Grundkiewicz
17e55f5a7d
Update VERSION 2022-02-10 11:20:47 +00:00
Graeme Nail
4d44627f26
PyYaml safe_load instead of load (#913)
* pyyaml safe_load instead of load
* Update CHANGELOG
2022-02-10 11:20:27 +00:00
dependabot[bot]
a492bc57d2
Bump regression-tests from 0716f4e to f7971b7 (#918)
Bumps [regression-tests](https://github.com/marian-nmt/marian-regression-tests) from `0716f4e` to `f7971b7`.
- [Release notes](https://github.com/marian-nmt/marian-regression-tests/releases)
- [Commits](0716f4e012...f7971b790a)

---
updated-dependencies:
- dependency-name: regression-tests
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-10 10:28:04 +00:00
Roman Grundkiewicz
73f1899307
Add dependabot for git submodules (#916) 2022-02-10 10:25:08 +00:00
Roman Grundkiewicz
b97645846a
Update release workflow (#915)
* Add CUDA 11.x to Windows installation script
* Update release.yml workflow
2022-02-09 18:56:56 +00:00
Graeme Nail
bcf29b8cd2
Update acknowledgements (#914) 2022-02-09 17:05:48 +00:00
Marcin Junczys-Dowmunt
f00d062189 update VERSION and CHANGELOG - Release 1.11.0 2022-02-08 08:40:33 -08:00
Graeme Nail
8e659bb5c0
Document Structure (#910)
* Add architectural outline
* Update index
2022-02-08 10:58:09 +00:00
Marcin Junczys-Dowmunt
05ba9e4c31
add -DDETERMINISTIC=ON/OFF flag (#912)
* Add -DDETERMINISTIC=ON/OFF flag to CMake
* Use -DDETERMINISTIC=on in GitHub/Azure workflows

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2022-02-08 10:57:20 +00:00
Marcin Junczys-Dowmunt
a365bb5ce9 fix server behaviour 2022-02-07 08:09:54 -08:00
Marcin Junczys-Dowmunt
aafe8fb5ca update regression tests pointer 2022-02-07 02:36:20 -08:00
Marcin Junczys-Dowmunt
3cf9e83bac resolve conflicts 2022-02-06 12:33:58 -08:00
Marcin Junczys-Dowmunt
8da539e835 merged with master 2022-02-06 12:00:48 -08:00
Roman Grundkiewicz
266b931daa
Update list of contributors (#906) 2022-01-30 20:11:38 +00:00
Roman Grundkiewicz
07c39c7d76
Cherry picked cleaning/refeactoring patches (#905)
Cherry-picked updates from pull request #457

Co-authored-by: Mateusz Chudyk <mateuszchudyk@gmail.com>
2022-01-28 14:16:41 +00:00
Qianqian Zhu
71b5454b9e
Layer documentation (#892)
* More examples for MLP layers and docs about RNN layers
* Docs about embedding layer and more doxygen code docs
* Add layer and factors docs into index.rst
* Update layer documentation
* Fix typos

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
Co-authored-by: Graeme Nail <graemenail.work@gmail.com>
2022-01-26 15:17:38 +00:00
Roman Grundkiewicz
3b458b044e
Update VERSION 2022-01-24 15:28:37 +00:00
Graeme Nail
894a07ad5b
Improve checks on transformer cache (#881)
* Fix caching in transformer attention
* Move hash specialization
* Swap comments to doxygen
* Include string header
2022-01-24 15:28:13 +00:00
Roman Grundkiewicz
b64e258bda
Update VERSION 2022-01-18 12:59:37 +00:00
Graeme Nail
b29cc07a95
Scorer model loading (#860)
* Add MMAP as an option
* Use io::isBin
* Allow getYamlFromModel from an Item vector
* ScorerWrapper can now load on to a graph from Item vector
The interface IEncoderDecoder can now call graph loads directly from an
Item Vector.
* Translator loads model before creating scorers
Scorers are created from an Item vector
* Replace model-config try-catch with check using IsNull
* Prefer empty vs size
* load by items should be pure virtual
* Stepwise forward load to encdec
* nematus can load from items
* amun can load from items
* loadItems in TranslateService
* Remove logging
* Remove by filename scorer functions
* Replace by filename createScorer
* Explicitly provide default value for get model-mmap
* CLI option for model-mmap only for translation and CPU compile
* Ensure model-mmap option is CPU only
* Remove move on temporary object
* Reinstate log messages for model loading in Amun / Nematus
* Add log messages for model loading in scorers

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2022-01-18 12:58:52 +00:00
Roman Grundkiewicz
c84599d08a
Update VERSION 2021-12-16 15:07:55 +00:00
Nikolay Bogoychev
e26e5b6faf
Use apple accelerate on MacOs by default (#897) 2021-12-16 15:07:34 +00:00
Nikolay Bogoychev
e8a1a2530f
Fix AVX2+ detection on Mac (#895)
MacOS is weird and its CPU flags are separated in two separate fields returned by the sysctl interface. To get around this, we need to test both of them, so here goes

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-12-07 17:47:33 +00:00
Qianqian Zhu
cd9afea8d3
Documentation about how to write code documentation (#891)
* add initial guidelines of code documentation
* fix math formula not displayed in Sphinx
* remove @name tags which cannot be extracted by exhale and cause function signature errors
* fix markdown ref warning and update markdown parser in sphinx
* more about doxygen: add Doxygen commands and math formulas
* move code doc guide to a new .rst file
* add formula image
* Set myst-parser version appropriate for the requested sphinx version
* Update documentation on how to write Doxygen comments
* Add new section to the documentation index
* Sphinx 2.4.4 requires myst-parser 0.14
* complete code doc guide and small fixes on reStructuredText formats
* More about reStructuredText
* Update badges on the documentation frontpage

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-12-07 15:10:46 +00:00
Marcin Junczys-Dowmunt
e8ea37cd5b Merged PR 21648: Allow for dynamic gradient scaling to fade out after N updates
Allow for dynamic gradient scaling to fade out after N updates
2021-12-06 23:20:44 +00:00
Graeme Nail
c64cb2990e
Constrain version of mistune to before v2 in GitHub CI Documentation builds (#894) 2021-12-06 14:06:14 +00:00
Marcin Junczys-Dowmunt
bbc673c50f update CHANGELOG and VERSION 2021-11-24 18:42:14 -08:00
Marcin Junczys-Dowmunt
8b8d1b11e2 Merged PR 21553: Parallelize data reading for training
This parallelizes data reading. On very fast GPUs and with small models training speed can be starved by too slow batch creation. Use --data-threads 8 or more, by default currently set to 1 for backcompat.
2021-11-25 02:33:49 +00:00
Nikolay Bogoychev
ab6b826083
Add GCC 11 support (#888)
* Add GCC 11 support

Some C++ Standard Library headers have been changed to no longer include other headers that they do need to depend on. As such, C++ programs that used standard library components without including the right headers will no longer compile.
The following headers are used less widely in libstdc++ and may need to be included explicitly when compiled with GCC 11:

<limits> (for std::numeric_limits)
<memory> (for std::unique_ptr, std::shared_ptr etc.)
<utility> (for std::pair, std::tuple_size, std::index_sequence etc.)
<thread> (for members of namespace std::this_thread.)

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-11-23 10:13:29 +00:00
Nikolay Bogoychev
1adf80b7c9
Task alias validation during training mode (#886)
* Attempt to validate task alias
* Validate allowed options for --task alias
* Update comment in aliases.cpp
* Show allowed values for alias

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-11-22 19:19:58 +00:00
Roman Grundkiewicz
3d15cd3d20 Update submodule regression-tests 2021-11-22 06:41:16 -08:00
David Meikle
3b4e943cda
Added pragma to ignore unused-private-field error on elementType_ which failed in macOS (#872)
Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-11-22 12:22:06 +00:00
Marcin Junczys-Dowmunt
c85d060848 Merged PR 20729: Add top-k sampling
This adds Top-K sampling to Marian and extends the --output-sampling option to take arguments
2021-11-22 03:32:54 +00:00
Roman Grundkiewicz
2bdfbd3f02
Update badges in README.md 2021-11-21 17:06:01 +00:00
Marcin Junczys-Dowmunt
1404201926 Merged PR 21151: Cleaning up fp16 behavior
This PR improves clipping and pruning behavior of NaNs and Infs during fp16 training, ultimately avoiding the underflow problems that we were facing so far.
2021-10-26 20:25:39 +00:00
Roman Grundkiewicz
7f06f3c5d2 Merged PR 21166: Keep building on macOS-10.15
Marian does not compile on macOS 11.6, so the build has stopped working due to an upgrade from macOS-10.15 to macOS 11.6 in Azure Pipelines: https://github.com/actions/virtual-environments/issues/4060
This PR explicitly set macOS 10.15 in the workflow.
2021-10-26 11:20:41 +00:00
Hieu Hoang
2d79ad02bb Merged PR 20933: beam & batch works for n on-factored models 2021-10-13 20:20:14 +00:00
Roman Grundkiewicz
12a1bfaf6f
Remove Ubuntu 16.04 from GitHub workflows (#879)
* Add --allow-unauthenticated when installing CUDA
* Remove workflow with Ubuntu 16.04
2021-10-11 16:59:52 +01:00
Marcin Junczys-Dowmunt
03fe175876 Merged PR 20879: Adjustable ffn width and depth in transformer decoder 2021-09-28 17:19:07 +00:00
Marcin Junczys-Dowmunt
d796a3c3b7 Merged PR 20839: Do not ignore ignoreEOS for spm decoding
With final space this eliminates trailing whitespace caused by appending EOS
2021-09-28 17:17:12 +00:00
Roman Grundkiewicz
aa58ba8e23 Merged PR 20593: Fix and update Azure pipelines
- Add `--allow-unauthenticated` to `apt` when installing CUDA on Ubuntu
- Removing `ubuntu-16.04` image from Azure pipelines, which will become unavailable after September 20
2021-09-20 13:14:24 +00:00