mirror of
https://github.com/marian-nmt/marian.git
synced 2024-09-17 09:47:34 +03:00
Update CHANGELOG
This commit is contained in:
parent
a4ef9b52a6
commit
facd045bf6
20
CHANGELOG.md
20
CHANGELOG.md
@ -17,11 +17,17 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
- Refactored sync sgd for easier communication and integration with NCCL
|
||||
- Smaller memory-overhead for sync-sgd
|
||||
- NCCL integration (version 2.2.13)
|
||||
- New binary format for saving/load of models, can be used with *.bin extension (can be memory mapped)
|
||||
- Memory-mapping of graphs for inferece with ExpressionGraph::mmap(const void* ptr) function. (assumes *.bin model is mapped or in buffer)
|
||||
- Added SRU (--dec-cell sru) and ReLU (--dec-cell relu) cells to inventory of RNN cells.
|
||||
- RNN auto-regression layers in transformer (--transformer-decoder-autreg rnn), work with gru, lstm, tanh, relu, sru cells.
|
||||
- Recurrently stacked layers in transformer (--transformer-tied-layers 1 1 1 2 2 2 means 6 layers with 1-3 and 4-6 tied parameters, two groups of parameters)
|
||||
- New binary format for saving/load of models, can be used with _*.bin_
|
||||
extension (can be memory mapped)
|
||||
- Memory-mapping of graphs for inferece with `ExpressionGraph::mmap(const void*
|
||||
ptr)` function. (assumes _*.bin_ model is mapped or in buffer)
|
||||
- Added SRU (--dec-cell sru) and ReLU (--dec-cell relu) cells to inventory of
|
||||
RNN cells.
|
||||
- RNN auto-regression layers in transformer (`--transformer-decoder-autreg
|
||||
rnn`), work with gru, lstm, tanh, relu, sru cells.
|
||||
- Recurrently stacked layers in transformer (`--transformer-tied-layers 1 1 1 2
|
||||
2 2` means 6 layers with 1-3 and 4-6 tied parameters, two groups of
|
||||
parameters)
|
||||
|
||||
### Fixed
|
||||
|
||||
@ -32,6 +38,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
future models to not train properly.
|
||||
- Bug in mini-batch-fit, tied embeddings would result in identical embeddings in fake
|
||||
source and target batch. Caused under-estimation of memory usage and re-allocation.
|
||||
- Seamless training continuation with exponential smoothing
|
||||
|
||||
## [1.5.0] - 2018-06-17
|
||||
|
||||
@ -165,7 +172,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
|
||||
### Changed
|
||||
- Changed shape organization to follow numpy.
|
||||
- Changed option `--moving-average` to `--exponential-smoothing` and inverted formula to `s_t = (1 - \alpha) * s_{t-1} + \alpha * x_t`, `\alpha` is now
|
||||
- Changed option `--moving-average` to `--exponential-smoothing` and inverted
|
||||
formula to `s_t = (1 - \alpha) * s_{t-1} + \alpha * x_t`, `\alpha` is now
|
||||
`1-e4` by default
|
||||
- Got rid of thrust for compile-time mathematical expressions
|
||||
- Changed boolean option `--normalize` to `--normalize [arg=1] (=0)`. New
|
||||
|
Loading…
Reference in New Issue
Block a user