mirror of
https://github.com/marian-nmt/marian.git
synced 2024-09-17 09:47:34 +03:00
Update CHANGELOG
This commit is contained in:
parent
bbc817dc86
commit
b359aa9500
35
CHANGELOG.md
35
CHANGELOG.md
@ -7,13 +7,16 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
- Returning hard alignments by scorer
|
||||
|
||||
## [1.6.0] - 2018-08-08
|
||||
|
||||
### Added
|
||||
|
||||
- Faster training (20-30%) by optimizing gradient popagation of biases
|
||||
- Returning Moses-style hard alignments during decoding single models, ensembles and n-best
|
||||
lists
|
||||
- Returning Moses-style hard alignments during decoding single models,
|
||||
ensembles and n-best lists
|
||||
- Hard alignment extraction strategy taking source words that have the
|
||||
attention value greater than the threshold
|
||||
- Refactored sync sgd for easier communication and integration with NCCL
|
||||
@ -24,23 +27,24 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
- Memory-mapping of graphs for inferece with `ExpressionGraph::mmap(const void*
|
||||
ptr)` function. (assumes _*.bin_ model is mapped or in buffer)
|
||||
- Added SRU (--dec-cell sru) and ReLU (--dec-cell relu) cells to inventory of
|
||||
RNN cells.
|
||||
RNN cells
|
||||
- RNN auto-regression layers in transformer (`--transformer-decoder-autreg
|
||||
rnn`), work with gru, lstm, tanh, relu, sru cells.
|
||||
rnn`), work with gru, lstm, tanh, relu, sru cells
|
||||
- Recurrently stacked layers in transformer (`--transformer-tied-layers 1 1 1 2
|
||||
2 2` means 6 layers with 1-3 and 4-6 tied parameters, two groups of
|
||||
parameters)
|
||||
- Seamless training continuation with exponential smoothing
|
||||
|
||||
### Fixed
|
||||
|
||||
- A couple of bugs in "selection" (transpose, shift, cols, rows) operators during
|
||||
back-prob for a very specific case: one of the operators is the first operator after
|
||||
a branch, in that case gradient propgation might be interrupted. This did not affect
|
||||
any of the existing models as such a case was not present, but might have caused
|
||||
future models to not train properly.
|
||||
- Bug in mini-batch-fit, tied embeddings would result in identical embeddings in fake
|
||||
source and target batch. Caused under-estimation of memory usage and re-allocation.
|
||||
- Seamless training continuation with exponential smoothing
|
||||
- A couple of bugs in "selection" (transpose, shift, cols, rows) operators
|
||||
during back-prob for a very specific case: one of the operators is the first
|
||||
operator after a branch, in that case gradient propgation might be
|
||||
interrupted. This did not affect any of the existing models as such a case
|
||||
was not present, but might have caused future models to not train properly
|
||||
- Bug in mini-batch-fit, tied embeddings would result in identical embeddings
|
||||
in fake source and target batch. Caused under-estimation of memory usage and
|
||||
re-allocation
|
||||
|
||||
## [1.5.0] - 2018-06-17
|
||||
|
||||
@ -90,7 +94,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
## [1.2.1] - 2018-01-19
|
||||
|
||||
### Fixed
|
||||
- Use valid-mini-batch size during validation with "translation" instead of mini-batch
|
||||
- Use valid-mini-batch size during validation with "translation" instead of
|
||||
mini-batch
|
||||
- Normalize gradients with multi-gpu synchronous SGD
|
||||
- Fix divergence between saved models and validated models in asynchronous SGD
|
||||
|
||||
@ -122,7 +127,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
- Fixed ensembling with language model and batched decoding
|
||||
- Fixed attention reduction kernel with large matrices (added missing
|
||||
`syncthreads()`), which should fix stability with large batches and beam-size
|
||||
during batched decoding.
|
||||
during batched decoding
|
||||
|
||||
## [1.1.1] - 2017-11-30
|
||||
|
||||
@ -187,7 +192,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
|
||||
- Renamed option name `--dynamic-batching` to `--mini-batch-fit`
|
||||
- Unified cross-entropy-based validation, supports now perplexity and other CE
|
||||
- Changed `--normalize (bool)` to `--normalize (float)arg`, allow to change
|
||||
length normalization weight as `score / pow(length, arg)`.
|
||||
length normalization weight as `score / pow(length, arg)`
|
||||
|
||||
### Removed
|
||||
- Temporarily removed gradient dropping (`--drop-rate X`) until refactoring.
|
||||
|
Loading…
Reference in New Issue
Block a user