marian/CHANGELOG.md

50 lines
1.9 KiB
Markdown
Raw Normal View History

2017-10-14 20:52:56 +03:00
# Changelog
2017-10-15 17:40:35 +03:00
2017-10-14 20:52:56 +03:00
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
## [Unreleased]
2017-10-18 12:02:53 +03:00
### Added
- Added CONTRIBUTING.md
2017-10-14 20:52:56 +03:00
## [1.0.0] - 2017-10-15
2017-10-15 17:40:35 +03:00
2017-10-14 20:52:56 +03:00
### Added
2017-10-15 17:40:35 +03:00
- New "transformer" model based on [Attention is all you
need](https://arxiv.org/abs/1706.03762)
- Options specific for the transformer model
2017-10-14 20:52:56 +03:00
- Linear learning rate warmup with and without initial value
- Cyclic learning rate warmup
2017-10-15 17:40:35 +03:00
- More options for learning rate decay, including: optimizer history reset,
repeated warmup
2017-10-18 12:02:53 +03:00
- Continuous inverted square root decay of learning (`--lr-decay-inv-sqrt`) rate
2017-10-15 17:40:35 +03:00
based on number of updates
2017-10-14 20:52:56 +03:00
- Exposed optimizer parameters (e.g. momentum etc. for Adam)
2017-10-18 12:02:53 +03:00
- Version of deep RNN-based models compatible with Nematus (`--type nematus`)
- Synchronous SGD training for multi-gpu (enable with `--sync-sgd`)
2017-10-14 20:52:56 +03:00
- Dynamic construction of complex models with different encoders and decoders,
2017-10-15 17:40:35 +03:00
currently only available through the C++ API
2017-10-14 20:52:56 +03:00
- Option --quiet to suppress output to stderr
2017-10-15 17:40:35 +03:00
- Option to choose different variants of optimization criterion: mean
cross-entropy, perplexity, cross-entopry sum
- In-process translation for validation, uses the same memory as training
- Label Smoothing
2017-10-14 20:52:56 +03:00
- Added CHANGELOG.md
2017-10-22 22:18:30 +03:00
- Swish activation function default for Transformer
(https://arxiv.org/pdf/1710.05941.pdf)
2017-10-14 20:52:56 +03:00
### Changed
2017-10-17 16:22:42 +03:00
- Renamed "s2s" binary to "marian-decoder"
- Renamed "rescorer" binary to "marian-scorer"
2017-10-18 12:02:53 +03:00
- Renamed "server" binary to "marian-server"
2017-10-17 16:22:42 +03:00
- Renamed option name `--dynamic-batching` to `--mini-batch-fit`
2017-10-15 17:40:35 +03:00
- Unified cross-entropy-based validation, supports now perplexity and other CE
2017-10-18 12:02:53 +03:00
- Changed `--normalize (bool)` to `--normalize (float)arg`, allow to change
length normalization weight as `score / pow(length, arg)`.
2017-10-14 20:52:56 +03:00
### Removed
2017-10-17 16:22:42 +03:00
- Temporarily removed gradient dropping (`--drop-rate X`) until refactoring.