Add note about BPE-Dropout during training

2024-11-23 08:29:06 +03:00 · 2021-02-15 11:59:34 +03:00 · 2021-02-15 11:59:34 +03:00 · fa326d431c
commit fa326d431c
parent 234923ed53
1 changed files with 2 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -94,6 +94,8 @@ On top of the basic BPE implementation, this repository supports:
  Doing this on the training corpus can improve quality of the final system; at test time, use BPE without dropout.
  In order to obtain reproducible results, argument `--seed` can be used to set the random seed.
  
+  **Note:** In the original paper, the authors used BPE-Dropout on each new batch separately. You can copy the training corpus several times to get similar behavior to obtain multiple segmentations for the same sentence.
+
 - support for glossaries:
  use the argument `--glossaries` for `subword-nmt apply-bpe` to provide a list of words and/or regular expressions
  that should always be passed to the output without subword segmentation