diff --git a/README.md b/README.md index 20f5693..3690de7 100644 --- a/README.md +++ b/README.md @@ -93,6 +93,8 @@ On top of the basic BPE implementation, this repository supports: use the argument `--dropout 0.1` for `subword-nmt apply-bpe` to randomly drop out possible merges. Doing this on the training corpus can improve quality of the final system; at test time, use BPE without dropout. In order to obtain reproducible results, argument `--seed` can be used to set the random seed. + + **Note:** In the original paper, the authors used BPE-Dropout on each new batch separately. You can copy the training corpus several times to get similar behavior to obtain multiple segmentations for the same sentence. - support for glossaries: use the argument `--glossaries` for `subword-nmt apply-bpe` to provide a list of words and/or regular expressions