Update README.md

This commit is contained in:
Taku Kudo 2018-06-09 00:34:58 +09:00 committed by GitHub
parent a929b63458
commit a574ce183c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -262,7 +262,7 @@ If you want to assign another special tokens, please see [Use custom symbols](do
The usage is basically the same as that of ```subword-nmt```. Assming that L1 and L2 are the two languages (source/target languages), train the shared spm model, and get resulting vocabulary for each:
```
% cat {train_file}.L1 {train_file}.L2 | shuffle > trian
% cat {train_file}.L1 {train_file}.L2 | shuffle > train
% spm_train --input=train --model_prefix=spm --vocab_size=8000
% spm_encode --model=spm.model --generate_vocabulary < {train_file}.L1 > {vocab_file}.L1
% spm_encode --model=spm.model --generate_vocabulary < {train_file}.L2 > {vocab_file}.L2