Merge pull request #845 from chris-ha458/patch-1

Update sentencepiece_python_module_example.ipynb
This commit is contained in:
Taku Kudo 2023-04-09 17:13:58 +09:00 committed by GitHub
commit 6c9fd791cf
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -948,7 +948,7 @@
"source": [
"## Randomizing training data\n",
"\n",
"Sentencepiece loads all the lines of training data into memory to train the model. However, larger training data increases the training time and memory usage, though they are liner to the training data. When **--input_sentence_size=<SIZE>** is specified, Sentencepiece randomly samples <SIZE> lines from the whole training data. **--shuffle_input_sentence=false** disables the random shuffle and takes the first <SIZE> lines."
"Sentencepiece loads all the lines of training data into memory to train the model. However, larger training data increases the training time and memory usage, though they are linear to the training data. When **--input_sentence_size=<SIZE>** is specified, Sentencepiece randomly samples <SIZE> lines from the whole training data. **--shuffle_input_sentence=false** disables the random shuffle and takes the first <SIZE> lines."
]
},
{