fixing minor typos in the API.md

This commit is contained in:
Dr. Christoph Mittendorf 2024-05-23 21:46:38 +02:00 committed by GitHub
parent 58b550871d
commit d36fa98e96
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -15,7 +15,7 @@ if (!status.ok()) {
}
// You can also load a serialized model from std::string.
// const std::stirng str = // Load blob contents from a file.
// const std::string str = // Load blob contents from a file.
// auto status = processor.LoadFromSerializedProto(str);
```
@ -64,7 +64,7 @@ processor.SampleEncode("This is a test.", &pieces, -1, 0.2);
std::vector<int> ids;
processor.SampleEncode("This is a test.", &ids, -1, 0.2);
```
SampleEncode has two sampling parameters, `nbest_size` and `alpha`, which correspond to `l` and `alpha` in the [original paper](https://arxiv.org/abs/1804.10959). When `nbest_size` is -1, one segmentation is sampled from all hypothesis with forward-filtering and backward sampling algorithm.
SampleEncode has two sampling parameters, `nbest_size` and `alpha`, which correspond to `l` and `alpha` in the [original paper](https://arxiv.org/abs/1804.10959). When `nbest_size` is -1, one segmentation is sampled from all hypotheses with forward-filtering and backward sampling algorithm.
## Training
Calls `SentencePieceTrainer::Train` function to train sentencepiece model. You can pass the same parameters of [spm_train](https://github.com/google/sentencepiece#train-sentencepiece-model) as a single string.