Merge pull request #1015 from Cassini-chris/patch-3

fixing minor typos in the API.md
This commit is contained in:
Taku Kudo 2024-05-24 16:09:18 +09:00 committed by GitHub
commit b35273478e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -15,7 +15,7 @@ if (!status.ok()) {
}
// You can also load a serialized model from std::string.
// const std::stirng str = // Load blob contents from a file.
// const std::string str = // Load blob contents from a file.
// auto status = processor.LoadFromSerializedProto(str);
```
@ -64,7 +64,7 @@ processor.SampleEncode("This is a test.", &pieces, -1, 0.2);
std::vector<int> ids;
processor.SampleEncode("This is a test.", &ids, -1, 0.2);
```
SampleEncode has two sampling parameters, `nbest_size` and `alpha`, which correspond to `l` and `alpha` in the [original paper](https://arxiv.org/abs/1804.10959). When `nbest_size` is -1, one segmentation is sampled from all hypothesis with forward-filtering and backward sampling algorithm.
SampleEncode has two sampling parameters, `nbest_size` and `alpha`, which correspond to `l` and `alpha` in the [original paper](https://arxiv.org/abs/1804.10959). When `nbest_size` is -1, one segmentation is sampled from all hypotheses with forward-filtering and backward sampling algorithm.
## Training
Calls `SentencePieceTrainer::Train` function to train sentencepiece model. You can pass the same parameters of [spm_train](https://github.com/google/sentencepiece#train-sentencepiece-model) as a single string.