.. |
builtin_pb
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
bpe_model_test.cc
|
Fixed windows build failure
|
2020-05-10 02:01:28 +09:00 |
bpe_model_trainer_test.cc
|
Use absl::flags
|
2020-06-01 00:53:07 +09:00 |
bpe_model_trainer.cc
|
Fix bugs in the handling of duplicated bigrams
|
2023-04-24 07:25:10 +00:00 |
bpe_model_trainer.h
|
Sync internal to github. DP related features are added.
|
2022-05-25 14:03:45 +09:00 |
bpe_model.cc
|
Added ImmutableSentencePiece class
|
2022-06-20 00:55:46 +09:00 |
bpe_model.h
|
clear description for alpha of BPE-dropout
|
2020-09-04 15:58:39 +02:00 |
builder_test.cc
|
Use absl::flags
|
2020-06-01 00:53:07 +09:00 |
builder.cc
|
Uses absl::string_view as much as possible
|
2022-06-15 01:29:55 +09:00 |
builder.h
|
Uses absl::string_view as much as possible
|
2022-06-15 01:29:55 +09:00 |
char_model_test.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
char_model_trainer_test.cc
|
Use absl::flags
|
2020-06-01 00:53:07 +09:00 |
char_model_trainer.cc
|
merges internal changes to github
|
2020-10-13 13:02:56 +09:00 |
char_model_trainer.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
char_model.cc
|
stop normalization for user_defined_symbols
|
2018-11-08 17:26:14 +09:00 |
char_model.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
CMakeLists.txt
|
Fixes build test errors in big-endian machines
|
2023-05-14 09:08:39 +00:00 |
common.h
|
Fixes build test errors in big-endian machines
|
2023-05-14 09:08:39 +00:00 |
compile_charsmap_main.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
error.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
filesystem_test.cc
|
Use absl::flags
|
2020-06-01 00:53:07 +09:00 |
filesystem.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
filesystem.h
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
freelist_test.cc
|
Sync internal to github. DP related features are added.
|
2022-05-25 14:03:45 +09:00 |
freelist.h
|
Sync internal to github. DP related features are added.
|
2022-05-25 14:03:45 +09:00 |
init_test.cc
|
change the type of input_sentence_size from int32 to uint64
|
2021-01-08 16:20:57 +09:00 |
init.h
|
Fixes include path when using external protobuf
|
2023-04-10 10:15:46 +00:00 |
model_factory_test.cc
|
Initialize repository
|
2017-03-07 19:43:50 +09:00 |
model_factory.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
model_factory.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
model_interface_test.cc
|
Added ImmutableSentencePiece class
|
2022-06-20 00:55:46 +09:00 |
model_interface.cc
|
sync from internal
|
2021-06-16 19:04:14 +09:00 |
model_interface.h
|
Added ImmutableSentencePiece class
|
2022-06-20 00:55:46 +09:00 |
normalization_rule.h
|
sync from internal
|
2021-06-16 19:04:14 +09:00 |
normalizer_test.cc
|
sync from internal
|
2021-06-16 19:04:14 +09:00 |
normalizer.cc
|
Fixes build test errors in big-endian machines
|
2023-05-14 09:08:39 +00:00 |
normalizer.h
|
Sync internal to github. DP related features are added.
|
2022-05-25 14:03:45 +09:00 |
pretokenizer_for_training_test.cc
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
pretokenizer_for_training.cc
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
pretokenizer_for_training.h
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
sentencepiece_model.proto
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
sentencepiece_processor_test.cc
|
Adds more unittests
|
2022-08-03 02:49:38 +09:00 |
sentencepiece_processor.cc
|
Fixed test failure.
|
2022-08-03 17:20:01 +09:00 |
sentencepiece_processor.h
|
Fixed test failure.
|
2022-08-03 17:20:01 +09:00 |
sentencepiece_trainer_test.cc
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
sentencepiece_trainer.cc
|
fixed link error
|
2021-06-17 01:56:17 +09:00 |
sentencepiece_trainer.h
|
Uses absl::string_view as much as possible
|
2022-06-15 01:29:55 +09:00 |
sentencepiece.proto
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
spec_parser.h
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
spm_decode_main.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
spm_encode_main.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
spm_export_vocab_main.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
spm_normalize_main.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
spm_train_main.cc
|
add pretokenization_delimiter options. Initialize seed pieces more accurately.
|
2023-04-10 02:11:37 +00:00 |
test_main.cc
|
added ShutdownLibrary function to uninitialize global variables
|
2022-08-20 23:34:37 +09:00 |
testharness.cc
|
support to build spm with external absl
|
2021-01-08 11:33:31 +09:00 |
testharness.h
|
support to build spm with external absl
|
2021-01-08 11:33:31 +09:00 |
trainer_factory_test.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
trainer_factory.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
trainer_factory.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
trainer_interface_test.cc
|
support pretokenization in BPE mode.
|
2023-04-11 06:48:08 +00:00 |
trainer_interface.cc
|
increases the max number of threads
|
2023-04-30 17:37:15 +00:00 |
trainer_interface.h
|
Sync internal to github. DP related features are added.
|
2022-05-25 14:03:45 +09:00 |
unicode_script_map.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
unicode_script_test.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
unicode_script.cc
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
unicode_script.h
|
fix address sanitizers on clang problem
|
2021-12-21 19:52:47 +08:00 |
unigram_model_test.cc
|
Fix the test error on windows
|
2023-04-28 06:20:50 +00:00 |
unigram_model_trainer_test.cc
|
Fixes build test errors in big-endian machines
|
2023-05-14 09:08:39 +00:00 |
unigram_model_trainer.cc
|
Fix the ULM training bugs
|
2023-04-27 17:32:57 +00:00 |
unigram_model_trainer.h
|
Fix the ULM training bugs
|
2023-04-27 17:32:57 +00:00 |
unigram_model.cc
|
Fix the ULM training bugs
|
2023-04-27 17:32:57 +00:00 |
unigram_model.h
|
Added ImmutableSentencePiece class
|
2022-06-20 00:55:46 +09:00 |
util_test.cc
|
Use absl::flags
|
2020-06-01 00:53:07 +09:00 |
util.cc
|
make the error message more descriptive. null termnate string in Utf8ToWide
|
2023-04-03 02:24:52 +00:00 |
util.h
|
fixes IS_BIGENDIAN macro places
|
2023-04-10 02:28:20 +00:00 |
word_model_test.cc
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
word_model_trainer_test.cc
|
Use absl::flags
|
2020-06-01 00:53:07 +09:00 |
word_model_trainer.cc
|
merges internal changes to github
|
2020-10-13 13:02:56 +09:00 |
word_model_trainer.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |
word_model.cc
|
Initial release of 0.19. Merged internal sentencepiece.
|
2020-05-08 01:06:50 +09:00 |
word_model.h
|
Port absl::flat_hash_map
|
2020-06-02 01:56:48 +09:00 |