Taku Kudo
238fd2cc43
Merge pull request #1005 from Cassini-chris/patch-1
...
Fixing issues with the normalizer.cc (typo, type safety, cast fucn)
2024-05-06 17:02:27 +09:00
Taku Kudo
0da168ca45
Merge pull request #1004 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-0dd82dcf5c
...
Bump setuptools from 69.2.0 to 69.5.1 in /.github/workflows/requirements in the build-time-deps group
2024-05-05 12:30:39 +09:00
Taku Kudo
0575107765
Merge pull request #1002 from mcognetta/small_changes
...
Add missing output formats to spm_encode flag documentation
2024-05-05 12:29:54 +09:00
Dr. Christoph Mittendorf
7425587c6a
Fixing issues with the normalizer.cc
...
1x fixed typo
1x Changed uint32 to uint32_t for consistency and type safety.
1x The original code uses const_cast<char*> before reinterpreting it as uint32*. Without const_cast, the compiler will treat the blob.data() as a pointer to constant data (since std::string is typically constant).
2024-05-04 10:42:03 +02:00
dependabot[bot]
34042dc854
Bump setuptools
...
Bumps the build-time-deps group in /.github/workflows/requirements with 1 update: [setuptools](https://github.com/pypa/setuptools ).
Updates `setuptools` from 69.2.0 to 69.5.1
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.2.0...v69.5.1 )
---
updated-dependencies:
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: build-time-deps
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-05-01 15:51:12 +00:00
Marco Cognetta
530f556b82
update spm_encode docs for bpe dropout
2024-05-01 14:40:27 +09:00
Taku Kudo
7dcb541451
Merge pull request #998 from google/dependabot/pip/dot-github/workflows/requirements/idna-3.7
...
Bump idna from 3.6 to 3.7 in /.github/workflows/requirements
2024-04-15 01:26:05 +09:00
dependabot[bot]
3fcf4d36cb
Bump idna from 3.6 to 3.7 in /.github/workflows/requirements
...
Bumps [idna](https://github.com/kjd/idna ) from 3.6 to 3.7.
- [Release notes](https://github.com/kjd/idna/releases )
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst )
- [Commits](https://github.com/kjd/idna/compare/v3.6...v3.7 )
---
updated-dependencies:
- dependency-name: idna
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-04-12 04:24:28 +00:00
Taku Kudo
7dc9a76ec7
Merge pull request #995 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-6d60a12ad5
...
Bump the build-time-deps group in /.github/workflows/requirements with 3 updates
2024-04-05 22:12:34 +09:00
dependabot[bot]
404882df3f
Bump the build-time-deps group
...
Bumps the build-time-deps group in /.github/workflows/requirements with 3 updates: [cibuildwheel](https://github.com/pypa/cibuildwheel ), [wheel](https://github.com/pypa/wheel ) and [setuptools](https://github.com/pypa/setuptools ).
Updates `cibuildwheel` from 2.16.5 to 2.17.0
- [Release notes](https://github.com/pypa/cibuildwheel/releases )
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md )
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.16.5...v2.17 )
Updates `wheel` from 0.42.0 to 0.43.0
- [Release notes](https://github.com/pypa/wheel/releases )
- [Changelog](https://github.com/pypa/wheel/blob/main/docs/news.rst )
- [Commits](https://github.com/pypa/wheel/compare/0.42.0...0.43.0 )
Updates `setuptools` from 69.1.1 to 69.2.0
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.1.1...v69.2.0 )
---
updated-dependencies:
- dependency-name: cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: build-time-deps
- dependency-name: wheel
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: build-time-deps
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: build-time-deps
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 15:29:14 +00:00
Taku Kudo
4d6a1f4106
Merge pull request #985 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-0b1593cb10
...
Bump the build-time-deps group in /.github/workflows/requirements with 3 updates
2024-03-03 09:53:10 +09:00
dependabot[bot]
1d7ce29586
Bump the build-time-deps group
...
Bumps the build-time-deps group in /.github/workflows/requirements with 3 updates: [twine](https://github.com/pypa/twine ), [pip](https://github.com/pypa/pip ) and [setuptools](https://github.com/pypa/setuptools ).
Updates `twine` from 4.0.2 to 5.0.0
- [Release notes](https://github.com/pypa/twine/releases )
- [Changelog](https://github.com/pypa/twine/blob/main/docs/changelog.rst )
- [Commits](https://github.com/pypa/twine/compare/4.0.2...5.0.0 )
Updates `pip` from 23.3.2 to 24.0
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/pip/compare/23.3.2...24.0 )
Updates `setuptools` from 69.0.3 to 69.1.1
- [Release notes](https://github.com/pypa/setuptools/releases )
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst )
- [Commits](https://github.com/pypa/setuptools/compare/v69.0.3...v69.1.1 )
---
updated-dependencies:
- dependency-name: twine
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: build-time-deps
- dependency-name: pip
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: build-time-deps
- dependency-name: setuptools
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: build-time-deps
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-03-01 15:09:24 +00:00
Taku Kudo
725952d8b8
makes the return value of --help same as official abseil library
2024-02-26 13:01:58 +00:00
Taku Kudo
52a7f156a4
increment version v0.2.1
2024-02-26 05:56:28 +00:00
Taku Kudo
9082653296
use ::testing::TempDir/SrcDir
2024-02-26 05:30:23 +00:00
Taku Kudo
3b2ea62d20
fix build error
2024-02-25 16:19:08 +00:00
Taku Kudo
0ba506938c
add nfc, nfd normalization tsv files
2024-02-25 15:47:08 +00:00
Taku Kudo
a216bd01d1
Merge pull request #981 from google/dependabot/pip/dot-github/workflows/requirements/cryptography-42.0.4
...
Bump cryptography from 42.0.2 to 42.0.4 in /.github/workflows/requirements
2024-02-22 16:56:11 +09:00
dependabot[bot]
c7b4cd5019
Bump cryptography in /.github/workflows/requirements
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 42.0.2 to 42.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/42.0.2...42.0.4 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-21 20:45:34 +00:00
Taku Kudo
1d91514435
Merge pull request #979 from h-vetinari/libdir
...
move setting of default CMAKE_INSTALL_{BIN,INCLUDE,LIB}DIR before first use
2024-02-21 14:26:56 +09:00
H. Vetinari
b2863fc8d3
also change CMAKE_INSTALL_INCDIR->CMAKE_INSTALL_INCLUDEDIR in src/CMakeLists.txt
2024-02-21 12:08:41 +11:00
H. Vetinari
270120812e
unify spelling of CMAKE_INSTALL_INCLUDEDIR
...
Following GNUInstallDirs defaults, see also CMake docs:
https://cmake.org/cmake/help/latest/command/install.html
2024-02-20 21:20:08 +11:00
H. Vetinari
26f9f58806
move setting of default CMAKE_INSTALL_{BIN,INCLUDE,LIB}DIR before first use
2024-02-20 21:13:20 +11:00
Taku Kudo
17d7580d64
suppress warnings in testharnress
2024-02-19 08:06:52 +00:00
Taku Kudo
4a3cd1cbaf
Merge pull request #975 from google/dependabot/pip/dot-github/workflows/requirements/cryptography-42.0.2
...
Bump cryptography from 42.0.0 to 42.0.2 in /.github/workflows/requirements
2024-02-19 13:52:25 +09:00
dependabot[bot]
670d2e7ee0
Bump cryptography in /.github/workflows/requirements
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 42.0.0 to 42.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/42.0.0...42.0.2 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-17 00:55:36 +00:00
Taku Kudo
2b8772ae8f
Merge pull request #974 from xunkai55/patch-1
...
Fix a typo in api.md
2024-02-14 21:50:54 +09:00
Xunkai
ffd8e9efb8
Fix a typo in api.md
2024-02-12 18:57:30 +08:00
Taku Kudo
03243af616
Merge pull request #970 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-bd99d7bc59
...
Bump the build-time-deps group in /.github/workflows/requirements with 1 update
2024-02-06 18:12:26 +09:00
Taku Kudo
d0fe40512e
Merge pull request #972 from google/dependabot/pip/dot-github/workflows/requirements/cryptography-42.0.0
...
Bump cryptography from 41.0.7 to 42.0.0 in /.github/workflows/requirements
2024-02-06 18:12:06 +09:00
dependabot[bot]
a8a618fb66
Bump cryptography in /.github/workflows/requirements
...
Bumps [cryptography](https://github.com/pyca/cryptography ) from 41.0.7 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pyca/cryptography/compare/41.0.7...42.0.0 )
---
updated-dependencies:
- dependency-name: cryptography
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-06 03:24:50 +00:00
dependabot[bot]
bbbe548a20
Bump the build-time-deps group
...
Bumps the build-time-deps group in /.github/workflows/requirements with 1 update: [cibuildwheel](https://github.com/pypa/cibuildwheel ).
Updates `cibuildwheel` from 2.16.2 to 2.16.5
- [Release notes](https://github.com/pypa/cibuildwheel/releases )
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md )
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.16.2...v2.16.5 )
---
updated-dependencies:
- dependency-name: cibuildwheel
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: build-time-deps
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-01 16:00:54 +00:00
Taku Kudo
53de76561c
allows to load precomputed seed sentencepieces for unigram from a file.
2024-01-28 16:17:08 +00:00
Taku Kudo
0fe7add363
fixed crash bug in unigram model training
2024-01-28 05:19:00 +00:00
Taku Kudo
41c4b7f080
returns unicode characetr offsets in normalize method
2024-01-22 07:19:04 +00:00
Taku Kudo
6b468a0e01
support bytes output in decode method
2024-01-20 08:16:17 +00:00
Taku Kudo
7b9ee4c93e
Merge pull request #962 from Halmoni100/external-absl-2
...
Additional external absl fixes
2024-01-18 15:31:43 +09:00
Christopher Hong
0ea22c03e2
Fix build for external absl
2024-01-17 19:33:51 -05:00
Christopher Hong
a34fb4018c
Add idempotency to external absl mod
2024-01-17 19:32:48 -05:00
Taku Kudo
4ce471c002
Update cross_build.yml
...
runs apt-get update to update the local index
2024-01-17 01:02:45 +09:00
Taku Kudo
de1747bbd4
added functionality to override normalizer spec
2024-01-16 04:06:05 +00:00
Taku Kudo
0018af1f31
better exteranl abseil and protobuf support
2024-01-16 02:15:46 +00:00
Taku Kudo
acf8ebe61f
build universal osx binary
2024-01-14 01:51:01 +00:00
Taku Kudo
ed76ecc478
add more advanced SentencePieceNormalizer class
2024-01-13 17:19:50 +00:00
Taku Kudo
f5c736302c
remove absl/random and absl/memory, add absl::btree_map
2024-01-07 10:48:48 +00:00
Taku Kudo
adf9e81b63
move SharedBitGen to random namespace
2024-01-06 15:56:51 +00:00
Taku Kudo
49afc4c6cc
Merge pull request #959 from google/revert-957-dependabot/github_actions/github-actions-bcafe21e81
...
Revert "Bump the github-actions group with 2 updates"
2024-01-07 00:24:40 +09:00
Taku Kudo
fb490c58c2
Revert "Bump the github-actions group with 2 updates"
2024-01-06 22:55:18 +09:00
Taku Kudo
06eee09847
Added Normalization API
2024-01-04 09:04:20 +00:00
Taku Kudo
e7b5260e4a
Merge pull request #955 from pnacht/pinned-pip
...
Hash-pin Python dependencies in CI/CD release workflows
2024-01-03 12:29:39 +09:00