Commit Graph

951 Commits

Author SHA1 Message Date
Taku Kudo
238fd2cc43
Merge pull request #1005 from Cassini-chris/patch-1
Fixing issues with the normalizer.cc (typo, type safety, cast fucn)
2024-05-06 17:02:27 +09:00
Taku Kudo
0da168ca45
Merge pull request #1004 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-0dd82dcf5c
Bump setuptools from 69.2.0 to 69.5.1 in /.github/workflows/requirements in the build-time-deps group
2024-05-05 12:30:39 +09:00
Taku Kudo
0575107765
Merge pull request #1002 from mcognetta/small_changes
Add missing output formats to spm_encode flag documentation
2024-05-05 12:29:54 +09:00
Dr. Christoph Mittendorf
7425587c6a
Fixing issues with the normalizer.cc
1x fixed typo
1x Changed uint32 to uint32_t for consistency and type safety.
1x The original code uses const_cast<char*> before reinterpreting it as uint32*. Without const_cast, the compiler will treat the blob.data() as a pointer to constant data (since std::string is typically constant).
2024-05-04 10:42:03 +02:00
dependabot[bot]
34042dc854
Bump setuptools
Bumps the build-time-deps group in /.github/workflows/requirements with 1 update: [setuptools](https://github.com/pypa/setuptools).


Updates `setuptools` from 69.2.0 to 69.5.1
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v69.2.0...v69.5.1)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: build-time-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-01 15:51:12 +00:00
Marco Cognetta
530f556b82 update spm_encode docs for bpe dropout 2024-05-01 14:40:27 +09:00
Taku Kudo
7dcb541451
Merge pull request #998 from google/dependabot/pip/dot-github/workflows/requirements/idna-3.7
Bump idna from 3.6 to 3.7 in /.github/workflows/requirements
2024-04-15 01:26:05 +09:00
dependabot[bot]
3fcf4d36cb
Bump idna from 3.6 to 3.7 in /.github/workflows/requirements
Bumps [idna](https://github.com/kjd/idna) from 3.6 to 3.7.
- [Release notes](https://github.com/kjd/idna/releases)
- [Changelog](https://github.com/kjd/idna/blob/master/HISTORY.rst)
- [Commits](https://github.com/kjd/idna/compare/v3.6...v3.7)

---
updated-dependencies:
- dependency-name: idna
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-12 04:24:28 +00:00
Taku Kudo
7dc9a76ec7
Merge pull request #995 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-6d60a12ad5
Bump the build-time-deps group in /.github/workflows/requirements with 3 updates
2024-04-05 22:12:34 +09:00
dependabot[bot]
404882df3f
Bump the build-time-deps group
Bumps the build-time-deps group in /.github/workflows/requirements with 3 updates: [cibuildwheel](https://github.com/pypa/cibuildwheel), [wheel](https://github.com/pypa/wheel) and [setuptools](https://github.com/pypa/setuptools).


Updates `cibuildwheel` from 2.16.5 to 2.17.0
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.16.5...v2.17)

Updates `wheel` from 0.42.0 to 0.43.0
- [Release notes](https://github.com/pypa/wheel/releases)
- [Changelog](https://github.com/pypa/wheel/blob/main/docs/news.rst)
- [Commits](https://github.com/pypa/wheel/compare/0.42.0...0.43.0)

Updates `setuptools` from 69.1.1 to 69.2.0
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v69.1.1...v69.2.0)

---
updated-dependencies:
- dependency-name: cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: build-time-deps
- dependency-name: wheel
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: build-time-deps
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: build-time-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 15:29:14 +00:00
Taku Kudo
4d6a1f4106
Merge pull request #985 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-0b1593cb10
Bump the build-time-deps group in /.github/workflows/requirements with 3 updates
2024-03-03 09:53:10 +09:00
dependabot[bot]
1d7ce29586
Bump the build-time-deps group
Bumps the build-time-deps group in /.github/workflows/requirements with 3 updates: [twine](https://github.com/pypa/twine), [pip](https://github.com/pypa/pip) and [setuptools](https://github.com/pypa/setuptools).


Updates `twine` from 4.0.2 to 5.0.0
- [Release notes](https://github.com/pypa/twine/releases)
- [Changelog](https://github.com/pypa/twine/blob/main/docs/changelog.rst)
- [Commits](https://github.com/pypa/twine/compare/4.0.2...5.0.0)

Updates `pip` from 23.3.2 to 24.0
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/pip/compare/23.3.2...24.0)

Updates `setuptools` from 69.0.3 to 69.1.1
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/setuptools/compare/v69.0.3...v69.1.1)

---
updated-dependencies:
- dependency-name: twine
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: build-time-deps
- dependency-name: pip
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: build-time-deps
- dependency-name: setuptools
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: build-time-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-01 15:09:24 +00:00
Taku Kudo
725952d8b8 makes the return value of --help same as official abseil library 2024-02-26 13:01:58 +00:00
Taku Kudo
52a7f156a4 increment version v0.2.1 2024-02-26 05:56:28 +00:00
Taku Kudo
9082653296 use ::testing::TempDir/SrcDir 2024-02-26 05:30:23 +00:00
Taku Kudo
3b2ea62d20 fix build error 2024-02-25 16:19:08 +00:00
Taku Kudo
0ba506938c add nfc, nfd normalization tsv files 2024-02-25 15:47:08 +00:00
Taku Kudo
a216bd01d1
Merge pull request #981 from google/dependabot/pip/dot-github/workflows/requirements/cryptography-42.0.4
Bump cryptography from 42.0.2 to 42.0.4 in /.github/workflows/requirements
2024-02-22 16:56:11 +09:00
dependabot[bot]
c7b4cd5019
Bump cryptography in /.github/workflows/requirements
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.2 to 42.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/42.0.2...42.0.4)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-21 20:45:34 +00:00
Taku Kudo
1d91514435
Merge pull request #979 from h-vetinari/libdir
move setting of default CMAKE_INSTALL_{BIN,INCLUDE,LIB}DIR before first use
2024-02-21 14:26:56 +09:00
H. Vetinari
b2863fc8d3 also change CMAKE_INSTALL_INCDIR->CMAKE_INSTALL_INCLUDEDIR in src/CMakeLists.txt 2024-02-21 12:08:41 +11:00
H. Vetinari
270120812e unify spelling of CMAKE_INSTALL_INCLUDEDIR
Following GNUInstallDirs defaults, see also CMake docs:
https://cmake.org/cmake/help/latest/command/install.html
2024-02-20 21:20:08 +11:00
H. Vetinari
26f9f58806 move setting of default CMAKE_INSTALL_{BIN,INCLUDE,LIB}DIR before first use 2024-02-20 21:13:20 +11:00
Taku Kudo
17d7580d64 suppress warnings in testharnress 2024-02-19 08:06:52 +00:00
Taku Kudo
4a3cd1cbaf
Merge pull request #975 from google/dependabot/pip/dot-github/workflows/requirements/cryptography-42.0.2
Bump cryptography from 42.0.0 to 42.0.2 in /.github/workflows/requirements
2024-02-19 13:52:25 +09:00
dependabot[bot]
670d2e7ee0
Bump cryptography in /.github/workflows/requirements
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0 to 42.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/42.0.0...42.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-17 00:55:36 +00:00
Taku Kudo
2b8772ae8f
Merge pull request #974 from xunkai55/patch-1
Fix a typo in api.md
2024-02-14 21:50:54 +09:00
Xunkai
ffd8e9efb8
Fix a typo in api.md 2024-02-12 18:57:30 +08:00
Taku Kudo
03243af616
Merge pull request #970 from google/dependabot/pip/dot-github/workflows/requirements/build-time-deps-bd99d7bc59
Bump the build-time-deps group in /.github/workflows/requirements with 1 update
2024-02-06 18:12:26 +09:00
Taku Kudo
d0fe40512e
Merge pull request #972 from google/dependabot/pip/dot-github/workflows/requirements/cryptography-42.0.0
Bump cryptography from 41.0.7 to 42.0.0 in /.github/workflows/requirements
2024-02-06 18:12:06 +09:00
dependabot[bot]
a8a618fb66
Bump cryptography in /.github/workflows/requirements
Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.7 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/41.0.7...42.0.0)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-06 03:24:50 +00:00
dependabot[bot]
bbbe548a20
Bump the build-time-deps group
Bumps the build-time-deps group in /.github/workflows/requirements with 1 update: [cibuildwheel](https://github.com/pypa/cibuildwheel).


Updates `cibuildwheel` from 2.16.2 to 2.16.5
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](https://github.com/pypa/cibuildwheel/compare/v2.16.2...v2.16.5)

---
updated-dependencies:
- dependency-name: cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: build-time-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-01 16:00:54 +00:00
Taku Kudo
53de76561c allows to load precomputed seed sentencepieces for unigram from a file. 2024-01-28 16:17:08 +00:00
Taku Kudo
0fe7add363 fixed crash bug in unigram model training 2024-01-28 05:19:00 +00:00
Taku Kudo
41c4b7f080 returns unicode characetr offsets in normalize method 2024-01-22 07:19:04 +00:00
Taku Kudo
6b468a0e01 support bytes output in decode method 2024-01-20 08:16:17 +00:00
Taku Kudo
7b9ee4c93e
Merge pull request #962 from Halmoni100/external-absl-2
Additional external absl fixes
2024-01-18 15:31:43 +09:00
Christopher Hong
0ea22c03e2 Fix build for external absl 2024-01-17 19:33:51 -05:00
Christopher Hong
a34fb4018c Add idempotency to external absl mod 2024-01-17 19:32:48 -05:00
Taku Kudo
4ce471c002
Update cross_build.yml
runs apt-get update to update the local index
2024-01-17 01:02:45 +09:00
Taku Kudo
de1747bbd4 added functionality to override normalizer spec 2024-01-16 04:06:05 +00:00
Taku Kudo
0018af1f31 better exteranl abseil and protobuf support 2024-01-16 02:15:46 +00:00
Taku Kudo
acf8ebe61f build universal osx binary 2024-01-14 01:51:01 +00:00
Taku Kudo
ed76ecc478 add more advanced SentencePieceNormalizer class 2024-01-13 17:19:50 +00:00
Taku Kudo
f5c736302c remove absl/random and absl/memory, add absl::btree_map 2024-01-07 10:48:48 +00:00
Taku Kudo
adf9e81b63 move SharedBitGen to random namespace 2024-01-06 15:56:51 +00:00
Taku Kudo
49afc4c6cc
Merge pull request #959 from google/revert-957-dependabot/github_actions/github-actions-bcafe21e81
Revert "Bump the github-actions group with 2 updates"
2024-01-07 00:24:40 +09:00
Taku Kudo
fb490c58c2
Revert "Bump the github-actions group with 2 updates" 2024-01-06 22:55:18 +09:00
Taku Kudo
06eee09847 Added Normalization API 2024-01-04 09:04:20 +00:00
Taku Kudo
e7b5260e4a
Merge pull request #955 from pnacht/pinned-pip
Hash-pin Python dependencies in CI/CD release workflows
2024-01-03 12:29:39 +09:00