Commit Graph

446 Commits

Author SHA1 Message Date
XapaJIaMnu
9271618ebb Update submodule 2024-05-12 09:51:02 +01:00
Yo'av Moshe
34acd8d982
fix downloading of models in the python binding (#472)
models come in files named like `csen.student.base.v1.cd5418ba6a412fc7.tar.gz`, but the directory they create when extracted are named like `csen.student.base`. we therefore need to remove not just the extension but everything following and including the 3rd period
2024-04-19 23:17:45 +01:00
Kirandevraj
5261614dfd
model url update in example script (#470) 2024-03-23 20:21:46 +00:00
XapaJIaMnu
983331bbc9 More pendantic spm 2023-12-19 18:41:18 +00:00
Kenneth Heafield
0367ae07a7 Fix MKL key URL 2023-12-07 12:10:50 -05:00
Kenneth Heafield
7774029d0d clang: marian-dev with newer fbgemm 2023-12-07 11:03:33 -05:00
Kenneth Heafield
73182d4c58 Pull in marian-dev with fixed CI and clang 2023-12-07 10:21:45 -05:00
dependabot[bot]
321be8ae04
Bump 3rd_party/marian-dev from 780df27 to 11c6ae7 (#466)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `780df27` to `11c6ae7`.
- [Commits](780df2708e...11c6ae7c46)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-20 08:10:18 +01:00
dependabot[bot]
0b069acce6
Bump 3rd_party/marian-dev from 300a50f to 780df27 (#464)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `300a50f` to `780df27`.
- [Commits](300a50f425...780df2708e)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-11 08:20:47 +01:00
Greg Tatum
db3826266d
Report the wasm size on builds (#460) 2023-08-17 07:55:49 +01:00
Greg Tatum
62770bb067
Generate a compile_commands.json by default with cmake (#461) 2023-08-16 16:14:56 +01:00
Greg Tatum
47024ec7a3
Add more things to the gitignore that are not being ignored (#462) 2023-08-16 15:35:26 +01:00
Nikolay Bogoychev
534ed37a3d
Remove wormhole references (#459)
* Remove warmhole references

* Remove more references to the WORMHOLE

* Update marian to wormhole removed marian

* Whoops

---------

Co-authored-by: Jelmer van der Linde <jelmer@ikhoefgeen.nl>
2023-08-14 15:22:54 +01:00
dependabot[bot]
ca954670aa
Bump 3rd_party/marian-dev from aa0221e to 8dbde0f (#458)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `aa0221e` to `8dbde0f`.
- [Commits](aa0221e687...8dbde0fd8e)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-11 15:04:27 +01:00
dependabot[bot]
2bdc493df3
Bump 3rd_party/ssplit-cpp from ad2c5a5 to a311f98 (#456)
Bumps [3rd_party/ssplit-cpp](https://github.com/browsermt/ssplit-cpp) from `ad2c5a5` to `a311f98`.
- [Commits](ad2c5a52a5...a311f9865a)

---
updated-dependencies:
- dependency-name: 3rd_party/ssplit-cpp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 10:37:24 +03:00
Graeme Nail
4b0da8d434
Enables model ensembles (#450)
* Enables model ensembles

Adds the ability to use ensembles of models. This supports ensembles of
binary- or npz-format models, as well as mixtures of both.

When all models in the ensembles are of binary format, the load from
memory path is used. Otherwise, they are loaded via the file system.
Enable log-level debug for output related to this.

* Fix formatting

* Fix WASM bindings for MemoryBundle

For now, this does not support ensembles.

* Remove shared_ptr wrapping the AlignedMemory of models.

* Fix formatting
2023-08-01 19:35:11 +01:00
dependabot[bot]
8011f9c849
Bump bergamot-translator-tests from 7984d14 to a04432d (#455)
Bumps [bergamot-translator-tests](https://github.com/browsermt/bergamot-translator-tests) from `7984d14` to `a04432d`.
- [Commits](7984d140ae...a04432d792)

---
updated-dependencies:
- dependency-name: bergamot-translator-tests
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-31 15:54:53 +01:00
Graeme Nail
cbfa839eef
Fix CI (#454)
* Use ubuntu-latest, macos-latest in GitHub Actions for cibuildwheel

* Update deprecated ubuntu-18.04 to ubuntu-latest for docs in GH actions
2023-07-31 15:54:42 +01:00
Graeme Nail
becb6e2cda
Fix Python formatting (Black) (#453) 2023-07-31 15:27:24 +01:00
dependabot[bot]
e333208cb9
Bump 3rd_party/marian-dev from 6a6bbb6 to aa0221e (#452)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `6a6bbb6` to `aa0221e`.
- [Commits](6a6bbb6278...aa0221e687)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-31 15:26:44 +01:00
XapaJIaMnu
eaa2562fe0 Sentencepiece windows compilation 2023-07-13 00:14:13 +01:00
XapaJIaMnu
ada8c39224 Fix compilation on newer gcc 2023-06-06 17:04:49 +01:00
dependabot[bot]
b3d36bca90
Bump 3rd_party/marian-dev from 8ceb051 to bb65f47 (#447)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `8ceb051` to `bb65f47`.
- [Commits](8ceb051b7f...bb65f473d5)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-10 16:07:24 +01:00
Nikolay Bogoychev
3c2a667f9b
Try harder to install gperftools 2023-05-04 12:06:20 +01:00
Nikolay Bogoychev
fceb713b27
Update workflows 2023-05-04 11:16:07 +01:00
dependabot[bot]
eb0fe1b583
Bump 3rd_party/marian-dev from 69e27d2 to 8ceb051 (#446)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `69e27d2` to `8ceb051`.
- [Release notes](https://github.com/browsermt/marian-dev/releases)
- [Commits](69e27d2984...8ceb051b7f)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-04 10:55:15 +01:00
Kenneth Heafield
82c276a15c Fix path to example program 2023-03-01 18:30:38 +00:00
Nikolay Bogoychev
1ba7461a36 Fix compilation on x86 2023-01-19 10:06:57 +00:00
Jelmer
8d5f877596
More portable WASM demo (#437)
* Replace most of the wasm demo page with code from the firefox extension

This code should be more generic and copy/pastable into other projects. Maybe one day it will be an npm package?

* Fix Ukrainian model support

* Add quality estimation output

Automatically enabled when the model(s) support it

* Little "Translating…" indicator

* Don't make Safari fail on something tiny

* Rewire lots of async state to be able to predictably know when the translator is working or not

Previously so much was lazy loaded that it was not easy to catch lack of SIMD support. Now I can just enable the interface only after it has properly loaded.

* No need for a two-stage setup for the worker. Just promise to call `initialize()`!

* More (correct) types and comments for code

* Keyboard shortcuts for input area for bold, italic and underline.

Enough to demo mark-up translation

* Fix `delete()`

* Move javascript glue code into its own npm package

* Add nodejs support and test to package

* More stand-alone build command

…for now, not really used by anything I think

* Ignore build packages

* Use local filesystem for build so it is automatically cached

* fix overflow on demo page

But this might break the mobile demo? I'll have to check into that

* Bring back integrity check, except for NodeJS for now

* Make `build` part of `prepare` so we always make sure we build a complete package

* Move worker code into its own folder

This way I can mark it as a commonjs module which will help cause nodejs treat the files the same as WebWorkers do right now. Firefox doesn't implement `{type: 'module'}` yet for WebWorkers.

* Add README

* Fix paths

* Add npm publish automation

* Make sure webpack ignores node compatibility code

* Add missing webpack:ignore around a worker

* Default to getting models from S3

* Separate "loading" and "translating" indicators

* Bump npm package version

* Add credits

* Don't block on the worker loading

* Not just Mozilla, but Bergamot!

* Make individual translation requests cancelable

* Swap button turns vertically when in skyscraper mode

* Make it easier to debug errors from inside the worker

* Don't bork on deleting a failed worker

* Don't bork on calling translate() with a failed worker

* Handle compilation error with more grace

* `contenteditable=true` seems to work better with some browser extensions

Looking at you, Vimium!

* Clean up abort promise

* Bump npm package version

* Remove `workerUrl` option in favour of better webpack support

With that option it was hard for Webpack to figure out dependencies, and it did not enter my worker script for rewriting. With the hardcoded url it does, and with a bit of `new webpack.DefinePlugin({'typeof self': JSON.stringify('object')}),` we can have webpack remove node-specific code on build!

* Bump version

Minor API change hehe

Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
2023-01-18 19:41:39 +00:00
Jelmer
2834f046dc
Expand the node-test.js example code with documentation (#434)
* Expand the node-test.js example code with documentation

Is there a better way to document code than by providing an annotated & working example of it? Just listing all the exposed methods feels like giving people a box of bricks and expecting them to build a house with it.

* Use @Jerin's feedback to simplify node-test.js explanations

* Use native `console.assert` instead

See #426 for an explanation

* Fix comment

Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
2023-01-18 19:09:47 +00:00
Nikolay Bogoychev
7d24908959 Apply security update and formatting 2023-01-18 16:46:07 +00:00
Nikolay Bogoychev
6f2659fe59
Arm updated (#443)
* ARM Support using ruy and simd_utils

* Adding ARM build on GitHub CI

* Add workflow and successful build

ssplit-cpp modified to get cross compiled android on GitHub CI working.

* Client side fixes for int8 no shift on ARM [python]

* Revert "Client side fixes for int8 no shift on ARM [python]"

This reverts commit 020af05a8b.

* moving int8shift no-op inside the library

* Bump 3rd-party/marian-dev

* update the marian branch test

* arm backend works

* Latest and greatest clang-format

Co-authored-by: Jerin Philip <jerinphilip@live.in>
2023-01-18 16:31:36 +00:00
dependabot[bot]
620c8b00ec
Bump qs and express in /wasm/test_page (#444)
Bumps [qs](https://github.com/ljharb/qs) to 6.11.0 and updates ancestor dependency [express](https://github.com/expressjs/express). These dependencies need to be updated together.


Updates `qs` from 6.7.0 to 6.11.0
- [Release notes](https://github.com/ljharb/qs/releases)
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ljharb/qs/compare/v6.7.0...v6.11.0)

Updates `express` from 4.17.1 to 4.18.2
- [Release notes](https://github.com/expressjs/express/releases)
- [Changelog](https://github.com/expressjs/express/blob/master/History.md)
- [Commits](https://github.com/expressjs/express/compare/4.17.1...4.18.2)

---
updated-dependencies:
- dependency-name: qs
  dependency-type: indirect
- dependency-name: express
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-18 15:30:58 +00:00
Nikolay Bogoychev
6cefc4302d Latest and greatest clang-format 2023-01-18 12:48:53 +00:00
Nikolay Bogoychev
21eff44513
try to update coding_styles workflow 2023-01-17 16:53:38 +00:00
Nikolay Bogoychev
06c31af0fe
update download path 2023-01-17 16:43:19 +00:00
Graeme Nail
7f79128900
MacOS Wheels (#432)
* Remove trailing whitespace
* Additional MacOS wheels: Wheels for python 3.6 to 3.10 with a 
   minimum target of MacOS 10.9
* Install bergamot package from wheel directory
* Remove no-index as we need dependencies
2022-06-29 22:46:24 +01:00
Jerin Philip
84c761bacd
Python: Work offline if models are available (#431)
Try to check if models.json is downloaded first, if it is use it. 
If not, fall back to attempting to fetch it from the network.

Fixes: #430
2022-06-25 18:32:50 +01:00
Jerin Philip
3ef85e12be
Python package: pyyaml >= 5.1 (#429)
Fixes issue on Colab which says vanilla YAML intall (3.x) does not have
yaml.FullLoader (https://stackoverflow.com/a/55553392/4565794).

Fix a broken link for presentation in PyPI.
2022-06-24 08:57:39 +01:00
Jerin Philip
05a8778497
Bump version to 0.4.5 (#427) 2022-06-21 17:49:07 +01:00
Jerin Philip
8771078177
Basic HTML property testing for WebAssembly (#425)
Import
https://gist.github.com/jelmervdl/a4c8b6b92ad88a885e1cbd51c6ad4902 and
attach it to CI.  NodeJS-14 is failing on trying to use the WebAssembly
binary. So we use node-16 independently setup.  This paves way for more
complicated testing for WebAssembly bindings in the future.
2022-06-21 14:07:17 +01:00
Jerin Philip
61d2c35dbd
Set up python packaging for pypi distribution (#424)
Old GitHub CI using Ubuntu and MacOS explicitly and building wheels have
been removed in favour of the more portable pypa specified builds. These
wheels should work just as well across a wider range of distributions.

pybind11:CMakeLists.txt requires Development.Module instead of
Development.* to avoid Embed from getting in the way of manylinux
builds.

manylinux_x86_64 builds are added for cp3.6 - 3.10. The linux build
uses an old image via docker.  Since the docker images are able to use
shared ccache folder, builds quite fast on warm starts.

ccache usage in setup.py is now triggered by an environment variable.
This allows for builds not to fail if ccache not present.

On tag pushes corresponding to versions, CI is configured to deliver
built wheels to PyPI, reading from repository secrets.

Improves setup.py including documentation and some formatting, and
additional links to source.

Fixes: #315
2022-06-20 14:35:29 +01:00
dependabot[bot]
ad781656fe
Bump 3rd_party/marian-dev from 199201e to e88c1aa (#416)
Bumps [3rd_party/marian-dev](https://github.com/browsermt/marian-dev) from `199201e` to `e88c1aa`.
- [Release notes](https://github.com/browsermt/marian-dev/releases)
- [Commits](199201eb89...e88c1aa5d5)

---
updated-dependencies:
- dependency-name: 3rd_party/marian-dev
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-18 16:17:53 +01:00
Abhishek Aggarwal
5ae1b1ebb3
Bump version to 0.4.4 (#415) 2022-04-28 16:24:13 +02:00
Abhishek Aggarwal
e34420647d
Upgrade emsdk to 3.1.8 (#414)
* Rework WASM compilation options

Necessary to work with newer versions of emscripten that are more picky about which option goes to the compiler, and which to the linker. Also took the opportunity to remove the need for the patching of the bergamot-translation-worker.js file, this can now easily be done through supported apis. Furthermore, I tried to downsize the generated javascript and wasm code a bit.

Initial estimates show that bergamot-translator compiled with emscripten 3.0.0 runs at about 3x the speed of 2.0.9 (when using embedded intgemm). Speed-up when using mozIntGemm is less dramatic.

* Updated marian-dev submodule
* Revert changes specific to patching external gemm modules for wasm
* Better Compilation and Link flags

 - Added "-O3" optimization flag for linking as well
 - "-g2" only for release and debug builds
 - "-g1" for release builds
 - Replaced deprecated "--bind" flag with "-lembind"
 - Removed redundant link flag

* Upgraded emsdk to 3.1.8
* Enclosed EXPORTED_FUNCTIONS values in a list
* Fixed the remaining 2.0.9 reference in circle ci build script
* Updated README

Co-authored-by: Jelmer van der Linde <jelmer@ikhoefgeen.nl>
2022-04-20 00:39:32 +01:00
Jerin Philip
98af5945c5
Update and fix windows CI (#410)
* Use a more vanilla windows workflow from translateLocally, remove the
complicated lukka/*. Also removes port overrides in the overall upgrade.
* Disable vcpkg binary caching
* Remove PCRE library hacks after upstream ssplit improvements
2022-04-15 08:56:31 +01:00
dependabot[bot]
f18a8835fa
Bump 3rd_party/ssplit-cpp from a08d6bc to 49fde6d (#408)
Bumps [3rd_party/ssplit-cpp](https://github.com/browsermt/ssplit-cpp) from `a08d6bc` to `49fde6d`.
- [Release notes](https://github.com/browsermt/ssplit-cpp/releases)
- [Commits](a08d6bce20...49fde6df7e)

---
updated-dependencies:
- dependency-name: 3rd_party/ssplit-cpp
  dependency-type: direct:production
...
2022-04-14 11:25:51 +01:00
Jelmer
df5db52513
Fix call to isspace (#396)
Documentation is explicit about only calling it with unsigned char, and Windows runtime is checking this.
2022-03-31 12:12:33 +01:00
dependabot[bot]
7d51d109f7
Bump bergamot-translator-tests from d03a9d3 to 7984d14 (#394)
Bumps [bergamot-translator-tests](https://github.com/browsermt/bergamot-translator-tests) from `d03a9d3` to `7984d14`.
- [Release notes](https://github.com/browsermt/bergamot-translator-tests/releases)
- [Commits](d03a9d316d...7984d140ae)

---
updated-dependencies:
- dependency-name: bergamot-translator-tests
  dependency-type: direct:production
...
2022-03-30 09:41:15 +01:00
Abhishek Aggarwal
d2e3a82622
Bump version to 0.4.3 (#392) 2022-03-28 18:03:43 +02:00