* Replace most of the wasm demo page with code from the firefox extension
This code should be more generic and copy/pastable into other projects. Maybe one day it will be an npm package?
* Fix Ukrainian model support
* Add quality estimation output
Automatically enabled when the model(s) support it
* Little "Translating…" indicator
* Don't make Safari fail on something tiny
* Rewire lots of async state to be able to predictably know when the translator is working or not
Previously so much was lazy loaded that it was not easy to catch lack of SIMD support. Now I can just enable the interface only after it has properly loaded.
* No need for a two-stage setup for the worker. Just promise to call `initialize()`!
* More (correct) types and comments for code
* Keyboard shortcuts for input area for bold, italic and underline.
Enough to demo mark-up translation
* Fix `delete()`
* Move javascript glue code into its own npm package
* Add nodejs support and test to package
* More stand-alone build command
…for now, not really used by anything I think
* Ignore build packages
* Use local filesystem for build so it is automatically cached
* fix overflow on demo page
But this might break the mobile demo? I'll have to check into that
* Bring back integrity check, except for NodeJS for now
* Make `build` part of `prepare` so we always make sure we build a complete package
* Move worker code into its own folder
This way I can mark it as a commonjs module which will help cause nodejs treat the files the same as WebWorkers do right now. Firefox doesn't implement `{type: 'module'}` yet for WebWorkers.
* Add README
* Fix paths
* Add npm publish automation
* Make sure webpack ignores node compatibility code
* Add missing webpack:ignore around a worker
* Default to getting models from S3
* Separate "loading" and "translating" indicators
* Bump npm package version
* Add credits
* Don't block on the worker loading
* Not just Mozilla, but Bergamot!
* Make individual translation requests cancelable
* Swap button turns vertically when in skyscraper mode
* Make it easier to debug errors from inside the worker
* Don't bork on deleting a failed worker
* Don't bork on calling translate() with a failed worker
* Handle compilation error with more grace
* `contenteditable=true` seems to work better with some browser extensions
Looking at you, Vimium!
* Clean up abort promise
* Bump npm package version
* Remove `workerUrl` option in favour of better webpack support
With that option it was hard for Webpack to figure out dependencies, and it did not enter my worker script for rewriting. With the hardcoded url it does, and with a bit of `new webpack.DefinePlugin({'typeof self': JSON.stringify('object')}),` we can have webpack remove node-specific code on build!
* Bump version
Minor API change hehe
Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
* Expand the node-test.js example code with documentation
Is there a better way to document code than by providing an annotated & working example of it? Just listing all the exposed methods feels like giving people a box of bricks and expecting them to build a house with it.
* Use @Jerin's feedback to simplify node-test.js explanations
* Use native `console.assert` instead
See #426 for an explanation
* Fix comment
Co-authored-by: Nikolay Bogoychev <nheart@gmail.com>
* ARM Support using ruy and simd_utils
* Adding ARM build on GitHub CI
* Add workflow and successful build
ssplit-cpp modified to get cross compiled android on GitHub CI working.
* Client side fixes for int8 no shift on ARM [python]
* Revert "Client side fixes for int8 no shift on ARM [python]"
This reverts commit 020af05a8b.
* moving int8shift no-op inside the library
* Bump 3rd-party/marian-dev
* update the marian branch test
* arm backend works
* Latest and greatest clang-format
Co-authored-by: Jerin Philip <jerinphilip@live.in>
* Remove trailing whitespace
* Additional MacOS wheels: Wheels for python 3.6 to 3.10 with a
minimum target of MacOS 10.9
* Install bergamot package from wheel directory
* Remove no-index as we need dependencies
Fixes issue on Colab which says vanilla YAML intall (3.x) does not have
yaml.FullLoader (https://stackoverflow.com/a/55553392/4565794).
Fix a broken link for presentation in PyPI.
Import
https://gist.github.com/jelmervdl/a4c8b6b92ad88a885e1cbd51c6ad4902 and
attach it to CI. NodeJS-14 is failing on trying to use the WebAssembly
binary. So we use node-16 independently setup. This paves way for more
complicated testing for WebAssembly bindings in the future.
Old GitHub CI using Ubuntu and MacOS explicitly and building wheels have
been removed in favour of the more portable pypa specified builds. These
wheels should work just as well across a wider range of distributions.
pybind11:CMakeLists.txt requires Development.Module instead of
Development.* to avoid Embed from getting in the way of manylinux
builds.
manylinux_x86_64 builds are added for cp3.6 - 3.10. The linux build
uses an old image via docker. Since the docker images are able to use
shared ccache folder, builds quite fast on warm starts.
ccache usage in setup.py is now triggered by an environment variable.
This allows for builds not to fail if ccache not present.
On tag pushes corresponding to versions, CI is configured to deliver
built wheels to PyPI, reading from repository secrets.
Improves setup.py including documentation and some formatting, and
additional links to source.
Fixes: #315
* Rework WASM compilation options
Necessary to work with newer versions of emscripten that are more picky about which option goes to the compiler, and which to the linker. Also took the opportunity to remove the need for the patching of the bergamot-translation-worker.js file, this can now easily be done through supported apis. Furthermore, I tried to downsize the generated javascript and wasm code a bit.
Initial estimates show that bergamot-translator compiled with emscripten 3.0.0 runs at about 3x the speed of 2.0.9 (when using embedded intgemm). Speed-up when using mozIntGemm is less dramatic.
* Updated marian-dev submodule
* Revert changes specific to patching external gemm modules for wasm
* Better Compilation and Link flags
- Added "-O3" optimization flag for linking as well
- "-g2" only for release and debug builds
- "-g1" for release builds
- Replaced deprecated "--bind" flag with "-lembind"
- Removed redundant link flag
* Upgraded emsdk to 3.1.8
* Enclosed EXPORTED_FUNCTIONS values in a list
* Fixed the remaining 2.0.9 reference in circle ci build script
* Updated README
Co-authored-by: Jelmer van der Linde <jelmer@ikhoefgeen.nl>
* Use a more vanilla windows workflow from translateLocally, remove the
complicated lukka/*. Also removes port overrides in the overall upgrade.
* Disable vcpkg binary caching
* Remove PCRE library hacks after upstream ssplit improvements
Fixes the docs workflow which is failing after pip is picking up Jinja 3.20.
We only need >=2.3, this one sets it to 3.0.3 builds were successful last.
Quality scores for HTML translation exposed as <font
x-bergamot-sentence-score=""> and <font x-bergamot-word-score=""> tags
in the HTML output. While this increases the size of the HTML returned,
the resulting rendered HTML can easily be styled to show the scores.
With Javascript or CSS, developers can easily have some interface based
on these extra attributes.
Also includes updates to the test page to show a proof-of-concept
demonstration.
Fixes: #355
Deprecates cacheEnabled parameter to be replaced with cacheSize=0.
Python bindings, Documentation in comments and tests updated to reflect
this change.
Exposes the fields corresponding to cache via embind as a value object.
The equivalent object-based syntax in worker.js allows propagation
from JS.
Fixes: #351
See also: mozilla/firefox-translations#96
- Prefer spreading markup over a full word.
- Ignore certain tags that are unlikely to be supposed to be translated,
such as `<code>` and `<samp>`.
- Never treat `<wbr>` as a space.
- Allow for inconsistent cases in tag names.
- Fix bug where void elements were inserted multiple times.
- Better handling of whitespace around punctuation.
- Ignore parsing `<noscript>` to be compatible with Firefox.
- Improvements to documentation and readability of `HTML` and `Scanner`
classes.
Fixes: #313, #339
* Create github release via circleci only for mozilla fork
- The extension uses mozilla fork for translator artifacts
-- Hence create github release via circleci only when
running in mozilla fork
* Small refactoring in ci script
Hide `cache-mutex-buckets` from the user. Now configured to be equal to number
of workers. Python bindings which had exposed these are modified to reflect
the API change. `std::optional` enabled on cache, constructed only if enabled.
Pointers used are replaced with an equivalent `std::optional.`
Fixes: #317
Fixes memory leak
ifdef for -fno-exceptions including clang-cl
Move spacing back to intgemm upstream
Co-authored-by: Jerin Philip <jerin.philip@research.iiit.ac.in>
Changes signature of BlockingService::{translate,pivot}Multiple
functions to take per input options, so a mix of HTML and plaintext
can be sent from the extension. Templating over testing is adjusted
to allow for continuous evaluations by modifying the test code.
Updates WebAssembly bindings to reflect the change in signature
and the javascript test-page to work with the new bindings.
This change lacks an accompanying test specific to the mixed HTML
and plaintext inputs.
Fixes: #345
See also: mozilla/firefox-translations#94
Co-authored-by: Jelmer van der Linde <jelmer@ikhoefgeen.nl>
Changes `ABORT` on non `.bin` model to an additional check for a `.npz`
extension. If `.bin`, the fast load path is activated by returning `AlignedMemory`.
Otherwise, the return of empty `AlignedMemory` causes fallback to
filesystem-based loads.
BRT: A test that checks if translation using `.npz` is approximately similar to
that of default CLI translation is checked in to ensure stability going ahead.
Previously, we only supported `.bin` models' loading via a fast mmap
path. While we had the underlying capability to load non `.bin` models, this
was not exposed, encouraging fast loads. Loading `.npz` models are helpful
for quick debugging and broader coverage of models available, which will
enhance user experience at translateLocally and python bindings.
Fixes#341.
See also: XapaJIaMnu/translateLocally#89
This reverts commit 62ff781ed4.
Sorry I should have realized Jerin was only amending python and
therefore this didn't break WASM.
Apologies to Jerin on this.