* Change WASM_COMPATIBLE_SOURCE=OFF by default
The default was WASN_COMPATIBLE_SOURCE=ON COMPILE_WASM=OFF which is a
testing configuration, not a sensible default for native or wasm.
* Always USE_WASM_COMPATIBLE_SOURCE with COMPILE_WASM
* Set CMP0077 to fix variable handling
* Update marian-dev to the newest mac version
* Attempt windows workflow
* force workflow rerun
* Separate id
* Attempt 3 at github action
* Marian dev submodule now compiles with apple clang
* Updated ssplit version to something more recent
* Attempt to fix compile on wasm
* Do not compile subproject tests
* Fix emscripten compilation on Mac
* 99% on the way to windows compile
* Try with a different generator
* Build release not debug
* Revert CMakeLists.txt hacks
* Fix sse2 compilation failure
* MSVC settings for WIN32
* Add nodefaultlib LIBCMT
* Do not compile ssplit.cpp as it contains sys/mman.h
* Revert ab56b9aa4f
* Update paths
* Set the build type to release if not set previously
* Attempt to build release with the windows workflow
* Attempt 5 at VS studio release build
* Attempt 6 at getting release build on MSVC generator
* The windows build is debug at the moment...
* fix ssplit for ubuntu 16.04
* Fix compilation with clang
* Compile on ubuntu16.04
* Explain what is going on
* Updated ssplit and workflow
* Updated marian-dev submodule
- cmake changes required after the submodule update
* Added workflows for building custom marian on mac and ubuntu
* Renamed cmake option
- Renamed USE_WASM_COMPATIBLE_SOURCES to USE_WASM_COMPATIBLE_SOURCE
- Use proper compile defnitions
* Draft adjustments to API
* Adjustments to docs
* Let's call the word + sentence ranges annotations
* Editing confusing comment on size()
* Fixing compilation for template adjustments for SentenceRanges
* string_view template hacks
This commit shifts AnnotatedBlob into a templated type and gets the
troubled part to compile. All to manage absl::string_view and
std::string_view.
Objective: marian::bergamot stays C++ 11 to pluck and put in marian
code, bergamot-translator somehow flexes C++17. Simplify development in
one place.
* Fixing the wiring: Gets source to build
Runtime errors exist, but AnnotatedBlobs are consistent.
* Bugfix: Matching old-state after factoring AnnotatedBlob in
* Removing vocabs_ from Response.
(For the umpteenth time).
* Alignment API ready in marian::bergamot::Response
* Wiring alignments upto TranslationResult
* Adjustment to get alignments; bergamot-translator-app has alignments available
* Accessing words instead of Ids
This code sets up access of word string_views from annotations instead
of printing Ids. However, we have segfault. This is likely due to
targetRanges not being set, pending from
https://github.com/browsermt/bergamot-translator/issues/25.
Could also be a rogue EOS token which we're filtering for in string_view
annotations, but not so in alignments.
* Switching to browsermt/marian-dev@jp/decode-string-view for targetTokenRanges
* Target word byte range annotations available
Issues corresponding to #25 should be resolved. There is still a
segfault. Could be due to EOS. Pending investigation.
* Bugfix: Tokens for alignments are now through.
Was not EOS.
* browsermt/marian-dev@master
ByteRange changes work downstream and has been merged to master.
Updating submodule to point to master.
* Style and documentation enhancements: response.cpp
* Style and documentation enhancements: TranslationResult.h
* Descriptions for SentenceRanges templating
* Switching to marian-dev@wasm-sync
* AnnotatedBlob can be copy-ctord/copy-assigned
* TranslationResult: Empty ctor + WASM Bindings
Allows empty construction of TranslationResult. Using this empty
constructor, WASM bindings are adjusted. Unsure of the results, maybe
@abhi-agg can test.
* Cosmetic: SentenceRangesT -> Annotation
- SentenceRangesT is renamed to AnnotationT;
- Further comments to explain heavily templated files.
* Response: Cleaning up unused members and adding docs
* Adding quality scores - attempt
* Stub QualityScores
This adjustment adds capability to get "scores", which should
potentially indicate how confident (at least relative in a
target-sentence) should be. This enables writing the code forward for
TranslationResult, and an example quality-score people can be pointed
at.
- These are not between [0,1] yet.
- In addition, guards to check out-of-bounds access have been placed so
illegal accesses are caught early on during development.
* Removing token debug statements
* Reworking Annotation without templates
https://github.com/mozilla/bergamot-translator/issues/8 provides
ByteRanges.
- This ByteRange data-type is used in Annotation and converted
to marian::string_view(=absl::string-view) on demand.
- Since Annotation[using ByteRange] is not bound to anything else, it
can be unit tested. A unit test is added (originally to test
independently for integration after).
- Annotation with ByteRange is now propogated across marian::bergamot
and functionality matched to how it was previously working.
This eliminates the string-view conversion and template code.
* Nit: Removing std::endl flushes
* Bring TranslationResult and Response closer
Helps https://github.com/browsermt/bergamot-translator/issues/53.
In preparation , the data-export types for Quality and Alignment are
pushed down to Response from TranslationResult and computed during
construction. This brings TranslationResult closer to Response, paving
way to avoid having two TranslationResults.
histories_ only remain for marian-decoder replacement usage, which can
be removed in a separate PR.
* Clean up hacks originally added for a unit-test to compile
* Moving Annotation functions to cpp and documenting header file
* Shifting alignments, qualityScore testing capability into main-mts
* Restore Unified API files to previous state
* Adaptations to fix Response with Quality, Alignments to connect to old Unified API
* Missing reset on TranslationResultBindings
* Cleaning up Response documentation to reflect newer code
* Minor adjustments to get build back after main sync
* Marian seems to make available Catch somehow
* Disable COMPILE_BERGAMOT_TESTS for WASM
* Add COMPILE_BERGAMOT_TESTS as a CMakeDependent option
* Use the COMPILE_TESTS flag instead to skip macos.yml
* Trigger unit-tests on GitHub runners for Annotation
* Reordering enable_testing() to before inclusion of test directory
* doc constructs required to operate with alignments
Documents with doxygen compatible documentation for Response,
AnnotatedBlob, Annotation, ByteRange.
Incorporates doxygen compatible documentation for
* Updates ByteRange consistent with general C++
Also little documentation enhancements in the process.
* Updating marian-dev@9337105
* Copy-paste documentation because lazy
* Turn off autoformat and manually edit to fix style changes
* AnnotatedBlob -> AnnotatedText; blob -> text
* text.text in test app renamed
* text of text -> blob of text in places of documentation
- USE_WASM_COMPATIBLE_MARIAN=off will start using vanilla Marian
i.e. with full threading support, with exceptions, with MKL
- Changed the relevant documentation
CMakeLists have been modified with the necessary includes to add
browsermt/mts@nuke files to the bergamot-translator library. In
addition, adds the ssplit dependency, corresponding includes.
Intel MKL fails on compilation, unable to find libraries. To solve this
3rd_party/CMakeLists.txt is modified with @ug's fixes to propogate
variables (EXT_LIBS, etc) at a library level.
- Contains classes for the API specification (doc/Unified_API.md)
- Things to be changed/decided later:
Use of std::string_view to represent ranges
Adding Alignment information
Basic Setters and Getters for some of the classes