bergamot-translator

mirror of https://github.com/browsermt/bergamot-translator.git synced 2024-09-17 16:47:18 +03:00

Author	SHA1	Message	Date
Abhishek Aggarwal	9747d9ba83	Add cmake option to compile project on WASM - Set cmake option COMPILE_WASM to ON to compile the project on WASM	2021-02-11 15:34:27 +01:00
Abhishek Aggarwal	a2d3269344	Updated ssplit submodule	2021-02-10 11:27:16 +01:00
Abhishek Aggarwal	584700ce91	Changed translate() API from non-blocking to blocking - Can be changed back to non-blocking once blocking API becomes integrable via WASM port in browser	2021-02-10 11:15:16 +01:00
Abhishek Aggarwal	5683168a8d	Updated ssplit submodule to a different repository - Added abhi-agg/ssplit-cpp - Added its wasm branch in bergamot-translator - Native builds of bergamot-translator are successful -- Sentence splitting is NOT WORKING -- Only translation is working	2021-02-10 10:33:01 +01:00
Abhishek Aggarwal	47b4bae268	Changed encodePreservingSource -> encodeWithByteRanges - This change happened because marian submodule changed this name - Native builds are working fine -- bergamot-translator-app output is consistent	2021-02-09 15:37:29 +01:00
Abhishek Aggarwal	9a54d2116c	Updated marian-dev submodule - Switch to "wasm" branch of browsermt/marian-dev	2021-02-08 13:46:59 +01:00
Jerin Philip	2929077324	Reordering git submodule update before includes	2021-02-02 14:41:26 +00:00
Jerin Philip	548c8880ff	CMake updates submodules	2021-02-02 14:39:19 +00:00
Jerin Philip	e76a602dc7	Removing config file printing	2021-01-28 21:44:05 +00:00
Jerin Philip	9a17f365c6	Fix for garbled output through cli. Requirement for string_view is the original source string be transferred all the way from input to service to back to TranslationResult. This constraint was violated in several places by means of existence of a copy-constructor. The issue is fixed by deleting copy and assignment constructors in marian::bergamot::TranslationResult and UnifiedAPI::TranslationResult, which demonstrated a few occurances of the same. Replaced the same with move semantics. In addition, future is set and get using move semantics at the moment. Default move-constructor didn't seem to be working, so they're made explicit for TranslationResults. This commit additionally packs a few deletions and improvements made to improve structure (textops.cpp, batcher.cpp) along the process of inspecting and fixing the garbled outputs. They are choose to be kept, in the interest of time, against a prettified atomic commit engineering. Combinations of the following commits in jp/string-view-bug [acfc92 78a588 12d91b 00a277 919e2f 9d3a46 b7e39b 18f67b bf667c]	2021-01-26 21:18:15 +00:00
Abhishek Aggarwal	0d16b1957f	Improved main.cpp file - Print original and translated text - Just add 2 vector entries for texts	2021-01-26 14:49:28 +01:00
Abhishek Aggarwal	b49f2c1af3	Cleanup TranslationModelConfiguration to std::string change in API - Provide yaml formatted string as model configuration - Remove redundant files	2021-01-26 11:13:41 +01:00
Abhishek Aggarwal	026f1af887	Removed redundant lines from CMakeFile	2021-01-26 10:46:35 +01:00
Jerin Philip	08a7358c3d	Integrating marian-translator through API Using std::string for config. Now capable of launching marian translator through API interface. There's a sketchy workaround to convert a string config to marian::Options, with an added note.	2021-01-25 22:11:38 +00:00
Jerin Philip	69adc7af77	Changing code-style to clang-format-google	2021-01-24 21:46:47 +00:00
Jerin Philip	cd025e9f65	CI scripts: master -> main	2021-01-23 14:39:08 +00:00
Jerin Philip	7e2eb02e18	CI and Associated Changes Enables Mac and Ubuntu CPU only builds through GitHub CI. CI scripts are copied from marian-dev with necessary changes. 3rd-party/marian-dev is modified to meet C++17 requirements modifying for half_float.	2021-01-23 13:34:04 +00:00
Jerin Philip	988e76baf9	Removing Exception to fix Apple compile	2021-01-22 15:13:30 +00:00
Abhishek Aggarwal	1c3b656852	Removed a redundant directory inclusion in CMakeFile	2021-01-22 15:53:19 +01:00
Abhishek Aggarwal	c8fc004452	Improved 3rd party header inclusion and library linking	2021-01-22 15:47:36 +01:00
Jerin Philip	3b6b9cd2bf	Updating README.md with instructions to run service-cli	2021-01-22 11:51:49 +00:00
Jerin Philip	e75bd7eb57	Adding vim temporary files to .gitignore	2021-01-22 11:31:20 +00:00
Jerin Philip	37143933a1	CMakeLists improvements Only the bergamot-translator library should be linked to main target Any other library (marian ${MARIAN_CUDA_LIB} ${EXT_LIBS} ssplit pcrecpp.a pcre.a) should be linked to bergamot-translator target inside src/translator folder.	2021-01-22 11:29:32 +00:00
Jerin Philip	80125e2789	Removing unused variable in batch_translator	2021-01-21 14:54:30 +00:00
Jerin Philip	12e7e2c650	Fixing compile error, need tests, CI	2021-01-21 14:54:09 +00:00
Jerin Philip	9b18bd9ffc	MTranslationResult, more comments	2021-01-21 02:03:47 +00:00
Jerin Philip	ea1a628cd2	Neaten TextProcessor, add a bit of docs. - Truncating long sentences into those of a specified length for faster processing is now a separate function, for improved readability. - Changes doing push_back -> emplace_back at places to avoid copy. - query_to_segments is renamed as process. - Comments are added in an attempt to bring some sanity.	2021-01-21 01:31:29 +00:00
Jerin Philip	4640ae4091	Fixes copying around vocabs Vocabs was earlier loaded in each thread and copied several times. Modified this to be loaded only once in Service and reference used consistently later on. This change makes Tokenizer as a class rather moot, as there's only one private member and a function. Moved this into TextProcessor. SentenceSplitter, however remains a separate class. utils.{h,cpp} had only a single loadVocabularies function, which is at the moment required only in Service. Making loadVocabularies a function inside Service and getting rid of utils.*.	2021-01-21 00:29:53 +00:00
Jerin Philip	d6ec007df9	TranslationResult Docs Removed Alignments, too many questions and no concrete answers. Better off removing unused code. History is kept for now, for internal use.	2021-01-20 21:58:13 +00:00
Jerin Philip	caa03e1d9f	Removing unused timer.h	2021-01-20 21:21:43 +00:00
Jerin Philip	54a6c6ce80	Moving main (mts) to app/ Commit modifies the example test-code main-mts into the app folder, updating CMakeLists accordingly.	2021-01-20 21:18:20 +00:00
Jerin Philip	d3c707f735	Enhancing service.h further	2021-01-20 21:11:27 +00:00
Jerin Philip	b3f1905a12	Adding documentation and example to service.h	2021-01-20 20:56:50 +00:00
Jerin Philip	b25b2276e3	Undoing LineSplitter, reverting SentenceSplitter. A faster linesplitter added for benchmarks is removed in favour of @ug's ssplit-cpp. NOTE: ssplit-cpp's regex based implementation is slow for one-line parses, which ideally needs to be improved in upstream ssplit-cpp to trivially reduce to a faster newline character based split.	2021-01-20 20:11:07 +00:00
Jerin Philip	bde9094728	Updating CMakeLists to build main CMakeLists have been modified with the necessary includes to add browsermt/mts@nuke files to the bergamot-translator library. In addition, adds the ssplit dependency, corresponding includes. Intel MKL fails on compilation, unable to find libraries. To solve this 3rd_party/CMakeLists.txt is modified with @ug's fixes to propogate variables (EXT_LIBS, etc) at a library level.	2021-01-20 19:52:34 +00:00
Jerin Philip	d786f2554e	Bumping marian with sentencepiece capable fork Modifications to SentencePiece are necessary to provide token level string_views. This commit changes marian to an alternate branch which has the feature incorporated.	2021-01-20 19:14:40 +00:00
Jerin Philip	601bd52716	Import sources from mts adaptation This first commit imports files from mts which was repurposed for bergamot translator from https://github.com/browsermt/mts/tree/nuke.	2021-01-20 19:08:46 +00:00
abhi-agg	0200843ed7	Merge pull request #7 from browsermt/application Updated README and Added a simple Application	2020-12-11 14:44:55 +01:00
abhi-agg	fd897dc4ec	Merge pull request #6 from browsermt/api Use marian::Options class internally for configuration options	2020-11-26 10:11:54 +01:00
Abhishek Aggarwal	f8c9a6b0cc	Added an application showing usage of bergamot translator - 'app' folder contains the application - The application uses dummy requests and responses for now	2020-11-16 15:44:02 +01:00
Abhishek Aggarwal	9478a54628	Improved 3rd party header inclusion - Inclusion now contains explicit names of the 3rd party libraries	2020-11-16 15:14:50 +01:00
Abhishek Aggarwal	cd505c9286	Updated README with 'Build' and 'Use' instructions	2020-11-16 13:09:42 +01:00
Abhishek Aggarwal	ce7312cfd4	Added basic skeleton for Adaptor class - The class adapts the TranslationModelConfiguration to marian::Options - Returns a dummy marian::Options for now	2020-11-12 11:17:34 +01:00
Abhishek Aggarwal	59c940090b	Use marian::Options class internally for configuration options - Marian uses Options class everywhere as configuration options - Owing to this project's heavy dependency on Marian: -- Made the internal implementation files of the project work with marian::Options instead of TranslationModelConfiguration -- An Adaptor class to adapt TranslationModelConfiguration to marian::Options will be added in following commit	2020-11-12 11:04:19 +01:00
abhi-agg	2c1515313e	Merge pull request #5 from browsermt/api Separated the public includes of the project from implementation	2020-11-11 19:44:50 +01:00
Abhishek Aggarwal	210c5a466a	Separated the public includes of the project from implementation - All interfaces are present in ROOT/src	2020-11-11 17:52:27 +01:00
abhi-agg	77abbfa9c7	Merge pull request #4 from browsermt/api Compile marian submodule in the project	2020-11-11 17:25:48 +01:00
Abhishek Aggarwal	358d76871f	Small change: Added New line endings	2020-11-11 17:18:12 +01:00
Abhishek Aggarwal	36911d39d5	Link marian library in the project	2020-11-11 16:24:50 +01:00
Abhishek Aggarwal	a220f915fc	Compile marian submodule in the project - marian compiles successfully and is ready to be used in the project	2020-11-11 16:19:54 +01:00

1 2 3 4 5

209 Commits