Commit Graph

213 Commits

Author SHA1 Message Date
Marcin Junczys-Dowmunt
0394d2cdbe
Display decoder speed statistics with --stat-freq N (#841)
Display decoder time statistics if requested
2021-03-22 08:58:04 -07:00
Marcin Junczys-Dowmunt
096c48e51c
remove TC_MALLOC from optional dependencies (#840)
There seems to be no benefit from TC_MALLOC any more, hence removing.
2021-03-22 08:02:04 -07:00
Nikolay Bogoychev
d780082973
Fix model loading on architectures where size_t is 32bits (#825)
* fix model loading on architectures where size_t is 32bit
* Update the changelog

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-03-19 15:56:17 +00:00
Marcin Junczys-Dowmunt
a11418c17c
Add simple unit tests for binary files (#826)
* unit tests for binary file operations
* adjust changelog
* Set file_ in TemporaryFile for MSVC

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-03-19 13:17:17 +00:00
Roman Grundkiewicz
db2a5e4d66
Fix broken links to MNIST data sets (#838) 2021-03-19 13:16:10 +00:00
Marcin Junczys-Dowmunt
272096c1d1 sync public and internal master 2021-03-18 03:41:24 +00:00
Marcin Junczys-Dowmunt
8f73923d31 increase version and update changelog 2021-03-18 03:34:44 +00:00
Graeme
2352475705
Fix missing float template specialisation for elem::Plus (#822)
* Fix missing float template specialisation for elem::Plus
* Update CHANGELOG.md
2021-03-12 11:58:34 +00:00
Kenneth Heafield
af3aa314d0
Fix OMP compilation (#824)
* Fix omp variable names
2021-03-02 17:26:49 -08:00
Marcin Junczys-Dowmunt
55a7047f8a merge with internal master 2021-03-02 05:15:41 +00:00
Roman Grundkiewicz
8155d232db Update CHANGELOG and VERSION 2021-02-28 09:08:50 +00:00
Graeme
ac71ee8518
Add graph operations documentation (#801)
* Doxygen structure for expression graph operators
* Document arithmetic expression operations
* Document comparison expression operations
* Document exp/log and trig operations
* Add missing implementation for cos/tan
* Document expression manipulation operations
* Document misc math operations
* Overview of operators
* Document activation functions
* Document element-wise min/max
* Document debugging/checkpoint operators
* Document topk/argmin/argmax operations
* Document index-based operations
* Document reduction operations
* Document lambda expression operators
* Document product operations
* Document softmax, cross-entropy, unlikelihood operations
* Document dropout operations
* Document scalar product and weighted average operations
* Document layer normalization, highway and pooling operations
* Document shift expression operator
* Extra details on rules for adding specializations to .inc files
* Add SinNodeOp example for specialization documentation
* Additional details in tensor operator documentation
* Remove brief command from doxygen comments
* Prefer @ style doxygen functions to \
* Document n-ary function macros
* Enable .cu and .inc files in documentation
* Add a comment about ONNX mapping
* Remove empty lines in doxygen
* Update CHANGELOG

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-02-28 08:38:32 +00:00
Marcin Junczys-Dowmunt
8ecc8b653f FindMKL modified to find installed mkl in system paths 2021-02-24 05:19:34 +00:00
Roman Grundkiewicz
f88ded2ba8
Add documentation platform based on Sphinx+Doxygen+Breathe+Exhale (#803)
* Setup Sphinx+Doxygen+Breathe+Exhale
* Remove links to modindex and search pages
* Enable Doxygen autobrief
2021-02-23 16:25:30 +00:00
Rihards Krišlauks
b29c50f9b7
Update simple-websocket-server to the latest version (#799)
This adds support for boost 1.75
2021-02-22 13:01:03 +00:00
Martin Junczys-Dowmunt
5aeea4e066 Merged PR 17430: Refactors MPI interfaces and adds different types of gradient exchanges
* Refactors MPI-related code
* Adds node-local updates with occasional inter-node updates
* decouples batch-reading across nodes
2021-02-08 05:27:49 +00:00
Marcin Junczys-Dowmunt
6f6d484665 increase version to 1.10.0 2021-02-06 15:35:16 -08:00
Nikolay Bogoychev
600f5cbdec
Integrate intgemm into marian (#595)
Adds intgemm as a module for Marian. Intgemm is @kpu 's 8/16 bit gemm library with support for architectures from SSE2 to AVX512VNNI
Removes outdated integer code, related to the --optimize option

Co-authored-by: Kenneth Heafield <github@kheafield.com>
Co-authored-by: Kenneth Heafield <kpu@users.noreply.github.com>
Co-authored-by: Ulrich Germann <ugermann@inf.ed.ac.uk>
Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>
Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-01-24 16:02:30 -08:00
Qianqian Zhu
737f43014a
Fix to resolve run time failures for FastOpt enabled WASM builds (#779)
* copy changes from commit 4df92f2
* add comments for better understanding
* restore the newline at the end of file and add this changes in changelog.md
2021-01-07 13:12:36 +00:00
Tommy MacWilliam
f59008276c
Add support for Apple Accelerate (#769)
* support for Apple Accelerate
* add a CMake flag to use Apple Accelerate as the BLAS library.
* rename USE_ACCELERATE to USE_APPLE_ACCELERATE
* add comment with more info on Accelerate
* link to the Apple documentation on Accelerate.
2021-01-07 10:41:55 +00:00
Graeme
49994a0460
Prefer if/else in place of try/catch in FastOpt makeScalar (#774) 2021-01-06 11:30:22 +00:00
Martin Junczys-Dowmunt
9dad84ae9b Merged PR 16337: Update sentencepiece to new version
This updates the SentencePiece version in Marian to a much more recent revision. Due to that there is no dependency on Protobuf anymore.
2020-11-11 00:38:37 +00:00
Martin Junczys-Dowmunt
b90229d8ee Merged PR 16294: Change stopping criterion for mini-batch-fit binary search
This PR changes the stopping criterion for mini-batch-fit binary search to better maximize batch size.
2020-11-10 17:08:47 +00:00
Martin Junczys-Dowmunt
d5e773f937 Merged PR 15925: Training embedder separation with margin losses
* This PR adds training of embedding spaces with better separation based on https://arxiv.org/abs/2007.01852
* We can now train with in-batch negative examples or a handful of hand-constructed negative examples provided in a tsv-file.
2020-11-07 17:46:39 +00:00
Marcin Junczys-Dowmunt
3d233ec592 Merge branch 'master' into pmaster 2020-11-04 14:37:25 -08:00
afaji
e274ac76b2
Quantized model finetuning (#690)
* enable quantized training
2020-11-04 14:25:40 -08:00
Martin Junczys-Dowmunt
f3e4cbf705 Merged PR 16219: Allow to set epoch display width
Allow to set the display width of the fractional pert of a logical epoch.
2020-11-04 17:31:59 +00:00
Marcin Junczys-Dowmunt
fabbe20309 merge with internal 2020-11-02 09:05:15 -08:00
Martin Junczys-Dowmunt
ca777c86b7 Merged PR 16168: Add redefinition of logical epoch
Adds e.g. --logical-epoch 1Gt (or other units) that alters the way the epoch counter is displayed. The actual underlying counter in form data passes is not changed. This is essentially a logging change that will now display the epoch as a fractional multiple of the chosen unit.

Example for `--logical-epoch 100Mt`:
```
[2020-11-02 04:14:16] Ep. 4.8602 : Up. 16755 : Sen. 1,088,000 : Cost 1.17630422 * 1,993,304 @ 31,985 after 486,015,051 : Time 61.36s : 32483.55 words/s
[2020-11-02 04:15:18] Ep. 4.8803 : Up. 16825 : Sen. 1,162,648 : Cost 1.17474616 * 2,009,996 @ 37,740 after 488,025,047 : Time 61.88s : 32480.17 words/s
[2020-11-02 04:16:19] Ep. 4.9002 : Up. 16893 : Sen. 1,235,200 : Cost 1.17799997 * 1,990,844 @ 26,173 after 490,015,891 : Time 60.47s : 32920.16 words/s
2020-11-02 17:03:41 +00:00
Martin Junczys-Dowmunt
1b908a82a9 Merged PR 16162: ChrF validation metric
This PR adds ChrF as a validation metric. This follows the implementation from SacreBLEU.
2020-11-02 15:47:46 +00:00
Martin Junczys-Dowmunt
3a028f215c Merged PR 16144: Merge Cross-Entropy with Label-Smoothing operation
* Compute label-smoothing within the cross-entropy node, should result in faster training.
2020-11-01 18:54:24 +00:00
Martin Junczys-Dowmunt
160b36cec8 Merged PR 15896: Add --after N option to supersede --after-batches and --after-epochs
Replace `--after-batches N` and `--after-epochs N` with `--after Nu/Ne` which allows to specify updates, epochs, target labels with units, e.g.:
* `--after 30Gt` or `--after 50ku` or `--after 10e`
* Can also combine multiple criteria: `--after 30Gt,50ku,10e` and will stop when whichever hits first

Changes default `cost-type` from `ce-mean` to `ce-sum` and turns `display-label-counts` on by default.
2020-10-29 20:16:19 +00:00
rhenry-nv
595fba4145
Fixes bug for certain reductions (#746)
* Fixes reductions into scalars for <= 32 input elements. Only affects reductions where 0 is not the identity
* Update CHANGELOG.md
* Adds space before "?"
* Adds comment explaining increase in margin for reduction tests. Adds axis comment to argument to reduce functions. Adds more tests for small reduction operators
2020-10-26 12:26:41 -07:00
Martin Junczys-Dowmunt
91ad534c65 Merged PR 15320: Sync internal and external master
Updates internal master to external master.

Changes:
* Correct behavior for Pre-Norm transformer
* Small changes to CMake files
2020-09-19 16:22:08 +00:00
Marcin Junczys-Dowmunt
951ecfe932
Enable final stack post-processing for transformer for correct prenorm behavior (#719)
This PR enables final post-processing of a full transformer stack for correct prenorm behavior.
See issues: #715 and #699,

List of changes:

Add final post-processing in encoder and decoder if requested with --transformer-postprocess-top. Can take combinations of d, n, a. Using a will add a skip connection from the bottom of the stack.
Add --task transformer-base-prenorm and --task transformer-big-prenorm which correspond to --task transformer-base --transformer-preprocess n --transformer-postprocess da --transformer-postprocess-top n.
2020-09-09 08:06:20 -07:00
Martin Junczys-Dowmunt
e3916b3d08 Merged PR 15233: Sync internal master with public master
Regular sync of public and internal master.
2020-09-07 19:37:41 +00:00
Roman Grundkiewicz
452f9c79e6
Print message that marian-server is listening on port X after it is accepting connections (#705)
* Print 'server is listening' after it is accepting connections; fix #701
* Minor code formatting
2020-09-03 11:47:10 +01:00
Nikolay Bogoychev
4d9d15649e
Enable compute75 when using cuda10 (#698)
* Enable compute75 when using cuda10 or newer and disable compute <50 when using CUDA11
* Re-enable deprecated architectures with CUDA11
2020-09-01 08:56:24 -07:00
Ulrich Germann
044b416af5
Improved handling of SIGTERM (#660)
* Return exit code 15 (SIGTERM) after SIGTERM.
When marian receives signal SIGTERM and exits gracefully (save model & exit),
it should then exit with a non-zero exit code, to signal to any parent process
that it did not exit "naturally".
* Added explanatory comment about exiting marian_train with non-zero status after SIGTERM.
* Bug fix: better handling of SIGTERM for graceful shutdown during training.
Prior to this bug fix, BatchGenerator::fetchBatches, which runs in a separate
thread, would ignore SIGTERM during training (training uses a custom signal handler
for SIGTERM, which simply sets a global flag, to enable graceful shutdown (i.e.,
save models and current state of training before shutting down).

The changes in this commit also facilitate custom handling of other signals in the
future by providing a general singal handler for all signals with a signal number
below 32 (setSignalFlag) and a generic flag checking function (getSignalFlag(sig))
for checking such flags.
2020-08-31 21:13:41 -07:00
Roman Grundkiewicz
3aed9143d9
Add --word-scores to marian-scorer (#638)
* Add --word-scores to scorer
* Update CHANGELOG
* Use single RescorerLoss class
* Update VERSION
2020-08-04 16:15:49 +01:00
Kenneth Heafield
c944633dd2
Optimize CPU LayerNormalization 6x with -ffast-math and lifting branches (#689)
* Optimize LayerNormalization with -ffast-math and lifting branches
* LayerNormalization changelog

gprof:
Now
  1.65     30.11     0.57                             void marian::cpu::LayerNormalizationImpl<1, 1, true>(float*, float const*, float const*, float const*,       float, int, int)
Baseline
  9.08     22.31     3.49                             marian::cpu::LayerNormalization(IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>,          IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, float)

That's 3.49 seconds to 0.57 second

* LayerNormalization: longer comments, @frankseide-style if
2020-08-01 20:41:56 -07:00
Ulrich Germann
dade12a1b2 Update CHANGELOG.md
Capitalize "Internal", add period at end.
2020-07-30 19:56:49 +01:00
Ulrich Germann
2987d1eb1f Update CHANGELOG.md
Fixed typo.
2020-07-30 19:56:49 +01:00
Ulrich Germann
ad91e54a24 Updated CHANGELOG.md. 2020-07-30 19:56:49 +01:00
Ulrich Germann
24a36f54da Default 'none' for option 'shuffle' in BatchGenerator. 2020-07-30 19:56:49 +01:00
Roman Grundkiewicz
23de8db346
Fix compilation without BLAS installed (#679) 2020-07-26 20:19:05 +01:00
Roman Grundkiewicz
3e5abb1b0e
Fix quiet-translation in marian-server (#643) 2020-07-26 20:17:50 +01:00
Roman Grundkiewicz
41de4f3d30
Support tab-separated inputs in marian-server (#649)
* Enable multi-source input in marian-server
* Add function converting multi-line tab-separated textual input

Co-authored-by: Tomasz Dwojak <t.dwojak@amu.edu.pl>
2020-07-26 17:39:12 +01:00
Roman Grundkiewicz
9d4cc7b13d
Fix providing vector-like options using the equals sign (#648) 2020-07-26 17:35:30 +01:00
Roman Grundkiewicz
44757877df
Add initial GitHub workflows (#676)
* Add GitHub workflows

* Workflows with CMake compilation on Windows

* Ubuntu workflow with Boost

* Ignore warnings from Boost

* Compile unit tests on Windows

* Disable cpuinfo tools if compiled with ninja

* Use a separate CMakeSettings.json for CI

* Disable CMake debugs

* Fix unit tests compilation with Ninja Release

* Use FBGEMM in Windows workflow; add comments

* Fix C4706 warning

* Update CHANGELOG

* Run Windows build on pull requests

* Compile SentencePiece statically in Windows workflow

* Add GitHub workflow on MacOS

* Address review comments

* Disable C4702 globally, not only in Debug

* Update CHANGELOG and workflows names

* Update VERSION
2020-07-26 14:43:23 +01:00
Ulrich Germann
b28905a228
Fix bug in finding path to ./git/log/HEAD if Marian is a submodule. (#644)
* Fix bug in finding path to ./git/log/HEAD if Marian is a submodule.
* Build build_info.cpp in build dir, not source dir.
2020-07-21 11:32:08 +01:00
Roman Grundkiewicz
f496a42155 Update CHANGELOG and VERSION 2020-07-17 18:42:26 +01:00
Roman Grundkiewicz
6bf6325058
Updated: fix Windows MSVC builds (using CMake) (#677)
* fix Windows MSVC builds
* dealing with MSVC warnings

Co-authored-by: Rob Berlang <robberlang@tutanota.com>
2020-07-17 11:27:17 +01:00
Marcin Junczys-Dowmunt
7fb5747d6d Update version and changelog 2020-05-21 20:10:33 -07:00
Martin Junczys-Dowmunt
be7b299fe9 Merged PR 12874: Add topk operator and other small changes in preparation of LSH-based short-list replacement
* Add tuple nodes via views and trickery
* Add `topk` operator, currently unused outside unit tests
* Add `abs` operator, currently unused outside unit tests
* Change return type of `Node::allocate()` to `void`. This used to return the number of allocated elements, but isn't really used anywhere. To avoid future confusion of elements and bytes, removed for now.
2020-05-15 04:25:29 +00:00
Marcin Junczys-Dowmunt
d26ce0dfe1 update version 2020-05-14 08:00:41 -07:00
Nikolay Bogoychev
c1facda6a2
Batched gemm (#633)
* Use cblas_sgemm_batch when available
* Merge with master, add comments and describe contribution
2020-05-14 07:55:27 -07:00
Roman Grundkiewicz
f2347a827f
Update Simple-WebSocket-Server and move it to submodules (#639)
* Fix server build with current boost, move simple-websocket-server to submodule
* Change submodule to marian-nmt/Simple-WebSocket-Server
* Update submodule simple-websocket-server

Co-authored-by: Gleb Tv <glebtv@gmail.com>
2020-04-27 10:34:10 +01:00
Kenneth Heafield
3c0c1e133b
python3 shebang from #620 (#621)
* python3 shebang from #620
* Add changelog entry for python3 change
2020-04-16 11:15:42 +01:00
Marcin Junczys-Dowmunt
5e21a28230 update changelog and version 2020-04-13 17:31:06 -07:00
Roman Grundkiewicz
cbb29908f4
Support relative paths in shortlist and sqlite options (#612)
* Refactorize processPaths
* Fix relative paths for shortlist and sqlite options
* Rename InterpolateEnvVars to interpolateEnvVars
* Update CHANGELOG
2020-04-12 18:56:11 +01:00
Marcin Junczys-Dowmunt
d5936084ad
Fix 0 * nan behavior due to using -O3 instead of -OFast (#630)
* fix 0 * nan behavior in concatention
* bump patch
* change epsilon to margin
2020-04-11 09:45:57 -07:00
Roman Grundkiewicz
e6f82f5bd1 Fix TSV training with mini-batch-fit after the last merge 2020-04-11 16:04:20 +01:00
Marcin Junczys-Dowmunt
2248a65b40 resolve merge conflicts 2020-04-10 13:49:20 -07:00
Roman Grundkiewicz
696bb44918
Support tab-separated inputs (#617)
* Add basic support for TSV inputs
* Fix mini-batch-fit for TSV inputs
* Abort if shuffling data from stdin
* Fix terminating training with data from STDIN
* Allow creating vocabs from TSV files
* Add comments; clean creation of vocabs from TSV files
* Guess --tsv-size based on the model type
* Add shortcut for STDIN inputs
* Rename --tsv-size to --tsv-fields
* Allow only one 'stdin' in --train-sets
* Properly create separate vocabularies from a TSV file
* Clearer logging message
* Add error message for wrong number of valid sets if --tsv is used
* Use --no-shuffle instead of --shuffle in the error message
* Fix continuing training from STDIN
* Update CHANGELOG
* Support both 'stdin' and '-'
* Guess --tsv-fields from dim-vocabs if special:model.yml available
* Update error messages
* Move variable outside the loop
* Refactorize utils::splitTsv; add unit tests
* Support '-' as stdin; refactorize; add comments
* Abort if excessive field(s) in the TSV input
* Add a TODO on passing one vocab with fully-tied embeddings
* Remove the unit test with excessive tab-separated fields
2020-04-10 13:01:56 -07:00
Marcin Junczys-Dowmunt
a5a5c62d4a bump version 2020-03-14 09:53:54 -07:00
Marcin Junczys-Dowmunt
adba021a5e bump version 2020-03-12 20:49:31 -07:00
Marcin Junczys-Dowmunt
3c7a88f4e9 update changelog and version 2020-03-10 11:34:07 -07:00
Roman Grundkiewicz
aad22c9d09
Add option for printing CMake cached variables (#583)
* Add option --build-info
2020-03-10 10:29:50 -07:00
Marcin Junczys-Dowmunt
f4ea8239c4 sync with internal branch 2020-03-06 20:54:40 -08:00
Marcin Junczys-Dowmunt
bec7e029b1 bump version 2020-03-06 20:46:11 -08:00
Marcin Junczys-Dowmunt
9f29403627 version and changelog 2020-03-06 20:32:30 -08:00
Roman Grundkiewicz
ad28f99ac0
Add --valid-reset-stalled (#587) 2020-01-29 13:08:45 -08:00
Martin Junczys-Dowmunt
b3a23108b4 Merged PR 11188: Handle empty inputs with batch purging
The previous mechanism to remove empty inputs does not play well with batch purging (removal of finished sentences). Now we reuse the batch purging mechanism to get rid of empty inputs by forcing EOS for all beam entries of a batch entry for the corresponding source batch entry. The purging then takes care of the rest. We set the probability to log(1) = 0.
2020-01-17 21:52:33 +00:00
Marcin Junczys-Dowmunt
1f7a63d3ff bump version 2020-01-11 12:31:34 -08:00
Martin Junczys-Dowmunt
164d26cc36 Merged PR 10999: Splitting up add_all.h into *.h, *.cu and *.inc
Splitting up header file into header and *.cu, comes with the price of having to include specializations for combinations of types as for element.inc and add.inc. No code changes otherwise.

Add CMake options to disable specific compute capabilities.

When run with `make -j16` this compiles in about 6 minutes instead of 7 minutes. Selecting only SM70 during compilation brings down the time to 3 minutes.
2020-01-06 19:14:00 +00:00
Martin Junczys-Dowmunt
88d9980589 Merged PR 10996: A number of smaller changes and clean-up
* Downgrade NCCL to 2.3.7 as 2.4.2 is buggy (hangs with larger models)
* Actually enable gradient-checkpointing, previous option was inactive
* Clean-up training-only options that should not be displayed for decoder and scorer
* Re-enable conversion to FP16 if element types are compatible (belong to the same type class)
* A few typos and more verbose log messages.
2020-01-05 23:16:13 +00:00
Roman Grundkiewicz
24f062cd27 Add option to print word-level scores (#501)
* Add printing word level scores

* Add option --no-spm-decode

* Fix precision for word-level scores

* Fix getting the no-spm-decode option

* Update CHANGELOG

* Add comments and refactor

* Print word-level scores next to other scores in an n-best list

* Remove --word-scores from marian-scorer

* Add --no-spm-decode only if compiled with SentencePiece

* Add comments

* Printing word scores before model scores in n-best lists

* Update VERSION

Co-authored-by: Marcin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com>
2020-01-03 19:10:21 -08:00
Marcin Junczys-Dowmunt
2bd986d8a7 update version and changelog 2019-12-23 12:10:30 -08:00
Martin Junczys-Dowmunt
e0500b20b8 Merged PR 10827: Sequential unlikelihood training and fixed gather operation
This implements Sequential Unlikelihood Training from https://arxiv.org/abs/1908.04319
* implementation as expensive multi-op, special node in-progress.
* fixed gather operator to work in batched cases
2019-12-13 18:55:36 +00:00
Marcin Junczys-Dowmunt
5be8558c35 ammend changelog and bump version 2019-12-12 19:17:16 -08:00
Roman Grundkiewicz
c343ced9e1 Add lexical shortlists to marian-server (#560)
* Add lexical shortlists to marian-server

* Update CHANGELOG
2019-12-05 19:41:36 -08:00
Marcin Junczys-Dowmunt
6224cb16e2 bump patch 2019-12-05 10:44:57 -08:00
Marcin Junczys-Dowmunt
82da7d5219 do not warn for a number of warnings for cpp files from src/3rd_party/.. 2019-12-03 20:32:50 -08:00
Marcin Junczys-Dowmunt
0197b89b43 bump version 2019-11-26 21:24:22 -08:00
Marcin Junczys-Dowmunt
f07042b9c3 bump version based on PRs 2019-11-25 19:20:03 -08:00
Marcin Junczys-Dowmunt
9e090e3472 bump version 2019-11-25 17:49:34 -08:00
Martin Junczys-Dowmunt
93b7ed80fe Merged PR 10553: Fix multiple problems in reduce kernels that occurred during back-prop
This fixes a number of bugs in our GPU reduce-kernels that would manifest mainly for larger matrices and during back-prop. We also drop support for CUDA 8.0 to be able to take advantage of new GPU primitives introduced by NVidia in CUDA 9.0.
2019-11-26 01:48:07 +00:00
Martin Junczys-Dowmunt
189d89e1dd Merged PR 10333: Batch-pruning in beam search
This PR introduces batch-purging in Marian, i.e. whenever a virtual beam becomes inactive (empty) the entire batch entry that corresponds to that beam can be removed from the encoder and decoder neural states. The CPU-side beam search keeps tracks of the hypotheses as before, but needs to perform mappings between original and shifted batch indices.
2019-11-12 05:13:15 +00:00
Martin Junczys-Dowmunt
9353f065f8 Merged PR 10373: Replace IntrusivePtr with std::uniq_ptr in FastOpt
In FastOpt we do not want to use locking during access, but that makes reference counting not thread-safe. We now use std::unique_ptr to const objects or const references everywhere. This fixes random segfaults with multi-GPU training. @TODO: clean-up option merging to make option generally immutable.
2019-11-11 22:04:19 +00:00
Martin Junczys-Dowmunt
5dfd8a7026 Merged PR 10383: Align items while saving at 256-byte boundary
Align items while saving at 256-byte boundary. Maintains binary-compatibility.
2019-11-11 18:56:12 +00:00
Martin Junczys-Dowmunt
233281cc26 Merged PR 10304: Remove naked pointers, add binary read mode
* Remove naked pointers in file_stream.{h,cpp}
* Add binary read mode
2019-11-05 23:18:45 +00:00
Marcin Junczys-Dowmunt
7ba804b20b bump patch version, update changelog 2019-11-05 11:46:43 -08:00
Marcin Junczys-Dowmunt
1abd12520d bump version once more 2019-11-01 16:37:12 -07:00
Marcin Junczys-Dowmunt
54fba7868e update changelog and version 2019-11-01 16:33:29 -07:00
Marcin Junczys-Dowmunt
398ed0c8d2 update version and changelog 2019-10-27 15:02:18 -07:00
Marcin Junczys-Dowmunt
55e3bcfd74 bump patch and changelog 2019-10-26 15:19:01 -07:00
Marcin Junczys-Dowmunt
68f9d90bfa fix changelog 2019-10-26 09:04:18 -07:00
Marcin Junczys-Dowmunt
e375905404 update changelog and version 2019-10-26 08:53:57 -07:00
Marcin Junczys-Dowmunt
ca37cb067a update version and changelog 2019-09-11 14:28:14 -07:00