Commit Graph

4905 Commits

Author SHA1 Message Date
Roman Grundkiewicz
a05124176d Merged PR 18531: Install GCC in Azure pipelines
This fixes Azure pipelines after recent changes in Azure-hosted runners removing GCC 8 and older on some Ubuntu images. GCC is now installed explicitly via `apt-get`.
2021-04-09 18:44:11 +00:00
Marcin Junczys-Dowmunt
6435c6f1ce synced with public master 2021-04-09 16:12:34 +00:00
Marcin Junczys-Dowmunt
fdf9fe7d4a
Update VERSION 2021-04-09 09:03:39 -07:00
Marcin Junczys-Dowmunt
be65065623
Allow to choose fine-grained CPU intrinsics on as CMake options (#849)
* allow to choose fine-grained CPU intrinsics on as CMake options
* inform user that e.g. -DCOMPILE_AVX2=off will be ignored with -march=native if there is compiler support
2021-04-09 09:02:34 -07:00
Marcin Junczys-Dowmunt
a17ee300f4
Create VERSION 2021-04-08 21:48:01 -07:00
rhenry-nv
fddd0e0661
Adds better Affine support for GPUs when using CUDA 11. Introduces a new bias addition kernel for CUDA < 11 (#778)
Co-authored-by: Marcin Junczys-Dowmunt <marcinjd@microsoft.com>
2021-04-08 21:46:27 -07:00
Roman Grundkiewicz
0223ce90b1
Fix Ubuntu GitHub checks (#848)
* Change ubuntu-latest to ubuntu-18.04
* Install gcc/g++
2021-04-08 18:41:15 +01:00
Marcin Junczys-Dowmunt
bfa6180033 Revert "remove TC_MALLOC from optional dependencies (#840)"
This reverts commit 096c48e51c.
2021-04-08 07:30:38 +00:00
Rohit Jain
4408e88a94 Merged PR 18366: Fix generation of special control characters for default vocabulary
This PR extends the --allow-special feature to default vocabulary items as well. If the default vocabulary is provided with symbols ostensibly generated from the SentencePiece Byte Fallback mechanism, we suppress the control characters from that list.
2021-03-30 21:43:06 +00:00
Roman Grundkiewicz
c29cc83dc4 Update submodule examples 2021-03-30 08:58:11 +00:00
Martin Junczys-Dowmunt
7d1f941242 Merged PR 18309: Cleaner suppression of unwanted output words
This PR adds cleaner suppression of unwanted output words. We identified a situation where SPM with byte-fallback can generate random bytes with output-sampling.

That is particularly harmful when that random bytes happens to be a newline symbol. Here we suppress newline in output unless explicitly wanted.
2021-03-26 16:17:12 +00:00
Marcin Junczys-Dowmunt
08bb158974 Merge branch 'pmaster' 2021-03-23 21:59:51 +00:00
Nikolay Bogoychev
ffd997e360
Properly copy the entire vector in the int16_t case (#845)
Fixes #842 #843 #844
2021-03-23 14:32:01 -07:00
Hieu Hoang
64707fa484 Revert "start lsh shortlist"
This reverts commit 415769fb2f.
2021-03-23 01:22:45 +00:00
Hieu Hoang
415769fb2f start lsh shortlist 2021-03-23 01:19:16 +00:00
Young Jin Kim
b36d0bbbab
Fix FBGEMM build with gcc 9.3+ (#836) 2021-03-22 11:13:40 -07:00
Marcin Junczys-Dowmunt
0394d2cdbe
Display decoder speed statistics with --stat-freq N (#841)
Display decoder time statistics if requested
2021-03-22 08:58:04 -07:00
Marcin Junczys-Dowmunt
096c48e51c
remove TC_MALLOC from optional dependencies (#840)
There seems to be no benefit from TC_MALLOC any more, hence removing.
2021-03-22 08:02:04 -07:00
Marcin Junczys-Dowmunt
9e36c73fa9 Merge branch 'master' into pmaster 2021-03-21 22:35:57 +00:00
Roman Grundkiewicz
c89efbe919
Update VERSION 2021-03-19 15:56:37 +00:00
Nikolay Bogoychev
d780082973
Fix model loading on architectures where size_t is 32bits (#825)
* fix model loading on architectures where size_t is 32bit
* Update the changelog

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-03-19 15:56:17 +00:00
Roman Grundkiewicz
c724837ab3
Update VERSION 2021-03-19 13:20:31 +00:00
Marcin Junczys-Dowmunt
a11418c17c
Add simple unit tests for binary files (#826)
* unit tests for binary file operations
* adjust changelog
* Set file_ in TemporaryFile for MSVC

Co-authored-by: Roman Grundkiewicz <rgrundkiewicz@gmail.com>
2021-03-19 13:17:17 +00:00
Roman Grundkiewicz
db2a5e4d66
Fix broken links to MNIST data sets (#838) 2021-03-19 13:16:10 +00:00
Roman Grundkiewicz
326b9400e9 Merged PR 18232: Update VS CMake builds and scripts
This PR updates Windows build using Visual Studio CMake compilation with Ninja. It does not affect standard VS compilation or Windows builds on Azure/GitHub CI.

List of changes:
- Fixed syntax in the script installing dependencies via vcpkg.
- Removed installation of Protobuf (already included as a submodule) and Boost 1.72 (the previous solution no longer works with new vcpkg).
- Disabled compilation of marian-server in the default setting due to Boost issues.
- Disabled compilation of NCCL in the default setting due to an error (see comment in the code).
- Updated vs/README.
2021-03-19 08:27:34 +00:00
Roman Grundkiewicz
571634a2c0
Update GitHub workflows with Ubuntu+CUDA (#837) 2021-03-18 12:05:32 +00:00
Marcin Junczys-Dowmunt
272096c1d1 sync public and internal master 2021-03-18 03:41:24 +00:00
Marcin Junczys-Dowmunt
8f73923d31 increase version and update changelog 2021-03-18 03:34:44 +00:00
Martin Junczys-Dowmunt
e08c52a8df Merged PR 18185: Support for Microsoft legacy binary shortlist
Adds support for Microsoft-internal binary shortlist format.
2021-03-18 03:33:13 +00:00
Roman Grundkiewicz
a1aaa32c6a Merged PR 18201: Install Boost in Azure pipelines
Installing Boost manually in all workflows, because it has been recently removed from Azure/GitHub hosted runners. This should fix recent failures of Marian CI builds.
2021-03-17 17:34:09 +00:00
Roman Grundkiewicz
77c3e356a4
Install Boost in GitHub workflows (#834)
* Install Boost in workflows
* Compile CUDA realease for all GPU archs
* Convert line endings to LF
2021-03-17 13:21:47 +00:00
Roman Grundkiewicz
bb92b817dd
Update VERSION 2021-03-12 11:58:53 +00:00
Graeme
2352475705
Fix missing float template specialisation for elem::Plus (#822)
* Fix missing float template specialisation for elem::Plus
* Update CHANGELOG.md
2021-03-12 11:58:34 +00:00
Graeme
db399d749c
Fix fallback to default paths in MNIST example (#821)
If --train-sets or --valid-sets are not provided, the fallback to the
hard-coded paths does not occur. This commit requires that these
entities have a non-empty value.
2021-03-12 09:42:43 +00:00
Hieu Hoang
2a7425ddd9 Merged PR 17975: refactor before lsh optimization
moving larges into their own files and implemtation to cpp so easier to read. No code change
2021-03-11 17:02:31 +00:00
Graeme
f74d055d20
Generate documentation artifact with GitHub action (#819)
* Generate documentation artifact with GitHub action
* Add comment about the contents of the api-docs artifact
2021-03-10 12:50:58 +00:00
Roman Grundkiewicz
73182b4aae Update regression-tests 2021-03-09 02:39:34 -08:00
Roman Grundkiewicz
cd018e8d04 Update formatting 2021-03-08 03:09:03 -08:00
Hieu Hoang
ba19663784 clang-format -i 2021-03-05 22:54:05 -07:00
Hieu Hoang
55f4216552 add .h 2021-03-05 06:12:28 +00:00
Hieu Hoang
7c1cb8462a add logits.cpp 2021-03-04 07:59:54 +00:00
Hieu Hoang
085c8a7a98 more code from .h -> .cpp 2021-03-04 07:57:10 +00:00
Hieu Hoang
b88c3fcb71 costs.cpp 2021-03-04 04:35:00 +00:00
Hieu Hoang
42406cc715 move logits to its own file 2021-03-04 04:23:35 +00:00
Hieu Hoang
f7266886f0 move logits to its own file 2021-03-04 04:18:19 +00:00
Hieu Hoang
ca47eabca5 move output to its own file 2021-03-04 03:24:25 +00:00
Hieu Hoang
0d8372c590 move embedding to its own file 2021-03-04 02:46:19 +00:00
Hieu Hoang
96ed0baf5a chmod -x 2021-03-04 02:24:37 +00:00
Roman Grundkiewicz
db771e09bd
Update VERSION 2021-03-03 10:21:06 +00:00
Kenneth Heafield
af3aa314d0
Fix OMP compilation (#824)
* Fix omp variable names
2021-03-02 17:26:49 -08:00