Moses, the machine translation system
Go to file
Jeroen Vermeulen b2d821a141 Unify tokenize() into util, and unit-test it.
The duplicate definition works fine in environments where the inline
definition becomes a weak symbol in the object file, but if it gets
generated as a regular definition, the duplicate definition causes link
problems.

In most call sites the return value could easily be made const, which
gives both the reader and the compiler a bit more certainty about the code's
intentions.  In theory this may help performance, but it's mainly for clarity.

The comments are based on reverse-engineering, and the unit tests are based
on the comments.  It's possible that some of what's in there is not essential,
in which case, don't feel bad about changing it!

I left a third identical definition in place, though I updated it with my
changes to avoid creeping divergence, and noted the duplication in a comment.
It would be nice to get rid of this definition as well, but it'd introduce
headers from the main Moses tree into biconcor, which may be against policy.
2015-04-22 09:59:05 +07:00
biconcor Unify tokenize() into util, and unit-test it. 2015-04-22 09:59:05 +07:00
contrib Replace deprecated bcopy() with memcpy(). 2015-04-16 19:19:34 +07:00
cruise-control changes for cruise control 2013-05-10 16:34:57 +01:00
defer delete External FF. FF framework changes too fast to be able to keep this up-to-date 2015-03-31 19:45:59 +04:00
doc Replaced content by pointer to online documentation. 2014-08-04 17:20:33 +01:00
jam-files Check for MinGW using __MINGW32__, not MINGW. 2015-03-23 22:35:03 +07:00
lm Less error-like complaint when substituting fallback discounts / Matthias Huck 2015-03-31 21:51:38 -04:00
mert Testing of Viterbi decoding on hypergraph. 2015-04-17 12:29:41 +01:00
mingw windows gui code 2014-01-24 11:12:03 +00:00
misc Modernize "C" includes in misc. 2015-03-28 20:18:39 +07:00
moses Unify tokenize() into util, and unit-test it. 2015-04-22 09:59:05 +07:00
moses-cmd 1. Lifetime of tasks in ThreadPool is now managed via shared pointers. 2015-03-21 16:12:52 +00:00
OnDiskPt delete typedefs for UINT32 and UINT64. MSVC now has uint32_t and uint64_t /Ken 2015-03-25 00:55:39 +00:00
phrase-extract Unify tokenize() into util, and unit-test it. 2015-04-22 09:59:05 +07:00
regression-testing run-test-{extract|mert|misc|scorer}.perl now log the command line executed for the specific test. 2015-02-09 23:00:04 +00:00
scripts duplicated functionality with ems/support/lmplz-wrapper.perl 2015-04-21 17:54:34 +04:00
search Modernize "C" includes in search. 2015-03-28 19:48:20 +07:00
symal create file stream and delete it at the end if user has specified a file for input & output 2015-04-16 00:50:54 +04:00
util Cross-platform tempfile implementation. 2015-04-18 00:21:18 +07:00
vw Using boost for prefix/suffix checks /Jeroen Vermeulen 2015-02-05 16:23:47 +00:00
.gitignore Update .gitignore. 2014-11-14 14:28:33 +00:00
.gitmodules Arrow pipeline submodules now use https protocol. 2013-07-24 11:52:14 +01:00
bjam Fix cd error when running bjam from non-top 2013-03-19 11:17:17 +00:00
BUILD-INSTRUCTIONS.txt test 2014-12-22 15:55:27 +05:30
Jamroot delete External FF. FF framework changes too fast to be able to keep this up-to-date 2015-03-31 19:45:59 +04:00