mjdenkowski
28f4e3b1a7
Compatibility: mmsapt and factored translation with generation models
2015-09-02 01:55:06 -04:00
Ulrich Germann
e8f010b9af
Removed ORLM.
2015-08-17 18:11:04 +01:00
Ulrich Germann
8b3f2d4338
Bye-bye, PhraseDictionaryDynSuffixArray.
2015-08-17 15:35:35 +01:00
MosesAdmin
8af06a6f0d
daily automatic beautifier
2015-08-12 00:01:03 +01:00
Hieu Hoang
4a3363479e
remove namespace pollution from old dynamic suffix array and randlm
2015-08-11 12:44:42 +04:00
Ulrich Germann
1dcd077806
More namespace fixes.
2015-08-10 15:14:44 +01:00
Ulrich Germann
0e2dc56360
Namespace cleanup
2015-08-10 11:03:31 +01:00
Ulrich Germann
03463facd7
Cleanup.
2015-08-10 10:14:28 +01:00
Ulrich Germann
16c637b8a5
Back to intrusive pointer for btfix in Mmsapt. Shared pointer causes segfaults. Needs further investigation.
2015-08-10 09:32:26 +01:00
Ulrich Germann
7d1987121f
Minor update of declaration of binread().
2015-08-10 09:30:53 +01:00
Ulrich Germann
c40082f94c
Bug fix: restored passing of server information to bias client.
2015-08-10 00:09:56 +01:00
Ulrich Germann
a68b77c300
Minor fix in logging of interaction with bias server.
2015-08-10 00:08:44 +01:00
Ulrich Germann
6b084a0587
clang can't handle boost intrusive pointers, it seems.
2015-08-09 22:49:35 +01:00
Ulrich Germann
9702084641
Deleted unused code.
2015-08-09 22:48:26 +01:00
Ulrich Germann
f20c4cbbc0
Namespace refactoring in mmsapt.
2015-08-09 20:51:21 +01:00
Ulrich Germann
fcd0c17af3
Choice of map type in pstats.
2015-08-07 19:16:27 +01:00
Ulrich Germann
883c34aee9
Merge branch 'master' of http://github.com/moses-smt/mosesdecoder into mmt-dev
...
Conflicts:
moses/SearchNormalBatch.cpp
moses/TranslationModel/UG/mm/ug_bitext.h
moses/TranslationModel/UG/mm/ug_typedefs.h
moses/TranslationModel/UG/mmsapt.cpp
moses/TranslationModel/UG/mmsapt.h
2015-08-07 14:14:19 +01:00
Ulrich Germann
a07eb65118
sptr<> -> SPTR<> in preparation for merge with legacy master
2015-08-07 13:35:02 +01:00
U-DESKTOP-ONHNTIV\hieuh
8fce5c783d
cygwin
2015-08-05 19:57:41 +01:00
Hieu Hoang
d94c2d210b
#define name clashed with internal class name when using clang
2015-08-02 06:33:32 +04:00
Ulrich Germann
51bc36d131
Report bias server errors.
2015-07-31 23:21:56 +00:00
Ulrich Germann
fd93df3b8f
Added virtual deconstructor.
2015-07-31 23:21:01 +00:00
Ulrich Germann
362f26d5eb
Logging in constructor of http_client. Member function to return server error message.
2015-07-31 23:19:03 +00:00
Ulrich Germann
3439de6341
Bug fix in uri_encode. Logging of lookup in http_client constructor.
2015-07-31 23:18:08 +00:00
Ulrich Germann
ecfc8d8b1a
Logging in query_bias_server().
2015-07-31 17:06:31 +01:00
Ulrich Germann
6ab2d4d3eb
Removed ranking stuff prior to merge with master.
2015-07-28 20:15:14 +01:00
Ulrich Germann
451016891e
No ranked sampling for the time being.
2015-07-28 19:35:27 +01:00
Ulrich Germann
d67723fd29
Merge branch 'master' of http://github.com/moses-smt/mosesdecoder into ranked-sampling
...
Conflicts:
moses/TargetPhrase.cpp
moses/TargetPhrase.h
2015-07-28 14:29:49 +01:00
Ulrich Germann
d1cb249a7f
Removed building of cooccurrence table from mmlex-build.
2015-07-28 14:24:06 +01:00
Ulrich Germann
2faa9e6fe4
Multi-threaded sorting when building suffix array.
2015-07-28 14:23:23 +01:00
Ulrich Germann
70a1c88614
New dummy bias that always returns 1.
...
Purpose: to keep track of phrase counts per document. If no bias is given,
no per-documents counts are stored.
2015-07-26 21:23:13 +01:00
Ulrich Germann
f26e2008ca
work in progress
2015-07-26 21:21:19 +01:00
Ulrich Germann
a1652b4a97
Fix typo.
2015-07-26 21:20:40 +01:00
Ulrich Germann
8da4804631
Initial check-in.
2015-07-23 00:13:19 +01:00
Ulrich Germann
8e393c79ab
Logging of priming time in ranked sampling.
2015-07-23 00:12:34 +01:00
Ulrich Germann
f1bde0af05
Map instead of vector for bias map in SamplingBias.
2015-07-23 00:12:00 +01:00
Ulrich Germann
09d0909e0f
Trying to make sampling more efficient for large document collections underlying the sampling phrase table.
2015-07-23 00:10:52 +01:00
Ulrich Germann
053037816b
Increased verbosity threshold for logging document map.
2015-07-23 00:08:33 +01:00
Ulrich Germann
5aaa8fcbfa
1. Fixed concurrency issue in context handling.
...
2. Added phrase table feature function PScoreLengthRatio.
2015-07-21 15:36:28 +01:00
Ulrich Germann
506e02bdec
Added utility function len_from_pid().
2015-07-21 15:32:47 +01:00
Ulrich Germann
80db5487bc
Fixed typo in comment.
2015-07-21 15:31:46 +01:00
Philipp Koehn
b4b30cff7a
fix some compile time warnings about unsigned / signed int
2015-07-20 11:45:23 -04:00
Ulrich Germann
ad4fdc59c2
Re-enabled actual caching (get always returned NULL). Moved point of locking in release() in an attempt to battle segfaults.
2015-07-18 19:02:47 +01:00
Ulrich Germann
e94007c7f4
Mmsapt can now handle factorized phrase tables with more than one factor.
2015-07-13 17:51:44 +01:00
Ulrich Germann
cc5f128944
Allow 'ranked' as alias for sampling method 'rank'.
2015-07-11 00:24:20 +01:00
Ulrich Germann
03e19dd915
Commented out m_rnd_denom. Not used.
2015-07-07 20:16:41 +01:00
Ulrich Germann
8bdbfe583f
1. Added initialization of pstats cache on ContextForQuery.
...
2. Code cleanup: removed obsolete code.
2015-07-07 00:12:56 +01:00
Ulrich Germann
47eb0bd41e
Added seeding of random generator to produce the same results across repeated runs of the decoder.
2015-07-07 00:12:20 +01:00
Ulrich Germann
4dd2ea3117
Added random sampling to BitextSampler.
2015-07-05 13:08:57 +01:00
Ulrich Germann
e1f31666c3
Fixes to make things compile after merging with branch mmt-dev.
2015-07-03 17:20:27 +01:00
Ulrich Germann
98c03dc047
Merge branch 'mmt-dev' into ranked-sampling.
...
Conflicts:
moses/TranslationModel/UG/mm/ug_bitext.h
moses/TranslationModel/UG/mm/ug_bitext_jstats.cc
moses/TranslationModel/UG/mm/ug_bitext_jstats.h
moses/TranslationModel/UG/mm/ug_bitext_pstats.cc
moses/TranslationModel/UG/mm/ug_sampling_bias.cc
moses/TranslationModel/UG/mm/ug_sampling_bias.h
2015-07-03 15:30:33 +01:00
Ulrich Germann
9dae3eb785
Code cleanup.
2015-07-03 15:11:21 +01:00
Ulrich Germann
f78bb4a6e9
Bigger K-best list to accommodate phrase extraction failure.
2015-07-02 23:56:53 +01:00
Ulrich Germann
1c25b29ebb
Show from which documents phrase translations were collected.
2015-07-02 23:55:14 +01:00
Ulrich Germann
64ec34df5d
Proper indentation with spaces (no tabs).
2015-07-02 23:49:00 +01:00
Ulrich Germann
b05ca8cb80
Fixes to make code compile on various versions of gcc.
2015-07-02 18:06:55 +01:00
Ulrich Germann
e94921dc44
Removal of 'using namespace ...' from several header files.
2015-07-02 01:32:34 +01:00
Ulrich Germann
61067b4fa5
Merge branch 'FF_ttptr' of http://github.com/moses-smt/mosesdecoder
2015-07-01 13:25:26 +01:00
Ulrich Germann
41a11dfe8a
Allow ports other than 80 as the server ports for the context bias server.
2015-06-25 18:37:28 +01:00
XapaJIaMnu
fcc9bb1e60
when using the suffix array PT, set the ttask in the targetPhrase
2015-06-25 16:52:14 +01:00
XapaJIaMnu
a3ecd9f2a7
Revert "Break everything by trying to add ttasksptr to TargetPhrase" and try an easier approach
...
This reverts commit afdc1b480e
.
2015-06-25 15:47:39 +01:00
XapaJIaMnu
afdc1b480e
Break everything by trying to add ttasksptr to TargetPhrase
2015-06-25 15:47:17 +01:00
Ulrich Germann
22cc22064c
Changed implementation of indocs (to keep track of which documents phrases come from) from vector to map.
2015-06-25 15:43:13 +01:00
XapaJIaMnu
47a488767e
Enable the bias weights to be (re)set by the server.
2015-06-25 13:12:33 +01:00
Ulrich Germann
78b2810cfe
Allow context server to use ports other than 80.
2015-06-24 18:09:22 +01:00
XapaJIaMnu
5a0168a6fa
forgot to negate a condition
2015-06-23 17:27:01 +01:00
XapaJIaMnu
e50926abf6
Enable the Suffix array to get context_weights from command line
2015-06-23 16:58:58 +01:00
MosesAdmin
e57ca5ec34
daily automatic beautifier
2015-06-22 00:00:43 +01:00
Marcin Junczys-Dowmunt
58f0187e8b
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2015-06-21 19:24:53 +02:00
Marcin Junczys-Dowmunt
6151003c13
Remove C++11 oddities
2015-06-21 19:24:43 +02:00
Hieu Hoang
0f943dd9c1
clang compile errors
2015-06-21 21:16:12 +04:00
Marcin Junczys-Dowmunt
1bd10e104c
workaround/cleaning for weird copy-constructor behaviour with C++11
2015-06-21 18:27:56 +02:00
Ulrich Germann
65bd46df65
Added feature with cumulative bias.
2015-06-19 21:50:01 +01:00
Phil Williams
90470e878d
Fix some C++11-related compilation errors (clang)
2015-06-19 15:58:14 +01:00
Ulrich Germann
a627fd3cc6
Bug fix: set_bias_for_ranking needs to lock.
2015-06-15 14:22:32 +01:00
Ulrich Germann
9d46c5efa1
Rearrangement of members to match initialization order.
2015-06-15 14:20:45 +01:00
Ulrich Germann
4582e43473
Merge branch 'master' into ranked-sampling
2015-06-08 15:45:04 +01:00
Ulrich Germann
bca0f651da
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-06-08 15:44:37 +01:00
Ulrich Germann
3a5acb56cc
Added some logging messages.
2015-06-08 15:32:53 +01:00
Ulrich Germann
b2a3bd280e
Allow intrusive pointers to const objects.
2015-06-08 14:10:48 +01:00
Ulrich Germann
2f125eddc3
Bug fix. Readability.
2015-06-08 14:08:35 +01:00
Ulrich Germann
5e2e63f678
Integration of ranked sampling.
2015-06-08 14:06:54 +01:00
Ulrich Germann
5dc9d68d2d
Initial check-in.
2015-06-08 14:05:41 +01:00
Ulrich Germann
78b0aab65b
Work in progress.
2015-06-08 14:04:19 +01:00
Ulrich Germann
36c3f9dda8
Work in progress. Bug fix (release pstats in deconstructor!). Various other changes.
2015-06-08 14:03:20 +01:00
Ulrich Germann
d34b107b91
Initial check-in.
2015-06-08 14:00:31 +01:00
Ulrich Germann
69f15d0c5a
New member function wait that won't return until sampling is done.
2015-06-08 14:00:17 +01:00
Ulrich Germann
ac99ec519f
Have SentenceBias keep track of document ids.
2015-06-08 13:53:39 +01:00
Ulrich Germann
ff97627e30
Update to emacs variables at top.
2015-06-08 13:52:34 +01:00
Ulrich Germann
4fcb9b98f7
Keeping track of cumulative bias scores.
2015-06-08 13:51:54 +01:00
Ulrich Germann
f1de677530
SentenceBias now has access to mapping from sentence IDs to document IDs.
2015-06-08 13:50:37 +01:00
Ulrich Germann
3c767fc333
New field to store cumulative bias scores.
2015-06-08 13:48:40 +01:00
Ulrich Germann
e8a4a9b10a
New member function to expose mapping from sentence IDs to document ids.
2015-06-08 13:45:51 +01:00
Ulrich Germann
d5b0ec7562
Initial check-in.
2015-06-08 12:20:25 +01:00
Ulrich Germann
7a57ce4dc2
Missing #pragma once.
2015-06-06 13:22:04 +01:00
Ulrich Germann
56dee1d4ac
Bug fixes: missing #include and const declaration of find_trg_phr-bounds().
2015-06-06 13:21:33 +01:00
Ulrich Germann
d4234847cd
Added #include.
2015-06-05 22:51:58 +01:00
Ulrich Germann
f87f123366
Added member function find_trg_phrase_bound(PhraseExtractionRecord& rec) to Bitext class.
2015-06-05 22:50:17 +01:00
Ulrich Germann
8ae2894107
Initial check-in.
2015-06-05 22:29:26 +01:00
Ulrich Germann
53752f70a7
Added member function find_trg_phr_bound(PhraseExtractionRecord& rec).
2015-06-05 22:28:02 +01:00
Ulrich Germann
c7fffab82c
Bug fixes.
2015-06-05 22:27:10 +01:00
Ulrich Germann
8a547ea82f
Added missing #include.
2015-06-05 22:25:49 +01:00
Ulrich Germann
704432cf0f
Bug fixes.
2015-06-05 22:25:13 +01:00
Ulrich Germann
623eb7bb77
Instantiation of btfix via boost::intrusive_ptr in Mmsapt.
...
This is in preparation for distinct bitext samplers which need to
ensure the lifetime of the bitext while sampling.
2015-06-05 21:15:47 +01:00
Ulrich Germann
e8ee56876e
Initial check-in.
2015-06-05 17:24:53 +01:00
Ulrich Germann
8f4b2afe26
#include a few more things.
2015-06-05 16:30:07 +01:00
Ulrich Germann
1b4b3a5103
Mmsapt: btfix now instatiated via intrusive pointer
...
... to prevent deletion while Mmsapt is live.
2015-06-05 16:27:49 +01:00
Ulrich Germann
47fa99b61b
Added member function size() to LRU_Cache.
2015-06-05 16:26:47 +01:00
Ulrich Germann
243a6a8b3b
Added #define for intrusive pointer.
2015-06-05 16:23:00 +01:00
Ulrich Germann
576c743aee
Simplified #include.
2015-06-05 16:22:03 +01:00
Ulrich Germann
5cb1d95e09
Added member function for retrieving nbest list items without sorting.
2015-06-05 16:21:09 +01:00
Ulrich Germann
5a56a5b496
Added target for forced relinking only (no forced recompilation); temporarily disabled tcmalloc.
2015-06-05 16:20:08 +01:00
Ulrich Germann
83fa1b6a88
Initial check-in.
2015-06-03 12:59:32 +01:00
Ulrich Germann
0afe139810
Initial check-in.
2015-06-03 12:55:58 +01:00
Ulrich Germann
debdd21899
Optional initialization of SentenceBias.
2015-06-03 12:53:38 +01:00
Ulrich Germann
f024eede74
Added ca() as short replacement for approxOccurrenceCount() to tsa_tree_iterator.
2015-06-03 12:51:44 +01:00
Hieu Hoang
3ea5faead8
codelite
2015-06-02 21:44:58 +04:00
Jeroen Vermeulen
35cf55d4d2
Trailing spaces.
2015-06-02 15:03:18 +07:00
Ulrich Germann
d62d2dc95f
Bug fix.
2015-06-01 23:10:50 +01:00
Ulrich Germann
aa4eed93d5
Bug fix related to getting rid of using namespace std; .
2015-06-01 18:55:40 +01:00
Ulrich Germann
cc800742b1
Updated Makefile for local compiles.
2015-06-01 18:26:27 +01:00
Ulrich Germann
99896cfd2c
Untangling bitext class from Moses dependencies, so that the class can be used
...
independently of Moses again.
2015-06-01 18:25:04 +01:00
Ulrich Germann
349163f3fd
Bug fix and in-line code documentation.
2015-06-01 18:21:52 +01:00
Ulrich Germann
25f98a446e
Bug fix in building imTtrack directly from input stream.
2015-06-01 18:19:34 +01:00
Ulrich Germann
c82ee9a4e9
Bug fix.
2015-05-24 16:44:41 +01:00
Ulrich Germann
da052b7f2b
Removed dependency on libcurlpp, as it was difficult to link that staticly.
2015-05-24 16:05:14 +01:00
Ulrich Germann
dcb8e5d3e0
Preparation for allowing context-aware decoding.
2015-05-19 02:35:39 +01:00
Hieu Hoang
39139e7a64
beautify.
2015-05-15 18:09:38 +01:00
Marcin Junczys-Dowmunt
7652ab9118
quick fix for out-of-bound alignment points
2015-05-15 09:12:51 +02:00
Jeroen Vermeulen
0859e9a844
Remove trailing whitespace from C++ files.
2015-05-13 17:05:43 +07:00
Jeroen Vermeulen
1364a7d599
Fix typo in mmap call.
...
The case where !m_fixed passed m_map_size to mmap(), but the "else"
clause passed map_size. In replacing mmap() with the portable wrapper,
I accidentally changed that to be m_map_size as well.
Besides fixing that, I'm changing the name of the variable to be more
clearly distinguishable from m_map_size.
2015-05-12 09:58:47 +07:00
Ulrich Germann
7da7ce52da
Added context buffering in IOWrapper for context-sensitive decoding.
...
Unfortunately, this seems to slow things down quite a bit.
2015-05-11 00:34:24 +01:00
Ulrich Germann
db5ccff364
Tweaks to logging for biased sampling.
2015-05-11 00:33:21 +01:00
Ulrich Germann
1778238d73
Logging of latency of bias lookup via server.
2015-05-11 00:32:20 +01:00
Ulrich Germann
8a174beb44
Additional check for document map if document bias is requested.
2015-05-11 00:30:32 +01:00
Nicola Bertoldi
90a982e579
merge remote into local
2015-05-04 09:42:44 +02:00
Nicola Bertoldi
c4f04670c2
made ProbingPT constructor compliant with PhraseDictionary signature
2015-05-04 09:25:50 +02:00
Hieu Hoang
cc8c6b7b10
beautify
2015-05-02 11:45:24 +01:00
Jeroen Vermeulen
eca5824100
Remove trailing whitespace in C++ files.
2015-04-30 12:05:11 +07:00
Ulrich Germann
324b1a9b56
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-04-29 20:20:54 +01:00
Ulrich Germann
e4f5c69109
One step closer to eliminating the requirement to provide num-features=... in the config file.
...
Some FF (Mmsapt, LexicalReordering, Many single-value FF) provide this number during "registration";
when missing, a default weight vector of uniform 1.0 is automatically generated. This eliminates the
need for the user to figure out what the exact number of features is for each FF, which can get complicated,
e.g. in the case of Mmsapt/PhraseDictionaryBitextSampling.
2015-04-29 20:16:52 +01:00
Ulrich Germann
c76f1c338d
Uninitialized variable.
2015-04-29 20:16:43 +01:00
Jeroen Vermeulen
616b589da3
Fix a bunch of compiler warnings.
...
Warnings are useful, but only if there are few!
2015-04-29 21:18:51 +07:00
Ulrich Germann
315610c02a
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-04-27 16:39:40 +01:00
Ulrich Germann
37bb1de9ed
Unused variable.
2015-04-27 16:30:59 +01:00
Ulrich Germann
fbf8b1f8b8
Code design debizarrification: Indexes of feature functions into the dense vector of all feature
...
values are now stored on the feature function instead of in a global map that is a static
member of ScoreComponentCollection.
2015-04-26 16:46:36 +01:00
Ulrich Germann
e63561ae7f
Unused variable.
2015-04-26 15:41:32 +01:00
Hieu Hoang
41529227b2
boost unique lock
2015-04-26 18:11:11 +04:00
Ulrich Germann
bafe60c3a1
Make sure things work when curl-based biasing is disabled.
2015-04-26 03:14:40 +01:00
Ulrich Germann
0d72cdd72c
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder into mmt-dev
...
Conflicts:
moses/Syntax/F2S/Manager-inl.h
moses/TranslationModel/UG/mmsapt.cpp
2015-04-26 02:12:16 +01:00
Jeroen Vermeulen
8ac91c8d97
Fix unqualified call to rand_excl().
...
The call needed to be made explicitly to util::rand_excl(). Sorry.
2015-04-24 00:22:25 +07:00
Jeroen Vermeulen
38d790cac0
Add cross-platform randomizer module.
...
The code uses two mechanisms for generating random numbers: srand()/rand(),
which is not thread-safe, and srandom()/random(), which is POSIX-specific.
Here I add a util/random.cc module that centralizes these calls, and unifies
some common usage patterns. If the implementation is not good enough, we can
now change it in a single place.
To keep things simple, this uses the portable srand()/rand() but protects them
with a lock to avoid concurrency problems.
The hard part was to keep the regression tests passing: they rely on fixed
sequences of random numbers, so a small code change could break them very
thoroughly. Util::rand(), for wide types like size_t, calls std::rand() not
once but twice. This behaviour was generalized into utils::wide_rand() and
friends.
2015-04-23 23:46:04 +07:00
Jeroen Vermeulen
02d1d9a4af
Don't work around missing popen() in MinGW.
...
Windows does not have popen()/pclose(), so FileHandler.cpp #define's them to
_popen()/_pclose(). But MinGW has similar macros built into <cstdio>, leading
to warnings. So skip the workaround on MinGW.
2015-04-22 11:24:32 +07:00
Jeroen Vermeulen
32722ab5b1
Support tokenize(const std::string &) as well.
...
Convenience wrapper: the actual function takes a const char[], but many of
the call sites want to pass a string and have to call its c_str() first.
2015-04-22 10:35:18 +07:00
Jeroen Vermeulen
b2d821a141
Unify tokenize() into util, and unit-test it.
...
The duplicate definition works fine in environments where the inline
definition becomes a weak symbol in the object file, but if it gets
generated as a regular definition, the duplicate definition causes link
problems.
In most call sites the return value could easily be made const, which
gives both the reader and the compiler a bit more certainty about the code's
intentions. In theory this may help performance, but it's mainly for clarity.
The comments are based on reverse-engineering, and the unit tests are based
on the comments. It's possible that some of what's in there is not essential,
in which case, don't feel bad about changing it!
I left a third identical definition in place, though I updated it with my
changes to avoid creeping divergence, and noted the duplication in a comment.
It would be nice to get rid of this definition as well, but it'd introduce
headers from the main Moses tree into biconcor, which may be against policy.
2015-04-22 09:59:05 +07:00
Ulrich Germann
2c0851099b
Work on integrating hierarchical lexicalized reordering models with sampled phrase tables.
2015-04-21 17:48:48 +01:00
Ulrich Germann
0d13edae24
Added entry for bitext-find.
2015-04-21 17:47:39 +01:00
Ulrich Germann
9a9e43ea2c
Initial check-in: search utility for bi-concordancing.
2015-04-21 17:47:09 +01:00
Ulrich Germann
e7246686bf
New constructor.
2015-04-21 17:46:12 +01:00
Ulrich Germann
1791f47bfb
mmBitext now maintains a vector of document names.
2015-04-21 17:43:51 +01:00
Ulrich Germann
8a921f5dc9
Initial check-in.
2015-04-21 17:41:33 +01:00
Ulrich Germann
adc80953e4
Minor edits for better readability.
2015-04-21 17:40:31 +01:00
Ulrich Germann
70f83e5be9
Additions for writing out alignments in yawat format (for kwipc).
2015-04-21 17:39:06 +01:00
Ulrich Germann
f98de4dc83
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-04-18 18:04:20 +01:00
Ulrich Germann
28d9e55379
Bug fix.
2015-04-18 16:53:57 +01:00
Ulrich Germann
e028eb7847
A single output factor in Mmsapt can now be specified externally. (Before: hard-coded to 0.)
2015-04-17 23:12:32 +01:00
Jeroen Vermeulen
d56f317f2e
New helper classes: temp_dir & temp_file.
...
I'm adding these because boost::filesystem::unique_path introduces
encoding issues: on Windows the path is in wchar_t, breaking use of
those strings in various places! Encoding the strings is just too
much work.
It's still possible that the current temp_file implementation won't
build on Windows (it uses POSIX mkstemp() and close()) but that can
be fixed underneath the API.
2015-04-17 22:57:55 +07:00
Jeroen Vermeulen
1e3e445e3f
Use cross-platform mmap() wrapper in CompactPT.
...
The MmapAllocator header made use of sys/mman.h and mmap(), which are
Unix-specific. But util has a wrapper which also works on Windows.
This also fixes the error handling: when mmap() failed, the old code would
return an invalid (but non-NULL!) pointer — leading to a crash. The wrapper
will throw an exception with a helpful error message.
2015-04-17 18:53:46 +07:00
Jeroen Vermeulen
464615a0c3
Fix some clang++ warnings.
...
Compiling with clang++ at the default warning/error levels produces
some interesting warnings. Here's a pair of fixes for the simplest
instances:
moses/TranslationModel/RuleTable/PhraseDictionaryFuzzyMatch.cpp:133:7:
warning: comparison of array 'path' equal to a null pointer is always
false [-Wtautological-pointer-compare]
if (path == NULL) {
^~~~ ~~~~
(The code unnecessarily checks that an automatic variable has a
non-null address).
moses/TranslationModel/DynSAInclude/onlineRLM.h:305:20:
warning: unsequenced modification and access to 'den_val' [-Wunsequenced]
if(((den_val = query(&ngram[len - num_fnd], num_fnd - 1)) > 0) &&
^
(The code tries to cram too much into an "if" condition.)
2015-04-07 22:58:17 +07:00
Ulrich Germann
e110e7df6b
Bug fix.
2015-04-05 16:18:09 +01:00
Ulrich Germann
3e2f878576
Merge branch 'master' into mmt-dev
...
Conflicts:
Jamroot
moses/TranslationModel/UG/mmsapt.h
2015-04-05 15:51:50 +01:00
Ulrich Germann
46e31a285c
- Code refactoring for Bitext class.
...
- Bug fixes and conceptual improvements in biased sampling. The sampling now
tries to stick to the bias, even when an unsuitable corpus dominates
the occurrences.
2015-04-05 14:29:00 +01:00
Ulrich Germann
05c4e382ff
Better logging during biased sampling in Mmsapt.
2015-04-03 21:12:44 +01:00
Ulrich Germann
b6c887b370
Minor bug fix in logging biased sampling for phrase lookup.
2015-04-03 20:18:55 +01:00
Ulrich Germann
93ce2423df
1. A context string for biased sampling in Mmsapt can now be provided on the
...
command line with --context-string. Not available in server mode yet.
2. Numerous bug fixes related to biased sampling.
3. Biased sampling now checks that the sampling sticks to the bias. If
the distribution of samples deviates too much from the bias, samples
whose selection would push the sample distribution even further from the bias
are not considered, even if that means that fewer samples are chosen in total.
2015-04-03 16:16:52 +01:00
Jeroen Vermeulen
ebc0930500
Replace use of tmpnam with boost::filesystem.
...
Silences a few annoying warnings from gcc: "tmpnam is dangerous" (and
the suggestion to use mkstemp instead).
2015-04-02 10:42:06 +07:00
XapaJIaMnu
29a729c99b
Remove old obsolete probingPT tests
2015-04-01 16:58:21 +01:00
Ulrich Germann
a9dbced81d
Bug fix.
2015-03-30 02:56:49 +01:00
Ulrich Germann
fcbfc5a535
Feature functions and the constructors of TranslationOptionCollections
...
now have access to the current translation task.
This was done to allow context-sensitive processing (if provided by the FF).
2015-03-30 01:20:17 +01:00
Ulrich Germann
79cd40d2c4
Disabled temporarily. Needs to be adapted to API changes in Mmsapt.
2015-03-29 23:58:17 +01:00
Ulrich Germann
2899645992
Cleanup.
2015-03-29 23:57:14 +01:00
Ulrich Germann
3541838a46
Included TargetPhraseCollectionCache.* in fakelib mmsapt.
2015-03-29 23:55:47 +01:00
Ulrich Germann
1525f1ea62
Cleanup.
2015-03-29 23:44:06 +01:00
Ulrich Germann
529a766da7
Initial check-in.
2015-03-29 23:43:50 +01:00
Jeroen Vermeulen
b124d99330
Use boost::filesystem for "rm -rf".
...
Replaces a system() call (which was a portability problem) and fixes,
en passant, a warning about its return value being ignored.
2015-03-29 18:33:58 +07:00
Jeroen Vermeulen
789a2e2bc3
Fix some compile warnings (gcc 4.9.2).
...
Mostly signed/unsigned comparisons and reordered member
initializations; also a few unused variables.
There are more, but if I chip away at them for a while, who knows, it
may catch on and warnings may eventually become socially stigmatizing.
:)
2015-03-29 18:10:51 +07:00
Ulrich Germann
1b23edf62f
Cache for the N most recently used TargetPhraseCollections. Refactored out of mmsapt.h.
2015-03-28 14:41:08 +00:00
Jeroen Vermeulen
a9c8f44896
Modernize "C" includes in moses.
...
This is one of those little chores in managing a long-lived C++
project: standard C headers like stdio.h and math.h now have their own
place in the C++ standard as resp. cstdio, cmath, and so on. In this
branch the #include names are updated for the moses/ subdirectory; more
branches to follow.
C++11 adds cstdint, but to support compilation with the previous
standard, that change is left for later.
2015-03-28 20:09:03 +07:00
Hieu Hoang
1064aaacbe
delete typedefs for UINT32 and UINT64. MSVC now has uint32_t and uint64_t /Ken
2015-03-25 00:55:39 +00:00
Ulrich Germann
8ca11d941d
1. Lifetime of tasks in ThreadPool is now managed via shared pointers.
...
2. Code cleanup in IOWrapper and a bit elsewhere.
2015-03-21 16:12:52 +00:00
Ulrich Germann
ee4e396a4d
Removed pointer to TranslationTask in InputTypes again. Not the right place to store this information.
2015-03-21 15:29:37 +00:00
Ulrich Germann
dcffbb5f4d
Made LRModel::ReorderingType an enumerated type.
2015-03-16 00:24:11 +00:00
Ulrich Germann
085c88cc7b
Eliminated sources of some compiler warnings (unused variables; signed/usigned comparisons).
2015-03-15 22:45:01 +00:00
Ulrich Germann
ad805c133b
Instances of InputType (and derived classes) now know which TranslationTask (if any) created them.
...
This is a first step towards providing phrase tables etc. access to context information etc.
associated with specific translation tasks.
2015-03-15 20:38:31 +00:00
Ulrich Germann
2a66a55c85
Added document map (maps from sentences to document ids) to Bitext class.
...
Minor overhaul to the bias regime, which allows to specify bias by document
name (as provided in the document map) rather than by sentence in the static
parallel corpus.
2015-03-15 13:32:09 +00:00
Ulrich Germann
51824355f9
Sampling now keeps track of counts for hierarchical lexicalized reordering.
2015-03-10 10:41:41 +00:00
Ulrich Germann
524376fad4
Code cleanup.
2015-03-09 00:34:47 +00:00
Hieu Hoang
32de075022
beautify
2015-02-19 12:27:23 +00:00
Ulrich Germann
ccf44f39fb
Code cleanup and reorganization. A few classes have been renamed to shorter names.
2015-02-15 01:45:22 +00:00
Hieu Hoang
755bd609f5
Using boost for prefix/suffix checks /Jeroen Vermeulen
2015-02-06 15:52:25 +00:00