Ulrich Germann
98c03dc047
Merge branch 'mmt-dev' into ranked-sampling.
...
Conflicts:
moses/TranslationModel/UG/mm/ug_bitext.h
moses/TranslationModel/UG/mm/ug_bitext_jstats.cc
moses/TranslationModel/UG/mm/ug_bitext_jstats.h
moses/TranslationModel/UG/mm/ug_bitext_pstats.cc
moses/TranslationModel/UG/mm/ug_sampling_bias.cc
moses/TranslationModel/UG/mm/ug_sampling_bias.h
2015-07-03 15:30:33 +01:00
Ulrich Germann
9dae3eb785
Code cleanup.
2015-07-03 15:11:21 +01:00
Ulrich Germann
f78bb4a6e9
Bigger K-best list to accommodate phrase extraction failure.
2015-07-02 23:56:53 +01:00
Ulrich Germann
1c25b29ebb
Show from which documents phrase translations were collected.
2015-07-02 23:55:14 +01:00
Ulrich Germann
64ec34df5d
Proper indentation with spaces (no tabs).
2015-07-02 23:49:00 +01:00
Ulrich Germann
b05ca8cb80
Fixes to make code compile on various versions of gcc.
2015-07-02 18:06:55 +01:00
Ulrich Germann
e94921dc44
Removal of 'using namespace ...' from several header files.
2015-07-02 01:32:34 +01:00
Ulrich Germann
61067b4fa5
Merge branch 'FF_ttptr' of http://github.com/moses-smt/mosesdecoder
2015-07-01 13:25:26 +01:00
Ulrich Germann
41a11dfe8a
Allow ports other than 80 as the server ports for the context bias server.
2015-06-25 18:37:28 +01:00
XapaJIaMnu
fcc9bb1e60
when using the suffix array PT, set the ttask in the targetPhrase
2015-06-25 16:52:14 +01:00
XapaJIaMnu
a3ecd9f2a7
Revert "Break everything by trying to add ttasksptr to TargetPhrase" and try an easier approach
...
This reverts commit afdc1b480e
.
2015-06-25 15:47:39 +01:00
XapaJIaMnu
afdc1b480e
Break everything by trying to add ttasksptr to TargetPhrase
2015-06-25 15:47:17 +01:00
Ulrich Germann
22cc22064c
Changed implementation of indocs (to keep track of which documents phrases come from) from vector to map.
2015-06-25 15:43:13 +01:00
XapaJIaMnu
47a488767e
Enable the bias weights to be (re)set by the server.
2015-06-25 13:12:33 +01:00
Ulrich Germann
78b2810cfe
Allow context server to use ports other than 80.
2015-06-24 18:09:22 +01:00
XapaJIaMnu
5a0168a6fa
forgot to negate a condition
2015-06-23 17:27:01 +01:00
XapaJIaMnu
e50926abf6
Enable the Suffix array to get context_weights from command line
2015-06-23 16:58:58 +01:00
MosesAdmin
e57ca5ec34
daily automatic beautifier
2015-06-22 00:00:43 +01:00
Marcin Junczys-Dowmunt
58f0187e8b
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2015-06-21 19:24:53 +02:00
Marcin Junczys-Dowmunt
6151003c13
Remove C++11 oddities
2015-06-21 19:24:43 +02:00
Hieu Hoang
0f943dd9c1
clang compile errors
2015-06-21 21:16:12 +04:00
Marcin Junczys-Dowmunt
1bd10e104c
workaround/cleaning for weird copy-constructor behaviour with C++11
2015-06-21 18:27:56 +02:00
Ulrich Germann
65bd46df65
Added feature with cumulative bias.
2015-06-19 21:50:01 +01:00
Phil Williams
90470e878d
Fix some C++11-related compilation errors (clang)
2015-06-19 15:58:14 +01:00
Ulrich Germann
a627fd3cc6
Bug fix: set_bias_for_ranking needs to lock.
2015-06-15 14:22:32 +01:00
Ulrich Germann
9d46c5efa1
Rearrangement of members to match initialization order.
2015-06-15 14:20:45 +01:00
Ulrich Germann
4582e43473
Merge branch 'master' into ranked-sampling
2015-06-08 15:45:04 +01:00
Ulrich Germann
bca0f651da
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-06-08 15:44:37 +01:00
Ulrich Germann
3a5acb56cc
Added some logging messages.
2015-06-08 15:32:53 +01:00
Ulrich Germann
b2a3bd280e
Allow intrusive pointers to const objects.
2015-06-08 14:10:48 +01:00
Ulrich Germann
2f125eddc3
Bug fix. Readability.
2015-06-08 14:08:35 +01:00
Ulrich Germann
5e2e63f678
Integration of ranked sampling.
2015-06-08 14:06:54 +01:00
Ulrich Germann
5dc9d68d2d
Initial check-in.
2015-06-08 14:05:41 +01:00
Ulrich Germann
78b0aab65b
Work in progress.
2015-06-08 14:04:19 +01:00
Ulrich Germann
36c3f9dda8
Work in progress. Bug fix (release pstats in deconstructor!). Various other changes.
2015-06-08 14:03:20 +01:00
Ulrich Germann
d34b107b91
Initial check-in.
2015-06-08 14:00:31 +01:00
Ulrich Germann
69f15d0c5a
New member function wait that won't return until sampling is done.
2015-06-08 14:00:17 +01:00
Ulrich Germann
ac99ec519f
Have SentenceBias keep track of document ids.
2015-06-08 13:53:39 +01:00
Ulrich Germann
ff97627e30
Update to emacs variables at top.
2015-06-08 13:52:34 +01:00
Ulrich Germann
4fcb9b98f7
Keeping track of cumulative bias scores.
2015-06-08 13:51:54 +01:00
Ulrich Germann
f1de677530
SentenceBias now has access to mapping from sentence IDs to document IDs.
2015-06-08 13:50:37 +01:00
Ulrich Germann
3c767fc333
New field to store cumulative bias scores.
2015-06-08 13:48:40 +01:00
Ulrich Germann
e8a4a9b10a
New member function to expose mapping from sentence IDs to document ids.
2015-06-08 13:45:51 +01:00
Ulrich Germann
d5b0ec7562
Initial check-in.
2015-06-08 12:20:25 +01:00
Ulrich Germann
7a57ce4dc2
Missing #pragma once.
2015-06-06 13:22:04 +01:00
Ulrich Germann
56dee1d4ac
Bug fixes: missing #include and const declaration of find_trg_phr-bounds().
2015-06-06 13:21:33 +01:00
Ulrich Germann
d4234847cd
Added #include.
2015-06-05 22:51:58 +01:00
Ulrich Germann
f87f123366
Added member function find_trg_phrase_bound(PhraseExtractionRecord& rec) to Bitext class.
2015-06-05 22:50:17 +01:00
Ulrich Germann
8ae2894107
Initial check-in.
2015-06-05 22:29:26 +01:00
Ulrich Germann
53752f70a7
Added member function find_trg_phr_bound(PhraseExtractionRecord& rec).
2015-06-05 22:28:02 +01:00
Ulrich Germann
c7fffab82c
Bug fixes.
2015-06-05 22:27:10 +01:00
Ulrich Germann
8a547ea82f
Added missing #include.
2015-06-05 22:25:49 +01:00
Ulrich Germann
704432cf0f
Bug fixes.
2015-06-05 22:25:13 +01:00
Ulrich Germann
623eb7bb77
Instantiation of btfix via boost::intrusive_ptr in Mmsapt.
...
This is in preparation for distinct bitext samplers which need to
ensure the lifetime of the bitext while sampling.
2015-06-05 21:15:47 +01:00
Ulrich Germann
e8ee56876e
Initial check-in.
2015-06-05 17:24:53 +01:00
Ulrich Germann
8f4b2afe26
#include a few more things.
2015-06-05 16:30:07 +01:00
Ulrich Germann
1b4b3a5103
Mmsapt: btfix now instatiated via intrusive pointer
...
... to prevent deletion while Mmsapt is live.
2015-06-05 16:27:49 +01:00
Ulrich Germann
47fa99b61b
Added member function size() to LRU_Cache.
2015-06-05 16:26:47 +01:00
Ulrich Germann
243a6a8b3b
Added #define for intrusive pointer.
2015-06-05 16:23:00 +01:00
Ulrich Germann
576c743aee
Simplified #include.
2015-06-05 16:22:03 +01:00
Ulrich Germann
5cb1d95e09
Added member function for retrieving nbest list items without sorting.
2015-06-05 16:21:09 +01:00
Ulrich Germann
5a56a5b496
Added target for forced relinking only (no forced recompilation); temporarily disabled tcmalloc.
2015-06-05 16:20:08 +01:00
Ulrich Germann
83fa1b6a88
Initial check-in.
2015-06-03 12:59:32 +01:00
Ulrich Germann
0afe139810
Initial check-in.
2015-06-03 12:55:58 +01:00
Ulrich Germann
debdd21899
Optional initialization of SentenceBias.
2015-06-03 12:53:38 +01:00
Ulrich Germann
f024eede74
Added ca() as short replacement for approxOccurrenceCount() to tsa_tree_iterator.
2015-06-03 12:51:44 +01:00
Hieu Hoang
3ea5faead8
codelite
2015-06-02 21:44:58 +04:00
Jeroen Vermeulen
35cf55d4d2
Trailing spaces.
2015-06-02 15:03:18 +07:00
Ulrich Germann
d62d2dc95f
Bug fix.
2015-06-01 23:10:50 +01:00
Ulrich Germann
aa4eed93d5
Bug fix related to getting rid of using namespace std; .
2015-06-01 18:55:40 +01:00
Ulrich Germann
cc800742b1
Updated Makefile for local compiles.
2015-06-01 18:26:27 +01:00
Ulrich Germann
99896cfd2c
Untangling bitext class from Moses dependencies, so that the class can be used
...
independently of Moses again.
2015-06-01 18:25:04 +01:00
Ulrich Germann
349163f3fd
Bug fix and in-line code documentation.
2015-06-01 18:21:52 +01:00
Ulrich Germann
25f98a446e
Bug fix in building imTtrack directly from input stream.
2015-06-01 18:19:34 +01:00
Ulrich Germann
c82ee9a4e9
Bug fix.
2015-05-24 16:44:41 +01:00
Ulrich Germann
da052b7f2b
Removed dependency on libcurlpp, as it was difficult to link that staticly.
2015-05-24 16:05:14 +01:00
Ulrich Germann
dcb8e5d3e0
Preparation for allowing context-aware decoding.
2015-05-19 02:35:39 +01:00
Hieu Hoang
39139e7a64
beautify.
2015-05-15 18:09:38 +01:00
Marcin Junczys-Dowmunt
7652ab9118
quick fix for out-of-bound alignment points
2015-05-15 09:12:51 +02:00
Jeroen Vermeulen
0859e9a844
Remove trailing whitespace from C++ files.
2015-05-13 17:05:43 +07:00
Jeroen Vermeulen
1364a7d599
Fix typo in mmap call.
...
The case where !m_fixed passed m_map_size to mmap(), but the "else"
clause passed map_size. In replacing mmap() with the portable wrapper,
I accidentally changed that to be m_map_size as well.
Besides fixing that, I'm changing the name of the variable to be more
clearly distinguishable from m_map_size.
2015-05-12 09:58:47 +07:00
Ulrich Germann
7da7ce52da
Added context buffering in IOWrapper for context-sensitive decoding.
...
Unfortunately, this seems to slow things down quite a bit.
2015-05-11 00:34:24 +01:00
Ulrich Germann
db5ccff364
Tweaks to logging for biased sampling.
2015-05-11 00:33:21 +01:00
Ulrich Germann
1778238d73
Logging of latency of bias lookup via server.
2015-05-11 00:32:20 +01:00
Ulrich Germann
8a174beb44
Additional check for document map if document bias is requested.
2015-05-11 00:30:32 +01:00
Nicola Bertoldi
90a982e579
merge remote into local
2015-05-04 09:42:44 +02:00
Nicola Bertoldi
c4f04670c2
made ProbingPT constructor compliant with PhraseDictionary signature
2015-05-04 09:25:50 +02:00
Hieu Hoang
cc8c6b7b10
beautify
2015-05-02 11:45:24 +01:00
Jeroen Vermeulen
eca5824100
Remove trailing whitespace in C++ files.
2015-04-30 12:05:11 +07:00
Ulrich Germann
324b1a9b56
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-04-29 20:20:54 +01:00
Ulrich Germann
e4f5c69109
One step closer to eliminating the requirement to provide num-features=... in the config file.
...
Some FF (Mmsapt, LexicalReordering, Many single-value FF) provide this number during "registration";
when missing, a default weight vector of uniform 1.0 is automatically generated. This eliminates the
need for the user to figure out what the exact number of features is for each FF, which can get complicated,
e.g. in the case of Mmsapt/PhraseDictionaryBitextSampling.
2015-04-29 20:16:52 +01:00
Ulrich Germann
c76f1c338d
Uninitialized variable.
2015-04-29 20:16:43 +01:00
Jeroen Vermeulen
616b589da3
Fix a bunch of compiler warnings.
...
Warnings are useful, but only if there are few!
2015-04-29 21:18:51 +07:00
Ulrich Germann
315610c02a
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-04-27 16:39:40 +01:00
Ulrich Germann
37bb1de9ed
Unused variable.
2015-04-27 16:30:59 +01:00
Ulrich Germann
fbf8b1f8b8
Code design debizarrification: Indexes of feature functions into the dense vector of all feature
...
values are now stored on the feature function instead of in a global map that is a static
member of ScoreComponentCollection.
2015-04-26 16:46:36 +01:00
Ulrich Germann
e63561ae7f
Unused variable.
2015-04-26 15:41:32 +01:00
Hieu Hoang
41529227b2
boost unique lock
2015-04-26 18:11:11 +04:00
Ulrich Germann
bafe60c3a1
Make sure things work when curl-based biasing is disabled.
2015-04-26 03:14:40 +01:00
Ulrich Germann
0d72cdd72c
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder into mmt-dev
...
Conflicts:
moses/Syntax/F2S/Manager-inl.h
moses/TranslationModel/UG/mmsapt.cpp
2015-04-26 02:12:16 +01:00
Jeroen Vermeulen
8ac91c8d97
Fix unqualified call to rand_excl().
...
The call needed to be made explicitly to util::rand_excl(). Sorry.
2015-04-24 00:22:25 +07:00
Jeroen Vermeulen
38d790cac0
Add cross-platform randomizer module.
...
The code uses two mechanisms for generating random numbers: srand()/rand(),
which is not thread-safe, and srandom()/random(), which is POSIX-specific.
Here I add a util/random.cc module that centralizes these calls, and unifies
some common usage patterns. If the implementation is not good enough, we can
now change it in a single place.
To keep things simple, this uses the portable srand()/rand() but protects them
with a lock to avoid concurrency problems.
The hard part was to keep the regression tests passing: they rely on fixed
sequences of random numbers, so a small code change could break them very
thoroughly. Util::rand(), for wide types like size_t, calls std::rand() not
once but twice. This behaviour was generalized into utils::wide_rand() and
friends.
2015-04-23 23:46:04 +07:00
Jeroen Vermeulen
02d1d9a4af
Don't work around missing popen() in MinGW.
...
Windows does not have popen()/pclose(), so FileHandler.cpp #define's them to
_popen()/_pclose(). But MinGW has similar macros built into <cstdio>, leading
to warnings. So skip the workaround on MinGW.
2015-04-22 11:24:32 +07:00
Jeroen Vermeulen
32722ab5b1
Support tokenize(const std::string &) as well.
...
Convenience wrapper: the actual function takes a const char[], but many of
the call sites want to pass a string and have to call its c_str() first.
2015-04-22 10:35:18 +07:00
Jeroen Vermeulen
b2d821a141
Unify tokenize() into util, and unit-test it.
...
The duplicate definition works fine in environments where the inline
definition becomes a weak symbol in the object file, but if it gets
generated as a regular definition, the duplicate definition causes link
problems.
In most call sites the return value could easily be made const, which
gives both the reader and the compiler a bit more certainty about the code's
intentions. In theory this may help performance, but it's mainly for clarity.
The comments are based on reverse-engineering, and the unit tests are based
on the comments. It's possible that some of what's in there is not essential,
in which case, don't feel bad about changing it!
I left a third identical definition in place, though I updated it with my
changes to avoid creeping divergence, and noted the duplication in a comment.
It would be nice to get rid of this definition as well, but it'd introduce
headers from the main Moses tree into biconcor, which may be against policy.
2015-04-22 09:59:05 +07:00
Ulrich Germann
2c0851099b
Work on integrating hierarchical lexicalized reordering models with sampled phrase tables.
2015-04-21 17:48:48 +01:00
Ulrich Germann
0d13edae24
Added entry for bitext-find.
2015-04-21 17:47:39 +01:00
Ulrich Germann
9a9e43ea2c
Initial check-in: search utility for bi-concordancing.
2015-04-21 17:47:09 +01:00
Ulrich Germann
e7246686bf
New constructor.
2015-04-21 17:46:12 +01:00
Ulrich Germann
1791f47bfb
mmBitext now maintains a vector of document names.
2015-04-21 17:43:51 +01:00
Ulrich Germann
8a921f5dc9
Initial check-in.
2015-04-21 17:41:33 +01:00
Ulrich Germann
adc80953e4
Minor edits for better readability.
2015-04-21 17:40:31 +01:00
Ulrich Germann
70f83e5be9
Additions for writing out alignments in yawat format (for kwipc).
2015-04-21 17:39:06 +01:00
Ulrich Germann
f98de4dc83
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2015-04-18 18:04:20 +01:00
Ulrich Germann
28d9e55379
Bug fix.
2015-04-18 16:53:57 +01:00
Ulrich Germann
e028eb7847
A single output factor in Mmsapt can now be specified externally. (Before: hard-coded to 0.)
2015-04-17 23:12:32 +01:00
Jeroen Vermeulen
d56f317f2e
New helper classes: temp_dir & temp_file.
...
I'm adding these because boost::filesystem::unique_path introduces
encoding issues: on Windows the path is in wchar_t, breaking use of
those strings in various places! Encoding the strings is just too
much work.
It's still possible that the current temp_file implementation won't
build on Windows (it uses POSIX mkstemp() and close()) but that can
be fixed underneath the API.
2015-04-17 22:57:55 +07:00
Jeroen Vermeulen
1e3e445e3f
Use cross-platform mmap() wrapper in CompactPT.
...
The MmapAllocator header made use of sys/mman.h and mmap(), which are
Unix-specific. But util has a wrapper which also works on Windows.
This also fixes the error handling: when mmap() failed, the old code would
return an invalid (but non-NULL!) pointer — leading to a crash. The wrapper
will throw an exception with a helpful error message.
2015-04-17 18:53:46 +07:00
Jeroen Vermeulen
464615a0c3
Fix some clang++ warnings.
...
Compiling with clang++ at the default warning/error levels produces
some interesting warnings. Here's a pair of fixes for the simplest
instances:
moses/TranslationModel/RuleTable/PhraseDictionaryFuzzyMatch.cpp:133:7:
warning: comparison of array 'path' equal to a null pointer is always
false [-Wtautological-pointer-compare]
if (path == NULL) {
^~~~ ~~~~
(The code unnecessarily checks that an automatic variable has a
non-null address).
moses/TranslationModel/DynSAInclude/onlineRLM.h:305:20:
warning: unsequenced modification and access to 'den_val' [-Wunsequenced]
if(((den_val = query(&ngram[len - num_fnd], num_fnd - 1)) > 0) &&
^
(The code tries to cram too much into an "if" condition.)
2015-04-07 22:58:17 +07:00
Ulrich Germann
e110e7df6b
Bug fix.
2015-04-05 16:18:09 +01:00
Ulrich Germann
3e2f878576
Merge branch 'master' into mmt-dev
...
Conflicts:
Jamroot
moses/TranslationModel/UG/mmsapt.h
2015-04-05 15:51:50 +01:00
Ulrich Germann
46e31a285c
- Code refactoring for Bitext class.
...
- Bug fixes and conceptual improvements in biased sampling. The sampling now
tries to stick to the bias, even when an unsuitable corpus dominates
the occurrences.
2015-04-05 14:29:00 +01:00
Ulrich Germann
05c4e382ff
Better logging during biased sampling in Mmsapt.
2015-04-03 21:12:44 +01:00
Ulrich Germann
b6c887b370
Minor bug fix in logging biased sampling for phrase lookup.
2015-04-03 20:18:55 +01:00
Ulrich Germann
93ce2423df
1. A context string for biased sampling in Mmsapt can now be provided on the
...
command line with --context-string. Not available in server mode yet.
2. Numerous bug fixes related to biased sampling.
3. Biased sampling now checks that the sampling sticks to the bias. If
the distribution of samples deviates too much from the bias, samples
whose selection would push the sample distribution even further from the bias
are not considered, even if that means that fewer samples are chosen in total.
2015-04-03 16:16:52 +01:00
Jeroen Vermeulen
ebc0930500
Replace use of tmpnam with boost::filesystem.
...
Silences a few annoying warnings from gcc: "tmpnam is dangerous" (and
the suggestion to use mkstemp instead).
2015-04-02 10:42:06 +07:00
XapaJIaMnu
29a729c99b
Remove old obsolete probingPT tests
2015-04-01 16:58:21 +01:00
Ulrich Germann
a9dbced81d
Bug fix.
2015-03-30 02:56:49 +01:00
Ulrich Germann
fcbfc5a535
Feature functions and the constructors of TranslationOptionCollections
...
now have access to the current translation task.
This was done to allow context-sensitive processing (if provided by the FF).
2015-03-30 01:20:17 +01:00
Ulrich Germann
79cd40d2c4
Disabled temporarily. Needs to be adapted to API changes in Mmsapt.
2015-03-29 23:58:17 +01:00
Ulrich Germann
2899645992
Cleanup.
2015-03-29 23:57:14 +01:00
Ulrich Germann
3541838a46
Included TargetPhraseCollectionCache.* in fakelib mmsapt.
2015-03-29 23:55:47 +01:00
Ulrich Germann
1525f1ea62
Cleanup.
2015-03-29 23:44:06 +01:00
Ulrich Germann
529a766da7
Initial check-in.
2015-03-29 23:43:50 +01:00
Jeroen Vermeulen
b124d99330
Use boost::filesystem for "rm -rf".
...
Replaces a system() call (which was a portability problem) and fixes,
en passant, a warning about its return value being ignored.
2015-03-29 18:33:58 +07:00
Jeroen Vermeulen
789a2e2bc3
Fix some compile warnings (gcc 4.9.2).
...
Mostly signed/unsigned comparisons and reordered member
initializations; also a few unused variables.
There are more, but if I chip away at them for a while, who knows, it
may catch on and warnings may eventually become socially stigmatizing.
:)
2015-03-29 18:10:51 +07:00
Ulrich Germann
1b23edf62f
Cache for the N most recently used TargetPhraseCollections. Refactored out of mmsapt.h.
2015-03-28 14:41:08 +00:00
Jeroen Vermeulen
a9c8f44896
Modernize "C" includes in moses.
...
This is one of those little chores in managing a long-lived C++
project: standard C headers like stdio.h and math.h now have their own
place in the C++ standard as resp. cstdio, cmath, and so on. In this
branch the #include names are updated for the moses/ subdirectory; more
branches to follow.
C++11 adds cstdint, but to support compilation with the previous
standard, that change is left for later.
2015-03-28 20:09:03 +07:00
Hieu Hoang
1064aaacbe
delete typedefs for UINT32 and UINT64. MSVC now has uint32_t and uint64_t /Ken
2015-03-25 00:55:39 +00:00
Ulrich Germann
8ca11d941d
1. Lifetime of tasks in ThreadPool is now managed via shared pointers.
...
2. Code cleanup in IOWrapper and a bit elsewhere.
2015-03-21 16:12:52 +00:00
Ulrich Germann
ee4e396a4d
Removed pointer to TranslationTask in InputTypes again. Not the right place to store this information.
2015-03-21 15:29:37 +00:00
Ulrich Germann
dcffbb5f4d
Made LRModel::ReorderingType an enumerated type.
2015-03-16 00:24:11 +00:00
Ulrich Germann
085c88cc7b
Eliminated sources of some compiler warnings (unused variables; signed/usigned comparisons).
2015-03-15 22:45:01 +00:00
Ulrich Germann
ad805c133b
Instances of InputType (and derived classes) now know which TranslationTask (if any) created them.
...
This is a first step towards providing phrase tables etc. access to context information etc.
associated with specific translation tasks.
2015-03-15 20:38:31 +00:00
Ulrich Germann
2a66a55c85
Added document map (maps from sentences to document ids) to Bitext class.
...
Minor overhaul to the bias regime, which allows to specify bias by document
name (as provided in the document map) rather than by sentence in the static
parallel corpus.
2015-03-15 13:32:09 +00:00
Ulrich Germann
51824355f9
Sampling now keeps track of counts for hierarchical lexicalized reordering.
2015-03-10 10:41:41 +00:00
Ulrich Germann
524376fad4
Code cleanup.
2015-03-09 00:34:47 +00:00
Hieu Hoang
32de075022
beautify
2015-02-19 12:27:23 +00:00
Ulrich Germann
ccf44f39fb
Code cleanup and reorganization. A few classes have been renamed to shorter names.
2015-02-15 01:45:22 +00:00
Hieu Hoang
755bd609f5
Using boost for prefix/suffix checks /Jeroen Vermeulen
2015-02-06 15:52:25 +00:00
Hieu Hoang
70e8eb54ce
Using boost for prefix/suffix checks /Jeroen Vermeulen
2015-02-05 16:23:47 +00:00
Marcin Junczys-Dowmunt
4140756fdf
Add missing chck for empty range while flushing
2015-01-22 22:18:19 +01:00
Marcin Junczys-Dowmunt
7d9013a85b
Work-around for temporary translation option collection size during phrase table binarization
2015-01-19 23:15:08 +01:00
Marcin Junczys-Dowmunt
fbcf2dcb56
Fixed thread-safety
2015-01-19 21:56:04 +01:00
Marcin Junczys-Dowmunt
82c603213a
Thread-safety and constness
2015-01-18 23:58:28 +01:00
Marcin Junczys-Dowmunt
16ffc2c978
Added new VW feature and execption to Simple9
2015-01-18 23:26:32 +01:00
Hieu Hoang
6d61db28fa
use astyle 2.01. It's on Edinburgh server and doesn't screw up enum
2015-01-14 19:21:11 +00:00
Hieu Hoang
05ead45e71
beautify
2015-01-14 11:07:42 +00:00
Phil Williams
e5ebf30664
Fix a few warnings.
2015-01-13 21:13:55 +00:00
Hieu Hoang
be0ab92d16
delete oov pt
2015-01-09 22:32:08 +00:00
Hieu Hoang
e195bdf6d9
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2015-01-08 02:37:01 +04:00
Rico Sennrich
7123d1cc80
eliminate spurious copy / memory leak
2015-01-07 18:42:20 +00:00
Hieu Hoang
ff7fbd55ee
add oovpt
2015-01-07 15:33:42 +04:00
Hieu Hoang
99b4b63c0c
change signature of GetChartRuleCollection()
2015-01-07 12:59:08 +04:00
Hieu Hoang
b9bef2fc44
add oovpt
2015-01-07 12:18:09 +04:00
Hieu Hoang
3b3f11365d
delete UserMessage. Too difficult to police
2015-01-07 10:01:10 +04:00
Hieu Hoang
1e0a2835bf
add oovpt
2015-01-04 19:10:48 +05:30
XapaJIaMnu
d0807c45f2
Fixed crash in probingPT when probability is precisely 0
2014-12-23 15:21:06 +00:00
Nicola Bertoldi
d0cddf0f2d
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2014-12-16 17:35:47 +01:00
Nikolay Bogoychev
d0f4402e86
Fix incorrect hashing in ProbingPT
2014-12-16 11:15:12 +00:00
Nicola Bertoldi
4e77665d30
better handling of cache-based models with inconsistent parameters
2014-12-15 17:42:41 +01:00
Nicola Bertoldi
e4eb201c52
merged master into dynamic-models and solved conflicts
2014-12-13 12:52:47 +01:00
Nicola Bertoldi
cea2d9d8bb
beautify
2014-12-09 12:39:37 +01:00
Hieu Hoang
8c6310bf4c
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2014-12-05 23:26:24 +00:00
Matthias Huck
bfeb7d641f
log output
2014-12-05 22:31:54 +00:00
Hieu Hoang
4b10c59bea
add OutputSearchGraphHypergraph() to API framework. Move m_source to BaseManager
2014-12-05 21:33:59 +00:00
Rico Sennrich
56921cae3b
small simplification of recursive CYK+
...
(following Chris Dyer's suggestion and Phil's refactoring in S2T decoder)
2014-12-01 11:05:17 +00:00
Ulrich Germann
7aa4d5d8d5
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
...
Conflicts:
moses-cmd/simulate-pe.cc
2014-11-20 17:55:51 +00:00
XapaJIaMnu
52c520c042
Resolve merge conflicts
2014-11-20 15:50:32 +00:00
Ulrich Germann
bda7ace530
Minor changes due to changes in the Moses API. Removed from list of standard programs to be compiled and installed. May need some work to get it working again.
2014-11-16 16:31:12 +00:00
XapaJIaMnu
4bea830188
doesn't work
2014-11-13 15:50:05 +00:00
Hieu Hoang
e1092c0dad
merge
2014-11-07 14:35:36 +00:00
Laura Kieras
ecae85e9a8
mm2dTable now opens its data file read-only, using mapped_file_source, so that we don't need write permissions on the file
2014-11-04 16:30:46 -05:00
Ulrich Germann
07202c544c
Added ptable-describe-features to list features used by PhraseDictionaryBitextSampling.
2014-10-25 12:06:38 -07:00
Ulrich Germann
44215b79c0
Added ptable-describe-features to list features used by PhraseDictionaryBitextSampling.
2014-10-25 12:06:24 -07:00
Ulrich Germann
53ef6c5c38
Added demo program for use of suffix arrays.
2014-10-23 11:11:28 -07:00
Barry Haddow
562cf7e007
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2014-10-21 15:11:22 +01:00
Hieu Hoang
cce818015d
Merge ../mosesdecoder into merge-cmd
2014-10-10 15:50:12 +01:00
Phil Williams
05ecc914c2
Fix a few more compiler warnings (from Clang mostly).
2014-10-10 15:47:53 +01:00
Phil Williams
ee57e59f2b
Fix a few compiler warnings (from Clang mostly).
2014-10-10 14:22:53 +01:00
Hieu Hoang
1743f7eeb2
Merge ../mosesdecoder into merge-cmd
2014-10-08 17:55:07 +01:00
Ulrich Germann
576931b088
Mmsapt now adds word alignment info to target phrases.
2014-10-07 18:08:31 +01:00
Hieu Hoang
33ed15ef19
move misc common functions into moses/
2014-09-30 14:22:38 +01:00
Barry Haddow
091948bff0
Improved debug
2014-09-18 17:03:19 +01:00
Ulrich Germann
1d834e2b48
Fixed bug with respect to adding check option to Mmsapt::Load().
2014-09-10 18:51:20 +02:00
Ulrich Germann
a58c7ceb18
Fixed issues with ambiguity in typedef of uint64_t (conflict between boost typedef and stdint typedef).
2014-09-10 12:07:57 +02:00
Ulrich Germann
31578d4915
Finished code for bias loading from Mmsapt config file.
2014-09-09 18:07:26 +01:00
Ulrich Germann
cda94c7d85
Fix in biased sampling. Started code on loading and using bias in Mmsapt.
2014-09-09 17:45:48 +01:00
Ulrich Germann
f86fa65a6f
Added utility count-ptable-features to count features in Mmsapt given a moses.ini config line.
2014-09-08 16:56:45 +01:00
Ulrich Germann
db6e5de641
Added initial code for utility to count features of PhraseDictionaryBitextSampling.
2014-09-08 11:03:05 +01:00
Ulrich Germann
5571ec91c6
Code cleanup.
2014-09-08 09:26:09 +01:00
Ulrich Germann
a86d49fc88
Added bias to bitext sampling.
2014-09-08 09:26:08 +01:00
Ulrich Germann
cef6460981
Initial check-in.
2014-09-08 09:26:08 +01:00
Ulrich Germann
a87a9ff207
Moved class PhrasePair back to ug_bitext.
...
Moved function expand() from mmsapt.cc to ug_bitext.h.
Added new lookup function to class Bitext.
Bug fixes related to inverse lookup in class Bitext.
2014-09-08 09:26:08 +01:00
Ulrich Germann
b588df77f0
Bug fix related to threading.
2014-09-08 09:26:08 +01:00
Ulrich Germann
2405293aaa
Fiddling around with the code. Not for production.
2014-09-08 09:26:08 +01:00
Ulrich Germann
90c91ae9bb
Added fakelib stringdist.
2014-09-08 09:26:08 +01:00
Ulrich Germann
9af3a61678
Added try-align2.
2014-09-08 09:26:08 +01:00
Ulrich Germann
a028fec7af
Work in progress.
2014-09-08 09:26:08 +01:00
Michael Denkowski
3304030a4e
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2014-09-04 11:19:32 -04:00
Michael Denkowski
6c33bc99dc
Option to add TM-specific word and phrase counts
2014-09-04 11:17:42 -04:00
Michael Denkowski
756bcf0f15
Option to add TM-specific word and phrase counts
2014-09-04 01:49:26 -04:00
Rico Sennrich
2a46e8ccea
parse chart compression for faster CYK+ parsing with syntax systems.
2014-09-01 18:16:22 +01:00
Michael Denkowski
1c45d780d4
all-restrict mode for MultiModel (restrict to phrases in first model)
2014-08-26 13:43:23 -04:00
Hieu Hoang
97e5a30d3a
compiles with clang on osx
2014-08-25 18:07:42 +01:00
Michael Denkowski
da0ed4df81
tunable=false option for mmsapt
2014-08-18 19:22:50 -04:00
Michael Denkowski
93e99be108
Mode to pass through "all" scores in MultiModel
2014-08-18 17:57:05 -04:00
Nicola Bertoldi
77e9e91b08
minor fixes
2014-08-18 19:13:51 +02:00
Hieu Hoang
00a338d576
clang only function
2014-08-14 16:44:20 +01:00
Hieu Hoang
303387f9ac
compiles with clang on osx
2014-08-14 16:17:21 +01:00
Hieu Hoang
fcbd64b3ac
eclipse
2014-08-14 14:04:25 +01:00
Hieu Hoang
2bbaf69409
Merge branch 'master' into bo-safe
2014-08-13 18:52:14 +01:00
Hieu Hoang
94c44c03d5
merge
2014-08-13 18:03:05 +01:00
Hieu Hoang
18c1c4a132
method rename
2014-08-08 18:11:30 +01:00
Hieu Hoang
efa5befb16
method rename
2014-08-08 15:59:34 +01:00
Ulrich Germann
95b04d2558
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2014-08-05 21:28:06 +01:00
Ulrich Germann
5480499309
Fixed (?) problem with multiple identical extractable target phrases per source phrase occurrence.
2014-08-05 21:26:29 +01:00
Michael Denkowski
13942b77ab
Add alias PhraseDictionaryBitextSampling
2014-08-05 14:47:07 -04:00
Ulrich Germann
f32a313a05
Mmsapt now uses timespec on linux, timeval om MacOS for time stamps.
2014-08-05 02:22:20 +01:00
Hieu Hoang
11471de9b8
mac osx
2014-08-04 18:50:10 +01:00
Ulrich Germann
c269abb083
Added num_read_write.cc to fakelib mm.
2014-08-04 17:52:08 +01:00
Ulrich Germann
9fad5d3eb0
Eliminated dependence on endian.h and related byte swapping on big-endian machines.
2014-08-04 17:52:08 +01:00
Hieu Hoang
3f29ed10f1
Merge branch 'master' of github.com:moses-smt/mosesdecoder
2014-08-05 11:00:01 +01:00
Hieu Hoang
84d6b25802
TargetPhrase to have pointer to the phrase table that creates it
2014-08-05 10:59:48 +01:00
Hieu Hoang
f447a23067
TargetPhrase to have pointer to the phrase table that creates it
2014-08-05 10:26:42 +01:00
Hieu Hoang
e863592f40
TargetPhrase to have pointer to the phrase table that creates it
2014-08-04 19:28:04 +01:00
Hieu Hoang
abe68be588
initialise m_container
2014-08-04 15:59:32 +01:00
Hieu Hoang
3f3912772d
initialise m_container
2014-08-04 15:46:40 +01:00
Hieu Hoang
5f90ccdb13
initialise m_container
2014-08-04 15:20:22 +01:00
Marcin Junczys-Dowmunt
5c9017c632
Forgot to add SetFeaturesToApply
2014-08-03 19:44:43 +02:00
Marcin Junczys-Dowmunt
ff6ed8cd21
Fixed segfault for features depending on factors not in phrase table (i.e. added by generation models)
2014-08-03 18:03:42 +02:00
Hieu Hoang
688bf4c061
each target phrase knows what decode graph created it
2014-08-02 17:15:01 +01:00
hieu
5741ef2635
compile error in gcc 4.4
2014-07-30 18:01:51 +01:00
Ulrich Germann
f9d167345a
Changed feature and parameter names for Mmsapt / PhraseDictionaryBitextSampling as requested by PK.
2014-07-29 13:57:00 +01:00
Ulrich Germann
6a1beb770d
Cleanup work to get rid of compiler warnings.
2014-07-29 13:51:44 +01:00
Nicola Bertoldi
1063012892
added a flag do disable the decaying in the cache
2014-07-22 11:25:03 +02:00
Nicola Bertoldi
02bf6d5d5e
fixings about file loading and precomputation of ascores
2014-07-22 09:45:41 +02:00
Hieu Hoang
b10760f428
delete PhraseTableImplementation. Old enum
2014-07-18 20:36:53 +01:00
Hieu Hoang
1347b153ee
compiles with c++11. Used by oxlm
2014-07-17 23:13:06 +01:00
Ulrich Germann
f06b145735
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
2014-07-10 17:24:42 +01:00