Rico Sennrich
1f435340f0
faster pruning in chart decoding
2014-03-26 11:23:59 +00:00
Rico Sennrich
fb16df8c00
typo in last commit
2014-03-21 11:59:41 +00:00
Rico Sennrich
45630a5851
various optimizations to make CYK+ parser several times faster and eat less memory.
...
speed-up of decoding depends on how much time is spent in parser:
10-50% speed-up for string-to-tree systems observed (more on long sentences and with high max-chart-span).
if you only use hiero or string-to-tree models (but none with source syntax), use compile-option --unlabelled-source for (small) efficiency gains.
2014-03-21 11:12:24 +00:00
Phil Williams
04dbd3c7aa
moses_chart: more efficient scope-3 parsing if sentence length < max-chart-span
2014-03-14 08:49:09 +00:00
Ulrich Germann
c02fbf7664
Completely rewritten. Now multi-threaded.
2014-03-11 13:57:42 +00:00
Ulrich Germann
fdc504d47a
Changes on main branch files while I was working on dynamic phrase tables.
2014-03-10 14:08:00 +00:00
Ulrich Germann
aa8ba7d9a7
Put alignment functionality into a separate class. Not working yet --- work in progress!
2014-03-10 12:03:27 +00:00
Ulrich Germann
ff4ce426e7
Made scorer in PScoreLex public for development purposes. Reset default number of workers to 20.
2014-03-10 12:02:05 +00:00
Ulrich Germann
f7ee316e12
Added initialization of wlex21 and COOCraw during loading.
2014-03-10 11:59:58 +00:00
Ulrich Germann
aad5d67947
Added option to also count raw cooccurrences.
2014-03-10 11:58:46 +00:00
Ulrich Germann
9cf86f6191
Added class Alignment as a friend and wlex21 and COOCraw for development purposes while working on word alignment issues.
2014-03-10 11:57:40 +00:00
Ulrich Germann
9159729ad0
Made internal table COOC public for development purposes.
2014-03-10 11:56:22 +00:00
Ulrich Germann
81ed9937e1
Routine check-in.
2014-03-05 11:53:05 +00:00
Ulrich Germann
2b19b71095
Routine check-in.
2014-03-04 15:51:59 +00:00
Ulrich Germann
6c37b8d252
Routine check-in.
2014-03-03 12:13:41 +00:00
Ulrich Germann
2b181ee691
Fixed Mmsapt constructor.
2014-02-25 03:10:16 +00:00
Ulrich Germann
4c003edb0d
Fixed #include-s.
2014-02-25 03:09:19 +00:00
Ulrich Germann
a8d66cd68d
Removed Mmsapt constuctor with both descriptor and config line.
2014-02-22 00:27:07 +00:00
Ulrich Germann
817e3695e0
Fixed some include paths.
2014-02-22 00:25:58 +00:00
Ulrich Germann
1252700c44
Removed constructor with both description and config line.
2014-02-22 00:25:02 +00:00
Ulrich Germann
4b95c3a906
Merge branch 'dynamic-phrase-tables' of ssh://thor//home/germann/git/mosesdecoder into dynamic-phrase-tables
...
due to resetting the location of the remote repository.
2014-02-21 01:09:38 +00:00
Ulrich Germann
ac238ef2d7
Changed construction from a given token sequence to allow partial matches.
2014-02-20 23:56:11 +00:00
Ulrich Germann
8afe62145b
Minor fix to make the compiler stop complain about unused typedef.
2014-02-20 23:54:15 +00:00
Ulrich Germann
e1d07e7475
Added pid2str conversion method to convert from phrase ids to the string.
2014-02-20 23:53:15 +00:00
Ulrich Germann
9536cf49e9
Phrase look-up now also gathers phrase orientation info (work in progress).
2014-02-20 23:51:17 +00:00
Ulrich Germann
6c66b9c631
Added Jamfile to produce try-align
2014-02-20 23:50:07 +00:00
Ulrich Germann
683635ce25
Minor fix to make the compiler stop complaining about unused variables.
2014-02-20 23:48:56 +00:00
Ulrich Germann
061b861639
Small test program for phrase-based alignment via mmsapt.
2014-02-20 23:29:37 +00:00
Ulrich Germann
c259e10b23
Various changes.
2014-02-20 23:28:01 +00:00
Ulrich Germann
9bcc315644
Added phrase-based word alignment to mmsapt (work in progress!).
2014-02-20 23:25:36 +00:00
Hieu Hoang
50cadc754f
use boost::unordered_map for CacheColl. Marginally faster
2014-02-11 03:43:58 +00:00
Ulrich Germann
a6ce081e15
Minor changes.
2014-02-08 18:25:46 +00:00
Ulrich Germann
594272ce05
Changed function count_tokens so that it can be run without passing a filter explicitly.
2014-02-08 18:06:11 +00:00
Ulrich Germann
9899364c46
Added implicit add-1 smoothing.
2014-02-08 18:03:18 +00:00
Ulrich Germann
40fbe226e4
Added private members numSent and numWords.
2014-02-08 18:02:03 +00:00
Ulrich Germann
66822b279b
Added append function to grow imTtracks dynamically in a thread-safe fashion.
2014-02-08 18:00:27 +00:00
Ulrich Germann
9f317f4849
Minor fix.
2014-02-08 17:58:05 +00:00
Ulrich Germann
5f8ae20d01
Added dynamicly updatable corpus; updated or added query functions.
2014-02-08 17:56:48 +00:00
Ulrich Germann
784654c831
Initial check-in.
2014-02-08 17:50:26 +00:00
Ulrich Germann
584626a767
Added a few programs.
2014-02-08 17:49:28 +00:00
Ulrich Germann
5c131f196c
Minor changes.
2014-02-08 17:22:57 +00:00
Ulrich Germann
4fb00ea6fd
Minor fixes.
2014-02-08 16:55:05 +00:00
Ulrich Germann
0702926dff
Added special copy constructor that adds new sentences to the new imTSA.
2014-02-08 16:53:15 +00:00
Ulrich Germann
e81e1772f8
Added capability to add sentence pairs to imBitext. Various minor fixes.
2014-02-08 16:48:39 +00:00
Hieu Hoang
39858ce1ff
leak
2014-01-21 18:07:12 +00:00
Hieu Hoang
fcadf4511a
leak
2014-01-21 17:11:16 +00:00
Rico Sennrich
742e59f1e0
minor optimization (minimize performance impact of SoftMatchingFeature code if disabled)
2014-01-17 11:57:52 +00:00
Rico Sennrich
ed25bb2b99
soft matching of target-side nonterminals
2014-01-16 18:34:33 +00:00
jiejiang
5f1217d793
merged upstream with origin for mingw
2014-01-15 18:16:56 +00:00
Rico Sennrich
df30085bbe
pass regtest with C++11 and gcc 4.7
2014-01-15 09:27:20 +00:00