Philipp Koehn
f69c1dab02
more efficient default recaser training
2015-02-04 09:18:09 +00:00
Phil Williams
6b9da6c585
filter-rule-table: merge changes from t2s branch (still WIP)
2015-02-03 11:33:10 +00:00
Matthias Huck
9987beb453
SoftSourceSyntacticConstraintsFeature: Now for both non-terminals (as before) _and_ terminals.
...
Also added score components based on relative frequency.
(TODO: logprobs right now; are plain probabilities better?)
2015-01-23 18:41:18 +00:00
Matthias Huck
b50c197313
forgot to check this in some time ago
2015-01-20 21:41:41 +00:00
Matthias Huck
a6c09e57d0
domain features in GHKM extraction
2015-01-20 21:36:55 +00:00
Hieu Hoang
b50b3164fa
beautify
2015-01-15 11:18:39 +00:00
Hieu Hoang
6289b39fd8
update extract-mixed-syntax
2015-01-15 09:53:57 +00:00
Hieu Hoang
6d61db28fa
use astyle 2.01. It's on Edinburgh server and doesn't screw up enum
2015-01-14 19:21:11 +00:00
Hieu Hoang
05ead45e71
beautify
2015-01-14 11:07:42 +00:00
Matthias Huck
168118d252
PhraseOrientationFeature efficiency improvement
2015-01-09 14:03:18 +00:00
Phil Williams
7cc75a0fa1
score-stsg: add --TreeScore option
2014-12-30 18:57:23 +00:00
Philipp Koehn
831f947874
long overdue feature: do not produce very low scoring translation table entries that are never used and just gum up the works
2014-12-21 01:14:42 +00:00
Nicola Bertoldi
e4eb201c52
merged master into dynamic-models and solved conflicts
2014-12-13 12:52:47 +01:00
Phil Williams
b9a382aa78
Add filter-rule-table
...
This will eventually replace filter-rule-table.py. At the moment
it can only filter rule tables where the source-side is a STSG
fragment and when the test sentences have parse trees.
2014-12-07 14:56:48 +00:00
Phil Williams
60e56efc6b
phrase-extract: add syntax-common sub-library
...
And remove some (near-)duplicate code from pcfg-common and score-stsg.
2014-12-07 14:27:51 +00:00
Phil Williams
a2708b8431
relax-parse: fix hang
...
SyntaxTree::Parse() would enter a *very* long loop due to an unintialized
member variable.
2014-12-07 12:56:41 +00:00
Hieu Hoang
4b10c59bea
add OutputSearchGraphHypergraph() to API framework. Move m_source to BaseManager
2014-12-05 21:33:59 +00:00
Matthias Huck
7a299de66b
avoid necessity of masking "{{" in the data
2014-12-04 15:54:05 +00:00
Matthias Huck
24a8a6a511
PhraseOrientationFeature
2014-12-03 20:04:26 +00:00
Hieu Hoang
49a2ff1faa
Merge branch 'merge-cmd'
2014-12-02 19:09:34 +00:00
Hieu Hoang
ba7afba9f6
move n-best code for phrase-based from IOWrapper to ChartManager
2014-12-02 17:40:53 +00:00
Phil Williams
f84f159247
Add score-stsg, a program for scoring STSG extract files
2014-12-02 17:10:20 +00:00
Phil Williams
ef1262a17f
extract-ghkm: change STSG output format.
2014-11-21 15:46:12 +00:00
Phil Williams
c46fb10ec7
extract-ghkm: add --STSG option
2014-11-21 11:30:29 +00:00
Matthias Huck
0fd987a8c6
avoid necessity of masking "{{" in the data
2014-11-12 18:28:59 +00:00
Phil Williams
a5d803ee14
extract-ghkm: add -T2S option
2014-11-12 14:03:24 +00:00
Rico Sennrich
ae8b9cbfef
glue grammar: alignment for <s> and </s>
2014-11-04 14:05:13 +00:00
Phil Williams
05ecc914c2
Fix a few more compiler warnings (from Clang mostly).
2014-10-10 15:47:53 +01:00
Paul Baltescu
8f74ecd8f3
Fix OxLM.
2014-10-08 22:08:42 +01:00
Matthias Huck
5ac6c42508
PhraseOrientationFeature: bugfixes
2014-09-13 00:20:17 +01:00
Matthias Huck
0cf0d595d3
GHKM glue grammar: Orientation phrase property
2014-09-12 17:30:03 +01:00
Matthias Huck
63316960a1
GHKM glue grammar: print word alignment links for <s> and </s>,
...
SSTART and SEND in internal tree structure
2014-09-12 17:18:31 +01:00
Matthias Huck
1523f3315d
PhraseOrientationFeature for chart-based decoding: a first simple version,
...
with lots of log output
2014-09-12 13:51:04 +01:00
Hieu Hoang
0b41879a3a
argument --NonTermConsecSourceMixedSyntax
2014-09-03 02:33:50 +01:00
Hieu Hoang
b0ee7f68e2
argument --NonTermConsecSourceMixedSyntax
2014-09-03 02:19:49 +01:00
Hieu Hoang
2d73f6f803
jamfile error
2014-09-01 18:02:42 +01:00
Hieu Hoang
1aa5c4fa35
change size of syntactic non-term
2014-08-30 07:44:53 +01:00
Hieu Hoang
f19781e05b
add Jamfile
2014-08-29 16:26:28 +01:00
Hieu Hoang
e6438e378f
Add option to sort chart translation option after EvaluateWithSourceContext
2014-08-29 16:24:49 +01:00
Hieu Hoang
379da960d1
eclipse
2014-08-29 16:21:27 +01:00
Hieu Hoang
26741c6bca
Roll out mixed syntax
2014-08-29 15:56:01 +01:00
Hieu Hoang
4b8d29d18d
Roll out mixed syntax
2014-08-29 15:33:35 +01:00
Hieu Hoang
73f1d259a1
Roll out mixed syntax
2014-08-29 15:31:47 +01:00
Hieu Hoang
049a9a9ea7
space between ||| and {{
2014-08-28 13:22:48 +01:00
Hieu Hoang
794b946783
space between ||| and {{
2014-08-28 13:20:05 +01:00
Matthias Huck
33992f9af5
uninitialized variables and double include
2014-08-08 16:27:17 +01:00
Paul Baltescu
d75c4e1ae5
OxLM integration.
2014-08-08 01:18:05 +01:00
Matthias Huck
c27cbf55ea
source labels: integration into EMS
2014-08-07 21:02:51 +01:00
Ulrich Germann
df3fb4ac5c
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
...
Conflicts:
doc/Mmsapt.howto
2014-08-04 17:26:15 +01:00
Ulrich Germann
2711360ce7
Added missing #include.
2014-08-04 17:19:58 +01:00
Barry Haddow
65b3e0b96e
Missing include
2014-08-01 11:13:34 +01:00
Hieu Hoang
8d7871125f
delete extract-ordering. Not part of the core functionality
2014-07-30 13:16:40 +01:00
Matthias Huck
7b02017da1
use std::numeric_limits
2014-07-28 19:49:43 +01:00
Matthias Huck
3a5dee12e8
implementation of phrase orientation in GHKM extraction
...
(...but a corresponding feature function for the chart-based decoder has not been written yet)
2014-07-28 18:27:12 +01:00
Matthias Huck
19a5ef4a1a
relax-parse: use cin.peek()
...
Hope this eliminates some weird behavior
2014-07-17 20:19:28 +01:00
Hieu Hoang
d7cbef5cbe
minor format change in consolidate
2014-06-25 07:04:11 -04:00
Hieu Hoang
ab3ed27f20
Merge ../mosesdecoder into hieu
2014-06-13 17:04:52 +01:00
Hieu Hoang
32b4eb5168
add NonTermContext property
2014-06-13 17:04:41 +01:00
Hieu Hoang
dddadc4c81
Merge branch 'hieu' of github.com:hieuhoang/mosesdecoder into hieu
2014-06-13 10:35:43 +01:00
Hieu Hoang
768ac1c6a8
add SpanLength to score
2014-06-12 13:26:01 +01:00
Matthias Huck
d0e92da734
GHKM extraction can add a source labels phrase property
2014-06-11 19:27:18 +01:00
Hieu Hoang
af659446bd
Merge branch 'hieu' of github.com:hieuhoang/mosesdecoder into hieu
2014-06-11 04:35:33 +01:00
Hieu Hoang
92089f9726
ignore 0 span. Don't bomb out
2014-06-11 04:35:09 +01:00
Hieu Hoang
0178e5237e
don't output SpanLength property if empty
2014-06-09 09:05:31 +01:00
Hieu Hoang
7c5208e6d6
Merge ../mosesdecoder into hieu
2014-06-08 17:18:16 +01:00
Hieu Hoang
29d83d94b1
delete any mention of SAFE_GETLINE so it doesn't reappear
2014-06-08 17:18:07 +01:00
Hieu Hoang
3c6a31128d
Merge ../mosesdecoder into hieu
2014-06-08 17:07:41 +01:00
Hieu Hoang
1b667e3e24
delete any mention of SAFE_GETLINE so it doesn't reappear
2014-06-08 17:07:12 +01:00
Hieu Hoang
cb94a3181b
use standard c++ getline instead of old Moses SAFE_GETLINE
2014-06-08 16:23:14 +01:00
Hieu Hoang
23ba0de224
use standard c++ getline instead of old Moses SAFE_GETLINE
2014-06-08 15:41:27 +01:00
Hieu Hoang
d979b24314
use standard c++ getline instead of old Moses SAFE_GETLINE
2014-06-08 14:06:33 +01:00
Hieu Hoang
45ed0a5b1f
Merge ../mosesdecoder into hieu
2014-06-08 13:22:34 +01:00
Hieu Hoang
f58c7fc831
use standard c++ getline instead of old Moses SAFE_GETLINE
2014-06-08 13:17:23 +01:00
Nicola Bertoldi
4d75c889f1
merged master into dynamic-models
2014-06-08 09:39:37 +02:00
Hieu Hoang
9f2e3a4194
add SpanLength property
2014-06-03 17:26:21 +01:00
Hieu Hoang
fcf9e4b51c
Merge ../mosesdecoder into hieu
2014-06-03 17:11:16 +01:00
Hieu Hoang
3ae671fc7c
||| separator after counts
2014-06-03 17:10:09 +01:00
Hieu Hoang
fe1fdb7980
Merge ../mosesdecoder into hieu
2014-06-03 14:40:19 +01:00
Hieu Hoang
23e9083514
erroneous assert
2014-06-03 14:40:00 +01:00
Hieu Hoang
280f02cd1a
Merge ../mosesdecoder into hieu
2014-06-03 14:15:54 +01:00
Hieu Hoang
8ba078c1eb
erroneous assert
2014-06-03 14:06:40 +01:00
Hieu Hoang
ea1fb296fe
add span length to training
2014-05-31 21:39:47 +01:00
Nicola Bertoldi
0ca98837db
beautify
2014-05-19 15:35:33 +02:00
Nicola Bertoldi
20b3e8929e
beautify
2014-05-19 15:35:08 +02:00
Nicola Bertoldi
20381cbf89
merged master into dynamic-models and solved conflicts
2014-04-28 19:18:38 +02:00
Rico Sennrich
c8682e9420
target-syntax: use SoftMatchingFeature to assign non-terminal to unknown words
2014-03-24 14:57:24 +00:00
Rico Sennrich
ba52fa163b
use | as default escape sequence for "|" (for consistency with tokenizer.perl)
2014-03-21 19:19:03 +00:00
Hieu Hoang
0e308e41ca
recommit Rico's change to score format
2014-03-13 18:30:24 +00:00
Ulrich Germann
a7c85780ee
Merge branch 'master' into dynamic-phrase-tables
...
Conflicts:
phrase-extract/score-main.cpp
2014-03-10 14:25:45 +00:00
Ulrich Germann
fdc504d47a
Changes on main branch files while I was working on dynamic phrase tables.
2014-03-10 14:08:00 +00:00
Rico Sennrich
01bc3c111e
swap position of alignment and scores in phrase table halves (before consolidate step).
...
ensures that multiple hierarchical rules with same source/target phrase, but different alignment, are sorted correctly
2014-03-02 16:55:42 +00:00
Ulrich Germann
ef2ef881a4
Merge branch 'dynamic-phrase-tables' of file:///home/germann/git/mosesdecoder into dynamic-phrase-tables
2014-02-21 01:04:02 +00:00
Ulrich Germann
e089c7463d
Fixed code formatting.
2014-02-08 16:03:50 +00:00
Matthias Huck
e40fabfad5
fixed compile errors in debug mode
2014-02-06 19:46:32 +00:00
Matthias Huck
65811a0325
tree fragments: tiny issues with the extraction pipeline
2014-02-03 18:13:10 +00:00
Matthias Huck
86ee3e15a4
new version of the score
tool
...
which is now capable of dealing with additional properties in an appropriate manner
2014-01-29 18:37:42 +00:00
Hieu Hoang
4c009e31e8
Merge branch 'master' of https://github.com/moses-smt/mosesdecoder into hieu
2014-01-20 17:08:02 +00:00
Rico Sennrich
bc0cac59be
unescape "|" (for compatibility with escape-special-chars scripts)
2014-01-18 12:23:21 +00:00
Nicola Bertoldi
4b072f2097
merge master into this branch
2014-01-17 14:04:15 +01:00
Rico Sennrich
c1d8f6e267
Revert "testing the waters for C++11 adoption"
...
This reverts commit d2d508184e
.
there's problems with gcc 4.5, and apparently different problems with new boost versions; sticking with C++03 for the time being.
2014-01-15 16:16:11 +00:00
Nicola Bertoldi
572728074d
removed useless files
2014-01-15 16:52:25 +01:00
Nicola Bertoldi
e452a13062
beautify
2014-01-15 16:49:57 +01:00
Nicola Bertoldi
47bece6eac
code cleanup; fixings to others' code/test
2014-01-15 16:16:37 +01:00
Rico Sennrich
d2d508184e
testing the waters for C++11 adoption
2014-01-14 17:01:46 +00:00
Nicola Bertoldi
50970b2b59
merge master into this branch
2014-01-14 08:50:18 +01:00
Hieu Hoang
584af0d015
add support for --MinPhraseLength
2014-01-06 18:03:38 +00:00
Hieu Hoang
35faa887e8
add support for --MinPhraseLength
2014-01-06 17:34:04 +00:00
Hieu Hoang
abe0155f81
ordering extract in same format as my own
2014-01-06 17:21:39 +00:00
Hieu Hoang
ac5d6676f2
ordering extract in same format as my own
2014-01-06 17:04:10 +00:00
Hieu Hoang
d4d4e27511
only output ordering extract
2014-01-06 16:31:21 +00:00
Hieu Hoang
2fb99f07bb
only output ordering extract
2014-01-06 13:31:47 +00:00
Hieu Hoang
63f6ea8fa7
eclipse
2014-01-06 11:55:22 +00:00
Hieu Hoang
b3a712baa0
output reordering only
2013-12-18 18:40:23 +00:00
Hieu Hoang
7d497abf41
minor verbose in consolidate-main.cpp
2013-12-06 11:46:19 +00:00
Hieu Hoang
4f6f127486
Merge pull request #53 from pengli09/master
...
Fix the bug in phrase-extract/extract-main.cpp: the authors forgot to change three variable names
2013-11-20 03:04:41 -08:00
Peng Li
f53825c71e
Fix the bug in phrase-extract/extract-main.cpp: the authors forgot to change inBottomRight/outBottomRight to inBottomLeft/outBottomLeft in the second loops in getOrientPhraseModel() and getOrientHierModel()
2013-11-20 16:22:15 +08:00
Hieu Hoang
ccf9662748
Merge branch 'master' of ../mosesdecoder
2013-11-15 14:03:05 +00:00
Phil Williams
6bee77e207
extract-ghkm: use square brackets for glue rule internal tree structure
2013-11-12 15:49:49 +00:00
Hieu Hoang
477314cda4
Merge branch 'master' of github.com:hieuhoang/mosesdecoder
2013-11-12 12:26:35 +00:00
Hieu Hoang
24f95297fc
compiles with clang
2013-10-31 12:46:41 +00:00
Hieu Hoang
125e9a8569
add debug argument
2013-10-05 10:48:01 +01:00
Hieu Hoang
902741681a
reverse 7d3de78500
2013-10-04 21:27:53 +01:00
Hieu Hoang
7d3de78500
minor error with placeholder
2013-10-04 19:29:16 +01:00
Phil Williams
d6aa123d03
score: write sparse features to third field.
2013-09-29 18:58:20 +01:00
Phil Williams
2a28d1a73e
Merge branch 'master' into GHKMStruct
...
Conflicts:
moses-chart-cmd/IOWrapper.cpp
moses-chart-cmd/IOWrapper.h
moses/FF/Factory.cpp
moses/Parameter.cpp
moses/StaticData.h
phrase-extract/extract-ghkm/ScfgRuleWriter.cpp
phrase-extract/score-main.cpp
2013-09-29 15:27:09 +01:00
Phil Williams
20b96fd0a7
Oops, fix e497dc485...
2013-09-29 15:23:37 +01:00
Phil Williams
e497dc4857
Remove NT length code missed in commit cdd9df19...
2013-09-29 15:09:14 +01:00
Hieu Hoang
31ce9b510e
beautify
2013-09-27 09:35:24 +01:00
Phil Williams
940591a1a3
extract-ghkm: allow trailing whitespace in alignment file
...
Thanks to Matt Post for reporting the problem.
2013-09-26 15:49:08 +01:00
Phil Williams
29c1089283
consolidate: don't assume input contains key-value field
2013-09-24 09:45:49 +01:00
Phil Williams
74ed066569
consolidate: expect key-value pairs in 7th field, not 6th
2013-09-20 15:50:03 +01:00
Phil Williams
23488e1adb
extract-ghkm: use square brackets for --TreeFragments
...
Use square brackets instead of round brackets for internal tree
structure. This avoids the need for additional escaping since
square brackets are already escaped in Moses.
Also: tweak code style to match the rest of the source file, and
output less whitespace to make the extract files (marginally)
smaller.
2013-09-20 14:57:40 +01:00
Phil Williams
ab863d1f16
consolidate: write key-value field to rule table
2013-09-20 09:42:13 +01:00
Hieu Hoang
98bb4fa1c7
placeholders work in extract
2013-09-19 12:24:57 +02:00
Hieu Hoang
a40d9082cd
more placeholder code and 'NO BEST TRANSLATION' to stderr for pb
2013-09-18 23:47:50 +02:00
Matthias Huck
a6d172e0f1
command line option for extract-ghkm: --TreeFragments
2013-09-16 20:06:02 +01:00
maria nadejde
7cc284a743
comment
2013-09-14 10:50:33 +02:00
maria nadejde
df86f0e78b
Merge branch 'GHKMStruct' of github.com:moses-smt/mosesdecoder into GHKMStruct
2013-09-14 10:46:17 +02:00
maria nadejde
5f37a545b1
fixed sparse feature output
2013-09-14 10:44:35 +02:00
Phil Williams
296eb6804a
Merge master
2013-09-13 22:32:45 +01:00
Phil Williams
cdd9df19d2
Remove --OutputNTLengths from extract-rules, etc.
...
The option isn't used in master and the output is compatible with the
current rule table format. If anyone wants this in master it should
probably be fixed in the span-length branch then merged.
2013-09-13 22:16:42 +01:00
maria nadejde
bf5c32df6c
stuff that probably doesn't work
2013-09-13 19:43:04 +02:00
Matthias Huck
643fa18805
Merge branch 'GHKMStruct' of github.com:moses-smt/mosesdecoder into GHKMStruct
2013-09-13 17:13:20 +02:00
Matthias Huck
c39bed60c0
Tree fragments in GHKM glue rules;
...
output of LHS tag in tree fragments for UNKs;
GHKMParse info is now denoted as Tree info
2013-09-13 17:10:21 +02:00
maria nadejde
fad57a60a7
comment for Equal implementation
2013-09-13 16:13:36 +02:00
maria nadejde
5615a11766
sparse feature weight file
2013-09-13 16:06:48 +02:00
maria nadejde
bff123635e
added Dense and Sparse feature to scorer
2013-09-13 12:45:46 +02:00
maria nadejde
43a9323d0f
add feature files
2013-09-12 18:46:40 +02:00
maria nadejde
67b873b67d
mock feature
2013-09-12 18:40:08 +02:00
Matthias Huck
96d14555fc
GHKM tree output during extraction: modified extract-ghkm and score tools
2013-09-11 16:46:37 +02:00