Phil Williams
ab863d1f16
consolidate: write key-value field to rule table
2013-09-20 09:42:13 +01:00
Hieu Hoang
98bb4fa1c7
placeholders work in extract
2013-09-19 12:24:57 +02:00
Hieu Hoang
a40d9082cd
more placeholder code and 'NO BEST TRANSLATION' to stderr for pb
2013-09-18 23:47:50 +02:00
Matthias Huck
a6d172e0f1
command line option for extract-ghkm: --TreeFragments
2013-09-16 20:06:02 +01:00
maria nadejde
7cc284a743
comment
2013-09-14 10:50:33 +02:00
maria nadejde
df86f0e78b
Merge branch 'GHKMStruct' of github.com:moses-smt/mosesdecoder into GHKMStruct
2013-09-14 10:46:17 +02:00
maria nadejde
5f37a545b1
fixed sparse feature output
2013-09-14 10:44:35 +02:00
Phil Williams
296eb6804a
Merge master
2013-09-13 22:32:45 +01:00
Phil Williams
cdd9df19d2
Remove --OutputNTLengths from extract-rules, etc.
...
The option isn't used in master and the output is compatible with the
current rule table format. If anyone wants this in master it should
probably be fixed in the span-length branch then merged.
2013-09-13 22:16:42 +01:00
maria nadejde
bf5c32df6c
stuff that probably doesn't work
2013-09-13 19:43:04 +02:00
Matthias Huck
643fa18805
Merge branch 'GHKMStruct' of github.com:moses-smt/mosesdecoder into GHKMStruct
2013-09-13 17:13:20 +02:00
Matthias Huck
c39bed60c0
Tree fragments in GHKM glue rules;
...
output of LHS tag in tree fragments for UNKs;
GHKMParse info is now denoted as Tree info
2013-09-13 17:10:21 +02:00
maria nadejde
fad57a60a7
comment for Equal implementation
2013-09-13 16:13:36 +02:00
maria nadejde
5615a11766
sparse feature weight file
2013-09-13 16:06:48 +02:00
maria nadejde
bff123635e
added Dense and Sparse feature to scorer
2013-09-13 12:45:46 +02:00
maria nadejde
43a9323d0f
add feature files
2013-09-12 18:46:40 +02:00
maria nadejde
67b873b67d
mock feature
2013-09-12 18:40:08 +02:00
Matthias Huck
96d14555fc
GHKM tree output during extraction: modified extract-ghkm and score tools
2013-09-11 16:46:37 +02:00
Matthias Huck
004c44faf1
prototype GHKM tree output from extract-ghkm (still flawed)
2013-09-10 15:41:26 +02:00
Rico Sennrich
b421f7c9b0
refactoring to minimize overhead from flexibility score code (if off)
2013-09-07 23:04:40 +02:00
Rico Sennrich
7138056b8f
flexibility scores
2013-09-07 23:04:01 +02:00
Nicola Bertoldi
614d7a0376
beautify
2013-08-11 23:43:26 +02:00
Hieu Hoang
77872f7521
beautify
2013-07-30 15:04:37 +01:00
Hieu Hoang
9cdcf713a6
phrase penalty now has it's own ff. No longer in the phrase table
2013-07-29 12:55:44 +01:00
Hieu Hoang
9e8402dedd
add placeholder support to extract
2013-07-26 15:46:15 +01:00
Hieu Hoang
e3917f911b
add placeholder support to extract
2013-07-26 15:44:29 +01:00
Hieu Hoang
2ba7a372e8
add placeholder support to extract
2013-07-26 14:12:27 +01:00
Hieu Hoang
4fde5f7ea2
eclipse file for extract-rules
2013-07-26 12:27:55 +01:00
Phil Williams
f0b603e6b5
extract-ghkm: write glue grammars for all sentence offsets
...
extract-parallel now merges separate glue grammars, so remove
previous workaround.
2013-07-25 13:53:32 +01:00
Phil Williams
b5584fdecf
extract-ghkm: workaround for extract-parallel issue
...
Don't write glue grammar or unknown word label files unless the sentence
offset is 0. This prevents multiple instances of extract-ghkm writing
to the same two files when extract-parallel is used.
TODO Better solutions might be:
1. modify extract-parallel so that it only configures one instance of
extract-ghkm to write the glue / unknown-lhs files (like the current
workaround, this assumes file chunks are representative of the whole)
2. add multithreading support directly to extract-ghkm
3. write distinct output files for each extract-ghkm instance and
combine them on completion
2013-07-23 14:55:16 +01:00
Hieu Hoang
310b26f989
beautify
2013-07-08 20:52:14 +01:00
Hieu Hoang
3eba5782c2
beautify
2013-07-08 20:25:47 +01:00
Hieu Hoang
dc33fa3d3d
redo parsing of feature function parameters
2013-06-20 12:50:41 +01:00
Hieu Hoang
abe6bb7c22
refactor parsing of feature functiona args
2013-06-10 18:11:55 +01:00
Hieu Hoang
6249432407
beautify
2013-05-29 18:16:15 +01:00
phikoehn
41da5b2760
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2013-05-12 08:16:22 +01:00
Rico Sennrich
ce5311c076
fix undefined behaviour in rule extract (thanks Barry)
2013-05-07 10:50:19 +02:00
Rico Sennrich
a52f0a8c4d
avoid costly copy operation in extract-rules
...
(noticeable speed-up with large number of non-terminals:
2x speed-up in benchmark with target syntax and --MaxNonTerm 5)
2013-05-03 10:48:14 +02:00
phikoehn
d19a28ae21
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
2013-05-01 19:22:00 +01:00
phikoehn
cd8915647b
support for Chris Dyer's fast-align; bug fix with sparse word translations feature; threshold pruning in filter
2013-05-01 19:20:05 +01:00
Rico Sennrich
4e87a012d0
fix two bugs with relax-parse:
...
- size of sentence was not calculated correctly
(instead, number of positions at which a subtree starts was used)
- code entered an infinitive loop sometimes; added break condition
2013-04-25 17:27:50 +02:00
phikoehn
5ba153806b
fixed kneserNey phrase probability smoothing bug reported by Česlav Przywara <ceslav@przywara.cz>
2013-03-13 17:52:24 +00:00
Barry Haddow
5f1be3217b
bugifx format of extract file for instance weighting
2013-03-07 21:40:43 +00:00
Rico Sennrich
e3ea93acb7
speed up rule extraction by factor 2 (by rewriting rule consolidation to have linear instead of quadratic complexity)
2013-02-06 13:10:38 +01:00
Barry Haddow
8db90fd2ac
instance weighting for lex reordering
2013-01-10 19:46:19 +00:00
Barry Haddow
2e8bad22e4
lex reordering scoring uses FilePiece/StringPiece
2013-01-09 17:38:48 +00:00
Barry Haddow
861792bfc5
extract can read an instance weights file.
...
Still have to parallelise.
2012-12-21 15:39:25 +00:00
Phil Williams
139148bc8f
extract-ghkm and friends: don't unescape special characters
...
Don't unescape special characters when reading XML parse trees in
extract-ghkm, extract-rules, and relax-parse.
2012-12-17 20:08:02 +00:00
Phil Williams
0ca5b8932a
extract-ghkm: tweak label collection for unknown words
...
Produce a better label set when unary rule elimination is enabled.
2012-12-17 19:43:42 +00:00
Phil Williams
fb8d20a22f
extract-ghkm: --UnknownWordMinRelFreq, --UnknownWordUniform
2012-12-17 19:02:30 +00:00
Hieu Hoang
d0cf8f47db
order of lexical probability has flipped
2012-11-22 17:37:36 +00:00
Barry Haddow
a90e1861c0
Alignments on by default for phrase-based
2012-11-15 12:35:43 +00:00
Hieu Hoang
f96b33de83
only include moses root when compiling
2012-11-14 13:43:04 +00:00
Hieu Hoang
5e3ef23cef
move moses/src/* to moses/
2012-11-12 19:56:18 +00:00
Kenneth Heafield
7d692496c3
More little jamfile changes
2012-11-12 16:57:56 +00:00
Kenneth Heafield
d74b784ad2
And pcfg-common too...
2012-11-12 16:53:42 +00:00
Kenneth Heafield
ddd3cc1d8a
Fix extract-ghkm compilation
2012-11-12 16:50:46 +00:00
Kenneth Heafield
62d37fa2b6
Refactor phrase-extract/Jamfile
2012-11-12 14:17:48 +00:00
Hieu Hoang
e75522b602
rename functions
2012-11-09 18:55:01 -05:00
Hieu Hoang
db29cc50d0
xcode
2012-11-09 18:13:20 -05:00
Kenneth Heafield
cd00219fa4
Missing dependency
2012-11-04 16:26:47 -05:00
Barry Haddow
62fa6d6f28
Feature function interface for use in scoring
2012-11-02 23:30:51 +00:00
Barry Haddow
d1d5fe4036
Remove -SentenceId (since we have -IncludeSentenceId now)
2012-10-22 22:03:43 +01:00
Barry Haddow
848aafb644
Merge remote branch 'github/master' into miramerge
...
Conflicts:
moses/src/AlignmentInfo.cpp
moses/src/AlignmentInfo.h
moses/src/ChartHypothesis.cpp
moses/src/ChartTrellisNode.cpp
moses/src/LM/Implementation.cpp
moses/src/LM/Ken.cpp
moses/src/TargetPhrase.cpp
moses/src/TargetPhrase.h
2012-10-08 17:54:59 +01:00
Phil Williams
0851a4d113
extract-ghkm: add --SentenceOffset option
...
This should behave the same as the --SentenceOffset option for
extract-rules. The extract-parallel.perl script expects the rule
extractor to have this option.
2012-10-03 20:04:09 +01:00
Barry Haddow
0a950ee9f4
Merge remote branch 'github/master' into miramerge
...
Compiles, but not tested. Had to disable relent filter. Strangely, it seems to contain the
whole of moses-cmd.
Conflicts:
Jamroot
OnDiskPt/TargetPhrase.cpp
moses-cmd/src/Main.cpp
moses/src/AlignmentInfo.cpp
moses/src/AlignmentInfo.h
moses/src/ChartTranslationOptionCollection.cpp
moses/src/ChartTranslationOptionCollection.h
moses/src/GenerationDictionary.cpp
moses/src/Jamfile
moses/src/Parameter.cpp
moses/src/PhraseDictionary.cpp
moses/src/StaticData.cpp
moses/src/StaticData.h
moses/src/TargetPhrase.h
moses/src/TranslationSystem.cpp
moses/src/TranslationSystem.h
moses/src/Word.cpp
phrase-extract/score.cpp
regression-testing/Jamfile
scripts/ems/experiment.meta
scripts/ems/experiment.perl
scripts/training/train-model.perl
2012-09-26 22:49:33 +01:00
phikoehn
28e8832a15
bug fix domain features
2012-09-25 01:22:09 +01:00
Kenneth Heafield
0cddf8a58b
Fix compilation without threads
2012-09-21 23:11:59 +01:00
Eva Hasler
21938e4d94
initialize correct variable (includeSentenceIdFlag)
2012-09-12 20:02:57 +01:00
phikoehn
5d9859ba0e
merge issues
2012-09-03 07:27:41 +01:00
phikoehn
e072a7f9a7
merge issues
2012-09-03 07:24:07 +01:00
phikoehn
0e783dc529
bug fix to enable pruned search graph output by default
2012-09-03 07:23:32 +01:00
phikoehn
d99f97297f
merges
2012-09-03 07:21:47 +01:00
Hieu Hoang
c639cdbb38
binary hiero reordering feature. Integrated into train-model.perl and experiment.perl. In the 2nd to last position in phrase table, just in front of 2.718
2012-08-28 17:01:08 +01:00
Hieu Hoang
33c03edfbb
binary hiero reordering feature. Implementation of 1 described in nist 2012. 1 if non-term is reordered wrt to other words or non-terms. 0 otherwise
2012-08-25 00:47:57 +01:00
Hieu Hoang
fa56d7861f
word alignment info for hiero grammar
2012-08-24 19:11:35 +01:00
Hieu Hoang
69fc00faf9
singleton feature in phrase table. Like similar feature in Adam's suffix array, as implemented in cdec
2012-08-24 00:54:05 +01:00
Hieu Hoang
5dbb0e66ce
option to produce rules that have boundary <s> & </s> words. Like Chris Dyer's extraction
2012-08-23 19:40:09 +01:00
phikoehn
4a1a995878
a lot of changes
2012-08-18 23:48:26 +01:00
phikoehn
366ab93f8a
a lot of changes
2012-08-18 23:47:05 +01:00
Hieu Hoang
aaa2432851
get rid of threading
2012-07-31 23:34:53 +01:00
Hieu Hoang
c778113ba7
get rid of threading
2012-07-31 22:03:11 +01:00
Hieu Hoang
dd4cf4523b
get rid of threading
2012-07-31 21:49:38 +01:00
Hieu Hoang
3b65a8c626
get rid of threading
2012-07-31 19:40:43 +01:00
Hieu Hoang
3302301c6d
compile error in statistics program
2012-07-31 02:32:58 +01:00
Hieu Hoang
a1ab8e354a
cleanup of variables. Need to delete temporary files
2012-07-31 02:21:48 +01:00
Hieu Hoang
7ae76dfe75
multi-threaded extract program. Thanks to Rohit Gupta
2012-07-18 12:46:59 +01:00
Barry Haddow
2b4e61d826
Merge branch 'trunk' into miramerge
...
Compiles, not tested.
Conflicts:
Jamroot
OnDiskPt/PhraseNode.h
OnDiskPt/TargetPhrase.cpp
OnDiskPt/TargetPhrase.h
OnDiskPt/TargetPhraseCollection.cpp
mert/BleuScorer.cpp
mert/Data.cpp
mert/FeatureData.cpp
moses-chart-cmd/src/Main.cpp
moses/src/AlignmentInfo.h
moses/src/ChartManager.cpp
moses/src/LM/Ken.cpp
moses/src/LM/Ken.h
moses/src/LMList.h
moses/src/LexicalReordering.h
moses/src/PhraseDictionaryTree.h
moses/src/ScoreIndexManager.h
moses/src/StaticData.h
moses/src/TargetPhrase.h
moses/src/Word.cpp
scripts/ems/experiment.meta
scripts/ems/experiment.perl
scripts/training/train-model.perl
2012-07-17 13:36:50 +01:00
Hieu Hoang
15b95cd042
use consistent alignment info for lexical probabilities for both forward and inverse scoring
2012-07-12 16:45:43 +01:00
Eva Hasler
027a20730e
merge Jamfiles
2012-07-04 11:49:07 +01:00
phikoehn
ff79f9f054
fix conflict
...
Merge branch 'master' of git://github.com/moses-smt/mosesdecoder
Conflicts:
scripts/ems/experiment.perl
2012-07-03 00:05:13 +01:00
phikoehn
ce65a47f0d
count bin feature
2012-07-03 00:00:21 +01:00
Hieu Hoang
75e038f4cf
create namespace for all classes
2012-07-02 17:05:11 +01:00
Hieu Hoang
121e258e84
namespace all classes in mert directory
2012-06-30 21:39:10 +01:00
Hieu Hoang
1cf1c2e515
create namespace for all classes in phrase-extract
2012-06-30 16:56:53 +01:00
Hieu Hoang
ef9db932aa
add namespace to phrase-extract
2012-06-30 15:43:47 +01:00
Hieu Hoang
a5ca652a76
move c++ code out of /script/ to /
2012-05-31 17:58:10 +01:00
Hieu Hoang
4eef94b121
move c++ code out of /script/ to /
2012-05-31 17:24:06 +01:00