Commit Graph

52 Commits

Author SHA1 Message Date
Matthias Huck
1659d6b4c8 Option for target constituent constrained phrase extraction. TargetConstituentAdjacencyFeature. 2016-02-12 17:46:57 +00:00
MosesAdmin
4825b9e08a daily automatic beautifier 2016-01-10 00:00:35 +00:00
Matthias Huck
1d3feba8d0 preparing extraction of Hiero soft syntactic preferences (target syntax) 2016-01-09 23:02:31 +00:00
Matthias Huck
9fd0486815 score-main: Seems like the list container is causing substantial efficiency issues.
Phrase scoring apparently takes hours longer in some cases. Switch back to vector.
2015-08-29 04:48:09 +01:00
MosesAdmin
b64af59af6 daily automatic beautifier 2015-07-25 00:00:40 +01:00
Matthias Huck
472529ade8 Moses::Scan too inefficient 2015-07-24 20:43:29 +01:00
Matthias Huck
9e31bced9a MinCount parameter in score-main 2015-07-24 19:42:15 +01:00
Jeroen Vermeulen
0ca2bcb28d End line after printing progress dots to stderr. 2015-07-16 15:51:16 +07:00
MosesAdmin
5696a59ae4 daily automatic beautifier 2015-06-04 13:41:46 +01:00
Jeroen Vermeulen
0981d23705 Lint-fixing binge. 2015-06-02 16:02:39 +07:00
Hieu Hoang
cc8c6b7b10 beautify 2015-05-02 11:45:24 +01:00
Jeroen Vermeulen
09c982c1de Remove bad initialization.
Setting lastLine[0] when lastLine is empty probably doesn't do anything, but
in C++11 is definitely undefined.  The value wasn't used anyway!
2015-05-01 18:42:04 +07:00
Matthias Huck
633e7be8f0 integer overflows in Good-Turing discounting 2015-03-30 17:42:55 +01:00
Hieu Hoang
ad73919979 merge with private branch 2015-03-10 15:28:45 +00:00
Matthias Huck
559077f6f8 some moderate modifications in phrase-extract/score-main.cpp
(e.g., use Moses::Scan<>() rather than atof()/atoi())
2015-03-09 18:49:32 +00:00
Matthias Huck
aa077ab66c GHKM extraction / consolidate: write most frequent POS sequence from property to factor (for usage with a POS LM) 2015-03-05 22:25:32 +00:00
Matthias Huck
638e9c3f60 POS property: map tags to indices in consolidate 2015-03-04 22:48:34 +00:00
Matthias Huck
06e87d851e GHKM: extract POS phrase property (from preterminals in the syntactic parse tree) 2015-03-04 21:40:56 +00:00
Hieu Hoang
70e8eb54ce Using boost for prefix/suffix checks /Jeroen Vermeulen 2015-02-05 16:23:47 +00:00
Matthias Huck
9987beb453 SoftSourceSyntacticConstraintsFeature: Now for both non-terminals (as before) _and_ terminals.
Also added score components based on relative frequency.
(TODO: logprobs right now; are plain probabilities better?)
2015-01-23 18:41:18 +00:00
Matthias Huck
b50c197313 forgot to check this in some time ago 2015-01-20 21:41:41 +00:00
Matthias Huck
a6c09e57d0 domain features in GHKM extraction 2015-01-20 21:36:55 +00:00
Hieu Hoang
05ead45e71 beautify 2015-01-14 11:07:42 +00:00
Matthias Huck
7a299de66b avoid necessity of masking "{{" in the data 2014-12-04 15:54:05 +00:00
Matthias Huck
0fd987a8c6 avoid necessity of masking "{{" in the data 2014-11-12 18:28:59 +00:00
Matthias Huck
3a5dee12e8 implementation of phrase orientation in GHKM extraction
(...but a corresponding feature function for the chart-based decoder has not been written yet)
2014-07-28 18:27:12 +01:00
Hieu Hoang
32b4eb5168 add NonTermContext property 2014-06-13 17:04:41 +01:00
Hieu Hoang
768ac1c6a8 add SpanLength to score 2014-06-12 13:26:01 +01:00
Matthias Huck
d0e92da734 GHKM extraction can add a source labels phrase property 2014-06-11 19:27:18 +01:00
Hieu Hoang
1b667e3e24 delete any mention of SAFE_GETLINE so it doesn't reappear 2014-06-08 17:07:12 +01:00
Hieu Hoang
d979b24314 use standard c++ getline instead of old Moses SAFE_GETLINE 2014-06-08 14:06:33 +01:00
Hieu Hoang
3ae671fc7c ||| separator after counts 2014-06-03 17:10:09 +01:00
Hieu Hoang
0e308e41ca recommit Rico's change to score format 2014-03-13 18:30:24 +00:00
Ulrich Germann
a7c85780ee Merge branch 'master' into dynamic-phrase-tables
Conflicts:
	phrase-extract/score-main.cpp
2014-03-10 14:25:45 +00:00
Ulrich Germann
fdc504d47a Changes on main branch files while I was working on dynamic phrase tables. 2014-03-10 14:08:00 +00:00
Rico Sennrich
01bc3c111e swap position of alignment and scores in phrase table halves (before consolidate step).
ensures that multiple hierarchical rules with same source/target phrase, but different alignment, are sorted correctly
2014-03-02 16:55:42 +00:00
Matthias Huck
e40fabfad5 fixed compile errors in debug mode 2014-02-06 19:46:32 +00:00
Matthias Huck
86ee3e15a4 new version of the score tool
which is now capable of dealing with additional properties in an appropriate manner
2014-01-29 18:37:42 +00:00
Phil Williams
d6aa123d03 score: write sparse features to third field. 2013-09-29 18:58:20 +01:00
Phil Williams
2a28d1a73e Merge branch 'master' into GHKMStruct
Conflicts:
	moses-chart-cmd/IOWrapper.cpp
	moses-chart-cmd/IOWrapper.h
	moses/FF/Factory.cpp
	moses/Parameter.cpp
	moses/StaticData.h
	phrase-extract/extract-ghkm/ScfgRuleWriter.cpp
	phrase-extract/score-main.cpp
2013-09-29 15:27:09 +01:00
maria nadejde
7cc284a743 comment 2013-09-14 10:50:33 +02:00
maria nadejde
df86f0e78b Merge branch 'GHKMStruct' of github.com:moses-smt/mosesdecoder into GHKMStruct 2013-09-14 10:46:17 +02:00
maria nadejde
5f37a545b1 fixed sparse feature output 2013-09-14 10:44:35 +02:00
Phil Williams
296eb6804a Merge master 2013-09-13 22:32:45 +01:00
Phil Williams
cdd9df19d2 Remove --OutputNTLengths from extract-rules, etc.
The option isn't used in master and the output is compatible with the
current rule table format.  If anyone wants this in master it should
probably be fixed in the span-length branch then merged.
2013-09-13 22:16:42 +01:00
maria nadejde
bf5c32df6c stuff that probably doesn't work 2013-09-13 19:43:04 +02:00
Matthias Huck
c39bed60c0 Tree fragments in GHKM glue rules;
output of LHS tag in tree fragments for UNKs;
GHKMParse info is now denoted as Tree info
2013-09-13 17:10:21 +02:00
maria nadejde
67b873b67d mock feature 2013-09-12 18:40:08 +02:00
Matthias Huck
96d14555fc GHKM tree output during extraction: modified extract-ghkm and score tools 2013-09-11 16:46:37 +02:00
Hieu Hoang
6249432407 beautify 2013-05-29 18:16:15 +01:00