Commit Graph

14 Commits

Author SHA1 Message Date
Matthias Huck
1659d6b4c8 Option for target constituent constrained phrase extraction. TargetConstituentAdjacencyFeature. 2016-02-12 17:46:57 +00:00
Matthias Huck
a5a4401fe9 TargetPreferencesPhraseProperty 2016-01-11 20:14:28 +00:00
Nicola Bertoldi
e4eb201c52 merged master into dynamic-models and solved conflicts 2014-12-13 12:52:47 +01:00
Matthias Huck
c27cbf55ea source labels: integration into EMS 2014-08-07 21:02:51 +01:00
Hieu Hoang
a402523ef5 calculate baseline score without optimisation 2014-07-11 16:26:48 +01:00
Hieu Hoang
5c57702664 merge 2014-06-13 17:08:22 +01:00
Matthias Huck
a5467d89c4 Minor modification of the phrase properties framework.
Properties can save memory by not storing the value string.
2014-06-13 16:37:13 +01:00
Hieu Hoang
eb78782c5d merge with master 2014-06-13 10:35:35 +01:00
Matthias Huck
9a7e568760 SourceLabelsPhraseProperty 2014-06-11 21:08:22 +01:00
Matthias Huck
e693a27e4e A simple phrase property class to access the three phrase count values.
The counts are usually not needed during decoding and are not loaded
from the phrase table. This is just a workaround that can make them
available to features which have a use for them.

If you need access to the counts, copy the two marginal counts and the
joint count into an additional information property with key "Counts",
e.g. using awk:

$ zcat phrase-table.gz | awk -F' \|\|\| '  '{printf("%s {{Counts %s}}\n",$0,$5);}' | gzip -c > phrase-table.withCountsPP.gz

CountsPhraseProperty reads them from the phrase table and provides
methods GetSourceMarginal(), GetTargetMarginal(), GetJointCount().
2014-06-11 20:02:31 +01:00
Nicola Bertoldi
1fe4eb0528 beautify 2014-06-08 09:44:59 +02:00
Hieu Hoang
4a3ac7411d span length 2014-06-04 16:52:57 +01:00
Matthias Huck
d921d23f7d comment 2014-05-19 22:09:27 +01:00
Matthias Huck
1740478238 Framework for additional phrase properties in decoding.
Derive your property class from PhraseProperty. Do any expensive string
processing of the property value in there, not in the feature
implementation, and provide methods to access the information in
appropriate data formats. The property value string will thus have to
be processed only once (on loading) rather than each time the respective
phrase is applied and your feature needs to access the property value.
2014-05-19 21:54:08 +01:00