Matthias Huck
dda3f1867c
Hiero phrase orientation
2016-01-06 18:52:14 +00:00
Matthias Huck
bd3f573452
Hiero phrase orientation
2015-12-10 12:56:37 +00:00
Phil Williams
3c86649e34
Fix bug in STSG rule scoring
2015-09-25 12:11:23 +01:00
Hieu Hoang
e0d2af268c
eclipse
2015-08-11 13:10:38 +04:00
Phil Williams
e7228ec9fb
extract-ghkm: minor refactoring
2015-07-06 14:41:34 +01:00
Phil Williams
44372d7787
extract-ghkm: fix a couple of exception-related issues
2015-07-06 12:05:41 +01:00
Phil Williams
c6a3d8e54a
Ongoing moses/phrase-extract refactoring
2015-06-04 16:54:31 +01:00
MosesAdmin
5696a59ae4
daily automatic beautifier
2015-06-04 13:41:46 +01:00
Phil Williams
ed321791a7
Ongoing moses/phrase-extract refactoring
2015-06-03 11:10:45 +01:00
Phil Williams
2e21f051f2
Ongoing moses/phrase-extract refactoring
2015-06-03 10:05:36 +01:00
Phil Williams
2f04d4a56e
Ongoing moses/phrase-extract refactoring
2015-06-02 15:23:41 +01:00
Phil Williams
0c61970ac7
Ongoing moses/phrase-extract refactoring
2015-06-02 13:56:03 +01:00
Phil Williams
d3fb4a8002
Ongoing moses/phrase-extract refactoring
2015-06-02 10:16:42 +01:00
Phil Williams
8a9505d72f
Ongoing moses/phrase-extract refactoring
2015-06-01 16:54:12 +01:00
Phil Williams
f37415a259
Ongoing moses/phrase-extract refactoring
2015-06-01 16:40:35 +01:00
Phil Williams
f61091e38d
Ongoing moses/phrase-extract refactoring
2015-06-01 14:23:25 +01:00
Phil Williams
c754aef37a
Oops. Fix compile error.
2015-06-01 08:45:04 +01:00
Phil Williams
985e7bbfc3
Ongoing moses/phrase-extract refactoring
2015-05-29 20:57:25 +01:00
Phil Williams
2f735998ca
Rename MosesTraining::SyntaxTree to MosesTraining::SyntaxNodeCollection
...
This is the first step in a small-scale refactoring effort that will touch a
lot of the syntax-related code in moses/phrase-extract. The end goals are:
- a storage mechanism for general attribute/value pairs in XML-style
tree / lattice input. E.g. the "pcfg-score" and "semantic-role"
attributes in:
<tree label="PRP" pcfg-score="1.0" semantic-role="AGENT"> I </tree>
- consolidation of the various near-duplicate Tree / XmlTreeParser classes
that have accumulated over the years (my fault)
- general de-crufting
2015-05-29 18:46:02 +01:00
Jeroen Vermeulen
a25193cc5d
Fix a lot of lint, mostly trailing whitespace.
...
This is lint reported by the new lint-checking functionality in beautify.py.
(We can change to a different lint checker if we have a better one, but it
would probably still flag these same problems.)
Lint checking can help a lot, but only if we get the lint under control.
2015-05-17 20:04:04 +07:00
Hieu Hoang
cc8c6b7b10
beautify
2015-05-02 11:45:24 +01:00
Jeroen Vermeulen
eca5824100
Remove trailing whitespace in C++ files.
2015-04-30 12:05:11 +07:00
Jeroen Vermeulen
32722ab5b1
Support tokenize(const std::string &) as well.
...
Convenience wrapper: the actual function takes a const char[], but many of
the call sites want to pass a string and have to call its c_str() first.
2015-04-22 10:35:18 +07:00
Jeroen Vermeulen
b2d821a141
Unify tokenize() into util, and unit-test it.
...
The duplicate definition works fine in environments where the inline
definition becomes a weak symbol in the object file, but if it gets
generated as a regular definition, the duplicate definition causes link
problems.
In most call sites the return value could easily be made const, which
gives both the reader and the compiler a bit more certainty about the code's
intentions. In theory this may help performance, but it's mainly for clarity.
The comments are based on reverse-engineering, and the unit tests are based
on the comments. It's possible that some of what's in there is not essential,
in which case, don't feel bad about changing it!
I left a third identical definition in place, though I updated it with my
changes to avoid creeping divergence, and noted the duplication in a comment.
It would be nice to get rid of this definition as well, but it'd introduce
headers from the main Moses tree into biconcor, which may be against policy.
2015-04-22 09:59:05 +07:00
Matthias Huck
534a894c0b
glue rules with stripped BitPar labels
2015-03-10 22:02:21 +00:00
Matthias Huck
01bed83cf9
GHKM extraction: option to strip non-terminal labels from BitPar syntactic parses right during extraction (i.e., remove any suffix starting with a hyphen from the label)
2015-03-10 21:25:32 +00:00
Matthias Huck
25f5470216
GHKM: write target parts-of-speech as a factor
2015-03-09 21:54:03 +00:00
Matthias Huck
99b8f65fb1
GHKM: POS factor in glue rules: target side only
2015-03-06 16:47:44 +00:00
Matthias Huck
aa077ab66c
GHKM extraction / consolidate: write most frequent POS sequence from property to factor (for usage with a POS LM)
2015-03-05 22:25:32 +00:00
Matthias Huck
773a16b5fd
POS property in glue rules
2015-03-04 23:05:45 +00:00
Matthias Huck
06e87d851e
GHKM: extract POS phrase property (from preterminals in the syntactic parse tree)
2015-03-04 21:40:56 +00:00
Matthias Huck
9987beb453
SoftSourceSyntacticConstraintsFeature: Now for both non-terminals (as before) _and_ terminals.
...
Also added score components based on relative frequency.
(TODO: logprobs right now; are plain probabilities better?)
2015-01-23 18:41:18 +00:00
Matthias Huck
a6c09e57d0
domain features in GHKM extraction
2015-01-20 21:36:55 +00:00
Hieu Hoang
6d61db28fa
use astyle 2.01. It's on Edinburgh server and doesn't screw up enum
2015-01-14 19:21:11 +00:00
Hieu Hoang
05ead45e71
beautify
2015-01-14 11:07:42 +00:00
Matthias Huck
168118d252
PhraseOrientationFeature efficiency improvement
2015-01-09 14:03:18 +00:00
Nicola Bertoldi
e4eb201c52
merged master into dynamic-models and solved conflicts
2014-12-13 12:52:47 +01:00
Matthias Huck
24a8a6a511
PhraseOrientationFeature
2014-12-03 20:04:26 +00:00
Hieu Hoang
ba7afba9f6
move n-best code for phrase-based from IOWrapper to ChartManager
2014-12-02 17:40:53 +00:00
Phil Williams
ef1262a17f
extract-ghkm: change STSG output format.
2014-11-21 15:46:12 +00:00
Phil Williams
c46fb10ec7
extract-ghkm: add --STSG option
2014-11-21 11:30:29 +00:00
Matthias Huck
0fd987a8c6
avoid necessity of masking "{{" in the data
2014-11-12 18:28:59 +00:00
Phil Williams
a5d803ee14
extract-ghkm: add -T2S option
2014-11-12 14:03:24 +00:00
Phil Williams
05ecc914c2
Fix a few more compiler warnings (from Clang mostly).
2014-10-10 15:47:53 +01:00
Matthias Huck
5ac6c42508
PhraseOrientationFeature: bugfixes
2014-09-13 00:20:17 +01:00
Matthias Huck
0cf0d595d3
GHKM glue grammar: Orientation phrase property
2014-09-12 17:30:03 +01:00
Matthias Huck
63316960a1
GHKM glue grammar: print word alignment links for <s> and </s>,
...
SSTART and SEND in internal tree structure
2014-09-12 17:18:31 +01:00
Matthias Huck
1523f3315d
PhraseOrientationFeature for chart-based decoding: a first simple version,
...
with lots of log output
2014-09-12 13:51:04 +01:00
Matthias Huck
33992f9af5
uninitialized variables and double include
2014-08-08 16:27:17 +01:00
Paul Baltescu
d75c4e1ae5
OxLM integration.
2014-08-08 01:18:05 +01:00