Use square brackets instead of round brackets for internal tree
structure. This avoids the need for additional escaping since
square brackets are already escaped in Moses.
Also: tweak code style to match the rest of the source file, and
output less whitespace to make the extract files (marginally)
smaller.
Don't write glue grammar or unknown word label files unless the sentence
offset is 0. This prevents multiple instances of extract-ghkm writing
to the same two files when extract-parallel is used.
TODO Better solutions might be:
1. modify extract-parallel so that it only configures one instance of
extract-ghkm to write the glue / unknown-lhs files (like the current
workaround, this assumes file chunks are representative of the whole)
2. add multithreading support directly to extract-ghkm
3. write distinct output files for each extract-ghkm instance and
combine them on completion
This should behave the same as the --SentenceOffset option for
extract-rules. The extract-parallel.perl script expects the rule
extractor to have this option.