mosesdecoder/contrib/python/README.md

96 lines
2.6 KiB
Markdown
Raw Normal View History

# Python interface to Moses
The idea is to have some of Moses' internals exposed to Python (inspired on pycdec).
## What's been interfaced?
2012-11-15 16:02:50 +04:00
* Binary tables:
Moses::PhraseDictionaryTree
OnDiskPt::OnDiskWrapper
## Building
2012-10-03 22:04:48 +04:00
1. Build the python extension:
You need to compile Moses with link=shared
2012-10-03 22:04:48 +04:00
./bjam --libdir=path link=shared
2012-10-03 22:04:48 +04:00
Then you can build the extension (in case you used --libdir=path above, use --moses-lib=path below)
2012-11-13 14:43:36 +04:00
python setup.py build_ext -i [--with-cmph] [--moses-lib=PATH] [--cython] [--max-factors=NUM] [--max-kenlm-order=NUM]
Use `--cython` if you want to re-compile the pyx files, note that they already come compiled so that you don't need to have Cython installed
## Example
### Getting a phrase table
cd examples
export LC_ALL=C
cat phrase-table.txt | sort | ../../../bin/processPhraseTable -ttable 0 0 - -nscores 5 -alignment-info -out phrase-table
### Getting a rule table
cd examples
../../../bin/CreateOnDiskPt 0 0 5 20 2 rule-table.txt rule-table
### Querying
1. Phrase-based
echo "casa" | python example.py examples/phrase-table 5 20
echo "essa casa" | python example.py examples/phrase-table 5 20
2. Hierarchical
echo "i [X]" | python example.py examples/rule-table 5 20
echo "have [X]" | python example.py examples/rule-table 5 20
echo "[X][X] do not [X][X] [X]" | python example.py examples/rule-table 5 20
### Code
```python
2012-11-15 16:02:50 +04:00
from moses.dictree import load # load abstracts away the choice of implementation by checking the available files
import sys
if len(sys.argv) != 4:
print "Usage: %s table nscores tlimit < query > result" % (sys.argv[0])
sys.exit(0)
path = sys.argv[1]
nscores = int(sys.argv[2])
tlimit = int(sys.argv[3])
table = load(path, nscores, tlimit)
for line in sys.stdin:
f = line.strip()
result = table.query(f)
# you could simply print the matches
# print '\n'.join([' ||| '.join((f, str(e))) for e in matches])
2012-11-15 16:02:50 +04:00
# or you can use their attributes
print result.source
for e in result:
if e.lhs:
print '\t%s -> %s ||| %s ||| %s' % (e.lhs,
' '.join(e.rhs),
e.scores,
e.alignment)
else:
print '\t%s ||| %s ||| %s' % (' '.join(e.rhs),
e.scores,
e.alignment)
```
## Changing the code
If you want to add your changes you are going to have to recompile the cython code.
1. Compile the cython code:
2012-11-13 14:43:36 +04:00
python setup.py build_ext -i --cython