mosesdecoder/contrib/python/README.md

97 lines
2.6 KiB
Markdown
Raw Normal View History

# Python interface to Moses
The idea is to have some of Moses' internals exposed to Python (inspired on pycdec).
## What's been interfaced?
2012-11-15 16:02:50 +04:00
* Binary tables:
Moses::PhraseDictionaryTree
OnDiskPt::OnDiskWrapper
## Building
2012-10-03 22:04:48 +04:00
1. Build the python extension:
You need to compile Moses with link=shared
2012-10-03 22:04:48 +04:00
./bjam --libdir=path link=shared
2012-10-03 22:04:48 +04:00
Then you can build the extension (in case you used --libdir=path above, use --moses-lib=path below)
2012-11-13 14:43:36 +04:00
python setup.py build_ext -i [--with-cmph] [--moses-lib=PATH] [--cython] [--max-factors=NUM] [--max-kenlm-order=NUM]
Use `--cython` if you want to re-compile the pyx files, note that they already come compiled so that you don't need to have Cython installed
## Example
### Getting a phrase table
cd examples
export LC_ALL=C
cat phrase-table.txt | sort | ../../../bin/processPhraseTable -ttable 0 0 - -nscores 5 -alignment-info -out phrase-table
### Getting a rule table
cd examples
../../../bin/CreateOnDiskPt 0 0 5 20 2 rule-table.txt rule-table
### Querying
1. Phrase-based
echo "casa" | python example.py examples/phrase-table 5 20
echo "essa casa" | python example.py examples/phrase-table 5 20
2. Hierarchical
echo "i [X]" | python example.py examples/rule-table 5 20
echo "have [X]" | python example.py examples/rule-table 5 20
echo "[X][X] do not [X][X] [X]" | python example.py examples/rule-table 5 20
### Code
```python
2012-11-15 16:02:50 +04:00
from moses.dictree import load # load abstracts away the choice of implementation by checking the available files
import sys
if len(sys.argv) != 4:
print "Usage: %s table nscores tlimit < query > result" % (sys.argv[0])
sys.exit(0)
path = sys.argv[1]
nscores = int(sys.argv[2])
tlimit = int(sys.argv[3])
table = load(path, nscores, tlimit)
for line in sys.stdin:
f = line.strip()
result = table.query(f)
# you could simply print the matches
# print '\n'.join([' ||| '.join((f, str(e))) for e in matches])
2012-11-15 16:02:50 +04:00
# or you can use their attributes
print result.source
for e in result:
if e.lhs:
print '\t%s -> %s ||| %s ||| %s' % (e.lhs,
' '.join(e.rhs),
e.scores,
e.alignment)
else:
print '\t%s ||| %s ||| %s' % (' '.join(e.rhs),
e.scores,
e.alignment)
```
## Changing the code
If you want to add your changes you are going to have to recompile the cython code.
1. Compile the cython code using Cython 0.17.1 or above
2012-11-13 14:43:36 +04:00
python setup.py build_ext -i --cython