Commit Graph

1490 Commits

Author SHA1 Message Date
hieuhoang1972
928d771085 create namespace
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1897 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-08 23:51:26 +00:00
redpony
76090cdda0 more normalization of feature names
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1896 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-05 19:34:49 +00:00
redpony
624e87a08f add more logging
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1895 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-05 18:12:42 +00:00
redpony
849538f73b make feature names the same.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1894 1f5c12ca-751b-0410-a591-d2e778427230
2008-10-05 16:37:42 +00:00
hieuhoang1972
1ea1d4f9b1 always with unix line endings
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1893 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-29 19:34:54 +00:00
hieuhoang1972
68a2461cb3 vs build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1892 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-24 17:04:29 +00:00
redpony
a81390c5bb fix a couple names
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1891 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-24 17:03:07 +00:00
redpony
232dc9889c enable moses to accept a file that lists feature name and weight pairs.
enable moses to export its search graph as a phrase lattice encoded serialized in a Google protocol buffer. This requires protoc (http://code.google.com/p/protobuf/) to function, disabled by default.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1890 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-24 16:48:23 +00:00
redpony
bb0ade93f7 a little refactoring in preparation for yet another way to export the search lattice.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1889 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-23 19:39:56 +00:00
hieuhoang1972
8ba603c39e visual studio build
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1888 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-14 01:31:02 +00:00
nicolabertoldi
9cbde412e2 support for creating binary Phrase Tables including word-to-word alignments
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1887 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-12 18:19:41 +00:00
nicolabertoldi
dd6c36640b Support for printing out word-to-word alignments (besides phrase-to-phrase alignments)
as contained in the phrase table.
If PT contains word-to-word alignments between source and target phrases,
Moses can optionally output them in the nbest and in the log file (if verbose).
W2w alignments from source to target and from target to source can differ,
if they differ in the PT.

Detailed documentation will be added in the Moses webpages very soon.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1886 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-12 18:09:06 +00:00
nicolabertoldi
e376f9f994 mv some Timer functions into the .cpp file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1885 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-12 15:31:46 +00:00
bhaddow
cd28f119c6 mert tests
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1881 1f5c12ca-751b-0410-a591-d2e778427230
2008-09-01 17:18:51 +00:00
jdschroeder
ea5ddd4d82 fixed nasty out-of-bounds array read in WordsBitmap, simplified (fixed?) lattice checks.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1879 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-26 17:04:43 +00:00
jdschroeder
78534c1518 made all zcat calls through ZCAT variable.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1875 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-15 16:15:51 +00:00
mfederico
842f5842a2 Integrated handling of oov penalty into irstlm library.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1874 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-05 11:56:32 +00:00
mfederico
7a0e5811ba Fixed bug with dub option.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1873 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-05 11:44:50 +00:00
hieuhoang1972
5dee5d04aa rename IOStream to IOWrapper.
move vs.net solution file to root folder

git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1872 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-05 00:24:45 +00:00
maurocettolo
90e3107ef4 just commented a print on stderr
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1871 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-04 16:55:21 +00:00
mfederico
9e61900fad Fixed bug concerning the handling of the oov penalty with IRSTLM
Now, the penalty for out-of-vocabulary words is specified 
by the parameter 

-lmodel-dub: dictionary upper bounds of language models

For instance, if you set it lmodel-dub to 1000000 (1M) and your actual 
vocabulary is let me say 200000 (200K), then the LM probabilty  of the
OOV word-class is divided by 800000 (800K), i.e. 1M-200K

You have to make sure that lmodel-dub is always larger than the LM 
dictionary.



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1870 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-04 13:06:52 +00:00
jdschroeder
7a2ebedc20 minor bugfixes and error checking
-added -rootdir option to enhanced-mert
	-fixed float regex in score-nbest.py and mert-moses.pl
	-allow for extra weights in constructing ini in mert-moses.pl
	-additional NFS bug checks in mert-moses.pl



git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1869 1f5c12ca-751b-0410-a591-d2e778427230
2008-08-01 10:24:01 +00:00
hieuhoang1972
dd9691d28c abort if try to get substring of confusion network. returning emptyy string just screws things up
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1868 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-30 16:29:51 +00:00
hieuhoang1972
03bd63e312 get rid of md5
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1867 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-28 10:10:07 +00:00
bojar
6a087d59c4 removed SCRIPTS_ROOTDIR from this 'my' declaration, it was obscuring previous
declaration!
lines wrapped


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1865 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-14 16:24:18 +00:00
bojar
2afe9e0357 avoid coredump files in parallel moses (usually just kills NFS for a while),
debug on a smaller scale, if needed


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1864 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-14 13:58:42 +00:00
hieuhoang1972
620b0c34cc abort if internal LM asked to do more than trigram
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1863 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-11 14:55:06 +00:00
bojar
c20e682f18 Avoid NFS race condition:
explicitly remove old cmert output files (hoping that they will be correctly
  replaced by a 'mv' in the shell script submitted to SGE by qsubwrapper
  occasionally reveals a race condition in NFS => weights seem unchanged =>
  mert finishes too early)


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1862 1f5c12ca-751b-0410-a591-d2e778427230
2008-07-10 11:47:55 +00:00
saintamh
9d106392e6 Bugfix
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1861 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-30 09:18:14 +00:00
bhaddow
83f234cf17 Implementation of Cer et al mert regularisation. Use with argument such
as --scconfig regtype:min,regwin:3 in extractor and mert. Only tested
on toy example so far.


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1860 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-24 19:27:18 +00:00
hieuhoang1972
6ddde13dca fixed constraint format bug
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1859 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-20 20:24:16 +00:00
dowobeha
3e1c6c39ff Fixed constraint decoding - there may be a bug in Util::Tokenize
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1858 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-20 15:45:06 +00:00
hieuhoang1972
81c7e5118b must provide line no to constraint file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1857 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-19 22:47:54 +00:00
dowobeha
67c8bdd328 Constraint decoding works, but not for cube pruning.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1856 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-19 22:40:51 +00:00
hieuhoang1972
52c2843e6c perl regexpr bug, submitted by German Sanchis Trilles
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1855 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-19 21:57:29 +00:00
dowobeha
7a4b1fb699 Added preliminary code for constraint decoding
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1854 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-18 23:14:09 +00:00
redpony
02067b2be0 fix some nasty edge cases in lattice decoding that arise when decoding lattices with complex topologies. also fix lattice regression tests. work done jointly with j schroeder.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1853 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-17 17:32:19 +00:00
chardmeier
1f38032a66 Fixed n-best list generation with cube pruning.
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1852 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-14 10:57:33 +00:00
hieuhoang1972
4f84e3ccfe don't run 2 tests which don't pass
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1850 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-11 13:42:23 +00:00
hieuhoang1972
47e4803f97 move cube pruning moses lib to trunk
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1849 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-11 13:22:26 +00:00
hieuhoang1972
4f642808f1 move cube pruning moses lib to trunk
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1848 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-11 10:52:57 +00:00
hieuhoang1972
6615fe0302 delete old moses lib
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1847 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-11 10:52:01 +00:00
bhaddow
4195b70247 First cut of new mert outer loop
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1842 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-10 09:07:20 +00:00
nicolabertoldi
e94834012d added facilities to read and write score statistics in binary format
moved facilities for feature names in FeatureData object


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1824 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 17:03:54 +00:00
nicolabertoldi
8e96e68476 overall change of a variable name: array_ instead of array2_
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1823 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 11:42:00 +00:00
nicolabertoldi
930e67c5e3 fixed another bug related to the handling of feature names
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1822 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 11:30:37 +00:00
nicolabertoldi
44d7e0e0f7 fixed a bug related to the handling of feature names
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1821 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 11:26:54 +00:00
bhaddow
37cf805139 Fix bug in output of scorestats to text file
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1820 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 11:08:29 +00:00
nicolabertoldi
281bf610b8 added binary read/load facility for feature data
added names of features in the header
added methods to access the features by name


git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1819 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-05 07:23:34 +00:00
hieuhoang1972
1b44c7c445 most popular alignment outputted, finally
git-svn-id: https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk@1818 1f5c12ca-751b-0410-a591-d2e778427230
2008-06-04 14:49:56 +00:00