mosesdecoder/scripts/generic/extract-factors.pl
Jeroen Vermeulen a25193cc5d Fix a lot of lint, mostly trailing whitespace.
This is lint reported by the new lint-checking functionality in beautify.py.
(We can change to a different lint checker if we have a better one, but it
would probably still flag these same problems.)

Lint checking can help a lot, but only if we get the lint under control.
2015-05-17 20:04:04 +07:00

22 lines
603 B
Perl
Executable File

#!/usr/bin/env perl
# $Id$
#extract-factors.pl: extract only the desired factors from a factored corpus
#usage: extract-factors corpusfile factor-index factor-index ... > outfile
#factor indices start at 0
#factor indices too large ought to be ignored
use warnings;
use strict;
my ($filename, @factors) = @ARGV;
my %indices = map {$_ => 1} @factors;
open(INFILE, "<$filename") or die "couldn't open '$filename' for read: $!\n";
while(my $line = <INFILE>)
{
chop $line;
print join(' ', map {my $i = 0; join('|', grep($indices{$i++}, split(/\|/, $_)))} split(/\s+/, $line)) . "\n";
}
close(INFILE);