Commit Graph

167 Commits

Author SHA1 Message Date
Hieu Hoang
fb233b50e0 init moses2 2015-10-25 11:47:47 +00:00
Hieu Hoang
a71cfeb3db init moses2 2015-10-24 02:14:52 +01:00
Hieu Hoang
013a2b092c allocate size of class temlate fn 2015-10-23 21:13:57 +01:00
Hieu Hoang
29295daf9c init moses2 2015-10-23 20:53:36 +01:00
Hieu Hoang
c80fdfde22 unit test error. Change in StringStream constructor 2015-10-16 19:03:22 +00:00
Hieu Hoang
dd2bf5d9f3 make util::StringStream more like std::stringstream 2015-10-16 19:03:22 +00:00
Hieu Hoang
b6231e8c73 make util::StringStream more like std::stringstream 2015-10-16 19:03:22 +00:00
Hieu Hoang
dcea021cd7 use util::StringStream 2015-10-16 19:03:22 +00:00
Kenneth Heafield
758b270447 Change integer handling to coercion by default 2015-10-02 18:32:16 +01:00
Kenneth Heafield
9e2be5fdd6 KenLM 31a3b75bc87f0b3160f15adfc54d2fde529f341a trying to fix some stringstream stuff 2015-10-02 18:14:19 +01:00
Hieu Hoang
fe7a817519 comparison error for pointers on OSX clang
eg. check boost::lexical_cast<std::string>(value) == result has failed [0x0 != 0]

Added operator<<(long value) for FakeOStream. Compile error on clang
2015-10-01 16:28:41 +01:00
Kenneth Heafield
ebf31f3e8d Missing ifdef / Matt Post 2015-09-30 17:52:21 +01:00
Kenneth Heafield
ea8e19f286 KenLM a590a3a4dadf516a1cff28c8f1c06aa89766f519 including StringStream
TODO: kill istream
2015-09-29 16:58:02 +01:00
Ulrich Germann
58d9d88a9f Merge branch 'master' of http://github.com/moses-smt/mosesdecoder 2015-09-14 09:58:30 +01:00
Kenneth Heafield
5732129bd4 Update kenlm 2015-09-10 16:04:09 +01:00
Ulrich Germann
8c0555346f Missing #include. 2015-09-04 00:00:53 +01:00
Kenneth Heafield
09ecd071f9 KenLM 2a3e8fae3633c890cb3b342d461f9130c8e343fa excluding unfinished interpolation directory 2015-08-27 10:55:52 +01:00
Ulrich Germann
515862ee1c Reformatting for readability. 2015-07-02 01:31:11 +01:00
Jeroen Vermeulen
924710f53e On MinGW use Windows _chsize_t, not ftruncate.
This works around a problem when building against MinGW and then running
the resulting Windows binary on WINE.  (Perverse, I know.)  For some
reason the ftruncate() to 0 bytes succeeds, but the subsequent one to a
larger size fails.  Even if the size is just 1 byte.

This happened where GenericModel::InitializeFromARPA called
BinaryFormat::SetupJustVocab, which called MapZeroedWrite, which calls
ResizeOrThrow twice; the second one failed.
2015-06-12 15:11:57 +07:00
Kenneth Heafield
0d54286d3f Require __SSE2__ for i386 to use SSE2 2015-06-11 14:43:10 -04:00
Ulrich Germann
044bfd0b1a util/probing_hash_table_benchmark_main.cc wouldn't compile with boost v.1.46.1. 2015-05-20 23:46:01 +01:00
Jeroen Vermeulen
ef5a17b2f9 Fix some new compile problems.
* file_piece.cc used isnan() instead of std::isnan().
 * Fdstream.h used close() but Windows doesn't have unistd.h.

Fixed Fdstream.h by using util::scoped_fd.  Thanks Ken.
2015-05-20 11:40:11 +07:00
Kenneth Heafield
a70d37e46f KenLM 7408730be415db9b650560a8b2bd3e4e3af49ec9.
unistd.hh is dead.
2015-05-19 15:27:30 -04:00
Jeroen Vermeulen
eca5824100 Remove trailing whitespace in C++ files. 2015-04-30 12:05:11 +07:00
Ulrich Germann
4390bcdffd boost/thread/lock_guard.hpp not found with Boost v1.46. 2015-04-26 03:13:19 +01:00
Jeroen Vermeulen
38d790cac0 Add cross-platform randomizer module.
The code uses two mechanisms for generating random numbers: srand()/rand(),
which is not thread-safe, and srandom()/random(), which is POSIX-specific.

Here I add a util/random.cc module that centralizes these calls, and unifies
some common usage patterns.  If the implementation is not good enough, we can
now change it in a single place.

To keep things simple, this uses the portable srand()/rand() but protects them
with a lock to avoid concurrency problems.

The hard part was to keep the regression tests passing: they rely on fixed
sequences of random numbers, so a small code change could break them very
thoroughly.  Util::rand(), for wide types like size_t, calls std::rand() not
once but twice.  This behaviour was generalized into utils::wide_rand() and
friends.
2015-04-23 23:46:04 +07:00
Jeroen Vermeulen
75bfb75882 Thread-safe, platform-agnostic randomizer.
Some places in mert use srandom()/random(), but these are POSIX-specific.
The standard alternative, srand()/rand(), is not thread-safe.  This module
wraps srand()/rand() in mutexes (very short-lived, so should not cost much)
so that it relies on just Boost and the C standard library, not on a Unix-like
environment.

This may reduce the width of the random numbers on some platforms: it goes
from "long int" to just "int".  If that is a problem, we may have to use
Boost's randomizer utilities, or eventually, the C++ ones.
2015-04-22 20:43:29 +07:00
Jeroen Vermeulen
32722ab5b1 Support tokenize(const std::string &) as well.
Convenience wrapper: the actual function takes a const char[], but many of
the call sites want to pass a string and have to call its c_str() first.
2015-04-22 10:35:18 +07:00
Jeroen Vermeulen
10a0a7b05a Add new files.
Oops.  Forgot these in my previous commit.  Sorry!
2015-04-22 10:18:02 +07:00
Jeroen Vermeulen
abfdb61bc9 Cross-platform tempfile implementation.
This makes temp_file and temp_dir work both on POSIX-like platforms and on
Windows.

It also fixes a bug where the temporary files/directories were created in
the current working directory, instead of in the system's standard
location for temporary files.  Unfortunately the Windows and POSIX code
diverge quite a bit on that point.
2015-04-18 00:21:18 +07:00
Jeroen Vermeulen
d56f317f2e New helper classes: temp_dir & temp_file.
I'm adding these because boost::filesystem::unique_path introduces
encoding issues: on Windows the path is in wchar_t, breaking use of
those strings in various places!  Encoding the strings is just too
much work.

It's still possible that the current temp_file implementation won't
build on Windows (it uses POSIX mkstemp() and close()) but that can
be fixed underneath the API.
2015-04-17 22:57:55 +07:00
Jeroen Vermeulen
1e3e445e3f Use cross-platform mmap() wrapper in CompactPT.
The MmapAllocator header made use of sys/mman.h and mmap(), which are
Unix-specific.  But util has a wrapper which also works on Windows.

This also fixes the error handling: when mmap() failed, the old code would
return an invalid (but non-NULL!) pointer — leading to a crash.  The wrapper
will throw an exception with a helpful error message.
2015-04-17 18:53:46 +07:00
Jeroen Vermeulen
8a3ae2fd5c Portability and include fixes.
Add <cstdlib> include for srand()/rand(), and <unistd.h> for open() etc.
Include <unistd.h> on Windows if using MinGW.  Disable MeteorScorer on
Windows, since it doesn't have fork() and pipe().
2015-04-10 12:54:34 +07:00
Jeroen Vermeulen
88e90957a1 Modernize "C" includes in util.
This is one of those little chores in managing a long-lived C++
project: standard C headers like stdio.h and math.h now have their own
place in the C++ standard as resp. cstdio, cmath, and so on.  In this
branch the #include names are updated for the util/ subdirectory; more
branches to follow.

C++11 adds cstdint, but to support compilation with the previous
standard, that change is left for later.
2015-03-28 19:37:48 +07:00
Kenneth Heafield
8b323abbca KenLM 240ea65a021574261a38d45eb68143f26ad177e5 2015-03-25 10:40:21 -04:00
Kenneth Heafield
769c19d10c KenLM a6d57501dcac95a31719a8628f6cbd288f6741e2 including Marcin's fixed pruning 2015-01-22 11:42:46 -05:00
Ulrich Germann
7aa4d5d8d5 Merge branch 'master' of https://github.com/moses-smt/mosesdecoder
Conflicts:
	moses-cmd/simulate-pe.cc
2014-11-20 17:55:51 +00:00
Kenneth Heafield
36da8d1e0c KenLM 370f97fa549f02e162a3a0f17bf3ad6cce2c3813 2014-10-08 08:42:53 -04:00
Ulrich Germann
a58c7ceb18 Fixed issues with ambiguity in typedef of uint64_t (conflict between boost typedef and stdint typedef). 2014-09-10 12:07:57 +02:00
Kenneth Heafield
62b476cd45 Fix fd leak noticed by Barry Haddow 2014-07-22 11:43:20 +08:00
Kenneth Heafield
c83c5a3ee6 D'ph forgot to copy util 2014-07-19 06:54:01 +08:00
Kenneth Heafield
6d9173ba72 KenLM f81d02792087a837ea17e6ce2b33f9b7aaecca68 should fix segfault with ArrayTrie 2014-06-04 16:03:39 -07:00
Kenneth Heafield
dd03f9fb69 KenLM 5a7efd8fe1db88ee0a9f7e9479b24ac3ca348221 with Hieu's patch to exception.hh 2014-06-02 10:29:40 -07:00
Kenneth Heafield
8ae4d153c8 Consolidated, consistent rt target fixes single-threaded build 2014-03-11 11:29:40 -07:00
Kenneth Heafield
e1d8f5c2ae Do not compile pcqueue_test for single-threaded builds 2014-03-11 11:11:24 -07:00
Kenneth Heafield
29f02c597f Fix progress bar for compressed files 2014-01-30 15:55:25 -08:00
Kenneth Heafield
df8b179b7a Update read_compressed to support concatenated gzip 2014-01-30 09:03:01 -08:00
Kenneth Heafield
ffd62e994e Fix C++11 compilation error / Chris Dyer 2014-01-27 22:25:43 -08:00
Kenneth Heafield
14e02978fc KenLM 5cc905bc2d214efa7de2db56a9a672b749a95591
Avoid unspecified behavior of mmap when a file is resized reported by Christian Hardmeier
Fixes for Mavericks and a workaround for Boost's broken semaphore
Clean clang compile (of kenlm)

Merged some of 744376b3fb but also undid some of it because it was just masking a fundaemntal problem with pread rather than working around windows limitations
2014-01-27 16:51:35 -08:00
Rico Sennrich
9e177cb472 SyntaxConstraintFeature (without any actual constraints; useful to build/output syntax tree from GHKM tree fragments) 2014-01-16 18:45:26 +00:00