Rico Sennrich
|
2d5a3ecdbc
|
remove subword marker at end-of-line
|
2017-04-07 15:13:26 +02:00 |
|
Rico Sennrich
|
fb526f1b00
|
rename --is-dict to --dict-input
|
2017-02-27 15:57:11 +00:00 |
|
Martin Boyanov
|
f37902dec6
|
Allow passing in a word - count file instead of iterating through the whole dataset
|
2017-02-25 14:17:56 +02:00 |
|
Rico Sennrich
|
4c54e1df2e
|
make max deterministic by using symbol pair as secondary sort key
|
2017-02-22 13:58:21 +00:00 |
|
Rico Sennrich
|
669255833f
|
acknowledgements
|
2017-02-20 10:54:15 +00:00 |
|
Rico Sennrich
|
9f23b0171a
|
consistently use UTF-8 across python versions and environment variables
|
2017-02-10 11:11:45 +00:00 |
|
Rico Sennrich
|
6a953fd54a
|
Merge branch 'unicode'
|
2017-02-10 11:07:08 +00:00 |
|
Rico Sennrich
|
c82604aa57
|
consistent, cross-version unicode handling
|
2017-01-10 14:52:42 +00:00 |
|
Rico Sennrich
|
269f18593e
|
Merge pull request #10 from rmeertens/master
using python3 print function
|
2016-11-08 09:09:06 +00:00 |
|
roland
|
d796a78a17
|
using python3 print function
|
2016-11-08 10:00:31 +01:00 |
|
Rico Sennrich
|
e68bd0582f
|
frequency threshold for learning
Closes #8
|
2016-10-17 16:36:33 +01:00 |
|
Rico Sennrich
|
ec5c7b009c
|
comments/whitespace
|
2016-09-06 14:09:41 +01:00 |
|
aagohary
|
4c3a3b3176
|
fixed the encoding issue with applying bpe with non-utf8 locale
|
2016-09-06 13:43:32 +01:00 |
|
Rico Sennrich
|
3004836285
|
update reference
|
2016-06-01 14:49:14 +01:00 |
|
Rico Sennrich
|
5d2d3758ad
|
break condition for toy example
|
2016-03-03 16:39:34 +00:00 |
|
Rico Sennrich
|
d0c78f57c8
|
add toy implementation of BPE as documentation
|
2016-03-03 11:17:36 +00:00 |
|
Rico Sennrich
|
962c445819
|
x2 speedup on Python 2.X (use PyPy for best speed)
|
2016-02-15 10:44:28 +00:00 |
|
Rico Sennrich
|
3380810b2e
|
escape backslashes in replacement. fixes #5.
|
2016-01-29 10:53:31 +00:00 |
|
Rico Sennrich
|
e4c38f2f30
|
Merge pull request #4 from He-Ro/fix-short-references
Fixes #3
|
2016-01-29 10:02:28 +00:00 |
|
Hendrik Rosendahl
|
19634c38c2
|
Fixes #3
Add test before division to check if total reference count is greater than zero
|
2016-01-29 10:23:00 +01:00 |
|
Rico Sennrich
|
8cb41a4c39
|
command line option for verbosity
|
2015-12-07 11:25:57 +00:00 |
|
Rico Sennrich
|
84e7411928
|
implementation of chrF (for evaluation)
|
2015-11-27 14:48:15 +00:00 |
|
Rico Sennrich
|
d822ce6744
|
Merge pull request #2 from kyunghyuncho/master
strip line before writing
|
2015-11-26 09:42:59 +00:00 |
|
Kyunghyun Cho
|
54e52cf3e3
|
removed debug
|
2015-11-25 19:01:39 -05:00 |
|
Kyunghyun Cho
|
81947f9907
|
correct strip
|
2015-11-25 19:01:04 -05:00 |
|
Kyunghyun Cho
|
b1e99d9829
|
use strip
|
2015-11-25 18:43:44 -05:00 |
|
Rico Sennrich
|
3028cc660d
|
allow use of apply_bpe as a class
|
2015-11-24 12:14:25 +00:00 |
|
Rico Sennrich
|
15f43f2afe
|
fix rare problem with pruned statistics
if the pruned stats are empty, we need to go back to full statistics
|
2015-10-29 16:45:00 +00:00 |
|
Rico Sennrich
|
83b1847647
|
initial commit
|
2015-09-01 11:48:49 +01:00 |
|