Commit Graph

6 Commits

Author SHA1 Message Date
mwells
f55d4d1230 merge diffbot-testing 2014-04-09 20:10:30 -07:00
Matt Wells
9cb99f7621 Merge branch 'diffbot' into diffbot-testing
Conflicts:
	Spider.cpp
2013-12-16 11:06:11 -08:00
Matt Wells
16e91375f4 bring in changes from live beta from ~/github.
limit spiders to 50, not 500 to prevent oom.
resume killed merges that had num files shrunk even
if down to one file. show collnum in spider queue.
remove back-to-back whitespace, and make all space
a ' ' for getting the doc checksum for deduping.
2013-12-12 12:58:58 -08:00
Matt Wells
02bf6ab3cc new crawlbot api. not backwards compatible any more. 2013-09-17 10:25:54 -07:00
mwells
ca2a024d04 fixed up thread/spider log msgs.
fixed core from calling fprintf in
alarm signal missed quickpoll handler.
2013-08-29 21:15:42 -06:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00