Commit Graph

10 Commits

Author SHA1 Message Date
Matt
09de59f026 do not store cblock, etc. tags into tagdb to save
disk space. added tagdb file cache for better performance,
less disk accesses. will help reduce disk load.
put file cache sizes in master controls and if they change
then update the cache size dynamically.
2015-09-10 12:46:00 -06:00
Matt Wells
a5a9820441 ignore tagdb tag rec bad recsize core.
do not save conf if crawlbot and not host id 0 and cored in
mem function, otherwise it just hangs and gb can't restart.
2015-08-23 09:40:11 -07:00
Matt
cad1d3d076 added support for sitelinks.txt file 2015-01-31 15:18:06 -07:00
mwells
87285ba3cd use gbmemcpy not memcpy so we can get profiler working again
since memcpy can't be interrupted and backtrace() called.
2015-01-13 12:25:42 -07:00
Matt
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
Matt Wells
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
Matt Wells
27e8e810d2 use collnum instead of coll string.
more stable since resetting collections
keeps string the same but changes the collnum.
2014-03-06 15:48:11 -08:00
Matt Wells
7cd746f567 fix msge0 msg0 overload in sockets table
when all diffbot replies timed out at once
at released thousands of spiders.
2014-01-22 20:34:55 -08:00
mwells
82494baa89 move CollectionRec stuff into Collectiondb files
for simplicity.
2013-12-10 15:28:04 -08:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00