Commit Graph

18 Commits

Author SHA1 Message Date
Matt
a1ed368d82 bring back max mem control into master controls.
it's useful to limit per process mem usage to prevent
oom killer because we can't save if we get killed.
overhaul diskpagecache to just use rdbcache. much simpler
and faster, but disabled for now until debugged more.
reduce min files to merge for crawlbot collections so
they stay more tightly merged to conserve fds and mem.
improved logDebugDisk msgs.
overhauled File.cpp fd pool. now it is way faster and
doesn't use any extra mem. much simpler too. although
could be sped up a little by using a linked list, but
probably is not significant enough to warrant doing right now.
increase mem ptr table from 3M to 8M slots. should really make
dynamic though. fix core from null msg20s[0]->m_r.
only call attemptMergeAll once every 60 seconds really.
do not attempt merge if already merging.
2015-08-14 12:58:54 -06:00
Matt Wells
840ca3fea1 fix rdbmap reduce mem thing 2015-08-08 15:43:09 -07:00
Matt Wells
c2bf461d27 call reduceMemFootprint() after writing rdb map
to save mem immediately rather than on restart of gb
2015-08-08 11:23:14 -07:00
Matt Wells
0d1acb09bc try to fix tree if corruption detected when dumping to disk 2015-07-14 22:27:43 -06:00
Matt Wells
b8049aae58 added isfakeip url filter expression to help
speed up bulk jobs
2015-06-17 13:59:13 -07:00
Matt Wells
43130f3a8d exit if corruption detected at startup 2015-06-17 10:17:36 -07:00
Matt Wells
68d04b239f auto move dat/map files we can't regen map for
to trash subdir. later: try to repair them better.
2015-06-17 06:55:49 -07:00
Matt Wells
b8f1cf9298 added a quickpoll to spider.cpp.
reduce diffbot max spiders per ip from 7 to 1 to fix
collection starvation at least temporarily until the
proper fix is deployed.
2015-06-15 11:46:51 -07:00
mwells
87285ba3cd use gbmemcpy not memcpy so we can get profiler working again
since memcpy can't be interrupted and backtrace() called.
2015-01-13 12:25:42 -07:00
Matt
adcef39376 Merge branch 'diffbot-testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	Collectiondb.h
	Conf.cpp
	Conf.h
	Msg39.cpp
	PageEvents.cpp
	PageResults.cpp
	PageTurk.cpp
	Pages.cpp
	Parms.cpp
	Posdb.cpp
	Proxy.cpp
	Query.cpp
	Query.h
	RdbBase.cpp
	RdbMap.cpp
	Repair.cpp
	Repair.h
	SafeBuf.cpp
	Spider.cpp
	Tagdb.cpp
	TopTree.cpp
	XmlDoc.cpp
	main.cpp
2014-11-20 16:53:07 -08:00
Matt
4e8a42e024 text replacements for bad int32_t substitutions 2014-11-17 18:24:38 -08:00
Matt
931a1c4bc6 good checkpoint. quite a few fixes. 2014-11-17 18:13:36 -08:00
Matt
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
Matt Wells
444ed14cde reduce mem usage in rdbmap. useful
for when there are thousands of tiny collections.
2014-11-07 08:49:08 -08:00
Matt Wells
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
mwells
ea72059bd5 fix core from keys out of order when dumping
when keys are equal. should fix a core i saw.
2014-09-18 12:33:56 -06:00
Matt Wells
4e803210ee tons of changes from live github on neo.
lots of core fixes.
took out ppthtml powerpoint convert, it hangs.
dynamic rdbmap to save memory per coll.
fixed disk page cache logic and brought it
back.
2014-01-17 21:01:43 -08:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00