Commit Graph

11 Commits

Author SHA1 Message Date
Dmitry Smirnov
b1ace63607 codespell: spelling corrections 2021-05-06 01:52:55 +10:00
Matt
a1ed368d82 bring back max mem control into master controls.
it's useful to limit per process mem usage to prevent
oom killer because we can't save if we get killed.
overhaul diskpagecache to just use rdbcache. much simpler
and faster, but disabled for now until debugged more.
reduce min files to merge for crawlbot collections so
they stay more tightly merged to conserve fds and mem.
improved logDebugDisk msgs.
overhauled File.cpp fd pool. now it is way faster and
doesn't use any extra mem. much simpler too. although
could be sped up a little by using a linked list, but
probably is not significant enough to warrant doing right now.
increase mem ptr table from 3M to 8M slots. should really make
dynamic though. fix core from null msg20s[0]->m_r.
only call attemptMergeAll once every 60 seconds really.
do not attempt merge if already merging.
2015-08-14 12:58:54 -06:00
mwells
87285ba3cd use gbmemcpy not memcpy so we can get profiler working again
since memcpy can't be interrupted and backtrace() called.
2015-01-13 12:25:42 -07:00
Matt
4e8a42e024 text replacements for bad int32_t substitutions 2014-11-17 18:24:38 -08:00
Matt
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
Matt Wells
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
Matt Wells
bc78b21dc6 for json docs only give them a single
xmlnode in the Xml.cpp class. hopefully
will not get "malformed sections" error
anymore. i think that was a result of the
json having html tags in it and making
unnested html structures which the
sections class did not like.
TODO: probably do this for CT_TEXT etc.
as well.
2014-01-25 08:17:38 -08:00
mwells
7d3cc672c8 use ./gb blaster -u <fileofurls> to just inject urls,
but use -i to also add the outlinks to spiderdb.
2013-08-19 16:33:27 -06:00
mwells
95a020574c set spiderlinks=1 when doing
./gb blaster -i <fileofurls> to
index/inject a file of urls so that
we add the outlinks to spiderdb. this will
slow things down a little since we will have
to do a dns lookup of the subdomain of each
outlink, unless it is cached.
2013-08-19 16:15:58 -06:00
mwells
2c83b96ba4 Added support for 'gb blaster -i <fileofurls> <maxThreads>' to
inject/index a file of urls. Committing older work for
compare.html that shows differences between gigablast and solr,
but has a lot of blanks.
2013-08-19 13:26:46 -06:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00