Commit Graph

16 Commits

Author SHA1 Message Date
Matt
09de59f026 do not store cblock, etc. tags into tagdb to save
disk space. added tagdb file cache for better performance,
less disk accesses. will help reduce disk load.
put file cache sizes in master controls and if they change
then update the cache size dynamically.
2015-09-10 12:46:00 -06:00
Matt
83be5d7d46 fix links parser so it harvests outlinks from rss feeds'
<link> tags. it was doing this before, now it is doing it again.
2015-03-12 17:35:47 -07:00
Matt
c6fd5571d2 if "links":[ is specified in diffbot reply then crawlbot
will parse out those links as if they were on the page.
2015-03-10 14:36:44 -07:00
mwells
3c92fd6916 fix Inlink accessor function core.
don't use off_urlBuf, etc. any more just
use size_urlBuf, etc. now for better backwards
compatibility.
2014-12-02 10:26:10 -07:00
Matt Wells
320eb66237 fixed time_t in LinkInfo class for 64 bit conversion 2014-11-27 20:55:18 -08:00
Matt
c5989f4c4c fix new simplied Inlinks code some more 2014-11-18 17:10:48 -08:00
Matt
2977845375 simplify Inlinks class in LinkInfo.cpp.
fix some more 64-bit related cores.
2014-11-18 16:50:31 -08:00
Matt
931a1c4bc6 good checkpoint. quite a few fixes. 2014-11-17 18:13:36 -08:00
Matt
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
Matt Wells
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
Matt Wells
b13f3d24d7 replaced unsigned long long with uint64_t 2014-10-30 13:30:39 -06:00
Matt Wells
94a55bf9a6 fixes for new link info code so it doesn't
bottleneck. got EFENCE_SIZE working so we
can use efence on large allocs only so we don't
go oom using it. might help finding some of
the out of bounds writing going on.
2014-02-25 10:55:05 -08:00
Matt Wells
72f1312652 new linkdb code compiling. 2014-02-20 17:27:28 -08:00
Matt Wells
9820f14066 checkpoint 2014-02-20 14:54:21 -08:00
Matt Wells
74c2742ced fix mem leak of LinkInfo.
fixed json output from injecting url.
2013-10-16 17:17:28 -07:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00