Matt Wells
9c26b85c2f
fixed contenthash32 logic for json objects.
...
fixed hashing of numbers/bools for json objects.
added m_dupCache to reduce spiderrequests added to spiderdb.
do not add urls to waitingtree if ufn is obviously filtered/banned.
do not spider spiderrequest from doledb is maxoutperip would
be violated.
2014-02-05 13:22:03 -08:00
Matt Wells
e0a15194e1
fix json double decoding issue. no more
...
partial decodes, json parser stores
fully decoded string into separate buf.
2013-11-22 14:16:14 -08:00
Matt Wells
fbcd6b8afd
display json objects that are not in arrays
...
in csv. show csv header. how to deal
with heterogenous object lists?
index spiderdate: for gbsortby:spiderdate.
added gbrevsortby: support.
2013-11-12 13:51:52 -08:00
Matt Wells
a288217e9f
a few bug fixes
2013-10-17 18:59:00 -07:00
Matt Wells
9d6c3626d8
json indexing/hashing updates.
2013-10-16 15:41:12 -07:00
mwells
a562c65627
another code checkpoint. new json api
...
for crawlbot. new url filters for crawlbot.
2013-10-14 16:10:48 -06:00
mwells
0de777d80d
parser fixes
2013-10-11 17:35:12 -06:00
mwells
6d5643e185
json parsing
2013-10-11 16:14:26 -06:00