Commit Graph

32 Commits

Author SHA1 Message Date
Zak Betz
7b507a70ef Set value length to 0 for something that does not return a string value
in Json.cpp.
Fix the '-' -> '_' when indexing generic fields.
Add a StackBuf macro which is a Safebuf initialized with a small
stack buffer for use in a local scope.
2015-06-30 14:09:57 -06:00
Matt
497131d359 fix gbssdocid bug better 2015-04-13 14:33:57 -06:00
Matt
3e5218c54c fix gbssDocId:123456789, et al, query. will only work for docs indexed
after applying this fix.
2015-04-13 14:13:16 -06:00
Matt
9f836dbf75 fix corruption of s_vbuf (gb version) in the hosts table. 2015-04-13 11:13:44 -06:00
Matt Wells
64bae224e0 fix core on the GI 2015-04-08 16:05:32 -06:00
Matt Wells
e346a14a47 added logic to retry diffbot reply on connection reset,
connection timed out or gateway timed out (http status 504)
msgs.  added logic to detect truncated json (missing final })
and not print it. also, at index time, we set a diffbot missing
curly error to g_errno so the whole url can be retried later.
2015-03-09 20:54:34 -07:00
Matt Wells
79879976fa try to fix a couple cores. one when parsing
bad json. the other in reclaiming doledb tree mem.
2015-03-08 08:56:10 -07:00
Matt
adcef39376 Merge branch 'diffbot-testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	Collectiondb.h
	Conf.cpp
	Conf.h
	Msg39.cpp
	PageEvents.cpp
	PageResults.cpp
	PageTurk.cpp
	Pages.cpp
	Parms.cpp
	Posdb.cpp
	Proxy.cpp
	Query.cpp
	Query.h
	RdbBase.cpp
	RdbMap.cpp
	Repair.cpp
	Repair.h
	SafeBuf.cpp
	Spider.cpp
	Tagdb.cpp
	TopTree.cpp
	XmlDoc.cpp
	main.cpp
2014-11-20 16:53:07 -08:00
Matt
4c19453ea9 working with -m32 for basic testing.
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
Matt
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
Matt Wells
cc9dfc6e45 parser was not capturing negative sign so
gbmin: and gbmax: and gbminint: etc. were not
working for negative numbers. should work now.
2014-10-31 13:13:27 -07:00
mwells
85c41b2211 multiple core fixes 2014-09-22 07:07:40 -07:00
Matt Wells
5980d48471 fixed json parser bug 2014-09-11 13:40:07 -07:00
mwells
d74df80f50 fix json parser core 2014-09-10 06:58:18 -07:00
Matt Wells
406fec356a fix json parser core from bad json. 2014-06-16 06:56:16 -07:00
Matt Wells
d2cc117d82 fix oops 2014-05-16 18:47:52 -07:00
Matt Wells
526be98ec8 fix core scenario when diffbot reply that was injected
using &diffbotreply= contains the http mime.
2014-05-16 18:46:39 -07:00
Matt Wells
c5ae5ca4b5 v3 support for tokenized diffbot replies
using the "objects" array in the json.
2014-05-12 16:13:24 -07:00
Matt Wells
402377d2e6 fix bug of gbmin, gbmax etc. not working.
floats were being rounded down to ints
in most cases it seems. so .9 -> 0 etc.
2014-03-26 11:56:06 -07:00
Matt Wells
9c26b85c2f fixed contenthash32 logic for json objects.
fixed hashing of numbers/bools for json objects.
added m_dupCache to reduce spiderrequests added to spiderdb.
do not add urls to waitingtree if ufn is obviously filtered/banned.
do not spider spiderrequest from doledb is maxoutperip would
be violated.
2014-02-05 13:22:03 -08:00
Matt Wells
4e803210ee tons of changes from live github on neo.
lots of core fixes.
took out ppthtml powerpoint convert, it hangs.
dynamic rdbmap to save memory per coll.
fixed disk page cache logic and brought it
back.
2014-01-17 21:01:43 -08:00
mwells
76bb3d05e1 clean up logging so i can see what's going on 2013-12-10 16:41:30 -08:00
Matt Wells
e0a15194e1 fix json double decoding issue. no more
partial decodes, json parser stores
fully decoded string into separate buf.
2013-11-22 14:16:14 -08:00
Matt Wells
35d22bd9aa fix json parser 2013-11-19 09:44:42 -08:00
Matt Wells
6495dfd86e try to fix json parser overflow error. needs
testing. tried to fix round num from incrementing
for little job because i think server overload.
should be fixed right some time. just made wait
time 30 secs instead of 10 in Spider.cpp.
2013-11-15 11:30:16 -08:00
Matt Wells
fbcd6b8afd display json objects that are not in arrays
in csv. show csv header. how to deal
with heterogenous object lists?
index spiderdate: for gbsortby:spiderdate.
added gbrevsortby: support.
2013-11-12 13:51:52 -08:00
Matt Wells
a288217e9f a few bug fixes 2013-10-17 18:59:00 -07:00
Matt Wells
f5e5b0f5d3 fix crawlbot bugs 2013-10-16 12:12:22 -07:00
mwells
a562c65627 another code checkpoint. new json api
for crawlbot. new url filters for crawlbot.
2013-10-14 16:10:48 -06:00
mwells
5a7d70f7b2 code checkpoint 2013-10-14 13:00:05 -06:00
mwells
0de777d80d parser fixes 2013-10-11 17:35:12 -06:00
mwells
6d5643e185 json parsing 2013-10-11 16:14:26 -06:00