Zak Betz
7b507a70ef
Set value length to 0 for something that does not return a string value
...
in Json.cpp.
Fix the '-' -> '_' when indexing generic fields.
Add a StackBuf macro which is a Safebuf initialized with a small
stack buffer for use in a local scope.
2015-06-30 14:09:57 -06:00
Matt
497131d359
fix gbssdocid bug better
2015-04-13 14:33:57 -06:00
Matt
3e5218c54c
fix gbssDocId:123456789, et al, query. will only work for docs indexed
...
after applying this fix.
2015-04-13 14:13:16 -06:00
Matt
9f836dbf75
fix corruption of s_vbuf (gb version) in the hosts table.
2015-04-13 11:13:44 -06:00
Matt Wells
64bae224e0
fix core on the GI
2015-04-08 16:05:32 -06:00
Matt Wells
e346a14a47
added logic to retry diffbot reply on connection reset,
...
connection timed out or gateway timed out (http status 504)
msgs. added logic to detect truncated json (missing final })
and not print it. also, at index time, we set a diffbot missing
curly error to g_errno so the whole url can be retried later.
2015-03-09 20:54:34 -07:00
Matt Wells
79879976fa
try to fix a couple cores. one when parsing
...
bad json. the other in reclaiming doledb tree mem.
2015-03-08 08:56:10 -07:00
Matt
adcef39376
Merge branch 'diffbot-testing' into diffbot-matt
...
Conflicts:
Collectiondb.cpp
Collectiondb.h
Conf.cpp
Conf.h
Msg39.cpp
PageEvents.cpp
PageResults.cpp
PageTurk.cpp
Pages.cpp
Parms.cpp
Posdb.cpp
Proxy.cpp
Query.cpp
Query.h
RdbBase.cpp
RdbMap.cpp
Repair.cpp
Repair.h
SafeBuf.cpp
Spider.cpp
Tagdb.cpp
TopTree.cpp
XmlDoc.cpp
main.cpp
2014-11-20 16:53:07 -08:00
Matt
4c19453ea9
working with -m32 for basic testing.
...
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
Matt
96b8197ad3
now it compiles with -m32
2014-11-10 14:45:11 -08:00
Matt Wells
cc9dfc6e45
parser was not capturing negative sign so
...
gbmin: and gbmax: and gbminint: etc. were not
working for negative numbers. should work now.
2014-10-31 13:13:27 -07:00
mwells
85c41b2211
multiple core fixes
2014-09-22 07:07:40 -07:00
Matt Wells
5980d48471
fixed json parser bug
2014-09-11 13:40:07 -07:00
mwells
d74df80f50
fix json parser core
2014-09-10 06:58:18 -07:00
Matt Wells
406fec356a
fix json parser core from bad json.
2014-06-16 06:56:16 -07:00
Matt Wells
d2cc117d82
fix oops
2014-05-16 18:47:52 -07:00
Matt Wells
526be98ec8
fix core scenario when diffbot reply that was injected
...
using &diffbotreply= contains the http mime.
2014-05-16 18:46:39 -07:00
Matt Wells
c5ae5ca4b5
v3 support for tokenized diffbot replies
...
using the "objects" array in the json.
2014-05-12 16:13:24 -07:00
Matt Wells
402377d2e6
fix bug of gbmin, gbmax etc. not working.
...
floats were being rounded down to ints
in most cases it seems. so .9 -> 0 etc.
2014-03-26 11:56:06 -07:00
Matt Wells
9c26b85c2f
fixed contenthash32 logic for json objects.
...
fixed hashing of numbers/bools for json objects.
added m_dupCache to reduce spiderrequests added to spiderdb.
do not add urls to waitingtree if ufn is obviously filtered/banned.
do not spider spiderrequest from doledb is maxoutperip would
be violated.
2014-02-05 13:22:03 -08:00
Matt Wells
4e803210ee
tons of changes from live github on neo.
...
lots of core fixes.
took out ppthtml powerpoint convert, it hangs.
dynamic rdbmap to save memory per coll.
fixed disk page cache logic and brought it
back.
2014-01-17 21:01:43 -08:00
mwells
76bb3d05e1
clean up logging so i can see what's going on
2013-12-10 16:41:30 -08:00
Matt Wells
e0a15194e1
fix json double decoding issue. no more
...
partial decodes, json parser stores
fully decoded string into separate buf.
2013-11-22 14:16:14 -08:00
Matt Wells
35d22bd9aa
fix json parser
2013-11-19 09:44:42 -08:00
Matt Wells
6495dfd86e
try to fix json parser overflow error. needs
...
testing. tried to fix round num from incrementing
for little job because i think server overload.
should be fixed right some time. just made wait
time 30 secs instead of 10 in Spider.cpp.
2013-11-15 11:30:16 -08:00
Matt Wells
fbcd6b8afd
display json objects that are not in arrays
...
in csv. show csv header. how to deal
with heterogenous object lists?
index spiderdate: for gbsortby:spiderdate.
added gbrevsortby: support.
2013-11-12 13:51:52 -08:00
Matt Wells
a288217e9f
a few bug fixes
2013-10-17 18:59:00 -07:00
Matt Wells
f5e5b0f5d3
fix crawlbot bugs
2013-10-16 12:12:22 -07:00
mwells
a562c65627
another code checkpoint. new json api
...
for crawlbot. new url filters for crawlbot.
2013-10-14 16:10:48 -06:00
mwells
5a7d70f7b2
code checkpoint
2013-10-14 13:00:05 -06:00
mwells
0de777d80d
parser fixes
2013-10-11 17:35:12 -06:00
mwells
6d5643e185
json parsing
2013-10-11 16:14:26 -06:00