Commit Graph

31 Commits

Author SHA1 Message Date
mwells
6434e5cc04 Merge branch 'testing' into diffbot-matt
Conflicts:
	Errno.cpp
	Errno.h
	Parms.h
2014-07-07 09:49:59 -07:00
mwells
5bae438169 get 'gb qa' working somewhat. need to have
more quick and robust smoketesting.
2014-07-04 17:15:22 -07:00
mwells
2ddd7d7366 finally got http tunnel logic working. 2014-07-01 16:28:15 -06:00
mwells
2f8b0694fd more http tunnel fixes 2014-07-01 15:43:20 -06:00
mwells
5de927f385 some fixes for http proxy tunnel 2014-07-01 15:18:18 -06:00
mwells
92799ef393 add support for tunnelling https fetch
through an http proxy using CONNECT
directive. needs more debugging.
2014-07-01 10:43:52 -06:00
mwells
e6ba1d123b minor msg update 2014-06-21 07:50:35 -07:00
mwells
628fe2336f make code compile cleaner. 2014-06-07 14:11:12 -07:00
Matt Wells
f293045ace more help/documentation updates 2014-04-10 22:41:20 -07:00
mwells
5ee79a4c2f daemonize on ./gb 0 etc. 2014-04-06 15:57:38 -07:00
Matt Wells
cd6069e5a6 send single space to socket if not streaming
and search results still not ready after 10 seconds.
send it every 10 seconds to prevent client from closing socket.
sped up all downloads, json and csv, but not doing "fuzzy"
deduping of search results, but just deduping on page
content hash. added TcpSocket::m_numDestroys to ensure we
do not send heartbeat on a socket that was closed and
re-opened for another client.
2014-02-13 08:45:13 -08:00
Matt Wells
8d534b8ed8 many more fixes for streaming mode 2014-02-06 18:21:22 -08:00
Matt Wells
874311ae52 fixes for streaming mode. 2014-02-06 16:28:42 -08:00
Matt Wells
845611ae1b &stream=1 stream mode fixes. 2014-02-06 15:23:53 -08:00
Matt Wells
392d043bd8 undo canonical deduping.
added dump round stats when uploading
json files.
2014-01-31 14:53:49 -08:00
Matt Wells
4f7b00c6ce fix core on broken pipe when calling
sendChunk() and socket in streaming mode.
2014-01-23 11:34:49 -08:00
Matt Wells
3ec44c5b35 fix streaming mode for sending back json
downloads/dumps.
2014-01-17 18:28:17 -08:00
Matt Wells
16f8af0d57 added awesome streaming mode support
to tcpserver.cpp for sending back
json objects as we get them from shards.
and as we get them in small pieces so we
don't go oom. made that code much simpler
and more reliable in the long run.
2014-01-17 16:26:17 -08:00
Matt Wells
5e4b5a112c Merge branch 'master' into diffbot
Conflicts:

	PageResults.cpp
	Threads.cpp
	XmlDoc.cpp
	XmlDoc.h
2013-12-07 11:34:26 -07:00
Matt Wells
5da41cd113 fix a couple different cores. 2013-11-24 19:46:44 -07:00
Matt Wells
c669f8c138 fix file descriptor leak in Dir class.
try to fix core from Thread getting SIGALRM.
try to set NOFILES to 1024 at startup in case
more are allowed.
2013-11-19 13:41:56 -08:00
Matt Wells
25dd764dac Merge branch 'master' into diffbot
Conflicts:
	Makefile
	PageResults.cpp
2013-11-18 16:59:33 -08:00
Matt Wells
5e30728a3a new graphic icons. minor clean ups. 2013-11-15 14:47:05 -07:00
Matt Wells
fc17521697 Merge branch 'master' into diffbot
Conflicts:
	Hostdb.cpp
	Makefile
	PageResults.cpp
	PageRoot.cpp
	Pages.cpp
	Rdb.cpp
	SearchInput.cpp
	SearchInput.h
	Spider.cpp
	Spider.h
	XmlDoc.cpp
2013-10-16 14:28:42 -07:00
mwells
b60bdcc038 documentation updates. fixed sd=0. 2013-10-13 14:24:41 -07:00
mwells
e71266e2db fix data downloading for large files 2013-09-30 13:48:37 -06:00
mwells
9bf8bf7712 add spider reply even on g_errno now with an error
code of EINTERNAL error in the spider reply.
no longer just sit on the lock. this was blocking
an entire ip when just lock sitting for 3 hrs.
and only do read rate timeouts if there was at least
one byte read. this was causing diffbot reply to
read rate timeout after just 60 seconds even though
its timeout was specified as 90 seconds.
2013-09-29 09:22:20 -06:00
Matt Wells
78a334198b Merge branch 'master' into diffbot 2013-09-16 09:05:37 -07:00
mwells
01c2a6d381 we already include our own 32-bit
libssl.a and libcrypto.a so we can ensure
stability. so we have to include the header
files as well really.
2013-09-15 18:25:49 -06:00
Matt Wells
5dc7bd2ab4 integrate diffbot from svn back into git. 2013-09-13 09:23:18 -07:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00