Commit Graph

61 Commits

Author SHA1 Message Date
Matt
ff969d92bb can inject a single doc now 2015-05-03 21:14:28 -07:00
Matt
3e1cc9a450 fix bug of parms being set at seemingly random. 2015-02-03 17:52:44 -08:00
Matt
c15bd53e52 added support for supplying basic proxy authorization
to spider proxies. username:password@1.2.3.4:80
2015-02-02 13:23:38 -08:00
mwells
87285ba3cd use gbmemcpy not memcpy so we can get profiler working again
since memcpy can't be interrupted and backtrace() called.
2015-01-13 12:25:42 -07:00
Matt
feed7d5b3c pthread_t pid_t compatibility fixes 2014-12-10 13:15:26 -08:00
Matt
adcef39376 Merge branch 'diffbot-testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	Collectiondb.h
	Conf.cpp
	Conf.h
	Msg39.cpp
	PageEvents.cpp
	PageResults.cpp
	PageTurk.cpp
	Pages.cpp
	Parms.cpp
	Posdb.cpp
	Proxy.cpp
	Query.cpp
	Query.h
	RdbBase.cpp
	RdbMap.cpp
	Repair.cpp
	Repair.h
	SafeBuf.cpp
	Spider.cpp
	Tagdb.cpp
	TopTree.cpp
	XmlDoc.cpp
	main.cpp
2014-11-20 16:53:07 -08:00
Matt
4e8a42e024 text replacements for bad int32_t substitutions 2014-11-17 18:24:38 -08:00
Matt
931a1c4bc6 good checkpoint. quite a few fixes. 2014-11-17 18:13:36 -08:00
Matt
69ef3c14ef fixes for repair/rebuild functionality.
more to come.
2014-11-13 13:04:28 -08:00
Matt
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
Matt Wells
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
Matt Wells
162e89b2d5 return error if client tries to use https for squid proxy
right now.
2014-10-10 07:47:23 -07:00
Matt Wells
7df7fbe721 support the CONNECT for gb squid proxy 2014-10-02 12:36:43 -07:00
mwells
46290fa52f new password systems. individual collection passwords/accessIps. 2014-09-28 18:59:49 -07:00
mwells
2ccc4626dc more support for cloud initiative 2014-08-31 21:55:27 -07:00
mwells
7f622bd416 fixes for cloud support. 2014-08-31 16:23:11 -07:00
mwells
caee238c46 fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
mwells
177dbeb23d Merge branch 'testing' of github.com:gigablast/open-source-search-engine into testing 2014-08-06 16:00:50 -07:00
mwells
6a28250e94 get qa test working after nyt bug fix 2014-08-06 16:00:25 -07:00
mwells
470c487be4 get search filters actually working 2014-08-06 08:12:05 -07:00
mwells
947be58f10 Merge branch 'diffbot-testing' into testing
Conflicts:
	HttpRequest.cpp
	Msg13.cpp
	XmlDoc.cpp
2014-08-05 17:19:53 -07:00
mwells
cc1ceaaac2 fix nyt.com cookie redir bug.
fixed bug when POSTing injection request with multipart/form-data.
2014-08-05 17:04:11 -07:00
mwells
d7b67f21e7 return error if we get CONNECT requests. we don't
handle those because we can't cache them or inject
the sectiondb voting info into their tags because they
are encrypted from us.
2014-07-09 11:06:46 -07:00
mwells
0f9409235e some cleanups 2014-07-09 10:41:38 -07:00
mwells
d9ae010371 shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed.
got them working again multicasting a msg 0x39 to the appropriate shard.
set special msg39request flag for better performance for those guys.
2014-07-07 12:32:27 -07:00
mwells
6434e5cc04 Merge branch 'testing' into diffbot-matt
Conflicts:
	Errno.cpp
	Errno.h
	Parms.h
2014-07-07 09:49:59 -07:00
mwells
4059c84074 api updates 2014-07-05 12:47:10 -07:00
mwells
29d170631a more api updates 2014-07-05 12:36:01 -07:00
mwells
2ddd7d7366 finally got http tunnel logic working. 2014-07-01 16:28:15 -06:00
mwells
5de927f385 some fixes for http proxy tunnel 2014-07-01 15:18:18 -06:00
Matt Wells
2137e150e7 Merge branch 'testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	Make.depend
	Parms.cpp
2014-06-27 17:17:14 -07:00
mwells
4e3e4fd0d0 yay! get multidoc flatfile injection working. 2014-06-15 14:57:38 -07:00
mwells
5c0b371dc9 Merge branch 'testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	HttpServer.cpp
	Make.depend
	Parms.cpp
	Parms.h
2014-06-13 11:00:09 -07:00
mwells
308f2d07f7 fixes for section info injection into squid proxied responses 2014-06-13 10:48:59 -07:00
mwells
3cf3cddc5c beginning of total parm overhaul.
new injection parms, just need to engage them.
2014-06-12 21:27:06 -07:00
mwells
b71ea7f7c6 fixes for squid proxy simulator 2014-06-09 14:31:48 -07:00
mwells
7d452a766c completed squid proxy simulation code 2014-06-09 12:42:05 -07:00
mwells
0fd85b788b halfway done coding up proxy (squid) support into gb 2014-06-06 17:27:18 -07:00
Matt Wells
f8073b5adc Merge branch 'diffbot-testing' into diffbot-matt 2014-06-03 14:59:31 -07:00
mwells
806cf79b73 spider proxy updates 2014-06-02 13:18:18 -07:00
Daniel Steinberg
1fae88b739 check version less than 99 2014-05-28 10:30:26 -07:00
Daniel Steinberg
c06f9fde36 gigablast now has a notion of version based on the request 2014-05-27 20:11:12 -07:00
Matt Wells
953b7c558d parm updates 2014-02-10 21:45:03 -07:00
Matt Wells
17fff243f9 add connectips back. call them adminIps this time.
if your ip is on the list then you have admin
access. cookie tokens will come later/soon.
2014-02-03 20:47:48 -07:00
Matt Wells
3a6f0d81e3 fix a few cores. assume any ip that matches
the c-block of any host in hosts.conf file is
"local". clarified specs in admin.html.
2014-01-10 18:34:47 -07:00
Matt Wells
ed79b67d2e core dump fixes 2013-12-08 15:36:23 -07:00
Matt Wells
c3517ee019 Merge branch 'diffbot' of github.com:gigablast/open-source-search-engine into diffbot
Conflicts:
	Spider.cpp
2013-11-22 17:37:42 -08:00
Matt Wells
f4de986c7e test to make sure diffbot reply contains
"url":" field. try to find out why some diffbot
replies are truncated.
2013-11-21 12:37:08 -08:00
Matt Wells
3e4db4f1bc show all crawl details in url webhook
notification in the post body.
2013-11-07 13:59:43 -08:00
Matt Wells
20052e34fe made webhook return the crawl name
and status as X- fields in the mime.
2013-10-28 22:03:10 -07:00