open-source-search-engine

mirror of https://github.com/gigablast/open-source-search-engine.git synced 2024-10-04 12:17:35 +03:00

Author	SHA1	Message	Date
Matt Wells	9b5e3016df	fix hosts.conf	2013-12-26 09:34:35 -08:00
Matt Wells	7624a3db0a	if url is manually added and it is simplifiedredirect then re-add with the same manually added bit set in the new spider request, otherwise seed url might not get spidered since it might not match the regex.	2013-12-26 08:58:56 -08:00
Matt Wells	6cc69106c2	fix hosts.conf	2013-12-23 10:30:45 -08:00
mwells	76bb3d05e1	clean up logging so i can see what's going on	2013-12-10 16:41:30 -08:00
Matt Wells	263bb8dfbc	fix oops	2013-11-05 14:32:56 -08:00
Matt Wells	2b904e9563	include firstip in the spider url lock, not just uh48, because using fake ips results in having the same url crawled twice since it is from a different "firstip" so we should include "firstip" in the lock as well to prevent a double round increment. see comment in Spider.cpp to this effect.	2013-11-05 14:31:05 -08:00
Matt Wells	b22f8d5d19	minor msg update	2013-10-29 15:26:32 -07:00
Matt Wells	54c50c1f3a	added "retrictDomain" parm which defaults to 1. will restrict spidered urls to same domain as seed urls.	2013-10-29 09:31:57 -07:00
Matt Wells	fb7096dc5d	num-mirrors: updates	2013-10-24 14:59:35 -07:00
Matt Wells	f65a2fd625	support num-mirrors: instead of index-splits: directive.	2013-10-24 14:32:56 -07:00
Matt Wells	91b8921b9e	have to use different ports if multiple gb instances/processes on same server.	2013-10-02 16:12:17 -07:00
mwells	c03e862b99	use a better version of hosts.conf where we specify the working directory for each host entry. then we can use the exact same hosts.conf file for each gb instance rather than having to change the single "working-dir:" directive for each instance, in the case where the each have a different working directory.	2013-10-02 13:11:58 -06:00
mwells	e9297df240	listen on DNS port 5998 not 6000. 6000 seemed to cause issues on a particular install for some reason.	2013-08-19 15:02:27 -06:00
mwells	be7aab78b7	Fixed bugs with running a proxy. Added more comments into hosts.conf.	2013-08-08 14:41:38 -06:00
Matt Wells	f6e560c1f4	Initial file population.	2013-08-02 13:12:24 -07:00

15 Commits