Commit Graph

32 Commits

Author SHA1 Message Date
Matt Wells
e351d2a6f1 get searching on token working 2014-03-06 17:01:41 -08:00
Matt Wells
27e8e810d2 use collnum instead of coll string.
more stable since resetting collections
keeps string the same but changes the collnum.
2014-03-06 15:48:11 -08:00
Matt Wells
d74f748e93 search all collections under a token if "&token" is
given but not "&c=..."
2014-03-06 11:00:43 -08:00
Matt Wells
25cf0efdbf first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
mwells
82494baa89 move CollectionRec stuff into Collectiondb files
for simplicity.
2013-12-10 15:28:04 -08:00
Matt Wells
ed79b67d2e core dump fixes 2013-12-08 15:36:23 -07:00
Matt Wells
5e4b5a112c Merge branch 'master' into diffbot
Conflicts:

	PageResults.cpp
	Threads.cpp
	XmlDoc.cpp
	XmlDoc.h
2013-12-07 11:34:26 -07:00
Matt Wells
c50ef1954f show admin controls on serps if ip is local.
fixed up the "reindex" page for deleting/reindexing
search results for a given query.
2013-12-06 09:48:30 -07:00
Matt Wells
43e40208b8 Merge branch 'master' into diffbot
Conflicts:
	SafeBuf.cpp
	SafeBuf.h
	SearchInput.cpp
	XmlDoc.cpp
2013-11-20 15:51:58 -08:00
mwells
5baf6a95d4 handle a bunch of oom conditions that
caused core. found using oom tester.
2013-11-20 10:14:02 -07:00
Matt Wells
25dd764dac Merge branch 'master' into diffbot
Conflicts:
	Makefile
	PageResults.cpp
2013-11-18 16:59:33 -08:00
Matt Wells
7d3b52fb3a if intersect thread takes forever
was causing msg5 reads to block forever
and spider round was getting incremented.
fixed a few bugs around that issue.
2013-11-18 16:20:30 -08:00
Matt Wells
dfab4ee13d fixed bugs with advanced.html advanced search page.
made stats graph only show last 5 minutes of stats.
tends to make the graph look more continuous.
do not use ajax to fetch the search results unless
this is running in matt wells' datacenter. it is
only an anti bot scraping measure and unnecessarily
complicates things for others.
2013-11-17 14:58:47 -07:00
Matt Wells
df28c4e0c2 search results in csv format.
remove serps per page limit if custom crawl.
2013-11-12 16:33:45 -08:00
Matt Wells
e395628d5a use &format=0 1 or 2 for html/xml/json now.
use &icc=1 to get dump of json objects in serps.
2013-11-08 18:00:30 -08:00
Matt Wells
fc17521697 Merge branch 'master' into diffbot
Conflicts:
	Hostdb.cpp
	Makefile
	PageResults.cpp
	PageRoot.cpp
	Pages.cpp
	Rdb.cpp
	SearchInput.cpp
	SearchInput.h
	Spider.cpp
	Spider.h
	XmlDoc.cpp
2013-10-16 14:28:42 -07:00
mwells
d41d5554da fix dmoz search. 2013-10-13 16:00:44 -07:00
mwells
2c7bc9031f documentation updates. 2013-10-13 13:15:31 -07:00
mwells
7ba9994804 many dmoz fixes. but still more we need to do.
isn't printing subcategories right now.
2013-10-08 23:55:11 -07:00
mwells
6c2c9f7774 trying to bring back dmoz integration. 2013-10-02 22:34:21 -06:00
Matt Wells
c77453348f Merge branch 'master' into diffbot
Conflicts:
	SearchInput.cpp
	XmlDoc.cpp
2013-09-18 09:23:48 -07:00
mwells
d6815f2c9d if family filter enabled (&ff=1) then
prepend "gbadult:0 |" to the query to
restrict to non-adult pages.
2013-09-18 00:11:55 -06:00
Matt Wells
98caa3225a fix query prepend logic for json searches 2013-09-17 17:16:39 -07:00
Matt Wells
c16fe8601b more crawlbot api fixes 2013-09-17 15:32:28 -07:00
Matt Wells
4c11265a98 more updates to crawlbot api 2013-09-16 13:59:11 -07:00
Matt Wells
78a334198b Merge branch 'master' into diffbot 2013-09-16 09:05:37 -07:00
Matt Wells
928dc36a03 get "&site=abc.com+xyz.com"... working to restrict
search results to specified sites. tested a little.
2013-09-15 20:16:48 -07:00
mwells
b684414e16 almost done adding support for whitelists.
i.e. list of sites to restrict search results to,
for instance.
2013-09-15 15:15:56 -06:00
Matt Wells
a412c798bf Merge branch 'master' into diffbot
Conflicts:
	PageResults.cpp
2013-09-13 09:24:28 -07:00
Matt Wells
5dc7bd2ab4 integrate diffbot from svn back into git. 2013-09-13 09:23:18 -07:00
mwells
aaf333c46c try to get family filter (&ff=1) working again
to filter out adult search results.
2013-09-01 18:22:38 -06:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00