Commit Graph

11 Commits

Author SHA1 Message Date
mwells
d5805733e5 more api updates 2014-07-13 09:35:44 -07:00
mwells
5f26918910 lots of bug fixes. more qa fixes. 2014-07-11 08:00:30 -07:00
Daniel Steinberg
79b2d4859b printCrawlDetailsInJson signature without version 2014-05-28 10:41:32 -07:00
Daniel Steinberg
c06f9fde36 gigablast now has a notion of version based on the request 2014-05-27 20:11:12 -07:00
Matt Wells
f420bd2769 checkpoint 2014-02-09 15:09:48 -07:00
Matt Wells
0e4d96b3f8 added "seeds" to json reply. store seed urls
(and deup them) in collrec. fixed some respidering
issues. any time we re-enter url filters
then rebuild the waiting tree.
2013-10-21 17:35:14 -07:00
mwells
3fecb3eb1f got email and url notification code compiling.
when crawl hits a limit we do notifications.
2013-10-01 15:14:39 -06:00
mwells
20952eedbe customizable api list in url filters 2013-09-30 09:18:22 -06:00
mwells
9730e5f3ef fix lost spiders from updating crawl info.
fix maxspidersperip limitation not being obeyed.
removed fakedb.
only add "0" time waiting tree keys to waiting tree.
only scanSpiderdb() will change their times to
a future time or add them to doledb directly.
confirmLockAcquisition() will not add to waitingtree
if max spiders per ip limit would be exceeded.
an incoming spider reply will trigger the add to
waiting tree with a time of "0".
2013-09-28 13:12:33 -06:00
mwells
eb3f657411 fixed distributed support for adding/deleting/resetting
collections. now need to specify collection name
like &addcoll=mycoll when adding a coll.
2013-09-27 10:49:24 -06:00
mwells
fd081478de fix crawlbot to work on a distributed network
as far as adding/deleting/resetting  colls
and updating parms. ideally we'd have a Colldb
Rdb where each key was a parm. that would make
syncing easier if a host went down, then it would
get the negative/positive colldb parm keys later.
so it could sync up on all your operations as long
as all your operations in terms of adding and deleting
database key/value pairs.
2013-09-26 22:41:05 -06:00