Matt Wells
a76f4c6974
just POST a full request for webhook now
...
so we can do application/json content type
2013-11-07 14:20:15 -08:00
Matt Wells
3e4db4f1bc
show all crawl details in url webhook
...
notification in the post body.
2013-11-07 13:59:43 -08:00
Matt Wells
adf4d258ae
better crawl status reporting.
...
allow for _ in coll names.
2013-10-30 10:00:46 -07:00
Matt Wells
20052e34fe
made webhook return the crawl name
...
and status as X- fields in the mime.
2013-10-28 22:03:10 -07:00
Matt Wells
a5a7ab2434
added spider status msg to json output
...
to indicate if spider has hit a limit.
no longer disable spiders in xmldoc.cpp
when a crawl/process limit is hit. just
check for limit when spidering urls in
spider.cpp and if it is hit set
CollectionRec::m_spiderStatus[Msg] and
send email from there.
Added maxCrawlRounds parm.
2013-10-23 11:40:30 -07:00
Matt Wells
0e4d96b3f8
added "seeds" to json reply. store seed urls
...
(and deup them) in collrec. fixed some respidering
issues. any time we re-enter url filters
then rebuild the waiting tree.
2013-10-21 17:35:14 -07:00
Matt Wells
b589b17e63
fix collection resetting.
2013-10-18 15:21:00 -07:00
Matt Wells
a288217e9f
a few bug fixes
2013-10-17 18:59:00 -07:00
mwells
ea859ef685
added 'gb emailmandrill' for testing.
...
got it working. it posts json, not url encoded.
2013-10-09 17:35:51 -06:00
mwells
c1c5c4e3d0
send notifications if no urls available
...
for immediate spidering.
2013-10-09 15:24:35 -06:00
Matt Wells
283ec2f6b4
email and webhook alerts when spider runs out of urls
...
to spider.
2013-10-09 11:42:56 -07:00
Matt Wells
3702a05d64
add sendEmailThroughMandrill() to send
...
through mail chimp http api.
2013-10-08 18:01:38 -07:00
Matt Wells
fe97e08281
move from groups to shards. got rid of annoying
...
groupid bit mask thing.
2013-10-04 16:18:56 -07:00
mwells
259ec08e09
email hook now works but you have to
...
supply the IP address of your sendmail
server and it has to allow email
forwarding from host #0 's IP. specify
the sendmail server's IP in the Master
Controls.
2013-10-02 09:36:44 -06:00
mwells
45941e4b2f
fix notification system.
2013-10-01 17:30:06 -06:00
mwells
3fecb3eb1f
got email and url notification code compiling.
...
when crawl hits a limit we do notifications.
2013-10-01 15:14:39 -06:00
Matt Wells
a412c798bf
Merge branch 'master' into diffbot
...
Conflicts:
PageResults.cpp
2013-09-13 09:24:28 -07:00
mwells
34b6d3e74a
fixed some cores. brought in fixes from
...
old repo.
2013-09-08 16:16:13 -06:00
Matt Wells
94e6492916
removed MAX_COLL_RECS so we can have unlimited
...
collections, really limited by the sizeof(collnum_t) only now,
which is 16bits, 15bits unsigned, which is the limitation.
can always expand this so we can have more than 32k collections.
2013-08-30 16:20:38 -07:00
Matt Wells
f6e560c1f4
Initial file population.
2013-08-02 13:12:24 -07:00