mwells
|
05400a0c25
|
updated spider code documentation.
|
2013-09-20 11:19:24 -07:00 |
|
Matt Wells
|
fbd62cecba
|
updated compilation instructions. need
to apt-get install gcc-multilib.
|
2013-09-20 10:06:01 -07:00 |
|
Matt Wells
|
bcc55dc46b
|
fixed a couple bugs. Added more documentation
into Spider.h.
|
2013-09-19 18:21:52 -07:00 |
|
Matt Wells
|
47465f6d90
|
more fixes. trying to fix spiders to
spider multiple urls from same ip...
|
2013-09-19 11:13:40 -07:00 |
|
Matt Wells
|
a3ea867305
|
update crawlbot api.
|
2013-09-18 17:13:36 -07:00 |
|
Matt Wells
|
022caeec04
|
use -diffbotxyz%li as a more unique appendage.
show token on crawlbot page.
|
2013-09-18 17:05:41 -07:00 |
|
Matt Wells
|
29f5c5d644
|
added isonsamesubdomain and isonsamedomain
|
2013-09-18 16:45:37 -07:00 |
|
Matt Wells
|
8de246d9c4
|
only show urls being spidered from your coll
|
2013-09-18 16:29:47 -07:00 |
|
Matt Wells
|
3bdd28ab1d
|
fix spider bug
|
2013-09-18 16:17:08 -07:00 |
|
Matt Wells
|
7fdbd0f66a
|
delete spider coll when deleting coll
|
2013-09-18 15:36:30 -07:00 |
|
Matt Wells
|
f90d20f4dd
|
diffbot api integration updates
|
2013-09-18 15:07:47 -07:00 |
|
Matt Wells
|
70ff54ce03
|
hide the parms that might scare users away
in the url filters.
|
2013-09-18 14:27:59 -07:00 |
|
Matt Wells
|
6af02119a1
|
use cookies to display url filters table.
|
2013-09-18 13:50:55 -07:00 |
|
Matt Wells
|
04b0a08ef9
|
propagate showtable=1 when submitting url filters table
|
2013-09-18 12:38:05 -07:00 |
|
Matt Wells
|
924d1320a2
|
fix bugs inserting and deleting rows
using TYPE_SAFEBUF parms.
|
2013-09-18 12:35:01 -07:00 |
|
Matt Wells
|
c1bcebb7bb
|
url filter documentation update.
|
2013-09-18 12:00:29 -07:00 |
|
Matt Wells
|
459a7e98fb
|
add diffbot dropdown to url filters table
|
2013-09-18 11:24:16 -07:00 |
|
Matt Wells
|
487d3f0a0e
|
fix url filters bugs.
|
2013-09-18 11:02:09 -07:00 |
|
Matt Wells
|
39d9760e5d
|
added ismedia url filter to
cover all the jpg,gif,mpeg,css rules.
|
2013-09-18 09:40:59 -07:00 |
|
Matt Wells
|
c77453348f
|
Merge branch 'master' into diffbot
Conflicts:
SearchInput.cpp
XmlDoc.cpp
|
2013-09-18 09:23:48 -07:00 |
|
mwells
|
d6815f2c9d
|
if family filter enabled (&ff=1) then
prepend "gbadult:0 |" to the query to
restrict to non-adult pages.
|
2013-09-18 00:11:55 -06:00 |
|
mwells
|
a0032e0eb7
|
added another log statement for when
debugging the adult content detectory.
we err on the side of caution for the most part.
|
2013-09-18 00:06:21 -06:00 |
|
mwells
|
119a4c0c22
|
fix adult content detector
|
2013-09-17 23:53:17 -06:00 |
|
mwells
|
5ec3803312
|
fix core in hashing gbisadult:[0|1] term.
|
2013-09-17 23:27:31 -06:00 |
|
Matt Wells
|
3005f904c7
|
index gbisadult:1 if adult content
gbisadult:0 if not.
|
2013-09-17 22:05:47 -07:00 |
|
Matt Wells
|
10fcfb6987
|
minor updates
|
2013-09-17 17:32:49 -07:00 |
|
Matt Wells
|
b8590d7df9
|
do not show json pages if searching pages.
|
2013-09-17 17:23:58 -07:00 |
|
Matt Wells
|
7fa4138d1c
|
fix Next 10 link
|
2013-09-17 17:19:41 -07:00 |
|
Matt Wells
|
98caa3225a
|
fix query prepend logic for json searches
|
2013-09-17 17:16:39 -07:00 |
|
Matt Wells
|
017a0febef
|
fix api dropdown selection.
|
2013-09-17 16:38:56 -07:00 |
|
Matt Wells
|
5e3b727eb5
|
crawlbot api fixes.
|
2013-09-17 16:30:57 -07:00 |
|
Matt Wells
|
b38d54cef9
|
save crawlinfo as binary so its easier
to not miss anything.
|
2013-09-17 16:07:59 -07:00 |
|
Matt Wells
|
2beff7f7d8
|
crawlbot api updates
|
2013-09-17 15:59:50 -07:00 |
|
Matt Wells
|
e50da4d012
|
crawlbot api fixes
|
2013-09-17 15:47:44 -07:00 |
|
Matt Wells
|
c16fe8601b
|
more crawlbot api fixes
|
2013-09-17 15:32:28 -07:00 |
|
Matt Wells
|
e7151e6cc6
|
fix bug with spiders not coming on.
|
2013-09-17 14:35:48 -07:00 |
|
Matt Wells
|
c81f700bf0
|
get reset collection kinda working.
|
2013-09-17 14:13:44 -07:00 |
|
Matt Wells
|
4321f02e4e
|
trying to get reset collection working
|
2013-09-17 12:21:09 -07:00 |
|
Matt Wells
|
fff8b80969
|
get collection delete working
|
2013-09-17 11:27:31 -07:00 |
|
Matt Wells
|
63973cf9c0
|
get "add new collection" working.
|
2013-09-17 10:43:23 -07:00 |
|
Matt Wells
|
02bf6ab3cc
|
new crawlbot api. not backwards compatible any more.
|
2013-09-17 10:25:54 -07:00 |
|
mwells
|
f34a7f44ab
|
compiler flag fix for xmldoc.o
|
2013-09-16 22:35:16 -06:00 |
|
mwells
|
afd1b3a9a2
|
added Diffbot.h
|
2013-09-16 21:42:48 -06:00 |
|
Matt Wells
|
fc692202ba
|
fix integration of urls filters into crawlbot page
|
2013-09-16 16:27:48 -07:00 |
|
Matt Wells
|
e7ed9254d4
|
formatting...
|
2013-09-16 15:33:45 -07:00 |
|
Matt Wells
|
1a780d1f4a
|
pretty up a little
|
2013-09-16 15:18:55 -07:00 |
|
Matt Wells
|
a034604cef
|
clean up to remove g_conf.m_useDiffbot
|
2013-09-16 15:00:43 -07:00 |
|
Matt Wells
|
cb9969ad22
|
fix token bug
|
2013-09-16 14:38:29 -07:00 |
|
Matt Wells
|
3dfba4de69
|
doc updates
|
2013-09-16 14:29:01 -07:00 |
|
Matt Wells
|
4c11265a98
|
more updates to crawlbot api
|
2013-09-16 13:59:11 -07:00 |
|