Matt Wells
1fb1e2af7e
fixed form input. fixed page parser submission.
...
added ability to dump out termlist from posdb
like type:json (with a colon in it) to try to debug
msft seeing html in csv output.
2014-01-29 14:10:08 -08:00
Matt Wells
a9909e189f
fix delete collection api
2014-01-27 15:28:26 -08:00
Matt Wells
df063dbdf2
fix a core
2014-01-22 22:26:50 -08:00
Matt Wells
33c5d9c07f
a lot of times rdb tree has invalid collection
...
numbers in it so fix our counting algo in case
the collection rec no longer exists!
2014-01-21 19:01:44 -08:00
Matt Wells
9354d06493
menu updates.
2014-01-21 13:01:37 -08:00
Matt Wells
8d5e1cb547
added url download support
2014-01-20 23:17:04 -08:00
Matt Wells
089d7f34a0
more spiderdb spider request fixes
2014-01-19 18:00:56 -08:00
Matt Wells
970d5b2488
formatting
2014-01-19 16:40:22 -08:00
Matt Wells
fa0e3f784f
formatting
2014-01-19 15:06:02 -08:00
Matt Wells
99de2188e1
formatting
2014-01-19 13:21:58 -08:00
Matt Wells
ca816492b5
doc links
2014-01-19 12:01:32 -08:00
Matt Wells
471599e9e7
formatting
2014-01-19 10:44:19 -08:00
Matt Wells
fe3a879758
formatting changes
2014-01-19 00:38:02 -08:00
Matt Wells
4606e88721
code cleanups.
...
xmldoc::injectDoc(), and it'll
add a SpiderRequest as well.
better collectiondb init code.
2014-01-18 21:19:26 -08:00
Matt Wells
f3000e2763
set m_needsSave in collectionrec when parms updated
2014-01-18 12:51:10 -08:00
Matt Wells
9c1f6197eb
added indexbody control so i can
...
turn it off for my special json
global index.
2014-01-18 10:04:33 -08:00
Matt Wells
2faba0efd1
fix repeat rounds sticking bug
...
by adding PF_REBUILDURLFILTERS flag to
spiderroundastarttime parm
2014-01-17 17:17:10 -08:00
Matt Wells
4b27b22949
git rebalancing working right
2014-01-15 17:40:17 -08:00
Matt Wells
883487889d
make gb install only have 10 outstanding per an ip
...
since ssh seems to close connections if you have more
than 12 out.
2014-01-15 14:41:30 -08:00
Matt Wells
d091c7e959
fix hostsinagreement bug
2014-01-14 11:24:32 -08:00
Matt Wells
cb5b4af271
show reason spiders are not going above
...
the spider queue page.
2014-01-11 21:40:45 -08:00
Matt Wells
9da106e7ca
added ermergency msg box on all admin pages
2014-01-11 20:35:13 -08:00
Matt Wells
eed606601e
added emergency msg box on all admin pages
2014-01-11 20:14:44 -08:00
Matt Wells
6de7abf6ba
display fixes.
...
./gb installgb and ./gb installgb2 now install 'gb'
if 'gb.new' is not present.
2014-01-11 17:16:20 -08:00
Matt Wells
f64b53bfb3
almost done with rebalancing code
2014-01-10 14:12:58 -08:00
Matt Wells
8943106389
minor print updates
2014-01-09 21:23:51 -08:00
Matt Wells
1d6ba52dcd
list collections in sidebar.
2014-01-09 21:13:41 -08:00
Matt Wells
645360b730
parm simplifcations
2014-01-09 19:00:21 -08:00
Matt Wells
501f49c81b
gui and parm updates. simplifcations.
2014-01-09 17:29:18 -08:00
Matt Wells
4d7fa1eea9
pretty up url filters table
2014-01-09 13:34:43 -08:00
Matt Wells
70f8c416de
allow collections to be added when no colls exist.
...
fixed gb start2 etc. to be sequential.
2014-01-09 13:07:16 -08:00
Matt Wells
0615acff17
zero out url filters checkboxes on submit
2013-12-16 11:03:40 -08:00
Matt Wells
a13114605a
more parm overhaul fixes
2013-12-12 12:44:54 -08:00
mwells
82494baa89
move CollectionRec stuff into Collectiondb files
...
for simplicity.
2013-12-10 15:28:04 -08:00
mwells
f2d5661965
parmdb overhaul. support collection add/del
...
sync when host comes back online. use udp not tcp.
host #0 can now handle a new incoming request while
a parm change is currently outstanding.
all missed "command" parms will be received when a dead host
comes back online, too, like a tight merge for instance.
does not use msg4, uses msg3e and msg3f for syncing and
sending parms.
2013-12-10 13:09:55 -08:00
Matt Wells
dd3b49faa9
collection name hell
2013-12-08 16:44:37 -07:00
Matt Wells
df28c4e0c2
search results in csv format.
...
remove serps per page limit if custom crawl.
2013-11-12 16:33:45 -08:00
Matt Wells
22f9e9355d
/v2/bulk api fixes
2013-10-22 18:51:09 -07:00
Matt Wells
fc17521697
Merge branch 'master' into diffbot
...
Conflicts:
Hostdb.cpp
Makefile
PageResults.cpp
PageRoot.cpp
Pages.cpp
Rdb.cpp
SearchInput.cpp
SearchInput.h
Spider.cpp
Spider.h
XmlDoc.cpp
2013-10-16 14:28:42 -07:00
mwells
7ba9994804
many dmoz fixes. but still more we need to do.
...
isn't printing subcategories right now.
2013-10-08 23:55:11 -07:00
Matt Wells
c0f1330d70
Merge branch 'master' into diffbot
...
Conflicts:
HttpServer.cpp
Makefile
PageGet.cpp
Pages.h
SafeBuf.h
2013-09-28 13:13:12 -07:00
mwells
5884951190
only do certain things if running
...
on a machine in matt wells datacenter.
like fan switching based on temps,
or printing seo links. made seo functions
weak overridable placeholder stubs so if
seo.o is linked in it will override.
include seo.o object if seo.cpp file exists
for automatic seo module building and linking.
2013-09-28 13:43:56 -06:00
mwells
7cdb3d6f9c
fix infinite loop from json parsing and
...
fix some core dumps.
2013-09-27 17:52:36 -06:00
mwells
e7377d72ab
fix robots.txt switch. fix collection rec saving.
...
require collname explicitly for injecturl urldata.
2013-09-27 11:39:23 -06:00
mwells
eb3f657411
fixed distributed support for adding/deleting/resetting
...
collections. now need to specify collection name
like &addcoll=mycoll when adding a coll.
2013-09-27 10:49:24 -06:00
mwells
fd081478de
fix crawlbot to work on a distributed network
...
as far as adding/deleting/resetting colls
and updating parms. ideally we'd have a Colldb
Rdb where each key was a parm. that would make
syncing easier if a host went down, then it would
get the negative/positive colldb parm keys later.
so it could sync up on all your operations as long
as all your operations in terms of adding and deleting
database key/value pairs.
2013-09-26 22:41:05 -06:00
Matt Wells
4c11265a98
more updates to crawlbot api
2013-09-16 13:59:11 -07:00
Matt Wells
a412c798bf
Merge branch 'master' into diffbot
...
Conflicts:
PageResults.cpp
2013-09-13 09:24:28 -07:00
Matt Wells
5dc7bd2ab4
integrate diffbot from svn back into git.
2013-09-13 09:23:18 -07:00
mwells
34b6d3e74a
fixed some cores. brought in fixes from
...
old repo.
2013-09-08 16:16:13 -06:00