Commit Graph

1486 Commits

Author SHA1 Message Date
Matt Wells
bc5b126f2a Merge branch 'diffbot' 2014-05-28 09:15:48 -07:00
Matt Wells
662a8a33d0 emergency core fix 2014-05-28 09:29:54 -07:00
mwells
b3dcca6356 added make master-rpm 2014-05-28 07:48:02 -07:00
mwells
9b985fc233 Merge branch 'testing' 2014-05-28 07:36:45 -07:00
mwells
17e1fbc16c fix getPitPosLL() error causing
lang detection to screw up.
2014-05-28 07:35:05 -07:00
Daniel Steinberg
7448e8a1ff don't use "expand" for mode= requests or non-analyze requests 2014-05-26 20:38:44 -07:00
mwells
da328a8d2f turn off spider reply indexing by default
until we stop indexing simple words in url
2014-05-26 13:22:43 -06:00
mwells
068a299339 udpate documentation 2014-05-26 13:09:57 -06:00
Matt Wells
2d4fb483b2 disambiguate error msg 2014-05-26 10:46:10 -07:00
mwells
8149e99965 added developer.html warning msg. 2014-05-26 11:41:12 -06:00
mwells
58a2c04e30 more admin.html updates. 2014-05-26 11:39:26 -06:00
mwells
cea69c35cb make sure all subsections of admin.html
have a last updated time or a warning if
documentation is old.
2014-05-26 11:31:54 -06:00
mwells
c89c1f1471 Merge branch 'master' into testing
Conflicts:
	html/admin.html
2014-05-26 11:25:33 -06:00
mwells
db1703d500 admin.html updates 2014-05-26 11:24:18 -06:00
mwells
f54c5192f2 updated admin.html to remove some stuff
not needed.
2014-05-26 10:57:39 -06:00
Matt Wells
8946517b7c minor admin.html update 2014-05-26 10:30:48 -04:00
mwells
fe536cf31f minor updates to admin.html 2014-05-26 07:14:32 -07:00
Matt Wells
5ecd486f48 update admin.html 2014-05-25 22:28:05 -04:00
Matt Wells
b201333549 Merge branch 'master' into testing 2014-05-25 22:13:45 -04:00
Matt Wells
8ad18d2cd3 make it so we don't need --nodeps with
rpm -ivh (rpm install) to install pkg.
2014-05-25 22:08:46 -04:00
Matt Wells
2e7f32b01a fix getcwd2() so it works on red hat.
defaults to /var/gigablast/data0/gb if
cmd is "gb" and the "gb" binary is not
in the current working directory.
2014-05-25 20:53:49 -04:00
mwells
d0df3da508 added dotemacs 2014-05-25 07:54:30 -07:00
Matt Wells
b0f9227bbc path fixes for gb startup 2014-05-25 10:28:13 -04:00
Matt Wells
98c2e7a8b6 redhat build updates on fedora 2014-05-25 09:58:07 -04:00
Matt Wells
3fe1d3f184 updates to compile cleanly on redhat. 2014-05-24 23:58:12 -04:00
mwells
b33959191b rpmbuild updates 2014-05-24 07:16:17 -07:00
Matt Wells
8234aaed23 put lastspidertimeutc back in because we need
it for debugging.
2014-05-23 09:43:46 -07:00
Matt Wells
e3b6f6b74e a second fix for crawls saying they're done and
then resuming. it seems to happen when we turn
spiders off then back on again. so hack that.
2014-05-23 07:29:18 -07:00
mwells
562b3eafda more spec file fixes. use relative symlinks 2014-05-22 21:57:46 -07:00
mwells
5c55517fe6 more rpm build fixes 2014-05-22 21:01:30 -07:00
mwells
ddec6353ed rpm updates 2014-05-22 19:24:33 -07:00
mwells
a783c9155b add spec file to build rpm. 2014-05-22 19:06:09 -07:00
mwells
b2e9cfcc1b minor make install changes 2014-05-22 18:46:38 -07:00
Matt Wells
1f4dc2df97 fix bug in spider scan
of spiderdb for unique firstips
2014-05-22 13:08:01 -07:00
Matt Wells
68fcffb2da speed up scan of spiderdb
to repopulate waiting tree by jumping over
last firstip.
2014-05-22 12:20:03 -07:00
Matt Wells
e9c4c9bb9a fix possible loss of data when doing reads
on especially doledb.
2014-05-22 11:06:56 -07:00
Matt Wells
1660805f66 more useful logging for debugging 2014-05-22 10:36:44 -07:00
Matt Wells
32735677d2 wait 45 seconds before ending round, not 30
to try to fix some issues...
2014-05-22 08:32:19 -07:00
Matt Wells
935cc72e19 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing 2014-05-21 13:55:29 -07:00
Matt Wells
b8886c399c show start/end job times on pagecrawlbot. 2014-05-21 13:55:01 -07:00
Matt Wells
61fc015014 fix potential diffbot injection bug 2014-05-21 12:21:29 -07:00
Matt Wells
b0c87b355c log update 2014-05-21 10:09:50 -07:00
Matt Wells
45df139ccb update logging 2014-05-21 10:05:49 -07:00
Matt Wells
7ad9058f77 when doing a query reindex on a json
child url we need to add the spider request
of the original parent url and make sure
it does not get "EDOCUNCHANGED" error.
then the possibly new json child objects
won't get indexed.
2014-05-21 05:43:53 -07:00
Matt Wells
34afc7c7cf Merge branch 'diffbot-dan' into diffbot-testing 2014-05-21 05:30:56 -07:00
Daniel Steinberg
e39dffadcf use "expand" option when calling Diffbot 2014-05-20 22:00:46 -07:00
Matt Wells
4b587f168b fix bug of not including empty responses when &icc=1 2014-05-20 21:07:21 -07:00
Matt Wells
c729b51ae5 fixed exact # search results hit count
when using min/max/sort operators.
2014-05-20 13:45:00 -07:00
Matt Wells
6664faa792 fix printing back-to-back commas when showing
results in json with &icc=1.
2014-05-20 13:23:29 -07:00
mwells
ffc4036840 update admin.html 2014-05-19 06:22:34 -07:00