Commit Graph

1856 Commits

Author SHA1 Message Date
Matt Wells
51a2d7d123 oops fix 2014-06-18 16:49:18 -07:00
Matt Wells
721cdec30c fix bug related to re-adding spider requests
for diffbot object parent urls for a query reindex.
url is no longer a docid.
2014-06-18 16:46:26 -07:00
Matt Wells
6b512f1379 Merge branch 'master' into testing 2014-06-18 09:17:13 -07:00
Matt Wells
f1ec530eef critical bug fixes 2014-06-18 09:16:28 -07:00
Matt Wells
d730f087c2 fix more injection bugs 2014-06-18 07:05:55 -07:00
mwells
ad42739e3e nothing 2014-06-18 08:09:02 -06:00
Matt Wells
772bd02e8d Merge branch 'testing' of git@github.com:gigablast/open-source-search-engine into testing
Conflicts:

	Makefile
2014-06-18 06:51:08 -07:00
Matt Wells
82be2ba28a fix injection bugs 2014-06-18 06:22:19 -07:00
mwells
91df090d1d nothing 2014-06-18 06:37:06 -06:00
Matt Wells
8bbdc2b48a fix another core 2014-06-18 05:23:48 -07:00
mwells
cd33553bf2 nothing 2014-06-18 06:10:58 -06:00
Matt Wells
9e2b1532d9 quick fix 2014-06-18 05:06:23 -07:00
Matt Wells
1bef36c03c emergency bug fixes 2014-06-18 05:04:45 -07:00
Matt Wells
b6264c6765 fix ulimit and antiword bugs 2014-06-18 04:06:20 -07:00
Matt Wells
e36d9d1f3a turn off dup removal for all download queries now,
not just bulk jobs. it is confusing ppl too much
2014-06-17 18:50:42 -07:00
mwells
4d9bc7dc08 update 2014-06-17 17:25:27 -06:00
mwells
c314e61968 make sectiondb stats just a special case of facets 2014-06-17 16:39:02 -06:00
mwells
b2e9c4e631 package bldg updates 2014-06-16 21:50:32 -06:00
mwells
584af942d4 Merge branch 'testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	Make.depend
	Parms.cpp
2014-06-16 20:42:28 -07:00
mwells
d71922168e facetize the sectiondb stuff 2014-06-16 20:40:35 -07:00
Matt Wells
514f919b11 license fix 2014-06-16 13:52:51 -07:00
Matt Wells
29a6851100 push old fixes 2014-06-16 12:53:14 -07:00
Matt Wells
549f8eb5bc fix bug in hosts.conf when expanding working dir. 2014-06-16 11:32:10 -07:00
Matt Wells
9be18173db fix compiler bug 2014-06-16 11:10:38 -07:00
Matt Wells
153a5ab11d Merge branch 'diffbot-testing' into testing 2014-06-16 07:08:36 -07:00
Matt Wells
3b8741a7cb trying to prevent some cores. 2014-06-16 07:03:51 -07:00
Matt Wells
406fec356a fix json parser core from bad json. 2014-06-16 06:56:16 -07:00
Matt Wells
3c1f89e052 move parm order around in gb.conf 2014-06-16 06:07:11 -07:00
mwells
9244e172d0 fix calls to antiword and pdftohtml etc. 2014-06-15 17:44:52 -07:00
mwells
928456b102 fix merge 2014-06-15 15:05:28 -07:00
mwells
2be7c78f2f Merge branch 'diffbot-testing' into testing
Conflicts:
	Collectiondb.cpp
	Parms.cpp
2014-06-15 15:02:17 -07:00
mwells
4e3e4fd0d0 yay! get multidoc flatfile injection working. 2014-06-15 14:57:38 -07:00
mwells
6f813ed7a5 fixed/added support for multi doc (flatfile) injection 2014-06-15 09:54:08 -07:00
mwells
b2923acaf1 added support for using delimeter with injections so
one injected file can contain multiple documents.
2014-06-15 09:10:00 -07:00
mwells
7506d66d4a fixes for page inject 2014-06-15 08:26:27 -07:00
mwells
c3bbcb9f92 parm-itize page reindex 2014-06-15 07:56:27 -07:00
Matt Wells
2841569376 fix &roundStart=1 again to force a spider round
for non-repeat collections.
2014-06-13 12:11:17 -07:00
mwells
8e241297f2 integrated parm updates 2014-06-13 11:07:01 -07:00
mwells
5c0b371dc9 Merge branch 'testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	HttpServer.cpp
	Make.depend
	Parms.cpp
	Parms.h
2014-06-13 11:00:09 -07:00
mwells
308f2d07f7 fixes for section info injection into squid proxied responses 2014-06-13 10:48:59 -07:00
mwells
993804e6ab api table updates 2014-06-13 09:37:53 -07:00
mwells
a123cad0d2 more page api updates 2014-06-13 08:58:55 -07:00
mwells
3cf3cddc5c beginning of total parm overhaul.
new injection parms, just need to engage them.
2014-06-12 21:27:06 -07:00
mwells
df8b9bd01a more fixes for section markup proxy 2014-06-12 15:28:03 -07:00
mwells
20c4ac4205 got it marking up html now with sectiondb stats.
seems to work ok.
2014-06-12 14:42:08 -07:00
mwells
ea90e7f755 more fixes for sectiondb markup code 2014-06-12 13:05:45 -07:00
Matt Wells
76f1987785 fix roundstart bug 2014-06-12 07:52:30 -07:00
mwells
a425e181e7 api table parm cleanups. more to come 2014-06-11 20:24:36 -07:00
mwells
7ebaf531b0 inject url page was breaking rendering the new parms. 2014-06-11 19:50:07 -07:00
Matt Wells
ab7717d065 now use &roundStart=0 to trigger the next crawl round.
now assume all crawl jobs are "repeat", but those that
have repeat of "0" just assume 10 year frequency, 3652.5 days.
that way the &roundStart=0 will do another round of crawling
for them as well.
2014-06-11 18:45:58 -07:00