Commit Graph

2509 Commits

Author SHA1 Message Date
Matt Wells
d19ee6ceea Merge branch 'diffbot' into diffbot-testing
Conflicts:
	Collectiondb.h
2014-12-11 08:40:55 -08:00
Matt Wells
7d67f104fb emergency fixes 2014-12-11 08:39:26 -08:00
Matt
27df9a4276 link text extraction fixes 2014-12-11 06:52:14 -08:00
mwells
4f71a95da5 reinstantiate linkdb min files to merge parm. 2014-12-11 07:20:15 -07:00
Matt
e43365fc70 more pthread_t pid_t fixes 2014-12-10 14:06:17 -08:00
Matt
08169a5562 makefile minor change 2014-12-10 13:29:59 -08:00
Matt
feed7d5b3c pthread_t pid_t compatibility fixes 2014-12-10 13:15:26 -08:00
Matt
619e980a97 try to fix pthread_t pid_t issues on Threads.cpp 2014-12-10 12:13:25 -08:00
Matt
329f004e74 compiler updates 2014-12-10 12:09:04 -08:00
Matt
d8ba619df3 makefile updates 2014-12-10 11:53:46 -08:00
Matt
1de49af4e6 add query term info into json output as well 2014-12-10 11:27:30 -08:00
Matt
44eddd63e8 fix signed/unsigned bug 2014-12-10 11:04:37 -08:00
Matt Wells
6b2e714964 try to fix core when generating statsdb graph 2014-12-10 11:01:50 -08:00
Matt Wells
c96b24f39d try to fix core on #0 and #16 from empty query.
if empty query or n<=0 and &stream=1 then fix bug
that was not sending back the reply properly.
basically, disable streaming if msg40 would not block.
upped MAX_SHARDS from 128 to 1024. should not take up
any more mem really or slow things down.
2014-12-10 10:44:01 -08:00
Matt Wells
febb1d4658 print pretty floats in the facets menu,
whether printing a single float or a range
of floats.
2014-12-09 17:17:12 -08:00
Matt Wells
720517c2f5 fix facet range lists 2014-12-09 16:51:14 -08:00
Matt Wells
d0bed16be5 fix type in sytnax.html page 2014-12-09 14:15:00 -08:00
Matt Wells
b218bc403d fix atotime1() output on json "date": field to
restrict to 32-bit min/max for time_t's that are beyond 32 bits.
so we truncate to min/max. later: add another termlist to add
more date coverage. would be useful for searching for big numeric
ranges, too, more than 32-bits.
2014-12-09 13:40:34 -08:00
Matt
dfce03eca8 fix printing of "next 10" link 2014-12-08 09:55:16 -08:00
Matt
0460335861 more permission system updates 2014-12-08 09:49:17 -08:00
Matt
2670cfd2f0 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing 2014-12-08 09:48:51 -08:00
Matt
59c4db704a made pwd/ip security text areas less rows.
send empty page in Parms.cpp if trying to
access admin page and no pwd/ip and
cloud user support not enabled.
2014-12-08 09:47:38 -08:00
Matt Wells
5a844068fe fix cores in top tree with last commit. this one
speeds things up greatly. don't scan scoreinfo buf
for every docid we add to top tree if scoreinfobuf
has plenty of space. later we'll have to be more
clever about removing things from scoreinfobuf
if it comes down to that.
2014-12-08 09:29:21 -08:00
Matt Wells
4fbb2443b5 Revert "Revert "emergency fix so ppl can download large # objects in json""
This reverts commit aaa5b34126.
2014-12-08 07:40:44 -08:00
Matt Wells
aaa5b34126 Revert "emergency fix so ppl can download large # objects in json"
This reverts commit c692b54bfd.
2014-12-08 07:36:31 -08:00
Matt Wells
c692b54bfd emergency fix so ppl can download large # objects in json 2014-12-08 07:13:00 -08:00
mwells
2c5f6daca2 bring back posdb min files to merge again
so we can set high when quicckly building index
2014-12-07 08:14:47 -07:00
Matt Wells
f3195c7eda make gb start do keepalive start again 2014-12-06 13:26:45 -08:00
Matt Wells
048fbfe60f fix gb start cmd 2014-12-06 13:19:13 -08:00
Matt Wells
b51d19a88c Merge branch 'diffbot-testing' into diffbot 2014-12-06 13:09:55 -08:00
Matt Wells
559ef067c5 fix core from langid too big
in pageresults.cpp
2014-12-06 13:09:30 -08:00
Matt Wells
840ca9b091 Merge branch 'diffbot-testing' into diffbot 2014-12-06 11:16:30 -08:00
Matt Wells
b38cc19a40 dont print query term info unless header bit set 2014-12-06 09:36:57 -08:00
Matt
41c8817bdb fixed summary initialization error
of the flags buffer.
fixed term freq algo. use exact term freq
for qatest123. made Summary.o -O3 again.
fix gbsystem() to disable both timers.
2014-12-06 10:14:48 -07:00
Matt
01d61d5427 remove type long, replace with int32_t 2014-12-05 08:55:22 -07:00
Matt
76c32bb741 identation cleanups 2014-12-05 08:54:27 -07:00
Gigablast
2f43eb828d Merge pull request #35 from emmanuelcharon/diffbot-testing
modified hopcount computation for custom crawls
2014-12-05 08:53:22 -07:00
mwells
7c57283b88 fix tld lang url filter. was being reset. 2014-12-04 14:34:08 -07:00
mwells
090c18f59d show how long it took in html serps 2014-12-04 14:22:40 -07:00
mwells
2ccb4f5c69 show spider req/repl sizes added to spiderdb in logIt().
make 'make' by itself work on 32-bit archs again.
2014-12-04 13:57:42 -07:00
Matt Wells
2021919d8c if its a diffbot crawl/bulk job then do not use
linkdb to save disk space.
2014-12-04 13:25:10 -07:00
Matt Wells
d825e64d3b on bad hint offset do not core, just return corrupt data errno. 2014-12-04 12:15:58 -08:00
Matt Wells
fd33997716 print query info in json, too, not just xml 2014-12-04 13:03:18 -07:00
Matt Wells
dc306858cc nomenclature change 2014-12-04 11:02:54 -07:00
Matt Wells
0331363893 show language query synonym terms came from
in the xml/json feed.
2014-12-04 10:57:01 -07:00
Matt Wells
832392887c do not spam the logs with spider request corrupt count msgs.
but store a count for them now in coll rec.
2014-12-04 10:00:13 -07:00
mwells
a7462ed1f4 fix injection stuff 2014-12-04 09:29:17 -07:00
mwells
8157c5be14 added 'gb dstart' cmd 2014-12-03 19:02:08 -07:00
Matt Wells
4894bf51ce fix core 2014-12-03 14:08:18 -08:00
mwells
ca8194d9b0 when rebuilding posdb do not rebuild for spiderdb 2014-12-03 11:32:22 -07:00