Matt Wells
d19ee6ceea
Merge branch 'diffbot' into diffbot-testing
...
Conflicts:
Collectiondb.h
2014-12-11 08:40:55 -08:00
Matt Wells
7d67f104fb
emergency fixes
2014-12-11 08:39:26 -08:00
Matt
27df9a4276
link text extraction fixes
2014-12-11 06:52:14 -08:00
mwells
4f71a95da5
reinstantiate linkdb min files to merge parm.
2014-12-11 07:20:15 -07:00
Matt
e43365fc70
more pthread_t pid_t fixes
2014-12-10 14:06:17 -08:00
Matt
08169a5562
makefile minor change
2014-12-10 13:29:59 -08:00
Matt
feed7d5b3c
pthread_t pid_t compatibility fixes
2014-12-10 13:15:26 -08:00
Matt
619e980a97
try to fix pthread_t pid_t issues on Threads.cpp
2014-12-10 12:13:25 -08:00
Matt
329f004e74
compiler updates
2014-12-10 12:09:04 -08:00
Matt
d8ba619df3
makefile updates
2014-12-10 11:53:46 -08:00
Matt
1de49af4e6
add query term info into json output as well
2014-12-10 11:27:30 -08:00
Matt
44eddd63e8
fix signed/unsigned bug
2014-12-10 11:04:37 -08:00
Matt Wells
6b2e714964
try to fix core when generating statsdb graph
2014-12-10 11:01:50 -08:00
Matt Wells
c96b24f39d
try to fix core on #0 and #16 from empty query.
...
if empty query or n<=0 and &stream=1 then fix bug
that was not sending back the reply properly.
basically, disable streaming if msg40 would not block.
upped MAX_SHARDS from 128 to 1024. should not take up
any more mem really or slow things down.
2014-12-10 10:44:01 -08:00
Matt Wells
febb1d4658
print pretty floats in the facets menu,
...
whether printing a single float or a range
of floats.
2014-12-09 17:17:12 -08:00
Matt Wells
720517c2f5
fix facet range lists
2014-12-09 16:51:14 -08:00
Matt Wells
d0bed16be5
fix type in sytnax.html page
2014-12-09 14:15:00 -08:00
Matt Wells
b218bc403d
fix atotime1() output on json "date": field to
...
restrict to 32-bit min/max for time_t's that are beyond 32 bits.
so we truncate to min/max. later: add another termlist to add
more date coverage. would be useful for searching for big numeric
ranges, too, more than 32-bits.
2014-12-09 13:40:34 -08:00
Matt
dfce03eca8
fix printing of "next 10" link
2014-12-08 09:55:16 -08:00
Matt
0460335861
more permission system updates
2014-12-08 09:49:17 -08:00
Matt
2670cfd2f0
Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing
2014-12-08 09:48:51 -08:00
Matt
59c4db704a
made pwd/ip security text areas less rows.
...
send empty page in Parms.cpp if trying to
access admin page and no pwd/ip and
cloud user support not enabled.
2014-12-08 09:47:38 -08:00
Matt Wells
5a844068fe
fix cores in top tree with last commit. this one
...
speeds things up greatly. don't scan scoreinfo buf
for every docid we add to top tree if scoreinfobuf
has plenty of space. later we'll have to be more
clever about removing things from scoreinfobuf
if it comes down to that.
2014-12-08 09:29:21 -08:00
Matt Wells
4fbb2443b5
Revert "Revert "emergency fix so ppl can download large # objects in json""
...
This reverts commit aaa5b34126
.
2014-12-08 07:40:44 -08:00
Matt Wells
aaa5b34126
Revert "emergency fix so ppl can download large # objects in json"
...
This reverts commit c692b54bfd
.
2014-12-08 07:36:31 -08:00
Matt Wells
c692b54bfd
emergency fix so ppl can download large # objects in json
2014-12-08 07:13:00 -08:00
mwells
2c5f6daca2
bring back posdb min files to merge again
...
so we can set high when quicckly building index
2014-12-07 08:14:47 -07:00
Matt Wells
f3195c7eda
make gb start do keepalive start again
2014-12-06 13:26:45 -08:00
Matt Wells
048fbfe60f
fix gb start cmd
2014-12-06 13:19:13 -08:00
Matt Wells
b51d19a88c
Merge branch 'diffbot-testing' into diffbot
2014-12-06 13:09:55 -08:00
Matt Wells
559ef067c5
fix core from langid too big
...
in pageresults.cpp
2014-12-06 13:09:30 -08:00
Matt Wells
840ca9b091
Merge branch 'diffbot-testing' into diffbot
2014-12-06 11:16:30 -08:00
Matt Wells
b38cc19a40
dont print query term info unless header bit set
2014-12-06 09:36:57 -08:00
Matt
41c8817bdb
fixed summary initialization error
...
of the flags buffer.
fixed term freq algo. use exact term freq
for qatest123. made Summary.o -O3 again.
fix gbsystem() to disable both timers.
2014-12-06 10:14:48 -07:00
Matt
01d61d5427
remove type long, replace with int32_t
2014-12-05 08:55:22 -07:00
Matt
76c32bb741
identation cleanups
2014-12-05 08:54:27 -07:00
Gigablast
2f43eb828d
Merge pull request #35 from emmanuelcharon/diffbot-testing
...
modified hopcount computation for custom crawls
2014-12-05 08:53:22 -07:00
mwells
7c57283b88
fix tld lang url filter. was being reset.
2014-12-04 14:34:08 -07:00
mwells
090c18f59d
show how long it took in html serps
2014-12-04 14:22:40 -07:00
mwells
2ccb4f5c69
show spider req/repl sizes added to spiderdb in logIt().
...
make 'make' by itself work on 32-bit archs again.
2014-12-04 13:57:42 -07:00
Matt Wells
2021919d8c
if its a diffbot crawl/bulk job then do not use
...
linkdb to save disk space.
2014-12-04 13:25:10 -07:00
Matt Wells
d825e64d3b
on bad hint offset do not core, just return corrupt data errno.
2014-12-04 12:15:58 -08:00
Matt Wells
fd33997716
print query info in json, too, not just xml
2014-12-04 13:03:18 -07:00
Matt Wells
dc306858cc
nomenclature change
2014-12-04 11:02:54 -07:00
Matt Wells
0331363893
show language query synonym terms came from
...
in the xml/json feed.
2014-12-04 10:57:01 -07:00
Matt Wells
832392887c
do not spam the logs with spider request corrupt count msgs.
...
but store a count for them now in coll rec.
2014-12-04 10:00:13 -07:00
mwells
a7462ed1f4
fix injection stuff
2014-12-04 09:29:17 -07:00
mwells
8157c5be14
added 'gb dstart' cmd
2014-12-03 19:02:08 -07:00
Matt Wells
4894bf51ce
fix core
2014-12-03 14:08:18 -08:00
mwells
ca8194d9b0
when rebuilding posdb do not rebuild for spiderdb
2014-12-03 11:32:22 -07:00