Matt
f4ca6d8cd4
try ddomain only urls with www. when looking up
...
in sitelinks.txt
2015-01-31 15:33:37 -07:00
Matt
cad1d3d076
added support for sitelinks.txt file
2015-01-31 15:18:06 -07:00
Matt
1ef3932b32
use ./gb dump z main 0 -1 1 to generate sitelinks.txt
2015-01-25 18:45:40 -07:00
mwells
87285ba3cd
use gbmemcpy not memcpy so we can get profiler working again
...
since memcpy can't be interrupted and backtrace() called.
2015-01-13 12:25:42 -07:00
Matt Wells
7d67f104fb
emergency fixes
2014-12-11 08:39:26 -08:00
Matt
0460335861
more permission system updates
2014-12-08 09:49:17 -08:00
Matt
adcef39376
Merge branch 'diffbot-testing' into diffbot-matt
...
Conflicts:
Collectiondb.cpp
Collectiondb.h
Conf.cpp
Conf.h
Msg39.cpp
PageEvents.cpp
PageResults.cpp
PageTurk.cpp
Pages.cpp
Parms.cpp
Posdb.cpp
Proxy.cpp
Query.cpp
Query.h
RdbBase.cpp
RdbMap.cpp
Repair.cpp
Repair.h
SafeBuf.cpp
Spider.cpp
Tagdb.cpp
TopTree.cpp
XmlDoc.cpp
main.cpp
2014-11-20 16:53:07 -08:00
Matt
4e8a42e024
text replacements for bad int32_t substitutions
2014-11-17 18:24:38 -08:00
Matt
931a1c4bc6
good checkpoint. quite a few fixes.
2014-11-17 18:13:36 -08:00
Matt
69ef3c14ef
fixes for repair/rebuild functionality.
...
more to come.
2014-11-13 13:04:28 -08:00
Matt
4c19453ea9
working with -m32 for basic testing.
...
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
Matt
96b8197ad3
now it compiles with -m32
2014-11-10 14:45:11 -08:00
Matt Wells
e7dd8f7956
replace long long with int64_t
2014-10-30 13:36:39 -06:00
Matt Wells
789bb73dd3
when docid is banned do not print json/xml
...
cruft in the serps. was causing json
parsing errors.
2014-09-19 07:27:33 -07:00
mwells
7f622bd416
fixes for cloud support.
2014-08-31 16:23:11 -07:00
Matt Wells
5d3fd80063
make it so we can dump tagdb to a wget-table
...
list of urls to re-add tags to another tagdb.
2014-08-23 07:29:40 -07:00
Matt Wells
d6434191d1
nomenclature changes to reduce collissions.
...
name collection 'qatest123' for doing smoke tests,
not 'test'.
2014-03-31 15:02:17 -07:00
Matt Wells
edbd61b0c5
thread fixes. if pthread_create fails then
...
keep thread queue and just return. will try to
relaunch later. do not count delete keys towards
shard rebalance count.
2014-03-15 20:07:02 -07:00
Matt Wells
bd4484db3c
Merge branch 'testing' into diffbot-testing
2014-03-10 12:08:23 -07:00
Matt Wells
e351d2a6f1
get searching on token working
2014-03-06 17:01:41 -08:00
Matt Wells
27e8e810d2
use collnum instead of coll string.
...
more stable since resetting collections
keeps string the same but changes the collnum.
2014-03-06 15:48:11 -08:00
Matt Wells
c9ef525338
code checkpoint
2014-02-09 12:55:45 -07:00
Matt Wells
b6c3ecc20e
more formatting
2014-01-19 11:56:36 -08:00
Matt Wells
fe3a879758
formatting changes
2014-01-19 00:38:02 -08:00
Matt Wells
4606e88721
code cleanups.
...
xmldoc::injectDoc(), and it'll
add a SpiderRequest as well.
better collectiondb init code.
2014-01-18 21:19:26 -08:00
Matt Wells
8c4ac3c514
Merge branch 'master' into diffbot
2014-01-17 20:17:40 -08:00
Matt Wells
dde05446f5
sharding fixes for 3+ stripes.
2014-01-16 11:20:12 -07:00
Matt Wells
8a49e87a61
got code with shard rebalancing compiling.
...
now we store a "sharded by termid" bit in posdb
key for checksums, etc keys that are not sharded
by docid. save having to do disk seeks on every
host in the cluster to do a dup check, etc.
2014-01-11 16:08:42 -08:00
Matt Wells
f64b53bfb3
almost done with rebalancing code
2014-01-10 14:12:58 -08:00
Matt Wells
1b5057ad42
log cleanups mostly.
...
took out disk page cache,
kinda buggy... need to fix at some point.
2013-12-18 10:57:18 -08:00
mwells
76bb3d05e1
clean up logging so i can see what's going on
2013-12-10 16:41:30 -08:00
Matt Wells
9f1d79b124
check for null collrec
2013-12-02 10:13:19 -08:00
Matt Wells
fe97e08281
move from groups to shards. got rid of annoying
...
groupid bit mask thing.
2013-10-04 16:18:56 -07:00
Matt Wells
f6e560c1f4
Initial file population.
2013-08-02 13:12:24 -07:00