Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Go to file
2014-09-27 08:39:22 -07:00
antiword-dir Initial file population. 2013-08-02 13:12:24 -07:00
diffbot-widget widget updates 2014-04-21 09:21:28 -07:00
html faq.html updates 2014-09-23 21:34:21 -07:00
openssl we already include our own 32-bit 2013-09-15 18:25:49 -06:00
ucdata Initial file population. 2013-08-02 13:12:24 -07:00
Abbreviations.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Abbreviations.h change a couple of possible reserved names in C++ 2013-08-28 22:59:01 -06:00
Accessdb.cpp fix addColl() logic for collectionless rdbs 2013-10-16 14:38:09 -07:00
Accessdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Address.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Address.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
addtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Ads.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Ads.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
AdultBit.cpp Initial file population. 2013-08-02 13:12:24 -07:00
AdultBit.h Initial file population. 2013-08-02 13:12:24 -07:00
animate.cpp Initial file population. 2013-08-02 13:12:24 -07:00
antiword fix ulimit and antiword bugs 2014-06-18 04:06:20 -07:00
AutoBan.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
AutoBan.h Initial file population. 2013-08-02 13:12:24 -07:00
badcattable.dat Initial file population. 2013-08-02 13:12:24 -07:00
BigFile.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
BigFile.h make code compile cleaner. 2014-06-07 14:11:12 -07:00
Bits.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Bits.h Initial file population. 2013-08-02 13:12:24 -07:00
blaster2.cpp more core fixes. more stability. 2014-07-16 12:52:51 -07:00
Blaster.cpp for json docs only give them a single 2014-01-25 08:17:38 -08:00
Blaster.h use ./gb blaster -u <fileofurls> to just inject urls, 2013-08-19 16:33:27 -06:00
bmptopnm Initial file population. 2013-08-02 13:12:24 -07:00
Cachedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Cachedb.h Initial file population. 2013-08-02 13:12:24 -07:00
camsort.cpp Initial file population. 2013-08-02 13:12:24 -07:00
catcountry.dat Initial file population. 2013-08-02 13:12:24 -07:00
Catdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Catdb.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Categories.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Categories.h documentation updates. fixed sd=0. 2013-10-13 14:24:41 -07:00
CatRec.cpp fix a couple catdb generation bugs. 2013-10-12 20:33:04 -07:00
CatRec.h Initial file population. 2013-08-02 13:12:24 -07:00
changelog version update 2014-09-26 08:52:09 -06:00
character-sets Initial file population. 2013-08-02 13:12:24 -07:00
check_unicode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Clusterdb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Clusterdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Collectiondb.cpp import fixes 2014-09-25 20:48:34 -07:00
Collectiondb.h added floater coll override switch. 2014-09-26 21:28:04 -07:00
Conf.cpp save conf files safely to disk so we don't 2014-07-29 10:02:43 -07:00
Conf.h more support for cloud initiative 2014-08-31 21:55:27 -07:00
control.deb package bldg updates 2014-06-16 21:50:32 -06:00
convert.cpp Initial file population. 2013-08-02 13:12:24 -07:00
copyright.head package bldg updates 2014-06-16 21:50:32 -06:00
copyright.tail package bldg updates 2014-06-16 21:50:32 -06:00
CountryCode.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
CountryCode.h fix pagecrawlbot.cpp to support &c=token-name. 2014-01-22 23:40:38 -08:00
create_ucd_tables.cpp Initial file population. 2013-08-02 13:12:24 -07:00
DailyMerge.cpp Fixed some bugs. 2013-08-09 08:52:15 -07:00
DailyMerge.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
DataFeed.cpp Initial file population. 2013-08-02 13:12:24 -07:00
DataFeed.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Datedb.cpp more core stability fixes. prevent core dumps 2014-07-16 12:07:39 -07:00
Datedb.h more core stability fixes. prevent core dumps 2014-07-16 12:07:39 -07:00
Dates.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Dates.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Diff.cpp Fixed some bugs. 2013-08-09 08:52:15 -07:00
Diff.h Initial file population. 2013-08-02 13:12:24 -07:00
Dir.cpp use gbsystem() not system() so it can turn off alarms 2014-09-11 05:01:55 -07:00
Dir.h fix file descriptor leak in Dir class. 2013-11-19 13:41:56 -08:00
DiskPageCache.cpp hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
DiskPageCache.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
dlstubs.c Initial file population. 2013-08-02 13:12:24 -07:00
dmozparse.cpp fix dmoz building. 2014-07-05 22:20:15 -07:00
Dns.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Dns.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
DnsProtocol.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
dnstest.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
Domains.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Domains.h Initial file population. 2013-08-02 13:12:24 -07:00
dumpcore.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Entities.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Entities.h Initial file population. 2013-08-02 13:12:24 -07:00
Errno.cpp rename admin.html to faq.html etc. file juggling. 2014-08-31 09:51:21 -07:00
Errno.h Merge branch 'testing' into diffbot-matt 2014-07-10 10:06:55 -07:00
errnotest.cpp errno test update 2013-11-19 00:10:10 -07:00
Events.h Initial file population. 2013-08-02 13:12:24 -07:00
Facebook.cpp fix addColl() logic for collectionless rdbs 2013-10-16 14:38:09 -07:00
Facebook.h fixes for page inject 2014-06-15 08:26:27 -07:00
fastIndexTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
fctypes.cpp Merge branch 'testing' into diffbot-testing 2014-08-29 11:23:13 -07:00
fctypes.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
File.cpp cygwin fixes 2014-09-26 23:04:16 -07:00
File.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
filterquerylogs.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Flags.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Flags.h Initial file population. 2013-08-02 13:12:24 -07:00
gb-1.0.spec make it so we don't need --nodeps with 2014-05-25 22:08:46 -04:00
gb-include.h compiler cleanups for cygwin compile 2014-06-07 14:20:04 -07:00
gb.deb.rules if netpbm pkg already installed use it. 2014-07-06 09:54:28 -07:00
gb.pem so we have spider https sites add 2013-10-13 00:15:39 -07:00
gbfilter.cpp Initial file population. 2013-08-02 13:12:24 -07:00
gbtitletest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geneaology.cpp Initial file population. 2013-08-02 13:12:24 -07:00
generateSuperMergeCode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geo_ip_table.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geo_ip_table.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP_internal.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP.c Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIPCity.c Initial file population. 2013-08-02 13:12:24 -07:00
GeoIPCity.h Initial file population. 2013-08-02 13:12:24 -07:00
getsample.cpp Initial file population. 2013-08-02 13:12:24 -07:00
giftopnm Initial file population. 2013-08-02 13:12:24 -07:00
hash.cpp undo hashtab change. too much overhead. 2014-09-27 08:39:22 -07:00
hash.h get "&site=abc.com+xyz.com"... working to restrict 2013-09-15 20:16:48 -07:00
HashTable.cpp fix core from last push. 2013-12-09 14:21:46 -07:00
HashTable.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
HashTableT.cpp Initial file population. 2013-08-02 13:12:24 -07:00
HashTableT.h Initial file population. 2013-08-02 13:12:24 -07:00
HashTableX.cpp fix floater bug from reading hashtable off disk. 2014-09-26 15:30:42 -07:00
HashTableX.h fix core from too many facet strs 2014-09-21 09:26:13 -07:00
hashtest2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hashtest3.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hashtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Highlight.cpp Merge branch 'master' into diffbot 2013-12-07 11:34:26 -07:00
Highlight.h trying to fix json decoding bug. 2013-10-24 17:55:01 -07:00
Hostdb.cpp use gbsystem() not system() so it can turn off alarms 2014-09-11 05:01:55 -07:00
Hostdb.h fix core 2014-09-19 14:00:57 -06:00
hosts.cpp Initial file population. 2013-08-02 13:12:24 -07:00
HttpMime.cpp fix time/date core 2014-09-22 07:00:10 -07:00
HttpMime.h Merge branch 'testing' into diffbot-matt 2014-06-13 11:00:09 -07:00
HttpRequest.cpp more support for cloud initiative 2014-08-31 21:55:27 -07:00
HttpRequest.h fixes for cloud support. 2014-08-31 16:23:11 -07:00
HttpServer.cpp print chrome on other pages 2014-09-23 20:59:48 -07:00
HttpServer.h added http server compression (gzip) stats. 2014-09-26 11:06:38 -07:00
iana_charset.cpp merge diffbot-testing 2014-04-09 20:10:30 -07:00
iana_charset.h merge diffbot-testing 2014-04-09 20:10:30 -07:00
iconv.h Initial file population. 2013-08-02 13:12:24 -07:00
Images.cpp multiple core fixes 2014-09-22 07:07:40 -07:00
Images.h support og:image images. allow user to 2014-07-04 15:33:27 -07:00
Indexdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Indexdb.h retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
IndexList.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexList.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexReadInfo.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexReadInfo.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable2.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
IndexTable2.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable.h Initial file population. 2013-08-02 13:12:24 -07:00
init.gb.conf minor make install changes 2014-05-22 18:46:38 -07:00
injectme3 added injectme3 file and documentation into compare.html 2013-08-17 11:02:26 -06:00
injector.cpp Initial file population. 2013-08-02 13:12:24 -07:00
iostream.h Initial file population. 2013-08-02 13:12:24 -07:00
ip.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
ip.h Initial file population. 2013-08-02 13:12:24 -07:00
ipconfig.cpp fixed some cores. brought in fixes from 2013-09-08 16:16:13 -06:00
Iso8859.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Iso8859.h Initial file population. 2013-08-02 13:12:24 -07:00
jointest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
jpegtopnm Initial file population. 2013-08-02 13:12:24 -07:00
Json.cpp multiple core fixes 2014-09-22 07:07:40 -07:00
Json.h v3 support for tokenized diffbot replies 2014-05-12 16:13:24 -07:00
keepalive.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Lang.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Lang.h when user searches for a word without the 2014-06-01 09:37:00 -07:00
LangList.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
LangList.h Initial file population. 2013-08-02 13:12:24 -07:00
Language.cpp use gbsystem() not system() so it can turn off alarms 2014-09-11 05:01:55 -07:00
Language.h Initial file population. 2013-08-02 13:12:24 -07:00
LanguageIdentifier.cpp more minor bug fixes. 2014-08-28 18:11:07 -07:00
LanguageIdentifier.h Initial file population. 2013-08-02 13:12:24 -07:00
LanguagePages.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
LanguagePages.h Initial file population. 2013-08-02 13:12:24 -07:00
libc.a Initial file population. 2013-08-02 13:12:24 -07:00
libcrypto.a turn off hearbeats when compiling openssl libs 2014-04-22 16:39:40 -07:00
libgcc.a Initial file population. 2013-08-02 13:12:24 -07:00
libiconv.a Initial file population. 2013-08-02 13:12:24 -07:00
libiconv.la Initial file population. 2013-08-02 13:12:24 -07:00
libjpeg.so.62 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libm.a Initial file population. 2013-08-02 13:12:24 -07:00
libnetpbm.so.10 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libpng12.so.0 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libpthread.a Initial file population. 2013-08-02 13:12:24 -07:00
libssl.a turn off hearbeats when compiling openssl libs 2014-04-22 16:39:40 -07:00
libstdc++.a Initial file population. 2013-08-02 13:12:24 -07:00
libtiff.so.4 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libz.a Initial file population. 2013-08-02 13:12:24 -07:00
libz.so.1 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
LICENSE license fix 2014-06-16 13:52:51 -07:00
Linkdb.cpp Merge branch 'master' into testing 2014-09-20 08:26:38 -06:00
Linkdb.h fixes for new link info code so it doesn't 2014-02-25 10:55:05 -08:00
LinkedList.h Initial file population. 2013-08-02 13:12:24 -07:00
linkspam.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
linkspam.h Initial file population. 2013-08-02 13:12:24 -07:00
Log.cpp try to fix msg22 based cores 2014-05-14 07:46:32 -07:00
Log.h keep thumbnail gen msgs in the log file 2014-07-04 08:34:42 -07:00
Loop.cpp sigalrm fixes 2014-09-12 02:42:00 -07:00
Loop.h more fixes for inner loop code 2014-09-11 05:56:47 -07:00
looptest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
main.cpp various bug fixes. more qa tests. 2014-09-24 20:03:16 -07:00
Make.depend force gb to recompile version every time 2014-09-19 12:23:40 -07:00
Makefile cygwin fixes 2014-09-26 23:04:16 -07:00
malloc.c Initial file population. 2013-08-02 13:12:24 -07:00
matches2.cpp dirty word detector revisions. we need 2013-10-16 20:19:49 -07:00
matches2.h renamed matches.h and matches.cpp to 2013-10-01 07:58:24 -07:00
Matches.cpp qa test fixes 2014-07-15 10:06:33 -07:00
Matches.h Initial file population. 2013-08-02 13:12:24 -07:00
Mem.cpp fix printf compiler warnings 2014-08-28 13:23:46 -07:00
Mem.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
membustest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPool.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPool.h Initial file population. 2013-08-02 13:12:24 -07:00
MemPoolTree.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPoolTree.h Initial file population. 2013-08-02 13:12:24 -07:00
memtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
mergetest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MetaContainer.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MetaContainer.h Initial file population. 2013-08-02 13:12:24 -07:00
Mime.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Mime.h Initial file population. 2013-08-02 13:12:24 -07:00
mixfile.cpp Initial file population. 2013-08-02 13:12:24 -07:00
mmseg.h Initial file population. 2013-08-02 13:12:24 -07:00
monitor.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
Monitordb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Monitordb.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg0.cpp retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
Msg0.h retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
Msg1.cpp ignore ENOCOLLREC msgs in handleRequest1() in Msg1.cpp. 2014-07-14 12:21:32 -07:00
Msg1.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg1f.cpp if logging to stderr then return err when trying to 2014-07-05 14:16:33 -07:00
Msg1f.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg2.cpp fix section stats display bugs 2014-07-10 15:55:18 -07:00
Msg2.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg2a.cpp Merge branch 'master' into diffbot 2013-10-16 14:28:42 -07:00
Msg2a.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg2b.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg2b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg3.cpp minor print fix 2014-04-26 13:41:08 -07:00
Msg3.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg3a.cpp facet text lookup fixes. 2014-07-29 19:32:27 -07:00
Msg3a.h shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed. 2014-07-07 12:32:27 -07:00
Msg3e.cpp fix infinite loop from json parsing and 2013-09-27 17:52:36 -06:00
Msg3e.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg4.cpp get qa tests working again. 2014-09-23 17:48:40 -07:00
Msg4.h upped MAX_SPIDERS from 100 to 300. 2014-09-03 07:25:40 -07:00
Msg5.cpp fix inifinite loop when rebalancing. 2014-09-11 12:11:34 -07:00
Msg5.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg6b.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg6b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg8b.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Msg8b.h Merge branch 'master' into diffbot 2013-10-16 14:28:42 -07:00
Msg9b.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg9b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg13.cpp fix floater bug from reading hashtable off disk. 2014-09-26 15:30:42 -07:00
Msg13.h fix floater bug from reading hashtable off disk. 2014-09-26 15:30:42 -07:00
Msg17.cpp first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg17.h first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg20.cpp inject docs that come through our squid proxy 2014-07-09 12:25:23 -07:00
Msg20.h fixes for cloud support. 2014-08-31 16:23:11 -07:00
Msg22.cpp fix 2014-09-20 07:59:41 -06:00
Msg22.h try to fix msg22 core some more 2014-05-14 08:16:47 -07:00
Msg24.cpp new Make.depend. 2013-08-09 17:13:45 -06:00
Msg28.cpp fix core from (broad)casting valueless cgi field. 2013-10-03 14:51:59 -07:00
Msg28.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg30.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg30.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Msg35.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
Msg35.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg36.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg36.h retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
Msg37.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg37.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg39.cpp facet text lookup fixes. 2014-07-29 19:32:27 -07:00
Msg39.h added langw and langwieght to control weight 2014-09-21 18:47:30 -07:00
Msg40.cpp get qa tests working again. 2014-09-23 17:48:40 -07:00
Msg40.h support facet ranges now like 2014-09-04 20:41:37 -07:00
Msg40Cache.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg40Cache.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg42.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg42.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg51.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg51.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msgaa.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msgaa.h Initial file population. 2013-08-02 13:12:24 -07:00
MsgC.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
MsgC.h Initial file population. 2013-08-02 13:12:24 -07:00
Msge0.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msge0.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msge1.cpp fixed qa tests when doing it over multi-host cluster 2014-09-16 10:25:45 -06:00
Msge1.h Initial file population. 2013-08-02 13:12:24 -07:00
Multicast.cpp fix not shutting down bug 2014-07-16 13:00:16 -07:00
Multicast.h fix data import function some more. added qa test. 2014-09-24 12:40:39 -07:00
mysynonyms.txt Initial file population. 2013-08-02 13:12:24 -07:00
numwords.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageAddColl.cpp gigabot advice updates 2014-09-06 21:05:11 -07:00
PageAddUrl.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageBasic.cpp print chrome on other pages 2014-09-23 20:59:48 -07:00
PageCatdb.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageCrawlBot.cpp Merge branch 'diffbot-testing' into testing 2014-07-28 14:37:44 -07:00
PageCrawlBot.h more api updates 2014-07-13 09:35:44 -07:00
PageDirectory.cpp fix dmoz building. 2014-07-05 22:20:15 -07:00
PageEvents.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
PageGet.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
PageHosts.cpp do not space break version in hosts table 2014-09-19 12:29:48 -07:00
PageIndexdb.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
PageInject.cpp import fixes 2014-09-25 20:48:34 -07:00
PageInject.h fix data import function some more. added qa test. 2014-09-24 12:40:39 -07:00
PageLogView.cpp new printadmintop functionality. 2014-02-07 23:08:04 -07:00
PageNetTest.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
PageNetTest.h Initial file population. 2013-08-02 13:12:24 -07:00
PageOverview.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageParser.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageParser.h more fixes for new boolean logic. 2014-03-13 13:09:33 -07:00
PagePerf.cpp took out pagecount table. just hafta scan 2014-01-19 20:34:38 -08:00
PageReindex.cpp added the query reindex smoke test. 2014-09-25 17:44:35 -07:00
PageReindex.h fix query reindex some more 2014-03-11 14:46:49 -07:00
PageResults.cpp Merge branch 'diffbot-testing' into testing 2014-09-26 14:25:08 -07:00
PageResults.h get html head and tail working again now. 2014-06-21 21:07:38 -07:00
PageRoot.cpp donations link 2014-09-26 08:12:32 -07:00
Pages.cpp added status count to collection nav bar key 2014-09-26 11:13:25 -07:00
Pages.h various bug fixes. more qa tests. 2014-09-24 20:03:16 -07:00
PageSockets.cpp put dropped requests in bold red 2014-09-04 11:01:49 -07:00
PageSpam.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageStats.cpp added http server compression (gzip) stats. 2014-09-26 11:06:38 -07:00
PageStatsdb.cpp fix core 2014-09-17 16:58:24 -06:00
PageSubmit.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageThesaurus.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageThreads.cpp formatting fixes 2014-01-19 00:57:20 -08:00
PageTitledb.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
PageTurk.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
PageTurk.h Initial file population. 2013-08-02 13:12:24 -07:00
Parms.cpp added floater coll override switch. 2014-09-26 21:28:04 -07:00
Parms.h get qa tests working again. 2014-09-23 17:48:40 -07:00
parse_iana_charsets.pl move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
pdftohtml try new pdftohtml binary 2014-09-26 08:02:17 -07:00
Phrases.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Phrases.h Initial file population. 2013-08-02 13:12:24 -07:00
PingServer.cpp force gb to recompile version every time 2014-09-19 12:23:40 -07:00
PingServer.h added emergency msg box on all admin pages 2014-01-11 20:14:44 -08:00
Placedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Placedb.h Initial file population. 2013-08-02 13:12:24 -07:00
pngtopnm Initial file population. 2013-08-02 13:12:24 -07:00
pnmscale Initial file population. 2013-08-02 13:12:24 -07:00
Pops.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pops.h Initial file population. 2013-08-02 13:12:24 -07:00
porter.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pos.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pos.h Initial file population. 2013-08-02 13:12:24 -07:00
Posdb.cpp added the query reindex smoke test. 2014-09-25 17:44:35 -07:00
Posdb.h various bug fixes. more qa tests. 2014-09-24 20:03:16 -07:00
postalCodes.txt Initial file population. 2013-08-02 13:12:24 -07:00
PostQueryRerank.cpp beginning of total parm overhaul. 2014-06-12 21:27:06 -07:00
PostQueryRerank.h Initial file population. 2013-08-02 13:12:24 -07:00
ppmtojpeg Initial file population. 2013-08-02 13:12:24 -07:00
Process.cpp get qa tests working again. 2014-09-23 17:48:40 -07:00
Process.h stage 1 import tool 2014-09-20 16:58:12 -07:00
Profiler.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Profiler.h Initial file population. 2013-08-02 13:12:24 -07:00
Proxy.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
Proxy.h Initial file population. 2013-08-02 13:12:24 -07:00
pstotext Initial file population. 2013-08-02 13:12:24 -07:00
qa.cpp added the query reindex smoke test. 2014-09-25 17:44:35 -07:00
QAClient.cpp Initial file population. 2013-08-02 13:12:24 -07:00
QAClient.h Initial file population. 2013-08-02 13:12:24 -07:00
quarantine.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Query.cpp added the query reindex smoke test. 2014-09-25 17:44:35 -07:00
Query.h added the query reindex smoke test. 2014-09-25 17:44:35 -07:00
Rdb.cpp use gbsystem() not system() so it can turn off alarms 2014-09-11 05:01:55 -07:00
Rdb.h fix # docs and recs bug. 2014-08-28 07:45:43 -07:00
RdbBase.cpp fix inifinite loop when rebalancing. 2014-09-11 12:11:34 -07:00
RdbBase.h fixed nasty bug of resetting RdbBases for 2014-06-09 10:16:29 -07:00
RdbBuckets.cpp log msg cleanups 2014-05-11 21:55:44 -07:00
RdbBuckets.h Initial file population. 2013-08-02 13:12:24 -07:00
RdbCache.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
RdbCache.h removed MAX_COLL_RECS so we can have unlimited 2013-08-30 16:20:38 -07:00
RdbDump.cpp fix dmoz building. 2014-07-05 22:20:15 -07:00
RdbDump.h fix dmoz building. 2014-07-05 22:20:15 -07:00
RdbList.cpp fix data corruption detection and repair bug. 2014-05-01 10:38:00 -07:00
RdbList.h checkpoint for faster spider code. 2014-02-04 16:15:27 -08:00
RdbMap.cpp fix core from keys out of order when dumping 2014-09-18 12:33:56 -06:00
RdbMap.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
RdbMem.cpp track down some nasty cores. fix 2013-10-29 16:37:14 -07:00
RdbMem.h now we can reset collection mid stream 2013-10-18 17:49:36 -07:00
RdbMerge.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
RdbMerge.h if coll is deleted or reset in a middle of a dump 2013-12-25 17:12:09 -08:00
RdbScan.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbScan.h Initial file population. 2013-08-02 13:12:24 -07:00
rdbtest2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
rdbtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbTree.cpp Merge branch 'diffbot-testing' into testing 2014-07-22 10:47:33 -07:00
RdbTree.h fix annoying rdbtree pos/neg key counting issue 2014-01-11 18:04:28 -08:00
README.md documentation updates 2014-09-12 03:09:06 -07:00
readRec.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Rebalance.cpp tuning the rebalance loop 2014-03-15 14:56:11 -07:00
Rebalance.h tight merge during rebalance to save 2014-03-14 23:37:30 -07:00
reindex2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Repair.cpp try to start indexing spider replies 2014-05-09 11:18:24 -07:00
Repair.h Initial file population. 2013-08-02 13:12:24 -07:00
RequestTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RequestTable.h Initial file population. 2013-08-02 13:12:24 -07:00
rescue.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Revdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Revdb.h Initial file population. 2013-08-02 13:12:24 -07:00
rmbots.cpp Initial file population. 2013-08-02 13:12:24 -07:00
S99gb added S99gb for loading at boot. 2014-06-23 07:32:38 -06:00
SafeBuf.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
SafeBuf.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
SafeList.h Initial file population. 2013-08-02 13:12:24 -07:00
Sanity.h Initial file population. 2013-08-02 13:12:24 -07:00
Scores.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Scores.h Initial file population. 2013-08-02 13:12:24 -07:00
Scraper.cpp take out datedb. no longer used. we store 2014-01-09 13:39:28 -08:00
Scraper.h misc/various bug fixes. 2014-08-28 18:07:22 -07:00
SearchInput.cpp fixes for cloud support. 2014-08-31 16:23:11 -07:00
SearchInput.h added langw and langwieght to control weight 2014-09-21 18:47:30 -07:00
Sections.cpp qa fixes 2014-08-02 09:07:33 -07:00
Sections.h do not hash redundant xpaths that have the same inner sentence/alnum 2014-07-09 17:16:01 -07:00
seektest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
seo.h Initial file population. 2013-08-02 13:12:24 -07:00
SiteGetter.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
SiteGetter.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
sleepandlog.cpp Initial file population. 2013-08-02 13:12:24 -07:00
sort.cpp Initial file population. 2013-08-02 13:12:24 -07:00
sort.h Initial file population. 2013-08-02 13:12:24 -07:00
Speller.cpp work on make install. 2014-05-11 12:48:56 -07:00
Speller.h Initial file population. 2013-08-02 13:12:24 -07:00
Spider.cpp added the query reindex smoke test. 2014-09-25 17:44:35 -07:00
Spider.h raised MAX_SPIDERS from 100 to 300. watch out for oom though. 2014-09-03 07:26:17 -07:00
SpiderProxy.cpp added floater coll override switch. 2014-09-26 21:28:04 -07:00
SpiderProxy.h got new floater/proxy logic compiling. 2014-06-06 15:11:51 -07:00
Stats.cpp comment out unused code. make thread cleanups 2014-09-03 09:48:43 -07:00
Stats.h comment out unused code. make thread cleanups 2014-09-03 09:48:43 -07:00
Statsdb.cpp fix graph window some 2014-09-17 06:23:49 -07:00
Statsdb.h fix potential problem of tons of points in 2013-10-14 22:52:29 -07:00
StopWords.cpp update common word list 2013-12-01 15:19:33 -07:00
StopWords.h Initial file population. 2013-08-02 13:12:24 -07:00
streambuf.h Initial file population. 2013-08-02 13:12:24 -07:00
Strings.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Strings.h Initial file population. 2013-08-02 13:12:24 -07:00
Summary.cpp fix default summary m_displayLen bug 2014-07-04 10:55:46 -07:00
Summary.h get summary "ns" parm and collectionrec 2014-07-03 07:29:44 -07:00
superMergeTest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
supported_charsets.cpp Initial file population. 2013-08-02 13:12:24 -07:00
supported_charsets.txt Initial file population. 2013-08-02 13:12:24 -07:00
Syncdb.cpp Merge branch 'diffbot-testing' into diffbot-matt 2014-06-09 12:42:54 -07:00
Syncdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Synonyms.cpp syn fix for 'sports' when lang is unknown. we default 2014-09-06 10:49:22 -07:00
Synonyms.h fix stack smash core. 2014-06-01 10:42:49 -07:00
Tagdb.cpp when docid is banned do not print json/xml 2014-09-19 07:27:33 -07:00
Tagdb.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
TcpServer.cpp more select polling fixes 2014-09-11 07:16:39 -07:00
TcpServer.h finally got http tunnel logic working. 2014-07-01 16:28:15 -06:00
TcpSocket.h add support for tunnelling https fetch 2014-07-01 10:43:52 -06:00
test2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_convert.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_hash.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_norm.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_parser2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_parser.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_unicode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Test.cpp fix annoying bug when adding new parms. 2014-06-10 12:29:50 -07:00
Test.h Initial file population. 2013-08-02 13:12:24 -07:00
testfloats.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Tfndb.cpp code cleanups. 2014-01-18 21:19:26 -08:00
Tfndb.h code cleanups. 2014-01-18 21:19:26 -08:00
Thesaurus.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Thesaurus.h Initial file population. 2013-08-02 13:12:24 -07:00
Threads.cpp turn off images for qa tests. 2014-09-10 14:13:39 -07:00
Threads.h comment out unused code. make thread cleanups 2014-09-03 09:48:43 -07:00
threadtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
thunder.cpp Initial file population. 2013-08-02 13:12:24 -07:00
tifftopnm Initial file population. 2013-08-02 13:12:24 -07:00
Timedb.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Timedb.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Timer.h Initial file population. 2013-08-02 13:12:24 -07:00
Title.cpp qa test fixes 2014-07-15 10:06:33 -07:00
Title.h qa test fixes 2014-07-15 10:06:33 -07:00
Titledb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Titledb.h code cleanups. 2014-01-18 21:19:26 -08:00
TopTree.cpp index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
TopTree.h more fixes for new boolean logic. 2014-03-13 13:09:33 -07:00
treetest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
TuringTest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
TuringTest.h Initial file population. 2013-08-02 13:12:24 -07:00
Turkdb.cpp Initial file population. 2013-08-02 13:12:24 -07:00
types.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
UCNormalizer.cpp code cleanups. 2014-01-18 21:19:26 -08:00
UCNormalizer.h Initial file population. 2013-08-02 13:12:24 -07:00
UCPropTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UCPropTable.h Initial file population. 2013-08-02 13:12:24 -07:00
UCWordIterator.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UCWordIterator.h Initial file population. 2013-08-02 13:12:24 -07:00
UdpProtocol.h Initial file population. 2013-08-02 13:12:24 -07:00
UdpServer.cpp fix core from too many facet strs 2014-09-21 09:26:13 -07:00
UdpServer.h rebalancer working pretty well now 2014-01-15 19:08:47 -08:00
UdpSlot.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
UdpSlot.h hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
udptest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Unicode.cpp when user searches for a word without the 2014-06-01 09:37:00 -07:00
Unicode.h new import code copiling. now needs runtime testing and 2014-09-20 20:12:28 -07:00
UnicodeProperties.cpp code cleanups. 2014-01-18 21:19:26 -08:00
UnicodeProperties.h Initial file population. 2013-08-02 13:12:24 -07:00
unifiedDict.txt Initial file population. 2013-08-02 13:12:24 -07:00
uniq2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Url.cpp fix nyt.com cookie redir bug. 2014-08-05 17:04:11 -07:00
Url.h Initial file population. 2013-08-02 13:12:24 -07:00
urlinfo.cpp do not add crazy urls into spiderdb 2014-09-20 08:26:22 -06:00
Users.cpp redhat build updates on fedora 2014-05-25 09:58:07 -04:00
Users.h Initial file population. 2013-08-02 13:12:24 -07:00
ValidPointer.cpp Initial file population. 2013-08-02 13:12:24 -07:00
ValidPointer.h Initial file population. 2013-08-02 13:12:24 -07:00
Vector.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Vector.h Initial file population. 2013-08-02 13:12:24 -07:00
Version.cpp force gb to recompile version every time 2014-09-19 12:23:40 -07:00
Version.h makefile updates 2014-09-19 13:51:08 -06:00
Weights.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Weights.h Initial file population. 2013-08-02 13:12:24 -07:00
Wiki.cpp log msg cleanups 2014-05-11 21:55:44 -07:00
Wiki.h Initial file population. 2013-08-02 13:12:24 -07:00
wikititles.txt.part1 Initial file population. 2013-08-02 13:12:24 -07:00
wikititles.txt.part2 Initial file population. 2013-08-02 13:12:24 -07:00
wiktionary-buf.txt when user searches for a word without the 2014-06-01 09:37:00 -07:00
wiktionary-lang.txt when user searches for a word without the 2014-06-01 09:37:00 -07:00
wiktionary-syns.dat when user searches for a word without the 2014-06-01 09:37:00 -07:00
Wiktionary.cpp fix compiler bug 2014-06-16 11:10:38 -07:00
Wiktionary.h Initial file population. 2013-08-02 13:12:24 -07:00
Words.cpp fixed bugs in sort by prices, etc. 2013-11-11 18:58:45 -08:00
Words.h Initial file population. 2013-08-02 13:12:24 -07:00
Xml.cpp fix <script> tags that immediately end in </script> or 2014-07-14 17:24:20 -07:00
Xml.h for json docs only give them a single 2014-01-25 08:17:38 -08:00
XmlDoc.cpp added floater coll override switch. 2014-09-26 21:28:04 -07:00
XmlDoc.h print chrome on other pages 2014-09-23 20:59:48 -07:00
XmlNode.cpp Merge branch 'diffbot-testing' into testing 2014-07-14 18:10:13 -07:00
XmlNode.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
zconf.h Initial file population. 2013-08-02 13:12:24 -07:00
zlib.h Initial file population. 2013-08-02 13:12:24 -07:00

open-source-search-engine

An open source web and enterprise search engine. As can be seen on http://www.gigablast.com/ .

RUNNING GIGABLAST

See html/faq.html for all administrative documentation including the quick start instructions.

Alternatively, visit http://www.gigablast.com/admin.html

See html/compare.html for a comparison of Gigablast to SOLR. Although this is very sparse right now, it does include some useful commands.

CODE ARCHITECTURE

See html/developer.html for all code documentation.

Alternatively, visit http://www.gigablast.com/developer.html

CONTACT

Contact me for feature requests or help in general. I will work for free for good use cases. mattdwells@hotmail.com.