Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Go to file
mwells 19feb94f29 imported new Loop.cpp using select() from
testing branch so ppl can use it on ubuntu 14.04
2014-09-02 22:22:39 -07:00
antiword-dir Initial file population. 2013-08-02 13:12:24 -07:00
diffbot-widget widget updates 2014-04-21 09:21:28 -07:00
html updated dotgdbinit 2014-08-27 15:04:11 -07:00
openssl we already include our own 32-bit 2013-09-15 18:25:49 -06:00
ucdata Initial file population. 2013-08-02 13:12:24 -07:00
Abbreviations.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Abbreviations.h change a couple of possible reserved names in C++ 2013-08-28 22:59:01 -06:00
Accessdb.cpp fix addColl() logic for collectionless rdbs 2013-10-16 14:38:09 -07:00
Accessdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Address.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Address.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
addtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Ads.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Ads.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
AdultBit.cpp Initial file population. 2013-08-02 13:12:24 -07:00
AdultBit.h Initial file population. 2013-08-02 13:12:24 -07:00
animate.cpp Initial file population. 2013-08-02 13:12:24 -07:00
antiword fix ulimit and antiword bugs 2014-06-18 04:06:20 -07:00
AutoBan.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
AutoBan.h Initial file population. 2013-08-02 13:12:24 -07:00
badcattable.dat Initial file population. 2013-08-02 13:12:24 -07:00
BigFile.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
BigFile.h make code compile cleaner. 2014-06-07 14:11:12 -07:00
Bits.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Bits.h Initial file population. 2013-08-02 13:12:24 -07:00
blaster2.cpp more core fixes. more stability. 2014-07-16 12:52:51 -07:00
Blaster.cpp for json docs only give them a single 2014-01-25 08:17:38 -08:00
Blaster.h use ./gb blaster -u <fileofurls> to just inject urls, 2013-08-19 16:33:27 -06:00
bmptopnm Initial file population. 2013-08-02 13:12:24 -07:00
Cachedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Cachedb.h Initial file population. 2013-08-02 13:12:24 -07:00
camsort.cpp Initial file population. 2013-08-02 13:12:24 -07:00
catcountry.dat Initial file population. 2013-08-02 13:12:24 -07:00
Catdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Catdb.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Categories.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Categories.h documentation updates. fixed sd=0. 2013-10-13 14:24:41 -07:00
CatRec.cpp fix a couple catdb generation bugs. 2013-10-12 20:33:04 -07:00
CatRec.h Initial file population. 2013-08-02 13:12:24 -07:00
changelog makefile updates 2014-08-27 22:09:21 -06:00
character-sets Initial file population. 2013-08-02 13:12:24 -07:00
check_unicode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Clusterdb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Clusterdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Collectiondb.cpp Merge branch 'diffbot-testing' into testing 2014-08-15 17:05:22 -07:00
Collectiondb.h Merge branch 'diffbot-testing' into testing 2014-07-15 19:30:34 -07:00
Conf.cpp save conf files safely to disk so we don't 2014-07-29 10:02:43 -07:00
Conf.h Merge branch 'diffbot-testing' into testing 2014-08-15 17:05:22 -07:00
control.deb package bldg updates 2014-06-16 21:50:32 -06:00
convert.cpp Initial file population. 2013-08-02 13:12:24 -07:00
copyright.head package bldg updates 2014-06-16 21:50:32 -06:00
copyright.tail package bldg updates 2014-06-16 21:50:32 -06:00
CountryCode.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
CountryCode.h fix pagecrawlbot.cpp to support &c=token-name. 2014-01-22 23:40:38 -08:00
create_ucd_tables.cpp Initial file population. 2013-08-02 13:12:24 -07:00
DailyMerge.cpp Fixed some bugs. 2013-08-09 08:52:15 -07:00
DailyMerge.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
DataFeed.cpp Initial file population. 2013-08-02 13:12:24 -07:00
DataFeed.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Datedb.cpp more core stability fixes. prevent core dumps 2014-07-16 12:07:39 -07:00
Datedb.h more core stability fixes. prevent core dumps 2014-07-16 12:07:39 -07:00
Dates.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Dates.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Diff.cpp Fixed some bugs. 2013-08-09 08:52:15 -07:00
Diff.h Initial file population. 2013-08-02 13:12:24 -07:00
Dir.cpp more parmdb fixes 2013-12-16 15:39:24 -08:00
Dir.h fix file descriptor leak in Dir class. 2013-11-19 13:41:56 -08:00
DiskPageCache.cpp hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
DiskPageCache.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
dlstubs.c Initial file population. 2013-08-02 13:12:24 -07:00
dmozparse.cpp fix dmoz building. 2014-07-05 22:20:15 -07:00
Dns.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Dns.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
DnsProtocol.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
dnstest.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
Domains.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Domains.h Initial file population. 2013-08-02 13:12:24 -07:00
dumpcore.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Entities.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Entities.h Initial file population. 2013-08-02 13:12:24 -07:00
Errno.cpp Merge branch 'testing' into diffbot-matt 2014-07-10 10:06:55 -07:00
Errno.h Merge branch 'testing' into diffbot-matt 2014-07-10 10:06:55 -07:00
errnotest.cpp errno test update 2013-11-19 00:10:10 -07:00
Events.h Initial file population. 2013-08-02 13:12:24 -07:00
Facebook.cpp fix addColl() logic for collectionless rdbs 2013-10-16 14:38:09 -07:00
Facebook.h fixes for page inject 2014-06-15 08:26:27 -07:00
fastIndexTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
fctypes.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
fctypes.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
File.cpp File::set() fix for //'s 2014-06-08 15:24:30 -07:00
File.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
filterquerylogs.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Flags.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Flags.h Initial file population. 2013-08-02 13:12:24 -07:00
gb-1.0.spec make it so we don't need --nodeps with 2014-05-25 22:08:46 -04:00
gb-include.h compiler cleanups for cygwin compile 2014-06-07 14:20:04 -07:00
gb.deb.rules if netpbm pkg already installed use it. 2014-07-06 09:54:28 -07:00
gb.pem so we have spider https sites add 2013-10-13 00:15:39 -07:00
gbfilter.cpp Initial file population. 2013-08-02 13:12:24 -07:00
gbtitletest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geneaology.cpp Initial file population. 2013-08-02 13:12:24 -07:00
generateSuperMergeCode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geo_ip_table.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geo_ip_table.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP_internal.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP.c Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIPCity.c Initial file population. 2013-08-02 13:12:24 -07:00
GeoIPCity.h Initial file population. 2013-08-02 13:12:24 -07:00
getsample.cpp Initial file population. 2013-08-02 13:12:24 -07:00
giftopnm Initial file population. 2013-08-02 13:12:24 -07:00
hash.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hash.h get "&site=abc.com+xyz.com"... working to restrict 2013-09-15 20:16:48 -07:00
HashTable.cpp fix core from last push. 2013-12-09 14:21:46 -07:00
HashTable.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
HashTableT.cpp Initial file population. 2013-08-02 13:12:24 -07:00
HashTableT.h Initial file population. 2013-08-02 13:12:24 -07:00
HashTableX.cpp quite a few fixes to the quota system, cleanups etc. 2014-01-18 16:23:13 -08:00
HashTableX.h new facet crap compiling now. 2014-06-20 12:28:50 -07:00
hashtest2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hashtest3.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hashtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Highlight.cpp Merge branch 'master' into diffbot 2013-12-07 11:34:26 -07:00
Highlight.h trying to fix json decoding bug. 2013-10-24 17:55:01 -07:00
Hostdb.cpp shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed. 2014-07-07 12:32:27 -07:00
Hostdb.h shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed. 2014-07-07 12:32:27 -07:00
hosts.cpp Initial file population. 2013-08-02 13:12:24 -07:00
HttpMime.cpp Merge branch 'diffbot-testing' into testing 2014-08-05 17:19:53 -07:00
HttpMime.h Merge branch 'testing' into diffbot-matt 2014-06-13 11:00:09 -07:00
HttpRequest.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
HttpRequest.h get search filters actually working 2014-08-06 08:12:05 -07:00
HttpServer.cpp Merge branch 'diffbot-testing' into testing 2014-08-05 17:19:53 -07:00
HttpServer.h Merge branch 'testing' into diffbot-matt 2014-07-10 10:06:55 -07:00
iana_charset.cpp merge diffbot-testing 2014-04-09 20:10:30 -07:00
iana_charset.h merge diffbot-testing 2014-04-09 20:10:30 -07:00
iconv.h Initial file population. 2013-08-02 13:12:24 -07:00
Images.cpp Merge branch 'diffbot-testing' into testing 2014-08-05 17:19:53 -07:00
Images.h support og:image images. allow user to 2014-07-04 15:33:27 -07:00
Indexdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Indexdb.h retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
IndexList.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexList.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexReadInfo.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexReadInfo.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable2.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
IndexTable2.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable.h Initial file population. 2013-08-02 13:12:24 -07:00
init.gb.conf minor make install changes 2014-05-22 18:46:38 -07:00
injectme3 added injectme3 file and documentation into compare.html 2013-08-17 11:02:26 -06:00
injector.cpp Initial file population. 2013-08-02 13:12:24 -07:00
iostream.h Initial file population. 2013-08-02 13:12:24 -07:00
ip.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
ip.h Initial file population. 2013-08-02 13:12:24 -07:00
ipconfig.cpp fixed some cores. brought in fixes from 2013-09-08 16:16:13 -06:00
Iso8859.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Iso8859.h Initial file population. 2013-08-02 13:12:24 -07:00
jointest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
jpegtopnm Initial file population. 2013-08-02 13:12:24 -07:00
Json.cpp fix json parser core from bad json. 2014-06-16 06:56:16 -07:00
Json.h v3 support for tokenized diffbot replies 2014-05-12 16:13:24 -07:00
keepalive.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Lang.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Lang.h when user searches for a word without the 2014-06-01 09:37:00 -07:00
LangList.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
LangList.h Initial file population. 2013-08-02 13:12:24 -07:00
Language.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
Language.h Initial file population. 2013-08-02 13:12:24 -07:00
LanguageIdentifier.cpp more minor bug fixes. 2014-08-28 18:11:07 -07:00
LanguageIdentifier.h Initial file population. 2013-08-02 13:12:24 -07:00
LanguagePages.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
LanguagePages.h Initial file population. 2013-08-02 13:12:24 -07:00
libc.a Initial file population. 2013-08-02 13:12:24 -07:00
libcrypto.a turn off hearbeats when compiling openssl libs 2014-04-22 16:39:40 -07:00
libgcc.a Initial file population. 2013-08-02 13:12:24 -07:00
libiconv.a Initial file population. 2013-08-02 13:12:24 -07:00
libiconv.la Initial file population. 2013-08-02 13:12:24 -07:00
libjpeg.so.62 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libm.a Initial file population. 2013-08-02 13:12:24 -07:00
libnetpbm.so.10 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libpng12.so.0 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libpthread.a Initial file population. 2013-08-02 13:12:24 -07:00
libssl.a turn off hearbeats when compiling openssl libs 2014-04-22 16:39:40 -07:00
libstdc++.a Initial file population. 2013-08-02 13:12:24 -07:00
libtiff.so.4 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
libz.a Initial file population. 2013-08-02 13:12:24 -07:00
libz.so.1 thumbnail generation support back in. 2014-04-24 10:13:45 -07:00
LICENSE license fix 2014-06-16 13:52:51 -07:00
Linkdb.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Linkdb.h fixes for new link info code so it doesn't 2014-02-25 10:55:05 -08:00
LinkedList.h Initial file population. 2013-08-02 13:12:24 -07:00
linkspam.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
linkspam.h Initial file population. 2013-08-02 13:12:24 -07:00
Log.cpp try to fix msg22 based cores 2014-05-14 07:46:32 -07:00
Log.h keep thumbnail gen msgs in the log file 2014-07-04 08:34:42 -07:00
Loop.cpp imported new Loop.cpp using select() from 2014-09-02 22:22:39 -07:00
Loop.h updates to compile cleanly on redhat. 2014-05-24 23:58:12 -04:00
looptest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
main.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Make.depend Merge branch 'testing' into diffbot-matt 2014-06-27 17:17:14 -07:00
Makefile makefile updates 2014-08-27 22:09:21 -06:00
malloc.c Initial file population. 2013-08-02 13:12:24 -07:00
matches2.cpp dirty word detector revisions. we need 2013-10-16 20:19:49 -07:00
matches2.h renamed matches.h and matches.cpp to 2013-10-01 07:58:24 -07:00
Matches.cpp qa test fixes 2014-07-15 10:06:33 -07:00
Matches.h Initial file population. 2013-08-02 13:12:24 -07:00
Mem.cpp fix printf compiler warnings 2014-08-28 13:23:46 -07:00
Mem.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
membustest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPool.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPool.h Initial file population. 2013-08-02 13:12:24 -07:00
MemPoolTree.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPoolTree.h Initial file population. 2013-08-02 13:12:24 -07:00
memtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
mergetest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MetaContainer.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MetaContainer.h Initial file population. 2013-08-02 13:12:24 -07:00
Mime.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Mime.h Initial file population. 2013-08-02 13:12:24 -07:00
mixfile.cpp Initial file population. 2013-08-02 13:12:24 -07:00
mmseg.h Initial file population. 2013-08-02 13:12:24 -07:00
monitor.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
Monitordb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Monitordb.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg0.cpp retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
Msg0.h retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
Msg1.cpp ignore ENOCOLLREC msgs in handleRequest1() in Msg1.cpp. 2014-07-14 12:21:32 -07:00
Msg1.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg1f.cpp if logging to stderr then return err when trying to 2014-07-05 14:16:33 -07:00
Msg1f.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg2.cpp fix section stats display bugs 2014-07-10 15:55:18 -07:00
Msg2.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg2a.cpp Merge branch 'master' into diffbot 2013-10-16 14:28:42 -07:00
Msg2a.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg2b.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg2b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg3.cpp minor print fix 2014-04-26 13:41:08 -07:00
Msg3.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg3a.cpp facet text lookup fixes. 2014-07-29 19:32:27 -07:00
Msg3a.h shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed. 2014-07-07 12:32:27 -07:00
Msg3e.cpp fix infinite loop from json parsing and 2013-09-27 17:52:36 -06:00
Msg3e.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg4.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Msg4.h code checkpoint 2014-02-09 12:38:40 -07:00
Msg5.cpp overhauled the main loop. (BIGLOOP) in Loop.cpp. 2014-08-27 14:07:13 -07:00
Msg5.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg6b.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg6b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg8b.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Msg8b.h Merge branch 'master' into diffbot 2013-10-16 14:28:42 -07:00
Msg9b.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg9b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg13.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Msg13.h more nyt.com bug fixes 2014-08-07 10:26:30 -07:00
Msg17.cpp first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg17.h first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg20.cpp inject docs that come through our squid proxy 2014-07-09 12:25:23 -07:00
Msg20.h fix facet/xpath lookup stuff. 2014-07-30 10:41:21 -07:00
Msg22.cpp fixed nasty bug of resetting RdbBases for 2014-06-09 10:16:29 -07:00
Msg22.h try to fix msg22 core some more 2014-05-14 08:16:47 -07:00
Msg24.cpp new Make.depend. 2013-08-09 17:13:45 -06:00
Msg28.cpp fix core from (broad)casting valueless cgi field. 2013-10-03 14:51:59 -07:00
Msg28.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg30.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg30.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Msg35.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
Msg35.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg36.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg36.h retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
Msg37.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg37.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg39.cpp facet text lookup fixes. 2014-07-29 19:32:27 -07:00
Msg39.h Merge branch 'testing' into diffbot-matt 2014-07-08 09:58:54 -07:00
Msg40.cpp fix gbequalfloat:1.23 term 2014-08-07 15:33:37 -07:00
Msg40.h facet text lookup fixes. 2014-07-29 19:32:27 -07:00
Msg40Cache.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg40Cache.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg42.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg42.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg51.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg51.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msgaa.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msgaa.h Initial file population. 2013-08-02 13:12:24 -07:00
MsgC.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
MsgC.h Initial file population. 2013-08-02 13:12:24 -07:00
Msge0.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msge0.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msge1.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
Msge1.h Initial file population. 2013-08-02 13:12:24 -07:00
Multicast.cpp fix not shutting down bug 2014-07-16 13:00:16 -07:00
Multicast.h shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed. 2014-07-07 12:32:27 -07:00
mysynonyms.txt Initial file population. 2013-08-02 13:12:24 -07:00
numwords.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageAddColl.cpp added a clone coll page. for after creation cloning. 2014-07-09 07:54:29 -07:00
PageAddUrl.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageBasic.cpp show latest spider status msgs in page basic status. 2014-07-29 07:14:54 -07:00
PageCatdb.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageCrawlBot.cpp Merge branch 'diffbot-testing' into testing 2014-07-28 14:37:44 -07:00
PageCrawlBot.h more api updates 2014-07-13 09:35:44 -07:00
PageDirectory.cpp fix dmoz building. 2014-07-05 22:20:15 -07:00
PageEvents.cpp fixes for page inject 2014-06-15 08:26:27 -07:00
PageGet.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageHosts.cpp overhauled the main loop. (BIGLOOP) in Loop.cpp. 2014-08-27 14:07:13 -07:00
PageIndexdb.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageInject.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageInject.h inject docs that come through our squid proxy 2014-07-09 12:25:23 -07:00
PageLogView.cpp new printadmintop functionality. 2014-02-07 23:08:04 -07:00
PageNetTest.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
PageNetTest.h Initial file population. 2013-08-02 13:12:24 -07:00
PageOverview.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageParser.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageParser.h more fixes for new boolean logic. 2014-03-13 13:09:33 -07:00
PagePerf.cpp took out pagecount table. just hafta scan 2014-01-19 20:34:38 -08:00
PageReindex.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageReindex.h fix query reindex some more 2014-03-11 14:46:49 -07:00
PageResults.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageResults.h get html head and tail working again now. 2014-06-21 21:07:38 -07:00
PageRoot.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Pages.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Pages.h gui updates 2014-08-03 10:42:45 -07:00
PageSockets.cpp change bad master link to admin link 2014-07-22 10:42:30 -07:00
PageSpam.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageStats.cpp show process IDs on the stats page now. 2014-08-27 14:10:50 -07:00
PageStatsdb.cpp more api updates 2014-07-05 10:16:21 -07:00
PageSubmit.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageThesaurus.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageThreads.cpp formatting fixes 2014-01-19 00:57:20 -08:00
PageTitledb.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PageTurk.cpp fixes for page inject 2014-06-15 08:26:27 -07:00
PageTurk.h Initial file population. 2013-08-02 13:12:24 -07:00
Parms.cpp updates for query help table 2014-08-27 23:10:27 -06:00
Parms.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
parse_iana_charsets.pl move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
pdftohtml use statically compiled pdftohtml 2014-04-24 08:31:52 -07:00
Phrases.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Phrases.h Initial file population. 2013-08-02 13:12:24 -07:00
PingServer.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
PingServer.h added emergency msg box on all admin pages 2014-01-11 20:14:44 -08:00
Placedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Placedb.h Initial file population. 2013-08-02 13:12:24 -07:00
pngtopnm Initial file population. 2013-08-02 13:12:24 -07:00
pnmscale Initial file population. 2013-08-02 13:12:24 -07:00
Pops.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pops.h Initial file population. 2013-08-02 13:12:24 -07:00
porter.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pos.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pos.h Initial file population. 2013-08-02 13:12:24 -07:00
Posdb.cpp added gbfieldmatch: operator for exactly matching 2014-08-25 13:57:55 -07:00
Posdb.h lookup facet values to get their text representations. 2014-07-29 16:17:18 -07:00
postalCodes.txt Initial file population. 2013-08-02 13:12:24 -07:00
PostQueryRerank.cpp beginning of total parm overhaul. 2014-06-12 21:27:06 -07:00
PostQueryRerank.h Initial file population. 2013-08-02 13:12:24 -07:00
ppmtojpeg Initial file population. 2013-08-02 13:12:24 -07:00
Process.cpp fix bug of not running df -ka to get disk usage 2014-08-28 09:49:47 -07:00
Process.h work on make install. 2014-05-11 12:48:56 -07:00
Profiler.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
Profiler.h Initial file population. 2013-08-02 13:12:24 -07:00
Proxy.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Proxy.h Initial file population. 2013-08-02 13:12:24 -07:00
pstotext Initial file population. 2013-08-02 13:12:24 -07:00
qa.cpp added quickpolls. 2014-08-28 19:45:25 -07:00
QAClient.cpp Initial file population. 2013-08-02 13:12:24 -07:00
QAClient.h Initial file population. 2013-08-02 13:12:24 -07:00
quarantine.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Query.cpp awesome updates to help.html page 2014-08-27 22:21:30 -07:00
Query.h updates for query help table 2014-08-27 23:10:27 -06:00
Rdb.cpp fix # docs and recs bug. 2014-08-28 07:45:43 -07:00
Rdb.h fix # docs and recs bug. 2014-08-28 07:45:43 -07:00
RdbBase.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
RdbBase.h fixed nasty bug of resetting RdbBases for 2014-06-09 10:16:29 -07:00
RdbBuckets.cpp log msg cleanups 2014-05-11 21:55:44 -07:00
RdbBuckets.h Initial file population. 2013-08-02 13:12:24 -07:00
RdbCache.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
RdbCache.h removed MAX_COLL_RECS so we can have unlimited 2013-08-30 16:20:38 -07:00
RdbDump.cpp fix dmoz building. 2014-07-05 22:20:15 -07:00
RdbDump.h fix dmoz building. 2014-07-05 22:20:15 -07:00
RdbList.cpp fix data corruption detection and repair bug. 2014-05-01 10:38:00 -07:00
RdbList.h checkpoint for faster spider code. 2014-02-04 16:15:27 -08:00
RdbMap.cpp tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
RdbMap.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
RdbMem.cpp track down some nasty cores. fix 2013-10-29 16:37:14 -07:00
RdbMem.h now we can reset collection mid stream 2013-10-18 17:49:36 -07:00
RdbMerge.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
RdbMerge.h if coll is deleted or reset in a middle of a dump 2013-12-25 17:12:09 -08:00
RdbScan.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbScan.h Initial file population. 2013-08-02 13:12:24 -07:00
rdbtest2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
rdbtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbTree.cpp Merge branch 'diffbot-testing' into testing 2014-07-22 10:47:33 -07:00
RdbTree.h fix annoying rdbtree pos/neg key counting issue 2014-01-11 18:04:28 -08:00
README.md Update README.md 2013-11-16 20:14:06 -08:00
readRec.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Rebalance.cpp tuning the rebalance loop 2014-03-15 14:56:11 -07:00
Rebalance.h tight merge during rebalance to save 2014-03-14 23:37:30 -07:00
reindex2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Repair.cpp try to start indexing spider replies 2014-05-09 11:18:24 -07:00
Repair.h Initial file population. 2013-08-02 13:12:24 -07:00
RequestTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RequestTable.h Initial file population. 2013-08-02 13:12:24 -07:00
rescue.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Revdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Revdb.h Initial file population. 2013-08-02 13:12:24 -07:00
rmbots.cpp Initial file population. 2013-08-02 13:12:24 -07:00
S99gb added S99gb for loading at boot. 2014-06-23 07:32:38 -06:00
SafeBuf.cpp fix bug causing qa json facet test to fail 2014-07-30 15:36:08 -07:00
SafeBuf.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
SafeList.h Initial file population. 2013-08-02 13:12:24 -07:00
Sanity.h Initial file population. 2013-08-02 13:12:24 -07:00
Scores.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Scores.h Initial file population. 2013-08-02 13:12:24 -07:00
Scraper.cpp take out datedb. no longer used. we store 2014-01-09 13:39:28 -08:00
Scraper.h misc/various bug fixes. 2014-08-28 18:07:22 -07:00
SearchInput.cpp more search filter bar updates 2014-08-06 22:05:30 -07:00
SearchInput.h search filter bar updates 2014-08-06 20:43:48 -07:00
Sections.cpp qa fixes 2014-08-02 09:07:33 -07:00
Sections.h do not hash redundant xpaths that have the same inner sentence/alnum 2014-07-09 17:16:01 -07:00
seektest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
seo.h Initial file population. 2013-08-02 13:12:24 -07:00
SiteGetter.cpp nomenclature changes to reduce collissions. 2014-03-31 15:02:17 -07:00
SiteGetter.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
sleepandlog.cpp Initial file population. 2013-08-02 13:12:24 -07:00
sort.cpp Initial file population. 2013-08-02 13:12:24 -07:00
sort.h Initial file population. 2013-08-02 13:12:24 -07:00
Speller.cpp work on make install. 2014-05-11 12:48:56 -07:00
Speller.h Initial file population. 2013-08-02 13:12:24 -07:00
Spider.cpp misc/various bug fixes. 2014-08-28 18:07:22 -07:00
Spider.h Merge branch 'diffbot-testing' into testing 2014-07-22 10:47:33 -07:00
SpiderProxy.cpp more core stability fixes. prevent core dumps 2014-07-16 12:07:39 -07:00
SpiderProxy.h got new floater/proxy logic compiling. 2014-06-06 15:11:51 -07:00
Stats.cpp fix performance graph. 2014-08-27 15:28:30 -07:00
Stats.h hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
Statsdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Statsdb.h fix potential problem of tons of points in 2013-10-14 22:52:29 -07:00
StopWords.cpp update common word list 2013-12-01 15:19:33 -07:00
StopWords.h Initial file population. 2013-08-02 13:12:24 -07:00
streambuf.h Initial file population. 2013-08-02 13:12:24 -07:00
Strings.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Strings.h Initial file population. 2013-08-02 13:12:24 -07:00
Summary.cpp fix default summary m_displayLen bug 2014-07-04 10:55:46 -07:00
Summary.h get summary "ns" parm and collectionrec 2014-07-03 07:29:44 -07:00
superMergeTest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
supported_charsets.cpp Initial file population. 2013-08-02 13:12:24 -07:00
supported_charsets.txt Initial file population. 2013-08-02 13:12:24 -07:00
Syncdb.cpp Merge branch 'diffbot-testing' into diffbot-matt 2014-06-09 12:42:54 -07:00
Syncdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Synonyms.cpp Merge branch 'master' into testing 2014-08-01 14:04:39 -07:00
Synonyms.h fix stack smash core. 2014-06-01 10:42:49 -07:00
Tagdb.cpp make it so we can dump tagdb to a wget-table 2014-08-23 07:29:40 -07:00
Tagdb.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
TcpServer.cpp Merge branch 'diffbot-testing' into testing 2014-08-07 15:15:08 -07:00
TcpServer.h finally got http tunnel logic working. 2014-07-01 16:28:15 -06:00
TcpSocket.h add support for tunnelling https fetch 2014-07-01 10:43:52 -06:00
test2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_convert.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_hash.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_norm.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_parser2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_parser.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_unicode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Test.cpp fix annoying bug when adding new parms. 2014-06-10 12:29:50 -07:00
Test.h Initial file population. 2013-08-02 13:12:24 -07:00
testfloats.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Tfndb.cpp code cleanups. 2014-01-18 21:19:26 -08:00
Tfndb.h code cleanups. 2014-01-18 21:19:26 -08:00
Thesaurus.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Thesaurus.h Initial file population. 2013-08-02 13:12:24 -07:00
Threads.cpp hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
Threads.h logging fixes 2014-04-10 23:04:00 -07:00
threadtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
thunder.cpp Initial file population. 2013-08-02 13:12:24 -07:00
tifftopnm Initial file population. 2013-08-02 13:12:24 -07:00
Timedb.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Timedb.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Timer.h Initial file population. 2013-08-02 13:12:24 -07:00
Title.cpp qa test fixes 2014-07-15 10:06:33 -07:00
Title.h qa test fixes 2014-07-15 10:06:33 -07:00
Titledb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Titledb.h code cleanups. 2014-01-18 21:19:26 -08:00
TopTree.cpp index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
TopTree.h more fixes for new boolean logic. 2014-03-13 13:09:33 -07:00
treetest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
TuringTest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
TuringTest.h Initial file population. 2013-08-02 13:12:24 -07:00
Turkdb.cpp Initial file population. 2013-08-02 13:12:24 -07:00
types.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
UCNormalizer.cpp code cleanups. 2014-01-18 21:19:26 -08:00
UCNormalizer.h Initial file population. 2013-08-02 13:12:24 -07:00
UCPropTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UCPropTable.h Initial file population. 2013-08-02 13:12:24 -07:00
UCWordIterator.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UCWordIterator.h Initial file population. 2013-08-02 13:12:24 -07:00
UdpProtocol.h Initial file population. 2013-08-02 13:12:24 -07:00
UdpServer.cpp hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
UdpServer.h rebalancer working pretty well now 2014-01-15 19:08:47 -08:00
UdpSlot.cpp fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
UdpSlot.h hacked up to debug why we're not getting 2014-08-27 10:37:03 -07:00
udptest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Unicode.cpp when user searches for a word without the 2014-06-01 09:37:00 -07:00
Unicode.h add support for stripping accent marks from greek letters. 2014-05-30 20:09:37 -07:00
UnicodeProperties.cpp code cleanups. 2014-01-18 21:19:26 -08:00
UnicodeProperties.h Initial file population. 2013-08-02 13:12:24 -07:00
unifiedDict.txt Initial file population. 2013-08-02 13:12:24 -07:00
uniq2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Url.cpp fix nyt.com cookie redir bug. 2014-08-05 17:04:11 -07:00
Url.h Initial file population. 2013-08-02 13:12:24 -07:00
urlinfo.cpp fixed data corruption bug. m_finalCrawlDelay 2013-11-27 14:18:15 -08:00
Users.cpp redhat build updates on fedora 2014-05-25 09:58:07 -04:00
Users.h Initial file population. 2013-08-02 13:12:24 -07:00
ValidPointer.cpp Initial file population. 2013-08-02 13:12:24 -07:00
ValidPointer.h Initial file population. 2013-08-02 13:12:24 -07:00
Vector.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Vector.h Initial file population. 2013-08-02 13:12:24 -07:00
Weights.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Weights.h Initial file population. 2013-08-02 13:12:24 -07:00
Wiki.cpp log msg cleanups 2014-05-11 21:55:44 -07:00
Wiki.h Initial file population. 2013-08-02 13:12:24 -07:00
wikititles.txt.part1 Initial file population. 2013-08-02 13:12:24 -07:00
wikititles.txt.part2 Initial file population. 2013-08-02 13:12:24 -07:00
wiktionary-buf.txt when user searches for a word without the 2014-06-01 09:37:00 -07:00
wiktionary-lang.txt when user searches for a word without the 2014-06-01 09:37:00 -07:00
wiktionary-syns.dat when user searches for a word without the 2014-06-01 09:37:00 -07:00
Wiktionary.cpp fix compiler bug 2014-06-16 11:10:38 -07:00
Wiktionary.h Initial file population. 2013-08-02 13:12:24 -07:00
Words.cpp fixed bugs in sort by prices, etc. 2013-11-11 18:58:45 -08:00
Words.h Initial file population. 2013-08-02 13:12:24 -07:00
Xml.cpp fix <script> tags that immediately end in </script> or 2014-07-14 17:24:20 -07:00
Xml.h for json docs only give them a single 2014-01-25 08:17:38 -08:00
XmlDoc.cpp added quickpolls. 2014-08-28 19:45:25 -07:00
XmlDoc.h Merge branch 'diffbot-testing' into testing 2014-08-07 15:15:08 -07:00
XmlNode.cpp Merge branch 'diffbot-testing' into testing 2014-07-14 18:10:13 -07:00
XmlNode.h fixes to make easier to compile on max os x. 2014-08-28 12:55:02 -07:00
zconf.h Initial file population. 2013-08-02 13:12:24 -07:00
zlib.h Initial file population. 2013-08-02 13:12:24 -07:00

open-source-search-engine

An open source web and enterprise search engine. As can be seen on http://www.gigablast.com/ .

RUNNING GIGABLAST

See html/admin.html for all administrative documentation including the quick start instructions.

Alternatively, visit http://www.gigablast.com/admin.html

See html/compare.html for a comparison of Gigablast to SOLR. Although this is very sparse right now, it does include some useful commands.

CODE ARCHITECTURE

See html/developer.html for all code documentation.

Alternatively, visit http://www.gigablast.com/developer.html

CONTACT

Contact me for feature requests or help in general. I will work for free for good use cases. mattdwells@hotmail.com.