Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Go to file
Matt Wells 99bd9319fd temp hack to reduce network comm
between trinity and neo
2014-03-20 15:42:34 -07:00
antiword-dir Initial file population. 2013-08-02 13:12:24 -07:00
coll.main.0 fix a few minor bugs. 2014-03-16 10:34:58 -07:00
html parm updates 2014-02-10 21:45:03 -07:00
openssl we already include our own 32-bit 2013-09-15 18:25:49 -06:00
ucdata Initial file population. 2013-08-02 13:12:24 -07:00
Abbreviations.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Abbreviations.h change a couple of possible reserved names in C++ 2013-08-28 22:59:01 -06:00
Accessdb.cpp fix addColl() logic for collectionless rdbs 2013-10-16 14:38:09 -07:00
Accessdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Address.cpp Merge branch 'testing' into diffbot-testing 2014-03-10 12:08:23 -07:00
Address.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
addtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Ads.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Ads.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
AdultBit.cpp Initial file population. 2013-08-02 13:12:24 -07:00
AdultBit.h Initial file population. 2013-08-02 13:12:24 -07:00
animate.cpp Initial file population. 2013-08-02 13:12:24 -07:00
antiword Initial file population. 2013-08-02 13:12:24 -07:00
AutoBan.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
AutoBan.h Initial file population. 2013-08-02 13:12:24 -07:00
badcattable.dat Initial file population. 2013-08-02 13:12:24 -07:00
BigFile.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
BigFile.h forgot to push the .h files 2013-12-07 22:12:48 -07:00
Bits.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Bits.h Initial file population. 2013-08-02 13:12:24 -07:00
blaster.cpp fixed bugs with advanced.html advanced search page. 2013-11-17 14:58:47 -07:00
Blaster.cpp for json docs only give them a single 2014-01-25 08:17:38 -08:00
Blaster.h use ./gb blaster -u <fileofurls> to just inject urls, 2013-08-19 16:33:27 -06:00
bmptopnm Initial file population. 2013-08-02 13:12:24 -07:00
Cachedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Cachedb.h Initial file population. 2013-08-02 13:12:24 -07:00
camsort.cpp Initial file population. 2013-08-02 13:12:24 -07:00
catcountry.dat Initial file population. 2013-08-02 13:12:24 -07:00
Catdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Catdb.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Categories.cpp documentation updates. fixed sd=0. 2013-10-13 14:24:41 -07:00
Categories.h documentation updates. fixed sd=0. 2013-10-13 14:24:41 -07:00
CatRec.cpp fix a couple catdb generation bugs. 2013-10-12 20:33:04 -07:00
CatRec.h Initial file population. 2013-08-02 13:12:24 -07:00
character-sets Initial file population. 2013-08-02 13:12:24 -07:00
check_unicode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Clusterdb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Clusterdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Collectiondb.cpp fix a few cores from previous code updates 2014-03-11 09:36:33 -07:00
Collectiondb.h Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
Conf.cpp Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
Conf.h Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
convert.cpp Initial file population. 2013-08-02 13:12:24 -07:00
CountryCode.cpp fix pagecrawlbot.cpp to support &c=token-name. 2014-01-22 23:40:38 -08:00
CountryCode.h fix pagecrawlbot.cpp to support &c=token-name. 2014-01-22 23:40:38 -08:00
create_ucd_tables.cpp Initial file population. 2013-08-02 13:12:24 -07:00
DailyMerge.cpp Fixed some bugs. 2013-08-09 08:52:15 -07:00
DailyMerge.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
DataFeed.cpp Initial file population. 2013-08-02 13:12:24 -07:00
DataFeed.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Datedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Datedb.h move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
Dates.cpp get new global preemptive cache 2014-01-05 11:51:09 -08:00
Dates.h Initial file population. 2013-08-02 13:12:24 -07:00
Diff.cpp Fixed some bugs. 2013-08-09 08:52:15 -07:00
Diff.h Initial file population. 2013-08-02 13:12:24 -07:00
Dir.cpp more parmdb fixes 2013-12-16 15:39:24 -08:00
Dir.h fix file descriptor leak in Dir class. 2013-11-19 13:41:56 -08:00
DiskPageCache.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
DiskPageCache.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
dlstubs.c Initial file population. 2013-08-02 13:12:24 -07:00
dmozparse.cpp add support for noindex meta tag. 2013-10-12 22:50:23 -07:00
Dns.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Dns.h Initial file population. 2013-08-02 13:12:24 -07:00
DnsProtocol.h Initial file population. 2013-08-02 13:12:24 -07:00
dnstest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Domains.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Domains.h Initial file population. 2013-08-02 13:12:24 -07:00
dumpcore.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Entities.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Entities.h Initial file population. 2013-08-02 13:12:24 -07:00
Errno.cpp Defect #2099: throw an error a crawl request was made with a name that already existed for bulk request (or the other way around) 2014-03-11 16:21:58 -07:00
Errno.h Defect #2099: throw an error a crawl request was made with a name that already existed for bulk request (or the other way around) 2014-03-11 16:21:58 -07:00
errnotest.cpp errno test update 2013-11-19 00:10:10 -07:00
Events.h Initial file population. 2013-08-02 13:12:24 -07:00
Facebook.cpp fix addColl() logic for collectionless rdbs 2013-10-16 14:38:09 -07:00
Facebook.h new graphic icons. minor clean ups. 2013-11-15 14:47:05 -07:00
fastIndexTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
fctypes.cpp for json docs only give them a single 2014-01-25 08:17:38 -08:00
fctypes.h got code with shard rebalancing compiling. 2014-01-11 16:08:42 -08:00
File.cpp log cleanups mostly. 2013-12-18 10:57:18 -08:00
File.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
filterquerylogs.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Flags.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Flags.h Initial file population. 2013-08-02 13:12:24 -07:00
gb-include.h Initial file population. 2013-08-02 13:12:24 -07:00
gb.conf thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
gb.pem so we have spider https sites add 2013-10-13 00:15:39 -07:00
gbfilter.cpp Initial file population. 2013-08-02 13:12:24 -07:00
gbtitletest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geneaology.cpp Initial file population. 2013-08-02 13:12:24 -07:00
generateSuperMergeCode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geo_ip_table.cpp Initial file population. 2013-08-02 13:12:24 -07:00
geo_ip_table.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP_internal.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP.c Initial file population. 2013-08-02 13:12:24 -07:00
GeoIP.h Initial file population. 2013-08-02 13:12:24 -07:00
GeoIPCity.c Initial file population. 2013-08-02 13:12:24 -07:00
GeoIPCity.h Initial file population. 2013-08-02 13:12:24 -07:00
getsample.cpp Initial file population. 2013-08-02 13:12:24 -07:00
giftopnm Initial file population. 2013-08-02 13:12:24 -07:00
hash.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hash.h get "&site=abc.com+xyz.com"... working to restrict 2013-09-15 20:16:48 -07:00
HashTable.cpp fix core from last push. 2013-12-09 14:21:46 -07:00
HashTable.h mem labelling fixes. 2013-12-09 14:05:02 -07:00
HashTableT.cpp Initial file population. 2013-08-02 13:12:24 -07:00
HashTableT.h Initial file population. 2013-08-02 13:12:24 -07:00
HashTableX.cpp quite a few fixes to the quota system, cleanups etc. 2014-01-18 16:23:13 -08:00
HashTableX.h only skip checking to spider a url of its 2014-03-03 13:22:27 -08:00
hashtest2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hashtest3.cpp Initial file population. 2013-08-02 13:12:24 -07:00
hashtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Highlight.cpp Merge branch 'master' into diffbot 2013-12-07 11:34:26 -07:00
Highlight.h trying to fix json decoding bug. 2013-10-24 17:55:01 -07:00
Hostdb.cpp take out tfndb, unused. fix core 2014-02-26 01:07:13 -08:00
Hostdb.h added host disk usage redbox and stats. 2014-02-12 09:47:44 -07:00
hosts.conf fix hosts.conf 2013-12-26 09:34:35 -08:00
hosts.cpp Initial file population. 2013-08-02 13:12:24 -07:00
HttpMime.cpp mem labelling fixes. 2013-12-09 14:05:02 -07:00
HttpMime.h update the dirty word list. but we still 2013-10-15 01:01:19 -07:00
HttpRequest.cpp parm updates 2014-02-10 21:45:03 -07:00
HttpRequest.h parm updates 2014-02-10 21:45:03 -07:00
HttpServer.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
HttpServer.h use &format=0 1 or 2 for html/xml/json now. 2013-11-08 18:00:30 -08:00
iana_charset.cpp Merge branch 'diffbot' into diffbot-testing 2013-12-16 11:06:11 -08:00
iana_charset.h Merge branch 'diffbot' into diffbot-testing 2013-12-16 11:06:11 -08:00
iconv.h Initial file population. 2013-08-02 13:12:24 -07:00
Images.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Images.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Indexdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Indexdb.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
IndexList.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexList.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexReadInfo.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexReadInfo.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable2.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
IndexTable2.h Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
IndexTable.h Initial file population. 2013-08-02 13:12:24 -07:00
injectme3 added injectme3 file and documentation into compare.html 2013-08-17 11:02:26 -06:00
injector.cpp Initial file population. 2013-08-02 13:12:24 -07:00
iostream.h Initial file population. 2013-08-02 13:12:24 -07:00
ip.cpp fix old bug. 2014-01-10 18:52:47 -07:00
ip.h Initial file population. 2013-08-02 13:12:24 -07:00
ipconfig.cpp fixed some cores. brought in fixes from 2013-09-08 16:16:13 -06:00
Iso8859.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Iso8859.h Initial file population. 2013-08-02 13:12:24 -07:00
jointest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
jpegtopnm Initial file population. 2013-08-02 13:12:24 -07:00
Json.cpp fixed contenthash32 logic for json objects. 2014-02-05 13:22:03 -08:00
Json.h fixed contenthash32 logic for json objects. 2014-02-05 13:22:03 -08:00
keepalive.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Lang.cpp comment updates 2013-10-15 23:13:50 -07:00
Lang.h Initial file population. 2013-08-02 13:12:24 -07:00
LangList.cpp code cleanups. 2014-01-18 21:19:26 -08:00
LangList.h Initial file population. 2013-08-02 13:12:24 -07:00
Language.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
Language.h Initial file population. 2013-08-02 13:12:24 -07:00
LanguageIdentifier.cpp Initial file population. 2013-08-02 13:12:24 -07:00
LanguageIdentifier.h Initial file population. 2013-08-02 13:12:24 -07:00
LanguagePages.cpp Initial file population. 2013-08-02 13:12:24 -07:00
LanguagePages.h Initial file population. 2013-08-02 13:12:24 -07:00
libc.a Initial file population. 2013-08-02 13:12:24 -07:00
libcrypto.a Initial file population. 2013-08-02 13:12:24 -07:00
libgcc.a Initial file population. 2013-08-02 13:12:24 -07:00
libiconv.a Initial file population. 2013-08-02 13:12:24 -07:00
libiconv.la Initial file population. 2013-08-02 13:12:24 -07:00
libm.a Initial file population. 2013-08-02 13:12:24 -07:00
libpthread.a Initial file population. 2013-08-02 13:12:24 -07:00
libssl.a Initial file population. 2013-08-02 13:12:24 -07:00
libstdc++.a Initial file population. 2013-08-02 13:12:24 -07:00
libz.a Initial file population. 2013-08-02 13:12:24 -07:00
LICENSE code checkpoint. time slicing, faster spider code 2014-02-04 17:34:43 -08:00
Linkdb.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
Linkdb.h fixes for new link info code so it doesn't 2014-02-25 10:55:05 -08:00
LinkedList.h Initial file population. 2013-08-02 13:12:24 -07:00
linkspam.cpp renamed matches.h and matches.cpp to 2013-10-01 07:58:24 -07:00
linkspam.h Initial file population. 2013-08-02 13:12:24 -07:00
Log.cpp forgot to unlock thread lock 2013-12-15 10:43:34 -07:00
Log.h Initial file population. 2013-08-02 13:12:24 -07:00
Loop.cpp tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
Loop.h Initial file population. 2013-08-02 13:12:24 -07:00
looptest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
main.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Make.depend Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
Makefile Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
malloc.c Initial file population. 2013-08-02 13:12:24 -07:00
matches2.cpp dirty word detector revisions. we need 2013-10-16 20:19:49 -07:00
matches2.h renamed matches.h and matches.cpp to 2013-10-01 07:58:24 -07:00
Matches.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Matches.h Initial file population. 2013-08-02 13:12:24 -07:00
Mem.cpp Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
Mem.h fix typo 2013-09-08 19:51:57 -07:00
membustest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPool.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPool.h Initial file population. 2013-08-02 13:12:24 -07:00
MemPoolTree.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MemPoolTree.h Initial file population. 2013-08-02 13:12:24 -07:00
memtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
mergetest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MetaContainer.cpp Initial file population. 2013-08-02 13:12:24 -07:00
MetaContainer.h Initial file population. 2013-08-02 13:12:24 -07:00
Mime.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Mime.h Initial file population. 2013-08-02 13:12:24 -07:00
mixfile.cpp Initial file population. 2013-08-02 13:12:24 -07:00
mmseg.h Initial file population. 2013-08-02 13:12:24 -07:00
monitor.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Monitordb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Monitordb.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg0.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg0.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg1.cpp get searching on token working 2014-03-06 17:01:41 -08:00
Msg1.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg1f.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg1f.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg2.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg2.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg2a.cpp Merge branch 'master' into diffbot 2013-10-16 14:28:42 -07:00
Msg2a.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg2b.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg2b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg3.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg3.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg3a.cpp get searching on token working 2014-03-06 17:01:41 -08:00
Msg3a.h get searching on token working 2014-03-06 17:01:41 -08:00
Msg3e.cpp fix infinite loop from json parsing and 2013-09-27 17:52:36 -06:00
Msg3e.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg4.cpp Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
Msg4.h code checkpoint 2014-02-09 12:38:40 -07:00
Msg5.cpp tuning the rebalance loop 2014-03-15 14:56:11 -07:00
Msg5.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg6b.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg6b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg8b.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg8b.h Merge branch 'master' into diffbot 2013-10-16 14:28:42 -07:00
Msg9b.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg9b.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg13.cpp get searching on token working 2014-03-06 17:01:41 -08:00
Msg13.h measure crawl delay by default from 2013-11-26 14:07:28 -08:00
Msg17.cpp first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg17.h first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg20.cpp first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
Msg20.h get searching on token working 2014-03-06 17:01:41 -08:00
Msg22.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg22.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg24.cpp new Make.depend. 2013-08-09 17:13:45 -06:00
Msg28.cpp fix core from (broad)casting valueless cgi field. 2013-10-03 14:51:59 -07:00
Msg28.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg30.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg30.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Msg35.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
Msg35.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg36.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg36.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg37.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg37.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg39.cpp fix query reindex some more 2014-03-11 14:46:49 -07:00
Msg39.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg40.cpp fix core &token= core 2014-03-13 07:57:06 -07:00
Msg40.h get searching on token working 2014-03-06 17:01:41 -08:00
Msg40Cache.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg40Cache.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg42.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msg42.h Initial file population. 2013-08-02 13:12:24 -07:00
Msg51.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msg51.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msgaa.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msgaa.h Initial file population. 2013-08-02 13:12:24 -07:00
MsgC.cpp move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
MsgC.h Initial file population. 2013-08-02 13:12:24 -07:00
Msge0.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msge0.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Msge1.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Msge1.h Initial file population. 2013-08-02 13:12:24 -07:00
Multicast.cpp a lot of times rdb tree has invalid collection 2014-01-21 19:01:44 -08:00
Multicast.h move from groups to shards. got rid of annoying 2013-10-04 16:18:56 -07:00
mysynonyms.txt Initial file population. 2013-08-02 13:12:24 -07:00
numwords.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageAddColl.cpp more spiderdb spider request fixes 2014-01-19 18:00:56 -08:00
PageAddUrl.cpp parm updates 2014-02-10 21:45:03 -07:00
PageBasic.cpp update example site lists 2014-03-09 20:35:45 -07:00
PageCatdb.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageCrawlBot.cpp Merge branch 'diffbot-dan' into diffbot-testing 2014-03-11 17:02:59 -07:00
PageCrawlBot.h checkpoint 2014-02-09 15:09:48 -07:00
PageDirectory.cpp code checkpoint. time slicing, faster spider code 2014-02-04 17:34:43 -08:00
PageEvents.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageGet.cpp fix security system to actually work now 2014-02-12 00:06:00 -07:00
PageHosts.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageIndexdb.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageInject.cpp update example site lists 2014-03-09 20:35:45 -07:00
PageInject.h code cleanups. 2014-01-18 21:19:26 -08:00
PageLogView.cpp new printadmintop functionality. 2014-02-07 23:08:04 -07:00
PageNetTest.cpp move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
PageNetTest.h Initial file population. 2013-08-02 13:12:24 -07:00
PageOverview.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageParser.cpp more formatting 2014-01-19 11:56:36 -08:00
PageParser.h Initial file population. 2013-08-02 13:12:24 -07:00
PagePerf.cpp took out pagecount table. just hafta scan 2014-01-19 20:34:38 -08:00
PageReindex.cpp fix query reindex some more 2014-03-11 14:46:49 -07:00
PageReindex.h fix query reindex some more 2014-03-11 14:46:49 -07:00
PageResults.cpp Merge branch 'testing' into diffbot-testing 2014-03-10 12:08:23 -07:00
PageResults.h specify &header=1 explicitly to get json serp header 2014-03-05 07:41:59 -08:00
PageRoot.cpp fix add url core 2014-03-12 08:28:42 -07:00
Pages.cpp update example site lists 2014-03-09 20:35:45 -07:00
Pages.h update example site lists 2014-03-09 20:35:45 -07:00
PageSockets.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageSpam.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageStats.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageStatsdb.cpp formatting 2014-01-19 12:37:37 -08:00
PageSubmit.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageThesaurus.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
PageThreads.cpp formatting fixes 2014-01-19 00:57:20 -08:00
PageTitledb.cpp fix security system to actually work now 2014-02-12 00:06:00 -07:00
PageTurk.cpp Initial file population. 2013-08-02 13:12:24 -07:00
PageTurk.h Initial file population. 2013-08-02 13:12:24 -07:00
Parms.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
Parms.h show user's ip in master ips description 2014-02-16 21:56:31 -07:00
parse_iana_charsets.pl move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
pdftohtml use the "onsite" keyword in your url filters 2013-09-06 09:37:17 -06:00
Phrases.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Phrases.h Initial file population. 2013-08-02 13:12:24 -07:00
PingServer.cpp Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
PingServer.h added emergency msg box on all admin pages 2014-01-11 20:14:44 -08:00
Placedb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Placedb.h Initial file population. 2013-08-02 13:12:24 -07:00
pngtopnm Initial file population. 2013-08-02 13:12:24 -07:00
pnmscale Initial file population. 2013-08-02 13:12:24 -07:00
Pops.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pops.h Initial file population. 2013-08-02 13:12:24 -07:00
porter.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pos.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Pos.h Initial file population. 2013-08-02 13:12:24 -07:00
Posdb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Posdb.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
postalCodes.txt Initial file population. 2013-08-02 13:12:24 -07:00
PostQueryRerank.cpp handle a bunch of oom conditions that 2013-11-20 10:14:02 -07:00
PostQueryRerank.h Initial file population. 2013-08-02 13:12:24 -07:00
ppmtojpeg Initial file population. 2013-08-02 13:12:24 -07:00
Process.cpp tuning the rebalance loop 2014-03-15 14:56:11 -07:00
Process.h tuning the rebalance loop 2014-03-15 14:56:11 -07:00
Profiler.cpp formatting fixes 2014-01-19 00:57:20 -08:00
Profiler.h Initial file population. 2013-08-02 13:12:24 -07:00
Proxy.cpp fix security system to actually work now 2014-02-12 00:06:00 -07:00
Proxy.h Initial file population. 2013-08-02 13:12:24 -07:00
pstotext Initial file population. 2013-08-02 13:12:24 -07:00
QAClient.cpp Initial file population. 2013-08-02 13:12:24 -07:00
QAClient.h Initial file population. 2013-08-02 13:12:24 -07:00
quarantine.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Query.cpp fix bug in gbminint. 2014-02-06 21:36:47 -08:00
Query.h index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
Rdb.cpp fix core 2014-03-18 11:12:50 -07:00
Rdb.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
RdbBase.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
RdbBase.h tuning the rebalance loop 2014-03-15 14:56:11 -07:00
RdbBuckets.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbBuckets.h Initial file population. 2013-08-02 13:12:24 -07:00
RdbCache.cpp fix some cores. use olddoc contenthash 2014-02-07 18:28:09 -08:00
RdbCache.h removed MAX_COLL_RECS so we can have unlimited 2013-08-30 16:20:38 -07:00
RdbDump.cpp tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
RdbDump.h if coll is deleted or reset in a middle of a dump 2013-12-25 17:12:09 -08:00
RdbList.cpp tuning the rebalance loop 2014-03-15 14:56:11 -07:00
RdbList.h checkpoint for faster spider code. 2014-02-04 16:15:27 -08:00
RdbMap.cpp tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
RdbMap.h tons of changes from live github on neo. 2014-01-17 21:01:43 -08:00
RdbMem.cpp track down some nasty cores. fix 2013-10-29 16:37:14 -07:00
RdbMem.h now we can reset collection mid stream 2013-10-18 17:49:36 -07:00
RdbMerge.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
RdbMerge.h if coll is deleted or reset in a middle of a dump 2013-12-25 17:12:09 -08:00
RdbScan.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbScan.h Initial file population. 2013-08-02 13:12:24 -07:00
rdbtest2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
rdbtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RdbTree.cpp get streaming time sliced results working 2014-02-06 14:25:44 -08:00
RdbTree.h fix annoying rdbtree pos/neg key counting issue 2014-01-11 18:04:28 -08:00
README.md Update README.md 2013-11-16 20:14:06 -08:00
readRec.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Rebalance.cpp tuning the rebalance loop 2014-03-15 14:56:11 -07:00
Rebalance.h tight merge during rebalance to save 2014-03-14 23:37:30 -07:00
reindex2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Repair.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
Repair.h Initial file population. 2013-08-02 13:12:24 -07:00
RequestTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
RequestTable.h Initial file population. 2013-08-02 13:12:24 -07:00
rescue.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Revdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Revdb.h Initial file population. 2013-08-02 13:12:24 -07:00
rmbots.cpp Initial file population. 2013-08-02 13:12:24 -07:00
SafeBuf.cpp index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
SafeBuf.h index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
SafeList.h Initial file population. 2013-08-02 13:12:24 -07:00
Sanity.h Initial file population. 2013-08-02 13:12:24 -07:00
Scores.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Scores.h Initial file population. 2013-08-02 13:12:24 -07:00
Scraper.cpp take out datedb. no longer used. we store 2014-01-09 13:39:28 -08:00
Scraper.h Initial file population. 2013-08-02 13:12:24 -07:00
SearchInput.cpp get searching on token working 2014-03-06 17:01:41 -08:00
SearchInput.h Merge branch 'testing' into diffbot-testing 2014-03-10 12:08:23 -07:00
Sections.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Sections.h get new global preemptive cache 2014-01-05 11:51:09 -08:00
seektest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
seo.h Initial file population. 2013-08-02 13:12:24 -07:00
SiteGetter.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
SiteGetter.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
sleepandlog.cpp Initial file population. 2013-08-02 13:12:24 -07:00
sort.cpp Initial file population. 2013-08-02 13:12:24 -07:00
sort.h Initial file population. 2013-08-02 13:12:24 -07:00
Speller.cpp clean up logging so i can see what's going on 2013-12-10 16:41:30 -08:00
Speller.h Initial file population. 2013-08-02 13:12:24 -07:00
Spider.cpp temp hack to reduce network comm 2014-03-20 15:42:34 -07:00
Spider.h more new site list api fixes 2014-03-09 18:15:57 -07:00
Stats.cpp formatting 2014-01-19 12:37:37 -08:00
Stats.h more formatting 2014-01-19 01:09:38 -08:00
Statsdb.cpp use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
Statsdb.h fix potential problem of tons of points in 2013-10-14 22:52:29 -07:00
StopWords.cpp update common word list 2013-12-01 15:19:33 -07:00
StopWords.h Initial file population. 2013-08-02 13:12:24 -07:00
streambuf.h Initial file population. 2013-08-02 13:12:24 -07:00
Strings.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Strings.h Initial file population. 2013-08-02 13:12:24 -07:00
Summary.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Summary.h renamed matches.h and matches.cpp to 2013-10-01 07:58:24 -07:00
superMergeTest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
supported_charsets.cpp Initial file population. 2013-08-02 13:12:24 -07:00
supported_charsets.txt Initial file population. 2013-08-02 13:12:24 -07:00
Syncdb.cpp minor fix 2014-03-07 08:07:09 -08:00
Syncdb.h Initial file population. 2013-08-02 13:12:24 -07:00
Synonyms.cpp now we index all numbers that have field names 2013-11-08 16:16:13 -08:00
Synonyms.h now we index all numbers that have field names 2013-11-08 16:16:13 -08:00
Tagdb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Tagdb.h use collnum instead of coll string. 2014-03-06 15:48:11 -08:00
TcpServer.cpp send single space to socket if not streaming 2014-02-13 08:45:13 -08:00
TcpServer.h fixes for streaming mode. 2014-02-06 16:28:42 -08:00
TcpSocket.h send single space to socket if not streaming 2014-02-13 08:45:13 -08:00
test2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_convert.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_hash.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_norm.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_parser2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_parser.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test_unicode.cpp Initial file population. 2013-08-02 13:12:24 -07:00
test.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Test.cpp fix a few minor bugs. 2014-03-16 10:34:58 -07:00
Test.h Initial file population. 2013-08-02 13:12:24 -07:00
testfloats.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Tfndb.cpp code cleanups. 2014-01-18 21:19:26 -08:00
Tfndb.h code cleanups. 2014-01-18 21:19:26 -08:00
Thesaurus.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Thesaurus.h Initial file population. 2013-08-02 13:12:24 -07:00
Threads.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Threads.h thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
threadtest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
thunder.cpp Initial file population. 2013-08-02 13:12:24 -07:00
tifftopnm Initial file population. 2013-08-02 13:12:24 -07:00
Timedb.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Timedb.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
Timer.h Initial file population. 2013-08-02 13:12:24 -07:00
Title.cpp fix core from getting title of json object 2014-02-28 08:18:09 -08:00
Title.h fix core from getting title of json object 2014-02-28 08:18:09 -08:00
Titledb.cpp thread fixes. if pthread_create fails then 2014-03-15 20:07:02 -07:00
Titledb.h code cleanups. 2014-01-18 21:19:26 -08:00
TopTree.cpp index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
TopTree.h index numbers as integers too, not just floats 2014-02-06 20:57:54 -08:00
treetest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
TuringTest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
TuringTest.h Initial file population. 2013-08-02 13:12:24 -07:00
Turkdb.cpp Initial file population. 2013-08-02 13:12:24 -07:00
types.h fix compiler warning in types.h. 2013-09-08 20:00:52 -06:00
UCNormalizer.cpp code cleanups. 2014-01-18 21:19:26 -08:00
UCNormalizer.h Initial file population. 2013-08-02 13:12:24 -07:00
UCPropTable.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UCPropTable.h Initial file population. 2013-08-02 13:12:24 -07:00
UCWordIterator.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UCWordIterator.h Initial file population. 2013-08-02 13:12:24 -07:00
UdpProtocol.h Initial file population. 2013-08-02 13:12:24 -07:00
UdpServer.cpp Merge branch 'diffbot' into testing 2014-03-08 09:38:44 -07:00
UdpServer.h rebalancer working pretty well now 2014-01-15 19:08:47 -08:00
UdpSlot.cpp Initial file population. 2013-08-02 13:12:24 -07:00
UdpSlot.h Initial file population. 2013-08-02 13:12:24 -07:00
udptest.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Unicode.cpp code cleanups. 2014-01-18 21:19:26 -08:00
Unicode.h move CollectionRec stuff into Collectiondb files 2013-12-10 15:28:04 -08:00
UnicodeProperties.cpp code cleanups. 2014-01-18 21:19:26 -08:00
UnicodeProperties.h Initial file population. 2013-08-02 13:12:24 -07:00
unifiedDict.txt Initial file population. 2013-08-02 13:12:24 -07:00
uniq2.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Url.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Url.h Initial file population. 2013-08-02 13:12:24 -07:00
urlinfo.cpp fixed data corruption bug. m_finalCrawlDelay 2013-11-27 14:18:15 -08:00
Users.cpp code cleanups. 2014-01-18 21:19:26 -08:00
Users.h Initial file population. 2013-08-02 13:12:24 -07:00
ValidPointer.cpp Initial file population. 2013-08-02 13:12:24 -07:00
ValidPointer.h Initial file population. 2013-08-02 13:12:24 -07:00
Vector.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Vector.h Initial file population. 2013-08-02 13:12:24 -07:00
Weights.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Weights.h Initial file population. 2013-08-02 13:12:24 -07:00
Wiki.cpp Initial file population. 2013-08-02 13:12:24 -07:00
Wiki.h Initial file population. 2013-08-02 13:12:24 -07:00
wikititles.txt.part1 Initial file population. 2013-08-02 13:12:24 -07:00
wikititles.txt.part2 Initial file population. 2013-08-02 13:12:24 -07:00
wiktionary-buf.txt Initial file population. 2013-08-02 13:12:24 -07:00
wiktionary-lang.txt Initial file population. 2013-08-02 13:12:24 -07:00
wiktionary-syns.dat Initial file population. 2013-08-02 13:12:24 -07:00
Wiktionary.cpp Merge branch 'master' into diffbot 2013-11-20 15:51:58 -08:00
Wiktionary.h Initial file population. 2013-08-02 13:12:24 -07:00
Words.cpp fixed bugs in sort by prices, etc. 2013-11-11 18:58:45 -08:00
Words.h Initial file population. 2013-08-02 13:12:24 -07:00
Xml.cpp for json docs only give them a single 2014-01-25 08:17:38 -08:00
Xml.h for json docs only give them a single 2014-01-25 08:17:38 -08:00
XmlDoc.cpp Merge branch 'diffbot' into diffbot-testing 2014-03-10 20:26:30 -07:00
XmlDoc.h new linkdb code compiling. 2014-02-20 17:27:28 -08:00
XmlNode.cpp fixed cdata parsing issue 2013-12-19 16:04:53 -08:00
XmlNode.h Initial file population. 2013-08-02 13:12:24 -07:00
zconf.h Initial file population. 2013-08-02 13:12:24 -07:00
zlib.h Initial file population. 2013-08-02 13:12:24 -07:00

open-source-search-engine

An open source web and enterprise search engine. As can be seen on http://www.gigablast.com/ .

RUNNING GIGABLAST

See html/admin.html for all administrative documentation including the quick start instructions.

Alternatively, visit http://www.gigablast.com/admin.html

See html/compare.html for a comparison of Gigablast to SOLR. Although this is very sparse right now, it does include some useful commands.

CODE ARCHITECTURE

See html/developer.html for all code documentation.

Alternatively, visit http://www.gigablast.com/developer.html

CONTACT

Contact me for feature requests or help in general. I will work for free for good use cases. mattdwells@hotmail.com.