antiword-dir
Initial file population.
2013-08-02 13:12:24 -07:00
coll.main.0
added "retrictDomain" parm which defaults to 1.
2013-10-29 09:31:57 -07:00
html
Merge branch 'master' into diffbot
2013-10-25 12:32:02 -07:00
openssl
we already include our own 32-bit
2013-09-15 18:25:49 -06:00
ucdata
Initial file population.
2013-08-02 13:12:24 -07:00
Abbreviations.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Abbreviations.h
change a couple of possible reserved names in C++
2013-08-28 22:59:01 -06:00
Accessdb.cpp
fix addColl() logic for collectionless rdbs
2013-10-16 14:38:09 -07:00
Accessdb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Address.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Address.h
change a couple of possible reserved names in C++
2013-08-28 22:59:01 -06:00
addtest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Ads.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Ads.h
Initial file population.
2013-08-02 13:12:24 -07:00
AdultBit.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
AdultBit.h
Initial file population.
2013-08-02 13:12:24 -07:00
animate.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
antiword
Initial file population.
2013-08-02 13:12:24 -07:00
AutoBan.cpp
various fixes.
2013-09-16 10:16:49 -07:00
AutoBan.h
Initial file population.
2013-08-02 13:12:24 -07:00
badcattable.dat
Initial file population.
2013-08-02 13:12:24 -07:00
BigFile.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
BigFile.h
Initial file population.
2013-08-02 13:12:24 -07:00
Bits.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Bits.h
Initial file population.
2013-08-02 13:12:24 -07:00
blaster.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Blaster.cpp
use ./gb blaster -u <fileofurls> to just inject urls,
2013-08-19 16:33:27 -06:00
Blaster.h
use ./gb blaster -u <fileofurls> to just inject urls,
2013-08-19 16:33:27 -06:00
bmptopnm
Initial file population.
2013-08-02 13:12:24 -07:00
Cachedb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Cachedb.h
Initial file population.
2013-08-02 13:12:24 -07:00
camsort.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
catcountry.dat
Initial file population.
2013-08-02 13:12:24 -07:00
Catdb.cpp
fix addColl() logic for collectionless rdbs
2013-10-16 14:38:09 -07:00
Catdb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Categories.cpp
documentation updates. fixed sd=0.
2013-10-13 14:24:41 -07:00
Categories.h
documentation updates. fixed sd=0.
2013-10-13 14:24:41 -07:00
CatRec.cpp
fix a couple catdb generation bugs.
2013-10-12 20:33:04 -07:00
CatRec.h
Initial file population.
2013-08-02 13:12:24 -07:00
character-sets
Initial file population.
2013-08-02 13:12:24 -07:00
check_unicode.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Clusterdb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Clusterdb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Collectiondb.cpp
reset spiderstatus
2013-10-30 13:49:31 -07:00
Collectiondb.h
would block when deleting or resetting
2013-10-30 13:12:46 -07:00
CollectionRec.cpp
better crawl status reporting.
2013-10-30 10:00:46 -07:00
CollectionRec.h
better crawl status reporting.
2013-10-30 10:00:46 -07:00
Conf.cpp
Merge branch 'master' into diffbot
2013-09-28 13:13:12 -07:00
Conf.h
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
convert.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
CountryCode.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
CountryCode.h
Initial file population.
2013-08-02 13:12:24 -07:00
create_ucd_tables.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
DailyMerge.cpp
Fixed some bugs.
2013-08-09 08:52:15 -07:00
DailyMerge.h
Initial file population.
2013-08-02 13:12:24 -07:00
DataFeed.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
DataFeed.h
Initial file population.
2013-08-02 13:12:24 -07:00
Datedb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Datedb.h
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Dates.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Dates.h
Initial file population.
2013-08-02 13:12:24 -07:00
Diff.cpp
Fixed some bugs.
2013-08-09 08:52:15 -07:00
Diff.h
Initial file population.
2013-08-02 13:12:24 -07:00
Dir.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Dir.h
Initial file population.
2013-08-02 13:12:24 -07:00
DiskPageCache.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
DiskPageCache.h
Initial file population.
2013-08-02 13:12:24 -07:00
dlstubs.c
Initial file population.
2013-08-02 13:12:24 -07:00
dmozparse.cpp
add support for noindex meta tag.
2013-10-12 22:50:23 -07:00
Dns.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Dns.h
Initial file population.
2013-08-02 13:12:24 -07:00
DnsProtocol.h
Initial file population.
2013-08-02 13:12:24 -07:00
dnstest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Domains.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Domains.h
Initial file population.
2013-08-02 13:12:24 -07:00
dumpcore.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Entities.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Entities.h
Initial file population.
2013-08-02 13:12:24 -07:00
Errno.cpp
add spider reply even on g_errno now with an error
2013-09-29 09:22:20 -06:00
Errno.h
add spider reply even on g_errno now with an error
2013-09-29 09:22:20 -06:00
Events.h
Initial file population.
2013-08-02 13:12:24 -07:00
Facebook.cpp
fix addColl() logic for collectionless rdbs
2013-10-16 14:38:09 -07:00
Facebook.h
Initial file population.
2013-08-02 13:12:24 -07:00
fastIndexTable.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
fctypes.cpp
fix core from calling a gettime related
2013-09-06 15:39:53 -06:00
fctypes.h
Initial file population.
2013-08-02 13:12:24 -07:00
File.cpp
couple fixes to makefile etc.
2013-09-28 16:37:39 -06:00
File.h
Initial file population.
2013-08-02 13:12:24 -07:00
filterquerylogs.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Flags.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Flags.h
Initial file population.
2013-08-02 13:12:24 -07:00
gb-include.h
Initial file population.
2013-08-02 13:12:24 -07:00
gb.conf
fix respider frequency bug.
2013-10-21 15:06:23 -07:00
gb.pem
so we have spider https sites add
2013-10-13 00:15:39 -07:00
gbfilter.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
gbtitletest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
geneaology.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
generateSuperMergeCode.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
geo_ip_table.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
geo_ip_table.h
Initial file population.
2013-08-02 13:12:24 -07:00
GeoIP_internal.h
Initial file population.
2013-08-02 13:12:24 -07:00
GeoIP.c
Initial file population.
2013-08-02 13:12:24 -07:00
GeoIP.h
Initial file population.
2013-08-02 13:12:24 -07:00
GeoIPCity.c
Initial file population.
2013-08-02 13:12:24 -07:00
GeoIPCity.h
Initial file population.
2013-08-02 13:12:24 -07:00
getsample.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
giftopnm
Initial file population.
2013-08-02 13:12:24 -07:00
hash.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
hash.h
get "&site=abc.com+xyz.com"... working to restrict
2013-09-15 20:16:48 -07:00
HashTable.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
HashTable.h
Initial file population.
2013-08-02 13:12:24 -07:00
HashTableT.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
HashTableT.h
Initial file population.
2013-08-02 13:12:24 -07:00
HashTableX.cpp
spider speedups and fixes.
2013-09-25 11:58:03 -06:00
HashTableX.h
spider speedups and fixes.
2013-09-25 11:58:03 -06:00
hashtest2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
hashtest3.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
hashtest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Highlight.cpp
trying to fix json decoding bug.
2013-10-24 17:55:01 -07:00
Highlight.h
trying to fix json decoding bug.
2013-10-24 17:55:01 -07:00
Hostdb.cpp
num-mirrors: updates
2013-10-24 14:59:35 -07:00
Hostdb.h
fix another bug from shard change.
2013-10-04 16:49:50 -07:00
hosts.conf
minor msg update
2013-10-29 15:26:32 -07:00
hosts.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
HttpMime.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
HttpMime.h
update the dirty word list. but we still
2013-10-15 01:01:19 -07:00
HttpRequest.cpp
made webhook return the crawl name
2013-10-28 22:03:10 -07:00
HttpRequest.h
Merge branch 'master' into diffbot
2013-09-28 13:13:12 -07:00
HttpServer.cpp
/v2/bulk api fixes
2013-10-22 18:51:09 -07:00
HttpServer.h
add sendEmailThroughMandrill() to send
2013-10-08 18:01:38 -07:00
iana_charset.cpp
new crawlbot api. not backwards compatible any more.
2013-09-17 10:25:54 -07:00
iana_charset.h
new crawlbot api. not backwards compatible any more.
2013-09-17 10:25:54 -07:00
iconv.h
Initial file population.
2013-08-02 13:12:24 -07:00
Images.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Images.h
Initial file population.
2013-08-02 13:12:24 -07:00
Indexdb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Indexdb.h
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
IndexList.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
IndexList.h
Initial file population.
2013-08-02 13:12:24 -07:00
IndexReadInfo.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
IndexReadInfo.h
Initial file population.
2013-08-02 13:12:24 -07:00
IndexTable2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
IndexTable2.h
Initial file population.
2013-08-02 13:12:24 -07:00
IndexTable.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
IndexTable.h
Initial file population.
2013-08-02 13:12:24 -07:00
injectme3
added injectme3 file and documentation into compare.html
2013-08-17 11:02:26 -06:00
injector.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
iostream.h
Initial file population.
2013-08-02 13:12:24 -07:00
ip.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
ip.h
Initial file population.
2013-08-02 13:12:24 -07:00
ipconfig.cpp
fixed some cores. brought in fixes from
2013-09-08 16:16:13 -06:00
Iso8859.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Iso8859.h
Initial file population.
2013-08-02 13:12:24 -07:00
jointest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
jpegtopnm
Initial file population.
2013-08-02 13:12:24 -07:00
Json.cpp
a few bug fixes
2013-10-17 18:59:00 -07:00
Json.h
a few bug fixes
2013-10-17 18:59:00 -07:00
keepalive.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Lang.cpp
comment updates
2013-10-15 23:13:50 -07:00
Lang.h
Initial file population.
2013-08-02 13:12:24 -07:00
LangList.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
LangList.h
Initial file population.
2013-08-02 13:12:24 -07:00
Language.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Language.h
Initial file population.
2013-08-02 13:12:24 -07:00
LanguageIdentifier.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
LanguageIdentifier.h
Initial file population.
2013-08-02 13:12:24 -07:00
LanguagePages.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
LanguagePages.h
Initial file population.
2013-08-02 13:12:24 -07:00
libc.a
Initial file population.
2013-08-02 13:12:24 -07:00
libcrypto.a
Initial file population.
2013-08-02 13:12:24 -07:00
libgcc.a
Initial file population.
2013-08-02 13:12:24 -07:00
libiconv.a
Initial file population.
2013-08-02 13:12:24 -07:00
libiconv.la
Initial file population.
2013-08-02 13:12:24 -07:00
libm.a
Initial file population.
2013-08-02 13:12:24 -07:00
libpthread.a
Initial file population.
2013-08-02 13:12:24 -07:00
libssl.a
Initial file population.
2013-08-02 13:12:24 -07:00
libstdc++.a
Initial file population.
2013-08-02 13:12:24 -07:00
libz.a
Initial file population.
2013-08-02 13:12:24 -07:00
LICENSE
exclude events and seo functionality.
2013-09-08 17:07:42 -06:00
Linkdb.cpp
added X-referring-url: X-anchor-text: and
2013-10-31 11:44:09 -07:00
Linkdb.h
fix mem leak of LinkInfo.
2013-10-16 17:17:28 -07:00
LinkedList.h
Initial file population.
2013-08-02 13:12:24 -07:00
linkspam.cpp
renamed matches.h and matches.cpp to
2013-10-01 07:58:24 -07:00
linkspam.h
Initial file population.
2013-08-02 13:12:24 -07:00
Log.cpp
fixed up thread/spider log msgs.
2013-08-29 21:15:42 -06:00
Log.h
Initial file population.
2013-08-02 13:12:24 -07:00
Loop.cpp
cleanup warnings in log.
2013-09-13 14:37:35 -07:00
Loop.h
Initial file population.
2013-08-02 13:12:24 -07:00
looptest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
main.cpp
would block when deleting or resetting
2013-10-30 13:12:46 -07:00
Make.depend
fix collection resetting.
2013-10-18 15:21:00 -07:00
Makefile
added X-referring-url: X-anchor-text: and
2013-10-31 11:44:09 -07:00
malloc.c
Initial file population.
2013-08-02 13:12:24 -07:00
matches2.cpp
dirty word detector revisions. we need
2013-10-16 20:19:49 -07:00
matches2.h
renamed matches.h and matches.cpp to
2013-10-01 07:58:24 -07:00
Matches.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Matches.h
Initial file population.
2013-08-02 13:12:24 -07:00
Mem.cpp
fix crawl round end detection etc.
2013-10-23 15:53:59 -07:00
Mem.h
fix typo
2013-09-08 19:51:57 -07:00
membustest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
MemPool.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
MemPool.h
Initial file population.
2013-08-02 13:12:24 -07:00
MemPoolTree.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
MemPoolTree.h
Initial file population.
2013-08-02 13:12:24 -07:00
memtest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
mergetest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
MetaContainer.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
MetaContainer.h
Initial file population.
2013-08-02 13:12:24 -07:00
Mime.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Mime.h
Initial file population.
2013-08-02 13:12:24 -07:00
mixfile.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
mmseg.h
Initial file population.
2013-08-02 13:12:24 -07:00
monitor.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Monitordb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Monitordb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg0.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg0.h
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg1.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Msg1.h
log fixes for debugging. try to
2013-10-02 22:37:20 -06:00
Msg1f.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg1f.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg2.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg2.h
new &sites=xyz.com+abc.com+... functionality compiles ok.
2013-09-15 18:14:32 -06:00
Msg2a.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Msg2a.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg2b.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg2b.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg3.cpp
get "&site=abc.com+xyz.com"... working to restrict
2013-09-15 20:16:48 -07:00
Msg3.h
almost done adding support for whitelists.
2013-09-15 15:15:56 -06:00
Msg3a.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg3a.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg3e.cpp
fix infinite loop from json parsing and
2013-09-27 17:52:36 -06:00
Msg3e.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg4.cpp
fix issue of losing data destined for
2013-10-30 15:48:31 -07:00
Msg4.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg5.cpp
try to fix core from spiderdb scan coming back to
2013-10-29 16:51:21 -07:00
Msg5.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg6b.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg6b.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg8b.cpp
fixes when crawling on distributed 2x2
2013-10-25 14:54:24 -07:00
Msg8b.h
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Msg9b.cpp
fix a couple catdb generation bugs.
2013-10-12 20:33:04 -07:00
Msg9b.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg13.cpp
fix core when getting new spider reply
2013-10-04 20:44:29 -07:00
Msg13.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg17.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg17.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg20.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg20.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg22.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg22.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg24.cpp
new Make.depend.
2013-08-09 17:13:45 -06:00
Msg28.cpp
fix core from (broad)casting valueless cgi field.
2013-10-03 14:51:59 -07:00
Msg28.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg30.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg30.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg35.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg35.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg36.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg36.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg37.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg37.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg39.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg39.h
new &sites=xyz.com+abc.com+... functionality compiles ok.
2013-09-15 18:14:32 -06:00
Msg40.cpp
trying to bring back dmoz integration.
2013-10-02 22:34:21 -06:00
Msg40.h
trying to bring back dmoz integration.
2013-10-02 22:34:21 -06:00
Msg40Cache.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg40Cache.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg42.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msg42.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msg51.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Msg51.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msgaa.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msgaa.h
Initial file population.
2013-08-02 13:12:24 -07:00
MsgC.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
MsgC.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msge0.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msge0.h
Initial file population.
2013-08-02 13:12:24 -07:00
Msge1.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Msge1.h
Initial file population.
2013-08-02 13:12:24 -07:00
Multicast.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Multicast.h
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
mysynonyms.txt
Initial file population.
2013-08-02 13:12:24 -07:00
numwords.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageAddColl.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageAddUrl.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
PageCatdb.cpp
trying to bring back dmoz integration.
2013-10-02 22:34:21 -06:00
PageCrawlBot.cpp
nomenclature download->crawl
2013-10-30 16:14:30 -07:00
PageCrawlBot.h
added "seeds" to json reply. store seed urls
2013-10-21 17:35:14 -07:00
PageDirectory.cpp
fix dup bug.
2013-10-13 16:06:38 -07:00
PageEvents.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageGet.cpp
trying to fix json decoding bug.
2013-10-24 17:55:01 -07:00
PageHosts.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
PageIndexdb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
PageInject.cpp
crawlbot fixes.
2013-10-15 16:31:59 -07:00
PageInject.h
crawlbot fixes.
2013-10-15 16:31:59 -07:00
PageLogin.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageLogView.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageNetTest.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
PageNetTest.h
Initial file population.
2013-08-02 13:12:24 -07:00
PageOverview.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
PageParser.cpp
added X-referring-url: X-anchor-text: and
2013-10-31 11:44:09 -07:00
PageParser.h
Initial file population.
2013-08-02 13:12:24 -07:00
PagePerf.cpp
half way done fixing performance graph.
2013-10-13 22:02:21 -07:00
PageReindex.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageReindex.h
Initial file population.
2013-08-02 13:12:24 -07:00
PageResults.cpp
trying to fix json decoding bug.
2013-10-24 17:55:01 -07:00
PageResults.h
added searchbox for dmoz pages/sites.
2013-10-13 15:45:12 -07:00
PageRoot.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Pages.cpp
/v2/bulk api fixes
2013-10-22 18:51:09 -07:00
Pages.h
got email and url notification code compiling.
2013-10-01 15:14:39 -06:00
PageSockets.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageSpam.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageStats.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
PageStatsdb.cpp
fix potential problem of tons of points in
2013-10-14 22:52:29 -07:00
PageSubmit.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageThesaurus.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageThreads.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageTitledb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
PageTurk.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PageTurk.h
Initial file population.
2013-08-02 13:12:24 -07:00
Parms.cpp
added "retrictDomain" parm which defaults to 1.
2013-10-29 09:31:57 -07:00
Parms.h
code checkpoint
2013-10-14 13:00:05 -06:00
parse_iana_charsets.pl
Initial file population.
2013-08-02 13:12:24 -07:00
pdftohtml
use the "onsite" keyword in your url filters
2013-09-06 09:37:17 -06:00
Phrases.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Phrases.h
Initial file population.
2013-08-02 13:12:24 -07:00
PingServer.cpp
better crawl status reporting.
2013-10-30 10:00:46 -07:00
PingServer.h
better crawl status reporting.
2013-10-30 10:00:46 -07:00
Placedb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Placedb.h
Initial file population.
2013-08-02 13:12:24 -07:00
pngtopnm
Initial file population.
2013-08-02 13:12:24 -07:00
pnmscale
Initial file population.
2013-08-02 13:12:24 -07:00
Pops.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Pops.h
Initial file population.
2013-08-02 13:12:24 -07:00
porter.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Pos.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Pos.h
Initial file population.
2013-08-02 13:12:24 -07:00
Posdb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Posdb.h
speed up whitelist hashtable like 20x
2013-09-15 21:10:53 -07:00
postalCodes.txt
Initial file population.
2013-08-02 13:12:24 -07:00
PostQueryRerank.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
PostQueryRerank.h
Initial file population.
2013-08-02 13:12:24 -07:00
ppmtojpeg
Initial file population.
2013-08-02 13:12:24 -07:00
ppthtml
Initial file population.
2013-08-02 13:12:24 -07:00
Process.cpp
would block when deleting or resetting
2013-10-30 13:12:46 -07:00
Process.h
would block when deleting or resetting
2013-10-30 13:12:46 -07:00
Profiler.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Profiler.h
Initial file population.
2013-08-02 13:12:24 -07:00
Proxy.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Proxy.h
Initial file population.
2013-08-02 13:12:24 -07:00
pstotext
Initial file population.
2013-08-02 13:12:24 -07:00
QAClient.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
QAClient.h
Initial file population.
2013-08-02 13:12:24 -07:00
quarantine.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Query.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Query.h
Initial file population.
2013-08-02 13:12:24 -07:00
Rdb.cpp
fix bug when we nuke a collnum
2013-10-30 12:27:08 -07:00
Rdb.h
fix bug when we nuke a collnum
2013-10-30 12:27:08 -07:00
RdbBase.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
RdbBase.h
Initial file population.
2013-08-02 13:12:24 -07:00
RdbBuckets.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RdbBuckets.h
Initial file population.
2013-08-02 13:12:24 -07:00
RdbCache.cpp
now we can reset collection mid stream
2013-10-18 17:49:36 -07:00
RdbCache.h
removed MAX_COLL_RECS so we can have unlimited
2013-08-30 16:20:38 -07:00
RdbDump.cpp
Merge branch 'master' into diffbot
2013-10-25 12:32:02 -07:00
RdbDump.h
Initial file population.
2013-08-02 13:12:24 -07:00
RdbList.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RdbList.h
Initial file population.
2013-08-02 13:12:24 -07:00
RdbMap.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RdbMap.h
Initial file population.
2013-08-02 13:12:24 -07:00
RdbMem.cpp
track down some nasty cores. fix
2013-10-29 16:37:14 -07:00
RdbMem.h
now we can reset collection mid stream
2013-10-18 17:49:36 -07:00
RdbMerge.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RdbMerge.h
Initial file population.
2013-08-02 13:12:24 -07:00
RdbScan.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RdbScan.h
Initial file population.
2013-08-02 13:12:24 -07:00
rdbtest2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
rdbtest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RdbTree.cpp
track down some nasty cores. fix
2013-10-29 16:37:14 -07:00
RdbTree.h
fix a couple collection related bugs
2013-10-21 11:38:33 -07:00
README.md
updated README.md to reference compare.html
2013-08-19 17:20:30 -06:00
readRec.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
reindex2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Repair.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Repair.h
Initial file population.
2013-08-02 13:12:24 -07:00
RequestTable.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
RequestTable.h
Initial file population.
2013-08-02 13:12:24 -07:00
rescue.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Revdb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Revdb.h
Initial file population.
2013-08-02 13:12:24 -07:00
rmbots.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
SafeBuf.cpp
fix that json RE-encoding bug
2013-10-24 18:09:35 -07:00
SafeBuf.h
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
SafeList.h
Initial file population.
2013-08-02 13:12:24 -07:00
Sanity.h
Initial file population.
2013-08-02 13:12:24 -07:00
Scores.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Scores.h
Initial file population.
2013-08-02 13:12:24 -07:00
Scraper.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Scraper.h
Initial file population.
2013-08-02 13:12:24 -07:00
SearchInput.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
SearchInput.h
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Sections.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Sections.h
make sections grow dynamically so we do not
2013-10-06 11:04:10 -06:00
seektest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
seo.h
Initial file population.
2013-08-02 13:12:24 -07:00
SiteGetter.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
SiteGetter.h
Initial file population.
2013-08-02 13:12:24 -07:00
sleepandlog.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
sort.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
sort.h
Initial file population.
2013-08-02 13:12:24 -07:00
Speller.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Speller.h
Initial file population.
2013-08-02 13:12:24 -07:00
Spider.cpp
added X-referring-url: X-anchor-text: and
2013-10-31 11:44:09 -07:00
Spider.h
better crawl status reporting.
2013-10-30 10:00:46 -07:00
Stats.cpp
start using html div graph for
2013-10-14 20:35:45 -07:00
Stats.h
remove old libplotter references
2013-10-13 23:48:07 -07:00
Statsdb.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
Statsdb.h
fix potential problem of tons of points in
2013-10-14 22:52:29 -07:00
StopWords.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
StopWords.h
Initial file population.
2013-08-02 13:12:24 -07:00
streambuf.h
Initial file population.
2013-08-02 13:12:24 -07:00
Strings.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Strings.h
Initial file population.
2013-08-02 13:12:24 -07:00
Summary.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Summary.h
renamed matches.h and matches.cpp to
2013-10-01 07:58:24 -07:00
superMergeTest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
supported_charsets.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
supported_charsets.txt
Initial file population.
2013-08-02 13:12:24 -07:00
Syncdb.cpp
fix addColl() logic for collectionless rdbs
2013-10-16 14:38:09 -07:00
Syncdb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Synonyms.cpp
fix core from hashtablex::set() not getting
2013-09-15 21:15:58 -07:00
Synonyms.h
Initial file population.
2013-08-02 13:12:24 -07:00
Tagdb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Tagdb.h
Initial file population.
2013-08-02 13:12:24 -07:00
TcpServer.cpp
Merge branch 'master' into diffbot
2013-10-16 14:28:42 -07:00
TcpServer.h
Initial file population.
2013-08-02 13:12:24 -07:00
TcpSocket.h
integrate diffbot from svn back into git.
2013-09-13 09:23:18 -07:00
test2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test_convert.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test_hash.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test_norm.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test_parser2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test_parser.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test_unicode.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
test.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Test.cpp
deal with if callback is null
2013-10-30 13:18:19 -07:00
Test.h
Initial file population.
2013-08-02 13:12:24 -07:00
testfloats.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Tfndb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Tfndb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Thesaurus.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Thesaurus.h
Initial file population.
2013-08-02 13:12:24 -07:00
Threads.cpp
cleanup warnings in log.
2013-09-13 14:37:35 -07:00
Threads.h
when using pthreads block SIGIO
2013-08-21 15:01:26 -06:00
threadtest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
thunder.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
tifftopnm
Initial file population.
2013-08-02 13:12:24 -07:00
Timedb.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Timedb.h
Initial file population.
2013-08-02 13:12:24 -07:00
Timer.h
Initial file population.
2013-08-02 13:12:24 -07:00
Title.cpp
Merge branch 'master' into diffbot
2013-09-16 09:05:37 -07:00
Title.h
Initial file population.
2013-08-02 13:12:24 -07:00
Titledb.cpp
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
Titledb.h
move from groups to shards. got rid of annoying
2013-10-04 16:18:56 -07:00
TopTree.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
TopTree.h
Initial file population.
2013-08-02 13:12:24 -07:00
treetest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
TuringTest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
TuringTest.h
Initial file population.
2013-08-02 13:12:24 -07:00
Turkdb.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
types.h
fix compiler warning in types.h.
2013-09-08 20:00:52 -06:00
UCNormalizer.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
UCNormalizer.h
Initial file population.
2013-08-02 13:12:24 -07:00
UCPropTable.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
UCPropTable.h
Initial file population.
2013-08-02 13:12:24 -07:00
UCWordIterator.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
UCWordIterator.h
Initial file population.
2013-08-02 13:12:24 -07:00
UdpProtocol.h
Initial file population.
2013-08-02 13:12:24 -07:00
UdpServer.cpp
fix core from trying to get the time
2013-09-01 12:55:22 -06:00
UdpServer.h
Initial file population.
2013-08-02 13:12:24 -07:00
UdpSlot.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
UdpSlot.h
Initial file population.
2013-08-02 13:12:24 -07:00
udptest.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Unicode.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Unicode.h
Initial file population.
2013-08-02 13:12:24 -07:00
UnicodeProperties.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
UnicodeProperties.h
Initial file population.
2013-08-02 13:12:24 -07:00
unifiedDict.txt
Initial file population.
2013-08-02 13:12:24 -07:00
uniq2.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Url.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Url.h
Initial file population.
2013-08-02 13:12:24 -07:00
urlinfo.cpp
just ignore all urls with # (hashtag) in them
2013-10-03 23:33:55 -06:00
Users.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Users.h
Initial file population.
2013-08-02 13:12:24 -07:00
ValidPointer.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
ValidPointer.h
Initial file population.
2013-08-02 13:12:24 -07:00
Vector.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Vector.h
Initial file population.
2013-08-02 13:12:24 -07:00
Weights.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Weights.h
Initial file population.
2013-08-02 13:12:24 -07:00
Wiki.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Wiki.h
Initial file population.
2013-08-02 13:12:24 -07:00
wikititles.txt.part1
Initial file population.
2013-08-02 13:12:24 -07:00
wikititles.txt.part2
Initial file population.
2013-08-02 13:12:24 -07:00
wiktionary-buf.txt
Initial file population.
2013-08-02 13:12:24 -07:00
wiktionary-lang.txt
Initial file population.
2013-08-02 13:12:24 -07:00
wiktionary-syns.dat
Initial file population.
2013-08-02 13:12:24 -07:00
Wiktionary.cpp
remove debug point.
2013-10-20 10:25:26 -07:00
Wiktionary.h
Initial file population.
2013-08-02 13:12:24 -07:00
Words.cpp
speed up whitelist hashtable like 20x
2013-09-15 21:10:53 -07:00
Words.h
Initial file population.
2013-08-02 13:12:24 -07:00
Xml.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
Xml.h
Initial file population.
2013-08-02 13:12:24 -07:00
XmlDoc.cpp
added X-referring-url: X-anchor-text: and
2013-10-31 11:44:09 -07:00
XmlDoc.h
just selecting a url to crawl should
2013-10-28 22:38:15 -07:00
XmlNode.cpp
Initial file population.
2013-08-02 13:12:24 -07:00
XmlNode.h
Initial file population.
2013-08-02 13:12:24 -07:00
zconf.h
Initial file population.
2013-08-02 13:12:24 -07:00
zlib.h
Initial file population.
2013-08-02 13:12:24 -07:00