Commit Graph

4 Commits

Author SHA1 Message Date
Matt Wells
5ee2be8fcf fixed data corruption bug. m_finalCrawlDelay
was being stored in xmldoc titlerec header.
2013-11-27 14:18:15 -08:00
mwells
f562e6da9a just ignore all urls with # (hashtag) in them
from the dmoz dump. we were truncating
http://twitter.com/#!/ronpaul to
http://twitter.com/ and when looking up
the catids of twitter.com got that ronpaul url.
so that's bad. people should respect the hashtag.
2013-10-03 23:33:55 -06:00
David Sparks
0783c7395e Copied these global vars from main.cpp to fix compilation error 2013-08-04 22:37:01 -07:00
Matt Wells
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00