Commit Graph

242 Commits

Author SHA1 Message Date
mwells
72dc660598 Merge branch 'testing' into diffbot-matt
Conflicts:
	Collectiondb.cpp
	HttpRequest.h
	PageBasic.cpp
	coll.main.0/coll.conf
2014-04-09 11:18:39 -07:00
mwells
be99155986 more updates 2014-04-09 11:03:31 -07:00
mwells
5ee79a4c2f daemonize on ./gb 0 etc. 2014-04-06 15:57:38 -07:00
mwells
b0dbf833a7 fix sitelist update logic. 2014-04-05 18:26:00 -07:00
mwells
ac5cf7971b more misc updates. 2014-04-05 18:09:04 -07:00
mwells
61b4ec4ca6 added some qa testing logic. qa.cpp. 2014-04-05 11:33:42 -07:00
Matt Wells
d6434191d1 nomenclature changes to reduce collissions.
name collection 'qatest123' for doing smoke tests,
not 'test'.
2014-03-31 15:02:17 -07:00
Matt Wells
98a10d4936 Merge branch 'testing' into diffbot-testing 2014-03-20 15:50:49 -07:00
Matt Wells
6e23d37e47 Merge branch 'diffbot' into diffbot-testing 2014-03-17 17:27:28 -07:00
Matt Wells
4abf56a75d cleanups 2014-03-16 18:06:22 -07:00
Matt Wells
5057fdaf14 aesthetic cleanups 2014-03-16 17:12:04 -07:00
Matt Wells
acd05aa740 fix a few minor bugs.
/master/->/admin/ and crawl type mismatch.
2014-03-16 10:34:58 -07:00
mwells
7812f5c746 more bool fixes. still needs a little more work 2014-03-13 13:54:23 -07:00
Matt Wells
018258bcaa Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing 2014-03-12 20:55:21 -07:00
Matt Wells
fbd1bcd349 initial attempt at new boolean query logic.
supports unlimited # of boolean query terms.
already docid phased from phasing logic already there
but could be phased more to save more mem and speed up
a little more.
2014-03-12 20:53:44 -07:00
Matt Wells
312438a32b Merge branch 'diffbot-dan' into diffbot-testing 2014-03-11 17:02:59 -07:00
Daniel Steinberg
2331b4673d Defect #2099: throw an error a crawl request was made with a name that already existed for bulk request (or the other way around) 2014-03-11 16:21:58 -07:00
Matt Wells
662b6d4b32 doc updates 2014-03-09 20:43:49 -07:00
Matt Wells
90ff2c2a25 update example site lists 2014-03-09 20:35:45 -07:00
Matt Wells
8aa0662a27 Merge branch 'diffbot' into testing
Conflicts:

	Make.depend
	PageResults.cpp
	Parms.cpp
	Spider.cpp
	Spider.h
	gb.conf
2014-03-08 09:38:44 -07:00
Matt Wells
14817df7a9 new site patterns api stuff 2014-03-08 09:23:32 -07:00
Matt Wells
451a092378 fix core from changing parms while evaluating
a url.
2014-03-06 07:47:43 -08:00
Matt Wells
ab9f2b33c1 definition updates 2014-03-04 08:37:39 -07:00
Matt Wells
5f3aa24805 took out restrictDomain logic. now we always
only follow links on the same domain as the seed
UNLESS a url crawl pattern or a url crawl regex
was specified.
2014-02-27 19:53:17 -08:00
Matt Wells
ae2aed7066 try to fix a few cores from deleting collections.
try to spider urls again if user changes
certain crawling parms. like regex, patterns, etc.
2014-02-18 09:44:15 -08:00
Matt Wells
9c9d5fff98 print out content type in caps with maroon bg
in serps. use empty site patterns to mean no
restriction, not "*" anymore for simplicity.
2014-02-16 22:47:02 -07:00
Matt Wells
0b5cd6d3f9 more parm fixes 2014-02-16 22:18:39 -07:00
Matt Wells
48315f6dc3 parm fixes 2014-02-16 22:13:27 -07:00
Matt Wells
725b6189a7 show user's ip in master ips description
so they can add it to the list easily.
2014-02-16 21:56:31 -07:00
Matt Wells
ce652462b0 add color coded circles to coll nav bar.
disk usage red box.
2014-02-16 19:59:53 -07:00
Matt Wells
32526a9b25 more checksum fixes for json. fixes for
repair/rebuild procedure.
2014-02-16 10:46:41 -08:00
Matt Wells
cd6069e5a6 send single space to socket if not streaming
and search results still not ready after 10 seconds.
send it every 10 seconds to prevent client from closing socket.
sped up all downloads, json and csv, but not doing "fuzzy"
deduping of search results, but just deduping on page
content hash. added TcpSocket::m_numDestroys to ensure we
do not send heartbeat on a socket that was closed and
re-opened for another client.
2014-02-13 08:45:13 -08:00
Matt Wells
68a14de031 security admin fixes 2014-02-12 00:36:09 -07:00
Matt Wells
3b0a571cea fix security system to actually work now 2014-02-12 00:06:00 -07:00
Matt Wells
609a344a57 fix counting bug in array parms 2014-02-11 22:28:04 -07:00
Matt Wells
9a76ff2531 minor parm updates 2014-02-11 20:50:36 -07:00
Matt Wells
c9be18615c more parm saving fixes 2014-02-10 22:04:22 -07:00
Matt Wells
2efbb602df fix saving parms bug 2014-02-10 21:52:29 -07:00
Matt Wells
953b7c558d parm updates 2014-02-10 21:45:03 -07:00
Matt Wells
69fa6662bc EDOCUNCHANGED fixes for diffbot 2014-02-10 16:23:39 -08:00
Matt Wells
debd9089e8 better logging msg when updating parm. 2014-02-10 11:29:24 -08:00
Matt Wells
c041d47a0c html formatting updates 2014-02-10 00:15:04 -07:00
Matt Wells
b309d84245 html updates 2014-02-09 23:19:43 -07:00
Matt Wells
9f0d2ad82e parm updates 2014-02-09 23:05:36 -07:00
Matt Wells
cdf2550136 more parm fixes 2014-02-09 22:51:16 -07:00
Matt Wells
c2c3fe993c parm fixes for basic pages 2014-02-09 22:25:08 -07:00
Matt Wells
d2b473e554 checkpoint 2014-02-09 19:09:44 -07:00
Matt Wells
ecdd167d9b code checkpoint 2014-02-09 16:41:43 -07:00
Matt Wells
f420bd2769 checkpoint 2014-02-09 15:09:48 -07:00
Matt Wells
156b50240a code checkpoint 2014-02-08 16:24:33 -07:00