Commit Graph

3217 Commits

Author SHA1 Message Date
Matt
c1ec4dedbb fix for bad query formation.
text:""foo bar""
2015-08-02 11:34:55 -06:00
Kevin Truong
37591be421 Merge branch 'diffbot-testing' of https://github.com/gigablast/open-source-search-engine 2015-07-31 18:12:56 -07:00
Kevin Truong
b6207ec344 Fixes #3012. Allow facet ranges to work on negative numbers. 2015-07-31 18:11:37 -07:00
Matt
18d1a787bb fix core dump from meta data in title rec
that was just a \0 from injecting content that way
2015-07-31 18:42:21 -06:00
Matt Wells
e18fca88f4 Merge branch 'diffbot' into diffbot-testing 2015-07-31 08:56:47 -07:00
Matt Wells
85c7fbae70 fix infinite loop bug from EBADRBDID 2015-07-31 08:56:26 -07:00
Matt
5af61ff59a fix core from boolean queries 2015-07-30 10:21:30 -06:00
Matt
72768c093d Merge branch 'diffbot-sam' of github.com:gigablast/open-source-search-engine into diffbot-sam 2015-07-23 17:24:41 -06:00
sam
86946392d0 reverted stepping. Useless 2015-07-23 10:53:59 -07:00
Matt
da41d53575 Merge branch 'diffbot-testing' into diffbot-sam 2015-07-23 09:27:00 -06:00
Matt Wells
e165b5d668 speed up bool queries 2015-07-22 13:00:45 -07:00
Matt
090e1b35d5 fix score info reporting for new bool query
min score based on # of query terms contained.
2015-07-20 14:37:37 -06:00
Matt
69c791e5aa for now at least do not use siterank for ranking
boolean search results.
2015-07-20 11:50:31 -06:00
Matt
1c93a88d82 use the # of matched terms as the score of a doc
when doing a boolean query. later: use proximity
scoring for non-field query terms.
2015-07-20 11:09:56 -06:00
Matt
ff7639e323 do not get synonyms for boolean operators.
just skip synonyms if ignoreWord is set at all.
2015-07-19 13:07:05 -06:00
Matt
646bc91c59 fix more possible unicode errors 2015-07-19 12:05:09 -06:00
Matt
b9fc583cae fix core 2015-07-18 18:01:11 -06:00
Matt
16fd428887 fix more cores from the dynamic query size changes.
add how many query terms we truncated in the json/xml replies.
document those fields as well.
2015-07-18 14:15:47 -06:00
Matt Wells
dab0726fac typo fix 2015-07-17 10:43:38 -06:00
Matt
5e7a06229c print special message if no seeds were able to be crawled. 2015-07-17 08:42:01 -06:00
Matt
7e526863d7 do not include 'diffbot uri' in urls.csv. should
not have been there.
2015-07-16 10:11:04 -06:00
Matt
0d3cfc2796 single words in quotes - keep them in quotes so
we do not get synonym forms
2015-07-15 09:58:25 -06:00
Matt
f1b0bd0149 quick fix for tree sanity checker 2015-07-15 09:46:27 -06:00
Matt Wells
0d1acb09bc try to fix tree if corruption detected when dumping to disk 2015-07-14 22:27:43 -06:00
sam
b0a6e590d6 treat estimatedDate like date 2015-07-14 17:18:16 -07:00
Matt
8048517463 gbss fix 2015-07-14 18:17:52 -06:00
sam
016fa88b29 treat estimatedDate like date 2015-07-14 17:17:21 -07:00
Matt
9946b4b4be add gbssDiffbotType and gbssIsSeedUrl:1 to spider status docs. 2015-07-14 17:59:50 -06:00
Matt
fa38d97ec4 Merge branch 'diffbot' into diffbot-testing 2015-07-14 11:45:05 -06:00
Matt
f173b41e92 additional log info 2015-07-14 11:44:14 -06:00
Matt
baff94875d fix another core from dynamic query sizing 2015-07-14 09:23:44 -06:00
Matt
c8cf0e5440 fix some mem leaks from allowing really big queries.
added a max query term control to search controls to
limit users doing really big queries. but default it
very high to 1M.
2015-07-13 23:17:53 -06:00
sam
f3d35b557f should solve defect #3002 2015-07-13 18:08:25 -07:00
Matt
fc4b4db425 fix core related to increasing max query length 2015-07-13 19:00:47 -06:00
Matt
c3a0f21600 nomenclature changes 2015-07-13 18:42:13 -06:00
Matt Wells
1ba57f9278 fix pesky memory leak finally 2015-07-13 17:47:34 -06:00
Matt Wells
c03594034d bump up some limits for extraordinarily long queries 2015-07-13 17:43:28 -06:00
Matt
34ec49e804 get mike's super long query working 2015-07-13 14:59:44 -06:00
Matt
0e009fa6bc fix cores from dynamic # query terms fix 2015-07-10 20:49:40 -06:00
Matt
f088e734f6 allow up to 3000 query terms. really we can allow
much more since we are mostly dynamically allocating,
only a few smaller arrays use the 3000 on the stack.
2015-07-10 19:02:30 -06:00
Matt
5d57862046 do not core on gigabits overflow issue 2015-07-10 11:00:16 -06:00
Matt
a15d2470f5 Merge branch 'testing' into diffbot-testing 2015-07-08 13:48:58 -06:00
Matt
581f287113 api doc update for facets 2015-07-08 13:48:16 -06:00
Matt
3395ee8111 fix core in sections 2015-07-08 08:15:30 -06:00
sam
97f2052d63 remove the debug log 2015-07-06 18:05:35 -07:00
Kevin Truong
bcd53016e9 Fixes #2947. Fixed a bug with counting facet 'totalDocsWithField' and 'totalDocsWithFieldAndValue' 2015-07-06 17:42:47 -07:00
sam
6745c72232 implemented stepping 2015-07-06 17:28:17 -07:00
Matt
adae6689e6 fix add url from root page. fix core from corruption 2015-07-04 21:52:11 -06:00
Matt
815bd7ce0a quite a few bug fixes. 2015-07-02 17:42:05 -06:00
Matt
1966f36c00 fix clock candidate bug 2015-07-01 20:34:39 -06:00