Commit Graph

464 Commits

Author SHA1 Message Date
mwells
1e47e32384 Merge branch 'diffbot-testing' into testing
Conflicts:
	XmlDoc.cpp
2014-08-01 14:06:25 -07:00
mwells
146e45db56 try to fix some redirect issues 2014-07-31 10:34:03 -07:00
mwells
257f225232 more facet fixes 2014-07-30 17:00:33 -07:00
mwells
c4174a0ca6 fix bug causing qa json facet test to fail 2014-07-30 15:36:08 -07:00
mwells
2d6fd06c27 fix facets over json docs. added json
docs through add url and json queries to qa tests.
2014-07-30 14:50:33 -07:00
mwells
dff04eff45 fix facet/xpath lookup stuff. 2014-07-30 10:41:21 -07:00
mwells
546d726487 pretty ups 2014-07-30 09:59:25 -07:00
mwells
312b39c059 lookup facet values to get their text representations. 2014-07-29 16:17:18 -07:00
mwells
958343957a nothing 2014-07-29 10:44:23 -07:00
mwells
3cc54b72cc qa updates 2014-07-28 19:15:31 -07:00
mwells
2a094accff add qa page 2014-07-25 17:39:29 -07:00
Matt Wells
066ccc831c fix core. lower 0x39 outstanding to 10 2014-07-23 07:30:04 -07:00
Matt Wells
b00c776f73 fix bugs for injecting with &sections=1.
follow simpler redirs if injecting.
2014-07-22 15:14:42 -07:00
Matt Wells
d2b1196a85 Merge branch 'diffbot-testing' into testing 2014-07-22 10:47:33 -07:00
Matt Wells
c5d72e0e18 do not do any simplified redirects for custom
crawls, just like bulk jobs.
2014-07-22 07:14:55 -07:00
Matt Wells
47b46a202c computing link text is slow for some reason.
needs to be looked at, but take out for now
for custom crawls to speed things up.
2014-07-19 11:12:33 -07:00
Matt Wells
d0bc187a77 more core fixes. more stability. 2014-07-16 12:52:51 -07:00
Matt Wells
dc7a78687c fix long-standing core when getting linkinfo
from a collection that got nuked.
2014-07-16 10:40:12 -07:00
mwells
6e345227a8 qa test fixes 2014-07-15 10:06:33 -07:00
mwells
15756ec94a Merge branch 'diffbot-testing' into testing 2014-07-14 18:10:13 -07:00
mwells
a72c5dae51 fix <script> tags that immediately end in </script> or
never end but hit another <script> or a </gbiframe> tag.
2014-07-14 17:24:20 -07:00
mwells
6078d36dcc qa test fixes 2014-07-14 12:44:32 -07:00
mwells
4adb57f98e prepare for release 1.2 2014-07-12 17:58:36 -06:00
mwells
5f26918910 lots of bug fixes. more qa fixes. 2014-07-11 08:00:30 -07:00
Matt Wells
0ecc7933d6 qa test for squid/sections 2014-07-10 16:28:24 -07:00
Matt Wells
9969644d23 fix section stats display bugs 2014-07-10 15:55:18 -07:00
mwells
950352d781 do not hash redundant xpaths that have the same inner sentence/alnum
html as their children tags. waste of index space.
2014-07-09 17:16:01 -07:00
mwells
b231bc8042 incorporate total # of docs with that xpathsitehash
into the tag attr. so using the MxDy should be good
enough to determine if something is chrome or not.
2014-07-09 16:47:47 -07:00
mwells
50c64f9369 fix printing of getInlineSectionVotingBuf() to be more accurate 2014-07-09 15:44:41 -07:00
mwells
05fcef9651 more vote infusion and squid proxy fixes. 2014-07-09 14:57:58 -07:00
mwells
d4218e01d7 inject docs that come through our squid proxy 2014-07-09 12:25:23 -07:00
mwells
5ae476f34e print facets for each search result 2014-07-08 19:38:54 -07:00
mwells
1af75c5d88 send back facet field/value pairs in msg20reply 2014-07-08 14:22:55 -07:00
mwells
a4273a1269 section voting markup updates 2014-07-08 11:14:45 -07:00
mwells
e658ebc8f6 fix up sections page some more. useful
for debugging sections stuff.
2014-07-08 10:31:42 -07:00
mwells
842d72b5db Merge branch 'testing' into diffbot-matt 2014-07-08 09:58:54 -07:00
mwells
d7cc290a1f added a few new search parms that can be used
to override collection defaults.
hide all clustered results.
max title len.
max summary excerpt/line width.
2014-07-08 07:01:51 -07:00
mwells
a7bddbcc0b return up to the first 3 h1 tags when &geth1tag=1
is specified for an xml or json feed.
2014-07-07 21:01:07 -07:00
Matt Wells
445896e04c fix query reindex core 2014-07-07 19:11:01 -07:00
mwells
c8567f8a24 sectioning stuff working halfway decent.
still need to do docid-based stats perhaps.
need to scroll to section hash when clicking
the 'sections' link.
2014-07-07 16:46:38 -07:00
mwells
d9ae010371 shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed.
got them working again multicasting a msg 0x39 to the appropriate shard.
set special msg39request flag for better performance for those guys.
2014-07-07 12:32:27 -07:00
mwells
6434e5cc04 Merge branch 'testing' into diffbot-matt
Conflicts:
	Errno.cpp
	Errno.h
	Parms.h
2014-07-07 09:49:59 -07:00
mwells
e22641997a fix geth1tag some more.
fixed bad comment tag detection. was losing
a good deal of some pages because of that.
2014-07-07 08:20:21 -07:00
mwells
dc6c97c59c basic qa tests running 2014-07-06 18:53:05 -07:00
mwells
aeae6bb1a5 qa test updates 2014-07-06 15:04:21 -07:00
mwells
81a89f5975 added support for &geth1tag=1 for xml feeds. 2014-07-05 16:08:48 -07:00
mwells
10bf6c3d35 fix bug in summary display 2014-07-04 15:42:02 -07:00
mwells
94d1b4e90c support og:image images. allow user to
enter thumbnail max width/height.
fix summary printing. was off a little.
2014-07-04 15:33:27 -07:00
Matt Wells
dff47bf5cd remove spam checker to make debugging easier 2014-07-03 13:17:28 -07:00
Matt Wells
d2996bad3a ease up on max redirects limit. was too low. 2014-07-03 13:09:09 -07:00