mwells
2b37f56e4c
Merge branch 'diffbot-matt' into testing
2014-05-10 07:56:45 -07:00
mwells
38a79888b6
Merge branch 'diffbot-testing' into testing
2014-05-10 07:49:29 -07:00
mwells
ed816b2c11
a few bug fixes
2014-05-10 07:48:23 -07:00
mwells
6b92a1f3d4
Merge branch 'master' into testing
2014-05-10 06:43:27 -07:00
mwells
f19014cc6c
fixed missing /
2014-05-10 06:39:36 -07:00
Matt Wells
e70f760d87
us gbstatus: and gbstatusmsg: field operators
2014-05-09 18:10:38 -07:00
Matt Wells
b1cd0cac86
indexing spider replies now working.
...
use type:status to see them or
gbstatus:success or gbstatus:tcp or gbstatus:0.
2014-05-09 18:07:38 -07:00
Matt Wells
941c8f1892
now added CT_STATUS type results into serps.
...
one for each spider reply we add so we can query
spider replies. using url: or type:status etc.
2014-05-09 13:52:12 -07:00
Matt Wells
eb49094343
try to start indexing spider replies
...
as regular search results in the index so
you can query on those. get histograms of
spider status msgs, etc. ability to turn
that and images on/off.
2014-05-09 11:18:24 -07:00
mwells
6048ae849b
added support for spidering a particular language
...
with higher priority.
2014-05-09 10:03:24 -06:00
Matt Wells
305340e4ff
minor update
2014-05-07 16:33:16 -07:00
Matt Wells
a2c47750fa
temp disable wpid logic. do indexing of err pages first.
2014-05-07 16:26:57 -07:00
Matt Wells
3bf52f0f2d
if "wpid" is supplied try to update sitelist
...
for that wpid. hopefully we can get the wp admin
tools to send a /search?wpid=xxxx&sites=xyz.com request so
we can start spidering those sites before they even see the
widget. also it is simpler than trying to update m_siteListBuf
each time someone does a query since those can be hundreds
a second.
2014-05-07 16:10:26 -07:00
Matt Wells
01a6ae1166
take html column out of csv
2014-05-07 13:28:20 -07:00
Matt Wells
dd5f35b06d
added icon
2014-05-07 13:06:56 -07:00
Matt Wells
dd3ab38e55
formatting updates for widget
2014-05-07 13:06:42 -07:00
Matt Wells
cebaffe76b
show embed code for html or php below widget
2014-05-07 12:56:06 -07:00
Matt Wells
09ab2a8d15
fix jerky scrollbar thing
2014-05-06 15:11:57 -07:00
Matt Wells
54e50f5c72
minor updates
2014-05-06 14:53:01 -07:00
Matt Wells
0bface8c9c
fix focus issues with widget qbox
2014-05-06 14:38:51 -07:00
Matt Wells
4135a8c0a3
fix thumbnail g_errno bug. make thumbnails
...
bigger. 250x250.
fix query logic for widget.
2014-05-06 14:05:33 -07:00
Matt Wells
ea0c3abcc6
fix widget query box to do ajax
2014-05-06 13:45:53 -07:00
Matt Wells
285c7e298e
fixes for widget
2014-05-06 11:33:00 -07:00
Matt Wells
2f331d55e5
widget updates
2014-05-06 10:47:57 -07:00
Matt Wells
0daced51df
Merge branch 'diffbot-testing' into diffbot-matt
2014-05-02 14:34:04 -07:00
Matt Wells
4d059109f1
work on infinite scrolling better updating
...
so user doesn't lose scrollbar position
2014-05-02 14:33:05 -07:00
Matt Wells
9f67eb7699
Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing
2014-05-02 12:20:57 -07:00
Matt Wells
5ff56e3863
fix problem when too many docids are deduped
...
and we don't have enough to show. re-merge the shard
termlists to try to get more...
2014-05-02 12:24:38 -07:00
Matt Wells
980b67ce7c
remove temp hack in there
2014-05-02 09:58:15 -07:00
Matt Wells
494db4ede8
Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing
2014-05-01 17:07:51 -07:00
Matt Wells
a503cb35a2
fix gb installconf
2014-05-01 17:09:15 -07:00
Matt Wells
1d766826ae
retry if too man docids deduped when &stream=1
2014-05-01 17:07:31 -07:00
Matt Wells
060f7da967
fix data corruption detection and repair bug.
...
do not core on corrupt http reply missing \0.
just set the g_errno to ECORRUPTDATA.
give more informative corruption log msgs.
2014-05-01 10:38:00 -07:00
Matt Wells
3fcb146462
minor updates
2014-04-30 13:44:19 -07:00
mwells
ff121a76d9
fix formatting bugs
2014-04-30 14:17:39 -06:00
mwells
81369b786c
make trash dir for image thumbs automatically
2014-04-29 17:01:48 -06:00
Matt Wells
f05ab3698b
fix thumbnail scaling.
2014-04-28 14:30:29 -07:00
Matt Wells
066a01cba6
Merge branch 'diffbot-testing' into diffbot-matt
2014-04-28 14:15:02 -07:00
Matt Wells
a2c6527ada
put thumbnail in proper proportion.
...
other formatting fixes.
2014-04-28 14:14:18 -07:00
Matt Wells
e21e0a404c
fixed bug for product title extraction.
...
titledb-saved.dat tree loop corruption bug.
no main coll bug.
put the ajax widget on spider status page so you can
see spider going in realtime. will give customers
a good idea of the spider moving along.
more widget fixes, to use new base64 thumbs, etc.
2014-04-28 13:30:24 -07:00
Matt Wells
de4a0a13a8
more thumbnail generation updates
2014-04-27 11:05:30 -07:00
Matt Wells
65493fcdec
minor print fix
2014-04-26 13:41:08 -07:00
Matt Wells
20a2729827
added jobCreationTimeUTC and jobCompletionTimeUTC
...
to json api
2014-04-25 14:12:18 -07:00
Matt Wells
5c0d646133
fix invalid json when doing &s=1
2014-04-25 13:46:20 -07:00
Matt Wells
f3c06ced57
try to fix core from deleting coll
2014-04-25 11:52:17 -07:00
Matt Wells
82726879a2
support base64 generated thumbnails in serps.
2014-04-24 14:04:57 -07:00
Matt Wells
08058d4f69
Merge branch 'master' into diffbot-matt
2014-04-24 10:14:53 -07:00
Matt Wells
efc16d2b21
Merge branch 'diffbot-testing' into diffbot-matt
2014-04-24 10:14:49 -07:00
Matt Wells
9edd5c8264
thumbnail generation support back in.
2014-04-24 10:13:45 -07:00
Matt Wells
45e2506598
use statically compiled pdftohtml
2014-04-24 08:31:52 -07:00