Matt
80991c943f
complete merge of ia code into testing.
...
make indexing warcs/arcs a switch in spider controls.
2015-11-09 12:46:06 -07:00
Matt
0f453d5cdf
Merge branch 'ia-zak' into testing
2015-10-07 10:02:38 -06:00
Matt
16db36252c
Merge branch 'diffbot-testing' into testing
2015-10-07 10:02:06 -06:00
Matt
df1c7f6e0f
update qa.cpp syntax test to do &n=100
...
for gbssStatusCode:0 query
2015-10-05 17:31:35 -06:00
Zak Betz
c947252fee
Add gbcapturedate to individual doc's metadata when injecting warcs.
2015-10-04 01:53:54 -06:00
Matt
cb4bbe8892
Merge branch 'ia' into ia-zak
2015-09-30 07:58:31 -06:00
Matt
d4c677170f
index metadata on EDOCUNCHANGED errors, and append new meta data
...
to XmlDoc::ptr_metadata.
2015-09-30 07:57:40 -06:00
Matt
100888d691
fix file/dir creation permissions bugs
2015-09-21 12:44:41 -06:00
Matt
74cde33a3a
just use the user's umask val for all file/dir creation
2015-09-21 11:33:38 -06:00
Matt
ce7b06fc4d
all files made are now group writable.
...
if you don't like that then you can make
a special group and set the directory just
group writable for that group using chmod g+s <dir>.
2015-09-21 11:19:34 -06:00
Matt
c803e0906e
fix </script> tag detection stuff again.
2015-08-31 14:06:44 -06:00
Zak Betz
e252dfb088
Add docs per second stat.
...
Fix auto update on statsdb graph.
Add Stat toggles for statsdb graph.
Add a unit test for indexing an array in metadata.
2015-08-22 12:05:20 -06:00
Zak Betz
a7ae510e31
Fix string faceting display for json metadata.
...
Add unit test for faceted metadata.
2015-07-06 23:05:18 -06:00
Zak Betz
87fcda0f93
Fix atotime5 to parse ISO8601.
...
Fix qa test for warcs and arcs.
Fix inject script.
2015-07-06 00:51:18 -06:00
Zak Betz
7b507a70ef
Set value length to 0 for something that does not return a string value
...
in Json.cpp.
Fix the '-' -> '_' when indexing generic fields.
Add a StackBuf macro which is a Safebuf initialized with a small
stack buffer for use in a local scope.
2015-06-30 14:09:57 -06:00
Zak Betz
9ca0223cf1
Translate metadata field names with dashes to _.
...
Add unit tests for searching for certain types of metadata.
2015-06-17 23:36:31 -06:00
Zak Betz
32987e76ee
Add json metadata field to page inject.
...
Fix memory leak when spidering warc files.
Add script to inject warcs from internet archives search results.
2015-06-14 20:58:41 -06:00
Zak Betz
e399a8b0aa
Add qa test for arc and warc files. Change XmlDoc to use timeaxis url
...
when creating the titlerec key instead of the firsturl.
2015-05-21 15:19:33 -06:00
Zak Betz
36037c23a1
Add a test for useTimeAxis.
2015-05-12 15:18:38 -06:00
Matt
697b8307b2
fix qa test to make it easier to see the real diffs
2015-04-30 19:38:27 -07:00
Matt
e2eba10068
qa test fix
2015-04-28 13:48:29 -07:00
Matt
f26c9d609b
one more qa test fix for spider status docs
2015-04-01 12:47:32 -06:00
Matt
5e46262cb2
more fixes for qa'ing of new spider status docs
2015-04-01 12:03:17 -06:00
Matt
10a31783bb
fixes to pass internal qa tests in light
...
of gbss (spider status doc) changes and other things.
had to make xmldoc.o -O2 instead of -O3 to fix strange bug.
2015-04-01 11:20:36 -06:00
Matt
6b293f17e6
now show "totalDocsWithField" for each facet, so we know
...
how many docs had that field, with any particular value,
so we can do tf/idf type things.
2015-04-01 09:16:42 -06:00
Matt
8e72d6e4cc
fix a couple critical xml parsing bugs. fixes
...
parsing of rss feeds better and xml in general.
fixed qa tests to ignore collection list when doing diff.
2015-03-10 19:13:21 -07:00
Matt
e8e5f9e005
qa test fixes
2015-03-05 07:45:28 -08:00
Matt
856823e862
fix qa test some.
2015-02-19 20:18:30 -07:00
Matt
ef99aabf4d
try to fix qainject1 core in qa.cpp
2015-02-17 20:17:59 -07:00
Matt
dce8d9f930
fix qa bug of not resetting s_i.
...
fix tcpserver.cpp bug of destroying a streaming
socket after what is really not the final write.
2015-02-17 20:10:13 -07:00
Matt
c0332d4381
fix qa
2015-01-31 18:42:31 -07:00
Matt
72b6546ed9
fix some smoke tests
2015-01-22 15:53:04 -07:00
Matt
faaaf3cb89
smoke test for query fix
2015-01-22 14:56:51 -07:00
Matt
e178c67f4b
do not core on qa test fail
2014-12-17 16:31:37 -08:00
Matt
27db9d57a1
added undeletable posdb key test to qainject1().
...
caught an undeletable rec and fixed that in xmldoc.cpp.
2014-12-16 13:29:04 -08:00
Matt
578cde9d9d
fix sections.cpp to not set root title section
...
to tagid TAG_TITLE.
2014-12-11 19:54:33 -08:00
Matt
b89f071f7c
quite a few bug fixes from adding the new query
...
syntax qa test.
2014-12-11 18:24:28 -08:00
Matt
0460335861
more permission system updates
2014-12-08 09:49:17 -08:00
Matt
41c8817bdb
fixed summary initialization error
...
of the flags buffer.
fixed term freq algo. use exact term freq
for qatest123. made Summary.o -O3 again.
fix gbsystem() to disable both timers.
2014-12-06 10:14:48 -07:00
Matt
5b92b5f6d5
now term freqs are almost exact for qatest123.
...
sometimes an off by 1 bug. we should really call
msg5 to get the list w/o thread and get a truly
exact term freq for qatest123 for consistency.
that would be in Posdb.cpp::getTermFreq()
2014-11-25 15:54:15 -07:00
Matt
266d97608a
fix a few more 64-bit conversion cores
2014-11-20 16:12:18 -08:00
Matt
4a0554c76f
more 64bit fixes
2014-11-14 17:30:32 -08:00
Matt
4c19453ea9
working with -m32 for basic testing.
...
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
Matt
96b8197ad3
now it compiles with -m32
2014-11-10 14:45:11 -08:00
Matt Wells
b13f3d24d7
replaced unsigned long long with uint64_t
2014-10-30 13:30:39 -06:00
mwells
ce56fb93ab
fix qa test so we can roll out proxy code.
2014-09-30 15:40:02 -07:00
mwells
a8c5d6a46e
fix gbfacetstr: operator for xml docs
2014-09-28 12:09:04 -07:00
mwells
7d3bcd7672
1 spider out at a time for qa test consistency
2014-09-28 11:00:31 -07:00
mwells
7a0f9fe370
fix support for indexing xml docs.
...
no longer use hacks gbxmltitle and gbxmllinks.
no longer convert html entities for xml docs using hacks
since we have XmlDoc::hashXmlFields() function.
added qaxml() qa test to test xml doc indexing and searching.
ignore <?xml> tag when generating xml tag compound name.
2014-09-28 10:43:41 -07:00
mwells
0267e865b8
minor fixes
2014-09-27 17:01:16 -07:00