1
1
mirror of https://github.com/harelba/q.git synced 2024-10-03 22:39:52 +03:00

q benchmark (#241)

This commit is contained in:
Harel Ben-Attia 2020-09-19 12:56:06 +03:00 committed by GitHub
parent 865f591a10
commit 9b492b829a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
18 changed files with 835 additions and 4 deletions

3
.gitignore vendored
View File

@ -12,3 +12,6 @@ packages
.idea/
dist/windows/
generated-site/
benchmark_data.tar.gz
_benchmark_data/
q.egg-info/

18
VERSION_BUMP.md Normal file
View File

@ -0,0 +1,18 @@
# Version bump
Currently, there are some manual steps needed in order to release a new version:
* Make sure that you're in a branch
* Change the version in the following three files: `bin/q.py`, `setup.py` and `do-manual-release.sh` and commit them to the branch
* perform merge into master of that branch
* add a tag of the release version
* `git push --tags origin master`
* create a release in github with the tag you've just created
Pushing to master will trigger a build/release, and will push the artifacts to the new release as assets.
The reason for this is related to limitations in the way that pyci uploads the binaries to github.
#
TBD - Continue with the flow of wrapping the artifacts with rpm/deb, copying the files to packages-for-q, and updating the web site.

View File

@ -33,7 +33,7 @@ from __future__ import print_function
from collections import OrderedDict
q_version = '2.0.17'
q_version = '2.0.18'
__all__ = [ 'QTextAsData' ]

View File

@ -2,7 +2,7 @@
set -e
VERSION=2.0.17
VERSION=2.0.18
if [[ "$TRAVIS_BRANCH" != "master" ]]
then

View File

@ -1,2 +1,3 @@
six==1.11.0
flake8==3.6.0
setuptools<45.0.0

View File

@ -2,7 +2,7 @@
from setuptools import setup
q_version = '2.0.17'
q_version = '2.0.18'
setup(
name='q',

159
test/BENCHMARK.md Normal file
View File

@ -0,0 +1,159 @@
NOTE: *Please don't use or publish this benchmark data yet. See below for details*
# Overview
This just a preliminary benchmark, originally created for validating performance optimizations and suggestions from users, and analyzing q's move to python3. After writing it, I thought it might be interesting to test its speed against textql and octosql as well.
The results I'm getting are somewhat surprising, to the point of me questioning them a bit, so it would be great to validate the further before finalizing the benchmark results.
The most surprising results are as follows:
* python3 vs python2 - A huge improvement (for large files, execution times with python 3 are around 40% of the times for python 2)
* python3 vs textql (written in golang) - Seems that textql becomes slower than the python3 q version as the data sizes grows (both rows and columns)
I would love to validate these results by having other people run the benchmark as well and send me their results.
If you're interested, follow the instructions and run the benchmark on your machine. After the benchmark is finished, send me the final results file, along with some details about your hardware, and i'll add it to the spreadsheet. <harelba@gmail.com>
I've tried to make running the benchmark as seamless as possible, but there obviously might be errors/issues. Please contact me if you encounter any issue, or just open a ticket.
# Benchmark
This is an initial version of the benchmark, along with some results. The following is compared:
* q running on multiple python versions
* textql 2.0.3
* octosql v0.3.0
The specific python versions which are being tested are specified in `benchmark-config.sh`.
This is by no means a scientific benchmark, and it only focuses on the data loading time which is the only significant factor for comparison (e.g. the query itself is a very simple count query). Also, it does not try to provide any usability comparison between q and textql/octosql, an interesting topic on its own.
## Methodology
The idea was to compare the time sensitivity of row and column count.
* Row counts: 1,10,100,1000,10000,100000,1000000
* Column counts: 1,5,10,20,50,100
* Iterations for each combination: 10
File sizes:
* 1M rows by 100 columns - 976MB (~1GB) - Largest file
* 1M rows by 50 columns - 477MB
The benchmark executes simple `select count(*) from <file>` queries for each combination, calculating the mean and stddev of each set of iterations. The stddev is used in order to measure the validity of the results.
The graphs below only compare the means of the results, the standard deviations are written into the google sheet itself, and can be viewed there if needed.
Instructions on how to run the benchmark are at the bottom section of this document, after the results section.
## Hardware
OSX Catalina on a 15" Macbook Pro from Mid 2015, with 16GB of RAM, and an internal Flash Drive of 256GB.
## Results
(Results are automatically updated from the baseline tab in the google spreadsheet).
Detailed results below.
Summary:
* All python 3 versions (3.6/3.7/3.8) provide similar results across all scales.
* python 3.x provides significantly better results than python2. Improvement grows as the file size grows (20% improvement for small files, up to ~70% improvement for the largest file)
* textql seems to provide faster results than q (py3) for smaller files, up to around 30MB of data. As the size grows further, it becomes slower than q, up to 80% (74 seconds vs 41 seconds) for the largest file
* The larger the files, textql becomes slower than q-py3 (up to 80% more time than q for the largest file)
* octosql is significantly slower than both q and textql, even for small files with a low number of rows and columns
### Data for 1M rows
#### Run time durations for 1M rows and different column counts:
| rows | columns | File Size | python 2.7 | python 3.6 | python 3.7 | python 3.8 | textql | octosql |
|:-------: |:-------: |:---------: |:----------: |:----------: |:----------: |:----------: |:------: |:-------: |
| 1000000 | 1 | 17M | 5.15 | 4.24 | 4.08 | 3.98 | 2.90 | 49.95 |
| 1000000 | 5 | 37M | 10.68 | 5.37 | 5.26 | 5.14 | 5.88 | 54.69 |
| 1000000 | 10 | 89M | 17.56 | 7.25 | 7.15 | 7.01 | 9.69 | 65.32 |
| 1000000 | 20 | 192M | 30.28 | 10.96 | 10.78 | 10.64 | 17.34 | 83.94 |
| 1000000 | 50 | 477M | 71.56 | 21.98 | 21.59 | 21.70 | 38.57 | 158.26 |
| 1000000 | 100 | 986M | 131.86 | 41.71 | 40.82 | 41.02 | 74.62 | 289.58 |
#### Comparison between python 3.x and python 2 run times (1M rows):
(>100% is slower than q-py2, <100% is faster than q-py2)
| rows | columns | file size | q-py2 runtime | q-py3.6 vs q-py2 runtime | q-py3.7 vs q-py2 runtime | q-py3.8 vs q-py2 runtime |
|:-------: |:-------: |:---------: |:-------------: |:------------------------: |:------------------------: |:------------------------: |
| 1000000 | 1 | 17M | 100.00% | 82.34% | 79.34% | 77.36% |
| 1000000 | 5 | 37M | 100.00% | 50.25% | 49.22% | 48.08% |
| 1000000 | 10 | 89M | 100.00% | 41.30% | 40.69% | 39.93% |
| 1000000 | 20 | 192M | 100.00% | 36.18% | 35.59% | 35.14% |
| 1000000 | 50 | 477M | 100.00% | 30.71% | 30.17% | 30.32% |
| 1000000 | 100 | 986M | 100.00% | 31.63% | 30.96% | 31.11% |
#### textql and octosql comparison against q-py3 run time (1M rows):
(>100% is slower than q-py3, <100% is faster than q-py3)
| rows | columns | file size | avg q-py3 runtime | textql vs q-py3 runtime | octosql vs q-py3 runtime |
|:-------: |:-------: |:---------: |:-----------------: |:-----------------------: |:------------------------: |
| 1000000 | 1 | 17M | 100.00% | 70.67% | 1217.76% |
| 1000000 | 5 | 37M | 100.00% | 111.86% | 1040.70% |
| 1000000 | 10 | 89M | 100.00% | 135.80% | 915.28% |
| 1000000 | 20 | 192M | 100.00% | 160.67% | 777.92% |
| 1000000 | 50 | 477M | 100.00% | 177.26% | 727.40% |
| 1000000 | 100 | 986M | 100.00% | 181.19% | 703.15% |
### Sensitivity to column count
Based on a the largest file size of 1,000,000 rows.
![Sensitivity to column count](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1585602598&format=image)
### Sensitivity to line count (per column count)
#### 1 Column Table
![1 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1119350798&format=image)
#### 5 Column Table
![5 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=599223098&format=image)
#### 10 Column Table
![10 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=82695414&format=image)
#### 20 Column Table
![20 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1573199483&format=image)
#### 50 Column Table
![50 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=448568670&format=image)
#### 100 Column Table
![100 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=2101488258&format=image)
## Running the benchmark
Please note that the initial run generates large files, so you'd need more than 3GB of free space available. All the generated files reside in the `_benchmark_data/` folder.
Part of the preparation flow will download the benchmark data as needed.
### Preparations
* Prerequisites:
* pyenv installed
* pyenv-virtualenv installed
* [`textql`](https://github.com/dinedal/textql#install)
* [`octosql`](https://github.com/cube2222/octosql#installation)
Run `./prepare-benchmark-env`
### Execution
Run `./run-benchmark <benchmark-id>`.
Benchmark output files will be written to `./benchmark-results/<q-executable>/<benchmark-id>/`.
* `benchmark-id` is the id you wanna give the benchmark.
* `q-executable` is the name of the q executable being used for the benchmark. If none has been provided through Q_EXECUTABLE, then the value will be the last commit hash. Note that there is no checking of whether the working tree is clean.
The summary of benchmark will be written to `./benchmark-results/<benchmark-id>/summary.benchmark-results``
By default, the benchmark will use the source python files inside the project. If you wanna run it on one of the standalone binary executable, the set Q_EXECUTABLE to the full path of the q binary.
For anyone helping with running the benchmark, don't use this parameter for now, just test against a clean checkout of the code using `./run-benchmark <benchmark-id>`.
## Benchmark Development info
### Running against the standalone binary
* `./run-benchmark` can accept a second parameter with the q executable. If it gets this parameter, it will use this path for running q. This provides a way to test the standalone q binaries in the new packaging format. When this parameter does not exist, the benchmark is executed directly from the source code.
### Updating the benchmark markdown document file
The results should reside in the following [google sheet](https://docs.google.com/spreadsheets/d/1Ljr8YIJwUQ5F4wr6ATga5Aajpu1CvQp1pe52KGrLkbY/edit?usp=sharing).
add a new tab to the google sheet, and paste the content of `summary.benchmark-results` to the new sheet.

3
test/benchmark-config.sh Normal file
View File

@ -0,0 +1,3 @@
#!/bin/bash
BENCHMARK_PYTHON_VERSIONS=(2.7.18 3.6.4 3.7.9 3.8.5)

View File

@ -0,0 +1,48 @@
lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 1 0.582091641426 0.0235290239617
10 1 0.596219730377 0.0320124029461
100 1 0.575977492332 0.0199296245316
1000 1 0.56785056591 0.00846389017466
10000 1 1.1466334343 0.00760108698846
100000 1 5.49565172195 0.131791932977
1000000 1 49.9513648033 0.443430523063
lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 5 0.582160949707 0.0274409391571
10 5 0.57046456337 0.0199413000359
100 5 0.585747480392 0.0372543971623
1000 5 0.572268772125 0.00384300349763
10000 5 1.15530762672 0.0117990775856
100000 5 6.10629923344 0.146711842919
1000000 5 54.6851765394 0.315486399525
lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 10 0.586222410202 0.0232479065914
10 10 0.59000480175 0.0186508192447
100 10 0.581873703003 0.0331332482772
1000 10 0.569027900696 0.0103675493106
10000 10 1.40067322254 0.00583352224401
100000 10 7.30705575943 0.0165839217599
1000000 10 65.3242264032 0.512552576414
lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 20 0.571048212051 0.0166919396871
10 20 0.594776701927 0.0368900941023
100 20 0.561370825768 0.00907051791451
1000 20 0.577527880669 0.00983965108957
10000 20 1.90710241795 0.00757011452155
100000 20 9.8267291069 0.127844155326
1000000 20 83.9448960066 0.46121344046
lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 50 0.572030115128 0.0253648479103
10 50 0.56993534565 0.0230474303306
100 50 0.563336873055 0.00964411866903
1000 50 0.826378440857 0.00941629472813
10000 50 3.27872717381 0.126592845956
100000 50 17.890055728 0.116794666005
1000000 50 158.262442636 0.826290454446
lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 100 0.569358110428 0.0279801762531
10 100 0.580981063843 0.0272341107532
100 100 0.559471726418 0.00668155858429
1000 100 1.08161640167 0.00698594638512
10000 100 5.67823712826 0.0123398407167
100000 100 32.2797194242 0.315508270241
1000000 100 289.582628798 0.929455236817

View File

@ -0,0 +1,48 @@
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev
1 1 0.106449890137 0.002010027753
10 1 0.106737875938 0.00224112203891
100 1 0.107839012146 0.00102954061006
1000 1 0.113026666641 0.00147361890226
10000 1 0.160376381874 0.00569766179806
100000 1 0.608236479759 0.00604026519608
1000000 1 5.14807910919 0.0584474028762
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev
1 5 0.106719517708 0.00236752032369
10 5 0.107823801041 0.00238873169438
100 5 0.109785079956 0.0013047675259
1000 5 0.120395207405 0.00207224422629
10000 5 0.21783041954 0.00522254475716
100000 5 1.17115747929 0.0221394865225
1000000 5 10.6830974817 0.339822977934
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev
1 10 0.104981088638 0.00166552032929
10 10 0.108320140839 0.00204034349199
100 10 0.112528729439 0.00168376477305
1000 10 0.13019015789 0.00253773120965
10000 10 0.284891676903 0.00384009140782
100000 10 1.84725661278 0.00860738744089
1000000 10 17.5610994339 0.228322442172
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev
1 20 0.106477689743 0.00254429925697
10 20 0.108580899239 0.00173704653824
100 20 0.118750286102 0.00247623639866
1000 20 0.146431708336 0.00249685551944
10000 20 0.419492387772 0.00248210434668
100000 20 3.15847921371 0.0550301268026
1000000 20 30.279082489 0.124978814506
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev
1 50 0.105411934853 0.00171651054128
10 50 0.109102797508 0.00111620290512
100 50 0.135682177544 0.00196166766665
1000 50 0.198261427879 0.00396172489054
10000 50 0.821499919891 0.0111642692132
100000 50 7.05980975628 0.121182371277
1000000 50 71.5645889759 5.02009516291
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev
1 100 0.10662381649 0.00193146624495
10 100 0.110662698746 0.00171461379583
100 100 0.163547992706 0.00166570196628
1000 100 0.280023741722 0.00337543024145
10000 100 1.46053376198 0.0221691284465
100000 100 13.2369835854 0.309375896258
1000000 100 131.864977288 1.22415449691

View File

@ -0,0 +1,48 @@
lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev
1 1 0.10342762470245362 0.0017673875851759295
10 1 0.10239293575286865 0.0012505611685910795
100 1 0.10317318439483643 0.0010581783881541751
1000 1 0.10687050819396973 0.0014050135772919004
10000 1 0.1447664737701416 0.001841256227287192
100000 1 0.5162809371948243 0.006962985088492867
1000000 1 4.238853335380554 0.04834401143632507
lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev
1 5 0.10211825370788574 0.0022568191323651568
10 5 0.1025341272354126 0.0016446470901070106
100 5 0.1053577184677124 0.0015298114223855884
1000 5 0.10980842113494874 0.002536098780902228
10000 5 0.1590113162994385 0.003123074098301634
100000 5 0.6348223447799682 0.0082691507829872
1000000 5 5.368562030792236 0.11628913334105236
lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev
1 10 0.10251858234405517 0.0015963869535345293
10 10 0.10278875827789306 0.0009920577082124496
100 10 0.10715732574462891 0.002033320000941064
1000 10 0.11389360427856446 0.0023603847702423973
10000 10 0.17806434631347656 0.001114054252191835
100000 10 0.8252989768981933 0.0037080843359275904
1000000 10 7.252838873863221 0.029052130546213153
lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev
1 20 0.10367965698242188 0.003661761341842434
10 20 0.10489590167999267 0.001977141196109372
100 20 0.11108210086822509 0.0014801173497056886
1000 20 0.12110791206359864 0.001648524669420912
10000 20 0.2178968906402588 0.0019298316207276716
100000 20 1.1962245225906372 0.010541407803235559
1000000 20 10.956057572364807 0.12677108174061705
lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev
1 50 0.10458300113677979 0.0016367630302744722
10 50 0.10616152286529541 0.002345135740908088
100 50 0.12375867366790771 0.00238414904864133
1000 50 0.14462883472442628 0.0022428030896492978
10000 50 0.34488487243652344 0.004867441221052092
100000 50 2.3394312858581543 0.02263239858944125
1000000 50 21.979821610450745 0.09080404939303836
lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev
1 100 0.10372309684753418 0.0010299126833031144
10 100 0.10784556865692138 0.0016557634029464607
100 100 0.14526791572570802 0.0028194506905186724
1000 100 0.18315494060516357 0.0023585311962114673
10000 100 0.5586131334304809 0.004808492789681402
100000 100 4.287398314476013 0.00957500108409644
1000000 100 41.706851434707644 0.4161526076289425

View File

@ -0,0 +1,48 @@
lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev
1 1 0.08099310398101807 0.001417385651688644
10 1 0.0822291374206543 0.0014809900020001858
100 1 0.08169686794281006 0.002108157069167563
1000 1 0.08690853118896484 0.0012595326919263487
10000 1 0.12215542793273926 0.0020152625320395434
100000 1 0.4825761795043945 0.0050418000028856335
1000000 1 4.084399747848511 0.027731958079814215
lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev
1 5 0.0817826271057129 0.002665533758836163
10 5 0.08261749744415284 0.0019205430658525572
100 5 0.08472237586975098 0.002571239449841039
1000 5 0.08973510265350342 0.002323797583077552
10000 5 0.13746986389160157 0.001964971666036654
100000 5 0.60649254322052 0.007131635266871318
1000000 5 5.2585612535476685 0.05661789407928516
lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev
1 10 0.08112843036651611 0.002251300165899426
10 10 0.08175232410430908 0.0014557171018568637
100 10 0.08572309017181397 0.0019643550214810675
1000 10 0.09268453121185302 0.001816414236580489
10000 10 0.15538835525512695 0.0024978076091814994
100000 10 0.7879442930221557 0.009412516078916211
1000000 10 7.146207928657532 0.06659760176757985
lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev
1 20 0.08142082691192627 0.001304584466639188
10 20 0.08197519779205323 0.0014842098503865223
100 20 0.08949971199035645 0.0009937446141285785
1000 20 0.09955930709838867 0.0013978961740806384
10000 20 0.1966566801071167 0.0028489273218240147
100000 20 1.1518636226654053 0.006410720031542237
1000000 20 10.776052689552307 0.04739925571001746
lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev
1 50 0.08237688541412354 0.0016494314799953837
10 50 0.08519520759582519 0.002610550182895596
100 50 0.10423583984375 0.0018808335751867933
1000 50 0.12195603847503662 0.0023611894043373983
10000 50 0.3163540124893188 0.002761333651520998
100000 50 2.237372374534607 0.009955353920396077
1000000 50 21.59097549915314 0.081188190530421
lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev
1 100 0.08336784839630126 0.0013840724401561887
10 100 0.0864112138748169 0.0017946939354350697
100 100 0.12199611663818359 0.0013003743156634682
1000 100 0.15871686935424806 0.0035993681064501234
10000 100 0.5243751525878906 0.004370273273595629
100000 100 4.175828623771667 0.016127303710583043
1000000 100 40.82292411327362 0.12328165162380703

View File

@ -0,0 +1,48 @@
lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev
1 1 0.10138180255889892 0.0017947074090971444
10 1 0.10056869983673096 0.003442371291904885
100 1 0.10126984119415283 0.0016392348107127808
1000 1 0.10484635829925537 0.0019743937339163262
10000 1 0.1400548219680786 0.0024523366133394117
100000 1 0.4901275157928467 0.003970374711691596
1000000 1 3.982502889633179 0.045292138461945054
lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev
1 5 0.09946837425231933 0.0018876161478998787
10 5 0.099178147315979 0.0014194733014858227
100 5 0.10171806812286377 0.0017580984705406846
1000 5 0.10602672100067138 0.002000261880840017
10000 5 0.15207929611206056 0.0015802680033212048
100000 5 0.609218978881836 0.006150144273259608
1000000 5 5.13688440322876 0.03649575898109647
lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev
1 10 0.09925477504730225 0.002168389758635997
10 10 0.09943633079528809 0.0016154501074880502
100 10 0.10376312732696533 0.0017275485891005433
1000 10 0.11087138652801513 0.0016934328033239559
10000 10 0.17246220111846924 0.0023824485659318527
100000 10 0.7999232530593872 0.003442975393506892
1000000 10 7.012071299552917 0.059217904448851263
lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev
1 20 0.10027089118957519 0.0020291529595204906
10 20 0.10038816928863525 0.001957086760826999
100 20 0.10723590850830078 0.0013833918448622436
1000 20 0.11735000610351562 0.0020318895390750882
10000 20 0.21264209747314453 0.00482341642419078
100000 20 1.1567201137542724 0.002987096441878969
1000000 20 10.640758633613586 0.06116581724028616
lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev
1 50 0.10066506862640381 0.002051307639276982
10 50 0.10588631629943848 0.0035835389655972105
100 50 0.11841504573822022 0.001608174845404568
1000 50 0.14032282829284667 0.002640027148889162
10000 50 0.33160474300384524 0.0027796660009712947
100000 50 2.258401036262512 0.011041280982383895
1000000 50 21.70080256462097 0.15897944629180621
lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev
1 100 0.10147004127502442 0.0021285682695135768
10 100 0.10471885204315186 0.001248479289219899
100 100 0.13894760608673096 0.002307980025026551
1000 100 0.17586205005645753 0.0023822296091426
10000 100 0.5414002418518067 0.0036291866664635458
100000 100 4.222555088996887 0.08562968951916528
1000000 100 41.021552324295044 0.16033566363076862

View File

@ -0,0 +1,48 @@
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev lines columns textql_2.0.3_mean textql_2.0.3_stddev lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 1 0.106449890137 0.002010027753 1 1 0.10342762470245362 0.0017673875851759295 1 1 0.08099310398101807 0.001417385651688644 1 1 0.10138180255889892 0.0017947074090971444 1 1 0.0196103572845 0.00207355214257 1 1 0.582091641426 0.0235290239617
10 1 0.106737875938 0.00224112203891 10 1 0.10239293575286865 0.0012505611685910795 10 1 0.0822291374206543 0.0014809900020001858 10 1 0.10056869983673096 0.003442371291904885 10 1 0.0186784029007 0.000970810220668 10 1 0.596219730377 0.0320124029461
100 1 0.107839012146 0.00102954061006 100 1 0.10317318439483643 0.0010581783881541751 100 1 0.08169686794281006 0.002108157069167563 100 1 0.10126984119415283 0.0016392348107127808 100 1 0.019472026825 0.00181951524514 100 1 0.575977492332 0.0199296245316
1000 1 0.113026666641 0.00147361890226 1000 1 0.10687050819396973 0.0014050135772919004 1000 1 0.08690853118896484 0.0012595326919263487 1000 1 0.10484635829925537 0.0019743937339163262 1000 1 0.022180891037 0.00116649968967 1000 1 0.56785056591 0.00846389017466
10000 1 0.160376381874 0.00569766179806 10000 1 0.1447664737701416 0.001841256227287192 10000 1 0.12215542793273926 0.0020152625320395434 10000 1 0.1400548219680786 0.0024523366133394117 10000 1 0.051066827774 0.0018168767618 10000 1 1.1466334343 0.00760108698846
100000 1 0.608236479759 0.00604026519608 100000 1 0.5162809371948243 0.006962985088492867 100000 1 0.4825761795043945 0.0050418000028856335 100000 1 0.4901275157928467 0.003970374711691596 100000 1 0.307463979721 0.00246268029188 100000 1 5.49565172195 0.131791932977
1000000 1 5.14807910919 0.0584474028762 1000000 1 4.238853335380554 0.04834401143632507 1000000 1 4.084399747848511 0.027731958079814215 1000000 1 3.982502889633179 0.045292138461945054 1000000 1 2.89862303734 0.022182722976 1000000 1 49.9513648033 0.443430523063
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev lines columns textql_2.0.3_mean textql_2.0.3_stddev lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 5 0.106719517708 0.00236752032369 1 5 0.10211825370788574 0.0022568191323651568 1 5 0.0817826271057129 0.002665533758836163 1 5 0.09946837425231933 0.0018876161478998787 1 5 0.0195286750793 0.0017840569109 1 5 0.582160949707 0.0274409391571
10 5 0.107823801041 0.00238873169438 10 5 0.1025341272354126 0.0016446470901070106 10 5 0.08261749744415284 0.0019205430658525572 10 5 0.099178147315979 0.0014194733014858227 10 5 0.0183676958084 0.000925251595491 10 5 0.57046456337 0.0199413000359
100 5 0.109785079956 0.0013047675259 100 5 0.1053577184677124 0.0015298114223855884 100 5 0.08472237586975098 0.002571239449841039 100 5 0.10171806812286377 0.0017580984705406846 100 5 0.0199447393417 0.000907007099218 100 5 0.585747480392 0.0372543971623
1000 5 0.120395207405 0.00207224422629 1000 5 0.10980842113494874 0.002536098780902228 1000 5 0.08973510265350342 0.002323797583077552 1000 5 0.10602672100067138 0.002000261880840017 1000 5 0.0263328790665 0.00165486505938 1000 5 0.572268772125 0.00384300349763
10000 5 0.21783041954 0.00522254475716 10000 5 0.1590113162994385 0.003123074098301634 10000 5 0.13746986389160157 0.001964971666036654 10000 5 0.15207929611206056 0.0015802680033212048 10000 5 0.0826982736588 0.00152451583229 10000 5 1.15530762672 0.0117990775856
100000 5 1.17115747929 0.0221394865225 100000 5 0.6348223447799682 0.0082691507829872 100000 5 0.60649254322052 0.007131635266871318 100000 5 0.609218978881836 0.006150144273259608 100000 5 0.60660867691 0.00395761320274 100000 5 6.10629923344 0.146711842919
1000000 5 10.6830974817 0.339822977934 1000000 5 5.368562030792236 0.11628913334105236 1000000 5 5.2585612535476685 0.05661789407928516 1000000 5 5.13688440322876 0.03649575898109647 1000000 5 5.87811236382 0.0304332294491 1000000 5 54.6851765394 0.315486399525
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev lines columns textql_2.0.3_mean textql_2.0.3_stddev lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 10 0.104981088638 0.00166552032929 1 10 0.10251858234405517 0.0015963869535345293 1 10 0.08112843036651611 0.002251300165899426 1 10 0.09925477504730225 0.002168389758635997 1 10 0.0191783189774 0.00107718516178 1 10 0.586222410202 0.0232479065914
10 10 0.108320140839 0.00204034349199 10 10 0.10278875827789306 0.0009920577082124496 10 10 0.08175232410430908 0.0014557171018568637 10 10 0.09943633079528809 0.0016154501074880502 10 10 0.0185215950012 0.000840353961363 10 10 0.59000480175 0.0186508192447
100 10 0.112528729439 0.00168376477305 100 10 0.10715732574462891 0.002033320000941064 100 10 0.08572309017181397 0.0019643550214810675 100 10 0.10376312732696533 0.0017275485891005433 100 10 0.0209223031998 0.00164494657684 100 10 0.581873703003 0.0331332482772
1000 10 0.13019015789 0.00253773120965 1000 10 0.11389360427856446 0.0023603847702423973 1000 10 0.09268453121185302 0.001816414236580489 1000 10 0.11087138652801513 0.0016934328033239559 1000 10 0.0309282779694 0.00110848590345 1000 10 0.569027900696 0.0103675493106
10000 10 0.284891676903 0.00384009140782 10000 10 0.17806434631347656 0.001114054252191835 10000 10 0.15538835525512695 0.0024978076091814994 10000 10 0.17246220111846924 0.0023824485659318527 10000 10 0.121016025543 0.00105071105139 10000 10 1.40067322254 0.00583352224401
100000 10 1.84725661278 0.00860738744089 100000 10 0.8252989768981933 0.0037080843359275904 100000 10 0.7879442930221557 0.009412516078916211 100000 10 0.7999232530593872 0.003442975393506892 100000 10 0.987622976303 0.00699348302979 100000 10 7.30705575943 0.0165839217599
1000000 10 17.5610994339 0.228322442172 1000000 10 7.252838873863221 0.029052130546213153 1000000 10 7.146207928657532 0.06659760176757985 1000000 10 7.012071299552917 0.059217904448851263 1000000 10 9.69240145683 0.0354453778052 1000000 10 65.3242264032 0.512552576414
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev lines columns textql_2.0.3_mean textql_2.0.3_stddev lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 20 0.106477689743 0.00254429925697 1 20 0.10367965698242188 0.003661761341842434 1 20 0.08142082691192627 0.001304584466639188 1 20 0.10027089118957519 0.0020291529595204906 1 20 0.0202306985855 0.00159619251952 1 20 0.571048212051 0.0166919396871
10 20 0.108580899239 0.00173704653824 10 20 0.10489590167999267 0.001977141196109372 10 20 0.08197519779205323 0.0014842098503865223 10 20 0.10038816928863525 0.001957086760826999 10 20 0.0187650680542 0.000845692486156 10 20 0.594776701927 0.0368900941023
100 20 0.118750286102 0.00247623639866 100 20 0.11108210086822509 0.0014801173497056886 100 20 0.08949971199035645 0.0009937446141285785 100 20 0.10723590850830078 0.0013833918448622436 100 20 0.0211876153946 0.000993808448942 100 20 0.561370825768 0.00907051791451
1000 20 0.146431708336 0.00249685551944 1000 20 0.12110791206359864 0.001648524669420912 1000 20 0.09955930709838867 0.0013978961740806384 1000 20 0.11735000610351562 0.0020318895390750882 1000 20 0.0404737234116 0.00122415059261 1000 20 0.577527880669 0.00983965108957
10000 20 0.419492387772 0.00248210434668 10000 20 0.2178968906402588 0.0019298316207276716 10000 20 0.1966566801071167 0.0028489273218240147 10000 20 0.21264209747314453 0.00482341642419078 10000 20 0.197762489319 0.00198188642677 10000 20 1.90710241795 0.00757011452155
100000 20 3.15847921371 0.0550301268026 100000 20 1.1962245225906372 0.010541407803235559 100000 20 1.1518636226654053 0.006410720031542237 100000 20 1.1567201137542724 0.002987096441878969 100000 20 1.75432097912 0.00692372147543 100000 20 9.8267291069 0.127844155326
1000000 20 30.279082489 0.124978814506 1000000 20 10.956057572364807 0.12677108174061705 1000000 20 10.776052689552307 0.04739925571001746 1000000 20 10.640758633613586 0.06116581724028616 1000000 20 17.3383012295 0.0410164637448 1000000 20 83.9448960066 0.46121344046
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev lines columns textql_2.0.3_mean textql_2.0.3_stddev lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 50 0.105411934853 0.00171651054128 1 50 0.10458300113677979 0.0016367630302744722 1 50 0.08237688541412354 0.0016494314799953837 1 50 0.10066506862640381 0.002051307639276982 1 50 0.0205577373505 0.00133922342068 1 50 0.572030115128 0.0253648479103
10 50 0.109102797508 0.00111620290512 10 50 0.10616152286529541 0.002345135740908088 10 50 0.08519520759582519 0.002610550182895596 10 50 0.10588631629943848 0.0035835389655972105 10 50 0.0195438146591 0.000791630611893 10 50 0.56993534565 0.0230474303306
100 50 0.135682177544 0.00196166766665 100 50 0.12375867366790771 0.00238414904864133 100 50 0.10423583984375 0.0018808335751867933 100 50 0.11841504573822022 0.001608174845404568 100 50 0.0246078014374 0.00108949795701 100 50 0.563336873055 0.00964411866903
1000 50 0.198261427879 0.00396172489054 1000 50 0.14462883472442628 0.0022428030896492978 1000 50 0.12195603847503662 0.0023611894043373983 1000 50 0.14032282829284667 0.002640027148889162 1000 50 0.063302564621 0.00058195987294 1000 50 0.826378440857 0.00941629472813
10000 50 0.821499919891 0.0111642692132 10000 50 0.34488487243652344 0.004867441221052092 10000 50 0.3163540124893188 0.002761333651520998 10000 50 0.33160474300384524 0.0027796660009712947 10000 50 0.410061001778 0.00294901155085 10000 50 3.27872717381 0.126592845956
100000 50 7.05980975628 0.121182371277 100000 50 2.3394312858581543 0.02263239858944125 100000 50 2.237372374534607 0.009955353920396077 100000 50 2.258401036262512 0.011041280982383895 100000 50 3.87797718048 0.0123467913678 100000 50 17.890055728 0.116794666005
1000000 50 71.5645889759 5.02009516291 1000000 50 21.979821610450745 0.09080404939303836 1000000 50 21.59097549915314 0.081188190530421 1000000 50 21.70080256462097 0.15897944629180621 1000000 50 38.5674883366 0.0602820291386 1000000 50 158.262442636 0.826290454446
lines columns q-benchmark-2.7.18_mean q-benchmark-2.7.18_stddev lines columns q-benchmark-3.6.4_mean q-benchmark-3.6.4_stddev lines columns q-benchmark-3.7.9_mean q-benchmark-3.7.9_stddev lines columns q-benchmark-3.8.5_mean q-benchmark-3.8.5_stddev lines columns textql_2.0.3_mean textql_2.0.3_stddev lines columns octosql_v0.3.0_mean octosql_v0.3.0_stddev
1 100 0.10662381649 0.00193146624495 1 100 0.10372309684753418 0.0010299126833031144 1 100 0.08336784839630126 0.0013840724401561887 1 100 0.10147004127502442 0.0021285682695135768 1 100 0.0216581106186 0.00103280947157 1 100 0.569358110428 0.0279801762531
10 100 0.110662698746 0.00171461379583 10 100 0.10784556865692138 0.0016557634029464607 10 100 0.0864112138748169 0.0017946939354350697 10 100 0.10471885204315186 0.001248479289219899 10 100 0.021723818779 0.000920429257416 10 100 0.580981063843 0.0272341107532
100 100 0.163547992706 0.00166570196628 100 100 0.14526791572570802 0.0028194506905186724 100 100 0.12199611663818359 0.0013003743156634682 100 100 0.13894760608673096 0.002307980025026551 100 100 0.0299471855164 0.00130217326679 100 100 0.559471726418 0.00668155858429
1000 100 0.280023741722 0.00337543024145 1000 100 0.18315494060516357 0.0023585311962114673 1000 100 0.15871686935424806 0.0035993681064501234 1000 100 0.17586205005645753 0.0023822296091426 1000 100 0.0996923923492 0.00155352212734 1000 100 1.08161640167 0.00698594638512
10000 100 1.46053376198 0.0221691284465 10000 100 0.5586131334304809 0.004808492789681402 10000 100 0.5243751525878906 0.004370273273595629 10000 100 0.5414002418518067 0.0036291866664635458 10000 100 0.767001605034 0.00328944029633 10000 100 5.67823712826 0.0123398407167
100000 100 13.2369835854 0.309375896258 100000 100 4.287398314476013 0.00957500108409644 100000 100 4.175828623771667 0.016127303710583043 100000 100 4.222555088996887 0.08562968951916528 100000 100 7.46734063625 0.0262039846119 100000 100 32.2797194242 0.315508270241
1000000 100 131.864977288 1.22415449691 1000000 100 41.706851434707644 0.4161526076289425 1000000 100 40.82292411327362 0.12328165162380703 1000000 100 41.021552324295044 0.16033566363076862 1000000 100 74.6216712952 0.0994037504394 1000000 100 289.582628798 0.929455236817

View File

@ -0,0 +1,48 @@
lines columns textql_2.0.3_mean textql_2.0.3_stddev
1 1 0.0196103572845 0.00207355214257
10 1 0.0186784029007 0.000970810220668
100 1 0.019472026825 0.00181951524514
1000 1 0.022180891037 0.00116649968967
10000 1 0.051066827774 0.0018168767618
100000 1 0.307463979721 0.00246268029188
1000000 1 2.89862303734 0.022182722976
lines columns textql_2.0.3_mean textql_2.0.3_stddev
1 5 0.0195286750793 0.0017840569109
10 5 0.0183676958084 0.000925251595491
100 5 0.0199447393417 0.000907007099218
1000 5 0.0263328790665 0.00165486505938
10000 5 0.0826982736588 0.00152451583229
100000 5 0.60660867691 0.00395761320274
1000000 5 5.87811236382 0.0304332294491
lines columns textql_2.0.3_mean textql_2.0.3_stddev
1 10 0.0191783189774 0.00107718516178
10 10 0.0185215950012 0.000840353961363
100 10 0.0209223031998 0.00164494657684
1000 10 0.0309282779694 0.00110848590345
10000 10 0.121016025543 0.00105071105139
100000 10 0.987622976303 0.00699348302979
1000000 10 9.69240145683 0.0354453778052
lines columns textql_2.0.3_mean textql_2.0.3_stddev
1 20 0.0202306985855 0.00159619251952
10 20 0.0187650680542 0.000845692486156
100 20 0.0211876153946 0.000993808448942
1000 20 0.0404737234116 0.00122415059261
10000 20 0.197762489319 0.00198188642677
100000 20 1.75432097912 0.00692372147543
1000000 20 17.3383012295 0.0410164637448
lines columns textql_2.0.3_mean textql_2.0.3_stddev
1 50 0.0205577373505 0.00133922342068
10 50 0.0195438146591 0.000791630611893
100 50 0.0246078014374 0.00108949795701
1000 50 0.063302564621 0.00058195987294
10000 50 0.410061001778 0.00294901155085
100000 50 3.87797718048 0.0123467913678
1000000 50 38.5674883366 0.0602820291386
lines columns textql_2.0.3_mean textql_2.0.3_stddev
1 100 0.0216581106186 0.00103280947157
10 100 0.021723818779 0.000920429257416
100 100 0.0299471855164 0.00130217326679
1000 100 0.0996923923492 0.00155352212734
10000 100 0.767001605034 0.00328944029633
100000 100 7.46734063625 0.0262039846119
1000000 100 74.6216712952 0.0994037504394

44
test/prepare-benchmark-env Executable file
View File

@ -0,0 +1,44 @@
#!/bin/bash
set -e
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
source benchmark-config.sh
if [ ! -f ./benchmark_data.tar.gz ];
then
echo benchmark data not found. downloading it
curl "https://s3.amazonaws.com/harelba-q-public/benchmark_data.tar.gz" -o ./benchmark_data.tar.gz
else
echo no need to download benchmark data
fi
if [ ! -d ./_benchmark_data ];
then
echo extracting benchmark data
tar xvfz benchmark_data.tar.gz
echo benchmark data is ready
else
echo no need to extract benchmark data
fi
for ver in "${BENCHMARK_PYTHON_VERSIONS[@]}"
do
echo installing $ver
pyenv install -s $ver
venv_name=q-benchmark-$ver
echo create venv $venv_name
pyenv virtualenv -f $ver $venv_name
echo activate venv $venv_name
pyenv activate $venv_name
pyenv version
echo installing requirements $venv_name
pip install -r ../requirements.txt
echo deactivating $venv_name
pyenv deactivate
done

77
test/run-benchmark Executable file
View File

@ -0,0 +1,77 @@
#!/bin/bash
# Usage: ./run-benchmark.sh <benchmark-id> <q-executable>
set -e
get_abs_filename() {
# $1 : relative filename
echo "$(cd "$(dirname "$1")" && pwd)/$(basename "$1")"
}
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
if [ "x$1" == "x" ];
then
echo Benchmark id must be provided as a parameter
exit 1
fi
Q_BENCHMARK_ID=$1
if [ "x$2" == "x" ];
then
EFFECTIVE_Q_EXECUTABLE="source-files-$(git rev-parse HEAD)"
else
ABS_Q_EXECUTABLE="$(get_abs_filename $2)"
export Q_EXECUTABLE=$ABS_Q_EXECUTABLE
if [ ! -f $ABS_Q_EXECUTABLE ]
then
echo "q executable must exist ($ABS_Q_EXECUTABLE)"
exit 1
fi
EFFECTIVE_Q_EXECUTABLE="${ABS_Q_EXECUTABLE//\//__}"
fi
echo "Q executable to use is $EFFECTIVE_Q_EXECUTABLE"
# Must be provided to the benchmark code so it knows where to write the results to
export Q_BENCHMARK_RESULTS_FOLDER="./benchmark-results/${EFFECTIVE_Q_EXECUTABLE}/${Q_BENCHMARK_ID}/"
echo Benchmark results folder is $Q_BENCHMARK_RESULTS_FOLDER
mkdir -p $Q_BENCHMARK_RESULTS_FOLDER
source benchmark-config.sh
ALL_FILES=()
for ver in "${BENCHMARK_PYTHON_VERSIONS[@]}"
do
venv_name=q-benchmark-$ver
echo activating $venv_name
pyenv activate $venv_name
echo "==== testing inside $venv_name ==="
./test-all BenchmarkTests.test_q_matrix -v
RESULT_FILE="${Q_BENCHMARK_RESULTS_FOLDER}/$venv_name.benchmark-results"
echo "==== Done. Results are in $RESULT_FILE"
ALL_FILES[${#ALL_FILES[@]}]="$RESULT_FILE"
echo "Deactivating"
pyenv deactivate
done
echo "==== testing textql ==="
./test-all BenchmarkTests.test_textql_matrix -v
RESULT_FILE="textql*.benchmark-results"
ALL_FILES[${#ALL_FILES[@]}]="${Q_BENCHMARK_RESULTS_FOLDER}/$RESULT_FILE"
echo "Done. Results are in textql.benchmark-results"
echo "==== testing octosql ==="
./test-all BenchmarkTests.test_octosql_matrix -v
RESULT_FILE="octosql*.benchmark-results"
ALL_FILES[${#ALL_FILES[@]}]="${Q_BENCHMARK_RESULTS_FOLDER}/$RESULT_FILE"
echo "Done. Results are in octosql.benchmark-results"
summary_file="$Q_BENCHMARK_RESULTS_FOLDER/summary.benchmark-results"
rm -vf $summary_file
paste ${ALL_FILES[*]} > $summary_file
echo "Done. final results file is $summary_file"

View File

@ -10,6 +10,7 @@
# in order to test the resulting binary executables as well, instead of just executing the q python source code.
#
from __future__ import print_function
import unittest
import random
import json
@ -24,7 +25,7 @@ import pprint
import six
from six.moves import range
import codecs
import itertools
sys.path.append(os.path.join(os.path.abspath(os.path.dirname(sys.argv[0])),'..','bin'))
from q import QTextAsData,QOutput,QOutputPrinter,QInputParams
@ -2599,6 +2600,195 @@ class BasicModuleTests(AbstractQTestCase):
self.assertTrue(table_structure.materialized_files['my_data'].filename,'my_data')
self.assertTrue(table_structure.materialized_files['my_data'].is_stdin)
class BenchmarkAttemptResults(object):
def __init__(self, attempt, lines, columns, duration,return_code):
self.attempt = attempt
self.lines = lines
self.columns = columns
self.duration = duration
self.return_code = return_code
def __str__(self):
return "{}".format(self.__dict__)
__repr__ = __str__
class BenchmarkResults(object):
def __init__(self, lines, columns, attempt_results, mean, stddev):
self.lines = lines
self.columns = columns
self.attempt_results = attempt_results
self.mean = mean
self.stddev = stddev
def __str__(self):
return "{}".format(self.__dict__)
__repr__ = __str__
class BenchmarkTests(AbstractQTestCase):
BENCHMARK_DIR = './_benchmark_data'
def _ensure_benchmark_data_dir_exists(self):
try:
os.mkdir(BenchmarkTests.BENCHMARK_DIR)
except Exception as e:
pass
def _create_benchmark_file_if_needed(self):
self._ensure_benchmark_data_dir_exists()
if os.path.exists('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR)):
return
g = GzipFile('unit-file.csv.gz')
d = g.read().decode('utf-8')
f = open('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR), 'w')
for i in range(100):
f.write(d)
f.close()
def _prepare_test_file(self, lines, columns):
filename = '{}/_benchmark_data__lines_{}_columns_{}.csv'.format(BenchmarkTests.BENCHMARK_DIR,lines, columns)
if os.path.exists(filename):
return filename
c = ['c{}'.format(x + 1) for x in range(columns)]
# write a header line
ff = open(filename,'w')
ff.write(",".join(c))
ff.write('\n')
ff.close()
r, o, e = run_command('head -{} {}/benchmark-file.csv | ' + Q_EXECUTABLE + ' -d , "select {} from -" >> {}'.format(lines, BenchmarkTests.BENCHMARK_DIR, ','.join(c), filename))
self.assertEqual(r, 0)
return filename
def _decide_result(self,attempt_results):
failed = list(filter(lambda a: a.return_code != 0,attempt_results))
if len(failed) == 0:
mean = sum([x.duration for x in attempt_results]) / len(attempt_results)
sum_squared = sum([(x.duration - mean)**2 for x in attempt_results])
ddof = 0
pvar = sum_squared / (len(attempt_results) - ddof)
stddev = pvar ** 0.5
else:
mean = None
stddev = None
return BenchmarkResults(
attempt_results[0].lines,
attempt_results[0].columns,
attempt_results,
mean,
stddev
)
def _perform_test_performance_matrix(self,name,generate_cmd_function):
results = []
benchmark_results_folder = os.environ.get("Q_BENCHMARK_RESULTS_FOLDER",'')
if benchmark_results_folder == "":
raise Exception("Q_BENCHMARK_RESULTS_FOLDER must be provided as an environment variable")
self._create_benchmark_file_if_needed()
for columns in [1, 5, 10, 20, 50, 100]:
for lines in [1, 10, 100, 1000, 10000, 100000, 1000000]:
attempt_results = []
for attempt in range(10):
filename = self._prepare_test_file(lines, columns)
if DEBUG:
print("Testing {}".format(filename))
t0 = time.time()
r, o, e = run_command(generate_cmd_function(filename,lines,columns))
duration = time.time() - t0
attempt_result = BenchmarkAttemptResults(attempt, lines, columns, duration, r)
attempt_results += [attempt_result]
if DEBUG:
print("Results: {}".format(attempt_result.__dict__))
final_result = self._decide_result(attempt_results)
results += [final_result]
series_fields = [six.u('lines'),six.u('columns')]
value_fields = [six.u('mean'),six.u('stddev')]
all_fields = series_fields + value_fields
output_filename = '{}/{}.benchmark-results'.format(benchmark_results_folder,name)
output_file = open(output_filename,'w')
for columns,g in itertools.groupby(sorted(results,key=lambda x:x.columns),key=lambda x:x.columns):
x = six.u("\t").join(series_fields + [six.u('{}_{}').format(name, f) for f in value_fields])
print(x,file = output_file)
for result in g:
print(six.u("\t").join(map(str,[getattr(result,f) for f in all_fields])),file=output_file)
output_file.close()
print("results have been written to : {}".format(output_filename))
if DEBUG:
print("RESULTS FOR {}".format(name))
print(open(output_filename,'r').read())
def test_q_matrix(self):
venv = os.path.basename(os.environ.get('VIRTUAL_ENV') or 'unknown-virtual-env')
def generate_q_cmd(data_filename,line_count,column_count):
if column_count == 1:
additional_params = '-c 1'
else:
additional_params = ''
return '{} -d , {} "select count(*) from {}"'.format(Q_EXECUTABLE,additional_params, data_filename)
self._perform_test_performance_matrix(venv,generate_q_cmd)
def _get_textql_version(self):
r,o,e = run_command("textql --version")
if r != 0:
raise Exception("Could not find textql")
if len(e) != 0:
raise Exception("Errors while getting textql version")
return o[0]
def _get_octosql_version(self):
r,o,e = run_command("octosql --version")
if r != 0:
raise Exception("Could not find octosql")
if len(e) != 0:
raise Exception("Errors while getting octosql version")
import re
version = re.findall('v[0-9]+\.[0-9]+\.[0-9]+',o[0])[0]
return version
def test_textql_matrix(self):
def generate_textql_cmd(data_filename,line_count,column_count):
return 'textql -dlm , -sql "select count(*)" {}'.format(data_filename)
name = 'textql_%s' % self._get_textql_version()
self._perform_test_performance_matrix(name,generate_textql_cmd)
def test_octosql_matrix(self):
config_fn = self.random_tmp_filename('octosql', 'config')
def generate_octosql_cmd(data_filename,line_count,column_count):
j = """
dataSources:
- name: bmdata
type: csv
config:
path: "{}"
headerRow: false
batchSize: 10000
""".format(data_filename)[1:]
f = open(config_fn,'w')
f.write(j)
f.close()
return 'octosql -c {} -o batch-csv "select count(*) from bmdata a"'.format(config_fn)
name = 'octosql_%s' % self._get_octosql_version()
self._perform_test_performance_matrix(name,generate_octosql_cmd)
def suite():
tl = unittest.TestLoader()
basic_stuff = tl.loadTestsFromTestCase(BasicTests)