hyperfine/scripts/welch_ttest.py

#!/usr/bin/env python

"""This script performs Welch's t-test on a JSON export file with two
benchmark results to test whether or not the two distributions are
the same."""

import argparse
import json
import sys
from scipy import stats

parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("file", help="JSON file with two benchmark results")
args = parser.parse_args()

with open(args.file) as f:
    results = json.load(f)["results"]

if len(results) != 2:
    print("The input file has to contain exactly two benchmarks")
    sys.exit(1)

a, b = [x["command"] for x in results[:2]]
X, Y = [x["times"] for x in results[:2]]

print("Command 1: {}".format(a))
print("Command 2: {}\n".format(b))

t, p = stats.ttest_ind(X, Y, equal_var=False)
th = 0.05
dispose = p < th
print("t = {:.3}, p = {:.3}".format(t, p))
print()

if dispose:
    print("There is a difference between the two benchmarks (p < {}).".format(th))
else:
    print("The two benchmarks are almost the same (p >= {}).".format(th))
scripts: modify shebang to use "/usr/bin/env python" Before this patch, executing directly one of these scripts, for example `./plot_hystogram.py` in a unix-like environment meant that the default system-level python would be used, regardless of an eventual activated virtualenv. This was due to the "#!/usr/bin/python" shebang. Changing it to "/usr/bin/env python" is a fairly standard practice, keeps intact the compatiblity with the system level python, and allows a user to run in a virtualenv if he wants. 2020-10-08 23:51:31 +03:00			`#!/usr/bin/env python`
scripts: add welch's t test 2019-10-07 08:57:42 +03:00
Add help text 2019-10-07 19:44:55 +03:00			`"""This script performs Welch's t-test on a JSON export file with two`
			`benchmark results to test whether or not the two distributions are`
			`the same."""`

scripts: add welch's t test 2019-10-07 08:57:42 +03:00			`import argparse`
			`import json`
Add error message 2019-10-07 19:36:00 +03:00			`import sys`
scripts: add welch's t test 2019-10-07 08:57:42 +03:00			`from scipy import stats`

Add help text 2019-10-07 19:44:55 +03:00			`parser = argparse.ArgumentParser(description=__doc__)`
Add error message 2019-10-07 19:36:00 +03:00			`parser.add_argument("file", help="JSON file with two benchmark results")`
scripts: add welch's t test 2019-10-07 08:57:42 +03:00			`args = parser.parse_args()`

			`with open(args.file) as f:`
			`results = json.load(f)["results"]`

Add error message 2019-10-07 19:36:00 +03:00			`if len(results) != 2:`
			`print("The input file has to contain exactly two benchmarks")`
			`sys.exit(1)`

scripts: add welch's t test 2019-10-07 08:57:42 +03:00			`a, b = [x["command"] for x in results[:2]]`
			`X, Y = [x["times"] for x in results[:2]]`

			`print("Command 1: {}".format(a))`
			`print("Command 2: {}\n".format(b))`

			`t, p = stats.ttest_ind(X, Y, equal_var=False)`
			`th = 0.05`
			`dispose = p < th`
Output formatting 2019-10-07 19:58:07 +03:00			`print("t = {:.3}, p = {:.3}".format(t, p))`
			`print()`

scripts: add welch's t test 2019-10-07 08:57:42 +03:00			`if dispose:`
Auto-format Python scripts 2020-04-01 10:35:06 +03:00			`print("There is a difference between the two benchmarks (p < {}).".format(th))`
scripts: add welch's t test 2019-10-07 08:57:42 +03:00			`else:`
Output formatting 2019-10-07 19:58:07 +03:00			`print("The two benchmarks are almost the same (p >= {}).".format(th))`