Wren is a "small, fast, class-based, concurrent scripting language", originally designed by Bob Nystrom (who you might recognise as the author of Game Programming Patterns and Crafting Interpreters. It's a really fun language to study - the implementation is compact and easily readable, and although class-based languages aren't considered very hip these days there's a real elegance to its design. I saw Wren's performance page hadn't been updated for a very long time, and especially given the recent upstream interpreter performance work on Python, was interested in seeing how performance on these microbencharks has changed. Hence this quick post to share some new numbers.
To cut to the chase, here are the results I get running the same set of benchmarks across a collection of Python, Ruby, and Lua versions (those available in current Arch Linux).
Method Call:
wren0.4 | |
---|---|
luajit2.1 -joff | |
ruby2.7 | |
ruby3.0 | |
lua5.4 | |
lua5.3 | |
python3.11 | |
lua5.2 | |
mruby | |
python3.10 |
Delta Blue:
wren0.4 | |
---|---|
python3.11 | |
python3.10 |
Binary Trees:
luajit2.1 -joff | |
---|---|
ruby2.7 | |
ruby3.0 | |
python3.11 | |
lua5.4 | |
wren0.4 | |
mruby | |
python3.10 | |
lua5.3 | |
lua5.2 |
Recursive Fibonacci:
luajit2.1 -joff | |
---|---|
lua5.4 | |
ruby2.7 | |
ruby3.0 | |
lua5.3 | |
lua5.2 | |
wren0.4 | |
python3.11 | |
mruby | |
python3.10 |
I've used essentially the same presentation and methodology as in the original benchmark, partly to save time pondering the optimal approach, partly so I can redirect any critiques to the original author (sorry Bob!). Benchmarks do not measure interpreter startup time, and each benchmark is run ten times with the median used (thermal throttling could potentially mean this isn't the best methodology, but changing the number of test repetitions to e.g. 1000 seems to have little effect).
The tests were run on a machine with an AMD Ryzen 9 5950X processor. wren 0.4 as of commit c2a75f1 was used as well as the following Arch Linux packages:
The Python 3.10 and 3.11 packages were compiled with the same GCC version
(12.2.1 according to python -VV
), though this won't necessarily be true for
all other packages (e.g. the lua52 and lua53 packages are several years old so
will have been built an older GCC).
I've submitted a pull request to update the Wren performance page.
The following results are copied from the Wren performance page (archive.org link ease of comparison. They were run on a MacBook Pro 2.3GHz Intel Core i7 with Lua 5.2.3, LuaJIT 2.0.2, Python 2.7.5, Python 3.3.4, ruby 2.0.0p247.
Method Call:
wren2015 | |
---|---|
luajit2.0 -joff | |
ruby2.0 | |
lua5.2 | |
python3.3 | |
python2.7 |
DeltaBlue:
wren2015 | |
---|---|
python3.3 | |
python2.7 |
Binary Trees:
luajit2.0 -joff | |
---|---|
wren2015 | |
ruby2.0 | |
python2.7 | |
python3.3 | |
lua5.2 |
Recursive Fibonacci:
luajit2.0 -joff | |
---|---|
wren2015 | |
ruby2.0 | |
lua5.2 | |
python2.7 | |
python3.3 |
A few takeaways:
Health warning: this is incredibly quick and dirty (especially the repeated switching between the python packages to allow testing both 3.10 and 3.11):
#!/usr/bin/env python3
# Copyright Muxup contributors.
# Distributed under the terms of the MIT license, see LICENSE for details.
# SPDX-License-Identifier: MIT
import statistics
import subprocess
out = open("out.md", "w", encoding="utf-8")
def run_single_bench(bench_name, bench_file, runner_name):
bench_file = "./test/benchmark/" + bench_file
if runner_name == "lua5.2":
bench_file += ".lua"
cmdline = ["lua5.2", bench_file]
elif runner_name == "lua5.3":
bench_file += ".lua"
cmdline = ["lua5.3", bench_file]
elif runner_name == "lua5.4":
bench_file += ".lua"
cmdline = ["lua5.4", bench_file]
elif runner_name == "luajit2.1 -joff":
bench_file += ".lua"
cmdline = ["luajit", "-joff", bench_file]
elif runner_name == "mruby":
bench_file += ".rb"
cmdline = ["mruby", bench_file]
elif runner_name == "python3.10":
bench_file += ".py"
subprocess.run(
[
"sudo",
"pacman",
"-U",
"--noconfirm",
"/var/cache/pacman/pkg/python-3.10.10-1-x86_64.pkg.tar.zst",
],
check=True,
)
cmdline = ["python", bench_file]
elif runner_name == "python3.11":
bench_file += ".py"
subprocess.run(
[
"sudo",
"pacman",
"-U",
"--noconfirm",
"/var/cache/pacman/pkg/python-3.11.3-1-x86_64.pkg.tar.zst",
],
check=True,
)
cmdline = ["python", bench_file]
elif runner_name == "ruby2.7":
bench_file += ".rb"
cmdline = ["ruby-2.7", bench_file]
elif runner_name == "ruby3.0":
bench_file += ".rb"
cmdline = ["ruby", bench_file]
elif runner_name == "wren0.4":
bench_file += ".wren"
cmdline = ["./bin/wren_test", bench_file]
else:
raise SystemExit("Unrecognised runner")
times = []
for _ in range(10):
bench_out = subprocess.run(
cmdline, capture_output=True, check=True, encoding="utf-8"
).stdout
times.append(float(bench_out.split(": ")[-1].strip()))
return statistics.median(times)
def do_bench(name, file_base, runners):
results = {}
for runner in runners:
results[runner] = run_single_bench(name, file_base, runner)
results = dict(sorted(results.items(), key=lambda kv: kv[1]))
longest_result = max(results.values())
out.write(f"**{name}**:\n")
out.write('<table class="chart">\n')
for runner, result in results.items():
percent = round((result / longest_result) * 100)
out.write(
f"""\
<tr>
<th>{runner}</th><td><div class="chart-bar" style="width: {percent}%;">{result:.3f}s </div></td>
</tr>\n"""
)
out.write("</table>\n\n")
all_runners = [
"lua5.2",
"lua5.3",
"lua5.4",
"luajit2.1 -joff",
"mruby",
"python3.10",
"python3.11",
"ruby2.7",
"ruby3.0",
"wren0.4",
]
do_bench("Method Call", "method_call", all_runners)
do_bench("Delta Blue", "delta_blue", ["python3.10", "python3.11", "wren0.4"])
do_bench("Binary Trees", "binary_trees", all_runners)
do_bench("Recursive Fibonacci", "fib", all_runners)
print("Output written to out.md")