Benchmarking in Ruby
Tests are necessary for successful refactorings. After all the system has to behave as it did before.
Refactorings also might influence the performance in certain circumstances. After successfully exercising the architecture or code improvement, surprises due to performance are undesirably.
Benchmarks supply the necessary data. The measured numbers provide the useful safety from the performance perspective. Besides benchmarks help to locate performance issues. Whether a presumably performance issue has to be fixed or not, should also depend on benchmark results.
Data help to make the right decisions.
For instance there is an argument against extracting logic into a separate method. It is suspected to cost performance.
The cost for moving the a temporary variable (years_since_birthday) into a Query Method has to be measured by means of the following module AgeCalculator:
# age_calculator.rb
module AgeCalculator
ADULT_AGE = 18
def self.adult? birthday
years_since_birthday = Date.today.year - birthday.year
years_since_birthday >= ADULT_AGE
end
end
The actual state is measured in a benchmark script:
# benchmarks.rb
require 'benchmark'
Benchmark.bm do |bm|
bm.report('adult?') do
AgeCalculator.adult? Date.new(2000)
end
end
All inside the Benchmark block is considered to be measured. Several benchmarks can be compared with each other with Benchmark#bm.
After running the script:
$ ruby benchmarks.rb
following numbers are possible:
user system total real
adult? 0.000000 0.000000 0.000000 ( 0.000034)
The numbers correspond to the typical UNIX benchmark results:
- user: the spent CPU time for running the user code (the adult? implementation),
- system: the spent CPU time for running the Kernel code (the layers below Ruby)
- total: the CPU total time composed of user und system
- real: the spent time, including other process running on the system (like there is someone measuring with a stop watch)
Unfortunately the numbers are almost valueless, because the total time is so very little. Besides there are meant to be before/ after comparisons for an exact conclusion.
Benchmark comparisons
An alternative implementation is introduced (Query Method) for comparison reasons:
# age_calculator.rb
module AgeCalculator
ADULT_AGE = 18
def self.original_adult? birthday
years_since_birthday = Date.today.year - birthday.year
years_since_birthday >= ADULT_AGE
end
def self.alternative_adult? birthday
years_since_birthday(birthday) >= ADULT_AGE
end
private
def self.years_since_birthday birthday
Date.today.year - birthday.year
end
end
Both implementations are compared with each other in the benchmark script:
# benchmarks.rb
require 'benchmark'
birthday = Date.new(2000)
Benchmark.bm do |bm|
bm.report('original') do
AgeCalculator.original_adult? birthday
end
bm.report('alternative') do
AgeCalculator.alternative_adult? birthday
end
end
with a number outcome like:
user system total real
original 0.000000 0.000000 0.000000 ( 0.000055)
alternative 0.000000 0.000000 0.000000 ( 0.000011)
The number quality (user and system) is still low. But apart from that the alternative approach was not expected to be 5 times faster. Obviously the benchmark is not realistic.
Realistic benchmarks
The Ruby garbage collector, memory allocations and caching mechanisms also influence the benchmark.
Those effects can be switched off with Benchmark#bmbm. Then the code is run twice:
# benchmarks.rb
require 'benchmark'
birthday = Date.new(2000)
Benchmark.bmbm do |bm|
bm.report('original') do
AgeCalculator.original_adult? birthday
end
bm.report('alternative') do
AgeCalculator.alternative_adult? birthday
end
end
and the second cycle produces a way more realistic result:
Rehearsal -----------------------------------------------
original 0.000000 0.000000 0.000000 ( 0.000036)
alternative 0.000000 0.000000 0.000000 ( 0.000009)
-------------------------------------- total: 0.000000sec
user system total real
original 0.000000 0.000000 0.000000 ( 0.000027)
alternative 0.000000 0.000000 0.000000 ( 0.000027)
However both implementations still seem to spend approximately 0 seconds.
Benchmarks for load and scaling
It makes sense to run the code very often for load tests and benchmarking small code segments. The resulting average value hides variances and peaks.
The function scale_benchmark is for running the passed block a million times:
# benchmarks.rb
require 'benchmark'
def scale_benchmark &block
1000000.times &block
end
birthday = Date.new(2000)
Benchmark.bm do |bm|
bm.report('original') do
scale_benchmark { AgeCalculator.original_adult? birthday }
end
bm.report('alternative') do
scale_benchmark { AgeCalculator.alternative_adult? birthday }
end
end
The outcome matches the expectations:
original 1.380000 0.580000 1.960000 ( 1.958297)
alternative 1.410000 0.580000 1.990000 ( 1.990055)
Indeed, moving logic into a Query Method cost performance. The difference for 1,000,000 cycles is 0.03 seconds on the test system (Intel , 8 GB RAM)
But whoever suffers those performance problems actually meets with another problem.