Benchmark vs. question mark

Benchmarking in Ruby

Tests are necessary for successful refactorings. After all the system has to behave as it did before.
Refactorings also might influence the performance in certain circumstances. After successfully exercising the architecture or code improvement, surprises due to performance are undesirably.
Benchmarks supply the necessary data. The measured numbers provide the useful safety from the performance perspective. Besides benchmarks help to locate performance issues. Whether a presumably performance issue has to be fixed or not, should also depend on benchmark results.
Data help to make the right decisions.
For instance there is an argument against extracting logic into a separate method. It is suspected to cost performance.
The cost for moving the a temporary variable (years_since_birthday) into a Query Method has to be measured by means of the following module AgeCalculator:

# age_calculator.rb
module AgeCalculator
  ADULT_AGE = 18

  def self.adult? birthday
    years_since_birthday = Date.today.year - birthday.year
    years_since_birthday >= ADULT_AGE
  end
end

The actual state is measured in a benchmark script:

# benchmarks.rb
require 'benchmark'

Benchmark.bm do |bm|
  bm.report('adult?') do
    AgeCalculator.adult? Date.new(2000)
  end
end

All inside the Benchmark block is considered to be measured. Several benchmarks can be compared with each other with Benchmark#bm.
After running the script:

$ ruby benchmarks.rb

following numbers are possible:

        user       system     total        real
adult?  0.000000   0.000000   0.000000 (  0.000034)

The numbers correspond to the typical UNIX benchmark results:

  1. user: the spent CPU time for running the user code (the adult? implementation),
  2. system: the spent CPU time for running the Kernel code (the layers below Ruby)
  3. total: the CPU total time composed of user und system
  4. real: the spent time, including other process running on the system (like there is someone measuring with a stop watch)

Unfortunately the numbers are almost valueless, because the total time is so very little. Besides there are meant to be before/ after comparisons for an exact conclusion.

Benchmark comparisons

An alternative implementation is introduced (Query Method) for comparison reasons:

# age_calculator.rb
module AgeCalculator
  ADULT_AGE = 18

  def self.original_adult? birthday
    years_since_birthday = Date.today.year - birthday.year
    years_since_birthday >= ADULT_AGE
  end

  def self.alternative_adult? birthday
    years_since_birthday(birthday) >= ADULT_AGE
  end

  private

  def self.years_since_birthday birthday
    Date.today.year - birthday.year
  end
end

Both implementations are compared with each other in the benchmark script:

# benchmarks.rb
require 'benchmark'

birthday = Date.new(2000)

Benchmark.bm do |bm|                                                          
  bm.report('original') do                                                      
    AgeCalculator.original_adult? birthday                                      
  end                                                                           
                                                                                
  bm.report('alternative') do                                                   
    AgeCalculator.alternative_adult? birthday                                   
  end                                                                           
end

with a number outcome like:

             user       system     total         real
original     0.000000   0.000000   0.000000 (  0.000055)
alternative  0.000000   0.000000   0.000000 (  0.000011)

The number quality (user and system) is still low. But apart from that the alternative approach was not expected to be 5 times faster. Obviously the benchmark is not realistic.

Realistic benchmarks

The Ruby garbage collector, memory allocations and caching mechanisms also influence the benchmark.
Those effects can be switched off with Benchmark#bmbm. Then the code is run twice:

# benchmarks.rb
require 'benchmark'

birthday = Date.new(2000)

Benchmark.bmbm do |bm|
  bm.report('original') do
    AgeCalculator.original_adult? birthday
  end

  bm.report('alternative') do
    AgeCalculator.alternative_adult? birthday
  end
end

and the second cycle produces a way more realistic result:

Rehearsal -----------------------------------------------
original      0.000000   0.000000   0.000000 (  0.000036)
alternative   0.000000   0.000000   0.000000 (  0.000009)
-------------------------------------- total: 0.000000sec

              user       system     total         real
original      0.000000   0.000000   0.000000 (  0.000027)
alternative   0.000000   0.000000   0.000000 (  0.000027)

However both implementations still seem to spend approximately 0 seconds.

Benchmarks for load and scaling

It makes sense to run the code very often for load tests and benchmarking small code segments. The resulting average value hides variances and peaks.
The function scale_benchmark is for running the passed block a million times:

# benchmarks.rb
require 'benchmark'

def scale_benchmark &block                                                      
  1000000.times &block                                                          
end

birthday = Date.new(2000)

Benchmark.bm do |bm|
  bm.report('original') do
    scale_benchmark { AgeCalculator.original_adult? birthday }
  end

  bm.report('alternative') do
    scale_benchmark { AgeCalculator.alternative_adult? birthday }
  end
end

The outcome matches the expectations:

original      1.380000   0.580000   1.960000 (  1.958297)
alternative   1.410000   0.580000   1.990000 (  1.990055)

Indeed, moving logic into a Query Method cost performance. The difference for 1,000,000 cycles is 0.03 seconds on the test system (Intel , 8 GB RAM)
But whoever suffers those performance problems actually meets with another problem.