My current Perl Projects - Benchmarks - BigBench

While rewriting Math::BigInt and Math::BigFloat, I am often in the need to benchmark different versions to see whether things got better or worse, and how it compares against the original version.

Problems

Unfortunately, use Benchmark; only allows to benchmark one version of a module at a time, and it can not let you specifiy the contents of the empty loop. Also, the results aren't rounded or formatted and there is no good comparisation table. cmpthese does a table, but again can do only one version of a module.

Solution

So, on a dark, stormy, rainy evening in late November I sat down and wrote a script that let's you define benchmark templates and a list of things to benchmark.

The script then runs all the benchmarks, collects the data and formats it into pretty ASCII art tables - you can tell that I am fond of these ;oP See below for some sample output.

Status

The basics are more or less done. There are some more things I will put in over the course of the next days (because I need them), please read the TODO to find out what. Apart from the new features, there is almost no documentation and no testcases. This will be fixed in the next release.

The current version is v0.06 [2001-12-01]

Examples

Here is an example file that defines what to benchmark:
# Short definition file for fast tests

# BigInt/BigFloat:
# You should use $class->new(); instead of Math::BigInt->new() etc, so that
# the benchmarks are independed of the actual class used.

# Actual definitions:

group=new#1#Big integer new
1#1###$x = $class->new(1)
2#1e10###$x = $class->new('1'.'0' x 10)

group=new specials#0#Big integer new (special values)
0#NaN###$x = $class->new('abcdefg')
As you can see, it let's you group things together. The group-id and op-id are optional, if you set them to 0, they will be filled in automatically. For a complete description of the format see perldoc bb or the comments in the definition file bigint.def.

Here is an example file that is used as template to benchmark the ops that were defined above:
#!/usr/bin/perl -w

# bigbench template file.
$| = 1;
use lib '##path##/Math-BigInt-1.48/lib';
use Math::BigInt;
use Math::BigFloat;
use Benchmark;
my $class = 'Math::BigInt';
my $class_float = 'Math::BigFloat';

# output header and test for correct version

my $need = 'Math::BigInt v1.48 lib => Math::BigInt::Calc v0.17';
my $v = "Math::BigInt v$Math::BigInt::VERSION ";
if (Math::BigInt->can('_core_lib'))
  {
  $v .= "lib => ". Math::BigInt->_core_lib();
  $v .= ' v' . eval '$'.Math::BigInt->_core_lib().'::VERSION';
  }
print "$v\n";

die "Cannot load '$need', got '$v'\n" unless $v eq $need;

# actual benchmarking code will be appended

Output

Help screen

./bb --help produces:
BigBench v0.05  (c) Copyright by Tels 2001.  Have fun!

Usage  : ./bb [options]
Options: --help              print this screen and exit
         --base=number       print relative summary based on number
         --code=sourcecode   bench code snippet and ignore definitons
         --definitons=file   from where to read benchmark definitions
         --duration=seconds  run each op for at least this time
         --templates=path    path to templates to be used
         --path=libpath      path to libraries used by templates
         --simulate=sr       simulate results by using srand(sr)
         --skew=factor       scale reported numbers by factor
         --nosummary         don't print summary
         --nounlink          don't unlink temporary files (for debug)
         --terse             more terse summary (less lines)
         --tight             more tight summary (smaller spacing)

Options may be abbreviated, their case does not matter.

Examples: ./bb --def=math.def --terse --skew=2.1        # better printable?
          ./bb --def=str.def --inc=math --duration=5    # really fine-grained
          ./bb --def=some.def --nosummary               # detailed
          ./bb --def=some.def --terse --base=100        # simulate perlbench
          ./bb --code='"ababba" =~ /a+/;'               # only this

Sat Dec  1 10:35:25 2001 All done. Enjoy!

Output with summary

The following output was generated with ./bb --templates=latest --def=short.def:
BigBench v0.05  (c) Copyright by Tels 2001.  Have fun!

Sat Dec  1 10:51:02 2001 Reading templates from 'latest/'...done.
 Got 2 templates.
Sat Dec  1 10:51:02 2001 Reading definitions from short.def...done.
 Got 3 ops in 2 groups.

Each op will run for at least 2 seconds.
Results are scaled by factor 1 and rounded to 3 digits.
Time to complete benchmark is approximately 18 seconds.

Running 'v0.01':
 Benchmarking group 1 ('new'):
      1  1                  35100 ops/s
      2  1e10               34000 ops/s
 Average:                   34500 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                65300 ops/s
 Average:                   65300 ops/s

Running 'v1.48':
 Benchmarking group 1 ('new'):
      1  1                  14900 ops/s
      2  1e10               14300 ops/s
 Average:                   14600 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                33500 ops/s
 Average:                   33500 ops/s

Sat Dec  1 10:51:21 2001 Numbers are absolute ops/s, scaled by factor 1.

               |  v0.01  v1.48
 --------------+---------------
  1            |  35100  14900
  1e10         |  34000  14300
 new:          |  34500  14600
 ..............|...............
  NaN          |  65300  33500
 new specials: |  65300  33500
 ..............|...............

Sat Dec  1 10:51:21 2001 All done. Enjoy!

Output with terse summary

The following was generated by ./bb --terse --def=short.def --temp=latest:
BigBench v0.05  (c) Copyright by Tels 2001.  Have fun!

Sat Dec  1 10:52:02 2001 Reading templates from 'latest/'...done.
 Got 2 templates.
Sat Dec  1 10:52:02 2001 Reading definitions from short.def...done.
 Got 3 ops in 2 groups.

Each op will run for at least 2 seconds.
Results are scaled by factor 1 and rounded to 3 digits.
Time to complete benchmark is approximately 18 seconds.

Running 'v0.01':
[snip some lines]

Sat Dec  1 10:52:21 2001 Numbers are absolute ops/s, scaled by factor 1.

               |  v0.01  v1.48
 --------------+---------------
 new:          |  34400  14600
 new specials: |  65800  33500

Sat Dec  1 10:52:21 2001 All done. Enjoy!

Output with tight summary

The following was generated by ./bb --tight --def=short.def --templ=latest:
BigBench v0.05  (c) Copyright by Tels 2001.  Have fun!

Sat Dec  1 10:48:03 2001 Reading templates from 'latest/'...done.
 Got 2 templates.
Sat Dec  1 10:48:03 2001 Reading definitions from short.def...done.
 Got 3 ops in 2 groups.

Each op will run for at least 2 seconds.
Results are scaled by factor 1 and rounded to 3 digits.
Time to complete benchmark is approximately 18 seconds.

Running 'v0.01':
[snip some lines]

Sat Dec  1 10:48:22 2001 Numbers are absolute ops/s, scaled by factor 1.

               | v0.01 v1.48
 --------------+-------------
  1            | 35000 15100
  1e10         | 34200 14200
 new:          | 34600 14600
 ..............|.............
  NaN          | 64900 33300
 new specials: | 64900 33300
 ..............|.............

Sat Dec  1 10:48:22 2001 All done. Enjoy!

Output with relative numbers in summary

The following was generated by ./bb --base=1000 --def=short.def --templ=bigint:
BigBench v0.05  (c) Copyright by Tels 2001.  Have fun!

Sat Dec  1 10:54:11 2001 Reading templates from 'bigint/'...done.
 Got 8 templates.
Sat Dec  1 10:54:11 2001 Reading definitions from short.def...done.
 Got 3 ops in 2 groups.

Each op will run for at least 2 seconds.
Results are scaled by factor 1 and rounded to 3 digits.
Time to complete benchmark is approximately 72 seconds.

Running 'v0.01':
 Benchmarking group 1 ('new'):
      1  1                  35300 ops/s
      2  1e10               34000 ops/s
 Average:                   34600 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                64600 ops/s
 Average:                   64600 ops/s

Running 'v1.33':
 Benchmarking group 1 ('new'):
      1  1                  13000 ops/s
      2  1e10               12200 ops/s
 Average:                   12600 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                39700 ops/s
 Average:                   39700 ops/s

Running 'v1.39':
 Benchmarking group 1 ('new'):
      1  1                  14900 ops/s
      2  1e10               14400 ops/s
 Average:                   14700 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                34100 ops/s
 Average:                   34100 ops/s

Running 'v1.40':
 Benchmarking group 1 ('new'):
      1  1                  14900 ops/s
      2  1e10               14300 ops/s
 Average:                   14600 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                34100 ops/s
 Average:                   34100 ops/s

Running 'v1.45':
 Benchmarking group 1 ('new'):
      1  1                  14900 ops/s
      2  1e10               14200 ops/s
 Average:                   14500 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                33700 ops/s
 Average:                   33700 ops/s

Running 'v1.47':
 Benchmarking group 1 ('new'):
      1  1                  14700 ops/s
      2  1e10               14100 ops/s
 Average:                   14400 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                33400 ops/s
 Average:                   33400 ops/s

Running 'v1.48':
 Benchmarking group 1 ('new'):
      1  1                  14900 ops/s
      2  1e10               14200 ops/s
 Average:                   14500 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                33800 ops/s
 Average:                   33800 ops/s

Running 'v1.48_Pari_v1.05':
 Benchmarking group 1 ('new'):
      1  1                  13500 ops/s
      2  1e10               13000 ops/s
 Average:                   13300 ops/s
 Benchmarking group 2 ('new specials'):
      3  NaN                28700 ops/s
 Average:                   28700 ops/s

Sat Dec  1 10:55:26 2001 Numbers are relative to v0.01, 1000 denotes 100%.

               |  v1.33  v1.39  v1.40  v1.45  v1.47  v1.48  v1.48
               |                                             Pari
               |                                            v1.05
 --------------+--------------------------------------------------
  1            |    368    422    422    422    416    422    382
  1e10         |    359    424    421    418    415    418    382
 new:          |    364    425    422    419    416    419    384
 ..............|..................................................
  NaN          |    614    528    528    522    517    523    444
 new specials: |    615    528    528    522    517    523    444
 ..............|..................................................

Sat Dec  1 10:55:26 2001 All done. Enjoy!
Hope you like it, so drop me a note.


Last update: 2001-11-29