Round-trip integration testing

(originally posted 2012-04-03)

Any system worth its salt is going to need some pretty intricate testing to verify it does all the wonderful things it’s designed to do. But hand-crafting test cases is tedious, brittle, and prone to masking your presuppositions about what will work and what might break. Auto-generating data takes care of a lot of the grunt-work, but how do you figure out what to look for at the end if you’re not explicitly defining the input and the expected output?

Well, one obvious approach is to programmatically obtain the output based on the input. But then you’re basically repeating the implementation of the code being tested. At best, the two implementations were arrived at independently and there’s something useful in the test. But if the requirements change over time–and they will–the test and code will both have to change similarly, and what’s the likelihood that you can maintain clean-room separation between the two during this evolution? You’re doomed to coupling the two together.

That’s why I shoot for what I’m calling “round trip” integration tests. The key idea is for the test to perform the operation in reverse, so you can put your test data through both and make sure it looks the same when it comes back. Depending on the operations in question, it may make sense to start with an expected output, from that derive the input needed to produce it, then verify that it does in fact do so. Conversely you may start with input, perform the operation, then transform it back to compare with the original input. If the operation is in any way destructive, then the latter option probably isn’t available to you. So let’s look at a tiny, somewhat contrived, example of the former.

Let’s say I wanted to test a grouping function like Ruby’s Enumerable#group_by, but in Perl. I can generate some sample output:

sub r100 { rand(100) + 1 }
sub random_sequence {
  my ($length) = @_;
  my %key;
  my @sequence;
  while (@sequence < $length) {
    my $num = r100;
    next if $key{$num};
    push @sequence, $num;
    $key{$num} = 1;
  }
  return @sequence;
}
my %grouped = map {$_ => [sequence(r100)]} sequence(r100);

Now we need to produce some input that should produce such a thing. Some type of randomness is nice, but I’d still like to see the entries for every given key in the same relative order:

my @input;
while (my @keys = keys %grouped) {
  my $key = $keys[rand(@keys)];
  push @input, [$key, shift @{$grouped{$key}}];
  delete $grouped{$key} unless @{$grouped{$key}};
}

So let’s write some code to perform the collation:

sub group {
  my %grouped;
  foreach my $item (@_) {
    my ($key, $value) = @$item;
    push @{$grouped{$key}}, $value;
  }
  return %grouped;
}

Now all we have left is to string it all together and compare. Your average test-suite will do this comparison for you, but let’s just brute force it here:

my %result = group @input;
die 'Key match failure' unless %result ~~ %grouped;
foreach my $key (keys %grouped) {
  die 'Grouped value failure' unless @{$result{$key}} ~~ @{$grouped{$key}};
}

Stringing it all together, I found there was a flaw somewhere in the round trip as the ‘Key match failure’ got tripped. Turned out that the transformation from %grouped to @input was destructive, so I needed to operate on a copy. A shallow copy made it past the key test, but died on the ‘Grouped value failure’. So here it is all strung together, with a deep copy, error free:

use strict;

sub r100 { int(rand(100)) + 1 }
sub sequence {
  my ($length) = @_;
  my %key;
  my @sequence;
  while (@sequence < $length) { my $num = r100; next if $key{$num}; push @sequence, $num; $key{$num} = 1; } return @sequence; } my %grouped = map {$_ => [sequence(r100)]} sequence(r100);

my %tmp;
@tmp{keys %grouped} = map {[@$_]} values %grouped;
my @input;
while (my @keys = keys %tmp) {
  my $key = $keys[rand(@keys)];
  push @input, [$key, shift @{$tmp{$key}}];
  delete $tmp{$key} unless @{$tmp{$key}};
}

sub group {
  my %grouped;
  foreach my $item (@_) {
    my ($key, $value) = @$item;
    push @{$grouped{$key}}, $value;
  }
  return %grouped;
}

my %result = group @input;

die 'Key match failure' unless %result ~~ %grouped;
foreach my $key (keys %grouped) {
  die 'Grouped value failure' unless @{$result{$key}} ~~ @{$grouped{$key}};
}