On paperboys, newsagents and exceptions

A few days ago I asked whether an exception should be thrown when the customer cannot pay the paper-boy. The responses (thank you!) ranged widely across the spectrum between Yes and various shades of No, and are well worth taking a few minutes to read before we go any further. What follows is my take on the code…

TL;DR — replace the exception by giving the paper-boy a coping strategy

Let’s think about the real-world (domain) model for a minute:

Firstly, I note that I have been that customer on many occasions. I’m frequently not at home when the paper-boy or the Avon lady or the window-cleaner comes around (honest!). I would expect that non-payment is statistically likely to occur on pretty much every collection round.

Now, presumably the paper-boy is on his round, calling on the newsagent’s customers. He may be only collecting money, or he may also be delivering papers. As he goes from house to house everything runs smoothly, in that every customer so far has paid their bill; he has a bulging bag of money, and has marked ‘Paid’ against every customer on the first page of his little Accounts book. But the next customer doesn’t have the funds to pay! The paper-boy hasn’t been told what to do, and wasn’t expecting this eventuality at all. He panics and runs screaming back to the newsagent.

The paper-boy runs into the arms of the newsagent, who sees that someone couldn’t pay. Thankfully the boy still has the money he has collected thus far. But he also has hold of the last customer’s wallet. And in his panic he is so incoherent that he can’t say which customer couldn’t pay [note that the exception has no constructor parameters]. At this point we have to make some assumptions about where the newsagent might be standing:

If he is back at the shop, then the paper-boy has run so far that his round must now be assumed to be abandoned; the newspapers (and the bills) of any remaining customers will need to be dealt with on another round. Alternatively, the newsagent might be accompanying the boy on his round, telling him which house to visit next; in this case he can simply calm the boy down, deal with the matter himself, and the pair can then proceed with the round.

Neither of these alternatives seems entirely satisfactory. In the first case, it seems unreasonable that neither the boy nor the newsagent would learn how to cope with non-payment (which must surely happen regularly). Even if the boy wasn’t told about the possibility before his first ever round, he wouldn’t continue behaving that way forever. And no sensible newsagent would be happy to have to re-start the boy’s rounds after each non-payment.

But in the second case, why have a dog and bark yourself? Why is the newsagent accompanying the boy, unless this is a training exercise? Again, not a scenario that is likely to be oft repeated.

I assume therefore that, in the real world, the boy is sent on the round because the newsagent wants to stay and mind the store. The newsagent wants to boy to collect his customers’ dues (and perhaps deliver papers too), without the newsagent having to break off his work (and potentially finish the round himself, or send another boy to do that later). It’s likely therefore that the boy will have been given instructions such as “If anyone isn’t at home, or can’t pay, just make a note in the book. We’ll see them next week, or send an invoice, but don’t you worry about that for now.” That is, the newsagent gives the boy a coping strategy. The boy’s responsibility is to mark ‘Paid’ or ‘Not paid’ against each customer, and to collect money where possible. The newsagent will reconcile these notes against his accounts later in the day and decide what to do, both with the money and with the non-payers.

So, back to the code. Throwing this exception forces some other part of our code (presumably the Newsagent) to cope with it; the catch clause will be an example of a “type 7” conditional in my little catalogue. I don’t believe it models the domain at all faithfully. Alternatively, we could avoid the exception if the paper-boy returned a status code. But that seems to me to be pretty much the same as the scenario in which the newsagent accompanies the boy on his round (and it leaves the Newsagent having to implement a “type 1” conditional too).

Instead, I agree with Paul D’Ambra: I think we should give the Paperboy a coping strategy: Equip him with code to call when the customer cannot pay. And in the interests of symmetry we might go further, having the boy update the accounts directly as he goes.

There are numerous ways to express this in code. For example, we could implement this using the Observer pattern, either with a directly coupled listener:

class Paperboy {
  private int collected_amount;
  private Newsagent newsagent;

  public void collect_money(Customer customer, int due_amount) {
    int amount_received = customer.payWhatYouCanAfford(due_amount);
    collected_amount += amount_received;
    newsagent.customerPaid(amount_received, customer);
  }
}

Or with a decoupled event notification mechanism:

class Paperboy
  def collect_money(customer, due_amount)
    amount_received = customer.pay_what_you_can_afford(due_amount)
    @collected_amount += amount_received
    EventBus.announce :customerPaid, amount: amount_received, customer: customer
  end
end

Another conditional avoided. (And a domain more faithfully modelled, in my opinion.)

Eliminate many conditionals with this one weird trick

Recently I attempted to classify the conditionals in software according to where in the code they originate. The first category of these was: Checking a value returned to me from code I own. A large proportion of these can be eliminated quite simply.

Imagine we want to begin reading a file at the last saved read position, which may be null if we haven’t read anything yet:

var readOffset = fileReader.GetSavedReadPosition();
if (readOffset == null)
  readOffset = new ReadPosition(fileReader, 0);

This code violates almost every rule in the book:

  • There is duplication, because both the caller and the callee have special code for the “no saved position” branch.
  • There is Connascence of Meaning, because both need a common understanding of what the special null value means.
  • We are violating the Tell, don’t Ask principle because we are making a decision on the basis of a value returned from a call to a different object.

All in all, this code has problems — and yet I see code like this everywhere I look. So, what to do?

Let’s look at this code from the Connascence point of view. The problem is the null value representing the special “not saved” case: both the caller and the callee have to agree to use null to mean that (hence “Connascence of Meaning”). Now, connascence becomes stronger with increasing distance. Our example smells strongly because the two connascent code fragments are in different classes; thus we could weaken it if we can bring the endpoints closer together.

Currently, the fileReader decides what to return when it has no saved read position, and the client code has to cope with that decision. What if, instead, the client code decides what it would like to get back in that case; what if it could simply tell the method what to return if it can’t do the thing we’re asking:

var readOffset = fileReader.GetSavedReadPosition(0);

Now the connascence has disappeared, and the conditional has disappeared with it. It’s as simple as that.

Many frameworks and APIs offer this kind of default return value parameter. For example, the standard Ruby Hash class provides this feature via the fetch method.

But what if there’s no sensible default value that we can pass into the called method? For example, consider the following case:

var username = auth.GetPrincipal();
if (username == null)
  throw new UsernameNotFound();
auth.SetPassword(username, password);

We still have Connascence of Meaning due to the null return case; and we don’t want to push the error handling (or whatever) down into the authentication object, because different callers may want to handle the situation differently. But we can take a leaf from the default parameter book, and have the authentication object execute code for us:

var throwIfNotFound = () => throw new UsernameNotFound();
var username = auth.GetPrincipal(throwIfNotFound);
auth.SetPassword(username, password);

Again, the null value has disappeared, and so has the Connascence and the conditional. For the sake of symmetry we can often also dispense with the return value altogether and pass in both code branches to the callee:

auth.WithPrincipal(
  (username) => auth.SetPassword(username, password),
  () => throw new UsernameNotFound());

lambda1

Pretty much every modern language supports this kind of code now:

  • Functional languages allow us to pass in functions to do the job; likewise Javascript;
  • C and C++ let us pass function pointers;
  • C# lets us use lambdas via Linq; Ruby lets us pass in lambdas, and even C++ (thanks @mmeijeri) and Java 8 now allow lambdas.

Some even let us use named parameters to make the options a little clearer; for example, we might write the above example in Ruby thus:

auth.withPrincipal(
  if_known: lambda {|username| auth.set_pass(username, passwd)},
  if_unknown: lambda {raise UsernameNotFound.new})

It’s as simple as that. By adopting these approaches it is possible to simplify a lot of code, effectively removing the connascence and the duplicated conditionals from the caller and the callee.

Where do conditionals come from?

Given that we want to reduce the number of conditional branches in our code, I wonder whether it is possible to catalogue all of the reasons they exist, so that we might then in turn list a few ways to eliminate each category.

So to that end I’ve listed out those origins of conditionals that I could think of. Is this a fool’s errand? Possibly, but let’s give it a try anyway. Please comment if you can spot any that I have missed, and let’s see if we can grow a “complete” list…

  1. Checking a value returned to me from code I own
    This is the essence of the Null Check smell, and more generally Connascence of Meaning. For example, see this (intentionally bad) code from William C. Wake’s Refactoring Workbook:

    public class Report {
      public static void report(Writer out, List<Machine> machines, Robot robot) throws IOException
      {
        //...
    
        out.write("Robot");
        if (robot.location() != null)
          out.write(" location=" + robot.location().name());
    
        if (robot.bin() != null)
          out.write(" bin=" + robot.bin());
    
        //...
      }
    }
    
  2. Checking a value returned to me from code I do not own
    For example, in Rails projects we often see code of this kind:

    if User.exists?(:user_id => current_user.id)
      # ...
    else
      # ...
    end
    
  3. Checking a parameter passed to me by code I own
    Whenever I write a method that accepts a Boolean parameter, I open myself up to the Control Couple smell. Here’s an example from Piotr Solnica’s blog:

    def say(sentence, loud = false)
      if loud
        puts sentence.upcase
      else
        puts sentence
      end
    end
    
  4. Checking a parameter passed to me by code I do not own
    Here’s an example from the jQuery documentation:

    <script>
    var xTriggered = 0;
    $( "#target" ).keypress(function( event ) {
      if ( event.which == 13 ) {
         event.preventDefault();
      }
      xTriggered++;
      var msg = "Handler for .keypress() called " + xTriggered + " time(s).";
      $.print( msg, "html" );
      $.print( event );
    });
    </script>
    
  5. Checking my own state or attributes
    Here’s an (intentionally bad) example from Martin Fowler’s Refactoring:

    class Employee {
      private int _type;
      static final int ENGINEER = 0;
      static final int SALESMAN = 1;
      static final int MANAGER = 2;
    
      Employee (int type) {
        _type = type;
      }
    
      int payAmount() {
        switch (_type) {
          case ENGINEER:
            return _monthlySalary;
          case SALESMAN:
            return _monthlySalary + _commission;
          case MANAGER:
            return _monthlySalary + _bonus;
          default:
            throw new RuntimeException("Incorrect Employee");
        }
      }
    }
    
  6. Checking a value I set previously in this method
    For example:

    public String format(List<String> items) {
      StringBuffer s = new StringBuffer();
      boolean first = true;
      for (String item : items) {
        if (first)
          first = false;
        else
          s.append(", ");
        s.append(item);
      }
      return s.toString();
    }
    
  7. Checking the type of an exception thrown by code I own
    Here’s an example from the PHP language documentation:

    <?php
    function inverse($x) {
        if (!$x) {
            throw new Exception('Division by zero.');
        }
        return 1/$x;
    }
    
    try {
        echo inverse(5) . "\n";
        echo inverse(0) . "\n";
    } catch (Exception $e) {
        echo 'Caught exception: ',  $e->getMessage(), "\n";
    }
    
    // Continue execution
    echo "Hello World\n";
    ?>
    
  8. Checking the type of an exception thrown by code I do not own
    We often find code such as this in Rails controllers:

    def delete
      schedule_id = params[:scheduleId]
      begin
        Schedules.delete(schedule_id)
      rescue ActiveRecord::RecordNotFound
        render :json => "record not found"
      rescue ActiveRecord::ActiveRecordError
        # handle other ActiveRecord errors
      rescue # StandardError
        # handle most other errors
      rescue Exception
        # handle everything else
      end
      render :json => "ok"
    end
    

Is this list anything like complete? Can it ever be?

I wonder if a useful next step might be to write down some general strategies for removing each of these kinds of conditonal…?

Conditionals on the edge

As you know, I have a thing about conditional code. Most conditionals are duplicates, and the only “genuine” conditionals are at the system boundaries, where they test external state and input information. But I discovered recently that, even at the edges, not every conditional is necessary…

Here’s (a drastically simplified version of) some code I wrote a while back. (It’s from a Rails app, but the message in this post applies to any technology and any kind of system input.)

class Users::RegistrationsController < Devise::RegistrationsController
  def create
    user = create_new_user_account
    if params[:invitation_token]
      sign_in_as(user)
      redirect_to Invitation.find(params[:invitation_token]).shared_content
    else
      redirect_to new_user_session_path
    end
  end
end

This route handles incoming user registrations, and it has a conditional branch to cater for the case in which the new user was invited, via email, by an existing user. If the new user arrived by clicking a link in an email we do one set of things; whereas we do a different set of things if the user got here via the ‘Register now’ button on the home page. Perfectly reasonable, I thought. Sure, it’s conditional code; but it’s sitting on the system boundary and therefore it’s fine, yes? No.

This conditional check represents duplication. Somewhere else in my application already knew, at some time in the recent past, which route should be taken through this method when it was finally invoked. When the app sent that email invitation, it was in a sense delegating to the recipient the task of supplying user account details. The conditional branch that handles the invitation is a callback from across a technology boundary, and the inviting code already knew to expect it.

The duplication also has a nasty side-effect: high complexity. This method is an order of magnitude harder to understand and to test(-drive) because of the conditional construct. (And you’re reading a version from which I’ve elided all manner of error handling.) Oh, and by forcing two routes into the same method I also made it violate the Single Responsibility Principle.

Happily the duplication is really easy to remove. The code that sends the email invitation knows which path it wants the registration code to take, so I just need to create a new input route for “invitation acceptance” and change the email invitation to link to it. As a result I’ll have two controller methods, each with simpler logic than their predecessor:

class Users::RegistrationsController < Devise::RegistrationsController
  def create
    user = create_new_user_account
    redirect_to new_user_session_path
  end
end

 

class InvitationAcceptancesController < Devise::RegistrationsController
  def create
    user = create_new_user_account
    sign_in_as(user)
    redirect_to Invitation.find(params[:invitation_token]).shared_content
    end
  end
end

So I have now removed the duplication, and the controller code is simpler as a result. These controller methods are also much easier to test(-drive) than the earlier code. And as a final bonus, I can design the user interface to this route with more knowledge, maybe adding some details of the invitation so that the incoming potential user feels a little more at home. In short, now that each code path has its own method (and class) I can pay it more attention, and it will flourish as a result.

Now, having refactored this code, I take away a few points for the future:

  • Where appropriate and possible, model system outputs as delegations into a space that your code can’t reach or control; and model the resulting future input as a response.
  • Where a system input is due to the system having earlier delegated a responsibility across its boundary, make sure the return route is differentiated from all others.
  • Differentiated incoming routes have code that is easier to read and simpler to test.
  • When an input route has a single use, the user experience can be tailored and enriched.
  • Not every conditional at the system boundary tests external state.

the anti-if campaign

I’ve just signed up to the Anti-If Campaign (link via Dave Nicolette). This is a movement to promote the use of strong object-oriented software design principles, by asking us to think twice before we allow any conditional branches into our code. This isn’t dogma, just a hook that can help direct our refactoring.

Long-time readers of this blog will know that I believe most conditionals to be duplication — see the following old posts for examples of my arguments here:

I’ve placed the Anti-If Campaign’s badge on my sidebar to show that I look hard at every conditional, and I invite you to do the same!

if…

A couple of years ago I was working on a project in which a significant portion (say 30%) of the codebase was legacy code. By which I mean that it was written by a chap who had just left the team. He was a ‘clever’ programmer – his code did the job, and the rest of us were afraid to touch it. The coding style (this was a Java project) was intensely algorithmic. It involved a small number of huge classes, each method of which took parameters to configure their behaviour. No-one could touch it because the inter-relationships between the (large) pieces were too opaque, and none of us was brave enough to even begin the process of refactoring towards a more ‘object-oriented’ solution. (One thing that would now ring alarm bells with me: the code failed Mike Feathers’ test, in that it was well-nigh impossible to write unit tests for any piece of the code. In fact, most of the classes couldn’t be instantiated outside of this sub-system’s highly complex nutrient soup.)
Continue reading

avoid boolean parameters

You are tempted to create a method with a boolean argument. The caller will set the argument to true or false depending on which code path he wishes the called method to take. This is an extremely convenient way to add a small degree of configurability to an algorithm, and means that both paths through the called method can share setup and cleanup code.

class MyClass

    void dosomething(boolean followPathA) {
        // setup
        if (followPathA)
            // pathA
        else
            // pathB
        // cleanup
    }

However, the use of a boolean argument reduces readability. For example, in a method such as
workspace.createFile(name, true), what does “false” mean? It is much more intention-revealing to call a method such as workspace.createTempFile(name). Furthermore, boolean arguments always create duplication, either of code or of responsibility, and often of a very insidious nature. This is because the called method contains a runtime test to decide between two different code paths, despite thet fact that the calling method already knows which branch it requires! The duplication not only wastes a little runtime, it also spreads knowledge and responsibility further than is necessary. The resulting code is more closely coupled than the design it represents. These effects are magnified exponentially each time the boolean argument is passed on in a further method call.

Therefore avoid the need for a boolean argument by creating two versions of the method, in which the meaning of the boolean is reflected in the names of the new methods. For example, replace

    File createFile(String name, boolean istemp)

by

    File createFile(String name)
    File createTempFile(String name)

If the resulting new methods have similar logic, extract the duplicate code into shared methods that can be called by each.

(Here’s a fun story on a related theme.)