nosewheelie

Technology, mountain biking, politics & music.

Archive for the ‘either’ tag

Simplifying JSON Parsing Using FunctionalKit

without comments

Introduction

At MoGeneration we write a lot of iPhone clients that integrate with back end web services. Thankfully, most of these expose their data as JSON, which is quite easy to parse into a corresponding NSDictionary using json-framework. However, dealing with errors, non-existant values & then turning the parsed NSDictionary into domain model instances can be tricky. It often involves writing a lot of repetitive code and manual error handling. We can however do a lot better.

This post will show you the basics of improving the way JSON is parsed by using some simple techniques from FunctionalKit. There’s lots of in depth explanation going on here, look towards the code samples if you want a quick summary.

JSON

So to begin with, here’s the JSON. It comes from a current project for the guys over at Perkler. This is the result of looking up the current user’s likes (stuff they’re interested in knowing about):

{
  "meta":{
    "action":"getLikes",
    "output":"json",
    "search":false,
    "location": {
      "geo_id":"1999",
      "geo_location":"Victoria, Australia",
      "geo_latitude":"-36.558800",
      "geo_longitude":"145.468994",
      "geo_altitude":"0.000000",
      "geo_country_code":"AU",
      "geo_administrative_area":"Victoria",
      "geo_locality":"",
      "geo_thoroughfare":"",
      "geo_postalcode":"",
      "geo_accuracy":"2",
      "score":"0"
    }
  },
  "likes":["bikes","coffee","girls","haskell"]
}

For this post, we’ll ignore the metadata, we’re mainly concerned with the likes array.

Ground Rules

Let’s begin by setting up some ground rules, there’s a bunch of error conditions we need to handle:

  • We don’t know if the results are valid JSON, our parser returns nil on error;
  • The likes array may not be there;
  • The likes array may be nil;
  • The likes array may be empty;

There are other issues such as the underlying HTTP transport failing, but we’re a little further along the chain here, so we’re not worrying about that (though, these techniques work just as well there).

Baseline

Back in the bad old days we’d do a bunch of nil checks, and only proceed if things weren’t nil. We also return nil because people in Objective-C like that sort of thing.

Here’s the top level code that takes a string, parses it and invokes our parser to turn it into an array of PLLikes.

- (NSArray *)getLikes:(NSString *)jsonEncodedResults {
    NSDictionary *jsonDecodedResults = [jsonEncodedResults JSONValue];
    NSArray *likes = [likesParser parseGetLikesResults:jsonDecodedResults];
}

And here’s our parser implementation, with its myriad of nil checks, we’re letting it do all the checking (including handling the nil from the parser’s JSONValue).

@implementation GetLikesParser
 
- (NSArray *)parseGetLikesResults:(NSDictionary *)results {
    if (results == nil) {
        return nil;
    } else {
        NSArray *likes = [results objectForKey:@"likes"];
        if (likes == nil) {
          return nil;
        } else {
            NSMutableArray *convertedlikes = 
                  [NSMutableArray arrayWithCapacity:[likes count]];
            for (NSString *like in likes) {
                [convertedlikes addObject:[PLLike value:like]];
            }
            return convertedlikes;
        }
    }
}
 
@end

Isn’t that nice!

Introducing Option

We’ve established our baseline, now what’s wrong with this & how can we make it better?

To start with, we have a bunch of nil checks and the code doing the work is buried under a bunch of layers, obscuring what it actually does. Let’s start to make it better by introducing Option. Option is a way of denoting that we either have a value, or, we don’t, in other words an optional value. If we have a value we say we have a Some with the value in it, and if we have nothing, we have a None. Simple hey? These are represented in FunctionalKit using FKOption.

Let’s rewrite the above example to use Option, to make it easy to follow, we’ll do a direct translation.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    if (maybeResults.isNone) {
        return [FKOption none];
    } else {
        FKOption *maybeLikes = [FKOption fromNil:[maybeResults.some objectForKey:@"likes"]];
        if (maybeLikes.isNone) {
            return [FKOption none];
        } else {
            NSMutableArray *convertedlikes = 
                  [NSMutableArray arrayWithCapacity:[maybeLikes.some count]];
            for (NSString *like in maybeLikes.some) {
                [convertedlikes addObject:[PLLike value:like]];
            }
            return [FKOption some:convertedlikes];
        }
    }
}
 
@end

Let’s have some commentary on the above example.

To create an option from a potentially nil value, we use the constructor fromNil:. If the value is nil we’ll get None otherwise we’ll get a Some containing the value.

We prefix the variable name with “maybe”, this is not required, it’s just something that I like to do as it denotes that it “may be” the thing we want, or it may not be.

To pull the value out of an optional value, we call the some property (or message it if you like). To construct a Some with a value we know is non-nil, we call the some: constructor.

We’re also returning an optional value, so the code in our top level method needs to handle this also, we’ve not shown that here. Notice also that as Objective-C doesn’t support parametric polymorphism, we’ve lost some degree of compiler safety, we no longer know at compile time what FKOption holds, it’s really an FKOption[NSArray[PLLike]], but we can’t enforce that.

But is this really any better? As we’ve done a literal translation, we still have a bunch of checks that really aren’t much better than what we had. They’re nil checks in a different form.

Don’t fret, we can do better.

Staying inside the Option

What all these nil check equivalents (option.isNone) are really doing is just making sure we only continue executing while we have a non-nil value, or a Some in this case. What we want is: If we have a None, return that, however if we have a Some do something with the value in that Some. We can apply this rule at each level of our checking.

At this point, we’ll also pull the function out that does the work. This will make it simpler to figure out the actual core of our problem (that our checks were obscuring) as well as giving us a nice hook to create a function from.

Let’s have a crack at that code.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    FKOption *maybeLikes = [maybeResults bind:functionTS(self, pullOutLikes:)];
    return [maybeLikes map:functionTS(self, parseLikes:)];
}
 
- (FKOption *)pullOutLikes:(NSDictionary *)results {
    return [FKOption fromNil:[results objectForKey:@"likes"]];
}
 
- (NSArray *)parseLikes:(NSArray *)likes {
    NSMutableArray *convertedlikes = [NSMutableArray arrayWithCapacity:[maybeLikes.some count]];
    for (NSString *like in maybeLikes.some) {
        [convertedlikes addObject:[PLLike value:like]];
    }
    return convertedlikes;
}
 
@end

How it all works

Before we discuss what this code actually does, let’s have a look at it more closely. Nowhere have we actually explicitly pulled the value out of the Option. We’ve left it inside the Option; it takes care of safely unpacking the value (if it’s there) and providing it to our functions!

So what is this magic map: function doing? If you’ve used Ruby or a functional language, you’ve probably used this before. Cocoa has a similar concept in NSArray‘s -(void)makeObjectsPerformSelector:(SEL)aSelector. The important thing to note is that Option is a container class for other values, just like an array is. In fact, you can think of Option as an array that will either contain zero or one element, but no more.

Mapping across a container class is the same as iterating over it using a conventional for loop, with the benefit that the container class takes care of the iteration, as calling code we’re never exposed to it.

Here’s how map: works. For each time around the loop, the element at that point (at that index if you like) is pulled out of the container and provided to a function. The function transforms the element from its current value into another one. Take as an example an array of numbers. We could map over this array turning each number into a string. The result of mapping across this array of numbers is an array of strings. Notice also that because we’re just iterating over the container, if the container is empty we don’t start the iteration so never invoke the function (this is important!).

To abstract this a little, you start with a container of some type, and end up with a container of another type. An important concept to note is that the container type never changes; start with an array, end up with an array.

If you’re up for a little bit more of an interlude, the function that map: takes looks has types as follows: f :: a ➝ b. This means that it will take something of type a and return something of type b, where a and b can be any types at runtime, for example a function that converts a number to a string. So using our abstraction from above, if we had a container of type C (say an NSArray) that contained as (say NSNumbers) and a function that turned as to bs (NSNumbers to NSStrings) then we can get a C containing bs. Looking at the types again, map: looks like this: map :: C a ➝ (a ➝ b) ➝ C b. Phew…

Let’s get back to our Option again.

If we consider Option to be a container class (which it is) similar to an array, then if we map over an empty Option – a None – then all we get back is an empty Option – a None. However if we map over a non-empty Option – a Some – then we get back a non-empty Option – a Some. This is the magic behind how our code can deal with the presence or absence of nils; if we have nothing, the functions never get called, and we get back another (empty) Option.

This style of mapping can be chained together for as many options as we like, if we don’t have anything nothing happens, if we do, we process it and keep going. If at any point in the chain we have a nil the chain effectively stops processing. However we only write one set of code to do this, so we’re basically pretending that errors don’t happen, but if they do, they’re handled effectively.

You will also have noticed the other magic function we’re using, bind:. I’ve not talked about it in detail as it is very similar to map:. The only difference being that whereas map: takes a function from a ➝ b, bind: takes a function from a ➝ C b and produces a flattened C b, that is: bind :: C a ➝ (a ➝ C b) ➝ C b. This allows us to safely handle a potential nil coming out of an NSDictionary (for the @"likes" key) and still chain together our processing.

Back to the code

So now we've had that nice little chat, what is the code doing?

We have three basic blocks of work going on: 1) mapping over options 2) pulling out likes from an NSDictionary and 3) turning string likes into PLLikes. We've pulled these last two blocks of work out into methods by themselves. We might normally do this to clean up the code when refactoring, but in this case it allows us call the methods as the functions passed to a map:. Remember the interlude above?

FunctionalKit provides a number of ways to create functions. functionTS is a macro that constructs a new FKFunction from a target object and a selector (there's long hand non-macro ways to do this also). Sending a selector to a target object is fairly routine in Objective-C, all FKFunction does is wrap the target and selector up into a convenient package and allow us to pass it into map:. Once Objective-C gets closures the need to pull out these methods will be removed, eliminating the need to use these macros in maps, binds, etc.

Simplifying

Now that we know what mapping is, that last function is crying out to be a map! Let's see this in action.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    FKOption *maybeLikes = [maybeResults bind:functionTS(self, pullOutLikes:)];
    return [maybeLikes map:functionTS(self, parseLikes:)];
}
 
- (FKOption *)pullOutLikes:(NSDictionary *)results {
    return [FKOption fromNil:[results objectForKey:@"likes"]];
}
 
- (NSArray *)parseLikes:(NSArray *)likes {
    return [likes map:functionTS(self, parseLike:)];
}
 
- (PLLike *)parseLike:(NSString *)like {
    return [PLLike value:like];
}
 
@end

Woah... that's lots of little functions doing not much (wouldn't you love a closure?). Luckily we can go even further as we can use the function macros on class methods, allowing us to remove that last function.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    FKOption *maybeLikes = [maybeResults bind:functionTS(self, pullOutLikes:)];
    return [maybeLikes map:functionTS(self, parseLikes:)];
}
 
- (FKOption *)pullOutLikes:(NSDictionary *)results {
    return [FKOption fromNil:[results objectForKey:@"likes"]];
}
 
- (NSArray *)parseLikes:(NSArray *)likes {
    return [likes map:functionTS([PLLike classAsId], value:)];
}
 
@end

Looking better, but there's another function we can remove; all pullOutLikes: does is provide a nil-safe accessor our dictionary. FunctionalKit provides a nil-safe extension on NSDictionary: -(FKOption*)maybeObjectForKey:(id)key. Let's try that shall we.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    FKOption *maybeLikes = [maybeResults bind:functionTS(self, pullOutLikes:)];
    return [maybeLikes map:functionTS(self, parseLikes:)];
}
 
- (FKOption *)pullOutLikes:(NSDictionary *)results {
    return [results maybeObjectForKey:@"likes"];
}
 
- (NSArray *)parseLikes:(NSArray *)likes {
    return [likes map:functionTS([PLLike classAsId], value:)];
}
 
@end

OK, that's a little nicer, but we've still got a function that does not much at all. Let's try another macro: functionSA, it creates a function from a selector and its argument. In our example we can pass it directly to our option's bind: method.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    FKOption *maybeLikes = [maybeResults bind:functionSA(maybeObjectForKey:, @"likes")];
    return [maybeLikes map:functionTS(self, parseLikes:)];
}
 
- (NSArray *)parseLikes:(NSArray *)likes {
    return [likes map:functionTS([PLLike classAsId], value:)];
}
 
@end

All right, we're getting close.

Some heavy lifting

So our code is now nil-safe, but it still a little verbose for our liking. Can we remove any more of these little functions we've created? Turns out we can.

We have a function already to turn an NSString representation of a like into a PLLike representation; functionTS([PLLike classAsId], value:). We see it in use when mapping over our array of likes. Wouldn't it be nice to have a way to turn that function on an individual like into a function on an array of likes? Thankfully such a thing exists. The process of taking a function on a single element of a container class and being able to apply it to the container class itself is called lifting a function.

Let's rewrite the example using a lift to turn our parsing function on a single like into a parsing function on an array of likes.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeResults = [FKOption fromNil:results];
    FKOption *maybeLikes = [maybeResults bind:functionSA(maybeObjectForKey:, @"likes")];
    return [maybeLikes map:[NSArray liftFunction:functionTS([PLLike classAsId], value:)]];
}
 
@end

We've now been able to remove yet another little function clouding up our code. Objective-C noise not withstanding, our code is closer to the core of what we're trying to achieve: if we have results pull the likes out; if we have likes, turn each one into a PLLike.

Let's go one step further and inline the maybeLikes variable.

@implementation GetLikesParser
 
- (FKOption *)parseGetLikesResults:(NSDictionary *)results {
    FKOption *maybeLikes = [[FKOption fromNil:results] bind:functionSA(maybeObjectForKey:, @"likes")];
    return [maybeLikes map:[NSArray liftFunction:functionTS([PLLike classAsId], value:)]];
}
 
@end

Compare this completed version to the one we started with, it's one hell of a lot nicer isn't it? Which one would you prefer?

Conclusion

The process that we've followed may seem a little convoluted and foreign, but we've really just applied simple rules at each step of the way. Once you get used to functional techniques these patterns become easier to spot and their application easier to handle. Granted, understanding this kind of code does take some time, but the benefits of doing so are massive, as we've seen, code literally melts away, becomes clearer and more bug-free.

Another major benefit is each of these little chunks of logic can be viewed in complete isolation from each other, allowing you to easily reason about the behaviour of each one, and then as well as the whole. The code is now also closer to the actual semantics of what we're trying to achieve; we have no for loop boiler plate clouding what we're trying to achieve.

If you're developing iPhone apps or even for Mac OS X give FunctionalKit a go and get in touch if you're interested in contributing.

Written by Tom Adams

April 1st, 2009 at 2:18 pm

Error handling with Either, or, Why Either rocks!

with 3 comments

Instinct, like most xUnit-like frameworks provides the ability to run methods, and have the status of those methods reported. In xUnit frameworks these methods are called tests, in Instinct they’re called specifications.

Specifications are ordinary instance methods that are marked in a way (naming convention or annotation) that tells Instinct to run them. Each specification has a lifecycle associated with it, where both the creator of the method (the developer specifying code) and the framework itself performs pre- and post-specification steps (Instinct tries to take away and simplify a lot of the drudgery involved in traditional testing).

Specifications have the following lifecycle (the default implementation can be overridden):

  1. The context class for the specification is created.
  2. The mockery is reset, all mocks now contain no expectations.
  3. Specification actors are auto-wired.
  4. Before specification methods are run.
  5. The specification is run.
  6. After specification methods are run.
  7. Mock expectations are verified.

Any step of this lifecycle can throw exceptions causing the specification to fail. For example a before specification method may throw a NullPointerException, or a specification may pass while a mock used in it may not have an expectation met.

The framework needs flexibility in choosing which parts of the lifecycle to run, which parts are important when executing the specification, what failures constitute stopping the run of a specification, etc.

Here’s the code we’re starting with:

private SpecificationResult runSpecification(final SpecificationMethod specificationMethod) {
  final long startTime = clock.getCurrentTime();
  try {
    final Class<?> contextClass = specificationMethod.getContextClass();
    final Object instance = invokeConstructor(contextClass);
    runSpecificationLifecycle(instance, specificationMethod);
    return createSpecResult(specificationMethod, SPECIFICATION_SUCCESS, startTime);
  } catch (Throwable exceptionThrown) {
    final SpecificationRunStatus status = new SpecificationRunFailureStatus(exceptionSanitiser.sanitise(exceptionThrown));
    return createSpecResult(specificationMethod, status, startTime);
  }
}

private void run(final Object contextInstance, final SpecificationMethod specificationMethod) {
  Mocker.reset();
  actorAutoWirer.autoWireFields(contextInstance);
  try {
    runMethods(contextInstance, specificationMethod.getBeforeSpecificationMethods());
    runSpecificationMethod(contextInstance, specificationMethod);
  } finally {
    try {
      runMethods(contextInstance, specificationMethod.getAfterSpecificationMethods());
    } finally {
      Mocker.verify();
    }
  }
}

This implementation of of the specification runner is overly simplistic. It runs everything within a large try-catch block, which means there’s no way to tell which part of the specification failed (before, spec, after, etc.). It also cannot collect up errors, so if an error occurs in a specification and a mock fails to verify, only the verification error is propagated. These are currently two of the highest priority user reported issues on Instinct.

Here’s my first attempt at isolating which part of the specification failed, each of the constants passed to fail define the location of the failure.

public SpecificationResult run(final SpecificationMethod specificationMethod) {
  try {
    final Class<?> contextClass = specificationMethod.getContextClass();
    final Object instance = invokeConstructor(contextClass);
    Mocker.reset();
    try {
      actorAutoWirer.autoWireFields(instance);
      try {
        runMethods(instance, specificationMethod.getBeforeSpecificationMethods());
        try {
          runSpecificationMethod(instance, specificationMethod);
          return result(specificationMethod, SPECIFICATION_SUCCESS);
        } catch (Throwable t) {
          return fail(specificationMethod, t, SPECIFICATION);
        } finally {
          try {
            try {
              runMethods(instance, specificationMethod.getAfterSpecificationMethods());
            } catch (Throwable t) {
              return fail(specificationMethod, t, AFTER_SPECIFICATION);
            }
          } finally {
            try {
              Mocker.verify();
            } catch (Throwable t) {
              return fail(specificationMethod, t, MOCK_VERIFICATION);
            }
          }
        }
      } catch (Throwable t) {
        return fail(specificationMethod, t, BEFORE_SPECIFICATION);
      }
    } catch (Throwable t) {
      return fail(specificationMethod, t, AUTO_WIRING);
    }
  } catch (Throwable t) {
      return fail(specificationMethod, t, CLASS_INITIALISATION);
  }
}

Obviously, this is very ugly, it’s also hard to reason about. But, as we now have the location of the failure we can make decisions as to whether we fail the specification, or not, so we’ve solved our first issue. But we haven’t made our second task any easier, we aren’t generally able to keep processing (we still validate mocks in the above code upon specification failure) and we don’t collect all the errors that occur.

And at about this time enters Either (in Scala):

The Either type represents a value of one of two possible types (a disjoint union). The data constructors; Left and Right represent the two possible values. The Either type is often used as an alternative to Option where Left represents failure (by convention) and Right is akin to Some.

Either can be used in place of conventional exception handling in Java, or, to wrap APIs that use conventional exception handling (a more thorough treatment of this issue is given in Lazy Error Handling in Java, Part 3: Throwing Away Throws). Here’s an example of the latter, using both Either and Option (discussed later).

public Either<Throwable, List<Field>> wireActors(final Object contextInstance) {
  try {
    return right(actorAutoWirer.autoWireFields(contextInstance));
  } catch (Throwable t) {
    return left(t);
  }
}

...

public Option<Throwable> verifyMocks() {
  try {
    Mocker.verify();
    return none();
  } catch (Throwable t) {
    return some(t);
  }
}

At a high level, the good thing about using Either is that your methods no longer lie; they don’t declare that they’ll return an Int, or, maybe, they’ll throw an exception, they come right out and say it: I’ll return either an exception or an Int. This is akin to conventional checked exceptions in Java (which Scala does away with), where a checked exception is used to represent a recoverable failure (enforced by the compiler) and an unchecked exception to represent an unrecoverable failure (not compiler enforced). Scala takes the correct approach here, it uses unchecked exceptions to represent the bottom value in non-terminating functions, and Either to represent recoverable failure.

Either is also much more flexible than exceptions, you can map across it, convert it into an option, add them into a container, and generally treat them like any other data structure [1].

So armed with this new knowledge, here’s the new specification lifecycle broken out from the runner itself (note, there are eleven steps in the lifecycle, including validation, however only these are exposed).

interface SpecificationLifecycle {
  <T> Either<Throwable, ContextClass> createContext(Class<T> contextClass);
  Option<Throwable> resetMockery();
  Either<Throwable, List<Field>> wireActors(Object contextInstance);
  Option<Throwable> runBeforeSpecificationMethods(
      Object contextInstance, List<LifecycleMethod> beforeSpecificationMethods);
  Option<Throwable> runSpecification(
      Object contextInstance, SpecificationMethod specificationMethod);
  Option<Throwable> runAfterSpecificationMethods(
      Object contextInstance, List<LifecycleMethod> afterSpecificationMethods);
  Option<Throwable> verifyMocks();
}

Now we need to make use of this in the specification runner, one step of which is determining the overall result, from the sequence of steps. Here’s my first attempt at this, using Functional Java’s Either to represent the result of each of the steps.

public <T extends Throwable> Either<List<T>, SpecificationResult> determineLifecycleResult(
    final Either<T, Unit> createContextResult,
    final Either<T, Unit> restMockeryResult,
    final Either<T, Unit> wireActorsResult,
    final Either<T, Unit> runBeforeSpecificationMethodsResult,
    final Either<T, SpecificationResult> runSpecificationResult,
    final Either<T, Unit> runAfterSpecificationMethodsResult,
    final Either<T, Unit> verifyMocksResult) {
  List<T> errors = List.nil();
  if (createContextResult.isLeft()) {
    errors = errors.cons(createContextResult.left().value());
  }
  if (restMockeryResult.isLeft()) {
    errors = errors.cons(restMockeryResult.left().value());
  }
  if (wireActorsResult.isLeft()) {
    errors = errors.cons(wireActorsResult.left().value());
  }
  if (runBeforeSpecificationMethodsResult.isLeft()) {
    errors = errors.cons(runBeforeSpecificationMethodsResult.left().value());
  }
  if (runSpecificationResult.isLeft()) {
    errors = errors.cons(runSpecificationResult.left().value());
  }
  if (runAfterSpecificationMethodsResult.isLeft()) {
    errors = errors.cons(runAfterSpecificationMethodsResult.left().value());
  }
  if (verifyMocksResult.isLeft()) {
    errors = errors.cons(verifyMocksResult.left().value());
  }
  return errors.isNotEmpty() ? Either.<List<T>, SpecificationResult>left(errors)
      : Either.<List<T>, SpecificationResult>right(runSpecificationResult.right().value());
}

All those ifs are a bit ugly (what happens when we have more?), and we’ve got a mutable list, surely we can do better? We’ve spotted a pattern here, and we could clean this up by folding across a list of results, pulling out the left of each Either, however Either does this for us, using Either.lefts() (it performs the fold for you).

Here’s the next cut, making use of a list of results and Either.left():

public <T extends Throwable> Either<List<Unit>, Unit> determineLifecycleResult(
    final List<Either<T, Unit>> allResults, final Either<T, Unit> specificationResult) {
  final List<T> errors = lefts(allResults);
  return errors.isEmpty() ?
      Either.<List<T>, Unit>right(specificationResult.right().value()) :
      Either.<List<T>, Unit>left(errors);
}

So what’s this doing? It takes a list of results and goes through each of the lefts (the errors) returning them as a list. As Either is a disjunction (we’ll have an error or a result, but not both), if any of the results contain an error on the left, our list will be non-empty, meaning our specification failed to run. In this case we return the errors on the left. If we have no errors (i.e. the list is empty) we return the real result on the right.

This code can be simplified further by using Option instead of Either. Option would allow us to place any exception into the some data constructor, the Unit we’re placing into Either becomes the none (we’re used to thinking of void as nothing in Java anyway). The only hassle comes if we want to treat the Option as an Either (say in the lefts call above), in that case we’d need to lift the Option into an Either.

Option<Throwable> option = ...
Either<Throwable, Unit> either = option.toEither(unit()).swap();

Option also allows use to pull each some out of a list of Options, in a similar way to how we pulled the lefts out of a list of Eithers.

List<Option<Throwable>> results = ...
List<Throwable> errors = somes(results);
Option<Throwable> overall = errors.isEmpty() ?
    Option.<Throwable>none() :
    some((Throwable) new AggregatingException(errors));

Given that we’ve now decoupled the lifecycle from the runner and we know have a better way of handling errors, here’s the pattern of the new runner code:

private SpecificationResult runLifecycle(final long startTime,
    final SpecificationLifecycle lifecycle, final SpecificationMethod specificationMethod) {
  ...
  List<Option<Throwable>> lifecycleStepErrors = nil();
  final Either<Throwable, ContextClass> createContextResult =
      lifecycle.createContext(specificationMethod.getContextClass());
  lifecycleStepErrors = lifecycleStepErrors.cons(createContextResult.left().toOption());
  if (createContextResult.isLeft()) {
    return fail(...);
  } else {
    final ContextClass contextClass = createContextResult.right().value();
    ...
    lifecycleStepErrors = lifecycleStepErrors.cons(contextValidationResult.left().toOption());
    if (contextValidationResult.isSome()) {
      return fail(...);
    } else {
      ...
      lifecycleStepErrors = lifecycleStepErrors.cons(...left().toOption());
      if (...isSome()) {
        return fail(...);
      } else {
        ...
        if (...isSome()) {
          return fail(...);
        } else {
          ...
          if (...isSome()) {
            return fail(...);
          } else {
            ...
            if (...isSome()) {
              return fail(...);
            } else {
              ...
              return determineResult(..., lifecycleStepErrors);
            }
          }
        }
      }
    }
  }
}

See the pattern there? Let’s see it in slow motion. Assume each of the lifecycle results is called a, b, c, etc.

if (a.isLeft()) {
  return fail()
} else {
  if (b.isLeft()) {
    return fail()
  } else {
    if (c.isLeft() {
    } else {
      ...
    }
  }
}

What we’re doing is binding through each lifecycle result, if we get an error, we fail fast, if we don’t, we execute the next step. There’s some other muck going on here too, we’re destructively updating the list of errors (lifecycleStepErrors), and the last few steps (run the specification, run after methods, verify mocks) are always executed, regardless of whether any fail. So how do we clean the code up? We anonymously bind through Either on the right, and sequence through the rest accumulating errors. What???

Here’s a simple example that contains eleven steps representative of running a specification. For the first eight (a through h), each step’s predecessor must succeed (i.e. we have at most one error). For the last three (i through k), we execute all of them regardless of whether they fail and accumulate the errors. We make use of the new Validation class in Functional Java (in version 2.9) to perform the last three steps (full source; this example has been further refined in the trunk).

class X {
  // The first sequence of steps...
  Either<Throwable, Unit> a;
  Either<Throwable, Unit> b;
  Either<Throwable, Unit> c;
  Either<Throwable, Unit> d;
  Either<Throwable, Unit> e;
  Either<Throwable, Unit> f;
  Either<Throwable, Unit> g;
  Either<Throwable, Unit> h;
  // The second sequence of steps...
  Either<Throwable, Unit> i;
  Either<Throwable, Unit> j;
  Either<Throwable, Unit> k;

  // Execute the first sequence of steps, fail on the first error.
  Either<Throwable, Unit> t1() {
    return a.left()
        .sequence(b).right()
        .sequence(c).right()
        .sequence(d).right()
        .sequence(e).right()
        .sequence(f).right()
        .sequence(g).right()
        .sequence(h);
  }

  // Execute the second sequence of steps, accumulate the errors.
  Option<NonEmptyList<Throwable>> t2() {
    return validation(t1()).nel().accumulate(
        Semigroup.<Throwable>nonEmptyListSemigroup(),
        Validation.<Throwable, Unit>validation(g).nel(),
        Validation.<Throwable, Unit>validation(h).nel(),
        Validation.<Throwable, Unit>validation(i).nel());
  }
}

Each of the fields in the above represents the result of executing a step in the specification lifecycle (including validation, which is beyond the SpecificationLifecycle itself), t1 represents the first eight steps, t2 the last three steps. t1 sequences through (anonymous bind) the result of each step, failing if any individual step fails. t2 executes [2] each step, continuing execution of the remaining steps if any step fails, and accumulates the errors.

Remember that this is what t1 looked like originally:

if (a.isLeft()) {
  return fail()
} else {
  if (b.isLeft()) {
    return fail()
  } else {
    if (c.isLeft() {
    } else {
      ...
    }
  }
}

Some simpler examples may make the binding clearer; consider Scala’s Option (used here for brevity). We can bind through Option using orElse:

scala> Some(7).orElse(Some(8))
res0: Option[Int] = Some(7)

Here we execute Some(7), if that fails (i.e. returns none), we execute Some(8). As we see, the result is Some(7). Let’s take a failure case:

scala> None.orElse(Some(8))
res1: Option[Int] = Some(8)

We execute None, if that fails (i.e. returns none), which it does, we execute Some(8). As we see, the result is Some(8).

Taking it back to our simple Java example, we evaluate the result of step a [2], if it fails, we return the failure, if it succeeds, we evaluate step b, and so on. This is the same logic we saw in the nested if-else blocks earlier. If any of the first eight steps fail, we get back either one error (from t1), if any of the last three steps fail, we get back at most 3 errors (from t2)

If we apply this pattern to our specification runner code, we get the following:

private SpecificationResult runLifecycle(final long startTime, final SpecificationLifecycle lifecycle,
    final SpecificationMethod specificationMethod) {
  final Either<Throwable, ContextClass> createContext = lifecycle.createContext(specificationMethod.getContextClass());
  if (createContext.isLeft()) {
    return fail(startTime, specificationMethod, createContext.left().value(), false);
  } else {
    final ContextClass contextClass = createContext.right().value();
    final Either<Throwable, Unit> validation = validateSpecification(contextClass, specificationMethod);
    if (validation.isLeft()) {
      return fail(startTime, specificationMethod, validation.left().value(), false);
    } else {
      return runSpecification(startTime, lifecycle, contextClass, specificationMethod);
    }
  }
}

That looks bit better, but where’s the complexity gone? OK, here it is…

private SpecificationResult runSpecification(final long startTime, final SpecificationLifecycle lifecycle, final ContextClass contextClass,
    final SpecificationMethod specificationMethod) {
  final Object contextInstance = constructorInvoker.invokeNullaryConstructor(contextClass.getType());
  final Validation<Throwable, Unit> preSpecificationSteps =
      validate(resetMocks().f(lifecycle)).sequence(validation(wireActors().f(lifecycle, contextInstance)))
          .sequence(validate(befores().f(lifecycle, contextInstance, contextClass.getBeforeSpecificationMethods())));
  if (preSpecificationSteps.isFail()) {
    return fail(startTime, specificationMethod, preSpecificationSteps.fail(), Option.<Throwable>none());
  } else {
    final Option<Throwable> specification = specification().f(lifecycle, contextInstance, specificationMethod);
    final Option<NonEmptyList<Throwable>> result = preSpecificationSteps.nel().accumulate(throwables(), validate(specification).nel(),
        validate(afters().f(lifecycle, contextInstance, contextClass.getAfterSpecificationMethods())).nel(),
        validate(verifyMocks().f(lifecycle)).nel());
    if (result.isSome()) {
      return fail(startTime, specificationMethod, result.some().toList(), specification);
    } else {
      return success(startTime, specificationMethod);
    }
  }
}

Here’s the complete old and new versions of the code if you’re so inclined…

This code combined with the extracted lifecycle class is functionally equivalent to the first snippet of code I presented above. It may look verbose (it would be much simpler in Scala for example), but an interesting thing came out of it; it made explicit a bunch of places where I wasn’t handling exceptions correctly. It forced me to make a decision as to what to do in each case, so I got a much finer grained exception handling mechanism. Of course, I could get the same using try-catch (arguably more verbose), and I can choose to ignore left results (errors) if I want. The other thing it highlights is Java’s woeful generics implementation [3].

When I started down this path to error handling I had two objectives (two reported issues to resolve); to allow the runner of a specification to know which parts failed (this gives the flexibility to allow before specs to error and not be reported as expected exceptions) and to return all the errors resulting from running a specification. I didn’t initially intend to go down this path, however after talking in the office, decided that there was a better way to handle this than nested try-catch blocks. The resulting code is smaller (even using Java), simpler and much more flexible than the traditional Java method of exception handling. A win all round. There are some downsides however, firstly, the verbosity of Java’s typing leads to a mess of Either<Throwable, Unit> and this method of error handling will be foreign to a lot of Java developers today.

Epilogue

The concept of sequencing while accumulating errors has been generalised in Functional Java (from version 2.9) as validation, here is an example of it in action. Scalaz contains a similar concept, though this uses applicative functors over higher kinds (something which Java’s type system does not support), here’s a small example of its use.

Footnotes

  1. There’s no reason why you couldn’t catch an exception and wrap it up in a list, or your own data structure instead of using Either, but most of the work is already done for us (i.e. useful functions across Either have been defined) and it’s a useful convention.
  2. Java is strict, so we don’t get the benefit of lazy evaluation in this case, but could emulate it with a function.
  3. The problems I’ve encountered mainly have to do with covariance in type parameters and differences between the way methods returning types are treated vs. local variables (see the bottom of StandardSpecificationLifecycle for details).

Written by Tom Adams

August 6th, 2008 at 2:27 pm