To Dream of Magick

Dreamer Shaper Seeker Maker

Exceptions and Errors, Part 1

Posted on Sat Jan 5 00:00:00 UTC 2013 by Savanni D'Gerinel

Today I bring to you part one of an infinite part series on exception handling and error handling. This is, I feel, a department that has not been well explained in any tutorial, and so I have spent a great deal of time bashing my head against the computer... almost to the point of giving up my career in software engineering completely.

Yes, it really is that nasty.

Anyway, I am going to use a rather non-intuitive distinction as detailed on this page of the Haskell wiki

  • error -- a situation that arises only as a result of a bug in the program and cannot be fixed except by fixing the program.
  • exception -- an uncommon situation but one which may occur as part of the normal running of a program and should be handled gracefully.

An error is something like "divide by zero" (sometimes), "array index out of bounds", "dereferencing a null pointer", or "assertion failed". An exception could be "parse error", "file not found", or "network not responding". Exceptions don't always have to be handled, as it would be totally acceptable for a short-running command line application to die due to "file not found", but the point is that they can be handled. That way "file not found" might kill a command-line application, but a GUI application can catch that exception and prompt the user.

Errors, on the other hand, should generally kill the application as fast as possible because they indicate that the application has reached an unexpected state and cannot be trusted even to clean itself up.

Now, also mentioned in the page is that some of the Haskell text, including the libraries, confuse these terms. So, like in that page, I use these distinctions so long as I am coding in Haskell:

  • Error -- triggered by a call to error, assert, or undefined
  • Exception -- triggered by a call to throwError or related functions. Really, anything called throw is throwing an Exception.

Stage 1 -- catching and handling an IO Exception

This part is both very basic and frequently ignored because it is so ill-defined. For today, we will simply catch and print out an IO exception, though more advanced work would have us responding to different exception types. But, I have not done that advanced work yet.

Consider the following:

res <- readFile "a.txt"

This function reads all of the contents of file "a.txt", if "a.txt" exists. If it does not exist, readFile instead throws an exception and res does not get a value. But that is okay because as written, all execution of your program stops.

Instead, however, let us print a bigger, louder, more obvious error message.

let block = "=================================================="
res <- catchError (readFile "a.txt") (\err -> return $ block ++ "\n" ++ (show err) ++ "\n" ++ block)
putStrLn res

a.txt: openFile: does not exist (No such file or directory)

catchError has this data type:

catchError :: MonadError e m => m a -> (e -> m a) -> m a
catchError action handler = ...

How to work with this is not entirely obvious. Sure, it is very easy if you just want a default string to return when the action fails, but if you want a more descriptive exception type then you have to work a little harder. As we will now do. Technically, this is going to look a little redundant, but it points the way.

res <- catchError (readFile "a.txt" >>= return . Right) (\err -> return $ Left $ (show err))
putStrLn $ either (\err -> "Exception detected: " ++ err) (\val -> val) res

Exception detected: a.txt: openFile: does not exist (No such file or directory)

Visually, this has gained us little. However, we have driven a wedge in to the exception handling system and captured the exception. From here, we can slowly widen the gap and ultimately do anything to recover from the exception. For this block of code, catchError effectively has this updated data type:

catchError :: IO (Either String String) -> (IOError -> IO (Either String String)) -> IO (Either String String)

Now we make a tiny little tweak:

res <- catchError (readFile "a.txt" >>= return . Right) (return . Left)

And our data type has changed again:

catchError :: IO (Either IOError String) -> (IOError -> IO (Either IOError String)) -> IO (Either IOError String)

res now contains an Either, and the Left form contains an IOError. You can begin using the IO classification functions to really figure out what kind of exception you are dealing with and recovering appropriately:

case res of
    Left ioErr -> if isDoesNotExistError l
                    then putStrLn "It doesn't exist"
                    else if isAlreadyInUseerror l
                        then ...

Of course, that is a clumsy way of handling it. We will look at a more elegant way in the future after I've had to develop it myself.

Stage 2 -- Mushing together more than one error type

I'm doing a lot of work with MongoDB these days. When building a database layer, I usually put a lot of knowledge specific to the data model into that layer: objects link together correctly, an object I'm inserting does not have a field that duplicates the corresponding field of another object in the database, the fields of the object meet certain criteria. So, my database layer has to contend with exceptions common to all databases (IO exceptions) as well as exceptions unique to my data model.

But, instead of going into that complication (today), I will take a simplified form that has to do with a fibonacci sequence. I first proposed this as a question on Stack Overflow. Since I ultimately answered it, I feel totally comfortable including it here, along with additional explanation.

For some unknown reason, you need to write a function that will read in a file and verify that the file contains a proper fibonacci sequence. The file will contain a single line of indeterminate length, and that line will contain a series of numbers separated by spaces.

Instantly, a number of exceptional conditions come to mind. Perhaps the file does not exist, or file permissions make it unreadable. Perhaps the file is in the wrong format. Or perhaps there is an invalid value in the sequence (technically, this is not an exception but simply the result of the function. I'm classifying it as an exception, anyway, so that I can pass back the invalid value). Let's see some code:

import System.IO
import System.IO.Error
import Control.Monad.Error

data FibException = FileUnreadable IOError
                  | FormatError String
                  | InvalidValue Integer
                  | Unknown String
                  deriving Show

Next, our verification function. I'm including the entire thing, but the data type is the most important part:

verifySequence :: String -> (Integer, Integer) -> Either FibException ()
verifySequence "" (prev1, prev2) = return ()
verifySequence s (prev1, prev2) =
    let readInt = reads :: ReadS Integer
        res = readInt s in
    case res of
        [] -> throwError $ FormatError s
        (val, rest):[] -> case (prev1, prev2, val) of
            (0, 0, 1) -> verifySequence rest (0, 1)
            (p1, p2, val') -> (if p1 + p2 /= val'
                then throwError $ InvalidValue val'
                else verifySequence rest (p2, val))
            _ -> throwError $ InvalidValue val

verifySequence can cover everything except the FileUnreadable exception. But, this function is totally pure, and if the input string is a valid fibonacci sequence, it will simply return Right ().

The next step takes a little leap. At the moment I cannot explain how to make the leap, but I assure you that I will write another tutorial as soon as I can.

We want to blend the exception of a pure operation and the exception of an IO operation. Basically, this makes it all an IO operation, but I am going to give the entire operation a name. This calls for a monad transformer:

type FibIOMonad = ErrorT FibException IO

Remember, this declaration is identical to

type FibIOMonad a = ErrorT FibException IO a

The function I want to write will read in a specified fibonacci file, check the data, and return Right () if it succeeds.

--verifyFibFile :: FilePath -> ErrorT FibException IO ()
verifyFibFile :: FilePath -> FibIOMonad ()
verifyFibFile path = do

The first operation that I need to do is to read in the file. I need to catch the IOErrors that may be thrown when I read it in and then translate those into FibExceptions. Basically, my catchError data type needs to be this:

catchError :: IO (Either FibException String) -> (IOError -> IO (Either FibException String)) -> IO (Either FibException String)
catchError (readFile path >>= return . Right) {- IO (Either a String) -}
           (return . Left . FileUnreadable)   {- IO (Either FibException b) -}

Next, because I am calling this operation inside a function of the FibIOMonad context, and not the IO context, I need to lift it:

    contents <- liftIO $ catchError (readFile path >>= return . Right)
                                    (return . Left . FileUnreadable)

Next, working with the contents is pretty straightforward:

    case contents of
        Left err -> throwError err
        Right sequenceStr' -> case (verifySequence sequencStr' (0, 0)) of
            Right res -> return res
            Left err -> throwError err

Putting it all together, verifyFibFile looks like this:

verifyFibFile :: FilePath -> FibIOMonad ()
verifyFibFile path = do
    contents <- liftIO $ catchError (readFile path >>= return . Right) (return . Left . FileUnreadable)
    case contents of
        Right sequenceStr' -> case (verifySequence sequenceStr' (0, 0)) of
            Right res -> return res
            Left err -> throwError err
        Left err -> throwError err

I do not like how I have to unroll the failing results just to rethrow them, but I have not figured out how to avoid that. On the up side, I can show you two different ways of repeatedly calling this function in order to get two different general effects.

First, totally within the FibIOMonad context, you can write this function:

runTest :: FibIOMonad ()
runTest = do
    res <- verifyFibFile "goodfib.txt"
    liftIO $ putStrLn "goodfib.txt"

    res <- verifyFibFile "invalidValue.txt"
    liftIO $ putStrLn "invalidValue.txt"

    res <- verifyFibFile "formatError.txt"
    liftIO $ putStrLn "formatError.txt"

You call this function from the IO context (i.e., the GHCI repl) like this:

*Main> runErrorT runTest
Left (InvalidValue 17)

The file invalidValue.txt has a 17 inserted in there, and 17 is decidedly not a fibonacci number. formatError.txt has an "a" inserted into it, but formatError.txt does not get called at all. Instead, given that verifyFibFile failed, the entire rest of the FibIOMonad operation falls through and runTest simply returns the Left value it got from checking "invalidValue.txt"

The other way is repeatedly calling verifyFibFile from within the IO context:

main = do
    res <- runErrorT $ verifyFibFile "goodfib.txt"
    printResult "goodfib.txt" res

    res <- runErrorT $ verifyFibFile "invalidValue.txt"
    printResult "invalidValue.txt" res

    res <- runErrorT $ verifyFibFile "formatError.txt"
    printResult "formatError.txt" res

    res <- runErrorT $ verifyFibFile "nonextantFile.txt"
    printResult "nonextantFile.txt" res

    where printResult filename (Right ()) = putStrLn ("file passes: " ++ filename)
          printResult filename (Left err) = putStrLn $ (filename ++ " " ++ show err)

If you run this, the results are a bit different:

*Main> main
file passes: goodfib.txt
invalidValue.txt InvalidValue 17
formatError.txt FormatError " a 21"
nonextantFile.txt FileUnreadable nonextantFile.txt: openFile: does not exist (No such file or directory)

Basically, each call to verifyFibFile happens within a separate context, so the failure in one call does not cause the entire IO context to fail, and thus you can check each file separately.

My remaining question is pretty simple:

How can I restructure verifyFibFile to be as flat as runTest?

If anyone has an answer, I would love to know.

Exceptions and Errors, Part 1 by Savanni D'Gerinel is licensed under a Creative Commons Attribution-NonCommercial-SharAlike 3.0 Unported License. You can link to it, copy it, redistribute it, and modify it, but don't sell it or the modifications and don't take my name from it.