Resource Management in Haskell

Posted on January 4, 2016

This is a post about Haskell. However, I would like to point out, that I am not a professional Haskell programmer. I have never had the opportunity to use it for production. However, I do enjoy using Haskell for side-projects. So, if you find that I made any mistake, please leave a comment.

I do most of my everyday programming in C++ or Python and this will be reflected in this article. I will discuss resource management in Haskell from the perspective of a C++ programmer.

I will show a number of example codes in this article. If you want to follow along you can find the source code of all examples here.

Resource Management in C++

First I would like to revise resource management in C++. The recommended way to manage resources in modern C++ is the RAII-idiom (Resource Acquisition Is Initialization). RAII might not carry the best name, but it is a pattern that allows for code with no resource leaks, is exception safe, and concise. It exploits the fact that objects are destroyed by the end of their lifetime, and that the lifetime of an object local to a scope will end when we leave that scope, independently of whether this happens due to an exception or the normal course of the program. If we encapsulate resource acquisition and release within a dedicated class and clearly define the ownership then the compiler will do the rest for us.

Before we look at an example I would like to clarify what I mean by a resource. Within this article it shall be something that has to be acquired in some way before we can use it, and has to be released once we don’t need it any longer. If we forget to release the resource then we have a leak in our application. Examples include allocated memory, file-handles, and locks on mutexes.

All of the above are difficult to observe. Therefore, we will emulate a more verbose resource that notifies us whenever it is acquired or released.

class Resource {
  public:
    Resource (std::string name) : name_(std::move(name)) {
        std::cout << "Acquired " << name_ << "\n";
    }
    ~Resource () {
        std::cout << "Released " << name_ << "\n";
    }

  private:
    std::string name_;
};

This defines the class Resource. Its constructor takes a string parameter, a name by which we want to refer to the particular instance. Both, the constructor and destructor write the name to stdout such that we can trace the resource’s lifetime. The following code demonstrates how to use this class.

int main() {
    try {
        Resource a("A");
        Resource b("B");
        // do something with a and b
        throw std::exception();
        // do some more with a and b
    } catch (const std::exception &) {
        std::cout << "Oops\n";
    }
}

Here we acquire two resources "A", and "B", and then do some work with them. However, something goes wrong in the middle of it and we end up throwing an exception. The program creates the following output.

Acquired A
Acquired B
Released B
Released A
Ooops

As we can see, both resources are released before we handle the exception. The compiler inserted all the necessary exception handling code for us. The exception is passed on after the resources are released.

Resource Management in Haskell

In comparison to that we will look at how resource management can be done in Haskell. If we go through the documentation of the module System.IO in the Haskell package base we find functions such as openFile, and hClose which can be used for manual resource handling, much like fopen and fclose in C. However, we also find functions such as withFile which ensure that the resource will be released after we don’t need it any longer, much like the C++ resource class above.

For the same reason as before we will first define a Haskell version of our emulated verbose resource and the corresponding pair of open and close functions.

data Resource = Resource String

acquireResource :: String -> IO Resource
acquireResource name = do
  putStrLn $ "Acquired " ++ name
  return $ Resource name

releaseResource :: Resource -> IO ()
releaseResource (Resource name) = putStrLn $ "Released " ++ name

To demonstrate exception handling we define the following exception type.

data ResourceException = ResourceException deriving (Show, Typeable)
instance Exception ResourceException

Our newly defined resource can be used in the following way.

main :: IO ()
main = handle (\ResourceException -> putStrLn "Oops") $ do
  a <- acquireResource "A"
  b <- acquireResource "B"
  -- do something with a and b
  throwIO ResourceException
  -- do some more with a and b
  releaseResource b
  releaseResource a

However, when we look at the output we find that there is a problem.

Acquired A
Acquired B
Oops

The resources have not been released at all. To fix this problem we have to define our own withResource function. We need a function that first acquires the resource, then performs some user defined action on it, and makes sure that the resource will be released in the end regardless of whether an exception was thrown or not. Luckily, Haskell already provides us with the function bracket in the module Control.Exception which does exactly what we want. Its first argument defines how to acquire the resource, its second argument how to release the resource, and the third argument how to use the resource. With its help we can define withResource.

withResource :: String -> (Resource -> IO r) -> IO r
withResource name = bracket (acquireResource name) releaseResource

An exception safe version of the above example can be implemented as follows.

main :: IO ()
main = handle (\ResourceException -> putStrLn "Oops") $
  withResource "A" $ \a ->
    withResource "B" $ \b -> do
      -- do something with a and b
      throwIO ResourceException
      -- do some more with a and b

And it produces the expected output.

Acquired A
Acquired B
Released B
Released A
Oops

This fixed the problem and made our code exception safe. However, we have to break out of do-notation for the resource management and the names that we bind our resources to are hidden in the syntactic noise of lambda definitions. This might be just a question of aesthetics, however if we add a few more resources and some productive code this could quickly become difficult to read.

Being still relatively new to Haskell I started to wonder whether there was some way to define a scope-monad where we could acquire a resource within the same block of do notation but at the same time ensure that the resource will be released in the end in an exception safe way. Naively I started out with a state-monad that held a list of resources that would automatically be released in the end. However, I quickly found that the implementation became very complicated when taking exception-safety, nested scopes, and stacks of monad-transformers into account. Furthermore, we would be giving up on the full wealth of with... functions that Haskell already has to offer.

I took a step back and had a closer look at these with... functions. In the end they all boil down to the following signature (a -> IO r) -> IO r, i.e. functions that take a function a -> IO r, apply it to the handled resource a, and finally return the result of the computation IO r. In other words, the with... functions don’t return the resource as a result, but rather pass it on to another function. This pattern is called continuation passing style and it can be represented by a monad. The transformers package contains a monad transformer for the continuation monad and its constructor’s signature matches that of the with... functions perfectly ContT :: (a -> m r) -> m r -> ContT r m a. With it we can transform our previous example into the following form.

main :: IO ()
main = handle (\ResourceException -> putStrLn "Oops") $ evalContT $ do
  a <- ContT $ withResource "A"
  b <- ContT $ withResource "B"
  -- do something with a and b
  liftIO $ throwIO ResourceException
  -- do some more with a and b

In the continuation monad we can build up a chain of continuations. However, in the end we will have to decide on a final continuation and return a result. In our case the final continuation should just return the result of whatever we were doing in the IO monad. The function evalContT with the signature ContT r m r -> m r does that for us. It passes return as the final continuation and performs the whole computation. Thanks to the ContT monad transformer we can safely handle our resources within the same do block. Note, that we have to lift IO actions into the continuation monad using liftIO.

The above code produces the expected output, the same as that of the previous example. That is good news. We found that the scope-monad that we were looking for is just a special case of the continuation monad. We get all the benefits of a monad, such as do notation, and, best of all, we didn’t actually have to do anything. All the required functionality was already there.

One Step Further

In C++ the RAII idiom has been used for more than just resource management. One example are scope guards, which define actions that should be performed by the very end of the scope. Alexei Alexandrescu has defined and later refined the notion for C++. For this article we will distinguish three cases: scopeExit, scopeFail, scopeSuccess. As it turns out, all these are actually very easy to implement in the scope-monad.

We will start with scopeExit which is supposed to always execute the given action by the end of the scope. Haskell already has a function for this task. It is called finally and part of the module Control.Exception. However, its signature IO a -> IO b -> IO a is not compatible with ContT. Therefore, we have to wrap it in the following way.

scopeExit :: IO a -> ContT r IO ()
scopeExit action = ContT $ \f -> f () `finally` action

This means that the result of scopeExit expects a continuation with the signature () -> IO r. In other words, scopeExit has nothing to pass on to its continuation.

Next we want to implement scopeFail. It is supposed to only execute the given action if an exception was thrown. As before we can implement it using a library function, namely onException. Its signature is identical to that of finally and we have to wrap it in the same way.

scopeFail :: IO a -> ContT r IO ()
scopeFail action = ContT $ \f -> f () `onException` action

Finally, scopeSuccess is a little harder to implement as there is no library function in Haskell which already does what we want. It is supposed to only execute the given action if no error occurred. However, if we lookup the implementation of finally we find that we only have to modify it slightly.

scopeSuccess :: IO a -> ContT r IO ()
scopeSuccess action = ContT $ \f -> do
  mask $ \restore -> do
    r <- restore (f ())
    _ <- action
    return r

Here we use the function mask, which takes a function in the IO monad as its argument and masks all asynchronous exceptions during its execution. However, it passes another function to its argument that can be used to restore the outer masking state. In the above example that means, that f () could be interrupted by an asynchronous exception, but action will be executed with asynchronous exceptions masked, i.e. a throwing thread will be blocked until asynchronous exceptions are unmasked again. If an exception is thrown during the execution of f () then action will never be executed.

The following code demonstrates how these three functions can be used.

demo :: Bool -> IO ()
demo throw = evalContT $ do
  scopeExit $ putStrLn "Leaving scope"
  scopeFail $ putStrLn "Scope failed"
  scopeSuccess $ putStrLn "Scope succeeded"
  liftIO $ putStrLn "Inside scope"
  when throw $ liftIO $ throwIO ResourceException
  liftIO $ putStrLn "Did we just throw?"

main :: IO ()
main = do
  handle (\ResourceException -> putStrLn "Oops") $ demo True
  putStrLn $ replicate 50 '-'
  handle (\ResourceException -> putStrLn "Oops") $ demo False

The function demo takes a Boolean parameter that determines whether an exception will be thrown. It then establishes three scope guards, does some IO, and optionally throws an exception. The program’s output looks as follows.

Inside scope
Scope failed
Leaving scope
Oops
--------------------------------------------------
Inside scope
Did we just throw?
Scope succeeded
Leaving scope

In the first case an exception was thrown and scopeFail, and scopeExit executed their actions. In the second case no exception was thrown and scopeSuccess, and scopeExit executed their actions. This means that we achieved our goal and implemented all three scope guards in Haskell. Note, that it is possible for exceptions to be thrown within the scope guards. If, in the above example, an exception is thrown within scopeSuccess’s action, then scopeFail will also execute its action. This means that the order in which scope guards are defined matters. Also note, that scopeSuccess is only necessary if you want to make sure that the embedded action will not be interrupted by asynchronous exceptions. Otherwise, you could just place the embedded action at the very end of the do notation block.

Practical Application

Finally, I would like to give a practical example of what we have learned. Suppose we are given the task to write a program that copies a large file and prints its progress in percent on standard output so that it can be piped into a tool such as dialog or Zenity. In order to do this we will need a number of resources. First, we will need file-handles to the source and destination files, and second, we will need a buffer to read into and write from. This makes it a good example to show off the scope-monad that we defined above.

First we include all of the required modules.

import Control.Monad (unless)
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Trans.Cont (ContT (..), evalContT)
import Foreign.Marshal.Alloc (allocaBytes)
import System.IO

We define the buffer size as a global constant. In this case we will use 1 MiB.

bufferSize :: Int
bufferSize = 1024 * 1024

Since we want to print the progress in percent we need a way to calculate these numbers.

percentOf :: Integral a => a -> a -> a
percentOf part all = (part * 100) `div` all

Finally, we define the main program.

main :: IO ()
main = evalContT $ do
  infile <- ContT $ withBinaryFile "infile" ReadMode
  outfile <- ContT $ withBinaryFile "outfile" WriteMode
  buffer <- ContT $ allocaBytes $ bufferSize
  liftIO $ hSetBuffering infile NoBuffering
  liftIO $ hSetBuffering outfile NoBuffering
  fileSize <- liftIO $ hFileSize infile
  let copy progress = do
        print $ progress `percentOf` fileSize
        bytesRead <- hGetBuf infile buffer bufferSize
        hPutBuf outfile buffer bytesRead
        unless (bytesRead == 0) $ copy (progress + fromIntegral bytesRead)
  liftIO $ copy 0

We first open the input and output files in binary mode. Then we allocate memory for the buffer. All these resources are acquired in the continuation monad and will be released in the end even if an exception occurs. Next we deactivate buffering on both file handles. We don’t need it, since we already define our own buffer. Then we measure the size of the input file which we need to calculate the progress. Finally we start copying in a loop. Each iteration prints the current progress in percent and then copies data from the input file to the output file. We leave the loop when the end of the input file is reached and hGetBuf does not read any further bytes of data. The full code is available here.

Conclusion

With this I will conclude this already rather lengthy article. We compared resource management in C++ and in Haskell and found a monad in which we could embed the resource handling in an exception safe way. In the end a surprisingly little amount of work was required to achieve this goal. Due to the use of the continuation monad transformer we can keep using all the existing with... functions. Furthermore, the solution extends beyond the IO monad. ContT is a general monad transformer and we could use it to define a scope in any other monad. A somewhat contrived example would be actions on a stack that is represented by a State [Int] monad. The source code for such an example is available here.

I was surprised to see that I could not find any mention of the continuation monad being used in this way anywhere on the internet, or in its documentation. I will readily admit that before I found this application I was never quite sure what practical use it might have. I hope that I was able to demonstrate a practical use case for the continuation monad, and I hope it can be of use to others. Let me point out, that I am not suggesting to use the continuation monad in every case where you need to handle resources. However, if you find that nested with... functions stack up to an uncomfortable depth, then the continuation monad might be a good solution.

Any criticism, suggestion, or any other form of feedback will be appreciated. So, please feel invited to leave a comment below. Thanks for reading!

Update

It was pointed out to me that the package managed provides a specialized monad for resource management based on the continuation monad. Furthermore, the package resourcet implements a monad for resource management with additional features such as controlled release of a resource at any point.

Re­source Man­age­ment in C++

Re­source Man­age­ment in Haskell

One Step Fur­ther

Prac­tical Ap­pli­ca­tion

Con­clu­sion

Up­date

Resource Management in C++

Resource Management in Haskell

One Step Further

Practical Application

Conclusion

Update