Resource Management in Haskell

Posted on January 4, 2016

This is a post about Haskell. How­ever, I would like to point out, that I am not a pro­fes­sional Haskell pro­gram­mer. I have never had the op­por­tu­nity to use it for pro­duc­tion. How­ever, I do enjoy using Haskell for side-pro­jects. So, if you find that I made any mis­take, please leave a com­ment.

I do most of my everyday pro­gram­ming in C++ or Python and this will be re­flected in this ar­ti­cle. I will dis­cuss re­source man­age­ment in Haskell from the per­spec­tive of a C++ pro­gram­mer.

I will show a number of ex­ample codes in this ar­ti­cle. If you want to follow along you can find the source code of all ex­am­ples here.

Re­source Man­age­ment in C++

First I would like to re­vise re­source man­age­ment in C++. The rec­om­mended way to manage re­sources in modern C++ is the RAII-idiom (Re­source Ac­qui­si­tion Is Ini­tial­iza­tion). RAII might not carry the best name, but it is a pat­tern that al­lows for code with no re­source leaks, is ex­cep­tion safe, and con­cise. It ex­ploits the fact that ob­jects are de­stroyed by the end of their life­time, and that the life­time of an ob­ject local to a scope will end when we leave that scope, in­de­pen­dently of whether this hap­pens due to an ex­cep­tion or the normal course of the pro­gram. If we en­cap­su­late re­source ac­qui­si­tion and re­lease within a ded­i­cated class and clearly de­fine the own­er­ship then the com­piler will do the rest for us.

Be­fore we look at an ex­ample I would like to clarify what I mean by a re­source. Within this ar­ticle it shall be some­thing that has to be ac­quired in some way be­fore we can use it, and has to be re­leased once we don’t need it any longer. If we forget to re­lease the re­source then we have a leak in our ap­pli­ca­tion. Ex­am­ples in­clude al­lo­cated mem­ory, file-han­dles, and locks on mu­texes.

All of the above are dif­fi­cult to ob­serve. There­fore, we will em­u­late a more ver­bose re­source that no­ti­fies us when­ever it is ac­quired or re­leased.

class Resource {
  public:
    Resource (std::string name) : name_(std::move(name)) {
        std::cout << "Acquired " << name_ << "\n";
    }
    ~Resource () {
        std::cout << "Released " << name_ << "\n";
    }

  private:
    std::string name_;
};

This de­fines the class Resource. Its con­structor takes a string pa­ra­me­ter, a name by which we want to refer to the par­tic­ular in­stance. Both, the con­structor and de­structor write the name to stdout such that we can trace the re­source’s life­time. The fol­lowing code demon­strates how to use this class.

int main() {
    try {
        Resource a("A");
        Resource b("B");
        // do something with a and b
        throw std::exception();
        // do some more with a and b
    } catch (const std::exception &) {
        std::cout << "Oops\n";
    }
}

Here we ac­quire two re­sources "A", and "B", and then do some work with them. How­ever, some­thing goes wrong in the middle of it and we end up throwing an ex­cep­tion. The pro­gram cre­ates the fol­lowing out­put.

Acquired A
Acquired B
Released B
Released A
Ooops

As we can see, both re­sources are re­leased be­fore we handle the ex­cep­tion. The com­piler in­serted all the nec­es­sary ex­cep­tion han­dling code for us. The ex­cep­tion is passed on after the re­sources are re­leased.

Re­source Man­age­ment in Haskell

In com­par­ison to that we will look at how re­source man­age­ment can be done in Haskell. If we go through the doc­u­men­ta­tion of the module System.IO in the Haskell package base we find func­tions such as openFile, and hClose which can be used for manual re­source han­dling, much like fopen and fclose in C. How­ever, we also find func­tions such as withFile which en­sure that the re­source will be re­leased after we don’t need it any longer, much like the C++ re­source class above.

For the same reason as be­fore we will first de­fine a Haskell ver­sion of our em­u­lated ver­bose re­source and the cor­re­sponding pair of open and close func­tions.

data Resource = Resource String

acquireResource :: String -> IO Resource
acquireResource name = do
  putStrLn $ "Acquired " ++ name
  return $ Resource name

releaseResource :: Resource -> IO ()
releaseResource (Resource name) = putStrLn $ "Released " ++ name

To demon­strate ex­cep­tion han­dling we de­fine the fol­lowing ex­cep­tion type.

data ResourceException = ResourceException deriving (Show, Typeable)
instance Exception ResourceException

Our newly de­fined re­source can be used in the fol­lowing way.

main :: IO ()
main = handle (\ResourceException -> putStrLn "Oops") $ do
  a <- acquireResource "A"
  b <- acquireResource "B"
  -- do something with a and b
  throwIO ResourceException
  -- do some more with a and b
  releaseResource b
  releaseResource a

How­ever, when we look at the output we find that there is a prob­lem.

Acquired A
Acquired B
Oops

The re­sources have not been re­leased at all. To fix this problem we have to de­fine our own withResource func­tion. We need a func­tion that first ac­quires the re­source, then per­forms some user de­fined ac­tion on it, and makes sure that the re­source will be re­leased in the end re­gard­less of whether an ex­cep­tion was thrown or not. Luck­ily, Haskell al­ready pro­vides us with the func­tion bracket in the module Control.Exception which does ex­actly what we want. Its first ar­gu­ment de­fines how to ac­quire the re­source, its second ar­gu­ment how to re­lease the re­source, and the third ar­gu­ment how to use the re­source. With its help we can de­fine withResource.

withResource :: String -> (Resource -> IO r) -> IO r
withResource name = bracket (acquireResource name) releaseResource

An ex­cep­tion safe ver­sion of the above ex­ample can be im­ple­mented as fol­lows.

main :: IO ()
main = handle (\ResourceException -> putStrLn "Oops") $
  withResource "A" $ \a ->
    withResource "B" $ \b -> do
      -- do something with a and b
      throwIO ResourceException
      -- do some more with a and b

And it pro­duces the ex­pected out­put.

Acquired A
Acquired B
Released B
Released A
Oops

This fixed the problem and made our code ex­cep­tion safe. How­ever, we have to break out of do-no­ta­tion for the re­source man­age­ment and the names that we bind our re­sources to are hidden in the syn­tactic noise of lambda de­f­i­n­i­tions. This might be just a ques­tion of aes­thet­ics, how­ever if we add a few more re­sources and some pro­duc­tive code this could quickly be­come dif­fi­cult to read.

Being still rel­a­tively new to Haskell I started to wonder whether there was some way to de­fine a scope-monad where we could ac­quire a re­source within the same block of do no­ta­tion but at the same time en­sure that the re­source will be re­leased in the end in an ex­cep­tion safe way. Naively I started out with a state-monad that held a list of re­sources that would au­to­mat­i­cally be re­leased in the end. How­ever, I quickly found that the im­ple­men­ta­tion be­came very com­pli­cated when taking ex­cep­tion-safety, nested scopes, and stacks of monad-trans­formers into ac­count. Fur­ther­more, we would be giving up on the full wealth of with... func­tions that Haskell al­ready has to of­fer.

I took a step back and had a closer look at these with... func­tions. In the end they all boil down to the fol­lowing sig­na­ture (a -> IO r) -> IO r, i.e. func­tions that take a func­tion a -> IO r, apply it to the han­dled re­source a, and fi­nally re­turn the re­sult of the com­pu­ta­tion IO r. In other words, the with... func­tions don’t re­turn the re­source as a re­sult, but rather pass it on to an­other func­tion. This pat­tern is called con­tin­u­a­tion passing style and it can be rep­re­sented by a monad. The trans­formers package con­tains a monad trans­former for the con­tin­u­a­tion monad and its con­struc­tor’s sig­na­ture matches that of the with... func­tions per­fectly ContT :: (a -> m r) -> m r -> ContT r m a. With it we can trans­form our pre­vious ex­ample into the fol­lowing form.

main :: IO ()
main = handle (\ResourceException -> putStrLn "Oops") $ evalContT $ do
  a <- ContT $ withResource "A"
  b <- ContT $ withResource "B"
  -- do something with a and b
  liftIO $ throwIO ResourceException
  -- do some more with a and b

In the con­tin­u­a­tion monad we can build up a chain of con­tin­u­a­tions. How­ever, in the end we will have to de­cide on a final con­tin­u­a­tion and re­turn a re­sult. In our case the final con­tin­u­a­tion should just re­turn the re­sult of what­ever we were doing in the IO monad. The func­tion evalContT with the sig­na­ture ContT r m r -> m r does that for us. It passes return as the final con­tin­u­a­tion and per­forms the whole com­pu­ta­tion. Thanks to the ContT monad trans­former we can safely handle our re­sources within the same do block. Note, that we have to lift IO ac­tions into the con­tin­u­a­tion monad using liftIO.

The above code pro­duces the ex­pected out­put, the same as that of the pre­vious ex­am­ple. That is good news. We found that the scope-monad that we were looking for is just a spe­cial case of the con­tin­u­a­tion monad. We get all the ben­e­fits of a monad, such as do no­ta­tion, and, best of all, we didn’t ac­tu­ally have to do any­thing. All the re­quired func­tion­ality was al­ready there.

One Step Fur­ther

In C++ the RAII idiom has been used for more than just re­source man­age­ment. One ex­ample are scope guards, which de­fine ac­tions that should be per­formed by the very end of the scope. Alexei Alexan­drescu has de­fined and later re­fined the no­tion for C++. For this ar­ticle we will dis­tin­guish three cases: scopeExit, scopeFail, scopeSuccess. As it turns out, all these are ac­tu­ally very easy to im­ple­ment in the scope-monad.

We will start with scopeExit which is sup­posed to al­ways ex­e­cute the given ac­tion by the end of the scope. Haskell al­ready has a func­tion for this task. It is called finally and part of the module Control.Exception. How­ever, its sig­na­ture IO a -> IO b -> IO a is not com­pat­ible with ContT. There­fore, we have to wrap it in the fol­lowing way.

scopeExit :: IO a -> ContT r IO ()
scopeExit action = ContT $ \f -> f () `finally` action

This means that the re­sult of scopeExit ex­pects a con­tin­u­a­tion with the sig­na­ture () -> IO r. In other words, scopeExit has nothing to pass on to its con­tin­u­a­tion.

Next we want to im­ple­ment scopeFail. It is sup­posed to only ex­e­cute the given ac­tion if an ex­cep­tion was thrown. As be­fore we can im­ple­ment it using a li­brary func­tion, namely onException. Its sig­na­ture is iden­tical to that of finally and we have to wrap it in the same way.

scopeFail :: IO a -> ContT r IO ()
scopeFail action = ContT $ \f -> f () `onException` action

Fi­nally, scopeSuccess is a little harder to im­ple­ment as there is no li­brary func­tion in Haskell which al­ready does what we want. It is sup­posed to only ex­e­cute the given ac­tion if no error oc­curred. How­ever, if we lookup the im­ple­men­ta­tion of finally we find that we only have to modify it slightly.

scopeSuccess :: IO a -> ContT r IO ()
scopeSuccess action = ContT $ \f -> do
  mask $ \restore -> do
    r <- restore (f ())
    _ <- action
    return r

Here we use the func­tion mask, which takes a func­tion in the IO monad as its ar­gu­ment and masks all asyn­chro­nous ex­cep­tions during its ex­e­cu­tion. How­ever, it passes an­other func­tion to its ar­gu­ment that can be used to re­store the outer masking state. In the above ex­ample that means, that f () could be in­ter­rupted by an asyn­chro­nous ex­cep­tion, but action will be ex­e­cuted with asyn­chro­nous ex­cep­tions masked, i.e. a throwing thread will be blocked until asyn­chro­nous ex­cep­tions are un­masked again. If an ex­cep­tion is thrown during the ex­e­cu­tion of f () then action will never be ex­e­cuted.

The fol­lowing code demon­strates how these three func­tions can be used.

demo :: Bool -> IO ()
demo throw = evalContT $ do
  scopeExit $ putStrLn "Leaving scope"
  scopeFail $ putStrLn "Scope failed"
  scopeSuccess $ putStrLn "Scope succeeded"
  liftIO $ putStrLn "Inside scope"
  when throw $ liftIO $ throwIO ResourceException
  liftIO $ putStrLn "Did we just throw?"

main :: IO ()
main = do
  handle (\ResourceException -> putStrLn "Oops") $ demo True
  putStrLn $ replicate 50 '-'
  handle (\ResourceException -> putStrLn "Oops") $ demo False

The func­tion demo takes a Boolean pa­ra­meter that de­ter­mines whether an ex­cep­tion will be thrown. It then es­tab­lishes three scope guards, does some IO, and op­tion­ally throws an ex­cep­tion. The pro­gram’s output looks as fol­lows.

Inside scope
Scope failed
Leaving scope
Oops
--------------------------------------------------
Inside scope
Did we just throw?
Scope succeeded
Leaving scope

In the first case an ex­cep­tion was thrown and scopeFail, and scopeExit ex­e­cuted their ac­tions. In the second case no ex­cep­tion was thrown and scopeSuccess, and scopeExit ex­e­cuted their ac­tions. This means that we achieved our goal and im­ple­mented all three scope guards in Haskell. Note, that it is pos­sible for ex­cep­tions to be thrown within the scope guards. If, in the above ex­am­ple, an ex­cep­tion is thrown within scopeSuccess’s ac­tion, then scopeFail will also ex­e­cute its ac­tion. This means that the order in which scope guards are de­fined mat­ters. Also note, that scopeSuccess is only nec­es­sary if you want to make sure that the em­bedded ac­tion will not be in­ter­rupted by asyn­chro­nous ex­cep­tions. Oth­er­wise, you could just place the em­bedded ac­tion at the very end of the do no­ta­tion block.

Prac­tical Ap­pli­ca­tion

Fi­nally, I would like to give a prac­tical ex­ample of what we have learned. Sup­pose we are given the task to write a pro­gram that copies a large file and prints its progress in per­cent on stan­dard output so that it can be piped into a tool such as di­alog or Zenity. In order to do this we will need a number of re­sources. First, we will need file-han­dles to the source and des­ti­na­tion files, and sec­ond, we will need a buffer to read into and write from. This makes it a good ex­ample to show off the scope-monad that we de­fined above.

First we in­clude all of the re­quired mod­ules.

import Control.Monad (unless)
import Control.Monad.IO.Class (liftIO)
import Control.Monad.Trans.Cont (ContT (..), evalContT)
import Foreign.Marshal.Alloc (allocaBytes)
import System.IO

We de­fine the buffer size as a global con­stant. In this case we will use 1 MiB.

bufferSize :: Int
bufferSize = 1024 * 1024

Since we want to print the progress in per­cent we need a way to cal­cu­late these num­bers.

percentOf :: Integral a => a -> a -> a
percentOf part all = (part * 100) `div` all

Fi­nally, we de­fine the main pro­gram.

main :: IO ()
main = evalContT $ do
  infile <- ContT $ withBinaryFile "infile" ReadMode
  outfile <- ContT $ withBinaryFile "outfile" WriteMode
  buffer <- ContT $ allocaBytes $ bufferSize
  liftIO $ hSetBuffering infile NoBuffering
  liftIO $ hSetBuffering outfile NoBuffering
  fileSize <- liftIO $ hFileSize infile
  let copy progress = do
        print $ progress `percentOf` fileSize
        bytesRead <- hGetBuf infile buffer bufferSize
        hPutBuf outfile buffer bytesRead
        unless (bytesRead == 0) $ copy (progress + fromIntegral bytesRead)
  liftIO $ copy 0

We first open the input and output files in bi­nary mode. Then we al­lo­cate memory for the buffer. All these re­sources are ac­quired in the con­tin­u­a­tion monad and will be re­leased in the end even if an ex­cep­tion oc­curs. Next we de­ac­ti­vate buffering on both file han­dles. We don’t need it, since we al­ready de­fine our own buffer. Then we mea­sure the size of the input file which we need to cal­cu­late the progress. Fi­nally we start copying in a loop. Each it­er­a­tion prints the cur­rent progress in per­cent and then copies data from the input file to the output file. We leave the loop when the end of the input file is reached and hGetBuf does not read any fur­ther bytes of data. The full code is avail­able here.

Con­clu­sion

With this I will con­clude this al­ready rather lengthy ar­ti­cle. We com­pared re­source man­age­ment in C++ and in Haskell and found a monad in which we could embed the re­source han­dling in an ex­cep­tion safe way. In the end a sur­pris­ingly little amount of work was re­quired to achieve this goal. Due to the use of the con­tin­u­a­tion monad trans­former we can keep using all the ex­isting with... func­tions. Fur­ther­more, the so­lu­tion ex­tends be­yond the IO monad. ContT is a gen­eral monad trans­former and we could use it to de­fine a scope in any other monad. A some­what con­trived ex­ample would be ac­tions on a stack that is rep­re­sented by a State [Int] monad. The source code for such an ex­ample is avail­able here.

I was sur­prised to see that I could not find any men­tion of the con­tin­u­a­tion monad being used in this way any­where on the in­ter­net, or in its doc­u­men­ta­tion. I will readily admit that be­fore I found this ap­pli­ca­tion I was never quite sure what prac­tical use it might have. I hope that I was able to demon­strate a prac­tical use case for the con­tin­u­a­tion monad, and I hope it can be of use to oth­ers. Let me point out, that I am not sug­gesting to use the con­tin­u­a­tion monad in every case where you need to handle re­sources. How­ever, if you find that nested with... func­tions stack up to an un­com­fort­able depth, then the con­tin­u­a­tion monad might be a good so­lu­tion.

Any crit­i­cism, sug­ges­tion, or any other form of feed­back will be ap­pre­ci­ated. So, please feel in­vited to leave a com­ment be­low. Thanks for read­ing!

Up­date

It was pointed out to me that the package man­aged pro­vides a spe­cial­ized monad for re­source man­age­ment based on the con­tin­u­a­tion monad. Fur­ther­more, the package re­sourcet im­ple­ments a monad for re­source man­age­ment with ad­di­tional fea­tures such as con­trolled re­lease of a re­source at any point.