Mutual Recursion in Final Encoding

Posted on May 28, 2016

Not long ago I was in­tro­duced to final en­coding of em­bedded do­main spe­cific lan­guages (ED­SLs) in Haskell thanks to a coding puzzle over at Code­Wars. While solving that puzzle I started won­dering whether it was pos­sible to de­fine mu­tu­ally re­cur­sive func­tions in an em­bedded lan­guage in final en­cod­ing. In this ar­ticle we will see that it is in­deed pos­sible and de­velop a generic and type-safe im­ple­men­ta­tion.

A brief dis­claimer: Many of the con­cepts that are pre­sented in this ar­ticle I only learned about very re­cently my­self. If you find any er­rors or in­ac­cu­ra­cies then please let me know.

You can find a source-code repos­i­tory with ex­ample codes here.

Ini­tial En­coding

An area in which Haskell re­ally shines is the im­ple­men­ta­tion of em­bedded do­main spe­cific lan­guages. The most straight-for­ward way of im­ple­menting such a lan­guage is to de­fine a data-type that can rep­re­sent the ab­stract syntax tree (AST) of ex­pres­sions in the em­bedded lan­guage.

Con­sider a very simple ex­ample lan­guage that al­lows to ex­press ad­di­tion and nega­tion of in­te­gers. We could rep­re­sent the AST as fol­lows.

data Expr = Lit Int        -- literal value
          | Neg Expr       -- negation
          | Add Expr Expr  -- addition

A simple ex­pres­sion such as 1 + (2 - 3) would then be rep­re­sented by the fol­lowing value.

exprI = Add (Lit 1) (Add (Lit 2) (Neg (Lit 3)))

We can then write an in­ter­preter which eval­u­ates the ex­pres­sion and turns it into a Haskell in­te­ger.

eval :: Expr -> Int
eval (Lit n) = n
eval (Neg x) = negate (eval x)
eval (Add a b) = (eval a) + (eval b)

Lo and be­hold, if we apply it to our simple ex­ample ex­pres­sion then we get the ex­pected an­swer.

>>> eval exprI
0

Having a value level rep­re­sen­ta­tion of an ex­pres­sion al­lows us to de­fine dif­ferent in­ter­preters for it. E.g. a pretty-printer that gen­er­ates a tex­tual rep­re­sen­ta­tion, or we could com­pile the ex­pres­sion and run it on a GPU (of course only sen­sible for much larger com­pu­ta­tions).

This kind of en­coding of an em­bedded lan­guage is called ini­tial en­coding. Un­for­tu­nately, there are a few down­sides to this ap­proach. In short, ex­tending the lan­guage with new ca­pa­bil­i­ties (e.g. mul­ti­pli­ca­tion) will re­quire a change of the data type Expr and will thereby in­val­i­date all code that op­er­ates on that type. At the very least, every module will have to be re­com­piled. Fur­ther­more, large ex­pres­sions will be rep­re­sented by large nested data struc­tures. That might cause per­for­mance prob­lems.

Final En­coding

A dif­ferent way of en­coding em­bedded lan­guages is the so called final en­coding. If you’re not fa­miliar with that ap­proach then I strongly rec­om­mend reading Oleg Kise­ly­ov’s lec­ture notes (PDF) on the topic. Ad­di­tion­ally, it’s well worth it to go through the ac­com­pa­nying ex­ample codes. The em­bedded lan­guage that we de­fine below is based on some of these ex­ample codes.

Here, I will only give a brief sum­mary of the con­cept. In con­trast to ini­tial en­cod­ing, final en­coding does not rep­re­sent ex­pres­sions in the em­bedded lan­guage as values of a ded­i­cated data-type. In­stead it makes rather clever use of Haskell’s type-classes. The syntax of the em­bedded lan­guage is de­fined through func­tions in type-classes. Ex­pres­sions in the em­bedded lan­guage are then formed by ap­pli­ca­tion of these func­tions. In­ter­preters are written as in­stances of the type-classes

To give an ex­am­ple: The simple arith­metic lan­guage from above could be de­fined as fol­lows.

class AddLang repr where
    int :: Int -> repr Int
    neg :: repr Int -> repr Int
    add :: repr Int -> repr Int -> repr Int

This means that the lan­guage sup­ports three prim­i­tives int, neg, add. The ex­ample ex­pres­sion from above would then be written as fol­lows.

exprF = add (int 1) (add (int 2) (neg (int 3)))

To eval­uate that ex­pres­sion we will pro­vide a cor­re­sponding in­ter­preter. An in­ter­preter is im­ple­mented as an in­stance of the class that de­fines the lan­guage.

newtype Eval a = Eval { eval :: a }

instance AddLang Eval where
    int = Eval
    neg x = Eval $ negate (eval x)
    add a b = Eval $ (eval a) + (eval b)

Here we de­fine a rep­re­sen­ta­tion Eval which is just a newtype-wrapper around any Haskell value. To con­struct an int ex­pres­sion from a Haskell in­teger we just apply the data-con­structor Eval. To negate an ex­pres­sion we first eval­uate it, apply Haskell’s negate func­tion, and then pack it up again. Sim­i­larly, add ap­plies Haskell’s (+) on a’s and b’s eval­u­a­tions and packs the re­sult. Since newtype-wrap­pers don’t incur any run­time over­head, eval­u­ating an ex­pres­sion of the em­bedded lan­guage is just as ef­fi­cient as eval­u­a­tion of an equiv­a­lent Haskell ex­pres­sion.

The ex­pres­sion still yields the same re­sult.

>>> eval exprF
0

The above im­ple­men­ta­tion can be made more con­cise if we no­tice that Eval is an Applicative.

-- We could use the DeriveFunctor extension instead
instance Functor Eval where
    fmap f x = Eval $ f (eval x)

instance Applicative Eval where
    pure = Eval
    f <*> x = Eval $ (eval f) (eval x)

instance AddLang repr where
    int = pure
    neg = liftA negate
    add = liftA2 (+)

Ex­ten­sions of the em­bedded lan­guage can be pro­vided simply by adding new type-classes. E.g. for mul­ti­pli­ca­tion.

class MulLang repr where
    mul :: repr Int -> repr Int -> repr Int

In order to eval­uate ex­pres­sions that men­tion mul we will also have to ex­tend the in­ter­preter.

instance MulLang Eval where
    mul = liftA2 (*)

But, that this doesn’t in­val­i­date any of the pre­vi­ously ex­isting code. In final en­coding the lan­guage is easily ex­ten­si­ble.

Defining the Lan­guage

For the re­mainder of this ar­ticle we will as­sume the fol­lowing lan­guage de­f­i­n­i­tion and then ex­pand upon that. It is a form of simply typed lambda cal­culus with some simple arith­metic, Boolean logic, and branching on top of it. For con­ve­nience we will also add sub­trac­tion as a prim­i­tive to the lan­guage.

class AddLang repr where
    int :: Int -> repr Int
    neg :: repr Int -> repr Int
    add :: repr Int -> repr Int -> repr Int
    sub :: repr Int -> repr Int -> repr Int

class MulLang repr where
    mul :: repr Int -> repr Int -> repr Int

class BoolLang repr where
    bool :: Bool -> repr Bool
    not :: repr Bool -> repr Bool
    and :: repr Bool -> repr Bool -> repr Bool
    or :: repr Bool -> repr Bool -> repr Bool

class EqLang repr where
    eq :: Eq a => repr a -> repr a -> repr Bool
    neq :: Eq a => repr a -> repr a -> repr Bool

class IfLang repr where
    if_ :: repr Bool -> repr a -> repr a -> repr a

class LambdaLang repr where
    lam :: (repr a -> repr b) -> repr (a -> b)
    ap :: repr (a -> b) -> repr a -> repr b

The full im­ple­men­ta­tion of the eval­u­a­tion in­ter­preter is avail­able with the code ex­am­ples to this ar­ti­cle. Here I will only show the im­ple­men­ta­tion of IfLang and LambdaLang as those are the only ones that differ sig­nif­i­cantly from the pre­vious AddLang im­ple­men­ta­tion.

instance IfLang Eval where
    if_ cond then_ else_ = Eval $ if (eval cond)
                                    then (eval then_)
                                    else (eval else_)

instance LambdaLang Eval where
    lam f = Eval $ eval . f . Eval
    ap = (<*>)

Note, that we are placing the burden of type-checking and name-binding on the host lan­guage (Haskell). Haskell’s type-checker pro­hibits us from ac­ci­den­tally adding a Boolean value to an in­te­ger. It is also im­pos­sible to write a func­tion that refers to a non-ex­is­tent ar­gu­ment.

As a simple ex­ample we will con­sider a func­tion that takes a number \(n\) and checks whether that number ful­fills the equa­tion \(n^2 = 5^2\).

-- type can be inferred (constraints omitted)
fun :: repr (Int -> Bool)
fun = lam (\n -> (n `mul` n) `eq` (int 5 `mul` int 5))

We can then apply that func­tion to a value and see the re­sult.

>>> eval $ ap fun (int 4)
False

>>> eval $ ap fun (int 5)
True

In it­self that might not seem so ex­it­ing, as we could just as well have im­ple­mented that func­tion in Haskell right away. How­ever, we are again free to pro­vide other in­ter­preters for the lan­guage. E.g. a pretty-printer, or a GPGPU com­piler. In the ex­ample codes you can find a simple pretty-printer. It turns the func­tion fun from above into the fol­lowing tex­tual rep­re­sen­ta­tion.

>>> putStrLn $ pretty fun
(\x0 -> ((x0 * x0) == (5 * 5)))

Re­cur­sion and the Fixed-Point Com­bi­nator

So far, the em­bedded lan­guage al­lows to de­fine func­tions and do a bit of arith­metic, and logic. Next, we want to be able to ex­press re­cur­sive func­tions. We will con­sider the fac­to­rial func­tion. In Haskell we could de­fine that func­tion as fol­lows.

facH n = if n == 0 then 1 else n * facH (n-1)

In this de­f­i­n­i­tion we use Haskell’s re­cur­sive let-bind­ings. The em­bedded lan­guage does not allow any let-bind­ings so far. There­fore, we have to take a dif­ferent ap­proach. First, we will rewrite the fac­to­rial func­tion in open re­cur­sive form.

open_fac self n = if n == 0 then 1 else n * self (n-1)

Then, we can apply the fixed-point com­bi­nator to close it and ob­tain the fac­to­rial func­tion again.

fixH :: (a -> a) -> a
fixH f = f (fixH f)

facH2 = fixH open_fac

The Fixed-Point Com­bi­nator in the Em­bedded Lan­guage

First, we pro­vide the fixed-point com­bi­nator as a prim­i­tive in the em­bedded lan­guage.

class FixLang repr where
    fix :: (repr a -> repr a) -> repr a

instance FixLang Eval where
    fix f = f (fix f)

With it we can now im­ple­ment the fac­to­rial func­tion within the em­bedded lan­guage.

fac = fix ( \self -> lam ( \n ->
              if_ (n `eq` (int 0))
                  (int 1)
                  (n `mul` (ap self (n `sub` (int 1))))
            )
          ) 
>>> eval $ ap fac (int 0)
1
>>> eval $ ap fac (int 3)
6
>>> eval $ ap fac (int 5)
120
>>> putStrLn $ pretty fac
fix (\self0 -> (\x1 -> (if (x1 == 0) then 1 else (x1 * (self0 (x1 - 1))))))

Great! We have a working fac­to­rial func­tion.

Mu­tual Re­cur­sion

Now, let us con­sider mu­tu­ally re­cur­sive func­tions. A simple ex­ample are the func­tions even and odd which de­ter­mine whether a given pos­i­tive number is even or odd. A mu­tu­ally re­cur­sive im­ple­men­ta­tion in Haskell could look like this.

even n = n == 0 || odd (n-1)
odd  n = n /= 0 && even (n-1)

We use that zero is an even number and there­fore, if the number is zero then it cannot be odd. Fur­ther­more, if n is even, then its pre­de­cessor n-1 is odd and vice-versa. These func­tions are only in­di­rectly re­cur­sive: The func­tion odd only calls it­self in­di­rectly through even. How­ever, it is not pos­sible to de­fine ei­ther of those func­tions without the other (in that way). Haskell al­lows mu­tu­ally re­cur­sive let-bind­ings. The em­bedded lan­guage does not, yet.

As be­fore we will first con­sider an open re­cur­sion form of the above func­tions. A first, naïve ap­proach might look as fol­lows.

-- type can be inferred
evodH1 :: (Int -> Bool, Int -> Bool) -> (Int -> Bool, Int -> Bool)
evodH1 (ev, od) = ( \n -> n == 0 || od (n-1)
                  , \n -> n /= 0 && ev (n-1) )

Here, evodH1 is a func­tion that takes a pair of func­tions ev and od as the first ar­gu­ment. It then re­turns a pair of the func­tions even and odd. Looking at its type-sig­na­ture we see that we can apply the fixed-point com­bi­nator to this func­tion.

(ev1,od1) = fixH evodH1

If we ac­tu­ally try to use any of those two func­tions then we will need to bring a lot of pa­tience along. They will both loop for­ever.

If we in­sert the de­f­i­n­i­tion for fixH this quickly be­comes ob­vi­ous.

fixH f = f (fixH f)

(ev,od) = fixH evodH1
        = evodH1 (fixH evodH1)
        = evodH1 (evodH1 (fixH evod1))
        = ...

The issue is that we never ac­tu­ally reach a point where we can split up the pair. For that we would need to fully eval­uate the in­fi­nite tower of evodH1’s first. To be more pre­cise it is nei­ther ev, nor od it­self that is looping for­ever. In­stead, it is the lazy pat­tern binding (ev,od). We need to find a better im­ple­men­ta­tion of evodH1, one that al­lows to split the pair im­me­di­ately.

evodH2 :: ( (Int -> Bool, Int -> Bool) -> (Int -> Bool)
          , (Int -> Bool, Int -> Bool) -> (Int -> Bool) )
evodH2 = ( \(ev,od) -> \n -> n == 0 || od (n-1)
         , \(ev,od) -> \n -> n /= 0 && ev (n-1) )

Much bet­ter. Un­for­tu­nately, we can no longer apply the pre­vi­ously de­fined fixed-point com­bi­na­tor. We need to de­fine a new com­bi­nator for a pair of mu­tu­ally re­cur­sive func­tions.

fixH2 :: ( (a, b) -> a
         , (a, b) -> b )
      -> (a, b)
fixH2 fs@(a, b) = (a (fixH2 fs), b (fixH2 fs))

If we apply this com­bi­nator to evodH2 we will get a pair of working func­tions.

>>> let (ev,od) = fixH2 evodH2
>>> ev 4
True
>>> ev 3
False
>>> od 3
True
>>> od 2
False

Very nice, we were able to ex­press even and odd in open re­cur­sive form and close it through a mu­tual re­cur­sion fixed-point com­bi­na­tor. But, how would we gen­er­alize this so­lu­tion to an ar­bi­trary number of mu­tu­ally re­cur­sive func­tions? One so­lu­tion that one can find on­line re­lies on lists in­stead of tu­ples.

fixL :: [[a] -> a] -> [a]
fixL fs = map (\f -> f (fixL fs)) fs

evodL = [ \[ev,od] n -> n == 0 || od (n-1)
        , \[ev,od] n -> n /= 0 && ev (n-1) ]

That ap­proach would work for even and odd. How­ever, there are a number of down­sides. First, it’s un­safe. There is nothing pre­venting us from ac­cessing a (non-ex­is­tent) third el­e­ment of the list which is taken as the first ar­gu­ment. Sec­ond, lists are ho­mo­ge­neous, i.e. all el­e­ments must have the same type. There is no (generic) way to find the fixed-point of mu­tu­ally re­cur­sive func­tions that have dif­ferent types.

Con­sider the fol­lowing con­trived ex­am­ple.

anInt = if aBool then 1 else 0
aBool = anInt == 0

This is a pair of mu­tu­ally re­cur­sive ex­pres­sions that both have dif­ferent types. The first one is an in­te­ger, the second a Boolean value. Of course both of them don’t ter­mi­nate. But, we could not ex­press them in open re­cur­sion form and close them with fixL be­cause we cannot con­struct a ho­mo­ge­neous list of an in­teger and a Boolean value.

Het­ero­ge­neous Lists

In­stead of having a fixed-point com­bi­nator that takes a ho­mo­ge­neous list, or a pair of val­ues, we need one that takes a het­ero­ge­neous col­lec­tion. In this ar­ticle we will use a het­ero­ge­neous list for that pur­pose.

The HList package by Oleg Kise­lyov et. al. is a good re­source on how to im­ple­ment het­ero­ge­neous lists in Haskell. They de­scribe two dif­ferent ways that are rel­e­vant for our use case.

Gen­er­al­ized Al­ge­braic Data Types

One uses gen­er­al­ized al­ge­braic data types (GADTs) and looks as fol­lows.

data HListG xs where
    HNilG :: HListG '[]
    HConsG :: x -> HListG xs -> HListG (x ': xs)

This uses the DataKinds, and TypeOperators ex­ten­sions to ex­press lists on the type level. The apos­trophe in '[] and x ': xs lifts those sym­bols from the value-level to the type-level such that e.g. Char ': '[Bool] would be the type-level list '[Char, Bool]. Of course it also re­quires the GADTs ex­ten­sion. With this im­ple­men­ta­tion we could form a het­ero­ge­neous list like so.

-- type can be inferred
listG :: HListG '[Char, Bool]
listG = HConsG 'a' (HConsG True HNilG)

We can de­fine re­cur­sive func­tions on such lists as fol­lows. E.g. to cal­cu­late the length of a given list.

-- type cannot be inferred
hLengthG :: HListG xs -> Int
hLengthG HNilG = 0
hLengthG (HConsG _ xs) = 1 + hLengthG xs

The down­side of using GADTs to rep­re­sent het­ero­ge­neous lists is that they don’t allow lazy pat­tern bind­ings on the one hand, and on the other hand strongly limit the ex­tent of type in­fer­ence wher­ever they ap­pear. This is dis­cussed in some more de­tail in this Stack­Over­flow thread. As an ex­am­ple, the type of the fol­lowing ex­pres­sion cannot be in­ferred.

-- type cannot be inferred
mysteriousG :: HListG '[Char, Bool] -> (String, Bool)
mysteriousG = \(HConsG c (HConsG b HNilG)) -> (c:"", not b)

This is par­tic­u­larly in­con­ve­nient for our use case. In the end we will need to apply our mu­tual re­cur­sion fixed-point com­bi­nator to a list of func­tions that each of them take a het­ero­ge­neous list as an ar­gu­ment. Having to pro­vide type sig­na­tures for every set of mu­tu­ally re­cur­sive func­tions that we want to de­fine would not make for a good user ex­pe­ri­ence.

Data Fam­i­lies

An­other op­tion is to use data fam­i­lies. They re­quire the TypeFamilies ex­ten­sion and offer a more func­tional ap­proach to de­fine data types. Below we de­fine HListF as a type-level func­tion that ex­pects a list of types and re­turns a type.

data family HListF :: [*] -> *
data instance HListF '[] = HNilF
data instance HListF (x ': xs) = HConsF x (HListF xs)

We pat­tern match on the empty list, in which case the cor­re­sponding data-con­structor is HNilF. Then we pat­tern match on a list of at least one el­e­ment, in which case the cor­re­sponding data-con­structor is HConsF.

Con­structing such a het­ero­ge­neous list works the same way as with the GADT ver­sion.

-- type can be inferred
listF :: HListF '[Char, Bool]
listF = HConsF 'a' (HConsF True HNilF)

Re­cur­sion on the other hand is no longer as easy as with the GADT ver­sion. E.g. the above ver­sion of hLengthG would not com­pile for HListF. In­stead we have to im­ple­ment such func­tions through type-classes.

class HLengthF xs where
    hLengthF :: HListF xs -> Int

instance HLengthF '[] where
    hLengthF HNilF = 0

instance HLengthF xs => HLengthF (x ': xs) where
    hLengthF (HConsF x xs) = 1 + hLengthF xs

On the other hand, type in­fer­ence works much better on the data family ver­sion. The type of the ex­ample from be­fore can now be in­ferred.

-- type can be inferred
mysteriousF :: HListF '[Char, Bool] -> (String, Bool)
mysteriousF = \(HConsF c (HConsF b HNilF)) -> (c:"", not b)

There­fore, when choosing how to im­ple­ment het­ero­ge­neous lists we have to de­cide on the fol­lowing trade-off. Do we want more pow­erful type-in­fer­ence on ex­pres­sions dealing with het­ero­ge­neous lists, or do we want re­cur­sive func­tions on them without the need for type-classes. As we will see we will ac­tu­ally need both. For­tu­nately, not at the same time.

Het­ero­ge­neous Lists of Values in the Em­bedded Lan­guage

Let’s first con­sider the kind of het­ero­ge­neous list that we ex­pect the func­tions in open re­cur­sion form to take. It would be very in­con­ve­nient to have to add ex­plicit type sig­na­tures when­ever we de­fine mu­tu­ally re­cur­sive func­tions. There­fore, we will side with the data family ver­sion. There is, how­ever, one more fea­ture that we need to add. We ex­pect that every el­e­ment of such a list is a value in the em­bedded lan­guage. I.e. every el­e­ment must be of the form repr a. To en­sure that we im­ple­ment het­ero­ge­neous lists in the fol­lowing way.

data family HList :: (* -> *) -> [*] -> *
data instance HList repr '[] = HNil
data instance HList repr (x ': xs) = repr x :. HList repr xs
infixr 5 :.

As its first ar­gu­ment the data family ex­pects a higher kinded type. That will be repr. Note, how we only apply repr to x on the right-hand side of the equal sign. That means that repr will not be re­peated in the type level list. I.e. the fol­lowing rep­re­sents a list con­taining a repr Int, a repr Bool, and a repr (Int -> Bool).

-- type can be inferred (constraints omitted)
threeThings :: HList repr '[Int, Bool, (Int -> Bool)]
threeThings = int 4 :. bool True :. lam (\n -> n `eq` int 3) :. HNil

Mu­tual Re­cur­sion Fixed-Point Com­bi­nator

The next ques­tion is how to ac­tu­ally use this data type for the mu­tual re­cur­sion fixed-point com­bi­na­tor. Com­paring to the ho­mo­ge­neous list im­ple­men­ta­tion from be­fore we ex­pect the mu­tual re­cur­sion fixed-point com­bi­nator to (con­cep­tion­ally) have the fol­lowing type.

fix :: HList' repr xs -> HList repr ys

Each el­e­ment of the ar­gu­ment list should be of the form HList repr ys -> repr y and all those y to­gether should form ys. The type Hlist' is yet to be de­ter­mined.

Un­for­tu­nately, this form would make it quite dif­fi­cult to use mu­tu­ally re­cur­sive func­tions in the em­bedded lan­guage. As it stands the lan­guage has no sup­port for het­ero­ge­neous lists it­self. I.e. we couldn’t ac­tu­ally un­pack the re­sult of fix within the em­bedded lan­guage. We could ei­ther take a de­tour through Haskell-land to un­pack the re­sults of fix and then dis­perse them over ex­pres­sions in the em­bedded lan­guage, or we could add het­ero­ge­neous lists to the lan­guage.

We will choose a third op­tion: In­stead of im­ple­menting a mu­tual re­cur­sion fixed-point com­bi­nator as a prim­i­tive we will ac­tu­ally do the same thing that Haskell does and pro­vide mu­tu­ally re­cur­sive let-bind­ings.

Let-Bind­ings

So, what is a let-bind­ing? In Haskell we all know what it is.

let answer = 42
 in "The answer to life, the universe, and everything is " ++ show answer

What that re­ally means is that we bind some ex­pres­sion (42) to a name (answer) and then make that name avail­able in an­other ex­pres­sion. We’ve ac­tu­ally seen that be­fore. It is the same as defining a func­tion of a pa­ra­meter (making the name avail­able) and then calling it im­me­di­ately with an ar­gu­ment (binding some ex­pres­sion to that name). A simple non-re­cur­sive let-binding could look as fol­lows.

class Let1Lang repr where
    let1 :: repr a -> (repr a -> repr b) -> repr b

instance Let1Lang Eval where
    let1 x e = e x

A mu­tu­ally re­cur­sive let-binding would then (con­cep­tu­ally) look like so.

class LetLang repr where
    let_ :: HList' repr xs -> (HList repr ys -> repr r) -> repr r

Where, again, each el­e­ment of the first ar­gu­ment list should be of the form HList repr ys -> repr y and all those y to­gether should form ys. And, again, the type HList' is yet to be de­ter­mined. It is clear that HList' will have to be some form of het­ero­ge­neous list. So, could we use a type sim­ilar to HList for that pur­pose?

There are a number of com­pli­ca­tions: First, the men­tioned con­straints on the ar­gu­ments need to be en­forced in some way. Sec­ond, re­member one of the re­stric­tions of the data family im­ple­men­ta­tion: To it­erate over el­e­ments of a data family im­ple­mented HList by re­cur­sion we would have to de­fine a spe­cific type-class just for that pur­pose. But, the func­tion let_ is al­ready within a type-class. Par­tic­u­larly, within one that doesn’t (and should­n’t) allow any fur­ther con­straints on xs or ys within an in­stance de­f­i­n­i­tion.

To ad­dress both those is­sues we will de­fine a new, ded­i­cated het­ero­ge­neous list type for let-bind­ings. This time we use a GADT to en­able re­cur­sion within an im­ple­men­ta­tion of let_. Fur­ther­more, we will de­fine it in such a way that the type con­straints on xs and ys are guar­an­teed by the list’s type it­self.

data LetBindings' repr ys zs where
    End  :: LetBindings' repr ys '[] 
    (:&) :: (HList repr ys -> f z)
         -> LetBindings' repr ys zs
         -> LetBindings' repr ys (z ': zs)
infixr 5 :&

type LetBindings repr ys = LetBindings' repr ys ys

The type LetBindings' (with an apos­tro­phe) takes a higher kinded type as its first ar­gu­ment (repr) and then ex­pects two type-lists ys, and zs. It is a list sim­ilar to HList in that it has a ter­mi­nator End, and a cons (:&). How­ever, the two type-lists allow us to en­force the re­quired con­straints on its el­e­ments: The first one (ys) is meant to hold the com­plete list of fixed-point types at all times. The second pa­ra­meter is meant to be built up with every el­e­ment that we add to the list. If we prepend an el­e­ment of type HList repr '[Int, Bool] -> repr Bool then the type Bool will be prepended to the list zs.

The magic hap­pens within the type-alias LetBindings (without apos­tro­phe) which en­forces that both type-lists, ys and zs, are iden­ti­cal. This way it is en­sured that the whole list of open re­cur­sive func­tions will re­sult in a list of fixed-points that has the same type as the list that every el­e­ment in the list ex­pects as a pa­ra­me­ter. It is pro­hib­ited by the type-system to con­struct an in­con­sis­tent list of let-bind­ings. E.g. we could not con­struct the fol­low­ing.

wrong = check
    ( (\(anInt :. aBool :. HNil) -> anInt `eq` (int 0))
   :& (\(anInt :. aBool :. HNil) -> if_ aBool (int 1) (int 0))
   :& End )
  where
    -- used to enforce that ys == zs
    check :: LetBindings repr ys -> LetBindings repr ys
    check = id

Looking at what each of those func­tions re­turns we would ex­pect the type-list zs to be '[Bool, Int]. Looking at the ar­gu­ment lists and how these ar­gu­ments are used we would ex­pect the type-list ys to be '[Int, Bool]. The lists don’t match and it would be im­pos­sible to ex­press wrong’s type through the type-alias LetBindings. To en­force con­sis­tent let-bind­ings we de­fine let_ in terms of this type-alias.

class LetLang repr where
    let_ :: LetBindings repr ys -> (HList repr ys -> repr r) -> repr r

Fi­nally, we im­ple­ment mu­tu­ally-re­cur­sive let-bind­ings for the eval­u­a­tor. An im­ple­men­ta­tion for the pretty-printer is avail­able with the ex­ample codes.

instance LetLang Eval where
    let_ xs e = e (fx xs xs) where
        fx :: LetBindings Eval ys
           -> LetBindings' Eval ys ys'
           -> HList Eval ys'
        fx xs End = HNil
        fx xs (x :& xs') = x (fx xs xs) :. fx xs xs'

This im­ple­men­ta­tion first finds all the fixed-points of the let-bind­ings and then passes the re­sults to the ex­pres­sion that was given as the second ar­gu­ment. The fixed-points are de­ter­mined by the mu­tual re­cur­sion fixed-point com­bi­nator fx. That func­tion it­er­ates over all mu­tu­ally re­cur­sive func­tions given in open re­cur­sion form, calls them on the re­sult of fx, and then sticks them into the list of re­sults. It might re­quire some staring at it, but this func­tion’s de­f­i­n­i­tion is con­cep­tu­ally iden­tical to that of the ho­mo­ge­neous list fixed-point com­bi­nator fixL. Note, that we have to pass the com­plete LetBindings into the fixed-point com­bi­nator as well. Oth­er­wise, the com­piler cannot infer that ys in fx’s sig­na­ture is the same as the ys in let_’s (im­plicit) sig­na­ture.

Using Mu­tu­ally Re­cur­sive Let-Bind­ings

Fi­nally, with mu­tu­ally re­cur­sive let-bind­ings im­ple­mented in the em­bedded lan­guage we can demon­strate a few use-cases.

We can im­ple­ment the fac­to­rial func­tion.

fac = let_ ( (\(self :. HNil) ->
               lam ( \n -> if_ (n `eq` int 0)
                               (int 1)
                               (n `mul` (ap self (n `sub` int 1)))) )
          :& End )
        ( \(fac :. HNil) -> fac )
>>> eval $ ap fac (int 4)
24
>>> putStrLn $ pretty fac
(let self0 = (\x0 -> if (x0 == 0) then 1 else (self0 (x0 - 1))); in self0)

We can im­ple­ment the even and odd func­tions from be­fore and use them to check that no number is even and odd at the same time.

false = let_ ( (\(ev :. od :. HNil) ->
                 lam (\n -> (n `eq` int 0) `or`
                            ap od (n `sub` int 1)))
            :& (\(ev :. od :. HNil) ->
                 lam (\n -> (n `neq` int 0) `and`
                            ap ev (n `sub` int 1)))
            :& End )
          ( \(ev :. od :. HNil) ->
            lam (\n -> ap ev n `and` ap od n) )
>>> eval $ ap false (int 4)
False
>>> eval $ ap false (int 3)
False

We can ac­tu­ally turn full circle and im­ple­ment the fixed-point com­bi­nator within the em­bedded lan­guage.

fixE = let_ ( \(fix :. HNil) ->
              lam (\f -> ap f (ap fix f))
           :& End )
         ( \(fix :. HNil) -> fix )
>>> putStrLn $ pretty fixE
(let self0 = (\x1 -> (x1 (self0 x1))); in self0)

Which we can then use to im­ple­ment the fac­to­rial func­tion yet again.

fac2 = ap fixE $ lam (\fac ->
    lam (\n ->
      if_ (n `eq` (int 0))
          (int 1)
          (n `mul` ap fac (n `sub` int 1))
    ))
>>> eval $ ap fac2 (int 4)
24

Con­clu­sion and Out­look

In this ar­ticle we found that it is in­deed pos­sible to im­ple­ment mu­tu­ally re­cur­sive let-bind­ings for an em­bedded lan­guage in final en­coding in a type-safe and generic way. The given im­ple­men­ta­tion can handle an ar­bi­trary number of bind­ings of dif­fering types.

On a per­sonal note, I can say that during this little project I was con­fronted with many as­pects of Haskell that were new to me. I was strug­gling quite a bit with the subtle dif­fer­ences be­tween GADTs and data fam­i­lies. It is very sur­prising to see that while they both allow to im­ple­ment type-safe het­ero­ge­neous lists, they imply very dif­ferent re­stric­tions on the re­sulting types.

For the par­tic­ular use case here it would have been great to have a means of con­structing a het­ero­ge­neous list that does not hinder type-in­fer­ence but at the same time al­lows for re­cur­sive func­tions to it­erate over el­e­ments without the need for type-classes. If that is al­ready pos­sible in Haskell, to­day. Please, let me know as I’m not aware of it.

One ex­ten­sion that I would love to add would be to write a vari­adic ver­sion of let_. The im­ple­men­ta­tion above re­quires a lot of boil­er­plate on the user’s side in terms of dif­ferent list con­struc­tors and ter­mi­na­tors. It would be much nicer if the fol­lowing in­ter­face was avail­able in a generic way. My (failed) at­tempt at that is avail­able with the ex­ample codes.

false = vlet ( \ev od ->
               lam (\n -> (n `eq` int 0) `or`
                          ap od (n `sub` int 1)) )
             ( \ev od ->
               lam (\n -> (n `neq` int 0) `and`
                          ap ev (n `sub` int 1)) )
             
       `in_` ( \ev od ->
               lam (\n -> ap ev n `and` ap od n) )

In any case, I hope you en­joyed this ar­ti­cle. If you have any com­ments, sug­ges­tions, or cri­tique please leave a com­ment be­low.

Thank you for read­ing.