> {-# LANGUAGE TypeOperators, MultiParamTypeClasses, FlexibleInstances, FlexibleContexts, UndecidableInstances, IncoherentInstances #-} > module Code20140128 where * Data types a la carte Based on: Data Types a la Carte Wouter Swierstra Journal of Functional Programming 18 (4): 423-436, 2008. Defining a recursive data type is straightforward in Haskell. Defining functions on the data type (by recursion) is also straightforward. But what happens to these functions if we want to extend the data type by new constructors? This question has been posed by Phil Wadler as the "Expression Problem": "The goal is to define a data type by cases, where one can add new cases to the data type and new functions over the data type, without recompiling existing code, and while retaining static type safety." Data types a la carte offers one solution. ** Definition > data Exp0 = Int0 Int | Add0 Exp0 Exp0 A functor that describes the structure of Exp0. > data ExpF x = IntF Int | AddF x x > instance Functor ExpF where > fmap f (IntF i) = IntF i > fmap f (AddF x1 x2) = AddF (f x1) (f x2) | f :: * -> * > data Fix f = InF (f (Fix f)) A description of an operation on one level of f: an f-algebra > type Algebra f a = f a -> a > evalExpAlg :: Algebra ExpF Int > evalExpAlg (IntF i) = i > evalExpAlg (AddF i1 i2) = i1 + i2 A generic recursion operator > foldFix :: Functor f => Algebra f a -> Fix f -> a > foldFix alg (InF x) = alg (fmap (foldFix alg) x) x :: f (Fix f) alg :: f a -> a fmap (foldFix alg) :: f (Fix f) -> f a A type "isomorphic" to Exp0: > type Exp1 = Fix ExpF > eval0 :: Exp1 -> Int > eval0 = foldFix evalExpAlg Using the injection InF is tedious: use smart constructors instead > int0 :: Int -> Exp1 > int0 i = InF (IntF i) > add0 :: Exp1 -> Exp1 -> Exp1 > add0 e1 e2 = InF (AddF e1 e2) Another example: lists of integers > data ListF x = NilF | ConsF Int x > instance Functor ListF where > fmap f NilF = NilF > fmap f (ConsF i x) = ConsF i (f x) > type List0 = Fix ListF > nil0 = InF NilF > cons0 i x = InF (ConsF i x) Examples of ListF algebras > lengthAlg :: Algebra ListF Int > lengthAlg NilF = 0 > lengthAlg (ConsF _ x) = 1 + x > sumAlg :: Algebra ListF Int > sumAlg NilF = 0 > sumAlg (ConsF i x) = i + x As an example, consider a data type representing abstract syntax and some functions on that syntax, e.g., a pretty printer, an interpreter, ... Instead of defining this data type directly, we take a deviation to make the definition of the type extensible. > data Mu f = In (f (Mu f)) This datatype is parameterized over a functor f that defines the signature of the constructors. Mu itself just provides the recursion. Now we can define a range of data types just be providing different functors. > data Val e = Val Int > type ConstExpr = Mu Val > data Add e = Add e e > type AddExpr = Mu Add To combine the two functors, we need to take their "coproduct" on the level of functors: > infixr :+: > data (f :+: g) e = Inl (f e) | Inr (g e) > type AddConstExpr = Mu (Val :+: Add) > instance (Functor f, Functor g) => Functor (f :+: g) where > fmap h (Inl fe) = Inl (fmap h fe) > fmap h (Inr ge) = Inr (fmap h ge) What would an example expression look like? ** Evaluation First, we need to make explicit that Val and Add are functors and that the coproduct of two functors is again a functor. > instance Functor Val where > fmap f (Val i) = (Val i) > instance Functor Add where > fmap f (Add x y) = Add (f x) (f y) > instance (Functor f, Functor g) => (Functor (f :+: g)) where > fmap h (Inl e) = Inl (fmap h e) > fmap h (Inr e) = Inr (fmap h e) Given a functor, we may define a generic fold operator on Mu. > foldMu :: Functor f => (f a -> a) -> Mu f -> a > foldMu alg (In x) = alg (fmap (foldMu alg) x) alg :: f a -> a In x :: Mu f x :: f (Mu f) fmap (foldMu alg) x :: f (Mu f) -> f a Side remark: If F is an endofunctor, an arrow of type F A -> A is called an F-algebra. F-algebras also form a category and many recursive data types may be described by an initial F-algebra for suitable F. Here are some specific examples. > class Functor f => Eval f where > evalAlgebra :: f Int -> Int > instance Eval Val where > evalAlgebra (Val i) = i > instance Eval Add where > evalAlgebra (Add x y) = x + y What about the combined case? > instance (Eval f, Eval g) => Eval ((:+:) f g) where > evalAlgebra (Inl x) = evalAlgebra x > evalAlgebra (Inr y) = evalAlgebra y Now we can put together an eval function. > eval :: Eval f => Mu f -> Int > eval e = foldMu evalAlgebra e Does it work? ** Injection Writing In (Inl (Val 17)) In (Inr (Add (In (Inl (Val 17))) (In (Inl (Val 4))))) is quite unreadable. How about using smart constructors like the following? > val1 :: Int -> Mu Val > val1 x = In (Val x) > add1 :: Mu Add -> Mu Add -> Mu Add > add1 x y = In (Add x y) Unfortunately, they do not combine in a data type like Mu (Val :+: Add)! We need to provide these constructors with a more general type. In particular, the constructor should work regardless how deeply nested the functor occurs in a nest of :+: applications. Hence, we define a subtype relation that locates a functor in such a nest and that generates the correct sequence of Inl and Inr constructors on the way. To express this relation requires an extended notion of a type class with multiple parameters. Such a class describes an n-ary relation on types, where a standard type class only describes a unary relation (a predicate). > class sub :<: super where > inj :: sub a -> super a At the functor itself, no injection is needed. (*) this instance declaration is illegal by default because its form may give rise to ambiguities. > instance f :<: f where > inj = id Otherwise, we may find the primitive functor either on the left or on the right. > instance f :<: (f :+: g) where > inj = Inl > instance (f :<: g) => f :<: (h :+: g) where > inj = Inr . inj This definition is not really declarative. It requires that primitive functors occur directly in the left alternative of a coproduct or nested in the right alternative. Supporting nesting in both alternatives would require Haskell to search exhaustively for suitable instances. However, the Haskell type checked only checks the first applicable alternative. With this class, suitable smart constructors are definable. > inject :: (g :<: f) => g (Mu f) -> Mu f > inject = In . inj > val :: (Val :<: f) => Int -> Mu f > val x = inject (Val x) > add :: (Add :<: f) => Mu f -> Mu f -> Mu f > add x y = inject (Add x y) These constructors improve readability a lot. Hack: make eval instance for ListF > instance Eval ListF where > evalAlgebra (NilF) = 0 > evalAlgebra (ConsF _ x) = 1+x ** Extension Now for the lackmus test. Let's add new features to the AST. > data Mul e = Mul e e > instance Functor Mul where > fmap f (Mul x y) = Mul (f x) (f y) > instance Eval Mul where > evalAlgebra (Mul x y) = x * y > mul :: (Mul :<: f) => Mu f -> Mu f -> Mu f > mul x y = inject (Mul x y) ** Extension II Adding new functions may be done either by defining more algebras and using fold or by using *open-ended recursion* directly. A pretty printer transforms expressions into strings. > class Render f where > render :: Render g => f (Mu g) -> String Why can't we simply use Mu f for the recursive use? Given that render is defined, the pretty printer boils down to: > pretty :: Render f => Mu f -> String > pretty (In x) = render x It remains to define instances of Render for each functor. > instance Render Val where > render (Val i) = show i > instance Render Add where > render (Add x y) = pretty x ++ " + " ++ pretty y > instance Render Mul where > render (Mul x y) = pretty x ++ " * " ++ pretty y > instance (Render f, Render g) => Render (f :+: g) where > render (Inl x) = render x > render (Inr y) = render y ** An application to free monads For any functor f, this data structure is a monad, the *free monad* for f: > data Term f a > = Pure a > | Impure (f (Term f a)) It distinguishes a pure value from an impure effect described by f. > instance Functor f => Functor (Term f) where > fmap h (Pure a) = Pure (h a) > fmap h (Impure t) = Impure (fmap (fmap h) t) h :: a -> b result :: Term f a -> Term f b t :: f (Term f a) > instance Functor f => Monad (Term f) where > return x = Pure x > Pure x >>= h = h x > Impure t >>= h = Impure (fmap (>>= h) t) Most monads are not free, for instance, the list and state monads are not. But some well-known monads may be constructed as free monads: > data Zero > data One a = One > data Const e a = Const e > instance Functor One where > fmap f One = One ??? The trick of this construction is that (the functor) Term maps functors to monads. Furthermore, it is left-adjoint to the forgetful functor from monads to functors and thus preserves coproducts. Hence, coproducts of functors (that we just constructed in the previous sections) map to coproducts of monads! Thus, the syntax defined in this way is directly monadic. As an example, consider a calcular with three actions on a memory cell. RECALL: obtain the current value from memory. INCREMENT: increment the value in memory by a given amount. CLEAR: reset memory to zero. > data Incr t = Incr Int t > data Recall t = Recall (Int -> t) > instance Functor Incr where > fmap f (Incr x t) = Incr x (f t) > instance Functor Recall where > fmap f (Recall h) = Recall (f . h) To write terms using this operators, we define smart constructors. > injectT :: (g :<: f) => g (Term f a) -> Term f a > injectT = Impure . inj > incr :: (Incr :<: f) => Int -> Term f () > incr x = injectT (Incr x (Pure ())) > recall :: (Recall :<: f) => Term f Int > recall = injectT (Recall Pure) Now, we can use these operators in a monad. > tick1 :: Term (Recall :+: Incr) Int > tick1 = do y <- recall > incr 1 > return y > tick2 :: (Functor f, Recall :<: f, Incr :<: f) => Term f Int > tick2 = do y <- recall > incr 1 > return y > tick3 :: Term (Incr :+: Recall) Int > tick3 = do y <- recall > incr 1 > return y We write functions over terms with a suitable fold operator. > foldTerm :: Functor f => (a -> b) -> (f b -> b) -> Term f a -> b > foldTerm f alg (Pure a) = f a > foldTerm f alg (Impure t) = alg (fmap (foldTerm f alg) t) To run terms with Incr and Recall, we need to define a data type for the memory and a state monad to interpret terms in. > newtype Mem = Mem Int > deriving Show So that run :: (...) => Term f a -> Mem -> (a, Mem) > class Functor f => Run f where > runAlgebra :: f (Mem -> (a, Mem)) -> (Mem -> (a, Mem)) > instance Run Incr where > runAlgebra (Incr k r) (Mem i) = r (Mem (i+k)) > instance Run Recall where > runAlgebra (Recall r) (Mem i) = r i (Mem i) > instance (Run f, Run g) => Run (f :+: g) where > runAlgebra (Inl r) = runAlgebra r > runAlgebra (Inr r) = runAlgebra r > run :: Run f => Term f a -> Mem -> (a, Mem) > run = foldTerm (,) runAlgebra What did we gain?