| CARVIEW |
Here are the typing rules, written in the usual natural deduction style:
It turns out that these rules are not only enough to provide typechecking but also type inference if you feed them into Prolog, though that’s far from a coincidence. Type inference for STLC is possible, and Prolog was the result of decades of research into finding proofs (in this case a typing – a proof that a given term has a given type) for problems stated in this style. It’s even less of a coincidence because STLC admits principal typings – so all proofs that a term has a given type are equivalent.
Rewriting the Rules
Suppose we deliberately ignore the existance of Prolog (but not of unification, as it’ll turn out) – how can we write the rules to make it more obvious what to do?
Well, the good news is that they’re already syntax-directed. Given a particular term, we can always see which rule to try applying based on its outermost structure. That makes it fairly easy to do typechecking working through a term, though there’s a hidden condition buried in there. Let’s tease that out and see if it helps:
The main difference is in the App rule. The red equality condition wasn’t there before, and aside from it the blue variables only appear above the line once. When we dispatch the conditions in black, we don’t have to worry about them interacting directly, we only have to worry about the red condition. If you’re only checking in a known context then the condition can be shown by a syntactic equality check. The principal typings property should be pretty clear at this point – there’s barely a decision to be made that shows up in a completed proof tree.
Checking for a Checker
Time to code those rules up, then. Starting with datatypes for the syntax:
data Term ty = Var TermIdentifier | App (Term ty) (Term ty) | Lam TermIdentifier ty (Term ty) data ObjType fix = Prim TypeIdentifier | Fun fix fix deriving Show
Term is parametric in the type of type annotations to make it easier to handle type inference. The types use a two-level fixpoint encoding as described here to help reusing the code a little lower down. To build an entire type we use a type-level fixpoint operator, and construct like so:
Fun (Fix $ Prim "Integer") (Fix $ Fun (Fix $ Prim "Integer") (Fix $ Prim "Integer") )
Or equivalently:
(Fix $ Prim "Integer") `Fun` (Fix $ (Fix $ Prim "Integer") `Fun` (Fix $ Prim "Integer"))
This represents Integer -> (Integer -> Integer), and has the type ObjType (Fix ObjType).
Given that, here’s a typechecker:
type OType = ObjType (Fix ObjType) type OEnvironment = Map TermIdentifier OType check :: OEnvironment -> Term OType -> OType check env (Var i) = case lookup i env of Nothing -> error $ "Unbound variable " ++ i Just v -> v check env (App f p) = let t_f = check env f t_p = check env p in case t_f of Fun (Fix t_p') (Fix r) | t_p == t_p' -> r | otherwise -> error "Parameter mismatch" _ -> error "Applied a non-function" check env (Lam i ty t) = let r = check (insert i ty env) t in Fun (Fix ty) (Fix r)
Aside from the Fix constructors, this is entirely standard 1 – so at least my rewrite of the typechecking rules works!
Inference
As for inference, it turns out this more or less solves it too – all we have to do is treat the red condition as a constraint and the syntax-directed rules as a constraint functional program. This can be encoded into the language from the earlier article, but it’s neater not to. If the term checks then there’s a best solution that’s unique up to metavariable naming. This lets us hang on to principal typings, using the typing rules and metavariable assignments 2. We also gain the means to deal with unknown contexts, by inferring the necessary bindings.
Here’s a type inferrer for known contexts:
type MType = MetaType ObjType type MEnvironment = Map TermIdentifier MType infer :: MEnvironment -> Term (MType) -> ConstraintMonad MType infer env (Var i) = case lookup i env of Nothing -> throwError $ "Unbound variable " ++ i Just v -> return v infer env (App f p) = do t_f <- infer env f t_p <- infer env p t_r <- newVar t_f `eqConstraint` O (t_p `Fun` t_r) return t_r infer env (Lam i ty b) = do t_b <- infer (insert i ty env) b return $ O $ ty `Fun` t_b
Here we’ve got error handling pushed into the ConstraintMonad, but no major wins for it in a toy implementation. Constraint handling makes the inferrer itself as easy to write as the typechecker though. The O constructor, taken from the Unification module linked to below, is used to lift object-level terms to the constraint system’s meta level. The inferrer returns meta-level terms – hence Os in the bottom line and in the call to eqConstraint.
Reading for Context
It deviates a little from a straightforward encoding of the typing rules, but it’s also common to use a ReaderT to handle the context like so:
inferReader :: MEnvironment -> Term MType -> ConstraintMonad MType inferReader env t = runReaderT (infer' t) env where infer' (Var i) = do env <- ask case lookup i env of Nothing -> throwError $ "Unbound variable " ++ i Just v -> return v infer' (App f p) = do t_f <- infer' f t_p <- infer' p t_r <- lift newVar lift $ t_f `eqConstraint` (O (t_p `Fun` t_r)) return t_r infer' (Lam i ty b) = do t_b <- local (insert i ty) (infer' b) return $ O $ ty `Fun` t_b
At this scale the Reader doesn’t look like that big a saving at first glance. Most of the context/environment passing is gone, but having to retrieve it to do lookups feels just as bad because it adds a line (much harder to mess up though!). It does lead onto an idea I’m going to save for the next post though – if we’re handling the context in the monad, can we treat it as part of the constraint system and if so can we figure out anything useful from it?
Trekking Through the Tarball
Here’s a quick rundown of what’s in the source that accompanies this post:
- Unification.hs: A slightly simplified, slower version of the unification framework from Generic Unification via Two-Level Types and Parameterized Modules, including some minimal support for error reporting.
- STLCLang.hs: Datatypes for STLC’s syntax, a helper class (
IsType) for building polymorphic type values that work at object or meta level, and a pretty printer for type values. - STLCChecker.hs: The typecheckers from above, plus the constraint monad the inferrers run in. Use
runSubstPrettyto invoke the checkers and pretty print the result. - STLCTerms.hs: A toy environment and some test terms to check.
Links
- Accompanying source
- What Are Principal Typings and What Are They Good For?
- The Essence of Principal Typings
-
The
lookupcall is legit due to the structural rules which are left implicit
-
Traditionally, an unconstrained metavariable becomes a type variable, which has much the same role as a primitive type in STLC

data Term = Term !Label ![Term]
I handled the meta level in a manner that isn’t particularly typeful:
data Term = Term !Label ![Term] | Metavar Integer
That’s okay for a quick and dirty solver, but there are better options if you want a bit more safety!
quick note: I leave the !s out in all these representations because the one in ![Term] doesn’t actually do that much good (it only achieves head-strictness) so I figured it’s easier to stick to the fast and loose approach
Just use two types
data ObjectTerm = ObjectTerm Label [ObjectTerm] data MetaTerm = MetaTerm Label [MetaTerm] | Metavar Integer liftToMeta :: ObjectTerm -> MetaTerm liftToMeta (ObjectTerm l cs) = MetaTerm l (map liftToMeta cs) concretise :: MetaTerm -> Maybe ObjectTerm concretise (MetaTerm l cs) = return . ObjectTerm l =<< mapM concretise cs concretise Metavar{} = Nothing
This one’s obvious enough. Given only Haskell 2010 type classes to abstract across the equivalents of the Term constructor this can be rather painful to work with because your polymorphic constructor is no good for pattern matching. With view patterns or the upcoming pattern synonyms things become more comfortable though.
data TermView children = Term Label [children] | NotTerm class Term t where term :: Label -> [t] -> t matchTerm :: t -> TermView t instance Term ObjectTerm where term = ObjectTerm matchTerm (ObjectTerm l cs) = Term l cs instance Term MetaTerm where term = MetaTerm matchTerm (MetaTerm l cs) = Term l cs matchTerm _ = NotTerm add :: (Term l, Term r) => l -> r -> r add (matchTerm -> Term "Z" []) r = r add (matchTerm -> Term "S" [n]) r = term "S" $ [add n r]
This example only uses view patterns, but pattern synonyms would allow patterns that’re indistinguishable from ordinary ones. Note the type of add though (equivalent to the inferred one) – it can lead to unexpected annotation requirements.
GADTs
One of the great things about GADTs is that you can use phantom types to mark out what might otherwise be subtypes of the GADT – and pattern-matching to perform the equivalent of a safe cast. The bad news is that they don’t play quite so nicely with the rose tree representation: you can’t prove anything without a separate leaf constructor, which makes a bit of a mess of the original code.
data Concrete data Meta data Term a where Leaf :: Label -> Term a Branch :: Label -> [Term a] -> Term a Metavar :: Integer -> Term Meta concretise :: Term a -> Maybe (Term Concrete) concretise (Leaf l) = return $ Leaf l concretise (Branch l cs) = return . Branch l =<< mapM concretise cs concretise Metavar{} = Nothing
The concretise function attempts to prove that a Term can be Concrete via pattern-matching – Metavar is always Term Meta, which effectively taints the entire term it appears in. The concretise function still ends up rebuilding all the Branch nodes though.
An alternative type for concretise is Term a -> Maybe (Term b), and if we’re comfortable with rank-n types we can use (forall a. Term a) as an equivalent to Term Concrete that’s readily usable as Term Meta as well. This means we already get liftToMeta for free if we keep types polymorphic until we care, but failing that it’s just a tree traversal rebuilding the same term at a different type.
Quick and Dirty Optimisation
The version of concretise above rebuilds an entire term just to change its type, but GHC already has the same underlying representations for them. I don’t know a good way of carrying coercions to Concrete or a type variable around, but we do know something the compiler doesn’t – and this means we can cheat.
import Unsafe.Coerce concretise :: Term a -> Maybe (Term Concrete) concretise l@Leaf{} = return $ unsafeCoerce l concretise b@(Branch l cs) = return $ unsafeCoerce b =<< mapM_ concretise cs concretise Metavar{} = Nothing
Two level types
There’s an alternative approach using two mutually recursive datatypes, which when generalised are parametric in each other. One type handles the per-node structure of a term, while the other handles the recursive structure – and via polymorphism, any additional structure that doesn’t belong to object terms.
data Term wrapper = Term Label [wrapper] newtype Fix a = Fix { unfix :: (a (Fix a)) } data Meta object = O (object (Meta object)) | Metavar Integer liftToMeta :: Fix Term -> Meta Term liftToMeta (Fix (Term l cs)) = O $ Term l (map liftToMeta cs) concretise :: Meta Term -> Maybe (Fix Term) concretise (O (Term l cs)) = return . Fix . Term l =<< mapM concretise cs concretise Metavar{} = Nothing
Fix is a type-level fixpoint operator – so it provides recursive structure in types polymorphic in their own children, and nothing else. The O constructor in Meta has much the same structure, but Meta also handles metavariables. As Fix and Meta have completely different internal representations, liftToMeta and concretise both require rebuilding the entire term again.
This approach has the major advantage that it starts to enable a degree of generic programming – we can reuse Fix and Meta for a variety of different object node types. The classic example is Tim Sheard‘s functional pearl on generic unification, which is implemented in the unification-fd package.
It may also be appropriate to use type classes etc much as in the initial example – possibly using one class generalising Term and a second one generalising Fix and Meta, such that an application of the Fix/Meta generalisation is also a generalisation of Term. I haven’t tried it so far though, and it’s possible that the increased annotation burden isn’t worth it.
Chaining Types and the Expression Problem
Having gone as far as two level types, we can try a refactoring:
data ObjectTermNode fix = Term Label [fix] data MetaTermNode chain fix = O (chain fix) | Metavar Integer newtype Fix a = Fix { unfix :: (a (Fix a)) } type ObjectTerm = Fix ObjectTermNode type MetaTerm = Fix (MetaTermNode ObjectTermNode)
Now the node info for metaterms is also separated from the recursive structure. This generalises to as many different node types as we like, though the collection of constructors used to pass up the chain will grow – clearly this cries out for a type class to match each node type. I’m hoping that with ConstraintKinds someone’ll figure out a good way to write instances that generate the chaining behaviour with only a linear instance count too, but as I’m sticking with the Haskell Platform at present I’m not able to have a go myself yet.
This solution has the advantange of generalising to the Expression Problem: we can add new node types (with corresponding type classes) and new functions reasonably orthogonally. Similarly, other solutions to the Expression Problem are solutions to the problem discussed in this post.
]]>
Finding and Fixing the Mistake
The centrepiece of the article is an embedded language, which I’d initially written as just an applicative functor and quickly decided needed more power so I added a bind-like construct as well. The functor at its heart only operates on a subcategory of that of Haskell values – defined by a typeclass called Embeddable:
class Embeddable a where subst :: Assignment -> a -> a instance Embeddable () where subst _ v = v instance Embeddable Term where subst = substTerm instance (Embeddable a, Embeddable b) => Embeddable (a -> b) where subst = const id
So we’ve got two ground types in the category – () and Term – and arbitrary-order functions on them. What we haven’t got is the target of the functor, Embeddable a => Code a – which means that while the Bind constructor types fine and so does its implementations, we can’t write the join function that provides the difference between an applicative functor and a monad in Haskell. We can pass around higher-order Haskell functions, but our embedded language is strictly first-order!
Oops? We can fix this by adding a new instance:
instance Embeddable a => Embeddable (Code a) where subst _ c = c
What made this mistake easy to make, of course, was not having a Monad-like typeclass with a defaulted join I was writing an instance for. That said, I did discover something new to me: I can have a "first-order-only monad" with the freedom of binding structure that comes with it. If I’m willing to invoke the relevant GHC flags, it can even play along with do notation.
What Did We Have Before?
This new structure is interesting – >>= still doesn’t admit self-analysis, but if I just want to know that there’s no passing around of computations happening in an EDSL then it’s a lot more pleasant to use than an Arrow. The first piece of good news is that there’s already somewhere for it to fit in on HackageDB – the RMonad package. Of course, this implies that not every RMonad is a monad.
For something with more theory developed, I believe it’s a Haskell encoding of a Relative Monad. A typeclass for them in Haskell using ConstraintKinds can be found in this draft paper: Subcategories in Haskell.
]]>Some basic definitions
Constraint programming isn’t a paradigm itself, but it can be viewed as a "paradigm transformer" – there are such things as Constraint Logic Programming, Constraint Functional Programming and Constraint Imperative Programming. What distinguishes them from their underlying paradigms is the ability to work with constraints on some kind of data, usually to either find a solution to those constraints or check whether such a solution can exist. Time to define a few terms (warning: I may be using them a little idiosyncratically or insufficiently generally!)
- Constraint Domain: a specific setting for constraint problems. A type or types of data that may be constrained and the constraints (possibly: primitive constraints) that are allowed on them.
- Constraint: a predicate that may or may not be satisfied by the values it constrains.
- Constraint Problem: a specific problem within a constraint domain, with specific constraints.
- Constraint Space: an abstract space defined by the variables in a constraint problem. For example, given three real numbers x, y and z and inequalities on them, the canonical constraint space for them is
. Inequalities ‘cut’ the space in half, one side satisfiable and the other unsatisfiable – how’s that for intuition? In some constraint domains, there may not be an easily-envisageable constraint space in that the set or universe of potential solutions may not be particularly spatial, or there may not be a space that’s particularly intuitive to work with. Depending on your background we’ll be dealing with one of those in this article.
There are further concepts to come, but I’m going to leave those until they come up – time for a worked example.
Behold my domain!
Okay, first let’s pick a constraint domain. I’m going to go for a classic: equality on the Herbrand universe of terms. Logic programming fans will recognise this as the implicit constraint domain of conventional logic programming ala Prolog. A datatype:
data Term = Term !Label ![Term] deriving Show
Note the strictness: we only want to deal with finite terms. Label can be anything that provides an infinite alphabet, equality and is a member of Show. Now we need terms with variables so we can start talking about non-trivial constraints. There are more typeful ways of doing this, but I’m going to go with modifying Term to this:
data Term = Term !Label ![Term] | Metavar Integer deriving Show
Why Metavar? Well, the language of constraints-on-Terms-with-variables is a metalanguage for talking about our original Terms – so those variables are metavariables, not variables of the Term language per se. This is also why there are more typeful ways to handle the problem: the code above doesn’t have a type of concrete Terms without metavariables in them any more. It’ll do for today though. If we have variables, we’re going to need some substitution machinery for them:
type Assignment = Map Integer Term substTerm a (Term l cs) = Term l $ map (substTerm a) cs substTerm a m@(Metavar i) = case lookup i a of Nothing -> m Just t -> substTerm a t
Astute readers will notice that an assignment is equivalent to a special case of a set of equality constraints where one side is a lone metavariable. It’s important that we don’t allow circular references into our Assignments.
It’s all Greek to the machine
So now we’ve got that much, let’s build ourselves an EDSL! You may ask where the constraints are: hold that thought for now. Instead we’re going to aim for an initial implementation that solves all constraints as it goes ala Prolog. I’d also like the language to automatically substitute in solutions as it goes so that when you run it you get out a concrete solution straight away. That means we need this typeclass and the following instances:
class Embeddable a where subst :: Assignment -> a -> a instance Embeddable () where subst _ v = v instance Embeddable Term where subst = substTerm instance (Embeddable a, Embeddable b) => Embeddable (a -> b) where subst = const id
Why Embeddable? These are the Haskell values we’re going to allow embedded into our own language, itself embedded back into Haskell – think Applicative‘s pure or Monad‘s return. Most of the EDSLs we think to grace with the name are actually mutually embedded with their host language to some extent – it’s just so convenient to steal from! I’m a fan of deep embeddings when I’m explaining and/or working on the semantics of an EDSL, so syntax next:
data Code a where Pure :: Embeddable a => a -> Code a Ap :: Embeddable b => Code (a -> b) -> Code a -> Code b Bind :: (Embeddable a, Embeddable b) => Code a -> (a -> Code b) -> Code b NewVar :: Code Term Equate :: Term -> Term -> Code Term
If you didn’t see this coming, we’re effectively building an applicative functor and a monad – only on a subcategory of haskell’s defined by Embeddable, instead of on the entire of it. Why do I have both Ap and Bind, analogues of <*> and >>=? Mostly historical accident, but I also wanted to highlight a design decision – different levels of power are appropriate to different constraint problems, as we’ll see with one implementation below. NewVar is used to generate fresh metavariables as we can’t do it automatically, whereas Equate is used to assert equality constraints. At this point we’re ready for multiple implementations of the language’s semantics. Here’s one for starters:
runOnFly :: Assignment -> Integer -> Code a -> Either UnificationError (a, Assignment, Integer) runOnFly a v (Pure t) = return (subst a t, a, v) runOnFly a v (Ap f p) = do (rf, a', v') <- runOnFly a v f (rp, a'', v'') <- runOnFly a' v' p return (subst a'' (rf rp), a'', v'') runOnFly a v (Bind p f) = do (rp, a', v') <- runOnFly a v p runOnFly a' v' (f $ subst a' rp) runOnFly a v NewVar = return (Metavar v, a, v+1) runOnFly a v (Equate l r) = case unify a l r of Left e -> Left e Right (u,a') -> return (u, a', v)
The parameters a and v are the current Assignment containing the solution-to-date during runtime and the first free metavariable counting from 0, both of which are handled as if they’re state. The implementation uses the Either monad, as it’s entirely possible for a new constraint to be unsatisfiable given the current Assignment and make the problem the program describes impossible to solve. Arguably I should’ve newtyped up a couple of StateTs on top of Either – if only so I can’t mess up the state handling! NewVar‘s case uses simple addition, whereas Equate calls a unification algorithm on the two Terms in the context of the current Assignment. Unification’s a standard way to solve this kind of constraint, but it’s pretty tedious so I won’t list the code here – see the bottom for a source file including it as well as everything else in this article. Oh yes, I almost missed it in amongst detailing the obvious: check out the call to subst in the Pure, Ap and Bind cases! This is what we gave up our Monad and Applicative instances for, but it’s kinda neat, huh? GHC 7.4’s new ConstraintKinds feature gives us a good way to combine typeclass-defined subcategories with instances of other typeclasses – hopefully we’ll see appropriate generalisations of Functor and the rest standardised on Hackage soon so we can perform such tricks freely.
The decomposing remains of a problem?
There’s another possible set of semantics for this language though: rather than solving constraints as we go, we could just use the language for constraint generation and let the user pass the resulting problem to satisfiability checkers, solvers and other analyses as appropriate. It’s a lot easier to "debug" the satisfiability of a constraint problem without having to deal with the program that generated it as well, so this is an idea with genuine merit. We’re going to have to talk about constraints explicitly in the code now:
data Constraint = Term := Term deriving Show
I’m going to be lazy and use lists as a representation for constraint sets. We don’t have to use sets at all though – if we treat equality constraints as predicates that are true or false for specific concrete assignments to all the metavariables, we can also treat the entire constraint problem as one big constraint – the conjunction of all the constraints that would otherwise form the constraint set. To go a step further, constraint domains are (or correspond to) intentionally-limited logics! This is a theme I’ll be using in later posts. Code time!
runGathering :: Integer -> Code a -> (a, Integer, [Constraint]) runGathering v (Pure t) = (t, v, []) runGathering v (Ap f p) = let (rf, v', csf) = runGathering v f (rp, v'', csp) = runGathering v' p in (rf rp, v'', csf ++ csp) runGathering v (Bind p f) = let (rp, v', cs) = runGathering v p (rf, v'', cs') = runGathering v' (f rp) in (rf, v'', cs ++ cs') runGathering v (NewVar) = (Metavar v, v + 1, []) runGathering v (Equate l r) = (l, v, [l := r])
This time around, I should’ve used a StateT for the metavariable counter and a WriterT to collect Constraints in. This isn’t just a new operational behaviour for the same language, arguably it’s a different language. Why? Because Bind and Ap see different results than they would with runOnFly – they don’t get to observe the results of solving.
Decision time!
Prolog fans will be wondering something by now. Our runOnFly implementation might smell a little bit familiar, but it can’t branch – where’s the disjunction? Well, there’s no reason we can’t have it and still be doing constraint programming, so here goes:
data Code a where ... Or :: Embeddable a => Code a -> Code a -> Code a ... runOnFly a v (Or l r) = case (runOnFly a v l) of r@Right{} -> r Left{} -> runOnFly a v r
Nothing special here – except that runGathering can’t handle Or because it doesn’t do any solving to branch on. This suggests a new language feature and a third language implementation – why not leave when to solve up to the program?
data Code a where ... Solve :: Code () ... data ConstraintSet = CS {assumptions :: Assignment, constraints :: [Constraint]} deriving Show emptyConstraintSet = CS Data.Map.empty [] solveConstraints :: ConstraintSet -> Either UnificationError ConstraintSet solveConstraints (CS a cs) = solveConstraints' a cs where solveConstraints' a [] = return (CS a []) solveConstraints' a ((l := r):cs) = do (u, a') <- unify a l r solveConstraints' a' cs
What’s with ConstraintSet? Well, it’s partitioned into an Assignment of solved "constraints" and a list of unsolved ones. When we call solveConstraints, it attempts to ‘move’ the unsolved ones into the solved set by unifying terms and seeing how that changes the Assignment. Feel free to complain at my use of general recursion!
run :: ConstraintSet -> Integer -> Code a -> Either UnificationError (a, ConstraintSet, Integer) run cs v (Pure t) = return (subst (assumptions cs) t, cs, v) run cs v (Ap f p) = do (rf, cs', v') <- run cs v f (rp, cs'', v'') <- run cs' v' p return (subst (assumptions cs'') (rf rp), cs'', v'') run cs v (Bind p f) = do (rp, cs', v') <- run cs v p run cs' v' (f $ subst (assumptions cs') rp) run cs v NewVar = return (Metavar v, cs, v+1) run cs v (Equate l r) = let cs' = CS (assumptions cs) (l := r : constraints cs) in return (subst (assumptions cs) l, cs', v) run cs v (Or l r) = case (run cs v l) of result@Right{} -> result Left{} -> run cs v r run cs v Solve = do cs' <- solveConstraints cs return ((), cs', v)
Bet you’re wishing I was using monad transformers by now, huh? Now we pass along a ConstraintSet in a state-like fashion. The constraints half of it behaves in a State-like fashion throughout – this is a design decision, we could have made Solve take a computation as a parameter and only solved the constraints generated inside it instead. In fact, either way round you can find use cases that the decision breaks. I’m not going to implement further here, but clearly there are cases where you might want to both pass constraint problems around first-class (which we get for free) and access the current constraint problem/set (another command for our language).
A little test
Finally, we’d better find some problems to solve, hadn’t we?
{- Test code - let's do a little bit of boring Peano arithmetic! No logic programming here, the constraints are just pattern-matching. Feel free to write it the other way if you can stomach the awful syntax. -} solve t = t `Bind` \result -> (Solve `Bind` (const $ Pure result)) {- Peano numbers, matching, addition -} succ t = Term "S" [t] zero = Term "Z" [] integerToNatTerm :: Integer -> Term integerToNatTerm i | i > 0 = succ (integerToNatTerm (i - 1)) | i == 0 = zero | otherwise = error "integerToNatTerm: negative parameter" natTermToInteger :: Term -> Integer natTermToInteger (Term "Z" []) = 0 natTermToInteger (Term "S" [n]) = 1 + natTermToInteger n natTermToInteger _ = error "natTermToInteger: parameter isn't a peano natural" matchZero t = solve $ Equate t zero matchSucc t = NewVar `Bind` (\n -> solve (Equate t (succ n)) `Bind` (const $ Pure n) ) add l r = (matchZero l `Bind` (const $ Pure r)) `Or` (matchSucc l `Bind` \l' -> add l' (succ r) )
All that code just to do a little lousy addition?! I’m afraid so. Evidence that it works:
*Herbrand> let t = add (integerToNatTerm 3) (integerToNatTerm 4) *Herbrand> run emptyConstraintSet 0 (solve t) Right (Term "S" [Term "S" [Term "S" [Term "S" [Term "S" [Term "S" [Term "S" [Term "Z" []]]]] ]]], CS {assumptions = fromList [(0,Term "S" [Term "S" [Term "Z" []]]), (1,Term "S" [Term "Z" []]), (2,Term "Z" [])], constraints = []}, 3)
You can thank me for the manually added layout later. Feel free to feed the result term into natTermToInteger to confirm that it is indeed 7. If we’d written our addition in a Prolog-like style, asserting a relation rather than solving a function, that would have arguably embedded something interesting: a non-primitive constraint. Admittedly not a particularly strong argument given that the constraint system itself not only doesn’t have an entity matching the addition but also doesn’t have the power to encode addition without the externally-imposed branching of our program. But it’s a nice idea for other constraint systems, no?
Closing up
So that’s constraint programming for you – you take a metametalanguage to write programs about problems in a metalanguage about the data you actually care about, leaving you with a 200 line source file that can tell you 3+4=7! Which makes Haskell a metametametalanguage because we used it as a host – but don’t worry about that, it got infinitely meta the moment GHC existed. Incidentally, if that’s the first time this article made your head asplode then just take a breather – the world around us is far more complicated, I’m just being a little facetious about pointing it out and you’re doing just fine. Thankfully all this metaness doesn’t actually leave us with nothing but our heads forming a recursive knot. Constraint problems are genuine serious business – type systems might be my application, but many will’ve been solved in manufacturing pretty much everything you buy. And perhaps when you did your budget to buy it with too. Send more money?
So what did we learn? We learned about constraint domains (of which there are many and I covered exactly one), constraint problems, constraint spaces and constraints themselves. We learned that a problem may or may not be satisfiable. We didn’t really talk enough about solving them, but we know it’s about finding values for all those variables. We also learned that constraints can be treated as little logics – or not so little. Most importantly we learned that all this stuff exists and (check the links) has been worked on heavily – that there’s something worth learning about! Hopefully this’s been worth your time – catch you around.
Links&References
- Herbrand.hs – Haskell source file for everything in this article, plus all the bits I skipped
- Monadic Constraint Programming package. I haven’t tried this myself, but I was impressed by the talk at Anglohaskell – well worth a look, and perhaps a quick search for the related papers.
- Wikipedia articles:
- Concepts, Techniques, and Models of Computer Programming by Peter Van Roy and Seif Haridi is an excellent book in general and has a good introduction to constraint programming. I don’t entirely agree with their notion of programming paradigms (it rules out half of why I’m interested in pure functional languages for example), but it’s certainly an advancement.
- The Haskell implementation Helium is designed for beginners and aims to produce better error messages. The team’s research can be found here, and takes a constraint-based approach.
- The HM(X) type system extends Hindley-Milner type inference by parameterising it on a constraint domain – examples include dimension analysis for physics and extensible records. There’s a chapter on it in ATTAPL, with an extended version available here.
I’ve not been too active with such things the last few years due to quite a collection of personal circumstances, but I’ve definitely got things to say if only I can get round to them. I don’t have a lot of in-person contact prodding me about such things, so a little online cheerleading would be a big help – hopefully I’ll have more up soon.
]]>

