Carview!

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: application/xml last-modified: Fri, 19 Dec 2025 18:29:11 GMT access-control-allow-origin: * etag: W/"69459977-445691" expires: Fri, 16 Jan 2026 19:19:32 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: 487A:377A50:83DE4:94D07:696A8CE7 accept-ranges: bytes age: 0 date: Fri, 16 Jan 2026 19:09:32 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210080-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1768590572.370724,VS0,VE273 vary: Accept-Encoding x-fastly-request-id: 8bf8edb91212c084d1171b489404a150beb1e31d content-length: 559349 parsonsmatt.org https://www.parsonsmatt.org The Subtle Footgun of TVar (Map _ _) <blockquote> How coarse-grained STM containers can livelock under load </blockquote> <ul> <li>Edit history: <ul> <li>12-19: <a href="https://discourse.haskell.org/t/the-subtle-footgun-of-tvar-map/13429/3">@teofilC on the Haskell Discourse</a> remarked a place where <code class="language-plaintext highlighter-rouge">TVar (Map _ _)</code> may be appropriate - I’ve modified the post to incorporate this.</li> </ul> </li> </ul> Software Transactional Memory (STM) is one of Haskell’s crown jewels. The promise is easy, lock-free concurrency with guaranteed transactional semantics and great performance. Used correctly, you get all of these benefits. However, concurrency is fundamentally difficult, and STM has a failure mode: livelock. Livelock is when the program is repeatedly retrying transactions - working furiously and making no progress. Livelock is subtle and extremely difficult to diagnose. The problem of livelock is often simply “performance is really bad and seems to get worse with more concurrency.” Avoiding livelock proactively will pay massive dividends in the performance of your concurrent systems, as well as the time spent diagnosing and fixing them later on. <h1 id="avoiding-livelock-on-shared-containers">Avoiding Livelock on Shared Containers</h1> A common causes of livelock is fortunately discoverable with simple text search: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>uhoh :: TVar (Map k (TVar v)) -> STM () oops :: TVar (HashMap k v) -> STM () ohno :: TVar (Set v) -> STM () </code></pre></div></div> A <code class="language-plaintext highlighter-rouge">TVar</code> to a container data structure, like <code class="language-plaintext highlighter-rouge">Map k v</code> or <code class="language-plaintext highlighter-rouge">Set a</code> or <code class="language-plaintext highlighter-rouge">[a]</code> or <code class="language-plaintext highlighter-rouge">Seq a</code>, is a common source of livelock. Any change to the <code class="language-plaintext highlighter-rouge">Map</code> invalidates any transaction that is reading from that <code class="language-plaintext highlighter-rouge">Map</code>, even if it is reading at a totally different key or value. If the <code class="language-plaintext highlighter-rouge">Map</code> is being concurrently updated, you are nearly guaranteed to run into performance problems. Fortunately, you can easily avoid this pattern, thanks to the <code class="language-plaintext highlighter-rouge">stm-containers</code> library. <h1 id="stm-containers"><code class="language-plaintext highlighter-rouge">stm-containers</code></h1> The <a href="https://nikita-volkov.github.io/stm-containers/"><code class="language-plaintext highlighter-rouge">stm-containers</code></a> package was released over ten years ago. <code class="language-plaintext highlighter-rouge">stm-containers</code> allows you to have a structure similar to a <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> or <code class="language-plaintext highlighter-rouge">TVar (Map k (TVar v))</code> with the added benefit that it actually scales up with increased concurrency, avoiding the dreaded livelock. Modifications to the map impact a significantly smaller portion of the map structure, so you’re way less likely to run into livelock from concurrent updates. Switching is really easy. The API is mostly the same as the <code class="language-plaintext highlighter-rouge">containers</code> library, but modifications are effectful in <code class="language-plaintext highlighter-rouge">STM ()</code> instead of returning the new data structure. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>StmMap.lookup :: (Hashable k) => k -> StmMap.Map k v -> STM (Maybe v) HashMap.lookup :: (Hashable k) => k -> HashMap k v -> Maybe v StmMap.insert :: (Hashable k) => v -- note that the value is first, not the key -> k -> StmMap.Map k v -> STM () HashMap.insert :: (Hashable k) => k -> v -> HashMap k v -> HashMap k v </code></pre></div></div> Here’s an example of using <code class="language-plaintext highlighter-rouge">stm-containers</code> in practice, compared to a <code class="language-plaintext highlighter-rouge">TVar Map</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import qualified Data.Map as Map import qualified StmContainers.Map as StmMap -- old: doStuff :: TVar (Map.Map String Int) -> String -> STM Int doStuff tmap str = do map <- readTVar tmap let mval = Map.lookup str map let newVal = maybe 0 (+1) mval writeTVar (Map.insert str newVal map) pure newVal -- new: doStuff :: StmMap.Map String Int -> String -> STM Int doStuff tmap str = do mval <- StmMap.lookup str tmap let newVal = maybe 0 (+1) mval StmMap.insert newVal str tmap pure newVal </code></pre></div></div> The library also offers an interesting <code class="language-plaintext highlighter-rouge">focus</code> function, which allows you to combine operations at a single key into a single operation. If you’re sharing a <code class="language-plaintext highlighter-rouge">Map</code> or <code class="language-plaintext highlighter-rouge">Set</code>-like container across many threads, then you almost certainly want to just use <code class="language-plaintext highlighter-rouge">stm-containers</code>. If you stop reading here, and merely use <code class="language-plaintext highlighter-rouge">stm-containers</code> by default, then you’ll avoid a lot of pain without having to invest much time or energy. <code class="language-plaintext highlighter-rouge">stm-containers</code> is faster and safer than a <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code>. <a href="https://github.com/MercuryTechnologies/hs-temporal-sdk/pull/279">In this PR to the <code class="language-plaintext highlighter-rouge">hs-temporal-sdk</code> library</a>, I mechanically translated a few <code class="language-plaintext highlighter-rouge">TVar (HashMap _ _)</code> to <code class="language-plaintext highlighter-rouge">stm-containers</code> datatypes. We were observing degraded performance with increased concurrency, some of which were causing an ordinary 10 minute test suite run to time out after 50 minutes. By switching from <code class="language-plaintext highlighter-rouge">TVar (HashMap _ _)</code> to <code class="language-plaintext highlighter-rouge">stm-containers</code>, the problem was completely fixed - performance remained consistent with increased concurrency. <h1 id="when-should-i-not-use-stm-containers">When should I not use <code class="language-plaintext highlighter-rouge">stm-containers</code>?</h1> By default, you should use <code class="language-plaintext highlighter-rouge">stm-containers</code>. It is very fast, simple to use, and has no downsides (aside from the downsides inherent to <code class="language-plaintext highlighter-rouge">STM</code> vs <code class="language-plaintext highlighter-rouge">MVar</code> or <code class="language-plaintext highlighter-rouge">IORef</code> based shared references). A <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> isn’t always going to be dangerous. And that’s part of the problem - there are some places where <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> won’t be slow or break. These rely on non-local assumptions about your code structure - you cannot enforce this easily without writing complicated wrappers that enforce encapsulation around how the shared reference is used. Writing code that works great with a <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> in the small is quite easy, but guaranteeing that code won’t break as it scales is challenging. There are many situations where going from <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> to <code class="language-plaintext highlighter-rouge">StmMap.Map k v</code> will make a huge improvement. There is one known situation where an <code class="language-plaintext highlighter-rouge">StmMap.Map k v</code> may run into a significant performance problem - if you need to get a snapshot of the <code class="language-plaintext highlighter-rouge">Map</code> atomically, <code class="language-plaintext highlighter-rouge">StmMap.Map k v</code> requires reading <code class="language-plaintext highlighter-rouge">O(n log n)</code> <code class="language-plaintext highlighter-rouge">TVar</code>s, and <a href="https://gitlab.haskell.org/ghc/ghc/-/issues/24410">GHC currently is quadratic in the count of <code class="language-plaintext highlighter-rouge">TVar</code>s in a transaction</a>. So if you find yourself doing <code class="language-plaintext highlighter-rouge">listT :: StmMap.Map k v -> ListT STM [(k, v)]</code>, then you may want to consider an alternative structure. <h1 id="what-about-ioref-map-k-v">What about <code class="language-plaintext highlighter-rouge">IORef (Map k v)</code>?</h1> An <code class="language-plaintext highlighter-rouge">IORef (Map k v)</code> eliminates the possibility of livelock. However, the <code class="language-plaintext highlighter-rouge">IORef</code> structure is not suitable for many writers. To ensure a consistent view of the <code class="language-plaintext highlighter-rouge">Map</code>, you need to use <code class="language-plaintext highlighter-rouge">atomicModifyIORef</code>. While a thread is doing an <code class="language-plaintext highlighter-rouge">atomicModifyIORef</code>, all other writes are blocked to the reference. This means that an <code class="language-plaintext highlighter-rouge">IORef (Map k v)</code> is suitable if there are relatively few writes to the <code class="language-plaintext highlighter-rouge">IORef</code>, and the <code class="language-plaintext highlighter-rouge">IORef</code> write is mostly a complete replacement. If you have many writers doing small writes, then threads will queue up and block on updating the entire <code class="language-plaintext highlighter-rouge">Map</code>. In briefer code snippets, <ul> <li>Good: <code class="language-plaintext highlighter-rouge">atomicModifyIORef' (\oldMap -> (Map.union newMap oldMap, ()))</code></li> <li>Bad: <code class="language-plaintext highlighter-rouge">atomicModifyIORef' (\oldMap -> (Map.insert k v oldMap, ()))</code></li> </ul> <code class="language-plaintext highlighter-rouge">atomicModifyIORef</code> also requires the updating action to be pure. If you need the modification to be effectful, and have a consistent view of the data, then you need to use either <code class="language-plaintext highlighter-rouge">stm-containers</code> or an <code class="language-plaintext highlighter-rouge">MVar</code>. <a href="https://github.com/parsonsmatt/prometheus-haskell/pull/1">In this PR to the <code class="language-plaintext highlighter-rouge">prometheus-haskell</code></a> library, I demonstrate that an <code class="language-plaintext highlighter-rouge">IORef (Map k v)</code> has significantly worse performance than an <code class="language-plaintext highlighter-rouge">stm-containers</code> <code class="language-plaintext highlighter-rouge">Map k v</code> as concurrency scales up. This change made a big difference to the performance of our metric collection code. <code class="language-plaintext highlighter-rouge">IORef (Map k v)</code> is not suitable because there are many frequent writes that are only concerned with a single <code class="language-plaintext highlighter-rouge">k</code>, and reading is done non-atomically - the value of <code class="language-plaintext highlighter-rouge">k0</code> changing does not impact the value of <code class="language-plaintext highlighter-rouge">k1</code>. <h1 id="what-about-mvar-map-k-v">What about <code class="language-plaintext highlighter-rouge">MVar (Map k v)</code>?</h1> An <code class="language-plaintext highlighter-rouge">MVar (Map k v)</code> avoids the problem of livelock, but introduces another potential problem: deadlock. An <code class="language-plaintext highlighter-rouge">MVar</code> is a locking concurrency mechanism, where you can block other threads by <code class="language-plaintext highlighter-rouge">takeMVar</code> and making the <code class="language-plaintext highlighter-rouge">MVar</code> empty. This allows updating threads to lock the value, update the <code class="language-plaintext highlighter-rouge">MVar</code>, and unlock the value. However, if your code fails to fill up an <code class="language-plaintext highlighter-rouge">MVar</code>, or if two threads are waiting on mutually held <code class="language-plaintext highlighter-rouge">MVar</code>s, then your code cannot make progress, and will be deadlocked. One advantage that this has to a <code class="language-plaintext highlighter-rouge">TVar</code> is fairness. Threads that attempt to read or take from an <code class="language-plaintext highlighter-rouge">MVar</code> are enqueued, and guaranteed to operate in a first-in-first-out manner. <code class="language-plaintext highlighter-rouge">TVar</code>, on the other hand, will start every thread at once, and the first one to complete wins while everyone else must retry. If you have multiple <code class="language-plaintext highlighter-rouge">MVar</code> variables that you want to coordinate on, then you are increasing the risk of deadlock. <h1 id="when-is-tvar-map-k-v-safe">When is <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> safe?</h1> The fundamental problem with <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> is that any write to the <code class="language-plaintext highlighter-rouge">TVar</code> will invalidate and retry every transaction that reads from it. For this variable to be safe, you need a single thread performing updates, and other threads are only doing reads. Additionally, those updates should ideally be relatively rare - if the thread is constantly updating single keys in the map, that will invalidate transactions frequently. Instead, you’ll want to batch updates to the <code class="language-plaintext highlighter-rouge">Map</code> and perform a whole replacement at once. However, this is exactly the same limitation as <code class="language-plaintext highlighter-rouge">IORef (Map k v)</code> or an <code class="language-plaintext highlighter-rouge">MVar (Map k v)</code>. If you only have a single <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> involved in your <code class="language-plaintext highlighter-rouge">STM</code> transactions, then you can simply switch to an <code class="language-plaintext highlighter-rouge">IORef</code> or <code class="language-plaintext highlighter-rouge">MVar</code> and enjoy increased performance. If you have multiple <code class="language-plaintext highlighter-rouge">TVar</code> in your transaction, then you are at risk of livelock, and should use <code class="language-plaintext highlighter-rouge">stm-containers</code>. If you really need <code class="language-plaintext highlighter-rouge">TVar</code> for <code class="language-plaintext highlighter-rouge">STM</code> transactionality, then I would highly recommend wrapping the <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code> in a <code class="language-plaintext highlighter-rouge">newtype</code> that forbids write operations, and only expose the underlying <code class="language-plaintext highlighter-rouge">TVar</code> in a manner that allows for a single thread to do infrequent writes. This will avoid contention and livelock. <h1 id="when-is-tvar-map-k-tvar-v-safe">When is <code class="language-plaintext highlighter-rouge">TVar (Map k (TVar v))</code> safe?</h1> <code class="language-plaintext highlighter-rouge">TVar (Map k (TVar v))</code> is slightly better than the above. With this, any write to the top-level <code class="language-plaintext highlighter-rouge">TVar</code> will invalidate the transaction, but writes to the value <code class="language-plaintext highlighter-rouge">TVar</code>s will only cause contention on other transactions that are attempting to write to it. For this to be safe, you must have a single thread that updates the <code class="language-plaintext highlighter-rouge">Map</code> structure (ie adding or removing keys). Many threads can write on the map values and only experience the usual contention on a single variable. However, this runs into the same problem as above: if you can safely refactor this to <code class="language-plaintext highlighter-rouge">IORef (Map k (TVar v))</code> (or <code class="language-plaintext highlighter-rouge">MVar</code>), then you are safe, otherwise, you are at risk of livelock and you should switch to <code class="language-plaintext highlighter-rouge">stm-containers</code>. <h1 id="the-semantics-of-a-reference-type">The Semantics of a Reference Type</h1> I love arguing semantics. What else are we going to argue about, syntax? If I say <code class="language-plaintext highlighter-rouge">IORef a</code>, I mean: <blockquote> This is a reference to an <code class="language-plaintext highlighter-rouge">a</code>. The reference may be read or written to from any thread. The reference is very fast for these operations. However, the only operation for atomic consistency is <code class="language-plaintext highlighter-rouge">atomicModifyIORef</code>. So do not expect to be able to do transactions or blocking with this! </blockquote> We have a very fast reference with very few guarantees. This is suitable for sharing information among many threads, where updates don’t happen very often, and updates are fast and pure. Meanwhile, if I say <code class="language-plaintext highlighter-rouge">MVar a</code>, I mean: <blockquote> This is a reference to an <code class="language-plaintext highlighter-rouge">a</code>, which may be full or empty. The reference may be read or written to from any thread. If the reference is empty, then other threads will queue up and wait for it to be full. This means I can implement control, atomic updates, and fair access. However, it also means that I can experience deadlock and blocking performance. </blockquote> Now, when writing code against an <code class="language-plaintext highlighter-rouge">MVar</code>, we have to be more careful: we can cause deadlock. But we also have significantly more power over how the data is accessed and updated. Particularly fairness and coordination are powerful advantages for the <code class="language-plaintext highlighter-rouge">MVar</code>. However, transactions with multiple <code class="language-plaintext highlighter-rouge">MVar</code> are difficult to do correctly. Where <code class="language-plaintext highlighter-rouge">IORef</code>s <code class="language-plaintext highlighter-rouge">atomicModifyIORef</code> requires a pure computation to avoid data races, you can <code class="language-plaintext highlighter-rouge">takeMVar</code> on an <code class="language-plaintext highlighter-rouge">MVar</code> to do an update - this allows you to do effects while computing your new value, and guarantees that consumers receive a consistent view of the value inside the <code class="language-plaintext highlighter-rouge">MVar</code>. If I say <code class="language-plaintext highlighter-rouge">TVar a</code>, I mean: <blockquote> This is a reference to an <code class="language-plaintext highlighter-rouge">a</code>. The reference may be read or written to from any thread. I want transactionality around the entire structure of <code class="language-plaintext highlighter-rouge">a</code> - that is, if anyone touches any part of the structure of <code class="language-plaintext highlighter-rouge">a</code>, I want my whole transaction to be invalidated. </blockquote> This last bit is actually a very strong claim! Are you sure you want transactionality around the entire <code class="language-plaintext highlighter-rouge">a</code>? For a <code class="language-plaintext highlighter-rouge">TVar Int</code>, the answer is yes. But for a <code class="language-plaintext highlighter-rouge">TVar (Map k v)</code>, your transaction probably is only concerned with specific parts of the <code class="language-plaintext highlighter-rouge">Map</code>. <code class="language-plaintext highlighter-rouge">stm-containers</code> uses a strategy similar to <code class="language-plaintext highlighter-rouge">TChan</code> and <code class="language-plaintext highlighter-rouge">TQueue</code> - rather than a <code class="language-plaintext highlighter-rouge">TVar</code> containing a recursive data structure, the actual recursive steps are themselves <code class="language-plaintext highlighter-rouge">TVar</code>s. This allows modifications to the data structure to only invalidate transactions that are actually relevant. Consider a SQL transaction. If you have a query which selects a row from a table, you want to have some sort of locking to ensure that the row remains the same throughout the transaction. Locking the entire table would be disastrous for performance - no one could do any work on the table until your transaction was complete! Locking the entire row can also be quite bad for performance if the row is large. But if you only select a handful of columns, ideally only those columns are locked by the query. <code class="language-plaintext highlighter-rouge">STM</code> is a wonderful mechanism for concurrency, but it isn’t foolproof. We are still responsible for selecting the right data structures for good performance. The core issue here is coarse-grained transactional state. Finer transactionality gives us better performance. Wed, 17 Dec 2025 00:00:00 +0000 https://www.parsonsmatt.org/2025/12/17/the_subtle_footgun_of_tvar_(map____).html https://www.parsonsmatt.org/2025/12/17/the_subtle_footgun_of_tvar_(map____).html Making My Life Harder with GADTs Lucas Escot wrote a good blog post titled <a href="https://acatalepsie.fr/posts/making-my-life-easier-with-gadts.html">“Making My Life Easier with GADTs”</a>, which contains a demonstration of GADTs that made his life easier. He posted the article to <a href="https://www.reddit.com/r/haskell/comments/1i6f48k/making_my_life_easier_with_gadts/">reddit</a>. I’m going to trust that - for his requirements and anticipated program evolution - the solution is a good one for him, and that it actually made his life easier. However, there’s one point in his post that I take issue with: <blockquote> Dependent types and assimilated type-level features get a bad rep. They are often misrepresented as a futile toy for “galaxy-brain people”, providing no benefit to the regular programmer. I think this opinion stems from a severe misconception about the presumed complexity of dependent type systems. </blockquote> I am often arguing against complexity in Haskell codebases. While Lucas’s prediction about “misconceptions” may be true for others, it is not true for me. I have worked extensively with Haskell’s most advanced features in large scale codebases. I’ve studied <a href="https://www.cis.upenn.edu/~bcpierce/tapl/">“Types and Programming Languages,”</a> <a href="https://www.manning.com/books/type-driven-development-with-idris">the Idris book</a>, <a href="https://anggtwu.net/tmp/nederpelt_geuvers__type_theory_and_formal_proof_an_introduction.pdf">“Type Theory and Formal Proof”</a>, and many other resources on advanced type systems. I don’t say this to indicate that I’m some kind of genius or authority, just that I’m not a rube who’s looking up on the Blub Paradox. My argument for simplicity comes from the hard experience of having to rip these advanced features out, and the pleasant discovery that simpler alternatives are usually nicer in every respect. So how about GADTs? Do they make my life easier? Here, I’ll reproduce <a href="https://www.reddit.com/r/haskell/comments/1i6f48k/making_my_life_easier_with_gadts/m8ddgn7/">the comment I left on reddit</a>: <hr /> <blockquote> They are often misrepresented as a futile toy for “galaxy-brain people”, providing no benefit to the regular programmer. I think this opinion stems from a severe misconception about the presumed complexity of dependent type systems. </blockquote> This opinion - in my case at least - stems from having seen people code themselves into a corner with fancy type features where a simpler feature would have worked just as well. In this case, the “simplest solution” is to have two entirely separate datatypes, as the blog post initially starts with. These datatypes, after all, represent different things - a typed environment and an untyped environment. Why mix the concerns? What pain or requirement is solved by having one more complicated datatype when two datatypes works pretty damn well? <blockquote> I could indeed keep typed environments completely separate. Different datatypes, different information. But this would lead to a lot of code duplication. Given that the compilation logic will mostly be mostly identical for these two targets, I don’t want to be responsible for the burden of keeping both implementations in sync. </blockquote> Code duplication can be a real concern. In this case, we have code that is not precisely duplicated, but simply similar - we want compilation logic to work for both untyped and typed logics, and only take typing information into account. When we want code to work over multiple possible types, we have two options: parametric polymorphism and ad-hoc polymorphism. With parametric polymorphism, the solution looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data GlobalEnv a = GlobalEnv [(Name, GlobalDecl a)] data GlobalDecl a = DataDecl (DataBody a) | FunDecl (FunBody a) | TypeDecl a data DataBody a = DataBody { indConstructors :: [ConstructorBody a] } data ConstructorBody a = ConstructorBody { ctorName :: Name , ctorArgs :: Int , ctorType :: a } data FunBody a = FunBody { funBody :: LamBox.Term , funType :: a } </code></pre></div></div> This is actually very similar to the GADT approach, because we’re threading a type variable through the system. For untyped, we can write <code class="language-plaintext highlighter-rouge">GlobalDecl ()</code>, and for typed, we can write <code class="language-plaintext highlighter-rouge">GlobalDecl LamBox.Type</code>. Functions which can work on either untyped or typed would have <code class="language-plaintext highlighter-rouge">GlobalDecl a -> _</code> as their input, and functions which require a representation can specify it directly. This would look very similar to the GADT approach: in practice, replace <code class="language-plaintext highlighter-rouge">GlobalDecl Typed</code> with <code class="language-plaintext highlighter-rouge">GlobalDecl Type</code> and <code class="language-plaintext highlighter-rouge">GlobalDecl Untyped</code> with <code class="language-plaintext highlighter-rouge">GlobalDecl ()</code> and you’re good. (or, heck, <code class="language-plaintext highlighter-rouge">data Untyped = Untyped</code> and the change is even smaller). This representation is much easier to work with. You can <code class="language-plaintext highlighter-rouge">deriving stock (Show, Eq, Ord)</code>. You can <code class="language-plaintext highlighter-rouge">$(deriveJSON ''GlobalEnv)</code>. You can delete several language extensions. It’s also more flexible: you can use <code class="language-plaintext highlighter-rouge">Maybe Type</code> to represent partially typed programs (or programs with type inference). You can use <code class="language-plaintext highlighter-rouge">Either TypeError Type</code> to represent full ASTs with type errors. You can <code class="language-plaintext highlighter-rouge">deriving stock (Functor, Foldable, Traversable)</code> to get access to <code class="language-plaintext highlighter-rouge">fmap</code> (change the type with a function) and <code class="language-plaintext highlighter-rouge">toList</code> (collect all the types in the AST) and <code class="language-plaintext highlighter-rouge">traverse</code> (change each type effectfully, combining results). When we choose GADTs here, we pay significant implementation complexity costs, and we give up flexibility. What is the benefit? Well, the entire benefit is that we’ve given up flexibility. With the parametric polymorphism approach, we can put anything in for that type variable <code class="language-plaintext highlighter-rouge">a</code>. The GADT prevents us from writing <code class="language-plaintext highlighter-rouge">TypeDecl ()</code> and it forbids you from having anything other than <code class="language-plaintext highlighter-rouge">Some (type :: Type)</code> or <code class="language-plaintext highlighter-rouge">None</code> in the fields. This restriction is what I mean by ‘coding into a corner’. Let’s say you get a new requirement to support partially typed programs. If you want to stick with the GADT approach, then you need to change <code class="language-plaintext highlighter-rouge">data Typing = Typed | Untyped | PartiallyTyped</code> and modify all the <code class="language-plaintext highlighter-rouge">WhenTyped</code> machinery - <code class="language-plaintext highlighter-rouge">Optional :: Maybe a -> WhenTyped PartiallTyped a</code>. Likewise, if you want to implement inference or type-checking, you need another constructor on <code class="language-plaintext highlighter-rouge">Typing</code> and another on<code class="language-plaintext highlighter-rouge">WhenTyped</code> - <code class="language-plaintext highlighter-rouge">... | TypeChecking</code> and <code class="language-plaintext highlighter-rouge">Checking :: Either TypeError a -> WhenTyped TypeChecking a</code>. But wait - now our <code class="language-plaintext highlighter-rouge">TypeAliasDecl</code> has become overly strict! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data GlobalDecl :: Typing -> Type where FunDecl :: FunBody t -> GlobalDecl t DataDecl :: DataBody t -> GlobalDecl t TypeAliasDecl :: TypeAliasBody -> GlobalDecl Typed </code></pre></div></div> We actually want <code class="language-plaintext highlighter-rouge">TypeAliasDecl</code> to work with any of <code class="language-plaintext highlighter-rouge">PartiallyTyped</code>, <code class="language-plaintext highlighter-rouge">Typed</code>, or <code class="language-plaintext highlighter-rouge">TypeChecking</code>. Can we make this work? Yes, with a type class constraint: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class IsTypedIsh (t :: Typing) instance IsTypedIsh Typed instance IsTypedIsh PartiallyTyped instance (Unsatisfiable msg) => IsTypedIsh Untyped data GlobalDecl :: Typing -> Type where FunDecl :: FunBody t -> GlobalDecl t DataDecl :: DataBody t -> GlobalDecl t TypeAliasDecl :: (IsTypedIsh t) => TypeAliasBody -> GlobalDecl t </code></pre></div></div> But, uh oh, we also want to write functions that can operate in many of these states. We can extend <code class="language-plaintext highlighter-rouge">IsTypedish</code> with a function witness <code class="language-plaintext highlighter-rouge">witnessTypedish :: WhenTyped t Type -> Type</code>, but that also doesn’t quite work - the <code class="language-plaintext highlighter-rouge">t</code> actually determines the output type. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class IsTypedIsh (t :: Typing) where type TypedIshPayload t isTypedIshWitness :: WhenTyped t Type -> TypedIshPayload t instance IsTypedIsh Typed where type TypedIshPayload Typed = Type isTypedIshWitness (Some a) = a instance IsTypedIsh PartiallyTyped where type TypedIshPayload PartiallyTyped = Maybe Type isTypedIshWitness (Optional a) = a instance IsTypedIsh TypeChecking where type TypedIshPayload TypeChecking = Either TypeError Type isTypedIshWitness (Checking a) = a instance (Unsatisfiable msg) => IsTypedIsh Untyped </code></pre></div></div> Now, this does let us write code like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputHasTypeSorta :: (IsTypedIsh t) => GlobalDec t -> _ </code></pre></div></div> but actually working with this becomes a bit obnoxious. You see, without knowing <code class="language-plaintext highlighter-rouge">t</code>, you can’t know the result of <code class="language-plaintext highlighter-rouge">isTypedIshWitness</code>, so you end up needing to say things like <code class="language-plaintext highlighter-rouge">(IsTypedish t, TypedIshPayload t ~ f Type, Foldable f) => ...</code> to cover the <code class="language-plaintext highlighter-rouge">Maybe</code> and <code class="language-plaintext highlighter-rouge">Either</code> case - and this only lets you fold the result. But now you’re working with the infelicities of type classes (inherently open) and sum types (inherently closed) and the way that GHC tries to unify these two things with type class dispatch. Whew. Meanwhile, in parametric polymorphism land, we get almost all of the above for free. If we want to write code that covers multiple possible cases, then we can use much simpler type class programming. Consider how easy it is to write this function and type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>beginTypeChecking :: GlobalDecl () -> GlobalDecl (Maybe (Either TypeError Type)) beginTypeChecking = fmap (\() -> Nothing) </code></pre></div></div> And now consider what you need to do to make the GADT program work out like this. Tue, 21 Jan 2025 00:00:00 +0000 https://www.parsonsmatt.org/2025/01/21/making_my_life_harder_with_gadts.html https://www.parsonsmatt.org/2025/01/21/making_my_life_harder_with_gadts.html Persistent Models are Views The Haskell <code class="language-plaintext highlighter-rouge">persistent</code> library provides a <code class="language-plaintext highlighter-rouge">QuasiQuoter</code> syntax for defining a Haskell datatype, along with code to convert it into a database table. However, there’s a bit of a subtlety here. <a href="https://hackage.haskell.org/package/persistent-2.14.6.0/docs/Database-Persist-Quasi.html">Here is the documentation for the syntax on the <code class="language-plaintext highlighter-rouge">QuasiQuoter</code></a>. I’ll refer to it throughout this blog post. The conventional use of this library is to define a bunch of tables that represent the complete table. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkPersist sqlSettings [persistLowerCase| User sql="users" name Text birthday Day Organization name Text primaryUser UserId |] </code></pre></div></div> However, a very natural thing to do is add <code class="language-plaintext highlighter-rouge">created</code> and <code class="language-plaintext highlighter-rouge">updated</code> timestamps. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkPersist sqlSettings [persistLowerCase| User sql="users" name Text birthday Day createdAt UTCTime default=now() updatedAt UTCtime default=now() Organization name Text primaryUser UserId createdAt UTCTime default=now() updatedAt UTCtime default=now() |] </code></pre></div></div> The intention is that the database supplies these values, but the Haskell code requires you provide them. This means that your <code class="language-plaintext highlighter-rouge">insert</code>s are annoying. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fakeUTCTime :: UTCTime fakeUTCTime = UTCTime (fromGregorian 1 1 1) 0 foo :: SqlPersistT IO () foo = do insert User { userName = "Matt Parsons" , userBirthday = fromGregorian 1988 09 29 , userCreatedAt = fakeUTCTime , userUpdateAt = fakeUTCTime } </code></pre></div></div> You have to provide a <code class="language-plaintext highlighter-rouge">fakeUTCTime</code> value. The database will immediately throw it away and not use it. Wouldn’t it be better to not need to do this? <h1 id="liberate-your-models">Liberate Your Models</h1> As is often the case in Haskell, the problem can be nicely ameliorated by providing more types. Let’s consider separating our concerns, and representing a <code class="language-plaintext highlighter-rouge">User</code> twice: once as a type faithful to the shape of the database table, and another as a default way to insert it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkPersist sqlSettings [persistLowerCase| User sql="users" name Text birthday Day createdAt UTCTime default=now() updatedAt UTCtime default=now() InsertUser sql="users" name Text birthday Day createdAt UTCTime default=now() MigrationOnly updatedAt UTCtime default=now() MigrationOnly |] foo :: SqlPersistT IO () foo = do insert InsertUser { insertUserName = "Matt Parsons" , insertUserBirthday = fromGregorian 1988 09 29 } </code></pre></div></div> Now, we have a variant of our type which does not have timestamps, and we can use this to insert a value into the database. The database supplies the value we need. There’s two tricks going on here: <ol> <li><code class="language-plaintext highlighter-rouge">sql="users"</code></li> <li><code class="language-plaintext highlighter-rouge">MigrationOnly</code></li> </ol> The <code class="language-plaintext highlighter-rouge">sql=</code> in <code class="language-plaintext highlighter-rouge">persistent</code> typically means “Use this name in the <code class="language-plaintext highlighter-rouge">sql</code> representation of this.” For a table, this tells <code class="language-plaintext highlighter-rouge">persistent</code> that the table name for our type is <code class="language-plaintext highlighter-rouge">users</code>. And - we have two Haskell models that reference <code class="language-plaintext highlighter-rouge">users</code>! Then, <code class="language-plaintext highlighter-rouge">MigrationOnly</code> is a signal to <code class="language-plaintext highlighter-rouge">persistent</code> that the field should not be present in the generated Haskell code. So <code class="language-plaintext highlighter-rouge">InsertUser</code> will not have Haskell code for <code class="language-plaintext highlighter-rouge">createdAt</code> or <code class="language-plaintext highlighter-rouge">updatedAt</code>, but <code class="language-plaintext highlighter-rouge">persistent</code> will still expect the database to have the right shape. <h1 id="decouple-your-models">Decouple Your Models</h1> This has application beyond providing a more convenient interface for inserting default columns. You can actually decouple your tables from each other and have business logic that relies on a subset of the database. For example, let’s look at some code that needs to know about <code class="language-plaintext highlighter-rouge">Organization</code>s, but that does not care at all about <code class="language-plaintext highlighter-rouge">User</code>s. The view of the <code class="language-plaintext highlighter-rouge">Organization</code> table that this code needs looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkPersist sqlSettings [persistLowerCase| MyOrganization sql="organization" !no-migrate name Text |] </code></pre></div></div> Note that we don’t actually reference the <code class="language-plaintext highlighter-rouge">UserId</code> type, which means we don’t need the <code class="language-plaintext highlighter-rouge">User</code> model in scope. This allows us to decouple this logic from the whole <code class="language-plaintext highlighter-rouge">User</code> notion, or anything else that <code class="language-plaintext highlighter-rouge">Organization</code> depends on that is irrelevant to the code that <code class="language-plaintext highlighter-rouge">MyOrganization</code> is useful for. This uses another <code class="language-plaintext highlighter-rouge">persistent</code> feature: <code class="language-plaintext highlighter-rouge">!no-migrate</code>. When we write this, we tell <code class="language-plaintext highlighter-rouge">persistent</code> not to include this model in our migrations. As long as the database table indicated by <code class="language-plaintext highlighter-rouge">sql="organization"</code> is compatible with what we have here for the operations we do on it, we’re fine - and if we’re just reading, then we’re totally fine! Unfortunately, <code class="language-plaintext highlighter-rouge">persistent</code> does not offer a means of blocking insert, so this can do unsafe things. Wed, 07 Feb 2024 00:00:00 +0000 https://www.parsonsmatt.org/2024/02/07/persistent_models_are_views.html https://www.parsonsmatt.org/2024/02/07/persistent_models_are_views.html The Meaning of Monad in MonadTrans At work, someone noticed that they got a compiler warning for a derived instance of <code class="language-plaintext highlighter-rouge">MonadTrans</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype FooT m a = FooT { unFooT :: StateT Int m a } deriving newtype (Functor, Applicative, Monad, MonadTrans) </code></pre></div></div> GHC complained about a redundant <code class="language-plaintext highlighter-rouge">Monad</code> constraint. After passing <code class="language-plaintext highlighter-rouge">-ddump-deriv</code>, I saw that GHC was pasting in basically this instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance MonadTrans FooT where lift :: Monad m => m a -> t m a lift = coerce (lift :: m a -> StateT Int m a) </code></pre></div></div> The problem was that the <code class="language-plaintext highlighter-rouge">Monad m</code> constraint there is redundant - we’re not actually using it. However, it’s a mirror for the definition of the class method. In <code class="language-plaintext highlighter-rouge">transformers < 0.6</code>, the definition of <code class="language-plaintext highlighter-rouge">MonadTrans</code> class looked like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class MonadTrans t where lift :: Monad m => m a -> t m a </code></pre></div></div> In <code class="language-plaintext highlighter-rouge">transformers-0.6</code>, a quantified superclass constraint was added to <code class="language-plaintext highlighter-rouge">MonadTrans</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (forall m. Monad m => Monad (t m)) => MonadTrans t where lift :: Monad m => m a -> t m a </code></pre></div></div> I’m having a bit of semantic satiation with the word <code class="language-plaintext highlighter-rouge">Monad</code>, which isn’t an unfamiliar phenomenon for a Haskell developer. However, while explaining this behavior, I found it to be a very subtle distinction in what these constraints fundamentally mean. <h1 id="what-is-a-constraint">What is a Constraint?</h1> A <code class="language-plaintext highlighter-rouge">Constraint</code> is a thing in Haskell that GHC needs to solve in order to make your code work. Solving a constraint is similar to proving a proposition in constructive logic - GHC needs to find evidence that the claim holds, in the form of a type class instance. When we write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: Num a => a -> a foo x = x * 2 </code></pre></div></div> We’re saying: <blockquote> I have a polymorphic function <code class="language-plaintext highlighter-rouge">foo</code> which can operate on types, if those types are instances of the class <code class="language-plaintext highlighter-rouge">Num</code>. </blockquote> If is the big thing here - it’s a way of making a conditional expression. For a totally polymorphic function, like <code class="language-plaintext highlighter-rouge">id :: a -> a</code>, there are no conditions. You can call it with any type you want. But a conditional polymorphic function expresses some requirements, or constraints, upon the input. If you ask for constraints you don’t need, then you can get a warning by enabling <code class="language-plaintext highlighter-rouge">-Wredundant-constraints</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>woops :: (Bounded a, Num a) => a -> a woops x = x + 5 </code></pre></div></div> GHC will happily warn us that we don’t actually use the <code class="language-plaintext highlighter-rouge">Bounded a</code> constraint, and it’s redundant. We should delete it. Indeed, there are many <code class="language-plaintext highlighter-rouge">Num</code> that aren’t <code class="language-plaintext highlighter-rouge">Bounded</code>, and by requiring <code class="language-plaintext highlighter-rouge">Bounded</code>, we are reducing the potential types we could call this function on for no reason. <h1 id="constraints-liberate">Constraints Liberate</h1> (from <a href="https://www.youtube.com/watch?v=GqmsQeSzMdw">Constraints Liberate, Liberties Constrain</a>) A constraint is a perspective on what is happening - it is the perspective of the caller of a function. It’s almost like I see a function type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someCoolFunction :: _ => a -> b </code></pre></div></div> And think - “Ah hah! I can call this at any type <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> that I want!” Only to find that there’s a bunch of constraints on <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code>, and now I am constrained in the types I can call this function at. However, a constraint feels very different from the implementer of a function. Let’s look at the classic identity function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>id :: a -> a id a = a </code></pre></div></div> As an implementer, I have quite a few constraints! Indeed, I can’t really do anything here. I can write equivalent functions, or much slower versions of this function, or I can escape hatch with unsafe behavior - but my options are really pretty limited. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>id' :: a -> a id' a = repeat a !! 1000 id'' :: a -> a id'' a = let y = a in y id''' :: a -> a id''' a = iterate id' a !! 1000 </code></pre></div></div> However, a <code class="language-plaintext highlighter-rouge">Constraint</code> means that I now have some extra power. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: Num a => a -> a </code></pre></div></div> With this signature, I now have access to the <code class="language-plaintext highlighter-rouge">Num</code> type class methods, as well as any other function that is polymorphic over <code class="language-plaintext highlighter-rouge">Num</code>. The constraint is a liberty - I have gained the power to do stuff with the input. <h1 id="the-two-monads">The Two Monads</h1> Back to <code class="language-plaintext highlighter-rouge">MonadTrans</code> - <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (forall m. Monad m => Monad (t m)) => MonadTrans t where lift :: Monad m => m a -> t m a </code></pre></div></div> <h2 id="method-constraint">Method Constraint</h2> Let’s talk about that <code class="language-plaintext highlighter-rouge">lift</code> constraint. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> lift :: Monad m => m a -> t m a </code></pre></div></div> This constraint means that the input to <code class="language-plaintext highlighter-rouge">lift</code> must prove that it is a <code class="language-plaintext highlighter-rouge">Monad</code>. This means that, as implementers of <code class="language-plaintext highlighter-rouge">lift</code>, we can use the methods on <code class="language-plaintext highlighter-rouge">Monad</code> in order to make <code class="language-plaintext highlighter-rouge">lift</code> work out. We often don’t need it - consider these instances. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype IdentityT m a = IdentityT (m a) instance MonadTrans IdentityT where lift action = IdentityT action </code></pre></div></div> <code class="language-plaintext highlighter-rouge">IdentityT</code> can use the constructor directly, and does not require any methods at all to work with <code class="language-plaintext highlighter-rouge">action</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ReaderT r m a = ReaderT (r -> m a) instance MonadTrans (ReaderT r) where lift action = ReaderT $ \_ -> action </code></pre></div></div> <code class="language-plaintext highlighter-rouge">ReaderT</code> uses the constructor and throws away the <code class="language-plaintext highlighter-rouge">r</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ExceptT e m a = ExceptT (m (Either e a)) instance MonadTrans (ExceptT e) where lift action = ExceptT $ fmap Right action </code></pre></div></div> Ah, now we’re using the <code class="language-plaintext highlighter-rouge">Functor</code> method <code class="language-plaintext highlighter-rouge">fmap</code> in order to make the inner action fit. We’re given an <code class="language-plaintext highlighter-rouge">action :: m a</code>, and we need an <code class="language-plaintext highlighter-rouge">m (Either e a)</code>. And we’ve got <code class="language-plaintext highlighter-rouge">Right :: a -> Either e a</code> and <code class="language-plaintext highlighter-rouge">fmap</code> to make it work. We are allowed to call <code class="language-plaintext highlighter-rouge">fmap</code> here because <code class="language-plaintext highlighter-rouge">Monad</code> implies <code class="language-plaintext highlighter-rouge">Applicative</code> implies <code class="language-plaintext highlighter-rouge">Functor</code>, and we’ve been given the <code class="language-plaintext highlighter-rouge">Monad m</code> constraint as part of the method. <h2 id="superclass-constraint">Superclass Constraint</h2> Let’s talk about the quantified superclass constraint (wow, what a fancy phrase). <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> (forall m. Monad m => Monad (t m)) </code></pre></div></div> This superclass constraint means that the type <code class="language-plaintext highlighter-rouge">t m a</code> is a <code class="language-plaintext highlighter-rouge">Monad</code> if <code class="language-plaintext highlighter-rouge">m</code> is a <code class="language-plaintext highlighter-rouge">Monad</code>, and this is true for all <code class="language-plaintext highlighter-rouge">m</code>, not just a particular one. Prior to this, if you wanted to write a <code class="language-plaintext highlighter-rouge">do</code> block thta was arbitrary in a monad transformer, you’d have to write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ohno :: (Monad (t m), Monad m, MonadTrans t) => m a -> t m a ohno action = do lift action lift action </code></pre></div></div> What’s more annoying is that, if you had a few different underlying type parameters, you’d need to request the <code class="language-plaintext highlighter-rouge">Monad (t m)</code> for each one - <code class="language-plaintext highlighter-rouge">Monad (t m), Monad (t n), Monad (t f)</code>. Boring and redundant. Obviously, if <code class="language-plaintext highlighter-rouge">m</code> is a <code class="language-plaintext highlighter-rouge">Monad</code>, and <code class="language-plaintext highlighter-rouge">t</code> is a <code class="language-plaintext highlighter-rouge">MonadTrans</code>former, then <code class="language-plaintext highlighter-rouge">t m</code> must also be a <code class="language-plaintext highlighter-rouge">Monad</code> - otherwise it’s not a valid <code class="language-plaintext highlighter-rouge">MonadTransformer</code>! So this <code class="language-plaintext highlighter-rouge">Constraint</code> is slightly different. The first perspective on <code class="language-plaintext highlighter-rouge">Constraint</code> is the user of a function: <blockquote> I am constrained in the types I can use a function with </blockquote> The second perspective on <code class="language-plaintext highlighter-rouge">Constraint</code> is the implementer of a function: <blockquote> By constraining my inputs, I gain knowledge and power over them </blockquote> But this superclass constraint is a bit different. It doesn’t seem to be about requiring anything from our users. It also doesn’t seem to be about allowing more power for implementers. Instead, it’s a form of evidence propagation. We’re saying: <blockquote> GHC, if you know that <code class="language-plaintext highlighter-rouge">m</code> is a <code class="language-plaintext highlighter-rouge">Monad</code>, then you may also infer that <code class="language-plaintext highlighter-rouge">t m</code> is a <code class="language-plaintext highlighter-rouge">Monad</code>. </blockquote> Type classes form a compositional tool for logic programming. Constraints like these are a conditional proposition that allow GHC to see more options for solving and satisfying problems. <h1 id="the-koan-of-constraint">The Koan of <code class="language-plaintext highlighter-rouge">Constraint</code></h1> Four blind monks touch a <code class="language-plaintext highlighter-rouge">Constraint</code>, trying to identify what it is. The first exclaims “This is a tool for limiting people.” The second laughs and says, “No, this is a tool for empowering people!” The third shakes his head solemnly and retorts, “No, this is a tool for clarifying wisdom.” The fourth says “I cannot satisfy this.” Thu, 10 Aug 2023 00:00:00 +0000 https://www.parsonsmatt.org/2023/08/10/the_meaning_of_monad_in_monadtrans.html https://www.parsonsmatt.org/2023/08/10/the_meaning_of_monad_in_monadtrans.html Yamaha vs NS Design Electric Cellos I’ve been learning cello again recently. The omnipresent advice for novices is to rent a cello from a reputable violin shop, which is great - my cello would be $2,700 new, but I’m renting it for $50/mo. And I can tell I’ll outgrow it, and I don’t know how to properly evaluate an acoustic cello for great suitability yet. I’ve got a practice mute, but it’s still rather loud - I can’t practice at night or early in the morning without waking someone up in my house. This led me to look at electric cellos for practice. There are essentially three options: <ul> <li>Shitty Cecilio electrics</li> <li>Yamaha silent cellos</li> <li>NS Design electric cellos</li> </ul> I had the opportunity to visit <a href="https://lutherstrings.com/">Luther Strings</a> this past weekend to try the Yamaha and NS Design cellos. <h1 id="avoid-cecilio">Avoid Cecilio</h1> First of all, the prices on the Yamaha and NS Design are high - you may be tempted to shop elsewhere for your first instrument. The Cecilio seems like a good deal at $530, but it isn’t. A friend of mine purchased one in college (budgets, hey). The electronics are awful - bad sound, noisy, connections are inconsistent and cut out. I lined it with tin foil and resoldered most of the joints, which helped. It wouldn’t hold tune, so I installed geared bass tuners. The bridge needs to be shorter and the fingerboard needs to be planed, which I can’t reasonably DIY and the luthiers want $100s to do it. The strings are uncomfortable, too, and they don’t sound good. It’ll cost about as much to get the instrument in usable condition as it does to buy it. You could start on this, but you’d quickly be dismayed at the difficulty in pressing the strings down, the noise from the electronics, and the impossibility of drawing a good tone from it. You’re far better off saving more money to buy a nicer one, or seeing if anyone nearby can rent one of these - a string shop that carries electrics may be willing to do that. <h1 id="yamaha">Yamaha</h1> Yamaha makes a range of silent practice cellos. They happen to be electric, but the clear intended purpose is for silent practice. They have body shapes that make it feel like a traditional cello - there’s a shoulder where you’d expect to go into thumb position, a frame for your legs, and a metal bit that rests on your chest. The electronics are all active, and have an A/C adapter - they have headphone outputs and reverb built-in. They also have a hookup for an auxiliary input, which would allow you to plug an MP3 player and listen to a backing track while you play. The cheapest option is the <a target="_blank" href="https://www.amazon.com/Yamaha-SVC50SK-Silent-Compact-Outfit/dp/B002PMKZ1K/ref=sr_1_3?keywords=yamaha+svc&link_code=qs&qid=1690815317&sr=8-3&ufe=app_do%253Aamzn1.fos.17f26c18-b61b-4ce9-8a28-de351f41cffb&_encoding=UTF8&tag=productionhas-20&linkCode=ur2&linkId=ed5a623248b1bf0138563cd0a3d8e457&camp=1789&creative=9325">Yamaha SVC-50</a> at $2,100. <h1 id="ns-design">NS Design</h1> NS Design makes a range of electric cellos. They happen to be quiet, but the clear intended purpose is for playing electric cello. The shape is highly non-traditional - the neck is unobstructed throughout, and there’s nothing but a stick on the cello itself. You can play with an included adjustable tripod to find the right position, or you can purchase a frame/end-pin mount, or a strap system for standing upright. On the low end, the electronics are simple - a passive pickup, a switch for arco or pizzicato playing, a volume know, and a tone knob - much like you’d find on an electric guitar or bass. The cheapest option is the <a target="_blank" href="https://www.amazon.com/NS-Design-WAV4-Electric-Cello/dp/B06XS3NH1Q/ref=sr_1_1?crid=2DJ6PD8NMC0U9&keywords=ns+design+wav4+cello&qid=1690816029&sprefix=ns+design+wav4+cell%252Caps%252C112&sr=8-1&ufe=app_do%253Aamzn1.fos.765d4786-5719-48b9-b588-eab9385652d5&_encoding=UTF8&tag=productionhas-20&linkCode=ur2&linkId=9bb93546242bc26a3fc5d4e53d8ddaaa&camp=1789&creative=9325">NS Design WAV4</a> at $1,259 (though <a href="https://www.sharmusic.com/products/ns-design-wav-cello-black?_pos=2&_sid=6b29ba9cb&_ss=r">Shar Music has one for $999</a>). While this is significantly cheaper than the Yamaha, you’ll probably want the <a href="https://thinkns.com/product/cello-end-pin-stand/">cello endpin stand at $385</a>, and maybe the <a href="https://thinkns.com/product/frame-strap-system/">frame strap system at $259</a>, or the <a href="https://thinkns.com/product/cello-thumbstop-neck-heel-rest/">thumbstop for $116</a>. The cost is now $1,759, and you still can’t plug headphones directly into the instrument - for that, you need to go up to the CR series, which are around $4,000 (but have active electronics and a headphone jack). <h1 id="comparison">Comparison</h1> These are two pretty different items. However, they have a lot of overlap - and so deciding between them may be a challenge. <h2 id="cost">Cost</h2> The NS Design WAV4 has a much lower entry price - at $999 from Shar, it’s a fantastic deal, and even the $1,259 from other retailers is good. However, the endpin stand is expensive and necessary to get the mobility and feel of a traditional cello, which would be an important purchase if the intent is to practice for traditional cello. The WAV4 also only has passive electronics and an instrument output - you need an amplifier of some sort to drive headphones. The Yamaha is $2,100, but it is fully loaded. You buy it, and you’ll be able to practice cello silently with a backing track right now. You don’t need anything else. Once you’ve bought the accessories you need to make these equivalent, the cost ends up around the same. This is defrayed a bit if you already have some of the amplifier/headphone gear for the NS cello from playing electric guitar. <h2 id="playing-vs-traditional-cello">Playing vs Traditional Cello</h2> The Yamaha silent cello feels like a traditional cello. The chest stop feels right, and the two leg frames feel right. You can move the cello with your knees to get different string angles and to aide expressive playing. The shoulder of the cello indicates when you’d need to shift to thumb position on a traditional cello. Unfortunately, I found the fingerboard to be oddly shaped, and the strings were a bit high. It was far better than the Cecilio I’ve played, but I do still feel like a proper setup from a luthier would help dramatically. I’d say it feels like a $1,000 cello with $1,000 of electronics. The NS Design doesn’t feel like a traditional cello at all. On the stock tripod, there’s no chest or leg support. The instrument sits in a fixed position. This doesn’t feel natural coming from an acoustic cello. I didn’t try the endpin stand, as the shop didn’t have it. The lack of shoulder means you have no indication when you’d need to shift into thumb position. There’s a single brass dot on the back of the neck that’s roughly where fourth position is. The Yamaha clearly feels more like a traditional cello. <h2 id="playing-as-an-electric-cello">Playing as an Electric Cello</h2> Yamaha’s intent is to allow you to pretend you are playing a real acoustic cello. Yamaha makes a comparable guitar - it’s their <a href="https://amzn.to/3DzkWqG">SLG200S</a> silent acoustic guitar. The Yamaha silent cello doesn’t really feel like an instrument I’d want to play on. I definitely wouldn’t want to write music for it, or perform with. It’s for practice, and it lives in the shadow of the acoustic cello. You never want to play the Yamaha - you’d rather be playing your acoustic - but sometimes, hey, this is good enough, and better than nothing. NS Design can sacrifice a lot of the engineering and design that the Yamaha needs for this in order to produce an instrument that stands alone. In the same way that a Fender Stratocaster or Gibson Les Paul are different instruments than an acoustic guitar, the NS Cello isn’t trying to be something that it isn’t. The tripod is sturdy and would work fine on stage. The frame strap allows you to stand and walk while playing - something the Yamaha simply cannot do, not even with <a href="https://www.cellostrap.com/">The Block Strap</a>, since the instrument is designed differently. The lack of a shoulder bout means you don’t have an indicator on when to shift - but that really just opens up the range of the upper end of the cello significantly. Just like how a classical guitar often has the shoulder bout on the 12th fret, but electric guitars can go up to 24 frets. The NS Design cello simply feels much better to play. It’s easier to play double stops and sound notes. The neck has a nice feel to it. I can easily imagine playing this instrument for itself - writing music, performing with it, recording with it, etc. <h2 id="sound-quality">Sound Quality</h2> Electronics absolutely dominate the tone of an electric instrument. The Yamaha cello is slightly noisy. You can hear a static hiss in the background. There’s no tone knob or onboard EQ, so the sound you get is what you get. In my opinion, it doesn’t sound great. However, it is a complete practice solution - and if you’re just practicing, the richness of tone doesn’t really matter. You can hear your technique just fine. Now, I did not try the WAV4. I tried a 5 string CR, with active electronics - volume, switch, and an active EQ knob for bass and treble. The NS Design sounds great. The arco/pizzo switch yields some interesting tone combinations - while everything sounds great on the arco setting, the pizzicato setting brings out much more sustain, similar to a guitar. The electronics are quiet - no detectable noise. I found it easy to produce a tone that sounded good - you wouldn’t confuse it for an acoustic cello, but you wouldn’t think it sounded bad at all. The WAV4 has the same pickup as the CR, but the electronics are passive. This means you only get a treble roll-off knob and a volume knob, plus the arco/pizzicato switch. I tend to prefer passive electronics anyway - both the Yamaha and NS Design cellos ran out of battery power during my trial, and produced some gnarly bad tone. In my opinion, the NS Design wins here. <h2 id="overall-fitfinishfeel">Overall fit/finish/feel</h2> The NS Design CR cello feels fantastic. The craftsmanship is superb. Everything is sturdy and feels great. I think it looks nice, too - a quality wood finish and a pattern of dots on the fingerboard provide some visual interest for what is otherwise just a stick. The Yamaha feels a bit like a toy. The aesthetics are pretty bare - it’s a black shape with some stuff sticking out. The tuners are kinda cheap looking. I didn’t get a great feeling for it. <h1 id="conclusion">Conclusion</h1> Well, the Yamaha is clearly a better instrument for practicing traditional acoustic cello. No question about it. If all you care about is an all-in-one travel cello and the ability to practice without bothering anyone, then the Yamaha is the winner. However, the NS Design has a lot of compelling points in it’s favor. It’s an instrument in-and-of-itself. It’s not trying to be an acoustic cello but quiet, and this allows it to have many advantages over the Yamaha. The sound is better than the Yamaha. In many ways, playing the NS Cello is even easier than an acoustic cello, and certainly much better feeling than the Yamaha. Overall, I feel compelled by the advantages of the NS cello. I’ll need to invest in the endpin stand, and I may try to fashion a detachable shoulder bout so I know when to practice thumb position. I’ve been intending on getting a music studio going, which would satisfy the headphone amp problem, and I could also use any electric guitar amp. I may regret the decision and decide that I don’t actually care about the electric cello features, and what I really did want was just a practice cello. But we’ll see! Mon, 31 Jul 2023 00:00:00 +0000 https://www.parsonsmatt.org/2023/07/31/yamaha_vs_ns_design_electric_cellos.html https://www.parsonsmatt.org/2023/07/31/yamaha_vs_ns_design_electric_cellos.html Working with Haskell CallStack GHC Haskell provides a type <a href="https://www.stackage.org/haddock/lts-20.20/base-4.16.4.0/GHC-Exception.html#t:CallStack"><code class="language-plaintext highlighter-rouge">CallStack</code></a> with some magic built in properties. Notably, there’s a constraint you can write - <code class="language-plaintext highlighter-rouge">HasCallStack</code> - that GHC will automagically figure out for you. Whenever you put that constraint on a top-level function, it will figure out the line and column, and either create a fresh <code class="language-plaintext highlighter-rouge">CallStack</code> for you, or it will append the source location to the pre-existing <code class="language-plaintext highlighter-rouge">CallStack</code> in scope. <h1 id="getting-the-current-callstack">Getting the current <code class="language-plaintext highlighter-rouge">CallStack</code></h1> To grab the current <code class="language-plaintext highlighter-rouge">CallStack</code>, you’ll write <code class="language-plaintext highlighter-rouge">callStack</code> - a value-level term that summons a <code class="language-plaintext highlighter-rouge">CallStack</code> from GHC’s magic. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import GHC.Stack emptyCallStack :: IO () emptyCallStack = putStrLn $ show callStack </code></pre></div></div> If we evaluate this in a compiled executable, then GHC will print out <code class="language-plaintext highlighter-rouge">[]</code> - a <code class="language-plaintext highlighter-rouge">CallStack</code> list with no entries! This isn’t much use. Let’s add a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>giveCallStack :: HasCallStack => IO () giveCallStack = putStrLn $ show callStack </code></pre></div></div> Running this in our test binary gives us the following entry, lightly formatted: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[ ( "giveCallStack" , SrcLoc { srcLocPackage = "main" , srcLocModule = "Main" , srcLocFile = "test/Spec.hs" , srcLocStartLine = 18 , srcLocStartCol = 9 , srcLocEndLine = 18 , srcLocEndCol = 22 } ) ] </code></pre></div></div> We get a <code class="language-plaintext highlighter-rouge">[(String, SrcLoc)]</code>. The <code class="language-plaintext highlighter-rouge">String</code> represnts the function that was called, and where <code class="language-plaintext highlighter-rouge">SrcLoc</code> tells us the package, module, file, and a begin and end to the source location of the call site - not the definition site. Let’s construct a helper that gets the current <code class="language-plaintext highlighter-rouge">SrcLoc</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getSrcLoc :: HasCallStack => SrcLoc getSrcLoc = snd $ head $ getCallStack callStack </code></pre></div></div> I’m going to call <code class="language-plaintext highlighter-rouge">print getSrcLoc</code> in my test binary, and this is the output (again, formatted for legibility): <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SrcLoc { srcLocPackage = "main" , srcLocModule = "Main" , srcLocFile = "test/Spec.hs" , srcLocStartLine = 27 , srcLocStartCol = 15 , srcLocEndLine = 27 , srcLocEndCol = 24 } </code></pre></div></div> We can use this to construct a link to a GitHub project - suppose that we called that inside the <a href="https://github.com/bitemyapp/esqueleto"><code class="language-plaintext highlighter-rouge">esqueleto</code> repository</a>, and we want to create a link that’ll go to that line of code. Normally, you’d want to shell out and grab the commit and branch information, but let’s just bake that into the link for now. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkGithubLink :: HasCallStack => String mkGithubLink = concat [ "https://www.github.com/bitemyapp/esqueleto/blob/master/" , srcLocFile srcLoc , "#L", show $ srcLocStartLine srcLoc , "-" , "L", show $ srcLocEndLine srcLoc ] where srcLoc = getSrcLoc </code></pre></div></div> Let’s call that from our test binary now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do -- snip... example "mkGithubLink" do putStrLn mkGithubLink </code></pre></div></div> The output is given: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkGithubLink https://www.github.com/bitemyapp/esqueleto/blob/master/src/Lib.hs#L24-L24 </code></pre></div></div> But - that’s not right! That’s giving us the source location for <code class="language-plaintext highlighter-rouge">getSrcLoc</code> inside of <code class="language-plaintext highlighter-rouge">mkGithubLink</code>. We want it to give us the location of the callsite of <code class="language-plaintext highlighter-rouge">mkGithubLink</code>. Fortunately, we can freeze the current <code class="language-plaintext highlighter-rouge">CallStack</code>, which will prevent <code class="language-plaintext highlighter-rouge">getSrcLoc</code> from adding to the existing <code class="language-plaintext highlighter-rouge">CallStack</code>. <h1 id="freezing-the-callstack">Freezing the <code class="language-plaintext highlighter-rouge">CallStack</code></h1> <code class="language-plaintext highlighter-rouge">GHC.Stack</code> provides a function <a href="https://www.stackage.org/haddock/lts-20.20/base-4.16.4.0/GHC-Stack.html#v:withFrozenCallStack"><code class="language-plaintext highlighter-rouge">withFrozenCallStack</code></a>, with a bit of a strange type signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withFrozenCallStack :: HasCallStack => (HasCallStack => a) -> a </code></pre></div></div> This function freezes the <code class="language-plaintext highlighter-rouge">CallStack</code> for the argument of the function. This is useful if you want to provide a wrapper around a function that manipulates or reports on the <code class="language-plaintext highlighter-rouge">CallStack</code> in some way, but you don’t want that polluting any other <code class="language-plaintext highlighter-rouge">CallStack</code>. Let’s call that before <code class="language-plaintext highlighter-rouge">getSrcLoc</code> and see what happens. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkGithubLinkFrozen :: HasCallStack => String mkGithubLinkFrozen = concat [ "https://www.github.com/bitemyapp/esqueleto/blob/master/" , srcLocFile srcLoc , "#L", show $ srcLocStartLine srcLoc , "-" , "L", show $ srcLocEndLine srcLoc ] where srcLoc = withFrozenCallStack getSrcLoc -- in test binary, main = do -- snip example "mkGithubLinkFrozen" do putStrLn mkGithubLinkFrozen </code></pre></div></div> Output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkGithubLinkFrozen https://www.github.com/bitemyapp/esqueleto/blob/master/test/Spec.hs#L32-L32 </code></pre></div></div> Bingo! <h2 id="more-real-world-examples">More real-world examples</h2> As an example, the library <a href="https://hackage.haskell.org/package/annotated-exception"><code class="language-plaintext highlighter-rouge">annotated-exception</code></a> attaches <code class="language-plaintext highlighter-rouge">CallStack</code>s to thrown exceptions, and each function like <code class="language-plaintext highlighter-rouge">catch</code> or <code class="language-plaintext highlighter-rouge">onException</code> that touches exceptions will append the current source location to the existing <code class="language-plaintext highlighter-rouge">CallStack</code>. However, <code class="language-plaintext highlighter-rouge">handle</code> is implemented in terms of <code class="language-plaintext highlighter-rouge">catch</code>, which is implemented in terms of <code class="language-plaintext highlighter-rouge">catches</code>, and we wouldn’t want every single call-site of <code class="language-plaintext highlighter-rouge">handle</code> to mention <code class="language-plaintext highlighter-rouge">catch</code> and <code class="language-plaintext highlighter-rouge">catches</code>, and we wouldn’t want every call site of <code class="language-plaintext highlighter-rouge">catch</code> to mention <code class="language-plaintext highlighter-rouge">catches</code> - that’s just noise. So, we can freeze the <code class="language-plaintext highlighter-rouge">CallStack</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>handle :: (HasCallStack, Exception e, MonadCatch m) => (e -> m a) -> m a -> m a handle handler action = withFrozenCallStack catch action handler catch :: (HasCallStack, Exception e, MonadCatch m) => m a -> (e -> m a) -> m a catch action handler = withFrozenCallStack catches action [Handler handler] catches :: (HasCallStack, MonadCatch m) => m a -> [Handler m a] -> m a catches action handlers = Safe.catches action (withFrozenCallStack mkAnnotatedHandlers handlers) mkAnnotatedHandlers :: (HasCallStack, MonadCatch m) => [Handler m a] -> [Handler m a] mkAnnotatedHandlers xs = xs >>= \(Handler hndlr) -> [ Handler $ \e -> checkpointCallStack $ hndlr e , Handler $ \(AnnotatedException anns e) -> checkpointMany anns $ hndlr e ] </code></pre></div></div> Now, there’s something interesting going on here: consider these two possible definition of <code class="language-plaintext highlighter-rouge">handle</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>handle handler action = withFrozenCallStack catch action handler handle handler action = withFrozenCallStack $ catch action handler </code></pre></div></div> It’s a Haskell instinct to write <code class="language-plaintext highlighter-rouge">function $ argument</code>, and it seems a bit odd to see <code class="language-plaintext highlighter-rouge">withFrozenCallStack</code> - a function - applied without a dollar. This is a subtle distinction - <code class="language-plaintext highlighter-rouge">withFrozenCallStack</code> applied to <code class="language-plaintext highlighter-rouge">catch</code> alone just freezes the <code class="language-plaintext highlighter-rouge">CallStack</code> for <code class="language-plaintext highlighter-rouge">catch</code>, but not for <code class="language-plaintext highlighter-rouge">handler</code> or <code class="language-plaintext highlighter-rouge">action</code>. If we apply <code class="language-plaintext highlighter-rouge">withFrozenCallStack $ catch action handler</code>, then we’ll freeze the <code class="language-plaintext highlighter-rouge">CallStack</code> for our arguments, too. This is usually not what you want. <h3 id="freezing-functions">Freezing Functions</h3> Let’s explore the above subtle distinction in more depth. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wat :: HasCallStack => IO () wat = do wrap "unfrozen" (printSrcLoc getSrcLoc) withFrozenCallStack $ wrap "dolla" (printSrcLoc getSrcLoc) withFrozenCallStack wrap "undolla" (printSrcLoc getSrcLoc) printSrcLoc :: SrcLoc -> IO () printSrcLoc = putStrLn . prettySrcLoc wrap :: HasCallStack => String -> IO a -> IO a wrap message action = do putStrLn $ concat [ "Beginning ", message , ", called at ", prettySrcLoc getSrcLoc ] a <- action putStrLn $ "Ending " <> message pure a </code></pre></div></div> Before seeing the answer and discussion below, consider and predict what <code class="language-plaintext highlighter-rouge">SrcLoc</code> you expect to see printed out when <code class="language-plaintext highlighter-rouge">wat</code> is called. Let’s zoom in on that: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> wrap "unfrozen" (printSrcLoc getSrcLoc) withFrozenCallStack $ wrap (print getSrcLoc) withFrozenCallStack wrap (print getSrcLoc) </code></pre></div></div> Both lines type check just fine. The difference is in which <code class="language-plaintext highlighter-rouge">CallStack</code>s are frozen. The first line freezes the <code class="language-plaintext highlighter-rouge">CallStack</code> for the entire expression, <code class="language-plaintext highlighter-rouge">wrap (print getSrcLoc)</code>. The second line only freezes the <code class="language-plaintext highlighter-rouge">CallStack</code> for the <code class="language-plaintext highlighter-rouge">wrap</code> function - the <code class="language-plaintext highlighter-rouge">CallStack</code> for the <code class="language-plaintext highlighter-rouge">(print getSrcLoc)</code> is unfrozen. Let’s see what happens when we run that: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wat Beginning unfrozen, called at src/Lib.hs:51:40 in callstack-examples-0.1.0.0-9VnJFsvI3QO7TuvXNKcHBF:Lib src/Lib.hs:40:34 in callstack-examples-0.1.0.0-9VnJFsvI3QO7TuvXNKcHBF:Lib Ending unfrozen Beginning dolla, called at test/Spec.hs:34:19 in main:Main test/Spec.hs:34:19 in main:Main Ending dolla Beginning undolla, called at test/Spec.hs:34:19 in main:Main src/Lib.hs:42:53 in callstack-examples-0.1.0.0-9VnJFsvI3QO7TuvXNKcHBF:Lib Ending undolla </code></pre></div></div> For <code class="language-plaintext highlighter-rouge">unfrozen</code>, <code class="language-plaintext highlighter-rouge">wrap</code> calls the <code class="language-plaintext highlighter-rouge">SrcLoc</code> that corresponds to it’s <code class="language-plaintext highlighter-rouge">putStrLn $ concat [..., getSrcLoc]</code> call. This always points to the <code class="language-plaintext highlighter-rouge">wrap</code> definition site - we’d want to freeze that <code class="language-plaintext highlighter-rouge">getSrcLoc</code> if we wanted the call site of <code class="language-plaintext highlighter-rouge">wrap</code> in that case. The next line (<code class="language-plaintext highlighter-rouge">src/Lib.hs:40:34 ...</code>) is our <code class="language-plaintext highlighter-rouge">printSrcLoc getSrcLoc</code> function provided to <code class="language-plaintext highlighter-rouge">wrap</code>. That <code class="language-plaintext highlighter-rouge">SrcLoc</code> points to the call site of <code class="language-plaintext highlighter-rouge">getSrcLoc</code> in the file for that function. For <code class="language-plaintext highlighter-rouge">dolla</code>, we’ve frozen the <code class="language-plaintext highlighter-rouge">CallStack</code> for both <code class="language-plaintext highlighter-rouge">wrap</code> and the function argument. That means the <code class="language-plaintext highlighter-rouge">SrcLoc</code> we get for both cases is the same - so we’re not really returning the exact <code class="language-plaintext highlighter-rouge">SrcLoc</code>, but the most recent <code class="language-plaintext highlighter-rouge">SrcLoc</code> before the entire <code class="language-plaintext highlighter-rouge">CallStack</code> was frozen. This <code class="language-plaintext highlighter-rouge">SrcLoc</code> corresponds to the call-site of <code class="language-plaintext highlighter-rouge">wat</code> in the test suite binary, not the library code that defined it. For <code class="language-plaintext highlighter-rouge">undolla</code>, we’ve only frozen the <code class="language-plaintext highlighter-rouge">CallStack</code> for <code class="language-plaintext highlighter-rouge">wrap</code>, and we leave it untouched for <code class="language-plaintext highlighter-rouge">printSrcLoc getSrcLoc</code>. The result is that <code class="language-plaintext highlighter-rouge">wrap</code> prints out the frozen <code class="language-plaintext highlighter-rouge">CallStack</code> pointing to the callsite of <code class="language-plaintext highlighter-rouge">wat</code> in the test binary, while the function argument <code class="language-plaintext highlighter-rouge">printSrcLoc getSrcLoc</code> is able to access the <code class="language-plaintext highlighter-rouge">CallStack</code> with new frames added. It’s easiest to see what’s going on here with explicit function parenthesization. Haskell uses whitespace for function application, which makes the parentheses implicit for multiple argument functions. Let’s write the above expressions with explicit parens around <code class="language-plaintext highlighter-rouge">withFrozenCallStack</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> (withFrozenCallStack (wrap "dolla" (printSrcLoc getSrcLoc))) (withFrozenCallStack wrap) "undolla" (printSrcLoc getSrcLoc) </code></pre></div></div> I almost wish that <code class="language-plaintext highlighter-rouge">withFrozenCallStack</code> always required parentheses, just to make this clearer - but that’s not possible to enforce. <h2 id="freezing-broke-mkgithublink">Freezing Broke <code class="language-plaintext highlighter-rouge">mkGithubLink</code></h2> Unfortunately, yeah, <code class="language-plaintext highlighter-rouge">mkGithubLinkFrozen</code> is broken if we’ve frozen the <code class="language-plaintext highlighter-rouge">CallStack</code> externally: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- test main :: HasCallStack => IO () main = do -- === line 16 -- snip... example "frozen githublink" do putStrLn (withFrozenCallStack mkGithubLinkFrozen) -- ^^^ line 37 </code></pre></div></div> This outputs: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>frozen githublink https://www.github.com/bitemyapp/esqueleto/blob/master/test/Spec.hs#L16-L16 </code></pre></div></div> Line 16 points to <code class="language-plaintext highlighter-rouge">main</code>, where we’ve included our <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint. What if we omit that constraint? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do -- === line 16 -- snip... example "frozen githublink" do putStrLn (withFrozenCallStack mkGithubLinkFrozen) -- ^^^ line 37 </code></pre></div></div> This outputs: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>frozen githublink callstack-examples-test: Prelude.head: empty list </code></pre></div></div> Uh oh! Well, <code class="language-plaintext highlighter-rouge">GHC.Stack</code> doesn’t provide a utility for us to unfreeze the <code class="language-plaintext highlighter-rouge">CallStack</code>, which makes sense - that would break whatever guarantee that <code class="language-plaintext highlighter-rouge">withFrozenCallStack</code> is providing. If we look at the <a href="https://www.stackage.org/haddock/lts-20.20/base-4.16.4.0/src/GHC.Stack.Types.html#CallStack">internal definitions</a> for <code class="language-plaintext highlighter-rouge">CallStack</code>, we’ll see that it’s a list-like type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data CallStack = EmptyCallStack | PushCallStack [Char] SrcLoc CallStack | FreezeCallStack CallStack </code></pre></div></div> Then we can see <a href="https://www.stackage.org/haddock/lts-20.20/base-4.16.4.0/src/GHC.Stack.html#withFrozenCallStack"><code class="language-plaintext highlighter-rouge">withFrozenCallStack</code>’s implementation</a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withFrozenCallStack :: HasCallStack => ( HasCallStack => a ) -> a withFrozenCallStack do_this = -- we pop the stack before freezing it to remove -- withFrozenCallStack's call-site let ?callStack = freezeCallStack (popCallStack callStack) in do_this </code></pre></div></div> That <code class="language-plaintext highlighter-rouge">?callStack</code> syntax is GHC’s <code class="language-plaintext highlighter-rouge">ImplicitParams</code> extension - it’s an implementation detail that GHC may change at any point in the future. Let’s rely on that detail! It has remained true for 10 major versions of <code class="language-plaintext highlighter-rouge">base</code>, and we can always try and upstream this officially… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import GHC.Stack.Types thawCallStack :: CallStack -> CallStack thawCallStack stack = case stack of FreezeCallStack stk -> stk _ -> stack withThawedCallStack :: HasCallStack => (HasCallStack => r) -> r withThawedCallStack action = let ?callStack = thawCallStack (popCallStack callStack) in action </code></pre></div></div> Unfortunately, we can’t call this within <code class="language-plaintext highlighter-rouge">mkGithubLink</code> - that unfreezes the <code class="language-plaintext highlighter-rouge">CallStack</code>, but at that point, it’s too late. Yet another “safe” use of <code class="language-plaintext highlighter-rouge">head</code> that turns out to be unsafe! Only in Haskell might we have a totally empty stack trace… <h1 id="propagating-callstack">Propagating <code class="language-plaintext highlighter-rouge">CallStack</code></h1> When you write a top-level function, you can include a <code class="language-plaintext highlighter-rouge">CallStack</code>. Any time you call <code class="language-plaintext highlighter-rouge">error</code>, the existing <code class="language-plaintext highlighter-rouge">CallStack</code> will be appended to the <a href="https://www.stackage.org/haddock/lts-20.20/base-4.16.4.0/Control-Exception.html#t:ErrorCall"><code class="language-plaintext highlighter-rouge">ErrorCall</code></a> thrown exception, which you can see by matching on <code class="language-plaintext highlighter-rouge">ErrorCallWithLocation</code> instead of plain <code class="language-plaintext highlighter-rouge">ErrorCall</code>. <code class="language-plaintext highlighter-rouge">CallStack</code> propagation is fragile. Any function which does not include a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint will break the chain, and you’ll only have the lowest level of the <code class="language-plaintext highlighter-rouge">CallStack</code>. Consider <code class="language-plaintext highlighter-rouge">boom</code> and <code class="language-plaintext highlighter-rouge">boomStack</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boom :: Int boom = error "oh no" boomStack :: HasCallStack => Int boomStack = error "oh no, but with a trace" </code></pre></div></div> If we evaluate these, then we’ll see very different information. <code class="language-plaintext highlighter-rouge">error</code> will summon it’s own <code class="language-plaintext highlighter-rouge">CallStack</code>, which will include the callsite of <code class="language-plaintext highlighter-rouge">error</code> itself: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>callstack-examples-test: oh no CallStack (from HasCallStack): error, called at src/Lib.hs:76:8 in callstack-examples-0.1.0.0-9VnJFsvI3Q O7TuvXNKcHBF:Lib </code></pre></div></div> Line 76 and column 8 point exactly to where <code class="language-plaintext highlighter-rouge">error</code> is called in the definition of <code class="language-plaintext highlighter-rouge">boom</code>. Let’s evaluate <code class="language-plaintext highlighter-rouge">boomStack</code> now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>callstack-examples-test: oh no, but with a trace CallStack (from HasCallStack): error, called at src/Lib.hs:79:13 in callstack-examples-0.1.0.0-9VnJFsvI3QO7TuvXNKcHBF:Lib boomStack, called at test/Spec.hs:40:15 in main:Main main, called at test/Spec.hs:16:1 in main:Main </code></pre></div></div> Now, we see the entry for <code class="language-plaintext highlighter-rouge">error</code>’s call-site, as well as <code class="language-plaintext highlighter-rouge">boomStack</code>’s call site, and finally <code class="language-plaintext highlighter-rouge">main</code> - the entire chain! Remembering to put <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraints everywhere is a bit of a drag, which is another motivation for my <a href="https://hackage.haskell.org/package/annotated-exception-0.2.0.4/docs/Control-Exception-Annotated.html"><code class="language-plaintext highlighter-rouge">annotated-exception</code></a> library - all of the functions which touch exceptions in any way will push a stack frame onto any exception that has been thrown. This means that any <code class="language-plaintext highlighter-rouge">catch</code> or <code class="language-plaintext highlighter-rouge">finally</code> or similar will do a much better job of keeping track of the stack frame. Diagnosing problems becomes far easier. We can do this for <code class="language-plaintext highlighter-rouge">ErrorCall</code>, but it’s annoying, because the location is a <code class="language-plaintext highlighter-rouge">String</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkStackFrameLines :: CallStack -> [String] mkStackFrameLines = map formatFrame . getCallStack where formatFrame (fn, srcLoc) = fn <> ", called at " <> prettySrcLoc srcLoc addStackFrame :: HasCallStack => IO a -> IO a addStackFrame action = do let newLines = map (" " <>) $ mkStackFrameLines callStack appendLoc locs = unlines (locs : newLines) action `catch` \(ErrorCallWithLocation err loc) -> throwIO $ ErrorCallWithLocation err (appendLoc loc) -- These functions are used here . -- Try and predict what their output will be! moreContextPlease :: IO () moreContextPlease = addStackFrame $ do print boom moreContextPleaseStacked :: HasCallStack => IO () moreContextPleaseStacked = addStackFrame $ do print boom </code></pre></div></div> When we evaluate <code class="language-plaintext highlighter-rouge">moreContextPlease</code>, we’ll see this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>callstack-examples-test: oh no CallStack (from HasCallStack): error, called at src/Lib.hs:77:8 in callstack-examples-0.1.0.0 addStackFrame, called at src/Lib.hs:84:5 in callstack-examples-0.1.0.0 </code></pre></div></div> This gives us a little more context - we at least have that <code class="language-plaintext highlighter-rouge">addStackFrame</code> call. But <code class="language-plaintext highlighter-rouge">addStackFrame</code> happily adds everything in the trace, and <code class="language-plaintext highlighter-rouge">moreContextPleaseStacked</code> has an unbroken line: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>callstack-examples-test: oh no CallStack (from HasCallStack): error, called at src/Lib.hs:77:8 in callstack-examples-0.1.0.0-9VnJFsvI3Q O7TuvXNKcHBF:Lib addStackFrame, called at src/Lib.hs:89:5 in callstack-examples-0.1.0.0-9V nJFsvI3QO7TuvXNKcHBF:Lib moreContextPleaseStacked, called at test/Spec.hs:40:9 in main:Main main, called at test/Spec.hs:16:1 in main:Main </code></pre></div></div> Wow! A complete stack trace, all the way from <code class="language-plaintext highlighter-rouge">error</code> to <code class="language-plaintext highlighter-rouge">main</code>. You never see that. Unfortunately, the <code class="language-plaintext highlighter-rouge">String</code> makes deduplicating lines more challenging. <code class="language-plaintext highlighter-rouge">boomStack</code> included the <code class="language-plaintext highlighter-rouge">HasCallStack</code>, which would be an unbroken chain too - let’s see how that plays out… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>moreContextPleaseStacked :: HasCallStack => IO () moreContextPleaseStacked = addStackFrame $ do print boomStack </code></pre></div></div> Evaluating this now gives us: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>callstack-examples-test: oh no, but with a trace CallStack (from HasCallStack): error, called at src/Lib.hs:80:13 in callstack-examples-0.1.0.0-9VnJFsvI3 QO7TuvXNKcHBF:Lib boomStack, called at src/Lib.hs:90:15 in callstack-examples-0.1.0.0-9VnJF svI3QO7TuvXNKcHBF:Lib moreContextPleaseStacked, called at test/Spec.hs:40:9 in main:Main main, called at test/Spec.hs:16:1 in main:Main addStackFrame, called at src/Lib.hs:89:5 in callstack-examples-0.1.0.0-9V nJFsvI3QO7TuvXNKcHBF:Lib moreContextPleaseStacked, called at test/Spec.hs:40:9 in main:Main main, called at test/Spec.hs:16:1 in main:Main </code></pre></div></div> We get <code class="language-plaintext highlighter-rouge">error</code>, <code class="language-plaintext highlighter-rouge">boomStack</code>, <code class="language-plaintext highlighter-rouge">moreContextPleaseStacked</code>, <code class="language-plaintext highlighter-rouge">main</code> - the original stack trace. Then we append to that <code class="language-plaintext highlighter-rouge">addStackFrame</code>, which also adds in <code class="language-plaintext highlighter-rouge">moreContextPleaseStacked</code> and <code class="language-plaintext highlighter-rouge">main</code> again. So, clearly, this is noisier than it needs to be - ideally, we would not include duplicates. This should be possible - <code class="language-plaintext highlighter-rouge">addStackFrame</code> could potentially parse the location <code class="language-plaintext highlighter-rouge">String</code> and if it finds a shared suffix (in this case, <code class="language-plaintext highlighter-rouge">moreContextPleaseStacked</code>), then it can only insert the <code class="language-plaintext highlighter-rouge">addStackFrame</code> call above it and drop the rest. <h1 id="annotated-exception"><code class="language-plaintext highlighter-rouge">annotated-exception</code></h1> I’ve mentioned <a href="https://hackage.haskell.org/package/annotated-exception-0.2.0.4/docs/Control-Exception-Annotated.html"><code class="language-plaintext highlighter-rouge">annotated-exception</code></a> a few times. This library extends the <code class="language-plaintext highlighter-rouge">CallStack</code> machinery to any exception that is thrown by the library or passes through an exception handler. Additionally, you can provide additional metadata information on your exceptions, which makes debugging them much more useful. You can now transparently add, say, the logged in user ID to every single exception that gets thrown in a code block. The source code for this blog post is available <a href="https://github.com/parsonsmatt/callstack-examples">at this GitHub repository</a>. Thu, 11 May 2023 00:00:00 +0000 https://www.parsonsmatt.org/2023/05/11/working_with_haskell_callstack.html https://www.parsonsmatt.org/2023/05/11/working_with_haskell_callstack.html Garmin Fenix 6 Pro vs Apple Watch SE I’ve been pondering a smart watch for tracking more of my daily activity and health, as well as getting some metrics on my non-cycling workouts. I had narrowed the field down to the Fitbit Charge, Apple Watch SE, and Garmin Fenix 6. The Fenix seemed like the winner - awesome battery life, navigation, and even music on board. However, even at a discount, it’s $450, so it must be awesome to be worthwhile. tl;dr: Garmin’s anticooperative features are a dealbreaker. <ul> <li>Update History <ul> <li>2023-02-17: I got an Apple Watch and the comparison has me feeling like maybe I should keep the Garmin. As a result, this may become more of a “fitness watch review and comparison post.”</li> </ul> </li> </ul> <h1 id="garmin-fenix-6s">Garmin Fenix 6s</h1> <h2 id="the-good">The Good</h2> The watch comes in three sizes, and the 6s doesn’t look too big on my wrist, despite having small wrists. The Body Battery and sleep tracking are pretty good. Over the past three weeks or so of owning the Fenix, the Body Battery has dialed in to how I’m physically feeling, and I can pretty well tune my workouts and daily activities based on the recommendations. The pulse oximeter has been interesting for tracking sleep - when I wear a nose strip, my blood oxygen level is higher overnight. This is actionable and useful information, though I already knew that a nose strip helps a lot for sleep quality. The device appears to be very rugged. The lack of touch screen is a plus to me - I don’t like the idea of accidentally touching or disabling things. The button interface is a bit clunky, but overall feels reliable and secure, even when operating with gloves. The battery life is excellent, so long as you’re not downloading music or maps or navigating. <h2 id="the-bad">The Bad</h2> Garmin has a reputation for being anticooperative. Their devices, software, etc, have always blocked features that would be extremely useful if they think they can sell you anothe device or service. They desperately want to be your only interface into the fitness world. In that sense, they’re a lot like Apple. But unlike Apple, you don’t build your lifestyle around your fitness devices. I own a Wahoo bike computer and a Withings smart scale. Garmin explicitly is not compatible with these devices. Garmin explicitly chooses to not support syncing workouts from Strava to calculate “training readiness,” and they explicitly don’t sync weight from Apple Health. They want you to record workouts using Garmin Connect, and they want you to buy a Garmin smart scale. They used to support API workout uploads, so you could sync from Strava to Garmin using a third party. However, they recently disabled this. And - even if you could upload Strava workouts - they don’t count for the “Training Readiness” score that the app gives you. Some more annoyances: <ul> <li>You can’t do firmware update over the phone. You need to plug in to a Windows or Mac computer. My computers are all Ubuntu, so this is a bad deal. I’ve read some reports that WINE can do it, but I’m not sure I want to put the firmware on a $450 device up to WINE and Garmin’s programmers.</li> <li>The music download feature is broken. I tried to download a playlist to the watch, and it burned the battery from 60% to dead in about 20 minutes. It didn’t even download anything.</li> <li>Activity tracking and options are overwhelming, but “track activites automatically” doesn’t work - it never registered a bike ride as a bike ride.</li> </ul> <h2 id="the-choice">The Choice</h2> So, Garmin presents a bargain: <blockquote> The Fenix is a fantastic watch for adventure and activity. But for you to really get the full benefit, you need to switch to Garmin Connect, and Garmin smart scale, and use Garmin for everything. Yeah, that includes tossing your Wahoo bike computer and buying a Garmin. Once you do that, you’ll get Training Status, Body Battery, and other useful training tools. </blockquote> The bargain is… expensive. The Garmin smart scale is more expensive than the Withings by a large amount, and per reviews, it’s not even as good. The Garmin bike computers are way worse than Wahoo, IME, and the cheaper ones have arbitrary software locks (like not being able to sync a route over Bluetooth). Furthermore, you know you’re signing up for a closed ecosystem that will never cooperate with anything else. This sort of ecosystem lock is comparable to what Apple is trying to do, but at a much smaller niche. For the most part, Apple’s ecosystem lock is about adding functionality. One Apple device is quite useful, and doesn’t really do anything to explicitly block compatibility. Sure, the Apple Watch only works on iPhone. But you can use a Fitbit or any other smart watch with the iPhone. You don’t need a Macbook to use an iPhone or iPad. Garmin, on the other hand, explicitly denies compatibility with other devices. Wahoo and Garmin don’t talk to each other. Wahoo is perfectly happy to talk to basically every other company - but Garmin won’t allow it. Garmin won’t even read data from Apple Health, if it can provide a slight edge to selling a product. <h2 id="the-alternatives">The Alternatives</h2> <a href="https://www.dcrainmaker.com/2021/11/best-gps-sport-smartwatches-recommendations-guide-2021.html">DC Rainmaker has a great guide to GPS sport watches</a>. I selected the Fenix as the “best choice” in the Hiking/Adventure watch, since I see myself as an “adventurer.” The COROS Vertix is the other recommendation in the category. That gives you the <a href="https://www.dcrainmaker.com/2021/05/coros-rolls-out-evolab-revamped-training-load-metrics-a-detailed-explainer.html">EvoLab “Training Load Metrics”</a>, which should be a nice competitor to “Body Battery.” It also gives a much better battery life - 60 hours - instead of the 24/36 from Garmin. However, at $500, it also seems like more of a “specialized/primary tool for runners/hikers,” and realistically, I’m a cyclist - my bike computer will be my go-to device for most activity I do. The Apple Watch SE seems to give most of the same featureset as the Garmin, with the main downside being battery life. However, the watch charges pretty quickly, and a habit of charging every day is pretty much what I’ve settled into with the Garmin. It’s also one of the cheapest options to provide mapping and navigation, and the WatchOS platform supports a bunch of apps like RideWithGPS and Hiking Project. The Fitbit Charge is considerably cheaper, and is only a basic activity/sleep tracker. If you don’t want any smartwatch features, then this may be a good bet. I’m boxing up and returning the Fenix today, and I’ve got an order in for the Apple Watch SE. <h1 id="2023-02-17-the-apple-watch-se">2023-02-17: The Apple Watch SE</h1> So, I haven’t actually returned the Fenix yet. I boxed it up, printed the return label, but then loitered at the bike store instead of going to the UPS store to drop it off. It’s sitting in my bag, just waiting for packaging. I picked up an Apple Watch SE on Monday the 13th. At $249, it’s about $200 cheaper than the Fenix 6s. These devices are very different. Really, they’re not in the same market category at all. The Apple competitor to the Garmin is probably the new Apple Watch Ultra, with longer battery life and more outdoor features. But at $800 that’s nearly twice what I paid for the Garmin, and I’m not that excited about having a watch. I think the differences actually make for an interesting compare-and-contrast, which is why I’m extending this post with the review of the Apple Watch. <h2 id="aesthetics">Aesthetics</h2> It’s an Apple product, you really need to start with the aesthetics. The Apple Watch SE has a minimal interface compared to the Garmin. Despite having the same “watch face” size, the border of the Garmin is significantly larger. The Apple Watch looks like a sleek, chic, modern urban accessory. The variety of bands available are also cool - the fabric fitness band that I selected is much more comfortable than the rubber/plastic band that came with the Fenix stock. Apple provides much more choice and flexibility with the watch faces. Being able to select photos as a background is really nice touch. I love looking down and seeing my cat doing something silly, or a beautiful photo memory of some bike ride or camping trip. Aesthetics are deeply personal. I like the Garmin Fenix - the rugged and sporty look works for me. However, the Apple Watch looks better when I’m not outdoors, especially if I’m dressed up (to the extent that a software engineer in Colorado ever dresses up). <h2 id="ease-of-use">Ease of Use</h2> The Apple Watch SE is extremely easy to use. Apple provides a touch screen, a button, and a scroll wheel button. The scroll wheel is a nice touch when gloves make it annoying to use the touch screen, but there’s only so much it can do. The real winner here is the gesture feature and Siri voice control. You can use Siri to start workouts, stop workouts, start timers, make reminders, etc - almost everything you’d expect your phone to do. The gestures are also very cool - you can setup the watch to do something if you pinch your fingers or clench your fist. This accessibility feature makes using the watch one handed significantly easier. The Garmin, on the other hand, relies on physical buttons. The experience is slightly clunkier - starting a workout is much more involved than “Hey Siri, start a yoga workout” or “Hey Siri, start an indoor cycling workout.” The plus side of this is that you won’t get mistaps from the touch screen or misclicks. The buttons on the Apple Watch are somewhat sensitive, and I’ve “misclicked” multiple times in the few days that I’ve owned it. There is no concern about gloves - buttons always work. In a “pure ease of use” contest, the Apple Watch wins easily. However, the reliability and security of physical buttons is an important feature, particularly in outdoor contexts. <h2 id="utility">Utility</h2> The Apple Watch series all share the same OS, and according to <a href="https://www.amazon.com/gp/bestsellers/electronics/7939901011">Amazon’s best sellers for smartwatches</a>, is the most popular smart watch platform. Even beyond the impressive built-in utility from Apple, the third party support is fantastic. Most cycling and fitness apps support the Apple Watch to some degree. Garmin also has an App Store of sorts, though the apps you can really put on a Garmin are much more limited. Apple Watch has more fitness apps available, and far more non-fitness apps. I can open the garage door with the Apple Watch. I can also use it as a walkie-talkie with my friends that have Apple Watches. I can record voice memos, deal with my car, send text messages, etc. It’s easier to see notifications and act on them. The Garmin is hindered here by being an “adventure fitness watch” and not a proper “smart watch.” So the comparison isn’t really fair. Fortunately, there are some points where the Garmin is clearly superior - so let’s dive into those. <h2 id="battery-life">Battery Life</h2> The Apple Watch has a relatively short battery life at only 18 hours. This means you will likely need to charge it several times per day. I’ve got a charger setup on my desk, where I bring it to a full charge in the morning (once I’m done with my morning routine and sleep). Then, I’ll charge it again when I am showering. This is usually enough to keep it working well enough for sleep tracking at night and some light activity tracking during the day. The Watch appears to require a 20W USB-C fast charger. I wasn’t able to get it to charge from my laptop’s USB-C port, nor the USB-C ports on my docking station. This is an inconvenience - I’m not sure I’d easily be able to charge it on a long hiking or bikepacking trip. The Garmin’s battery life is far better. Not only that, but the charging speed is faster. The “charge while showering” habit is all that’s necessary to bring it up to a full charge, even if I’ve forgotten to charge it for a few days. As a result, there’s less stress around the battery. With the Apple Watch, I feel like I’m needing to constantly manage the habit of keeping it charged, which is really more attention than I want to pay to a device. The Garmin is much more forgiving. The Garmin is also the clear winner for longer trips. While I would not bring the Apple Watch along for a multiday bikepacking expedition, the Garmin would definitely come along. <h2 id="fitness-information">Fitness Information</h2> The Garmin takes much finer grained fitness information, and does much more with it. But the Apple Watch does more for “health” - loudness levels, walking balance, etc. <h3 id="heart-rate">Heart Rate</h3> The Apple Watch takes heart rate readings periodically - about every 4 minutes according to my Apple Health information. The Garmin tracks much more frequently when not in a workout - it appears to be continuously reporting a Heart Rate number, although that is still probably sampling only ever several seconds. Both track continuously during a workout, and the Apple Watch provides a nicer view into your heart rate zones. <h3 id="sleep-tracking">Sleep Tracking</h3> Both devices offer sleep tracking. I haven’t compared them directly, but both seem fairly good. The Garmin occasionally thinks I slept longer than I did, which is easy to correct. The Apple Watch doesn’t appear to have any ability to edit the overall sleep duration, but I also haven’t seen it be wrong yet, so that’s promising. On that brief experience, the Apple Watch appears to have a more reliable algorithm, so I’m tempted to trust it more. However, the Garmin provides a “stress” measurement during sleep, which can measure the quality of sleep. Apple measures the quantity of sleep and the time in various sleep stages, but it doesn’t try to tell you what that means. Garmin takes into account “stress” during sleep and incorporates that into a “Sleep Score.” In my experience, the “Sleep Score” did a pretty damn good job of predicting how I’d feel during the day. It did seem to notice when I had caffeine too late in the day, or even a single alcoholic beverage. The Apple Watch may be providing the same raw data, but I don’t know how to interpret it. The Garmin provides a much better reflection point. <h3 id="recovery-status">Recovery Status</h3> The Apple Watch does not attempt to provide a picture of your recovery status in the same way that Garmin does. Garmin tracks your “stress level” in addition to your heart rate and presents a “Body Battery” score indicating your relative readiness. In my experience, this number tracks pretty well with how I’m actually feeling. I haven’t had a time where I saw the number, checked in with my feelings, and thought “wow that’s wrong.” Much more often, I’d see a low number, reflect, and realize how tired I was. The Apple Watch does track HRV, though it appears to periodically take measurements throughout the day. This approach is inherently pretty noisy. <a href="https://medium.com/@altini_marco/how-to-make-sense-of-your-apple-watch-heart-rate-variability-hrv-data-89bf4a510438">This post my Marco Altini</a> goes into detail on the best way to use the HRV data from the Apple Watch, which is tricky. The Apple Watch tracks “heart rate recovery,” but the metric is pretty limited. It only works if you end a workout with your heart rate near peak. So if your workout doesn’t get to a peak heart rate, you won’t get a reliable number. You also won’t get a reliable number if your workout has cool-down. In terms of providing feedback for training, the Garmin is far better. Now, that doesn’t mean that you can’t get useful feedback with the Apple Watch. Most cardio-based training apps (Strava, Xert, etc) will provide some “fitness/fatigue/form” numbers you can use to figure out how you’re feeling. This is often “good enough,” especially if you’re sufficiently embodied to just “know” how sleepy and stressed you are. Those models are often limited by only taking into account workout data. And that needs to be calibrated against some training parameters, like Functional Threshold Power or Maximum Heart Rate - so if you expect those are wrong or off (which, coming off of a surgery, they definitely are for me), then you shouldn’t expect them to be too accurate. The Garmin’s more holistic view of stress and fitness seems like a really useful tool for balancing actual recovery and not just training inputs. This is a big deal. I tend to take on about as much stress as I can in my life, and I’m not nearly as embodied as I would like. If the Garmin can help me attune to my own sensations better and provide more actionable recovery feedback, then that is very valuable to me. Does Apple Watch have a third party app that mimics Body Battery? <a href="https://www.reddit.com/r/AppleWatch/comments/erlzx7/is_there_an_app_thats_similar_to_garmins_body/">A three year old Reddit topic provides several options</a>. The most relevant one appears to be <a href="https://trainingtodayapp.com/">Training Today</a> and a few more specific sleep tracking apps. The app appears to be pretty good - I just downloaded it, and it loaded my data and said “You’re on the more tired side. Keep to Z1 or Z2.” This is fair - I just did a 2x20 at 100% of FTP workout yesterday, and that’s about what I’d expect. The free app gives the basic data you need, and only $20 lifetime for more advanced features is great. Garmin’s information is more detailed <h3 id="training-status">Training Status</h3> I haven’t actually used this, because Garmin’s lack of interoperability means that I haven’t recorded any real workouts with the watch. If you do use a Garmin device to track workouts, then Garmin gives you information about your VO2 max, and provides some data about how you’re training is going. I can see this being effective, especially for runners, but cyclists tend to use power as the source-of-truth for training, and most training apps/websites provide that information pretty well. <h3 id="workouts">Workouts</h3> The Garmin can connect to power meters, external heart rate monitors, and other sensors. The Apple Watch cannot. So the Garmin is a better “fitness monitoring” device. However, the Apple Watch is better for actually doing workouts. Siri’s voice control is super nice for starting/stopping workouts and setting timers, all of which are pretty dang useful during a workout. The scroll wheel is a better interface for most things than Garmin’s clunky up/down button. The touch screen isn’t great when sweaty, but it’s not a disaster, either. For serious training, the Garmin wins, but for the more casual user, the Apple Watch is probably a better fit. If you’re a cyclist and into “serious training,” you probably have a dedicated bike computer anyway, which does the job much better than any watch. But if you’re also interested in running, snowshoeing, hiking, skiing, etc, then the bike computer is obviously a worse fit. <h1 id="garmin-vs-apple">Garmin vs Apple</h1> I’ve had the Apple Watch for about a week, and I used the Garmin for about two weeks before deciding to write this up and switch to the Apple Watch. The Apple Watch is $200 cheaper and has many more non-fitness features. Even if you spend the $5 on Autosleep and $20 on Training Today, you’re $175 cheaper - and now the Garmin’s only real advantage is the longer battery life. Thus, the question: Does it make sense to pay $175 more for a much nicer battery, and also lose a ton of really good features? For me, no. In large part, that’s because I’m a cyclist, and I already have an optimal setup for tracking cycling workouts - a bike computer, a chest strap heart rate monitor, and a power meter. If I didn’t have that stuff, then the Garmin becomes much more interesting. The Garmin can talk directly to a power meter when recording workouts, and is a heart rate monitor. It can also provide navigation, routes, data pages for workouts, and other good features. Changing anything on the bike would be a pain in the ass, though. The Apple Watch would rely on using my phone to record workouts, since it cannot talk directly to a power meter. However, it also can’t natively broadcast heart rate to other apps - <a href="https://apps.apple.com/us/app/heartcast-heart-rate-monitor/id1499771124">there’s a third party app</a>, but it only has 2.9 stars - maybe unreliable? So, to sum everything up: <ul> <li>The Garmin Fenix is a superior sport/fitness watch, if you use it as your central device for fitness tracking, and if a watch is a better form factor than a bike computer</li> <li>The Apple Watch SE is better in every other way, aside from battery life</li> <li>The Apple Watch SE is cheap enough that you can get a Wahoo Elemnt BOLT and the watch, which is a better combination for cycling than just the Garmin</li> </ul> Mon, 13 Feb 2023 00:00:00 +0000 https://www.parsonsmatt.org/2023/02/13/garmin_fenix_6_pro_review.html https://www.parsonsmatt.org/2023/02/13/garmin_fenix_6_pro_review.html Production Haskell Complete I’m happy to announce that my book “Production Haskell” is complete. The book is a 500+ page distillation of my experience working with Haskell in industry. I believe it’s the best resource available for building and scaling the use of Haskell in business. To buy the ebook, go to <a href="https://leanpub.com/production-haskell">the Leanpub page</a> - the price is slightly lower here than on Amazon. To buy hard copies, go to <a href="https://www.amazon.com/dp/B0BTNSJRKD">Amazon</a>. Thanks to all of you for reading my blog, commenting on Reddit, and encouraging me to write the book in the first place. Thu, 02 Feb 2023 00:00:00 +0000 https://www.parsonsmatt.org/2023/02/02/production_haskell_complete.html https://www.parsonsmatt.org/2023/02/02/production_haskell_complete.html Haddock Performance I was recently made aware that <code class="language-plaintext highlighter-rouge">haddock</code> hasn’t been working, at all, on the Mecury code base. I decided to investigate. Watching <code class="language-plaintext highlighter-rouge">htop</code>, <code class="language-plaintext highlighter-rouge">haddock</code> slowly accumulated memory, until it exploded in use and invoked the OOM killer. My laptop has 64GB of RAM. What. I rebooted, and tried again. With no other programs running, <code class="language-plaintext highlighter-rouge">haddock</code> was able to complete. I enabled <code class="language-plaintext highlighter-rouge">-v2</code> and <code class="language-plaintext highlighter-rouge">--optghc=-ddump-timings</code>, which printed out GHC timing information and Haddock time/memory information. With these flags, I could see that HTML generation alone was allocating 800GB of RAM. I decided to look at the source code and see if there were any low hanging fruit. Fortunately, there was! <h1 id="dont-use-writert-for-logging">Don’t use <code class="language-plaintext highlighter-rouge">WriterT</code> for logging</h1> This section culminated in <a href="https://github.com/haskell/haddock/pull/1543">this PR #1543</a>, which I’ll reference. At time of writing, it has been merged. The first thing that jumped out at me is that <code class="language-plaintext highlighter-rouge">haddock</code> used <code class="language-plaintext highlighter-rouge">WriterT</code> for logging. Even worse, it used <code class="language-plaintext highlighter-rouge">WriterT [String]</code>. This is maybe the slowest possible logging system you can imagine. <h2 id="at-least-use-the-cps-writer">At least, use the CPS Writer</h2> <code class="language-plaintext highlighter-rouge">WriterT</code> has a big problem with space leaks. Even the strict <code class="language-plaintext highlighter-rouge">WriterT</code> has this issue. The only <code class="language-plaintext highlighter-rouge">WriterT</code> that can avoid it is the <code class="language-plaintext highlighter-rouge">CPS</code> variant, newly available in <code class="language-plaintext highlighter-rouge">mtl-2.3</code> in <code class="language-plaintext highlighter-rouge">Control.Monad.Writer.CPS</code>. This is documented in Infinite Negative Utility’s post <a href="https://journal.infinitenegativeutility.com/writer-monads-and-space-leaks">“Writer Monads and Space Leaks”</a>, which references two posts from Gabriella Gonzalez to the mailing list in <a href="https://mail.haskell.org/pipermail/libraries/2012-October/018599.html">2012</a> and <a href="https://mail.haskell.org/pipermail/libraries/2013-March/019528.html">2013</a>. <h2 id="dont-use-string-or-string">Don’t use <code class="language-plaintext highlighter-rouge">[String]</code> or <code class="language-plaintext highlighter-rouge">String</code></h2> Beyond just leaking space, the <code class="language-plaintext highlighter-rouge">String</code> format for log messages is extremely inefficient. This is equal to a <code class="language-plaintext highlighter-rouge">[Char]</code>, which builds up as a big thunk in memory until the whole <code class="language-plaintext highlighter-rouge">WriterT</code> computation can complete. Each <code class="language-plaintext highlighter-rouge">Char</code> takes <a href="https://wiki.haskell.org/GHC/Memory_Footprint">2 words of memory</a>, and a <code class="language-plaintext highlighter-rouge">[Char]</code> takes <code class="language-plaintext highlighter-rouge">(1 + 3N) words + 2N</code> where <code class="language-plaintext highlighter-rouge">N</code> is the number of characters. Or, ~5 words per character. On a 64 bit machine, each word is 8 bytes, so each character costs 40 bytes. A UTF-8 encoded <code class="language-plaintext highlighter-rouge">ByteString</code> will use 1 to 4 bytes per character. Using a <code class="language-plaintext highlighter-rouge">ByteString</code> would make the representation much more compact, but these things get concatenated a bunch, and a <code class="language-plaintext highlighter-rouge">Builder</code> is the appropriate choice for an <code class="language-plaintext highlighter-rouge">O(1)</code> append. Switching to <code class="language-plaintext highlighter-rouge">CPS.WriterT [Builder]</code> instead of <code class="language-plaintext highlighter-rouge">WriterT [String]</code> helps, but we’re not done yet. <code class="language-plaintext highlighter-rouge">[]</code> is a bad choice for a <code class="language-plaintext highlighter-rouge">WriterT</code>, since <code class="language-plaintext highlighter-rouge">tell</code> will <code class="language-plaintext highlighter-rouge">mappend</code> the lists. <code class="language-plaintext highlighter-rouge">mappend</code> on lists can result in bad performance if it isn’t associated correctly - <code class="language-plaintext highlighter-rouge">(((a ++ b) ++ c) ++ d) ++ e</code> is accidentally quadratic, since we’ll traverse over the <code class="language-plaintext highlighter-rouge">a</code> list for every single <code class="language-plaintext highlighter-rouge">++</code> call. A “difference list” has much faster appends, since it can associate things correctly regardless of how you construct it. To make it easier to use the API, I created an <a href="https://github.com/haskell/haddock/pull/1543/files?diff=split&w=1#diff-fb24fea4d702952a1d040f7f9f4f7e547cbc2467b29587657c7fee02bfc1ee9bR686-R687"><code class="language-plaintext highlighter-rouge">ErrorMessages</code></a> type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ErrorMessages = ErrorMessages { unErrorMessages :: [Builder] -> [Builder] } deriving newtype (Semigroup, Monoid) runErrMsgM :: ErrMsgM a -> (a, ErrorMessages) runErrMsgM = runWriter . unErrMsgM singleMessage :: Builder -> ErrorMessages singleMessage m = ErrorMessages $ (m :) errorMessagesToList :: ErrorMessages -> [Builder] errorMessagesToList messages = unErrorMessages messages [] </code></pre></div></div> <h2 id="dont-use-nub">Don’t use <code class="language-plaintext highlighter-rouge">nub</code></h2> There were a few places that called <code class="language-plaintext highlighter-rouge">nub</code>. <code class="language-plaintext highlighter-rouge">nub</code> is <code class="language-plaintext highlighter-rouge">O(n^2)</code> on a linked list - it must search the entire list for every element in the list to verify uniqueness. <ul> <li><a href="https://github.com/haskell/haddock/pull/1543/files?diff=split&w=1#diff-421689688051a1380a572c21720065b628525bba15300b606aca78753e699fdaL464"><code class="language-plaintext highlighter-rouge">nub</code> called on <code class="language-plaintext highlighter-rouge">packages</code></a>, a list of installed packages.</li> <li><a href="https://github.com/haskell/haddock/pull/1543/files?diff=split&w=1#diff-48dfb971b4cf3e94bd44acfafd718ac029840968b23192f25ac716d2cffe831fL177"><code class="language-plaintext highlighter-rouge">nub</code> called on <code class="language-plaintext highlighter-rouge">themeFiles</code></a></li> <li><a href="https://github.com/haskell/haddock/pull/1543/files?diff=split&w=1#diff-b5017e9be522fcf7f7bdfdf3fc9e8d32897ec99b84fc4122cee231ba83a0ea3cL327"><code class="language-plaintext highlighter-rouge">nub</code> called on <code class="language-plaintext highlighter-rouge">messages</code></a>, the result of the <code class="language-plaintext highlighter-rouge">WriterT [String]</code> computation</li> </ul> That last one is especially gnarly. We’re doing an <code class="language-plaintext highlighter-rouge">O(n^2)</code> job, leaking space along the way, and finally we accumulate the big list- only to do an <code class="language-plaintext highlighter-rouge">O(n^2)</code> traversal over it to delete duplicates. Fortunately, each call site of <code class="language-plaintext highlighter-rouge">nub</code> can be replaced with the easy <code class="language-plaintext highlighter-rouge">ordNub</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ordNub :: Ord a => [a] -> [a] ordNub = Set.toList . Set.fromList </code></pre></div></div> This also sorts the list, which may not be desired. A more cumbersome implementation does this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ordNub :: Ord a => [a] -> [a] ordNub = go Set.empty where go seen [] = [] go seen (x:xs) | Set.member x seen = go seen xs | otherwise = x : go (Set.insert x seen) xs </code></pre></div></div> <h2 id="results">Results</h2> This small change resulted in a huge improvement in performance for my test case. Running <code class="language-plaintext highlighter-rouge">haddock</code> on the <code class="language-plaintext highlighter-rouge">persistent-test</code> library, I observed a 30% improvement in the time to generate documenation with total memory use 4% better. Allocations went from 42GB to 25GB. I didn’t bother profiling to determine this as a hot-spot - it’s always wrong to use <code class="language-plaintext highlighter-rouge">WriterT</code> as a logger. A further performance improvement would be to remove <code class="language-plaintext highlighter-rouge">WriterT</code> entirely and simply output the messages directly. Then instead of retaining a big difference list of log messages, you can just print them right then and there. <h1 id="xhtml-and-string"><code class="language-plaintext highlighter-rouge">xhtml</code> and <code class="language-plaintext highlighter-rouge">[String]</code></h1> This section is represented by <a href="https://github.com/haskell/haddock/pull/1546">this <code class="language-plaintext highlighter-rouge">haddock</code> PR</a> and <a href="https://github.com/haskell/xhtml/pull/18">this <code class="language-plaintext highlighter-rouge">xhtml</code> PR</a>. <code class="language-plaintext highlighter-rouge">haddock</code> uses a library <a href="https://hackage.haskell.org/package/xhtml"><code class="language-plaintext highlighter-rouge">xhtml</code></a> for generating the HTML. This library is old - the initial copyright is 1999. <code class="language-plaintext highlighter-rouge">xhtml</code> predates <code class="language-plaintext highlighter-rouge">ByteString</code> entirely, which has an earliest copyright of 2003. Anyway, we have a similar problem. The <code class="language-plaintext highlighter-rouge">Html</code> type is defined like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Html = Html { getHtmlElements :: [HtmlElement] } data HtmlElement = HtmlString String | HtmlTag { markupTag :: String, markupAttrs :: [HtmlAttr], markupContent :: Html } -- | Attributes with name and value. data HtmlAttr = HtmlAttr String String </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">xhtml</code> library uses <code class="language-plaintext highlighter-rouge">++</code> on lists all over the place. The <a href="https://github.com/haskell/xhtml/blob/68353ccd1a2e776d6c2b11619265d8140bb7dc07/Text/XHtml/Internals.hs#L286-L299"><code class="language-plaintext highlighter-rouge">renderHtml'</code> function</a> uses <code class="language-plaintext highlighter-rouge">ShowS</code>, fortunately - this is difference list of <code class="language-plaintext highlighter-rouge">Char</code>, so we probably won’t be seeing pessimal performance. Like the above PR to remove <code class="language-plaintext highlighter-rouge">WriterT [String]</code> and replace it with a difference list of <code class="language-plaintext highlighter-rouge">Builder</code>, I did that to <code class="language-plaintext highlighter-rouge">xhtml</code>. All explicit lists are now difference lists, and all <code class="language-plaintext highlighter-rouge">String</code> are replaced with <code class="language-plaintext highlighter-rouge">Builder</code>. The performance results are impressive: <table> <thead> <tr> <th> </th> <th>Haddock Head</th> <th><code class="language-plaintext highlighter-rouge">xhtml</code> Builder</th> <th>Absolute Difference</th> <th>Relative Change</th> </tr> </thead> <tbody> <tr> <td>HTML allocations</td> <td>1134 MB</td> <td>1141 MB</td> <td>+7 MB</td> <td>0.6% worse</td> </tr> <tr> <td>HTML time:</td> <td>380 ms</td> <td>198 ms</td> <td>-182 ms</td> <td>47.9% improvement</td> </tr> <tr> <td>Total Memory:</td> <td>554 MB</td> <td>466 MB</td> <td>-88 MB</td> <td>15.9% improvement</td> </tr> <tr> <td>Total Allocated:</td> <td>16.0 GB</td> <td>16.0 GB</td> <td>0</td> <td>No change</td> </tr> <tr> <td>Max residency:</td> <td>238 MB</td> <td>195 MB</td> <td>-43 MB</td> <td>18.1% improvement</td> </tr> <tr> <td>Total Time:</td> <td>10.88 s</td> <td>6.526s s</td> <td>-4.354 s</td> <td>40% improvement</td> </tr> </tbody> </table> Avoiding <code class="language-plaintext highlighter-rouge">[]</code> and <code class="language-plaintext highlighter-rouge">String</code> halves the time to render HTML, and results in a 40% overall improvement in the time to run <code class="language-plaintext highlighter-rouge">haddock</code>. While we don’t allocate any fewer memory during HTML generation, we’re using 16% less total memory and maximum residency is down by 18%. <h1 id="conclusion">Conclusion</h1> Haskell performance doesn’t have to be hard. If you avoid common footguns like <code class="language-plaintext highlighter-rouge">WriterT</code>, <code class="language-plaintext highlighter-rouge">[]</code>, <code class="language-plaintext highlighter-rouge">String</code>, <code class="language-plaintext highlighter-rouge">nub</code>, etc. then your code will probably be pretty quick. Picking the low hanging fruit is usually worthwhile, even if you haven’t spent the effort determining the real problem. Profiling shows that <code class="language-plaintext highlighter-rouge">haddock</code> spends an enormous amount of time generating object code - a necessary step for any module that has <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> enabled. With GHC 9.6, we’ll be able to pass <code class="language-plaintext highlighter-rouge">-fprefer-byte-code</code>, which will use the much faster byte code representation instead of object code. Even in HTML generation, profiling indicates that we spend the majority of time doing <code class="language-plaintext highlighter-rouge">fixChar</code> - the process of escaping a character into an HTML appropriate <code class="language-plaintext highlighter-rouge">Builder</code>. We also spend a bunch of time regenerating HTML for re-exports - the HTML documentation for a datatype, function, type class, etc is generated fresh for every module that exports it. Even if HTML were perfectly optimized, Haddock’s current design creates a huge <code class="language-plaintext highlighter-rouge">[Interface]</code>, where each <code class="language-plaintext highlighter-rouge">Interface</code> is a module that you are generating documentation for. This <code class="language-plaintext highlighter-rouge">[Interface]</code> must be retained in memory, because it is passed to each “component” of the documentation build. Refactoring <code class="language-plaintext highlighter-rouge">haddock</code> to stream these interfaces isn’t obvious, since some doc building steps require summary of the entire <code class="language-plaintext highlighter-rouge">[Interface]</code> in order to proceed. Figuring out a fix for the “real problems” would have been much more difficult than these easy fixes, which have still made a huge difference in overall perforamnce. Wed, 21 Dec 2022 00:00:00 +0000 https://www.parsonsmatt.org/2022/12/21/haddock_performance.html https://www.parsonsmatt.org/2022/12/21/haddock_performance.html Break Gently with Pattern Synonyms This is a really brief post to call out a nice trick for providing users a nice migration message when you delete a constructor in a sum type. <h1 id="the-problem">The Problem</h1> You have a sum type, and you want to delete a redundant constructor to refactor things. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Foo = Bar Int | Baz Char | Quux Double </code></pre></div></div> That <code class="language-plaintext highlighter-rouge">Quux</code> is double trouble. But if we simply delete it, then users will get a <code class="language-plaintext highlighter-rouge">Constructor not found: Quux</code>. This isn’t super helpful. They’ll have to go find where <code class="language-plaintext highlighter-rouge">Quux</code> came from, what package defined it, and then go see if there’s a Changelog. If not, then they’ll have to dig through the Git history to see what’s going on. This isn’t a fun workflow. But, let’s say you really need end users to migrate off <code class="language-plaintext highlighter-rouge">Quux</code>. So we’re interested in giving a compile error that has more information than <code class="language-plaintext highlighter-rouge">Constructor not in scope</code>. Here’s what some calling code looks like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>blah :: Foo -> Int blah x = case x of Bar i -> i Baz c -> fromEnum c Quux a -> 3 </code></pre></div></div> will give the output: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/patsyn.hs:24:5: error: Not in scope: data constructor ‘Quux’ | 24 | Quux a -> 3 | ^^^^ Failed, no modules loaded. </code></pre></div></div> Fortunately, we can make this nicer. GHC gives us a neat trick called <a href="https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/pattern_synonyms.html"><code class="language-plaintext highlighter-rouge">PatternSynonyms</code></a>. They create constructor-like things that we can match on and construct with, but that are a bit smarter. <h2 id="matching">Matching</h2> Let’s redefine <code class="language-plaintext highlighter-rouge">Quux</code> as a pattern synonym on <code class="language-plaintext highlighter-rouge">Foo</code>. We’ll also export it as part of the datatype definition. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language PatternSynonyms, ViewPatterns #-} module Wow (Foo (.., Quux)) where data Foo = Bar Int | Baz Char pattern Quux :: a -> Foo pattern Quux i <- (const Nothing -> Just i) </code></pre></div></div> This does something tricky: we always throw away the input with the <code class="language-plaintext highlighter-rouge">ViewPattern</code>, and we can summon whatever we want in the left hand side. This allows us to provide whatever <code class="language-plaintext highlighter-rouge">a</code> is needed to satisfy the type. This match will never succeed - so <code class="language-plaintext highlighter-rouge">Quux</code> behavior will never happen. Now, we get a warning for the match: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[1 of 1] Compiling Main ( /home/matt/patsyn.hs, interpreted ) /home/matt/patsyn.hs:25:5: warning: [-Woverlapping-patterns] Pattern match is redundant In a case alternative: Quux a -> ... | 25 | Quux a -> 3 | ^^^^^^^^^^^ Ok, one module loaded. </code></pre></div></div> But an error for constructing: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[1 of 1] Compiling Main ( /home/matt/patsyn.hs, interpreted ) /home/matt/patsyn.hs:28:10: error: • non-bidirectional pattern synonym ‘Quux’ used in an expression • In the expression: Quux 3 In an equation for ‘blargh’: blargh = Quux 3 | 28 | blargh = Quux 3 | ^^^^ Failed, no modules loaded. </code></pre></div></div> So we need to construct with it, too. We can modify the pattern synonym by providing a <code class="language-plaintext highlighter-rouge">where</code>, and specifying how to construct with it. Since we’re intending to prevent folks from using it, we’ll just use <code class="language-plaintext highlighter-rouge">undefined</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pattern Quux :: a -> Foo pattern Quux i <- (const Nothing -> Just i) where Quux _ = undefined </code></pre></div></div> With this, we get just the warning about a redundant pattern match. Now it’s time to step up our game by providing a message to the end user. <h1 id="warnings">Warnings</h1> GHC gives us the ability to write <code class="language-plaintext highlighter-rouge">{-# WARNING Quux "migrate me pls" #-}</code>. This can make sense if we expect that the runtime behavior of a program won’t be changed by our pattern synonym. So let’s write a warning: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pattern Quux :: a -> Foo pattern Quux i <- (const Nothing -> Just i) where Quux _ = undefined {-# WARNING Quux "Please migrate away from Quux in some cool manner. \ \See X resource for migration tips." #-} </code></pre></div></div> Now, when compiling, we’ll see the warnings: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/patsynimp.hs:11:5: warning: [-Wdeprecations] In the use of data constructor ‘Quux’ (imported from PatSyn): "Please migrate away from Quux in some cool manner. See X resource for migration tips." | 11 | Quux _ -> 3 | ^^^^ /home/matt/patsynimp.hs:11:5: warning: [-Woverlapping-patterns] Pattern match is redundant In a case alternative: Quux _ -> ... | 11 | Quux _ -> 3 | ^^^^^^^^^^^ /home/matt/patsynimp.hs:14:10: warning: [-Wdeprecations] In the use of data constructor ‘Quux’ (imported from PatSyn): "Please migrate away from Quux in some cool manner. See X resource for migration tips." | 14 | blargh = Quux (3 :: Int) | ^^^^ </code></pre></div></div> But this may not be good enough. We may want to give them an error, so they can’t build. <h1 id="typeerror"><code class="language-plaintext highlighter-rouge">TypeError</code></h1> <a href="https://www.stackage.org/haddock/lts-19.31/base-4.15.1.0/GHC-TypeLits.html#t:TypeError"><code class="language-plaintext highlighter-rouge">base</code> defines a type <code class="language-plaintext highlighter-rouge">TypeError</code></a>, which GHC treats specially - it raises a type error. This isn’t generally useful, but can be great for marking branches of a <code class="language-plaintext highlighter-rouge">type family</code> or type class <code class="language-plaintext highlighter-rouge">instance</code> as “impossible.” The error message can be fantastic for guiding folks towards writing correct code. <code class="language-plaintext highlighter-rouge">PatternSynonym</code>s can have two sets of constraints: the first is required when constructing, and the second is provided when matching. So let’s just put an error in the first and see what happens: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pattern Quux :: (TypeError ('Text "please migrate ...")) => () => a -> Foo pattern Quux i <- (const Nothing -> Just i) where Quux _ = undefined </code></pre></div></div> Unfortunately, GHC blows up immediately while compiling the synonym! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[1 of 2] Compiling PatSyn ( PatSyn.hs, interpreted ) PatSyn.hs:20:1: error: please migrate ... | 20 | pattern Quux | ^^^^^^^^^^^^... Failed, no modules loaded. </code></pre></div></div> We can’t even <code class="language-plaintext highlighter-rouge">-fdefer-type-errors</code> this one. Are we hosed? What about the second position? Same problem. We can’t put a bare <code class="language-plaintext highlighter-rouge">TypeError</code> in there at all. Fortunately, we can have a lil’ bit of laziness by introducing it as a constraint. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class DeferredError instance (TypeError ('Text "please migrate ...")) => DeferredError pattern Quux :: DeferredError => DeferredError => a -> Foo pattern Quux i <- (const Nothing -> Just i) where Quux _ = undefined </code></pre></div></div> This actually does give us a warning now - at the <code class="language-plaintext highlighter-rouge">const Nothing -> Just i</code> line, we have a deferred type error. This gives us the error behavior we want! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/patsynimp.hs:14:10: error: • please migrate ... • In the expression: Quux (3 :: Int) In an equation for ‘blargh’: blargh = Quux (3 :: Int) | 14 | blargh = Quux (3 :: Int) | ^^^^^^^^^^^^^^^ Failed, one module loaded. </code></pre></div></div> We only get the one error - but if we delete it, we can see the other error: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[2 of 2] Compiling Main ( /home/matt/patsynimp.hs, interpreted ) /home/matt/patsynimp.hs:11:5: error: • please migrate ... • In the pattern: Quux _ In a case alternative: Quux _ -> 3 In the expression: case x of Bar i -> i Baz c -> fromEnum c Quux _ -> 3 | 11 | Quux _ -> 3 | ^^^^^^ Failed, one module loaded. </code></pre></div></div> What’s fun is that we can actually provide two different messages. Constructing something will give both error messages, and pattern matching only uses the “required” constraint. This should make it much easier for end users to migrate to new versions of your library. <h1 id="final-code-and-errors">Final Code and Errors</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language PatternSynonyms #-} {-# language KindSignatures #-} {-# language FlexibleContexts #-} {-# language FlexibleInstances #-} {-# language ViewPatterns #-} {-# language MultiParamTypeClasses #-} {-# language UndecidableInstances #-} {-# language DataKinds #-} {-# OPTIONS_GHC -fdefer-type-errors #-} module PatSyn where import Prelude import GHC.Exts import GHC.TypeLits data Foo = Bar Int | Baz Char class DeferredError (a :: ErrorMessage) instance (TypeError a) => DeferredError a pattern Quux :: DeferredError ('Text "please migrate (required constraint)") => DeferredError ('Text "please migrate (provided constraint)") => a -> Foo pattern Quux i <- (const Nothing -> Just i) where Quux _ = undefined </code></pre></div></div> Matching a constructor: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[2 of 2] Compiling Main ( /home/matt/patsynimp.hs, interpreted ) /home/matt/patsynimp.hs:11:5: error: • please migrate (required constraint) • In the pattern: Quux _ In a case alternative: Quux _ -> 3 In the expression: case x of Bar i -> i Baz c -> fromEnum c Quux _ -> 3 | 11 | Quux _ -> 3 | ^^^^^^ Failed, one module loaded. </code></pre></div></div> Using a constructor: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[2 of 2] Compiling Main ( /home/matt/patsynimp.hs, interpreted ) /home/matt/patsynimp.hs:14:10: error: • please migrate (required constraint) • In the expression: Quux (3 :: Int) In an equation for ‘blargh’: blargh = Quux (3 :: Int) | 14 | blargh = Quux (3 :: Int) | ^^^^^^^^^^^^^^^ /home/matt/patsynimp.hs:14:10: error: • please migrate (provided constraint) • In the expression: Quux (3 :: Int) In an equation for ‘blargh’: blargh = Quux (3 :: Int) | 14 | blargh = Quux (3 :: Int) | ^^^^^^^^^^^^^^^ Failed, one module loaded. </code></pre></div></div> Wed, 02 Nov 2022 00:00:00 +0000 https://www.parsonsmatt.org/2022/11/02/break-gently-pattern-syn.html https://www.parsonsmatt.org/2022/11/02/break-gently-pattern-syn.html Spooky Masks and Async Exceptions Everyone loves Haskell because it makes concurrent programming so easy! <code class="language-plaintext highlighter-rouge">forkIO</code> is great, and you’ve got <code class="language-plaintext highlighter-rouge">STM</code> and <code class="language-plaintext highlighter-rouge">MVar</code> and other fun tools that are pleasant to use. Well, then you learn about asynchronous exceptions. The world seems a little scarier - an exception could be lurking around any corner! Anyone with your <code class="language-plaintext highlighter-rouge">ThreadId</code> could blast you with a <code class="language-plaintext highlighter-rouge">killThread</code> or <code class="language-plaintext highlighter-rouge">throwTo</code> and you would have no idea what happened. The <code class="language-plaintext highlighter-rouge">async</code> library hides a lot of this from you by managing the <code class="language-plaintext highlighter-rouge">forkIO</code> and <code class="language-plaintext highlighter-rouge">throwTo</code> stuff for you. It also makes it easy to wait on a thread to finish, and receive exceptions that the forked thread died with. Consider how nice the implementation of <code class="language-plaintext highlighter-rouge">timeout</code> is here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>timeout :: Int -> IO a -> IO (Maybe a) timeout microseconds action = do withAsync (Just <$> action) $ \a0 -> withAsync (Nothing <$ threadDelay microseconds) $ \a1 -> either id id <$> waitEither a0 a1 </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">async</code> library uses asynchronous exceptions to signal that a thread must die. The <code class="language-plaintext highlighter-rouge">withAsync</code> function guarantees that the forked thread is killed off when the inner action is complete. So <code class="language-plaintext highlighter-rouge">timeout</code> will fork a thread to run <code class="language-plaintext highlighter-rouge">Just <$> action</code>, and then fork another thread to <code class="language-plaintext highlighter-rouge">threadDelay</code>. <code class="language-plaintext highlighter-rouge">waitEither</code> accepts an <code class="language-plaintext highlighter-rouge">Async a</code> and an <code class="language-plaintext highlighter-rouge">Async b</code> and returns an <code class="language-plaintext highlighter-rouge">IO (Either a b)</code> - whichever one finishes first determines the return type. If <code class="language-plaintext highlighter-rouge">threadDelay</code> finishes first, then we get a <code class="language-plaintext highlighter-rouge">Right Nothing</code> as the return, and exits. This spells doom for the <code class="language-plaintext highlighter-rouge">action</code> thread. But if our brave hero is able to escape before the deadline, it’s the <code class="language-plaintext highlighter-rouge">threadDelay</code> that gets killed! Indeed, this is a specialization of <code class="language-plaintext highlighter-rouge">race :: IO a -> IO b -> IO (Either a b)</code>, which runs two <code class="language-plaintext highlighter-rouge">IO</code> actions in separate threads. The first to complete returns the value, and the remaining thread is sacrificed to unspeakable horrors. But, you really shouldn’t <code class="language-plaintext highlighter-rouge">catch</code> or <code class="language-plaintext highlighter-rouge">handle</code> async exceptions yourself. GHC uses them to indicate “you really need to shut down extremely quickly, please handle your shit right now.” <code class="language-plaintext highlighter-rouge">ThreadKilled</code> is used to end a thread’s execution, and <code class="language-plaintext highlighter-rouge">UserInterrupt</code> means that you got a <code class="language-plaintext highlighter-rouge">SIGINT</code> signal and need to stop gracefully. The <code class="language-plaintext highlighter-rouge">async</code> package uses <code class="language-plaintext highlighter-rouge">AsyncCancelled</code> to, well, cancel threads. However, the <code class="language-plaintext highlighter-rouge">base</code> package’s <code class="language-plaintext highlighter-rouge">Control.Exception</code> has a footgun: if you catch-all-exceptions by matching on <code class="language-plaintext highlighter-rouge">SomeException</code>, then you’ll catch these async exceptions too! Now, you should pretty much never be catching <code class="language-plaintext highlighter-rouge">SomeException</code>, unless you really really know what you’re doing. But I see it all the time: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Exception (catch) blah = Just <$> coolThing `catch` \(SomeException e) -> do reportException e pure Nothing </code></pre></div></div> If <code class="language-plaintext highlighter-rouge">coolThing</code> receives a <code class="language-plaintext highlighter-rouge">ThreadKilled</code> or an <code class="language-plaintext highlighter-rouge">AsyncCancelled</code> or <code class="language-plaintext highlighter-rouge">UserInterrupt</code> or anything else from <code class="language-plaintext highlighter-rouge">throwTo</code>, it’ll catch it, report it, and then your program will continue running. Then the second <code class="language-plaintext highlighter-rouge">Ctrl-C</code> comes from the user, and your program halts immediately without running any cleanup. This is pretty dang bad! You really want your <code class="language-plaintext highlighter-rouge">finally</code> calls to run. You search for a bit, and you find the <a href="https://hackage.haskell.org/package/safe-exceptions"><code class="language-plaintext highlighter-rouge">safe-exceptions</code></a> package. It promises to make things a lot nicer by not catching async exceptions by default. So our prior code block, with just a change in import, becomes much safer: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Exception.Safe (catch) blah = Just <$> coolThing `catch` \(SomeException e) -> do reportException e pure Nothing </code></pre></div></div> This code will no longer catch and report an async exception. However, the blocks in your <code class="language-plaintext highlighter-rouge">finally</code> and <code class="language-plaintext highlighter-rouge">bracket</code> for cleanup will run! Unfortunately, the <code class="language-plaintext highlighter-rouge">safe-exceptions</code> library (and the <code class="language-plaintext highlighter-rouge">unliftio</code> package which uses the same behavior), have a dark secret… <code class="language-plaintext highlighter-rouge">*thunder claps in the distance, as rain begins to fall*</code> … they wear spooky masks while cleaning! WowowoOOOoOoOooOooOOooOooOOo No, really, they do something like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bracket provide cleanup action = Control.Exception.bracket provide (\a -> Control.Exception.uninterruptibleMask_ $ cleanup a) action </code></pre></div></div> This code looks pretty innocuous. It even says that it’s good! “Your cleanup function is guaranteed not to be interrupted by an asynchronous exception.” So if you’re cleaning things up, and BAMM a vampire <code class="language-plaintext highlighter-rouge">ThreadKill</code>s you, you’ll finish your cleanup before rethrowing. This might just be all you need to make it out of the dungeon alive. Behind the sweet smile and innocent demeanor of the <code class="language-plaintext highlighter-rouge">safe-exceptions</code> package, though, is a dark mystery - and a vendetta for blood. Well, maybe not blood, but I guess “intercompatibility of default expectations”? <h1 id="a-nightmare-scenario-night-of-the-living-deadlock">A Nightmare Scenario: Night of the Living Deadlock</h1> Once, a brave detective tried to understand how slow the database was. But in her studies, she accidentally caused the the entire app to deadlock and become an unkillable zombie?! There are three actors in this horror mystery. Mr DA, the prime suspect. Alice, our detective. And Bob, the unassuming janitor. <h2 id="mr-database-acquisition">Mr Database Acquisition</h2> One of the suspected villains is Mr. Database Acquisition, a known rogue. Usually, Mr. Database Acquisition works quickly and effectively, but sometimes everything stops and he’s nowhere to be found. We’re already recording how long he takes by measuring the job completion time, but if the job never finishes, we don’t know anything. The database connection is provided from a <code class="language-plaintext highlighter-rouge">resource-pool</code> <code class="language-plaintext highlighter-rouge">Pool</code>, which is supposed to be thread safe and guarantee resource allocation. But something seems shady about it… <h2 id="alice">Alice</h2> Alice is a performance engineer and lead detective. She’s interested in making the codebase faster, and to do so, she sets up inspection points to log how long things are taking. Alice cleverly sets up a phantom detective - a forked thread that occasionally checks in on Mr Database. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withAcquisitionTimer :: (IO () -> IO r) -> IO r withAcquisitionTimer action = do timeSpent <- newIORef 0 let tracker = forever $ do threadDelay 1000 timeSpent <- atomicModifyIORef' timeSpent (\a -> (a+1000, a+1000)) recordMetric runningWait timeSpent report = do elapsed <- readIORef timeSpent recordMetric totalWait elapsed withAsync (tracker `finally` report) $ \a -> action (cancel a) </code></pre></div></div> The actual implementation is a bit more robust and sensible, but this gets the gist across. Pretend we’re in a campy low budget horror movie. The <code class="language-plaintext highlighter-rouge">tracker</code> thread wakes up every millisecond to record how long we’re waiting, and continues running until the thread is finally cancelled, or killed with an async exception, or the <code class="language-plaintext highlighter-rouge">action</code> finishes successfully, or if a regular exception causes <code class="language-plaintext highlighter-rouge">action</code> to exit early. <code class="language-plaintext highlighter-rouge">withAsync</code> will <code class="language-plaintext highlighter-rouge">cancel</code> the tracker thread, ensuring that we don’t leak threads. Part of <code class="language-plaintext highlighter-rouge">cancel</code>’s API is that it doesn’t return until the thread is totally, completely, certainly dead - so when <code class="language-plaintext highlighter-rouge">withAsync</code> returns, you’re guaranteed that the thread is dead. Alice sets the tracker up for every database acquisition, and waits to see what’s really going on. <h2 id="bob-the-janitor">Bob, the Janitor</h2> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>theSceneOfTheCrime = bracket (runDB startProcess) (\processId -> runDB (closeProcess processId)) $ \processId -> do doWorkWith processId {- ... snip ... -} </code></pre></div></div> There’s a great big mess - it appears that someone was thrown from a high building! Foul play is suspected from the initial detective work. But after the excitement dies down, the janitor, Bob, is left to clean up the mess. One of the perks of being a janitor is protection from all sorts of evil. While you’re cleaning stuff up, nothing spooky can harm you - no async exceptions are allowd. You might expect there’s a loophole here, but it’s fool proof. It’s such a strong protection that the janitor is even able to bestow it upon anyone that works for him to help clean up. Bob begins cleaning up by recording the work he’s doing in the database. To do this, he requests a database connection from Mr Database. However, this provides Mr Database with the same protections: no one can kill him, or anyone that works for him! Now, by the particular and odd rules of this protection magic, you don’t have to know that someone is working for you. So the phantom tracker that Alice set up is similarly extended this protection. Mr Database provides the database connection to Bob in a prompt manner, and Bob completes his task. However, when Bob attempts to release the database back, he can’t! The database connection is permanently stuck to his hand. Mr Database can’t accept it back and put it in the pool, and he can’t continue to his next job. The entire application comes grinding to a halt, as no one can access the database. What kind of bizarre curse is this? <h1 id="the-gift-of-safety">The Gift of Safety</h1> <code class="language-plaintext highlighter-rouge">withAsync</code> wants to be safe - it wants to guarantee that the forked thread is killed when the block exits. It accomplishes this by effectively doing: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withAsync thread action = bracket (async thread) uninterruptibleCancel action </code></pre></div></div> <code class="language-plaintext highlighter-rouge">async</code> forks the thread and prepares the <code class="language-plaintext highlighter-rouge">Async</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>async action = do var <- newEmptyTMVarIO threadId <- mask $ \restore -> forkIO $ try (restore action) >>= atomically . putTMVar var return Async { asyncThreadId = threadId , _asyncWait = readTMVar var } </code></pre></div></div> <code class="language-plaintext highlighter-rouge">async</code> is careful to <code class="language-plaintext highlighter-rouge">mask</code> the <code class="language-plaintext highlighter-rouge">forkIO</code> call, which ensures that the forked thread is <code class="language-plaintext highlighter-rouge">mask</code>ed. That allows <code class="language-plaintext highlighter-rouge">action</code> to receive async exceptions, but outside of <code class="language-plaintext highlighter-rouge">action</code>, it’s guaranteed that if <code class="language-plaintext highlighter-rouge">try</code> succeeds, then the <code class="language-plaintext highlighter-rouge">atomically . putTMVar var</code> also succeeds. Since <code class="language-plaintext highlighter-rouge">try</code> will catch async exceptions, this means that the async exception will definitely be registered in the <code class="language-plaintext highlighter-rouge">putTMVar</code> call. <code class="language-plaintext highlighter-rouge">uninterruptibleCancel</code> cancels the thread in an uninterruptible state. <code class="language-plaintext highlighter-rouge">cancel</code> waits for the thread to complete - either with an exception or a real value. Meanwhile, <code class="language-plaintext highlighter-rouge">bracket</code> is also cursed with safety: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module UnliftIO.Exception where bracket make clean action = withRunInIO $ \runInIO -> Control.Exception.bracket (runInIO make) (\a -> uninterruptibleMask_ $ runInIO $ clean a) (\a -> runInIO $ action a) </code></pre></div></div> <h1 id="the-curse-of-two-gifts">The Curse of Two Gifts</h1> Unspeakable magical rules dictate that two gifts form a curse, under the usual laws for associativity and commutativity. To understand what’s going on, we start by inlining the bracket. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>crimeSceneCleanedUp = withRunInIO $ \runInIO -> bracket (runInIO $ runDB createProcess) (\pid -> uninterruptibleMask_ $ do runInIO $ runDB $ do closeProcess pid ) _stuff </code></pre></div></div> We know that the <code class="language-plaintext highlighter-rouge">make</code> and <code class="language-plaintext highlighter-rouge">action</code> managed to complete, so we’re interested in the cleanup. Let’s expand <code class="language-plaintext highlighter-rouge">runDB</code> annd omit some noise: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>crimeSceneCleanedUp = withRunInIO $ \runInIO -> uninterruptibleMask_ $ do runInIO $ do sqlPool <- getSqlPool withAcquisitionTimer $ \stop -> flip runSqlPool sqlPool $ do stop closeProcess pid </code></pre></div></div> Hmm! That <code class="language-plaintext highlighter-rouge">withAcqusitionTimer</code> is new! Enhance!! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>crimeSceneCleanedUp = withRunInIO $ \runInIO -> uninterruptibleMask_ $ do runInIO $ do sqlPool <- getSqlPool withAsync (task `finally` record) $ \async -> flip runSqlPool sqlPool $ do cancel async closeProcess pid </code></pre></div></div> Uh oh. Let’s zoom in on <code class="language-plaintext highlighter-rouge">withAsync</code> (and get rid of some indentation): <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>crimeSceneCleanedUp = uninterruptibleMask_ $ do sqlPool <- getSqlPool bracket (async (task `finally` record)) (uninterruptibleCancel) $ \async -> flip runSqlPool sqlPool $ do cancel async closeProcess pid </code></pre></div></div> One more level! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>crimeSceneCleanedUp = uninterruptibleMask_ $ do sqlPool <- getSqlPool bracket (do var <- newEmptyTMVarIO threadId <- mask $ \restore -> forkIO $ do eres <- try $ restore $ task `finally` record atomically $ putTMVar var eres return Async { asyncThreadId = threadId , _asyncWait = readTMVar var } uninterruptibleCancel $ \async -> flip runSqlPool sqlPool $ do cancel async closeProcess pid </code></pre></div></div> Uh oh. <code class="language-plaintext highlighter-rouge">forkIO</code> inherits the masking state from the parent thread. This means that <code class="language-plaintext highlighter-rouge">uninterruptibleMask_</code> state, set by <code class="language-plaintext highlighter-rouge">bracket</code>’s <code class="language-plaintext highlighter-rouge">cleanup</code>, is inherited by our <code class="language-plaintext highlighter-rouge">forkIO</code>. Let’s zoom back out on that <code class="language-plaintext highlighter-rouge">async</code> call and inline the <code class="language-plaintext highlighter-rouge">task</code>… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>crimeSceneCleanedUp = uninterruptibleMask_ $ do withAsync (do forever $ do threadDelay 1000 {- hmm -} `finally` record) $ \async -> {- snip -} </code></pre></div></div> Ah! That’s the zombie. Reducing it to it’s most basic nature, we have: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zombie :: IO (Async a) zombie = uninterruptibleMask_ $ async $ forever $ threadDelay 1000 </code></pre></div></div> <code class="language-plaintext highlighter-rouge">uninteruptibleMask_</code> means “I cannot be killed by async exceptions.” <code class="language-plaintext highlighter-rouge">async</code> allows the forked thread to inherit the masking state of the parent. But about half of the API of <code class="language-plaintext highlighter-rouge">async</code> requires that the forked thread can be killed by async exceptions. <code class="language-plaintext highlighter-rouge">race</code> is completely broken with unkillable <code class="language-plaintext highlighter-rouge">Async</code>s. The solution is to use <code class="language-plaintext highlighter-rouge">withAsyncWithUnmask</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeWithAsync thread action = withAsyncWithUnmask (\unmask -> unmask thread) action </code></pre></div></div> This unmasks the child thread, revealing it to be an imposter all along. <blockquote> And I would have ~gotten away with it~ never exited and consumed all resources, if it weren’t for you danged kids!!! </blockquote> The unmasked phantom thread, free from it’s curse of safety, was killed and returned to the phantom aether to be called upon in other sorcery. Sat, 29 Oct 2022 00:00:00 +0000 https://www.parsonsmatt.org/2022/10/29/spooky_masks_and_async_exceptions.html https://www.parsonsmatt.org/2022/10/29/spooky_masks_and_async_exceptions.html Femoroacetabular Impingement Apparently, I’ve spent my entire life with a condition called “femoracetabular impingement.” The bones in my hips are deformed - the femoral neck is too thick and mis-shapen, and I have a “pincer” on my acetabum which restricts range of motion even further. As a result, I wasn’t able to internally rotate my hips almost at all - I had a single degree range of motion (normal for the population is 45 degrees). I can get my knee to about 90 degrees, but that’s it - for my knees to come up any further, I need to flex my low back. This makes sitting, cycling, weightlifting, yoga, and, uh, pretty much everything a painful and difficult experience. For a long time, I thought I just had “tight hamstrings,” and would occasionally get really into mobility exercises and stretching to try and improve it. Nothing ever worked. In fact, all of that stretching and mobilization was really stretching my low back, not my hamstrings, because the joint was already fully flexed - bone-on-bone contact. And, yeah, bone-on-bone. From squatting, deadlifting, sitting in a chair and programming, and cycling, I’ve pretty much shredded the labrum on each side of my hip. Turns out, the weird aching pains in the front of my legs are hip arthritis. I found out about all of this in such a roundabout way. Last year, my girlfriend wanted to join a bike racing team. She found a team ride/race for No Ride Around, which happened to be the team for my favorite local bike shop. I love cycling and wanted to support her, so I joined too, even though racing isn’t really my thing. Being on a race team, especially a really supportive one, is a fantastic motivation. The team leader recommended Denver Fit Loft for a race bike fit. Charles Van Atta, the fitter, was surprised at my limited range of motion, and recommended that I consult an orthopedic surgeon for hip impingement. Fortunately, Denver has a really great sports medicine scene. In my Google research, I found Dr. James Genuario, a world leading expert in exactly this sort of thing. Within a few weeks of the bicycle fitting, I had X-rays confirming a severe case of hip impingement. In a normal hip, there’s a number called the “alpha angle” that describes how round the femoral head is. A normal alpha angle is 45 degrees, and <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5005062/">50-55 degrees is considered “pathological”</a> and warrants surgical intervention. My alpha angle is 69 degrees. Based on my current hip condition, I was looking at a total hip replacement in 5-15 years if I didn’t act quick. Yet another fortunate coincidence - another member of my race team worked in medical device support, and knew many of the surgeons in the area. He gave me a bunch of advice, and spoke very well of Dr. Genuario. I spent six weeks going to PT twice a week. Lots of weird stretches and exercises did - well, nothing at all. Insurance companies require six weeks of PT before they’ll pay for the MRI and CT scans required for surgery, much less the surgery itself. Apparently, about half of the folks that initially report these problems can resolve with stretching. Given my seriously messed up bone anatomy, I wish we could have skipped that step. After six weeks, I got my MRI scan - and fortunately my connective tissue is good enough to warrant corrective surgery. A month of waiting, and I was able to get the CT scan, which provides a highly detailed 3D picture of my hip. The CT scan goes to Germany, where they construct (in software) a 3D model of a “healthy” version of my hip. This is the blueprint. The surgeon will use that to trim my bones to the right shape. What’s fun is that I found <a href="https://www.youtube.com/watch?v=CKgNr5cl0rs">a video of this procedure on YouTube</a>. They literally use a fancy dremel tool to shave the bone down. On September 22nd, I received my first surgery. The doctor said that he wasn’t sure if he could repair the labrum, and I may need a reconstruction - which is a fancy way of saying “get a dead person’s labrum and stitch it in there.” Once I signed all the consent forms, they gave me a Valium, and started hooking me up to an IV. The nurse was jovial as I was being wheeled away - “we got you on the good drugs, it’s party time.” To which my drugged out self responded - “double fisting valium and whatever this is.” That’s my last memory before going under. On waking up, the doctor said that he couldn’t repair the existing labrum - something about it looking like “crab meat.” Given that I was still high on the anesthesia, I said “hell yeah i’m part zombie.” I was in a fog all that day, and for two days afterward, I was taking narcotics. I weaned myself off pretty quick, since I dislike the side effects, and they don’t work that great on me anyway. After a few days, my hip was feeling totally fine, but every single medication I was on otherwise had “constipation” as a side effect, including the anti-nausea medications. So when my stomach started to feel bad, I took all the nausea meds, which only made things worse. The 29th (my birthday) was the hardest day - I was completely laid up in bed. Once I determined the real cause of the stomach discomfort, it was pretty easy to manage. I’m at two weeks post-op right now, and recovery is great. Dr. Genuario’s skill as a surgeon is remarkable - he was able to bring my alpha angle to 45 degrees. Despite removing so much bone, there is no pain at this point. I’m supposed to be weaning off of crutches starting next week, but truth be told, I’m only using a single crutch most of the time anyway. The range of motion in my operative leg is much better than the My second surgery is scheduled for November 3rd. Another three weeks in crutches, and I’ll be able to walk unassisted for Thanksgiving. Another three weeks of recovery and PT, and I’ll be able to ride a bike outside - hopefully in time for the winter solstice (would hate to lose my Solstice Century streak). I should be back to full strength and regular activity by April. I’m incredibly grateful for everyone involved in the process. But the person who has helped the most is my partner. She’s supported me through all of this, helped me with my physical therapy, and changed my wound dressings. Fri, 07 Oct 2022 00:00:00 +0000 https://www.parsonsmatt.org/2022/10/07/femoroacetabular_impingement.html https://www.parsonsmatt.org/2022/10/07/femoroacetabular_impingement.html Dynamic Exception Reporting in Haskell Exceptions kind of suck in Haskell. You don’t get a stack trace. They don’t show up in the types of functions. They incorporate a subtyping mechanism that feels more like Java casting than typical Haskell programming. A partial solution to the problem is <code class="language-plaintext highlighter-rouge">HasCallStack</code> - that gives us a <code class="language-plaintext highlighter-rouge">CallStack</code> which gets attached to <code class="language-plaintext highlighter-rouge">error</code> calls. However, it only gets attached to <code class="language-plaintext highlighter-rouge">error</code> - so you can either have <code class="language-plaintext highlighter-rouge">String</code> error messages and a <code class="language-plaintext highlighter-rouge">CallStack</code>, or you can have richly typed exceptions with no location information. A <code class="language-plaintext highlighter-rouge">CallStack</code> is a static piece of information about the code. “You called <code class="language-plaintext highlighter-rouge">foo</code>, which called <code class="language-plaintext highlighter-rouge">bar</code>, which called <code class="language-plaintext highlighter-rouge">quuz</code>, which blew up with <code class="language-plaintext highlighter-rouge">Prelude.read: No parse</code>.” The <code class="language-plaintext highlighter-rouge">CallStack</code> answers a single question: “Where did this go wrong?” But there’s often many more interesting questions that simply “Where?” You often want to know Who? When? How? in order to diagnose the big one: why did my code blow up? In order to help answer these questions and develop robust exception reporting and diagnosing facilities, I created the <a href="https://hackage.haskell.org/package/annotated-exception"><code class="language-plaintext highlighter-rouge">annotated-exception</code></a> package. <h1 id="better-call-stacks">Better Call Stacks</h1> <code class="language-plaintext highlighter-rouge">annotated-exception</code> provides a big improvement in static <code class="language-plaintext highlighter-rouge">CallStack</code> behavior. To understand the improvement, let’s dig into the core problem: <h2 id="broken-chains-and-orphan-stacks">Broken Chains and Orphan Stacks</h2> If any function doesn’t include a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint in your stack, then the chain is broken, and you only get the stack closest to the source. Consider this trivial example, which has a few ways of blowing up: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import GHC.Stack foo :: HasCallStack => Int foo = error "foo" bar :: HasCallStack => Int bar = foo baz :: Int baz = foo quux :: HasCallStack => Int quux = bar ohno :: HasCallStack => Int ohno = baz </code></pre></div></div> If we call <code class="language-plaintext highlighter-rouge">foo</code> in GHCi, we get the immediate stack trace: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> foo *** Exception: foo CallStack (from HasCallStack): error, called at <interactive>:4:7 in interactive:Ghci1 foo, called at <interactive>:14:1 in interactive:Ghci2 </code></pre></div></div> Since the <code class="language-plaintext highlighter-rouge">bar</code> term has the <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint, it will add it’s location to the mix: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> bar *** Exception: foo CallStack (from HasCallStack): error, called at <interactive>:4:7 in interactive:Ghci1 foo, called at <interactive>:6:7 in interactive:Ghci1 bar, called at <interactive>:15:1 in interactive:Ghci2 </code></pre></div></div> However, <code class="language-plaintext highlighter-rouge">baz</code> omits the constraint, which means that you won’t get that function in the stack: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> baz *** Exception: foo CallStack (from HasCallStack): error, called at <interactive>:4:7 in interactive:Ghci1 foo, called at <interactive>:8:7 in interactive:Ghci1 </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">quux</code> term has the call stack, so you get the whole story again: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> quux *** Exception: foo CallStack (from HasCallStack): error, called at <interactive>:4:7 in interactive:Ghci1 foo, called at <interactive>:6:7 in interactive:Ghci1 bar, called at <interactive>:10:8 in interactive:Ghci1 quux, called at <interactive>:17:1 in interactive:Ghci2 </code></pre></div></div> But here’s the crappy thing - <code class="language-plaintext highlighter-rouge">ohno</code> does have a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint. You might expect that it would show up in the backtrace. But it does not: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> ohno *** Exception: foo CallStack (from HasCallStack): error, called at <interactive>:4:7 in interactive:Ghci1 foo, called at <interactive>:8:7 in interactive:Ghci1 </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">CallStack</code> for <code class="language-plaintext highlighter-rouge">foo</code>, <code class="language-plaintext highlighter-rouge">baz</code>, and <code class="language-plaintext highlighter-rouge">ohno</code> are indistinguishable. This makes diagnosing the failure difficult. To avoid this problem, you must diligently place a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint on every function in your code base. This is pretty annoying! And if you have any library code that calls your code, the library’s lack of <code class="language-plaintext highlighter-rouge">HasCallStack</code> will break your chains for you. <h2 id="checkpoint-to-the-rescue"><code class="language-plaintext highlighter-rouge">checkpoint</code> to the rescue</h2> <code class="language-plaintext highlighter-rouge">annotated-exception</code> introduces the idea of a <a href="https://hackage.haskell.org/package/annotated-exception-0.2.0.3/docs/src/Control.Exception.Annotated.html#checkpoint"><code class="language-plaintext highlighter-rouge">checkpoint</code></a>. The simplest one is <code class="language-plaintext highlighter-rouge">checkpointCallStack</code>, which attaches the call-site to any exceptions thrown out of the action: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>checkpointCallStack :: (HasCallStack, MonadCatch m) => m a -> m a </code></pre></div></div> Let’s replicate the story from above. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Exception.Annotated foo :: IO Int foo = throw (userError "foo") -- in GHCi, evaluate: -- λ> foo *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc { srcLocPackage = "interactive" , srcLocModule = "Ghci1" , srcLocFile = "<interactive>" , srcLocStartLine = 4 , srcLocStartCol = 7 , srcLocEndLine = 4 , srcLocEndCol = 30 } ) ] ] , exception = user error (foo) } </code></pre></div></div> I’ve formatted the output to be a bit more legible. Now, instead of a plain <code class="language-plaintext highlighter-rouge">IOError</code>, we’ve thrown an <code class="language-plaintext highlighter-rouge">AnnotatedException IOError</code>. Inside of it, we have the <code class="language-plaintext highlighter-rouge">CallStack</code> from <code class="language-plaintext highlighter-rouge">throw</code>, which knows where it was thrown from. That <code class="language-plaintext highlighter-rouge">CallStack</code> inside of the exception is reporting the call-site of <code class="language-plaintext highlighter-rouge">throw</code> - not the definition site! This is true even though <code class="language-plaintext highlighter-rouge">foo</code> does not have a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint! Let’s do <code class="language-plaintext highlighter-rouge">bar</code>. We’ll do <code class="language-plaintext highlighter-rouge">HasCallStack</code> and our <code class="language-plaintext highlighter-rouge">checkpointCallStack</code>, just to see what happens: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import GHC.Stack bar :: HasCallStack => IO Int bar = checkpointCallStack foo -- λ> bar *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc { srcLocPackage = "interactive", srcLocModule = "Ghci1", srcLocFile = "<interactive>", srcLocStartLine = 4, srcLocStartCol = 7, srcLocEndLine = 4, srcLocEndCol = 30} ) , ( "checkpointCallStack" , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci2", srcLocFile = "<interactive>", srcLocStartLine = 15, srcLocStartCol = 7, srcLocEndLi ne = 15, srcLocEndCol = 30} ) , ( "bar" , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci3", srcLocFile = "<interactive>", srcLocStartLine = 17, srcLocStartCol = 1, srcLocEndLine = 17, srcLocEndCol = 4} ) ] ] , exception = user error (foo) } </code></pre></div></div> We get the source location for <code class="language-plaintext highlighter-rouge">throw</code>, <code class="language-plaintext highlighter-rouge">checkpointCallStack</code>, and then the use site of <code class="language-plaintext highlighter-rouge">bar</code>. Now, suppose we have our Problem Function again: <code class="language-plaintext highlighter-rouge">baz</code> doesn’t have a <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint or a <code class="language-plaintext highlighter-rouge">checkpointCallStack</code>. And when we called it through <code class="language-plaintext highlighter-rouge">ohno</code>, we lost the stack, even though <code class="language-plaintext highlighter-rouge">ohno</code> had the <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>baz :: IO Int baz = bar ohno :: IO Int ohno = checkpointCallStack baz -- λ> ohno *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci1", srcLocFile = "<interactive>", srcLocStartLine = 4, srcLocStartCol = 7, srcLocEndLine = 4, srcLocEndCol = 30} ) , ( "checkpointCallStack" , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci2", srcLocFile = "<interactive>", srcLocStartLine = 15, srcLocStartCol = 7, srcLocEndLi ne = 15, srcLocEndCol = 30} ) , ( "bar" , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci3", srcLocFile = "<interactive>", srcLocStartLine = 21, srcLocStartCol = 7, srcLocEndLine = 21, srcLocEndCol = 10} ) , ( "checkpointCallStack" , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci3", srcLocFile = "<interactive>", srcLocStartLine = 23, srcLocStartCol = 8, srcLocEndLine = 23, srcLocEndCol = 31} ) ] ] , exception = user error (foo) } </code></pre></div></div> When we call <code class="language-plaintext highlighter-rouge">ohno</code>, we preserve all of the entries in the <code class="language-plaintext highlighter-rouge">CallStack</code>. <code class="language-plaintext highlighter-rouge">checkpointCallStack</code> in <code class="language-plaintext highlighter-rouge">ohno</code> adds itself to the <code class="language-plaintext highlighter-rouge">CallStack</code> that is present on the <code class="language-plaintext highlighter-rouge">AnnotatedException</code> itself, so it doesn’t need to worry about the stack being broken. It’s perfectly capable of recording that history for you. <h2 id="aint-just-a-checkpoint---catch-me-later">Ain’t Just a Checkpoint - <code class="language-plaintext highlighter-rouge">catch</code> me later</h2> The type signature for <code class="language-plaintext highlighter-rouge">catch</code> in <code class="language-plaintext highlighter-rouge">annotated-exception</code> looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>catch :: (HasCallStack, Exception e, MonadCatch m) => m a -> (e -> m a) -> m a </code></pre></div></div> That <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint is used to give you a <code class="language-plaintext highlighter-rouge">CallStack</code> entry for any time that you <code class="language-plaintext highlighter-rouge">catch</code> an exception. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype MyException = MyException String deriving Show instance Exception MyException boom :: IO Int boom = throw (MyException "boom") recovery :: IO Int recovery = boom `catch` \(MyException message) -> do putStrLn message throw (MyException (message ++ " recovered")) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">recovery</code> catches the <code class="language-plaintext highlighter-rouge">MyException</code> from <code class="language-plaintext highlighter-rouge">boom</code>, prints the message, and then throws a new exception with a modified message. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> recovery boom *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 19, srcLocStartCol = 9, srcLocEndLine = 19, srcLocEndCol = 54} ) , ( "throw" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34} ) , ( "catch" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 17, srcLocStartCol = 5, srcLocEndLine = 19, srcLocEndCol = 54} ) ] ] , exception = MyException "boom recovered" } </code></pre></div></div> Now, look at that call stack: we have the first <code class="language-plaintext highlighter-rouge">throw</code> (from <code class="language-plaintext highlighter-rouge">boom</code>), then we have the second <code class="language-plaintext highlighter-rouge">throw</code> (in <code class="language-plaintext highlighter-rouge">recovery</code>), and finally the <code class="language-plaintext highlighter-rouge">catch</code> in <code class="language-plaintext highlighter-rouge">recovery</code>. So we know where the exception originally happened, where it was rethrown, and where it was caught. This is fantastic! But, even better - these annotations survive even if you throw a different type of <code class="language-plaintext highlighter-rouge">Exception</code>. This means you can translate exceptions fearlessly, knowing that any essential annotated context won’t be lost. <h1 id="dynamic-annotations">Dynamic Annotations</h1> As I said earlier, <code class="language-plaintext highlighter-rouge">CallStack</code> is fine, but it’s a static thing. We can figure out “what code called what other code” that eventually led to an exception, but we can’t know anything about the running state of the program. Enter <code class="language-plaintext highlighter-rouge">checkpoint</code>. This function attaches an arbitrary <code class="language-plaintext highlighter-rouge">Annotation</code> to thrown exceptions. An <code class="language-plaintext highlighter-rouge">Annotation</code> is a wrapper around any value that has an instance of <code class="language-plaintext highlighter-rouge">Show</code> and <code class="language-plaintext highlighter-rouge">Typeable</code>. The library provides an instance of <code class="language-plaintext highlighter-rouge">IsString</code> for this, so you can enable <code class="language-plaintext highlighter-rouge">OverloadedStrings</code> and have stringly-typed annotations. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constantAnnotation :: IO String constantAnnotation = checkpoint "from constant annotation" $ do msg <- getLine if null msg then throw (MyException "empty message") else pure msg </code></pre></div></div> But the real power is in using runtime data to annotate things. Let’s imagine you’ve got a web application. You’re reporting runtime exceptions to a service, like Bugsnag. Specific teams “own” routes, so if something breaks, you want to alert the right team. You can annotate thrown exceptions with the route. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Route = Login | Signup | ViewPosts | CreatePost | EditPost PostId deriving Show dispatch :: Request -> IO Response dispatch req = case parseRequest req of Right route -> checkpoint (Annotation route) $ case route of Login -> handleLogin Signup -> handleSignup ViewPosts -> handleViewPosts CreatePost -> handleCreatePost EditPost postId -> checkpoint (Annotation postId) $ handleEditPost postId Left _ -> invalidRouteError </code></pre></div></div> Now, suppose an exception is thrown somewhere in <code class="language-plaintext highlighter-rouge">handleLogin</code>. It’s going to bubble up past <code class="language-plaintext highlighter-rouge">dispatch</code> and get handled by the Warp default exception handler. That’s going to dig into the <code class="language-plaintext highlighter-rouge">[Annotation]</code> and use that to alter the report we send to Bugsnag. The team that is responsible for <code class="language-plaintext highlighter-rouge">handleLogin</code> gets a notification that something broke there. In the <code class="language-plaintext highlighter-rouge">EditPost</code> case, we’ve also annotated the exception with the post ID that we’re trying to edit. This means that, when debugging, we can know exactly which post threw the given exception. Now, when diagnosing and debugging, we can immediately pull up the problematic entry. This gives us much more information about the problem, which makes diagnosis easier. Likewise, suppose we have a function that gives us the logged in user: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withLoggedInUser :: (Maybe (Entity User) -> IO a) -> IO a withLoggedInUser action = do muser <- getLoggedInUser checkpoint (Annotation (fmap entityKey muser)) $ do action muser </code></pre></div></div> If the action we pass in to <code class="language-plaintext highlighter-rouge">withLoggedInUser</code> throws an exception, that exception will carry the <code class="language-plaintext highlighter-rouge">Maybe UserId</code> of whoever was logged in. Now, we can easily know who is having a problem on our service, in addition to what the problem actually is. <h1 id="the-value-of-transparency">The Value of Transparency</h1> <blockquote> But wait - if all exceptions are wrapped with this <code class="language-plaintext highlighter-rouge">AnnotatedException</code> type, then how do I catch things? Won’t this pollute my codebase? And, what happens if I try to catch an <code class="language-plaintext highlighter-rouge">AnnotatedException MyException</code> but some other code only threw a plain <code class="language-plaintext highlighter-rouge">MyException</code>? Won’t that break things? </blockquote> These are great questions. <code class="language-plaintext highlighter-rouge">catch</code> and <code class="language-plaintext highlighter-rouge">try</code> from other libraries will fail to catch a <code class="language-plaintext highlighter-rouge">FooException</code> if the real type of the exception is <code class="language-plaintext highlighter-rouge">AnnotatedException FooException</code>. However, <code class="language-plaintext highlighter-rouge">catch</code> and <code class="language-plaintext highlighter-rouge">try</code> from <code class="language-plaintext highlighter-rouge">annotated-exception</code> is capable of “seeing through” the <code class="language-plaintext highlighter-rouge">AnnotatedException</code> wrapper. In fact, we took advantage of this earlier - here’s the code for <code class="language-plaintext highlighter-rouge">recovery</code> again: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boom :: IO Int boom = throw (MyException "boom") recovery :: IO Int recovery = boom `catch` \(MyException message) -> do putStrLn message throw (MyException (message ++ " recovered")) </code></pre></div></div> Note how <code class="language-plaintext highlighter-rouge">catch</code> doesn’t say anything about annotations. We catch a <code class="language-plaintext highlighter-rouge">MyException</code>, exactly like you would in <code class="language-plaintext highlighter-rouge">Control.Exception</code>, and the annotations are propagated. But, let’s say you want to catch the <code class="language-plaintext highlighter-rouge">AnnotatedException MyException</code>. You just do that. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>recoveryAnnotated :: IO Int recoveryAnnotated = boom `catch` \(AnnotatedException annotations (MyException message)) -> do putStrLn message traverse print annotations throw (OtherException (length message)) -- in GHCi, λ> recoveryAnnotated boom Annotation @CallStack [("throw",SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34})] *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 37, srcLocStartCol = 9, srcLocEndLine = 37, srcLocEndCol = 48} ) ] ] , exception = OtherException 4 } </code></pre></div></div> Now, something tricky occurs here: we don’t preserve the annotations on the thrown exception. If you catch an <code class="language-plaintext highlighter-rouge">AnnotatedException</code>, the library assumes that you’re going to handle those yourself. If you want to keep them, you’d need to throw an <code class="language-plaintext highlighter-rouge">AnnotatedException</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>recoveryAnnotatedPreserve :: IO Int recoveryAnnotatedPreserve = boom `catch` \(AnnotatedException annotations (MyException message)) -> do putStrLn message traverse print annotations throw (AnnotatedException annotations (OtherException (length message))) -- in GHCi, λ> recoveryAnnotatedPreserve boom Annotation @CallStack [("throw",SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34})] *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 44, srcLocStartCol = 9, srcLocEndLine = 44, srcLocEndCol = 81} ) ] , Annotation @CallStack [ ( "throw" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34} ) ] ] , exception = OtherException 4 } </code></pre></div></div> We’re missing <code class="language-plaintext highlighter-rouge">catch</code>, which is unfortunate, but generally you aren’t going to be doing this - you’re either going to be handling an error completely, or rethrowing it, and the <code class="language-plaintext highlighter-rouge">[Annotation]</code> won’t be relevant to you… unless you’re writing an integration with Bugsnag, or reporting on them in some other way. So <code class="language-plaintext highlighter-rouge">annotated-exception</code>’s exception handling functions can “see through” an <code class="language-plaintext highlighter-rouge">AnnotatedException inner</code> to work only on the <code class="language-plaintext highlighter-rouge">inner</code> exception type. But what if I try to catch a <code class="language-plaintext highlighter-rouge">DatabaseException</code> as an <code class="language-plaintext highlighter-rouge">AnnotatedException DatabaseException</code>? Turns out, the <code class="language-plaintext highlighter-rouge">Exception</code> instance of <code class="language-plaintext highlighter-rouge">AnnotatedException</code> allows you to do that. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import qualified Control.Exception emptyAnnotationsAreCool :: IO () emptyAnnotationsAreCool = Control.Exception.throwIO (MyException "definitely not annotated?") `Control.Exception.catch` \(AnnotatedException annotations (MyException woah)) -> do print annotations putStrLn woah -- in GHCi, λ> emptyAnnotationsAreCool [] definitely not annotated? </code></pre></div></div> We promote the <code class="language-plaintext highlighter-rouge">inner</code> into <code class="language-plaintext highlighter-rouge">AnnotatedException [] inner</code>. So the library works regardless if any code you throw cares about <code class="language-plaintext highlighter-rouge">AnnotatedException</code>. If you call some external library code which throws an exception, you’ll get the first annotation you try - including if that’s just from <code class="language-plaintext highlighter-rouge">catch</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>catchPutsACallStack :: IO () catchPutsACallStack = Control.Exception.throwIO (MyException "definitely not annotated?") `catch` \(MyException woah) -> do throw (OtherException (length woah)) -- in GHCi, λ> catchPutsACallStack *** Exception: AnnotatedException { annotations = [ Annotation @CallStack [ ( "throw" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "../parsonsmatt.github.io/src/annotated.hs", srcLocStartLine = 60, srcLocStartCol = 17, srcLocEndLine = 60, srcLocEndCol = 53}) , ("catch" , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "../parsonsmatt.github.io/src/annotated.hs", srcLocStartLine = 58, srcLocStartCol = 9, srcLocEndLine = 58, srcLocEndCol = 16} ) ] ] , exception = OtherException 25 } </code></pre></div></div> We get <code class="language-plaintext highlighter-rouge">throw</code> and <code class="language-plaintext highlighter-rouge">catch</code> both showing up in our stack trace. If we’d used <code class="language-plaintext highlighter-rouge">Control.Exception.throwIO</code> instead of <code class="language-plaintext highlighter-rouge">Control.Exception.Annotated.throw</code>, then we’d still have <code class="language-plaintext highlighter-rouge">catch</code> as an annotation. <h1 id="do-you-feel-the-power">Do you feel the power?</h1> The primary purpose here is to share the technique and inspire a hunger for dynamic exception annotations. We’ve been using this technique at Mercury for most of this year. It has dramatically simplified how we report exceptions, the shape of our exceptions, and how much info we get from a Bugsnag report. It’s now much easier to diagnose problems and fix bugs. The Really Big Deal here is that - we now have something better than other languages. The lack of stack traces in Haskell is really annoying, and a clear way that Haskell suffers compared to Ruby or Java. But now, with <code class="language-plaintext highlighter-rouge">annotated-exception</code>, we actually have more powerful and more useful exception annotations than a mere stack trace. And, since this is all just library functions, you can swap to <code class="language-plaintext highlighter-rouge">Control.Exception.Annotated</code> with little fuss. Tue, 16 Aug 2022 00:00:00 +0000 https://www.parsonsmatt.org/2022/08/16/dynamic_exception_reporting_in_haskell.html https://www.parsonsmatt.org/2022/08/16/dynamic_exception_reporting_in_haskell.html Moving the Programming Blog I’m moving the programming stuff over to <a href="https://overcoming.software"><code class="language-plaintext highlighter-rouge">https://overcoming.software</code></a>. Well, I will at some point in the future. But I don’t want to break links. So I need to setup a server at this domain which has a permanent redirect to the relevant <code class="language-plaintext highlighter-rouge">overcoming.software</code> domain. That’s a decent amount of work, which I don’t have time for, so this probably won’t happen any time soon. Tue, 03 May 2022 00:00:00 +0000 https://www.parsonsmatt.org/2022/05/03/moving_the_programming_blog.html https://www.parsonsmatt.org/2022/05/03/moving_the_programming_blog.html RankNTypes via Lambda Calculus <code class="language-plaintext highlighter-rouge">RankNTypes</code> is a language extension in Haskell that allows you to write even more polymorphic programs. The most basic explanation is that it allows the implementer of a function to pick a type, rather than the caller of the function. A very brief version of this explanation follows: <h1 id="the-typical-explanation">The Typical Explanation</h1> Consider the identity function, or <code class="language-plaintext highlighter-rouge">const</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>id :: a -> a id x = x const :: a -> b -> a const a b = a </code></pre></div></div> These functions work for any types that the caller of the function picks. Which means that, as implementers, we can’t know anything about the types involved. Let’s say we want to apply a function to each element in a tuple. Without a type signature, we can write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>applyToBoth f (a, b) = (f a, f b) </code></pre></div></div> Vanilla Haskell will provide this type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>applyToBoth :: (a -> a) -> (a, a) -> (a, a) </code></pre></div></div> This is a perfectly useful type, but what if we want to apply it to a tuple containing two different types? Well, we can’t do anything terribly interesting with that - if we don’t know anything about the type, the only thing we can provide is <code class="language-plaintext highlighter-rouge">id</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>applyToBoth :: (forall x. x -> x) -> (a, b) -> (a, b) applyToBoth f (a, b) = (f a, f b) </code></pre></div></div> And that <code class="language-plaintext highlighter-rouge">forall x</code> inside of a parentheses is a <code class="language-plaintext highlighter-rouge">RankNType</code>. It allows the implementer of the function to select the type that the function will be used at, and the caller of the function must provide something sufficiently polymorphic. This explanation is a bit weird and difficult, even though it captures the basic intuition. It’s not super obvious why the caller or the implementer gets to pick types, though. Fortunately, by leveraging the lambda calculus, we can make this more precise! <h1 id="whirlwind-tour-of-lambda">Whirlwind Tour of Lambda</h1> Feel free to skip this section if you’re familiar with the lambda calculus. We’re going to work from untyped, to simply typed, and finally to the polymorphic lambda calculus. This will be sufficient for us to get a feeling for what <code class="language-plaintext highlighter-rouge">RankNTypes</code> are. <h2 id="untyped-lambda-calculus">Untyped Lambda Calculus</h2> The untyped lambda calculus is an extremely simple programming language with three things: <ol> <li>Variables</li> <li>Anonymous Functions (aka lambdas)</li> <li>Function Application</li> </ol> This language is Turing complete, surprisingly. We’ll use Haskell syntax, but basically, you can write things like: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>id = \x -> x const = \a -> \b -> a apply = \f -> \a -> f a </code></pre></div></div> <h2 id="simply-typed-lambda-calculus">Simply Typed Lambda Calculus</h2> The simply typed lambda calculus adds an extremely simple type system to the untyped lambda calculus. All terms must be given a type, and we will have a pretty simple type system - we’ll only have <code class="language-plaintext highlighter-rouge">Unit</code> and function arrows. A lambda will always introduce a function arrow, and a function application always eliminates it. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>id :: Unit -> Unit id = \(x :: Unit) -> x idFn :: (Unit -> Unit) -> (Unit -> Unit) idFn = \(f :: Unit -> Unit) -> f const :: Unit -> Unit -> Unit const = \(a :: Unit) -> \(b :: Unit) -> a apply :: (Unit -> Unit) -> Unit -> Unit apply = \(f :: Unit -> Unit) -> \(a :: Unit) -> f a </code></pre></div></div> This is a much less powerful programming language - it is not even Turing Complete. This makes sense - type systems forbid certain valid programs that are otherwise syntactically valid. The type system in this is only capable of referring to the constants that we provide. Since we only have <code class="language-plaintext highlighter-rouge">Unit</code> and <code class="language-plaintext highlighter-rouge">-></code> as valid type constants, we have a super limited ability to write programs. We can still do quite a bit - natural numbers and Boolean types are perfectly expressible, but many higher order combinators are impossible. Let’s add polymorphic types. <h2 id="polymorphic-lambda-calculus">Polymorphic Lambda Calculus</h2> The magic of the lambda calculus is that we have a means of introducing variables. The problem of the simply typed lambda calculus is that we don’t have variables. So we can introduce type variables. Like Haskell, we’ll use <code class="language-plaintext highlighter-rouge">forall</code> to introduce type variables. In a type signature, the syntax will be the same. However, unlike Haskell, we’re going to have explicit type variable application and introduction at the value level as well. Let’s write <code class="language-plaintext highlighter-rouge">id</code> with our new explicit type variables. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>id :: forall a. a -> a id = forall a. \(x :: a) -> x </code></pre></div></div> Let’s write <code class="language-plaintext highlighter-rouge">const</code> and <code class="language-plaintext highlighter-rouge">apply</code>. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const :: forall a. forall b. a -> b -> a const = forall a. forall b. \(x :: a) -> \(y :: b) -> x apply :: forall a. forall b. (a -> b) -> a -> b apply = forall a. forall b. \(f :: a -> b) -> \(x :: a) -> f x </code></pre></div></div> Finally, let’s apply some type variables. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constUnit :: Unit -> Unit -> Unit constUnit = const @Unit @Unit idUnitFn :: (Unit -> Unit) -> (Unit -> Unit) idUnitFn = id @(Unit -> Unit) f idReturnUnitFn :: forall a. (a -> Unit) -> (a -> Unit) idReturnUnitFn = forall a. id @(a -> Unit) constUnitFn :: Unit -> (Unit -> Unit) -> Unit constUnitFn = const @Unit @(Unit -> Unit) </code></pre></div></div> We’re passing types to functions. With all of these simple functions, the caller gets to provide the type. If we want the implementer to provide a type, then we’d just put the <code class="language-plaintext highlighter-rouge">forall</code> inside a parentheses. Let’s look at the <code class="language-plaintext highlighter-rouge">applyBoth</code> from above. This time, we’ll have explicit type annotations and introductions! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>applyBoth :: forall a. forall b. (forall x. x -> x) -> (a, b) -> (a, b) applyBoth = forall a. forall b. -- [1] \(f :: forall x. x -> x) -> -- [2] \((k, h) :: (a, b)) -> -- [3] (f @a k, f @b h) -- [4] </code></pre></div></div> There’s a good bit going on here, so let’s break it down on a line-by-line basis. <ol> <li>Here, we’re introducing our type variables <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> so that we can refer to them in the type signatures of our variables, and apply them to our functions.</li> <li>Here, we’re introducing our first value parameter - the function <code class="language-plaintext highlighter-rouge">f</code>, which itself has a type that accepts a type variable.</li> <li>Now, we’re accepting our second value parameter - a tuple <code class="language-plaintext highlighter-rouge">(k, h) :: (a, b)</code>. We can refer to <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> in this signature because we’ve introduced them in step 1.</li> <li>Finally, we’re supplying the type <code class="language-plaintext highlighter-rouge">@a</code> to our function <code class="language-plaintext highlighter-rouge">f</code> in the left hand of the tuple, and the type <code class="language-plaintext highlighter-rouge">@b</code> to the type in the right. This allows our types to check.</li> </ol> Let’s see what it looks like to call this function. To give us some more interesting types to work with, we’ll include <code class="language-plaintext highlighter-rouge">Int</code> and <code class="language-plaintext highlighter-rouge">Bool</code> literals. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: (Int, Bool) foo = applyBoth @Int @Bool _f (3, True) </code></pre></div></div> We haven’t decided what <code class="language-plaintext highlighter-rouge">_f</code> will look like exactly, but the type of the value is <code class="language-plaintext highlighter-rouge">forall x. x -> x</code>. So, syntactically, we’ll introduce our type variable, then our value-variable: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: (Int, Bool) foo = applyBoth @Int @Bool (forall x. \(a :: x) -> (_ :: x)) (3, True) </code></pre></div></div> As it happens, the only value we can possibly plug in here is <code class="language-plaintext highlighter-rouge">a :: x</code> to satisfy this. We know absolutely nothing about the type <code class="language-plaintext highlighter-rouge">x</code>, so we cannot do anything with it. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: (Int, Bool) foo = applyBoth @Int @Bool (forall x. \(a :: x) -> a) (3, True) </code></pre></div></div> <h1 id="tug-of-war">Tug of War</h1> <code class="language-plaintext highlighter-rouge">applyBoth</code> is an awful example of <code class="language-plaintext highlighter-rouge">RankNTypes</code> because there’s literally nothing useful you can do with it. The reason is that we don’t give the caller of the function any options! By giving the caller of the function more information, they can do more useful and interesting things with the results. This mirrors the guarantee of parametric polymorphism. The less that we know about our inputs, the less we can do with them - until we get to types like <code class="language-plaintext highlighter-rouge">const :: a -> b -> a</code> where the implementation is completely constrained. What this means is that we provide, as arguments to the callback function, more information! Let’s consider this other type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>applyBothList :: (forall x. [x] -> Int) -> ([a], [b]) -> (Int, Int) applyBothList f (as, bs) = (f as, f bs) </code></pre></div></div> Now the function knows a good bit more: we have a list as our input (even if we don’t know anything aobut the type), and the output is an <code class="language-plaintext highlighter-rouge">Int</code>. Let’s translate this to our polymorphic lambda calculus. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>applyBothList = forall a. forall b. \(f :: forall x. [x] -> Int) -> \( as :: [a], bs :: [b] ) -> ( f @a as, f @b bs ) </code></pre></div></div> When we call this function, this is what it looks like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> applyBothList @Int @Char (forall x. \(xs :: [x]) -> length @x xs * 2) ( [1, 2, 3], ['a', 'b', 'c', 'd'] ) </code></pre></div></div> <h1 id="constraints">Constraints?</h1> In Haskell, a type class constraint is elaborated into a record-of-functions that is indexed by the type. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Num a where fromInteger :: Integer -> a (+) :: a -> a -> a (*) :: a -> a -> a (-) :: a -> a -> a -- etc... -- under the hood, this is the same thing: data NumDict a = NumDict { fromInteger :: Integer -> a , (+) :: a -> a -> a , (-) :: a -> a -> a , (*) :: a -> a -> a } </code></pre></div></div> When you have a function that accepts a <code class="language-plaintext highlighter-rouge">Num a</code> argument, GHC turns it into a <code class="language-plaintext highlighter-rouge">NumDict a</code> and passes it explicitly. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Regular Haskell: square :: Num a => a -> a square a = a * a -- What hapens at runtime: square :: NumDict a -> a -> a square NumDict {..} a = a * a </code></pre></div></div> Or, for a simpler variant, let’s consider <code class="language-plaintext highlighter-rouge">Eq</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Regular Haskell: class Eq a where (==) :: a -> a -> Bool -- Runtime dictionary: newtype EqDict a = EqDict { (==) :: a -> a -> Bool } -- Regular: allEqual :: (Eq a) => a -> a -> a -> Bool allEqual x y z = x == y && y == z && x == z -- Runtime dictionary: allEqual :: EqDict a -> a -> a -> a -> Bool allEqual (EqDict (==)) x y z = x == y && y == z && x == z </code></pre></div></div> (Note that binding a variable name to an operator is perfectly legal!) One common way to extend the power or flexibility of a <code class="language-plaintext highlighter-rouge">RankNTypes</code> program is to include allowed constraints in the callback function. Knowing how and when things come into scope can be tricky, but if we remember our polymorphic lambda calculus, this becomes easy. Consider this weird signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>weirdNum :: (forall a. Num a => a) -> String weirdNum someNumber = show (someNumber @Int) </code></pre></div></div> This isn’t exactly a function. What sort of things can we call here? Well, we have to produce an <code class="language-plaintext highlighter-rouge">a</code>. And we know that we have a <code class="language-plaintext highlighter-rouge">Num a</code> constraint. This means we can call <code class="language-plaintext highlighter-rouge">fromInteger :: Integer -> a</code>. And, we can also use any other <code class="language-plaintext highlighter-rouge">Num</code> methods - so we can add to it, double it, square it, etc. So, calling it might look like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do putStrLn $ weirdNum (fromInteger 3 + 6 * 2) </code></pre></div></div> Let’s elaborate this to our lambda calculus. We’ll convert the type class constraint into an explicit dictionary, and then everything should work normally. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>weirdNum = \(number :: forall a. NumDict a -> a) -> show @Int intShowDict(number @Int intNumDict) </code></pre></div></div> Now, let’s call this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> weirdNum ( forall a. \(dict :: NumDict a) -> fromInteger dict 3 ) </code></pre></div></div> <h1 id="more-on-the-lambda-calculus">More on the Lambda Calculus</h1> If you’ve found this elaboration interesting, you may want to consider reading <a href="https://www.amazon.com/Type-Theory-Formal-Proof-Introduction/dp/110703650X">Type Theory and Formal Proof</a>. This book is extremely accessible, and it taught me almost everything I know about the lambda calculus. Tue, 30 Nov 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/11/30/rank_n_types_via_lambda_calculus.html https://www.parsonsmatt.org/2021/11/30/rank_n_types_via_lambda_calculus.html Deferred Derivation <h2 id="justifiably-lazy-orphans">justifiably lazy orphans</h2> (alternative subtitle: “I used the <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> to destroy the <code class="language-plaintext highlighter-rouge">TemplateHaskell</code>”) (EDIT: 2021-11-05 - Having actually tried this approach in production, I now have an experience report and a warning. Scroll to the bottom for the details!) At the day job, we use the <code class="language-plaintext highlighter-rouge">aeson-typescript</code> library to generate TypeScript types from our Haskell types. One problem is that the library uses <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> to do this, which means we have <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> in almost all of our datatype-defining modules. Due to the <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> recompilation avoidance bug (any module that uses <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> must always be recompiled if any transitive dependency is changed), this means we spend a lot of time recompiling a lot of modules that don’t need to change. Freeing a module from <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> speeds up our build tremendously - not because <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> is slow (it’s very fast) but because compiling at all is slow. The best way to speed something up is to spend 0 time doing it - don’t do it at all! Anyway, I investigated switching the library to a <code class="language-plaintext highlighter-rouge">Generics</code> based approach, but it’s complicated and looked tricky, so I decided not to pursue that. The library defines a type class <a href="https://hackage.haskell.org/package/aeson-typescript-0.3.0.1/docs/Data-Aeson-TypeScript-TH.html#t:HasJSONOptions"><code class="language-plaintext highlighter-rouge">HasJSONOptions</code></a>, which lets you re-use the same JSON options for both <code class="language-plaintext highlighter-rouge">aeson</code>’s JSON encoding classes and the <code class="language-plaintext highlighter-rouge">TypeScript</code> class. I had recently done some work with <a href="https://hackage.haskell.org/package/persistent-2.13.1.2/docs/Database-Persist-TH.html#v:discoverEntities"><code class="language-plaintext highlighter-rouge">discoverEntities</code></a>, a <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> function which gathers the <code class="language-plaintext highlighter-rouge">PersistEntity</code> instances in scope and collects their <code class="language-plaintext highlighter-rouge">entityDef (Proxy :: Proxy a)</code>. I started wondering - can I use this trick to defer the derivation and omit the <code class="language-plaintext highlighter-rouge">TemplateHaskell</code>? In preparation, I wrote <a href="https://hackage.haskell.org/package/discover-instances-0.1.0.0/docs/DiscoverInstances.html"><code class="language-plaintext highlighter-rouge">discover-instances</code></a>, which generalized the above pattern (and was a fun exercise in typed <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> quotes). Now, we might have had an instance like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data X = X { ... } $(deriveJSON defaultOptions ''X) $(deriveTypeScript defaultOptions ''X) </code></pre></div></div> We can excise the <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> entirely by using <code class="language-plaintext highlighter-rouge">Generics</code>-based derivation for the JSON classes, and specify an instance for the <code class="language-plaintext highlighter-rouge">HasJSONOptions</code> class. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data X = X { ... } deriving stock Generic instance HasJSONOptions X where getJSONOptions _ = defaultOptions instance ToJSON X where toJSON = genericToJSON (getJSONOptions (Proxy :: Proxy X)) instance FromJSON X where parseJSON = genericFromJSON (getJSONOptions (Proxy :: Proxy X)) </code></pre></div></div> But we don’t yet have that <code class="language-plaintext highlighter-rouge">TypeScript</code> instance. Let’s look at the module that actually generates our <code class="language-plaintext highlighter-rouge">TypeScript</code> code. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MakeTypeScript where import Model.X import Model.Y import Model.Z writeTypeScript :: IO () writeTypeScript = do writeFile frontendTypes $ concat [ makeTypeScriptFor (Proxy :: Proxy X) , makeTypeScriptFor (Proxy :: Proxy Y) , makeTypeScriptFor (Proxy :: Proxy Z) ] </code></pre></div></div> This is the only place those instances are ever used. With the above change (eg deleting the <code class="language-plaintext highlighter-rouge">deriveTypeScript</code> stuff), there’s no longer an instance present. But I can fix that by deriving the instance in this file. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MakeTypeScript where import Model.X import Model.Y import Model.Z $(deriveTypeScript (getJSONOptions (Proxy :: Proxy X)) ''X) $(deriveTypeScript (getJSONOptions (Proxy :: Proxy Y)) ''Y) $(deriveTypeScript (getJSONOptions (Proxy :: Proxy Z)) ''Z) writeTypeScript :: IO () writeTypeScript = do writeFile frontendTypes $ concat [ makeTypeScriptFor (Proxy :: Proxy X) , makeTypeScriptFor (Proxy :: Proxy Y) , makeTypeScriptFor (Proxy :: Proxy Z) ] </code></pre></div></div> This is deeply unsatisfying to me. We have three points of repetition: <ol> <li><code class="language-plaintext highlighter-rouge">import</code>ing a type</li> <li><code class="language-plaintext highlighter-rouge">derive</code>ing an instance for it</li> <li>Mentioning the type in <code class="language-plaintext highlighter-rouge">writeTypeScript</code>.</li> </ol> Let’s improve this. First, we want to use <a href="https://hackage.haskell.org/package/discover-instances-0.1.0.0/docs/DiscoverInstances.html#v:discoverInstances"><code class="language-plaintext highlighter-rouge">discoverInstances</code></a> to splice in all the <code class="language-plaintext highlighter-rouge">HasJSONOptions</code> instances that are visible. Second, we’ll iterate over each instance, and derive the code there. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MakeTypeScript where import Model.X import Model.Y import Model.Z $(do decs <- forInstances $$(discoverInstances @HasJSONOptions) $ \proxy@(Proxy :: Proxy a) -> do deriveTypeScript (getJSONOptions proxy) (nameForType proxy) pure (concat decs) ) writeTypeScript :: IO () writeTypeScript = do writeFile frontendTypes $ concat [ makeTypeScriptFor (Proxy :: Proxy X) , makeTypeScriptFor (Proxy :: Proxy Y) , makeTypeScriptFor (Proxy :: Proxy Z) ] </code></pre></div></div> <a href="https://hackage.haskell.org/package/discover-instances-0.1.0.0/docs/DiscoverInstances.html#v:forInstances"><code class="language-plaintext highlighter-rouge">forInstances</code></a> is a convenience function that lets you operate on the proxy with the constraint in scope. This totally works. We’ve derived all of the instances of <code class="language-plaintext highlighter-rouge">TypeScript</code>, and now we’re using them quite happily. Now we’re down to two points of repetition - importing a type and writing it specifically in <code class="language-plaintext highlighter-rouge">makeTypeScriptFor</code>. We can’t get rid of imports, so let’s look at the <code class="language-plaintext highlighter-rouge">writeFile</code> list. This is pretty easy to get rid of using our <code class="language-plaintext highlighter-rouge">SomeDict TypeScript</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MakeTypeScript where import Model.X import Model.Y import Model.Z $(do decs <- forInstances $$(discoverInstances @HasJSONOptions) $ \proxy@(Proxy :: Proxy a) -> do deriveTypeScript (getJSONOptions proxy) (nameForType proxy) pure (concat decs) ) writeTypeScript :: IO () writeTypeScript = do writeFile frontendTypes $ concat $ withInstances $$(discoverInstances @TypeScript) $ \proxy -> makeTypeScriptFor proxy </code></pre></div></div> And we’re done! <h2 id="wins">Wins:</h2> <ol> <li>No more <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> requirement in a module that defines an API type!</li> <li>No more boring repetition!</li> <li>We got to use fancy types!</li> </ol> <h2 id="losses">Losses:</h2> <ol> <li>Orphan instances :(</li> <li>There’s still TemplateHaskell even if it’s localized</li> <li><code class="language-plaintext highlighter-rouge">Generic</code>-based deriving is slower, so a clean build will be worse</li> </ol> <h1 id="update-experience-report">UPDATE: Experience Report</h1> Alright, so I’ve put this technique through it’s paces on the work codebase, and it works, but it has an unforeseen negative consequence. Here’s what happens: <ol> <li>Types are enrolled in the “signal class”, which is effectively free for compilation time.</li> <li>But then we introduce a new module, <code class="language-plaintext highlighter-rouge">TypeScript.Instances</code>, which imports all the datatypes we expose on the front-end.</li> <li>This module runs <code class="language-plaintext highlighter-rouge">$$(discoverInstances)</code> and then <code class="language-plaintext highlighter-rouge">deriveTypeScript</code> on every single type</li> <li>This takes a long time! Like ~7 seconds.</li> <li>And, because it is downstream of every type, we have to recompile the module basically whenever any module is touched.</li> </ol> To put it in a textual meme format, <ul> <li>(!!!) Trade offer (!!!) <ul> <li>You get: <ul> <li>Recompilation avoidance for most of your modules</li> </ul> </li> <li>I get: <ul> <li>Forced recompilation of a big module in the hot path of a successful recompile, every single time, forever</li> </ul> </li> </ul> </li> </ul> This is not a good trade, unfortunately. There are two work-arounds that I’ve thought about: <ol> <li>Moving the TypeScript code generation and instance derivation to the <code class="language-plaintext highlighter-rouge">executable</code> component that actually runs.</li> <li>Moving all the types out-of-band of the modules that are more likely to change</li> </ol> <h2 id="moving-to-a-separate-package-component">Moving to a separate package component</h2> This essentially defers the problem, by saying: “I don’t care about compiling this for my normal workflow. Please just only compile this when I specifically ask for it.” This means that your <code class="language-plaintext highlighter-rouge">ghcid</code> flow (or HLS, or whatever) will stop at this boundary and not check for compilation past that. If you’re in the habit of trying to build as much of the package as possible with each compile-step, then this completely defeats that. To do this, you’d move the <code class="language-plaintext highlighter-rouge">MakeTypeScript</code> module from the <code class="language-plaintext highlighter-rouge">src/</code> or <code class="language-plaintext highlighter-rouge">library</code> component into an <code class="language-plaintext highlighter-rouge">executable</code> component. I’m not a fan of this approach, and won’t be pursuing it. <h2 id="moving-types-out-of-band">Moving types out-of-band</h2> So, this is a bit tricky. The way a lot of our application is designed, we have our web application handlers defined in a module along with their request/response types. For example, we might have a route called <code class="language-plaintext highlighter-rouge">FooR</code>, and it’d be defined in a module much like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Handler.Foo where import Import.Handler getFooR :: Text -> Handler FooResponse getFooR fooName = do foo <- getFoo fooName pure $ FooResponse {..} postFooR :: Handler FooResponse postFooR = do FooRequest {..} <- parseJSONBody doStuff pure $ FooResponse {..} </code></pre></div></div> We might have 1-2 datatype definitions in a module, with <code class="language-plaintext highlighter-rouge">Handler</code> and business logic taking up 100-1000 lines of code. Recompiling this code every time is a drag, especially since the instances themselves are pretty quick. In <code class="language-plaintext highlighter-rouge">MakeTypeScript</code>, we import each of our <code class="language-plaintext highlighter-rouge">Handler.*</code> modules. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MakeTypeScript where import Handler.Foo import Handler.Bar import Handler.User import Handler.Organziation import .......... </code></pre></div></div> Due to the transitive nature of the recompilation problem, this means that any change to any types or business logic upon those types will trigger a recompilation, not just of the <code class="language-plaintext highlighter-rouge">Handler</code>, but also the <code class="language-plaintext highlighter-rouge">MakeTypeScript</code>. We can restructure the <code class="language-plaintext highlighter-rouge">Handler</code> modules to avoid this problem: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Handler.Foo.Types where -- this import has a *lot* fewer dependences in the application import Import.Handler.Types data FooRequest = FooRequest ... $(deriveJSONAndTypeScript defaultOptions ''FooRequest) data FooResponse = FooResponse ... $(deriveJSONAndTypeScript defaultOptions ''FooResponse) module Handler.Foo where -- this import has mostly business logic import Import.Handler import Handler.Foo.Types getFooR :: Text -> Handler FooResponse getFooR fooName = do foo <- getFoo fooName pure $ FooResponse {..} postFooR :: Handler FooResponse postFooR = do FooRequest {..} <- parseJSONBody doStuff pure $ FooResponse {..} -- etc for a few hundred lines </code></pre></div></div> By doing this, we now only recompile <code class="language-plaintext highlighter-rouge">Handler.Foo</code> when it actually needs to change. If we can sufficiently separate the business logic and type declarations, then we can also avoid recompiling <code class="language-plaintext highlighter-rouge">MakeTypeScript</code> so often - <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MakeTypeScript where import Handler.Foo.Types import Handler.Bar.Types import ... </code></pre></div></div> If these <code class="language-plaintext highlighter-rouge">Handler.*.Types</code> modules only ever depend on rarely-changing datatypes, then this should alleviate the problem. <h3 id="however">However…</h3> A problem with the <code class="language-plaintext highlighter-rouge">Deferred Derivation</code> approach is that it is extremely easy to break the fast compilation times you get from it. Since any <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> at all causes The Recompilation Problem, skipping only a few parts of it will mostly just make your compilation slower. The strategy for moving types out-of-band is also effective at solving the original problem. By defining the types and logic separately, you don’t have to needlessly recompile the types all of the time. And this has a cascading incremental effect - the more module splitups you do, the faster things get, immediately. <h1 id="maybe-this-is-a-bad-idea">Maybe this is a bad idea</h1> At the very least, the results of performing the experiment at work are: “let’s not use this.” I do still think there’s a lot of value in “signal classes” and <code class="language-plaintext highlighter-rouge">discoverInstances</code> for those classes to perform neat metaprogramming. Deferring derivation of type classes that are primarily used for a single thing may not be the ticket, though. Thu, 09 Sep 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/09/09/deferred_derivation.html https://www.parsonsmatt.org/2021/09/09/deferred_derivation.html Family Values I wrote a big thread on the company Slack to compare type families: open vs closed vs associated. I also ended up discussing data families, as well, since they are a good complement to type families. I’ll probably edit this further and include it in my book, Production Haskell, but here’s the Slack transcript for now: An associated type family is an open type family, but with the requirement that it be defined for any type that’s also an instance of the type class. Consider the <code class="language-plaintext highlighter-rouge">mono-traversable</code> library. We have: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family Element o :: Type class MonoFunctor o where omap :: (Element o -> Element o) -> o -> o class MonoFoldable o where ofoldMap :: Monoid m => (Element o -> m) -> o -> m class (MonoFunctor o, MonoFoldable o) => MonoTraversable o where otraverse :: (Applicative f) => (Element o -> f (Element o)) -> o -> f o </code></pre></div></div> If we wanted to put <code class="language-plaintext highlighter-rouge">Element</code> as an associated type, like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class MonoFunctor o where type Element o :: Type omap :: (Element o -> Element o) -> o -> o </code></pre></div></div> Then we’d be a bit stuck - we need to require <code class="language-plaintext highlighter-rouge">class (MonoFunctor o) => MonoFoldable o</code>. This is a stronger requirement than regular Foldable, and that’s a problem. Or we define a separate type family on <code class="language-plaintext highlighter-rouge">MonoFoldable</code>. But then - what do we use with <code class="language-plaintext highlighter-rouge">MonoTraversable</code>? If we’re desperate to avoid open type families, then we can work around this by having a dummy class that only carries the type family. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class HasElement o where type Element o :: Type class (HasElement o) => MonoFunctor o where ... class (HasElement o) => MonoFoldable o where ... class (MonoFunctor o, MonoFoldable o) => MonoTraversable o where ... </code></pre></div></div> But “a dummy class that only has a type family” seems obviously worse to me than “a type family” :shrug: The question: “Should I use an open type family or a closed type family?” has an analog to simpler language features: type classes and sum types. If you want a closed set, you use a sum type. If you want an open set, you use a type class. So if you’re familiar with the trade-offs there, the trade-offs with open/closed type familes are easier to evaluate A closed set means you can be exhaustive - “every case is handled.” If you’re pattern matching on a datakind, like <code class="language-plaintext highlighter-rouge">type family Foo (x :: Bool)</code>, then you can know that handling <code class="language-plaintext highlighter-rouge">Foo 'True</code> and <code class="language-plaintext highlighter-rouge">Foo 'False</code> that you’ve handled all cases. You don’t have to worry that some user is going to add a case and blow things up. (Well, you kinda do, because even closed type families aren’t actually closed, due to stuckness semantics, but, uhhhhh, that’s a bit of very advanced trickery, talk to csongor if you’re curious) An open set is a way of allowing easy extensibility. So you’re going to accept something of kind Type or possibly a polymorphic kind variable to allow people to define their own types, and their own instances of these types. For example, if I want to associate a type with a string, I can write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family TypeString (sym :: Symbol) :: Type type instance TypeString "Int" = Int type instance TypeString "Char" = Char </code></pre></div></div> And that lets me run programs at the type level, that end users can extend. Much like you can write a type <code class="language-plaintext highlighter-rouge">class</code> and end users can extend your functionality. But ultimately you need to do something at the value level. Which means you need to take some type information and translate it to the value level. This is precisely what type classes do - they are morally “a function from a type to a value.” We can write a super basic function, like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typeStringProxy :: Proxy sym -> Proxy (TypeString sm) typeStringProxy _ = Proxy </code></pre></div></div> But this is still not useful without further classes. The Default class assigns a special value to any type, and we could do something like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typeStringDefault :: forall sm. Default (TypeString sm) => Proxy sm -> TypeString sm typeStringDefault = def @(TypeString sm) </code></pre></div></div> Since associating a type class and an open type family is so common, it’s almost always better to use an associated type unless you know that the type family is going to be shared across multiple type classes. “So how do you associate a closed type family with values?” That’s a great question. We can do the same trick with Proxy functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family Closed (a :: Bool) where Closed 'True = Int Closed 'False = Char closed :: Proxy b -> Proxy (Closed b) closed _ = Proxy </code></pre></div></div> But, until we know what <code class="language-plaintext highlighter-rouge">b</code> is, we can’t figure out what <code class="language-plaintext highlighter-rouge">Closed b</code> is. To pattern match on a type, we need a type class. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class RunClosed (b :: Bool) where runClosed :: Proxy b -> Closed b instance RunClosed 'True where runClosed _ = 3 instance RunClosed 'False where runClosed _ = 'a' </code></pre></div></div> I’m going to detour a bit here and mention data families. A data family is like a type family, but instead of allowing you to refer to any type, you have to specify the constructors inline. To take an example from persistent, <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data family Key a newtype instance Key User = UserKey { unUserKey :: UUID } type UserId = Key User newtype instance Key Organization = OrganizationKey { unOrganizationKey :: UUID } type OrganizationId = Key Organization </code></pre></div></div> An advantage of this is that, since you specify the constructors, you can know the type of <code class="language-plaintext highlighter-rouge">Key a</code> by knowing the constructor in use - the value specifies the type. <code class="language-plaintext highlighter-rouge">OrganizationKey :: UUID -> Key Organization</code>. It looks a lot like an “open type family,” and in fact is completely analogous. But we don’t call them “open data families,” even though that’s an appropriate name for it. It should make you wonder - is there such a thing as a closed data family? Aaaaaand - the answer is “yes”, but we call them GADTs instead. The nice thing about an “open data family” is that you can learn about types by inspecting values - by knowing a value (like <code class="language-plaintext highlighter-rouge">OrganizationKey uuid</code>), I can work ‘backwards’ and learn that I have an <code class="language-plaintext highlighter-rouge">Key Organization</code>. But, I can’t write a <code class="language-plaintext highlighter-rouge">case</code> expression over all <code class="language-plaintext highlighter-rouge">Key a</code> - it’s open! and case only works on closed things. So this code does not work: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>whatKey :: Key a -> Maybe UUID whatKey k = case k of UserKey uuid -> Just uuid OrganizationKey uuid -> Just uuid _ -> Nothing </code></pre></div></div> Indeed, we need a type class to allow us to write <code class="language-plaintext highlighter-rouge">get :: Key a -> SqlPersistT m (Maybe a)</code>. A <code class="language-plaintext highlighter-rouge">GADT</code> - as a closed data family - allows us to work from a value to a type, and since it is exhaustive, we can write case expressions on them. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Key a where UserKey :: { unUserKey :: UUID } -> Key User OrganizationKey :: { unOrganizationKey :: UUID } -> Key Organization </code></pre></div></div> If I have this structure, then I can actually write <code class="language-plaintext highlighter-rouge">get</code> without a type class. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>get :: Key a -> SqlPersistT IO (Maybe a) get k = case k of UserKey uuid -> do [userName, userAge] <- rawSql "SELECT * FROM users WHERE id = ?" [toPersistValue uuid] pure User {..} OrganizationKey uuid -> ... </code></pre></div></div> A GADT is ‘basically’ a closed type family that gives you constructor tags for applying that type family,. If we look at <code class="language-plaintext highlighter-rouge">Closed</code>, we can inline this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family ClosedTy (b :: Bool) where ClosedTy True = Int ClosedTy False = Char data ClosedData (a :: Type) where ClosedData :: Proxy b -> ClosedData (ClosedTy b) -- inlining: data Closed (a :: Type) where ClosedTrue :: Proxy 'True -> Closed Int ClosedFalse :: Proxy 'False -> Closed Char </code></pre></div></div> When we case on a <code class="language-plaintext highlighter-rouge">Closed</code> value, we get: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runClosed :: Closed a -> a runClosed closed = case closed of ClosedTrue (Proxy :: Proxy 'True) -> 3 ClosedFalse (Proxy :: Proxy 'False) -> 'a' </code></pre></div></div> I’m going to conclude now, with this distillation: <ul> <li>Open type family + type class = extensible, open programming, but no exhaustivity.</li> <li>Closed type family + GADT + functions = exhaustive handling of types, but not extensible</li> <li>An open type family + a GADT isn’t much fun.</li> <li>A closed type family + a type class isn’t much fun</li> </ul> Thu, 02 Sep 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/09/02/family_values.html https://www.parsonsmatt.org/2021/09/02/family_values.html Designing New I want a better way of constructing Haskell records. Let’s compare and contrast the existing ways. We’ll be using this datatype as an example: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Env = Env { accountId :: String , accountPassword :: String , requestHook :: Request -> IO Request , responseHook :: Response -> IO Response } </code></pre></div></div> This type is an <code class="language-plaintext highlighter-rouge">Env</code> that you might see in a <code class="language-plaintext highlighter-rouge">ReaderT Env IO</code> integration with some external service. We can attach request hooks and response hooks. <h1 id="function-arguments">Function Arguments</h1> The simplest and most boring way is to pass function arguments. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: Env env = Env "asdfadf" "hunter42" pure pure </code></pre></div></div> This is undesirable for a few reasons: <ol> <li>We have no idea what those parameters mean without looking at the datatype definition.</li> <li>We have to pass arguments in a specific order.</li> <li>If the type of the <code class="language-plaintext highlighter-rouge">Env</code> changes, then this also changes.</li> <li>… but we don’t get a break if the field order is changed in a way that respects the types!</li> </ol> Consider swapping the order of <code class="language-plaintext highlighter-rouge">accountId</code> and <code class="language-plaintext highlighter-rouge">accountPassword</code> in our data definition. Now everything breaks mysteriously with no type errors. Using the function-style for constructing records is probably a bad idea. <h1 id="record-labels">Record Labels</h1> The second most boring way is to use record construction syntax: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: Env env = Env { accountId = "asdfasdf" , accountPassword = "hunter42" , requestHook = pure , responseHook = pure } </code></pre></div></div> This solves basically all the problems with function arguments. However, we’re still sensitive to changes in the record constructor. If we add a new field, we must account for that in all creation sites. This is annoying, especially since many new fields in records like this are designed to accommodate new functionality or customization, and most existing users want to just ignore them. <h1 id="a-default-record">A Default Record</h1> Instead of constructing a record, we’ll have end users modify an existing record. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>defaultEnv :: Env defaultEnv = Env { accountId = "" , accountPassword = "" , requestHook = pure , responseHook = pure } env :: Env env = defaultEnv { accountId = "asdfasdf" , accountPassword = "hunter42" } </code></pre></div></div> However, this is gross, for a few reasons. The first is that we provide a dummy value of <code class="language-plaintext highlighter-rouge">accountId</code> and <code class="language-plaintext highlighter-rouge">accountPassword</code>, and the end user is required to fill them in. There’s actually no way for us to give a warning or error if they fail to provide it. The standard solution is to accept function arguments, but this has a nasty problem: record syntax binds tighter than anything else, even function application, so we need to do this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>defaultEnv :: String -> String -> Env defaultEnv a p = Env a p pure pure -- brevity, forgive me env :: Env env = (defaultEnv "asdfasdf" "hunter42") { requestHook = \req -> do logRequest req pure req } </code></pre></div></div> That’s right - we gotta put parens around our constructor. We can’t use <code class="language-plaintext highlighter-rouge">$</code> here, either, because the syntax explicitly requires a <code class="language-plaintext highlighter-rouge">value { field0 = val0, ... fieldN = valN }</code> form. Also now we’re back at the same problem with <code class="language-plaintext highlighter-rouge">defaultEnv</code> - we can mismatch our function arguments. <h1 id="an-args-record">An Args Record</h1> The pattern I chose for <a href="https://hackage.haskell.org/package/persistent-2.13.1.1/docs/Database-Persist-SqlBackend.html#v:mkSqlBackend"><code class="language-plaintext highlighter-rouge">SqlBackend</code></a> in <code class="language-plaintext highlighter-rouge">persistent</code> is to have an <code class="language-plaintext highlighter-rouge">*Args</code> record. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language DuplicateRecordFields #-} {-# language RecordWildCards #-} data EnvArgs = EnvArgs { accountId :: String , accountPassword :: String } mkEnv :: EnvArgs -> Env mkEnv EnvArgs {..} = Env { requestHook = pure , responseHook = pure , .. } env :: Env env = mkEnv EnvArgs { accountId = "asdfasdf" , accountPassword = "hunter42" } </code></pre></div></div> This solves all of the above problems, but it’s a bit unsatisfying - we can’t also modify the <code class="language-plaintext highlighter-rouge">requestHook</code> and <code class="language-plaintext highlighter-rouge">responseHook</code> parameters directly in <code class="language-plaintext highlighter-rouge">mkEnv</code>, we have to do it outside. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fullEnv :: Env fullEnv = (mkEnv EnvArgs {..}) { requestHook = \req -> do log req pure req } </code></pre></div></div> Hmm, slightly annoying syntax, again. But, hey, whatever, it works. <h1 id="codependent-records">Codependent Records</h1> No, I’m not talking about some fancy type theory. Record syntax is essentially codependent on the value it is modifying, or the constructor it is using. We can’t pass in a ‘record’ of stuff and use it in ways that are clever or useful. Let’s talk about the “whitespace operator.” We can imagine defining it like this, for regular functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>( ) :: (a -> b) -> a -> b f a = f a </code></pre></div></div> OK, it’s special built in syntax, the definition doesn’t make any sense. But let’s try and write it for records now. Remember we need to support update and creation. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>( ) :: (AllowableRecord con rec result) => con -> rec -> result con rec = implementRecord con rec class AllowableRecord con rec result where implementRecord :: con -> rec -> result </code></pre></div></div> Now <code class="language-plaintext highlighter-rouge">rec</code> is something that can stand alone - it is freed from the codependent relationship with the values and constructors it serves. What is that something, though? It could be a row type, like PureScript. That’d be awesome. Well now I’ve just worked myself up into a Mood about GHC’s record syntax. Even with <code class="language-plaintext highlighter-rouge">OverloadedRecordDot</code>, Haskell’s records are still bad, they’re just not awful. <h1 id="ignore-records-use-functions">Ignore Records, Use Functions</h1> This approach eschews records entirely for updates and uses <code class="language-plaintext highlighter-rouge">set*</code> functions. It makes for a pretty clean interface. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: Env env = addRequestHook (\req -> log req >> pure req) $ mkEnv EnvArgs { accountId = "asdfasdf" , accountPassword = "hunter42" } addRequestHook :: (Request -> IO Request) -> Env -> Env addRequestHook newHook env = env { requestHook = \req -> do requestHook env req newHook req } </code></pre></div></div> This is pretty tedious as a library author to write, but it gives you a better interface. It would be nice if we could use this for construction, too. But this is a challenge because the type would change with each new addition to the record. The <code class="language-plaintext highlighter-rouge">{ ... }</code> record syntax can know ahead of time how many fields there are, and GHC can issue warnings (or errors) if any are missing. <h1 id="type-changing-updates">Type Changing Updates</h1> We can use a type parameter for each field that is required to be set. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data EnvP a b = EnvP { accountId :: a , accountPassword :: b , requestHook :: Request -> IO Request , responseHook :: Response -> IO Response } type Env = EnvP String String data Void defaultEnv :: EnvP Void Void defaultEnv = EnvP { requestHook = pure , responseHook = pure } </code></pre></div></div> GHC will issue warnings here, but that’s okay - we know they’re undefined at the type level. Now we can write our <code class="language-plaintext highlighter-rouge">set</code> functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>setAccountId :: String -> EnvP a b -> EnvP String b setAccountId str env = env { accountId = str } setAccountPassword :: String -> EnvP a b -> EnvP a String setAccountPassword str env = env { accountPassword = str } env :: Env env = setAccountId "asdfasdf" $ setAccountPassword "hunter42" $ defaultEnv </code></pre></div></div> And, well, this actually works out. If we only expose the <code class="language-plaintext highlighter-rouge">Env</code> type (and maybe a pattern synonym for construction/deconstruction), this interface should be pretty safe and straightforward. A final <code class="language-plaintext highlighter-rouge">mkEnv</code> call could even put it behind a <code class="language-plaintext highlighter-rouge">newtype</code> wrapper, or a similar datatype, similar to the <code class="language-plaintext highlighter-rouge">*Args</code> pattern above. The boilerplate sucks, but would be easy to <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> away. Can <code class="language-plaintext highlighter-rouge">OverloadedRecordDot</code> help us here? With some of the tricks in <a href="https://www.parsonsmatt.org/2021/07/29/stealing_impl_from_rust.html">Stealing <code class="language-plaintext highlighter-rouge">impl</code> From Rust</a>, sort of. We can write simple setters: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { name :: String } instance HasField "setName" User (String -> User) where getField self newName = self { name = newName } </code></pre></div></div> And, using the One Weird Trick to defeat functional dependencies, we can write type-changing setters, too! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasField "setAccountId" (EnvP a b) (x -> EnvP x b) => HasField "setAccountId" (EnvP a b) (x -> EnvP x b) where getField self x = self { accountId = x } </code></pre></div></div> Now, to provide a good UX, we’d want to require this be <code class="language-plaintext highlighter-rouge">String</code>, possibly with a nice <code class="language-plaintext highlighter-rouge">TypeError</code> constraint that complains. But this’ll work for now - we can totally write this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: EnvP String Void env = defaultEnv.setAccountId "asdfasdf" </code></pre></div></div> Unfortunately, chaining this isn’t really feasible. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: EnvP String String env = defaultEnv.setAccountId "asdfasdf".setAccountPassword "hunter42" </code></pre></div></div> This fails with an error, as <code class="language-plaintext highlighter-rouge">.setAccountPassword</code> is attaching to <code class="language-plaintext highlighter-rouge">"asdfasdf"</code>, not the return of <code class="language-plaintext highlighter-rouge">defaultEnv.setAccountId "asdfasdf"</code>. So we can work around this with parens: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: EnvP String String env = (defaultEnv.setAccountId "asdfasdf").setAccountPassword "hunter42" </code></pre></div></div> This gets annoying, especially as the chaining goes up. Assigning to intermediate values also works: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: EnvP String String env = let withId = defaultEnv.setAccountId "asdfasdf" withPassword = withId.setAccountPassword "hunter42" in withPassword </code></pre></div></div> But, at this point, I’m wondering how this is any better than just writing <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>env :: EnvP String String env = setAccountId "asdfadsf" $ setAccountPassword "hunter42" defaultEnv </code></pre></div></div> Unfortunately, the type errors can get a bit weird and annoying carrying around the <code class="language-plaintext highlighter-rouge">EnvP</code> value. Wrapping it in a <code class="language-plaintext highlighter-rouge">newtype</code> or translating to a separate data structure can make errors better. It also distinguishes the “create this record” and “use this record” scenarios. <h1 id="back-to-args">Back to Args</h1> And, yeah, ultimately, I think <code class="language-plaintext highlighter-rouge">Args</code> is probably the right way to go. There’s not really much to a library for it. You’d define a class like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class New a where type Args a = r | r -> a new :: Args a -> a </code></pre></div></div> You want the <code class="language-plaintext highlighter-rouge">TypeFamilyDependencies</code> annotation on <code class="language-plaintext highlighter-rouge">Args</code> because you want the argument type to inform the result type. A data family would also work, but it would not allow you to define it separately and document it with a separate type name. Maybe a problem, maybe not. It may also be nice to vary the return type, allowing <code class="language-plaintext highlighter-rouge">IO</code>, for example. That looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class New a where type Args a = r | r -> a type Return a = r | r -> a type Return a = a new :: Args a -> Return a </code></pre></div></div> But now we’ve just, got, like, this type class, where it takes a thing, and returns another thing (maybe in IO, maybe not?? who knows). And this is so general and lawless that making a library for it seems a bit silly. So, instead of writing a library, I wrote a blog post. Tue, 24 Aug 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/08/24/designing_new.html https://www.parsonsmatt.org/2021/08/24/designing_new.html Stealing Impl from Rust With the new <code class="language-plaintext highlighter-rouge">OverloadedRecordDot</code> language extension, we can use the <code class="language-plaintext highlighter-rouge">.</code> character to access stuff on records. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language OverloadedRecordDot #-} data User = User { name :: String } main :: IO () main = do let user = User { name = "Matt" } putStrLn user.name </code></pre></div></div> This is syntax sugar for the following code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import GHC.Records data User = User { name :: String } instance HasField "name" User String where getField (User n) = n main :: IO () main = do let user = User { name = "Matt" } putStrLn (getField @"name" user) </code></pre></div></div> As it happens, we can add fields to a record. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language OverloadedRecordDot #-} data User = User { name :: String } instance HasField "age" User Int where getField user = 32 main :: IO () main = do let user = User { name = "Matt" } print user.age </code></pre></div></div> This works, though it’s a bit boring. It’s much more useful to have, say, virtual fields. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { firstName :: String , lastName :: String } instance HasField "name" User String where getField user = unwords [user.firstName, user.lastName] </code></pre></div></div> This gives us a “virtual field,” which can allow us to refactor code that depends on the record field in neat ways! <h1 id="methods">Methods</h1> So, those types, they don’t have to be ordinary values. They can be methods. Or, y’know, functions, whatever, it’s all the same in Haskell. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasField "greet" User (String -> IO ()) where getField self message = do putStrLn $ concat [message, ", ", self.name, "!"] main :: IO () main = do let user = User { firstName = "Matt", lastName = "Parsons" } user.greet "UhhhHHH Excuse me WTF" </code></pre></div></div> This prints out <code class="language-plaintext highlighter-rouge">UhhhHHH Excuse me WTF, Matt Parsons!</code>. Which is pretty cool. <h1 id="impl"><code class="language-plaintext highlighter-rouge">impl</code></h1> Rust has a <a href="https://doc.rust-lang.org/std/keyword.impl.html">keyword <code class="language-plaintext highlighter-rouge">impl</code></a>, which is used in two ways: <ol> <li>Adding methods to a type.</li> <li>Adding a trait to a type.</li> </ol> The linked docs tell the whole story, just about. I don’t know about you but I want nicer syntax than all the <code class="language-plaintext highlighter-rouge">instance HasField</code> stuff. I wrote a little library that should do this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { name :: String } impl ''User [d| greet :: String -> IO () greet message = do putStrLn $ concat [message, ", ", self.name] |] </code></pre></div></div> It’s relatively straightforward. In pseudocode, it’s implemented like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>impl :: Name -> Q [Dec] -> Q [Dec] impl tyName qds = do decs <- qds let namesTypesExprs :: [(String, Type, Exp)] namesTypesExprs = getTypesAndExprs decs instances <- for namesTypesExprs $ \(name, typ, exp) -> do [d| instance HasField $(name) $(tyName) $(typ) where getField self = $(exp) |] pure (concat instances) </code></pre></div></div> Unfortunately, I ran into a bit of a <a href="https://gitlab.haskell.org/ghc/ghc/-/issues/20185">blocking issue</a>, namely that GHC does not support <code class="language-plaintext highlighter-rouge">OverloadedRecordDot</code> in <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> <code class="language-plaintext highlighter-rouge">QuasiQuotes</code> yet. While I can work around it, I’d rather not bother until <code class="language-plaintext highlighter-rouge">OverloadedRecordDot</code> is fully supported by GHC. <h1 id="the-dealbreaker">The Dealbreaker</h1> There’s no polymorphism allowed. Like, at all. You can’t write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Show a) => HasField "myPrint" User (a -> IO ()) where getField self a = do putStrLn (show a) </code></pre></div></div> This fails the functional dependencies. You can’t write methods generic in <code class="language-plaintext highlighter-rouge">MonadIO m => HasField User (String -> m ())</code> either. The functional dependencies seem pretty reasonable: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class HasField sym r a | sym r -> a where getField :: r -> a </code></pre></div></div> This means that the types <code class="language-plaintext highlighter-rouge">sym</code> and <code class="language-plaintext highlighter-rouge">r</code> uniquely determine the <code class="language-plaintext highlighter-rouge">a</code> type - or, that if you know what <code class="language-plaintext highlighter-rouge">sym</code> and <code class="language-plaintext highlighter-rouge">r</code> are, then you always know exactly what <code class="language-plaintext highlighter-rouge">a</code> is. Since users of our isntance are able to select things like <code class="language-plaintext highlighter-rouge">IO</code>, <code class="language-plaintext highlighter-rouge">ReaderT () IO</code>, and <code class="language-plaintext highlighter-rouge">StateT Int IO</code> for this, you can’t uniquely determine the result type just based on the answer. Seems like <code class="language-plaintext highlighter-rouge">ImpredicativeTypes</code> should work here, but they apparently don’t. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasField "myPrint" User (forall a. Show a => a -> IO ()) where ... </code></pre></div></div> This fails with a syntax error, due to the way <code class="language-plaintext highlighter-rouge">OverloadedRecordDot</code> affects GHC’s parser. <blockquote> <a href="https://gitlab.haskell.org/ghc/ghc/-/issues/20186">Bugs for the bug god!!</a> </blockquote> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasField "myPrint" User (forall a . Show a => a -> IO ()) where ... </code></pre></div></div> This fails because it is an illegal polymorphic type. <h1 id="illegal-polymorphism">Illegal Polymorphism</h1> But wait - this is an impredicative type. <code class="language-plaintext highlighter-rouge">ImpredicativeTypes</code> was a deprecated language extension, but I recall hearing that we landed support for them with <a href="https://github.com/ghc-proposals/ghc-proposals/pull/274">Quick Look Impredicativity</a>. And, in GHC 9, <a href="https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/impredicative_types.html#impredicative-polymorphism">we have a proper <code class="language-plaintext highlighter-rouge">ImpredicativeTypes</code> behavior</a>! We definitely are paying a big cost for it - the <a href="https://gitlab.haskell.org/ghc/ghc/-/wikis/migration/9.0#simplified-subsumption">Simplify Subsumption</a> proposal gives no practical benefit to programmers except that it gives additional power to Quick Look Impredicativity. Unfortunately, enabling <code class="language-plaintext highlighter-rouge">ImpredicativeTypes</code> doesn’t make this work - GHC still deems the above an illegal polymorphic type. It turns out, you can’t <a href="https://gitlab.haskell.org/ghc/ghc/-/issues/20188">put an impredicative type in an instance at all</a>. Oh well. <h1 id="breaking-news">BREAKING NEWS</h1> Okay, so I posted this, and was immediately offered a Prime Tip by Sandy Maguire. Apparently <a href="https://www.youtube.com/watch?v=ZXtdd8e7CQQ">Richard Eisenberg</a> has published a video stating how to defeat this. The answer is to demand the constraint in the context. So we can write <code class="language-plaintext highlighter-rouge">myPrint</code> like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( Show a , HasField "myPrint" User (a -> IO ()) ) => HasField "myPrint" User (a -> IO ()) where getField self a = do putStrLn $ concat [self.name, " says: ", show a] </code></pre></div></div> That let’s us write code like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>go :: IO () go = do let user = User { name = "Matt" } user.myPrint 'a' user.myPrint 3 </code></pre></div></div> Which evaluates like this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$> go Matt says: 'a' Matt says: 3 </code></pre></div></div> Nice! Deal unbreaker. <h1 id="yet-another-update">Yet Another Update</h1> Reddit user <code class="language-plaintext highlighter-rouge">/u/WhisterPayer</code> implemented <a href="https://github.com/ElderEphemera/instance-impl">this plugin</a> to accomplish this! Awesome. Thu, 29 Jul 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/07/29/stealing_impl_from_rust.html https://www.parsonsmatt.org/2021/07/29/stealing_impl_from_rust.html Hspec Hooks The <code class="language-plaintext highlighter-rouge">hspec</code> testing library includes many useful facilities for writing tests, including a powerful “hooks” capability. These hooks allow you to provide data and capabilities to your tests. <h1 id="specwith"><code class="language-plaintext highlighter-rouge">SpecWith</code></h1> The typical <code class="language-plaintext highlighter-rouge">hspec</code> test suite looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = hspec specs specs :: Spec specs = do describe "math" $ do it "1 + 1" $ do 1 + 1 `shouldBe` 2 it "3 * 2" $ do 3 * 2 `shouldBe` 6 describe "words" $ do it "breaks stuff up" $ do words "asdf asdf asdf" `shouldBe` ["asdf", "asdf", "asdf"] </code></pre></div></div> Everything is a <code class="language-plaintext highlighter-rouge">Spec</code>, and it’s all nice and cute. Suddenly, you want to provide a database connection to each item in a spec. You can do this using a plain ol’ function argument, and this works alright. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do db <- createDatabase specs db specs :: DB -> Spec specs db = do describe "SELECT" $ do it "works" $ do result <- runDb db "SELECT...." result `shouldBe` [1,2,3] </code></pre></div></div> But - what if we want to have a fresh database connection made, for each test? Well, then it’s a bit more awkward. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>specs :: Spec specs = do describe "SELECT" $ do it "works" $ do db <- createDatabase result <- runDb db "SELECT...." result `shouldBe` [1,2,3] it "does other stuff" $ do db <- createDatabase result <- runDb db "OTHER STUFF..." result `shouldBe` [4,3,2] </code></pre></div></div> That’s not much fun! <code class="language-plaintext highlighter-rouge">hspec</code> gives us a function <code class="language-plaintext highlighter-rouge">before</code> that can be used to provide a fresh value for each item in a spec. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>specs :: Spec specs = before createDatabase $ do describe "SELECT" $ do it "works" $ \db -> do result <- runDb db "SELECT...." result `shouldBe` [1,2,3] it "does other stuff" $ \db -> do result <- runDb db "OTHER STUFF..." result `shouldBe` [4,3,2] </code></pre></div></div> Let’s look at the type of <code class="language-plaintext highlighter-rouge">before</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>before :: IO a -> SpecWith a -> Spec </code></pre></div></div> This raises some questions. What is a <code class="language-plaintext highlighter-rouge">SpecWith</code>? All of the <code class="language-plaintext highlighter-rouge">describe</code> stuff functions in <a href="https://hackage.haskell.org/package/hspec-core-2.8.2/docs/Test-Hspec-Core-Spec.html#g:2">a <code class="language-plaintext highlighter-rouge">SpecM</code> monad</a>, which constructs the <code class="language-plaintext highlighter-rouge">Spec</code> tree and allows for filtering, focusing, and mapping of spec items. That link shows that <code class="language-plaintext highlighter-rouge">type Spec = SpecWith ()</code>. Expanding our type, we get this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>before :: IO a -> SpecWith a -> SpecWith () </code></pre></div></div> A <code class="language-plaintext highlighter-rouge">SpecWith</code> is a test that expects some additional context. <h1 id="more-before">More <code class="language-plaintext highlighter-rouge">before</code></h1> So, <code class="language-plaintext highlighter-rouge">before</code> is used to provide a fresh thing to every test item. What if you want to create a single thing and have it be shared among every spec item? We can use <code class="language-plaintext highlighter-rouge">beforeAll</code> to accomplish that. It has the same signature. The only difference is that the creation action is run once, and then shared among every test. What if you don’t want to pass anything to the tests, but you want to run some action? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>before_ :: IO () -> SpecWith a -> SpecWith a </code></pre></div></div> You may be wondering: “What if I want to run an action once before all the items in the test go, but don’t provide a value?” <code class="language-plaintext highlighter-rouge">hspec</code> has you covered - <code class="language-plaintext highlighter-rouge">beforeAll_</code> works exactly like that. There’s one more tricky thing here - <code class="language-plaintext highlighter-rouge">beforeWith</code>. Note that in <code class="language-plaintext highlighter-rouge">before</code>, the result is a <code class="language-plaintext highlighter-rouge">Spec</code> - a test without extra context. How do we call <code class="language-plaintext highlighter-rouge">before</code> on something that has already had <code class="language-plaintext highlighter-rouge">before</code> called on it? <code class="language-plaintext highlighter-rouge">beforeWith</code> comes to the rescue. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>beforeWith :: (b -> IO a) -> SpecWith a -> SpecWith b </code></pre></div></div> If you understand <code class="language-plaintext highlighter-rouge">Contravariant</code> functors, then that intuition will carry you a decent way. If you don’t, that’s cool - let’s dig into it. Let’s say we have some group of tests that want to run a set of migrations against the database, and also provide some information along with the database connection. We’ll insert a fake <code class="language-plaintext highlighter-rouge">User</code> and make the <code class="language-plaintext highlighter-rouge">Id</code> available to the resulting tests. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = before createDatabase $ do describe "SELECT" $ do it "has a database" $ \db -> do ... beforeWith createUser $ describe "With User" $ do it "has a db and a user" $ \(db, userId) -> do ... createUser :: DB -> IO (DB, UserId) createUser db = do userId <- runDb $ insert User { name = "asdf" } pure (db, userId) </code></pre></div></div> Now, <code class="language-plaintext highlighter-rouge">beforeWith</code> means that each item gets run before each spec. So each test item in the database will have a different <code class="language-plaintext highlighter-rouge">User</code> created for the test. Naturally, there’s <code class="language-plaintext highlighter-rouge">beforeAllWith</code>, which would only be run once, and would provide the same <code class="language-plaintext highlighter-rouge">UserId</code> to each test item. You may wonder: “Is there a <code class="language-plaintext highlighter-rouge">beforeAllWith_</code>? Or even just <code class="language-plaintext highlighter-rouge">beforeWith_</code>?” There is not, and the reason is that they’re redundant. Note how <code class="language-plaintext highlighter-rouge">before_</code> and <code class="language-plaintext highlighter-rouge">beforeAll_</code> don’t affect the context of the specs. <pre><code class="language-Haskell">before_ :: IO () -> SpecWith a -> SpecWith a beforeAll_ :: IO () -> SpecWith a -> SpecWith a </code></pre> If we want to not affect the context of the spec, then we can just return it directly. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>beforeWith_ :: (a -> IO ()) -> SpecWith a -> SpecWith a beforeWith_ action = beforeWith $ \a -> do action a pure a </code></pre></div></div> <h1 id="after"><code class="language-plaintext highlighter-rouge">after</code></h1> The <code class="language-plaintext highlighter-rouge">before</code> family of functions are useful for providing data and preparing the state of the world for a test. <code class="language-plaintext highlighter-rouge">after</code> is useful for tearing it down, or cleaning up after a test. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>after :: ActionWith a -> SpecWith a -> SpecWith a </code></pre></div></div> <code class="language-plaintext highlighter-rouge">ActionWith</code> is a type synonym, so let’s review the definition and inline it here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type ActionWith a = a -> IO () after :: (a -> IO ()) -> SpecWith a -> SpecWith a </code></pre></div></div> (I often find that inlining type synonyms helps with <code class="language-plaintext highlighter-rouge">hspec</code> when reading and understanding it) Let’s write a function that deletes the <code class="language-plaintext highlighter-rouge">User</code> out of the database for all the terms. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = before createDatabase $ do describe "SELECT" ... beforeWith createUser $ after deleteUser $ describe "With User" $ do it "has a db and a user" $ \(db, userId) -> do ... createUser :: DB -> IO (DB, UserId) createUser db = do userId <- runDb $ insert User { name = "asdf" } pure (db, userId) deleteUser :: (DB, UserId) -> IO () deleteUser (db, userId) = do runDb $ delete userId pure () </code></pre></div></div> Now, we aren’t polluting our database with all those <code class="language-plaintext highlighter-rouge">User</code> rows. <code class="language-plaintext highlighter-rouge">afterAll</code> does what you expect, if you know how <code class="language-plaintext highlighter-rouge">beforeAll</code> works. The action is run exactly once, after all spec items have been run. If we replace <code class="language-plaintext highlighter-rouge">after</code> with <code class="language-plaintext highlighter-rouge">afterAll</code> in the above code, we’ll get some slightly weird results. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>beforeWith createUser $ afterAll deleteUser $ describe "With User" $ do it "has a thing" ... it "likes cats" ... it "also likes dogs" ... </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">beforeWith</code> is called each time - so we create a fresh <code class="language-plaintext highlighter-rouge">User</code> for each database. <code class="language-plaintext highlighter-rouge">afterAll</code> gets called on the last spec item - so we keep the first two <code class="language-plaintext highlighter-rouge">User</code> rows in the database. <code class="language-plaintext highlighter-rouge">after_</code> and <code class="language-plaintext highlighter-rouge">afterAll_</code> ignore the <code class="language-plaintext highlighter-rouge">a</code> from <code class="language-plaintext highlighter-rouge">SpecWith a</code>. Instead of being an <code class="language-plaintext highlighter-rouge">ActionWith a</code> or an <code class="language-plaintext highlighter-rouge">(a -> IO ())</code> as the first parameter, it’s merely the <code class="language-plaintext highlighter-rouge">IO ()</code> action. <h1 id="around"><code class="language-plaintext highlighter-rouge">around</code></h1> <code class="language-plaintext highlighter-rouge">around</code> is pretty tricky. It encapsulates the pattern above - create something for each test, then tear it down afterwards. Most uses of <code class="language-plaintext highlighter-rouge">before create $ after destroy $ ...</code> can be refactored to use <code class="language-plaintext highlighter-rouge">around</code> and enjoy greater exception safety. Let’s start off with <code class="language-plaintext highlighter-rouge">around_</code>. It doesn’t worry about the extra context, which makes it easier to understand. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>around_ :: (IO () -> IO ()) -> SpecWith a -> SpecWith a </code></pre></div></div> Our first argument is a function, which accepts an <code class="language-plaintext highlighter-rouge">IO ()</code> action and returns another one. The <code class="language-plaintext highlighter-rouge">IO ()</code> can be named <code class="language-plaintext highlighter-rouge">runTest</code>, and it becomes clear how it works: <pre><code class="language-Haskell">spec :: Spec spec = around_ (\runTest -> do putStrLn "beginning" runTest putStrLn "ending" ) $ describe "My tests" $ </code></pre> So, our <code class="language-plaintext highlighter-rouge">IO ()</code> parameter is our test, and we can do whatever we want around it. Let’s get back to <code class="language-plaintext highlighter-rouge">around</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>around :: (ActionWith a -> IO ()) -> SpecWith a -> Spec around :: ((a -> IO ()) -> IO ()) -> SpecWith a -> SpecWith () </code></pre></div></div> It’s really similar, but our <code class="language-plaintext highlighter-rouge">runTest</code> is now a function from <code class="language-plaintext highlighter-rouge">a</code> to the <code class="language-plaintext highlighter-rouge">IO ()</code>. Let’s write our user creation/deletion helper with <code class="language-plaintext highlighter-rouge">around</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = around (\runTest -> do db <- createDatabase userId <- createUser db runTest (db, userId) deleteUser (db, userId) ) $ describe "With User" $ do it "has a user" $ \(db, userId) -> ... it "ok ya i get it" $ \(db, userId) -> ... </code></pre></div></div> One thing that’s neat is that we can use <code class="language-plaintext highlighter-rouge">bracket</code> style to safely close out resources, too. Instead of creating a database connection, let’s use the <code class="language-plaintext highlighter-rouge">withDatabase</code> sort of API. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = around (\runTest -> do withDatabase $ \db -> do userId <- createUser db runTest (db, userId) deleteUser (db, userId) ) $ describe "With User" $ do it "has a user" $ \(db, userId) -> ... it "ok ya i get it" $ \(db, userId) -> ... </code></pre></div></div> Now, if an exception is thrown in the test or in the <code class="language-plaintext highlighter-rouge">around</code> action, the <code class="language-plaintext highlighter-rouge">withDatabase</code> function gets a chance to clean up the database connection. Resource safety FTW! <h1 id="aroundwith"><code class="language-plaintext highlighter-rouge">aroundWith</code></h1> You may have noticed that <code class="language-plaintext highlighter-rouge">around</code> results in a <code class="language-plaintext highlighter-rouge">Spec</code>, not a <code class="language-plaintext highlighter-rouge">SpecWith</code>. You may have further inferred that there must be an <code class="language-plaintext highlighter-rouge">aroundWith</code> that lifted that restriction. There is, and the type signature is a bit scary. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aroundWith :: (ActionWith a -> ActionWith b) -> SpecWith a -> SpecWith b -- inlining ActionWith type synonym aroundWith :: ((a -> IO ()) -> (b -> IO ())) -> SpecWith a -> SpecWith b -- deleting unnecessary parens aroundWith :: ((a -> IO ()) -> b -> IO ()) -> SpecWith a -> SpecWith b </code></pre></div></div> The callback to <code class="language-plaintext highlighter-rouge">aroundWith</code> is intriguing. The <code class="language-plaintext highlighter-rouge">b</code> is provided to us, and we must provide an <code class="language-plaintext highlighter-rouge">a</code> to the callback. That <code class="language-plaintext highlighter-rouge">b</code> represents the “outer context” of our test suite - the result type, what we’re plugging the whole test into. While the <code class="language-plaintext highlighter-rouge">a</code> represents the “inner context” of the argument <code class="language-plaintext highlighter-rouge">SpecWith a</code> that we’re passed. <code class="language-plaintext highlighter-rouge">aroundWith</code> is saying: “I know how to unify these two contexts.” <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aroundWith $ \runTest outerContext -> do innerContext <- createInnerContext outerContext runTest innerContext </code></pre></div></div> Now, we can rewrite our database creation, user creation, etc to properly delete and create these things. More importantly - it happens in a composable manner. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = do let provideDatabase runTest = withDatabase $ \db -> runTest db around provideDatabase $ describe "With Database" $ do it "has stuff" ... it "okay" ... let provideUser runTest db = do userId <- createUser db runTest (db, userId) deleteUser (db, userId) aroundWith provideUser $ describe "With User" $ do it ... it ... </code></pre></div></div> We can even use <code class="language-plaintext highlighter-rouge">bracket</code> internally, to ensure that exceptions are handled neatly. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = do let provideDatabase runTest = withDatabase $ \db -> runTest db around provideDatabase $ describe "With Database" $ do it "has stuff" ... it "okay" ... let provideUser runTest db = do bracket (createUser db) (\userId -> deleteUser (db, userId)) (\userId -> runTest (db, userId)) aroundWith provideUser $ describe "With User" $ do it ... it ... </code></pre></div></div> Finally, if you’re just mapping the <code class="language-plaintext highlighter-rouge">a</code> type, there’s <code class="language-plaintext highlighter-rouge">mapSubject</code>, which lets you modify the type for the underlying items. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mapSubject :: (b -> a) -> SpecWith a -> SpecWith b </code></pre></div></div> <h1 id="hspec-rules">Hspec Rules</h1> I love writing tests with <code class="language-plaintext highlighter-rouge">hspec</code>. Hopefully, you’ll enjoy writing fancy composable tests with the library too! Fri, 16 Jul 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/07/16/hspec_hooks.html https://www.parsonsmatt.org/2021/07/16/hspec_hooks.html Template Haskell Performance Tips <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> is a powerful feature. With it, you can generate Haskell code using Haskell code, and GHC will compile it for you. This allows you to do many neat things, like <a href="https://hackage.haskell.org/package/qq-literals">quoted type safe literals</a>, <a href="https://hackage.haskell.org/package/persistent-2.13.1.1/docs/Database-Persist-Quasi.html">database entity definitions</a>, <a href="https://hackage.haskell.org/package/singletons-th-3.0/docs/Data-Singletons-TH.html">singletonized types for type-level programming</a>, <a href="https://hackage.haskell.org/package/lens-5.0.1/docs/Control-Lens-TH.html">automatic <code class="language-plaintext highlighter-rouge">Lens</code> generation</a>, among other things. One of the main downsides to <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> is that it can cause compilation times to increase significantly. Let’s dig into these slowdowns and talk about how to make them a bit less onerous. <h1 id="firing-up-the-external-interpreter">Firing up the external interpreter</h1> EDIT: <a href="https://www.reddit.com/r/haskell/comments/oiwl6z/templatehaskell_performance_tips/h4ya0ay/">Adam Gundry commented on <code class="language-plaintext highlighter-rouge">reddit</code></a> that this section is wrong. The external interpreter is only used if <code class="language-plaintext highlighter-rouge">-fexternal-interpreter</code> option is passed to GHC. This may be why I was unable to detect the overhead from running an external interpret! If you use <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> at all in a module, then GHC needs to fire up an external interpeter. GHC loads the interpreter (typically something like <code class="language-plaintext highlighter-rouge">ghci</code>), then executes/interprets the Haskell code. Splices return one of the <a href="https://www.stackage.org/haddock/lts-18.2/template-haskell-2.16.0.0/Language-Haskell-TH.html#g:18">Haskell syntax algebraic data types</a>. This has a constant overhead cost. It’s difficult to measure directly, since GHC doesn’t have an easy means of outputting performance and timing information on a per module basis. However, we can pass <code class="language-plaintext highlighter-rouge">+RTS -s -RTS</code> to GHC, which will cause it to print performance for a “package target.” And, with GHC 9, I’m actually unable to determine a difference. The noise in a given run appears to overwhelm the costs of actually firing up the interpreter. So much for that! (If you find different things, please let me know - you can file an issue or a PR to the <a href="https://github.com/parsonsmatt/parsonsmatt.github.io">GitHub repo</a>) <h1 id="actually-running-code">Actually running code</h1> GHC has two phases for TH: <ol> <li>Generating Code</li> <li>Compiling Code</li> </ol> Generating code typically doesn’t take much time at all, though this isn’t guaranteed. Fortunately, we can easily write a timing utility, since the <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> generation type allows you to run arbitrary <code class="language-plaintext highlighter-rouge">IO</code> operations. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Data.Time (getCurrentTime, diffUTCTime) import Language.Haskell.TH (Q, runIO, reportWarning) timed :: String -> Q a -> Q a timed message action = do begin <- runIO getCurrentTime result <- action end <- runIO getCurrentTime let duration = end `diffUTCTime` begin reportWarning $ concat [ "[", message, "]: ", show duration] pure result </code></pre></div></div> Expert benchmarkers will complain about using <code class="language-plaintext highlighter-rouge">getCurrentTime</code> since it isn’t monotonic, which is a valid complaint. But we’re not getting a real benchmark anyway, and we’re mostly just going to see whether generation or compilation is dominating the elapsed time (hint: it’ll almost always be compilation). With this, we will get a reported warning about the duration of the code generation. In <a href="https://www.reddit.com/r/haskell/comments/oi1x5v/tiny_use_of_template_haskell_causing_huge_memory/h4tr7n8/">this reddit comment</a>, I used it to determine that generation of some code was taking <code class="language-plaintext highlighter-rouge">0.0015s</code>, while compilation of the resulting code took <code class="language-plaintext highlighter-rouge">21.201s</code>. The code looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Main where import TuplesTH $(timed "tuples" $ generateTupleBoilerplate 62) main :: IO () main = do print $ _3 (1,2,42,"hello",'z') </code></pre></div></div> The output looks like this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Building executable 'th-perf-exe' for th-perf-0.1.0.0.. [1 of 2] Compiling Main /home/matt/Projects/th-perf/app/Main.hs:11:2: warning: [tuples]: 0.001553454s | 11 | $(timed "tuples" $ generateTupleBoilerplate 62) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [2 of 2] Compiling Paths_th_perf 21,569,689,896 bytes allocated in the heap 6,231,564,888 bytes copied during GC 594,126,600 bytes maximum residency (17 sample(s)) 3,578,104 bytes maximum slop 1641 MiB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1097 colls, 0 par 4.919s 4.921s 0.0045s 0.1072s Gen 1 17 colls, 0 par 4.466s 4.467s 0.2628s 1.0215s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.001s ( 0.001s elapsed) MUT time 11.813s ( 12.135s elapsed) GC time 9.385s ( 9.388s elapsed) EXIT time 0.001s ( 0.007s elapsed) Total time 21.201s ( 21.530s elapsed) Alloc rate 1,825,890,582 bytes per MUT second Productivity 55.7% of total user, 56.4% of total elapsed </code></pre></div></div> This sort of timing is usually only useful to determine whether you need to benchmark and optimize the generation phase or the compilation phase. Optimizing generation is a relatively standard Haskell performance optimization process, so I won’t cover it here. If your code is mostly pure functions (or, with GHC 9, the new <a href="https://www.stackage.org/haddock/nightly-2021-07-11/template-haskell-2.17.0.0/Language-Haskell-TH.html#t:Quote"><code class="language-plaintext highlighter-rouge">Quote</code></a> type class), then it’s straightforward to do. Many <code class="language-plaintext highlighter-rouge">Q</code> features are not supported in <code class="language-plaintext highlighter-rouge">IO</code>, and it’s difficult to accurately benchmark them. <h1 id="optimizing-compilation">Optimizing Compilation</h1> In the above example, GHC spends a tiny amount of time generating code, and then spends a huge amount of time compiling it. What’s going on? In <a href="https://www.parsonsmatt.org/2019/11/27/keeping_compilation_fast.html">Keeping Compilation Fast</a>, I write that GHC compiles modules superlinearly in the size of the module. That means that large modules take longer to compile than the same amount of code split up over several modules. <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> has no way of creating modules, or even altering the imports/exports of a given module, and so it necessarily might run into this problem. We have two means of reducing generated code: spreading the use over multiple modules, and optimizing how we generate the code. <h2 id="fewer-calls-to-th">Fewer Calls to TH</h2> In <a href="https://www.parsonsmatt.org/2019/12/06/splitting_persistent_models.html">Splitting Persistent Models</a>, I wrote how to speed up compile-times by isolating the <code class="language-plaintext highlighter-rouge">persistent</code> model definitions into separate modules. This results in many smaller modules, which GHC can compile much faster - in part because the modules are able to parallelized, and in part because they are smaller, and don’t hit the superlinearity. You can do this with any other thing, too. A large module that has a ton of data types and a <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> declaration for each type will quickly become a problem in compilation. Separating it out into multiple modules, each exporting a small subset of those types, will allow GHC to operate much more quickly. <h2 id="smaller-code">Smaller Code</h2> It’s relatively easy to generate a massive amount of Haskell code. After all, the entire point is to make GHC generate code for us, because we don’t want to write it ourselves! In order to see how much code we’re generating in a module, it’s useful to enable the <code class="language-plaintext highlighter-rouge">-ddump-splices</code> option. We can do this with a <code class="language-plaintext highlighter-rouge">GHC_OPTIONS</code> pragma above the module header: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language TemplateHaskell #-} {-# OPTIONS_GHC -ddump-splices #-} module Lib where import Language.Haskell.TH.Syntax (liftTyped) asdf :: Int asdf = $$(liftTyped 3) </code></pre></div></div> With this option, GHC will print the splice and the corresponding output while compiling the module. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Building library for th-perf-0.1.0.0.. [2 of 3] Compiling Lib /home/matt/Projects/th-perf/src/Lib.hs:10:10-22: Splicing expression liftTyped 3 ======> 3 </code></pre></div></div> However, if you’ve got a performance problem, then you’ve probably got more output here than you have any idea what to do with. In <a href="https://www.reddit.com/r/haskell/comments/oi1x5v/tiny_use_of_template_haskell_causing_huge_memory/h4tr7n8/">the reddit thread</a>, we ended up generating enough code that I couldn’t scroll back to the top! So, we’ll want to dump the resulting splices to a file. We can use the <code class="language-plaintext highlighter-rouge">-ddump-to-file</code>, and GHC will store the splices for a module in a file named <code class="language-plaintext highlighter-rouge">$(module-name).dump-$(phase)</code>. If you’re building with <code class="language-plaintext highlighter-rouge">stack</code>, then the files will be located in the <code class="language-plaintext highlighter-rouge">.stack-work</code> file. We can get the resulting size of the file using <code class="language-plaintext highlighter-rouge">wc</code> and a bit of a glob. In that investigation, this is the command and output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ wc -l .stack-work/**/*.dump-splices 15897 .stack-work/dist/x86_64-linux-tinfo6/Cabal-3.4.0.0/build/th-perf-exe/th-perf-exe-tmp/app/Main.dump-splices </code></pre></div></div> That’s 15,897 lines of code! You can open that file up and see what it generates. In that example, there wasn’t much to optimize. <h3 id="beware-splicing-and-lifting">Beware Splicing and Lifting</h3> At the work codebase, we had a <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> function that ended up taking several minutes to compile. It iterated through all of our database models and generated a function that would stream each row from the database and verify that we could successfully parse everything out of the database. This is nice to check that our <code class="language-plaintext highlighter-rouge">PersistField</code> definitions worked, or that our <code class="language-plaintext highlighter-rouge">JSONB</code> columns could all still be parsed. I investigated the slow compile-time by dumping splices, and managed to find that it was splicing in the entire <a href="https://www.stackage.org/haddock/lts-18.2/persistent-2.13.1.1/Database-Persist-EntityDef-Internal.html#t:EntityDef"><code class="language-plaintext highlighter-rouge">EntityDef</code></a> type, multiple times, for each table. This is a relatively large record, with a bunch of fields, and each <code class="language-plaintext highlighter-rouge">FieldDef</code> also is relatively large, with a bunch of fields! The resulting code size was enormous. Why was it doing this? I looked into it and discovered this innocuous bit of code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>do -- ... tableName <- [| getEntityHaskellName entityDef |] dbName <- [| getEntityDBName entityDef |] -- ... pure $ mkFun tableName dbName </code></pre></div></div> You might expect that <code class="language-plaintext highlighter-rouge">tableName</code> would be an expression containing only the Haskell name of the entity. However, it’s actually the entire expression in the <code class="language-plaintext highlighter-rouge">QuasiQuote</code>! Haskell allows you to implicitly lift things, sometimes, depending on scope and context etc. The <a href="https://www.stackage.org/haddock/lts-18.2/template-haskell-2.16.0.0/Language-Haskell-TH-Syntax.html#t:Lift"><code class="language-plaintext highlighter-rouge">lift</code> in question refers to the <code class="language-plaintext highlighter-rouge">Lift</code> type class</a>, not the <code class="language-plaintext highlighter-rouge">MonadTrans</code> variant. This ends up being translated to: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tableName <- [| $(lift getEntityHaskellName) $(lift entityDef) |] </code></pre></div></div> Lifting a function like this is relatively easy - you just splice a reference to the function. So the resulting expression for the function name is something like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>lift getEntityHaskellName ===> VarE 'getEntityHaskellName </code></pre></div></div> In order to <code class="language-plaintext highlighter-rouge">lift</code> the <code class="language-plaintext highlighter-rouge">EntityDef</code> into the expression, we need to take the complete run-time value and transform it into valid Haskell code, which we then splice in directly. In this case, that looks something like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>lift entityDef ===> EntityDef { entityHaskell = EntityNameHS (Data.Text.pack "SomeTable") , entityDB = EntityNameDB (Data.Text.pack "some_table") , entityId = EntityIdField ( FieldDef { fieldHaskell = FieldNameHS (Data.Text.pack "id") , fieldDB = FieldNameDB (Data.Text.pack "id") , fieldType = -- .... , fieldSqlType = -- ... , -- etc... } , entityFields = [ FieldDef { ... } , FieldDef { ... } , FieldDef { ... } , ... ] } </code></pre></div></div> The combined expression splices this in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>VarE 'getEntityHaskellName `AppE` (ConE 'EntityDef `AppE` (ConE 'EntityNameHS `AppE` (VarE 'pack `AppE` LitE (StringL "SomeTable")) ) `AppE` (ConE 'EntityNameDB ...) ) </code></pre></div></div> Which is no good - we’re obviously only grabbing a single field from the record. Fortunately, we can fix that real easy: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tableName <- lift $ getEntityHaskellName entityDef dbName <- lift $ getEntityDBName entityDef </code></pre></div></div> This performs the access before we generate the code, resulting in significantly smaller code generation. <h1 id="recompilation-avoidance">Recompilation Avoidance</h1> GHC is usually pretty clever about determining if it can avoid recompiling a module. However, <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> defeats this, and GHC doesn’t even try to see if it can avoid recompiling - it just recompiles. (This may be fixed in an upcoming GHC, but as of 9.0, it’s still doing the safe/dumb thing). We can’t fix this, but we can work around it. Try to isolate your <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> use to only a few modules, and keep them as small as possible. For example, suppose you have a ~500 line module that contains a bunch of data types, <code class="language-plaintext highlighter-rouge">deriveJSON</code> calls for those types, business logic, and handler API functions. If any dependency of that module changes, you need to recompile the whole module due to the <code class="language-plaintext highlighter-rouge">TH</code> recompilation rule. This needlessly recompiles everything - the datatypes, functions, JSON derivation, etc. If you pull the datatypes and <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> into a separate module, then that module needs to be recompiled every time. However, GHC is smart enough to avoid recompiling the dependent module. Suppose you split the 500 line module into two files, one of which is 20 lines of <code class="language-plaintext highlighter-rouge">data</code> and <code class="language-plaintext highlighter-rouge">TemplateHaskell</code>, and the other is 480 lines of functions, code, etc. GHC will always recompile the 20 line module (very fast), and intelligently avoid recompiling the 480 lines when it doesn’t need to. <h2 id="recompilation-cascade">Recompilation Cascade</h2> Recompilation Cascade is the name I’ve given to a problem where a tiny change triggers a <code class="language-plaintext highlighter-rouge">[TH]</code> rebuild of a module, and, since that module got rebuilt, every dependent module using <code class="language-plaintext highlighter-rouge">TH</code> gets rebuilt. If you use <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> pervasively, then you may end up having <code class="language-plaintext highlighter-rouge">[TH]</code> rebuilds for your entire codebase! This can wreck incremental compile times. Try to avoid this by separating out your <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> into isolated modules, if at all possible. If you use the <code class="language-plaintext highlighter-rouge">typed QQ literals</code> trick, then you can isolate those literals into a <code class="language-plaintext highlighter-rouge">Constants</code> module, and use those constants directly. Instead of: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module X where sendEmailToFoo = sendEmail [email|foobar@gmail.com|] "hello world" </code></pre></div></div> Consider using this instead: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Email.Constants where foobar_at_gmail = [email|foobar@gmail.com|] module X where import Email.Constants sendEmailToFoo = sendEmail foobar_at_gmail "hello world" </code></pre></div></div> With the latter form, <code class="language-plaintext highlighter-rouge">X</code> does not use <code class="language-plaintext highlighter-rouge">TemplateHaskell</code>, and therefore can skip recompilation if any dependencies change. Mon, 12 Jul 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/07/12/template_haskell_performance_tips.html https://www.parsonsmatt.org/2021/07/12/template_haskell_performance_tips.html Global IORef in Template Haskell I’m investigating <a href="https://github.com/yesodweb/persistent/issues/1241#issuecomment-824128224">a way to speed up <code class="language-plaintext highlighter-rouge">persistent</code> as well as make it more powerful</a>, and one of the potential solutions involves persisting some global state across module boundaries. I decided to investigate whether the “Global IORef Trick” would work for this. Unfortunately, it doesn’t. On reflection, it seems obvious: the interpreter for Template Haskell is a GHCi-like process that is loaded for each module. Loading an interpreter for each module is part of why Template Haskell imposes a compile-time penalty - in my measurements, it’s something like ~100ms. Not huge, but noticeable on large projects. (I still generally find that <code class="language-plaintext highlighter-rouge">DeriveGeneric</code> and the related <code class="language-plaintext highlighter-rouge">Generic</code> code to be slower, but it’s a complex issue). Anyway, let’s review the trick and obseve the behavior. <h1 id="global-ioref-trick">Global IORef Trick</h1> This trick allows you to have an <code class="language-plaintext highlighter-rouge">IORef</code> (or <code class="language-plaintext highlighter-rouge">MVar</code>) that serves as a global reference. You almost certainly do not need to do this, but it can be a convenient way to hide state and make your program deeply mysterious. Here’s the trick: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Lib where import Data.IORef import System.IO.Unsafe globalRef :: IORef [String] globalRef = unsafePerformIO $ newIORef [] {-# NOINLINE globalRef #-} </code></pre></div></div> There are two important things to note: <ol> <li>You must give a concrete type to this.</li> <li>You must write the <code class="language-plaintext highlighter-rouge">{-# NOINLINE globalRef #-}</code> pragma.</li> </ol> Let’s say we give <code class="language-plaintext highlighter-rouge">globalRef</code> a more general type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>globalRef :: IORef [a] </code></pre></div></div> This means that we woudl be allowed to write and read whatever we want from this reference. That’s bad! We could do something like <code class="language-plaintext highlighter-rouge">writeIORef globalRef [1,2,3]</code>, and then <code class="language-plaintext highlighter-rouge">readIORef globalRef :: IO [String]</code>. Boom, your program explodes. Unless you want a dynamically typed reference for some reason - and even then, you’d better use <code class="language-plaintext highlighter-rouge">Dynamic</code>. If you omit the <code class="language-plaintext highlighter-rouge">NOINLINE</code> pragma, then you’ll just get a fresh reference each time you use it. GHC will see that any reference to <code class="language-plaintext highlighter-rouge">globalRef</code> can be inlined to <code class="language-plaintext highlighter-rouge">unsafePerformIO (newIORef [])</code>, and it’ll happily perform that optimization. But that means you won’t be sharing state through the reference. This is a bad idea, don’t use it. I hesitate to even explain it. <h1 id="testing-the-trick">Testing the Trick</h1> But, well, sometimes you try things out to see if they work. In this case, they don’t, so it’s useful to document that. We’re going to write a function <code class="language-plaintext highlighter-rouge">trackString</code> that remembers the strings that are passed previously, and defines a value that returns those. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>trackString "hello" -- hello = [] trackString "goodbye" -- goodbye = ["hello"] trackString "what" -- what = ["goodbye", "hello"] </code></pre></div></div> Here’s our full module: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language QuasiQuotes #-} {-# language TemplateHaskell #-} module Lib where import Data.IORef import System.IO.Unsafe import Language.Haskell.TH import Language.Haskell.TH.Quote import Language.Haskell.TH.Syntax globalRef :: IORef [String] globalRef = unsafePerformIO $ newIORef [] {-# NOINLINE globalRef #-} trackStrings :: String -> Q [Dec] trackStrings input = do strs <- runIO $ readIORef globalRef _ <- runIO $ atomicModifyIORef globalRef (\i -> (input : i, ())) ty <- [t| [String] |] pure [ SigD (mkName input) ty , ValD (VarP (mkName input)) (NormalB (ListE $ map (LitE . stringL) $ strs)) [] ] </code></pre></div></div> This works in a single module just fine. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language TemplateHaskell #-} module Test where import Lib trackStrings "what" trackStrings "good" trackStrings "nothx" test :: IO () test = do print what print good print nothx </code></pre></div></div> If we evaluate <code class="language-plaintext highlighter-rouge">test</code>, we get the following output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[] ["what"] ["good","what"] </code></pre></div></div> This is exactly what we want. Unfortunately, this is only module-local state. Given this <code class="language-plaintext highlighter-rouge">Main</code> module, we get some disappointing output: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> {-# language TemplateHaskell #-} module Main where import Lib import Test trackStrings "hello" trackStrings "world" trackStrings "goodbye" main :: IO () main = do test print hello print world print goodbye </code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[] ["what"] ["good","what"] [] ["hello"] ["world","hello"] </code></pre></div></div> To solve my problem, <code class="language-plaintext highlighter-rouge">main</code> would have needed to output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[] ["what"] ["good","what"] ["nothx","good","what"] ["hello","nothx","good","what"] ["world","hello","nothx","good","what"] </code></pre></div></div> <h1 id="module-local-state-in-template-haskell">Module-local state in Template Haskell</h1> Fortunately, we don’t even need to do anything awful like this. The <code class="language-plaintext highlighter-rouge">Q</code> monad offers two methods, <a href="https://www.stackage.org/haddock/lts-17.9/template-haskell-2.16.0.0/Language-Haskell-TH-Syntax.html#v:getQ"><code class="language-plaintext highlighter-rouge">getQ</code></a> and <a href="https://www.stackage.org/haddock/lts-17.9/template-haskell-2.16.0.0/Language-Haskell-TH-Syntax.html#v:putQ"><code class="language-plaintext highlighter-rouge">putQ</code></a> that allow module-local state. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | Get state from the Q monad. Note that the state is -- local to the Haskell module in which the Template -- Haskell expression is executed. getQ :: Typeable a => Q (Maybe a) -- | Replace the state in the Q monad. Note that the -- state is local to the Haskell module in which the -- Template Haskell expression is executed. putQ :: Typeable a => a -> Q () </code></pre></div></div> These use a <code class="language-plaintext highlighter-rouge">Typeable</code> dictionary, so you can store many kinds of state - one for each type! This is a neat way to avoid the “polymorphic reference” problem I described above. <h1 id="how-to-actually-solve-the-problem">How to actually solve the problem?</h1> If y’all dare me enough I might write a follow-up where I investigate using a <a href="https://hackage.haskell.org/package/compact-0.2.0.0/docs/Data-Compact-Serialize.html">compact region</a> to persist state across modules, but I’m terrified of the potential complexity at play there. I imagine it’d work fine for a single threaded compile, but there’d probably be contention on the file with parallel builds. Hey, maybe I just need to spin up a redis server to manage the file locks… Perhaps I can install <code class="language-plaintext highlighter-rouge">nix</code> at compile-time and call out to a <code class="language-plaintext highlighter-rouge">nix-shell</code> that installs Redis and runs the server. Wed, 21 Apr 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/04/21/global_ioref_in_template_haskell.html https://www.parsonsmatt.org/2021/04/21/global_ioref_in_template_haskell.html Async Control Flow This post is an investigation of <a href="https://github.com/yesodweb/persistent/issues/1199"><code class="language-plaintext highlighter-rouge">persistent</code> issue #1199</a> where an asynchronous exception caused a database connnection to be improperly returned to the pool. The linked issue contains some debugging notes, along with the <a href="https://github.com/yesodweb/persistent/pull/1207">PR that fixes the problem</a>. While I was able to identify the problem and provide a fix, I don’t really understand what happened - it’s a complex bit of work. So I’m going to write this up as an exploration into the exact code paths that are happening. <h1 id="datapool"><code class="language-plaintext highlighter-rouge">Data.Pool</code></h1> <a href="https://hackage.haskell.org/package/resource-pool"><code class="language-plaintext highlighter-rouge">resource-pool</code></a> is a how <code class="language-plaintext highlighter-rouge">persistent</code> manages concurrent pooling and sharing of database connections. When you create a <code class="language-plaintext highlighter-rouge">Pool</code>, you specify how to create resources, destroy them, and then some information around resource management: how long to keep an unused resource open, how many sub-pools to maintain, and how many resources per sub-pool (aka stripe). <code class="language-plaintext highlighter-rouge">persistent</code> calls <code class="language-plaintext highlighter-rouge">createPool</code> <a href="https://github.com/yesodweb/persistent/blob/fdd7b7b42d2aaf8528cbe5e968002fef42e473fd/persistent/Database/Persist/Sql/Run.hs#L220-L238">here</a>. The database client libraries provide a <code class="language-plaintext highlighter-rouge">LogFunc -> IO SqlBackend</code> that is used to create new database connections, and the <code class="language-plaintext highlighter-rouge">close'</code> delegates to the <code class="language-plaintext highlighter-rouge">connClose</code> field on the <code class="language-plaintext highlighter-rouge">SqlBackend</code> record. While <code class="language-plaintext highlighter-rouge">resource-pool</code> isn’t seeing much maintenance activity, it’s relatively well tested and reliable. Once you’ve got a <code class="language-plaintext highlighter-rouge">Pool a</code> from <code class="language-plaintext highlighter-rouge">createPool</code>, the recommended way to use it is <a href="https://www.stackage.org/haddock/lts-17.6/resource-pool-0.2.3.2/Data-Pool.html#v:withResource"><code class="language-plaintext highlighter-rouge">withResource</code></a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withResource :: (MonadBaseControl IO m) => Pool a -> (a -> m b) -> m b withResource pool act = control $ \runInIO -> mask $ \restore -> do (resource, local) <- takeResource pool ret <- restore (runInIO (act resource)) `onException` destroyResource pool local resource putResource local resource return ret </code></pre></div></div> <h1 id="dataacquire"><code class="language-plaintext highlighter-rouge">Data.Acquire</code></h1> Now, in <code class="language-plaintext highlighter-rouge">persistent-2.10.5</code>, a new API based on the <code class="language-plaintext highlighter-rouge">resourcet</code> package’s <code class="language-plaintext highlighter-rouge">Data.Acquire</code> <a href="https://github.com/yesodweb/persistent/pull/984">was introduced</a>, and this API became the underlying implementation for the <code class="language-plaintext highlighter-rouge">runSqlPool</code> family of functions. The underlying functionality is in the new function <a href="https://github.com/yesodweb/persistent/pull/984/files#diff-f9d7f232cd00cb88188b7fcc68110e3f4cb378fcad9df652360de44d13cd86e3R33-R45"><code class="language-plaintext highlighter-rouge">unsafeAcquireSqlConnFromPool</code></a>, which was later factored out into <a href="https://hackage.haskell.org/package/resourcet-pool"><code class="language-plaintext highlighter-rouge">resourcet-pool</code></a>. This change was introduced because <code class="language-plaintext highlighter-rouge">resource-pool</code> operates in <code class="language-plaintext highlighter-rouge">MonadBaseControl</code>, which is incompatible with many other monad transformers - specifically, <code class="language-plaintext highlighter-rouge">ConduitT</code>. <code class="language-plaintext highlighter-rouge">Acquire</code> is based on <code class="language-plaintext highlighter-rouge">MonadUnliftIO</code>, which is compatible. In hindsight, we could have just changed the code to use <code class="language-plaintext highlighter-rouge">MonadUnliftIO</code> - it’s relatively straightforward to do. A term with a single constrant like <code class="language-plaintext highlighter-rouge">MonadBaseControl IO m => m a</code> can be specialized to <code class="language-plaintext highlighter-rouge">IO a</code>, and we can then run that using <code class="language-plaintext highlighter-rouge">withRunInIO</code> from <code class="language-plaintext highlighter-rouge">unliftio</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>toUnliftIO :: MonadUnliftIO n => (forall m. MonadBaseControl IO m => m a) -> n a toUnliftIO mbc = withRunInIO $ \runInIO -> do mbc toPlainIO :: (forall m. MonadBaseControl IO m => m a) -> IO a toPlainIO mbc = mbc toMonadIO :: MonadIO n => (forall m. MonadBaseControl IO m => m a) -> n a toMonadIO mbc = liftIO (toPlainIO mbc) </code></pre></div></div> <h1 id="acquire-vs-pool"><code class="language-plaintext highlighter-rouge">Acquire</code> vs <code class="language-plaintext highlighter-rouge">Pool</code></h1> I didn’t realize this at the time, but <code class="language-plaintext highlighter-rouge">Data.Acquire</code> is inherently a weaker tool than <code class="language-plaintext highlighter-rouge">Data.Pool</code>. <code class="language-plaintext highlighter-rouge">Data.Acquire</code> provides a means of creating a new resource, and also freeing it automatically when a scope is exited. <code class="language-plaintext highlighter-rouge">Data.Pool</code> keeps track of resources, resource counts, and occasionally destroys them if they’re unsued. So let’s look at our conversion function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>unsafeAcquireSqlConnFromPool = do pool <- MonadReader.ask let freeConn :: (backend, LocalPool backend) -> ReleaseType -> IO () freeConn (res, localPool) relType = case relType of ReleaseException -> P.destroyResource pool localPool res _ -> P.putResource localPool res return $ fst <$> mkAcquireType (P.takeResource pool) freeConn </code></pre></div></div> <code class="language-plaintext highlighter-rouge">mkAcquireType</code> is analogous to <code class="language-plaintext highlighter-rouge">createPool</code> - it creates a handle <code class="language-plaintext highlighter-rouge">Acquire a</code> that can be used with a function named <a href="https://www.stackage.org/haddock/lts-17.6/resourcet-1.2.4.2/Data-Acquire.html#v:with"><code class="language-plaintext highlighter-rouge">with</code></a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>with :: MonadUnliftIO m => Acquire a -> (a -> m b) -> m b with (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do Allocated x free <- f restore res <- restore (run (g x)) `E.onException` free ReleaseException free ReleaseNormal return res </code></pre></div></div> <code class="language-plaintext highlighter-rouge">with</code> is aliased to <code class="language-plaintext highlighter-rouge">withAcquire</code>, which I’ll use from here on out to disambiguate. You may notice that <code class="language-plaintext highlighter-rouge">withAcquire</code> and <code class="language-plaintext highlighter-rouge">withResource</code> are implemented nearly identically. <code class="language-plaintext highlighter-rouge">withResource</code> uses <code class="language-plaintext highlighter-rouge">MonadBaseControl</code> and <code class="language-plaintext highlighter-rouge">withAcquire</code> uses <code class="language-plaintext highlighter-rouge">MonadUnliftIO</code>, and that’s the whole of the difference. They have the same async exception handling with <code class="language-plaintext highlighter-rouge">mask</code> and use the same <code class="language-plaintext highlighter-rouge">onException</code> functions. All the exception handling stuff is from <code class="language-plaintext highlighter-rouge">Control.Exception</code>, so we’re not using <code class="language-plaintext highlighter-rouge">UnliftIO.Exception</code> or <code class="language-plaintext highlighter-rouge">Control.Monad.Catch</code> or <code class="language-plaintext highlighter-rouge">Control.Exception.Safe</code> here. These are really similar. When we look at how the <code class="language-plaintext highlighter-rouge">unsafeSqlConnFromPool</code> works, it should provide identical behavior. For <code class="language-plaintext highlighter-rouge">free</code>, we case on <code class="language-plaintext highlighter-rouge">ReleaseType</code> and do <code class="language-plaintext highlighter-rouge">destroyResource</code> on exception and <code class="language-plaintext highlighter-rouge">putResource</code> on any other exit. We’re not handling <code class="language-plaintext highlighter-rouge">ReleaseEarly</code> specially - this constructor is only used when we use <code class="language-plaintext highlighter-rouge">ResourceT</code>’s <code class="language-plaintext highlighter-rouge">release</code> function on a value. Using <code class="language-plaintext highlighter-rouge">withAcquire</code>, we’ll only ever pass <code class="language-plaintext highlighter-rouge">ReleaseNormal</code> and <code class="language-plaintext highlighter-rouge">ReleaseException</code>. So this is locally safe. Weirdly enough, <code class="language-plaintext highlighter-rouge">resourcet</code> doesn’t really depend on the <code class="language-plaintext highlighter-rouge">Acquire</code> type at all, at least not directly - the <code class="language-plaintext highlighter-rouge">ReleaseMap</code> type contains a function <code class="language-plaintext highlighter-rouge">ReleaseType -> IO ()</code> for freeing resources, but doesn’t mention anything else about it. Anyway, let’s get back on track. Since <code class="language-plaintext highlighter-rouge">withAcquire</code> and <code class="language-plaintext highlighter-rouge">withResource</code> are nearly identical, it may be our translating code that is the problem. We can use algebraic substitution to check this out. Let’s look at <a href="https://www.stackage.org/haddock/lts-17.6/resourcet-1.2.4.2/src/Data.Acquire.Internal.html#mkAcquireType"><code class="language-plaintext highlighter-rouge">mkAcquireType</code></a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkAcquireType :: IO a -- ^ acquire the resource -> (a -> ReleaseType -> IO ()) -- ^ free the resource -> Acquire a mkAcquireType create free = Acquire $ \_ -> do x <- create return $! Allocated x (free x) </code></pre></div></div> The ignored parameter in the lambda there is a function that looks like <code class="language-plaintext highlighter-rouge">restore</code> - and we’re ignoring it. So, this action gets run when we unpack the <code class="language-plaintext highlighter-rouge">Acquire</code> in <code class="language-plaintext highlighter-rouge">withAcquire</code>. Let’s plug in our <code class="language-plaintext highlighter-rouge">create</code> and <code class="language-plaintext highlighter-rouge">free</code> parameters: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkAcquireType :: IO a -- ^ acquire the resource -> (a -> ReleaseType -> IO ()) -- ^ free the resource -> Acquire a mkAcquireType (create = P.takeResource pool) (free = freeConn) = Acquire $ \_ -> do x <- (P.takeResource pool) return $! Allocated x (freeConn x) where freeConn (res, localPool) relType = case relType of ReleaseException -> P.destroyResource pool localPool res _ -> P.putResource localPool res </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">pool</code> variable is captured in the closure. Now we can look at <code class="language-plaintext highlighter-rouge">withAcquire</code>, and plug in our behavior: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withAcquire (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do -- `f` ignores the `restore` argument: possible bug? Allocated x free <- f restore -- so `x` here comes from `P.takeResource pool` -- free = freeConn ret <- restore (run (g x)) `E.onException` free ReleaseException free ReleaseNormal return ret </code></pre></div></div> Let’s plug in the specific case for <code class="language-plaintext highlighter-rouge">free</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withAcquire (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do -- `f` ignores the `restore` argument: possible bug? Allocated x free <- f restore -- so `x` here comes from `P.takeResource pool` -- free = freeConn ret <- restore (run (g x)) `E.onException` P.destroyResource pool localPool res P.putResource localPool res return ret </code></pre></div></div> Closer, closer… Let’s unpack the <code class="language-plaintext highlighter-rouge">Allocated</code> stuff: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withAcquire (Acquire _) g = withRunInIO $ \run -> E.mask $ \restore -> do -- `f` ignores the `restore` argument: possible bug? Allocated x free <- f restore -- so `x` here comes from `P.takeResource pool` -- free = freeConn ret <- restore (run (g x)) `E.onException` P.destroyResource pool localPool res P.putResource localPool res return ret where f _ = do x <- (P.takeResource pool) return $! Allocated x (freeConn x) -- OK, let's splice in the definition of `f`: withAcquire (Acquire _) g = withRunInIO $ \run -> E.mask $ \restore -> do -- `f` ignores the `restore` argument: possible bug? Allocated x free <- do x <- P.takeResource pool return $! Allocated x (freeConn x) -- so `x` here comes from `P.takeResource pool` -- free = freeConn ret <- restore (run (g x)) `E.onException` P.destroyResource pool localPool res P.putResource localPool res return ret -- Now let's remove the `Allocated` constructor: withAcquire (Acquire _) g = withRunInIO $ \run -> E.mask $ \restore -> do x@(res, localPool) <- P.takeResource pool ret <- restore (run (g x)) `E.onException` P.destroyResource pool localPool res P.putResource localPool res return ret </code></pre></div></div> With this, we’re now nearly identical with <code class="language-plaintext highlighter-rouge">withResource</code> (copied again): <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withResource :: (MonadBaseControl IO m) => Pool a -> (a -> m b) -> m b withResource pool act = control $ \runInIO -> mask $ \restore -> do (resource, local) <- takeResource pool ret <- restore (runInIO (act resource)) `onException` destroyResource pool local resource putResource local resource return ret </code></pre></div></div> The only difference here is that <code class="language-plaintext highlighter-rouge">Acquire</code> also passes the <code class="language-plaintext highlighter-rouge">LocalPool</code> to the given action. In the <code class="language-plaintext highlighter-rouge">persistent</code> code, we use <code class="language-plaintext highlighter-rouge">fmap fst</code> so that it only passes the resource to the callback. So, I’m not sure this function is at fault. Let’s see how we call this function. <h1 id="whats-that--doing-there">What’s that <code class="language-plaintext highlighter-rouge">>>=</code> doing there??</h1> <a href="https://github.com/yesodweb/persistent/pull/984/files#diff-f9d7f232cd00cb88188b7fcc68110e3f4cb378fcad9df652360de44d13cd86e3R62-R67"><code class="language-plaintext highlighter-rouge">acquireSqlConnFromPool</code></a> is what’s actually called by <a href="https://github.com/yesodweb/persistent/pull/984/files#diff-f9d7f232cd00cb88188b7fcc68110e3f4cb378fcad9df652360de44d13cd86e3R89"><code class="language-plaintext highlighter-rouge">runSqlPool</code></a> in this version of the code. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>acquireSqlConnFromPool :: (MonadReader (Pool backend) m, BackendCompatible SqlBackend backend) => m (Acquire backend) acquireSqlConnFromPool = do connFromPool <- unsafeAcquireSqlConnFromPool return $ connFromPool >>= acquireSqlConn </code></pre></div></div> That <code class="language-plaintext highlighter-rouge">>>=</code> is weird. What’s going on here? We have <code class="language-plaintext highlighter-rouge">return :: a -> m a</code>, and then <code class="language-plaintext highlighter-rouge">f >>= g</code>. <code class="language-plaintext highlighter-rouge">f :: Acquire backend</code> - so then <code class="language-plaintext highlighter-rouge">g</code> must have the type <code class="language-plaintext highlighter-rouge">g :: backend -> Acquire backend</code>, meaning that we’re using the <code class="language-plaintext highlighter-rouge">>>=</code> of <code class="language-plaintext highlighter-rouge">Acquire a -> (a -> Acquire b) -> Acquire b</code>. <code class="language-plaintext highlighter-rouge">acquireSqlConn</code> cashes out to <a href="https://github.com/yesodweb/persistent/pull/984/files#diff-f9d7f232cd00cb88188b7fcc68110e3f4cb378fcad9df652360de44d13cd86e3R122-R142"><code class="language-plaintext highlighter-rouge">rawAcquireSqlConn</code></a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rawAcquireSqlConn :: forall backend m . (MonadReader backend m, BackendCompatible SqlBackend backend) => Maybe IsolationLevel -> m (Acquire backend) rawAcquireSqlConn isolation = do conn <- MonadReader.ask let rawConn :: SqlBackend rawConn = projectBackend conn getter :: T.Text -> IO Statement getter = getStmtConn rawConn beginTransaction :: IO backend beginTransaction = conn <$ connBegin rawConn getter isolation finishTransaction :: backend -> ReleaseType -> IO () finishTransaction _ relType = case relType of ReleaseException -> connRollback rawConn getter _ -> connCommit rawConn getter return $ mkAcquireType beginTransaction finishTransaction </code></pre></div></div> So, in the investigation, the exception (<code class="language-plaintext highlighter-rouge">libpq: failed (another command is already in progress)</code>) would happen (as best as I can tell) when we try to call <code class="language-plaintext highlighter-rouge">connRollback</code>. The problem is somewhere around here. Um excuse me what? This is also operating in <code class="language-plaintext highlighter-rouge">m (Acquire backend)</code>, not <code class="language-plaintext highlighter-rouge">Acquire backend</code>. How is it possibly being used on the RHS of a <code class="language-plaintext highlighter-rouge">>>=</code>? … Oh, right. Just like <code class="language-plaintext highlighter-rouge">MonadBaseControl IO m => m a</code> can be concretized to <code class="language-plaintext highlighter-rouge">IO a</code>, we can concretize <code class="language-plaintext highlighter-rouge">MonadReader r m => m a</code> to <code class="language-plaintext highlighter-rouge">r -> a</code>. So what’s happening here is we’re picking the spcialized type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rawAcquireSqlConn :: Maybe IsolationLevel -> backend -> Acquire backend </code></pre></div></div> Wild. Well, let’s look at <a href="https://www.stackage.org/haddock/lts-17.6/resourcet-1.2.4.2/src/Data.Acquire.Internal.html#line-54"><code class="language-plaintext highlighter-rouge">>>=</code> for <code class="language-plaintext highlighter-rouge">Acquire</code></a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad Acquire where return = pure Acquire f >>= g' = Acquire $ \restore -> do Allocated x free1 <- f restore let Acquire g = g' x Allocated y free2 <- g restore `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) </code></pre></div></div> Hmm! This smells funny. The problem occurs when we try to roll back the transaction. So let’s apply some more substitution here. <code class="language-plaintext highlighter-rouge">Acquire f</code> contains: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>\_ -> do x <- P.takeResource pool pure $ Allocated x (freeConn x) </code></pre></div></div> And <code class="language-plaintext highlighter-rouge">g'</code> contains (simplifying a tiny bit): <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>\sqlBackend -> do Acquire $ \_ -> do _ <- beginTransaction sqlBackend getter isolation pure $ Allocated sqlBackend $ \case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend </code></pre></div></div> So, we can start inlining. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Acquire $ \restore -> do Allocated x free1 <- (\_ -> do x <- P.takeResource pool pure $ Allocated x (freeConn x)) restore let Acquire g = g' x Allocated y free2 <- g restore `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) -- (\_ -> x) restore = x Acquire $ \restore -> do Allocated x free1 <- do x <- P.takeResource pool pure $ Allocated x (freeConn x) let Acquire g = g' x Allocated y free2 <- g restore `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) -- float `c` and `freeConn` up Acquire $ \restore -> do x <- P.takeResource pool let free1 = freeConn x let Acquire g = g' x Allocated y free2 <- g restore `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) -- inline g' Acquire $ \restore -> do x <- P.takeResource pool let free1 = freeConn x let sqlBackend = x let Acquire g = Acquire $ \_ -> do _ <- beginTransaction sqlBackend getter isolation pure $ Allocated sqlBackend $ \case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend Allocated y free2 <- g restore `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) -- Remove `Acquire` constructor: Acquire $ \restore -> do x <- P.takeResource pool let free1 = freeConn x let sqlBackend = x let g _ = do _ <- beginTransaction sqlBackend getter isolation pure $ Allocated sqlBackend $ \case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend Allocated y free2 <- g restore `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) -- Inline `g`, ignore `restore` parameter Acquire $ \restore -> do x <- P.takeResource pool let free1 = freeConn x let sqlBackend = x Allocated y free2 <- (do _ <- beginTransaction sqlBackend getter isolation pure $ Allocated sqlBackend $ \case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend ) `E.onException` free1 ReleaseException return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) </code></pre></div></div> Now, this next transformation feels a bit tricky. I’m going to float <code class="language-plaintext highlighter-rouge">beginTransaction</code> up and put the <code class="language-plaintext highlighter-rouge">E.onException</code> only on it. Note that we’re not actually running the <code class="language-plaintext highlighter-rouge">free2</code> action here - just preparing it. Then I’ll assign it with a <code class="language-plaintext highlighter-rouge">let</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Acquire $ \restore -> do x <- P.takeResource pool let free1 = freeConn x let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation `E.onException` free1 ReleaseException let free2 = \case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend return $! Allocated y (\rt -> free2 rt `E.finally` free1 rt) -- Inline free1 and free2 Acquire $ \restore -> do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation `E.onException` freeConn x ReleaseException return $! Allocated y $ \rt -> (\case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) rt `E.finally` (freeConn x rt) -- Inline freeConn Acquire $ \restore -> do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation `E.onException` P.destroyResource pool localPool sqlBackend return $! Allocated y $ \rt -> (case rt of ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) `E.finally` do case rt of ReleaseException -> P.destroyResource pool localPool sqlBackend _ -> P.putResource localPool sqlBackend </code></pre></div></div> I think it’s important to note that, again we don’t ever actually call <code class="language-plaintext highlighter-rouge">restore</code>. So the masking state is inherited and not ever changed. It feels important but I’m not sure if it actually is. Let’s plug this into <code class="language-plaintext highlighter-rouge">withAcquire</code> now. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withAcquire (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do Allocated x free <- f restore res <- restore (run (g x)) `E.onException` free ReleaseException free ReleaseNormal return res -- Inline `f`. Since `restore` is never called, we can omit passing it as -- a parameter. withAcquire (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do Allocated x free <- do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation `E.onException` free1 ReleaseException return $! Allocated x $ \rt -> (case rt of ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) `E.finally` do case rt of ReleaseException -> P.destroyResource pool localPool sqlBackend _ -> P.putResource localPool sqlBackend res <- restore (run (g x)) `E.onException` free ReleaseException free ReleaseNormal return res -- float `x <- P.takeResource pool` to the top, and define `free` using `let` withAcquire (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do x <- P.takeResource pool let free1 = freeConn x let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation `E.onException` free1 ReleaseException let free rt = (case rt of ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) `E.finally` do case rt of ReleaseException -> P.destroyResource pool localPool sqlBackend _ -> P.putResource localPool sqlBackend res <- restore (run (g x)) `E.onException` free ReleaseException free ReleaseNormal return res -- inline `free` for each case: withAcquire (Acquire f) g = withRunInIO $ \run -> E.mask $ \restore -> do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation `E.onException` P.destroyResource pool localPool sqlBackend res <- restore (run (g x)) `E.onException` do connRollback sqlBackend `E.finally` P.destroyResource pool localPool sqlBackend do -- ReleaseNormal connCommit sqlBackend `E.finally` do P.putResource localPool sqlBackend return res </code></pre></div></div> Let’s consider our masking state. We’re masked for everything except for the <code class="language-plaintext highlighter-rouge">restoure (run (g x))</code> call. Including beginning the transaction and committing the transaction. But we can still receive asynchronous exceptions during <a href="https://hackage.haskell.org/package/base-4.14.1.0/docs/Control-Exception.html#interruptible">interruptible operations</a>. Interruptible operations include “anything that can block or perform IO,” which seems very likely to include the Postgres code here. <h1 id="the-original">The Original</h1> Let’s compare this with the original code. The <a href="https://github.com/yesodweb/persistent/pull/984/files#diff-f9d7f232cd00cb88188b7fcc68110e3f4cb378fcad9df652360de44d13cd86e3L31">original code</a> delegated to <a href="https://github.com/yesodweb/persistent/pull/984/files#diff-f9d7f232cd00cb88188b7fcc68110e3f4cb378fcad9df652360de44d13cd86e3L63-L72"><code class="language-plaintext highlighter-rouge">runSqlConn</code></a> after acquiring a <code class="language-plaintext highlighter-rouge">SqlBackend</code> from the <code class="language-plaintext highlighter-rouge">Pool</code> in <code class="language-plaintext highlighter-rouge">MonadUnliftIO</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runSqlConn :: (MonadUnliftIO m, BackendCompatible SqlBackend backend) => ReaderT backend m a -> backend -> m a runSqlConn r conn = withRunInIO $ \runInIO -> mask $ \restore -> do let conn' = projectBackend conn getter = getStmtConn conn' restore $ connBegin conn' getter Nothing x <- onException (restore $ runInIO $ runReaderT r conn) (restore $ connRollback conn' getter) restore $ connCommit conn' getter return x </code></pre></div></div> We’ll inline this into <code class="language-plaintext highlighter-rouge">runSqlPool</code>, so we’ll now see: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runSqlPool r pconn = withRunInIO $ \run -> withResource pconn $ run . runSqlConn r -- expand lambda runSqlPool r pconn = withRunInIO $ \run -> withResource pconn $ \conn -> run $ runSqlConn r conn -- inline runSqlConn runSqlPool r pconn = withRunInIO $ \run -> withResource pconn $ \conn -> run $ withRunInIO $ \runInIO -> mask $ \restore -> do let conn' = projectBackend conn getter = getStmtConn conn' restore $ connBegin conn' getter Nothing x <- onException (restore $ runInIO $ runReaderT r conn) (restore $ connRollback conn' getter) restore $ connCommit conn' getter return x </code></pre></div></div> Kind of a lot of <code class="language-plaintext highlighter-rouge">withStuff</code> going on, including two <code class="language-plaintext highlighter-rouge">withRunInIO</code>s lol. Let’s make it even worse by inlining <code class="language-plaintext highlighter-rouge">withResource</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- abstract action to a variable runSqlPool r pconn = withRunInIO $ \run -> withResource pconn $ \conn -> let act = run $ withRunInIO $ \runInIO -> mask $ \restore -> do let conn' = projectBackend conn getter = getStmtConn conn' restore $ connBegin conn' getter Nothing x <- onException (restore $ runInIO $ runReaderT r conn) (restore $ connRollback conn' getter) restore $ connCommit conn' getter return x in act -- inline withResource runSqlPool r pconn = withRunInIO $ \run -> -- withResource pconn $ \conn -> control $ \runInIO0 -> mask $ \restore0 -> do let act conn = run $ withRunInIO $ \runInIO1 -> mask $ \restore1 -> do let conn' = projectBackend conn getter = getStmtConn conn' restore1 $ connBegin conn' getter Nothing x <- onException (restore1 $ runInIO1 $ runReaderT r conn) (restore1 $ connRollback conn' getter) restore1 $ connCommit conn' getter return x (resource, local) <- takeResource pool ret <- restore0 (runInIO0 (act resource)) `onException` destroyResource pool local resource putResource local resource return ret -- inline `act` runSqlPool r pconn = withRunInIO $ \run -> -- withResource pconn $ \conn -> control $ \runInIO0 -> mask $ \restore0 -> do (resource, local) <- takeResource pool ret <- restore0 (runInIO0 ( run $ withRunInIO $ \runInIO1 -> mask $ \restore1 -> do let conn = resource let conn' = projectBackend conn getter = getStmtConn conn' restore1 $ connBegin conn' getter Nothing x <- onException (restore1 $ runInIO1 $ runReaderT r conn) (restore1 $ connRollback conn' getter) restore1 $ connCommit conn' getter return x)) `onException` destroyResource pool local resource putResource local resource return ret </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">restore</code> paratmer in <code class="language-plaintext highlighter-rouge">mask</code> doesn’t unmask it completely - it restores the existing <code class="language-plaintext highlighter-rouge">mask</code>ing state before the <code class="language-plaintext highlighter-rouge">mask</code> was entered. So <code class="language-plaintext highlighter-rouge">mask $ \restore -> mask $ \restore -> restore (print 10)</code> doesn’t have <code class="language-plaintext highlighter-rouge">print 10</code> in an unmasked state, but the same mask as before. However, here, we have this pattern: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mask $ \restore -> do restore $ do mask $ \restore' -> do ... </code></pre></div></div> Which is interesting! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runSqlPool r pconn = -- Unmasked withRunInIO $ \run -> control $ \runInIO0 -> mask $ \restore0 -> do -- Masked (resource, local) <- takeResource pool ret <- restore0 -- Unmasked (runInIO0 $ run $ withRunInIO $ \runInIO1 -> -- Masked mask $ \restore1 -> do let conn = resource let conn' = projectBackend conn getter = getStmtConn conn' -- Unmasked restore1 $ connBegin conn' getter Nothing x <- onException -- Unmasked (restore1 $ runInIO1 $ runReaderT r conn) -- Unmasked (restore1 $ connRollback conn' getter) restore1 $ do --unmasked connCommit conn' getter return x) -- Masked `onException` destroyResource pool local resource -- Still masked putResource local resource return ret </code></pre></div></div> So our masked actions are: <ol> <li><code class="language-plaintext highlighter-rouge">takeResource pool</code></li> <li><code class="language-plaintext highlighter-rouge">onException</code></li> <li><code class="language-plaintext highlighter-rouge">onException</code> and then <code class="language-plaintext highlighter-rouge">destroyResource</code></li> <li><code class="language-plaintext highlighter-rouge">putResource</code></li> </ol> Unmasked, we have: <ol> <li><code class="language-plaintext highlighter-rouge">connBegin</code></li> <li><code class="language-plaintext highlighter-rouge">r</code> (the action passed to <code class="language-plaintext highlighter-rouge">runSqlConn</code>)</li> <li><code class="language-plaintext highlighter-rouge">connRollback</code></li> <li><code class="language-plaintext highlighter-rouge">connCommit</code></li> </ol> Let’s compare with <code class="language-plaintext highlighter-rouge">withAcquire</code> which was all inlined above: <ul> <li>Masked: <ol> <li><code class="language-plaintext highlighter-rouge">takeResource</code></li> <li><code class="language-plaintext highlighter-rouge">beginTransaction</code></li> <li><code class="language-plaintext highlighter-rouge">destroyResource</code></li> <li><code class="language-plaintext highlighter-rouge">connRollback</code></li> <li><code class="language-plaintext highlighter-rouge">destroyResource</code> again</li> <li><code class="language-plaintext highlighter-rouge">connCommit</code></li> <li><code class="language-plaintext highlighter-rouge">putResource</code></li> </ol> </li> <li>Unmasked <ol> <li><code class="language-plaintext highlighter-rouge">run (g x)</code> – the action passed to <code class="language-plaintext highlighter-rouge">withAcquire</code> and <code class="language-plaintext highlighter-rouge">runSqlConn</code>.</li> </ol> </li> </ul> So <code class="language-plaintext highlighter-rouge">withAcquire</code> actually has quite a bit more masking going on! Interesting. Remembering, the problem occurs when the <code class="language-plaintext highlighter-rouge">thread killed</code> exception happens and the <code class="language-plaintext highlighter-rouge">connRollback</code> function is called, causing <code class="language-plaintext highlighter-rouge">libpq</code> to die with the “command in progress” error. So, we throw a <code class="language-plaintext highlighter-rouge">killThread</code> at our <code class="language-plaintext highlighter-rouge">withAcquire</code> function. It’ll land as soon as we’re unmasked, or an interruptible action occurs. Since almost all of it is masked, we need to determine what the interruptible operations are. <a href="https://www.stackage.org/haddock/lts-17.6/resource-pool-0.2.3.2/src/Data.Pool.html#takeResource"><code class="language-plaintext highlighter-rouge">takeResource</code></a> might be interruptible - it has an STM transaction, which does call <code class="language-plaintext highlighter-rouge">retry</code>. I don’t know if any code with <code class="language-plaintext highlighter-rouge">retry</code> triggers an interrupt, or if only actually calling <code class="language-plaintext highlighter-rouge">retry</code> can trigger an interruptible state. Based on a quick and bewildering look at the GHC source, I think it’s just that <code class="language-plaintext highlighter-rouge">retry</code> itself can be interrupted. <code class="language-plaintext highlighter-rouge">retry</code> occurs when there are no available entries in the local pool and we’re at max resources for the pool. This is exactly the scenario this test is exercising: a single stripe with a single resource that’s constantly in use. <code class="language-plaintext highlighter-rouge">beginTransaction</code> kicks off an IO action to postgres, so it is almost definitely interruptible. Same for <code class="language-plaintext highlighter-rouge">connRollback</code> and <code class="language-plaintext highlighter-rouge">connCommit</code>. So the masked-state for these items in <code class="language-plaintext highlighter-rouge">withAcquire</code> is probably not a big deal - but we could check by using <code class="language-plaintext highlighter-rouge">uninterruptibleMask</code> on them. <h1 id="to-be-continued">To be continued?</h1> I wish I had a more satisfying conclusion here, but I’m all out of time to write on this for now. Please comment on the relevant GitHub issues if you’re interested or have some insight! Wed, 17 Mar 2021 00:00:00 +0000 https://www.parsonsmatt.org/2021/03/17/async_control_flow.html https://www.parsonsmatt.org/2021/03/17/async_control_flow.html Haskell Proposal: Simplify Deriving Haskell’s type classes and deriving facilities are a killer feature for type safety and extensibility. Over nearly 30 years they’ve acquired quite a bit of cruft and language extensions. With <code class="language-plaintext highlighter-rouge">DerivingVia</code>, we now have the ability to dramatically simplify the deriving story. This post outlines a change to the language that would hopefully be adopted with the next version of the language standard. They get less reasonable and more dramatic as the post goes on. <h1 id="add-to-the-stock-deriving-classes">Add to the Stock Deriving Classes</h1> GHC has a ton of extensions that only serve to unlock additional type classes to the “stock” deriving strategy. <code class="language-plaintext highlighter-rouge">Derive{Functor,Foldable,Traversable,Generic,Lift,etc}</code>. We can remove all of these extensions by folding them into the <code class="language-plaintext highlighter-rouge">stock</code> deriving strategy. <h1 id="remove-deriveanyclass">Remove <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code></h1> <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code> is a footgun. It allows you to write any type class in a <code class="language-plaintext highlighter-rouge">deriving</code> clause. It pastes in an “empty” instance, relying on <code class="language-plaintext highlighter-rouge">DefaultSignatures</code> to fill in the values. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- With DeriveAnyClass: data X = X deriving ToJSON -- Without: data X = X instance ToJSON X </code></pre></div></div> <h1 id="remove-defaultsignatures">Remove <code class="language-plaintext highlighter-rouge">DefaultSignatures</code></h1> <code class="language-plaintext highlighter-rouge">DefaultSignatures</code> is used to give a single default implementation of a type class if the underlying type matches a more restrictive constraint. This is primarily used to provide <code class="language-plaintext highlighter-rouge">Generic</code>-based implementations with very little syntax. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data X = X deriving Generic -- with DefaultSignatures: class ToJSON a where toJSON :: a -> Value default toJSON :: (Generic a, GToJSON (Rep a)) => a -> Value toJSON = gtoJSON instance ToJSON X -- without DefaultSignatures: class ToJSON a where toJSON :: a -> Value instance ToJSON X where toJSON = gtoJSON </code></pre></div></div> By privileging a single default, it makes any other possible defaults less useful and less discoverable. The <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code> utility is subsumed by <code class="language-plaintext highlighter-rouge">DerivingVia</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Generically a = Generically a instance (Generic a, GToJSON (Rep a)) => ToJSON (Generically a) where toJSON (Generically a) = gtoJSON a data X = X deriving stock Generic deriving ToJSON via Generically X </code></pre></div></div> <h1 id="remove-generalizednewtypederiving">Remove <code class="language-plaintext highlighter-rouge">GeneralizedNewtypeDeriving</code></h1> This extension is subsumed by <code class="language-plaintext highlighter-rouge">DerivingVia</code>, also. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- with GeneralizedNewtypeDeriving newtype UserId = UserId Text deriving newtype (Show, ToJSON) -- with DerivingVia newtype UserId = UserId Text deriving (Show, ToJSON) via Text </code></pre></div></div> <h1 id="remove-derivingstrategies">Remove <code class="language-plaintext highlighter-rouge">DerivingStrategies</code></h1> Now that there’s only two strategies, we can get rid of <code class="language-plaintext highlighter-rouge">DerivingStrategies</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Before data X = X deriving stock (Show, Generic) deriving (ToJSON, FromJSON) via Generically X -- After data X = X deriving (Show, Generic) deriving (ToJSON, FromJSON) via Generically X </code></pre></div></div> <h1 id="allow-wildcards-in-deriving-clauses">Allow wildcards in deriving clauses</h1> Currently, you must write the complete type in a <code class="language-plaintext highlighter-rouge">DerivingVia</code> clause. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data X = X deriving Generic deriving ToJSON via Generically X newtype Y = Y Text deriving ToJSON via Text </code></pre></div></div> This can be cumbersome for a very large type. <pre><code class="language-Haskell">newtype App a = App (ExceptT () (StateT () (ReaderT () IO)) a) deriving ( Functor , Applicative , Monad , MonadReader () , MonadError () , MonadState () ) via ExceptT () (StateT () (ReaderT () IO)) </code></pre> It’s also annoyingly repetitive, and can lead to errors. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Foo = Foo deriving Generic deriving ToJSON via Generically Foo -- copy/paste error data Bar = Bar deriving Generic deriving ToJSON via Generically Foo </code></pre></div></div> A wildcard can be used to indicate either: a. The underlying type of a <code class="language-plaintext highlighter-rouge">newtype</code>, or b. The type of the <code class="language-plaintext highlighter-rouge">data</code> declaration. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Foo = Foo deriving Generic deriving ToJSON via Generically _ -- no more copy paste error data Bar = Bar deriving Generic deriving ToJSON via Generically _ -- mmmm nice and clean newtype App a = App (ExceptT () (StateT () (ReaderT () IO)) a) deriving ( Functor , Applicative , Monad , MonadReader () , MonadError () , MonadState () ) via _ </code></pre></div></div> <h1 id="remove-attached-deriving">Remove attached deriving</h1> There are two ways to derive things: <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code> and attached deriving. Attached deriving is redundant, but convenient. <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code> is more powerful, but less convenient. Attached deriving clauses don’t work with <code class="language-plaintext highlighter-rouge">GADTs</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Before: data Foo = Foo deriving Generic deriving (FromJSON, ToJSON) via Generically _ newtype App a = App (ExceptT () (StateT () (ReaderT () IO)) a) deriving ( Functor , Applicative , Monad , MonadReader () , MonadError () , MonadState () ) via _ -- Only standalone: data Foo = Foo deriving instance Generic Foo deriving via Generically _ instance ToJSON Foo deriving via Generically _ instance FromJSON Foo newtype App a = App ... deriving via _ instance Functor App deriving via _ instance Applicative App deriving via _ instance Monad App deriving via _ instance MonadReader () App deriving via _ instance MonadError () App deriving via _ instance MonadState () App -- GADT must use standalone to specify a context data Some f where Some :: Show a => f a -> Some f deriving instance (forall a. Show a => Show (f a)) => Show (Some f) </code></pre></div></div> <h1 id="lightweight-standalone-syntax">Lightweight Standalone Syntax</h1> The problem with the above proposal is that it carries a significant syntax cost. The keyword <code class="language-plaintext highlighter-rouge">deriving</code> is repeated for each instance, the keyword <code class="language-plaintext highlighter-rouge">instance</code> is repeated, the <code class="language-plaintext highlighter-rouge">via _</code> clause is repeated, and the type name is repeated. Multiple instances should be derivable with the same context. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Foo = Foo deriving Foo ( Generic , via (Generically _) ( ToJSON , FromJSON ) ) </code></pre></div></div> In this block, we define the <code class="language-plaintext highlighter-rouge">ToJSON</code> and <code class="language-plaintext highlighter-rouge">FromJSON</code> instances using the same <code class="language-plaintext highlighter-rouge">Generically</code> viatype. We can still use <code class="language-plaintext highlighter-rouge">_</code> to refer to the type, since we know the type we’re deriving for: <code class="language-plaintext highlighter-rouge">Foo</code>. This recovers the syntax convenience of “attached deriving.” <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App a = App ... deriving App via _ instance ( Functor , Applicative , Monad , MonadReader () , MonadError () , MonadState () ) </code></pre></div></div> This also recovers the convenience of attached deriving. Let’s look at the main point - GADTs. Otherwise we could just remove <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code> (with the nice benefit/tragedy of banning orphan derived instance). <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Some f where Some :: Show a => f a -> Some f -- old deriving instance (forall a. Show a => Show (f a)) => Show (Some f) -- new deriving Some (forall a. Show a => Show (f a)) => Show (_ f) -- generally, deriving SomeGadtType (SomeContextOn a b c) => ( Show, Eq, ToJSON, FromJSON ) (_ a b c) </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">_</code> refers to the type name, without any variables applied. So you need to apply the type variables in the instance head. That’s a bit annoying, but maybe it’s fine <h1 id="remove-stock-deriving">Remove Stock Deriving</h1> GHC provides a <code class="language-plaintext highlighter-rouge">newtype Stock a = Stock a</code> that hooks in to <code class="language-plaintext highlighter-rouge">DerivingVia</code> somehow. Now we’re down to one deriving strategy. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data X = X deriving X via Stock _ ( Eq, Show, Generic ) via Generically _ ( ToJSON, FromJSON ) </code></pre></div></div> This “deprivileges” the <code class="language-plaintext highlighter-rouge">Stock</code> deriving classes. <h1 id="remove-standalone-deriving">Remove Standalone Deriving</h1> OK, so maybe you don’t like getting rid of attached deriving. Let’s get rid of standalone deriving instead. We need <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code> for two reasons: <ol> <li>Orphan derived instances (shame on you)</li> <li>Specifying a context for GADTs and allow application of type variables</li> </ol> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Some f where Some :: Show a => f a -> Some f deriving ( (forall a. Show a => Show (f a)) => Show , -- generally, (SomeContext f) => SomeClass ) </code></pre></div></div> The type variable <code class="language-plaintext highlighter-rouge">f</code> is in scope from the <code class="language-plaintext highlighter-rouge">data</code> declaration. EDIT: <a href="https://twitter.com/quickdudley/status/1328068260659482624">@quickdudley</a> and <a href="https://twitter.com/nnotm/status/1327998875563683845">@nnotm</a> have correctly pointed out that you also want to be able to define instances of a class at the definition module of a class. These are perfectly valid instances, and so we must keep <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code>. <h1 id="terrible-post-over">Terrible Post Over</h1> Alright, post is done. These ideas are certainly controversial and Bad, but man wouldn’t it be nice to have a simpler story around deriving and type class instances? The current story is so complex, and I think we can genuinely simplify Haskell-the-language by trimming some fat here. EDIT: <a href="https://twitter.com/am_i_tom/status/1327992136151789568">@i_am_tom</a> posted a reference to the <a href="https://github.com/tysonzero/ghc-proposals/blob/concrete-class-dictionaries/proposals/0000-concrete-class-dictionaries.md">Concrete Class Dictionaries</a> GHC proposal, which subsumes a lot of this. Tue, 10 Nov 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/11/10/simplifying_deriving.html https://www.parsonsmatt.org/2020/11/10/simplifying_deriving.html Plucking In, Plucking Out In <a href="/2020/01/03/plucking_constraints.html">plucking constraints</a>, I talked about a way to shrink a set of constraints by partially concretizing it. At the end of the article, I show how to use it for errors. The <a href="https://hackage.haskell.org/package/plucky"><code class="language-plaintext highlighter-rouge">plucky</code></a> package documents the technique, and my upcoming library <a href="https://github.com/parsonsmatt/prio/blob/master/src/Lib.hs"><code class="language-plaintext highlighter-rouge">prio</code></a> embed plucking into run-time exceptions, effectively solving <a href="/2018/11/03/trouble_with_typed_errors.html">the trouble with typed errors</a>. Figuring that out has been bothering me for two and a half years! This got me thinking. Michael Snoyman’s <a href="https://www.stackage.org/lts-16.20/package/rio-0.1.19.0"><code class="language-plaintext highlighter-rouge">rio</code></a> package is an alternative Prelude which bakes the <a href="https://www.fpcomplete.com/blog/2017/06/readert-design-pattern/"><code class="language-plaintext highlighter-rouge">ReaderT</code> Design Pattern</a> in to the base monad <code class="language-plaintext highlighter-rouge">RIO r a</code>, and then encourages users to write in a polymorphic style. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Concrete, monomorphic: foo :: Int -> RIO AppEnv String -- Abstract, polymorphic: foo :: ( MonadReader env m, MonadIO m , HasThing env, HasOtherThing env ) => Int -> m String </code></pre></div></div> The abstract <code class="language-plaintext highlighter-rouge">foo</code> specifies exactly what the <code class="language-plaintext highlighter-rouge">env</code> must satisfy. With a concrete type, the <code class="language-plaintext highlighter-rouge">AppEnv</code> type is almost certainly too big - it probably needs to support all kinds of things, and <code class="language-plaintext highlighter-rouge">foo</code> is only a small part of that. In <a href="/2018/11/03/trouble_with_typed_errors.html">The Trouble With Typed Errors</a>, I argue that an error type that is too big is a major problem. But I’ve never really been bothered with an <code class="language-plaintext highlighter-rouge">env</code> type that is too big. Why is that? <h1 id="profunctors">Profunctors</h1> If I combine <code class="language-plaintext highlighter-rouge">prio</code>’s <code class="language-plaintext highlighter-rouge">CheckedT e m a</code> for checked runtime exceptions and <code class="language-plaintext highlighter-rouge">rio</code>’s <code class="language-plaintext highlighter-rouge">RIO r a</code>, I get this neat type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App r e a = App { unApp :: CheckedT e (RIO r a) } deriving newtype ( Functor, Applicative, Monad, MonadIO , MonadReader r, MonadError e, MonadState s -- etc ) </code></pre></div></div> Let’s simplify. Instead of runtime exceptions with <code class="language-plaintext highlighter-rouge">CheckedT</code>, we’ll use <code class="language-plaintext highlighter-rouge">Either e</code>. We’ll inline the transformers, too. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App r e a = App { unApp :: r -> IO (Either e a) } deriving via (ReaderT r (ExceptT e IO)) ( Functor, Applicative, Monad, MonadIO -- etc ) </code></pre></div></div> If you squint a little, this is a <code class="language-plaintext highlighter-rouge">Profunctor</code> with an input of type <code class="language-plaintext highlighter-rouge">r</code> and an output of type <code class="language-plaintext highlighter-rouge">Either e a</code>. <code class="language-plaintext highlighter-rouge">Either e a</code> is even just a way of “blessing” one possible output as the “bind” output while the <code class="language-plaintext highlighter-rouge">e</code> is a “short-circuit” output. <code class="language-plaintext highlighter-rouge">Profunctor</code> is a fancy math word that makes me think about category theory. If I apply some sloppy category theory thinking, maybe I can satisfactorily answer “Why is a too-big output bad, while a too-big input is fine?” <h1 id="the-problem-with-catch">The Problem With <code class="language-plaintext highlighter-rouge">catch</code></h1> The problem with <code class="language-plaintext highlighter-rouge">catch</code> is that it doesn’t change the error set at all. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>catch :: Either e a -> (e -> Either e a) -> Either e a </code></pre></div></div> There is no type-level evidence that anything has changed. Alexis King would call this <a href="https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/">validation, not parsing</a>. The Correct type of <code class="language-plaintext highlighter-rouge">catch</code> is something like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>catch :: forall input small rest . ( Contains small input , rest ~ Delete small input ) => Either input a -> (small -> Either rest a) -> Either rest a </code></pre></div></div> The big difference here is that the <code class="language-plaintext highlighter-rouge">small</code> type of problem has been handled, and we’re only left with the <code class="language-plaintext highlighter-rouge">rest</code>. Using Constraint Plucking, this signature is simply: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>catch :: Either (Either small rest) a -> (small -> Either rest a) -> Either rest a </code></pre></div></div> Since <code class="language-plaintext highlighter-rouge">catch</code> is our Big Problem with the output, we should get to our Big Problem with the input by taking the dual of this function. Taking the dual means flipping the arrows and replacing sums with products. (Real category theory experts will yell at me for this, I Will Not Log Off, but I will accept a PR linking to a better explanation). <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Provide alias names to help with flipping the arrows catch0 :: start -> handler -> result -- Let's do `start` now. We'll just convert Either to Tuple. start ~ Either (Either small rest) a start0 ~ ( (small, rest), a ) -- Handler: handler ~ (small -> Either rest a) -- Flip arrow handler0 ~ (Either rest a -> small) -- Tuplize handler1 ~ ((rest, a) -> small) -- Result result ~ Either rest a -- Tuplize result ~ (rest, a) cocatch0 :: start -> handler -> result -- substitute our named expressions cocatch :: ((small, rest), a) -> ((rest, a) -> small) -> (rest, a) cocatch ((small, rest), a) k = (rest, a) </code></pre></div></div> Huh. This is totally useless. I suspect I have performed the flipping of arrows incorrectly. <a href="https://blog.ezyang.com/2010/07/flipping-arrows-in-coburger-king/">Edward Yang</a> has a really good post on category theory and flipping arrows, so I’ll read that and come back to this. <h1 id="monads">Monads?</h1> <code class="language-plaintext highlighter-rouge">catch</code> is really like a <code class="language-plaintext highlighter-rouge">bind</code> on the <code class="language-plaintext highlighter-rouge">e</code> parameter of <code class="language-plaintext highlighter-rouge">Either e a</code>. Just compare the signatures: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bind :: Either e a -> (a -> Either e a) -> Either e a catch :: Either e a -> (e -> Either e a) -> Either e a </code></pre></div></div> This makes me think: <code class="language-plaintext highlighter-rouge">Monad</code> is the wrong approach. I want to look at the equivalent <code class="language-plaintext highlighter-rouge">Comonad</code>. The dual of <code class="language-plaintext highlighter-rouge">Either e a</code> is <code class="language-plaintext highlighter-rouge">(e, a)</code> - the <code class="language-plaintext highlighter-rouge">Env</code> comonad. So let’s look at <code class="language-plaintext highlighter-rouge">cobind</code> on <code class="language-plaintext highlighter-rouge">Env</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cobind :: Env e a -> (Env e a -> b) -> Env e b cocatch :: Env e a -> (Env e a -> x) -> Env x a </code></pre></div></div> Now we’re getting somewhere. Let’s rewrite as a tuple: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cocatch :: (e, a) -> ((e, a) -> x) -> (x, a) cocatch (e, a) k = (k (e, a), a) </code></pre></div></div> Hmm. Let’s curry that second argument. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cocatch :: (e, a) -> (e -> a -> x) -> (x, a) cocatch (e, a) k = (k e a, a) </code></pre></div></div> Looks a lot like <code class="language-plaintext highlighter-rouge">censor</code> from <code class="language-plaintext highlighter-rouge">MonadWriter</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>censor :: (w -> w) -> Writer w a -> Writer w a censor f (Writer x) = Writer $ cocatch (\w _ -> f w) x </code></pre></div></div> It’s really a more specialized variant of <code class="language-plaintext highlighter-rouge">mapWriter</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cocatch w f = mapWriter (\(e, a) -> (f e a, a)) w </code></pre></div></div> Well, that’s interesting, but it doesn’t answer my question. The dual of the <code class="language-plaintext highlighter-rouge">Either</code> monad is the <code class="language-plaintext highlighter-rouge">Env</code> comonad, which is equivalent to the <code class="language-plaintext highlighter-rouge">Writer</code> monad, not the <code class="language-plaintext highlighter-rouge">Reader</code> monad. Maybe I need to get back to the <code class="language-plaintext highlighter-rouge">Profunctor</code> approach - inspect why the contravariant part of the functor is OK to be too big. <h1 id="inputs-and-outputs">Inputs and Outputs</h1> With <code class="language-plaintext highlighter-rouge">App r e a</code> , we have two output types and an input type. We have a few tools for working on these type parameters. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap :: (a -> b) -> App r e a -> App r e b fmapL :: (e -> f) -> App r e a -> App r f a local :: (r -> x) -> App x e a -> App r e a </code></pre></div></div> <code class="language-plaintext highlighter-rouge">local</code> is like our <code class="language-plaintext highlighter-rouge">contramap</code> function, but it won’t work because the kinds aren’t right. We can introduce effectful variants: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bind :: (a -> App r e b) -> App r e a -> App r e b catch :: (e -> App r f b) -> App r e a -> App r f a what :: (r -> App x e a) -> App x e a -> App r e a </code></pre></div></div> Okay, <code class="language-plaintext highlighter-rouge">what</code> has my interest. It can’t be defined. We never have an <code class="language-plaintext highlighter-rouge">x</code>, so there’s no way we can discharge it. Maybe we can translate this and flip arrows slightly differently… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- unwrap the newtypes: local :: (r -> x) -> (x -> Either e a) -> (r -> Either e a) localM :: (r -> x -> Either e a) -> (x -> Either e a) -> (r -> Either e a) </code></pre></div></div> Okay, this is obviously wrong. The <code class="language-plaintext highlighter-rouge">x</code> is in the wrong spot! We’re not supposed to be accepting an <code class="language-plaintext highlighter-rouge">x</code> as input, we’re supposed to be producing one. Let’s try that again. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>localM :: (r -> Either e x) -- App r e x -> (x -> Either e a) -- App x e a -> (r -> Either e a) -- App r e a </code></pre></div></div> This looks much more feasible. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>localM :: App r e x -> App x e a -> App r e a localM mkX withX = do x <- mkX localApp (\_ -> x) withX </code></pre></div></div> Well, this is a bit of a weird one. If we can produce an <code class="language-plaintext highlighter-rouge">x</code> from an <code class="language-plaintext highlighter-rouge">App r e</code>, then we can run an <code class="language-plaintext highlighter-rouge">App x e a</code> action into <code class="language-plaintext highlighter-rouge">App r e a</code>. Plucking with <code class="language-plaintext highlighter-rouge">catch</code> is about incrementally removing constraints that add cases to a type. So plucking with <code class="language-plaintext highlighter-rouge">localM</code> is about incrementally adding types to the environment product. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>localMPluck :: App r e x -> App (x, r) e a -> App r e a localMPluck mkX withXR = do x <- mkX localApp (\r -> (x, r)) withXR </code></pre></div></div> The tuple type works OK for a constraint plucking interface, but I don’t really like it, so let’s define a nested product. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data a :* b = a :* b infixr 7 :* </code></pre></div></div> The pattern for a plucking interface is to write a class that can delegate to a type parameter. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Has t env where get :: env -> t instance Has x x where get = id instance {-# overlapping #-} Has x (x :* y) where get (a :* _) = a instance {-# overlappable #-} Has x y => Has x (a :* y) where get (_ :* b) = get b </code></pre></div></div> And, let’s give it a nicer name - <code class="language-plaintext highlighter-rouge">provide</code>. It’s providing some new bit of information to the environment. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>provide :: App r e new -> App (new :* r) e a -> App r e a </code></pre></div></div> <h1 id="a-man-provides">A Man Provides</h1> Let’s run <code class="language-plaintext highlighter-rouge">App</code> into <code class="language-plaintext highlighter-rouge">IO</code>. We’ll guarantee via <code class="language-plaintext highlighter-rouge">RankNTypes</code> that it doesn’t need anything and doesn’t throw anything. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runApp :: (forall r e. App r e a) -> IO a runApp action = do eitherVoidA <- runExceptT $ runReaderT (unApp action) () case eitherVoidA of Left v -> absurd v Right a -> pure a </code></pre></div></div> Since we’re requiring a value that can work with any <code class="language-plaintext highlighter-rouge">r</code>, we can provide a <code class="language-plaintext highlighter-rouge">()</code> value. And, likewise, since we’re requiring that the <code class="language-plaintext highlighter-rouge">e</code> error type is any type we want, we can select <code class="language-plaintext highlighter-rouge">Void</code>. We get a guarantee that all checked exceptions are handled and we don’t need anything from the environment. Now, let’s use this stuff. We’re going to need a logger, first of all. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Logger = Logger mkLogger :: App r e Logger mkLogger = pure Logger logInfo :: (Has Logger r) => String -> App r e () logInfo msg = do Logger <- asks get liftIO $ putStrLn msg main :: IO () main = do runApp $ do logInfo "hello" </code></pre></div></div> Our <code class="language-plaintext highlighter-rouge">main</code> fails with an error: <code class="language-plaintext highlighter-rouge">No instance for (Has Logger r)</code>. So we need to provide one. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do runApp $ do provide mkLogger $ do logInfo "asdf" </code></pre></div></div> We can pass <code class="language-plaintext highlighter-rouge">provide mkLogger</code> to <code class="language-plaintext highlighter-rouge">runApp</code> because <code class="language-plaintext highlighter-rouge">mkLogger</code> has no requirements on the <code class="language-plaintext highlighter-rouge">e</code> or <code class="language-plaintext highlighter-rouge">r</code> types. Now, let’s make a database handle. This one is going to require logging. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data DbHandle = DbHandle mkDbHandle :: (Has Logger r) => App r e DbHandle mkDbHandle = do logInfo "making postgres handle" pure DbHandle getUserIds :: (Has DbHandle r) => App r e [Int] getUserIds = do DbHandle <- asks get pure [1,2,3] </code></pre></div></div> We can’t call this next to <code class="language-plaintext highlighter-rouge">logInfo</code> above, because we haven’t <code class="language-plaintext highlighter-rouge">provide</code>d it. The following code black fails with an error <code class="language-plaintext highlighter-rouge">No isntance for (Has DbHandle r)</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do runApp $ do provide mkLogger $ do logInfo "asdf" ids <- getUserIds forM_ ids $ \id -> do logInfo (show id) </code></pre></div></div> We can fix it by providing one: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do runApp $ do provide mkLogger $ do logInfo "asdf" provide mkDbHandle $ do ids <- getUserIds forM_ ids $ \id -> do logInfo (show id) </code></pre></div></div> This works just fine. What about throwing errors? <h1 id="plucking-errors">Plucking Errors</h1> We need to pluck a sum type. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data a || b = This a | That b class lil :< big where inject :: lil -> big project :: big -> Maybe lil instance lil :< lil where inject = id project = Just instance {-# overlapping #-} lil :< (lil || rest) where inject = This project x = case x of This a -> Just a _ -> Nothing instance {-# overlappable #-} (lil :< rest) => lil :< (not || rest) where inject = That . inject project x = case x of This _ -> Nothing That a -> project a throw :: lil :< big => lil -> App r big a throw = throwError . inject catch :: App r (lil || rest) a -> (lil -> App r rest a) -> App r rest a catch action handler = ReaderT $ \r -> runReaderT action r `catchE` \lilOrRest -> case lilOrRest of This lil -> runReaderT (handler lil) r That rest -> throwE rest </code></pre></div></div> Let’s suppose that creating a database handle actually has a <code class="language-plaintext highlighter-rouge">PgError</code> exception associated with it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data DbExn = DbExn mkDbHandle :: (Has Logger r, DbExn :< e) => App r e DbHandle mkDbHandle = do logInfo "making postgres handle" if 3 == 4 then pure DbHandle else throw DbExn </code></pre></div></div> Now, our <code class="language-plaintext highlighter-rouge">main</code> no longer compiles! GHC complains about <code class="language-plaintext highlighter-rouge">No instance for (DbExn :< e) arising from a use of mkDbHandle</code>. So we need to discharge that exception in the above block. We’ll define <code class="language-plaintext highlighter-rouge">handle</code> as a shorthand (<code class="language-plaintext highlighter-rouge">handle = flip catch</code>), and we can discharge the error: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do runApp $ do provide mkLogger $ do logInfo "asdf" handle (\DbExn -> logInfo "uh oh") $ provide mkDbHandle $ do ids <- getUserIds forM_ ids $ \id -> do logInfo (show id) </code></pre></div></div> This now compiles. But I suspect the <code class="language-plaintext highlighter-rouge">handle f . provide mk</code> pattern is common enough to factor it out. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>providing :: (lil -> App r rest a) -> App r (lil || rest) x -> App (x :* r) (lil || rest) a -> App r rest a providing handler provider = handle handler . provide provider main :: IO () main = do runApp $ do provide mkLogger $ do logInfo "asdf" providing (\DbExn -> logInfo "uh oh") mkDbHandle $ do ids <- getUserIds forM_ ids $ \id -> do logInfo (show id) </code></pre></div></div> Neat! <h1 id="wait-where-were-we">Wait, where were we?</h1> Right. Right. I’m trying to figure out why a big type for an error is a Problem, but a big type for an environment isn’t. In doing so, I figured out how to make composable, growing environments that play nicely with constraints. If I don’t have a composable, growing environment that plays nicely with constraints, then what do I have? Usually just a big <code class="language-plaintext highlighter-rouge">AppEnv</code> type that has everythign I ever need. Functions may be defined in terms of constraints, but usually are just defined in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App a = App { unApp :: ReaderT AppEnv IO a } </code></pre></div></div> When do I need to get away from <code class="language-plaintext highlighter-rouge">App</code>? Only when I want to use a function in another context. It can be frustrating to provide an entire <code class="language-plaintext highlighter-rouge">AppEnv</code> when I only need a database handle. But it rarely bites me in a way that is frustrating. Why? Ignoring inputs is safe. But ignoring outputs is dangerous. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fine :: (Int, String) -> Int fine (i, _) = i bad :: Either Int String bad = Left 2 </code></pre></div></div> <code class="language-plaintext highlighter-rouge">fine</code> throws away the input. It’s wasteful, but it’s not lying. <code class="language-plaintext highlighter-rouge">bad</code>, however, encodes partiality - it says it might have an <code class="language-plaintext highlighter-rouge">Int</code> or a <code class="language-plaintext highlighter-rouge">String</code>, but it definitely doesn’t have a <code class="language-plaintext highlighter-rouge">String</code>. It’s not lying, or even wrong, it’s just not helpful. If you want to use the <code class="language-plaintext highlighter-rouge">Int</code> inside, you gotta handle the <code class="language-plaintext highlighter-rouge">Right String</code> case. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>useBad :: (String -> Int) -> Int useBad handleString = either id handleString bad </code></pre></div></div> Now, <code class="language-plaintext highlighter-rouge">fine</code> has a problem: it’s too specific about it’s requirements. If I want to call <code class="language-plaintext highlighter-rouge">fine</code> with the result of <code class="language-plaintext highlighter-rouge">useBad</code>, then I have to come up with a <code class="language-plaintext highlighter-rouge">String</code> from somewhere. I don’t know whether or not it actually uses the <code class="language-plaintext highlighter-rouge">String</code> or not, so I have no idea if it matters what <code class="language-plaintext highlighter-rouge">String</code> I use. We can make <code class="language-plaintext highlighter-rouge">fine</code>’s type more precise: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fine :: (Int, unused) -> Int fine (i, _) = i </code></pre></div></div> Now, since <code class="language-plaintext highlighter-rouge">unused</code> is a type variable that I get to pick, I can pass <code class="language-plaintext highlighter-rouge">()</code> to satisfy the type checker. Likewise, we can refine <code class="language-plaintext highlighter-rouge">bad</code>’s type to make it more specific to what it has: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bad :: Either Int Void bad = Left 2 </code></pre></div></div> Now we know we’re never getting a <code class="language-plaintext highlighter-rouge">Right</code> out of it, so we don’t have to worry about it. Our calling code is much simpler: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>useBad :: Int useBad = either absurd id bad </code></pre></div></div> <h1 id="getting-vs-handling">Getting vs Handling</h1> It’s a question of “Where to get?” vs “How to handle?” If you know the inputs to a function, you need to provide all of them. If some part of the input is unnecessary, you may perform extra work providing it (only for it to be thrown away). If you know the outputs of your function, you need to handle all of them. If some part of the output is unnecessary, you may perform extra work handling it (only for the code path to never be used). It’s less consequential if we’re sloppy about our inputs. Computational waste usually isn’t that expensive. But we care much more about the correctness and shape of our outputs. I’m still not satisfied with what I’ve covered here. I think there’s a lot more to this. Tue, 27 Oct 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/10/27/plucking_in_plucking_out.html https://www.parsonsmatt.org/2020/10/27/plucking_in_plucking_out.html Unpack your Existentials I recently wrote a library <a href="https://www.stackage.org/nightly-2020-10-13/package/prairie-0.0.1.0"><code class="language-plaintext highlighter-rouge">prairie</code></a> to have “First Class Record Fields.” The overall gist is that I wanted a pair of functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>diffRecord :: Record rec => rec -> rec -> [Update rec] updateRecord :: Record rec => [Update rec] -> rec -> rec </code></pre></div></div> I want to be able to send these <code class="language-plaintext highlighter-rouge">[Update rec]</code> over the wire, so they needed to be serializable. The design choice I ended up with borrowed the <code class="language-plaintext highlighter-rouge">EntityField</code> idea from <code class="language-plaintext highlighter-rouge">persistent</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Record rec where data Field rec a recordFieldLens :: Field rec a -> Lens' rec a </code></pre></div></div> <code class="language-plaintext highlighter-rouge">Field</code> is an associated data family. Instances will be a <code class="language-plaintext highlighter-rouge">GADT</code> where the second type parameter indicates the type of the field. The <a href="https://www.stackage.org/haddock/nightly-2020-10-13/prairie-0.0.1.0/Prairie-Class.html">documentation for <code class="language-plaintext highlighter-rouge">Prairie.Class</code></a> has a good explainer on what’s going on. That’s where I ended up, but the story on how I got there is interesting too. <h2 id="packing-gadts">Packing GADTs</h2> The definition for <code class="language-plaintext highlighter-rouge">Update</code> is best represented as a <code class="language-plaintext highlighter-rouge">GADT</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Update rec where SetField :: Field rec a -> a -> Update rec updateRecord :: Record rec => [Update rec] -> rec -> rec updateRecord upds rec = foldr updateField rec upds where updateField (SetField field newVal) = set (recordFieldLens field) newVal </code></pre></div></div> The type <code class="language-plaintext highlighter-rouge">a</code> is existential - we’ve hidden it from view. Matching on the <code class="language-plaintext highlighter-rouge">Field</code> type will expose it to use within the scope of the pattern match. So let’s write a <code class="language-plaintext highlighter-rouge">ToJSON</code> instance for an <code class="language-plaintext highlighter-rouge">Update</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON (Update rec) where toJSON (Update field newVal) = object [ "field" .= field , "value" .= newVal ] </code></pre></div></div> This fails because we need a <code class="language-plaintext highlighter-rouge">ToJSON (Field rec a)</code>, and also because we need <code class="language-plaintext highlighter-rouge">ToJSON a</code>. We can use a <code class="language-plaintext highlighter-rouge">QuantifiedConstraints</code> to get the first: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (forall a. ToJSON (Field rec a)) => ToJSON (Update rec) where toJSON (Update field newVal) = object [ "field" .= field , "value" .= newVal ] </code></pre></div></div> But - how do we get the <code class="language-plaintext highlighter-rouge">ToJSON a</code> instance? GADTs allow us to pack up existential types. More powerfully, they allow us to pack up constraints. So let’s just paste it in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Update rec where SetField :: ToJSON a => Field rec a -> a -> Update a </code></pre></div></div> Our code now compiles. Hooray! Now, let’s write <code class="language-plaintext highlighter-rouge">FromJSON</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance FromJSON (Update rec) where parseJSON = withObject "Update" $ \o -> do field <- o .: "field" newVal <- o .: "value" pure $ Update field newVal </code></pre></div></div> GHC complains about this. We can use a <code class="language-plaintext highlighter-rouge">QuantifiedConstraint</code> again to get the <code class="language-plaintext highlighter-rouge">Field</code> parsing. But how are we going to know what type the <code class="language-plaintext highlighter-rouge">value</code> needs to be? We need to bring the type into scope. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (forall a. FromJSON (Field rec a)) => FromJSON (Update rec) where parseJSON = withObject "Update" $ \o -> do o .: "field" >>= \(field :: Field rec a) -> do newVal <- o .: "value" :: Parser a pure $ Update field newVal </code></pre></div></div> GHC still isn’t happy. We don’t have an instance of <code class="language-plaintext highlighter-rouge">FromJSON a</code> in scope. We also don’t have an instance of <code class="language-plaintext highlighter-rouge">ToJSON a</code> in scope, either! GHC is complaining about <code class="language-plaintext highlighter-rouge">ToJSON</code> because the <code class="language-plaintext highlighter-rouge">Update</code> constructor requires the <code class="language-plaintext highlighter-rouge">ToJSON a</code> constraint to work. This appears to be a bad road. Writing an instance of <code class="language-plaintext highlighter-rouge">Show (Update rec)</code> is going to require that we pack another constraint <code class="language-plaintext highlighter-rouge">Show a</code> in the GADT. And <code class="language-plaintext highlighter-rouge">Eq (Update rec)</code> will require even more constraints. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (forall a. Eq (Field rec a)) => Eq (Update rec) where SetField field0 val0 == SetField field1 val1 = field0 == field1 && val0 == val1 </code></pre></div></div> This code doesn’t work because we have no idea what the types are! But we can make it work by packing some more constraints in the GADT. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Update rec where SetField :: (Eq a, Typeable a, ToJSON a, FromJSON a, Show a) => Field rec a -> a -> Update rec </code></pre></div></div> Now, we can write that <code class="language-plaintext highlighter-rouge">Eq</code> instance. Because we’re comparing things with potentially different types, we need to compare their types first using the <code class="language-plaintext highlighter-rouge">Typeable</code> interface. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (forall a. Eq (Field rec a)) => Eq (Update rec) where sf0 == sf1 = case sf0 of SetField (f0 :: Field rec a) (v0 :: a) -> case sf1 of SetField (f1 :: Field rec b) (v1 :: b) -> case eqT @a @b of Just Refl -> f0 == f1 && v0 == v1 Nothing -> False </code></pre></div></div> Well. This works, but dang is it ugly. Look at the constructor to <code class="language-plaintext highlighter-rouge">Update</code> - it has so many constraints! And, what’s worse, we’re going to need a new constraint packed in there for every single class we want to write an instance. Gross! Fortunately, there’s a better way. <h2 id="fielddict"><code class="language-plaintext highlighter-rouge">FieldDict</code></h2> We want the ability to say: <blockquote> I have a <code class="language-plaintext highlighter-rouge">Field rec a</code>, and I want to have an instance of <code class="language-plaintext highlighter-rouge">ToJSON a</code> in scope. </blockquote> Or, if we generalize, <blockquote> I have a <code class="language-plaintext highlighter-rouge">Field rec a</code>, and I want to have an instance of <code class="language-plaintext highlighter-rouge">c a</code> in scope. </blockquote> That sounds an awful lot like a function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fieldDict :: forall (c :: Type -> Constraint) (rec :: Type) (a :: Type) . Field rec a -> c a </code></pre></div></div> Except that doesn’t work. We can’t have a <code class="language-plaintext highlighter-rouge">Constraint</code> as the return type of a function. But we can have a <code class="language-plaintext highlighter-rouge">Dict</code>. <code class="language-plaintext highlighter-rouge">Dict</code> is a GADT from the <a href="https://hackage.haskell.org/package/constraints-0.12/docs/Data-Constraint.html"><code class="language-plaintext highlighter-rouge">constraints</code></a> package. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Dict c where Dict :: c => Dict c </code></pre></div></div> Whenever we unpack a <code class="language-plaintext highlighter-rouge">Dict c</code>, we get the constraint <code class="language-plaintext highlighter-rouge">c</code> in scope. So we might have a <code class="language-plaintext highlighter-rouge">Dict (Eq Int)</code>. And unpacking that will bring <code class="language-plaintext highlighter-rouge">Eq Int</code> into scope. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fieldDict :: Field rec a -> Dict (c a) </code></pre></div></div> Now our types and kinds line up. We might use this like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>case fieldDict someField of Dict :: Dict (Eq a)-> view (recordFieldLens someField) old == view (recordFieldLens someField) new </code></pre></div></div> The constraint <code class="language-plaintext highlighter-rouge">Eq a</code> is in scope there, so we can use <code class="language-plaintext highlighter-rouge">==</code> with it. We can write it in a “continuation passing style” to avoid needing to <code class="language-plaintext highlighter-rouge">case</code> on the GADT every time. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withFieldDict :: (c a) => Field rec a -> (c a => r) -> r withFieldDict f k = case fieldDict f of Dict :: Dict (c a) -> k </code></pre></div></div> Unfortunately, this isn’t doing anything for us. We need the <code class="language-plaintext highlighter-rouge">c a</code> dictionary in scope to even call this, which means we’re not gleaning any additional information with the <code class="language-plaintext highlighter-rouge">Dict</code>. We’ll need to replace the <code class="language-plaintext highlighter-rouge">c a</code> constraint with something that’ll allow us to communicate what we need. We’re replacing a <code class="language-plaintext highlighter-rouge">c a :: Constraint</code>, which suggests we need a type class of our own. The whole problem is that we don’t know how to refer to <code class="language-plaintext highlighter-rouge">a</code> except through the GADT field, and we keep the <code class="language-plaintext highlighter-rouge">rec</code> parameter around. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (Record rec) => FieldDict (c :: Type -> Constraint) (rec :: Type) where fieldDict :: Field rec a -> Dict (c a) withFieldDict :: forall c rec a r. FieldDict c rec => Field rec a -> (c a => r) -> r withFieldDict field cont = case fieldDict @c field of Dict :: Dict (c a) -> cont </code></pre></div></div> This compiles. Defining instances is relatively straightforward. We write a pattern match on the field type and return <code class="language-plaintext highlighter-rouge">Dict</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance FieldDict Eq User where fieldDict field = case field of UserName -> Dict UserAge -> Dict UserId -> Dict instance FieldDict Ord User where fieldDict field = case field of UserName -> Dict UserAge -> Dict UserId -> Dict instance FieldDict Show User where fieldDict field = case field of UserName -> Dict UserAge -> Dict UserId -> Dict instance FieldDict ToJSON User where fieldDict field = case field of UserName -> Dict UserAge -> Dict UserId -> Dict </code></pre></div></div> Uh oh. This is boring. Can we do better? Yes! We can actually be generic over the constraint, provided that it holds for all the types. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (c String, c Int) => FieldDict c User where fieldDict field = case field of UserName -> Dict :: Dict (c String) UserId -> Dict :: Dict (c Int) UserAge -> Dict :: Dict (c Int) </code></pre></div></div> The type signatures are unnecessary - they’re just there to show that you are returning different things. <h2 id="back-to-the-json">Back to the JSON</h2> Alright, let’s rewrite our <code class="language-plaintext highlighter-rouge">ToJSON</code> instance using <code class="language-plaintext highlighter-rouge">FieldDict</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Update rec where SetField :: Field rec a -> a -> Update rec instance ( forall a. ToJSON (Field rec a) , FieldDict ToJSON rec ) => ToJSON (Update rec) where toJSON (Update field newVal) = withFieldDict @ToJSON field $ object [ "field" .= field , "value" .= newVal ] </code></pre></div></div> This works. What about parsing? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( forall a. FromJSON (Field rec a) , FieldDict FromJSON a ) => FromJSON (Update rec) where parseJSON = withObject "Update" $ \o -> do field <- o .: "field" case field of (_ :: Field rec a) -> withFieldDict @FromJSON field $ do val <- o .: "value" pure (SetField field val) </code></pre></div></div> This works too! If you find yourself packing your existentials to great dismay, consider this approach instead. <h3 id="aside">Aside…</h3> I’ve got a whole chapter dedicated to the design and implementation of the <code class="language-plaintext highlighter-rouge">prairie</code> record library in my book <a href="https://leanpub.com/production-haskell">Production Haskell</a>. If you liked what you read here, you’ll probably enjoy the rest of the book. Tue, 13 Oct 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/10/13/unpack_your_existentials.html https://www.parsonsmatt.org/2020/10/13/unpack_your_existentials.html Production Haskell Alpha Release I’m thrilled to announce that my book <a href="https://leanpub.com/production-haskell">Production Haskell</a> is released in alpha version. The first release has 240 pages of content, with much much more to come. If you want to <a href="https://leanpub.com/production-haskell">buy the book or sign up to receive updates, click here</a>. Wed, 07 Oct 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/10/07/production_haskell_alpha_release.html https://www.parsonsmatt.org/2020/10/07/production_haskell_alpha_release.html Quick Memory Trick So, Haskell has an amazing potential for writing correct code, but sometimes it doesn’t leave an obvious way to remember things. With record or product types, we don’t usually write direct pattern matches to access fields. You’re much more likely to see field labels as accessor functions, or an extension like <code class="language-plaintext highlighter-rouge">RecordWildCards</code> or <code class="language-plaintext highlighter-rouge">NamedFieldPuns</code>, or occasionally a field match. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { userName :: String , userAge :: Int , userAdmin :: Bool } -- a direct pattern match, rare: showUser :: User -> String showUser (User name age _admin) = name ++ "(age: " ++ show age ++ ")" -- function accessor: showUser :: User -> String showUser user = userName user ++ "(age: " ++ show (userAge user) ++ ")" -- field pattern match: showUser :: User -> String showUser User { userName = name, userAge = age } = name ++ "(age: " ++ show age ++ ")" -- RecordWildCards showUser :: User -> String showUser User {..} = userName ++ "(age: " ++ show userAge ++ ")" -- NamedFieldPuns showUser :: User -> String showUser User { userName, userAge } = userName ++ "(age: " ++ show userAge ++ ")" </code></pre></div></div> When you write with the direct pattern match, you have to match every single field. In <code class="language-plaintext highlighter-rouge">showUser</code>, we had to match on the <code class="language-plaintext highlighter-rouge">_admin</code> field, even though we weren’t using it. If you ever add or remove a field, you have to modify the pattern match, even if the modified fields are irrelevant to the function. This causes noisy diffs and busywork that doesn’t get things done. Accessor functions are super flexible - if we decide we want to refactor the <code class="language-plaintext highlighter-rouge">User</code> type to instead contain a first and last name, the field <code class="language-plaintext highlighter-rouge">userName</code> can be converted into a function that concatenates the two fields. This is like using methods instead of field access in Object Oriented languages. But often times the “field pattern matches” are more convenient, and they’re just as resilient to modifications of irrelevant fields. If a record type has fields that are irrelevant to the function, then this is a safe and reasonable choice. Sometimes, they’re so convenient that you want to use RecordWildCards to pattern match all the fields out, and consume all of them. Is there a way to get the convenience of <code class="language-plaintext highlighter-rouge">RecordWildCards</code> and the safety of knowing that modifying the type will cause a compile error? <h1 id="yes">Yes!</h1> I call it the “Undefined Pattern Match” trick. This is a good way to check that all fields are accounted for in a codebase, even if the record isn’t being pattern matched. I discovered the trick because we have a number of places in the work codebase where adding or removing a field must be accounted for in a way that isn’t tracked in the types, or by pattern matching on a relevant value. In the functions you’re defining that requires attention when a constructor changes, you write the following: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>userFields :: [UserField] userFields = [ ( "name", SomeField userName ) , ( "age", SomeField userAge ) , ( "admin", SomeField userAdmin ) ] where User _ _ _ = undefined </code></pre></div></div> Now, if I go to add a field to <code class="language-plaintext highlighter-rouge">User</code> (or remove one), then this function will cause a compile-error. I am reminded that I need to update this definition. You can also user <code class="language-plaintext highlighter-rouge">error</code> to attach a note: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>userFields :: [UserField] userFields = [ ( "name", SomeField userName ) , ( "age", SomeField userAge ) , ( "admin", SomeField userAdmin ) ] where User _ _ _ = error "Don't forget to update the fields" </code></pre></div></div> Actually, since <code class="language-plaintext highlighter-rouge">undefined</code> can be used at any type, you can leave a note with it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>userFields :: [UserField] userFields = [ ( "name", SomeField userName ) , ( "age", SomeField userAge ) , ( "admin", SomeField userAdmin ) ] where User _ _ _ = undefined "Don't forget to update fields!" </code></pre></div></div> If you use this pattern in your codebase a lot, you may want to make a helper term, so you can attach documentation. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | We use this function on the right-hand side of a pattern match -- so we can remind ourselves to modify functions that rely on the -- fields of a record. undefinedPatternMatch :: String -> a undefinedPatternMatch _ = undefined </code></pre></div></div> Mon, 01 Jun 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/06/01/quick_memory_trick.html https://www.parsonsmatt.org/2020/06/01/quick_memory_trick.html On PVP + Restrictive Bounds The following discussion occurred on the <a href="https://fpchat-invite.herokuapp.com/">FPChat Slack (invite link)</a>. It was pasted into a gist by <a href="https://twitter.com/pitopos">Emily Pillmore</a>. However, the gist was mysteriously deleted. I offered to host the interesting discussion on my blog, which will hopefully preserve it. The gist was recovered from Google Cache and is reproduced here. Some formatting and emojis are probably wrong because I don’t have the original markdown. <hr /> <h2 id="on-pvp--restrictive-bounds">On PVP + Restrictive Bounds</h2> The prompt came from a tweet (later found out to be ranting into the void and not to be considered real advice) by a Stackage trustee: <blockquote> Hey Haskell developers! STOP clap USING clap RESTRICTIVE clap UPPER clap BOUNDS! clap At least stop using them by default! </blockquote> <h3 id="part-i">Part I</h3> C: What do people consider the best practices on upper bounds in packages? L: No build failures / No restrictive upper bounds. Pick one. (And stackage only works as a third alternative as long as you can afford to stay within it.) П: @C This is why PVP is so important. Having upper bounds to within the next major version are good enough to guarantee that your bounds will not get in your way until you absolutely need them to, and will save you from committing to potential API breakages. As long as everyone follows PVP, you will not see either semantic or api breakages within a major version. However, there are bad actors using Hackage who will not follow this protocol, and do their own thing. This is bad for everyone, and a few missteps in major packages have made Stackage people paranoid about enforcing builds as a source of truth. Rather than becoming more vigilant or chalking it up to human error, Stackage folks recommend no one use upper bounds, and we just all run all plans against all new package versions to pick one build plan per cycle , instead of a potential range cabal offers. However, builds are not a source of truth. They are just proof of compilation! The package that maintains the same api, has no tests, and redefines every definition to be undefined still would pass build in Stackage! <ol> <li>This is a waste of resources, and a combinatorially explosive algorithm you can only accomplish with lots of boxes.</li> <li>It asks everyone to commit to potential API breakages.</li> <li>It still doesn’t solve the problem of semantic breakages, and if a particular package has a downstream package set that lacks tests for a certain breaking change in semantics semantics, all of its downstream dependencies are now wrong for the user, despite building</li> <li>If any of these problems appear in Cabal, i can revise my bounds and commit a revision. Stackage does not like revisions.</li> <li>Some like to complain about a mythical Solver creating build plans that don’t work. This is not the case if everyone follows PVP, and I’ve still never seen evidence to support this. It’s probably an artifact of the Cabal 1.0 era that has long been out of date.</li> <li>I think in the end, J is complaining about the work he’s found himself doing as a result of a bad maintainer not following semantic versioning, which is a problem in any system.</li> </ol> For further reading, there’s <ul> <li>https://github.com/haskell/pvp/blob/master/pvp-faq.md</li> <li>https://www.yesodweb.com/blog/2015/09/true-root-pvp-debate</li> <li>https://www.reddit.com/r/haskell/comments/3ls6d4/stack_and_the_pvp/cv92v6k/</li> </ul> L: Following PVP alone is not sufficient, you also need people to avoid breaking APIs to not bump major versions. (Just to be clear, I’ve been guilty of this too, “don’t do it” is easier said than done) J: I’m complaining about the work I find myself doing as a result of the impedance mismatch between how the PVP dictates the way the world ought to be and the way that the people inhabiting this world actually follow it. П: PVP is not entirely sufficient, sure, but to quote Λ [from the tweet]: @J, Why? I’d much rather weaken bounds myself using allow-newer than get broken build plans. Build plan solver errors are much easier for me to debug and diagnose than compilation failures! If you don’t like having to weaken bounds periodically, maybe break compatibility less often. П: Or, just realize that Stackage and Hackage serve two different purposes, and that the Stackage set of needs should not be the default thing we prescribe to everyone. It breaks hackage, and forces us all to rely on an incredibly costly methodology and is (historically) affected by the same semantic issues when picking out build plans. Stackage got bitten by aeson as well, if you remember C: One major downside to allow-newer on the stack side is it doesn’t mean what it says, it means allow anything. Cabal has fixed that. <h3 id="part-ii">Part II.</h3> (Later) Sockington: I think people who want to ignore upper bounds should use allow-newer: True in their cabal.project (or stack equivalent) instead of expecting the entire world to commit to api breakages. Sockington: Frankly I’m surprised that this isn’t already the default in stack, what with everything being managed by LTS. L: that’s doesn’t solve the problem for Stackage maintainers though. which is that they get spammed with hundreds of packages with spurious bounds and the only thing they can do is wait for the packages to get fixed. J: Yes, if I wasn’t complaining vacuously on Twitter I would have written a more nuanced blog post explaining the crux of the problem. IME, Haskell’s community has poor hygiene when it comes to managing version boundaries over the course of a package’s lifecycle. I would say that in the vast majority of cases, version bumps signifying breaking changes that cause lots of downstream build plan failures are completely spurious. Put another way: requesting that people abide by these rules carefully is an exercise in strict discipline across a diverse community of practitioners. Part of why I like using languages with static types is that I can offload more of what would otherwise be disciplinary enforcement onto the computer for automated checking. J: @C: For a serious answer, I would say follow https://www.semver.org if you’re a library author, and follow a similar methodology to what SemVer proscribes if you’re a library consumer (e.g. ^1.0.0 == >= 1.0.0 && < 2.0.0) ChrisPenciler: Or follow PVP since it’s literally haskell’s package versioning policy J: Eh, AFAICT there’s nothing enforcing that and I don’t care for the PVP. If Hackage were to make a decision that the PVP was an actual standard for enforcement rather than a de facto one I’d be fine with it though. П: the only thing enforcing PVP is the mass of historical packages that conform to it, and the inertia of a significant percentage of the Haskell community. To be fair, PVP came out before SemVer, and it’s around for Hysterical Raisins. I’d like to see SemVer, but it’s probably not an option at this point given the decade+ of packages following PVP. SemVer came out later, and is a refinement of PVP. But you can achieve roughly the same thing with PVP, which is fine. I just don’t end up using the epochal first major version number! @J also, re: your ping about what’s holding it up, the same thing that’s holding a lot of Hackage infrastructure up: lack of people helping to work on it! If you or others want to dedicate some time to haskell infra, it’d be very appreciated. The Hackage servers need a lot of love, and i think there’s only 2 very busy-with-life people working on it currently. One of which is going through a housing crisis and is AWOL atm. If there’s one thing Snoyman has been great at, it’s herding cats. The haskell.org folks, not so much besides Jasper and myself doing what we can to wrangle everyone to get things done. Anyone can pm me for more info about contributing their time to solving things for Hackage. We can get you set up and onboarded Sockington: maybe the problem is complaining on Twitter thinking M: I have a bunch of packages that i maintain and I pretty much alternate between strict upper bounds and YOLO upper bounds. strict upper bounds is much more work to maintain for a relatively small benefit. YOLO upper bounds require much less frequent work/maintenance, but the occasional times I need to do it, it’s more annoying work. Both of the pain points could be automated with tooling but no one has written it yet. Λ: The PVP is enormously better than semver. I don’t understand why anyone prefers semver to the PVP. They’re essentially the same, except that with semver, you only have one major version number, and with the PVP you have two. This means that under semver, any time you make any breaking change, you have to bump the major version number. When upgrading from version 1.x to 2.x, you have no idea how much effort it will be. Maybe the entire API has been overhauled and the package is completely rewritten, or maybe someone changed the behavior of a single, rarely-used function. Under semver, these are treated identically. Under the PVP, you have two major versions, so you can distinguish between “major API revisions” and “changes that could break someone’s stuff but probably won’t, or at least will be easy to fix.” This is such a huge boon, since it communicates far more information. Versioning is to some degree fundamentally a social problem, not a technical one, and I’ve personally found semver extremely frustrating here because it encourages people to either end up with absurdly large major version numbers or to release technically-breaking changes as minor version bumps because it seems so silly to bump the major version. My experience is that in ecosystems that use semver, versioning is less reliable, and I get more breakages when bumping minor version numbers than I do in Haskell. However, of course, this all flies out the window if people mysteriously choose to dogmatically flout conventions and upload packages to Hackage that use semver for no reason I can discern except spite. One of the most frustrating things to me in this entire debate over versioning and revisions is the fact that, fundamentally, overly-loose constraints pollute the build plan solver forever if no revision is made to fix them. Even if subsequent versions of the package tighten the bounds, the solver will still see the older versions with incorrect bounds, and it will choose them and try a bad build plan instead of raising a solver error. So you have a group of people making the following arguments: <ul> <li>I don’t like restrictive upper bounds.</li> <li>I don’t like Hackage revisions.</li> <li>cabal-install sucks because it generates bad build plans. :shocked_pikachu:</li> </ul> J: To be fair Hackage revisions are terrible, and I cannot stand the argument that they’re fine because they don’t really introduce mutability or some other such nonsense. Λ: Why are they terrible? If you don’t have them, a mistake in the constraints pollutes the solver forever. How do you avoid that otherwise? Just never screw up? J: So you release a new version of your library, like every other programming ecosystem does! Λ: I explained above why that doesn’t solve the problem. M: I think a lot of it comes from those first two bullets being a legitimately large burden on library maintainers. So you really want some tooling around fixing that. The tooling we got was Stackage, which solved the problem by not having it in the first place. I think a more robust solution would involve lax upper bounds and tooling that can report build failures to a central service to serve as “official” upper bounds. Evidence-based bounds, rather than no-bounds or overly-conservative bounds J: I also find the notion of a third-party being able to arbitrarily mutate metadata associated with packages to be extremely distasteful. And again, this is something that no other programming language ecosystem had to my knowledge, so why must we have it? Λ: Okay, lets add two more bullets to the list: <ul> <li>Weakening restrictive bounds is too much work.</li> <li>I don’t want someone else to weaken my bounds for me.</li> </ul> M: With no upper bounds, there’s a small chance I’ll need to make a revision to establish upper bounds on every version of a package that needs it. With restrictive upper bounds, there’s a large chance that I’ll need to make a revision to relax upper bounds on each version of a package. Either of these could be automated, but are both currently annoying to work with. I actually wonder which would be easier to write tooling for lol J: Honestly though, I still don’t have a clear understanding of why Hackage is such a uniquely positioned package ecosystem that we need to implement things like revisions in the first place. To my knowledge no other package ecosystem has something like this in place, so what makes Haskell special in this regard? Λ: I want to be very clear: if you dislike revisions, the only option is restrictive upper bounds. Otherwise you cannot revise a package to fix overly-lax constraints, and now your build plans for that package are polluted forever. You are fighting for two things that are mutually contradictory to the way a build plan solver works. Why don’t other ecosystems do this? I don’t know, but it means they’re broken in a way that we aren’t. Maybe other ecosystems just assume everyone will never screw up! J: They don’t. B: If your model requires mutation to fix potentially global poisoning of builds your model is bad and you shouldn’t find a model that treats you better and buys you nice things Λ: Do you know of any such model? Look, I am usually pretty empathetic when it comes to technical disagreements within the Haskell community; anyone who knows anything about me should know that pretty well. But this argument has always come off as magical thinking to me, with no proposed alternative that works. When I started arguing in favor of revisions, I had exclusively used Stack as a build tool, so I do not say this out of personal bias. The model you’re arguing for just doesn’t make sense. B: Hackage and Cabal’s problems are sui generis. So pick your poison. J: As with all things, my argument here is that we should look at what Rust/crates.io is doing in this area and copy it wholesale. L: how do they do it? Λ: “Every other language does this broken thing, so we should do it too” is such a weird take coming from the Haskell community. J: @L: Defaulting to upper bounds, community consensus on using semver, no mutation in the package index. Λ: Cargo allows mutation via removing a package: https://doc.rust-lang.org/cargo/commands/cargo-yank.html. You need mutation somewhere. If you want to argue in favor of cargo yank rather than revisions, that’s fine. I personally think revisions are nicer to deal with, but that’s subjective at least. M: (i’m gonna come out in favor of hackage revisions fwiw but i would prefer “known to not work” and “not known to work” operators) Λ: Also, FWIW, rubygems also has gem yank, so it has mutation, too: https://guides.rubygems.org/command-reference/#gem-yank J: I disagree that package removal is the same as modifying the metadata (which, again, goes further than just allowing trustees to tweak bounds). I don’t mind package removal Λ: Why? The argument made above was “no other ecosystem needs this,” which is demonstrably false, it’s just that other ecosystems do it in a different way. In other ecosystems, you have to yank the versions with bad constraints and upload new ones (with different version numbers) with fixed constraints. Okay, so now the argument comes down to whether the revision model or the yank model is better. But with the yank model, suppose a new version of a package is released that invalidates a dozen versions’ bounds. Now you have to yank every one of those versions. If you want to support people running on those major/minor versions, you have to upload several new releases to fix the bounds for each of those major/minor lines. J: If the revision model was limited to only changing the version bounds associated with the library portion of code residing in Hackage, I would probably not be opposed to it. Λ: That is the revision model. J: And I would likely prefer it over yanking. No it isn’t. Λ: A Hackage revision cannot make code changes. Hackage revisions are to package metadata exclusively. J: https://github.com/snoyberg/trojan Package metadata modification can extend further than purely modifying bounds associated with the library Λ: You’re basically saying “Hackage revisions can be used maliciously because they can modify other parts of the cabal file,” but okay, so can modifying version bounds. Upload a new, malicious version of a package, then update the version bounds to use that package. If you want to pin to a particular revision to ensure you’re using a trusted plan, okay—that’s fine and reasonable! But this argument seems absurd to me: package management is fundamentally about installing untrusted code from someone else. Are you auditing all of it all the time? If not, then this “revision attack” seems like a very strange threat model to be worrying about. J: Modifying version bounds for the core library stanza is extremely limited in scope. It introduces the possibility of transitively introducing malicious code via a version change in a third party library, but the trade off there seems to at least be reasonable with respect to the stated goal of not blocking perfectly valid build plans to be constructed with that library. I don’t find the trade off to be acceptable beyond that, personally. I don’t find restrictive upper bounds useful for test or benchmark code, and I find revisions there of limited utility accordingly. Λ: > J: I don’t find the trade off to be acceptable beyond that, personally. Why? J: And I especially dislike the ability to modify flag defaults, flag types, or add custom setup stanzas as this can change the behavior of a library at a particular version retroactively with very little visibility into the cause. Λ: Has this literally ever happened? Or is this 100% theoretical? (I’m mostly referring to the custom-setup example; modifying flags seems like it has some valid use cases because automatic flags interact with the solver.) J: I’m not aware of any time it’s happened, I’m simply pointing out that this attitude colors my perception of the feature in a way that makes me disinclined to favor it. Λ: cabal-install allows you to: <ul> <li>Pin the version and flags of a package with freeze files.</li> <li>Pin yourself to a state of the package index using index-state.</li> </ul> So you can pin yourself to a particular set of flags and revisions pretty easily. Through this lens, revisions are basically just another part of the version number, so the alternative—yanking—is worse, if anything. The yanking model allows you to do everything the revisions model allows you to do, plus you can make arbitrary code changes! However, the real reason I like revisions is that I think they are meaningfully distinct from the version number. To me, the version number versions the code, and revisions version the metadata. The second of those two things really is orthogonal to the first: the yank model requires you to bump a version number, causing a bunch of trouble and headaches for everyone who has pinned to that version number, even when no code changes were made at all. Revisions allow you to bump a different version number to not invalidate people’s build plans, and there is a social expectation that they will not do anything evil. Package management is a social problem. There are ways to add technical guard rails and to make things easier on maintainers, but you cannot escape that at the end of the day, you are trusting tons and tons of other people at every level of the process. Revisions seem like a great option in that light, and most of the arguments I’ve heard against them boil down to “they make me uncomfortable” without providing much of a reasonable alternative. Now, as for restrictive upper bounds on test suites: modern versions of cabal-install allow you to install a package’s library without installing its test suites and benchmarks. So even if the package has restrictive upper bounds on its test suites, you can still construct a build plan that includes the library. That seems like getting the best of both worlds. You can have the proper bounds on your test suites, and people only have to care about them if they actually want to build your test suite. Sockington: @Λ: I would be so happy if you put together your research and thoughts on this into a blog post. Thu, 07 May 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/05/07/on_pvp_restrictive_bounds.html https://www.parsonsmatt.org/2020/05/07/on_pvp_restrictive_bounds.html Evolving Import Style For Diff Friendliness Raise your hand if you’ve been annoyed by imports in Haskell. They’re not fun. Imports are often noisy, lists are often huge, and diffs can be truly nightmarish to compare. Using a term often requires modifying the import list, which breaks your workflow. Fortunately, we can reduce some of the pain of these problems with a few choices in our <code class="language-plaintext highlighter-rouge">stylish-haskell</code> configuration and a script that gradually implements these changes in your codebase. This post begins with a style recommendation, continues with a script to implement it gradually in your codebase, and finishes with a discussion on relevant import styles and how they affect review quality. <h1 id="the-blessed-style">The Blessed Style</h1> I use <a href="https://hackage.haskell.org/package/stylish-haskell"><code class="language-plaintext highlighter-rouge">stylish-haskell</code></a> for my formatting tool. My editor’s default formatting choices with <a href="https://github.com/parsonsmatt/vim2hs"><code class="language-plaintext highlighter-rouge">vim2hs</code></a> work well for me (while I maintain that fork, it’s mostly a conglomeration of a bunch of changes that other people have made to it). I have this shortcut defined to run <code class="language-plaintext highlighter-rouge">stylish-haskell</code> in vim: <div class="language-vimscript highlighter-rouge"><div class="highlight"><pre class="highlight"><code>" Haskell nnoremap <leader>hs ms:%!stylish-haskell<cr>'s </code></pre></div></div> This sets a mark, filters the file through <code class="language-plaintext highlighter-rouge">stylish-haskell</code>, and then returns to the mark. <code class="language-plaintext highlighter-rouge">stylish-haskell</code> is configured by a <code class="language-plaintext highlighter-rouge">.stylish-haskell.yaml</code> file, and it will walk up the directory tree searching for one to configure the project with. I place mine in the root of the Haskell directory, right next to the <code class="language-plaintext highlighter-rouge">stack.yaml</code> or <code class="language-plaintext highlighter-rouge">cabal.project</code> files. Here are the contents that I recommend: <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>steps: - imports: align: none list_align: with_module_name pad_module_names: false long_list_align: new_line_multiline empty_list_align: inherit list_padding: 7 # length "import " separate_lists: false space_surround: false - language_pragmas: style: vertical align: false remove_redundant: true - simple_align: cases: false top_level_patterns: false records: false - trailing_whitespace: {} # You need to put any language extensions that's enabled for the entire project here. language_extensions: [] # This is up to personal preference, but 80 is the right answer. columns: 80 </code></pre></div></div> Let’s look at a diff that compares the default stylish-haskell and this configuration. I created a <a href="https://github.com/parsonsmatt/servant-persistent/pull/41">pull request</a> against the <code class="language-plaintext highlighter-rouge">servant-persistent</code> example project to demonstrate the style. I left a bunch of review comments to explain the differences, and the UI for reading them is nice on GitHub. Here’s a reproduction of the differences: <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- import Init (runApp) + import Init (runApp) </code></pre></div></div> We no longer indent so that module names are aligned. This helps keep the column count low, and makes it easier to just type this out manually without worrying about alignment. <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE DataKinds #-} {-# LANGUAGE DataKinds #-} </code></pre></div></div> We don’t align on pragmas anymore. The diff will only show a new language pragma, not highlighting every line that was changed just to align the imports. <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- import Api.User (UserAPI, userApi, userServer) - import Config (AppT (..), Config (..)) + import Api.User (UserAPI, userApi, userServer) + import Config (AppT(..), Config(..)) </code></pre></div></div> We no longer align the explicit import lists along the longest module name. This is less noisy, because adding a new module import that is longer than any others will no longer trigger a reformat across all the imports. <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- import Database.Persist.Postgresql (Entity (..), fromSqlKey, insert, - selectFirst, selectList, (==.)) + import Database.Persist.Postgresql + (Entity(..), fromSqlKey, insert, selectFirst, selectList, (==.)) </code></pre></div></div> If the import and module goes beyond the column count, then the import list is indented, but is kept on one line. This keeps the import lists compact in the smallest cases, where it’s easier to notice a small change. <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- import Servant ((:<|>) ((:<|>)), Proxy (Proxy), Raw, - Server, serve, serveDirectoryFileServer) + import Servant + ( (:<|>)((:<|>)) + , Proxy(Proxy) + , Raw + , Server + , serve + , serveDirectoryFileServer + ) </code></pre></div></div> If a newline indented import list expands beyond the column count, then it’ll put each term on a new line. This takes up space, but it’s really easy to read, and the diff for adding or removing an import line points to exactly the change that was made. <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- import Config (AppT (..)) - import Control.Monad.Metrics (increment, metricsCounters) - import Data.HashMap.Lazy (HashMap) - import Data.IORef (readIORef) - import Data.Text (Text) - import Lens.Micro ((^.)) - import Models (User (User), runDb, userEmail, - userName) - import qualified Models as Md - import qualified System.Metrics.Counter as Counter + import Config (AppT(..)) + import Control.Monad.Metrics (increment, metricsCounters) + import Data.HashMap.Lazy (HashMap) + import Data.IORef (readIORef) + import Data.Text (Text) + import Lens.Micro ((^.)) + import Models (User(User), runDb, userEmail, userName) + import qualified Models as Md + import qualified System.Metrics.Counter as Counter </code></pre></div></div> The end result is less pretty. It’s a little more cluttered to read. However, it dramatically improves diffs and merge conflicts when using qualified and explicit imports, which will improve the overall readability of the codebase significantly. <h1 id="automating-the-migration">Automating the Migration</h1> You don’t want to shotgun the entire project with this, because that’ll cause a nightmare of merge conflicts for everyone until the dust settles. But if you did, you could write: <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ stylish-haskell --inplace **/.hs </code></pre></div></div> This is fine for small projects with few collaborators. But on large projects with many collaborators, we want to make this a bit more gentle. So instead, we’ll only require that files changed in a given PR are formatted. We can get that information using <code class="language-plaintext highlighter-rouge">git diff --name-status origin/master</code>. If your “target” remote and branch isn’t <code class="language-plaintext highlighter-rouge">origin master</code> then substitute whatever you use. The output of that command looks like this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>M .stylish-haskell.yaml M Setup.hs M app/Main.hs M src/Api.hs M src/Api/User.hs M src/Config.hs M src/DevelMain.hs M src/Init.hs M src/Logger.hs M src/Models.hs M test/ApiSpec.hs M test/UserDbSpec.hs </code></pre></div></div> All of these symbols are <code class="language-plaintext highlighter-rouge">M</code>, but you can also get <code class="language-plaintext highlighter-rouge">A</code> for additions and <code class="language-plaintext highlighter-rouge">R</code> for replacements/rewrites, and we’ll want to <code class="language-plaintext highlighter-rouge">stylish</code> those up too. We’ll handle these in three steps for these cases, because it’s easiest. The first case is simply <code class="language-plaintext highlighter-rouge">M</code>, and we can focus on that with <code class="language-plaintext highlighter-rouge">grep "^M"</code>. We only want Haskell files, so we’ll filter on those with <code class="language-plaintext highlighter-rouge">grep ".hs"</code>. We want to get the second field, so we’ll do <code class="language-plaintext highlighter-rouge">cut -f 2</code>. Finally, we’ll send all the elements as arguments to <code class="language-plaintext highlighter-rouge">stylish-haskell --inplace</code> using <code class="language-plaintext highlighter-rouge">xargs</code>. The whole command is here: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git diff --name-status origin/master \ | grep .hs \ | grep "^M" \ | cut -f 2 \ | xargs stylish-haskell --inplace </code></pre></div></div> Added files is the same, but you’ll have <code class="language-plaintext highlighter-rouge">grep "^A"</code> instead. Replaced/rewritten files are slightly different. Those have three fields - the type (<code class="language-plaintext highlighter-rouge">R</code>), the original filename, and the destination/new file name. We only want the new file name. So the script looks like this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># renamed files git diff --name-status origin/master \ | grep .hs \ | grep "^R" \ | cut -f 3 \ | xargs stylish-haskell --inplace </code></pre></div></div> The only real difference is the <code class="language-plaintext highlighter-rouge">cut -f 3</code> field. Our full script is: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/usr/bin/env bash set -Eeux # modified files git diff --name-status origin/master \ | grep .hs \ | grep "^M" \ | cut -f 2 \ | xargs stylish-haskell --inplace # added files git diff --name-status origin/master \ | grep .hs \ | grep "^A" \ | cut -f 2 \ | xargs stylish-haskell --inplace # renamed files git diff --name-status origin/master \ | grep .hs \ | grep "^R" \ | cut -f 3 \ | xargs stylish-haskell --inplace </code></pre></div></div> Save that somewhere as <code class="language-plaintext highlighter-rouge">stylish-haskell.sh</code>, and add an entry in your <code class="language-plaintext highlighter-rouge">Makefile</code> that references it (you do have a Makefile, right?). Now, we can run <code class="language-plaintext highlighter-rouge">make stylish</code> and it’ll format all imports that have changed in our PR, but it won’t touch anything else. Over time, the codebase will converge on the new style, but only as people are working on relevant changes. <h1 id="adding-to-ci">Adding to CI</h1> We can add this to CI by calling the script and seeing if anything changed. <code class="language-plaintext highlighter-rouge">git</code> has an option <code class="language-plaintext highlighter-rouge">--exit-code</code> that will cause <code class="language-plaintext highlighter-rouge">git</code> to exit with a failure if there is a difference. In this snippet, I have some uncommitted changes: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ git diff --exit-code diff --git a/Makefile b/Makefile index f8d1636..df336de 100644 --- a/Makefile +++ b/Makefile @@ -6,4 +6,7 @@ ghcid-devel: ## Run the server in fast development mode. See DevelMain for detai --command "stack ghci servant-persistent" \ --test "DevelMain.update" -.PHONY: ghcid-devel help +imports: ## Format all the imports that have changed since the master branch. + ./stylish-haskell.sh + +.PHONY: ghcid-devel help imports $ echo $? 1 </code></pre></div></div> We can use this to fail CI. In Travis CI, we can add the following lines: <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>script: - make imports - git diff --exit-code - stack --no-terminal --install-ghc test </code></pre></div></div> You can adapt this to whatever CI setup you need. However, you’ll probably need to install <code class="language-plaintext highlighter-rouge">stylish-haskell</code> in CI, too. Your build tool can handle that, just ensure that it’s present on the <code class="language-plaintext highlighter-rouge">PATH</code>. <h1 id="why-this-style">Why this style?</h1> The default style is really aesthetically nice. Everything lines up, there’s a lot of horizontal whitespace, it’s uncluttered looking. But it just doesn’t scale! It doesn’t look good with long module names. It doesn’t look good with long explicit import lists. It causes a ton of irrelevant diff noise and needless merge conflicts. It becomes a hassle when you’re working on a large codebase with other people. So let’s look at all the choices, their alternatives, and why I selected these. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>steps: - imports: align: none </code></pre></div></div> Alignment is visually appealing but it creates diff noise and it consumes columns with whitespace that would better be used with meaning. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> list_align: with_module_name </code></pre></div></div> This option is superfluous, because we have selected <code class="language-plaintext highlighter-rouge">new_line_multiline</code> for <code class="language-plaintext highlighter-rouge">long_list_align</code>. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> pad_module_names: false </code></pre></div></div> The docs for this give the justification quite nicely: <blockquote> Right-pad the module names to align imports in a group: <ul> <li> true: a little more readable <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> import qualified Data.List as List (concat, foldl, foldr, > init, last, length) > import qualified Data.List.Extra as List (concat, foldl, foldr, > init, last, length) </code></pre></div> </div> </li> <li> false: diff-safe <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> import qualified Data.List as List (concat, foldl, foldr, init, > last, length) > import qualified Data.List.Extra as List (concat, foldl, foldr, > init, last, length) </code></pre></div> </div> </li> </ul> Default: true </blockquote> Ultimately, diff-safe is preferable to aesthetics, so we go with that. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> long_list_align: new_line_multiline </code></pre></div></div> <code class="language-plaintext highlighter-rouge">long_list_align</code> determines what happens when the import list goes over the maximum column count. This is option a recent addition to the options. There are a few choices here, and you may actually prefer an even more diff-friendly approach than me. <code class="language-plaintext highlighter-rouge">new_line_multiline</code> will indent if the module and list exceeds the column length. If the new line list also exceeds the column length, then it’ll put every import on it’s own line. This is fantastic for diffs, but takes up a lot of space. It looks quite readable, at least. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> empty_list_align: inherit </code></pre></div></div> This is a mostly irrelevant choice, since there is no alignment. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> list_padding: 7 # length "import " </code></pre></div></div> This sets it up so that the import list clears the <code class="language-plaintext highlighter-rouge">import </code>, providing a clean visual break between lines. You could go longer or shorter, but that’s up to you. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> separate_lists: false </code></pre></div></div> <code class="language-plaintext highlighter-rouge">separate_lists</code> adds a space between a class and it’s methods or a type and it’s constructors. <blockquote> <ul> <li> true: There is single space between Foldable type and list of it’s functions. <blockquote> import Data.Foldable (Foldable (fold, foldl, foldMap)) </blockquote> </li> <li> false: There is no space between Foldable type and list of it’s functions. <blockquote> import Data.Foldable (Foldable(fold, foldl, foldMap)) </blockquote> </li> </ul> </blockquote> I like it off, but this can go either way. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> space_surround: false </code></pre></div></div> This doesn’t really matter and can go either way. With <code class="language-plaintext highlighter-rouge">multiline</code> and now <code class="language-plaintext highlighter-rouge">new_line_multiline</code>, this is probably better to be <code class="language-plaintext highlighter-rouge">true</code>. <blockquote> Space surround option affects formatting of import lists on a single line. The only difference is single space after the initial parenthesis and a single space before the terminal parenthesis. <ul> <li> true: There is single space associated with the enclosing parenthesis. <blockquote> import Data.Foo ( foo ) </blockquote> </li> <li> false: There is no space associated with the enclosing parenthesis <blockquote> import Data.Foo (foo) </blockquote> </li> </ul> Default: false </blockquote> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> - language_pragmas: style: vertical align: false remove_redundant: true </code></pre></div></div> I know it looks nice to have aligned pragmas, but it’s annoying to view a diff and not easily tell what pragmas were added or removed. THis makes it obvious. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> - simple_align: cases: false top_level_patterns: false records: false </code></pre></div></div> All of this visual alignment just ruins diffs. If you want visual alignment, align on an indentation boundary. Compare: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fromMaybe default maybeA = case maybeA of Just a -> a Nothing -> default </code></pre></div></div> This looks nice, but it’s annoying to maintain and change. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fromMaybe default maybeA = case maybeA of Just a -> a Nothing -> default </code></pre></div></div> You still get alignment of the important bits, but it’s now safe to diffs and refactoring. Likewise, adding, removing, or changing a field to a record should only trigger a diff on the relevant fields. Anything else is noise that detracts from signal. <h1 id="conclusion">Conclusion</h1> Anyway, these are my recommendations for large projects that have multiple collaborators. If you’re working on a small project, then you don’t need to worry about anything here. These aren’t my aesthetic preferences, but these formatting choices do annoy me a lot less than pretty code pleases me. Tue, 17 Mar 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/03/17/gradual_import_style_improvements.html https://www.parsonsmatt.org/2020/03/17/gradual_import_style_improvements.html Effectful Property Testing You’re convinced that <a href="https://hypothesis.works/articles/what-is-property-based-testing/">Property Based Testing</a> is awesome. You’ve read about <a href="https://wickstrom.tech/programming/2019/03/02/property-based-testing-in-a-screencast-editor-introduction.html">using PBT to test a screencast editor</a> and can’t wait to do more. But it’s time to write some property tests that integrate with an external system, and suddenly, it’s not so easy. The fantastic <a href="https://github.com/hedgehogqa/haskell-hedgehog"><code class="language-plaintext highlighter-rouge">hedgehog</code></a> library has two “modes” of operation: generating values and making assertions on those values. I wrote the compatibility library <a href="https://hackage.haskell.org/package/hspec-hedgehog"><code class="language-plaintext highlighter-rouge">hspec-hedgehog</code></a> to allow using <code class="language-plaintext highlighter-rouge">hspec</code>’s nice testing features with <code class="language-plaintext highlighter-rouge">hedgehog</code>’s excellent error messages. But then the time came to start writing property tests against a Postgresql database. At work, we have a lot of complex SQL queries written both in <a href="https://hackage.haskell.org/package/esqueleto"><code class="language-plaintext highlighter-rouge">esqueleto</code></a> and in raw SQL. We’ve decided we want to increase our software quality by writing tests against our database code. While both Haskell and SQL are declarative and sometimes obviously correct, it’s not always the case. Writing property tests would help catch edge cases and prevent bugs from getting to our users. <h1 id="io-tests">IO Tests</h1> It’s considered good practice to model tests in three separate phases: <ol> <li>Arrange</li> <li>Act</li> <li>Assert</li> </ol> This works really well with property based testing, especially with <code class="language-plaintext highlighter-rouge">hedgehog</code>. We start by generating the data that we need. Then we call some function on it. Finally we assert that it should have some appropriate shape: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = describe "some property" $ do it "works" $ hedgehog $ do value <- someGenerator let result = someFunction value result === someExpectedValue </code></pre></div></div> It’s relatively straightforward to call functions in IO. <code class="language-plaintext highlighter-rouge">hedgehog</code> provides a function <a href="https://www.stackage.org/haddock/lts-15.3/hedgehog-1.0.2/Hedgehog.html#v:evalIO"><code class="language-plaintext highlighter-rouge">evalIO</code></a> that lets you run arbitrary IO actions and still receive good error messages. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = describe "some IO property" $ do it "works" $ hedgehog $ do value <- someGenerator result <- evalIO $ someFunction value result === someExpectedValue </code></pre></div></div> For very simple tests like this, this is fine. However, it becomes cumbersome quite quickly when you have a lot of values you want to make assertions on. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = describe "some IO property" $ do it "works" $ hedgehog $ do value0 <- someGenerator0 value1 <- someGenerator1 value2 <- someGenerator2 (a, b, c, d, e) <- evalIO $ do prepare value0 prepare value1 prepare value2 a <- someFunction alterState b <- someFunction c <- otherFunction d <- anotherFunction e <- comeOnReally pure (a, b, c, d, e) a === expectedA diff a (<) b c === expectedC d /== anyNonDValue </code></pre></div></div> This pattern becomes unwieldy for a few reasons: <ol> <li>It’s awkward to have to <code class="language-plaintext highlighter-rouge">pure</code> up a tuple of the values you want to assert against.</li> <li>It’s repetitive to declare bindings twice for all the values you want to assert against.</li> <li>Modifying a return means adding or removing items from the tuple, which can possibly be error-prone.</li> </ol> Fortunately, we can do better. <h1 id="pure-on-pure-on-pure">pure on pure on pure</h1> Instead of returning values to a different scope, and then doing assertions against those values, we will return an action that does assertions, and then call it. The simple case is barely changes: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = describe "some simple IO property" it "works" $ hedgehog $ do value <- someGenerator assertions <- evalIO $ do result <- someFunction value pure $ do result === expectedValue assertions </code></pre></div></div> An astute student of monadic patterns might notice that: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo = do result <- thing result </code></pre></div></div> is equivalent to: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo = do join thing </code></pre></div></div> and then simplify: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = describe "some simple IO property" it "works" $ hedgehog $ do value <- someGenerator join $ evalIO $ do result <- someFunction value pure $ do result === expectedValue </code></pre></div></div> Nice! Because we’re returning an action of assertions instead of values that will be asserted against, we don’t have to play any weird games with names or scopes. We’ve got all the values we need in scope, and we make assertions, and then we defer returning them. Let’s refactor our more complex example: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: Spec spec = describe "some IO property" $ do it "works" $ hedgehog $ do value0 <- someGenerator0 value1 <- someGenerator1 value2 <- someGenerator2 join $ evalIO $ do prepare value0 prepare value1 prepare value2 a <- someFunction alterState b <- someFunction c <- otherFunction d <- anotherFunction e <- comeOnReally pure $ do a === expectedA diff a (<) b c === expectedC d /== anyNonDValue </code></pre></div></div> On top of being more convenient and easy to write, it’s more difficult to do the wrong thing here. You can’t accidentally swap two names in a tuple, because there is no tuple! <h1 id="a-nice-api">A Nice API</h1> We can write a helper function that does some of the boilerplate for us: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>arrange :: PropertyT IO (IO (PropertyT IO a)) -> PropertyT IO a arrange mkAction = do action <- mkAction join (evalIO action) </code></pre></div></div> Since we’re feeling cute, let’s also write some helpers that’ll make this pattern more clear: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>act :: IO (PropertyT IO a) -> PropertyT IO (IO (PropertyT IO a)) act = pure assert :: PropertyT IO a -> IO (PropertyT IO a) assert = pure </code></pre></div></div> And now our code sample looks quite nice: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> spec :: Spec spec = describe "some IO property" $ do it "works" $ arrange $ do value0 <- someGenerator0 value1 <- someGenerator1 value2 <- someGenerator2 act $ do prepare value0 prepare value1 prepare value2 a <- someFunction alterState b <- someFunction c <- otherFunction d <- anotherFunction e <- comeOnReally assert $ do a === expectedA diff a (<) b c === expectedC d /== anyNonDValue </code></pre></div></div> <h1 id="beyond-io">Beyond IO</h1> It’s not enough to just do <code class="language-plaintext highlighter-rouge">IO</code>. The problem that motivated this research called for <code class="language-plaintext highlighter-rouge">persistent</code> and <code class="language-plaintext highlighter-rouge">esqueleto</code> tests against a Postgres database. These functions operate in <code class="language-plaintext highlighter-rouge">SqlPersistT</code>, and we use database transactions to keep tests fast by rolling back the commit instead of finalizing. Fortunately, we can achieve this by passing an “unlifter”: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>arrange :: (forall x. m x -> IO x) -> PropertyT IO (m (PropertyT IO a)) -> PropertyT IO a arrange unlift mkAction = do action <- mkAction join (evalIO (unlift action)) act :: m (PropertyT IO a) -> PropertyT IO (m (PropertyT IO a)) act = pure assert :: Applicative m => PropertyT IO a -> m (PropertyT IO a) assert = pure </code></pre></div></div> With these helpers, our database tests look quite neat. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spec :: SpecWith TestDb spec = describe "db testing" $ do it "is neat" $ \db -> arrange (runTestDb db) $ do entity0 <- forAll generateEntity0 entityList <- forAll $ Gen.list (Range.linear 1 100) $ generateEntityChild act $ do insert entity0 before <- someDatabaseFunction insertMany entityList after <- someDatabaseFunction assert $ do before === 0 diff before (<) after after === length entityList </code></pre></div></div> <h1 id="a-real-example">A Real Example</h1> OK, OK, so that last one was too abstract. Let’s say we’re writing a billing system (jeez, i sure do use that example a lot). We keep track of <code class="language-plaintext highlighter-rouge">Invoice</code>s, which group <code class="language-plaintext highlighter-rouge">InvoiceLineItem</code>s that contain actual amounts. We have a model for <code class="language-plaintext highlighter-rouge">Payment</code>s, which record details on a <code class="language-plaintext highlighter-rouge">Payment</code> like how it was made, who made it, whether it was successful, etc. A <code class="language-plaintext highlighter-rouge">Payment</code> can be applied to many <code class="language-plaintext highlighter-rouge">Invoice</code>s, so we have a join table <code class="language-plaintext highlighter-rouge">InvoicePayment</code> that records the amount of each payment allocated toward an <code class="language-plaintext highlighter-rouge">Invoice</code>. Neat. Because we love Postgresql, quite a bit of our business logic is performed database-side, either via custom SQL functions or <code class="language-plaintext highlighter-rouge">esqueleto</code> expressions. One of these functions is <code class="language-plaintext highlighter-rouge">invoicePaidTotal</code>, which tells us the total amount paid towards an <code class="language-plaintext highlighter-rouge">Invoice</code>. Here’s the <code class="language-plaintext highlighter-rouge">esqueleto</code> code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>invoicePaidTotal :: SqlExpr (Entity DB.Invoice) -> SqlExpr (Value (Dollar E2)) invoicePaidTotal i = fromMaybe_ zeroDollars $ subSelect . from $ \(ip `InnerJoin` p) -> do on (p ^.DB.PaymentId ==. ip ^.DB.InvoicePaymentPaymentId) where_ (ip ^.DB.InvoicePaymentInvoiceId ==. i ^.DB.InvoiceId) where_ (Payment.isSucceeded p) pure (sumDollarDefaultZero (ip ^.DB.InvoicePaymentTotal)) </code></pre></div></div> (Some of these functions are internal to the work codebase, but they do the obvious thing) This is equivalent to the following SQL: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT COALESCE(SUM(ip.total), 0) FROM invoice_payment AS ip INNER JOIN payment AS p ON p.id == ip.payment_id WHERE ip.invoice_id = :invoice_id AND payment_succeeded(p) </code></pre></div></div> Now, we want to write a test for it. Upon inspection, we’re testing two things: SQL’s <code class="language-plaintext highlighter-rouge">SUM</code> function and the <code class="language-plaintext highlighter-rouge">payment_succeeded</code> function (which itself is actually another esqueleto expression that would unfold). So, we can write a property: <ul> <li>If there are no <code class="language-plaintext highlighter-rouge">Payment</code>s or <code class="language-plaintext highlighter-rouge">InvoicePayment</code>s in the database, then this function should return <code class="language-plaintext highlighter-rouge">$0.00</code>.</li> <li>If there are some <code class="language-plaintext highlighter-rouge">InvoicePayment</code>s in the database, then this function should return the sum of their <code class="language-plaintext highlighter-rouge">total</code> fields provided that the associated <code class="language-plaintext highlighter-rouge">Payment</code> is successful.</li> </ul> Here’s the test code. We’ll start by looking at <code class="language-plaintext highlighter-rouge">arrange</code> bit, which creates the database models. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>arrange (runTestDb db) "invoicePaidTotal" $ do let invoiceId = InvoiceKey "invoice" invoice = baseInvoice payments <- forAll $ Gen.list (Range.linear 1 50) $ do id <- Gen.id name <- Gen.faker Faker.name pure $ Entity id basePayment { paymentName = name } invoicePayments <- forAll $ for payments $ (Entity paymentId _) -> do amount <- Gen.integral (Range.linear 1 1000) pure $ InvoicePayment invoiceId paymentId amount act $ do ... snip ... </code></pre></div></div> Values like <code class="language-plaintext highlighter-rouge">baseInvoice</code>, <code class="language-plaintext highlighter-rouge">basePayment</code> are useful as test fixtures. I’ve generally found that writing generators for models isn’t nearly as useful as generating modifications to models that alter what you care about. This doesn’t catch as many potential edge case bugs, so it has it’s downsides, but if the client name being “Foobar” instead of “AsdfQuux” affects payment totals, then something is deeply weird. Alright, let’s <code class="language-plaintext highlighter-rouge">act</code>! I usually like to define the function under test as <code class="language-plaintext highlighter-rouge">subject</code>, along with whatever scaffolding needs to happen to make it easy to call. In this case, I want to test an <code class="language-plaintext highlighter-rouge">esqueleto</code> <code class="language-plaintext highlighter-rouge">SqlExpr</code>, which means I need convert it into a query and run it. Calling it <code class="language-plaintext highlighter-rouge">subject</code> is just an aesthetic thing. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> act $ do let subject = fmap (unValue . head) select $ from $ \i -> do where_ $ i ^. InvoiceId ==. val invoiceId pure $ invoicePaidTotal i </code></pre></div></div> I call <code class="language-plaintext highlighter-rouge">head</code> fearlessly here because I don’t care about runtime errors in test suites. YOLO. Next, we’re going to insert our invoice, and call <code class="language-plaintext highlighter-rouge">subject</code> to get the paid total. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> insertKey invoiceId invoice beforePayments <- subject </code></pre></div></div> Then we’ll mutate the state of the database by inserting all the invoice payments and invoices. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> insertEntityMany payments insertMany invoicePayments afterPayments <- subject </code></pre></div></div> And that’s all we need to start writing some assertions. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> assert $ do beforePayments === 0 afterPayments === do map invoicePaymentTotal $ filter isSuccessfulPayment $ invoicePayments </code></pre></div></div> <code class="language-plaintext highlighter-rouge">isSuccessfulPayment</code> is a Haskell function that mirrors the logic in the SQL. If this test passes, then we know that the logic is all set. Next up, we might want to write an equivalence test for the Haskell <code class="language-plaintext highlighter-rouge">isSuccessfulPayment</code> and the esqueleto/SQL <code class="language-plaintext highlighter-rouge">Payment.isSuccessful</code>. This would look something like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>arrange (runTestDb db) $ do Entity paymentId payment <- forAll Payment.gen act $ do insertKey paymentId payment dbPaymentSuccessful <- fmap (unValue . head) $ select $ from $ \p -> do where_ $ p ^. PaymentId ==. val paymentId pure (Payment.isSuccessful p) assert $ do dbPaymentSuccessful === paymentIsSuccessful payment </code></pre></div></div> <h1 id="on-naming-things">On Naming Things</h1> No, I’m not going to talk about <a href="https://www.parsonsmatt.org/2017/06/23/on_naming_things.html">that kind of naming things</a>. This is about actually giving names to things! The most general types for <code class="language-plaintext highlighter-rouge">arrange</code>, <code class="language-plaintext highlighter-rouge">act</code>, and <code class="language-plaintext highlighter-rouge">assert</code> are: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>act, assert :: Applicative f => a -> f a act = pure assert = pure arrange :: Monad m => (forall x. n x -> m x) -> m (n (m a)) -> m a arrange transform mkAction = do action <- mkAction join (transform action) </code></pre></div></div> These are pretty ordinary and unassuming functions. They’re so general. It can be hard to see all the ways they can be useful. Likewise, if we only ever write the direct functions, then it can be difficult to capture the pattern and make it obvious in our code. Giving a thing a name makes it real in some sense. In the Haskell sense, it becomes a value you can link to, provide Haddocks for, and show examples on. In our work codebase, the equivalent functions to the <code class="language-plaintext highlighter-rouge">arrange</code>, <code class="language-plaintext highlighter-rouge">act</code>, and <code class="language-plaintext highlighter-rouge">assert</code> defined here have nearly 100 lines of documentation and examples, as well as more specified types that can help guide you to the correct implementation. Sometimes designing a library is all about narrowing the potential space of things that a user can do with your code. Wed, 11 Mar 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/03/11/effectful_property_testing.html https://www.parsonsmatt.org/2020/03/11/effectful_property_testing.html Mirror Mirror: Reflection and Encoding Via Mirror, mirror, on the wall, where is the skolem that escapes the <code class="language-plaintext highlighter-rouge">forall</code>? This post is about reflection, reification, and (to get to the pragmatism) the use of the new <code class="language-plaintext highlighter-rouge">DerivingVia</code> mechanism to provide awesome codecs. What does reflection and reification have to do with any of this? Well, we’ll see, but first let’s dig into some code. Encoding and decoding JSON is a common problem, and you very often need to massage the data a little bit in order to get what you want. Sometimes you need to maintain backwards compatibility with old services, and this means that you can’t just do whatever you want internally. What works best for your domain and codebase doesn’t necessarily play nicely with the boilerplate reducing deriving mechanisms or metaprogramming. You can dispense with type classes and generic deriving. Writing encoders and decoders by hand is a great and declarative solution, and is often the right answer. However, the work can be boilerplate-y and error-prone, and some machine help is much appreciated. Fortunately, <a href="https://www.kosmikus.org/DerivingVia/deriving-via-paper.pdf"><code class="language-plaintext highlighter-rouge">DerivingVia</code></a> can be used to handle much of this work safely, composably, and without boilerplate. Let’s dig into what I’ve been working on. We’re going to need a boatload of language extensions to make this work. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE AllowAmbiguousTypes #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE DeriveAnyClass #-} {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE DerivingStrategies #-} {-# LANGUAGE DerivingVia #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE KindSignatures #-} {-# LANGUAGE NoStarIsType #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE PolyKinds #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE TypeOperators #-} {-# LANGUAGE UndecidableInstances #-} </code></pre></div></div> Don’t worry about them if you don’t understand them. For some more boilerplate, we’re going to define the most common domain type: <code class="language-plaintext highlighter-rouge">User</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { userName :: String , userAge :: Int , userFavoriteAnimal :: String } deriving (Show, Generic) bob :: User bob = User "Bob" 32 "cats" </code></pre></div></div> Now, <code class="language-plaintext highlighter-rouge">User</code> does not have a <code class="language-plaintext highlighter-rouge">ToJSON</code> instance. But we want to convert it to JSON anyway. We can write a <code class="language-plaintext highlighter-rouge">newtype</code> wrapper that delegates to the <code class="language-plaintext highlighter-rouge">Generic</code> stuff with JSON, as a way to provide a <code class="language-plaintext highlighter-rouge">ToJSON</code> instance for a type that doesn’t have one. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype GenericToJSON value = GenericToJSON value instance ToJSON (GenericToJSON value) where toJSON (GenericToJSON value) = genericToJSON defaultOptions value </code></pre></div></div> GHC is definitely not going to like this, because we need some constraints. So let’s have GHC compile this and complain! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/encoding-via/src/Lib.hs:74:24: error: • No instance for (Generic a) arising from a use of ‘genericToJSON’ Possible fix: add (Generic a) to the context of the instance declaration • In the expression: genericToJSON defaultOptions a In an equation for ‘toJSON’: toJSON (GenericToJSON a) = genericToJSON defaultOptions a In the instance declaration for ‘ToJSON (GenericToJSON a)’ | 74 | toJSON (GenericToJSON a) = genericToJSON defaultOptions a | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ </code></pre></div></div> Let’s follow GHC’s suggestion: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Generic a => ToJSON (GenericToJSON a) where toJSON (GenericToJSON a) = genericToJSON defaultOptions a </code></pre></div></div> Now we get another error: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/encoding-via/src/Lib.hs:75:24: error: • Could not deduce (aeson-1.4.6.0:Data.Aeson.Types.ToJSON.GToJSON Value Zero (Rep a)) arising from a use of ‘genericToJSON’ from the context: Generic a bound by the instance declaration at src/Lib.hs:(72,5)-(73,32) • In the expression: genericToJSON defaultOptions a In an equation for ‘toJSON’: toJSON (GenericToJSON a) = genericToJSON defaultOptions a In the instance declaration for ‘ToJSON (GenericToJSON a)’ | 75 | toJSON (GenericToJSON a) = genericToJSON defaultOptions a | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ </code></pre></div></div> Another type class constraint to paste in. However, there’s something tricky here: GHC is reporting a fully qualified name for the <code class="language-plaintext highlighter-rouge">GToJSON</code> class. That means it isn’t in scope. Let’s <a href="https://www.stackage.org/lts-14.22/hoogle?q=GToJSON">Hoogle the <code class="language-plaintext highlighter-rouge">GToJSON</code> class</a>. Looks like there are two types here with the same name. We’ve got <code class="language-plaintext highlighter-rouge">type GToJSON = Internal.GToJSON Value</code>. So I think we can just use that in the constraint: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Generic a, GToJSON Zero (Rep a)) => ToJSON (GenericToJSON a) where toJSON (GenericToJSON a) = genericToJSON defaultOptions a </code></pre></div></div> Sure enough, this compiles! Can we use it to convert a <code class="language-plaintext highlighter-rouge">User</code> to JSON, now? Yes. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> BS8.putStrLn (Aeson.encode (GenericToJSON bob)) {"userName":"Bob","userAge":32,"userFavoriteAnimal":"cats"} </code></pre></div></div> OK, that may not be the encoding we want, but it does work. <h1 id="derivingvia">DerivingVia</h1> OK, OK, but now we actually need to provide a <code class="language-plaintext highlighter-rouge">ToJSON</code> instance for the <code class="language-plaintext highlighter-rouge">User</code>. We have a bunch of options: <h3 id="manual">Manual</h3> BORING <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON User where toJSON user = object [ "userName" .= userName user , "userAge" .= userAge user , "userFavorateAnimal" .= userFavoriteAnimal user ] </code></pre></div></div> Also, HWOOPS, you may have noticed the typo. That’s unfortunately already been released and is now part of the Public API which we will embarrassingly support for the next decade or two. <code class="language-plaintext highlighter-rouge">Referer</code> has some company, at least. Anyway this is boring, error-prone, and full of repetition. But! Importantly, it gives us a tremendous amount of control over the representation. We specify exactly what we want and how we want it. Want to special case a field name? Easy! Just write it. Want to special case a value representation? Easy! Just do it. <h3 id="deriveanyclass">DeriveAnyClass</h3> All you have to do is throw a deriving clause on <code class="language-plaintext highlighter-rouge">User</code> for this to work out. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User ... deriving (Generic, ToJSON) </code></pre></div></div> This is easy. But it requires a lot of work from GHC and the library author. GHC needs a feature to permit library authors to provide specialized defaults for type class methods, and library authors must then provide those specialized defaults. Library authors have to pick a single default set that is privileged for <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code>, which is unfortunate. Fortunately for users, this is very easy. There’s nothing to it. But we also don’t have any control over it. So let’s look at a slightly more flexible way: <h3 id="generic-deriving">Generic Deriving</h3> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON User where toJSON = genericToJSON defaultOptions </code></pre></div></div> The value <code class="language-plaintext highlighter-rouge">defaultOptions</code> gives us tools and hooks to modify field labels and constructor values and other ways that the JSON encoding can be handled. This is good and convenient. The <a href="https://hackage.haskell.org/package/aeson-casing"><code class="language-plaintext highlighter-rouge">aeson-casing</code></a> gives us a function <code class="language-plaintext highlighter-rouge">snakeCase</code> that we can use to snake case the fields instead of using the text of the field that we’re given. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON User where toJSON = genericToJSON options where options = defaultOptions { fieldLabelModifier = \fieldLabel -> snakeCase fieldLabel } >>> BS8.putStrLn (Aeson.encode bob) {"user_name":"Bob","user_age":32,"user_favorite_animal":"cats"} </code></pre></div></div> Cool. And finally we can drop the type name, because we want prettier fields. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON User where toJSON = genericToJSON options where options = defaultOptions { fieldLabelModifier = \fieldLabel -> snakeCase (drop (length "user") fieldLabel) } >>> BS8.putStrLn (Aeson.encode bob) {"name":"Bob","age":32,"favorite_animal":"cats"} </code></pre></div></div> Nice. That’s what we want. <h3 id="derivingvia-1">DerivingVia</h3> We can derive a Generic-based instance using our <code class="language-plaintext highlighter-rouge">newtype</code> from earlier: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { userName :: String , userAge :: Int , userFavoriteAnimal :: String } deriving (Show, Generic) deriving ToJSON via GenericToJSON User </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">via</code> keyword allows us to specify a <code class="language-plaintext highlighter-rouge">newtype</code> wrapper that might contain additional information to use in deriving. This will generate an instance that looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON User where toJSON user = toJSON (coerce user :: GenericToJSON User) </code></pre></div></div> Basically, we’re delegating to this instance under-the-hood: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Generic a, GToJSON Zero (Rep a)) => ToJSON (GenericToJSON a) where toJSON (GenericToJSON a) = genericToJSON defaultOptions a </code></pre></div></div> Here’s what I find awesome about this: <ul> <li>It subsumes default methods and democratizes them - library authors are no longer required to provide these default methods, and library users can supply them as well.</li> <li>Because it uses type classes, it is completely canonical - there can only be one instance for a type, and it should be pretty easy to find either the type or the instance.</li> <li>We can customize and reuse these values easily</li> </ul> Indeed, <code class="language-plaintext highlighter-rouge">GenericToJSON</code> is too strict of a name - we can use that wrapper for anything that just delegates to the Generic instance. This type is canonically available as <a href="https://www.stackage.org/haddock/lts-14.22/generic-data-0.7.0.0/Generic-Data.html#t:Generically"><code class="language-plaintext highlighter-rouge">Generically</code></a>. <h3 id="derivingvia--customization">DerivingVia + Customization?</h3> But, how can we customize? If we write the instance by hand, then we can customize the <code class="language-plaintext highlighter-rouge">options</code> passed in. But the language in <code class="language-plaintext highlighter-rouge">DerivingVia</code> doesn’t allow for mere values - only types can be talked about. Fortunately, we have ways of communicating across the type-value divide. <h1 id="functions-on-values-and-types">Functions on Values and Types</h1> In Haskell, we are very familiar with functions from values to values. It’s functional programming! But we also have types. Can we have functions from values to types? What about functions from types to types? Or functions from types to values? Value-to-value functions are ordinary functions. And we have type-to-type functions using <code class="language-plaintext highlighter-rouge">TypeFamilies</code>. Value-to-type functions are the realm of dependent types, and Haskell can only sorta simulate these sometimes in a limited and weird way. But we want type-to-value functions. Given a type, return a value. We have these - they are called “type classes.” <h2 id="what">what??</h2> It’s a bit of a mindbender! For sure. And the syntax is a little awkward. Don’t worry. Let’s make a type class that make this super evident. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class TypeToInt a where typeToInt :: Int </code></pre></div></div> The class <code class="language-plaintext highlighter-rouge">TypeToInt</code> is a function that accepts a type and provides a value. We can define an instance like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance TypeToInt Int where typeToInt = 1 instance TypeToInt String where typeToInt = 2 instance TypeToInt Char where typeToInt = 3 </code></pre></div></div> We can use the type function like this <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> typeToInt @Int 1 >>> typeToInt @Char 3 </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">@</code> is a <code class="language-plaintext highlighter-rouge">TypeApplications</code> syntax - it allows us to explicitly pass the type to the value. Typical type classes, like <code class="language-plaintext highlighter-rouge">Monoid</code>, are similar. Consider <code class="language-plaintext highlighter-rouge">mempty</code> - it’s a value, all alone. If we use it unadorned, it looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mempty :: (Monoid a) => a </code></pre></div></div> If we view this as a function from types to values, then we can pass a type and receive a value: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> mempty @(Sum Int) Sum { getSum = 0 } </code></pre></div></div> Anyway, to get back on track, we’re going to need to build a type-level language for modifying JSON options, and then we’re going to need to use type classes to get a value level modifier. If that sounds scary, then, well, it kind of is. But no worries - you’ll get the hang of it! <h1 id="modify-options">Modify Options</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Codec (tag :: k) (val :: Type) = Codec val </code></pre></div></div> The type that we’ll use to hang our hat is this. The <code class="language-plaintext highlighter-rouge">Codec</code> type takes a type parameter <code class="language-plaintext highlighter-rouge">tag</code> that can be of any kind <code class="language-plaintext highlighter-rouge">k</code>, and it contains a single value of type <code class="language-plaintext highlighter-rouge">val</code>. This allows us to use it with <code class="language-plaintext highlighter-rouge">DerivingVia</code>. Now, we’ll define an instance of <code class="language-plaintext highlighter-rouge">ToJSON</code> for <code class="language-plaintext highlighter-rouge">Codec</code>, which modifies the options based on <code class="language-plaintext highlighter-rouge">tag</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( GToJSON Zero (Rep a), Generic a , ModifyOptions tag ) => ToJSON (Codec tag a) where toJSON (Codec a) = genericToJSON (modifyOptions @tag defaultOptions) a </code></pre></div></div> <code class="language-plaintext highlighter-rouge">ModifyOptions</code> is a function from a type to a value - in this case, a function which modifies options. We’ll start with the base case - do nothing! For this, we can use the <code class="language-plaintext highlighter-rouge">()</code> type, but we’ll alias it for readability: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type AsIs = () class ModifyOptions tag where modifyOptions :: Options -> Options instance ModifyOptions AsIs where modifyOptions = id </code></pre></div></div> This gives us the same thing as <code class="language-plaintext highlighter-rouge">deriving ToJSON via Generically User</code>, and we can verify this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> encode (Codec bob :: Codec AsIs User) {"userName":"Bob","userAge":32,"userFavoriteAnimal":"cats"} </code></pre></div></div> Now, we want the ability to <code class="language-plaintext highlighter-rouge">snake_case</code> the options. So we’ll create a type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data SnakeCase </code></pre></div></div> The purpose of this type is to “reflect” the value <code class="language-plaintext highlighter-rouge">snakeCase :: String -> String</code> and modify the field labels with that function. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ModifyOptions SnakeCase where modifyOptions options = options { fieldLabelModifier = \fieldLabel -> snakeCase (fieldLabelModifier options fieldLabel) } </code></pre></div></div> Oof, record update, how nasty. Let’s factor that out into it’s own pattern: we want to take an <code class="language-plaintext highlighter-rouge">Options</code> and compose a function with the existing <code class="language-plaintext highlighter-rouge">fieldLabelModifier</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>addFieldLabelModifier :: (String -> String) -> Options -> Options addFieldLabelModifier f options = options { fieldLabelModifier = f . fieldLabelModifier options } instance ModifyOptions SnakeCase where modifyOptions = addFieldLabelModifier snakeCase </code></pre></div></div> Much nicer. Excellent. Does this work? Let’s try! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> BS8.putStrLn (Aeson.encode (Codec bob :: Codec SnakeCase User)) {"user_name":"Bob","user_age":32,"user_favorite_animal":"cats"} </code></pre></div></div> Nice. Now, let’s drop that type name from the front. We’ll write a combinator that lets you specify that you want to <code class="language-plaintext highlighter-rouge">Drop</code> something from the front. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Drop something </code></pre></div></div> And, here’s our instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (KnownSymbol symbol) => ModifyOptions (Drop symbol) where modifyOptions = addFieldLabelModifier $ \fieldLabel -> case List.stripPrefix prefix fieldLabel of Just stripped -> stripped Nothing -> fieldLabel where prefix = symbolVal (Proxy @symbol) </code></pre></div></div> There’s a bit to unpack here. This type class is matching on two types: one visibly (<code class="language-plaintext highlighter-rouge">Drop symbol</code>), and one invisibly. It’s matching on the inferred kind of <code class="language-plaintext highlighter-rouge">symbol</code> – <code class="language-plaintext highlighter-rouge">symbol :: Symbol</code>. It’s real easy to get tripped up when GHC starts inferring stuff about kinds, so if you get confused here, you’re in good company - this stuff confuses me all the time. A <code class="language-plaintext highlighter-rouge">Symbol</code> is a String at the type level. The function <a href="https://www.stackage.org/haddock/lts-14.22/base-4.12.0.0/GHC-TypeLits.html#v:symbolVal"><code class="language-plaintext highlighter-rouge">symbolVal</code></a> is used to get a <code class="language-plaintext highlighter-rouge">String</code> from a <code class="language-plaintext highlighter-rouge">Symbol</code>. It’s another function from types to values that we’ve been using. So we’d say that we’re “reflecting” the symbol into the <code class="language-plaintext highlighter-rouge">prefix</code> variable, and then using it normally. This works! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> BS8.putStrLn (Aeson.encode (Codec bob :: Codec (Drop "user") User)) {"Name":"Bob","Age":32,"FavoriteAnimal":"cats"} </code></pre></div></div> But we want to do both of these at the same time, without writing a bunch of boilerplatey code. <h1 id="composing">Composing</h1> We need a type to compose these functions. We can’t use <code class="language-plaintext highlighter-rouge">.</code> as a type operator. So that leaves us with <code class="language-plaintext highlighter-rouge">$</code> and <code class="language-plaintext highlighter-rouge">&</code>. <code class="language-plaintext highlighter-rouge">$</code> has a useful type operator already - you can feasibly use it to write <code class="language-plaintext highlighter-rouge">IO $ Either String Char</code> and remove brackets there. So we’ll use <code class="language-plaintext highlighter-rouge">&</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data a & b infixr 6 & </code></pre></div></div> Now, we’ll write an instance of <code class="language-plaintext highlighter-rouge">ModifyOptions</code> for this type. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance () => ModifyOptions (a & b) where modifyOptions = undefined </code></pre></div></div> Just kidding, we put in a dummy/skeleton implementation. So the idea is that we want to have a symmetry with <code class="language-plaintext highlighter-rouge">&</code>, which is defined like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(&) :: a -> (a -> b) -> b a & f = f a </code></pre></div></div> You use it like <code class="language-plaintext highlighter-rouge">[1,2,3] & map (+1)</code>. It’s similar to Elm, F#, and Elixir’s <code class="language-plaintext highlighter-rouge">|></code> operator. With this understanding, we can stitch together the instance. We need for <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> to have an instance of <code class="language-plaintext highlighter-rouge">ModifyOptions</code>, and then we’ll compose those functions. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (ModifyOptions a, ModifyOptions b) => ModifyOptions (a & b) where modifyOptions = modifyOptions @b . modifyOptions @a </code></pre></div></div> Now, we can write our <code class="language-plaintext highlighter-rouge">Codec</code> that will do both of these operations. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> let val = Codec bob :: Codec (Drop "user" & SnakeCase) User >>> BS8.putStrLn (Aeson.encode val) {"name":"Bob","age":32,"favorite_animal":"cats"} </code></pre></div></div> Armed with this, we can now derive that instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User ... deriving stock Generic deriving ToJSON via Codec (Drop "user" & SnakeCase) User </code></pre></div></div> <h1 id="derivingvia-reflecting-types-to-values-to-control-classes">DerivingVia: Reflecting Types to Values to Control Classes</h1> <code class="language-plaintext highlighter-rouge">DerivingVia</code> gives us a powerful language for deriving instances, but it requires that we write at the type level. Fortunately, we can reflect our types into functions, and use those to drive behavior. These <code class="language-plaintext highlighter-rouge">newtype</code> wrappers aren’t useful only for deriving. We can also use it to specify alternative behaviors easily. We’ve needed this recently at my company to load results in a JSONB array. Postgresql has an aggregation function <a href="https://www.postgresql.org/docs/9.5/functions-aggregate.html"><code class="language-plaintext highlighter-rouge">jsonb_agg</code></a> that will take an expression, convert it to JSONB, and collect the results in a JSONB list. However, there’s no way to control the JSONB representation - <code class="language-plaintext highlighter-rouge">postgresql</code> uses the column names for the keys, as-is. <code class="language-plaintext highlighter-rouge">persistent</code> can automatically derive JSON instances for you, but it can potentially pick different encoder/decoder than what postgresql uses. This is the default behavior with most of the settings. Furthermore, you may not even have derived JSON instances for these types! So how are you going to make the communication work, without a ton of error-prone boilerplate? We’ll use the exact same <code class="language-plaintext highlighter-rouge">newtype</code> and reflection tricks. We’ll point these techniques at <code class="language-plaintext highlighter-rouge">FromJSON</code> instead, which should be able to reuse all of the combinators we’re building here to modify the requisite options. In the interest of brevity, though, that particular exposition will have to wait for another post. In the meantime, you can look at code on my <a href="https://github.com/parsonsmatt/encoding-via/blob/master/src/Lib.hs"><code class="language-plaintext highlighter-rouge">encoding-via</code></a> repository. Tue, 04 Feb 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/02/04/mirror_mirror.html https://www.parsonsmatt.org/2020/02/04/mirror_mirror.html Plucking Constraints There’s a Haskell trick that I’ve observed in a few settings, and I’ve never seen a name put to it. I’d like to write a post about the technique and give it a name. It’s often useful to write in a type class constrained manner, but at some point you need to discharge (or satisfy?) those constraints. You can pluck a single constraint at a time. This technique is used primarily used in <code class="language-plaintext highlighter-rouge">mtl</code> (or other effect libraries), but it also has uses in error handling. <h1 id="gathering-constraints">Gathering Constraints</h1> We can easily gather constraints by using functions that require them. Here’s a function that has a <code class="language-plaintext highlighter-rouge">MonadReader Int</code> constraint: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>number :: (MonadReader Int m) => m Int number = ask </code></pre></div></div> Here’s another function that has a <code class="language-plaintext highlighter-rouge">MonadError String</code> constraint: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>woops :: (MonadError String m) => m void woops = throwError "woops!" </code></pre></div></div> And yet another function with a <code class="language-plaintext highlighter-rouge">MonadState Char</code> constraint: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>update :: (MonadState Char m) => m () update = modify succ </code></pre></div></div> We can seamlessly write a program that uses all of these functions together: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program = do number woops update </code></pre></div></div> GHC will happily infer the type of <code class="language-plaintext highlighter-rouge">program</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: ( MonadReader Int m , MonadError String m , MonadState Char m ) => m () </code></pre></div></div> At some point, we’ll need to actually use this. Virtually all Haskell code that gets used is called from <code class="language-plaintext highlighter-rouge">main :: IO ()</code>. Let’s try just using it directly: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = program </code></pre></div></div> GHC is going to complain about this. It’s going to say something like: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>No instance for `MonadReader Int IO` arising from a use of `program` .... No instance for `MonadState Char IO` arising from a use of `program` .... Couldn't match type `IOException` with type `String` .... </code></pre></div></div> This is GHC’s way of telling us that it doesn’t know how to run our program in <code class="language-plaintext highlighter-rouge">IO</code>. Perhaps the <code class="language-plaintext highlighter-rouge">IO</code> type is not powerful enough to do all the stuff we want as-is. And it has a conflicting way to throw errors - the <code class="language-plaintext highlighter-rouge">MonadError</code> instance is for the <code class="language-plaintext highlighter-rouge">IOException</code> type, not the <code class="language-plaintext highlighter-rouge">String</code> that we’re trying to use. So we have to do something differently. <h1 id="unify">Unify</h1> Let’s try figuring out what GHC is doing with <code class="language-plaintext highlighter-rouge">main = program</code>. First, we’ll look at the equations: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: ( MonadReader Int m , MonadError String m , MonadState Char m ) => m () main :: IO () </code></pre></div></div> GHC sees that the “shape” of these types is similar. It can substitute <code class="language-plaintext highlighter-rouge">IO</code> for <code class="language-plaintext highlighter-rouge">m</code> in <code class="language-plaintext highlighter-rouge">program</code>. Does that work? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: ( MonadReader Int IO , MonadError String IO , MonadState Char IO ) => IO () </code></pre></div></div> Yeah! That looks okay so far. Now, we have a totally concrete constraint: <code class="language-plaintext highlighter-rouge">MonadReader Int IO</code> doesn’t have any type variables. So let’s look it up and see if we can find an instance . . . Unfortunately, there’s no instance defined like this. If there’s no instance for <code class="language-plaintext highlighter-rouge">IO</code>, then how are we going to satisfy that constraint? We need to get rid of it and discharge it somehow! The <code class="language-plaintext highlighter-rouge">mtl</code> library gives us a type that’s sole responsibility is discharging the <code class="language-plaintext highlighter-rouge">MonadReader</code> instance: <code class="language-plaintext highlighter-rouge">ReaderT</code>. Let’s check out the <code class="language-plaintext highlighter-rouge">runReaderT</code> function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runReaderT :: ReaderT r m a -> r -> m a </code></pre></div></div> <code class="language-plaintext highlighter-rouge">runReaderT</code> says: <blockquote> My first argument is a <code class="language-plaintext highlighter-rouge">ReaderT r m a</code>. My second argument is the <code class="language-plaintext highlighter-rouge">r</code> environment. And then I’ll take off the <code class="language-plaintext highlighter-rouge">ReaderT</code> business on the type, returning only <code class="language-plaintext highlighter-rouge">m</code>. </blockquote> We’re going to pluck off that <code class="language-plaintext highlighter-rouge">MonadReader</code> constraint by turning it into a concrete type. And <code class="language-plaintext highlighter-rouge">runReaderT</code> is one way to do that plucking. GHC inferred a pretty general type for <code class="language-plaintext highlighter-rouge">program</code> earlier, but we can pick a more concrete type. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: ( MonadError String n , MonadState Char n ) => ReaderT Int n () </code></pre></div></div> Notice how we’ve shifted a constraint into a concrete type. We’ve fixed the type of <code class="language-plaintext highlighter-rouge">m</code> to be <code class="language-plaintext highlighter-rouge">ReaderT Int n</code>, and all the other constraints got delegated down to this new type variable <code class="language-plaintext highlighter-rouge">n</code>. We don’t need to pick this concrete type at our definition site of <code class="language-plaintext highlighter-rouge">program</code>. Indeed, we can provide that annotation somewhere else, like in <code class="language-plaintext highlighter-rouge">main</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = let program' :: ( MonadError String n , MonadState Char n ) => ReaderT Int n () program' = program in runReaderT program' 3 </code></pre></div></div> We’re literally saying “<code class="language-plaintext highlighter-rouge">program'</code> is exactly like <code class="language-plaintext highlighter-rouge">program</code> but we’re making it a tiny bit more concrete.” Now, GHC still isn’t happy. It’s going to complain that there’s no instance for <code class="language-plaintext highlighter-rouge">MonadState Char IO</code> and that <code class="language-plaintext highlighter-rouge">String</code> isn’t equal to <code class="language-plaintext highlighter-rouge">IOException</code>. So we have a little more work to do. Fortunately, the <code class="language-plaintext highlighter-rouge">mtl</code> library gives us types for plucking these constraints off too. <code class="language-plaintext highlighter-rouge">StateT</code> and <code class="language-plaintext highlighter-rouge">runStateT</code> can be used to pluck off a <code class="language-plaintext highlighter-rouge">MonadState</code> constraint, as well as <code class="language-plaintext highlighter-rouge">ExceptT</code> and <code class="language-plaintext highlighter-rouge">runExceptT</code>. Let’s write <code class="language-plaintext highlighter-rouge">program''</code>, which will use <code class="language-plaintext highlighter-rouge">StateT</code> to ‘pluck’ the <code class="language-plaintext highlighter-rouge">MonadState Char</code> constraint off. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = let program' :: ( MonadError String n , MonadState Char n ) => ReaderT Int n () program' = program program'' :: (MonadError String n) => ReaderT Int (StateT Char n) () program'' = program' programRead :: (MonadError String n) => StateT Char n () programRead = runReaderT program'' 3 in runStateT programRead 'c' </code></pre></div></div> GHC still isn’t happy - it’s going to complain that <code class="language-plaintext highlighter-rouge">()</code> and <code class="language-plaintext highlighter-rouge">((), Char)</code> aren’t the same types. Also we still haven’t dealt with <code class="language-plaintext highlighter-rouge">IOException</code> and <code class="language-plaintext highlighter-rouge">String</code> being different. So let’s use <code class="language-plaintext highlighter-rouge">ExceptT</code> to pluck out that final constraint. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = let program' :: ( MonadError String n , MonadState Char n ) => ReaderT Int n () program' = program program'' :: (MonadError String n) => ReaderT Int (StateT Char n) () program'' = program' program''' :: (Monad m) => ReaderT Int (StateT Char (ExceptT String m) () -> m () program''' = program'' -- ... snip ... </code></pre></div></div> Okay, so I’m going to snip here and talk about something interesting. When we plucked the <code class="language-plaintext highlighter-rouge">MonadError</code> constraint out, we didn’t totally remove it. Instead, we’re left with a <code class="language-plaintext highlighter-rouge">Monad</code> constraint. We’ll get into this later. But first, let’s look at the steps that happen when we run the program, one piece at a time. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- ... snip ... programRead :: (Monad m) => StateT Char (ExceptT String m) () programRead = runReaderT program''' 3 programStated :: (Monad m) => ExceptT String m ((), Char) programStated = runStateT programRead 'a' programExcepted :: (Monad m) => m (Either String ((), Char)) programExcepted = runExceptT programStated programInIO :: IO (Either String ((), Char)) programInIO = programExcepted in do result <- programInIO case result of Left err -> do fail err Right ((), endState) -> do print endState pure () </code></pre></div></div> GHC doesn’t error on this! When we finally get to <code class="language-plaintext highlighter-rouge">programExcepted</code>, we have a type that GHC can happily accept. The <code class="language-plaintext highlighter-rouge">IO</code> type has an instance of <code class="language-plaintext highlighter-rouge">Monad</code>, and so we can just substitute <code class="language-plaintext highlighter-rouge">(Monad m) => m ()</code> and <code class="language-plaintext highlighter-rouge">IO ()</code> without any fuss. These are all of the steps, laid out explicitly, but we can condense them significantly. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: ( MonadReader Int m , MonadError String m , MonadState Char m ) => m () program = do number woops update main :: IO () main = do result <- runExceptT (runStateT (runReaderT program 3) 'a') case result of Left err -> do fail err Right ((), endState) -> do print endState pure () </code></pre></div></div> <h1 id="plucking-constraints">Plucking Constraints!</h1> The general pattern here is: <ol> <li>A function has many constraints.</li> <li>You can pluck a single constraint off by making the type a little more concrete.</li> <li>The rest of the constraints are delegated to the new type.</li> </ol> We don’t need to only do this in <code class="language-plaintext highlighter-rouge">main</code>. Suppose we want to discharge the <code class="language-plaintext highlighter-rouge">MonadReader Int</code> inside of <code class="language-plaintext highlighter-rouge">program</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: ( MonadState Char m , MonadError String m ) => m () program = do i <- gets fromEnum runReaderT number i woops update </code></pre></div></div> We plucked the <code class="language-plaintext highlighter-rouge">MonadReader</code> constraint off of <code class="language-plaintext highlighter-rouge">number</code> directly and discharged it right there. So you don’t have to just collect constraints until you discharge them in <code class="language-plaintext highlighter-rouge">main</code>. You can pluck them off one-at-a-time as you need to, or as it becomes convenient to do so. <h1 id="how-does-it-work">How does it work?</h1> Let’s look at <code class="language-plaintext highlighter-rouge">ReaderT</code> and <code class="language-plaintext highlighter-rouge">MonadReader</code> to see how the type and class are designed for plucking. We don’t need to worry about the implementations, just the types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ReaderT r m a -- or, with explicit kinds, newtype ReaderT (r :: Type) (m :: Type -> Type) (a :: Type) class MonadReader r m | m -> r instance (Monad m) => MonadReader r (ReaderT r m) instance (MonadError e m) => MonadError e (ReaderT r m) instance (MonadState s m) => MonadState s (ReaderT r m) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">ReaderT</code>, partially applied, as a few different readings: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- [1] ReaderT r :: (Type -> Type) -> (Type -> Type) -- [2] ReaderT r m :: (Type -> Type) -- [3] ReaderT r m a :: Type </code></pre></div></div> <ol> <li>With just an <code class="language-plaintext highlighter-rouge">r</code> applied, we have a ‘monad transformer.’ Don’t worry if this is tricky: just notice that we have something like <code class="language-plaintext highlighter-rouge">(a -> a) -> (a -> a)</code>. At the value level, this might look something like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> updatePlayer :: (Player -> Player) -> GameState -> GameState </code></pre></div> </div> Where we can call <code class="language-plaintext highlighter-rouge">updatePlayer</code> to ‘lift’ a function that operates on <code class="language-plaintext highlighter-rouge">Player</code>s to an entire <code class="language-plaintext highlighter-rouge">GameState</code>. </li> <li>With an <code class="language-plaintext highlighter-rouge">m</code> and an <code class="language-plaintext highlighter-rouge">r</code> applied, we have a ‘monad.’ Again, don’t worry if this is tricky. Just notice that we have something that fits the same shape that the <code class="language-plaintext highlighter-rouge">m</code> parameter has.</li> <li>Finally, we have a regular old type that has runtime values.</li> </ol> The important bit here is the ‘delegation’ type variable. For the class we know how to handle, we can write a ‘base case’: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Monad m) => MonadReader r (ReaderT r m) </code></pre></div></div> And for the classes that we don’t know how to handle, we can write ‘recursive cases’: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (MonadError e m) => MonadError e (ReaderT r m) instance (MonadState s m) => MonadState s (ReaderT r m) </code></pre></div></div> Now, GHC has all the information it needs to pluck a single constraint off and delegate the rest. <h1 id="plucking-errors">Plucking Errors</h1> I mentioned that this technique can also be applied to errors. First, we need to write classes that work for our errors. Let’s say we have database, HTTP, and filesystem errors: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class AsDbError err where liftDbError :: DbError -> err isDbError :: err -> Maybe DbError class AsHttpError err where liftHttpError :: HttpError -> err isHttpError :: err -> Maybe HttpError class AsFileError err where liftFileError :: FileError -> err isFileError :: err -> Maybe FileError </code></pre></div></div> Obviously, our ‘base case’ instances are pretty simple. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance AsDbError DbError where liftDbError = id isDbError = Just instance AsHttpError HttpError where liftHttpError = id isHttpError = Just -- etc... </code></pre></div></div> But we need a way of “delegating.” So let’s write our ‘error transformer’ type for each error: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data DbErrorOr err = IsDbErr DbError | DbOther err data HttpErrorOr err = IsHttpErr HttpError | HttpOther err data FileErrorOr err = IsFileErr FileError | FileOther err </code></pre></div></div> Now, we can write an instance for <code class="language-plaintext highlighter-rouge">DbErrorOr</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance AsDbError (DbErrorOr err) where liftDbError dbError = IsDbErr dbError isDbError (IsDbErr e) = Just e isDbError (DbOther _) = Nothing </code></pre></div></div> This one is pretty simple - it is also a ‘base case.’ Let’s write the recursive case: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance AsHttpError err => AsHttpError (DbErrorOr err) where liftHttpError httpError = DbOther (liftHttpError httpError) isHttpError (IsDbErr _) = Nothing isHttpError (DbOther err) = isHttpError err </code></pre></div></div> Here, we’re just writing some boilerplate code to delegate to the underlying <code class="language-plaintext highlighter-rouge">err</code> variable. We’d want to repeat this for every permutation, of course. Now, we can compose programs that throw varying errors: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program :: (AsHttpError e, AsDbError e) => Either e () program = do Left (liftHttpError HttpError) Left (liftDbError DbError) </code></pre></div></div> The constraints collect exactly as nicely as you’d want, and the type class machinery allows you to easily go from the single type to the concrete type. Let’s ‘pluck’ the constraint. We’ll ‘pick’ a concrete type and delegate the other constraint to the type variable: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>program' :: (AsHttpError e) => Either (DbErrorOr e) () program' = program </code></pre></div></div> GHC is pretty happy about this. All the instances work out, and it solves the problem of how to delegate everything for you. We can pattern match directly on this, which allows us to “catch” individual errors and discharge them: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>handleLeft :: Either err a -> (err -> Either err' a) -> Either err' a handleLeft (Right r) _ = Right r handleLeft (Left l) f = f l program'' :: AsHttpError e => Either e () program'' = handleLeft program $ \err -> case err of IsDbErr dbError -> Right () DbOther dbOther -> Left dbOther </code></pre></div></div> Voila! We’ve “handled” the database error, but we’ve delegated handling the HTTP error. The technique of ‘constraint plucking’ works out here. Now, an astute reader might note that this technique is so boring. There’s so much boilerplate code!! SO MUCH!!! Come on, y’all. It’s exactly the same amount of boilerplate code as the <code class="language-plaintext highlighter-rouge">mtl</code> library requires. Is it really that bad? <blockquote> YESSSS!!! </blockquote> Okay, yeah, it’s pretty bad. This encoding is primarily here to present the ‘constraint plucking’ technique. You can do a more general and ergonomic approach to handling errors like this, but describing it is out of scope for this post. I’ve published a library named <a href="https://hackage.haskell.org/package/plucky"><code class="language-plaintext highlighter-rouge">plucky</code></a> that captures this pattern, and the <a href="https://hackage.haskell.org/package/plucky-0.0.0.1/docs/Data-Either-Plucky.html">module documentation</a> covers it pretty extensively. Hopefully you find this concept as useful as I have. Best of luck in your adventures! Fri, 03 Jan 2020 00:00:00 +0000 https://www.parsonsmatt.org/2020/01/03/plucking_constraints.html https://www.parsonsmatt.org/2020/01/03/plucking_constraints.html Write Junior Code <h2 id="a-plea-to-haskellers-everywhere">A plea to Haskellers everywhere.</h2> Haskell has a hiring problem. There aren’t many Haskell jobs, and there aren’t many Haskell employees. Haskell employees tend to be senior engineers, and the vast majority of job ads want senior-level Haskell candidates. The vast majority of Haskell users do not have any professional production experience, and yet almost every job wants production Haskell experience. Why is this the case? We write fancy code. Here’s a familiar story: <blockquote> Boss: You’re going to be allowed to make a new project in whatever language you want. Even Haskell. Employee: Oh yeah!! Time to write FANCY HASKELL!! Employee writes a ton of really fancy Haskell, delivers fantastically and in about 1000 lines of code. Everyone is very impressed. The project grows in scope. Boss: It’s time to hire another Haskeller. What are the job requirements? Employee: Oh, they’ll need to know type-level programming, lenses, <code class="language-plaintext highlighter-rouge">servant</code>, Generics, monad transformers, <code class="language-plaintext highlighter-rouge">mtl</code>, and advanced multithreading in order to be productive anytime soon. </blockquote> The boss then has trouble hiring anyone with that skill set. The project can’t really grow anymore. Maybe the original employee left, and now they have a legacy Haskell codebase that they can’t deal with. Maybe the original employee tried to train others on the codebase, but there was too much to teach before anyone could do anything productively. Finally, someone gets hired on - they have several years of Production Haskell under their belts. But where did they come from? Another Haskell company, most likely! Now that company has a job to fill. Where do they fill it from? The same pool of candidates that they just lost someone to! They can’t hire juniors or train folks for the same reasons. <h1 id="break-the-cycle">Break The Cycle</h1> This coming year, let’s break the cycle. Let’s write junior code. Let’s write simple, basic, easy Haskell. Let’s get bogged down with how much simple code we write. Let’s make jobs for juniors. Let’s hire juniors. Let’s train those juniors into seniors. Let us grow Haskell in industry by writing simpler code and making room for the less experienced. Let’s not delete all of our fancy code - it serves a purpose! Let’s make it a small part of our codebase, preferably hidden in libraries with nice simple interfaces. Let us share the joy and wonder of an Actually (Mostly) Good programming language with the people that haven’t had the privilege and opportunity to work for years in it already. <h2 id="addendum">Addendum:</h2> Kudos to <a href="https://alpacaaa.net/thoughts-on-haskell-2020/">Marco Sampellegrini</a> who wrote basically the same post as me today. Kudos to <a href="https://www.snoyman.com/blog/2019/11/boring-haskell-manifesto">Michael Snoyman</a> who has been championing Boring Haskell for a while. And kudos to everyone else who has contributed to making Haskell easy, and not just powerful/fun/flexible/fast/amazing. Thu, 26 Dec 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/12/26/write_junior_code.html https://www.parsonsmatt.org/2019/12/26/write_junior_code.html Splitting Persistent Models Reddit user /u/qseep <a href="https://www.reddit.com/r/haskell/comments/e2l1yj/keeping_compilation_fast/f91hcwm/?context=3">made a comment on my last blog post</a>, asking if I had any advice for splitting up <code class="language-plaintext highlighter-rouge">persistent</code> model definitions: <blockquote> A schema made using <a href="https://hackage.haskell.org/package/persistent">persistent</a> feels like a giant Types module. One change to an entity definition requires a recompile of the entire schema and everything that depends on it. Is there a similar process to break up a persistent schema into pieces? </blockquote> Yes! There is. In fact, I’ve been working on this at work, and it’s made a big improvement in our overall compile-times. I’m going to lay out the strategy and code here to make it all work out. You’d primarily want to do this to improve compilation times, though it’s also logically nice to “only import what you need” I guess. <h1 id="starting-point-and-background">Starting Point and Background</h1> <a href="https://hackage.haskell.org/package/persistent"><code class="language-plaintext highlighter-rouge">persistent</code></a> is a database library in Haskell that focuses on rapid productivity and iteration with relatively simple types. It’s heavier than a raw SQL library, but much simpler than something like <a href="https://hackage.haskell.org/package/opaleye"><code class="language-plaintext highlighter-rouge">opaleye</code></a> or <a href="https://hackage.haskell.org/package/beam"><code class="language-plaintext highlighter-rouge">beam</code></a>. It also offers less features and type-safety than those two libraries. Trade-offs abound! Usually, <code class="language-plaintext highlighter-rouge">persistent</code> users will define the models/tables for their database using the <code class="language-plaintext highlighter-rouge">persistent</code> QuasiQuoter language. The examples in the <a href="https://www.yesodweb.com/book/persistent">Persistent chapeter in the Yesod book</a> use the QuasiQuoter directly in line with the Haskell code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| Person name String age Int Maybe deriving Show BlogPost title String authorId PersonId deriving Show |] </code></pre></div></div> The Yesod scaffold, however, loads a file: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- You can define all of your database entities in the entities file. -- You can find more information on persistent and how to declare entities -- at: -- https://www.yesodweb.com/book/persistent/ share [mkPersist sqlSettings, mkMigrate "migrateAll"] $(persistFileWith lowerCaseSettings "config/models") </code></pre></div></div> For smaller projects, I’d recommend using the <code class="language-plaintext highlighter-rouge">QuasiQuoter</code> - it causes less problems with GHCi (no need to worry about relative file paths). Once the models file gets big, compilation will become slow, and you’ll want to split it into many files. I <a href="https://twitter.com/mattoflambda/status/1158853267499057152">investigated this slowness to see what the deal was</a>, initially suspecting that the Template Haskell code was slowing things down. What I found was a little surprising: for a 1,200 line <code class="language-plaintext highlighter-rouge">models</code> file, we were spending <a href="https://twitter.com/mattoflambda/status/1158853269432651779">less than a second</a> doing TemplateHaskell. The rest of the module would take several minutes to compile, largely because the generated module was over 390,000 lines of code, and GHC is <a href="https://www.parsonsmatt.org/2019/11/27/keeping_compilation_fast.html">superlinear in compiling large modules</a>. (note: this issue was fixed in <code class="language-plaintext highlighter-rouge">persistent-template-2.8.0</code>, which resulted in a massive performance improvement by generating dramatically less code! upgrade!!) Another reason to split it up is to avoid GHCi linker issues. GHCi can exhaust linker ticks (or some other weird finite resource?) when compiling a module, and it will do this when you get more than ~1,000 lines of models (in my experience). <h1 id="split-up-approaches">Split Up Approaches</h1> I am aware of two approaches for splitting up the modules - one uses the <code class="language-plaintext highlighter-rouge">QuasiQuoter</code>, and the other uses external files for compilation. We’ll start with external files, as it works best with <code class="language-plaintext highlighter-rouge">persistent</code> migrations and requires the least amount of fallible human error checking. <h2 id="separate-files">Separate Files</h2> I prepared a <a href="https://github.com/parsonsmatt/split-persistent/pull/1/files">GitHub pull request</a> that demonstrates the changes in this section. Follow along for exactly what I did: In the Yesod scaffold, you have a <code class="language-plaintext highlighter-rouge">config/models</code> file which contains all of the entity definitions. We’re going to rename the file to <code class="language-plaintext highlighter-rouge">config/models_backup</code>, and we’re going to create a folder <code class="language-plaintext highlighter-rouge">config/models/</code> where we will put the new entity files. For consistency/convention, we’re going to name the files <code class="language-plaintext highlighter-rouge">${EntityName}.persistentmodels</code>, so we’ll end up with this directory structure: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>config └── models ├── Comment.persistentmodels ├── Email.persistentmodels └── User.persistentmodels </code></pre></div></div> Now, we’re going to create a Haskell file for each models file. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE TypeFamilies #-} module Model.User where import ClassyPrelude.Yesod import Database.Persist.Quasi mkPersistWith [] sqlSettings $(persistFileWith lowerCaseSettings "config/models/User.persistentmodels") </code></pre></div></div> <code class="language-plaintext highlighter-rouge">mkPersistWith</code> is a new function that accepts a <code class="language-plaintext highlighter-rouge">[EntityDef]</code> representing the entity definitions for tables defined outside of the current module. The library needs this so it knows how to generate foreign keys. So far, so good! The contents of the <code class="language-plaintext highlighter-rouge">User.persistentmodels</code> file only has the entity definition for the <code class="language-plaintext highlighter-rouge">User</code> table: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- config/models/User.persistentmodels User ident Text password Text Maybe UniqueUser ident deriving Typeable </code></pre></div></div> Next up, we’ll do <code class="language-plaintext highlighter-rouge">Email</code>, which is defined like this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Email email Text userId UserId Maybe verkey Text Maybe UniqueEmail email </code></pre></div></div> <code class="language-plaintext highlighter-rouge">Email</code> refers to the <code class="language-plaintext highlighter-rouge">UserId</code> type, which is defined in <code class="language-plaintext highlighter-rouge">Model.User</code>. So we need to add that import to the <code class="language-plaintext highlighter-rouge">Model.Email</code> module, and use it with the <code class="language-plaintext highlighter-rouge">mkPersistWith</code> call. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE TypeFamilies #-} module Model.Email where import ClassyPrelude.Yesod import Database.Persist.Quasi import Model.User mkPersistWith [entityDef (Proxy :: Proxy User)] sqlSettings $(persistFileWith lowerCaseSettings "config/models/Email.persistentmodels") </code></pre></div></div> While you can write the <code class="language-plaintext highlighter-rouge">[entityDef ...]</code> list manually, it is considerably easier to write <code class="language-plaintext highlighter-rouge">$(discoverEntities)</code> and splice them in automatically. We need to do the same thing for the <code class="language-plaintext highlighter-rouge">Comment</code> type and module. Now, we have a bunch of modules that are defining our data entities. You may want to reexport them all from the top-level <code class="language-plaintext highlighter-rouge">Model</code> module, or you may choose to have finer grained imports. Either way has advantages and disadvantages. <h3 id="migrations">Migrations</h3> Let’s get those persistent migrations back. If you’re not using <code class="language-plaintext highlighter-rouge">persistent</code> migrations, then you can just skip this bit. We’ll define a new module, <code class="language-plaintext highlighter-rouge">Model.Migration</code>, which will load up all the <code class="language-plaintext highlighter-rouge">*.persistentmodels</code> files and make a migration out of them. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE NoImplicitPrelude #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE TypeFamilies #-} module Model.Migration where import System.Directory import ClassyPrelude.Yesod import Database.Persist.Quasi mkMigrate "migrateAll" $(do files <- liftIO $ do dirContents <- getDirectoryContents "config/models/" pure $ map ("config/models/" <>) $ filter (".persistentmodels" `isSuffixOf`) dirContents persistManyFileWith lowerCaseSettings files ) </code></pre></div></div> Some tricks here: <ol> <li>You can write <code class="language-plaintext highlighter-rouge">do</code> notation in a TemplateHaskell splice, because <code class="language-plaintext highlighter-rouge">Q</code> is a monad, and a splice only expects that the result have <code class="language-plaintext highlighter-rouge">Q splice</code> where <code class="language-plaintext highlighter-rouge">splice</code> depends on syntactically where it’s going. Here, we have <code class="language-plaintext highlighter-rouge">Q Exp</code> because it’s used in an expression context.</li> <li>We do a relatively simple scan - get directory contents for our models, then <code class="language-plaintext highlighter-rouge">filter</code> to the suffix we care about, and then <code class="language-plaintext highlighter-rouge">map</code> the full directory path on there.</li> <li>Finally we call <code class="language-plaintext highlighter-rouge">persistManyFileWith</code>, which takes a list of files and parses it into the <code class="language-plaintext highlighter-rouge">[EntityDef]</code>.</li> </ol> Now we’ve got migrations going, and our files are split up. This speeds up compilation quite a bit. <h2 id="quasiquotes">QuasiQuotes</h2> If you’re not using migrations, this approach has a lot less boilerplate and extra files you have to mess about with. However, the migration story is a little more complicated. Basically, you just put your QuasiQuote blocks in separate Haskell modules, and import the types you need for the references to work out. Easy-peasy! <h3 id="migrations-1">Migrations</h3> In recent versions of <code class="language-plaintext highlighter-rouge">persistent</code>, migrations aren’t generated at compile-time. Instead, they’re created at run-time using the <code class="language-plaintext highlighter-rouge">[EntityDef]</code> provided. So the <code class="language-plaintext highlighter-rouge">QuasiQuote</code> and separate-file-schemes work equally well. Fri, 06 Dec 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/12/06/splitting_persistent_models.html https://www.parsonsmatt.org/2019/12/06/splitting_persistent_models.html Keeping Compilation Fast You’re a Haskell programmer, which means you complain about compilation times. We typically spend a lot of time waiting for GHC to compile code. To some extent, this is unavoidable - GHC does a tremendous amount of work for us, and we only ever ask it to do more. At some point, we shouldn’t be terribly surprised that “doing more work” ends up meaning “taking more time.” However, there are some things we can do to allow GHC to avoid doing more work than necessary. For the most part, these are going to be code organization decisions. In my experience, the following things are true, and should guide organization: <ul> <li>Superlinear: GHC takes more time to compile larger modules than smaller modules.</li> <li>Constant costs: GHC takes a certain amount of start-up time to compile a module</li> <li>Parallelism: GHC can compile modules in parallel (and build tools can typically compile packages in parallel)</li> <li>Caching: GHC can cache modules</li> </ul> So let’s talk about some aspects of project organization and how they can affect compile times. <h1 id="the-projecttypes-megamodule">The <code class="language-plaintext highlighter-rouge">Project.Types</code> Megamodule</h1> You just start on a new project, and you get directed to the God module - <code class="language-plaintext highlighter-rouge">Project.Types</code>. It’s about 4,000 lines long. “All the types are defined in here, it’s great!” However, this is going to cause big problems for your compilation time: <ul> <li>A super large module is going to take way longer to compile</li> <li>Any change to any type requires touching this module, and recompiling everything in it</li> <li>Any change to this module requires recompiling any module that depends on it, which is usually everything</li> </ul> We pretty much can’t take advantage of caching, because GHC doesn’t cache any finer than the module-level. We can’t take advantage of parallelism, as GHC’s parallelism machinery only seems to work at module granularity. Furthermore, we’re tripping this constantly, which is causing GHC to recompile a lot of modules that probably don’t need to be recompiled. <h2 id="resolution">Resolution</h2> Factor concepts out of your <code class="language-plaintext highlighter-rouge">Project.Types</code> module. This will require manually untangling the dependency graph, which can be a little un-fun. It’s probably also a good excuse to learn <code class="language-plaintext highlighter-rouge">.hs-boot</code> files for breaking mutual recursion. There’s a small constant cost to compile a module, so you probably shouldn’t define a module for every single type. Group related types into modules. The sweet spot is probably between 50-200 lines, but that’s a number I just summoned out of the intuitional aether. This process can be done incrementally. Pick a concept or type from the bottom of your dependency graph, and put it in it’s own module. You’ll need to import that into <code class="language-plaintext highlighter-rouge">Project.Types</code> - but do not reexport it! Everywhere that complains, add another import to your new module. As you factor more and more modules out, eventually you’ll start dropping the dependency on <code class="language-plaintext highlighter-rouge">Project.Types</code>. Now, as you edit <code class="language-plaintext highlighter-rouge">Project.Types</code>, you won’t have to recompile these modules, and your overall compile-times will improve dramatically. All the types that are pulled out of <code class="language-plaintext highlighter-rouge">Project.Types</code> will be cached, so recompiling <code class="language-plaintext highlighter-rouge">Project.Types</code> itself will become much faster. Before too long, you’ll be minimizing the amount of compilation you have to do, and everything will be happy. <h1 id="package-splitting">Package Splitting</h1> Okay so you think “I know! I’ll make a bunch of packages to separate my logical concerns!” This is probably smart but it comes with some important trade-offs for development velocity and compile-times. <h2 id="ghci">GHCi</h2> GHCi is pretty picky about loading specific targets, and what you load is going to determine what it will pick up on a reload. You need to ensure that each target has the same default extensions, dependencies, compiler flags, etc. because all source files will be loaded as though they were in a single project. This is a good reason to either use Cabal or <code class="language-plaintext highlighter-rouge">hpack</code> common stanzas for all of this information, or to use file-specific stuff and avoid using implicit configuration. What’s a “load target”? A target is a part of a package, like a library, a specific test-suite, a specific executable, or a sub-library. In a multi-package Cabal or Stack project, load targets can come from different packages. Another gotcha is that any relative filepaths must resolve based on where you’re going to invoke <code class="language-plaintext highlighter-rouge">{stack,cabal} ghci</code>. Suppose you decide you want to split your web app into two packages: <code class="language-plaintext highlighter-rouge">database</code> and <code class="language-plaintext highlighter-rouge">web</code>, where <code class="language-plaintext highlighter-rouge">database</code> has a file it loads for the model definitions, and <code class="language-plaintext highlighter-rouge">web</code> has a bunch of files it loads for HTML templating. The Template Haskell file-loading libraries pretty much assume that your paths are relative to the directory containing the <code class="language-plaintext highlighter-rouge">.cabal</code> file. When you invoke <code class="language-plaintext highlighter-rouge">stack ghci</code> (or <code class="language-plaintext highlighter-rouge">cabal repl</code>), it puts your CWD in the directory you launch it, and the relative directories there are probably not going to work. Once you’ve created that package boundary, it becomes difficult to operate across it. The natural inclination - indeed, the reason why you might break it up - is to allow them to evolve independently. The more they evolve apart, the less easily you can load everything into GHCi. You can certainly load things into GHCi - in the above example, <code class="language-plaintext highlighter-rouge">web</code> depends on <code class="language-plaintext highlighter-rouge">database</code>, and so you can do <code class="language-plaintext highlighter-rouge">stack ghci web</code>, and it’ll compile <code class="language-plaintext highlighter-rouge">database</code> just fine as a library and load <code class="language-plaintext highlighter-rouge">web</code> into GHCi. However, you won’t be able to modify a module in <code class="language-plaintext highlighter-rouge">database</code>, and hit <code class="language-plaintext highlighter-rouge">:reload</code> to perform a minimal recompilation. Instead, you’ll need to kill the GHCi session and reload it from scratch. This takes a lot more time than an incremental recompilation. <h2 id="module-parallelism">Module Parallelism</h2> GHC is pretty good at compiling modules in parallel. It’s also pretty good at compiling packages in parallel. Unfortunately, it can’t see across the package boundary. Suppose your package <code class="language-plaintext highlighter-rouge">hello</code> depends on module <code class="language-plaintext highlighter-rouge">Tiny.Little.Module</code> in the package <code class="language-plaintext highlighter-rouge">the-world</code>, which also contains about a thousand utility modules and Template Haskell splices and derived Generic instances for data types and type family computations and (etc……). You’d really want to just start compiling <code class="language-plaintext highlighter-rouge">hello</code> as soon as <code class="language-plaintext highlighter-rouge">Tiny.Little.Module</code> is completely compiled, but you can’t - GHC must compile everything else in the package before it can start on yours. Breaking up your project into multiple packages can cause overall compile-times to go up significantly in this manner. If you do this, it should ideally be to split out a focused library that will need to change relatively rarely while you iterate on the rest of your codebase. I’d beware of breaking things up until absolutely necessary - a package boundary is a heavy tool to merely separate responsibilities. At the day job, we have a package graph that looks like this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> +-> C A -> B --| +-> D </code></pre></div></div> By combining <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> into a single package, we sped up compile times for a complete build of the application by 10%. A clean build of the new <code class="language-plaintext highlighter-rouge">AB</code> package was 15% faster to build all told,and incremental builds were improved too. <h2 id="package-parallelism">Package parallelism</h2> The good news is that it is quite easy to cache entire packages, and the common build tools are quite good at compiling packages in parallel. It’s not that big of a deal to depend on <code class="language-plaintext highlighter-rouge">lens</code> anymore, largely because of how good sharing and caching has gotten. So certaily don’t be afraid to split out libraries and push them to GitHub or Hackage, but if you’re not willing to GitHub it, then it should probably stay in the main package. <h1 id="big-ol-instances-module">Big Ol Instances Module</h1> Well, you did it. You have a bunch of packages and you don’t want to merge them together. Then you defined a bunch of types in <code class="language-plaintext highlighter-rouge">foo</code>, and then defined a type class in <code class="language-plaintext highlighter-rouge">bar</code>. <code class="language-plaintext highlighter-rouge">bar</code> depends on <code class="language-plaintext highlighter-rouge">foo</code>, so you can’t put the instances with the type definitions, and you’re a Good Haskeller so you want to avoid orphan instances, which means you need to put all the instances in the same module. Except - you know how you had a 4,000 line types module, which was then split-up into dozens of smaller modules? Now you have to import all of those, and you’ve got a big 3,000 class/instance module. All the same problems apply - you’ve got a bottleneck in compilation, and any touch to any type causes this big module to get recompiled, which in turn causes everything that depends on the class to be recompiled. A solution is to ensure that all your type classes are defined above the types in the module graph. This is easiest to do if you have only a single package. But you may not be able to do that easily, so here’s a solution: <h2 id="hidden-orphans">Hidden Orphans</h2> The real problem is that you want to refer to the class and operations without incurring the wrath of the dependency graph. You can do this with orphan instances. Define each instance in it’s own module and import them into the module that defines the class. Don’t expose the orphan modules - you really want to ensure that you don’t run into the practical downsides of orphans while allowing recompilation and caching. You’ll start with a module like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MyClass where import Types.Foo import Types.Bar import Types.Baz class C a instance C Foo instance C Bar instance C Baz </code></pre></div></div> where a change to any <code class="language-plaintext highlighter-rouge">Types</code> module requires a recompilation of the entirety of the <code class="language-plaintext highlighter-rouge">MyClass</code> module. You’ll create an internal module for the class (and any helpers etc), then a module for each type/instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MyClass.Class where class C a module MyClass.Foo where import MyClass.Class import Types.Foo instance C Foo module MyClass.Bar where import MyClass.Class import Types.Bar instance C Bar module MyClass.Baz where import MyClass.Class import Types.Baz instance C Baz module MyClass (module X) where import MyClass.Class as X import MyClass.Foo as X import MyClass.Bar as X import MyClass.Baz as X </code></pre></div></div> So what happens when we touch <code class="language-plaintext highlighter-rouge">Types.Foo</code>? With the old layout, it’d trigger a recompile of <code class="language-plaintext highlighter-rouge">MyClass</code>, which would have to start entirely over and recompile everything. With the new layout, it triggers a recompile of <code class="language-plaintext highlighter-rouge">MyClass.Foo</code>, which is presumably much smaller. Then, we do need to recompile <code class="language-plaintext highlighter-rouge">MyClass</code>, but because all the rest of the modules are untouched, they can be reused and cached, and compiling the entire module is much faster. This is a bit nasty, but it can break up a module bottleneck quite nicely, and if you’re careful to only use the MyClass interface, you’ll be safe from the dangers of orphan instances. <h1 id="some-random-parting-thoughts">Some random parting thoughts</h1> <ul> <li>Don’t do more work than you need to. Derived type class instances are work that GHC must redo every time the module is compiled.</li> <li>Keep the module graph broad and shallow.</li> <li>TemplateHaskell isn’t that bad for compile times. <ul> <li>You pay a 200-500ms hit to fire up the interpreter at all, but from there, most TH code is quite fast - running the TH code to parse and generate models from 1,500 lines of <code class="language-plaintext highlighter-rouge">persistent</code> quasiquoter takes about 50ms.</li> <li>The slow part is compiling the resulting code - those 1,500 lines of model definitions expanded out to something like 200kloc (note that this is no longer true as of <code class="language-plaintext highlighter-rouge">persistent-template-2.8.0</code>, which dramatically speeds up compile-times by generating less code)</li> <li>The solution is to split up the module, following the tips in <a href="https://www.parsonsmatt.org/2019/12/06/splitting_persistent_models.html">this post</a>!</li> </ul> </li> <li>The following command speeds up compilation significantly, especially after exposing all those parallelism opportunities: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> stack build --fast --file-watch --ghc-options "-j4 +RTS -A128m -n2m -RTS" </code></pre></div> </div> These flags give GHC 4 threads to work with (more didn’t help on my 8 core computer), and <code class="language-plaintext highlighter-rouge">-A128m</code> gives it more memory before it does GC. (EDIT: this post used to recommend <code class="language-plaintext highlighter-rouge">-qg</code> which turns off the parallel garbage collector. <a href="https://twitter.com/MaxTagher">@MaxTagher</a> did the brave work of actually verifying this, and it was indeed slowing things down! removing this flag dropped a full build from 3:28 to 3:03, a huge improvement.) Thanks to <a href="https://www.reddit.com/r/haskell/comments/e2l1yj/keeping_compilation_fast/f8wt34p/">/u/dukerutledge</a> for pointing out <code class="language-plaintext highlighter-rouge">-n2m</code>, which I don’t understand but helped! </li> <li>Try to keep things <code class="language-plaintext highlighter-rouge">ghci</code> friendly as much as possible. <code class="language-plaintext highlighter-rouge">:reload</code> is the fastest way to test stuff out usually, and REPL-friendly code is test-friendly too!</li> </ul> <h1 id="results">Results</h1> (this section was added on 2020-01-30) Not sure if these tips are legit? Here’s an experience report. Before we implemented these changes at work, a full build of the repository code (56kloc) took 9:55. Today, a full build of the repository (with GHC flags) is down to 3:40, and we have 70kloc. Much of that comes from the improvements to <code class="language-plaintext highlighter-rouge">persistent-template-2.8.0</code>’s performance improvements - about 20%. The rest is module splitting and exposing parallelism. Wed, 27 Nov 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/11/27/keeping_compilation_fast.html https://www.parsonsmatt.org/2019/11/27/keeping_compilation_fast.html Why 'Functor' Doesn't Matter Alternative, less click-baity title: Names Do Not Transmit Meaning People often complain about the names for concepts that are commonly used in Functional Programming, especially Haskell. Functor, monoid, monad, foldable, traversable, arrow, optics, etc. They’re weird words! <code class="language-plaintext highlighter-rouge">Functor</code> comes from category theory, <code class="language-plaintext highlighter-rouge">Monoid</code> comes from abstract algebra. <code class="language-plaintext highlighter-rouge">Arrow</code> comes from – well it’s just kind of made up! <code class="language-plaintext highlighter-rouge">Optics</code>, lenses, prisms, etc are all somewhat strange metaphors for what’s going on. What’s the deal? Why can’t they just pick simple names that mean what they are? Why can’t they use practical and ordinary terms, the way that Object Oriented Programming does? Some people strengthen their complaint with moral urgency: <code class="language-plaintext highlighter-rouge">Functor</code> is a confusing word, <ul> <li>and it makes it more difficult for people to learn Haskell,</li> <li>and this makes Haskell an elitist, non-inclusive language,</li> <li>and if we just used “my favorite term” instead, it wouldn’t be a problem!</li> </ul> So, let’s talk about names. What are they? What do they do? How do they matter, and why? <h1 id="whats-in-a-name">What’s in a name?</h1> My name is Matthew Parsons. If you google “Matthew Parsons”, you’ll see a bunch of footballers, a doctor, a professor, a radio personality. My blog is the last entry on the second page of Google results for my name, which I’m pretty proud of. My name isn’t globally unique - there are a lot of Matt Parsons running around. Many of them think that they have my email, and sign me up for all kinds of silly stuff (and some serious stuff, too). If you further qualify the name - by appending ‘Haskell’ to the search query - then you get a ton of stuff that points to me. I appear to be the most prominent Haskell programmer named Matt Parsons (for now). If I’m in a group of folks, and you say the word “Matt,” I’m going to assume you’re trying to get my attention. Unless there’s another Matt in the group, at which point you’ll probably say “Matt Parsons” or similar to disambiguate. I’m about to go on a bikepacking trip with two other Matts. I suspect my first name will be dropped entirely on this trip. What does my name tell you about me? Almost nothing. I’m an English speaking male, probably of British descent. But it doesn’t tell you that I like kittens, that I like Haskell, that I like to ride bikes, or that I dislike the color red. It’s merely an imperfect, globally duplicated pointer. <h1 id="whats-a-name-good-for">What’s a name good for?</h1> It’s a pointer with an ambiguous address space. It’s a key in a nondeterministic map. It’s a database ID column with an index, but not a unique index. They’re bad! They don’t scale, at all. I think of a concept, I say a word, and hopefully this points to the same concept in your brain. We use names as shortcuts for communicating common concepts. If we need to learn more about a concept, we can look up the name and try to find relationships to other names. Names can’t transmit meaning. They just point to concepts. Concepts must be explained and understood, usually in terms of a large quantity of simpler or more familiar names. If we want a name to fully describe the concept it points to, then it must be a very simple concept indeed. <h1 id="how-can-we-judge-a-name">How can we judge a name?</h1> Names can’t transmit meaning, and so a name shouldn’t be judged on how well it transmits meaning. That doesn’t mean that names can’t be judged at all - there are good and bad aspects to names. <ul> <li>How reliably does it point to the right concept?</li> <li>How memorable is it?</li> <li>How easy is it to pronounce (for the language it originated in)?</li> <li>How aesthetically appealing is it?</li> </ul> The last three are pretty subjective - I used to find it difficult to remember the difference between <code class="language-plaintext highlighter-rouge">Monoid</code> and <code class="language-plaintext highlighter-rouge">Monad</code> because they both have the form <code class="language-plaintext highlighter-rouge">mon*d</code>, and because people kept saying things like “Monads are monoids for functors.” I find a word like ‘illuminate’ pretty and ‘buzzfeed’ gross, which is totally just because I am weird and have opinions about this. Reliability is also subjective. It all depends on context and familiarity. It’s essentially impossible to have fully unique names for ideas - even if you pick something totally novel, someone else can come along and use your unique name for a totally different concept. <code class="language-plaintext highlighter-rouge">Functor</code> gets picked on a lot, so let’s look at that. The word <a href="https://en.wiktionary.org/wiki/functor">functor</a> has three meanings, one in linguistics, one in object oriented programming, and one in category theory. This is pretty good - only three collisions, and it is usually pretty clear what you mean based on context clues. The best (but still extremely bad) alternative name to <code class="language-plaintext highlighter-rouge">Functor</code> is <code class="language-plaintext highlighter-rouge">Mappable</code>. It’s the best alternative because it is the least misleading - I’ve seen people suggest <code class="language-plaintext highlighter-rouge">Iterable</code>, <code class="language-plaintext highlighter-rouge">Streaming</code>, <code class="language-plaintext highlighter-rouge">Container</code>, <code class="language-plaintext highlighter-rouge">Lift</code>, and they’re all dramatically more misleading. The <a href="https://en.wiktionary.org/wiki/mappable">Wiktionary page</a> relates it to maps, suggesting that it means you “can make a geographical map of a thing,” or that you can construct a “mapping” between two sets of things. Grammatically, it implies that you can use a verb “map” over the thing. So let’s look at the <a href="https://en.wiktionary.org/wiki/map">Wiktionary entry for ‘map’</a>. At a first glance, it’s a way bigger page than <code class="language-plaintext highlighter-rouge">Functor</code>. There’s a common understanding: geographic maps, like you use to navigate a new city. The next most common understanding is more abstract: <blockquote> A graphical representation of the relationships between objects, components or themes. </blockquote> Third definition is from math, and makes it a synonym for ‘function’. But that’s not quite right - after all, <code class="language-plaintext highlighter-rouge">Mappable x</code> implies “I can map x,” and you can apply a function to any value in Haskell. So that doesn’t really give any additional clarity. The other meanings are completely out-of-bounds, and it’s unlikely that someone would be confused. This name, <code class="language-plaintext highlighter-rouge">Mappable</code>, points to a bunch of potential meanings already, and none of them are really right. So we can create a new meaning that <code class="language-plaintext highlighter-rouge">Mappable</code> points to, and hope that people infer the right one. But we still have to explain what a <code class="language-plaintext highlighter-rouge">Mappable</code> is. “What’s a <code class="language-plaintext highlighter-rouge">Mappable</code>? Well, it’s something you can <code class="language-plaintext highlighter-rouge">map</code> over!” is a terrible explanation! It’s literally just the grammatic expansion of the word. All it does is move the question one bit further: <blockquote> What does it mean to be able to <code class="language-plaintext highlighter-rouge">map</code> over something? </blockquote> Wait, “map over”? This is unfamiliar terminology. I know about “maps” like Google Maps. I know that I can ‘map’ a space out and provide information about how to get from here to there. You might think that because you already learned <code class="language-plaintext highlighter-rouge">map</code> from JavaScript or Python that it’s a good enough name. But that’s an argument from familiarity. Rubyists and Smalltalkers are more familiar with the name <code class="language-plaintext highlighter-rouge">collect</code> for this operation. If they want to call it <code class="language-plaintext highlighter-rouge">Collectable</code>, then who are we to stop them? Fact is, “mapping” isn’t an easy concept, no matter what you call it. We could call it “florbing” and it would require the exact same amount of instruction and understanding for the concept to work out. Worse yet, a <code class="language-plaintext highlighter-rouge">Mappable</code> is a more permissive concept than a <code class="language-plaintext highlighter-rouge">Functor</code>. There are things that are <code class="language-plaintext highlighter-rouge">Mappable</code> that are not a <code class="language-plaintext highlighter-rouge">Functor</code>, because a <code class="language-plaintext highlighter-rouge">Functor</code> is structure preserving. This means that the following laws must hold: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>composition: fmap f . fmap g = fmap (f . g) identity: fmap id = id </code></pre></div></div> A <code class="language-plaintext highlighter-rouge">Set</code> datatype (collection of unique objects) is not a functor, because it is possible for a choice of <code class="language-plaintext highlighter-rouge">f</code> and <code class="language-plaintext highlighter-rouge">g</code> to violate the composition law. Likewise, a datatype <code class="language-plaintext highlighter-rouge">Counter</code> that keeps track of how many times you call <code class="language-plaintext highlighter-rouge">fmap</code> on it is not a <code class="language-plaintext highlighter-rouge">Functor</code> because <code class="language-plaintext highlighter-rouge">fmap id</code> would not be equal to <code class="language-plaintext highlighter-rouge">id</code>. These laws are important, because they allow us to perform refactoring and simplify the possibilities when thinking about code. <code class="language-plaintext highlighter-rouge">Mappable</code> is an OK concept, but it ain’t a <code class="language-plaintext highlighter-rouge">Functor</code>, and there’s no way I’m trading the name for <code class="language-plaintext highlighter-rouge">StructurePreservingMappable</code>. <h1 id="so-what-makes-a-name-bad">So what makes a name bad?</h1> Names can’t transmit meaning. They can transmit a pointer, though, which might point to some meaning. If that meaning isn’t the right meaning, then the recipient will misunderstand. Misunderstandings like this can be difficult to track down, because our brains don’t give us a type error with a line and column number to look at. Instead, we just feel confused, and we have to dig through our concept graph to figure out what’s missing or wrong. Object Oriented Programming is littered with terrible names, precisely because they mislead and cause a false familiarity. Object, Class, Visitor, Factory, Command, Strategy, Interface, Adapter, Bridge, Composite. All of these are common English words with a relatively familiar understanding to them. And all of them are misleading. “Object” is possibly the most reasonable name choice - the English word ‘object’ just refers to any random physical thing, or grammatically speaking, the target of a subject - something we act upon. After that, it’s misleading names causing confusion. What’s a class? It’s a blueprint for objects! Why not call it an <code class="language-plaintext highlighter-rouge">ObjectBlueprint</code>? Uhhh… And how does that relate to static class variables, and other attributes of classes? What’s the visitor pattern? What does it mean for my code to “visit” another piece of code? You need to understand the abstract meaning of these words before “visitor pattern” means anything to you. What’s the difference between an interface, a bridge, and an adapter? These three terms are all idioms for the same sort of concept in English, but they have rather different precise meanings in programming. <code class="language-plaintext highlighter-rouge">Monad</code> - now that’s a name that didn’t mean anything to me when I first read it. I read it, and I immediately knew that I didn’t understand the underlying concept. At the time, I was so tired of reading familiar words, assuming they meant something that I understood, and stepping on abstract landmines that betrayed my lack of understanding. <code class="language-plaintext highlighter-rouge">Monad</code> was a breath of fresh air. A new concept, and a new name to go with it! Concepts are hard. Names don’t make them any easier or harder to understand. Names are only useful in their value as pointers, and to establish relationships between concepts. <code class="language-plaintext highlighter-rouge">Functor</code> is hard to learn. It is not hard to learn because it is named <code class="language-plaintext highlighter-rouge">Functor</code>. If you renamed it to anything else, you’d have just as hard of a time, and you’d be cutting off your student from all of the resources and information currently using the word “functor” to refer to that concept. Fri, 30 Aug 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/08/30/why_functor_doesnt_matter.html https://www.parsonsmatt.org/2019/08/30/why_functor_doesnt_matter.html Extending the Persistent QuasiQuoter Haskell’s <code class="language-plaintext highlighter-rouge">persistent</code> database library is convenient and flexible. The recommended way to define your database entities is the QuasiQuoter syntax, and a complete module that defines some typical entities looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Models.hs {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE QuasiQuotes #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE TypeFamilies #-} module Models where import Database.Persist.TH import Data.Text (Text) share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| User json name Text email Text age Int deriving Show Eq |] </code></pre></div></div> The QuasiQuoter does a ton of stuff for you. In this post, we’re going to learn how to make it work for you! <h1 id="sharing-is-caring">Sharing is Caring</h1> Let’s look at the <a href="https://www.stackage.org/haddock/lts-13.14/persistent-template-2.5.4/Database-Persist-TH.html#v:share"><code class="language-plaintext highlighter-rouge">share</code> function</a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>share :: [[EntityDef] -> Q [Dec]] -> [EntityDef] -> Q [Dec] share fs x = fmap mconcat $ mapM ($ x) fs </code></pre></div></div> It takes a list of functions <code class="language-plaintext highlighter-rouge">[EntityDef] -> Q [Dec]</code> and then runs all of them over the <code class="language-plaintext highlighter-rouge">[EntityDef]</code> that is provided, and finally joins all the <code class="language-plaintext highlighter-rouge">[Dec]</code> together. So, if we want to make the QQ work for us, we need to write a function with that type and add it to our list. Let’s start with a problem: one of the instances that are generated for the <code class="language-plaintext highlighter-rouge">User</code> table is <code class="language-plaintext highlighter-rouge">PersistEntity</code>. <code class="language-plaintext highlighter-rouge">PersistEntity</code> has an associated data type, called <code class="language-plaintext highlighter-rouge">EntityField</code>. It’s a sum type which contains all of the fields for the <code class="language-plaintext highlighter-rouge">User</code> type, and it’s a GADT that tells you what the type of the field is. If we were to write that part of the <code class="language-plaintext highlighter-rouge">PersistEntity</code> instance by hand, it would look like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance PersistEntity User where data EntityField User fieldType where UserName :: EntityField User Text UserEmail :: EntityField User Text UserAge :: EntityField User Int </code></pre></div></div> There are a lot of functions that use the <code class="language-plaintext highlighter-rouge">EntityField</code> type when doing querying. This type has no instances defined for it! And you may want to do something interesting with these field types that hasn’t been considered. Let’s say we need to have <code class="language-plaintext highlighter-rouge">Show</code> instances for our record fields. We can derive them using the <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code> language extension, so this works: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE QuasiQuotes #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE StandaloneDeriving #-} module Models where import Database.Persist.TH import Data.Text (Text) share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| User json name Text email Text age Int deriving Show Eq |] deriving instance Show (EntityField User field) </code></pre></div></div> The last line is our <code class="language-plaintext highlighter-rouge">StandaloneDeriving</code> instance. This works! However, it’s a bit annoying to write out for every single record in a larger schema. Let’s write a function that will do this for us automatically. <h1 id="template-rascal">Template Rascal</h1> Let’s first review the type of the function we pass to <code class="language-plaintext highlighter-rouge">share</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[EntityDef] -> Q [Dec] </code></pre></div></div> The input type is a list of the entity definitions. This type (<a href="https://www.stackage.org/haddock/lts-13.14/persistent-2.9.1/Database-Persist-Types.html#t:EntityDef"><code class="language-plaintext highlighter-rouge">EntityDef</code></a>) comes from the <code class="language-plaintext highlighter-rouge">persistent</code> package, and has a ton of information about the entities. The type <code class="language-plaintext highlighter-rouge">Q</code> comes from the <a href="https://hackage.haskell.org/package/template-haskell-2.14.0.0/docs/Language-Haskell-TH.html#t:Q"><code class="language-plaintext highlighter-rouge">template-haskell</code></a> package, as do <a href="https://hackage.haskell.org/package/template-haskell-2.14.0.0/docs/Language-Haskell-TH.html#t:Dec"><code class="language-plaintext highlighter-rouge">Dec</code></a>. This blog post isn’t going to elaborate too much on <code class="language-plaintext highlighter-rouge">TemplateHaskell</code> - if you’d like a beginner-friendly tutorial, see <a href="/2015/11/15/template_haskell.html">Template Haskell Is Not Scary</a>. We will begin by creating the skeleton for the function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveShowFields :: [EntityDef] -> Q [Dec] deriveShowFields entities = undefined </code></pre></div></div> We know we’re going to iterate over all of them, so let’s use <code class="language-plaintext highlighter-rouge">forM</code> - <code class="language-plaintext highlighter-rouge">Q</code> has a <code class="language-plaintext highlighter-rouge">Monad</code> instance. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveShowFields :: [EntityDef] -> Q [Dec] deriveShowFields entities = forM entites $ \entity -> undefined </code></pre></div></div> We need to replace <code class="language-plaintext highlighter-rouge">undefined</code> with an expression of type <code class="language-plaintext highlighter-rouge">Q Dec</code>. We could attempt to construct the <code class="language-plaintext highlighter-rouge">Dec</code> value directly using data constructors. However, it will be a bit more straightforward to use a QuasiQuoter. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveShowFields :: [EntityDef] -> Q [Dec] deriveShowFields entities = forM entites $ \entity -> let name = undefined [d|deriving instance Show (EntityField $(name) field)|] </code></pre></div></div> This fails with a type error. The <code class="language-plaintext highlighter-rouge">[d| ... |]</code> quasiquoter returns a value of type <code class="language-plaintext highlighter-rouge">Q [Dec]</code>. That means that <code class="language-plaintext highlighter-rouge">forM entities ...</code> will return <code class="language-plaintext highlighter-rouge">Q [[Dec]]</code>. So we just need to flatten it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveShowFields :: [EntityDef] -> Q [Dec] deriveShowFields entities = fmap join . forM entites $ \entity -> let name = undefined [d|deriving instance Show (EntityField $(name) field)|] </code></pre></div></div> Alright, now we need to get a <code class="language-plaintext highlighter-rouge">name</code> that fits in that splice. What’s the type of that splice? I’m going to throw a <code class="language-plaintext highlighter-rouge">()</code> in there and see what GHC complains about. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveShowFields :: [EntityDef] -> Q [Dec] deriveShowFields entities = fmap join . forM entites $ \entity -> let name = () [d|deriving instance Show (EntityField $(name) field)|] </code></pre></div></div> This gives us an error: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>• Couldn't match type ‘()’ with ‘Q Type’ Expected type: TypeQ Actual type: () • In the expression: name In a stmt of a 'do' block: [d| deriving instance Show (EntityField $(name) x) |] pending(rn) [<splice, name>] In the expression: do let name = () [d| deriving instance Show (EntityField $(name) x) |] pending(rn) [<splice, name>] | ... | [d|deriving instance Show (EntityField $(name) x)|] | ^^^^ </code></pre></div></div> Cool! We need something of type <code class="language-plaintext highlighter-rouge">Q Type</code>. <code class="language-plaintext highlighter-rouge">Type</code>, like <code class="language-plaintext highlighter-rouge">Dec</code>, comes from the <a href="https://hackage.haskell.org/package/template-haskell-2.14.0.0/docs/Language-Haskell-TH.html#t:Type"><code class="language-plaintext highlighter-rouge">template-haskell</code></a> package. So, we have an <code class="language-plaintext highlighter-rouge">entity :: EntityDef</code>, and we need a <code class="language-plaintext highlighter-rouge">name :: Q Type</code>. The name is the name of the entity. If we look at the <a href="https://www.stackage.org/haddock/lts-13.14/persistent-2.9.1/Database-Persist-Types.html#t:EntityDef">fields of <code class="language-plaintext highlighter-rouge">EntityDef</code></a> again, we’ll see that the first field is <code class="language-plaintext highlighter-rouge">entityHaskell :: HaskellName</code>. That is promising. We can use another <a href="https://www.stackage.org/haddock/lts-13.14/persistent-2.9.1/Database-Persist-Class.html#t:PersistEntity"><code class="language-plaintext highlighter-rouge">PersistEntity</code></a> class function, <code class="language-plaintext highlighter-rouge">entityDef :: (Monad m) => m rec -> EntityDef</code>, to summon an <code class="language-plaintext highlighter-rouge">EntityDef</code> in GHCi and see what we get. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> entityHaskell $ entityDef (Nothing :: Maybe User) HaskellName {unHaskellName = "User"} </code></pre></div></div> What’s inside a <code class="language-plaintext highlighter-rouge">HaskellName</code>? Let’s find out! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> :info HaskellName newtype HaskellName = HaskellName {unHaskellName :: Data.Text.Internal.Text} -- Defined in ‘persistent-2.8.2:Database.Persist.Types.Base’ </code></pre></div></div> So, we have a <code class="language-plaintext highlighter-rouge">Text</code> representation of the Haskell record name. And we know we need a <code class="language-plaintext highlighter-rouge">Type</code> that refers to this name. If we look at <a href="https://hackage.haskell.org/package/template-haskell-2.14.0.0/docs/Language-Haskell-TH.html#t:Type">the data constructors for <code class="language-plaintext highlighter-rouge">Type</code></a>, we’ll notice that <code class="language-plaintext highlighter-rouge">ConT</code> appears to be what we want. So now we need a <code class="language-plaintext highlighter-rouge">Name</code> to give to <code class="language-plaintext highlighter-rouge">ConT</code>. What is a <a href="https://hackage.haskell.org/package/template-haskell-2.14.0.0/docs/Language-Haskell-TH.html#t:Name"><code class="language-plaintext highlighter-rouge">Name</code></a>? The linked docs say that it’s an abstract type for the Haskell value names. They also give us a way of creating one: <a href="https://hackage.haskell.org/package/template-haskell-2.14.0.0/docs/Language-Haskell-TH.html#v:mkName"><code class="language-plaintext highlighter-rouge">mkName :: String -> Name</code></a>. The last building block is <code class="language-plaintext highlighter-rouge">Data.Text.unpack :: Text -> String</code>. Now, let’s plug our legos together: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveShowFields :: [EntityDef] -> Q [Dec] deriveShowFields entities = fmap join . forM entites $ \entity -> let name = pure . ConT . mkName . Text.unpack . unHaskellName . entityHaskell $ entity [d|deriving instance Show (EntityField $(name) field)|] </code></pre></div></div> Bingo! Let’s pass this to <code class="language-plaintext highlighter-rouge">share</code> in our model file. Note that we need to import it from somewhere else due to Template Haskell staging restrictions. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>share [ deriveShowFields , mkPersist sqlSettings , mkMigrate "migrateAll" ] [persistLowerCase| User json name Text email Text age Int deriving Show Eq |] </code></pre></div></div> And let’s try it in GHCi: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> show UserEmail "UserEmail" >>> show UserName "UserName" </code></pre></div></div> Nice. We’ve hooked into <code class="language-plaintext highlighter-rouge">persistent</code>’s QuasiQuoter and provided our own functionality. Sat, 30 Mar 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/03/30/extending_the_persistent_quasiquoter.html https://www.parsonsmatt.org/2019/03/30/extending_the_persistent_quasiquoter.html Return a Function to Avoid Effects To help write robust, reliable, and easy-to-test software, I always recommend purifying your code of effects. There are a bunch of tricks and techniques to accomplish this sort of thing, and I’m going to share one of my favorites. I have implemented a pure data pipeline that imports records from one database and puts them in another database with a slightly different schema. Rather than implement all of the logic for saving each entity individually, I’ve created a function <code class="language-plaintext highlighter-rouge">migrate</code> that is abstract. The heart of this pipeline is a set of type classes: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language FunctionalDependencies, ScopedTypeVariables, TypeApplications, AllowAmbiguousTypes #-} class LoadData i where load :: IO [i] class ConvertData i o | i -> o where convert :: i -> o class SaveData o where save :: [o] -> IO () migrate :: forall i o . (LoadData i, ConvertData i o, SaveData o) => IO () migrate = do old <- load @i save (map convert old) </code></pre></div></div> All of the business logic is in the <code class="language-plaintext highlighter-rouge">ConvertData</code> class. The <code class="language-plaintext highlighter-rouge">LoadData</code> and <code class="language-plaintext highlighter-rouge">SaveData</code> do typically boring things. A new requirement came in: we are going to import a new record, and we must generate new UUIDs for it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Old = Old Int Text data New = New UUID Int Text </code></pre></div></div> These <code class="language-plaintext highlighter-rouge">UUID</code>s must be randomly generated. The logical place to generate the <code class="language-plaintext highlighter-rouge">UUID</code> is in the <code class="language-plaintext highlighter-rouge">ConvertData</code> part of the pipeline. However, this would require adding <code class="language-plaintext highlighter-rouge">IO</code> to the method signature, which would make testing and verifying this code more difficult. Instead, we are going to create a new type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype NeedsUUID = NeedsUUID { giveUUID :: UUID -> New } </code></pre></div></div> Now, our conversion function instances will look like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance LoadData Old where load = loadOldData instance ConvertData Old NeedsUUID where convert (Old i t) = NeedsUUID (\uuid -> New uuid i t) </code></pre></div></div> We have abstracted the UUID generation. Our <code class="language-plaintext highlighter-rouge">ConvertData</code> type class remains pure, and we’ve pushed the implementation detail of UUID generation out. Now, we implement the <code class="language-plaintext highlighter-rouge">SaveData</code> type class, which already had <code class="language-plaintext highlighter-rouge">IO</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance SaveData NeedsUUID where save needsUUIDs = do values <- forM needsUUIDs $ \needsUUID -> uuid <- freshUUID return (giveUUID needsUUID uuid) saveNewValues values </code></pre></div></div> <h1 id="effects-on-the-edge">Effects on the Edge</h1> We want to keep effects isolated to the edges of our programs as much as possible. This allows most of our code to remain pure and easy to test and examine. I’ve written about similar topics in my posts <a href="/2017/10/11/type_safety_back_and_forth.html">Type Safety Back and Forth</a> and <a href="/2017/07/27/inverted_mocking.html">Invert Your Mocks!</a>. Fri, 22 Mar 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/03/22/return_a_function_to_avoid_effects.html https://www.parsonsmatt.org/2019/03/22/return_a_function_to_avoid_effects.html Sum Types In SQL Algebraic datatypes are a powerful feature of functional programming languages. By combining the expressive power of “and” and “or,” we can solve all kinds of problems cleanly and elegantly. SQL databases represent product types – “and” – extremely well - a SQL table can correspond easily and directly to a product type where each field in the product type can fit in a single column. On the other hand, SQL databases have trouble with sum types – “or”. Most SQL databases support simple enumerations easily, but they lack the ability to talk about real sum types with fields. We can encode sum types in SQL in a few different ways, each of which has upsides and downsides. For each of these examples, we will be encoding the following Haskell datatype: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Animal = Cat Name Age FavoriteFood | Dog Name OwnerId | Bird Name Song type Name = Text type Age = Int type FavoriteFood = Text type OwnerId = Int type Song = Text </code></pre></div></div> We’re going to ignore that this datatype could be normalized (though I will describe datatype normalization and show what I mean at the end of the post). The SQL examples in this blog post will use PostgreSQL. The Haskell query code will use the <code class="language-plaintext highlighter-rouge">persistent</code> syntax for entities and queries. <h1 id="shared-primary-keys">Shared Primary Keys</h1> The first technique is the one that I have used with the most success. It has a good set of tradeoffs with respect to SQL normalization and Haskell types. First, we are going to create a table for all animals and a type for the constructors: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TYPE animal_constr AS ENUM ('cat', 'bird', 'dog'); CREATE TABLE animal ( id SERIAL NOT NULL, type animal_constr NOT NULL, PRIMARY KEY (id, type) ); </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">type</code> field on the table will distinguish the different constructors and tables we’ll use. Let’s create the <code class="language-plaintext highlighter-rouge">cat</code> table: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE cat ( id INTEGER PRIMARY KEY, type animal_constr NOT NULL DEFAULT 'cat' CHECK (type = 'cat'), name TEXT NOT NULL, age INTEGER NOT NULL, favorite_food TEXT NOT NULL, FOREIGN KEY (id, type) REFERENCES animal (id, type) ) </code></pre></div></div> The contents of the <code class="language-plaintext highlighter-rouge">type</code> column for the <code class="language-plaintext highlighter-rouge">cat</code> table are completely constrained – we cannot insert any record into the table that does not have a value of type <code class="language-plaintext highlighter-rouge">'cat'</code>. Fortunately, this is automated - we can omit specifying it, and it’ll be filled in automatically from the <code class="language-plaintext highlighter-rouge">DEFAULT 'cat'</code> clause. The other SQL tables will have a similar format. The <code class="language-plaintext highlighter-rouge">type</code> field will be constrained to the constructor. If we add new constructors to the Haskell datatype, we can use the SQL command: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ALTER TYPE animal_constr ADD VALUE 'alligator'; </code></pre></div></div> Then we can create an additional table <code class="language-plaintext highlighter-rouge">'alligator'</code>, and the constraints all work out exactly like we want. This representation has a downside. It is possible to insert entries into the <code class="language-plaintext highlighter-rouge">animal</code> table which have no corresponding entry in any other table. Indeed, this is necessary - there must first be an entry in the <code class="language-plaintext highlighter-rouge">animal</code> table before we can insert the corresponding <code class="language-plaintext highlighter-rouge">cat</code>. Let’s check out how to query this from Haskell. We are going to use the following <code class="language-plaintext highlighter-rouge">persistent</code> definition: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Animal type AnimalConstr Cat type AnimalConstr name Text age Int favoriteFood Text </code></pre></div></div> This corresponds exactly with the SQL description above, though it will generate migrations that are different. I am going to elide the <code class="language-plaintext highlighter-rouge">PersistField</code> definitions, as they are irrelevant, but <code class="language-plaintext highlighter-rouge">AnimalConstr</code> is defined as: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data AnimalConstr = Cat | Dog | Bird deriving (Eq, Show, Read) </code></pre></div></div> Assuming we have defined the other animal tables, we’ll use this SQL query: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT ??, ??, ??, ?? FROM animal LEFT JOIN cat ON cat.id = animal.id LEFT JOIN dog ON dog.id = animal.id LEFT JOIN bird ON bird.id = animal.id WHERE cat.id IS NOT NULL OR dog.id IS NOT NULL OR bird.id IS NOT NULL </code></pre></div></div> We need the <code class="language-plaintext highlighter-rouge">WHERE</code> clause to ensure that we do not select <code class="language-plaintext highlighter-rouge">animal</code> records that do not have corresponding records in the subtables. The Haskell code to load these records out of the database looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>selectAnimalsDb :: SqlPersistM [ ( Entity Animal , Maybe (Entity Cat) , Maybe (Entity Dog) , Maybe (Entity Bird) ) ] selectAnimalsDb = rawQuery theAboveSqlQuery [] </code></pre></div></div> We have a post-condition on the return value of this query that is guaranteed by the database schema, but not present in the types. One of the <code class="language-plaintext highlighter-rouge">Maybe</code> values will be a <code class="language-plaintext highlighter-rouge">Just</code> constructor, and the <code class="language-plaintext highlighter-rouge">Just</code> constructor will be determined by the <code class="language-plaintext highlighter-rouge">AnimalConstr</code> value on the <code class="language-plaintext highlighter-rouge">Entity Animal</code> value. This allows us to unsafely extract the <code class="language-plaintext highlighter-rouge">Just</code> value based on the constructor. We would write our conversion function as so: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>toDomainType :: Entity Animal -> Maybe (Entity Cat) -> Maybe (Entity Dog) -> Maybe (Entity Bird) -> Domain.Animal toDomainType (Entity _ animal) mcat mdog mbird = case (animalType animal, mcat, mdog, mbird) of (Cat, Just cat, _, _) -> toCat cat (Dog, _, Just dog, _) -> toDog dog (Bird, _, _, Just bird) -> toBird bird _ -> error "Impossible due to database constraints" where toDog :: Entity Dog -> Animal toCat :: Entity Cat -> Animal toBird :: Entity Bird -> Animal </code></pre></div></div> This approach is safe, easy to extend, and conforms to good relational database design. It is possible to make the Haskell-side even safer, by adding some fancier type-level tricks to the SQL conversion layer. However, that differs based on the database library you are using, and I would like to keep this post generalizable to other libraries. A followup post (or library?) might provide insight into this. <h1 id="the-persistent-approach">The <code class="language-plaintext highlighter-rouge">persistent</code> Approach</h1> The <code class="language-plaintext highlighter-rouge">persistent</code> library has an approach to encoding sum-type entities. The <a href="https://github.com/yesodweb/persistent/blob/master/docs/Persistent-entity-syntax.md#entity-level">documentation</a> describes this feature. Given that I wrote the documentation, I’m going to copy it in here: The schema in the test is reproduced here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>share [mkPersist persistSettings, mkMigrate "sumTypeMigrate"] [persistLowerCase| Bicycle brand T.Text Car make T.Text model T.Text +Vehicle bicycle BicycleId car CarId |] </code></pre></div></div> Let’s check out the definition of the Haskell type <code class="language-plaintext highlighter-rouge">Vehicle</code>. Using <code class="language-plaintext highlighter-rouge">ghci</code>, we can query for <code class="language-plaintext highlighter-rouge">:info Vehicle</code>: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> :i Vehicle type Vehicle = VehicleGeneric SqlBackend -- Defined at .../Projects/persistent/persistent-test/src/SumTypeTest.hs:26:1 >>> :i VehicleGeneric type role VehicleGeneric nominal data VehicleGeneric backend = VehicleBicycleSum (Key (BicycleGeneric backend)) | VehicleCarSum (Key (CarGeneric backend)) -- Defined at .../persistent/persistent-test/src/SumTypeTest.hs:26:1 -- lots of instances follow... </code></pre></div></div> A <code class="language-plaintext highlighter-rouge">VehicleGeneric</code> has two constructors: <ul> <li><code class="language-plaintext highlighter-rouge">VehicleBicycleSum</code> with a <code class="language-plaintext highlighter-rouge">Key (BicycleGeneric backend)</code> field</li> <li><code class="language-plaintext highlighter-rouge">VehicleCarSum</code> with a <code class="language-plaintext highlighter-rouge">Key (CarGeneric backend)</code> field</li> </ul> The <code class="language-plaintext highlighter-rouge">Bicycle</code> and <code class="language-plaintext highlighter-rouge">Car</code> are typical <code class="language-plaintext highlighter-rouge">persistent</code> entities. This generates the following SQL migrations (formatted for readability): <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE "bicycle" ( "id" INTEGER PRIMARY KEY, "brand" VARCHAR NOT NULL ); CREATE TABLE "car"( "id" INTEGER PRIMARY KEY, "make" VARCHAR NOT NULL, "model" VARCHAR NOT NULL ); CREATE TABLE "vehicle"( "id" INTEGER PRIMARY KEY, "bicycle" INTEGER NULL REFERENCES "bicycle", "car" INTEGER NULL REFERENCES "car" ); </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">vehicle</code> table contains a nullable foreign key reference to both the bicycle and the car tables. A SQL query that grabs all the vehicles from the database looks like this (note the <code class="language-plaintext highlighter-rouge">??</code> is for the <code class="language-plaintext highlighter-rouge">persistent</code> raw SQL query functions): <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT ??, ??, ?? FROM vehicle LEFT JOIN car ON vehicle.car = car.id LEFT JOIN bicycle ON vehicle.bicycle = bicycle.id </code></pre></div></div> If we use the above query with <code class="language-plaintext highlighter-rouge">rawSql</code>, we’d get the following result: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getVehicles :: SqlPersistM [ ( Entity Vehicle , Maybe (Entity Bicycle) , Maybe (Entity Car) ) ] </code></pre></div></div> This result has some post-conditions that are not guaranteed by the types or the schema. The constructor for <code class="language-plaintext highlighter-rouge">Entity Vehicle</code> is going to determine which of the other members of the tuple is <code class="language-plaintext highlighter-rouge">Nothing</code>. We can convert this to a friendlier domain model like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Vehicle' = Car' Text Text | Bike Text check = do result <- getVehicles pure (map convert result) convert :: Entity Vehicle -> Maybe (Entity Bicycle) -> Maybe (Entity Car) -> Vehicle' convert (Entity _ (VehicycleBicycleSum _)) (Just (Entity _ (Bicycle brand))) _ = Bike brand convert (Entity _ (VehicycleCarSum _)) _ (Just (Entity _ (Car make model))) = Car make model convert _ = error "The database preconditions have been violated!" </code></pre></div></div> The SQL table that is automatically generated from the entities does not guarantee that exactly one ID column is present. We can resolve this by adding a <code class="language-plaintext highlighter-rouge">CHECK</code> constraint: We need to add another check to ensure that at most one of the columns is present. <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ALTER TABLE vehicle ADD CHECK ( (bicycle IS NOT NULL AND car IS NULL) OR (car IS NOT NULL AND bicycle IS NULL) ); </code></pre></div></div> As compared to the “Shared Primary Keys” approach, this inverts the foreign key relationship. This means that we would need to insert an <code class="language-plaintext highlighter-rouge">Cat</code> or <code class="language-plaintext highlighter-rouge">Dog</code> into the database before we can insert an <code class="language-plaintext highlighter-rouge">Animal</code> into the database. Where the previous method would allow “orphan” <code class="language-plaintext highlighter-rouge">animal</code> records with no corresponding <code class="language-plaintext highlighter-rouge">cat</code>, <code class="language-plaintext highlighter-rouge">dog</code>, etc, this method allows for orphan <code class="language-plaintext highlighter-rouge">cat</code>s and <code class="language-plaintext highlighter-rouge">dog</code>s without corresponding <code class="language-plaintext highlighter-rouge">animal</code> records. Adding constructors is easy - you add a new column to the <code class="language-plaintext highlighter-rouge">animal</code> table, and adjust the <code class="language-plaintext highlighter-rouge">CHECK</code> constraints so that only one can be present. This requires less work with adding custom <code class="language-plaintext highlighter-rouge">ENUM</code> types, requires fewer and less complicated foreign keys, and has less “dynamic” behavior (linking the <code class="language-plaintext highlighter-rouge">id</code> and <code class="language-plaintext highlighter-rouge">type</code> field at runtime vs statically known relationship). <h1 id="nullable-columns">Nullable Columns</h1> This approach dispenses with multiple tables and represents the sum-type in a single table. It is an awkward encoding, and it is not considered good database design. However, it does avoid many <code class="language-plaintext highlighter-rouge">JOIN</code>s, and is slightly more straightforward. We will start by creating a table for <code class="language-plaintext highlighter-rouge">animal</code> and denormalize the other tables in, with <code class="language-plaintext highlighter-rouge">NULL</code>able columns. <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE animal ( -- Common fields: id SERIAL PRIMARY KEY, type animal_constr NOT NULL, -- Cat fields: cat_name TEXT, cat_age INTEGER, cat_food TEXT, -- Dog fields: dog_name TEXT, dog_owner_id INTEGER REFERENCES owner, -- Bird fields: bird_name TEXT, bird_song TEXT ); </code></pre></div></div> Now, when we parse fields out of the database, we will look a the <code class="language-plaintext highlighter-rouge">type</code> field, and assume that the relevant fields are not null. We can make this safe by using a <code class="language-plaintext highlighter-rouge">CHECK</code> constraint to ensure that the corresponding fields are not null, and that the irrelevant fields are null. <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ALTER TABLE animal ADD CHECK ( ( type = 'cat' AND cat_name IS NOT NULL AND cat_age IS NOT NULL AND cat_food IS NOT NULL AND dog_name IS NULL AND dog_owner_id IS NULL AND bird_name IS NULL AND bird_song IS NULL ) OR ( type = 'dog' AND dog_name IS NOT NULL AND dog_owner_id IS NOT NULL AND cat_name IS NULL AND cat_age IS NULL AND cat_food IS NULL AND bird_name IS NULL AND bird_song IS NULL ) OR ( type = 'bird' AND bird_name IS NOT NULL AND bird_song IS NOT NULL AND dog_name IS NULL AND dog_owner_id IS NULL AND cat_name IS NULL AND cat_age IS NULL AND cat_food IS NULL ) ); </code></pre></div></div> This <code class="language-plaintext highlighter-rouge">CHECK</code> constraint is getting pretty gnarly, and it’s only going to get worse if we add additional constructors and fields. This kind of boilerplate can be automated away. I suspect that a complex <code class="language-plaintext highlighter-rouge">CHECK</code> constraint like this might become computationally intensive, as well, though I have no idea what the performance characteristics of it are. This approach is explicitly denormalized, and your DBA friends may scoff at you for implementing it. However, it has many upsides, as well. It is simple and easy to query, and aside from the safety <code class="language-plaintext highlighter-rouge">CHECK</code> constraints to guarantee data integrity, it is relatively low boilerplate. If you want to provide this schema for convenience, you might consider using one of the previous two choices and exposing this as a <code class="language-plaintext highlighter-rouge">VIEW</code> on the underlying data. The query to provide the same schema from the “Shared Primary Keys” approach is here: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE VIEW animal_a ( id INTEGER, type animal_constr NOT NULL, cat_name TEXT, cat_age INTEGER, cat_food TEXT, dog_name TEXT, dog_owner_id INTEGER REFERENCES owner, bird_name TEXT, bird_song TEXT ) AS SELECT id, type, cat.name as cat_name, cat.age as cat_age, cat.food as cat_food, dog.name as dog_name, dog.owner_id as dog_owner_id, bird.name as bird_name, bird.song as bird_song FROM animal LEFT JOIN cat ON animal.id = cat.id LEFT JOIN dog ON animal.id = dog.ig LEFT JOIN bird ON animal.id = bird.id WHERE cat.id IS NOT NULL OR dog.is IS NOT NULL OR bird.id IS NOT NULL </code></pre></div></div> The view from the “Persistent” approach is slightly different, because the IDs for the cats/dogs/birds differ from the animal ID. If you needed to provide absolute backwards compatibility here, then you could provide the view with redundant <code class="language-plaintext highlighter-rouge">cat_id</code> etc columns. <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE VIEW animal_b ( id INTEGER, type animal_constr NOT NULL, cat_id INTEGER cat_name TEXT, cat_age INTEGER, cat_food TEXT, dog_id INTEGER dog_name TEXT, dog_owner_id INTEGER REFERENCES owner, bird_id INTEGER bird_name TEXT, bird_song TEXT ) AS SELECT id, type, cat.id as cat_id, cat.name as cat_name, cat.age as cat_age, cat.food as cat_food, dog.id as dog_id, dog.name as dog_name, dog.owner_id as dog_owner_id, bird.id as bird_id, bird.name as bird_name, bird.song as bird_song FROM animal LEFT JOIN cat ON animal.cat_id = cat.id LEFT JOIN dog ON animal.dog_id = dog.id LEFT JOIN bird ON animal.bird_id = bird.id </code></pre></div></div> Since this option is expressible as a <code class="language-plaintext highlighter-rouge">VIEW</code> on the other two options, I’d suggest doing that if you need to provide this schema. <h1 id="datatype-normalization">Datatype Normalization</h1> Above, I mentioned that the <code class="language-plaintext highlighter-rouge">Animal</code> datatype we’re using is not normalized. Haskell datatypes can be normalized in much the same way that SQL relations can be. The <a href="https://en.wikipedia.org/wiki/Database_normalization">Wikipedia</a> is a good way to learn about this, and the process is essentially the same. First, we notice that the <code class="language-plaintext highlighter-rouge">Name</code> field is repeated in each constructor. We are going to factor that field out. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data AnimalDetails = Cat Age FavoriteFood | Dog OwnerId | Bird Song data Animal = Animal Name AnimalDetails </code></pre></div></div> All of the repetition has been factored out. <h1 id="addendum">Addendum</h1> <a href="https://twitter.com/cliffordheath/status/1108497262450102272">@cliffordheath</a> commented that these three strategies have well-known names in the database world - absorption, separation, and partition. Tue, 19 Mar 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/03/19/sum_types_in_sql.html https://www.parsonsmatt.org/2019/03/19/sum_types_in_sql.html Implementing Union in Esqueleto I We use the SQL <code class="language-plaintext highlighter-rouge">UNION</code> operator at IOHK in one of our <code class="language-plaintext highlighter-rouge">beam</code> queries, and <code class="language-plaintext highlighter-rouge">esqueleto</code> does not support it. To make porting the IOHK SQL code more straightforward, I decided to implement <code class="language-plaintext highlighter-rouge">UNION</code>. This blog post series will delve into implementing this feature, in a somewhat stream-of-thought manner. <h1 id="background">Background</h1> <code class="language-plaintext highlighter-rouge">esqueleto</code> is a SQL library that builds on the <code class="language-plaintext highlighter-rouge">persistent</code> library for database definitions and simple queries. It attempts to provide an embedded DSL that allows you to use SQL and Haskell together. In my opinion, it has less complicated types than <code class="language-plaintext highlighter-rouge">beam</code> and an easier to learn UX than <code class="language-plaintext highlighter-rouge">opaleye</code>. The <code class="language-plaintext highlighter-rouge">persistent</code> quasiquoter model definitions save a bunch of boilerplate, too. <code class="language-plaintext highlighter-rouge">esqueleto</code> is implemented in a somewhat convoluted manner – we have a type class <code class="language-plaintext highlighter-rouge">Esqueleto query expr backend</code> that everything is defined in terms of. However, the functional depenencies on the class essentially only permit a single instance. The <code class="language-plaintext highlighter-rouge">query</code> type is always fixed to <code class="language-plaintext highlighter-rouge">SqlQuery</code>, a <code class="language-plaintext highlighter-rouge">WriterT [Clauses] (StateT IdentInfo)</code> monad. The <code class="language-plaintext highlighter-rouge">expr</code> type is always <code class="language-plaintext highlighter-rouge">SqlExpr</code>, which is a GADT that provides a structure for SQL expressions. It’s kind of a tagless final encoding paired with a GADT initial encoding. Neat. <h1 id="goal">Goal</h1> Alright, so let’s start with the SQL. We want to be able to write a Haskell thing that translates to this SQL: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT name FROM person UNION SELECT title FROM blog_post </code></pre></div></div> Let’s just write the syntax out that <code class="language-plaintext highlighter-rouge">esqueleto</code> usually uses, and see where that takes us: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>unionTest = ( select $ from $ \person -> return (person ^. PersonName) ) `union` ( select $ from $ \blog -> return (blog ^. BlogPostTitle) ) </code></pre></div></div> This is a pleasing looking API! Can it work? What type would <code class="language-plaintext highlighter-rouge">union</code> need to have? Well, probably not. Let’s look at the type of <code class="language-plaintext highlighter-rouge">select</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select :: (SqlSelect a r, MonadIO m) => SqlQuery a -> SqlReadT m [r] </code></pre></div></div> Once something has become a <code class="language-plaintext highlighter-rouge">SqlReadT</code> value, we can’t really introspect on the query structure anymore. So we can’t have this syntax :( Let’s try something else: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>unionTest = select $ union ( from $ \person -> do pure (person ^. PersonName) ) ( from $ \blog -> do pure (blog ^. BlogPostTitle) ) </code></pre></div></div> This means that <code class="language-plaintext highlighter-rouge">union</code> will end up returning a <code class="language-plaintext highlighter-rouge">SqlQuery a</code>. It takes two arguments, each of which is a query returning the same thing. We have our first attempt at a type to implement! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>union :: SqlQuery a -> SqlQuery a -> SqlQuery a union query0 query1 = undefined </code></pre></div></div> <h1 id="first-attempt">First Attempt</h1> Alright, so, uh, how do we make values of type <code class="language-plaintext highlighter-rouge">SqlQuery</code>? Let’s first look at what the type actually is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | SQL backend for @esqueleto@ using 'SqlPersistT'. newtype SqlQuery a = Q { unQ :: W.WriterT SideData (S.State IdentState) a } -- | Side data written by 'SqlQuery'. data SideData = SideData { sdDistinctClause :: !DistinctClause , sdFromClause :: ![FromClause] , sdSetClause :: ![SetClause] , sdWhereClause :: !WhereClause , sdGroupByClause :: !GroupByClause , sdHavingClause :: !HavingClause , sdOrderByClause :: ![OrderByClause] , sdLimitClause :: !LimitClause , sdLockingClause :: !LockingClause } -- | List of identifiers already in use and supply of temporary -- identifiers. newtype IdentState = IdentState { inUse :: HS.HashSet T.Text } initialIdentState :: IdentState initialIdentState = IdentState mempty </code></pre></div></div> So, we use the <code class="language-plaintext highlighter-rouge">WriterT SideData</code> to accumulate information about the query we’re building. And then we use <code class="language-plaintext highlighter-rouge">IdentState</code> to keep track of identifiers in use. Let’s look at some things that return <code class="language-plaintext highlighter-rouge">SqlQuery</code> values. I searched through the <code class="language-plaintext highlighter-rouge">Database.Esqueleto.Internal.Sql</code> module for <code class="language-plaintext highlighter-rouge">-> SqlQuery</code> and got some interesting results. The only function that returns a <code class="language-plaintext highlighter-rouge">SqlQuery</code> value in the whole module is this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- line 497 withNonNull :: PersistField typ => SqlExpr (Value (Maybe typ)) -> (SqlExpr (Value typ) -> SqlQuery a) -> SqlQuery a withNonNull field f = do where_ $ not_ $ isNothing field f $ veryUnsafeCoerceSqlExprValue field </code></pre></div></div> Okay, so <code class="language-plaintext highlighter-rouge">where_</code> is a <code class="language-plaintext highlighter-rouge">SqlQuery</code> function. Let’s look for it’s definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (Monad query) => Esqueleto query expr backend | query -> expr backend , expr -> query backend where -- snip... -- in Database.Esqueleto.Internal.Language, line 93 -- | @WHERE@ clause: restrict the query's result. where_ :: expr (Value Bool) -> query () </code></pre></div></div> The class definition has functional dependencies that basically make it so you can determine any type variable from any other. Since <code class="language-plaintext highlighter-rouge">persistent</code> uses the <code class="language-plaintext highlighter-rouge">SqlBackend</code> type for the <code class="language-plaintext highlighter-rouge">backend</code>, you end up needing to totally pick <code class="language-plaintext highlighter-rouge">Esqueleto SqlQuery SqlExpr SqlBackend</code>, and you can’t vary any of those types. Okay, let’s find the instance definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- line 452 in Database.Esqueleto.Internal.Sql where_ expr = Q $ W.tell mempty { sdWhereClause = Where expr } on expr = Q $ W.tell mempty { sdFromClause = [OnClause expr] } groupBy expr = Q $ W.tell mempty { sdGroupByClause = GroupBy $ toSomeValues expr } having expr = Q $ W.tell mempty { sdHavingClause = Where expr } locking kind = Q $ W.tell mempty { sdLockingClause = Monoid.Last (Just kind) } </code></pre></div></div> There’s actually a bunch, and they mostly just <code class="language-plaintext highlighter-rouge">tell</code> about a part of the query we’re building. Cool. This may be useful soon, but it’s not immediately obvious to me how. I spent some time perusing the rest of the library, and I found another combinator that takes a <code class="language-plaintext highlighter-rouge">SqlQuery</code> value and produces something else: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sub :: PersistField a => Mode -> SqlQuery (SqlExpr (Value a)) -> SqlExpr (Value a) sub mode query = ERaw Parens $ \info -> toRawSql mode info query </code></pre></div></div> This looks useful! Let’s look at the <code class="language-plaintext highlighter-rouge">ERaw</code> constructor, which is from the <code class="language-plaintext highlighter-rouge">SqlExpr</code> datatype: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> -- Raw expression: states whether parenthesis are needed -- around this expression, and takes information about the SQL -- connection (mainly for escaping names) and returns both an -- string ('TLB.Builder') and a list of values to be -- interpolated by the SQL backend. ERaw :: NeedParens -> (IdentInfo -> (TLB.Builder, [PersistValue])) -> SqlExpr (Value a) </code></pre></div></div> Okay, so we can start with this approach and just generate the raw SQL we need. This is probably the wrong approach, but it might work, and working is better than imaginary. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>union :: PersistField a => SqlQuery (SqlExpr (Value a)) -> SqlQuery (SqlExpr (Value a)) -> SqlQuery (SqlExpr (Value a)) union query0 query1 = pure $ ERaw Parens $ \info -> let (q0, v0) = toRawSql SELECT info query0 (q1, v1) = toRawSql SELECT info query1 in (q0 <> " UNION " <> q1, v0 <> v1) </code></pre></div></div> This is basically what <code class="language-plaintext highlighter-rouge">sub</code> does. We just concatenate them with the <code class="language-plaintext highlighter-rouge">UNION</code> operator in between. Let’s write a test and see how it works! <h2 id="testing-the-first-approach">Testing the First Approach</h2> I hop into the <code class="language-plaintext highlighter-rouge">esqueleto</code> test suite and start writing my test: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>testCaseUnion :: Run -> Spec testCaseUnion run = do describe "union" $ do it "works" $ do run $ do let names = [ "john", "joe", "jordan", "james" ] blogs = [ "asdf", "qwer", "berty", "nopex" ] (pid:_) <- forM names $ \name -> insert (Person name Nothing Nothing 3) forM_ blogs $ \blog -> insert (BlogPost blog pid) res <- select $ ( from $ \person -> do pure (person ^. PersonName) ) `union` ( from $ \blog -> do pure (blog ^. BlogPostTitle) ) liftIO $ L.sort (map unValue res) `shouldBe` L.sort (names <> blogs) </code></pre></div></div> We insert four blogs and people into the database, and then get the <code class="language-plaintext highlighter-rouge">UNION</code> of their names and titles. Does it work? No :( <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> test/Common/Test.hs:1421:11: 1) Tests that are common to all backends, union, works expected: ["asdf","berty","james","joe","john","jordan","nopex","qwer"] but got: ["asdf"] </code></pre></div></div> Okay, that’s a bit weird. Why does it only pick the first? Let’s test our understanding and try a raw query. I’ll add a line that runs the raw SQL and then I’m going to <code class="language-plaintext highlighter-rouge">error</code> out to see the output in the test suite: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- ... res' <- rawSql ( "SELECT Person.name FROM Person " <> "UNION " <> "SELECT BlogPost.title FROM BlogPost" ) [] error (show $ map unSingle (res' :: [Single String])) -- ... </code></pre></div></div> Now, we get an error that shows what that query returned: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> test/Common/Test.hs:1417:9: 1) Tests that are common to all backends, union, works uncaught exception: ErrorCall ["asdf","berty","james","joe","john","jordan","nopex","qwer"] CallStack (from HasCallStack): error, called at test/Common/Test.hs:1417:9 in main:Common.Test </code></pre></div></div> Okay, that’s exactly what I expected to see! So there’s something weird about how the query is being generated. I want to get a textual representation of the query, and <code class="language-plaintext highlighter-rouge">toRawSql</code> is the function to do that. I’m going to make a wrapper around it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>renderQuery :: (Monad m, EI.SqlSelect a r) => SqlQuery a -> SqlPersistT m TL.Text renderQuery q = do conn <- ask pure (queryToText conn q) queryToText :: EI.SqlSelect a r => SqlBackend -> SqlQuery a -> TL.Text queryToText conn q = let (tlb, _) = EI.toRawSql EI.SELECT (conn, EI.initialIdentState) q in TLB.toLazyText tlb </code></pre></div></div> And we’ll render the query: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- ...snip let q = ( from $ \person -> do pure (person ^. PersonName) ) `union` ( from $ \blog -> do pure (blog ^. BlogPostTitle) ) res <- select q error . show =<< renderQuery q -- snip... </code></pre></div></div> Now, what do we get? <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> test/Common/Test.hs:1415:9: 1) Tests that are common to all backends, union, works uncaught exception: ErrorCall "SELECT (SELECT \"Person\".\"name\"\nFROM \"Person\"\n UNION SELECT \"BlogPost\".\"title\"\nFROM \"BlogPost\"\n)\n" CallStack (from HasCallStack): error, called at test/Common/Test.hs:1415:9 in main:Common.Test </code></pre></div></div> Okay, so it’s doing something that we don’t want. We want this: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT name FROM person UNION SELECT title FROM blog_post </code></pre></div></div> And it’s doing this: <div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT ( SELECT name FROM person UNION SELECT title FROM blog_post ) </code></pre></div></div> Which explains our problem! We actually need it do <code class="language-plaintext highlighter-rouge">SELECT * FROM (the union query)</code>. Or removing the outer SELECT entirely. So, this suggests that this isn’t the right approach. Next post, I’ll attempt to find another way to implement it, and write down the stream-of-thought process on how I got there. Thu, 31 Jan 2019 00:00:00 +0000 https://www.parsonsmatt.org/2019/01/31/esqueleto_union_i.html https://www.parsonsmatt.org/2019/01/31/esqueleto_union_i.html 2018 Retrospective 2018 was a bit of a rollercoaster. Like <a href="/2017/12/31/2017_retrospective.html">last year</a>, I kept a detailed todolist in <a href="https://www.workflowy.com">Workflowy</a>. As a result, I can look back at my goals for the year and see how I worked through them and what I accomplished. One thing that I noted in my previous year’s retrospective was a desire to focus on non-software stuff, and I think I did fairly well on that. <h1 id="physical-health">Physical Health</h1> <h2 id="weight">Weight</h2> My goal was to get to 170lbs, as a continuation of last year’s goal. I did not succeed, but I did make significant progress! <img src="https://www.parsonsmatt.org/images/2018-retro/bodyweight.png" alt="Weight loss chart from 2018" /> I started the year around 210lb, and got my monthly trendline to 190lb in mid-August. I credit my weight loss to riding my bike a shitload and the ketogenic diet. In mid-August, I started seeing someone seriously, and I no longer had quite so much time for the bike or discipline for keto. Rather than focus on losing weight, I was also focusing on increasing performance on the bike. For next year, I’d like to finish out the weight loss (finally). Ketogenic dieting works really damn well for me, so I’ll just have to make it work with my lifestyle. Carbs are just so good for endurance performance, it’ll be annoying to take that performance hit. <h2 id="cycling">Cycling</h2> I didn’t have any goals coming into 2018, but I did develop some throughout the year. My brother got me into cycling, and I’m so thankful for his influence and for sharing that hobby with me. So of course I had to compete with him – I decided that I’d beat him in miles for the year. We were pretty close throughout the year, until I crashed on my mountain bike and bruised my ribs. He gained several hundred miles on me while I was recovering for nearly six weeks. Unfortunately for him, he broke his foot, and I am finishing the year with 4,351 miles, just barely passing his 4,312. I wanted to do a lot more bikepacking and bike touring than I ended up doing. I planned a big trip from Denver to Santa Fe, but I was totally unprepared for it. The route was dramatically more difficult than I anticipated, and despite all my riding and training, I was not capable of the 70-90 mile days that I needed to hit my schedule. Next year, I plan on mapping out the route better and accomplishing it, though probably aiming to hit half the daily distance. I want to do a lot more overnight bikepacking trips, too. I did a section of the Colorado Trail a few times, and it was great. I should also do some ordinary street/paved tours, too, as those are also a lot of fun and a bit less challenging. I have a few actual bike races planned! I’m excited to try that out and see how it goes, though I don’t expect to place well or do too good. <h2 id="lifting">Lifting</h2> I lifted about three times in 2018. I made 0 progress towards my strength goals of a 225lb bench press, 315lb squat, and 405lb deadlift. This year was almost entirely focused on cycling distance and endurance. Next year, I’d like to focus more on a balanced health and fitness approach. So I will likely sacrifice some cycling performance to get stronger and lose weight. <h1 id="mental-health">Mental Health</h1> This was a hard year for mental health for me. I managed to escape an emotionally abusive personal situation, and it left me with what I later realized was PTSD symptoms. Escape, recovery and reflection on these experiences has been a major theme of this year for me. I’ve learned and grown tremendously from this, and I am eternally grateful for my friends and the folks that supported me through this. I signed up for fancy and expensive health insurance next year with Kaiser primarily so that I could finally begin getting better mental healthcare. Naturally, <a href="https://www.sfchronicle.com/bayarea/article/Kaiser-mental-health-workers-to-strike-for-5-days-13452791.php">Kaiser doesn’t seem to live up to their legal obligations</a> for mental health, which rules. I’m planning on beginning talk therapy in the new year, and I’m excited to see where that goes. <h1 id="write-a-book">Write a Book</h1> This was a goal for last year, but I didn’t work towards it at all. This year, I decided to put more energy into it. Early on, Sandy Maguire and I decided to team up to write a COMPENDIUM OF HASKELL. We came up with a sweet table-of-contents. Unfortunately, I was busy enough with work and other pursuits to contribute, and so Sandy took his parts of the table of contents and wrote <a href="https://leanpub.com/thinking-with-types">Thinking with Types</a>, a fantastic book on type-level programming in Haskell. While I haven’t made as much progress as I want on this, I am happy to announce that I do have a table-of-contents, about 30 pages, and a working title: Haskell In Production. It will be an intermediate level manual on low variance, high success strategies for using Haskell at work. When it comes closer to completion, I will have a more formal announcement. <h1 id="buy-a-house">Buy a House</h1> I was all set to buy a house, but banks decided to revoke their approval when my employment contract status (1099) was known. Apparently they need 2 or more years at a single employer before they’ll consider a 1099 valid/stable income for extending a mortgage. Honestly, it’s probably good that I’m taking an extra year to save up for a down payment, but it is frustrating. I love living in Denver, and I can’t wait to have more lasting roots here. Mon, 31 Dec 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/12/31/2018_retrospective.html https://www.parsonsmatt.org/2018/12/31/2018_retrospective.html Laziness Quiz Do you understand laziness? It’s okay if you don’t. Most people don’t. It can be somewhat surprising when something actually gets evaluated in Haskell, even when you’re using bang patterns. So, here is a quick quiz on laziness in Haskell! If it makes you feel better, I didn’t get it right either on my first try. You can copy and paste this into a source file and run it in GHCi. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE BangPatterns #-} {-# LANGUAGE NamedFieldPuns #-} module Quiz ( quiz ) where import Debug.Trace import System.IO.Unsafe import Data.IORef -- | A datatype with a "strict" field and a lazy field. We use -- a "BangPattern" to annotate that `strict` should be "strict." Why the -- quotes? Well, because it's not really as strict as you might expect! data Foo = Foo { strict :: !Int, lazy :: Int } -- | A global variable. Yes, Haskell has global variables, you just have to -- be careful with them. lineNumber :: IORef Int lineNumber = unsafePerformIO (newIORef 0) {-# NOINLINE lineNumber #-} -- | This action is used to keep track of where we are in the quiz. logLine :: IO () logLine = do i <- atomicModifyIORef' lineNumber (\i -> (i + 1, i + 1)) putStrLn $ "Log line: " ++ show i -- | The 'trace' functions from Debug.Trace will print a message to the -- console whenever the expression is evaluated. It will only print it -- once, and after that, the value will be cached. quiz :: IO () quiz = do logLine let a = trace "evaluating a" $ 1 + 2 logLine let !b = trace "evaluating b" $ 2 + 4 logLine let foo = Foo { strict = trace "evaluating strict" $ 1 + 2 + 3 + 4 , lazy = trace "evaluating lazy" $ 9 + 12 } logLine print a logLine print a -- this line is not a typo logLine case foo of Foo { strict, lazy } -> do logLine print lazy </code></pre></div></div> Okay, make sure you can open this up in GHCi. The quiz is: When will the various <code class="language-plaintext highlighter-rouge">trace</code> calls print things out? We have four traces: the one for <code class="language-plaintext highlighter-rouge">a</code>, the one for <code class="language-plaintext highlighter-rouge">b</code>, the one for <code class="language-plaintext highlighter-rouge">strict</code> field, and the one for <code class="language-plaintext highlighter-rouge">lazy</code> field. Read the program and guess where each of these traces will output. Then, run the program. How does your expectation differ from what actually happened? What mental model were you using, and how did it differ from reality? <h1 id="spoiling">Spoiling…</h1> seriously don’t look at this section until you’ve tried it yourself i’m gonna put more filler here M O A R F I L L E R <h1 id="fill-it-up">fill it up</h1> okay, well, if you’ve made it this far, here’s how I was wrong: I expected that <code class="language-plaintext highlighter-rouge">"evaluating strict"</code> would print out with the <code class="language-plaintext highlighter-rouge">let foo = ...</code> line. My understanding of bang patterns on data fields was that the value was evaluated to WHNF, and then the record was constructed with it. BUT – I also thought that declaring a data structure with a strict field would make the constructor strict as well. It isn’t! In fact, <code class="language-plaintext highlighter-rouge">strict</code>’s evaluation doesn’t print out until later – can you guess when? … It’s when we evaluate the <code class="language-plaintext highlighter-rouge">Foo</code> constructor to weak head normal form. Indeed – a strict field does not evaluate when the record is constructed, but rather, when the record is evaluated. Anyway, this may be useful to you. Enjoy! EDIT: The <a href="https://old.reddit.com/r/haskell/comments/a339r4/laziness_quiz_can_you_get_it_right_i_didnt/">reddit thread</a> is quite good! Tue, 04 Dec 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/12/04/laziness_quiz.html https://www.parsonsmatt.org/2018/12/04/laziness_quiz.html The Trouble with Typed Errors You, like me, program in either Haskell, or Scala, or F#, or Elm, or PureScript, and you don’t like runtime errors. They’re awful and nasty! You have to debug them, and they’re not represented in the types. Instead, we like to use <code class="language-plaintext highlighter-rouge">Either</code> (or something isomorphic) to represent stuff that might fail: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Either l r = Left l | Right r </code></pre></div></div> <code class="language-plaintext highlighter-rouge">Either</code> has a <code class="language-plaintext highlighter-rouge">Monad</code> instance, so you can short-circuit an <code class="language-plaintext highlighter-rouge">Either l r</code> computation with an <code class="language-plaintext highlighter-rouge">l</code> value, or bind it to a function on the <code class="language-plaintext highlighter-rouge">r</code> value. So, we take our unsafe, runtime failure functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head :: [a] -> a lookup :: k -> Map k v -> v parse :: String -> Integer </code></pre></div></div> and we use informative error types to represent their possible failures: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data HeadError = ListWasEmpty head :: [a] -> Either HeadError a data LookupError = KeyWasNotPresent lookup :: k -> Map k v -> Either LookupError v data ParseError = UnexpectedChar Char String | RanOutOfInput parse :: String -> Either ParseError Integer </code></pre></div></div> Except, we don’t really use types like <code class="language-plaintext highlighter-rouge">HeadError</code> or <code class="language-plaintext highlighter-rouge">LookupError</code>. There’s only one way that <code class="language-plaintext highlighter-rouge">head</code> or <code class="language-plaintext highlighter-rouge">lookup</code> could fail. So we just use <code class="language-plaintext highlighter-rouge">Maybe</code> instead. <code class="language-plaintext highlighter-rouge">Maybe a</code> is just like using <code class="language-plaintext highlighter-rouge">Either () a</code> – there’s only one possible <code class="language-plaintext highlighter-rouge">Left ()</code> value, and there’s only one possible <code class="language-plaintext highlighter-rouge">Nothing</code> value. (If you’re unconvinced, write <code class="language-plaintext highlighter-rouge">newtype Maybe a = Maybe (Either () a)</code>, derive all the relevant instances, and try and detect a difference between this <code class="language-plaintext highlighter-rouge">Maybe</code> and the stock one). But, <code class="language-plaintext highlighter-rouge">Maybe</code> isn’t great – we’ve lost information! Suppose we have some computation: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: String -> Maybe Integer foo str = do c <- head str r <- lookup str strMap eitherToMaybe (parse (c : r)) </code></pre></div></div> Now, we try it on some input, and it gives us <code class="language-plaintext highlighter-rouge">Nothing</code> back. Which step failed? We actually can’t know that! All we can know is that something failed. So, let’s try using <code class="language-plaintext highlighter-rouge">Either</code> to get more information on what failed. Can we just write this? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: String -> Either ??? Integer foo str = do c <- head str r <- lookup str strMap parse (c : r) </code></pre></div></div> Unfortunately, this gives us a type error. We can see why by looking at the type of <code class="language-plaintext highlighter-rouge">>>=</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(>>=) :: (Monad m) => m a -> (a -> m b) -> m b </code></pre></div></div> The type variable <code class="language-plaintext highlighter-rouge">m</code> must be an instance of <code class="language-plaintext highlighter-rouge">Monad</code>, and the type <code class="language-plaintext highlighter-rouge">m</code> must be exactly the same for the value on the left and the function on the right. <code class="language-plaintext highlighter-rouge">Either LookupError</code> and <code class="language-plaintext highlighter-rouge">Either ParseError</code> are not the same type, and so this does not type check. Instead, we need some way of accumulating these possible errors. We’ll introduce a utility function <code class="language-plaintext highlighter-rouge">mapLeft</code> that helps us: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mapLeft :: (l -> l') -> Either l r -> Either l' r mapLeft f (Left l) = Left (f l) mapLeft _ r = r </code></pre></div></div> Now, we can combine these error types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: String -> Either (Either HeadError (Either LookupError ParseError)) Integer foo str = do c <- mapLeft Left (head str) r <- mapLeft (Right . Left) (lookup str strMap) mapLeft (Right . Right) (parse (c : r)) </code></pre></div></div> There! Now we can know exactly how and why the computation failed. Unfortunately, that type is a bit of a monster. It’s verbose and all the <code class="language-plaintext highlighter-rouge">mapLeft</code> boilerplate is annoying. At this point, most application developers will create a “application error” type, and they’ll just shove everything that can go wrong into it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data AllErrorsEver = AllParseError ParseError | AllLookupError LookupError | AllHeadError HeadError | AllWhateverError WhateverError | FileNotFound FileNotFoundError | etc... </code></pre></div></div> Now, this slightly cleans up the code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: String -> Either AllErrorsEver Integer foo str = do c <- mapLeft AllHeadError (head str) r <- mapLeft AllLookupError (lookup str strMap) mapLeft AllParseError (parse (c : r)) </code></pre></div></div> However, there’s a pretty major problem with this code. <code class="language-plaintext highlighter-rouge">foo</code> is claiming that it can “throw” all kinds of errors – it’s being honest about parse errors, lookup errors, and head errors, but it’s also claiming that it will throw if files aren’t found, “whatever” happens, and <code class="language-plaintext highlighter-rouge">etc</code>. There’s no way that a call to <code class="language-plaintext highlighter-rouge">foo</code> will result in <code class="language-plaintext highlighter-rouge">FileNotFound</code>, because <code class="language-plaintext highlighter-rouge">foo</code> can’t even do <code class="language-plaintext highlighter-rouge">IO</code>! It’s absurd. The type is too large! And I have <a href="/2018/10/02/small_types.html">written about keeping your types small</a> and how wonderful it can be for getting rid of bugs. Suppose we want to handle <code class="language-plaintext highlighter-rouge">foo</code>’s error. We call the function, and then write a case expression like good Haskellers: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>case foo "hello world" of Right val -> pure val Left err -> case err of AllParseError parseError -> handleParseError parseError AllLookupError lookupErr -> handleLookupError AllHeadError headErr -> handleHeadError _ -> error "impossible?!?!?!" </code></pre></div></div> Unfortunately, this code is brittle to refactoring! We’ve claimed to handle all errors, but we’re really not handling many of them. We currently “know” that these are the only errors that can happen, but there’s no compiler guarantee that this is the case. Someone might later modify <code class="language-plaintext highlighter-rouge">foo</code> to throw another error, and this case expression will break. Any case expression that evaluates any result from <code class="language-plaintext highlighter-rouge">foo</code> will need to be updated. The error type is too big, and so we introduce the possibility of mishandling it. There’s another problem. Let’s suppose we know how to handle a case or two of the error, but we must pass the rest of the error cases upstream: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bar :: String -> Either AllErrorsEver Integer bar str = case foo str of Right val -> Right val Left err -> case err of AllParseError pe -> Right (handleParseError pe) _ -> Left err </code></pre></div></div> We know that <code class="language-plaintext highlighter-rouge">AllParseError</code> has been handled by <code class="language-plaintext highlighter-rouge">bar</code>, because – just look at it! However, the compiler has no idea. Whenever we inspect the error content of <code class="language-plaintext highlighter-rouge">bar</code>, we must either a) “handle” an error case that has already been handled, perhaps dubiously, or b) ignore the error, and desperately hope that no underlying code ever ends up throwing the error. Are we done with the problems on this approach? No! There’s no guarantee that I throw the right error! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head :: [a] -> Either AllErrorsEver a head (x:xs) = Right x head [] = Left (AllLookupError KeyWasNotPresent) </code></pre></div></div> This code typechecks, but it’s wrong, because <code class="language-plaintext highlighter-rouge">LookupError</code> is only supposed to be thrown by <code class="language-plaintext highlighter-rouge">lookup</code>! It’s obvious in this case, but in larger functions and codebases, it won’t be so clear. <h1 id="monolithic-error-types-are-bad">Monolithic error types are bad</h1> So, having a monolithic error type has a ton of problems. I’m going to make a claim here: <blockquote> All error types should have a single constructor </blockquote> That is, no sum types for errors. How can we handle this? Let’s maybe see if we can make <code class="language-plaintext highlighter-rouge">Either</code> any nicer to use. We’ll define a few helpers: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type (+) = Either infixr + 5 l :: l -> Either l r l = Left r :: r -> Either l r r = Right </code></pre></div></div> Now, let’s refactor that uglier <code class="language-plaintext highlighter-rouge">Either</code> code with these new helpers: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: String -> Either (HeadError + LookupError + ParseError) Integer foo str = do c <- mapLeft l (head str) r <- mapLeft (r . l) (lookup str strMap) mapLeft (r . r) (parse (c : r)) </code></pre></div></div> Well, the syntax is nicer. We can <code class="language-plaintext highlighter-rouge">case</code> over the nested Either in the error branch to eliminate single error cases. It’s easier to ensure we don’t claim to throw errors we don’t – after all, GHC will correctly infer the type of <code class="language-plaintext highlighter-rouge">foo</code>, and if GHC infers a type variable for any <code class="language-plaintext highlighter-rouge">+</code>, then we can assume that we’re not using that error slot, and can delete it. Unfortunately, there’s still the <code class="language-plaintext highlighter-rouge">mapLeft</code> boilerplate. And expressions which you’d really want to be equal, aren’t – <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x :: Either (HeadError + LookupError) Int y :: Either (LookupError + HeadError) Int </code></pre></div></div> The values <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> are isomorphic, but we can’t use them in a <code class="language-plaintext highlighter-rouge">do</code> block because they’re not exactly equal. If we add errors, then we must revise all <code class="language-plaintext highlighter-rouge">mapLeft</code> code, as well as all <code class="language-plaintext highlighter-rouge">case</code> expressions that inspect the errors. Fortunately, these are entirely compiler-guided refactors, so the chance of messing them up is small. However, they contribute significant boilerplate, noise, and busywork to our program. <h1 id="boilerplate-be-gone">Boilerplate be gone!</h1> Well, turns out, we can get rid of the order dependence and boilerplate with type classes! The most powerful approach is to use “classy prisms” from the <code class="language-plaintext highlighter-rouge">lens</code> package. Let’s translate our types from concrete values to prismatic ones: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Concrete: head :: [a] -> Either HeadError a -- Prismatic: head :: AsHeadError err => [a] -> Either err a -- Concrete: lookup :: k -> Map k v -> Either LookupError v -- Prismatic: lookup :: (AsLookupError err) => k -> Map k v -> Either err v </code></pre></div></div> Now, type class constraints don’t care about order – <code class="language-plaintext highlighter-rouge">(Foo a, Bar a) => a</code> and <code class="language-plaintext highlighter-rouge">(Bar a, Foo a) => a</code> are exactly the same thing as far as GHC is concerned. The <code class="language-plaintext highlighter-rouge">AsXXX</code> type classes will automatically provide the <code class="language-plaintext highlighter-rouge">mapLeft</code> stuff for us, so now our <code class="language-plaintext highlighter-rouge">foo</code> function looks a great bit cleaner: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: (AsHeadError err, AsLookupError err, AsParseError err) => String -> Either err Integer foo str = do c <- head str r <- lookup str strMap parse (c : r) </code></pre></div></div> This appears to be a significant improvement over what we’ve had before! And, most of the boilerplate with the <code class="language-plaintext highlighter-rouge">AsXXX</code> classes is taken care of via Template Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>makeClassyPrisms ''ParseError -- this line generates the following: class AsParseError a where _ParseError :: Prism' a ParseError _UnexpectedChar :: Prism' a (Char, String) _RanOutOfInput :: Prism' a () instance AsParseError ParseError where -- etc... </code></pre></div></div> However, we do have to write our own boilerplate when we eventually want to concretely handle these types. We may end up writing a huge <code class="language-plaintext highlighter-rouge">AppError</code> that all of these errors get injected into. There’s one major, fatal flaw with this approach. While it composes very nicely, it doesn’t decompose at all! There’s no way to catch a single case and ensure that it’s handled. The machinery that prisms give us don’t allow us to separate out a single constraint, so we can’t pattern match on a single error. Once again, our types become ever larger, with all of the problems that entails. <h1 id="generics-to-the-rescue">Generics to the rescue!</h1> What we really want is: <ul> <li>Order independence</li> <li>No boilerplate</li> <li>Easy composition</li> <li>Easy decomposition</li> </ul> In PureScript or OCaml, you can use open variant types to do this flawlessly. Haskell doesn’t have open variants, and the attempts to mock them end up quite clumsy to use in practice. I’m happy to say that the entire job is handled quite nicely with the amazing <a href="https://hackage.haskell.org/package/generic-lens"><code class="language-plaintext highlighter-rouge">generic-lens</code></a> package. I created <a href="https://gist.github.com/parsonsmatt/880fbf79eaad6ed863786c6c02f8ddc9">a gist</a> that demonstrates their usage, but the magic comes down to this simple fact: there’s an instance of the prismatic <code class="language-plaintext highlighter-rouge">AsType</code> class for <code class="language-plaintext highlighter-rouge">Either</code>, which allows you to “pluck” a constraint off. This satisfies all of the things I wanted in my list, and we can consider representing errors mostly solved. <h2 id="womp-womp-edit-2020-06-02">Womp Womp (edit: 2020-06-02)</h2> Turns out, the above approach was premature. <code class="language-plaintext highlighter-rouge">generic-lens</code> only handles two types - it doesn’t “deep search” the <code class="language-plaintext highlighter-rouge">Either</code>, so this approach doesn’t work as well as I wanted to. Fortunately, the technique I document in the <a href="/2020/01/03/plucking_constraints.html">Plucking Constraints</a> post does work quite well, and it doesn’t even require <code class="language-plaintext highlighter-rouge">lens</code> knowledge! I wrote the <a href="https://hackage.haskell.org/package/plucky"><code class="language-plaintext highlighter-rouge">plucky</code></a> to demonstrate the technique for errors specifically. As far as I know, this is the best approach in Haskell for type-safe errors. <h1 id="mostly">Mostly?</h1> Well, <code class="language-plaintext highlighter-rouge">ExceptT e IO a</code> still imposes a significant runtime performance hit, and asynchronous exceptions aren’t considered here. A bifunctor IO type like <code class="language-plaintext highlighter-rouge">newtype BIO err a = BIO (IO a)</code> which carries the type class constraints of the errors it contains is promising, but I haven’t been able to write a satisfying interface to this yet. I also haven’t used this technique in a large codebase yet, and I don’t know how it scales. And the technique does require you to be comfortable with <code class="language-plaintext highlighter-rouge">lens</code>, which is a fairly high bar for training new folks on. I’m sure that API improvements could be made to make this style more accessible and remove some of the lens knowledge prerequisites. Sat, 03 Nov 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/11/03/trouble_with_typed_errors.html https://www.parsonsmatt.org/2018/11/03/trouble_with_typed_errors.html Capability and Suitability Gary Bernhardt has a fantastic talk on <a href="https://www.youtube.com/watch?v=NftT6HWFgq0">Capability vs Suitability</a>, where he separates advances in software engineering into two buckets: <ul> <li>Capability: The ability to do new things!</li> <li>Suitability: The ability to do things well.</li> </ul> Capability is progressive and daring, while suitability is conservative and boring. Capability wants to create entirely new things, while suitability wants to refine existing things. This post is going to explore a metaphor with bicycles, specifically bike tires, while we think about capability and suitability. When you get a bike, you have so many options. Tire size is one of them. You can opt for a super narrow road tire – a mere 19mm in width! Or, on the other end of the scale, you can opt for a truly fat tire at around 5” in width. What’s the difference? Narrower tires are less capable – there is less terrain you can cover on a narrow tire. However, they’re more suitable for the terrain they can cover – a 19mm tire will be significantly lighter and faster than a 5” tire. A good 19mm tire weighs around 200g, while a 5” tire might weigh 1,800g each. Lugging around an extra 7lbs of rubber takes a lot of energy! Additionally, all that rubber is going to have a lot of rolling resistance – it’ll be harder to push across the ground on smooth surfaces where the 19mm tire excels. So, most cyclists don’t use fat tire bikes. But they also don’t use 19mm skinny tires. Most road cyclists have moved up to 25 or 28mm tires. While the 19mm tires work fantastically on a perfectly smooth surface, they start suffering when the road gets bumpy. All the bumps and rough surfaces call for a slightly more capable tire. The wider tires can run lower air pressure, which lets them float over bumps rather than being bumped up and down. So, we have two competing forces in bike tires: <ul> <li>The speed and comfort on the terrain you ride most frequently</li> <li>The speed and comfort on the worst terrain you encounter regularly</li> </ul> You want enough capability to handle the latter, while a tire that’s suitable for the former. In computer programming, we tend to reach for the most capable thing we can get our hands on. Dynamically typed, impure, and Turing complete programming languages like Ruby, JavaScript, and Python are immensely popular. Statically typed languages are often seen as stifling, and pure languages even more so. There simply aren’t many languages that are Turing incomplete, that’s how little we like them! Yet, these extra gains in capability are often unnecessary. There’s very little code that’s difficult to statically type with a reasonable type system. Impurity seems convenient, until you realize that you need to look at every single method call to see why the code that renders an HTML page is making an N+1 query and ruining performance. Indeed, even Turing completeness is overrated – a Turing incomplete language permits dramatically more optimizations and static analysis for bug prevention, and very few programs actually require Turing completeness. In this sense, programmers are like cyclists that pick up the 5” tire fat bikes and then wonder why they’re moving so slow. They may ride in the snow or deep sand once or twice a year, and they stick with the 5” tire for that reason alone. Programmers that are willing to give up the capability they don’t need in order to purchase suitability they could use tend to go faster, as you might expect. Folks that learn Haskell and become sufficiently familiar with purely functional and statically typed programming tend to take those practices with them, even in impure or dynamically typed languages. It is easier to understand what you did when you limit what you can do. Sat, 03 Nov 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/11/03/capability_and_suitability.html https://www.parsonsmatt.org/2018/11/03/capability_and_suitability.html TChan vs TQueue: What's the difference? I always forget the difference between a <code class="language-plaintext highlighter-rouge">TChan</code> and a <code class="language-plaintext highlighter-rouge">TQueue</code>. They appear to have an almost identical API, so whenever I need a concurrent message thing, I spend some time working out what the difference is. I’ve done this a few times now, and it’s about time that I write it out so that I don’t need to keep reconstructing it. Aside: Please don’t use <code class="language-plaintext highlighter-rouge">TChan</code> or <code class="language-plaintext highlighter-rouge">TQueue</code>. These types are unbounded, which means that you can run into unbounded memory use if your producer is faster than your consumer. Instead, use <code class="language-plaintext highlighter-rouge">TBChan</code> or <code class="language-plaintext highlighter-rouge">TBQueue</code>, which allow you to set a bound. I have run into issues with livelock with the standard <code class="language-plaintext highlighter-rouge">stm</code> and <code class="language-plaintext highlighter-rouge">stm-chans</code> types, and have found that <a href="https://hackage.haskell.org/package/unagi-chan"><code class="language-plaintext highlighter-rouge">unagi-chan</code> package</a> has better performance in all cases, so I usually reach for that when I have a need for a high performance concurrent channel. Unfortunately, the <code class="language-plaintext highlighter-rouge">unagi-chan</code> variants don’t operate in <code class="language-plaintext highlighter-rouge">STM</code>, which can be a dealbreaker depending on your workflow. tl;dr: Use a channel when you want all readers to receive each message. Use a queue when you want only one reader to receive each message. The <a href="https://hackage.haskell.org/package/stm-2.5.0.0/docs/Control-Concurrent-STM-TChan.html">docs for a <code class="language-plaintext highlighter-rouge">TChan</code></a> are concise: <blockquote> TChan is an abstract type representing an unbounded FIFO channel. </blockquote> The <a href="">docs for a <code class="language-plaintext highlighter-rouge">TQueue</code> are a bit more verbose</a>: <blockquote> A <code class="language-plaintext highlighter-rouge">TQueue</code> is like a <code class="language-plaintext highlighter-rouge">TChan</code>, with two important differences: <ul> <li>it has faster throughput than both <code class="language-plaintext highlighter-rouge">TChan</code> and <code class="language-plaintext highlighter-rouge">Chan</code> (although the costs are amortised, so the cost of individual operations can vary a lot).</li> <li>it does not provide equivalents of the <code class="language-plaintext highlighter-rouge">dupTChan</code> and <code class="language-plaintext highlighter-rouge">cloneTChan</code> operations.</li> </ul> The implementation is based on the traditional purely-functional queue representation that uses two lists to obtain amortised $O(1)$ enqueue and dequeue operations. </blockquote> So the docs say that <code class="language-plaintext highlighter-rouge">TQueue</code> is faster, but has fewer operations. Presumably, we should use a <code class="language-plaintext highlighter-rouge">TQueue</code> unless we need these operations. What do <code class="language-plaintext highlighter-rouge">dupTChan</code> and <code class="language-plaintext highlighter-rouge">cloneTChan</code> do? Let’s look at <a href="https://hackage.haskell.org/package/stm-2.5.0.0/docs/Control-Concurrent-STM-TChan.html#v:dupTChan">the Haddocks</a>: <blockquote> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dupTChan :: TChan a -> STM (TChan a) </code></pre></div> </div> Duplicate a TChan: the duplicate channel begins empty, but data written to either channel from then on will be available from both. Hence this creates a kind of broadcast channel, where data written by anyone is seen by everyone else. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cloneTChan :: TChan a -> STM (TChan a) </code></pre></div> </div> Clone a TChan: similar to dupTChan, but the cloned channel starts with the same content available as the original channel. </blockquote> So, what’s the point of these? Let’s write some code and see what happens. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>test2 :: IO () test2 = do c <- newTChanIO forkIO $ do for_ [1..5] $ \n -> do i <- atomically $ readTChan c putStrLn $ "First thread received: " ++ show i ++ " on #: " ++ show n forkIO $ do for_ [6..10] $ \n -> do i <- atomically $ readTChan c putStrLn $ "Second thread received: " ++ show i ++ " on #: " ++ show n for_ [1..10] $ \i -> do threadDelay 10000 atomically $ writeTChan c i </code></pre></div></div> This creates a new <code class="language-plaintext highlighter-rouge">TChan</code>, then forks two threads. Each thread sits and waits on the <code class="language-plaintext highlighter-rouge">TChan</code> to have a value, and then it prints the value out. Finally we stuff the numbers 1 through 100 into the channel, with a slight delay. This is the output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Beginning test... First thread received: 1 on #: 1 First thread received: 2 on #: 2 Second thread received: 3 on #: 6 First thread received: 4 on #: 3 Second thread received: 5 on #: 7 Second thread received: 6 on #: 8 First thread received: 7 on #: 4 Second thread received: 8 on #: 9 First thread received: 9 on #: 5 Second thread received: 10 on #: 10 </code></pre></div></div> Alright, so the two threads mostly just interleave their work. The values 1-100 are printed out by each thread. Let’s try using <code class="language-plaintext highlighter-rouge">dupTChan</code> and see what happens: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>test3 :: IO () test3 = do c <- newTChanIO forkIO $ do for_ [1..5] $ \n -> do i <- atomically $ readTChan c putStrLn $ "First thread received: " ++ show i ++ " on #: " ++ show n forkIO $ do c' <- atomically $ dupTChan c for_ [6..10] $ \n -> do i <- atomically $ readTChan c' putStrLn $ "Second thread received: " ++ show i ++ " on #: " ++ show n for_ [1..10] $ \i -> do threadDelay 10000 atomically $ writeTChan c i </code></pre></div></div> This is basically the same code, but we’ve duplicated the <code class="language-plaintext highlighter-rouge">TChan</code> in the second thread. Here’s the new output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Beginning test... First thread received: 1 on #: 1 Second thread received: 1 on #: 6 Second thread received: 2 on #: 7 First thread received: 2 on #: 2 First thread received: 3 on #: 3 Second thread received: 3 on #: 8 First thread received: 4 on #: 4 Second thread received: 4 on #: 9 First thread received: 5 on #: 5 Second thread received: 5 on #: 10 </code></pre></div></div> Interesting! So the duplicated channel is able to receive a <code class="language-plaintext highlighter-rouge">writeTChan</code>, and both threads are able to see the values. So, a <code class="language-plaintext highlighter-rouge">TChan</code> with a <code class="language-plaintext highlighter-rouge">dupTChan</code> call is suitable for when you want all copies of the <code class="language-plaintext highlighter-rouge">TChan</code> to receive a value. A <code class="language-plaintext highlighter-rouge">TQueue</code> will only permit a value to be seen once, by a single thread. There’s a variant of <code class="language-plaintext highlighter-rouge">newTChan</code> called <code class="language-plaintext highlighter-rouge">newBroadcastTChan</code>. How does it differ? The docs explain: <blockquote> Create a write-only TChan. More precisely, readTChan will retry even after items have been written to the channel. The only way to read a broadcast channel is to duplicate it with dupTChan. Consider a server that broadcasts messages to clients: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>serve :: TChan Message -> Client -> IO loop serve broadcastChan client = do myChan <- dupTChan broadcastChan forever $ do message <- readTChan myChan send client message </code></pre></div> </div> The problem with using newTChan to create the broadcast channel is that if it is only written to and never read, items will pile up in memory. By using newBroadcastTChan to create the broadcast channel, items can be garbage collected after clients have seen them. </blockquote> The last paragraph is the important part. A standard <code class="language-plaintext highlighter-rouge">TChan</code> will accumulate messages until that copy of the <code class="language-plaintext highlighter-rouge">TChan</code> is read. A broadcast <code class="language-plaintext highlighter-rouge">TChan</code> will not accumulate messages on the write end of the channel. This points to an important performance concern: <ul> <li>If you <code class="language-plaintext highlighter-rouge">dupTChan</code> a channel, you must read from every duplicate of the <code class="language-plaintext highlighter-rouge">TChan</code> in order to avoid memory loss.</li> <li>If you intend on having a write-only end, you must use <code class="language-plaintext highlighter-rouge">newBroadcastTChan</code> for any channel that you won’t read from.</li> </ul> <code class="language-plaintext highlighter-rouge">TQueue</code> avoids this problem as it cannot be duplicated. Fri, 12 Oct 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/10/12/tchan_vs_tqueue.html https://www.parsonsmatt.org/2018/10/12/tchan_vs_tqueue.html Keep your types small... <h1 id="-and-your-bugs-smaller">… and your bugs smaller</h1> In my post <a href="/2017/10/11/type_safety_back_and_forth.html">“Type Safety Back and Forth”</a>, I discussed two different techniques for bringing type safety to programs that may fail. On the one hand, you can push the responsibility forward. This technique uses types like <code class="language-plaintext highlighter-rouge">Either</code> and <code class="language-plaintext highlighter-rouge">Maybe</code> to report a problem with the inputs to the function. Here are two example type signatures: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeDivide :: Int -> Int -> Maybe Int lookup :: Ord k => k -> Map k a -> Maybe a </code></pre></div></div> If the second parameter to <code class="language-plaintext highlighter-rouge">safeDivide</code> is <code class="language-plaintext highlighter-rouge">0</code>, then we return <code class="language-plaintext highlighter-rouge">Nothing</code>. Likewise, if the given <code class="language-plaintext highlighter-rouge">k</code> is not present in the <code class="language-plaintext highlighter-rouge">Map</code>, then we return <code class="language-plaintext highlighter-rouge">Nothing</code>. On the other hand, you can push it back. Here are those functions, but with the safety pushed back: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeDivide :: Int -> NonZero Int -> Int lookupJustified :: Key ph k -> Map ph k a -> a </code></pre></div></div> With <code class="language-plaintext highlighter-rouge">safeDivide</code>, we require the user pass in a <code class="language-plaintext highlighter-rouge">NonZero Int</code> – a type that guarantees that the underlying value is not <code class="language-plaintext highlighter-rouge">0</code>. With <code class="language-plaintext highlighter-rouge">lookupJustified</code>, the <code class="language-plaintext highlighter-rouge">ph</code> type guarantees that the <code class="language-plaintext highlighter-rouge">Key</code> is present in the <code class="language-plaintext highlighter-rouge">Map</code>, so we can pull the resulting value out without requiring a <code class="language-plaintext highlighter-rouge">Maybe</code>. (Check out the <a href="https://hackage.haskell.org/package/justified-containers-0.3.0.0/docs/Data-Map-Justified-Tutorial.html">tutorial</a> for <code class="language-plaintext highlighter-rouge">justified-containers</code>, it is pretty awesome) <h1 id="expansion-and-restriction">Expansion and Restriction</h1> “Type Safety Back and Forth” uses the metaphor of “pushing” the responsibility in one of two directions: <ul> <li>forwards: the caller of the function is responsible for handling the possible error output</li> <li>backwards: the caller of the function is required to providing correct inputs</li> </ul> However, this metaphor is a bit squishy. We can make it more precise by talking about the “cardinality” of a type – how many values it can contain. The type <code class="language-plaintext highlighter-rouge">Bool</code> can contain two values – <code class="language-plaintext highlighter-rouge">True</code> and <code class="language-plaintext highlighter-rouge">False</code>, so we say it has a cardinality of 2. The type <code class="language-plaintext highlighter-rouge">Word8</code> can express the numbers from 0 to 255, so we say it has a cardinality of 256. The type <code class="language-plaintext highlighter-rouge">Maybe a</code> has a cardinality of <code class="language-plaintext highlighter-rouge">1 + a</code>. We get a “free” value <code class="language-plaintext highlighter-rouge">Nothing :: Maybe a</code>. For every value of type <code class="language-plaintext highlighter-rouge">a</code>, we can wrap it in <code class="language-plaintext highlighter-rouge">Just</code>. The type <code class="language-plaintext highlighter-rouge">Either e a</code> has a cardinality of <code class="language-plaintext highlighter-rouge">e + a</code>. We can wrap all the values of type <code class="language-plaintext highlighter-rouge">e</code> in <code class="language-plaintext highlighter-rouge">Left</code>, and then we can wrap all the values of type <code class="language-plaintext highlighter-rouge">a</code> in <code class="language-plaintext highlighter-rouge">Right</code>. The first technique – pushing forward – is “expanding the result type.” When we wrap our results in <code class="language-plaintext highlighter-rouge">Maybe</code>, <code class="language-plaintext highlighter-rouge">Either</code>, and similar types, we’re saying that we can’t handle all possible inputs, and so we must have extra outputs to safely deal with this. Let’s consider the second technique. Specifically, here’s <code class="language-plaintext highlighter-rouge">NonZero</code> and <code class="language-plaintext highlighter-rouge">NonEmpty</code>, two common ways to implement it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype NonZero a = UnsafeNonZero { unNonZero :: a } nonZero :: (Num a, Eq a) => a -> Maybe (NonZero a) nonZero 0 = Nothing nonZero i = Just (UnsafeNonZero i) data NonEmpty a = a :| [a] nonEmpty :: [a] -> Maybe (NonEmpty a) nonEmpty [] = Nothing nonEmpty (x:xs) = x :| xs </code></pre></div></div> What is the cardinality of these types? <code class="language-plaintext highlighter-rouge">NonZero a</code> represents “the type of values <code class="language-plaintext highlighter-rouge">a</code> such that the value is not equal to <code class="language-plaintext highlighter-rouge">0</code>.” <code class="language-plaintext highlighter-rouge">NonEmpty a</code> represents “the type of lists of <code class="language-plaintext highlighter-rouge">a</code> that are not empty.” In both of these cases, we start with some larger type and remove some potential values. So the type <code class="language-plaintext highlighter-rouge">NonZero a</code> has the cardinality <code class="language-plaintext highlighter-rouge">a - 1</code>, and the type <code class="language-plaintext highlighter-rouge">NonEmpty a</code> has the cardinality <code class="language-plaintext highlighter-rouge">[a] - 1</code>. Interestingly enough, <code class="language-plaintext highlighter-rouge">[a]</code> has an infinite cardinality, so <code class="language-plaintext highlighter-rouge">[a] - 1</code> seems somewhat strange – it is also infinite! Math tells us that these are even the same infinity. So it’s not the mere cardinality that helps – it is the specific value(s) that we have removed that makes this type safer for certain operations. These are custom examples of <a href="https://ucsd-progsys.github.io/liquidhaskell-tutorial/">refinement types</a>. Another closely related idea is <a href="https://www.hedonisticlearning.com/posts/quotient-types-for-programmers.html">quotient types</a>. The basic idea here is to restrict the size of our inputs. Slightly more formally, <ul> <li>Forwards: expand the range</li> <li>Backwards: restrict the domain</li> </ul> <h1 id="constraints-liberate">Constraints Liberate</h1> Runar Bjarnason has a wonderful talk titled <a href="https://www.youtube.com/watch?v=GqmsQeSzMdw">Constraints Liberate, Liberties Constrain</a>. The big idea of the talk, as I see it, is this: <blockquote> When we restrict what we can do, it’s easier to understand what we can do. </blockquote> I feel there is a deep connection between this idea and Rich Hickey’s talk <a href="https://www.youtube.com/watch?v=34_L7t7fD_U">Simple Made Easy</a>. In both cases, we are focusing on simplicity – on cutting away the inessential and striving for more elegant ways to express our problems. Pushing the safety forward – expanding the range – does not make things simpler. It provides us with more power, more options, and more possibilities. Pushing the safety backwards – restricting the domain – does make things simpler. We can use this technique to take away the power to get it wrong, the options that aren’t right, and the possibilities we don’t want. Indeed, if we manage to restrict our types sufficiently, there may be only one implementation possible! The classic example is the <code class="language-plaintext highlighter-rouge">identity</code> function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>identity :: a -> a identity a = a </code></pre></div></div> This is the only implementation of this function that satisfies the type signature (ignoring <code class="language-plaintext highlighter-rouge">undefined</code>, of course). In fact, for any function with a sufficiently precise type signature, there is a way to automatically derive the function! Joachim Breitner’s <a href="https://www.joachim-breitner.de/blog/735-The_magic_%E2%80%9CJust_do_it%E2%80%9D_type_class"><code class="language-plaintext highlighter-rouge">justDoIt</code></a> is a fascinating utility that can solve these implementations for you. With sufficiently fancy types, the computer can write even more code for you. The programming language Idris can <a href="https://youtu.be/X36ye-1x_HQ?t=1140">write well-defined functions like <code class="language-plaintext highlighter-rouge">zipWith</code> and <code class="language-plaintext highlighter-rouge">transpose</code> for length-indexed lists nearly automatically!</a> <h1 id="restrict-the-range">Restrict the Range</h1> I see this pattern and I am compelled to fill it in: <table> <thead> <tr> <th> </th> <th>Restrict</th> <th>Expand</th> </tr> </thead> <tbody> <tr> <td>Range</td> <td> </td> <td>:(</td> </tr> <tr> <td>Domain</td> <td>:D</td> <td> </td> </tr> </tbody> </table> I’ve talked about restricting the domain and expanding the range. Expanding the domain seems silly to do – we accept more possible values than we know what to do with. This is clearly not going to make it easier or simpler to implement our programs. However, there are many functions in Haskell’s standard library that have a domain that is too large. Consider: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>take :: Int -> [a] -> [a] </code></pre></div></div> <code class="language-plaintext highlighter-rouge">Int</code>, as a domain, is both too large and too small. It allows us to provide negative numbers: what does it even mean to take <code class="language-plaintext highlighter-rouge">-3</code> elements from a list? As <code class="language-plaintext highlighter-rouge">Int</code> is a finite type, and <code class="language-plaintext highlighter-rouge">[a]</code> is infinite, we are restricted to only using this function with sufficiently small <code class="language-plaintext highlighter-rouge">Int</code>s. A closer fit would be <code class="language-plaintext highlighter-rouge">take :: Natural -> [a] -> [a]</code>. <code class="language-plaintext highlighter-rouge">Natural</code> allows any non-negative whole number, and perfectly expresses the reasonable domain. Expanding the domain isn’t desirable, as we might expect. <code class="language-plaintext highlighter-rouge">base</code> has functions with a range that is too large, as well. Let’s consider: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>length :: [a] -> Int </code></pre></div></div> This has many of the same problems as <code class="language-plaintext highlighter-rouge">take</code> – a list with too many elements will overflow the <code class="language-plaintext highlighter-rouge">Int</code>, and we won’t get the right answer. Additionally, we have a guarantee that we forget – a <code class="language-plaintext highlighter-rouge">length</code> for any container must be positive! We can more correctly express this type by restricting the output type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>length :: [a] -> Natural </code></pre></div></div> <h1 id="a-perfect-fit">A perfect fit</h1> The more precisely our types describe our program, the fewer ways we have to go wrong. Ideally, we can provide a correct output for every input, and we use a type that tightly describes the properties of possible outputs. Tue, 02 Oct 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/10/02/small_types.html https://www.parsonsmatt.org/2018/10/02/small_types.html ghcid for the win! <h2 id="supercharge-your-haskell-development-experience-with-ghcid">Supercharge your Haskell development experience with ghcid!</h2> <a href="https://github.com/ndmitchell/ghcid"><code class="language-plaintext highlighter-rouge">ghcid</code></a> is – at the current moment – the most important tool for Haskell development environments. It is fast, reliable, works on all kinds of projects, and is remarkably versatile. You can use it with any editor workflow, primarily by not integrating your editor! (though there are integrations available if you’re brave) For these reasons, whenever someone asks about a Haskell IDE, I tell them to ignore the siren song of <code class="language-plaintext highlighter-rouge">ghc-mod</code>, <code class="language-plaintext highlighter-rouge">hdevtools</code>, <code class="language-plaintext highlighter-rouge">intero</code>, <code class="language-plaintext highlighter-rouge">haskell-ide-engine</code>, etc<a href="#fn:1" class="footnote" rel="footnote">1</a>, and just stick with the old faithful GHCi and <code class="language-plaintext highlighter-rouge">ghcid</code>. Use whatever editor you want – make sure it has syntax highlighting, and open up GHCi and/or <code class="language-plaintext highlighter-rouge">ghcid</code> in a separate terminal. Here are some things we’re going to do with it in this post: <ul> <li>Basic warning/error reporting</li> <li>Fake type-of-expression support</li> <li>Reload your web app on every edit</li> <li>Run your test suite on every edit</li> </ul> As I think of additional “tricks” with <code class="language-plaintext highlighter-rouge">ghcid</code>, I will be updating this post and adding them here. If you have a suggestion or question, please open an issue on my blog’s GitHub :) <h1 id="basic-warnings-and-errors">Basic warnings and errors</h1> This is the bread and butter of what <code class="language-plaintext highlighter-rouge">ghcid</code> is good for. At this point, you’re probably used to running <code class="language-plaintext highlighter-rouge">ghci</code> and doing <code class="language-plaintext highlighter-rouge">:reload</code> to see whether or not your code compiles. <code class="language-plaintext highlighter-rouge">ghci</code> has some advantages over a <code class="language-plaintext highlighter-rouge">cabal new-build</code> or <code class="language-plaintext highlighter-rouge">stack build</code> or similar – it loads everything in interpreted byte code by default, which is much faster, and is capable of very intelligent module reloading to minimize work. This can cut the feedback time from compilation dramatically. By default, <code class="language-plaintext highlighter-rouge">ghcid</code> will load with the flag <code class="language-plaintext highlighter-rouge">-fno-code</code> enabled. This turns off all code generation, and basically only gives you syntax and type checking. When you eventually customize your <code class="language-plaintext highlighter-rouge">ghcid</code> command, you will want to remember to either enable <code class="language-plaintext highlighter-rouge">-fno-code</code> or <code class="language-plaintext highlighter-rouge">-fobject-code</code>. You need <code class="language-plaintext highlighter-rouge">-fobject-code</code> in order to do stuff like run tests, check Template Haskell expressions, etc. To customize your <code class="language-plaintext highlighter-rouge">ghcid</code> command, you do this: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ghcid --command "the command to start ghci" # example, for a Template Haskell heavy project: $ ghcid --command "stack ghci package:lib --ghci-options=-fobject-code" # example, to pick a single executable target: $ ghcid --command "stack ghci package:exe:main-node" # example, to defer type errors: $ ghcid --command "stack ghci --ghci-options=-fdefer-type-errors" </code></pre></div></div> At IOHK, I wrote up a <code class="language-plaintext highlighter-rouge">Makefile</code> with the common <code class="language-plaintext highlighter-rouge">ghcid</code> commands I use when working on the new wallet. <a href="https://github.com/input-output-hk/cardano-sl/blob/755b3b8fa7981982e9355052be69432952dee528/wallet/Makefile">This command</a> lets me say <code class="language-plaintext highlighter-rouge">make ghcid</code> in the <code class="language-plaintext highlighter-rouge">wallet-new</code> subdirectory and get lightning fast reloading of code, display of all warnings and errors, and lets me run through refactorings quite nice and quickly. <h1 id="fake-type-of-expression-support">Fake type-of-expression support</h1> Sometimes GHC feels like a reluctant wizard. It knows things. You know it knows things. It knows that you know that it knows things. But it doesn’t want to tell you! A common question that IDE authors want to ask is “What’s the type of this expression?” <code class="language-plaintext highlighter-rouge">ghc-mod</code>, <code class="language-plaintext highlighter-rouge">intero</code>, all try to support this, to varying degrees of success and performance. But GHC is curious and easily distracted, and would much rather tell you that you’re wrong than answer a question. So let’s trick the wizard! Just today, I was working on this snippet of code, pulled from my <a href="https://github.com/parsonsmatt/servant-persistent"><code class="language-plaintext highlighter-rouge">servant-persistent</code></a> starter pack/example project: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do env <- lookupSetting "ENV" Development port <- lookupSetting "PORT" 8081 logEnv <- defaultLogEnv pool <- makePool env logEnv store <- serverMetricStore <$> forkServer "localhost" 8000 waiMetrics <- registerWaiMetrics store metr <- M.initializeWith store let cfg = Config { configPool = pool , configEnv = env , configMetrics = metr , configLogEnv = logEnv } logger = setLogger env runSqlPool doMigrations pool generateJavaScript run port $ logger $ metrics waiMetrics $ app cfg </code></pre></div></div> I wanted to know what the type of the <code class="language-plaintext highlighter-rouge">port</code> variable was. With a more sophisticated toolchain, I might hover over <code class="language-plaintext highlighter-rouge">port</code>, and get a tooltip telling me what. But, we’re using the more primitive <code class="language-plaintext highlighter-rouge">ghcid</code>. Well, we know what it isn’t – It’s not <code class="language-plaintext highlighter-rouge">()</code>. So, in the olden tradition, let’s loudly be wrong and await correction: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> run (port :: ()) $ logger $ metrics waiMetrics $ app cfg </code></pre></div></div> We fire up <code class="language-plaintext highlighter-rouge">ghcid</code>, making sure to include the executable package target: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ghcid --command "stack ghci servant-persistent:exe:perservant" </code></pre></div></div> And we’re greeted with an error message: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/servant-persistent/app/Main.hs:37:10: error: • Couldn't match type ‘()’ with ‘Int’ Expected type: warp-3.2.22:Network.Wai.Handler.Warp.Types.Port Actual type: () • In the first argument of ‘run’, namely ‘(port :: ())’ In the expression: run (port :: ()) In a stmt of a 'do' block: run (port :: ()) $ logger $ metrics waiMetrics $ app cfg | 37 | run (port :: ()) $ logger $ metrics waiMetrics $ app cfg | ^^^^^^^^^^ </code></pre></div></div> Ah, GHC expects it to be of type <code class="language-plaintext highlighter-rouge">Int</code>. There we go! This works well with functions, too. Let’s say we want to know the type of <code class="language-plaintext highlighter-rouge">run</code>, instead: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> (run :: ()) port $ logger $ metrics waiMetrics $ app cfg </code></pre></div></div> <code class="language-plaintext highlighter-rouge">ghcid</code> is happy to tell us how wrong we are: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/servant-persistent/app/Main.hs:37:5: error: • Couldn't match expected type ‘Integer -> Network.Wai.Application -> IO ()’ with actual type ‘()’ • The function ‘run :: ()’ is applied to one argument, but its type ‘()’ has none In the expression: (run :: ()) port In a stmt of a 'do' block: (run :: ()) port $ logger $ metrics waiMetrics $ app cfg | 37 | (run :: ()) port $ logger $ metrics waiMetrics $ app cfg | ^^^^^^^^^^^^^^^^ /home/matt/Projects/servant-persistent/app/Main.hs:37:6: error: • Couldn't match expected type ‘()’ with actual type ‘warp-3.2.22:Network.Wai.Handler.Warp.Types.Port -> Network.Wai.Application -> IO ()’ • Probable cause: ‘run’ is applied to too few arguments In the expression: run :: () In the expression: (run :: ()) port In a stmt of a 'do' block: (run :: ()) port $ logger $ metrics waiMetrics $ app cfg | 37 | (run :: ()) port $ logger $ metrics waiMetrics $ app cfg | ^^^ </code></pre></div></div> Note that we get a slightly inconsistent message. We’ve asserted that <code class="language-plaintext highlighter-rouge">run :: ()</code>, and it has two expected types: one from definition, and one from inferred use. The inferred type is <code class="language-plaintext highlighter-rouge">Integer -> Application -> IO ()</code>. The defined type is <code class="language-plaintext highlighter-rouge">Port -> Application -> IO ()</code>. It infers <code class="language-plaintext highlighter-rouge">Integer</code> because, without <code class="language-plaintext highlighter-rouge">run</code> forcing <code class="language-plaintext highlighter-rouge">port</code> to be a <code class="language-plaintext highlighter-rouge">Port</code>, it has nothing else to tell it what to be, and therefore defaults to <code class="language-plaintext highlighter-rouge">Integer</code>. <h1 id="reload-your-web-app-on-every-edit">Reload your web app on every edit</h1> <code class="language-plaintext highlighter-rouge">ghcid</code>, in addition to the <code class="language-plaintext highlighter-rouge">--command</code> flag, also takes a <code class="language-plaintext highlighter-rouge">--test</code> flag. The flag name is somewhat too specific – upon a successful compile with no warnings or errors, it will issue that command to GHCi for you. It was initially intended for running tests, but we can do anything we like with it – and we are going to use it to get our web application reloading lightning fast. <a href="https://github.com/parsonsmatt/servant-persistent/pull/33">This PR on the <code class="language-plaintext highlighter-rouge">servant-persistent</code> project</a> includes the necessary changes to get this running. I have left a self-review on the PR, so I won’t explain too much here. The <code class="language-plaintext highlighter-rouge">ghcid</code> command we use is: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ghcid \ --command "stack ghci servant-persistent" \ --test "DevelMain.update" </code></pre></div></div> This calls the <code class="language-plaintext highlighter-rouge">DevelMain.update</code> function on every successful compile. The <code class="language-plaintext highlighter-rouge">DevelMain</code> module was mostly copied from the Yesod scaffold, with a few updates to make it work with this repo. In truth, all you need to provide is a development-oriented function <code class="language-plaintext highlighter-rouge">IO Application</code> that boots your application and gives you the WAI value. The DevelMain code uses <code class="language-plaintext highlighter-rouge">foreign-store</code> library to persist the state across GHCi sessions. The <code class="language-plaintext highlighter-rouge">ghcid</code> README links to <a href="https://binarin.ru/post/auto-reload-threepenny-gui/">an article on threepenny-gui apps</a> with a similar strategy. This is really fast – because GHCi can reload only exactly what it needs, and doesn’t have to link anything, you get to see your changes almost immediately. <h1 id="run-your-test-suite-on-every-edit">Run your test suite on every edit</h1> What’s better than knowing your project compiles? Knowing that it passes the test suite! We’ll use the <code class="language-plaintext highlighter-rouge">--test</code> command here, and we specify that we want to run the tests. In the <a href="https://github.com/input-output-hk/cardano-sl/"><code class="language-plaintext highlighter-rouge">cardano-sl</code></a> repository, I put a <a href="https://github.com/input-output-hk/cardano-sl/blob/develop/wallet-new/Makefile#L8-L11"><code class="language-plaintext highlighter-rouge">Makefile</code> command for test running</a>: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ghcid-test: ## Have ghcid run the test suite for the wallet-new-specs on successful recompile ghcid \ --command "stack ghci cardano-sl-wallet-new:lib cardano-sl-wallet-new:test:wallet-new-specs --ghci-options=-fobject-code" \ --test "main" </code></pre></div></div> There’s a tricky bit here: We have to tell <code class="language-plaintext highlighter-rouge">stack ghci</code> which package targets to load. I specify the library (<code class="language-plaintext highlighter-rouge">cardano-sl-wallet-new:lib</code>) so that it adds the library to the set of modules to watch for reloading. Then I specify the test-suite I want to run (<code class="language-plaintext highlighter-rouge">cardano-sl-wallet-new:test:wallet-new-specs</code>). Finally I use <code class="language-plaintext highlighter-rouge">--ghci-options=-fobject-code</code>, because this is fast, and I need to actually run the code (you get weird linker errors if you do <code class="language-plaintext highlighter-rouge">-fno-code</code> and try to run the nonexistent code). <h1 id="conclusion">Conclusion</h1> <code class="language-plaintext highlighter-rouge">ghcid</code> is awesome and everyone owes Neil Mitchell a beverage. If you have a suggested use case you want added, ping me on GitHub and I’ll credit you :) <div class="footnotes" role="doc-endnotes"> <ol> <li id="fn:1" role="doc-endnote"> These are great projects. But they are flaky, partially because GHC’s API is difficult to interface with, and partially because GHCi’s interactive features have some performance issues with larger code bases. For small projects and libraries, they often work great. For larger projects, or more varied environments, they show their pain. You can spend a lot of time fussing with the editor integration and waiting on some command to finish, or you can just develop habits that don’t need them (like <code class="language-plaintext highlighter-rouge">ghcid</code> in a separate terminal). I say this as the author of the <a href="https://github.com/parsonsmatt/intero-neovim"><code class="language-plaintext highlighter-rouge">intero-neovim</code></a> plugin. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a> </li> </ol> </div> Sat, 19 May 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/05/19/ghcid_for_the_win.html https://www.parsonsmatt.org/2018/05/19/ghcid_for_the_win.html Transforming Transformers There’s a kind fellow named <code class="language-plaintext highlighter-rouge">lunaris</code> on <a href="https://fpchat-invite.herokuapp.com/">the FPChat slack channel</a> that shares exceptionally good advice. Unfortunately, due to the ephemeral nature of Slack, a lot of this advice is lost to history. I’ve been pestering him to write up his advice in a blog so that it could be preserved. He hasn’t posted it yet, so I’m going to start posting his rants for him ;) <code class="language-plaintext highlighter-rouge">lunaris</code> works with a company called Habito, and they are currently hiring for a wide variety of roles. If this post appeals to you (and you live in London), then <a href="https://www.habito.com/careers">check out their job openings</a>! <hr /> <blockquote> @lunaris says… (with minor formatting edits) </blockquote> What I meant by obviating transformer stacks was perhaps specific to my (or what I think is my) use case. That is, you’re building a set of services, <code class="language-plaintext highlighter-rouge">MonadAccounts m</code> (<code class="language-plaintext highlighter-rouge">createAccount :: Email -> Password -> m Account</code>), etc. You can do them as dictionaries or type classes. If you go down the latter (which I think is worth it because eventually the hassle of passing those dictionaries becomes a mite too great for my liking), you probably want to build the services modularly. So you whip out some transformers <code class="language-plaintext highlighter-rouge">AccountT</code>, <code class="language-plaintext highlighter-rouge">ProfileT</code>, <code class="language-plaintext highlighter-rouge">ApplicationsT</code>, etc. And you instantiate a big stack <code class="language-plaintext highlighter-rouge">App = AccountT (ProfileT ..</code> in your main. Where it’s something like, for each transformer: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype AccountT m a = AccountT (m a) instance ( MonadReader r m , HasSomeAccountConfiguration r ) => MonadAccount (AccountT m) where ... </code></pre></div></div> Or some such. And at the bottom of your <code class="language-plaintext highlighter-rouge">App</code> is <code class="language-plaintext highlighter-rouge">ReaderT GlobalConfig IO</code> such that <code class="language-plaintext highlighter-rouge">HasSomeAccountConfiguration GlobalConfig</code> is an instance that tells you where to get the things needed to configure your account service. This is all fine, except you also have to write the passthrough instances for <code class="language-plaintext highlighter-rouge">MonadReader</code> for all your services. And of course any other things you might want to pass through (e.g. <code class="language-plaintext highlighter-rouge">MonadPostgreSQL</code>, <code class="language-plaintext highlighter-rouge">MonadHTTP</code> – “effect”-like things). We previous “solved” the pass through using something like <code class="language-plaintext highlighter-rouge">monad-classes</code> in Haskell, which uses a load of type hackery to avoid the squared-instances problem. But it comes with lots of costs and we ended up abandoning it. There are other games you can play around it. But what we’ve ended up pursuing instead is taking the functions you’d normally write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- api/ class Monad m => MonadAccounts m where createAccount :: Email -> Password -> m Account -- impl/ createAccountImpl :: (MonadReader r m, HasAccountConfig r) => Email -> Password -> m Account </code></pre></div></div> And instead of then also having <code class="language-plaintext highlighter-rouge">impl</code> define and export <code class="language-plaintext highlighter-rouge">AccountT</code> with an instance such that <code class="language-plaintext highlighter-rouge">createAccount = createAccountImpl</code>, just export <code class="language-plaintext highlighter-rouge">createAccountImpl</code>. Then in main, do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App a = App (ReaderT GlobalConfig IO a) instance MonadAccounts App where createAccount = createAccountImpl </code></pre></div></div> This has a different set of trade-offs. For one, you no longer have a stack of binds to wade through or lift. Things like <code class="language-plaintext highlighter-rouge">HasAccountConfig</code> you can automatically instantiate using generics too. The last tradeoff is that you can’t derive these mechanical instances. Moreover, because you can’t derive them, you can’t enforce that people will write them correctly. E.g. if your class has methods M1, M2 and you export M1Impl, M2Impl, nothing stops someone from using M1Impl but ignoring M2Impl, which may violate any laws your class’ implementation would otherwise fulfill. However. If you have <a href="https://github.com/Icelandjack/deriving-via"><code class="language-plaintext highlighter-rouge">deriving via</code></a> (and sorry, the flood is nearly over). You can have <code class="language-plaintext highlighter-rouge">impl</code> define and export: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype AccountT m a = AccountT (m a) instance ( MonadReader r m , HasAccountConfig r ) => MonadAccounts (AccountT m) where createAccount = createAccountImpl </code></pre></div></div> And not export the method implementations (as before). Now, in main, you just write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App a = App (ReaderT GlobalConfig IO a) deriving MonadAccounts via AccountT </code></pre></div></div> Or something similar. And get the instances you want, without the transformer stack. Of course, you still want things like <code class="language-plaintext highlighter-rouge">MaybeT</code> and the like for their use in composing effects, even in MTL-like code blocks. But assuming this works, that feels to me like how I’d want to do application effects from then on. Still mulling it over though. <hr /> Big thanks to <code class="language-plaintext highlighter-rouge">@lunaris</code> for letting me post this. Tue, 10 Apr 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/04/10/transforming_transformers.html https://www.parsonsmatt.org/2018/04/10/transforming_transformers.html Stealing Where from Rust I normally write about Haskell, but I’m also super excited about Rust! I’m pretty new to it, though, so this post might have some incorrectness. One convenient feature that Rust has is a type-level <code class="language-plaintext highlighter-rouge">where</code>. It’s used to provide trait bounds (equivalent to Haskell’s type class constraints) to type variables. Here’s a type signature from <code class="language-plaintext highlighter-rouge">nom</code>, a parser combinator library (<a href="https://github.com/Geal/nom/blob/b478db511696ddbc7ff6ee6a012dd50bc3a6789e/src/character.rs#L141-L143">github link</a>): <div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pub fn anychar<T>(input: T) -> IResult<T, char> where T: InputIter + InputLength + Slice<RangeFrom<usize>> + AtEof, <T as InputIter>::Item: AsChar, </code></pre></div></div> We’re declaring a public function named <code class="language-plaintext highlighter-rouge">anychar</code> that is parameterized by some type <code class="language-plaintext highlighter-rouge">T</code>. It accepts a single argument, named <code class="language-plaintext highlighter-rouge">input</code>, that has the type <code class="language-plaintext highlighter-rouge">T</code>. It returns a value of type <code class="language-plaintext highlighter-rouge">IResult<T, char></code>. The <code class="language-plaintext highlighter-rouge">where</code> bit defines the trait bounds on the <code class="language-plaintext highlighter-rouge">T</code> type – we say that <code class="language-plaintext highlighter-rouge">T</code> must satisfy the <code class="language-plaintext highlighter-rouge">InputIter</code>, <code class="language-plaintext highlighter-rouge">InputLength</code>, <code class="language-plaintext highlighter-rouge">Slice</code>, etc. traits. The <code class="language-plaintext highlighter-rouge">::</code> refers to an associated type, which is essentially the same thing as an associated type on a type class in Haskell. So <code class="language-plaintext highlighter-rouge"><T as InputIter>::Item</code> refers to the <code class="language-plaintext highlighter-rouge">Item</code> type for the implementation of <code class="language-plaintext highlighter-rouge">T</code>’s <code class="language-plaintext highlighter-rouge">InputIter</code> trait. Let’s write this in Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>anychar :: forall t. ( InputIter t, InputLength t , Slice (RangeFrom Int) t, AtEof t , AsChar (InputIterItem t) ) => t -> Result t Char </code></pre></div></div> Haskell has implicit quantification by default. That is, you don’t have to introduce type variables – you can just use them where you want. The <code class="language-plaintext highlighter-rouge">forall</code> keyword introduces type variables to a type. So the syntax <code class="language-plaintext highlighter-rouge">forall t.</code> is analogous to the <code class="language-plaintext highlighter-rouge"><T></code> syntax in Rust. Haskell’s type classes take type variables as arguments and turn them into <code class="language-plaintext highlighter-rouge">Constraint</code>s. So, instead of Rust’s <code class="language-plaintext highlighter-rouge">t : InputIter</code> (“<code class="language-plaintext highlighter-rouge">t</code> satisfies the trait bound <code class="language-plaintext highlighter-rouge">InputIter</code>”), we say <code class="language-plaintext highlighter-rouge">InputIter t</code> (“given an instance of <code class="language-plaintext highlighter-rouge">InputIter</code> for <code class="language-plaintext highlighter-rouge">t</code>”). Can we get this into Haskell, as well? Yes, with some type families! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family Where a cs :: Constraint where Where _ '[] = () Where a (c ': cs) = (c a, Where a cs) </code></pre></div></div> Here, we define a type family <code class="language-plaintext highlighter-rouge">Where</code>, which takes two parameters: a type <code class="language-plaintext highlighter-rouge">a</code> of kind <code class="language-plaintext highlighter-rouge">k</code>, and a type <code class="language-plaintext highlighter-rouge">cs</code> of kind <code class="language-plaintext highlighter-rouge">[k -> Constraint]</code>, and it turns them into a single <code class="language-plaintext highlighter-rouge">Constraint</code>. For the empty list, we have no constraints to add – therefore, we use the empty constraint, <code class="language-plaintext highlighter-rouge">()</code>. If we have a constraint, then we add that constraint <code class="language-plaintext highlighter-rouge">c a</code> and do constraint union with the result of <code class="language-plaintext highlighter-rouge">Where a cs</code>. Here’s how it looks: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>anychar :: forall t. ( Where t [ InputIter, InputLength , Slice (FromRange Int), AtEof ] , AsChar (InputIterItem t) ) => f a -> f a </code></pre></div></div> Nice. It might be good to remember the story “If you give a mouse a cookie…” <h1 id="if-you-give-a-haskeller-a-syntax">If you give a Haskeller a syntax,</h1> <h2 id="theyll-just-want-more">they’ll just want more!</h2> If we can have <code class="language-plaintext highlighter-rouge">Where</code>, that makes me want <code class="language-plaintext highlighter-rouge">Let</code>. It would be really nice to be able to provide shorthands for commonly repeated types in a function. Consider this signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doThings :: MaybeT (ExceptT MyError IO) Int -> MaybeT (ExceptT MyError IO) Char -> MaybeT (ExceptT MyError IO) (Int, Char) doThings mi mc = do i <- mi c <- mc pure (i, c) </code></pre></div></div> Look at all of that repetition. We can factor it out into a top level definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type ThingDoing = MaybeT (ExceptT MyError IO) doThings :: ThingDoing Int -> ThingDoing Char -> ThingDoing (Int, Char) </code></pre></div></div> But that can clutter the namespace. We want something local, for the same reason we want <code class="language-plaintext highlighter-rouge">let</code> and <code class="language-plaintext highlighter-rouge">where</code> in the value language. What do we have to do to make the following code work? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doThings :: Let m (MaybeT (ExceptT MyError IO)) => m Int -> m Char -> m (Int, Char) </code></pre></div></div> Well, it’s easier than you might think: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Let = (~) </code></pre></div></div> That squiggle <code class="language-plaintext highlighter-rouge">(~)</code> is a type equality constraint. We’re saying that <code class="language-plaintext highlighter-rouge">m</code> and <code class="language-plaintext highlighter-rouge">MaybeT (ExceptT MyError IO)</code> must be equal. So that lets us use <code class="language-plaintext highlighter-rouge">m</code> where we might use the longer, explicit type. This trick is great for reducing duplication in type signatures. Thanks, Rust, for inspiring me to want this. Mon, 26 Mar 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/03/26/stealing_where_from_rust.html https://www.parsonsmatt.org/2018/03/26/stealing_where_from_rust.html Three Layer Haskell Cake The question of “How do I design my application in Haskell?” comes up a lot. There’s a bunch of perspectives and choices, so it makes sense that it’s difficult to choose just one. Do I use plain monad transformers, <code class="language-plaintext highlighter-rouge">mtl</code>, just pass the parameters manually and use <code class="language-plaintext highlighter-rouge">IO</code> for everything, the <a href="https://www.fpcomplete.com/blog/2017/06/readert-design-pattern"><code class="language-plaintext highlighter-rouge">ReaderT</code> design pattern</a>, <a href="https://www.parsonsmatt.org/2017/09/22/what_does_free_buy_us.html">free monads</a>, freer monads, some other kind of algebraic effect system?! The answer is: why not both/all? Each approach has pros and cons. Instead of sticking with one technique for everything, let’s instead leverage all of the techniques where they shine. Lately, I’ve been centering on an application design architecture with roughly three layers. <h1 id="layer-1">Layer 1:</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype AppT m a = AppT { unAppT :: ReaderT YourStuff m a } deriving (Functor, Applicative, Monad, etc) </code></pre></div></div> The <a href="https://www.fpcomplete.com/blog/2017/06/readert-design-pattern"><code class="language-plaintext highlighter-rouge">ReaderT</code> Design Pattern</a>, essentially. This is what everything gets boiled down to, and what everything eventually gets interpreted in. This type is the backbone of your app. For some components, you carry around some info/state (consider <a href="https://hackage.haskell.org/package/monad-metrics"><code class="language-plaintext highlighter-rouge">MonadMetrics</code></a> or <a href="https://hackage.haskell.org/package/katip-0.5.2.0/docs/Katip.html"><code class="language-plaintext highlighter-rouge">katip</code>’s</a> logging state/data); for others, you can carry <a href="https://www.parsonsmatt.org/2016/07/14/rank_n_classy_limited_effects.html">an explicit effect interpreter</a>. This layer is for defining how the upper layers work, and for handling operational concerns like performance, concurrency, etc. At IOHK, we had a name for this kind of thing: a “capability”. We have a <a href="https://github.com/parsonsmatt/cardano-sl/blob/10e55bde9a5c0d9d28bca25950a8811407c5fc8c/docs/monads.md">big design doc on monads</a>, and the doc goes into what makes something a capability or not. IOHK has since deleted this design document and decided that it wasn’t good to follow. This layer sucks to test. So don’t. Shift all the business logic up into the next two layers as much as possible. You want this layer to be tiny. If you get a request to test something in this layer, don’t - factor the logic out, test that function, and call it in IO. How do you shift something out? I wrote a post on <a href="https://www.parsonsmatt.org/2017/07/27/inverted_mocking.html">Inverting your Mocks</a> that I believe covers it well, but the general routine is: <ul> <li>Factor the inputs of code out</li> <li>Represent the output of effects as data returned from pure functions</li> </ul> If I had to give a name to this layer, I’d call it the “orchestration” layer. All of the code has been composed, and now we’re arranging it for a real performance. <h1 id="layer-2">Layer 2</h1> This layer provides a bridge between the first and third layer. Here, we’re mostly interested in mocking out external services and dependencies. The most convenient way I’ve found to do this are <code class="language-plaintext highlighter-rouge">mtl</code> style classes, implemented in terms of domain resources or effects. This is a trivial example: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class MonadTime m where getCurrentTime :: m UTCTime </code></pre></div></div> <code class="language-plaintext highlighter-rouge">MonadTime</code> is a class that I might use to “purify” an action that uses IO only for the current time. Doing so makes unit testing a time based function easier. However – this isn’t a great use for this. The best “pure” instance of this is <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance MonadTime ((->) UTCTime) where getCurrentTime = id </code></pre></div></div> And, if you’ve factored your effects out, this will already be done for you. Furthermore, it would actually be quite difficult to write a realistic <code class="language-plaintext highlighter-rouge">MonadTime</code> mock. One law we might like to have with <code class="language-plaintext highlighter-rouge">getCurrentTime</code> is that: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>timeLessThan = do x <- getCurrentTime y <- getCurrentTime pure (x < y) </code></pre></div></div> A pure implementation returning a constant time would fail this. We could have a <code class="language-plaintext highlighter-rouge">State</code> with a random generator and a <code class="language-plaintext highlighter-rouge">UTCTime</code> and add a random amount of seconds for every call, but this wouldn’t really make testing any easier than just getting the actual time. Getting the current time is best kept as a Layer 1 concern - don’t bother mocking it. A more realistic example from a past codebase is this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Monad m => MonadLock m where acquireLock :: NominalDiffTime -> Key -> m (Maybe Lock) renewLock :: NominalDiffTime -> Lock -> m (Maybe Lock) releaseLock :: Lock -> m () </code></pre></div></div> This class describes logic around implementing distributed locks. The production instance talked to a Redis datastore. Setting up redis for dev/test sounded annoying, so I implemented a testing mock that held an <code class="language-plaintext highlighter-rouge">IORef (Map ByteString ByteString)</code>. Another good class is a simplified DSL for working with your data. In OOP land, you’d call this your “Data Access Object.” It doesn’t try to contain a full SQL interpreter, it only represents a small set of queries/data that you need. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (Monad m) => AcquireUser m where getUserBy :: UserQuery -> m [User] getUser :: UserId -> m (Maybe User) getUserWithDog :: UserId -> m (Maybe (User, Dog)) class AcquireUser m => UpdateUser m where deleteUser :: UserId -> m () insertUser :: User -> m () </code></pre></div></div> We can use this class to provide a mock database for testing, without having to write an entire SQL database mocking system. These classes also come in handy because you can swap out the underlying production implementations. Suppose you have a microservices system going on, and <code class="language-plaintext highlighter-rouge">AcquireUser</code> is done through an HTTP API. Suddenly, your boss is convinced that monoliths are king, and gives you One Large Server To Rule Them All. Now your HTTP API has direct database access to the underlying data – you can make SQL requests instead of HTTP! How wonderful. These are higher level than <code class="language-plaintext highlighter-rouge">App</code> and delimit the effects you use; but are ultimately lower level than real business logic. You might see some <code class="language-plaintext highlighter-rouge">MonadIO</code> in this layer, but it should be avoided where possible. This layer should be expanded on an as-needed (or as-convenient) basis. As an example, implementing <code class="language-plaintext highlighter-rouge">MonadLock</code> as a class instead of directly in <code class="language-plaintext highlighter-rouge">AppT</code> was done because using Redis directly would require that every development and test environment would need a full Redis connection information. That is wasteful so we avoid it. Implementing <code class="language-plaintext highlighter-rouge">AcquireModel</code> as a class allows you to omit database calls in testing, and if you’re real careful, you can isolate the database tests well. DO NOT try to implement <code class="language-plaintext highlighter-rouge">MonadRedis</code> or <code class="language-plaintext highlighter-rouge">MonadDatabase</code> or <code class="language-plaintext highlighter-rouge">MonadFilesystem</code> here. That is a fool’s errand. Instead, capture the tiny bits of your domain: <code class="language-plaintext highlighter-rouge">MonadLock</code>, <code class="language-plaintext highlighter-rouge">MonadModel</code>, or <code class="language-plaintext highlighter-rouge">MonadSpecificDataAcquisition</code>. The smaller your domain, the easier it is to write mocks and tests for it. You probably don’t want to try and write a SQL database, so don’t – capture the queries you need as methods on the class so they can easily be mocked. Alternatively, present a tiny query DSL that is easy to write an interpreter for. This layer excels at providing swappable implementations of external services. This technique is still quite heavy-weight: <code class="language-plaintext highlighter-rouge">mtl</code> classes require tons of <code class="language-plaintext highlighter-rouge">newtype</code>s and instance boilerplate. This layer should be as thin as possible, preferring to instead push stuff into the Layer 3. <h1 id="layer-3">Layer 3:</h1> Business logic. This should be entirely pure, with no <code class="language-plaintext highlighter-rouge">IO</code> component at all. This should almost always just be pure functions and relatively simple data types. Reach for only as much power as you need – and you need much less than you think! All the effectful data should have been acquired beforehand, and all effectful post-processing should be handled afterwards. My post on <a href="https://www.parsonsmatt.org/2017/07/27/inverted_mocking.html">inverting your mocks</a> goes into detail on ways to handle this. If you need streaming, then you can implement “pure” <code class="language-plaintext highlighter-rouge">conduit</code> or <code class="language-plaintext highlighter-rouge">pipe</code>s with a type signature like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pureConduit :: Monad m => Conduit i m o </code></pre></div></div> This expresses no dependency on where the data comes from, nor on how the output is handled. We can easily run it with mock data, or put it in the real production pipeline. It is abstract of such concerns. As a result, it’s lots of fun to test. If the result of computation needs to perform an effect, then it is useful to encode that effect as a datatype. Free monads are a technique to encode computation as data, but they’re very complicated; you can usually get away with a much simpler datatype to express the behavior you want. Often times, a simple non-recursive sum type “command” suffices as an interface between a pure function and an effectful one. A list of commands adds a lot of flexibility without dramatically complicating things. Before you jump to monads, consider: would a monoidal means of constructing/reading this data work? If not, what about a Free Applicative? What about a limited recursive sum type, perhaps a GADT, that can express what I want to do? When you’re testing pure functions emitting data structures, it’s a dream. This is the Haskell we know and love. So try to put as much of your code into the pleasant, testable, QuickCheckable, type-verified bits as possible. If you manage to isolate your application like this, then you won’t need to test your IO stuff (aside from the usual integration testing). May all of your tests be pleasant and your software correct. <h3 id="addendum">Addendum</h3> When I initially wrote this blog post, it really bothered me that I couldn’t come up with a good name for layers 2 and 3. I published it anyway because it’s useful enough with just the numbers, and names have the potential to mislead. I’ve since realized that the layers already have names! <ol> <li>Imperative programming</li> <li>Object Oriented programming</li> <li>Functional programming</li> </ol> I’ve provided a new explanation of the <a href="https://www.destroyallsoftware.com/talks/boundaries">“functional core, imperative shell”</a> model of programming! <h3 id="examples">Examples</h3> I get folks asking me for examples fairly regularly. Unfortunately, I haven’t had time to write an OSS app using this technique. Fortunately, other folks have! <ul> <li><a href="https://github.com/Holmusk/three-layer">Holmusk/three-layer</a></li> <li><a href="https://github.com/thomashoneyman/purescript-halogen-realworld">thomashoneyman/purescript-halogen-realworld</a></li> <li><a href="https://github.com/incoherentsoftware/defect-process">defect-process</a> is a 62kloc Haskell video game project</li> </ul> Thu, 22 Mar 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/03/22/three_layer_haskell_cake.html https://www.parsonsmatt.org/2018/03/22/three_layer_haskell_cake.html Servant Route Smooshing Haskell’s <code class="language-plaintext highlighter-rouge">servant</code> library needs only a modest introduction – it’s an attempt to stuff the description of an API into the type system. Using compile-time type-level programming, we’re able to get a number of benefits: <ul> <li>The server implements the type faithfully</li> <li>You can derive clients automagically</li> <li>You can get Swagger specification automagically</li> <li>You can get lots of testing facilities for free</li> </ul> This is all really neat! However, because Haskell’s <a href="https://www.haskellforall.com/2012/05/scrap-your-type-classes.html">type system isn’t as pleasant as the value-system</a>, this can get gnarly. Servant has a very happy path – but that path is very narrow. Let’s look at an API type that roughly describes a game: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Api = "player" :> Capture "playerId" Int :> "x" :> Get '[JSON] Int :<|> "player" :> Capture "playerId" Int :> "y" :> Get '[JSON] Int </code></pre></div></div> This <code class="language-plaintext highlighter-rouge">Api</code> type describes two routes: either <code class="language-plaintext highlighter-rouge">player/:playerId/x</code> or <code class="language-plaintext highlighter-rouge">player/:playerId/y</code>, returning the coordinates of the given player. All by itself, it doesn’t do a whole lot. However, we can write a <code class="language-plaintext highlighter-rouge">Server</code> and a <code class="language-plaintext highlighter-rouge">Client</code> for it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiServer :: Server Api apiServer = serveX :<|> serveY where serveX :: Int -> Handler Int serveX playerId = return 42 serveY :: Int -> Handler Int serveY playerId = return 24 getX :: Int -> ClientM Int getY :: Int -> ClientM Int getX :<|> getY = client (Proxy :: Proxy Api) </code></pre></div></div> Now this – this is really cool! Our client is free, and our server is type-checked. <h1 id="the-drying">The DRYing</h1> Now, the anti-repetition DRY part of your brain is going to see that route and want to factor parts of it out. After all, the <code class="language-plaintext highlighter-rouge">"player" :> Capture "playerId" Int</code> part is repeated. Servant has no problem with factoring it out: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Api' = "player" :> Capture "playerId" Int :> ( "y" :> Get '[JSON] Int :<|> "x" :> Get '[JSON] Int ) </code></pre></div></div> The repetition is gone! Very cool. Unfortunately… this complicates the types of the <code class="language-plaintext highlighter-rouge">Server</code> and <code class="language-plaintext highlighter-rouge">Client</code>. Let’s reuse the old implementation for the server and see what happens: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiServer' :: Server Api' apiServer' = serveX :<|> serveY where serveX playerId = return 42 serveY playerId = return 42 </code></pre></div></div> We get an error, reproduced below: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/servant-smoosh/src/Lib.hs:41:14: error: • Couldn't match type ‘(p0 -> m0 Integer) :<|> (p1 -> m1 Integer)’ with ‘Int -> Handler Int :<|> Handler Int’ Expected type: Server Api' Actual type: (p0 -> m0 Integer) :<|> (p1 -> m1 Integer) • In the expression: serveX :<|> serveY In an equation for ‘apiServer'’: apiServer' = serveX :<|> serveY where serveX playerId = return 42 serveY playerId = return 42 | 41 | apiServer' = serveX :<|> serveY | ^^^^^^^^^^^^^^^^^^ </code></pre></div></div> What’s going on here? <h1 id="-does-not-distribute"><code class="language-plaintext highlighter-rouge">:<|></code> does not distribute</h1> The two API types we provide above end up describing the exact same API structure. However, the structure of the type is different, and the operation does not in fact distribute. What’s it mean for something to distribute? Let’s look at addition and multiplication, a very simple form of distribution. If we have $(x \times y) + (x \times z)$, we can factor out the multiplication of $x$. That gives us $x \times (y + z)$, an expression that is exactly equal. Ideally, we could factor out parameters in our servant API types, and they would “distribute” those parameters to all subroutes. So, let’s fix that initial type error, and we’ll dig into some simplified implementation details of Servant to figure out why. Instead of having a <code class="language-plaintext highlighter-rouge">Server</code> that contains two handlers, each a function from the captured <code class="language-plaintext highlighter-rouge">Int</code> to the return type, we have a function from an <code class="language-plaintext highlighter-rouge">Int</code> to a server of two handlers. We can inspect this in GHCi with the <code class="language-plaintext highlighter-rouge">:kind!</code> command: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> :kind! Server Api Server Api :: * = (Int -> Handler Int) :<|> (Int -> Handler Int) >>> :kind! Server Api' Server Api' :: * = Int -> Handler Int :<|> Handler Int </code></pre></div></div> <code class="language-plaintext highlighter-rouge">:kind!</code> takes a type and tries to normalize it fully – this means applying all available type synonyms and computing all type families. We have factored out the parameter, and this has complicated our server type. Here’s the new server for it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apiServer' :: Server Api' apiServer' playerId = serveX :<|> serveY where serveX = return 42 serveY = return 42 </code></pre></div></div> <code class="language-plaintext highlighter-rouge">apiServer'</code> accepts the <code class="language-plaintext highlighter-rouge">playerId</code> parameter from the capture. <code class="language-plaintext highlighter-rouge">serveX</code> and <code class="language-plaintext highlighter-rouge">serveY</code> are mere <code class="language-plaintext highlighter-rouge">Handler Int</code>s, now. They have access to <code class="language-plaintext highlighter-rouge">playerId</code> because it’s in scope, but if you factor those functions out into top-level definitions, you’d need to pass it explicitly. <h1 id="what-about-the-client">What about the client?</h1> We can use <code class="language-plaintext highlighter-rouge">:kind!</code> with the <code class="language-plaintext highlighter-rouge">Client</code> type as well: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> :kind! Client Api Client Api :: * = (Int -> ClientM Int) :<|> (Int -> ClientM Int) >>> :kind Client Api' Client Api' :: * </code></pre></div></div> Huh, that’s – weird. <code class="language-plaintext highlighter-rouge">Client Api'</code> has the kind <code class="language-plaintext highlighter-rouge">*</code> – it’s an ordinary value. Let’s instead look at the type of the derived client, using the <code class="language-plaintext highlighter-rouge">client</code> function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> :t client (Proxy :: Proxy Api) client (Proxy :: Proxy Api) :: (Int -> ClientM Int) :<|> (Int -> ClientM Int) >>> :t client (Proxy :: Proxy Api') client (Proxy :: Proxy Api') :: Int -> ClientM Int :<|> ClientM Int </code></pre></div></div> Ah! So <code class="language-plaintext highlighter-rouge">client</code> with the <code class="language-plaintext highlighter-rouge">Api'</code> type does not give us a pair of client functions, but rather, a function that returns a pair of clients. This makes derivation much less pleasant. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkClient :: Int -> ClientM Int :<|> ClientM Int mkClient = client (Proxy :: Proxy Api') getX :: Int -> ClientM Int getX i = getX' where (getX' :<|> _) = mkClient i getY :: Int -> ClientM Int getY i = getY' where (_ :<|> getY') = mkClient i </code></pre></div></div> This gets dramatically worse as the number of parameters goes up, and as the level of nesting increases. At this point, I strongly recommend keeping your API types as flat and repetitive as possible. Doing otherwise takes you off Servant’s happy path. <h1 id="how-does-this-happen">How does this happen?</h1> Servant uses a technique that I refer to as “inductive type class programming.” It provides a lot of extensibility, and is super cool. Let’s reproduce a bit of Servant’s stuff: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE TypeOperators #-} {-# LANGUAGE PolyKinds #-} module Butler where import GHC.TypeLits data thing :> route infixr 9 :> data alt1 :<|> alt2 = alt1 :<|> alt2 infixr 8 :<|> data Capture (sym :: Symbol) typ data Get a </code></pre></div></div> This is all we need to define our own type-level APIs! We have a type operators <code class="language-plaintext highlighter-rouge">:></code> and a data type <code class="language-plaintext highlighter-rouge">:<|></code>, and a <code class="language-plaintext highlighter-rouge">Capture</code> type that describes a named parameter. Here’s our simplified API using the Butler types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type BPI = "player" :> Capture "playerId" Int :> "x" :> Get Int :<|> "player" :> Capture "playerId" Int :> "y" :> Get Int </code></pre></div></div> Now, we want to implement a <code class="language-plaintext highlighter-rouge">Server</code> for it. But first, we need a way of describing what a <code class="language-plaintext highlighter-rouge">Server</code> for a given API type looks like. We will ignore actually serving the API using a <code class="language-plaintext highlighter-rouge">Server</code>, and instead focus on defining the handlers. For our purposes, the <code class="language-plaintext highlighter-rouge">Server</code> class will be quite simple: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class HasServer x where type Server x </code></pre></div></div> Now the fun begins. We are going to write a lot of overlapping and orphan instances. That’s just part of the deal with this style of programming. We’re going to start with our base case: the <code class="language-plaintext highlighter-rouge">Get</code> handler. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasServer (Get a) where type Server (Get a) = IO a </code></pre></div></div> The handler for a simple <code class="language-plaintext highlighter-rouge">Get a</code> is an <code class="language-plaintext highlighter-rouge">IO a</code>. What about alternation? If we have <code class="language-plaintext highlighter-rouge">left :<|> right</code>, then it makes sense that we’d need for <code class="language-plaintext highlighter-rouge">left</code> to be serve-able and <code class="language-plaintext highlighter-rouge">right</code> to be serve-able. We express this by requiring <code class="language-plaintext highlighter-rouge">HasServer</code> instances in the instance context. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( HasServer left , HasServer right ) => HasServer (left :<|> right) where type Server (left :<|> right) = Server left :<|> Server right </code></pre></div></div> So the server for an alternation of APIs is the alternation of the servers of those APIs. Let’s do <code class="language-plaintext highlighter-rouge">Capture</code> now – that one is a bit interesting! In order to handle the Capture, we need to take the capture-d thing as a parameter. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasServer (Capture paramName paramType) where type Server (Capture paramName paramType) = -- ...? </code></pre></div></div> Well, that doesn’t quite work out. We don’t have the rest of the server to delegate to. That’s because we use <code class="language-plaintext highlighter-rouge">:></code> for chaining combinators. We’ll need to write the instance for <code class="language-plaintext highlighter-rouge">Capture</code> using the <code class="language-plaintext highlighter-rouge">:></code> combinator to make it flow. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( HasServer rest ) => HasServer (Capture paramName paramType :> rest) where type Server (Capture paramName paramType :> rest) = paramType -> Server rest </code></pre></div></div> We need to enable <code class="language-plaintext highlighter-rouge">FlexibleInstances</code> for this one. Let’s try it out so far: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>>>> :kind! Server (Capture "hey" Int :> Get String) Server (Capture "hey" Int :> Get String) :: * = Int -> IO [Char] >>> :kind! Server (Capture "hey" Int :> Get String :<|> Get Char) Server (Capture "hey" Int :> Get String :<|> Get Char) :: * = (Int -> IO [Char]) :<|> IO Char </code></pre></div></div> Nice! This works out. Let’s add the instance for <code class="language-plaintext highlighter-rouge">"hey" :> rest</code> symbols: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( HasServer rest ) => HasServer (skipMe :> rest) where type Server (skipMe :> rest) = Server rest </code></pre></div></div> Because we’re only dealing with handler types, we just ignore this. Unfortunately, GHC isn’t happy with this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Conflicting family instance declarations: forall k1 k2 (skipMe :: k2) (rest :: k1). Server (skipMe :> rest) = Server rest -- Defined at /home/matt/Projects/servant-smoosh/src/Butler.hs:51:10 forall k (paramName :: Symbol) paramType (rest :: k). Server (Capture paramName paramType :> rest) = paramType -> Server rest -- Defined at /home/matt/Projects/servant-smoosh/src/Butler.hs:58:10 | 51 | type Server (skipMe :> rest) = | ^^^^^^^^^^^^^^^^^^^^^^^^^... </code></pre></div></div> Frankly, this one mystified me. Google didn’t help me find it. This kind of programming puts you into a fairly hostile territory, and it becomes difficult to figure out how to solve problems. It requires a lot of experimentation, guesswork, and luck. The fix was weird: provide a kind annotation to <code class="language-plaintext highlighter-rouge">skipMe</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ( HasServer rest ) => HasServer ((skipMe :: Symbol) :> rest) where type Server (skipMe :> rest) = Server rest </code></pre></div></div> And now we’re back in business! We can write out the handler type for our server: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type BPI = "player" :> Capture "playerId" Int :> "x" :> Get Int :<|> "player" :> Capture "playerId" Int :> "y" :> Get Int bpiServer :: Server BPI bpiServer = handleX :<|> handleY where handleX :: Int -> IO Int handleX _ = return 32 handleY :: Int -> IO Int handleY _ = return 35 </code></pre></div></div> <h1 id="what-about-the-client-1">What about the client?</h1> Writing out the client is actually very similar to the server. We create a type class <code class="language-plaintext highlighter-rouge">HasClient</code>, and write instances for all the various parts of the chain. Since we’re not actually serving or requesting anything here, I’ll omit that part. The small server pretend implementation is sufficient for us to continue. <h1 id="smooshing-the-servant">Smooshing the Servant</h1> But… we want to have a nice DRY API type with nesting! That eliminates a lot of boilerplate and makes writing handlers and clients quite nice. Therefore, we need some way of distributing the parameters. Type level programming in Haskell is quite hairy. It’s generally easier to sketch out a value-level program, desugar it, and then port it to the type level than it is to implement it directly. We’ll start by implementing a highly simplified version of the routes, at the value level: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Smoosh where data Route = Route :> Route | Route :<|> Route | Capture String | Path String | Get Int infixr 9 :> infixr 8 :<|> </code></pre></div></div> This more-or-less mirrors the shape of the routes in Servant. Our goal is to take a value that nests parameters, like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nested :: Route nested = Path "player" :> Capture "playerId" :> (Get 3 :<|> Get 4) </code></pre></div></div> and flatten it out into a route that looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>flat :: Route flat = Path "player" :> Capture "playerId" :> Get 3 :<|> Path "player" :> Capture "playerId" :> Get 4 </code></pre></div></div> We’ll walk up the route tree, collecting the Captures into a list, and when we hit a <code class="language-plaintext highlighter-rouge">:<|></code> branch, we distribute the captures to both sides of the alternation. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>smooshRoute :: Route -> Route smooshRoute = go [] where go captures r = case r of Capture str :> rest -> go (Capture str : captures) rest Path str :> rest -> go (Path str : captures) rest Get i -> applyCaptures captures (Get i) r1 :<|> r2 -> applyCaptures captures (smooshRoute r1) :<|> applyCaptures captures (smooshRoute r2) applyCaptures :: [Route] -> Route -> Route applyCaptures [] route = route applyCaptures (x:xs) route = applyCaptures xs (x :> route) test :: Bool test = smooshRoute nested == flat </code></pre></div></div> You might note that <code class="language-plaintext highlighter-rouge">applyCaptures</code> could be rewritten using <code class="language-plaintext highlighter-rouge">foldl'</code>. We won’t do that, because you don’t have <code class="language-plaintext highlighter-rouge">foldl'</code> at the type level. <h1 id="port-to-the-types">Port to the Types</h1> Now that we’ve implemented this at the value level, it’s relatively straight forward to desugar it and bring it to the type level, once you know the desugaring rules. <ul> <li>There are no case expressions, so all pattern matching must be done at the top level.</li> <li>There are no <code class="language-plaintext highlighter-rouge">where</code> blocks, so all expressions must be at the top level.</li> </ul> This desugaring gives us this implementation: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>smooshRoute' :: Route -> Route smooshRoute' = smooshHelper [] smooshHelper :: [Route] -> Route -> Route smooshHelper captures (Capture str :> rest) = smooshHelper (Capture str : captures) rest smooshHelper captures (Path str :> rest) = smooshHelper (Path str : captures) rest smooshHelper captures (Get i) = applyCaptures captures (Get i) smooshHelper captures (r1 :<|> r2) = applyCaptures captures (smooshRoute' r1) :<|> applyCaptures captures (smooshRoute' r2) </code></pre></div></div> Now, we’ve got to make a choice: how do we do this at the type level? Type classes, or type families? Let’s try type families first. We’ll start with the simplified <code class="language-plaintext highlighter-rouge">Butler</code> types we defined earlier, and then port to the more complicated <code class="language-plaintext highlighter-rouge">servant</code> types. The translation works out relatively simply: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type SmooshRoute route = Smoosh '[] route type family Smoosh xs rt where Smoosh xs (Capture pname pty :> rest) = Smoosh (Capture pname pty ': xs) rest Smoosh xs ((sym :: Symbol) :> rest) = Smoosh (sym ': xs) rest Smoosh xs (Get i) = ApplyCaptures xs (Get i) Smoosh xs (r1 :<|> r2) = ApplyCaptures xs (SmooshRoute r1) :<|> ApplyCaptures xs (SmooshRoute r2) type family ApplyCaptures xs r where ApplyCaptures '[] r = r ApplyCaptures (x ': xs) r = ApplyCaptures xs (x :> r) </code></pre></div></div> Unfortunately, this doesn’t work :( Haskell’s type level lists require that every type inside has the same kind, and this complains about <code class="language-plaintext highlighter-rouge">Symbol</code> and <code class="language-plaintext highlighter-rouge">*</code> not aligning. We’ll write a wrapper for <code class="language-plaintext highlighter-rouge">Symbol</code> and then special case in unpacking: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data SWrap s type family Smoosh xs rt where Smoosh xs (Capture pname pty :> rest) = Smoosh (Capture pname pty ': xs) rest Smoosh xs ((sym :: Symbol) :> rest) = Smoosh (SWrap sym ': xs) rest Smoosh xs (Get i) = ApplyCaptures xs (Get i) Smoosh xs (r1 :<|> r2) = ApplyCaptures xs (SmooshRoute r1) :<|> ApplyCaptures xs (SmooshRoute r2) type family ApplyCaptures xs r where ApplyCaptures '[] r = r ApplyCaptures (SWrap x ': xs) r = ApplyCaptures xs (x :> r) ApplyCaptures (x ': xs) r = ApplyCaptures xs (x :> r) </code></pre></div></div> Now, does it work? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type BPI' = "player" :> Capture "playerId" Int :> ( "x" :> Get Int :<|> "y" :> Get Int ) bpiServer' :: Server (SmooshRoute BPI') bpiServer' = handleX :<|> handleY where handleX :: Int -> IO Int handleX _ = return 32 handleY :: Int -> IO Int handleY _ = return 35 </code></pre></div></div> GHC doesn’t complain! We did it! Awesome! <h1 id="port-to-servant">Port to Servant</h1> Alright, it’s time to level this thing up. Let’s port to <code class="language-plaintext highlighter-rouge">servant</code> and see if it works. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Distribute route = Flatten '[] route data SWrap (s :: Symbol) type family Flatten xs route where Flatten xs ((sym :: Symbol) :> rest) = Flatten (SWrap sym ': xs) rest Flatten xs ((x :: *) :> rest) = Flatten (x ': xs) rest Flatten xs (Verb meth code typs a) = ApplyCaptures xs (Verb meth code typs a) Flatten xs (r1 :<|> r2) = ApplyCaptures xs (Distribute r1) :<|> ApplyCaptures xs (Distribute r2) type family ApplyCaptures xs r where ApplyCaptures '[] r = r ApplyCaptures (SWrap x ': xs) r = ApplyCaptures xs (x :> r) ApplyCaptures (x ': xs) r = ApplyCaptures xs (x :> r) </code></pre></div></div> OK, this is the port. I changed <code class="language-plaintext highlighter-rouge">Get</code> to <code class="language-plaintext highlighter-rouge">Verb</code> and added all the type parameters. Everything else gets collected and distributed out to all the API leaves. Let’s write the server using this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>distributed :: Server (Distribute Api') distributed = serveX :<|> serveY where serveX _ = return 32 serveY _ = return 42 </code></pre></div></div> But, it doesn’t work. We get this error: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/servant-smoosh/src/Lib.hs:54:15: error: • Couldn't match type ‘ServerT (Flatten '[SWrap "y"] (Verb 'GET 200 '[JSON] Int)) Handler’ with ‘m0 a0’ Expected type: Server (Distribute Api') Actual type: (Int -> m0 a0) :<|> (Int -> m1 a1) The type variables ‘m0’, ‘a0’ are ambiguous • In the expression: serveX :<|> serveY In an equation for ‘distributed’: distributed = serveX :<|> serveY where serveX _ = return 32 serveY _ = return 42 | 54 | distributed = serveX :<|> serveY | ^^^^^^^^^^^^^^^^^^ </code></pre></div></div> It’s got a type error. If we look at that type error, we see that the <code class="language-plaintext highlighter-rouge">Flatten</code> type family is still there. GHC does a thing where it gets “stuck” if a type family doesn’t match anything. Rather than saying “I can’t figure this type family out, you must have made a mistake,” it just carries the type on in the non-reduced state. This is totally bizarre if you don’t know what you’re looking for. So if you’re type-level-hacking and you see a type family application, that means that it failed to match a case. Specifically, it seems like we missed the <code class="language-plaintext highlighter-rouge">Flatten xs (Verb _ _ _ _)</code> case. Why is that? Let’s inspect it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Flatten xs (Verb meth code typs a) = ApplyCaptures xs (Verb meth code typs a) </code></pre></div></div> Hmmm.. What extensions are enabled? The behavior of type level programming in Haskell is dependent on the extensions we provide. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE TypeFamilies #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE KindSignatures #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE TypeOperators #-} </code></pre></div></div> <code class="language-plaintext highlighter-rouge">TypeFamilies</code> enables, well, type families. <code class="language-plaintext highlighter-rouge">UndecidableInstances</code> is needed for the recursion in <code class="language-plaintext highlighter-rouge">ApplyCaptures</code>. <code class="language-plaintext highlighter-rouge">KindSignatures</code> allows us to write <code class="language-plaintext highlighter-rouge">data SWrap (s :: Symbol)</code>. <code class="language-plaintext highlighter-rouge">DataKinds</code> lets us promote values-to-types and types-to-kinds. And <code class="language-plaintext highlighter-rouge">TypeOperators</code> lets us use operators in types. But you know what GHC does with type variables that don’t have a kind signature? By default, it infers that they’re of kind <code class="language-plaintext highlighter-rouge">*</code>! This is the explicit case that we’ve defined: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Flatten xs (Verb (meth :: *) (code :: *) (typs :: *) (a :: *)) = ApplyCaptures xs (Verb meth code typs a) </code></pre></div></div> And the case we’re trying to apply it to is this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Flatten '[SWrap "y"] (Verb 'GET 200 '[JSON] Int) </code></pre></div></div> Ah! <code class="language-plaintext highlighter-rouge">'GET</code>, <code class="language-plaintext highlighter-rouge">'[JSON]</code> and <code class="language-plaintext highlighter-rouge">200</code> don’t have kind <code class="language-plaintext highlighter-rouge">*</code>! That is the trick. So we make an addition to our extensions list: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE PolyKinds #-} </code></pre></div></div> And now our implementation works! Does the client work? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getX' :: Int -> ClientM Int getY' :: Int -> ClientM Int getX' :<|> getY' = client (Proxy :: Proxy (Distribute Api')) </code></pre></div></div> Yes, yes it does. <h1 id="take-aways">Take aways</h1> <ul> <li>Type level programming is hard and full of minefields.</li> <li>Type class inductive programming gives awful error messages</li> <li>Servant is fantastic, but nontrivial modifications and extensions require intense knowledge of GHC’s weird flavor of type level programming.</li> </ul> Wed, 14 Mar 2018 00:00:00 +0000 https://www.parsonsmatt.org/2018/03/14/servant_route_smooshing.html https://www.parsonsmatt.org/2018/03/14/servant_route_smooshing.html 2017 Retrospective It’s that time of year again – a retrospective! One interesting difference this year is that I kept a fairly detailed todo list in Workflowy throughout the year. I can show the completed listings throughout the year to get an idea of what I got up to. 2017 was a huge year for me. I made several goals for the year, almost all of which I completed! <h1 id="move-to-denver">Move to Denver</h1> I moved from Athens, Georgia to Denver, Colorado in February. I’d lived in Athens for about ten years, and I really needed to be out of that tiny town and the humid south. Denver has been absolutely amazing – I’ve loved the outdoors for the first time in my adult life, and I’m way happier and healthier than I have ever been before. This was truly needed, as I’d sacrificed a lot of my health to get my degree and career on track. <h1 id="give-a-conference-talk">Give a Conference Talk</h1> I gave <a href="https://www.youtube.com/watch?v=Ej5FQtEgTBw">a talk at LambdaConf</a>. People seemed to like it and find it valuable, despite the fact that it had a ton of Ruby code. I enjoyed giving the talk, and I’m excited to give more of them next year :D <h1 id="lose-30lbs">Lose 30lbs</h1> One of my goals for 2017 was to lose 30lbs, going from 200lbs to 170lbs. I did not make any progress toward this goal, and dropped it pretty early in the year. I decided instead to focus on lifting and activity for fitness, instead of weight. I started getting more into cycling. I was mostly into commuting beforehand, but now I’m getting really into gravel riding and might even try to do a road race next year. I just finished a bike tour from Daphne AL to Port Saint Joe FL, too! It’s a lot of fun and really fulfilling. <h1 id="write-a-book">Write a Book</h1> I did not write a book. I have been mulling the idea over, but I have not taken any steps to write out a table of contents or structure anything. <h1 id="non-goal-related">Non-Goal Related</h1> <h2 id="careerprofessional-stuff">Career/Professional Stuff</h2> I did a bunch of stuff related to software development and Haskell. I helped revive the <a href="https://www.meetup.com/denverfp">DenverFP meetup group</a>, which has had a bunch of cool talks and a quite successful <a href="https://haskellbook.com/">Haskell Book Club</a>. I gave five talks at various local meetups. I wrote fifteen blog posts, totalling 28,579 words. We have three Haskellers at my job right now, and are actively training others. Another big Haskell project is about to get started, and I’m excited to help get it off the ground. The future is bright for Haskell here. I went to BayHac in San Fransisco and had a blast. I spent a bit of time in Portland, and hung out with the Rust folks at Rustconf and had a great time there too. Speaking of, I should really get my Rust project going again… <h2 id="music">Music</h2> I haven’t done anything with music. I ended up giving up on the cello, after about a year of practice and lessons. I’m a little sad about it, but it’s just too much of a distraction from my larger goals right now. I’ve been intending on learning classical guitar for a while, and have all the stuff for it. But I haven’t put in the time yet. <h2 id="travel">Travel</h2> This year, I went to San Fransisco, Portland, Atlanta, and then the bike tour along the coast. I liked it a bunch, and I want to do more of it. <h1 id="whats-next">What’s next?</h1> Last year’s retrospective was pretty bleak. I had a really good year this year, and I’m so grateful for that. I’m excited to do more traveling, especially out of the country now that I have my passport (Poland and Switzerland are already on my calendar). I’ve got another bike tour planned (Denver to Santa Fe). I’m going to try and get more involved in the local non-programming communities. I’m going to follow a budget better and live healthier. I’m excited for another year on this earth. Sun, 31 Dec 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/12/31/2017_retrospective.html https://www.parsonsmatt.org/2017/12/31/2017_retrospective.html Haskell Performance Debugging Someone posted <a href="https://www.reddit.com/r/haskell/comments/7km60k/optimization_ideas_in_treap_implementation/">a Treap implementation to reddit</a> that was slow. Let’s analyze it and determine what’s up. The repo is available <a href="https://github.com/parsonsmatt/performance-debugging">here</a> <h1 id="base-run">Base Run</h1> I set the code up in a Cabal project, created a makefile, and ran an initial profiling run. The code and profiling output are in the <code class="language-plaintext highlighter-rouge">base</code> branch on GitHub. Before we look at any of the executing code or profilign output, let’s check out the definition of the data structure in question: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Node v d = Node { val :: v, info :: d, prior :: Int } deriving (Eq, Show) data Treap v d = Leaf | Tree {node :: Node v d, left :: Treap v d, right :: Treap v d} deriving Show </code></pre></div></div> We have a binary tree along with some annotations. The spine and values are lazy, like a linked list. Here’s the <code class="language-plaintext highlighter-rouge">main</code> function that we’re going to be inspecting output for: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do g <- getStdGen let nulls = repeat () n = 100000 rxs = take n $ randomRs (1,100000) g :: [Int] nodeList = feedFold (zip rxs nulls) g buildNode treap = insertMany empty nodeList print $ heightTreap treap print $ map (\Node{val = v} -> v) $ inOrder treap </code></pre></div></div> I build the executable with profiling and do a run with <code class="language-plaintext highlighter-rouge">-p</code> and <code class="language-plaintext highlighter-rouge">-s</code>. This gets me a time and allocation profile. Here’s the <code class="language-plaintext highlighter-rouge">-s</code> output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1,691,027,808 bytes allocated in the heap 1,179,783,328 bytes copied during GC 42,694,944 bytes maximum residency (25 sample(s)) 8,493,296 bytes maximum slop 121 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 3033 colls, 0 par 0.716s 0.752s 0.0002s 0.0008s Gen 1 25 colls, 0 par 0.544s 0.560s 0.0224s 0.0460s INIT time 0.000s ( 0.000s elapsed) MUT time 1.088s ( 1.140s elapsed) GC time 1.260s ( 1.312s elapsed) RP time 0.000s ( 0.000s elapsed) PROF time 0.000s ( 0.000s elapsed) EXIT time 0.000s ( 0.003s elapsed) Total time 2.404s ( 2.455s elapsed) %GC time 52.4% (53.4% elapsed) Alloc rate 1,554,253,500 bytes per MUT second Productivity 47.6% of total user, 46.5% of total elapsed </code></pre></div></div> 52% time spent in GC isn’t great. The profiling output indicates we’re spending the vast majority of our time in the <code class="language-plaintext highlighter-rouge">splitTreap</code> function. So let’s look there and see what’s up: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>splitTreap :: (Ord v) => Treap v d -> v -> (Treap v d, Treap v d) splitTreap Leaf _ = (Leaf, Leaf) splitTreap (tree @ Tree {node = Node { val = x }, left = l, right = r}) v | x < v = let (lt, rt) = splitTreap r v in ( Tree { node = node tree, left = l, right = lt } , rt ) | v <= x = let (lt, rt) = splitTreap l v in ( lt , Tree { node = node tree, left = rt, right = r} ) </code></pre></div></div> I see two things that are concerning to me: <ul> <li>Tuples.</li> <li>Recursion.</li> </ul> Tuples are a source of often unwanted laziness and space leaks. GHC can sometimes “see through” a tuple data structure and unbox it entirely, making it 0 overhead. However, sometimes it can’t, and then you have a bunch more allocations and thunks happening. Recursion totally defeats GHC’s ability to inline things, which can wreck performance. Stuff like <code class="language-plaintext highlighter-rouge">map</code>, <code class="language-plaintext highlighter-rouge">foldr</code>, etc. have clever means of being optimized, but naive recursive functions can often have issues with inlining. So, these are my impressions before I get started with experimenting. In order to test my “tuple allocation” hypothesis, I’m going to run a heap profiling run. We’ll use the <code class="language-plaintext highlighter-rouge">-hd</code> flag to get the data constructors that are allocated: <img src="https://www.parsonsmatt.org/treap-base-hd.png" alt="The outpput of -hd" /> Neat! Okay, so this graph tells us that we allocate a ton of nodes, tuples, and <code class="language-plaintext highlighter-rouge">I#</code> (the constructor for <code class="language-plaintext highlighter-rouge">Int</code>), before we start allocating a bunch of <code class="language-plaintext highlighter-rouge">Tree</code> constructors. Given the <code class="language-plaintext highlighter-rouge">main</code> function we’re dealing with, that’s not entirely unreasonable. <h1 id="experiment-one-strictifying-the-data-structure">Experiment One: Strictifying the Data Structure</h1> The code for this section is in <code class="language-plaintext highlighter-rouge">strictify-treap</code>. I modify the data structure by placing bang patterns at certain points: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Node v d = Node { val :: !v , info :: d , prior :: !Int } deriving (Eq, Show) data Treap v d = Leaf | Tree { node :: !(Node v d) , left :: Treap v d , right :: Treap v d } deriving Show </code></pre></div></div> This makes the <code class="language-plaintext highlighter-rouge">Node</code> type strict in the <code class="language-plaintext highlighter-rouge">val</code> and <code class="language-plaintext highlighter-rouge">prior</code> fields, and the <code class="language-plaintext highlighter-rouge">Treap</code> type strict in the <code class="language-plaintext highlighter-rouge">node</code> field. The <code class="language-plaintext highlighter-rouge">info</code> field is left lazy, like most containers. We are leaving the spine of the data structure lazy as well. Here’s the <code class="language-plaintext highlighter-rouge">-s</code> output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1,659,050,200 bytes allocated in the heap 1,144,049,696 bytes copied during GC 43,890,168 bytes maximum residency (33 sample(s)) 8,508,680 bytes maximum slop 102 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2905 colls, 0 par 0.676s 0.696s 0.0002s 0.0007s Gen 1 33 colls, 0 par 0.544s 0.567s 0.0172s 0.0409s INIT time 0.000s ( 0.000s elapsed) MUT time 0.920s ( 0.996s elapsed) GC time 1.220s ( 1.263s elapsed) RP time 0.000s ( 0.000s elapsed) PROF time 0.000s ( 0.000s elapsed) EXIT time 0.000s ( 0.001s elapsed) Total time 2.196s ( 2.260s elapsed) %GC time 55.6% (55.9% elapsed) Alloc rate 1,803,315,434 bytes per MUT second Productivity 44.4% of total user, 44.1% of total elapsed </code></pre></div></div> We’re using about 20MB less memory now, which is good. And we’re using less time overall (2.4 vs 2.2 seconds), which is also good! But we’re actually doing 55% GC now, which is worse than before! Here’s the output of the heap profile now: <img src="https://www.parsonsmatt.org/treap-strict-nodes.png" alt="The output of -hd" /> This hasn’t made a huge difference, but it’s certainly a bit better. The time and allocation profile tell another story: we’ve gone from 2.49 seconds to run the program to 0.97 seconds. I’m feeling pretty encouraged by this, so I’m going to make the tree spine-strict as well. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Node v d = Node { val :: !v , info :: d , prior :: !Int } deriving (Eq, Show) data Treap v d = Leaf | Tree { node :: !(Node v d) , left :: !(Treap v d) , right :: !(Treap v d) } deriving Show </code></pre></div></div> We’re still at around 94MB of total memory in use according to the <code class="language-plaintext highlighter-rouge">-s</code> output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1,161,437,656 bytes allocated in the heap 449,893,272 bytes copied during GC 43,890,328 bytes maximum residency (24 sample(s)) 8,520,808 bytes maximum slop 94 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2143 colls, 0 par 0.152s 0.166s 0.0001s 0.0006s Gen 1 24 colls, 0 par 0.188s 0.203s 0.0085s 0.0272s INIT time 0.000s ( 0.000s elapsed) MUT time 0.556s ( 0.644s elapsed) GC time 0.320s ( 0.345s elapsed) RP time 0.000s ( 0.000s elapsed) PROF time 0.020s ( 0.024s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.952s ( 0.989s elapsed) %GC time 33.6% (34.9% elapsed) Alloc rate 2,088,916,647 bytes per MUT second Productivity 64.3% of total user, 62.6% of total elapsed </code></pre></div></div> This is much better. 33% GC isn’t great, but it’s much better than we had before. We’re down from 2.4 seconds to 0.95 seconds, which is a substantial improvement. Let’s look at the heap output now: <img src="https://www.parsonsmatt.org/treap-strict-spine.png" alt="heap output with strict spine" /> Now that’s a lot closer! We’ll note that we generate a big spike of memory, and then collect it all. That’s a pretty tell-tale sign that something’s up. We’ve got a lot of allocations on tuple constructors, which bothers me. <h1 id="strictifying-the-split">Strictifying the Split</h1> We’re still at the largest offendor being <code class="language-plaintext highlighter-rouge">splitTreap</code>, which is responsible for nearly half of the runtime of the program. We know we’re allocating and then throwing away tuples, so we’ve likely got a space leak there. I am going to add bang patterns inside of the tuples and observe the output. Here’s the change: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>splitTreap :: (Ord v) => Treap v d -> v -> (Treap v d, Treap v d) splitTreap Leaf _ = (Leaf, Leaf) splitTreap (tree @ Tree {node = Node { val = x }, left = l, right = r}) v | x < v = let (!lt, !rt) = splitTreap r v in ( Tree { node = node tree, left = l, right = lt }, rt ) | v <= x = let (!lt, !rt) = splitTreap l v in ( lt, Tree { node = node tree, left = rt, right = r} ) </code></pre></div></div> Where the original code destructures the tuple immediately and leaves the <code class="language-plaintext highlighter-rouge">lt</code> and <code class="language-plaintext highlighter-rouge">rt</code> variables lazy, this forces those variables to weak head normal form. Here’s the new <code class="language-plaintext highlighter-rouge">-s</code> output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1,331,896,120 bytes allocated in the heap 497,880,136 bytes copied during GC 43,890,328 bytes maximum residency (25 sample(s)) 8,516,712 bytes maximum slop 94 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2245 colls, 0 par 0.188s 0.186s 0.0001s 0.0007s Gen 1 25 colls, 0 par 0.212s 0.251s 0.0100s 0.0386s INIT time 0.000s ( 0.000s elapsed) MUT time 0.636s ( 0.756s elapsed) GC time 0.360s ( 0.394s elapsed) RP time 0.000s ( 0.000s elapsed) PROF time 0.040s ( 0.043s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 1.084s ( 1.151s elapsed) %GC time 33.2% (34.2% elapsed) Alloc rate 2,094,176,289 bytes per MUT second Productivity 63.1% of total user, 62.0% of total elapsed </code></pre></div></div> This is barely changed at all from the previous run. The heap profile didn’t change, either. I’m going to do a run now with <code class="language-plaintext highlighter-rouge">-hc</code> to see where these tuples are getting allocated. <code class="language-plaintext highlighter-rouge">-hc</code> records which functions are actually producing the data, which tells us where to focus our efforts. <img src="https://www.parsonsmatt.org/treap-strict-tuple-hc.png" alt="Output of -hc" /> Ah, nuts! <code class="language-plaintext highlighter-rouge">splitTreap</code> is allocating a tiny amount of memory now. It looks like we’re allocating the most in <code class="language-plaintext highlighter-rouge">buildNode</code>, <code class="language-plaintext highlighter-rouge">feedFold</code>, and <code class="language-plaintext highlighter-rouge">insertMany</code>. This seems to disagree with the <code class="language-plaintext highlighter-rouge">-p</code> output, which indicates that we’re spending the majority of our time and allocations in <code class="language-plaintext highlighter-rouge">splitTreap</code>. Well, I guess I’ll focus on <code class="language-plaintext highlighter-rouge">insertMany</code> now. <h1 id="insertmany"><code class="language-plaintext highlighter-rouge">insertMany</code></h1> The code for this section is on <code class="language-plaintext highlighter-rouge">insert-many</code> in GitHub. <code class="language-plaintext highlighter-rouge">mergeTreap</code> is curried for some reason: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mergeTreap :: (Treap v d, Treap v d) -> Treap v d </code></pre></div></div> And this bothers me, so I’m uncurrying it. This doesn’t do anything. At this point I actually look at <code class="language-plaintext highlighter-rouge">insertMany</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>insertMany :: (Ord v) => Treap v d -> [Node v d] -> Treap v d insertMany = foldl insertTreap </code></pre></div></div> Oh. Duh. <code class="language-plaintext highlighter-rouge">foldl</code> strikes again. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>insertMany :: (Ord v) => Treap v d -> [Node v d] -> Treap v d insertMany = foldl' insertTreap </code></pre></div></div> Who would win, GHC’s amazing optimization powers, or one prime boi?? <code class="language-plaintext highlighter-rouge">-s</code> output: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1,115,162,944 bytes allocated in the heap 245,033,472 bytes copied during GC 12,088,896 bytes maximum residency (22 sample(s)) 306,112 bytes maximum slop 32 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2134 colls, 0 par 0.116s 0.128s 0.0001s 0.0003s Gen 1 22 colls, 0 par 0.080s 0.105s 0.0048s 0.0136s INIT time 0.000s ( 0.000s elapsed) MUT time 0.596s ( 0.700s elapsed) GC time 0.180s ( 0.216s elapsed) RP time 0.000s ( 0.000s elapsed) PROF time 0.016s ( 0.018s elapsed) EXIT time 0.000s ( 0.001s elapsed) Total time 0.852s ( 0.916s elapsed) %GC time 21.1% (23.6% elapsed) Alloc rate 1,871,078,765 bytes per MUT second Productivity 77.0% of total user, 74.4% of total elapsed </code></pre></div></div> Nice, down to 32MB of total memory in use, and about a tenth of a second less time overall. We’re also only spending 21% of our time in garbage collection, which is a huge win. Let’s check out the heap profile: <img src="https://www.parsonsmatt.org/treap-foldl.png" alt="heap profile after foldl'" /> <h1 id="never-ever-use-foldl">Never Ever Use <code class="language-plaintext highlighter-rouge">foldl</code></h1> <h1 id="always-use-foldl">Always Use <code class="language-plaintext highlighter-rouge">foldl'</code></h1> What if we go back to the beginning and only do the <code class="language-plaintext highlighter-rouge">foldl</code> to <code class="language-plaintext highlighter-rouge">foldl'</code> change? I run <code class="language-plaintext highlighter-rouge">git checkout base</code> to get back to the original timeline, change <code class="language-plaintext highlighter-rouge">foldl</code> to <code class="language-plaintext highlighter-rouge">foldl'</code>. Here’s <code class="language-plaintext highlighter-rouge">-s</code>: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1,581,972,168 bytes allocated in the heap 1,140,799,032 bytes copied during GC 40,964,944 bytes maximum residency (43 sample(s)) 495,784 bytes maximum slop 114 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 3044 colls, 0 par 0.664s 0.636s 0.0002s 0.0004s Gen 1 43 colls, 0 par 0.796s 0.812s 0.0189s 0.0611s INIT time 0.000s ( 0.000s elapsed) MUT time 0.964s ( 1.226s elapsed) GC time 1.336s ( 1.320s elapsed) RP time 0.000s ( 0.000s elapsed) PROF time 0.124s ( 0.128s elapsed) EXIT time 0.000s ( 0.002s elapsed) Total time 2.488s ( 2.548s elapsed) %GC time 53.7% (51.8% elapsed) Alloc rate 1,641,049,966 bytes per MUT second Productivity 41.3% of total user, 43.2% of total elapsed </code></pre></div></div> Not great – actually a little worse than where we started! What about the heap profile? <img src="https://www.parsonsmatt.org/treap-just-foldl.png" alt="heap profile with only foldl" /> This profile is also nearly the same! The allocations appear to be a little smoother, but not significantly different. So just switching to <code class="language-plaintext highlighter-rouge">foldl'</code> without also making the data structure strict didn’t help. <h1 id="final-run">Final Run</h1> Finally, we disable profiling and run the code again: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 650,786,800 bytes allocated in the heap 132,515,880 bytes copied during GC 7,278,528 bytes maximum residency (17 sample(s)) 353,296 bytes maximum slop 21 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1233 colls, 0 par 0.112s 0.100s 0.0001s 0.0004s Gen 1 17 colls, 0 par 0.056s 0.056s 0.0033s 0.0122s INIT time 0.000s ( 0.000s elapsed) MUT time 0.212s ( 0.341s elapsed) GC time 0.168s ( 0.156s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.436s ( 0.497s elapsed) %GC time 38.5% (31.3% elapsed) Alloc rate 3,069,749,056 bytes per MUT second Productivity 61.5% of total user, 68.7% of total elapsed </code></pre></div></div> We get 21MB total memory usage and 0.43 seconds of execution time. <h1 id="conclusion">Conclusion?</h1> <h2 id="strict-in-the-spine-lazy-in-the-leaves">Strict in the spine, lazy in the leaves</h2> Data structures should be strict in the spine and lazy in the leaves, unless you explicitly intend on constructing/consuming the data constructor in a streaming fashion. <h2 id="never-use-foldl">Never use foldl</h2> Seriously, don’t. Make an <code class="language-plaintext highlighter-rouge">hlint</code> rule to never use it. Ban it from your codebase. Make a GHC proposal to repeal and replace it in <code class="language-plaintext highlighter-rouge">Prelude</code>. Mon, 18 Dec 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/12/18/haskell_performance_debugging.html https://www.parsonsmatt.org/2017/12/18/haskell_performance_debugging.html MonadBaseControl in Five Minutes This post is intended to be a short guide on using <code class="language-plaintext highlighter-rouge">MonadBaseControl</code> effectively in Haskell code without understanding it. <h1 id="tiny-synopsis">Tiny synopsis</h1> The big idea behind <code class="language-plaintext highlighter-rouge">MonadIO m</code> is that you can perform a transformation <code class="language-plaintext highlighter-rouge">IO a -> m a</code>. The big idea behind <code class="language-plaintext highlighter-rouge">MonadBaseControl</code> is that you can perform a transformation <code class="language-plaintext highlighter-rouge">m a -> IO a</code>. Most monads have additional context than just <code class="language-plaintext highlighter-rouge">IO</code>, so to go from your custom monad to <code class="language-plaintext highlighter-rouge">IO</code> requires providing additional context. Additionally, many monads alter the return type slighty. <code class="language-plaintext highlighter-rouge">ExceptT e IO a</code> turns into <code class="language-plaintext highlighter-rouge">IO (Either e a)</code>, and <code class="language-plaintext highlighter-rouge">StateT s IO a</code> turns into <code class="language-plaintext highlighter-rouge">IO (a, s)</code>. The <code class="language-plaintext highlighter-rouge">StM m a</code> type family is used to associate the types. There is one function you need to know: <code class="language-plaintext highlighter-rouge">control</code>. <h1 id="lifting-callbacks">Lifting Callbacks</h1> The primary reason to use <code class="language-plaintext highlighter-rouge">MonadBaseControl</code> is to lift IO callbacks. Here are some examples: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>withFile :: FilePath -> IOMode -> (Handle -> IO a) -- callback to lift -> IO a withFileLifted :: (MonadBaseControl IO m, StM m a ~ a) => FilePath -> IOMode -> (Handle -> m a) -> m a withFileLifted path mode action = control $ \runInIO -> withFile path mode (\handle -> runInIO (action handle)) </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">StM m a ~ a</code> line asserts that the <code class="language-plaintext highlighter-rouge">m</code> monad does not alter the return state. This means no <code class="language-plaintext highlighter-rouge">ExceptT</code>, <code class="language-plaintext highlighter-rouge">StateT</code>, etc. Those have unpredictable effects in the presence of multithreading, so it’s best to avoid them. Here is another example: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forkIO :: IO a -> IO ThreadId forkIOLifted :: (MonadBaseControl IO m) => m a -> m (StM m ThreadId) forkIOLifted action = control $ \runInIO -> forkIO (runInIO action) </code></pre></div></div> The general pattern is to write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>control $ \runInIO -> do -- [1] putStrLn "go" res <- runInIO foobar -- [2] putStrLn "end" return res -- [3] </code></pre></div></div> <ol> <li><code class="language-plaintext highlighter-rouge">control</code> takes a callback that operates in <code class="language-plaintext highlighter-rouge">IO</code>.</li> <li>It provides a function for dropping the extra stuff and running an lifted action as an IO action.</li> <li>When you return something, it needs the <code class="language-plaintext highlighter-rouge">StM m a</code> type to know how to augment the value.</li> </ol> <h1 id="with-exceptions">With Exceptions</h1> There have been many questions on converting <code class="language-plaintext highlighter-rouge">MonadBaseControl</code> and <code class="language-plaintext highlighter-rouge">MonadCatch</code> sorts of functions. If you have a function <code class="language-plaintext highlighter-rouge">foo :: MonadCatch m => m a</code>, then you can specialize the type to any <code class="language-plaintext highlighter-rouge">MonadCatch</code> instance. For <code class="language-plaintext highlighter-rouge">(MonadCatch m, MonadIO m)</code>, you can specialize to <code class="language-plaintext highlighter-rouge">m ~ IO</code>. Finally, you can use <code class="language-plaintext highlighter-rouge">control</code> to generalize it. Given: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Exception.Safe catch :: (MonadCatch m, Exception e) => m a -> (e -> m a) -> m a catchLifted :: ( MonadBaseControl IO m , Exception e , StM m a ~ a ) => m a -> (e -> m a) -> m a catchLifted action handler = control $ \runInIO -> (runInIO action) `catch` (\e -> runInIO (handler e)) </code></pre></div></div> This works because we can specialize <code class="language-plaintext highlighter-rouge">MonadCatch m => m a</code> into <code class="language-plaintext highlighter-rouge">IO a</code>. Tue, 21 Nov 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/11/21/monadbasecontrol_in_five_minutes.html https://www.parsonsmatt.org/2017/11/21/monadbasecontrol_in_five_minutes.html Contributing to GHC UPDATED 2019-08-22: See the end of the post for how this turned out. This post serves as notes and explorations of my first patch to GHC. I’m going to start from the very beginning – so it might be kind of boring! <h1 id="get-on-that-documentation">Get on that documentation</h1> First thing’s first: <a href="https://ghc.haskell.org/trac/ghc/">check out the documentation</a>. It was a little overwhelming at first, until I noticed the <a href="https://ghc.haskell.org/trac/ghc/wiki/Newcomers">Newcomer’s info</a> link. This one is much easier to get started with – has all the directions you need to get GHC building! I ran the commands, GHC downloaded and built, and soon all was ready to go. <h1 id="pick-a-ticket">Pick a Ticket</h1> The next thing to do is to pick a ticket to implement. The Newcomer’s Info page has a list of newcomer friendly tickets. <a href="https://ghc.haskell.org/trac/ghc/ticket/12389">This ticket</a> seemed relatively easy – a small change to the parsing rules to allow trailing and leading commas to data constructor export lists. I made a comment indicating that I would try to make a patch for it. <h1 id="explore-the-source-code">Explore the source code!</h1> This was fun – GHC has a huge codebase. Fortunately, the parser was relatively easy to locate. There’s a directory <code class="language-plaintext highlighter-rouge">compiler</code> that has the compiler, and <code class="language-plaintext highlighter-rouge">parser</code> is located right under there. As an aside: I use FZF to fuzzy locate files in projects via <a href="https://github.com/junegunn/fzf.vim">fzf.vim</a>. This lets me open vim up, type some barely coherent garbage in, and mostly find what I am looking for. I do this as my primary means of exploring a code base – I’ll whack <code class="language-plaintext highlighter-rouge"><leader>e</code> which lets me fuzzy search file names. The Haskell language parser uses <a href="https://www.haskell.org/happy/">Happy, the parser generator for Haskell</a>. I used Happy briefly when working on the Appel compiler book, but have otherwise never used a parser generator. <h1 id="implement-a-test">Implement a test!</h1> How do testing!? I used my <code class="language-plaintext highlighter-rouge">FZF</code> trick from above and did a fuzzy search for <code class="language-plaintext highlighter-rouge">testparse</code>. This showed me that there was a <code class="language-plaintext highlighter-rouge">testsuite</code> directory that included a <code class="language-plaintext highlighter-rouge">tests/parser</code> directory. There’s a <code class="language-plaintext highlighter-rouge">testuite/README.md</code> file that has more information on how tests run, and also a <a href="https://ghc.haskell.org/trac/ghc/wiki/Building/RunningTests">link to the official test suite documentation</a>. That page included a <a href="https://ghc.haskell.org/trac/ghc/wiki/Building/RunningTests/Adding">link on how to add a test</a>. I ran the test suite at first, and – holy crap – it takes a LONG time to run the entire GHC test suite! So I killed it, reread the README, found that you can run just a section of a test, and then did that. I followed the directions to add a test and came up with <a href="https://github.com/parsonsmatt/ghc/commit/b3544c708d73ea42af5468814fceffb99dd844d6">this commit</a>. The test failed, so I’m good to go. <h1 id="write-a-fix">Write a fix!</h1> Now, it’s time to get that test passing. I’m not super familiar with parser combinators, so I tried stuff until the test passed. The test suite readme had instructions on running a single test with <code class="language-plaintext highlighter-rouge">make TEST=T12389</code> which I gladly took advantage of. To make compiling things faster, you can <code class="language-plaintext highlighter-rouge">cd</code> into the relevant directory and run <code class="language-plaintext highlighter-rouge">make</code>. For the parser, that’s <code class="language-plaintext highlighter-rouge">compiler</code>. That work ended up with <a href="https://github.com/parsonsmatt/ghc/commit/97561e566b1524a971ab4511ce26b6c8623438b4">this commit</a>. I made the test example a little bigger to test for more stuff. <h1 id="make-a-pr">Make a PR!</h1> This is where things get Weird, if you’re like me and you started getting into open source after GitHub had essentially established world dominance. GHC is the first project I’ve ever worked on that was hosted on a non-GitHub (or GH-like site, like BitBucket or GitLab). But, before I get too excited, I look at the instructions for <a href="https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/FixingBugs">how to submit a patch to GHC</a>. After making the commits, it asks to “validate the commits” using <a href="https://ghc.haskell.org/trac/ghc/wiki/TestingPatches">Travis or a validation script</a>. <h1 id="er-validate-commits">Er, Validate Commits!</h1> Looks like I can just run <code class="language-plaintext highlighter-rouge">./validate</code> in the GHC repo. Except that gets me a <code class="language-plaintext highlighter-rouge">sphinx-build</code> not found error. Fortunately, that’s easy to fix by installing it via apt-get. <code class="language-plaintext highlighter-rouge">./validate</code> then proceeds to run for a VERY long time. … Still running… Ok, at this point I got impatient. <h1 id="make-a-pr-again">Make a PR, again!</h1> <a href="https://ghc.haskell.org/trac/ghc/wiki/Phabricator">The directions for submitting a patch using Phabricator</a> are pretty great. I ran the relevant commands, and now <a href="https://phabricator.haskell.org/D4134">my patch is on the website</a>. Neat! At this point, it’s time to wait for code review, implement any requested changes, and then pat myself on the back for contributing to GHC. You can do it, too! <h1 id="update-2019-08-22">UPDATE: 2019-08-22</h1> It’s been almost two years since this whole process got started. The contribution did not land. Here’s what happened. <a href="https://phabricator.haskell.org/D4134#115719"><code class="language-plaintext highlighter-rouge">hvr</code> left a comment</a> indicating that this had been thought through, and it would need a much larger investment of time and energy. I was told to make a GHC proposal so that it could go through the official process. So, <a href="https://github.com/ghc-proposals/ghc-proposals/pull/87">I made an official GHC proposal</a>. The proposal discussion ballooned to 166 comments at the time of writing this update. Here’s my summary: <ul> <li>Matt: Here’s a proposal!</li> <li>Community: This is good, but it must be expanded to cover all cases of comma-separated enumerations.</li> <li>Matt: OK, here’s the updated proposal. There’s an interaction with <code class="language-plaintext highlighter-rouge">TupleSections</code>, though. How should we handle it?</li> <li>Community: bikeshedding about syntax for about 1.5 years</li> <li>Committee: Matt can you address concerns/questions?</li> <li>Matt: I’ve addressed everything as well as I can, as far as I know it’s just up to a vote.</li> <li>Committee: Matt can you address concerns/questions?</li> <li>Matt: Can you be more specific about the questions/concerns?</li> <li>Committee:</li> <li>Committee:</li> <li>Committee: Matt can you address concerns/questions?</li> <li> Matt: OK, so there are two options for this: <ol> <li>Make the extensions <code class="language-plaintext highlighter-rouge">ExtraCommas</code> and <code class="language-plaintext highlighter-rouge">TupleSections</code> completely incompatible.</li> <li>Exclude tuples from <code class="language-plaintext highlighter-rouge">ExtraCommas</code>.</li> </ol> Let’s pick one and get on with this. I want this off my plate so if you haven’t decided in a week then I’m closing it out. </li> <li>Committee:</li> <li>Committee:</li> <li>Matt: OK, closing it out.</li> <li>Committee:</li> <li>Committee: Okay, we’ve decided to accept it!</li> <li>SPJ: <a href="https://github.com/ghc-proposals/ghc-proposals/pull/87#issuecomment-506664603">Wait, hold on, I thought we were voting on the first proposal and I hadn’t actually read what we were voting on! What if…</a></li> </ul> And so I will not be contributing to GHC until they’ve significantly improved their processes for newcomers. Sun, 29 Oct 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/10/29/contributing_to_ghc.html https://www.parsonsmatt.org/2017/10/29/contributing_to_ghc.html LambdaConf 2017 Talk I’m happy to say that my talk at Lambdaconf 2017 is finally posted! In “<a href="https://www.youtube.com/watch?v=Ej5FQtEgTBw">I command you to be free!</a>”, I motivate the virtues of reifying programs as values using the command pattern. The command pattern has a very natural evolution into the free monad, and it can be implemented in a way that’s idiomatic and convenient for both object oriented and functional programming languages. This strategy has worked well to improve the correctness and testability of the PHP codebase at work. Sun, 15 Oct 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/10/15/lambdaconf_2017_talk.html https://www.parsonsmatt.org/2017/10/15/lambdaconf_2017_talk.html Type Safety Back and Forth Types are a powerful construct for improving program safety. Haskell has a few notable ways of handling potential failure, the most famous being the venerable <code class="language-plaintext highlighter-rouge">Maybe</code> type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Maybe a = Nothing | Just a </code></pre></div></div> We can use <code class="language-plaintext highlighter-rouge">Maybe</code> as the result of a function to indicate: <blockquote> Hey, friend! This function might fail. You’ll need to handle the <code class="language-plaintext highlighter-rouge">Nothing</code> case. </blockquote> This allows us to write functions like a safe division function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeDivide :: Int -> Int -> Maybe Int safeDivide i 0 = Nothing safeDivide i j = Just (i `div` j) </code></pre></div></div> I like to think of this as pushing the responsibility for failure forward. I’m telling the caller of the code that they can provide whatever <code class="language-plaintext highlighter-rouge">Int</code>s they want, but that some condition might cause them to fail. And the caller of the code has to handle that failure later on. This is the easiest technique to show and tell, because it’s one-size-fits-all. If your function can fail, just slap <code class="language-plaintext highlighter-rouge">Maybe</code> or <code class="language-plaintext highlighter-rouge">Either</code> on the result type and you’ve got safety. I can write a 35 line blog post to show off the technique, and if I were feeling frisky, I could use it as an introduction to <code class="language-plaintext highlighter-rouge">Functor</code>, <code class="language-plaintext highlighter-rouge">Monad</code>, and all that jazz. Instead, I’d like to share another technique. Rather than push the responsibility for failure forward, let’s explore pushing it back. This technique is a little harder to show, because it depends on the individual cases you might use. If pushing responsibility forward means accepting whatever parameters and having the caller of the code handle possibility of failure, then pushing it back is going to mean we accept stricter parameters that we can’t fail with. Let’s consider <code class="language-plaintext highlighter-rouge">safeDivide</code>, but with a more lax type signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeDivide :: String -> String -> Maybe Int safeDivide iStr jStr = do i <- readMay iStr j <- readMay jStr guard (j /= 0) pure (i `div` j) </code></pre></div></div> This function takes two strings, and then tries to parse <code class="language-plaintext highlighter-rouge">Int</code>s out of them. Then, if the <code class="language-plaintext highlighter-rouge">j</code> parameter isn’t <code class="language-plaintext highlighter-rouge">0</code>, we return the result of division. This function is safe, but we have a much larger space of calls to <code class="language-plaintext highlighter-rouge">safeDivide</code> that fail and return <code class="language-plaintext highlighter-rouge">Nothing</code>. We’ve accepted more parameters, but we’ve pushed a lot of responsibility forward for handling possible failure. Let’s push the failure back. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeDivide :: Int -> NonZero Int -> Int safeDivide i (NonZero j) = i `div` j </code></pre></div></div> We’ve required that users provide us a <code class="language-plaintext highlighter-rouge">NonZero Int</code> rather than any old <code class="language-plaintext highlighter-rouge">Int</code>. We’ve pushed back against the callers of our function: <blockquote> No! You must provide a <code class="language-plaintext highlighter-rouge">NonZero Int</code>. I refuse to work with just any <code class="language-plaintext highlighter-rouge">Int</code>, because then I might fail, and that’s annoying. </blockquote> So speaks our valiant little function, standing up for itself! Let’s implement <code class="language-plaintext highlighter-rouge">NonZero</code>. We’ll take advantage of Haskell’s <code class="language-plaintext highlighter-rouge">PatternSynonyms</code> language extension to allow people to pattern match on a “constructor” without exposing a way to unsafely construct values. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE PatternSynonyms #-} module NonZero ( NonZero() , pattern NonZero , unNonZero , nonZero ) where newtype NonZero a = UnsafeNonZero a pattern NonZero a <- UnsafeNonZero a unNonZero :: NonZero a -> a unNonZero (UnsafeNonZero a) = a nonZero :: (Num a, Eq a) => a -> Maybe (NonZero a) nonZero 0 = Nothing nonZero i = Just (UnsafeNonZero i) </code></pre></div></div> This module allows us to push the responsibility for type safety backwards onto callers. As another example, consider <code class="language-plaintext highlighter-rouge">head</code>. Here’s the unsafe, convenient variety: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head :: [a] -> a head (x:xs) = x head [] = error "oh no" </code></pre></div></div> This code is making a promise that it can’t keep. Given the empty list, it will fail at runtime. Let’s push the responsibility for safety forward: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>headMay :: [a] -> Maybe a headMay (x:xs) = Just x headMay [] = Nothing </code></pre></div></div> Now, we won’t fail at runtime. We’ve required the caller to handle a <code class="language-plaintext highlighter-rouge">Nothing</code> case. Let’s try pushing it back now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>headOr :: a -> [a] -> a headOr def (x:xs) = x headOr def [] = def </code></pre></div></div> Now, we’re requiring that the caller of the function handle possible failure before they ever call this. There’s no way to get it wrong. Alternatively, we can use a type for nonempty lists! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data NonEmpty a = a :| [a] safeHead :: NonEmpty a -> a safeHead (x :| xs) = x </code></pre></div></div> This one works just as well. We’re requiring that the calling code handle failure ahead of time. A more complicated example of this technique is the <a href="https://hackage.haskell.org/package/justified-containers-0.1.2.0/docs/Data-Map-Justified-Tutorial.html"><code class="language-plaintext highlighter-rouge">justified-containers</code></a> library. The library uses the type system to prove that a given key exists in the underlying <code class="language-plaintext highlighter-rouge">Map</code>. From that point on, lookups using those keys are total: they are guaranteed to return a value, and they don’t return a <code class="language-plaintext highlighter-rouge">Maybe</code>. This works even if you <code class="language-plaintext highlighter-rouge">map</code> over the <code class="language-plaintext highlighter-rouge">Map</code> with a function, transforming values. You can also use it to ensure that two maps share related information. It’s a powerful feature, beyond just having type safety. <h1 id="the-ripple-effect">The Ripple Effect</h1> When some piece of code hands us responsibility, we have two choices: <ol> <li>Handle that responsibility.</li> <li>Pass it to someone else!</li> </ol> In my experience, developers will tend to push responsibility in the same direction that the code they call does. So if some function returns a <code class="language-plaintext highlighter-rouge">Maybe</code>, the developer is going to be inclined to also return a <code class="language-plaintext highlighter-rouge">Maybe</code> value. If some function requires a <code class="language-plaintext highlighter-rouge">NonEmpty Int</code>, then the developer is going to be inclined to also require a <code class="language-plaintext highlighter-rouge">NonEmpty Int</code> be passed in. This played out in my work codebase. We have a type representing an <code class="language-plaintext highlighter-rouge">Order</code> with many <code class="language-plaintext highlighter-rouge">Item</code>s in it. Originally, the type looked something like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Order = Order { items :: [Item] } </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">Item</code>s contained nearly all of the interesting information in the order, so almost everything that we did with an <code class="language-plaintext highlighter-rouge">Order</code> would need to return a <code class="language-plaintext highlighter-rouge">Maybe</code> value to handle the empty list case. This was a lot of work, and a lot of <code class="language-plaintext highlighter-rouge">Maybe</code> values! The type is too permissive. As it happens, an <code class="language-plaintext highlighter-rouge">Order</code> may not exist without at least one <code class="language-plaintext highlighter-rouge">Item</code>. So we can make the type more restrictive and have more fun! We redefined the type to be: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Order = Order { items :: NonEmpty Item } </code></pre></div></div> All of the <code class="language-plaintext highlighter-rouge">Maybe</code>s relating to the empty list were purged, and all of the code was pure and free. The failure case (an empty list of orders) was moved to two sites: <ol> <li>Decoding JSON</li> <li>Decoding database rows</li> </ol> Decoding JSON happens at the API side of things, when various services <code class="language-plaintext highlighter-rouge">POST</code> updates to us. Now, we can respond with a <code class="language-plaintext highlighter-rouge">400</code> error and tell API clients that they’ve provided invalid data! This prevents our data from going bad. Decoding database rows is even easier. We use an <code class="language-plaintext highlighter-rouge">INNER JOIN</code> when retrieving <code class="language-plaintext highlighter-rouge">Order</code>s and <code class="language-plaintext highlighter-rouge">Item</code>s, which guarantees that each <code class="language-plaintext highlighter-rouge">Order</code> will have at least one <code class="language-plaintext highlighter-rouge">Item</code> in the result set. Foreign keys ensure that each <code class="language-plaintext highlighter-rouge">Item</code>’s <code class="language-plaintext highlighter-rouge">Order</code> is actually present in the database. This does leave the possibility that an <code class="language-plaintext highlighter-rouge">Order</code> might be orphaned in the database, but it’s mostly safe. When we push our type safety back, we’re encouraged to continue pushing it back. Eventually, we push it all the way back – to the edges of our system! This simplifies all of the code and logic inside of the system. We’re taking advantage of types to make our code simpler, safer, and easier to understand. <h1 id="ask-only-what-you-need">Ask Only What You Need</h1> In many senses, designing our code with type safety in mind is about being as strict as possible about your possible inputs. Haskell makes this easier than many other languages, but there’s nothing stopping you from writing a function that can take literally any binary value, do whatever effects you want, and return whatever binary value: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foobar :: ByteString -> IO ByteString </code></pre></div></div> A <code class="language-plaintext highlighter-rouge">ByteString</code> is a totally unrestricted data type. It can contain any sequence of bytes. Because it can express any value, we have very little guarantees on what it actually contains, and we are very limited in how we can safely handle this. By restricting our past, we gain freedom in the future. Wed, 11 Oct 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/10/11/type_safety_back_and_forth.html https://www.parsonsmatt.org/2017/10/11/type_safety_back_and_forth.html What does Free buy us? Let’s talk about free monads. Why are they free? Do monads ordinarily cost us something? The category theory intuition for “free” roughly expands to: <blockquote> This structure gives you a free X when given a Y </blockquote> So, when we talk about the typical free monad type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Free f a = Pure a | Free (f (Free f a)) </code></pre></div></div> this really expands to: <blockquote> This structure gives you a free monad for a given functor </blockquote> This expansion is witnessed by the instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Functor f) => Monad (Free f) where return = Pure Pure a >>= k = k a Free m >>= k = Free ((>>= k) <$> m) </code></pre></div></div> The instance says: “If your <code class="language-plaintext highlighter-rouge">f</code> is a <code class="language-plaintext highlighter-rouge">Functor</code>, then a <code class="language-plaintext highlighter-rouge">Free f</code> is a <code class="language-plaintext highlighter-rouge">Monad</code>.” The exact implementation is less important. <h1 id="free-monoids">Free Monoids</h1> We say that “List is the free monoid.” What we mean is that: <blockquote> This structure gives you a free monoid for a given type. </blockquote> So we can equip any value with list and it becomes a monoid, for free. This expansion is witnessed by the instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monoid [a] where mempty = [] mappend xs ys = xs ++ ys </code></pre></div></div> Are there other free monoids? Yes! We have a free monoid for a given semigroup. This is the more moral instance of <code class="language-plaintext highlighter-rouge">Monoid</code> for <code class="language-plaintext highlighter-rouge">Maybe</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Semigroup a) => Monoid (Maybe a) where mempty = Nothing mappend (Just a) (Just b) = Just (a <> b) mappend Nothing (Just b) = Just b mappend (Just a) Nothing = Just a mappend Nothing Nothing = Nothing -- where `<>` comes from Data.Semigroup </code></pre></div></div> <h1 id="whats-the-point">What’s the point?</h1> Great question! What do these constructs buy us? What is the alternative to using <code class="language-plaintext highlighter-rouge">Free</code>’s <code class="language-plaintext highlighter-rouge">Monad</code> instance for a given <code class="language-plaintext highlighter-rouge">Functor</code>? The common motivation for <code class="language-plaintext highlighter-rouge">Free</code> is writing a data structure that we can build up specialized programs in, and then vary the interpretation. So let’s do this without free. We’ve been tasked with writing our billing logic system for our SaaS billing system. We’re going to construct a data type that represents a program where we check a user’s balance and either charge them, notify them, or cancel their subscriptions based on various factors. We absolutely need to get this right, so we’re doing this weird heavy weight technique to improve our confidence in it’s correctness. <h1 id="sum-commands">Sum Commands</h1> First, we need to represent the various things we want to do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram = GetUserBalance | GetUserLastPaymentDate | CancelSubscription | ChargeUser | SendLateNotice </code></pre></div></div> These are the commands we need to do: <ol> <li>We need a way to get the user’s current account balance.</li> <li>We need a way to get the user’s last successful payment date.</li> <li>We need a way to cancel a user’s subscription.</li> <li>We need to be able to charge the user.</li> <li>We need to be able to send the user a late notice.</li> </ol> This data type is a sum type that represents the possible commands we can issue to the system. Now, we can construct “programs” using just this type! Here is a basic interpreter for this data type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingState = BillingState { userId :: UserId , userBalance :: Double , userSubscription :: SubscriptionId , lastPaymentDate :: Day } interpret :: BillingProgram -> StateT BillingState IO () interpret GetUserBalance = do id <- gets userId balance <- liftIO $ Stripe.getUserBalance id modify (\s -> s { userBalance = balance }) interpret GetUserLastPaymentDate = -- etc... </code></pre></div></div> We can have our logic construct a <code class="language-plaintext highlighter-rouge">[BillingProgram]</code> value, and then use <code class="language-plaintext highlighter-rouge">mapM_ interpret</code> over that. However, this is really inflexible. We’re required to store every bit of state in the interpreter, as well as the current user and subscription that we’re working on. Let’s delegate some of that work to our command type. In a sense, this is sort of like the code we might write in a highly stateful, imperative OOP context: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class BillingProgram { private UserId userId; private double userBalance; private SubscriptionId userSubscription; private Day lastPaymentDate; public BillingProgram(UserId u) { /* etc... */ } public void runLogic() { getUserBalance(); if (this.userBalance > 100) { chargeUser(); } else { sendBalanceNotice(); } } } </code></pre></div></div> The command data type has no way of communicating arguments, and it has no way of communicating a return value. This is no fun. <h1 id="commands-with-info">Commands, with info!</h1> We want our commands to contain the information they need in order to be able to do work. Rather than a simple signal to our interpreter on what action to take, we’ll also include the parameters that we wish to act on. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram = GetUserBalance UserId | GetUserLastPaymentDate UserId | CancelSubscription UserId PlanId | ChargeUser UserId Double | SendLateNotice PlanId Email </code></pre></div></div> Now, we’ve augmented our data type. Interpreting this has become a lot easier – we no longer need to carry the user ID in the state, or the last payment date. These are just things we can interpret and request. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interpret :: BillingProgram -> IO () interpret (GetUserBalance userId) = Stripe.getUserBalance userId interpret (GetUserLastPaymentDate userId) = Stripe.getLastPaymentDate userId interpret (CancelSubscription userId planId) = do subscriptions <- Stripe.getSubscriptionsFor userId for_ subscriptions $ \sub -> do when (sunPlan sun == planId) $ do Stripe.cancelSubscription (subId sub) -- etc... </code></pre></div></div> This implementation is pretty clean. Just like the previous type, we can have our logic functions create a list of these commands, and we can use <code class="language-plaintext highlighter-rouge">mapM_ interpret</code> to interpret them meaningfully. However, we have a problem: the two commands <code class="language-plaintext highlighter-rouge">GetUserBalance</code> and <code class="language-plaintext highlighter-rouge">GetUserLastPaymentDate</code> are queries. These queries have a meaningful return value. And we don’t have a way to vary behavior. The interpret function won’t type check: <code class="language-plaintext highlighter-rouge">Stripe.getUserBalance</code> doesn’t return <code class="language-plaintext highlighter-rouge">()</code>, it returns a <code class="language-plaintext highlighter-rouge">Double</code> that we want to use! So, we have two choices: <ol> <li>Refactor the data type to not have queries.</li> <li>Refactor the data type to be able to use queries.</li> </ol> Let’s explore #1 first. <h1 id="no-queries-no-masters">No Queries No Masters</h1> So, we can’t have queries in our data type. That means we need to factor all of the logic that we’d do on those queries into the individual commands. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram = CancelSubscriptionIfUserPaymentTooOld UserId SubscriptionId | IfBalanceGreatEnoughThenChargeUserElseSendNotice UserId SubscriptionId Email </code></pre></div></div> Ugh. Let’s write the interpreter: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interpret :: BillingProgram -> IO () interpret (CancelSubscriptionIfUserPaymentTooOld userId subscriptionId) = do date <- Stripe.getLastPaymentDate userId now <- getCurrentTime when (now `diffTime` date > days 60) $ do Stripe.cancelSubscription userId subscriptionId interpret (IfBalanceGreatEnoughThenChargeUserElseSendNotice userId subscriptionId email) = do balance <- Stripe.getUserBalance userId subscription <- Stripe.getSubscriptions subscriptionId if balance > subPrice subscription then Stripe.chargeUser userId (subPrice subscription) else Email.sendBalanceNotice email subscription </code></pre></div></div> Ugh! This is horrible. OK, this approach was a mistake. Let’s try refactoring the data type to be able to use queries. <h1 id="the-question-of-next">The question of next</h1> A query is “something that informs what we might want to do next.” But our command data type doesn’t have any concept of “next” or “previous,” only “now”: Charge the user money! Send a billing email! We’d previously used lists to have a sequence of commands, and we’d execute each of them individually. Lists are a fine way to express iteration and sequencing, but they don’t allow previous commands to affect future commands. So, if we want to incorporate the idea of “next” into our data type, then we can’t use lists. We have to make it part of the type. We’ll include another field on each command: this will have the <code class="language-plaintext highlighter-rouge">BillingProgram</code> to execute after the current program. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram = GetUserBalance UserId BillingProgram | GetUserLastPaymentDate UserId BillingProgram | CancelSubscription UserId PlanId BillingProgram | ChargeUser UserId Double BillingProgram | SendLateNotice PlanId Email BillingProgram </code></pre></div></div> Now, these commands all have a way of expressing “Once you’re done with this command, here’s the next command you’ll want to execute.” However, we’re still not using the information from the <code class="language-plaintext highlighter-rouge">UserBalance</code> and <code class="language-plaintext highlighter-rouge">LastPaymentDate</code> commands. We can express that as a function where the construction of the next <code class="language-plaintext highlighter-rouge">BillingProgram</code> depends on the value that the interpreter returns. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram = GetUserBalance UserId (Double -> BillingProgram) | GetUserLastPaymentDate UserId (Day -> BillingProgram) | CancelSubscription UserId PlanId BillingProgram | ChargeUser UserId Double BillingProgram | SendLateNotice PlanId Email BillingProgram </code></pre></div></div> That does it! Now, we can express complex logic in our billing program. Let’s construct our billing program that expresses “If the user has enough balance, then charge them, otherwise send a balance notice.” <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chargeOrEmail :: User -> Subscription -> BillingProgram chargeOrEmail user sub = GetUserBalance (userId user) $ \userBalance -> if userBalance >= subPrice sub then ChargeUser (userId user) (subPrice sub) ??? else SendLateNotice (subPlan sub) (userEmail user) ??? </code></pre></div></div> Errr, this doesn’t quite work. We need something in our command data type to indicate “This program is complete.” Let’s add that constructor: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram = GetUserBalance UserId (Double -> BillingProgram) | GetUserLastPaymentDate UserId (Day -> BillingProgram) | CancelSubscription UserId PlanId BillingProgram | ChargeUser UserId Double BillingProgram | SendLateNotice PlanId Email BillingProgram | Done </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">Done</code> constructor is the only constructor that doesn’t contain a <code class="language-plaintext highlighter-rouge">BillingProgram</code>, which means that every <code class="language-plaintext highlighter-rouge">BillingProgram</code> must end with a <code class="language-plaintext highlighter-rouge">Done</code> (or loop infinitely). Alright, we can finish our program now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chargeOrEmail :: User -> Subscription -> BillingProgram chargeOrEmail user sub = GetUserBalance (userId user) $ \userBalance -> if userBalance >= subPrice sub then ChargeUser (userId user) (subPrice sub) Done else SendLateNotice (subPlan sub) (userEmail user) Done </code></pre></div></div> <h1 id="meaningful-return-values">Meaningful return values</h1> But, hmm, what if we want to report on whether or not we could successfully bill the user? The command data type has no way of “returning” a value. That’s kind of unfortunate. We can change the <code class="language-plaintext highlighter-rouge">Done</code> constructor to take a value, but we don’t want to constrain the type of the value – we could potentially return all kinds of things! That means we need to add a type variable to the command data type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingProgram ret = GetUserBalance UserId (Double -> BillingProgram ret) | GetUserLastPaymentDate UserId (Day -> BillingProgram ret) | CancelSubscription UserId PlanId (BillingProgram ret) | ChargeUser UserId Double (BillingProgram ret) | SendLateNotice PlanId Email (BillingProgram ret) | Done ret </code></pre></div></div> Alright, now we can rewrite our program to return whether or not we successfully billed the customer: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chargeOrEmail :: User -> Subscription -> BillingProgram Bool chargeOrEmail user sub = GetUserBalance (userId user) $ \userBalance -> if userBalance >= subPrice sub then ChargeUser (userId user) (subPrice sub) (Done True) else SendLateNotice (subPlan sub) (userEmail user) (Done False) </code></pre></div></div> Very cool. We’re starting to have a reasonably fully featured language for billing our customers. However, we don’t have any tools for taking an existing program and extending it, or composing it with another one. <h1 id="extending-programs">Extending programs</h1> So, we want to extend a preexisting program. What does it mean to extend a program? To me, that suggests that we’ll start running a new program with the output of the old program. In order to get the output of the old program, we need to run it until we get to a <code class="language-plaintext highlighter-rouge">Done</code> constructor. Then, we can use the value from <code class="language-plaintext highlighter-rouge">Done</code> to continue the program. We’ll start with the <code class="language-plaintext highlighter-rouge">Done</code> case: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>andThen :: BillingProgram a -> (a -> BillingProgram b) -> BillingProgram b andThen (Done ret) mkProgram = mkProgram ret </code></pre></div></div> We take the return value from the previous program, and use it to construct the next bit of the program. Now, we just need to plumb this through the other constructors: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>andThen (GetUserBalance userId next) mkProgram = GetUserBalance userId (\balance -> andThen (next balance) mkProgram) andThen (GetUserLastPaymentDate userId next) mkProgram = GetUserLastPaymentDate userId (\date -> andThen (next date) mkProgram) andThen (CancelSubscription userId planId next) = CancelSubscription userId planId (andThen next mkProgram) andThen (ChargeUser userId amount next) = ChargeUser userId amount (andThen next mkProgram) andThen (SendLateNotice planId email next) = SendLateNotice planId notice (andThen next mkProgram) </code></pre></div></div> We want to not change the existing command structure. The only thing we do here is use <code class="language-plaintext highlighter-rouge">andThen</code> to recursively walk the program until we hit <code class="language-plaintext highlighter-rouge">Done</code>, at which point we extend the program with the new program using the output of the old program. <h1 id="huh">huh</h1> <h2 id="that-looks-familiar">that looks familiar</h2> Let’s write a non-trivial program that uses these commands: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>billingProgram :: User -> [Subscription] -> BillingProgram () billingProgram _ [] = Done () billingProgram user (sub:subs) = GetUserBalance uid $ \balance -> if balance > price then ChargeUser uid price theRest else SendLateNotice plan (userEmail user) $ GetUserLastPaymentDate uid $ \day -> if day < 60daysago then CancelSubscription uid plan theRest else theRest where uid = userId user price = subPrice sub plan = subPlan sub theRest = billingProgram user subs </code></pre></div></div> This is super clumsy and ugly to write. We have to manually iterate over the list, and we have these weird uppercase constructors everywhere. We need to manually handle lambda scopes and other such nonsense. Let’s factor out some of the common patterns here, and use the <code class="language-plaintext highlighter-rouge">andThen</code> function we wrote earlier to build programs rather than manually grafting this stuff together. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getUserBalance :: UserId -> BillingProgram Double getUserBalance userId = GetUserBalance userId (\amount -> Done amount) end :: BillingProgram () end = Done () getLastPaymentDate :: UserId -> BillingProgram Day getLastPaymentDate userId = GetUserLastPaymentDate userId (\day -> Done day) cancelSubscription :: UserId -> PlanId -> BillingProgram () cancelSubscription userId planId = CancelSubscription userId planId end -- etc, this gets pretty repetitive </code></pre></div></div> Alright, let’s use these and the <code class="language-plaintext highlighter-rouge">andThen</code> to write the above logic out: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>billingProgram :: User -> [Subscription] -> BillingProgram () billingProgram _ [] = end billingProgram user (sub:subs) = getUserBalance uid `andThen` \balance -> if balance > price then chargeUser uid price `andThen` \_ -> theRest else sendLateNotice plan (userEmail user) `andThen` \_ -> getUserLastPaymentDate uid `andThen` \day -> if day < 60daysago then cancelSubscription uid plan `andThen` \_ -> theRest else theRest where uid = userId user price = subPrice sub plan = subPlan sub theRest = billingProgram user subs </code></pre></div></div> This looks quite a bit nicer! <h1 id="ah-yes-ive-seen-this-before">ah yes, i’ve seen this before</h1> This is strongly reminding me of <code class="language-plaintext highlighter-rouge">Monad</code> at this point. Let’s write an instance of <code class="language-plaintext highlighter-rouge">Monad</code> for our type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad BillingProgram where return = Done (>>=) = andThen </code></pre></div></div> Huh. That was easy. Now we can take advantage of <code class="language-plaintext highlighter-rouge">do</code> notation and all the functions that are generic over the monad. I’m specifically thinking of these friends: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forM_ :: (Monad m) => [a] -> (a -> m b) -> m () when :: (Monad m) => Bool -> m () -> m () </code></pre></div></div> Let’s rewrite our program with this new fanciness: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>billingProgram :: User -> [Subscription] -> BillingProgram () billingProgram user subs = forM_ subs $ \sub -> do let uid = userId user price = subPrice sub plan = subPlan sub balance <- getUserBalance uid if balance > price then do chargeUser uid price else do day <- getUserLastPaymentDate uid when (day < 60daysago) $ do cancelSubscription uid plan </code></pre></div></div> Now this is some nice, readable, and idiomatic code. We’ve used the <code class="language-plaintext highlighter-rouge">Monad</code> instance to get sweet, sweet <code class="language-plaintext highlighter-rouge">do</code> notation. We haven’t tried to interpret it, yet – is this going to suck? <h1 id="interpreting-the-monad">interpreting the monad</h1> It’s fairly straightforward. Let’s do a Stripe interpreter: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interpret :: BillingProgram a -> IO a interpret (Done a) = pure a interpret (ChargeUser uid price next) = do Stripe.chargeUser uid price interpret next interpret (SendLateNotice plan email next) = do Email.sendLateNoticeFor plan email interpret next interpret (GetUserBalance uid next) = do balance <- Stripe.getBalance uid interpret (next balance) interpret etcccc = do putStrLn "you could finish me" </code></pre></div></div> This interpreter just walks down the command tree. It interprets the command, and then calls the interpreter on the next command recursively. Where the next command is a function, we first acquire the value, pass it to the function to generate the next command, and the interpret the result. Can we write a test interpreter? Yes! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interpretTest :: BillingProgram a -> State Mock a interpretTest (Done a) = pure a interpretTest (ChargeUser uid price next) = do modify (subtractBalance uid price) interpret next interpretTest (SendLateNotice plan email next) = do modify (addBillingEmail plan email) interpret next interpretTest (GetUserBalance uid next) = do balance <- gets (userBalance uid) interpret (next balance) interpretTest etc = error "finish meeee" </code></pre></div></div> This one doesn’t use any IO. It operates in the <code class="language-plaintext highlighter-rouge">State</code> monad, so we can keep it entirely pure. We can provide an initial <code class="language-plaintext highlighter-rouge">Mock</code> state and then make assertions on what the <code class="language-plaintext highlighter-rouge">Mock</code> looks like after we run a program. This lets us write tests without needing to mock out any IO or anything else nasty. <h1 id="what-have-we-done">what have we done?!</h1> You might be satisfied to stop here. We’ve accomplished a lot, after all! You might think: <blockquote> Dang, that was a lot of boilerplate. There was a lot of repetition in the definition of <code class="language-plaintext highlighter-rouge">andThen</code>, and the definition of the interpreter seemed awfully repetitive as well. What if I write another EDSL (embedded domain specific language)? Will I have to write all this boilerplate again? </blockquote> Let’s go deeper. Let’s write another data type for an EDSL. This one describes a terminal interaction: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Terminal a = GetLine (String -> Terminal a) | Done a | PrintLine String (Terminal a) instance Monad Terminal where return = Done t >>= mk = case t of GetLine next -> GetLine $ \s -> next s >>= mk PrintLine str next -> PrintLine str (next >>= mk) Done a -> mk a interpret :: Terminal a -> IO a interpret (Done a) = pure a interpret (GetLine next) = do str <- getLine interpret (next str) interpret (PrintLine str next) = do putStrLn str interpret next </code></pre></div></div> There’s definitely a fair amount of boilerplate here. The structure is very similar. Let’s look at these two types and see what we can factor out: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Terminal a = GetLine (String -> Terminal a) | PrintLine String (Terminal a) | Done a data BillingProgram ret = GetUserBalance UserId (Double -> BillingProgram ret) | GetUserLastPaymentDate UserId (Day -> BillingProgram ret) | CancelSubscription UserId PlanId (BillingProgram ret) | ChargeUser UserId Double (BillingProgram ret) | SendLateNotice PlanId Email (BillingProgram ret) | Done ret </code></pre></div></div> Both of these types have a <code class="language-plaintext highlighter-rouge">Done</code> constructor, so we should be able to factor that out. Both of these types are also recursive, so we should be able to factor the recursion out. That means our type should have two components: <ol> <li>Factored out recursion (aka, <code class="language-plaintext highlighter-rouge">Fix</code>)</li> <li>A <code class="language-plaintext highlighter-rouge">Done</code> constructor.</li> </ol> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Free f a = Free (f (Free f a)) | Done a </code></pre></div></div> This actually looks really similar to a list, except the recursion has an intermediate step, the <code class="language-plaintext highlighter-rouge">Free</code> doesn’t take a value, and the <code class="language-plaintext highlighter-rouge">Done</code> constructor takes a value. Let’s lay them side by side: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Free f a = Free (f (Free f a)) | Done a data List a = Cons a (List a) | Nil </code></pre></div></div> In fact, <code class="language-plaintext highlighter-rouge">Free</code> is more general than list! We can recover singly linked lists by providing an appropriate <code class="language-plaintext highlighter-rouge">f</code> and <code class="language-plaintext highlighter-rouge">a</code>, specifically, <code class="language-plaintext highlighter-rouge">(,) n</code> and <code class="language-plaintext highlighter-rouge">()</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type List a = Free ((,) a) () totallyAList :: List Int totallyAList = Free (1, Free (2, Free (3, Done ()))) </code></pre></div></div> Anyway, back to stuff people actually care about. Now that we’ve factored out the common stuff between our two program types, let’s get some common machinery between them: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data TerminalF next = GetLine (String -> next) | PrintLine String next type Terminal = Free TerminalF getLine :: Terminal String getLine = Free (GetLine (\str -> Done str)) printLine :: String -> Terminal () printLine str = Free (PrintLine str (Done ())) </code></pre></div></div> The new command data type only has the commands we care about. We replace the explicit recursion with a <code class="language-plaintext highlighter-rouge">next</code> type variable, which the <code class="language-plaintext highlighter-rouge">Free</code> type fills in. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BillingF next = GetUserBalance UserId (Double -> next) | ChargeUser UserId Double next | etc you get it type Billing = Free BillingF getUserBalance :: UserId -> Billing Double getUserBalance userId = Free (GetUserBalance userId Done) chargeUser :: UserId -> Double -> Billing () chargeUser uid amt = Free (ChargeUser uid amt (Done ())) </code></pre></div></div> Same – the new command data type doesn’t have to worry about <code class="language-plaintext highlighter-rouge">Done</code>, or anything else. Because <code class="language-plaintext highlighter-rouge">Free</code> has an instance of <code class="language-plaintext highlighter-rouge">Monad</code> for any <code class="language-plaintext highlighter-rouge">Functor</code>, we only have to write a <code class="language-plaintext highlighter-rouge">Functor</code> instance to make this work. I lied. We don’t even have to write that instance. We just have to ask for it! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE DeriveFunctor #-} data TerminalF next = GetLine (String -> next) | PrintLine String next deriving Functor </code></pre></div></div> Haskell lets us derive <code class="language-plaintext highlighter-rouge">Functor</code> for types where it can figure it out. So, the free monad instance makes it easy to write programs, but does it make interpreters easy? Yes! We can define this function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foldFree :: (Functor f, Monad m) => (forall a. f a -> m a) -> Free f a -> m a foldFree morph (Done a) = return a foldFree morph (Free f) = do a <- morph f foldFree morph a </code></pre></div></div> This reads as: <blockquote> Give me a way to interpret your commands into some monad. Then give me a program built of these commands. I’ll interpret all of the commmands for you. </blockquote> So we can write our terminal program as: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data TerminalF next = GetLine (String -> next) | PrintLine String next deriving Functor type Terminal = Free TerminalF interpret :: Terminal a -> IO a interpret = foldFree morph where morph :: TermF a -> IO a morph (GetLine next) = next <$> getLine morph (PrintLine s n) = do putStrLn s pure n </code></pre></div></div> Note the really interesting bit of this interpreter – we don’t have to specify any recursion, at all. <code class="language-plaintext highlighter-rouge">foldFree</code> handles all of that for us. We just need to specify the bits that should happen at each step of the recursion. <h1 id="wrap-it-up">Wrap it up</h1> We’ve implemented a data type to represent a primitive set of commands. We’ve then extended those commands with arguments, which allowed us to shift complexity from the interpreter into the commands themselves. Then, we factored the question of “what to do next” from the list data structure into the command data type. This increased the complexity of both the data type and the interpreter. However, we were able to get a <code class="language-plaintext highlighter-rouge">Monad</code> instance for our programs, which gave us a lot of awesome flexibility for writing the EDSLs. To tame that complexity, we factored the “what to do next” back out into a new data type, this time called <code class="language-plaintext highlighter-rouge">Free</code> instead of <code class="language-plaintext highlighter-rouge">List</code>. <code class="language-plaintext highlighter-rouge">Free</code> and <code class="language-plaintext highlighter-rouge">List</code> are similar; and we can use <code class="language-plaintext highlighter-rouge">Free</code> to write <code class="language-plaintext highlighter-rouge">List</code> and other interesting data structures. The only requirement that <code class="language-plaintext highlighter-rouge">Free</code> has to give a monad to the whole type is that the <code class="language-plaintext highlighter-rouge">f</code> type parameter be a <code class="language-plaintext highlighter-rouge">Functor</code>. I did a similar dive into recursive types in <a href="https://www.parsonsmatt.org/2015/09/24/recursion.html">Recursion Excursion</a>, which you may find interesting. Fri, 22 Sep 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/09/22/what_does_free_buy_us.html https://www.parsonsmatt.org/2017/09/22/what_does_free_buy_us.html Debugging Types: A Stream of Thought At the day job, we’ve got a whole bunch of database models. And it’s somewhat easy to accidentally query the wrong database. Fortunately, Persistent has a mechanism for fixing that – using type-specific backends! However, Persistent’s mechanism was not designed around this sort of use case, so I’ve had to work around it. I wrote a wrapper library called <a href="https://github.com/parsonsmatt/persistent-typed-db"><code class="language-plaintext highlighter-rouge">persistent-typed-db</code></a> to enable type safe access. It’s almost entirely vendored code from <code class="language-plaintext highlighter-rouge">persistent</code> with a phantom type variable for the database you’re accessing. I got to work on integrating the library into the work codebase, and ran into a bunch of road blocks. As part of the debugging process at work, we’ve started writing stream-of-thought “as it happens” debugging logs. They’ve been tremendously helpful for sharing workflow, thought processes, and “why is this path a dead end?” which doesn’t typically make it’s way in process documentation. Since this debugging workflow was mostly for open source stuff (Esqueleto, Persistent, and my wrapper library), I figured I’d post the entire flow here. It’s mostly stream-of-thought and the direction isn’t great, but it pretty closely mirrors the work and research I had to do to solve the problem. (For best accuracy, read along while listening <a href="https://www.youtube.com/watch?v=3jWRrafhO7M">to some fine Ghibli tunes</a>) <h1 id="persistent-typed-db"><code class="language-plaintext highlighter-rouge">persistent-typed-db</code></h1> The library will allow us to have type safety when running database queries, so that we don’t accidentally issue a texas-toast account query on an FBG master database (as an example). The library needs to be compatible with Persistent and Esqueleto to be useful. Currently: <ul> <li><code class="language-plaintext highlighter-rouge">persistent-typed-db</code> + <code class="language-plaintext highlighter-rouge">persistent:</code> great!</li> <li><code class="language-plaintext highlighter-rouge">persistent-typed-db</code> + <code class="language-plaintext highlighter-rouge">esqueleto:</code> incompatible</li> </ul> Why is persistent-typed-db incompatible with Esqueleto? Let’s dig into the error we receive: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:60:5: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘from’ • In the expression: from $ \ (v `LeftOuterJoin` vs) -> do { on (just (v ^. VenueId) ==. vs ?. VenueSettingVenue); pure (v, vs) } In an equation for ‘venueWithSettings’: venueWithSettings = from $ \ (v `LeftOuterJoin` vs) -> do { on (just (v ^. VenueId) ==. vs ?. VenueSettingVenue); .... } </code></pre></div></div> The error comes from this code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>venueWithSettings :: SqlQuery (SqlExpr (Entity Venue), SqlExpr (Maybe (Entity VenueSetting))) venueWithSettings = from $ \(v `LeftOuterJoin` vs) -> do on (just (v ^. VenueId) ==. vs ?. VenueSettingVenue) pure (v, vs) </code></pre></div></div> So, the error indicates that GHC is trying to unify <code class="language-plaintext highlighter-rouge">SqlFor TexAcctDb ~ SqlBackend</code> due to a use of <code class="language-plaintext highlighter-rouge">from</code>. What is the type of <code class="language-plaintext highlighter-rouge">from</code>, and how is it specifying <code class="language-plaintext highlighter-rouge">SqlBackend</code>? If we dig into <code class="language-plaintext highlighter-rouge">esqueleto</code>, we’ll find <code class="language-plaintext highlighter-rouge">from</code> at line 935 in Database.Esqueleto.Internal.Language: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>from :: From query expr backend a => (a -> query b) -> query b from = (from_ >>=) </code></pre></div></div> Now we need to know what <code class="language-plaintext highlighter-rouge">From</code> is all about. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | (Internal) Class that implements the tuple 'from' magic (see -- 'fromStart'). class Esqueleto query expr backend => From query expr backend a where from_ :: query a </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">From</code> is a class that explains how to select a value of type <code class="language-plaintext highlighter-rouge">a</code> using a <code class="language-plaintext highlighter-rouge">query</code> that has an instance from <code class="language-plaintext highlighter-rouge">Esqueleto</code> class. We need to dig into the Esqueleto class to identify why it’s coercing the backend. Here is the class definition for Esqueleto: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (Functor query, Applicative query, Monad query) => Esqueleto query expr backend | query -> expr backend, expr -> query backend where </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">-></code> arrows in the class definitions are “functional dependencies.” A simpler example is this guy: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Container container element | container -> element where toList :: container -> [element] instance Container [a] a where toList = id instance Container (Set a) a where toList = Set.toList instance Container Text Char where toList = Text.unpack </code></pre></div></div> This class and instances say: “For a given type <code class="language-plaintext highlighter-rouge">container</code>, the <code class="language-plaintext highlighter-rouge">element</code> type of that container is fully determined by the container.” For <code class="language-plaintext highlighter-rouge">[a]</code> and <code class="language-plaintext highlighter-rouge">Set a</code>, the element type of the container is the type that it is polymorphic over. For <code class="language-plaintext highlighter-rouge">Text</code>, the element type is fixed to be <code class="language-plaintext highlighter-rouge">Char</code>. Back to the Esqueleto class definition! The functional dependencies state that the type of <code class="language-plaintext highlighter-rouge">query</code> is enough to select the type of <code class="language-plaintext highlighter-rouge">expr</code> and <code class="language-plaintext highlighter-rouge">backend</code>, and that the type of <code class="language-plaintext highlighter-rouge">expr</code> is sufficient to select the type of <code class="language-plaintext highlighter-rouge">query</code> and <code class="language-plaintext highlighter-rouge">backend</code>. Practically, this means we can only have one instance for a given <code class="language-plaintext highlighter-rouge">query</code> or <code class="language-plaintext highlighter-rouge">expr</code> type – we may not vary the <code class="language-plaintext highlighter-rouge">backend</code> and reuse query/expr types. Our type signature for <code class="language-plaintext highlighter-rouge">venueWithSettings</code> fixes the type of <code class="language-plaintext highlighter-rouge">SqlQuery</code> and <code class="language-plaintext highlighter-rouge">SqlExpr</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>venueWithSettings :: SqlQuery (SqlExpr (Entity Venue), SqlExpr (Maybe (Entity VenueSetting))) </code></pre></div></div> When we do that, that tells GHC that it can also unambiguously select the backend: <code class="language-plaintext highlighter-rouge">SqlBackend</code>! But, why does it complain that the backend is <code class="language-plaintext highlighter-rouge">SqlBackend</code>? It must be asking GHC what the <code class="language-plaintext highlighter-rouge">PersistEntityBackend</code> is for the records, and when that doesn’t line up with <code class="language-plaintext highlighter-rouge">SqlBackend</code>, it throws a type error. Unfortunately, GHC’s type checker does not include a step-through debugger. So we have to prod it manually. I replaced the type signature with a more polymorphic one, which should cause the compiler to defer making that selection for a bit. That might give us some clues on how we can proceed. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>venueWithSettings :: Esqueleto q e b => q (e (Entity Venue), e (Maybe (Entity VenueSetting))) </code></pre></div></div> Now, the query, expression, and backend are polymorphic again. When we attempt to compile, we get more errors: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:19:20: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘select’ • In the first argument of ‘(.)’, namely ‘select’ In the second argument of ‘(.)’, namely ‘select . venueById’ In the expression: fmap convert . select . venueById </code></pre></div></div> This one suggests to me that <code class="language-plaintext highlighter-rouge">select</code> is responsible for selecting <code class="language-plaintext highlighter-rouge">SqlBackend</code>, so we’ll make a note to investigate that next. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:36:26: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘select’ • In the second argument of ‘(<$>)’, namely ‘select venueWithSettings’ In the expression: toList . convert <$> select venueWithSettings In an equation for ‘getVenuesWithSettings’: getVenuesWithSettings = toList . convert <$> select venueWithSettings where convert :: [(Entity Venue, Maybe (Entity VenueSetting))] -> Map (Key Venue) (Entity Venue, Map Text (Maybe Text)) convert = fmap (fmap (venueSettingsToMap . fmap entityVal)) . foldr (\ (evenue, evenueSetting) -> Map.insertWith (\ (ev, es1) (_, es2) -> ...) (entityKey evenue) (evenue, maybeToList evenueSetting)) Map.empty </code></pre></div></div> This appears to be the same thing: <code class="language-plaintext highlighter-rouge">select</code> seems to be looking up the record backend and complaining when the type doesn’t line up. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:61:5: error: • Overlapping instances for Database.Esqueleto.Internal.Language.FromPreprocess q e b (e (Entity Venue)) arising from a use of ‘from’ Matching instances: instance (Esqueleto query expr backend, PersistEntity val, PersistEntityBackend val ~ backend) => Database.Esqueleto.Internal.Language.FromPreprocess query expr backend (expr (Entity val)) -- Defined in ‘Database.Esqueleto.Internal.Language’ instance (Esqueleto query expr backend, Database.Esqueleto.Internal.Language.FromPreprocess query expr backend a, Database.Esqueleto.Internal.Language.FromPreprocess query expr backend b, Database.Esqueleto.Internal.Language.IsJoinKind join) => Database.Esqueleto.Internal.Language.FromPreprocess query expr backend (join a b) -- Defined in ‘Database.Esqueleto.Internal.Language’ (The choice depends on the instantiation of ‘q, b, e’ To pick the first instance above, use IncoherentInstances when compiling the other instance declarations) • In the expression: from $ \ (v `LeftOuterJoin` vs) -> do { on (just (v ^. VenueId) ==. vs ?. VenueSettingVenue); pure (v, vs) } In an equation for ‘venueWithSettings’: venueWithSettings = from $ \ (v `LeftOuterJoin` vs) -> do { on (just (v ^. VenueId) ==. vs ?. VenueSettingVenue); .... } </code></pre></div></div> This error mentions <code class="language-plaintext highlighter-rouge">Database.Esqueleto.Language.FromPreprocess</code>, which I’m not familiar with, so I’ll need to look at. It is also complaining about <code class="language-plaintext highlighter-rouge">from</code>. The first instance mentioned looks promising: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> instance (Esqueleto query expr backend, PersistEntity val, PersistEntityBackend val ~ backend) => Database.Esqueleto.Internal.Language.FromPreprocess query expr backend (expr (Entity val)) -- Defined in ‘Database.Esqueleto.Internal.Language’ </code></pre></div></div> This instance requires that <code class="language-plaintext highlighter-rouge">Esqueleto query expr backend</code> is satsified, and that <code class="language-plaintext highlighter-rouge">val</code> is a <code class="language-plaintext highlighter-rouge">PersistEntity</code> and that the <code class="language-plaintext highlighter-rouge">PersistEntityBackend val ~ backend</code> for Esqueleto. So, it can’t solve this type class instance unless the record’s backend is an instance of Esqueleto. We know that this forces it to <code class="language-plaintext highlighter-rouge">SqlExpr</code>, <code class="language-plaintext highlighter-rouge">SqlQuery</code>, and <code class="language-plaintext highlighter-rouge">SqlBackend</code> thanks to the functional dependencies (also: fundeps if you’re lazy, like me). K, back to the issues with <code class="language-plaintext highlighter-rouge">select</code>. Let’s look at it’s type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select :: ( SqlSelect a r , MonadIO m ) => SqlQuery a -> SqlReadT m [r] </code></pre></div></div> Alright, so <code class="language-plaintext highlighter-rouge">select</code> is also taking a <code class="language-plaintext highlighter-rouge">SqlQuery</code>, which forces <code class="language-plaintext highlighter-rouge">expr ~ SqlExpr</code> and <code class="language-plaintext highlighter-rouge">backend ~ SqlBackend</code>. But it doesn’t appear to be using the backend type specifically yet. What is SqlReadT? Doing a quick <a href="https://hoogle.haskell.org">hoogle</a> search, I get <a href="https://hoogle.haskell.org/?hoogle=SqlReadT">these results</a>, which points to the <a href="https://hackage.haskell.org/package/persistent-2.7.0/docs/Database-Persist-Sql.html#t:SqlReadT">this type signature</a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type SqlReadT m a = forall backend. SqlBackendCanRead backend => ReaderT backend m a </code></pre></div></div> This type signature is abstracting the backend, and saying that “this query will work for all <code class="language-plaintext highlighter-rouge">backend</code>s, provided that the <code class="language-plaintext highlighter-rouge">backend</code> is an instance of <code class="language-plaintext highlighter-rouge">SqlBackendCanRead</code>.” If we give our <code class="language-plaintext highlighter-rouge">SqlFor a</code> type an instance of <code class="language-plaintext highlighter-rouge">SqlBackendCanRead</code> then we’ll be set there. So, <code class="language-plaintext highlighter-rouge">select</code> doesn’t appear to to care about the records. Let’s look at the <code class="language-plaintext highlighter-rouge">from</code> problem again: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> instance (Esqueleto query expr backend, PersistEntity val, PersistEntityBackend val ~ backend) => Database.Esqueleto.Internal.Language.FromPreprocess query expr backend (expr (Entity val)) </code></pre></div></div> The instance is saying: <blockquote> Given an instance <code class="language-plaintext highlighter-rouge">Esqueleto query expr backend</code>, and an instance for <code class="language-plaintext highlighter-rouge">PersistEntity</code> val, and requiring that <code class="language-plaintext highlighter-rouge">PersistEntityBackend val</code> have the same type as <code class="language-plaintext highlighter-rouge">backend</code>, we can provide an instance for <code class="language-plaintext highlighter-rouge">FromPreprocess</code>. </blockquote> We can open <code class="language-plaintext highlighter-rouge">stack ghci texas-toast-models</code> and ask what the type of <code class="language-plaintext highlighter-rouge">PersistEntityBackend Venue</code> is: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :kind! PersistEntityBackend Venue PersistEntityBackend Venue :: * = Database.Persist.Typed.SqlFor TexAcctDb </code></pre></div></div> So this is exactly the problem. We need to provide an alternative way for Esqueleto to do this. Mostly, it just needs to accept that two backends are compatible. So a <code class="language-plaintext highlighter-rouge">SqlFor a</code> is compatible with <code class="language-plaintext highlighter-rouge">SqlBackend</code>, even if they’re not the same. If we replace <code class="language-plaintext highlighter-rouge">PersistEntityBackend val ~ backend</code> with <code class="language-plaintext highlighter-rouge">BackendCompatible val backend</code> for a suitable definition of <code class="language-plaintext highlighter-rouge">BackendCompatible</code>, then that should fix the issue. <h1 id="modifying-esqueleto">Modifying Esqueleto</h1> I prepared <a href="https://github.com/bitemyapp/esqueleto/pull/53">a patch for Esqueleto</a>. I added a class and some instances: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class BackendCompatible sup sub instance BackendCompatible SqlBackend SqlBackend instance BackendCompatible SqlBackend SqlReadBackend instance BackendCompatible SqlBackend SqlWriteBackend </code></pre></div></div> then, in the texas-toast-models repository, added an instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance BackendCompatible SqlBackend (SqlFor TxMasterDb) instance BackendCompatible SqlBackend (SqlFor TxAcctDb) </code></pre></div></div> I replaced the <code class="language-plaintext highlighter-rouge">PersistEntityBackend val ~ backend</code> constraints in the library with <code class="language-plaintext highlighter-rouge">BackendCompatible backend (PersistEntityBackend val)</code>, which solved the issues with <code class="language-plaintext highlighter-rouge">FromPreprocess</code>. However, we’re still getting an error: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:19:20: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘select’ • In the first argument of ‘(.)’, namely ‘select’ In the second argument of ‘(.)’, namely ‘select . venueById’ In the expression: fmap convert . select . venueById /home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:36:26: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘select’ • In the second argument of ‘(<$>)’, namely ‘select venueWithSettings’ In the expression: toList . convert <$> select venueWithSettings In an equation for ‘getVenuesWithSettings’: getVenuesWithSettings = toList . convert <$> select venueWithSettings where convert :: [(Entity Venue, Maybe (Entity VenueSetting))] -> Map (Key Venue) (Entity Venue, Map Text (Maybe Text)) convert = fmap (fmap (venueSettingsToMap . fmap entityVal)) . foldr (\ (evenue, evenueSetting) -> Map.insertWith (\ (ev, es1) (_, es2) -> ...) (entityKey evenue) (evenue, maybeToList evenueSetting)) Map.empty </code></pre></div></div> Same stuff as before: <code class="language-plaintext highlighter-rouge">select</code> is somehow trying to make the backend a <code class="language-plaintext highlighter-rouge">SqlBackend</code> instead of a <code class="language-plaintext highlighter-rouge">SqlFor TexAcctDb</code>. So, let’s dig into the implementation of <code class="language-plaintext highlighter-rouge">select</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select :: ( SqlSelect a r , MonadIO m ) => SqlQuery a -> SqlReadT m [r] select query = do res <- rawSelectSource SELECT query conn <- R.ask liftIO $ with res $ flip R.runReaderT conn . runSource </code></pre></div></div> This delegates to <code class="language-plaintext highlighter-rouge">rawSelectSource</code>, which should have an answer. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rawSelectSource :: ( SqlSelect a r , MonadIO m1 , MonadIO m2 ) => Mode -> SqlQuery a -> SqlReadT m1 (Acquire (C.Source m2 r)) rawSelectSource mode query = do conn <- persistBackend <$> R.ask res <- run conn return $ (C.$= massage) `fmap` res </code></pre></div></div> This calls <code class="language-plaintext highlighter-rouge">persistBackend <$> ask</code>. What’s <code class="language-plaintext highlighter-rouge">persistBackend</code>? <a href="https://hoogle.haskell.org/?hoogle=persistBackend">Hoogle gives us</a> this <a href="https://hackage.haskell.org/package/persistent-2.7.0/docs/Database-Persist-Class.html#v:persistBackend">top result</a>: it’s a method on the <code class="language-plaintext highlighter-rouge">HasPersistBackend</code> class. It takes a value of some type that has a <code class="language-plaintext highlighter-rouge">BaseBackend</code>, and returns the <code class="language-plaintext highlighter-rouge">BaseBackend</code> for that type. This works for <code class="language-plaintext highlighter-rouge">SqlReadBackend</code> and <code class="language-plaintext highlighter-rouge">SqlWriteBackend</code> because their backends are simply <code class="language-plaintext highlighter-rouge">SqlBackend</code>. That’s no good! The <code class="language-plaintext highlighter-rouge">persistent</code> library really wants for <code class="language-plaintext highlighter-rouge">BaseBackend backend ~ PersistEntityBackend record</code>. If we want type safety, then <code class="language-plaintext highlighter-rouge">BaseBackend backend ~ SqlFor DbName ~ PersistEntityBackend record</code> for the records we care about. So, we need to change the method to be something that grabs the <code class="language-plaintext highlighter-rouge">SqlBackend</code> out of whatever is passed on in. <h1 id="extending-the-class">Extending the Class</h1> The class we introduced to Esqueleto provides a natural way to solve this. Rather than use <code class="language-plaintext highlighter-rouge">persistBackend</code>, which returns the <code class="language-plaintext highlighter-rouge">BaseBackend</code>, we can add a new method: <code class="language-plaintext highlighter-rouge">projectBackend</code>, which returns the large backend that the smaller backend is compatible with. This should backwards-compatibly fix the issue. I extended the class with: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class BackendCompatible sup sub where projectBackend :: sub -> sup instance BackendCompatible SqlBackend SqlBackend where projectBackend = id instance BackendCompatible SqlBackend SqlReadBackend where projectBackend = unSqlReadBackend </code></pre></div></div> This allows us to acquire a <code class="language-plaintext highlighter-rouge">SqlBackend</code> from any compatible backend. The definition for <code class="language-plaintext highlighter-rouge">select</code> and friends is also changed: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select :: ( SqlSelect a r , MonadIO m , SqlBackendCanRead backend , BackendCompatible SqlBackend backend ) => SqlQuery a -> R.ReaderT backend m [r] select query = do res <- rawSelectSource SELECT query conn <- R.ask liftIO $ with res $ flip R.runReaderT conn . runSource rawSelectSource :: ( SqlSelect a r , MonadIO m1 , MonadIO m2 , SqlBackendCanRead backend , BackendCompatible SqlBackend backend) => Mode -> SqlQuery a -> R.ReaderT backend m1 (Acquire (C.Source m2 r)) rawSelectSource mode query = do conn <- projectBackend <$> R.ask let _ = conn :: SqlBackend res <- run conn return $ (C.$= massage) `fmap` res where ... </code></pre></div></div> <code class="language-plaintext highlighter-rouge">select</code> mostly just needed to change the constraints on the <code class="language-plaintext highlighter-rouge">backend</code> type. Beforehand, it was using <code class="language-plaintext highlighter-rouge">SqlBackendCanRead backend => ReaderT backend m [r]</code>. Now, it’s using that, provided that we’ve also constrained the backend to be compatible with <code class="language-plaintext highlighter-rouge">SqlBackend</code>. <code class="language-plaintext highlighter-rouge">rawSelectSource</code> has the same constraint differences. We also need to use <code class="language-plaintext highlighter-rouge">projectBackend</code> instead of <code class="language-plaintext highlighter-rouge">persistBackend</code> to convert it to the backend we want. <h1 id="sigh">sigh</h1> Unfortunately, we’re still running into the same issue. Here’s the error output we get with those changes to the library: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:19:20: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘select’ • In the first argument of ‘(.)’, namely ‘select’ In the second argument of ‘(.)’, namely ‘select . venueById’ In the expression: fmap convert . select . venueById /home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:36:26: error: • Couldn't match type ‘persistent-typed-db-0.0.1.0:Database.Persist.Typed.SqlFor TexAcctDb’ with ‘SqlBackend’ arising from a use of ‘select’ • In the second argument of ‘(<$>)’, namely ‘select venueWithSettings’ </code></pre></div></div> This is exactly the problem we had before generalizing the constraint. What gives?! <h1 id="type-inference-rules">type inference RULES</h1> Whenever Haskell types are confusing, it can be helpful to just blow the type signature away and see what GHC comes up with. So I deleted the type signatures that were throwing errors, and asked GHC what it thought the types should be. GHC very helpfully gave me the following: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:18:1: warning: [-Wmissing-signatures] Top-level binding with no type signature: getVenueById :: (BaseBackend backend ~ SqlBackend, PersistUniqueRead backend, PersistQueryRead backend, IsPersistBackend backend, Database.Esqueleto.Internal.Language.BackendCompatible SqlBackend backend, MonadIO m) => VenueId -> ReaderT backend m (Maybe (Entity Venue, Map Text (Maybe Text))) /home/matt/Projects/sellerlabs-hs/texas-toast/src/Texas/Query/Venue.hs:35:1: warning: [-Wmissing-signatures] Top-level binding with no type signature: getVenuesWithSettings :: (BaseBackend backend ~ SqlBackend, PersistUniqueRead backend, PersistQueryRead backend, IsPersistBackend backend, Database.Esqueleto.Internal.Language.BackendCompatible SqlBackend backend, MonadIO m) => ReaderT backend m [(Entity Venue, Map Text (Maybe Text))] </code></pre></div></div> Ah HAH! The synonym <code class="language-plaintext highlighter-rouge">SqlBackendCanRead</code> must be carrying around that <code class="language-plaintext highlighter-rouge">BaseBackend backend ~ SqlBackend</code> constraint. That’s what’s borking this. Indeed, <a href="https://hackage.haskell.org/package/persistent-2.7.0/docs/Database-Persist-Sql.html#t:SqlBackendCanRead">the Hackage docs for SqlBackendCanRead</a> show that it is a constraint alias: it aliases all of <code class="language-plaintext highlighter-rouge">IsSqlBackend</code>, <code class="language-plaintext highlighter-rouge">PersistQueryRead</code>, <code class="language-plaintext highlighter-rouge">PersistStoreRead</code>, and <code class="language-plaintext highlighter-rouge">PersistUniqueRead</code>. <code class="language-plaintext highlighter-rouge">IsSqlBackend</code> is also an alias: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type IsSqlBackend backend = (IsPersistBackend backend, BaseBackend backend ~ SqlBackend) </code></pre></div></div> Well, that’s our issue. So now we need to unwrap all those aliases, toss the <code class="language-plaintext highlighter-rouge">BaseBackend</code> requirement, and instead use the <code class="language-plaintext highlighter-rouge">BackendCompatible</code> class. Wed, 13 Sep 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/09/13/debugging_types_a_stream_of_thought.html https://www.parsonsmatt.org/2017/09/13/debugging_types_a_stream_of_thought.html Using GHC CallStacks (Note: this blog post has code accessible on <a href="https://github.com/parsonsmatt/callstacks-what-even">this GitHub repository</a>. You can follow along there if you’d like.) Haskell doesn’t really have a callstack. The evaluation strategy is more like a graph reduction. If you don’t understand that, that’s okay – I don’t either! All I know about it is that it makes questions like “what’s the stack trace for this error?” surprisingly difficult to answer. While Haskell’s debugging story tends to be rather nice (break up code into small, composable, reusable functions; take advantage of types to make errors unrepresentable where practical; write unit and property tests for the rest), it’s also great to know where errors actually come from. Coding practices like “don’t ever use partial functions like <code class="language-plaintext highlighter-rouge">head :: [a] -> a</code>” and “prefer <code class="language-plaintext highlighter-rouge">NonEmpty a</code> to <code class="language-plaintext highlighter-rouge">[a]</code> where possible” help a lot. However, you may find yourself stuck staring at <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>recv: resource vanished </code></pre></div></div> or similar, and that frankly sucks. <h2 id="ghccallstack"><code class="language-plaintext highlighter-rouge">GHC.CallStack</code></h2> GHC has a callstack simulation mechanism. The interface is a nullary type class, and you can include a callstack with your program by adding it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>headNoCallStack :: [a] -> a headNoCallStack (x:xs) = x headNoCallStack [] = error "nope" headWithCallstack :: HasCallStack => [a] -> a headWithCallstack (x:xs) = x headWithCallstack [] = error "nope" </code></pre></div></div> Let’s compare the behavior of these various functions. The ordinary <code class="language-plaintext highlighter-rouge">head</code> from the Prelude gives us this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> head [] *** Exception: Prelude.head: empty list </code></pre></div></div> Well, that’s useless. No information about where it was even called! Our own <code class="language-plaintext highlighter-rouge">headNoCallStack</code> gives slightly better results: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> headNoCallStack [] *** Exception: nope CallStack (from HasCallStack): error, called at src/Lib.hs:7:22 in main:Lib </code></pre></div></div> We get a callstack! <code class="language-plaintext highlighter-rouge">error</code> was modified recently to carry a <code class="language-plaintext highlighter-rouge">CallStack</code> parameter, though that information is a little hidden: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :t error error :: [Char] -> a λ> :i error error :: forall (r :: ghc-prim-0.5.0.0:GHC.Types.RuntimeRep) (a :: TYPE r). HasCallStack => [Char] -> a -- Defined in ‘GHC.Err’ </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">:info</code> output shows that <code class="language-plaintext highlighter-rouge">error</code> is polymorphic in the runtime representation (eg: the phantom type <code class="language-plaintext highlighter-rouge">a</code> can be an unlifted type like <code class="language-plaintext highlighter-rouge">Int#</code> or a lifted type like <code class="language-plaintext highlighter-rouge">Int</code>). The <code class="language-plaintext highlighter-rouge">:type</code> omits the <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint for some reason. When <code class="language-plaintext highlighter-rouge">headWithCallstack</code> throws that error, you’ll get more extra information: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> headWithCallStack [] *** Exception: nope CallStack (from HasCallStack): error, called at src/Lib.hs:11:24 in main:Lib headWithCallStack, called at <interactive>:6:1 in interactive:Ghci1 </code></pre></div></div> This constructs a <code class="language-plaintext highlighter-rouge">CallStack</code> from <code class="language-plaintext highlighter-rouge">headWithCallStack</code> down to the <code class="language-plaintext highlighter-rouge">error</code> call. Nice! <h1 id="a-shallow-stack">A Shallow Stack</h1> How does this interact with more complex programs? Let’s write something with a bit of nesting: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>maximumCS :: (HasCallStack, Ord a) => [a] -> a maximumCS = foldr1CS max foldr1CS :: HasCallStack => (a -> a -> a) -> [a] -> a foldr1CS _ [x] = x foldr1CS k (x:xs) = k x (foldr1CS k xs) foldr1CS _ [] = error "foldr1 empty list" someProgram :: HasCallStack => [[Int]] -> Int someProgram = headWithCallStack . maximumCS </code></pre></div></div> Nothing terribly complicated, but we’re propagating that callstack all the way down. Let’s see what happens when it blows up: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> someProgram [] *** Exception: foldr1 empty list CallStack (from HasCallStack): error, called at src/Lib.hs:19:17 in main:Lib foldr1CS, called at src/Lib.hs:14:13 in main:Lib maximumCS, called at src/Lib.hs:22:35 in main:Lib someProgram, called at <interactive>:36:1 in interactive:Ghci1 λ> someProgram [[]] *** Exception: nope CallStack (from HasCallStack): error, called at src/Lib.hs:11:24 in main:Lib headWithCallStack, called at src/Lib.hs:22:15 in main:Lib someProgram, called at <interactive>:37:1 in interactive:Ghci1 </code></pre></div></div> Nice! We get a complete stack trace of everything that went wrong. When we pass it the empty list, then we can see that <code class="language-plaintext highlighter-rouge">error</code> was called by <code class="language-plaintext highlighter-rouge">foldr1CS</code>, which was called by <code class="language-plaintext highlighter-rouge">maximumCS</code>, and finally <code class="language-plaintext highlighter-rouge">someProgram</code> was the main offender. When given <code class="language-plaintext highlighter-rouge">[[]]</code>, we can see that <code class="language-plaintext highlighter-rouge">headWithCallstack</code> is the one that threw the exception. Nice! <h1 id="omitting-the-callstack">Omitting the CallStack</h1> Let’s see how this works if we omit something at some point. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: HasCallStack => Maybe a -> a foo (Just a) = a foo Nothing = error "foo is unpleased" bar :: Maybe a -> a bar = foo baz :: HasCallStack => Maybe a -> a baz = bar </code></pre></div></div> These are all just <code class="language-plaintext highlighter-rouge">fromJust</code> in disguise. <code class="language-plaintext highlighter-rouge">baz</code> delegates to <code class="language-plaintext highlighter-rouge">bar</code> and <code class="language-plaintext highlighter-rouge">bar</code> delegates to <code class="language-plaintext highlighter-rouge">foo</code>. Let’s observe the stack traces we get when we call <code class="language-plaintext highlighter-rouge">baz Nothing</code>! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> foo Nothing *** Exception: foo is unpleased CallStack (from HasCallStack): error, called at src/Lib.hs:28:15 in main:Lib foo, called at <interactive>:44:1 in interactive:Ghci1 λ> bar Nothing *** Exception: foo is unpleased CallStack (from HasCallStack): error, called at src/Lib.hs:28:15 in main:Lib foo, called at src/Lib.hs:31:7 in main:Lib λ> baz Nothing *** Exception: foo is unpleased CallStack (from HasCallStack): error, called at src/Lib.hs:28:15 in main:Lib foo, called at src/Lib.hs:31:7 in main:Lib </code></pre></div></div> Our callstack appears to be cut off! We only get to see what happens with <code class="language-plaintext highlighter-rouge">foo</code> local stack. Since <code class="language-plaintext highlighter-rouge">bar</code> does not have the <code class="language-plaintext highlighter-rouge">HasCallStack</code> constraint, it doesn’t propagate any more information when the error is bubbled up. If any function in the chain does not have <code class="language-plaintext highlighter-rouge">HasCallStack</code> in the signature, then nothing above that will be represented in the stack trace. This is a pretty big limitation. <h1 id="should-i-include-a-hascallstack-constraint">Should I include a HasCallStack constraint?</h1> Great question! <code class="language-plaintext highlighter-rouge">HasCallStack</code> is implemented as <a href="https://hackage.haskell.org/package/base-4.10.0.0/docs/GHC-Stack.html#t:HasCallStack"><code class="language-plaintext highlighter-rouge">type HasCallStack = ?callStack :: CallStack</code></a> where the <code class="language-plaintext highlighter-rouge">?</code> means <a href="https://wiki.haskell.org/Implicit_parameters">an implicit parameter</a> in current versions of GHC. This is an extra parameter that gets passed around and handled in your program, which will affect performance. Implicit parameters can potentially interact with sharing in weird ways, which might also cause strange performance issues. <code class="language-plaintext highlighter-rouge">HasCallStack</code> is not pervasive in many libraries, so you’re unlikely to actually have a <code class="language-plaintext highlighter-rouge">CallStack</code> present in the functions you pass to library or framework code. This makes them less useful. Lastly, the GHC Exceptions machinery doesn’t have any notion of a callstack, and any proper exceptions that you throw or catch will not have a callstack: only <code class="language-plaintext highlighter-rouge">error</code> calls. Sat, 29 Jul 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/07/29/using_ghc_callstacks.html https://www.parsonsmatt.org/2017/07/29/using_ghc_callstacks.html Invert Your Mocks! Mocking comes up a lot in discussions of testing effectful code in Haskell. One of the advantages for <code class="language-plaintext highlighter-rouge">mtl</code> type classes or <code class="language-plaintext highlighter-rouge">Eff</code> freer monads is that you can swap implementations and run the same program on different underlying interpretations. This is cool! However, it’s an extremely heavy weight technique, with a ton of complexity. I’ve recently gravitated to mostly doing everything in this sort of type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype App a = App { unApp :: ReaderT AppCtx IO a } </code></pre></div></div> It’s simple, has great error messages, and is easy to hook into existing libraries and frameworks by writing instances for either <code class="language-plaintext highlighter-rouge">AppCtx</code> or <code class="language-plaintext highlighter-rouge">App</code>. There’s a small cost: I have to call <code class="language-plaintext highlighter-rouge">lift</code> manually if I use an <code class="language-plaintext highlighter-rouge">App a</code> function inside of a Conduit or <code class="language-plaintext highlighter-rouge">MaybeT</code> block or similar. This is a fairly small cost to pay, all told, and the benefits in getting new developers up to speed on our projects is a big sell. Now, how would I go about testing this sort of function? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWork :: App () doWork = do query <- runHTTP getUserQuery users <- runDB (usersSatisfying query) for_ users $ \user -> do thing <- getSomething user let result = compute thing runRedis (writeKey (userRedisKey user) result) </code></pre></div></div> If we have our <code class="language-plaintext highlighter-rouge">mtl</code> or <code class="language-plaintext highlighter-rouge">Eff</code> or OOP mocking hats on, we might think: <blockquote> I know! We need to mock our HTTP, database, and Redis effects. Then we can control the environment using mock implementations, and verify that the results are sound! </blockquote> Let’s step back and apply some more elementary techniques to this problem. I bet we can simplify our solution to testing. <h1 id="decomposing-effects">Decomposing Effects</h1> The first thing we need to do is recognize that effects and values are separate, and try to keep them as separate as possible. This is a basic principle of purely functional programming, and we would be wise to take its heed. Generally speaking, functions that look like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWork :: App () </code></pre></div></div> are not functional (in the “functional programming” sense). The only point to this is to run it for the effect it has on the outside world. We can tell just by looking at the type signature! So, let’s look at what it does, and how we might test it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWork :: App () doWork = do query <- runHTTP getUserQuery users <- runDB (usersSatisfying query) for_ users $ \user -> do thing <- getSomething user let result = compute thing runRedis (writeKey (userRedisKey user) result) </code></pre></div></div> We get a bunch of stuff – inputs – that are acquired as an effect. We can make this a lot easier to test by simply taking those things as inputs. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWork :: App () doWork = do query <- runHTTP getUserQuery users <- runDB (usersSatisfying query) doWorkHelper users doWorkHelper :: [User] -> App () doWorkHelper users = for_ users $ \user -> do thing <- getSomething user let result = compute thing runRedis (writeKey (userRedisKey user) result) </code></pre></div></div> Now, the only effect we need to mock for the <code class="language-plaintext highlighter-rouge">doWorkHelper</code> is <code class="language-plaintext highlighter-rouge">getSomething</code> and <code class="language-plaintext highlighter-rouge">runRedis</code>. But I’m not satisfied. We can get rid of the <code class="language-plaintext highlighter-rouge">getSomething</code> by factoring another helper out. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWorkHelper :: [User] -> App () doWorkHelper users = do things'users <- for users $ \user -> do thing <- getSomething user pure (thing, user) lookMaNoInputs things'users lookMaNoInputs :: [(Thing, User)] -> App () lookMaNoInputs things'users = for_ things'users $ \(thing, user) -> do let result = compute thing runRedis (writeKey (userRedisKey user) result) </code></pre></div></div> We’ve now extracted all of the “input effects.” Can we decompose this further? We can! Let’s inspect our output effect: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runRedis (writeKey (userRedisKey user) result) </code></pre></div></div> It expects two things: <ol> <li>The user’s Redis key</li> <li>The computed result from the <code class="language-plaintext highlighter-rouge">thing</code>.</li> </ol> We can prepare the redis key and computed result fairly easily: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>businessLogic :: (Thing, User) -> (RedisKey, Result) businessLogic (thing, user) = (userRedisKey user, compute thing) lookMaNoInputs :: [(Thing, User)] -> App () lookMaNoInputs users = do for_ (map businessLogic users) $ \(key, result) -> do runRedis (writeKey key result) </code></pre></div></div> neat! We’ve isolated the core business logic out and now we can write nice unit tests on that business logic. All of the business logic has been excised from the effectful code, and we’ve reduced the amount of code we need to test. <h1 id="decomposition-conduit-style">Decomposition: Conduit-style</h1> Streaming libraries like <code class="language-plaintext highlighter-rouge">Pipes</code> and <code class="language-plaintext highlighter-rouge">Conduit</code> are a great way to handle large data sets and interleave effects. They’re also a great way to decompose functions and provide “inverted mocking” facilities to your programs. Most conduits look like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Data.Conduit (runConduit, (.|)) import qualified Data.Conduit.List as CL streamSomeStuff :: IO () streamSomeStuff = do runConduit $ conduitThatGetsStuff .| conduitThatProcessesStuff .| conduitThatConsumesStuff </code></pre></div></div> You have some <code class="language-plaintext highlighter-rouge">Source</code> or <code class="language-plaintext highlighter-rouge">Producer</code> that initially provides things. This can be from a database action, an HTTP request, or from a file handle. Now, each part of this conduit can itself have many conduits inside of it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>conduitThatGetsStuff :: Producer IO ByteString conduitThatGetsStuff = ... conduitThatProcessesStuff :: Conduit ByteString IO RealThing conduitThatProcessesStuff = CL.mapM (\bs -> case parseFromByteString bs of Left err -> throwIO err Right yesss -> pure yesss ) .| CL.map convertSomeThing .| CL.filter someFilterCondition passThrough :: (a -> IO ()) -> Conduit a IO a passThrough action = CL.mapM (\a -> do action a pure a) conduitThatConsumesStuff :: Consumer RealThing IO () conduitThatConsumesStuff = passThrough print .| passThrough makeHttpPost .| CL.mapM_ saveToDatabase </code></pre></div></div> We have a bunch of small, decomposed things. Our <code class="language-plaintext highlighter-rouge">conduitThatProcessesStuff</code> doesn’t care where it gets the <code class="language-plaintext highlighter-rouge">ByteString</code>s that it parses – you can hook it up to anything. Databases, HTTP calls, file IO, or even just <code class="language-plaintext highlighter-rouge">CL.sourceList [example1, example2, example3]</code>. Likewise, the <code class="language-plaintext highlighter-rouge">conduitThatConsumesStuff</code> doesn’t care where the <code class="language-plaintext highlighter-rouge">RealThing</code>s come from. You can use <code class="language-plaintext highlighter-rouge">CL.sourceList</code> to provide a bunch of fake input. We’re not usually working directly with <code class="language-plaintext highlighter-rouge">Conduit</code>s here, either – most of the functions are provided to <code class="language-plaintext highlighter-rouge">CL.mapM_</code> or <code class="language-plaintext highlighter-rouge">CL.filter</code> or <code class="language-plaintext highlighter-rouge">CL.map</code>. That allows us to write functions that are simple <code class="language-plaintext highlighter-rouge">a -> m b</code> or <code class="language-plaintext highlighter-rouge">a -> Bool</code> or <code class="language-plaintext highlighter-rouge">a -> b</code>, and these are really easy to test. <h1 id="plain-ol-abstraction">Plain ol’ abstraction</h1> Always keep in mind the lightest and most general techniques in functional programming: <ol> <li>Make it a function</li> <li>Abstract a parameter</li> </ol> These will get you very, very far. Let’s revisit the <code class="language-plaintext highlighter-rouge">doWork</code> business up top: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWork :: App () doWork = do query <- runHTTP getUserQuery users <- runDB (usersSatisfying query) for_ users $ \user -> do thing <- getSomething user let result = compute thing runRedis (writeKey (userRedisKey user) result) </code></pre></div></div> We can make this abstract by taking concrete terms and making them function parameters. The literal definition of lambda abstraction! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWorkAbstract :: Monad m => m Query -- ^ The HTTP getUserQuery -> (Query -> m [User]) -- ^ The database action -> (User -> m Thing) -- ^ The getSomething function -> (RedisKey -> Result -> m ()) -- ^ finally, the redis action -> m () doWorkAbstract getUserQuery getUsers getSomething redisAction = do query <- getUserQuery users <- getUsers query for_ users $ \user -> do thing <- getSomething user let result = compute thing redisAction (userRedisKey user) result </code></pre></div></div> There are some interesting things to note about this abstract definition: <ol> <li>It’s parameterized over any monad. <code class="language-plaintext highlighter-rouge">Identity</code>, <code class="language-plaintext highlighter-rouge">State</code>, <code class="language-plaintext highlighter-rouge">IO</code>, whatever. You choose!</li> <li>We have a pure specification of the effect logic. This can’t do anything. It just describes what to do, when given the right tools.</li> <li>This is basically dependency injection on steroids.</li> </ol> Given the above abstract definition, we can easily recover the concrete <code class="language-plaintext highlighter-rouge">doWork</code> by providing the necessary functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWork :: App () doWork = doWorkAbstract (runHTTP getUserQuery) (\query -> runDB (usersSatisfying query)) (\user -> getSomething user) (\key result -> runRedis (writeKey key result)) </code></pre></div></div> We can also easily get a testing variant that logs the actions taken: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doWorkScribe :: Writer [String] () doWorkScribe = doWorkAbstract getQ getUsers getSomething redis where getQ = do tell ["getting users query"] pure AnyUserQuery getUsers _ = do tell ["getting users"] pure [exampleUser1, exampleUser2] getSomething u = do tell ["getting something for " <> show u] pure (fakeSomethingFor u) redis k v = do tell ["wrote k: " <> show k] tell ["wrote v: " <> show v] </code></pre></div></div> All without having to fuss about with monad transformers, type classes, or anything else that’s terribly complicated. <h1 id="decompose">Decompose!!!</h1> Ultimately, this is all about decomposition of programs into their smallest, most easily testable parts. You then unit or property test these tiny parts to ensure they work together. If all the parts work independently, then they should work together when composed. Your effects should ideally not be anywhere near your business logic. Pure functions from <code class="language-plaintext highlighter-rouge">a</code> to <code class="language-plaintext highlighter-rouge">b</code> are ridiculously easy to test, especially if you can express properties. If your business logic really needs to perform effects, then try the simplest possible techniques first: functions and abstractions. Ultimately, I believe that it’s simpler and easier to write and test functions that take pure values. These are agnostic to where the data comes from, and don’t need to be mocked at all. This transformation is typically easier than introducing <code class="language-plaintext highlighter-rouge">mtl</code> classes, monad transformers, <code class="language-plaintext highlighter-rouge">Eff</code>, or similar techniques. <h1 id="what-if-i-need-to">What if I need to?</h1> Sometimes, you really just can’t avoid testing effectful code. A common pattern I’ve noticed is that people want to make things abstract at a level that is far too low. You want to make the abstraction as weak as possible, to make it easy to mock. Consider the common case of wanting to mock out the database. This is reasonable: database calls are extremely slow! Implementing a mock database, however, is an extremely difficult task – you essentially have to implement a database. Where the behavior of the database differs from your mock, then you’ll have test/prod mismatch that will blow up at some point. Instead, go a level up – create a new indirection layer that can be satisfied by either the database or a simple to implement mock. You can do this with a type class, or just by abstracting the relevant functions concretely. Abstracting the relevant functions is the easiest and simplest technique, but it’s not unreasonable to also write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data UserQuery = AllUsers | UserById UserId | UserByEmail Email class Monad m => GetUsers m where runUserQuery :: UserQuery -> m [User] </code></pre></div></div> This is vastly more tenable interface to implement that a SQL database! Let’s write our instances, one for the <a href="https://hackage.haskell.org/package/persistent"><code class="language-plaintext highlighter-rouge">persistent</code> </a> library and another for a mock that uses QuickCheck’s <code class="language-plaintext highlighter-rouge">Gen</code> type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance MonadIO m => GetUsers (SqlPersistT m) where runUserQuery = selectList . convertToQuery instance GetUsers Gen where runUserQuery query = case query of AllUsers -> arbitrary UserById userId -> take 1 . fmap (setUserId userId) <$> arbitrary UserByEmail userEmail -> take 1 . fmap (setUserEmail userEmail) <$> arbitrary </code></pre></div></div> Alternatively, you can just pass functions around manually instead of using the type class mechanism to pass them for you. Oh, wait, no! That <code class="language-plaintext highlighter-rouge">GetUsers Gen</code> instance has a bug! Can you guess what it is? <hr /> In the <code class="language-plaintext highlighter-rouge">UserById</code> and <code class="language-plaintext highlighter-rouge">UserByEmail</code> case, we’re not ever testing the “empty list” case – what if that user does not exist? A fixed variant looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance GetUsers Gen where runUserQuery query = case query of AllUsers -> arbitrary UserById userId -> do oneOrZero <- choose (0, 1) take oneOrZero . fmap (setUserId userId) <$> arbitrary UserByEmail userEmail -> do oneOrZero <- choose (0, 1) take oneOrZero . fmap (setUserEmail userEmail) <$> arbitrary </code></pre></div></div> I made a mistake writing a super simple generator. Just think about how many mistakes I might have made if I were trying to model something more complex! Thu, 27 Jul 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/07/27/inverted_mocking.html https://www.parsonsmatt.org/2017/07/27/inverted_mocking.html On Naming Things: Library Design Perhaps you’ve heard this joke: <blockquote> There are only two hard problems in computer science: Naming things, cache invalidation, and off-by-one errors </blockquote> Lol. Naming things ends up being actually pretty difficult! It’s a nontrivial problem in library design, and there are interesting constraints imposed by the technologies we use. There are a number of best practices and guidelines available for using libraries that make code easier to read and understand. But we don’t have compelling guidelines available for actually writing these libraries. I’ve written a few libraries now and have tried out different naming and exporting conventions. I’ve developed a bit of a feel for how it is to write and use them, and I’m going to put out my personal preferences and opinions on library design here. This post will be discussing the Haskell programming language and ecosystem. <h1 id="usage-best-practices">Usage best practices</h1> So you’ve got a bare Haskell module: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Main where main :: IO () main = do ... </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">module $NAME where</code> starts the module definition, followed by a list of imports, and then declarations. There are no imports declared. So you know that all terms are coming from the <code class="language-plaintext highlighter-rouge">Prelude</code> module. If you’ve enabled <code class="language-plaintext highlighter-rouge">NoImplicitPrelude</code> language extension, then you won’t even have that in scope! As we add imports, we add new terms. Each new term might be unfamiliar to a programmer who is unfamiliar with the import. As you add more and more imports, it’s important (sorry) to make it easy for people to find where that term comes from. Consider this import list: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Data.Aeson import Data.Aeson.TH import qualified Data.Map.Strict as Map import Data.Swagger import Database.Esqueleto import Database.Esqueleto.Internal.Language (Update) import Servant </code></pre></div></div> If a new person comes upon a term that they don’t understand, where did it come from? It could be any of those imports. There are typically two proposed solutions: explicit import lists, and qualified imports. <h2 id="qualified-imports">Qualified Imports</h2> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import qualified Data.Aeson as Aeson import qualified Data.Aeson.TH as Aeson import qualified Data.Map.Strict as Map import qualified Data.Swagger as Swagger import qualified Database.Esqueleto as Esqueleto import qualified Database.Esqueleto.Internal.Language (Update) import qualified Servant </code></pre></div></div> Unfortunately, this leads to extremely verbose code, and also makes operators super annoying to use. Consider this example usage: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Api = "hello" Servant.:> Servant.Get '[Servant.JSON] Hello data Hello = Hello { helloMap :: Map.Map String Int } Aeson.deriveJSON Aeson.defaultJSONOptions ''Hello -- or, instance Aeson.ToJSON Hello where toJSON hello = Aeson.object [ "map" Aeson..= Aeson.toJSON (helloMap hello) ] </code></pre></div></div> Gross! This makes the code way noisier, and more verbose. Everyone pays a significant cost when writing and reading this code. Only new people to the codebase are benefitted, and even then, only for a short time. It’s common for the module name to be shortened, so <code class="language-plaintext highlighter-rouge">Data.ByteString</code> becomes <code class="language-plaintext highlighter-rouge">BS</code> or <code class="language-plaintext highlighter-rouge">B</code>, and <code class="language-plaintext highlighter-rouge">Data.HashMap.Strict</code> becomes <code class="language-plaintext highlighter-rouge">M</code>. Sometimes you’ll have <code class="language-plaintext highlighter-rouge">Data.Text</code>, <code class="language-plaintext highlighter-rouge">Data.Text.Encoding</code>, and <code class="language-plaintext highlighter-rouge">Data.Text.IO</code> all qualified under <code class="language-plaintext highlighter-rouge">T</code> or <code class="language-plaintext highlighter-rouge">Text</code>, which make it less easy to figure out where the term is coming from. Qualified imports are great sometimes, but they don’t seem to be a great solution all of the time. So typically people use another strategy for making the namespace clean: <h2 id="explicit-import-lists">Explicit Import Lists</h2> You can also list out all of the terms that you use explicitly after the import. This is a good practice, because it doesn’t make the code more verbose, it just makes the import lists bigger. Here’s the previous code snippet and import list, but with explicit imports: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Servant ((:>), Get, JSON) import Map (Map) import Data.Aeson (ToJSON(..), object, (.=)) type Api = "hello" :> Get '[JSON] Hello data Hello = Hello { helloMap :: Map String Int } deriveJSON defaultJSONOptions ''Hello -- or, instance ToJSON Hello where toJSON hello = object [ "map" .= toJSON (helloMap hello) ] </code></pre></div></div> This looks a lot cleaner. Anyone that wants to know where a term comes from can simply search for it in the import lists. However, it requires developer discipline or tooling to keep the import lists up to date. If you update the API type to include the <code class="language-plaintext highlighter-rouge">Post</code> type, then you’ll get a compile failure to update the import list. This can be somewhat frustrating to work with. There’s another approach that some libraries use: <h2 id="just-stick-me-in-my-own-module">Just stick me in my own module!</h2> Some libraries, like Parsec, Esqueleto, etc. want to be used in their own module. In my projects at work, I typically will have a module structure like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module App.SomeType where -- type definition, functions for operating on the type, etc module App.SomeType.Parser where import Data.Attoparsec.ByteString -- the parser definition </code></pre></div></div> Why? The exports of <code class="language-plaintext highlighter-rouge">Attoparsec</code> and the encouraged style of writing functions have a tendency to collide with the names and functions for actually using the types. These aren’t typically imported or cared about: most clients of a <code class="language-plaintext highlighter-rouge">Parser</code> module just want <code class="language-plaintext highlighter-rouge">parseThing :: ByteString -> Either Error Thing</code>, not <code class="language-plaintext highlighter-rouge">thing :: Parser Thing</code> or <code class="language-plaintext highlighter-rouge">someSubComponentOfThing :: Parser Whatever</code>. Likewise, the SQL library Esqueleto builds upon the Persistent database library for writing more advanced queries. It redefines some of the terms and operators so that the language looks consistent with the Persistent DSL for queries and updates, but these are name collisions. Additionally, the eDSL defines a bunch of common names like <code class="language-plaintext highlighter-rouge">on</code>, <code class="language-plaintext highlighter-rouge">from</code>, and <code class="language-plaintext highlighter-rouge">select</code>, which easily collide with other modules. So I’ll typically have a module structure like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module App.Models.SomeModel where import Database.Persistent -- functions specific to the model module App.Query.SomeModel where import Database.Esqueleto -- queries specific to the model </code></pre></div></div> where I can use Esqueleto’s full edsl without having to worry about import/export business. <h2 id="moderation">Moderation</h2> The problem of “where does this term come from” manifests mostly in very large modules with a ton of imports. It’s not an issue to find where a term is from if you have a 100 line module with 5 imports. If you have a 1,000 line module with 20 imports, you’re in trouble. By breaking your modules up into smaller logical chunks, you can avoid this problem, at the expense of having your code spread out more. Most codebases use a combination of qualified imports, explicit import lists, and open imports. The decision tends to be made in terms of some combination of taste and the design of the library/module that you’re importing. Consider these modules: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- 1. Designed for qualified import import Data.Map (Map) import qualified Data.Map as Map -- 2. Basic enough that they don't need -- to be introduced import Control.Monad import Control.Applicative -- 3. Designed for *unqualified* import import Data.IORef import Control.Concurrent -- 4. Obscure enough that you might want -- to have an explicit import list import Control.Arrow (first, (&&&)) </code></pre></div></div> This is a somewhat moderate import strategy. Unusual terms are imported explicitly, common terms are imported implicitly. Some libraries are designed to be imported and used in a specific manner. <h2 id="what-do-we-want">What do we want?</h2> Let’s summarize what we want out of our import/export situation: <ol> <li>Easy to find where an identifier comes from</li> <li>Not excessively verbose</li> <li>Tooling isn’t necessary to use the strategy</li> <li>It’s easy to write tooling for the strategy</li> </ol> With all that out of the way, I think I’m ready to talk about library design. <h1 id="library-design">Library Design</h1> There are a few ways to design libraries to handle the import/export pain. <ol> <li>Qualified Import</li> <li>Module Isolation</li> <li>Open import</li> </ol> <h2 id="qualified-import">Qualified Import</h2> The <code class="language-plaintext highlighter-rouge">containers</code> library is designed to be imported qualified – that is, you must import it qualified in order to use it. If you don’t, you get ambiguous term errors. Any code that uses it typically has these two imports: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import qualified Data.Map as Map import Data.Map (Map) </code></pre></div></div> so that you can use the <code class="language-plaintext highlighter-rouge">Map</code> term unqualified and the functions for operating on <code class="language-plaintext highlighter-rouge">Map</code>s qualified, like so: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someMap :: Map String Int someMap = Map.insert "hello" 3 Map.empty `Map.union` Map.singleton "bar" 2 </code></pre></div></div> This has a problem: If you want to write a custom Prelude to cut down on import related boilerplate, you’re unable to do so with this strategy. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module MyPrelude ( module MyPrelude , module X ) where import Data.Map (Map) as X import Data.Set (Set) as X import Data.Text (Text) as X import Data.ByteString (ByteString) as X </code></pre></div></div> When we import <code class="language-plaintext highlighter-rouge">MyPrelude</code>, we get the type names in scope. This is an improvement. But we still need to write out all the qualified imports to use the types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Main where import MyPrelude import qualified Data.Map as Map </code></pre></div></div> There is currently no way to do a qualified reexport, which would fix this issue. If you are intending for your module to be used qualified, I’d recommend making the intended name available as a top level module. Consider my library <code class="language-plaintext highlighter-rouge">monad-metrics</code>, which is designed for qualified import: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Monad.Metrics (MonadMetrics(..)) import qualified Control.Monad.Metrics as Metrics </code></pre></div></div> This is a lot of typing! Instead, in a new version of the library, I will do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Monad.Metrics import qualified Metrics </code></pre></div></div> where <code class="language-plaintext highlighter-rouge">Control.Monad.Metrics</code> will export the types, and <code class="language-plaintext highlighter-rouge">Metrics</code> will export the functions for operating on them. This cuts down on the effort required by the user. This scheme requires a lot of maintenance on the part of the user, or a dependency on tooling that may or may not be available for a user’s editor solution. Furthermore, this encourages a style of naming where the same basic identifier gets used many times: <a href="https://hoogle.haskell.org/?hoogle=empty&scope=set%3Astackage"><code class="language-plaintext highlighter-rouge">empty</code></a> is used 25 times in the Stackage database. This makes it more difficult for tooling to know what to suggest in these cases. <h2 id="module-isolation">Module Isolation</h2> This strategy hearkens back to the Esqueleto and Parsec examples I presented earlier. It also applies to some other libraries I’ve used, like the Swagger library. This is the easiest thing to do – you stop caring about stepping on anyone’s toes, and require that your users define functions that use your library in encapsulated, isolated modules, that they then reexport however they like. This makes a lot of sense when you’re defining an EDSL (embedded domain specific language) for working with something. Parsers, SQL queries, HTML DSLs, and Swagger definitions all fall into this role. Data structures and web servers typically don’t. This approach puts a tax on users: it requires that they break functionality into a separate module. This is typically a good thing, but it asks more of users than a qualified import strategy or an open import strategy. Typically, there won’t be that many libraries that can use this scheme productively. <h2 id="open-import">Open Import</h2> The open import strategy is quickly becoming my favorite. The library is designed such that you can just import the whole thing without an import list and it’s easy enough to find it. If your library uses operators, you’re strongly encouraging your users to use this strategy, even if the rest of your library doesn’t fit it well. To do this, you’ll need to incorporate a bit of redundant information into the identifiers you use. This is a sort of Hungarian notation. Let’s look at some examples of libraries that take this design. <h3 id="dataioref">Data.IORef</h3> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Data.IORef import Control.Concurrent main = do ref <- newIORef 100 fix $ \loop -> do val <- readIORef ref if val >= 0 then do modifyIORef ref (subtract 1) loop else putStrLn "Done!" </code></pre></div></div> <h3 id="controlconcurrentstm">Control.Concurrent.STM</h3> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Control.Concurrent import Control.Concurrent.STM main = do q <- newTQueue forkIO $ forever $ print =<< atomically (readTQueue q) forM_ [1..1000] $ \i -> forkIO (atomically (writeTQueue q i)) </code></pre></div></div> <h3 id="lucid">Lucid</h3> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Lucid view :: Html () view = do h1_ "Hello world!" div_ $ do p_ $ do "The consistent naming scheme " "and specific identifier name " "choices make this an easy lib" "rary to use with an open impo" "rt." ul_ $ do li_ "Lucid is just great." li_ "Fave HTML lib fo sho" </code></pre></div></div> Now, where does <code class="language-plaintext highlighter-rouge">newIORef</code> come from? What does it result in? It creates a new <code class="language-plaintext highlighter-rouge">IORef</code>. Likewise, what about <code class="language-plaintext highlighter-rouge">newTQueue</code>? It creates a new <code class="language-plaintext highlighter-rouge">TQueue</code>. The Lucid functions all follow a simple convention that make them easy to use and purposefully avoid collision with other identifiers: the <code class="language-plaintext highlighter-rouge">_</code> suffix of an HTML tag makes it easy to know that an identifier is from <code class="language-plaintext highlighter-rouge">Lucid</code>. The only collision I notice ordinarily is <code class="language-plaintext highlighter-rouge">for_</code>. If these modules were designed with qualified import in mind, we’d write <code class="language-plaintext highlighter-rouge">TQueue.new</code>, <code class="language-plaintext highlighter-rouge">IORef.new</code>, or <code class="language-plaintext highlighter-rouge">Lucid.h1_</code> instead. However, that would cause us a few problems: <ol> <li>You’d have to write <code class="language-plaintext highlighter-rouge">import qualified Data.IORef as IORef</code> and <code class="language-plaintext highlighter-rouge">import Data.IORef (IORef)</code> in all of your imports that use it</li> <li>You can’t reexport the functions from <code class="language-plaintext highlighter-rouge">IORef</code> anymore, so you need to explicitly import them every time, leading to more boilerplate in a custom prelude.</li> </ol> How would <code class="language-plaintext highlighter-rouge">Data.Map</code> look like with this scenario? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Data.Map main = do let a = insertMap "hello" 0 emptyMap b = singletonMap "goodbye" 1 c = unionMap a b print c </code></pre></div></div> Aesthetically, I think I prefer <code class="language-plaintext highlighter-rouge">Map.insert</code> over <code class="language-plaintext highlighter-rouge">insertMap</code>. Unfortunately, until we’re able to re-export modules qualified, we’re unable to use the qualified imports conveniently. <h1 id="imo">IMO</h1> In my opinion, designing libraries for open import is the most convenient and useful way to go. Alternatively, it would be nice to have a Template Haskell function that can take a module designed for qualified import and mechanically convert it to this style. Then you could design with qualified import in mind and people that want an open import strategy could simply use the function and make their own. Fri, 23 Jun 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/06/23/on_naming_things.html https://www.parsonsmatt.org/2017/06/23/on_naming_things.html Exceptional Servant Handling The <a href="https://haskell-servant.readthedocs.io/en/stable/">Haskell Servant</a> library is a fantastic way to write web APIs. When you’re implementing the handlers, you either need to write them in the <code class="language-plaintext highlighter-rouge">Handler</code> monad, or define a <code class="language-plaintext highlighter-rouge">Nat</code>ural transformation (a way of converting) your choice of monad into the <code class="language-plaintext highlighter-rouge">Handler</code> monad. The <a href="https://haskell-servant.readthedocs.io/en/stable/tutorial/Server.html#the-handler-monad"><code class="language-plaintext highlighter-rouge">Handler</code> monad</a> is a <code class="language-plaintext highlighter-rouge">newtype</code> around <code class="language-plaintext highlighter-rouge">ExceptT ServantErr IO a</code>, where <a href="https://hackage.haskell.org/package/servant-server-0.11/docs/Servant-Server.html#t:ServantErr"><code class="language-plaintext highlighter-rouge">ServantErr</code></a> is a way of providing errors like <code class="language-plaintext highlighter-rouge">404 -- Not Found</code> or <code class="language-plaintext highlighter-rouge">401 -- Not Authorized</code>, or other non-200 responses, like <code class="language-plaintext highlighter-rouge">302</code> redirection. If you’re familiar with <code class="language-plaintext highlighter-rouge">ExceptT</code>, this isn’t new to you. You can always use <code class="language-plaintext highlighter-rouge">throwError</code> in <code class="language-plaintext highlighter-rouge">ExceptT</code> to short-circuit the block and return the given error. Servant handles the <code class="language-plaintext highlighter-rouge">ServantErr</code> intelligently, converting it into an appropriate response. For non-<code class="language-plaintext highlighter-rouge">ServantErr</code> exceptions, Servant lets the serving backend (typically <a href="https://hackage.haskell.org/package/wai"><code class="language-plaintext highlighter-rouge">WAI</code></a>) handle it, usually by providing a <code class="language-plaintext highlighter-rouge">500</code> error. <h1 id="exceptt-e-io-antipattern"><code class="language-plaintext highlighter-rouge">ExceptT e IO</code> antipattern</h1> Perhaps you’ve read Michael Snoyman’s <a href="https://www.fpcomplete.com/blog/2016/11/exceptions-best-practices-haskell">Exceptions Best Practices In Haskell</a> blog post. Perhaps you’re sold on the idea – why bother with <code class="language-plaintext highlighter-rouge">ExceptT</code> over <code class="language-plaintext highlighter-rouge">IO</code> when <code class="language-plaintext highlighter-rouge">IO</code> already can throw runtime errors? Furthermore, maybe you’re concerned with performance – the <code class="language-plaintext highlighter-rouge">>>=</code> implementation for <code class="language-plaintext highlighter-rouge">ExceptT</code> must do case analysis on the result to determine what to do next. Carter Schonwald’s <a href="https://hackage.haskell.org/package/monad-ste">monad-ste</a> package provides a more efficient way of dealing with exceptions, as it uses GHC’s runtime exception system. There are various good reasons why you might want to strip <code class="language-plaintext highlighter-rouge">ExceptT</code> from your Servant handlers. There are various good reasons why you wouldn’t want to do that. I’m in the first camp – I don’t want <code class="language-plaintext highlighter-rouge">ExceptT</code> over <code class="language-plaintext highlighter-rouge">IO</code>. Maybe you don’t even like monad transformers at all, and just want your handlers to be in plain ol’ <code class="language-plaintext highlighter-rouge">IO</code>. <img src="https://www.parsonsmatt.org/brain-meme.jpg-large" alt="stupid expanding brain meme" /> Well, it turns out, that doesn’t take much code! <h1 id="nat-simplification">Nat Simplification</h1> The <code class="language-plaintext highlighter-rouge">servant-server</code> library allows you to use the function <a href="https://hackage.haskell.org/package/servant-server-0.11/docs/Servant-Server.html#g:5"><code class="language-plaintext highlighter-rouge">enter</code></a> to provide a conversion function from one monad to another. Given an API type like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type API = "best-numbers" :> Get '[JSON] [Int] </code></pre></div></div> we can write a handler like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server :: IO [Int] server = do now <- getCurrentTime let timeInSeconds = utctDayTime now wakeUpTime = 8 * 60 * 60 when (timeInSeconds <= wakeUpTime) $ throwIO err400 { errBody = "request too early!" } return [1,2,3] </code></pre></div></div> This handler returns <code class="language-plaintext highlighter-rouge">[1, 2, 3]</code> if it’s awake. But if the current time is less than 8 AM UTCTime, then we throw a 400 error instead. Since <code class="language-plaintext highlighter-rouge">ServantErr</code> is an instance of <code class="language-plaintext highlighter-rouge">Exception</code>, we’re allowed to throw it in <code class="language-plaintext highlighter-rouge">IO</code> using <code class="language-plaintext highlighter-rouge">throwIO :: Exception e => e -> IO a</code>. If you’re using the <code class="language-plaintext highlighter-rouge">exceptions</code> package, you can use <code class="language-plaintext highlighter-rouge">throwM</code> as well. To hook this up with the <code class="language-plaintext highlighter-rouge">serve</code> function, we need to use <code class="language-plaintext highlighter-rouge">enter</code> and provide a <a href="https://hackage.haskell.org/package/servant-server-0.11/docs/Servant-Server.html#t::-126--62-"><code class="language-plaintext highlighter-rouge">NT</code></a> natural transformation/conversion function. The type signature in the documentation is super generic, but ultimately, we’re looking for a function like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type NaturalTransformation source target = forall a. source a -> target a </code></pre></div></div> Or, in English, “a function that converts a <code class="language-plaintext highlighter-rouge">source a</code> into a <code class="language-plaintext highlighter-rouge">target a</code> that is forbidden from inspecting the <code class="language-plaintext highlighter-rouge">a</code> values.” Concretely, for our specific use case, we want: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO a -> Handler a </code></pre></div></div> <h1 id="hole-driven-development">Hole Driven Development</h1> Hole driven development to the rescue! HDD is where you create a type hole and fill it in with your ‘best guess’ based on the surrounding context. Typically you’ll drop another type hole, which allows you to interactively develop with the compiler. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO a -> Handler a convert action = _f </code></pre></div></div> Well, <code class="language-plaintext highlighter-rouge">_f</code> gives us a type hole for <code class="language-plaintext highlighter-rouge">Handler a</code>, which isn’t surprising. How can we construct a <a href="https://hackage.haskell.org/package/servant-server-0.11/docs/Servant-Server.html#t:Handler"><code class="language-plaintext highlighter-rouge">Handler</code></a>? The Haddocks point us to an exposed constructor, also <code class="language-plaintext highlighter-rouge">Handler</code>, which we can use. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO a -> Handler a convert action = Handler _f </code></pre></div></div> Now, <code class="language-plaintext highlighter-rouge">_f</code> is <code class="language-plaintext highlighter-rouge">ExceptT ServantErr IO a</code>. How do we construct an <a href="https://hackage.haskell.org/package/mtl-2.2.1/docs/Control-Monad-Except.html#t:ExceptT"><code class="language-plaintext highlighter-rouge">ExceptT</code></a>? The Haddocks show that we have a constructor, also called <code class="language-plaintext highlighter-rouge">ExceptT</code>, which expects an <code class="language-plaintext highlighter-rouge">m (Either e a)</code>. So let’s plug that in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO a -> Handler a convert action = Handler (ExceptT _f) </code></pre></div></div> Now <code class="language-plaintext highlighter-rouge">_f</code> is <code class="language-plaintext highlighter-rouge">IO (Either ServantErr a)</code>. This is where it gets tricky. We know that we have an <code class="language-plaintext highlighter-rouge">IO a</code> on hand. If we do a hoogle search for <a href="https://hoogle.haskell.org/?hoogle=IO+a+-%3E+IO+%28Either+ServantErr+a%29&scope=set%3Astackage"><code class="language-plaintext highlighter-rouge">IO a -> IO (Either e a)</code></a>, then we get a bunch of funny results. None of them are exactly right, but there are a lot of variants on <code class="language-plaintext highlighter-rouge">try</code>. So let’s hoogle for <a href="https://hoogle.haskell.org/?hoogle=try&scope=set%3Astackage"><code class="language-plaintext highlighter-rouge">try</code></a>! That gives us this nice definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>try :: Exception e => IO a -> IO (Either e a) </code></pre></div></div> so let’s plug that in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO a -> Handler a convert action = Handler (ExceptT (try _f)) </code></pre></div></div> Now, <code class="language-plaintext highlighter-rouge">_f</code> has the type <code class="language-plaintext highlighter-rouge">IO a</code>. And we have an <code class="language-plaintext highlighter-rouge">IO a</code> already – it’s the parameter we’ve been passed! So we can simplify our convert: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO a -> Handler a convert = Handler . ExceptT . try </code></pre></div></div> wrap it in the <code class="language-plaintext highlighter-rouge">NT</code> natural transformation newtype: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convert :: IO :~> Handler convert = NT . Handler . ExceptT . try </code></pre></div></div> and use it in <code class="language-plaintext highlighter-rouge">enter</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>app :: Application app = server (Proxy :: Proxy Api) (enter convert handler) </code></pre></div></div> Voila! You’re throwing and catching exceptions in IO, and Servant is still getting the chance to handle them however it wants. Wed, 21 Jun 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/06/21/exceptional_servant_handling.html https://www.parsonsmatt.org/2017/06/21/exceptional_servant_handling.html Basic Type Level Programming in Haskell Dependently typed programming is becoming all the rage these days. Advocates are talking about all the neat stuff you can do by putting more and more information into the type system. It’s true! Type level programming gives you interesting new tools for designing software. You can guarantee safety properties, and in some cases, even gain performance optimizations through the use of these types. I’m not going to try and sell you on these benefits – presumably you’ve read about something like the <a href="https://blog.jle.im/entry/practical-dependent-types-in-haskell-1.html">dependently typed neural networks</a>, or about Idris’s <a href="https://docs.idris-lang.org/en/latest/st/examples.html">encoding of network protocols in the type system</a>. If you’re not convinced, then this isn’t the right article for you. If you are interested, and have some familiarity with Haskell, Elm, F#, or another ML-family language, then this article will be right up your alley. <h1 id="the-basic-types">The Basic Types</h1> So let’s talk about some basic types. I’m going to stick with the real basic types here: no primitives, just stuff we can define in one line in Haskell. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Unit = MkUnit data Bool = True | False </code></pre></div></div> This code block defines two new types: <code class="language-plaintext highlighter-rouge">Unit</code> and <code class="language-plaintext highlighter-rouge">Bool</code>. The <code class="language-plaintext highlighter-rouge">Unit</code> type has one constructor, called <code class="language-plaintext highlighter-rouge">MkUnit</code>. Since there’s only one constructor for this type, and it takes no parameters, there is only one value of this type. We call it <code class="language-plaintext highlighter-rouge">Unit</code> because there’s only one value. <code class="language-plaintext highlighter-rouge">Bool</code> is a type that has two constructors: <code class="language-plaintext highlighter-rouge">True</code> and <code class="language-plaintext highlighter-rouge">False</code>. These don’t take any parameters either, so they’re kind of like constants. What does it mean to be a type? A type is a way of classifying things. Things – that’s a vague word. What do I mean by ‘things’? Well, for these simple types above, we’ve already seen all their possible values – we can say that <code class="language-plaintext highlighter-rouge">True</code> and <code class="language-plaintext highlighter-rouge">False</code> are members of the type <code class="language-plaintext highlighter-rouge">Bool</code>. Furthermore, <code class="language-plaintext highlighter-rouge">1</code>, <code class="language-plaintext highlighter-rouge">'a'</code>, and <code class="language-plaintext highlighter-rouge">Unit</code> are not members of the type <code class="language-plaintext highlighter-rouge">Bool</code>. These types are kind of boring. Let’s look at another type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data IntAndChar = MkIntAndChar Int Char </code></pre></div></div> This introduces a type <code class="language-plaintext highlighter-rouge">IntAndChar</code> with a single constructor that takes two arguments: one of which is an <code class="language-plaintext highlighter-rouge">Int</code> and the other is a <code class="language-plaintext highlighter-rouge">Char</code>. Values of this type look like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>theFirstOne = MkIntAndChar 3 'a' theSecond = MkIntAndChar (-3) 'b' </code></pre></div></div> <code class="language-plaintext highlighter-rouge">MkIntAndChar</code> looks a lot like a function. In fact, if we ask GHCi about it’s type, we get this back: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :t MkIntAndChar MkIntAndChar :: Int -> Char -> IntAndChar </code></pre></div></div> <code class="language-plaintext highlighter-rouge">MkIntAndChar</code> is a function accepting an <code class="language-plaintext highlighter-rouge">Int</code> and a <code class="language-plaintext highlighter-rouge">Char</code> and finally yielding a value of type <code class="language-plaintext highlighter-rouge">IntAndChar</code>. So we can construct values, and values have types. Can we construct types? And if so, what do they have? <h1 id="the-higher-kinds">The Higher Kinds</h1> Let’s hold onto our intuition about functions and values. A function with the type <code class="language-plaintext highlighter-rouge">foo :: Int -> IntAndChar</code> is saying: <blockquote> Give me a value with the type <code class="language-plaintext highlighter-rouge">Int</code>, and I will give you a value with the type <code class="language-plaintext highlighter-rouge">IntAndChar</code>. </blockquote> Now, let’s lift that intuition into the type level. A value constructor accepts a value and yields a value. So a type constructor accepts a type and yields a type. Haskell’s type variables allow us to express this. Let’s consider everyone’s favorite sum type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Maybe a = Just a | Nothing </code></pre></div></div> Here, we declare a type <code class="language-plaintext highlighter-rouge">Maybe</code>, with two data constructors: <code class="language-plaintext highlighter-rouge">Just</code>, which accepts a value of type <code class="language-plaintext highlighter-rouge">a</code>, and <code class="language-plaintext highlighter-rouge">Nothing</code>, which does not accept any values at all. Let’s ask GHCi about the type of <code class="language-plaintext highlighter-rouge">Just</code> and <code class="language-plaintext highlighter-rouge">Nothing</code>! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :t Just Just :: a -> Maybe a λ> :t Nothing Nothing :: Maybe a </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">Just</code> has that function type – and it looks like, whatever type of value we give it, it becomes a <code class="language-plaintext highlighter-rouge">Maybe</code> of that type. <code class="language-plaintext highlighter-rouge">Nothing</code>, however, can conjure up whatever type it wants, without needing a value at all. Let’s play with that a bit: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> let nothingA = Nothing :: Maybe a λ> let nothingInt = Nothing :: Maybe Int λ> let nothingChar = Nothing :: Maybe Char λ> nothingInt == nothingChar \<interactive\>:27:15: error: • Couldn't match type ‘Char’ with ‘Int’ Expected type: Maybe Int Actual type: Maybe Char • In the second argument of ‘(==)’, namely ‘nothingChar’ In the expression: nothingInt == nothingChar In an equation for ‘it’: it = nothingInt == nothingChar λ> nothingA == nothingInt True λ> nothingA == nothingChar True </code></pre></div></div> Woah – we get a type error when trying to compare <code class="language-plaintext highlighter-rouge">nothingInt</code> with <code class="language-plaintext highlighter-rouge">nothingChar</code>. That makes sense – <code class="language-plaintext highlighter-rouge">(==)</code> only works on values that have the same type. But then, wait, why does <code class="language-plaintext highlighter-rouge">nothingA</code> not complain when compared with <code class="language-plaintext highlighter-rouge">nothingInt</code> and <code class="language-plaintext highlighter-rouge">nothingChar</code>? The reason is that <code class="language-plaintext highlighter-rouge">nothingA :: Maybe a</code> really means: <blockquote> I am a value of <code class="language-plaintext highlighter-rouge">Maybe a</code>, for any and all types <code class="language-plaintext highlighter-rouge">a</code> that you might provide to me. </blockquote> So I’m seeing that we’re passing types to <code class="language-plaintext highlighter-rouge">Maybe</code>, in much the same way that we pass values to <code class="language-plaintext highlighter-rouge">MkIntAndChar</code>. Let’s ask GHCi about the type of <code class="language-plaintext highlighter-rouge">Maybe</code>! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :type Maybe \<interactive\>:1:1: error: • Data constructor not in scope: Maybe • Perhaps you meant variable ‘maybe’ (imported from Prelude) </code></pre></div></div> Well, it turns out that types don’t have types (kind of, sort of). Types have kinds. We can ask GHCi about the kind of types with the <code class="language-plaintext highlighter-rouge">:kind</code> command: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :kind Maybe Maybe :: * -> * </code></pre></div></div> What is <code class="language-plaintext highlighter-rouge">*</code> doing there? Well, <code class="language-plaintext highlighter-rouge">*</code> is the kind of types which have values. Check this out: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :kind Maybe Maybe :: * -> * </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">Maybe</code> has the kind <code class="language-plaintext highlighter-rouge">* -> *</code>, which means: <blockquote> Give me a type that has values, and I will give you a type that has values. </blockquote> Maybe you’ve heard about higher kinded polymorphism before. Let’s write a data type that demonstrates that this means: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data HigherKinded f a = Bare a | Wrapped (f a) </code></pre></div></div> Haskell’s kind inference is awfully kind on the fingers – since we use <code class="language-plaintext highlighter-rouge">f</code> applied to <code class="language-plaintext highlighter-rouge">a</code>, Haskell just knows that <code class="language-plaintext highlighter-rouge">f</code> must have the kind <code class="language-plaintext highlighter-rouge">* -> *</code>. If we ask GHCi about the kind of <code class="language-plaintext highlighter-rouge">HigherKinded</code>, we get back: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :kind HigherKinded HigherKinded :: (* -> *) -> * -> * </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">HigherKinded</code> is a type that accepts a type of kind <code class="language-plaintext highlighter-rouge">* -> *</code>, and a type of kind <code class="language-plaintext highlighter-rouge">*</code>, and returns a type of kind <code class="language-plaintext highlighter-rouge">*</code>. In plain, verbose English, this reads as: <blockquote> Give me two types: the first of which is a function that does not have values itself, but when given a type that does have values, it can have values. The second being a type that has values. Finally, I will return to you a type that can have ordinary values. </blockquote> <h1 id="dynamically-kinded-programming">Dynamically Kinded Programming</h1> <code class="language-plaintext highlighter-rouge">*</code> reminds me of regular expressions – “match anything.” Indeed, <code class="language-plaintext highlighter-rouge">*</code> matches any type that has values, or even types that are only inhabited by the infinite loop: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> data Void λ> :kind Void Void :: * </code></pre></div></div> We don’t provide any ways to construct a void value, yet it still has kind <code class="language-plaintext highlighter-rouge">*</code>. In the same way that you can productively program at the value level with dynamic types, you can productively program at the type level with dynamic kinds. And <code class="language-plaintext highlighter-rouge">*</code> is basically that! Let’s encode our first type level numbers. We’ll start with the Peano natural numbers, where numbers are inductively defined as either <code class="language-plaintext highlighter-rouge">Zero</code> or the <code class="language-plaintext highlighter-rouge">Successor</code> of some natural number. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Zero data Succ a type One = Succ Zero type Two = Succ One type Three = Succ Two type Four = Succ (Succ (Succ (Succ Zero))) </code></pre></div></div> But this is pretty unsatisfying. After all, there’s nothing that stops us from saying <code class="language-plaintext highlighter-rouge">Succ Bool</code>, which doesn’t make any sense. I’m pretty sold on the benefits of types for clarifying thinking and preventing errors, so abandoning the safety of types when I program my types just seems silly. In order to get that safety back, we need to introduce more kinds than merely <code class="language-plaintext highlighter-rouge">*</code>. For this, we have to level up our GHC. <h1 id="data-kinds">Data Kinds</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE DataKinds #-} </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">DataKinds</code> extension allows us to promote data constructors into type constructors, which also promotes their type constructors into kind constructors. To promote something up a level, we prefix the name with an apostrophe, or tick: <code class="language-plaintext highlighter-rouge">'</code>. Now, let’s define our kind safe type level numbers: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Nat = Zero | Succ Nat </code></pre></div></div> In plain Haskell, this definition introduces a new type <code class="language-plaintext highlighter-rouge">Nat</code> with two value constructors, <code class="language-plaintext highlighter-rouge">Zero</code> and <code class="language-plaintext highlighter-rouge">Succ</code> (which takes a value of type <code class="language-plaintext highlighter-rouge">Nat</code>). With the <code class="language-plaintext highlighter-rouge">DataKinds</code> extension, this also defines some extra new tidbits. We get a new kind <code class="language-plaintext highlighter-rouge">Nat</code>, which exists in a separate namespace. And we get two new types: a type constant <code class="language-plaintext highlighter-rouge">'Zero</code>, which has the kind <code class="language-plaintext highlighter-rouge">Nat</code>, and a type constructor <code class="language-plaintext highlighter-rouge">'Succ</code>, which accepts a type of kind <code class="language-plaintext highlighter-rouge">Nat</code>. Let’s ask GHCi about our new buddies: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :kind 'Zero 'Zero :: Nat λ> :kind 'Succ 'Succ :: Nat -> Nat </code></pre></div></div> You might think: that looks familiar! And it should. After all, the types look very much the same! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :type Zero Zero :: Nat λ> :type Succ Succ :: Nat -> Nat </code></pre></div></div> Where it can be ambiguous, the <code class="language-plaintext highlighter-rouge">'</code> is used to disambiguate. Otherwise, Haskell can infer which you mean. It’s important to note that there are no values of type <code class="language-plaintext highlighter-rouge">'Zero</code>. The only kind that can have types that can have values is <code class="language-plaintext highlighter-rouge">*</code>. We’ve gained the ability to construct some pretty basic types and kinds. In order to actually use them, though, we need a bit more power. <h1 id="gadts">GADTs</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE GADTs #-} </code></pre></div></div> You may be wondering, “What does GADT stand for?” Richard Eisenberg will tell you that they’re <a href="https://www.youtube.com/watch?v=6snteFntvjM">Generalized Algebraic Data Types, but that the terminology isn’t helpful, so just think of them as Gadts</a>. GADTs are a tool we can use to provide extra type information by matching on constructors. They use a slightly different syntax than normal Haskell data types. Let’s check out some simpler types that we’ll write with this syntax: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Maybe a where Just :: a -> Maybe a Nothing :: Maybe a </code></pre></div></div> The GADT syntax lists the constructors line-by-line, and instead of providing the fields of the constructor, we provide the type signature of the constructor. This is an interesting change – I just wrote out <code class="language-plaintext highlighter-rouge">a -> Maybe a</code>. That suggests, to me, that I can make these whatever type I want. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data IntBool a where Int :: Int -> IntBool Int Bool :: Bool -> IntBool Bool </code></pre></div></div> This declaration creates a new type <code class="language-plaintext highlighter-rouge">IntBool</code>, which has the kind <code class="language-plaintext highlighter-rouge">* -> *</code>. It has two constructors: <code class="language-plaintext highlighter-rouge">Int</code>, which has the type <code class="language-plaintext highlighter-rouge">Int -> IntBool Int</code>, and <code class="language-plaintext highlighter-rouge">Bool</code>, which has the type <code class="language-plaintext highlighter-rouge">Bool -> IntBool Bool</code>. Since the constructors carry information about the resulting type, we get bonus information about the type when we pattern match on the constructors! Check this signature out: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>extractIntBool :: IntBool a -> a extractIntBool (Int _) = 0 extractIntBool (Bool b) = b </code></pre></div></div> Something really interesting is happening here! When we match on <code class="language-plaintext highlighter-rouge">Int</code>, we know that <code class="language-plaintext highlighter-rouge">IntBool a ~ IntBool Int</code>. That <code class="language-plaintext highlighter-rouge">~</code> tilde is a symbol for type equality, and introduces a constraint that GHC needs to solve to type check the code. For this branch, we know that <code class="language-plaintext highlighter-rouge">a ~ Int</code>, so we can return an <code class="language-plaintext highlighter-rouge">Int</code> value. We now have enough power in our toolbox to implement everyone’s favorite example of dependent types: length indexed vectors! <h1 id="vectors">Vectors</h1> Length indexed vectors allow us to put the length of a list into the type system, which allows us to statically forbid out-of-bounds errors. We have a way to promote numbers into the type level using <code class="language-plaintext highlighter-rouge">DataKinds</code>, and we have a way to provide bonus type information using <code class="language-plaintext highlighter-rouge">GADTs</code>. Let’s combine these two powers for this task. I’ll split this definition up into multiple blocks, so I can walk through it easily. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Vector (n :: Nat) a where </code></pre></div></div> We’re defining a type <code class="language-plaintext highlighter-rouge">Vector</code> with kind <code class="language-plaintext highlighter-rouge">Nat -> * -> *</code>. The first type parameter is the length index. The second type parameter is the type of values contained in the vector. Note that, in order to compile something with a kind signature, we need… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE KindSignatures #-} </code></pre></div></div> Thinking about types often requires us to think in a logical manner. We often need to consider things inductively when constructing them, and recursively when destructing them. What is the base case for a vector? It’s the empty vector, with a length of <code class="language-plaintext highlighter-rouge">Zero</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> VNil :: Vector 'Zero a </code></pre></div></div> A value constructed by <code class="language-plaintext highlighter-rouge">VNil</code> can have any type <code class="language-plaintext highlighter-rouge">a</code>, but the length is always constrained to be <code class="language-plaintext highlighter-rouge">'Zero</code>. The inductive case is adding another value to a vector. One more value means one more length. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> VCons :: a -> Vector n a -> Vector ('Succ n) a </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">VCons</code> constructor takes two values: one of type <code class="language-plaintext highlighter-rouge">a</code>, and another of type <code class="language-plaintext highlighter-rouge">Vector n a</code>. We don’t know how long the <code class="language-plaintext highlighter-rouge">Vector</code> provided is – it can be any <code class="language-plaintext highlighter-rouge">n</code> such that <code class="language-plaintext highlighter-rouge">n</code> is a <code class="language-plaintext highlighter-rouge">Nat</code>ural number. We do know that the resulting vector is the <code class="language-plaintext highlighter-rouge">Succ</code>essor of that <code class="language-plaintext highlighter-rouge">n</code>umber, though. So here’s the fully annotated and explicit definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Vector (n :: Nat) (a :: *) where VNil :: Vector 'Zero a VCons :: a -> Vector n a -> Vector ('Succ n) a </code></pre></div></div> Fortunately, Haskell can infer these things for us! Whether you use the above explicit definition or the below implicit definition is a matter of taste, aesthetics, style, and documentation. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Vector n a where VNil :: Vector Zero a VCons :: a -> Vector n a -> Vector (Succ n) a </code></pre></div></div> Let’s now write a <code class="language-plaintext highlighter-rouge">Show</code> instance for these length indexed vectors. It’s pretty painless: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Show a => Show (Vector n a) where show VNil = "VNil" show (VCons a as) = "VCons " ++ show a ++ " (" ++ show as ++ ")" </code></pre></div></div> That <code class="language-plaintext highlighter-rouge">n</code> type parameter is totally arbitrary, so we don’t have to worry about it too much. <h2 id="the-vector-api">The Vector API</h2> As a nice exercise, let’s write <code class="language-plaintext highlighter-rouge">append :: Vector n a -> Vector m a -> Vector ??? a</code>. But wait, what is <code class="language-plaintext highlighter-rouge">???</code> going to be? It needs to represent the addition of these two natural numbers. Addition is a function. And we don’t have type functions, right? Well, we do, but we have to upgrade our GHC again. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE TypeFamilies #-} </code></pre></div></div> For some reason, functions that operate on types are called type families. There are two ways to write a type family: open, where anyone can add new cases, and closed, where all the cases are defined at once. We’ll mostly be dealing with closed type families here. So let’s figure out how to add two <code class="language-plaintext highlighter-rouge">Nat</code>ural numbers, at the type level. For starters, let’s figure out how to add at at the value level first. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>add :: Nat -> Nat -> Nat </code></pre></div></div> This is the standard Haskell function definitions we all know and love. We can pattern match on values, write <code class="language-plaintext highlighter-rouge">where</code> clauses with helpers, etc. We’re working with an inductive definition of numbers, so we’ll need to use recursion get our answer. We need a base case, and then the inductive case. So lets start basic: if we add 0 to any number, then the answer is that number. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>add Zero n = n </code></pre></div></div> The inductive case asks: <blockquote> If we add the successor of a number (<code class="language-plaintext highlighter-rouge">Succ n</code>) to another number (<code class="language-plaintext highlighter-rouge">m</code>), what is the answer? </blockquote> Well, we know we want to get to <code class="language-plaintext highlighter-rouge">Zero</code>, so we want to somehow shrink our problem a bit. We’ll have to shift that <code class="language-plaintext highlighter-rouge">Succ</code> from the left term to the right term. Then we can recurse on the addition. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>add (Succ n) m = add n (Succ m) </code></pre></div></div> If you imagine a natural number as a stack of plates, we can visualize the addition of two natural numbers as taking one plate off the top of the first stack, and putting it on top of the second. Eventually, we’ll use all of the plates – this leaves us with <code class="language-plaintext highlighter-rouge">Zero</code> plates, and our final single stack of plates is the answer. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>add :: Nat -> Nat -> Nat add Zero n = n add (Succ n) m = add n (Succ m) </code></pre></div></div> Alright, let’s promote this to the type level. <h1 id="type-families">Type Families</h1> The first line of a type family definition is the signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family Add n m where </code></pre></div></div> This introduces a new type function <code class="language-plaintext highlighter-rouge">Add</code> which accepts two parameters. We can now define the individual cases. We can pattern match on type constructors, just like we can pattern match on value constructors. So we’ll write the <code class="language-plaintext highlighter-rouge">Zero</code> case: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Add 'Zero n = n </code></pre></div></div> Next, we recurse on the inductive case: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Add ('Succ n) m = Add n ('Succ m) </code></pre></div></div> Ahh, except now GHC is going to give us an error. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> • The type family application ‘Add n ('Succ m)’ is no smaller than the instance head (Use UndecidableInstances to permit this) • In the equations for closed type family ‘Add’ In the type family declaration for ‘Add’ </code></pre></div></div> GHC is extremely scared of undecidability, and won’t do anything that it can’t easily figure out on it’s own. <code class="language-plaintext highlighter-rouge">UndecidableInstances</code> is an extension which allows you to say: <blockquote> Look, GHC, it’s okay. I know you can’t figure this out. I promise this makes sense and will eventually terminate. </blockquote> So now we get to add: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE UndecidableInstances #-} </code></pre></div></div> to our file. The type family definition compiles fine now. How can we test it out? Where we used <code class="language-plaintext highlighter-rouge">:kind</code> to inspect the kind of types, we can use <code class="language-plaintext highlighter-rouge">:kind!</code> to evaluate these types as far as GHC can. This snippet illustrates the difference: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :kind Add (Succ (Succ Zero)) (Succ Zero) Add (Succ (Succ Zero)) (Succ Zero) :: Nat λ> :kind! Add (Succ (Succ Zero)) (Succ Zero) Add (Succ (Succ Zero)) (Succ Zero) :: Nat = 'Succ ('Succ ('Succ 'Zero)) </code></pre></div></div> The first line just tells us that the result of <code class="language-plaintext highlighter-rouge">Add</code>ing two <code class="language-plaintext highlighter-rouge">Nat</code>ural numbers is itself a <code class="language-plaintext highlighter-rouge">Nat</code>ural number. The second line shows the actual result of evaluating the type level function. Cool! So now we can finally finish writing <code class="language-plaintext highlighter-rouge">append</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>append :: Vector n a -> Vector m a -> Vector (Add n m) a </code></pre></div></div> Let’s start with some bad attempts, to see what the types buy us: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>append VNil rest = VNil </code></pre></div></div> This fails with a type error – cool! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> • Could not deduce: m ~ 'Zero from the context: n ~ 'Zero bound by a pattern with constructor: VNil :: forall a. Vector 'Zero a, in an equation for ‘append’ at /home/matt/Projects/dep-types.hs:31:8-11 ‘m’ is a rigid type variable bound by the type signature for: append :: forall (n :: Nat) a (m :: Nat). Vector n a -> Vector m a -> Vector (Add n m) a at /home/matt/Projects/dep-types.hs:30:11 Expected type: Vector (Add n m) a Actual type: Vector 'Zero a • In the expression: VNil In an equation for ‘append’: append VNil rest = VNil • Relevant bindings include rest :: Vector m a (bound at /home/matt/Projects/dep-types.hs:31:13) append :: Vector n a -> Vector m a -> Vector (Add n m) a (bound at /home/matt/Projects/dep-types.hs:31:1) </code></pre></div></div> The error is kinda big and scary at first. Let’s dig into it a bit. GHC is telling us that it can’t infer that <code class="language-plaintext highlighter-rouge">m</code> (which is the length of the second parameter vector) is equal to <code class="language-plaintext highlighter-rouge">Zero</code>. It knows that <code class="language-plaintext highlighter-rouge">n</code> (the length of the first parameter) is <code class="language-plaintext highlighter-rouge">Zero</code> because we’ve pattern matched on <code class="language-plaintext highlighter-rouge">VNil</code>. So, what values can we return? Let’s replace the definition we have thus far with <code class="language-plaintext highlighter-rouge">undefined</code>, reload in GHCi, and inspect some types: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :t append VNil append VNil :: Vector m a -> Vector m a </code></pre></div></div> We need to construct a value <code class="language-plaintext highlighter-rouge">Vector m a</code>, and we have been given a value <code class="language-plaintext highlighter-rouge">Vector m a</code>. BUT – we don’t know what <code class="language-plaintext highlighter-rouge">m</code> is! So we have no way to spoof this or fake it. We have to return our input. So our first case is simply: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>append VNil xs = xs </code></pre></div></div> Like with addition of natural numbers, we’ll need to have the inductive case. Since we have the base case on our first parameter, we’ll want to try shrinking our first parameter in the recursive call. So let’s try another bad implementation: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>append (VCons a rest) xs = append rest (VCons a xs) </code></pre></div></div> This doesn’t really do what we want, which we can verify in the REPL: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> append (VCons 1 (VCons 3 VNil)) (VCons 2 VNil) VCons 3 (VCons 1 (VCons 2 (VNil))) </code></pre></div></div> The answer should be <code class="language-plaintext highlighter-rouge">VCons 1 (VCons 3 (VCons 2 VNil))</code>. However, our Vector type only encodes the length of the vector in the type. The sequence is not considered. Anything that isn’t lifted into the type system doesn’t get any correctness guarantees. So let’s fix the implementation: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>append (VCons a rest) xs = VCons a (append rest xs) </code></pre></div></div> And let’s reload in GHCi to test it out! <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :reload [1 of 1] Compiling DepTypes ( /home/matt/Projects/dep-types.hs, interpreted ) /home/matt/Projects/dep-types.hs:32:28: error: • Could not deduce: Add n1 ('Succ m) ~ 'Succ (Add n1 m) from the context: n ~ 'Succ n1 bound by a pattern with constructor: VCons :: forall a (n :: Nat). a -> Vector n a -> Vector ('Succ n) a, in an equation for ‘append’ at /home/matt/Projects/dep-types.hs:32:9-20 Expected type: Vector (Add n m) a Actual type: Vector ('Succ (Add n1 m)) a • In the expression: VCons a (append rest xs) In an equation for ‘append’: append (VCons a rest) xs = VCons a (append rest xs) • Relevant bindings include xs :: Vector m a (bound at /home/matt/Projects/dep-types.hs:32:23) rest :: Vector n1 a (bound at /home/matt/Projects/dep-types.hs:32:17) append :: Vector n a -> Vector m a -> Vector (Add n m) a (bound at /home/matt/Projects/dep-types.hs:31:1) Failed, modules loaded: none. </code></pre></div></div> Oh no! A type error! GHC can’t figure out that <code class="language-plaintext highlighter-rouge">Add n (Succ m)</code> is the same as <code class="language-plaintext highlighter-rouge">Succ (Add n m)</code>. We can kinda see what went wrong if we lay the <code class="language-plaintext highlighter-rouge">Vector</code>, <code class="language-plaintext highlighter-rouge">Add</code> and <code class="language-plaintext highlighter-rouge">append</code> definitions next to each other: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Vector n a where VNil :: Vector Zero a VCons :: a -> Vector n a -> Vector (Succ n) a type family Add x y where Add 'Zero n = n Add ('Succ n) m = Add n ('Succ m) append :: Vector n a -> Vector m a -> Vector (Add n m) a append VNil xs = xs append (VCons a rest) xs = VCons a (append rest xs) </code></pre></div></div> In <code class="language-plaintext highlighter-rouge">Vector</code>’s inductive case, we are building up a bunch of <code class="language-plaintext highlighter-rouge">Succ</code>s. In <code class="language-plaintext highlighter-rouge">Add</code>’s recursive case, we’re tearing down the left hand side, such that the exterior is another <code class="language-plaintext highlighter-rouge">Add</code>. And in <code class="language-plaintext highlighter-rouge">append</code>s recursive case, we’re building up the right hand side. Let’s trace how this error happens, and supply some type annotations as well: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>append (VCons a rest) xs = </code></pre></div></div> Here, we know that <code class="language-plaintext highlighter-rouge">VCons a rest</code> has the type <code class="language-plaintext highlighter-rouge">Vector (Succ n) a</code>, and <code class="language-plaintext highlighter-rouge">xs</code> has the type <code class="language-plaintext highlighter-rouge">Vector m a</code>. We need to produce a result of type <code class="language-plaintext highlighter-rouge">Vector (Add (Succ n) m) a</code> in order for the type to line up right. We use <code class="language-plaintext highlighter-rouge">VCons a (append rest xs)</code>. <code class="language-plaintext highlighter-rouge">VCons</code> has a length value that is the <code class="language-plaintext highlighter-rouge">Succ</code>essor of the result of <code class="language-plaintext highlighter-rouge">append rest xs</code>, which should have the value <code class="language-plaintext highlighter-rouge">Add n m</code>, so the length there is <code class="language-plaintext highlighter-rouge">Succ (Add n m)</code>. Unfortunately, our result type needs to be <code class="language-plaintext highlighter-rouge">Add (Succ n) m</code>. We know these values are equivalent. Unfortuantely, GHC cannot prove this, so it throws up it’s hands. Two definitions, which are provably equivalent, are structurally different, and this causes the types and proofs to fail. This is a HUGE gotcha in type level programming – the implementation details matter, a lot, and they leak, hard. We can fix this by using a slightly different definition of <code class="language-plaintext highlighter-rouge">Add</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type family Add x y where Add 'Zero n = n Add ('Succ n) m = 'Succ (Add n m) </code></pre></div></div> This definition has a similar structure of recursion – we pull the <code class="language-plaintext highlighter-rouge">Succ</code>s out, which allows us to match the way that <code class="language-plaintext highlighter-rouge">VCons</code> adds <code class="language-plaintext highlighter-rouge">Succ</code> on top. This new definition compiles and works fine. <h1 id="this-sucks">This Sucks</h1> Agreed, which is why I’ll defer the interested reader to <a href="https://www.schoolofhaskell.com/user/konn/prove-your-haskell-for-great-safety/dependent-types-in-haskell">this much better tutorial</a> on length indexed vectors in Haskell. Instead, let’s look at some other more interesting and practical examples of type level programming. <h1 id="heterogeneous-lists">Heterogeneous Lists</h1> Heterogeneous lists are kind of like tuples, but they’re defined inductively. We keep a type level list of the contents of the heterogeneous list, which let us operate safely on them. To use ordinary Haskell lists at the type level, we need another extension: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE TypeOperators #-} </code></pre></div></div> which allows us to use operators at the type level. Here’s the data type definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data HList xs where HNil :: HList '[] (:::) :: a -> HList as -> HList (a ': as) infixr 6 ::: </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">HNil</code> constructor has an empty list of values, which makes sense, because it doesn’t have any values! The <code class="language-plaintext highlighter-rouge">:::</code> construction operator takes a value of type <code class="language-plaintext highlighter-rouge">a</code>, an <code class="language-plaintext highlighter-rouge">HList</code> that already has a list of types <code class="language-plaintext highlighter-rouge">as</code> that it contains, and returns an <code class="language-plaintext highlighter-rouge">HList</code> where the first element in the type level list is <code class="language-plaintext highlighter-rouge">a</code> followed by <code class="language-plaintext highlighter-rouge">as</code>. Let’s see what a value for this looks like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> :t 'a' ::: 1 ::: "hello" ::: HNil 'a' ::: 1 ::: "hello" ::: HNil :: HList '[Char, Int, String] </code></pre></div></div> So now we know that we have a <code class="language-plaintext highlighter-rouge">Char</code>, <code class="language-plaintext highlighter-rouge">Int</code>, and <code class="language-plaintext highlighter-rouge">String</code> contained in this <code class="language-plaintext highlighter-rouge">HList</code>, and their respective indexes. What if we want to <code class="language-plaintext highlighter-rouge">Show</code> that? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> 'a' ::: 1 ::: "hello" ::: HNil \<interactive\>:13:1: No instance for (Show (HList '[Char, Int, String])) arising from a use of ‘print’ In the first argument of ‘print’, namely ‘it’ In a stmt of an interactive GHCi command: print it </code></pre></div></div> Hmm. We’ll need to write a <code class="language-plaintext highlighter-rouge">Show</code> instance for <code class="language-plaintext highlighter-rouge">HList</code>. How should we approach this? Let’s try something dumb first. We’ll ignore all the contents! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Show (HList xs) where show HNil = "HNil" show (x ::: rest) = "_ ::: " ++ show rest </code></pre></div></div> Ahah! this compiles, and it even works! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> 'a' ::: 1 ::: "hello" ::: HNil _ ::: _ ::: _ ::: HNil </code></pre></div></div> Unfortunately, it’s not very useful. Can we do better? We can! <h1 id="inductive-type-class-instances">Inductive Type Class Instances</h1> First, we’ll define the base case – showing an empty HList! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Show (HList '[]) where show HNil = "HNil" </code></pre></div></div> This causes a compile error, requiring that we enable yet another language extension: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE FlexibleInstances #-} </code></pre></div></div> If you’re doing this in another file than the type family above, you’ll also get an error about <code class="language-plaintext highlighter-rouge">FlexibleContexts</code>. It turns out that enabling <code class="language-plaintext highlighter-rouge">UndecidableInstances</code> implies <code class="language-plaintext highlighter-rouge">FlexibleContexts</code> for some reason. So let’s throw that one on too, for good measure: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE FlexibleContexts #-} </code></pre></div></div> This compiles, and we can finally <code class="language-plaintext highlighter-rouge">show HNil</code> and it works out. Now, we must recurse! The principle of induction states that: <ol> <li>We must be able to do something for the base case.</li> <li>If we can do something for a random case, then we can do it for a case that is one step larger.</li> <li>By 1 and 2, you can do it for all cases.</li> </ol> We’ve covered that base case. We’ll assume that we can handle the smaller cases, and demonstrate how to handle a slightly large case: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Show (HList as), Show a) => Show (HList (a ': as)) where show (a ::: rest) = show a ++ " ::: " ++ show rest </code></pre></div></div> This instance basically says: <blockquote> Given that I know how to <code class="language-plaintext highlighter-rouge">Show</code> an <code class="language-plaintext highlighter-rouge">HList</code> of <code class="language-plaintext highlighter-rouge">as</code>, and I know how to <code class="language-plaintext highlighter-rouge">Show</code> an <code class="language-plaintext highlighter-rouge">a</code>: I can <code class="language-plaintext highlighter-rouge">Show</code> an <code class="language-plaintext highlighter-rouge">HList</code> with an <code class="language-plaintext highlighter-rouge">a</code> and a bunch of <code class="language-plaintext highlighter-rouge">as</code>. </blockquote> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> 'a' ::: 1 ::: "hello" ::: HNil 'a' ::: 1 ::: "hello" ::: HNil </code></pre></div></div> <h1 id="further-exercises">Further Exercises</h1> Write an <a href="https://hackage.haskell.org/package/aeson"><code class="language-plaintext highlighter-rouge">aeson</code></a> instance for <code class="language-plaintext highlighter-rouge">HList</code>. It’ll be similar to the <code class="language-plaintext highlighter-rouge">Show</code> instance, but require a bit more stuff. <h1 id="extensible-records">Extensible Records</h1> There are a few variants on extensible records in Haskell. Here’s a tiny implementation that requires yet more extensions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE PolyKinds #-} {-# LANGUAGE TypeApplications #-} import GHC.TypeLits (KnownSymbol, symbolVal) import Data.Proxy </code></pre></div></div> This generalizes definitions for type variables, which allows for non-value type variables to have kind polymorphism. Type applications allow us to explicitly pass types as arguments using an <code class="language-plaintext highlighter-rouge">@</code> symbol. First, we must define the type of our fields, and then our record: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype s >> a = Named a data HRec xs where HEmpty :: HRec '[] HCons :: (s >> a) -> HRec xs -> HRec (s >> a ': xs) </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">s</code> parameter is going to be a type with the kind <code class="language-plaintext highlighter-rouge">Symbol</code>. <code class="language-plaintext highlighter-rouge">Symbol</code> is defined in <code class="language-plaintext highlighter-rouge">GHC.TypeLits</code>, so we need that import to do the fun stuff. We’ll construct a value using the <code class="language-plaintext highlighter-rouge">TypeApplications</code> syntax, so a record will look like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> HCons (Named @"foo" 'a') (HCons (Named @"bar" (3 :: Int)) HEmpty) \<interactive\>:10:1: error: • No instance for (Show (HRec '["foo" >> Char, "bar" >> Int])) arising from a use of ‘print’ • In a stmt of an interactive GHCi command: print it </code></pre></div></div> So, this type checks fine! Cool. But it does not Show, so we need to define a <code class="language-plaintext highlighter-rouge">Show</code> instance. Those string record fields only exist at the type level – but we can use the <code class="language-plaintext highlighter-rouge">KnownSymbol</code> class to bring them back down to the value level using <code class="language-plaintext highlighter-rouge">symbolVal</code>. Here’s our base case: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Show (HRec '[]) where show _ = "HEmpty" </code></pre></div></div> And, when we recurse, we need a tiny bit more information. Let’s start with the instance head first, so we know what variables we need: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Show (HRec (s >> a ': xs)) where </code></pre></div></div> OK, so we have a <code class="language-plaintext highlighter-rouge">s</code> type, which has the kind <code class="language-plaintext highlighter-rouge">Symbol</code>, an <code class="language-plaintext highlighter-rouge">a :: *</code>, and <code class="language-plaintext highlighter-rouge">xs</code>. So now we pattern match on that bad boy: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Show (HRec (s >> a ': xs)) where show (HCons (Named a) rest) = </code></pre></div></div> OK, so we need to <code class="language-plaintext highlighter-rouge">show</code> the <code class="language-plaintext highlighter-rouge">a</code> value. Easy. Which means we need a <code class="language-plaintext highlighter-rouge">Show a</code> constraint tacked onto our instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Show a) => Show (HRec (s >> a ': xs)) where show (HCons (Named a) rest) = let val = show a </code></pre></div></div> Next up, we need the key as a string. Which means we need to use <code class="language-plaintext highlighter-rouge">symbolVal</code>, which takes a <code class="language-plaintext highlighter-rouge">proxy s</code> and returns the <code class="language-plaintext highlighter-rouge">String</code> associated with the <code class="language-plaintext highlighter-rouge">s</code> provided that <code class="language-plaintext highlighter-rouge">s</code> is a <code class="language-plaintext highlighter-rouge">KnownSymbol</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Show a, KnownSymbol s) => Show (HRec (s >> a ': xs)) where show (HCons (Named a) rest) = let val = show a key = symbolVal (Proxy :: Proxy s) </code></pre></div></div> At this point, you’re probably going to get an error like <code class="language-plaintext highlighter-rouge">No instance for 'KnownSymbol s0'</code>. This is because Haskell’s type variables have a very limited scope by default. When you write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>topLevelFunction :: a -> (a -> b) -> b topLevelFunction a = go where go :: (a -> b) -> b go f = f a </code></pre></div></div> Haskell interprets each type signature as it’s own scope for the type variables. This means that the <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">b</code> variables in the <code class="language-plaintext highlighter-rouge">go</code> helper function are different type variables, and a more precise way to write it would be: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>topLevelFunction :: a0 -> (a0 -> b0) -> b0 topLevelFunction a = go where go :: (a1 -> b1) -> b1 go f = f a </code></pre></div></div> If we want for type variables to have a scope similar to other variables, we need another extension: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE ScopedTypeVariables #-} </code></pre></div></div> Finally, we need to show the rest of the stuff! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance (Show a, KnownSymbol s, Show (HRec xs)) => Show (HRec (s >> a ': xs)) where show (HCons (Named a) rest) = let val = show a key = symbolVal (Proxy :: Proxy s) more = show rest in "(" ++ key ++ ": " ++ val ++ ") " ++ more </code></pre></div></div> This gives us a rather satisfying <code class="language-plaintext highlighter-rouge">Show</code> instance, now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> HCons (Named @"foo" 'a') (HCons (Named @"bar" (3 :: Int)) HEmpty) (foo: 'a') (bar: 3) HEmpty </code></pre></div></div> <h1 id="exercise">Exercise:</h1> Write an Aeson <code class="language-plaintext highlighter-rouge">ToJSON</code> instance for this <code class="language-plaintext highlighter-rouge">HRec</code> type which converts into a JSON object. Bonus points: Write an Aeson <code class="language-plaintext highlighter-rouge">FromJSON</code> instance for the <code class="language-plaintext highlighter-rouge">HRec</code> type. <h1 id="like-what-you-read">Like what you read?</h1> If you enjoyed this post, you should check out the book <a href="https://thinkingwithtypes.com/">Thinking with Types</a> by Sandy Maguire. It is an extensive manual on practical type-level programming in Haskell. Wed, 26 Apr 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/04/26/basic_type_level_programming_in_haskell.html https://www.parsonsmatt.org/2017/04/26/basic_type_level_programming_in_haskell.html Maybe? Use a type parameter! Haskell’s a powerful and flexible language for modeling the real world. By pushing information into the type level, we can make our program safer and easier to refactor. Where many safety features provide limitations, we also get flexibility from these. So let’s look at a common Real World data set: a microblogging system with Users, Posts, and Organizations! <h1 id="the-data-model">The Data Model</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { userName :: Text , userPosts :: [Post] , userOrganization :: Maybe Organization } data Organization = Organization { organiationName :: Text , organizationUsers :: [User] } data Post = Post { postTitle :: Text , postBody :: Text , postComments :: [Post] , postAuthor :: User } </code></pre></div></div> This is a pretty simple data model, and it captures our relationships fairly well. However, it has some issues – A <code class="language-plaintext highlighter-rouge">User</code>’s <code class="language-plaintext highlighter-rouge">Organization</code> is going to link back to that <code class="language-plaintext highlighter-rouge">User</code>, which is going to result in a cycle! If we try to <code class="language-plaintext highlighter-rouge">print</code> that <code class="language-plaintext highlighter-rouge">User</code>, then it’ll go on forever. Also, any function which takes a <code class="language-plaintext highlighter-rouge">User</code> and operates on the <code class="language-plaintext highlighter-rouge">Organization</code> will have to consider the <code class="language-plaintext highlighter-rouge">Maybe</code>. Consider this function that gets a user’s comembers in the organization: <h1 id="the-pain-points">The Pain Points</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>coworkers :: User -> Maybe [User] coworkers user = case userOrganization user of Nothing -> Nothing Just organization -> Just (organizationUsers organization) -- or, coworkers = fmap organizationUsers . userOrganization </code></pre></div></div> Having <code class="language-plaintext highlighter-rouge">Maybe</code> values all over the place is much nicer than implicit <code class="language-plaintext highlighter-rouge">null</code>, but it’s still a pain compared to ordinary values. When we’re loading this information from the database, it’s going to be a little awkward, as our database model isn’t going to correspond exactly to this. We’d need to have slightly different data types to represent keys, rather than entities: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data DbUser = DbUser { dbUserName :: Text , dbUserOrganiation :: Maybe OrganizationId } data DbOrganization = DbOrganization { dbOrganizationName :: Text } data DbPost = DbPost { dbPostTitle :: Text , dbPostBody :: Text , dbPostAuthor :: UserId , dbPostParent :: Maybe PostId } type UserId = Text type OrganizationId = Text type PostId = Text </code></pre></div></div> So now, we represent a <code class="language-plaintext highlighter-rouge">DbUser</code> with a name and an optional organization ID, which we’ll use it’s name. An organization just contains it’s name – the relationship to Users is contained by the User model. Likewise, posts no longer contain a reference to their replies, but instead a reference to the post that they are a reply to. Users don’t have Posts directly, and the Post model refers to the author. Man, this is getting to be a lot of boiler plate, and there’s a lot of duplication. It seems like this can be simplified or made more general. Maybe we can reach for some Template Haskell, or perhaps we should get some extensible records library and turn on the kitchen sink of language extensions. <h1 id="the-template-haskell-solution">The Template Haskell Solution</h1> Actually, <h1 id="lets-not">Let’s not</h1> tis a silly place Instead, let’s inspect some commonalities in our <code class="language-plaintext highlighter-rouge">User</code> and <code class="language-plaintext highlighter-rouge">DbUser</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User = User { userName :: Text , userPosts :: [Post] , userOrganization :: Maybe Organization } data DbUser = DbUser { dbUserName :: Text , dbUserOrganiation :: Maybe OrganizationId } </code></pre></div></div> So the <code class="language-plaintext highlighter-rouge">name</code> remains the same, but the shape of the organization changes – we have a <code class="language-plaintext highlighter-rouge">Maybe</code> in both cases, but a reference/ID for the database and an entity for the user. The database also has no concept of the Posts. Our first step in cleaning this up is in making the organization a type parameter: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User org = User { userName :: Text , userOrganization :: Maybe org } type UserModel = User Organization type UserDb = User OrganizationId </code></pre></div></div> And now, our data model allows us to use the same type to describe these two use cases! So this is a small victory. We can take it a bit further, though – why hardcode the <code class="language-plaintext highlighter-rouge">Maybe</code>ness of that organization? We’ve solved some of the boilerplate, but we still have the issue with <code class="language-plaintext highlighter-rouge">coworkers</code> returning a <code class="language-plaintext highlighter-rouge">Maybe</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>coworkers :: User Organization -> Maybe [User OrganizationId] coworkers = fmap organizationUsers . userOrganization </code></pre></div></div> So, let’s remove the <code class="language-plaintext highlighter-rouge">Maybe</code> from our definition, which moves the absence or presence of the organization from the value level to the type level. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User org = User { userName :: Text , userOrganization :: org } </code></pre></div></div> Now, let’s look at all of our cool variants! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type UserWithOrg = User Organization type UserInDb = User (Maybe OrganizationId) type UserWithOrgId = User OrganizationId type UserWithoutOrganization = User () </code></pre></div></div> We can express some really neat stuff here. Our type for <code class="language-plaintext highlighter-rouge">coworkers</code> is a lot nicer: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>coworkers :: User Organization -> [User OrganizationId] coworkers = organizationUsers . userOrganization </code></pre></div></div> We’re now disallowed from passing a <code class="language-plaintext highlighter-rouge">User</code> in unless we’ve already given that user an <code class="language-plaintext highlighter-rouge">Organization</code>. We’ve also gained a nice way of bottoming out our relationship: the <code class="language-plaintext highlighter-rouge">Organization</code> contains a list of users with organization references, instead of actual organizations. This makes it safe to print the whole thing out. We can also immediately see whether or not we need to do joins, inner joins, left joins, etc. because the nature of the relationship is specified in the type. The functions for loading stuff out of the database is like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | Load all the users out of the database. -- This is an ordinary select. loadUsers :: Database [User (Maybe OrganizationId)] loadUsers = execute [sql| select users.* from users |] -- | Load all the users with organizations out of the database. -- This does an inner join. loadUsersWithOrganizations :: Database [User OrganizationId] loadUsersWithOrganizations = execute [sql| select users.* from users inner join organizations on users.organization_id = organizations.id |] -- | Load all the users with their organization loadUsersAndOrganizations :: Database [User Organization] loadUsersAndOrganizations = combine <$> execute [sql| select users.*, organizations.* from users inner join organizations on users.organization_id = organizations.id |] where combine user organization = user { userOrganization = organization } </code></pre></div></div> You’d also know from the type signature if we did a <code class="language-plaintext highlighter-rouge">left join</code> instead, since we’d have a <code class="language-plaintext highlighter-rouge">Maybe Organization</code>. So, how would I write this model out? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data User org posts = User { userName :: Text , userOrganization :: org , userPosts :: posts } data Organization users = Organization { organizationName :: Text , organizationUsers :: users } data Post user = Post { postTitle :: Text , postBody :: Text , postAuthor :: user } </code></pre></div></div> I’m not going to contain the Post hierarchy within the post datatype, because that makes that data type responsible for too much. If I want to represent that, a <code class="language-plaintext highlighter-rouge">Tree (Post user)</code> does fine. What’s another benefit we get from this? <h1 id="type-classes">TYPE CLASSES</h1> Oh dang! Now that our <code class="language-plaintext highlighter-rouge">User</code>, <code class="language-plaintext highlighter-rouge">Organization</code>, and <code class="language-plaintext highlighter-rouge">Post</code> have type parameters, we can write <code class="language-plaintext highlighter-rouge">Functor</code>, <code class="language-plaintext highlighter-rouge">Foldable</code>, <code class="language-plaintext highlighter-rouge">Traversable</code>, etc instances. Actually, we don’t have to – we can derive them with the help of our language extension friends: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# language DeriveFunctor, DeriveFoldable, DeriveTraversable #-} data User org posts = User { userName :: Text , userOrganization :: org , userPosts :: posts } deriving (Functor, Foldable, Traversable) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">User</code> is an instance of <code class="language-plaintext highlighter-rouge">Bifunctor</code>, <code class="language-plaintext highlighter-rouge">Bifoldable</code>, and <code class="language-plaintext highlighter-rouge">Bitraversable</code>, so we can map over both the <code class="language-plaintext highlighter-rouge">org</code> and the <code class="language-plaintext highlighter-rouge">posts</code> parameter. This gives us a lot of good code reuse. <h1 id="make-fields-for-fun-and-profit">Make Fields For Fun And Profit</h1> As the final thing to do, we’ll use the <code class="language-plaintext highlighter-rouge">Control.Lens</code> function <code class="language-plaintext highlighter-rouge">makeFields</code> to make it easy to access these types. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>makeFields ''User makeFields ''Organization makeFields ''Post </code></pre></div></div> And now we can write code like <code class="language-plaintext highlighter-rouge">user ^. organization . name</code> to access a user’s organization name, or <code class="language-plaintext highlighter-rouge">user ^.. posts . title</code>. <h1 id="further-watching">Further Watching</h1> This post is surely inspired from <a href="https://www.youtube.com/watch?v=BHjIl81HgfE">Stephen Compall’s ComposeConf talk</a>, which is a great thing to watch. Sat, 08 Apr 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/04/08/maybe_use_a_type_parameter.html https://www.parsonsmatt.org/2017/04/08/maybe_use_a_type_parameter.html OOPH: Data Inheritance (this post is part of a series on Object Oriented Programming in Haskell – see <a href="/tutorials/">tutorials</a> for a table of contents) <blockquote> Does Haskell have inheritance? </blockquote> Well, no, it doesn’t, because Haskell does not have objects, and inheritance is a relationship between two objects. Objects are a combination of internal state (data) and methods (behavior). Since inheritance is a combination of both of these things, we’ll need to treat them separately. This post will approach data inheritance, since it is somewhat easier to deal with. tl;dr: Decompose OO features into simple parts, reconsitute with FP <h1 id="the-objects-java">The Objects: Java</h1> Let’s consider a simple example of data-only inheritance. We’ll use Java for this example. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Shape { public final int x; public final int y; public Shape(int x, int y) { this.x = x; this.y = y; } } class Circle extends Shape { public final int radius; public Circle(int x, int y, int radius) { super(x, y); this.radius = radius; } } </code></pre></div></div> Here we’ve defined a class <code class="language-plaintext highlighter-rouge">Shape</code> that contains the X and Y coordinates where it is located. We extend <code class="language-plaintext highlighter-rouge">Shape</code> to create the <code class="language-plaintext highlighter-rouge">Circle</code> class, which also has a <code class="language-plaintext highlighter-rouge">Radius</code>. Whenever we have a <code class="language-plaintext highlighter-rouge">Circle</code>, or any other class that extends <code class="language-plaintext highlighter-rouge">Shape</code>, we know we can access the <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> properties on that class. In the introduction, I mentioned that inheritance can be broken down into simpler features. Let’s put on our “Objects and Messages” thinking caps, and think about what we mean with these constructs. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Circle c = new Circle(1, 2, 3); </code></pre></div></div> We’ll read this as: <blockquote> Create a new Circle object with the values 1, 2, 3. </blockquote> <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>c.x; </code></pre></div></div> Generally, this is read as “access the property <code class="language-plaintext highlighter-rouge">x</code> on the object <code class="language-plaintext highlighter-rouge">c</code>.” Let’s instead read it as: <blockquote> Send the message <code class="language-plaintext highlighter-rouge">x</code> to the object <code class="language-plaintext highlighter-rouge">c</code>. </blockquote> This gets us more in the “pure” object oriented sense of things. So, when we send the message <code class="language-plaintext highlighter-rouge">x</code> to the object <code class="language-plaintext highlighter-rouge">c</code>, how does it know how to respond? First, it’ll do a lookup on all of it’s attributes. <code class="language-plaintext highlighter-rouge">Circle</code> only has a <code class="language-plaintext highlighter-rouge">radius</code> attribute. It doesn’t have an <code class="language-plaintext highlighter-rouge">x</code> attribute. It’s not time to give up, though. The next thing it does is delegate the message to the parent class. <code class="language-plaintext highlighter-rouge">Shape</code> does have <code class="language-plaintext highlighter-rouge">x</code> defined, so we can respond with that value. What if we do this: <code class="language-plaintext highlighter-rouge">c.toString()</code>? Well, <code class="language-plaintext highlighter-rouge">Shape</code> doesn’t have <code class="language-plaintext highlighter-rouge">toString()</code> on it’s list of attributes defined. Java (and most OOP languages) have an implicit <code class="language-plaintext highlighter-rouge">Object</code> class that is the superclass of all objects. <code class="language-plaintext highlighter-rouge">Object</code> does have <code class="language-plaintext highlighter-rouge">toString()</code> defined, so <code class="language-plaintext highlighter-rouge">c</code> delegates to <code class="language-plaintext highlighter-rouge">Shape</code>, which then delegates to <code class="language-plaintext highlighter-rouge">Object</code>. Super classes, mixins, traits, etc. are all – conceptually – just an automatic form of delegation. “If I don’t know how to handle something, ask this object.” If we were to ban the <code class="language-plaintext highlighter-rouge">extends</code> keyword in Java, then we could recover the same functionality with object composition: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Shape { private final int x; private final int y; public Shape(int x, int y) { this.x = x; this.y = y; } public int getX() { return this.x; } public int getY() { return this.y; } } class Circle { private final Shape shape; private final int radius; public Circle(int x, int y, int radius) { this.shape = new Shape(x, y); this.radius = radius; } public int getX() { return this.shape.getX(); } public int gety() { return this.shape.getY(); } public int getRadius() { return this.radius; } } </code></pre></div></div> Oof, that’s a lot of boilerplate. No wonder people prefer inheritance. In Java, you’d probably want to define an interface so that you can use <code class="language-plaintext highlighter-rouge">Shape</code>s polymorphically, but we’ll ignore that for now. <h1 id="port-to-haskell">Port to Haskell</h1> Now that we’ve simplified the implementation of data inheritance in Java to the core features of OOP, it’s easy to port it to Haskell. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Shape = Shape { x :: Int, y :: Int } data Circle = Circle { shape :: Shape, radius :: Int } </code></pre></div></div> That’s it! We just compose two bits of data. That’s the easy way. However, we have some flexibility problems. To access the <code class="language-plaintext highlighter-rouge">Circle</code>’s <code class="language-plaintext highlighter-rouge">x</code> parameter, we need to compose functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>circleX = x . shape circleY = y . shape </code></pre></div></div> Ideally, we’d like to be able to the <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> functions work on any type that has those attributes. Haskell’s solution for name overloading is the type class. <h1 id="class-it-up">Class it up!</h1> Here’s a new definition of our types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Shape = Shape { shapeX :: Int , shapeY :: Int } data Circle = Circle { circleShape :: Shape , circleRadius :: Int } </code></pre></div></div> Now, we’re prefixing the record fields with the type name. This saves the more general names for the type classes. For each field that we want to be polymorphic about, we create a type class and an instance for the classes: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class HasX a where x :: a -> Int instance HasX Shape where x = shapeX instance HasX Circle where x = shapeX . circleShape </code></pre></div></div> Now, we can use the function <code class="language-plaintext highlighter-rouge">x</code> on any type that we instantiate it for. We can even retrofit existing types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance HasX (Int, Int) where x = fst </code></pre></div></div> This is reminiscent of monkey patching in Ruby, but rather than being a gross hack, it’s an elegant way to extend types with new functionality. <h1 id="modification-of-values">Modification of values</h1> We’ve covered immutable reading of values, but we also want to be able to update values. The idiomatic Java is presented here, with most of the boilerplate: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Shape { private int x; private int y; public Shape(int x, int y) { this.x = x; this.y = y; } public int getX() { return this.x; } public void setX(int x) { this.x = x; } } class Circle extends Shape { // pretend I'm not too lazy to // write out the boilerplate } </code></pre></div></div> Now, we want to be able to translate: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Circle c = new Circle(1, 2, 3); c.setX(4); </code></pre></div></div> into idiomatic Haskell. First, we’ll have to translate the above into using immutable objects, since Haskell’s data types are immutable. Let’s see what that looks like: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Shape { private final int x; private final int y; public Shape(int x, int y) { this.x = x; this.y = y; } public Shape setX(int x) { return new Shape(x, this.y); } // etc... } </code></pre></div></div> Rather than modifying the old Shape, we return a new Shape with the field changed. Modifying a Circle is pretty similar. We’ll switch back to using object composition as well: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Circle { private final Shape shape; private final int radius; public Circle(int x, int y, int radius) { this.shape = new Shape(x, y); this.radius = radius; } public Circle setX(int x) { return new Circle( x, this.shape.getY(), this.radius ); } } </code></pre></div></div> Alright, now we can directly translate this to Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class SetX a where setX :: Int -> a -> a instance SetX Shape where setX x (Shape _ y) = Shape x y instance SetX Circle where setX x (Circle shape radius) = Circle (setX x shape) radius </code></pre></div></div> Now we can express a modification function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>modifyX :: (HasX a, SetX a) => (Int -> Int) -> a -> a modifyX f a = setX (f (getX a)) a </code></pre></div></div> Neat! <h1 id="lenses">Lenses</h1> So, you may notice that the boilerplate involved with the polymorphic accessors and setters is pretty intense. Lenses solve this issue nicely. I won’t go into depth on how great lenses are and why learning them has made my life so much better, but I will recommend <a href="https://hackage.haskell.org/package/lens-tutorial-1.0.2/docs/Control-Lens-Tutorial.html">this tutorial</a> and <a href="https://artyom.me/lens-over-tea-1">this lengthy blog series</a>. Classy lenses work especially well for this problem: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Shape = Shape { _shapeX :: Int , _shapeY :: Int } data Circle = Circle { _circleShape :: Shape , _circleRadius :: Int } makeClassy ''Shape makeClassy ''Circle instance HasShape Circle where shape = circleShape </code></pre></div></div> which give us some pretty nice helpers: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someShape = Shape 1 2 someCircle = Circle someShape 3 moveShape = over x (+1) . over y (+1) </code></pre></div></div> And we can use <code class="language-plaintext highlighter-rouge">moveShape</code> on both a <code class="language-plaintext highlighter-rouge">Shape</code> and a <code class="language-plaintext highlighter-rouge">Circle</code>, and truly, anything that implements the <code class="language-plaintext highlighter-rouge">HasShape</code> type class. <h1 id="summary">Summary</h1> The somewhat-mechanical process that we followed to reach this point was: <ol> <li>Take the fancy OOP features</li> <li>Boil them down to the core features: objects and messages</li> <li>Translate to immutable values</li> <li>(somewhat) mechanically translate to Haskell</li> </ol> I suspect that this process will work fairly well for most of the things we’ll run into on this series. Fri, 17 Feb 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/02/17/ooph_data_inheritance.html https://www.parsonsmatt.org/2017/02/17/ooph_data_inheritance.html Object Oriented Programming in Haskell <h1 id="introduction">Introduction</h1> Well, here you are. You drank the kool aid and learned Haskell. You overcame all the novelties and differences from imperative programming, figured out how to use monoids and monads, and wrote yourself a couple dozen parsers. Now it’s time to implement some Real World Program, and you’re a little lost. <blockquote> Haskell is so different from object oriented programming that you must unlearn all of that knowledge to learn the Functional Paradigm </blockquote> Perhaps you’ve heard this a few times. Well, it’s true – Haskell has different idioms than OOP. It’s also unhelpful. If you’re thinking, “Oh, I know, I’ll use a factory to select the right instance,” then being told that “factories don’t real in Haskell” doesn’t help you. I had a conversation with a coworker where he asked: <blockquote> Does Haskell have inheritance? </blockquote> To which I responded: <blockquote> No. </blockquote> This was deeply unsatisfying to me, because I didn’t get to actually help him accomplish his goal. This blog post series is my attempt to address this and similar questions. <h1 id="haskells-xy-problem">Haskell’s XY Problem</h1> The XY problem (described in much greater detail in <a href="https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem">this Stack Overflow post</a>) is what happens when you ask about how to implement your chosen solution, rather than ask about how to solve your given problem. People that get started building software in Haskell tend to ask XY questions. Here are a few common ones that I’ve seen: <ul> <li>“Does Haskell have inheritance?”</li> <li>“How can I have a heterogeneous list?”</li> <li>“How can I use mocks or spys to make testing easier?”</li> </ul> The answer to all of these is usually “don’t” or “no,” but instead of just saying “no,” I’d like to cover common features in OOP and how Haskell developers accomplish the same goals. <h1 id="what-is-oop">What is OOP?</h1> “Object oriented programming” is about as controversial of a term as “functional programming.” My OOP instruction was mostly through reading and watching <a href="https://www.sandimetz.com/">Sandi Metz</a>, <a href="https://www.destroyallsoftware.com/screencasts">Gary Bernhardt</a>, and the surrounding Ruby community, which is heavily inspired by Smalltalk. Sandi Metz’s excellent book <a href="https://www.sandimetz.com/products#product-poodr">Practical Object Oriented Design in Ruby</a> is a fantastic guide to writing good OOP code. The core of object oriented programming is the object. An object has many metaphors: cells in biology, actors, messengers, etc. Objects are: <ol> <li>Data – they hold on to some internal state, which may be hidden</li> <li>Behavior – they respond to messages (or methods) by updating their internal state, issuing side effects, and/or returning a value.</li> </ol> There’s a lot more to most OOP – interfaces, classes, inheritance, traits, modules, mixins, overriding, namespaces, abstract classes, etc. But most of those features can be expressed in terms of simpler ideas. To maximize the applicability, I’ll try to stay as simple in OOP terms as possible, so you can use this information regardless of which OOP language you’re most familiar with. For the up-to-date table of contents, check out my <a href="/tutorials/">tutorials</a> page. Fri, 17 Feb 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/02/17/object_oriented_programming_in_haskell.html https://www.parsonsmatt.org/2017/02/17/object_oriented_programming_in_haskell.html How do type classes differ from interfaces? Haskell type classes are a tricky concept for many Haskell beginners to learn. Most languages cannot express them at all, and they don’t have a concept that comes close. For many object oriented languages, the <code class="language-plaintext highlighter-rouge">Interface</code> is the closest language construct available. Ruby <code class="language-plaintext highlighter-rouge">module</code>s fill a similar niche. However, while these concepts both address name overloading and a kind of polymorphism, they miss some of the power that type classes provide. This post is intended for people curious about type classes. It doesn’t assume any knowledge of Haskell or functional programming. Familiarity with a statically typed language like Java or C# will help. <h1 id="type-class-introductionrecap">Type Class Introduction/Recap</h1> If you know what a type class is, feel free to skip to the next header. To recap, a Haskell type class is defined like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- [1] [2] [3] class (Eq a) => Ord a where -- [4] compare :: a -> a -> Ordering data Ordering = EQ | GT | LT </code></pre></div></div> <ol> <li>This is the “super class” constraint. To make a type an instance of <code class="language-plaintext highlighter-rouge">Ord</code>, it must be an instance of <code class="language-plaintext highlighter-rouge">Eq</code>.</li> <li>This is the class name.</li> <li>This is the type that we are parameterizing the class over.</li> <li>This is the function type definition (or list of such) we need to define for <code class="language-plaintext highlighter-rouge">a</code> in order to make <code class="language-plaintext highlighter-rouge">a</code> an instance of <code class="language-plaintext highlighter-rouge">Ord</code>.</li> </ol> We can read the above code snippet as: <blockquote> Declare a class <code class="language-plaintext highlighter-rouge">Ord</code> that is parameterized on some type <code class="language-plaintext highlighter-rouge">a</code> where <code class="language-plaintext highlighter-rouge">a</code> has an <code class="language-plaintext highlighter-rouge">Eq</code> type class instance. To make the type <code class="language-plaintext highlighter-rouge">a</code> an instance of <code class="language-plaintext highlighter-rouge">Ord</code>, you must define the <code class="language-plaintext highlighter-rouge">compare</code> function. This function takes two values of the type <code class="language-plaintext highlighter-rouge">a</code> and returns a value with the type <code class="language-plaintext highlighter-rouge">Ordering</code>. </blockquote> First, we’ll create a toy datatype <code class="language-plaintext highlighter-rouge">ToyOrd</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data ToyOrd = Smol | Large Int </code></pre></div></div> This data type has two constructors: <code class="language-plaintext highlighter-rouge">Smol</code>, which has no fields, and <code class="language-plaintext highlighter-rouge">Large</code>, which has a single <code class="language-plaintext highlighter-rouge">Int</code> field. In order to make it an instance of <code class="language-plaintext highlighter-rouge">Ord</code>, we have to make it an instance of <code class="language-plaintext highlighter-rouge">Eq</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- [1] [2] [3] instance Eq ToyOrd where -- [4] Smol == Smol = True Large x == Large y = x == y _ == _ = False </code></pre></div></div> <ol> <li>We start making an instance with the <code class="language-plaintext highlighter-rouge">instance</code> keyword.</li> <li>This is the class name we’re making an instance of.</li> <li><code class="language-plaintext highlighter-rouge">ToyOrd</code> is the type that we’re making an <code class="language-plaintext highlighter-rouge">Eq</code> instance for.</li> <li>The <code class="language-plaintext highlighter-rouge">Eq</code> class defines a function <code class="language-plaintext highlighter-rouge">(==) :: a -> a -> Bool</code> and <code class="language-plaintext highlighter-rouge">(/=) :: a -> a -> Bool</code>. Since <code class="language-plaintext highlighter-rouge">(/=)</code> has a default implementation, we can only implement <code class="language-plaintext highlighter-rouge">(==)</code>.</li> </ol> Now that we’ve defined an <code class="language-plaintext highlighter-rouge">Eq</code> instance for our <code class="language-plaintext highlighter-rouge">ToyOrd</code> data type, we can define <code class="language-plaintext highlighter-rouge">Ord</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Ord ToyOrd where compare Smol Smol = -- [1] EQ compare Smol (Large _) = -- [2] LT compare (Large x) (Large y) = -- [3] compare x y compare (Large _) Smol = -- [4] GT </code></pre></div></div> <ol> <li>Two <code class="language-plaintext highlighter-rouge">Smol</code> values are equal.</li> <li>A <code class="language-plaintext highlighter-rouge">Smol</code> value is always less than a <code class="language-plaintext highlighter-rouge">Large</code>.</li> <li>Two <code class="language-plaintext highlighter-rouge">Large</code> values are compared by their <code class="language-plaintext highlighter-rouge">Int</code> values.</li> <li>A <code class="language-plaintext highlighter-rouge">Large</code> is always greater than a <code class="language-plaintext highlighter-rouge">Smol</code>.</li> </ol> Once you have a type class, you can write functions which expect an instance of that type class as an input. Let’s define the ordering operators: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(<=) :: Ord a => a -> a -> Bool a1 <= a2 = case compare a1 a2 of LT -> True EQ -> True GT -> False (>) :: Ord a => a -> a -> Bool a1 > a2 = not (a1 <= a2) </code></pre></div></div> We specify that this function works for all types <code class="language-plaintext highlighter-rouge">a</code> provided that these types are an instance of the <code class="language-plaintext highlighter-rouge">Ord</code> type class. <h1 id="similarity-with-interfaces">Similarity With Interfaces</h1> Java interfaces allow you to specify a set of methods that an object supports, and, as of Java 8, default implementations for these methods. So we can write an interface that does essentially the same thing as <code class="language-plaintext highlighter-rouge">Ord</code>. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public interface Eq { default public bool equalTo(Eq a) { return ! this.notEqualTo(a); } default public bool notEqualTo(Eq a) { return ! this.equalTo(a); } } </code></pre></div></div> This is the <code class="language-plaintext highlighter-rouge">Eq</code> interface in Java. We’re taking advantage of those default implementations. If we don’t override one of them, then we’ll loop infinitely if we try to call a method. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public interface Ord extends Eq { public Ordering compare(Ord other); } public enum Ordering { LT, EQ, GT } </code></pre></div></div> We can also write generic methods in terms of these interfaces. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class OrdUtil { bool lessThanOrEqual(Ord a1, Ord a2) { Ordering result = a1.compare(a2); return result == LT || result == EQ; } } </code></pre></div></div> On the surface, these look similar. However, there are a number of important differences! <h1 id="differences">Differences!</h1> <h3 id="differing-types">Differing types!</h3> The type signature for the Haskell <code class="language-plaintext highlighter-rouge">compare</code> function is very specific about the types of it’s arguments. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>compare :: Ord a => a -> a -> Ordering </code></pre></div></div> This type signature says: <blockquote> The caller of this function can pick any type <code class="language-plaintext highlighter-rouge">a</code> that is an instance of <code class="language-plaintext highlighter-rouge">Ord</code>. I will return an <code class="language-plaintext highlighter-rouge">Ordering</code>. </blockquote> Note that the type signature requires that both parameters to <code class="language-plaintext highlighter-rouge">compare</code> have the same type! It is illegal to write <code class="language-plaintext highlighter-rouge">compare Smol 10</code>. The Java version allows any two objects to be passed, provided they implement the <code class="language-plaintext highlighter-rouge">Ord</code> interface. The Java equivalent looks more like: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public class OrdUtil { static <A extends Ord> bool lessThanOrEqual(A a1, A a2) { Ordering result = a1.compare(a2); return result == LT || result == EQ; } } </code></pre></div></div> This method signature introduces a generic type variable <code class="language-plaintext highlighter-rouge">A</code>, and says that <code class="language-plaintext highlighter-rouge">A</code> must extend/implement the <code class="language-plaintext highlighter-rouge">Ord</code> interface. The method then takes two parameters, both of which have the same generic <code class="language-plaintext highlighter-rouge">A</code> type. <h3 id="separation-of-implementation">Separation of Implementation</h3> Java classes are defined in one place. Any interface a class implements must be defined on that class. Java doesn’t handle sum types very well, so we’ll just do <code class="language-plaintext highlighter-rouge">Large</code> from our <code class="language-plaintext highlighter-rouge">ToyOrd</code> class above. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Large implements Eq, Ord { public final int size; public Large(int size) { this.size = size; } public bool equalTo(Eq other) { if (other instanceof Large) { Large other1 = (Large) other; return other1.size == this.size; } return false; } public Ordering compare(Ord other) { if (other instanceof Large) { Large other1 = (Large) other; if (other1.size < this.size) { return Ordering.LT; } if (other2.size == this.size) { return Ordering.EQ; } return Ordering.GT; } throw new RuntimeException("what does this even mean"); } } </code></pre></div></div> We’ve defined <code class="language-plaintext highlighter-rouge">compare</code> and <code class="language-plaintext highlighter-rouge">equalTo</code>. Note that we have to do <code class="language-plaintext highlighter-rouge">instanceof</code> and type casting in order to properly implement these methods. What does it even mean to try and compare two objects of arbitrary type? Suppose that we’ve imported <code class="language-plaintext highlighter-rouge">Large</code> from some upstream package, and we’ve defined our own interface. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interface SomeOtherPackage { public bool doSomeThing(int lol); } </code></pre></div></div> We are completely incapable of making <code class="language-plaintext highlighter-rouge">Large</code> implement our <code class="language-plaintext highlighter-rouge">SomeOtherPackage</code> interface! Instead, we must wrap the class with a new class that we control, which implements the interface and otherwise delegates to <code class="language-plaintext highlighter-rouge">Large</code>. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public class MyLarge implements SomeOtherPackage { public final Large large; public MyLarge(Large large) { this.large = large; } public bool doSomething(int lol) { System.out.println("wut"); } } </code></pre></div></div> Type classes separate the definition of data types and the instances of a class. So, supposing that I imported the <code class="language-plaintext highlighter-rouge">ToyOrd</code> from another package, I can easily do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import JokesAreFun (ToyOrd(..)) class MyNewClass a where doSomething :: a -> Int -> IO () instance MyNewClass ToyOrd where doSomething Smol x = putStrLn "hahahaa yess" doSomething (Large x) y = putStrLn ("numbers! " ++ show (x + y)) </code></pre></div></div> <h1 id="return-type-polymorphism">Return Type Polymorphism</h1> Here’s one of the bigger and more amazing things that type classes allow you to do. We call it return type polymorphism. And it’s kind of obscene. Let’s define a Haskell type class for actions which can fail. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- [1] class CanFail failable where -- [2] [3] oops :: failable a -- [4] [5] [6] [7] pick :: failable a -> failable a -> failable a -- [8] win :: a -> failable a </code></pre></div></div> <ol> <li><code class="language-plaintext highlighter-rouge">failable</code> is the type variable name that we’re using for the class.</li> <li><code class="language-plaintext highlighter-rouge">oops</code> is a value representing a failed computation.</li> <li>We apply the type variable <code class="language-plaintext highlighter-rouge">a</code> to the class variable <code class="language-plaintext highlighter-rouge">failable</code>. So <code class="language-plaintext highlighter-rouge">failable</code> must take a generic type parameters.</li> <li><code class="language-plaintext highlighter-rouge">pick</code> is a function which looks at the two parameters.</li> <li>If the first parameter is not a failure, then we accept it.</li> <li>Otherwise, we return the 2nd parameter.</li> <li>So the return value is going to allow us to choose a successful value from two possibilities, or fail entirely.</li> <li>Finally, we give a way to succeed, but only if we take an <code class="language-plaintext highlighter-rouge">a</code> as a parameter.</li> </ol> We can easily make an instance for the <code class="language-plaintext highlighter-rouge">Maybe</code> type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- [1] [2] data Maybe a -- [3] = Just a -- [5] | Nothing </code></pre></div></div> <ol> <li><code class="language-plaintext highlighter-rouge">Maybe</code> is the name of the type we are declaring here.</li> <li>It takes a single generic type variable, which we introduce and name as <code class="language-plaintext highlighter-rouge">a</code>.</li> <li>It has two constructors. The first is <code class="language-plaintext highlighter-rouge">Just</code>, which takes a single parameter of the generic type <code class="language-plaintext highlighter-rouge">a</code>.</li> <li>The second constructor <code class="language-plaintext highlighter-rouge">Nothing</code> does not take any type parameters.</li> </ol> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- without annotations, data Maybe a = Just a | Nothing notAnInt :: Maybe Int notAnInt = Nothing hasAnInt :: Maybe Int hasAnInt = Just 5 </code></pre></div></div> Now, let’s write our <code class="language-plaintext highlighter-rouge">CanFail</code> instance! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance CanFail Maybe where oops = Nothing pick (Just a) _ = Just a pick Nothing (Just a) = Just a pick Nothing Nothing = oops win a = Just a </code></pre></div></div> Now, we can write some functions in terms of <code class="language-plaintext highlighter-rouge">CanFail</code>. We can write a safe division by zero function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>safeDivision :: CanFail failable => Double -> Double -> failable Double safeDivision x y = if y == 0 then oops else win (x / y) </code></pre></div></div> This function signature is doing something very interesting here! Let’s translate it to plain English: <blockquote> <code class="language-plaintext highlighter-rouge">safeDivision</code> is a function which accepts two arguments of type <code class="language-plaintext highlighter-rouge">Double</code>, and returns a value having the type <code class="language-plaintext highlighter-rouge">failable Double</code> where <code class="language-plaintext highlighter-rouge">failable</code> is a generic type variable that the caller may pick, as long as that type has an instance of <code class="language-plaintext highlighter-rouge">CanFail</code>. </blockquote> Woah! The caller gets to pick the type? That means I can write code like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someMathFunction :: Double -> Double -> Double someMathFunction x y = let result = safeDivision x y in case result of Just number -> number * 3 Nothing -> 0 </code></pre></div></div> As the caller of <code class="language-plaintext highlighter-rouge">safeDivision</code> in this function, I am able to select the <code class="language-plaintext highlighter-rouge">Maybe</code> type. What if there are other instances? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data MaybeError a = Error String | Result a instance CanFail MaybeError where oops = Error "oops" pick (Result a) _ = Result a pick _ (Result a) = Result a pick ohNooo = ohNooo win a = Result a </code></pre></div></div> Now I can also select the <code class="language-plaintext highlighter-rouge">MaybeError</code> type! If I want to, I can also make it an instance of <code class="language-plaintext highlighter-rouge">IO</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- simplified. Runs an action and catches an exception. try :: IO a -> IO (MaybeError a) instance CanFail IO where oops = throwException "oops" pick first second = do eResult <- try first case eResult of Result a -> return a Error exception -> second win a = return a </code></pre></div></div> Now, we can use our <code class="language-plaintext highlighter-rouge">safeDivision</code> function in <code class="language-plaintext highlighter-rouge">IO</code>, just like it were <code class="language-plaintext highlighter-rouge">print</code> or similar! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do putStrLn "Woah, look at us!" :: IO () x <- safeDivision 3 2 :: IO Double putStrLn "the result was: " print x y <- safeDivision 3 0 print "This never happens because we threw an exception!" </code></pre></div></div> Return type polymorphism is super cool, and definitely one of the best things about type classes. It’s also one of the things that really sets it apart from interfaces or modules in other languages. Sat, 07 Jan 2017 00:00:00 +0000 https://www.parsonsmatt.org/2017/01/07/how_do_type_classes_differ_from_interfaces.html https://www.parsonsmatt.org/2017/01/07/how_do_type_classes_differ_from_interfaces.html 2016 Retrospective <a href="/2015/12/31/2015_retrospective.html">Last year</a>, I wrote: <blockquote> 2015 has been a hell of a year for me. </blockquote> If only I knew how much more so 2016 would be! <h1 id="the-longest-year">The Longest Year</h1> Subjectively, this has felt like the longest year in my entire life. Technically, my last CD <a href="https://alastyranny.bandcamp.com/album/the-cycle-of-the-void-2">The Cycle of the Void</a> is the crowning moment of my 2015, but I somehow forgot to mention it in the last post, so I figure it makes a good thing to start with. <h2 id="the-search-for-work">The Search for Work</h2> The year began with a flood of job interviews. I’d cast my net far and wide, and thanks to my intense self-study of Haskell and blogging, I had a decent amount of interest. The previously abstract concern of “get a real job, leave Athens” became starkly concrete as I evaluated job offers, compared cost of living, and had to deal with the very real effects of potentially moving. A move to Atlanta, New York City, or San Fransisco seemed like an impending certainty. I ended up choosing the offer from Seller Labs, a company based in Athens. It’s hard to beat a remote-friendly Haskell job, even if you have to do some PHP. <h2 id="compose-conf">Compose Conf</h2> In February, I went to New York for <a href="https://www.composeconference.org/">Compose Conf</a>. I had a tremendous amount of fun, learned a ton, and got to meet a bunch of great folks. I highly recommend the experience. <h2 id="escape-from-academic-fortress">Escape from Academic Fortress</h2> I entered the spring semester burnt out from the previous semester. I had 17 hours of classes: computer networks, distributed systems, intro to numeric/scientific computation, intro to women’s studies, and an independent study on category theory/type theory/logic. Fortunately, since I’d already signed a job offer, my inability to care about grades didn’t end up harming me. I got to focus on the far more interesting task of actually learning things. The highlight of this semester was certainly my independent study. I got to study lambda calculus, category theory, formal logic, the Curry-Howard correspondence, and applications of the above to distributed systems. I was able to make the connection that the type signatures for the Cloud Haskell primitives have categorical interpretations that can be shown to follow along with a modal logic designed specifically for distributed systems. If you’re interested, you can <a href="https://github.com/parsonsmatt/modalities/blob/master/paper/paper.md.pdf">read the writeup</a>. Midway through the semester, I wrote a spreadsheet and <a href="https://github.com/parsonsmatt/shouldistudy">a stupid little PureScript app</a> to determine how much work I needed to do in order to graduate. This dramatically reduced the academic stress. Unfortunately, my prior attempt at college had left me with a poor GPA, and even straight As in the last semester wouldn’t be enough to get my GPA over the 3.0 mark. <h2 id="sweet-relief">Sweet Relief</h2> May came, and I was finally done. I rested and relaxed for a few weeks before going a two week trip to Colorado for LambdaConf. Every time I go to that state, I love it even more. The open air, the lack of humidity, the mountains, the potential for snow, the bike-friendliness are just wonderful. I greatly enjoyed exploring Denver and Boulder on my own. The people I met and things I learned at LambdaConf were more than worth the price of entry. When I returned, I had a few days to recover before starting the full time job with Seller Lab. <h2 id="professional-software-development">Professional Software Development</h2> On June 1st, I started my first official full time non-intern software engineering role. My task was to port some failing/inefficient workers from PHP to Haskell. I spent a ton of time learning how PHP works, reading the existing code, and writing Haskell to do the same job. As of now, Haskell is the beating heart of Seller Labs most successful product, managing a 2x increase in throughput with 1/4 the resources. <h2 id="a-second-denver-trip">A Second Denver Trip</h2> In October, I made a two week trip to Denver to explore the city and scope out where I’d want to live. I went with my then-partner and housemate. My housemate ended up moving out there in early December, being too enthused to wait any longer. I had another excellent time visiting the city and working remotely. I got to explore more coffee bars, parks, and community this time around. Denver has captivated me, and I can’t wait to move there early next year. <h2 id="finally-cello">Finally, cello</h2> I’ve wanted to learn to play cello for over half my life. My parents could never have afforded it when I was a kid, and I didn’t have the financial resources myself until late this year. I’d played around with my housemate’s electric cello and tried to teach myself, but made very little progress. I finally decided to rent a real cello and get lessons, and it’s been an awesome experience thus far. I really appreciate the feeling of being a beginner. You suck at something, and it’s hard to get motivated when you suck. But progress is measured in hours: you practice for an hour and notice immediate improvement. You’re noticeably better after a week. This novice progression is addictive and fun, once you get over the complete incompetency associated with being a beginner. Being a competent incompetent is one of my greatest strengths. I hope I never become allergic to learning new things. <h1 id="everything-has-a-price">Everything has a Price</h1> Last year, I wrote: <blockquote> Many of my relationships faltered this year, and some failed entirely. </blockquote> Unfortunately, this trend continued this year. In May, I returned to Athens from Boulder, and a partner decided to end our relationship after two months of avoiding me. In November, I returned to Athens from Denver, and my partner of 4 years and I finally gave up on our relationship. I worked for, earned, and bought success. I learned everything I could about good software development techniques and practices. I demonstrated my skill publicly and to employers. I picked up tens of thousands of dollars in student loans. Most importantly, I devoted nearly all of my time and effort to this goal. Relationships require time, care, and effort. Without these things, they die. I have no roots in this city. When I shared my fears and anticipation of leaving, what little that held me here recoiled and retreated. The city that left me numerous times has left me yet again; except this time, it’s still here. Despite being surrounded by familiarity, I’m isolated. The city is building walls to keep me out, to let me know I’m no longer welcome. I found a document I wrote in early 2014, before I started this whole crazy software development thing. All of my estimations were so conservative. Fear and anxiety were soothed by meticulous planning; a kind of reassurance from the future that I’d be OK. I succeeded far more than I’d anticipated. I paid a far greater price than I knew. Fri, 30 Dec 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/12/30/2016_retrospective.html https://www.parsonsmatt.org/2016/12/30/2016_retrospective.html Servant in Yesod - Yo Dawg If you’re a web programmer, Haskell has a lot of neat toys. <a href="https://haskell-servant.readthedocs.io/en/stable/">Servant</a> does a fantastic job of describing RESTful APIs, and the ability to generate <a href="https://haskell-servant.readthedocs.io/en/stable/tutorial/Javascript.html">JavaScript clients</a>, <a href="https://haskell-servant.readthedocs.io/en/stable/tutorial/Client.html">Haskell clients</a>, and <a href="https://haskell-servant.github.io/posts/2016-02-06-servant-swagger.html">Swagger documentation and UIs</a> make it a compelling choice for implementing your API. <a href="https://www.yesodweb.com/">Yesod</a> makes a similarly compelling choice for full blown websites, with lots of documentation, HTML templates, solid routing and type safe links, and convenient database modeling. Where Servant excels for RESTful APIs, Yesod excels for websites. If you’re writing a web app, you may be wondering: how do I choose? An API may end up needing to render some pages, and a website may need to expose JSON endpoints. Fortunately, we can easily have both! In this blog post, I’ll demonstrate how to mount my <a href="https://github.com/parsonsmatt/servant-persistent">servant-persistent</a> starter project inside of a newly minted Yesod application. tl;dr: Both Servant and Yesod expose functions to convert them to <a href="https://hackage.haskell.org/package/wai">WAI</a> applications, and both have means of running arbitrary <code class="language-plaintext highlighter-rouge">WAI</code> applications. If you’re too impatient to read the walkthrough, the complete repository is <a href="https://github.com/parsonsmatt/yo-dawg">on Github</a>. <h1 id="servant-persistent"><code class="language-plaintext highlighter-rouge">servant-persistent</code></h1> Since we’ll be using this package as our API, you may want to read a bit about it. I wrote a post describing the project and how it’s used <a href="https://www.parsonsmatt.org/2016/07/08/servant-persistent_updated.html">here</a>. <h1 id="start-your-yesods">Start your Yesods</h1> Yesod has a feature called <a href="https://www.yesodweb.com/book/creating-a-subsite">subsites</a>. This allows you to write a modular little website, and then put it inside of a larger site. A lesser known feature is the <code class="language-plaintext highlighter-rouge">WaiSubsite</code> which allows you to embed an arbitrary <code class="language-plaintext highlighter-rouge">WAI</code> <code class="language-plaintext highlighter-rouge">Application</code>. We’ll use this feature to embed the Servant app. Start up a new Yesod project using <code class="language-plaintext highlighter-rouge">stack</code>, like so: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stack new yo-dawg yesod-postgres --resolver=lts-6.27 </code></pre></div></div> Next up, we’ll add <code class="language-plaintext highlighter-rouge">servant-persistent</code> as another package in the <code class="language-plaintext highlighter-rouge">stack.yaml</code> so that we can use it: <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code># in stack.yaml packages: - '.' - location: git: git@github.com:parsonsmatt/servant-persistent commit: 98479a423609794ffa9b668b0ae13ae9a57be18e extra-dep: true </code></pre></div></div> And we’ll need to add <code class="language-plaintext highlighter-rouge">servant-persistent</code> as a dependency of our project: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- in yo-dawg.cabal build-depends: -- ....... , servant-persistent </code></pre></div></div> That should be all the setup we need to do in order to start using the stuff. <h1 id="who-controls-the-database">Who controls the database?</h1> Yesod comes with a bunch of stuff for models. But <code class="language-plaintext highlighter-rouge">servant-persistent</code> already has database stuff and models defined. How should we handle this? At the day job, I factored the models out into their own package. That’s an option that has worked well, though it’s a little more labor intensive. You could allow the API and the website to have separate models, though that seems like a lot of duplication and shared concerns. What we’ll do, for simplicity, is rely on the models present in the <code class="language-plaintext highlighter-rouge">servant-persistent</code> app and delete the model code out of the Yesod repository. In order to get this running, we’ll need to delete the stuff relating to <code class="language-plaintext highlighter-rouge">Comment</code>s, as they’re not present in the <code class="language-plaintext highlighter-rouge">servant-persistent</code> models. We’ll also need to delete the authentication code and <code class="language-plaintext highlighter-rouge">userIdent</code> references. Thankfully, GitHub makes these changes easy to see. <a href="https://github.com/parsonsmatt/yo-dawg/commit/03ecf35fcc7322f4aeddc1b145195bcfe791c6a7">Here’s a commit link</a> that shows the changes to the base template necessary. <h1 id="adding-the-route">Adding the Route</h1> Our next task is to put the API somewhere. <code class="language-plaintext highlighter-rouge">api</code> is a sensible place to put it, so let’s add that to the routes file. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- in config/routes /api ServantPersistentR WaiSubsite getServantPersistent {- [1] [2] [3] [4] -} </code></pre></div></div> The line has four components: <ol> <li>The route that we’ll mount the API on</li> <li>The data constructor to generate to route things to the API</li> <li>The foundation of the subsite</li> <li>The name of the function that will actually return the subsite.</li> </ol> When this code is added, we’ll get an error in <code class="language-plaintext highlighter-rouge">Foundation.hs</code>, since it’ll be trying to refer to <code class="language-plaintext highlighter-rouge">getServantPersistent</code>, which isn’t defined. The expected type of <code class="language-plaintext highlighter-rouge">getServantPersistent</code> is going to be a function that takes our <code class="language-plaintext highlighter-rouge">App</code> type in the <code class="language-plaintext highlighter-rouge">Foundation</code> and returns a <code class="language-plaintext highlighter-rouge">WaiSubsite</code>. So we’ll add the bare minimum to shut GHC up: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- in Foundation.hs getServantPersistent :: App -> WaiSubsite getServantPersistent = error "later" </code></pre></div></div> Now, compilation succeeds. The next step is to use the <code class="language-plaintext highlighter-rouge">WaiSubsite</code> constructor, which has the type <code class="language-plaintext highlighter-rouge">Application -> WaiSubsite</code>, where <code class="language-plaintext highlighter-rouge">Application</code> is a WAI application. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- in Foundation.hs getServantPersistent :: App -> WaiSubsite getServantPersistent = WaiSubsite . error "later" </code></pre></div></div> Now, we’re left with the question: Given our <code class="language-plaintext highlighter-rouge">App</code>, how do we get the <code class="language-plaintext highlighter-rouge">servant-persistent</code> <code class="language-plaintext highlighter-rouge">Application</code> out of it? <h1 id="initializing-the-api">Initializing the API</h1> <code class="language-plaintext highlighter-rouge">servant-persistent</code> has a <code class="language-plaintext highlighter-rouge">Config</code> type that is in many ways similar to the <code class="language-plaintext highlighter-rouge">App</code> type in Yesod. It contains all of the Stuff you need in order to get the API up and running, including the settings, database pool, etc. Fortunately, the package also exposes a function <code class="language-plaintext highlighter-rouge">app :: Config -> Application</code>. So all we need to do is get our hands on the <code class="language-plaintext highlighter-rouge">Config</code> data type, call that function, and we’re set. Fortunately, in this case, the <code class="language-plaintext highlighter-rouge">Config</code> is pretty simple: just an <code class="language-plaintext highlighter-rouge">Environment</code> and a <code class="language-plaintext highlighter-rouge">ConnectionPool</code>. Yesod prefers to handle <code class="language-plaintext highlighter-rouge">Environment</code> by different (and better) means, so we’ll just pass <code class="language-plaintext highlighter-rouge">Production</code> in. The <code class="language-plaintext highlighter-rouge">ConnectionPool</code> is created in <code class="language-plaintext highlighter-rouge">Application.hs</code> function <code class="language-plaintext highlighter-rouge">makeFoundation</code>. Since we don’t want to re-make the API Application on every request, we’ll just go ahead and add the <code class="language-plaintext highlighter-rouge">Application</code> to the Yesod <code class="language-plaintext highlighter-rouge">App</code> data type. In <code class="language-plaintext highlighter-rouge">Foundation.hs</code>, we’ll make the following changes: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data App = App { appSettings :: AppSettings , appStatic :: Static -- ^ Settings for static file serving. , appConnPool :: ConnectionPool -- ^ Database connection pool. , appHttpManager :: Manager , appLogger :: Logger -- vvv New! :D vvv , appSubApi :: Application -- ^^^ New! :D ^^^ } getServantPersistent :: App -> WaiSubsite getServantPersistent = WaiSubsite . appSubApi </code></pre></div></div> And in <code class="language-plaintext highlighter-rouge">Application.hs</code>, we’ll need to initialize the API when we get the connection pool. Yesod does a bit of a hack by default here, so we’ll respond in kind with a hack. In the function <code class="language-plaintext highlighter-rouge">makeFoundation</code>, we’ll modify the <code class="language-plaintext highlighter-rouge">mkFoundation</code> function defined in the <code class="language-plaintext highlighter-rouge">let</code> like so: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Application.hs import qualified Api as ServantPersistent import Config (Config(..), Environment(Production)) -- down to makeFoundation ... let mkFoundation appConnPool = let apiCfg = Config appConnPool Production appSubApi = ServantPersistent.app apiCfg in App {..} -- ... </code></pre></div></div> This ties everything together, and you’ll be serving your Servant API out of a Yesod subsite. Neat! Sun, 18 Dec 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/12/18/servant_in_yesod_-_yo_dawg.html https://www.parsonsmatt.org/2016/12/18/servant_in_yesod_-_yo_dawg.html Clean Alternatives with MaybeT Haskell’s abstraction facilities are awesome. <code class="language-plaintext highlighter-rouge">Functor</code>, <code class="language-plaintext highlighter-rouge">Applicative</code>, and <code class="language-plaintext highlighter-rouge">Monad</code> are all great, and <code class="language-plaintext highlighter-rouge">Maybe</code> is a pretty fantastic example of each. Lifting functions over optional values, combining optional values, and sequencing the possibility of <code class="language-plaintext highlighter-rouge">Nothing</code>ness are pretty powerful tools for cleaning up code. The first time I refactored some <code class="language-plaintext highlighter-rouge">Maybe</code> infested code like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someFunc :: Int -> Maybe String someFunc i = case foo i of Nothing -> Nothing Just a -> case bar a of Nothing -> Nothing Just b -> Just (show b) </code></pre></div></div> into the elegant: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someFunc i = do a <- foo i b <- bar a pure (show b) </code></pre></div></div> I knew I was totally hooked. The <code class="language-plaintext highlighter-rouge">Monad</code> instance for <code class="language-plaintext highlighter-rouge">Maybe</code> covers a common case: given some sequence of functions which may fail, we want to try them all and if any of them fail then we’ll short circuit it all. However, that’s not the only case. Very often, you’ll want to take the first thing that succeeds, rather than failing unless everything works. Something like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someOtherFunc :: Int -> Maybe String someOtherFunc i = do case foo i of Just a -> Just a Nothing -> case bar i of Just b -> Just b Nothing -> wat i </code></pre></div></div> One of Haskell’s lesser known type classes is <code class="language-plaintext highlighter-rouge">Alternative</code>, which is precisely the abstraction we want here! <h1 id="alternative">Alternative</h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Applicative f => Alternative f where empty :: f a (<|>) :: f a -> f a -> f a </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">Alternative</code> class gives us <code class="language-plaintext highlighter-rouge">empty</code>, which is an “empty” value, and <code class="language-plaintext highlighter-rouge"><|></code>, which allows us to define a way to choose between two values. The <a href="https://hackage.haskell.org/package/base-4.9.0.0/docs/Control-Applicative.html#t:Alternative">documentation</a> tells us that <code class="language-plaintext highlighter-rouge">empty</code> should be an identity for <code class="language-plaintext highlighter-rouge"><|></code>, and that <code class="language-plaintext highlighter-rouge"><|></code> is a binary associative operator (huh, sounds like a monoid, right?) Maybe has a nice <code class="language-plaintext highlighter-rouge">Alternative</code> instance that looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Alternative Maybe where empty = Nothing Just a <|> _ = Just a Nothing <|> b = b </code></pre></div></div> Does this make sense? Well, if we have some <code class="language-plaintext highlighter-rouge">Just 10</code>, and we choose between <code class="language-plaintext highlighter-rouge">Nothing <|> Just 10</code>, then we’ll pick <code class="language-plaintext highlighter-rouge">Just 10</code>. Likewise, if we choose between <code class="language-plaintext highlighter-rouge">Just 10 <|> Nothing</code>, we’ll take <code class="language-plaintext highlighter-rouge">Just 10</code>. It’s associative, so we don’t need parentheses. <code class="language-plaintext highlighter-rouge">a <|> b <|> c <|> d</code> will choose the first value that isn’t <code class="language-plaintext highlighter-rouge">empty</code>. Okay, so how can we rewrite <code class="language-plaintext highlighter-rouge">someOtherFunc</code> like this? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someOtherFunc :: Int -> Maybe String someOtherFunc i = foo i <|> bar i <|> wat i </code></pre></div></div> Now that looks pretty nice! Definitely a lot cleaner than the previous one. <h1 id="transformers-in-disguise">Transformers In Disguise</h1> Raise your hand if you’ve written some Haskell code like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getFromCache :: String -> IO (Maybe Record) getFromDatabase :: String -> IO (Maybe Record) getFromRemoteAPI :: String -> IO (Maybe Record) retrieveRecord :: String -> IO (Maybe Record) retrieveRecord name = do mrec <- getFromCache name case mrec of Just rec -> pure (Just rec) Nothing -> do mrec' <- getFromDatabase name case mrec' of Just rec -> pure (Just rec) Nothing -> getFromRemoteAPI name </code></pre></div></div> GROSS! That’s just as bad as before. Wouldn’t it be great if we could get that nice <code class="language-plaintext highlighter-rouge">Maybe</code> <code class="language-plaintext highlighter-rouge">Alternative</code> action going here? Well, we can! The entire magic of a monad transformer is that we can enhance a base monad with features of another monad. Let’s cover the implementation of <code class="language-plaintext highlighter-rouge">MaybeT</code> and see how to use it to wrap our actions and get that choice. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype MaybeT m a = MaybeT { runMaybeT :: m (Maybe a) } </code></pre></div></div> I’m going to elide the <code class="language-plaintext highlighter-rouge">Functor</code> and <code class="language-plaintext highlighter-rouge">Applicative</code> definitions – let’s get right into <code class="language-plaintext highlighter-rouge">Monad</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Monad (MaybeT m) where return = MaybeT . return . Just MaybeT ma >>= f = ??? </code></pre></div></div> <code class="language-plaintext highlighter-rouge">???</code> has the type <code class="language-plaintext highlighter-rouge">MaybeT m b</code>, <code class="language-plaintext highlighter-rouge">ma :: m (Maybe a)</code>, and <code class="language-plaintext highlighter-rouge">f :: a -> MaybeT m b</code>. We need to get the <code class="language-plaintext highlighter-rouge">a</code> out of that <code class="language-plaintext highlighter-rouge">ma</code> value, but it’s a <code class="language-plaintext highlighter-rouge">Monad</code>, so we can only <code class="language-plaintext highlighter-rouge">bind</code> out of it. So we’ll have to start with the <code class="language-plaintext highlighter-rouge">MaybeT</code> constructor. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Monad (MaybeT m) where return = MaybeT . return . Just MaybeT ma >>= f = MaybeT ??? </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">???</code> value has the type <code class="language-plaintext highlighter-rouge">m (Maybe b)</code> now, which means that it’s in the same monad. This means we can use <code class="language-plaintext highlighter-rouge">do</code> and bind out of that original <code class="language-plaintext highlighter-rouge">ma</code> value! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Monad (MaybeT m) where return = MaybeT . return . Just MaybeT ma >>= f = MaybeT $ do maybeA <- ma ??? </code></pre></div></div> We’ve got a <code class="language-plaintext highlighter-rouge">maybeA :: Maybe a</code> value now, so we’re not out of the weeds yet. We’ll case match on the value. If it’s <code class="language-plaintext highlighter-rouge">Nothing</code>, we’ll <code class="language-plaintext highlighter-rouge">return Nothing</code> since we can’t do anything else. Otherwise, we can continue! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Monad (MaybeT m) where return = MaybeT . return . Just MaybeT ma >>= f = MaybeT $ do maybeA <- ma case maybeA of Nothing -> return Nothing Just a -> ??? </code></pre></div></div> Now that we’ve finally got that <code class="language-plaintext highlighter-rouge">a</code>, we need to apply it to <code class="language-plaintext highlighter-rouge">f</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Monad (MaybeT m) where return = MaybeT . return . Just MaybeT ma >>= f = MaybeT $ do maybeA <- ma case maybeA of Nothing -> return Nothing Just a -> f a </code></pre></div></div> However, this isn’t quite right, because <code class="language-plaintext highlighter-rouge">f a :: MaybeT m b</code>, and we need <code class="language-plaintext highlighter-rouge">m (Maybe b)</code>! We’ll unwrap with <code class="language-plaintext highlighter-rouge">runMaybeT</code> and it’ll work. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Monad (MaybeT m) where return = MaybeT . return . Just MaybeT ma >>= f = MaybeT $ do maybeA <- ma case maybeA of Nothing -> return Nothing Just a -> runMaybeT (f a) </code></pre></div></div> Cool! Now, what’s the alternative instance look like? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Alternative (MaybeT m a) where empty = MaybeT (return Nothing) MaybeT first <|> MaybeT second = ??? </code></pre></div></div> Well, we’ll want to check the first value, and if it’s <code class="language-plaintext highlighter-rouge">Nothing</code>, then we’ll check the second value. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Alternative (MaybeT m a) where empty = MaybeT (return Nothing) MaybeT first <|> MaybeT second = MaybeT $ do maybeA <- first case maybeA of Just a -> return (Just a) Nothing -> ??? </code></pre></div></div> Well, now we’ve taken care of the <code class="language-plaintext highlighter-rouge">first</code> action. If it was <code class="language-plaintext highlighter-rouge">Nothing</code>, then we’ll need to bind out of the <code class="language-plaintext highlighter-rouge">second</code> action. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Alternative (MaybeT m a) where empty = MaybeT (return Nothing) MaybeT first <|> MaybeT second = MaybeT $ do maybeA <- first case maybeA of Just a -> return (Just a) Nothing -> do maybeA' <- second case maybeA' of Just a -> return (Just a) Nothing -> return Nothing </code></pre></div></div> There’s some redundancy here that we can clean up: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad m => Alternative (MaybeT m a) where empty = MaybeT (return Nothing) MaybeT first <|> MaybeT second = MaybeT $ do maybeA <- first case maybeA of Just a -> return (Just a) Nothing -> second </code></pre></div></div> Cool! <h1 id="using-the-alternative">Using the Alternative</h1> Now that we’ve got our <code class="language-plaintext highlighter-rouge">Alternative</code>, we can use it with our previous functions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getFromCache :: String -> IO (Maybe Record) getFromDatabase :: String -> IO (Maybe Record) getFromRemoteAPI :: String -> IO (Maybe Record) retrieveRecord :: String -> IO (Maybe Record) retrieveRecord name = runMaybeT $ MaybeT (getFromCache name) <|> MaybeT (getFromDatabase name) <|> MaybeT (getFromRemoteAPI name) </code></pre></div></div> This is a lot cleaner! I hope this has convinced you to check out the <code class="language-plaintext highlighter-rouge">Alternative</code> class and consider using it in your code. Fri, 18 Nov 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/11/18/clean_alternatives_with_maybet.html https://www.parsonsmatt.org/2016/11/18/clean_alternatives_with_maybet.html Grokking Fix <blockquote> This post is intended for beginners of functional programming interested in an exploration of laziness, Haskell, and recursion </blockquote> Haskell’s laziness enables some pretty cool tricks. The <code class="language-plaintext highlighter-rouge">fix</code> function is one of the neater ones, though it can be hard to understand how to use it from just the implementation and type signature. If you grab a calculator and put any number into it, you can start hitting the <code class="language-plaintext highlighter-rouge">cos</code> button. After a while, the number will start getting closer and closer to the <a href="https://mathworld.wolfram.com/DottieNumber.html">fixed point of cosine</a>. A fixed point of a function is some value where applying the function to the value returns the same value. The equation is a little easier to get, for some function $f$, the fix point $c$ is: \[f(c) = c\] We can implement this in Haskell! The entirety of the magic is right here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fix :: (a -> a) -> a fix f = let x = f x in x </code></pre></div></div> You might be thinking: <blockquote> wat </blockquote> And you’d be right! This is a really odd definition. It relies on the fact that Haskell values are lazy, and that you can refer to terms before defining them. <h1 id="fixing-identity">fixing identity</h1> The type signature says that we, the callers of the function, get to choose whatever <code class="language-plaintext highlighter-rouge">a</code> type we want. <code class="language-plaintext highlighter-rouge">(a -> a)</code> calls to mind <code class="language-plaintext highlighter-rouge">id</code>, which we can use as an easy first choice to see how Haskell evaluates this expression. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- We call it: fix id -- Rewrite: let x = id x in x -- Rewrite `x` on the right hand side in terms of how it is defined: let x = id (id x) in x -- Repeat: let x = id (id (id x)) in x -- Again, but with function composition: let x = (id . id . id . id) x in x -- *yawn* let x = (id . id . ... . id . id) x in x </code></pre></div></div> So this is just an infinite application of <code class="language-plaintext highlighter-rouge">id</code> to <code class="language-plaintext highlighter-rouge">x</code>! But where is <code class="language-plaintext highlighter-rouge">x</code>? What is it? This is precisely <code class="language-plaintext highlighter-rouge">_|_</code>: bottom, the value-that-is-no-value, the term <code class="language-plaintext highlighter-rouge">undefined</code>! So no matter how far you dig into that infinite pile of <code class="language-plaintext highlighter-rouge">id</code>s, you’ll never reach <code class="language-plaintext highlighter-rouge">bottom</code>. Another way to write this is to inline the definition of <code class="language-plaintext highlighter-rouge">id</code> right into our calling of <code class="language-plaintext highlighter-rouge">fix</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fix (\x -> x) </code></pre></div></div> Well, that was kind of pointless. How else can we use this function? Perhaps <code class="language-plaintext highlighter-rouge">fix cos</code> can get us that number we want! If we type that into GHCi, though, we get <code class="language-plaintext highlighter-rouge">*** Exception: <<loop>></code>. The function doesn’t have any way to terminate recursion, so this still repeats forever. <h1 id="fixing-more-interesting-things">fixing more interesting things</h1> Specializing the type means we can specialize to anything we want. This includes function types! So we can also specialize the type of <code class="language-plaintext highlighter-rouge">fix</code> to be: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fix :: (a -> a ) -> a [1] fix :: ((b -> c) -> (b -> c)) -> (b -> c) [2] fix :: ((b -> c) -> b -> c ) -> b -> c [3] </code></pre></div></div> Here we have: <ol> <li>The original definition.</li> <li>Specializing the type variable <code class="language-plaintext highlighter-rouge">a</code> to the function type <code class="language-plaintext highlighter-rouge">b -> c</code></li> <li>Dropping some redundant parentheses (remember, function arrow associates to the right, so <code class="language-plaintext highlighter-rouge">a -> (b -> c)</code> is equivalent to <code class="language-plaintext highlighter-rouge">a -> b -> c</code>)</li> </ol> This small change has had a pretty dramatic effect on how the type signature reads. <code class="language-plaintext highlighter-rouge">fix :: (a -> a) -> a</code> reads like “Give me a function that takes a single argument and returns a value of the same type, and I’ll give you a value of that type.” The two parameter version reads like: <blockquote> Give me a function that takes two arguments: the first being a function from <code class="language-plaintext highlighter-rouge">b</code> to <code class="language-plaintext highlighter-rouge">c</code>, and the second being a value of type <code class="language-plaintext highlighter-rouge">b</code>. Then, if you give me a <code class="language-plaintext highlighter-rouge">b</code>, then, I’ll give you a <code class="language-plaintext highlighter-rouge">c</code>. </blockquote> This is much more interesting. What might an example of this look like? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cosFixpoint x = fix (\f b -> if cos b == b then b else f (cos b) ) x </code></pre></div></div> Evaluating <code class="language-plaintext highlighter-rouge">cosFixpoint</code> for any <code class="language-plaintext highlighter-rouge">x</code> gives us the same result: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ> fix (\f x -> if x == cos x then x else f (cos x)) 3 0.7390851332151607 λ> fix (\f x -> if x == cos x then x else f (cos x)) 4 0.7390851332151607 λ> fix (\f x -> if x == cos x then x else f (cos x)) 5 0.7390851332151607 λ> fix (\f x -> if x == cos x then x else f (cos x)) 6 0.7390851332151607 </code></pre></div></div> Now you might notice something interesting here. The function argument <code class="language-plaintext highlighter-rouge">f</code> – what is that function’s definition? It’s the lambda! We could rewrite this as an explicit recursion with a very similar structure: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cosFixpointExplicit x = if cos x == x then x else cosFixpointExplicit (cos x) </code></pre></div></div> In fact, we can use <code class="language-plaintext highlighter-rouge">fix</code> to factor out recursion anywhere we might find it. What might this look like for <code class="language-plaintext highlighter-rouge">last</code>? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>last :: [a] -> Maybe a last [] = Nothing last [x] = Just x last (x:xs) = last xs </code></pre></div></div> First, we’d factor out the named recursion, and then pattern match on the list. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>last' = fix $ \f xs -> case xs of [] -> Nothing [x] -> Just x (x:xs) -> f xs </code></pre></div></div> How about <code class="language-plaintext highlighter-rouge">map</code>? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map :: (a -> b) -> [a] -> [b] map _ [] = [] map f (x:xs) = f x : map f xs map' f = fix $ \recurse list -> case list of [] -> [] (x:xs) -> f x : recurse xs </code></pre></div></div> Neat! <h1 id="monadic-fixins">monadic fixins</h1> Do we need another definition of the function to work with monadic functions? Let’s specialize the type and see what happens: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fix :: (a -> a) -> a fix :: Monad m => ((a -> m b) -> a -> m b) -> a -> m b </code></pre></div></div> That checks out. Let’s write something that does a bit of <code class="language-plaintext highlighter-rouge">IO</code> with <code class="language-plaintext highlighter-rouge">fix</code> now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>printUntilZero = fix $ \f x -> if x >= 0 then do print x f (x - 1) else pure x -- pasting into GHCi: λ> fix (\f x -> if x >= 0 then do print x; f (x - 1) else pure x) 4 4 3 2 1 0 -1 </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">-1</code> in the output is the return value of the fix expression: the printed numbers are the side effects. Well, that’s weird and cool. How exactly does this all work again? Let’s review the definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fix :: (a -> a) -> a fix f = let x = f x in x </code></pre></div></div> Haskell’s laziness allows us to essentially rewrite this one step at a time, exactly as we demand the results of it. How does the language compile and understand this? First, it checks the syntax, which is totally okay. Second, it collects all of the declarations and their type signatures together. We’re declaring three things: <ol> <li>The top level function <code class="language-plaintext highlighter-rouge">fix</code></li> <li><code class="language-plaintext highlighter-rouge">fix</code>’s first argument, <code class="language-plaintext highlighter-rouge">f</code></li> <li>The <code class="language-plaintext highlighter-rouge">let</code> bound variable <code class="language-plaintext highlighter-rouge">x</code>, which is available in the bodies of the <code class="language-plaintext highlighter-rouge">let</code> variable and the expression immediately following the <code class="language-plaintext highlighter-rouge">in</code>.</li> </ol> Then, it goes to type check the expression. As it type checks, it also ensures that no variables are undeclared. Since <code class="language-plaintext highlighter-rouge">x</code> is declared in the <code class="language-plaintext highlighter-rouge">let</code> block, and <code class="language-plaintext highlighter-rouge">f x</code> is well typed, it accepts this definition. Haskell will continue applying <code class="language-plaintext highlighter-rouge">f</code> to <code class="language-plaintext highlighter-rouge">x</code> each time we demand the next bit of evaluation, or alternatively, whenever <code class="language-plaintext highlighter-rouge">f</code> self-terminates. We saw above that <code class="language-plaintext highlighter-rouge">fix id</code> dug a big hole of <code class="language-plaintext highlighter-rouge">id</code> applications, from which we’d never get out. But specializing to a function type allowed us to provide a starting point, and terminate early! <h1 id="the-secret-tricks">The Secret Tricks</h1> <ol> <li>Combinators are cool.</li> <li>You can specialize plain <code class="language-plaintext highlighter-rouge">a</code> types to function types <code class="language-plaintext highlighter-rouge">a -> b</code> for interesting results.</li> <li>Laziness is fun</li> </ol> Wed, 26 Oct 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/10/26/grokking_fix.html https://www.parsonsmatt.org/2016/10/26/grokking_fix.html Rank 'n Classy Limited Effects <h2 id="update-from-the-future-2019-05-22">Update from the Future (2019-05-22):</h2> I don’t recommend this technique any more. It’s quite complicated, and there’s a much simpler formulation of the idea that uses records-of-functions instead of type class instances. I’ve written up this formulation at the end of the blog post. I don’t recommend using the record-of-functions approach either, except in very narrow use cases, and with very constrained interfaces. <h2 id="original-post">Original Post</h2> Side effects are awful. Database access, HTTP requests, file reading, talking to Redis, ah! Just so much gross IO code to shuffle around. There’s been a lot of effort to make things nicer. Using monads to track effects in the type is a great start, but it’s a little painful to work with without some good abstractions. The <a href="https://hackage.haskell.org/package/mtl"><code class="language-plaintext highlighter-rouge">mtl</code></a> library does a great job of making abstractions, but it has a big flaw: every new monad you want to introduce incurs $O(n^2)$ instances that you need to write. If you’re using <code class="language-plaintext highlighter-rouge">Reader</code>, <code class="language-plaintext highlighter-rouge">State</code>, <code class="language-plaintext highlighter-rouge">Logger</code>, <code class="language-plaintext highlighter-rouge">Http</code>, <code class="language-plaintext highlighter-rouge">Database</code>, <code class="language-plaintext highlighter-rouge">Email</code>, etc. (with special instances for testing/production/etc) then eventually this becomes too much of a burden. More recently, the <a href="https://www.haskellforall.com/2012/06/you-could-have-invented-free-monads.html">free monad</a> approach and <a href="https://hackage.haskell.org/package/extensible-effects">extensible-effects</a> on top of it have become more popular. Free monads solve the $O(n^2)$ instance problem, and they offer the ability to introspect on the computation and perform optimizations on it. However, they have worse performance and are more complicated to implement. You either have to build a giant command functor with an equally complex interpreter, or you need to build many small languages and manage their combinations. I’ve been working on a very promising pattern at the day job lately, and it’s worked out quite well thus far. It seems to solve the issues involved with a ridiculous proliferation of monad instances and the complications involved with free monads, while still giving most of the benefits of both. <h1 id="mtl-style-revisited"><code class="language-plaintext highlighter-rouge">mtl</code> style, revisited</h1> If you’re unfamiliar, the <code class="language-plaintext highlighter-rouge">mtl</code> style of documenting effects is to use type classes to specify the effects of functions. This has two main benefits: <h2 id="you-dont-have-to-worry-about-the-order-of-your-monad-stack">You don’t have to worry about the order of your monad stack.</h2> Concretely, these two functions are incompatible: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: StateT Int (Reader Char) Bool foo = do int <- get char <- lift ask pure False bar :: ReaderT Char (State Int) Bool bar = do int <- lift get char <- ask pure True </code></pre></div></div> Since we have to specify the <code class="language-plaintext highlighter-rouge">lift</code>s, we can’t use them together. The <code class="language-plaintext highlighter-rouge">mtl</code> approach makes this possible: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo :: (MonadState Int m, MonadReader Char m) => m Bool foo = do int <- get char <- ask pure False bar :: (MonadState Int m, MonadReader Char m) => m Bool bar = do int <- get char <- ask pure True </code></pre></div></div> The two functions are now easily interoperable! Nice. We also didn’t have to type <code class="language-plaintext highlighter-rouge">lift</code> a bunch, though truth be told, you just need the <code class="language-plaintext highlighter-rouge">mtl</code> type classes in scope for that to work. While this is nice, it’s not the real benefit of <code class="language-plaintext highlighter-rouge">mtl</code>. <h2 id="strict-specification-of-effects">Strict specification of effects</h2> When you use an <code class="language-plaintext highlighter-rouge">mtl</code> type class, you’re restricting yourself to the interface that the type class provides. If your monad is <code class="language-plaintext highlighter-rouge">StateT Int IO String</code>, then your monad can do any <code class="language-plaintext highlighter-rouge">IO</code> it wants. That’s no good! But if you know your function is <code class="language-plaintext highlighter-rouge">MonadState Int m => m String</code>, you know it can only operate on the state. This lets you swap implementations easily. The PureScript compiler had an awesome demonstration, where they moved a <code class="language-plaintext highlighter-rouge">WriterT</code> based logger to an <code class="language-plaintext highlighter-rouge">IO</code> based instance (<a href="https://blog.functorial.com/posts/2016-01-31-PureScript-0.8.html">documented here</a>) for big performance gains. <h1 id="mocking-monads">Mocking Monads</h1> First, we’ll define a type class that represents an effect. We’ll use a limited subset of <code class="language-plaintext highlighter-rouge">Http</code> requests. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Monad m => MonadHttp m where get :: Url -> m ByteString post :: ToJSON a => Url -> a -> m ByteString type Url = String </code></pre></div></div> We can easily make an instance for <code class="language-plaintext highlighter-rouge">IO</code>, using <code class="language-plaintext highlighter-rouge">wreq</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance MonadHttp IO where get = fmap (view responseBody) . Wreq.get post url = fmap (view responseBody) . Wreq.post url . toJson </code></pre></div></div> Now, wherever we might have been using a function like <code class="language-plaintext highlighter-rouge">makeRequest :: Something -> IO OtherThing</code>, we can now abstract that <code class="language-plaintext highlighter-rouge">IO</code> into <code class="language-plaintext highlighter-rouge">makeRequest :: MonadHttp m => SomeThing -> m OtherThing</code>. We can make the change transparently, since <code class="language-plaintext highlighter-rouge">IO</code> will still be inferred and used. Plus, we have the assurance that we’re not going to be accessing the database or printing any output in our <code class="language-plaintext highlighter-rouge">MonadHttp</code> functions. Actually running HTTP requests in dev/test is boring. It’s slow, annoying, unreliable, etc. and we’d much rather run locally for faster tests and more reliable development. We can easily create a mock implementation of <code class="language-plaintext highlighter-rouge">MonadHttp</code> that does static returns: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype MockHttp a = MockHttp { runMockHttp :: ReaderT HttpEnv IO a } deriving (Functor, Applicative, Monad, MonadReader HttpEnv, MonadIO) type HttpEnv = IORef HttpState type HttpState = Map String ByteString instance MonadHttp MockHttp where get url = do state <- ask >>= liftIO . readIORef pure (Map.lookup url state) post url body = do ref <- ask state <- liftIO (readIORef ref) liftIO (writeIORef ref (Map.insert url (encode body) state)) pure "200 OK" </code></pre></div></div> Now, with this instance, all of your <code class="language-plaintext highlighter-rouge">MonadHttp</code> requests will be performed real fast (and dumb). <h1 id="rankn-classy"><code class="language-plaintext highlighter-rouge">RankN</code> Classy</h1> Now, how can we select which interpretation we want? We obviously want <code class="language-plaintext highlighter-rouge">IO</code> for production and <code class="language-plaintext highlighter-rouge">MockHttp</code> for testing. Ultimately, what we want is one of a family of functions, with the following generalized type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forall app a. (forall m. MonadHttp m => m a) -> app a </code></pre></div></div> And here, we see <code class="language-plaintext highlighter-rouge">RankNTypes</code> come into play. The two type variables <code class="language-plaintext highlighter-rouge">app</code> and <code class="language-plaintext highlighter-rouge">a</code> are both <code class="language-plaintext highlighter-rouge">Rank1</code> types, since they’re introduced at the leftmost part of the function and are mentioned to the right of all the function arrows. The <code class="language-plaintext highlighter-rouge">m</code> type variable, on the other hand, is hidden – that’s a <code class="language-plaintext highlighter-rouge">Rank2</code> type variable. The variables <code class="language-plaintext highlighter-rouge">app</code> and <code class="language-plaintext highlighter-rouge">a</code> can both be chosen by the user to be whatever works, but we’re forcing the user to provide a value that not only satisfies the <code class="language-plaintext highlighter-rouge">MonadHttp</code> type class, but that it can do no more than <code class="language-plaintext highlighter-rouge">MonadHttp</code>. Consider this other signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forall app m a. MonadHttp m => m a -> app a </code></pre></div></div> It looks really similar, but that <code class="language-plaintext highlighter-rouge">m</code> is no longer hidden. The user of the function can easily select <code class="language-plaintext highlighter-rouge">IO</code> as the implementation, as that satisfies the type class requirements. The user could then execute arbitrary IO actions, and the types haven’t helped much. The <code class="language-plaintext highlighter-rouge">Rank2</code> type above forces the user to only use <code class="language-plaintext highlighter-rouge">MonadHttp</code> functions and actions. We can safely specialize <code class="language-plaintext highlighter-rouge">app</code> to <code class="language-plaintext highlighter-rouge">IO</code> for now, which we’ll need in order to read <code class="language-plaintext highlighter-rouge">IORef</code>s and do HTTP. So the functions we’re looking for, then are: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mockHttpRequests :: HttpEnv -> (forall m. MonadHttp m => m a) -> IO a mockHttpRequests env action = runReaderT (runMockHttp action) env runHttpRequests :: (forall m. MonadHttp m => m a) -> IO a runHttpRequests action = action </code></pre></div></div> <h1 id="abstracting-the-implementations">Abstracting the implementations</h1> Now, here’s the last bit of the trick: You abstract out the implementations into a record. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Services = Services { runHttp :: forall a. (forall m. MonadHttp m => m a) -> IO a } </code></pre></div></div> which you store in your application environment: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Application = ReaderT Services IO </code></pre></div></div> Now, when you need to run HTTP requests, you can do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foobar :: Application Int foobar = do service <- ask lift $ runHttp service $ do page <- get "https://wwww.google.com/" post "https://secret-data" (collectData page) pure (length page) </code></pre></div></div> Then, while initializing your application, you can choose which environment to pass in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runApplicationProd :: Application a -> IO a runApplicationProd action = runReaderT action (Services runHttpRequests) runApplicationTest :: Application a -> IO a runApplicationTest action = do ref <- newIORef initialHttpState runReaderT action (Services mockHttpRequests) </code></pre></div></div> <h1 id="what-about-free-monads">What about free monads?</h1> Oh this is the best part! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>reify :: (forall m. MonadHttp m => m a) -> Free HttpF a reify = unFreeHttp newtype FreeHttp a = FreeHttp { unFreeHttp :: Free HttpF a } data HttpF next = Get Url (ByteString -> next) | forall a. FromJSON a => Post Url a next instance MonadHttp FreeHttp where get url = FreeHttp (liftF (Get url id)) post url body = FreeHttp (liftF (Post url body)) </code></pre></div></div> And now you can grab ahold of a Free monad representation of your AST. <h1 id="the-theory">The Theory</h1> Where a free monad seems like a great way to describe the effects of a computation, they seem to be more awkward at implementing requests. I’m inspired by <a href="https://tomasp.net/coeffects/">Tomas Petricek’s Coeffects</a> concept, which describes the context or environment of a computation as an indexed comonad. It seems like this approach allows you to request an environment comonad of interpreters for effects. By reifying these effects at the value level (a trick similar to <a href="https://www.haskellforall.com/2012/07/first-class-modules-without-defaults.html">Gabriel Gonzalez’s First Class Module Records</a>), we avoid a lot of the problems with type classes and instances, while keeping the niceties of their abstractions. An environment comonad is a essentially a really complicated way of saying “tuple”, and that’s left adjoint<a href="#fn:1" class="footnote" rel="footnote">1</a> to a reader monad. We get nice syntax sugar for monads and not comonads in Haskell, so <code class="language-plaintext highlighter-rouge">ReaderT Services</code> provides a nice approach to packaging up your environment’s request context. What’s next? Well, you might note that the <code class="language-plaintext highlighter-rouge">Services</code> type was a little restricted. Indeed, the following is a bit nicer: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Services g = Services { runHttp :: forall a. (forall f. MonadHttp f => f a) -> g a } </code></pre></div></div> Which, hey… That’s just a natural transformation! Specifically, a monad morphism. We can reify that type with <code class="language-plaintext highlighter-rouge">ConstraintKinds</code> to get: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type InterpreterFor g eff = forall a. (forall f. eff f => f a) -> g a </code></pre></div></div> We can read this as: You choose the <code class="language-plaintext highlighter-rouge">eff</code>ect you want to interpret, and the monad you want to interpret it to. But you can’t choose the underlying concrete <code class="language-plaintext highlighter-rouge">f</code>, nor can you introspect on the <code class="language-plaintext highlighter-rouge">a</code>s to do so. If you allow <code class="language-plaintext highlighter-rouge">TypeOperators</code>, then it even reads nicely, and we can replace our <code class="language-plaintext highlighter-rouge">IO</code> services with: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Services = Services { runHttp :: IO `InterpreterFor` MonadHttp } </code></pre></div></div> (Not going to lie, that syntax really pleases my inner Rubyist) And, in a final act of cutting <code class="language-plaintext highlighter-rouge">IO</code> out of the program, we can parametrize that, yielding: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Services eff = Services { runHttp :: eff `InterpreterFor` MonadHttp , runDatabase :: eff `InterpreterFor` MonadDatabase , runEmails :: eff `InterpreterFor` MonadMandrill -- etc... } </code></pre></div></div> along with a final <code class="language-plaintext highlighter-rouge">Application</code> type that abstracts over that: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Application = forall m. ReaderT (Services m) m </code></pre></div></div> (<a href="https://www.reddit.com/r/haskell/comments/4std0v/rank_n_classy_limited_effects/d5c2rcb">credit to /u/Faucelme on reddit for that!</a>) Applications, then, are just an environment comonad of monad morphisms. More plainly, they’re a record of effect interpreters. <h1 id="update">Update:</h1> Thanks to <a href="https://www.reddit.com/r/haskell/comments/4std0v/rank_n_classy_limited_effects/d5cdon4">/u/ElvishJerrico on Reddit</a> who has implemented a <code class="language-plaintext highlighter-rouge">Category</code> instance for these morphisms! This is a great way to compose effects. The given example is copied here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE ConstraintKinds #-} {-# LANGUAGE DeriveFunctor #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE PolyKinds #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE UndecidableInstances #-} module Lib where import Control.Category import Control.Monad.Free import Control.Monad.IO.Class import Control.Monad.Reader import Prelude hiding (id, (.)) import qualified Prelude as P newtype Interpret c d = Interpret (forall n a. d n => (forall m. c m => m a) -> n a) instance Category Interpret where id = Interpret P.id Interpret f . Interpret g = Interpret $ \h -> f (g h) class Monad m => MonadHttp m where httpGet :: String -> m String newtype HttpApp a = HttpApp { runHttpApp :: IO a } deriving (Functor, Applicative, Monad, MonadIO) instance MonadHttp HttpApp where httpGet _ = return "[]" -- Should do actual IO runIO :: Interpret MonadHttp MonadIO runIO = Interpret $ \x -> liftIO $ runHttpApp x newtype MockHttp m a = MockHttp { runMockHttp :: m a } deriving (Functor, Applicative, Monad) instance MonadReader r m => MonadReader r (MockHttp m) where ask = MockHttp ask local f (MockHttp m) = MockHttp $ local f m instance MonadReader String m => MonadHttp (MockHttp m) where httpGet _ = ask runMock :: Interpret MonadHttp (MonadReader String) runMock = Interpret runMockHttp class Monad m => MonadRestApi m where getUserIds :: m [Int] data RestApi a = GetUsers ([Int] -> a) deriving Functor instance MonadRestApi (Free RestApi) where getUserIds = liftF $ GetUsers id runRestApi :: Interpret MonadRestApi MonadHttp runRestApi = Interpret $ iterA go where go (GetUsers f) = do response <- httpGet "url" f $ read response runApplication :: Interpret MonadRestApi MonadIO runApplication = runIO . runRestApi mockApplication :: Interpret MonadRestApi (MonadReader String) mockApplication = runMock . runRestApi </code></pre></div></div> <h1 id="actually">Actually…</h1> Don’t use this. It’s complicated and overly boilerplatey. Here’s the <code class="language-plaintext highlighter-rouge">MonadHttp</code> effect code we ended up developing: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type InterpreterFor g eff = forall a. (forall f. eff f => f a) -> g a class MonadHttp m where get :: Url -> m ByteString post :: ToJSON a => Url -> a -> m ByteString data Services eff = Services { runHttp :: eff `InterpreterFor` MonadHttp } </code></pre></div></div> To create one of these <code class="language-plaintext highlighter-rouge">InterpreterFor</code>s, we have to make a type and define an instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype MockHttp a = MockHttp { runMockHttp :: ReaderT HttpEnv IO a } deriving (Functor, Applicative, Monad, MonadReader HttpEnv, MonadIO) type HttpEnv = IORef HttpState type HttpState = Map String ByteString instance MonadHttp MockHttp where get url = do state <- ask >>= liftIO . readIORef pure (Map.lookup url state) post url body = do ref <- ask state <- liftIO (readIORef ref) liftIO (writeIORef ref (Map.insert url (encode body) state)) pure "200 OK" instance MonadHttp IO where get = fmap (view responseBody) . Wreq.get post url = fmap (view responseBody) . Wreq.post url . toJson </code></pre></div></div> Instead, we will create a record-of-functions for the type class, and create two values: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Http m = Http { get :: Url -> m ByteString , post :: forall a. ToJSON a => Url -> a -> m ByteString } prodHttp :: Http IO prodHttp = Http { get = fmap (view responseBody) . Wreq.get , post = \url -> fmap (view responseBody) . Wreq.post url . toJson } mockHttp :: IORef (Map String ByteString) -> Http IO mockHttp env = Http { get = \url -> do state <- readIORef env pure (Map.lookup url state) , post = \url body -> do state <- liftIO (readIORef env) liftIO (writeIORef env (Map.insert url (encode body) state)) pure "200 OK" } </code></pre></div></div> and we include the record of functions directly into <code class="language-plaintext highlighter-rouge">Services</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Services eff = Services { http :: Http eff } type Application eff = ReaderT (Services eff) eff </code></pre></div></div> This gives us the same expressive power without having to deal with type classes, instances, and any of that other hassle. You construct plain values and pass them around. <div class="footnotes" role="doc-endnotes"> <ol> <li id="fn:1" role="doc-endnote"> I had initially written “isomorphic,” and was corrected by George Wilson who reminded me that tuple and reader form an adjunction, and that the isomorphism is between Kleisli (Reader r) and CoKleisli (Env r) <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a> </li> </ol> </div> Thu, 14 Jul 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/07/14/rank_n_classy_limited_effects.html https://www.parsonsmatt.org/2016/07/14/rank_n_classy_limited_effects.html servant-persistent updated Previously, I <a href="/2015/06/07/servant-persistent.html">wrote a blog post on using <code class="language-plaintext highlighter-rouge">servant</code> and <code class="language-plaintext highlighter-rouge">persistent</code> together</a>. <code class="language-plaintext highlighter-rouge">servant</code> has updated to the new 0.7 version, and I felt like it was a good idea to bring my tutorial up to date. I’d also noticed that some folks were using the repository as a starter scaffold for their own apps, which is great! To accommodate that, I’ve beefed up the application a bit to demonstrate some of the features of Servant, including a primitive client, as well as configuration for easy deployment with the <a href="https://hackage.haskell.org/package/keter"><code class="language-plaintext highlighter-rouge">keter</code></a> package. Let’s dive in! The code for all of this is on the <a href="https://github.com/parsonsmatt/servant-persistent/tree/0.7">GitHub repository</a>. I’ll be keeping the 0.7 branch up to date with any edits to this post. Take note: This is less of a tutorial on <code class="language-plaintext highlighter-rouge">servant</code> specifically, and more of an exposition on a <code class="language-plaintext highlighter-rouge">servant</code> base package that has some convenient defaults for running applications. <h1 id="application-structure">Application Structure</h1> The application has three sub-components: <ul> <li><code class="language-plaintext highlighter-rouge">src</code> : contains all the library code</li> <li><code class="language-plaintext highlighter-rouge">app</code> : contains the code for the executable</li> <li><code class="language-plaintext highlighter-rouge">test</code> : contains the test code</li> </ul> It’s a good idea to extract as much code as you can in the library. This makes it easier to test the code, as you can import it into the tests without having to recompile it every time. Additionally, you can make the library functions available for all kinds of potential executables down the line. We’ll start with <code class="language-plaintext highlighter-rouge">Main</code> and dig into the rest. <h1 id="appmainhs"><code class="language-plaintext highlighter-rouge">app/Main.hs</code></h1> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | The 'main' function gathers the required environment information and -- initializes the application. main :: IO () main = do putStrLn "servant-persistent booting up" env <- lookupSetting "ENV" Development port <- lookupSetting "PORT" 8081 pool <- makePool env let cfg = Config { getPool = pool, getEnv = env } logger = setLogger env runSqlPool doMigrations pool generateJavaScript run port $ logger $ app cfg </code></pre></div></div> <code class="language-plaintext highlighter-rouge">main</code> grabs some settings from the environment, creates the database pool, runs migrations, generates JavaScript for querying the API, and finally runs the app. We define <code class="language-plaintext highlighter-rouge">lookupSetting</code> a little below: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | Looks up a setting in the environment, with a provided default, and -- 'read's that information into the inferred type. lookupSetting :: Read a => String -> a -> IO a lookupSetting env def = do maybeValue <- lookupEnv env case maybeValue of Nothing -> return def Just str -> maybe (handleFailedRead str) return (readMay str)) where handleFailedRead str = error $ mconcat [ "Failed to read [[" , str , "]] for environment variable " , env ] </code></pre></div></div> First, we lookup the environment variable. If it’s not present, then we just return the default value. If it is present, then we use the function <code class="language-plaintext highlighter-rouge">readMay</code> which we’ve imported from the <code class="language-plaintext highlighter-rouge">Safe</code> module. If <code class="language-plaintext highlighter-rouge">readMay</code> fails to read the variable, then we throw an error. Consider that <code class="language-plaintext highlighter-rouge">readMay "PRoduction" :: Maybe Environment</code> will return <code class="language-plaintext highlighter-rouge">Nothing</code>, silently putting us in <code class="language-plaintext highlighter-rouge">Development</code> mode. We definitely don’t want that! Next up is <code class="language-plaintext highlighter-rouge">makePool</code>, so let’s check that out. We’ve imported it from <code class="language-plaintext highlighter-rouge">Config</code>. <h1 id="srcconfighs"><code class="language-plaintext highlighter-rouge">src/Config.hs</code></h1> For <code class="language-plaintext highlighter-rouge">Development</code> and <code class="language-plaintext highlighter-rouge">Test</code> environments, the <code class="language-plaintext highlighter-rouge">makePool</code> function is relatively simple: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | This function creates a 'ConnectionPool' for the given environment. -- For 'Development' and 'Test' environments, we use a stock and highly -- insecure connection string. The 'Production' environment acquires the -- information from environment variables that are set by the keter -- deployment application. makePool :: Environment -> IO ConnectionPool makePool Test = runNoLoggingT (createPostgresqlPool (connStr "_test") (envPool Test)) makePool Development = runStdoutLoggingT (createPostgresqlPool (connStr "") (envPool Development)) </code></pre></div></div> In <code class="language-plaintext highlighter-rouge">Testing</code>, we don’t want it to print anything out, so we use the <code class="language-plaintext highlighter-rouge">runNoLoggingT</code> function from <code class="language-plaintext highlighter-rouge">Control.Monad.Logger</code> to tell <code class="language-plaintext highlighter-rouge">createPostgresqlPool</code> which instance of the <code class="language-plaintext highlighter-rouge">MonadLogger</code> type class it’ll use. Likewise, <code class="language-plaintext highlighter-rouge">Development</code> will be printing all of the logs to standard out. We create a <code class="language-plaintext highlighter-rouge">connStr</code> with a database name suffix of “_test” for testing and no suffix for development. For production, it gets a bit trickier. We need to get the database environment from <code class="language-plaintext highlighter-rouge">keter</code>, so we have to read each bit of the connection string in as environment variables. This part of the function makes heavy use of the <code class="language-plaintext highlighter-rouge">MaybeT</code> monad transformer, which might be confusing if you’re not familiar with it. It allows us to combine the effects from ‘IO’ and the effect of <code class="language-plaintext highlighter-rouge">Maybe</code> into a single “big effect”, so that when we bind out of <code class="language-plaintext highlighter-rouge">MaybeT IO a</code>, we get an <code class="language-plaintext highlighter-rouge">a</code>. If we just had <code class="language-plaintext highlighter-rouge">IO (Maybe a)</code>, then binding out of the IO would give us a <code class="language-plaintext highlighter-rouge">Maybe a</code>, which would make the code quite a bit more verbose. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>makePool Production = do pool <- runMaybeT $ do let keys = [ "host=" , "port=" , "user=" , "password=" , "dbname=" ] envs = [ "PGHOST" , "PGPORT" , "PGUSER" , "PGPASS" , "PGDATABASE" ] envVars <- traverse (MaybeT . lookupEnv) envs let prodStr = mconcat . zipWith (<>) keys . fmap BS.pack $ envVars runStdoutLoggingT $ createPostgresqlPool prodStr (envPool Production) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">traverse</code> is tons of fun. If you know <code class="language-plaintext highlighter-rouge">map :: (a -> b) -> [a] -> [b]</code>, then <code class="language-plaintext highlighter-rouge">mapM</code> shouldn’t be too scary: it’s just <code class="language-plaintext highlighter-rouge">mapM :: (a -> m b) -> [a] -> m [b]</code>. As it happens, the <code class="language-plaintext highlighter-rouge">m</code> in <code class="language-plaintext highlighter-rouge">mapM</code> doesn’t have to a <code class="language-plaintext highlighter-rouge">Monad</code>, just <code class="language-plaintext highlighter-rouge">Applicative</code>, and it works for more things than just lists. In this case, <code class="language-plaintext highlighter-rouge">traverse</code> is taking each <code class="language-plaintext highlighter-rouge">String</code> in the <code class="language-plaintext highlighter-rouge">envs</code> list, looking it up in the environment and wrapping it in <code class="language-plaintext highlighter-rouge">MaybeT</code>, and finally evaluating a value of type <code class="language-plaintext highlighter-rouge">MaybeT IO [String]</code>. We now have a list of keys, and a list of values. We zip them together with <code class="language-plaintext highlighter-rouge"><></code> and concatenate them all into a big connection string, with which we create a pool. Finally, we <code class="language-plaintext highlighter-rouge">runMaybeT</code> to convert the <code class="language-plaintext highlighter-rouge">MaybeT IO ConnectionPool</code> to an <code class="language-plaintext highlighter-rouge">IO (Maybe ConnectionPool</code>) and bind that value out. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> case pool of Nothing -> error "Database Configuration not present in environment." Just a -> return a </code></pre></div></div> If the database configuration isn’t there, then we error out. Otherwise, we return it. This shouldn’t happen, as <code class="language-plaintext highlighter-rouge">keter</code> automatically manages the PostgreSQL database information for us on the deployment server. <h1 id="srcmodelshs"><code class="language-plaintext highlighter-rouge">src/Models.hs</code></h1> That covers making the pool. Running migrations was next. This step is neatly handled for us by the <a href="https://www.yesodweb.com/book/persistent"><code class="language-plaintext highlighter-rouge">persistent</code></a> library. For further reading on that, check the chapter out. It’s a great resource. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doMigrations :: SqlPersistM () doMigrations = runMigration migrateAll </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">migrateAll</code> function is generated by the following Persistent Entity Definitions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| User json name String email String deriving Show |] </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">json</code> keyword there means “please generate <code class="language-plaintext highlighter-rouge">FromJSON</code> and <code class="language-plaintext highlighter-rouge">ToJSON</code> instances for this entity,” which is a really handy tool. Persistent is smart enough to know if the current database schema is in line with what the entity definitions say. If it is, then it doesn’t do anything. If it can safely make the migrations, then it does so. If it can’t, then it helpfully prints the SQL necessary to the console for you to do yourself. You’ll probably want to move to something like <a href="https://hackage.haskell.org/package/dbmigrations">dbmigrations</a> when your database is a bit more complicated, but Persistent’s migrations are still really useful to verify that your data looks like you expect. You can run <code class="language-plaintext highlighter-rouge">printMigration</code> to just print out what Persistent would do. Easy! Let’s see how we’re generating the JavaScript now. That function was imported from <code class="language-plaintext highlighter-rouge">Api.User</code>, which we’ll check out next. <h1 id="srcapiuserhs"><code class="language-plaintext highlighter-rouge">src/Api/User.hs</code></h1> In classic <code class="language-plaintext highlighter-rouge">servant</code> manner, we’ve got a little API we’ve defined: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type UserAPI = "users" :> Get '[JSON] [Entity User] :<|> "users" :> Capture "name" String :> Get '[JSON] (Entity User) :<|> "users" :> ReqBody '[JSON] User :> Post '[JSON] Int64 </code></pre></div></div> Along with our handlers for the server: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | The server that runs the UserAPI userServer :: ServerT UserAPI App userServer = allUsers :<|> singleUser :<|> createUser -- | Returns all users in the database. allUsers :: App [Entity User] allUsers = runDb (selectList [] []) </code></pre></div></div> It still blows my mind how good Haskell’s type inference is. <code class="language-plaintext highlighter-rouge">selectList</code> is a function that accepts a list of filters and a list of options, and returns a list of matching records. Here, we provide nothing other than the inferred return type of <code class="language-plaintext highlighter-rouge">Entity User</code> and it knows how to run the query. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | Returns a user by name or throws a 404 error. singleUser :: String -> App (Entity User) singleUser str = do maybeUser <- runDb (selectFirst [UserName ==. str] []) case maybeUser of Nothing -> throwError err404 Just person -> return person -- | Creates a user in the database. createUser :: User -> App Int64 createUser p = do newUser <- runDb (insert (User (userName p) (userEmail p))) return $ fromSqlKey newUser </code></pre></div></div> Here’s a neat trick: <code class="language-plaintext highlighter-rouge">App (Entity User)</code> is just a function. We can easily reuse that handler code in the rest of the codebase if we wanted to, and it’d do the right thing. Finally, the JavaScript generation: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- | Generates JavaScript to query the User API. generateJavaScript :: IO () generateJavaScript = writeJSForAPI (Proxy :: Proxy UserAPI) vanillaJS "./assets/api.js" </code></pre></div></div> Well, what does that look like? <h1 id="assetsapijs"><code class="language-plaintext highlighter-rouge">assets/api.js</code></h1> The generated code isn’t super pretty, but it gets the job done. <div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code>var getUsers = function(onSuccess, onError) { var xhr = new XMLHttpRequest(); xhr.open('GET', '/users', true); xhr.setRequestHeader("Accept","application/json"); xhr.onreadystatechange = function (e) { if (xhr.readyState == 4) { if (xhr.status == 204 || xhr.status == 205) { onSuccess(); } else if (xhr.status >= 200 && xhr.status < 300) { var value = JSON.parse(xhr.responseText); onSuccess(value); } else { var value = JSON.parse(xhr.responseText); onError(value); } } } xhr.send(null); } </code></pre></div></div> This is just the <code class="language-plaintext highlighter-rouge">vanillaJS</code> option. There’s also jQuery and AngularJS options available. The same machinery that generates client JavaScript code can also be used to generate <a href="https://hackage.haskell.org/package/lackey">Ruby</a> clients, if you need them. Now, we still need to serve up some static files. We do that in the <code class="language-plaintext highlighter-rouge">app</code> function, imported from <code class="language-plaintext highlighter-rouge">Api</code>. <h1 id="srcapihs"><code class="language-plaintext highlighter-rouge">src/Api.hs</code></h1> This is the function we export to run our <code class="language-plaintext highlighter-rouge">UserAPI</code>. Given a <code class="language-plaintext highlighter-rouge">Config</code>, we return a WAI <code class="language-plaintext highlighter-rouge">Application</code> which any WAI compliant server can run. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>userApp :: Config -> Application userApp cfg = serve (Proxy :: Proxy UserAPI) (appToServer cfg) </code></pre></div></div> This functions tells Servant how to run the <code class="language-plaintext highlighter-rouge">App</code> monad with the Servant provided <code class="language-plaintext highlighter-rouge">server</code> function. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>appToServer :: Config -> Server UserAPI appToServer cfg = enter (convertApp cfg) userServer </code></pre></div></div> This function converts our <code class="language-plaintext highlighter-rouge">App</code> monad into the <code class="language-plaintext highlighter-rouge">ExceptT ServantErr IO</code> monad that Servant<code class="language-plaintext highlighter-rouge">s </code>enter’ function needs in order to run the application. The <code class="language-plaintext highlighter-rouge">:~></code> type is a natural transformation, or, in non-category theory terms, a function that converts two type constructors without looking at the values in the types. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>convertApp :: Config -> App :~> ExceptT ServantErr IO convertApp cfg = Nat (flip runReaderT cfg . runApp) </code></pre></div></div> Since we also want to provide a minimal front end, we need to give Servant a way to serve a directory with HTML and JavaScript. This function creates a WAI application that just serves the files out of the given directory. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>files :: Application files = serveDirectory "assets" </code></pre></div></div> Just like a normal API type, we can use the <code class="language-plaintext highlighter-rouge">:<|></code> combinator to unify two different APIs and applications. This is a powerful tool for code reuse and abstraction! We need to put the ‘Raw’ endpoint last, since it always succeeds. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type AppAPI = UserAPI :<|> Raw appApi :: Proxy AppAPI appApi = Proxy </code></pre></div></div> Finally, this function takes a configuration and runs our <code class="language-plaintext highlighter-rouge">UserAPI</code> alongside the <code class="language-plaintext highlighter-rouge">Raw</code> endpoint that serves all of our files. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>app :: Config -> Application app cfg = serve appApi (readerServer cfg :<|> files) </code></pre></div></div> Now, we can do: <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ stack build $ stack exec perservant </code></pre></div></div> and open <code class="language-plaintext highlighter-rouge">localhost:8081</code> to see our primitive little UI. We’re done, right? Well, sort of! There’s also deployment with <code class="language-plaintext highlighter-rouge">keter</code>! <h1 id="deployment">Deployment</h1> <code class="language-plaintext highlighter-rouge">keter</code> is a very nice little utility for deploying Haskell applications. Here’s the configuration required for the app: <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code># config/keter.yaml exec: ../perservant host: your.host.name.com plugins: postgres: true </code></pre></div></div> And the deployment script: <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#! /bin/bash set -e echo "Building Perservant..." stack build strip `stack exec -- which perservant` echo "Creating bundle..." cp `stack exec -- which perservant` perservant tar -czvf perservant.keter perservant config ql-ui/assets rm perservant scp ./perservant.keter user@host:/opt/keter/incoming/perservant.keter rm perservant.keter </code></pre></div></div> And that’s all you need to get going! Fri, 08 Jul 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/07/08/servant-persistent_updated.html https://www.parsonsmatt.org/2016/07/08/servant-persistent_updated.html Incremental API Takeover with Haskell Servant Haskell’s <a href="https://haskell-servant.github.io/"><code class="language-plaintext highlighter-rouge">servant</code></a> library is a compelling choice for a web API. By providing a specification of the API as a type, you ensure at compile time that your application correctly implements the specification. You also get automatically generated or derived clients for <a href="https://haskell-servant.readthedocs.io/en/stable/tutorial/Client.html">Haskell</a>, <a href="https://haskell-servant.readthedocs.io/en/stable/tutorial/Javascript.html">JavaScript</a>, and <a href="https://github.com/tfausak/lackey#readme">Ruby</a>. Using <a href="https://haskell-servant.github.io/posts/2016-02-06-servant-swagger.html">servant-swagger</a>, you can automatically generate <code class="language-plaintext highlighter-rouge">swagger</code> API specification, with all the goodies that come from that. If you’re not familiar, I’d highly recommend checking out the links – <a href="https://haskell-servant.github.io/posts/2015-08-05-content-types.html">this article in particular</a> is a bit of a mindblower. “Fine, fine, you’ve convinced me, I’ll start my next project with Servant. But I still have all these old APIs in Ruby and JavaScript and Java that I need to support!” What if I told you that you could incrementally take over an existing API, and gradually reap the benefits of a <code class="language-plaintext highlighter-rouge">servant</code> application? Oh yes. Let’s do this! We’ll put the old API behind a reverse proxy that’s handled in Haskell by Servant, and take over an endpoint at a time. The code for this blog post is located in <a href="https://www.github.com/parsonsmatt/incremental-servant">this repository</a>. Each section has it’s own git branch. The first section is the master branch. <h1 id="the-initial-api">The Initial API</h1> Here’s the super important business logic legacy API that we need to preserve while we’re replacing it: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code># rubby/api.rb require 'sinatra' require 'json' get '/' do 'You can get either <a href="cat">cat</a> or <a href="dog">dog</a>.' end get '/cat' do { cat: "meow" }.to_json end get '/dog' do { dog: "woof" }.to_json end </code></pre></div></div> Pretty hairy, right? groan Sinatra is a nice Ruby DSL for writing APIs and web apps. When you make a <code class="language-plaintext highlighter-rouge">GET</code> request to <code class="language-plaintext highlighter-rouge">/</code>, it responds with a bit of text linking you to either <code class="language-plaintext highlighter-rouge">cat</code> or <code class="language-plaintext highlighter-rouge">dog</code>. Ruby implicitly returns the last line in a block, so this just returns the corresponding hashes converted to JSON. Ordinarily, we’d run this with <code class="language-plaintext highlighter-rouge">ruby api.rb</code> and it’d be on <code class="language-plaintext highlighter-rouge">https://localhost:4567</code>. Our first step is creating the reverse proxy in Haskell that is handled by Servant. We’ll use the <a href="https://www.stackage.org/lts-6.4/package/http-reverse-proxy-0.4.3"><code class="language-plaintext highlighter-rouge">http-reverse-proxy</code></a> package to simplify the process. Here’s the full source code of the initial reverse proxy. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Api.hs {-# LANGUAGE DataKinds #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TypeOperators #-} module Api where import Network.HTTP.ReverseProxy import Network.HTTP.Client (Manager, defaultManagerSettings, newManager) import Network.Wai import Network.Wai.Handler.Warp import Servant </code></pre></div></div> As usual, we start with the language extensions and imports. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>forwardRequest :: Request -> IO WaiProxyResponse forwardRequest _ = pure (WPRProxyDest (ProxyDest "127.0.0.1" 4567)) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">forwardRequest</code> is a function we’ll use to return a <a href="https://www.stackage.org/haddock/lts-6.4/http-reverse-proxy-0.4.3/Network-HTTP-ReverseProxy.html#t:WaiProxyResponse"><code class="language-plaintext highlighter-rouge">WaiProxyResponse</code></a>. This function inspects the request, and then gets to decide what to do with it. In this case, we just want to forward the request to our Sinatra app running on <code class="language-plaintext highlighter-rouge">localhost:4567</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>app :: Manager -> Application app manager = waiProxyTo forwardRequest defaultOnExc manager startApp :: IO () startApp = do manager <- newManager defaultManagerSettings run 8080 (app manager) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">app</code> is going to be how we define our application, and <code class="language-plaintext highlighter-rouge">startApp</code> is a convenience function we’ll use to run the app in GHCi. Let’s verify that this is working like we want it to! <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ git clone https://www.github.com/parsonsmatt/incremental-servant $ cd incremental-servant/rubby $ bundle install $ ruby api.rb </code></pre></div></div> This runs our Ruby API. When you go to <code class="language-plaintext highlighter-rouge">https://localhost:4567</code>, you should see the text defined in the <code class="language-plaintext highlighter-rouge">get '/' do ... end</code> block above. Clicking either of the links will return a JSON object. Now, if this works, then we should be able to just run the Haskell thing in GHCi and access it through <code class="language-plaintext highlighter-rouge">localhost:8080</code>. Let’s give that a shot. Leave the <code class="language-plaintext highlighter-rouge">ruby api.rb</code> task running, and in another terminal, do: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cd incremental-servant $ stack init $ stack ghci # GHCi loads... Main Api> startApp -- silence . . . </code></pre></div></div> Now, when I navigate to <code class="language-plaintext highlighter-rouge">localhost:8080</code> in Chrome, I see the original application we defined. Nice! <h1 id="commandeering-a-route">Commandeering a Route</h1> Next up, we’ll take over the <code class="language-plaintext highlighter-rouge">cat</code> route. The code for this section is in the <a href="https://github.com/parsonsmatt/incremental-servant/tree/cat-takeover"><code class="language-plaintext highlighter-rouge">cat-takeover</code> branch on GitHub</a>. Our first step is to define our API type for Servant: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type API = "cat" :> Get '[JSON] Cat </code></pre></div></div> For the string literal <code class="language-plaintext highlighter-rouge">"cat"</code>, we’ll return a <code class="language-plaintext highlighter-rouge">JSON</code> representation of a <code class="language-plaintext highlighter-rouge">Cat</code>. We could alternatively specify other encodings, like HTML or plain text, but for now we’re just returning JSON. What is a <code class="language-plaintext highlighter-rouge">Cat</code> exactly? We have to define it! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Cat = Cat { cat :: String } </code></pre></div></div> And how do we convert it to JSON? We’ll use the <a href="https://www.stackage.org/lts-6.4/package/aeson-0.11.2.0">Aeson</a> library to do the conversion! Here’s a manual instance that mirrors the API we have on the Ruby: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance ToJSON Cat where toJSON (Cat mew) = object [ "cat" .= mew ] </code></pre></div></div> Serialization and deserialization are often some of the more annoying parts of writing a web app. Aeson provides both Template Haskell functions for deriving compile time instances, as well as generic implementations for derived instances. If we enable the <code class="language-plaintext highlighter-rouge">DeriveGeneric</code> and <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code> extensions and <code class="language-plaintext highlighter-rouge">import GHC.Generics</code>, then we can change our <code class="language-plaintext highlighter-rouge">newtype</code> declaration above to the following and get the JSON instance without having to write it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Cat = Cat { cat :: String } deriving (Generic, ToJSON) </code></pre></div></div> Nice! Alright, let’s get it hooked up to our application. First, we need a Servant server function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server :: Server API server = pure (Cat { cat = "mrowl" }) </code></pre></div></div> Now we need to turn our <code class="language-plaintext highlighter-rouge">Server API</code> into a <code class="language-plaintext highlighter-rouge">WAI</code> <code class="language-plaintext highlighter-rouge">Application</code>. We use the <code class="language-plaintext highlighter-rouge">serve</code> function and the funny <code class="language-plaintext highlighter-rouge">:<|></code> constructor. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>api :: Proxy (API :<|> Raw) api = Proxy app :: Manager -> Application app manager = serve api (server :<|> waiProxyTo forwardRequest defaultOnExc manager) </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">api</code> proxy is required to tell Servant what our API type is supposed to look like. Otherwise, it can’t figure out whether or not the function we’re <code class="language-plaintext highlighter-rouge">serve</code>ing conforms to it! The second argument of <code class="language-plaintext highlighter-rouge">serve</code> is a pair of handlers. The handlers must line up with the types. Note that <code class="language-plaintext highlighter-rouge">server</code> in this code has the type <code class="language-plaintext highlighter-rouge">Server API</code>, while <code class="language-plaintext highlighter-rouge">waiProxyTo ...</code> has the type <code class="language-plaintext highlighter-rouge">Application</code>. If we have Servant server types (like <code class="language-plaintext highlighter-rouge">"cat" :> Get '[JSON] Cat</code>), then we need to have a <code class="language-plaintext highlighter-rouge">Server</code> for them. For the <code class="language-plaintext highlighter-rouge">Raw</code> endpoint, we just need any <code class="language-plaintext highlighter-rouge">WAI</code> application. This is all the changes we need! Let’s test this out. You haven’t closed the <code class="language-plaintext highlighter-rouge">ruby api.rb</code> process right? That’s been running the whole time, right? Open it up in <code class="language-plaintext highlighter-rouge">localhost:4567</code> and verify that it’s still doing the thing we want it to. Close out the current <code class="language-plaintext highlighter-rouge">startApp</code> call in GHCi, hit <code class="language-plaintext highlighter-rouge">:reload</code> to reload the code, and run <code class="language-plaintext highlighter-rouge">startApp</code> again. Open <code class="language-plaintext highlighter-rouge">localhost:8080</code> in the browser, and you should see the same text from Sinatra. However, when you click <code class="language-plaintext highlighter-rouge">cat</code>, instead of the cat saying <code class="language-plaintext highlighter-rouge">meow</code>, you get <code class="language-plaintext highlighter-rouge">mrowl</code>! EXCITING We just took over a route without any loss of service or touching the underlying application at all! <h1 id="dogs">DOGS</h1> Let’s knock out the dogs route now. The code for that is <a href="https://github.com/parsonsmatt/incremental-servant/tree/dog-takeover">here</a>. Then the original app will only be serving as an entry point! It’s actually a really minimal change! We’ll add two language pragmas so we can derive generic instances, import the generic machinery, and that’s almost the whole of it! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE DeriveGeneric, DeriveAnyClass #-} -- ... import GHC.Generics -- ... newtype Dog = Dog { dog :: String } deriving (Generic, ToJSON) type API = "cat" :> Get '[JSON] Cat :<|> "dog" :> Get '[JSON] Dog server :: Server API server = cats :<|> dogs where cats = pure (Cat { cat = "mrowl" }) dogs = pure (Dog { dog = "zzzzzz" }) </code></pre></div></div> And that’s all! We’ll reload the code in GHCi and see that the <code class="language-plaintext highlighter-rouge">dogs</code> route has been successfully captured. <h1 id="intro-text">Intro Text</h1> The second to last remaining bit is to grab the index page. Let’s do it! (this code is in the <a href="https://github.com/parsonsmatt/incremental-servant/tree/index"><code class="language-plaintext highlighter-rouge">index</code> branch on GitHub</a>) We’ll use the excellent <a href="https://www.stackage.org/lts-6.4/package/lucid-2.9.5"><code class="language-plaintext highlighter-rouge">lucid</code></a> library for HTML templating rather than a bare string. This means we’ll need to add <code class="language-plaintext highlighter-rouge">lucid</code> and <code class="language-plaintext highlighter-rouge">servant-lucid</code> to the cabal file, and add the relevant imports. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import Lucid import Servant.HTML.Lucid type API = Get '[HTML] (Html ()) :<|> "cat" :> Get '[JSON] Cat :<|> "dog" :> Get '[JSON] Dog </code></pre></div></div> Our API definition adds a first route, returning a content type of HTML and a value of <code class="language-plaintext highlighter-rouge">Html ()</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server :: Server API server = pure index :<|> cats :<|> dogs where cats = pure (Cat { cat = "mrowl" }) dogs = pure (Dog { dog = "zzzzzzzz" }) index = p_ $ do "You can get either a " a_ [href_ "cat"] "cat" " or a " a_ [href_ "dog"] "dog" "." </code></pre></div></div> I really like <code class="language-plaintext highlighter-rouge">lucid</code>. It’s a great DSL for HTML. At this point, all the API endpoints are being routed to Haskell. We still have the reverse proxy setup, though, and a <code class="language-plaintext highlighter-rouge">Raw</code> endpoint never fails to match. This means that any missing route will go back to the Sinatra application, and will be handled there. If we remove the proxy and <code class="language-plaintext highlighter-rouge">Raw</code> endpoint, then we’ll be able to handle those errors in Servant. Anyway, that’s all we had to do. If we quit GHCi, restart it, and then rerun <code class="language-plaintext highlighter-rouge">startApp</code>, the application will be serving up our index page rather than the Ruby app. We have successfully ousted a Ruby API with one based on Haskell’s Servant, from which we’ll reap tremendous benefits in terms of generated documentation, clients, and improved performance. <h1 id="performance">Performance</h1> So what is the overhead on this reverse proxy? Let’s run a shady benchmark with <code class="language-plaintext highlighter-rouge">httperf</code>. Keep in mind that this is done locally, and the Ruby server is <code class="language-plaintext highlighter-rouge">WEBrick</code>, which is notoriously slow. Here’s the output from <code class="language-plaintext highlighter-rouge">httperf</code> on the Ruby server: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ httperf --port=4567 --num-calls=500 httperf --client=0/1 --server=localhost --port=4567 --uri=/ --send-buffer=4096 --recv-buffer=163 84 --ssl-protocol=auto --num-conns=1 --num-calls=500 Maximum connect burst length: 0 Total: connections 1 requests 500 replies 500 test-duration 19.961 s Connection rate: 0.1 conn/s (19960.9 ms/conn, <=1 concurrent connections) Connection time [ms]: min 19960.9 avg 19960.9 max 19960.9 median 19960.5 stddev 0.0 Connection time [ms]: connect 0.1 Connection length [replies/conn]: 500.000 Request rate: 25.0 req/s (39.9 ms/req) Request size [B]: 62.0 Reply rate [replies/s]: min 25.0 avg 25.0 max 25.0 stddev 0.0 (3 samples) Reply time [ms]: response 1.4 transfer 38.5 Reply size [B]: header 282.0 content 66.0 footer 0.0 (total 348.0) Reply status: 1xx=0 2xx=500 3xx=0 4xx=0 5xx=0 CPU time [s]: user 6.18 system 13.78 (user 30.9% system 69.0% total 100.0%) Net I/O: 10.0 KB/s (0.1*10^6 bps) Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0 Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0 </code></pre></div></div> We’re getting about 25 requests per second, with a test duration of about 20 seconds. Here’s output from just running the reverse proxy. I built the binary using <code class="language-plaintext highlighter-rouge">-O2</code> to enable optimizations. <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ httperf --port=8080 --num-calls=500 httperf --client=0/1 --server=localhost --port=8080 --uri=/ --send-buffer=4096 --recv-buffer=163 84 --ssl-protocol=auto --num-conns=1 --num-calls=500 Maximum connect burst length: 0 Total: connections 1 requests 500 replies 500 test-duration 19.984 s Connection rate: 0.1 conn/s (19984.3 ms/conn, <=1 concurrent connections) Connection time [ms]: min 19984.4 avg 19984.4 max 19984.4 median 19984.5 stddev 0.0 Connection time [ms]: connect 0.1 Connection length [replies/conn]: 500.000 Request rate: 25.0 req/s (40.0 ms/req) Request size [B]: 62.0 Reply rate [replies/s]: min 25.0 avg 25.0 max 25.0 stddev 0.0 (3 samples) Reply time [ms]: response 39.9 transfer 0.0 Reply size [B]: header 290.0 content 66.0 footer 2.0 (total 358.0) Reply status: 1xx=0 2xx=500 3xx=0 4xx=0 5xx=0 CPU time [s]: user 5.76 system 14.20 (user 28.8% system 71.1% total 99.9%) Net I/O: 10.2 KB/s (0.1*10^6 bps) Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0 Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0 </code></pre></div></div> This is essentially the exact same. About 20 seconds test duration and 25 requests per second. There doesn’t seem to be any overhead detectable in this test, which makes me want a better test! For reference, here’s the Haskell <code class="language-plaintext highlighter-rouge">index</code> branch, where it’s doing the <code class="language-plaintext highlighter-rouge">lucid</code> HTML templating: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ ~/Projects/incremental-servant/ index* httperf --port=8080 --num-calls=500 httperf --client=0/1 --server=localhost --port=8080 --uri=/ --send-buffer=4096 --recv-buffer=163 84 --ssl-protocol=auto --num-conns=1 --num-calls=500 Maximum connect burst length: 0 Total: connections 1 requests 500 replies 500 test-duration 0.061 s Connection rate: 16.3 conn/s (61.5 ms/conn, <=1 concurrent connections) Connection time [ms]: min 61.5 avg 61.5 max 61.5 median 61.5 stddev 0.0 Connection time [ms]: connect 0.1 Connection length [replies/conn]: 500.000 Request rate: 8136.6 req/s (0.1 ms/req) Request size [B]: 62.0 Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples) Reply time [ms]: response 0.1 transfer 0.0 Reply size [B]: header 143.0 content 77.0 footer 2.0 (total 222.0) Reply status: 1xx=0 2xx=500 3xx=0 4xx=0 5xx=0 CPU time [s]: user 0.02 system 0.04 (user 32.5% system 65.1% total 97.6%) Net I/O: 2240.7 KB/s (18.4*10^6 bps) Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0 Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0 </code></pre></div></div> 8000 requests per second, with a test duration of 60 milliseconds. I like that pretty well! Fri, 24 Jun 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/06/24/take_over_an_api_with_servant.html https://www.parsonsmatt.org/2016/06/24/take_over_an_api_with_servant.html The Magic of Folds Folds are a common stumbling point for people learning about the functional paradigm. I remember being pretty confused about the difference between a left fold, a right fold, and how either of them differ from a <code class="language-plaintext highlighter-rouge">reduce</code>. I’m going to try and explain them in a way that’s easy for non-functional programmers to get. If you don’t get it, I’ve screwed up – feel free to let me know! To keep things simple, the example will just use singly linked lists in Java and Haskell. <h1 id="lists">Lists</h1> Let’s get warmed up and write out our <code class="language-plaintext highlighter-rouge">List</code> class. We’re trying on our functional programming hats, so we’ll keep it fairly simple: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class List<T> { public final T head; public final List<T> tail; private List(T head, List<T> tail) { this.head = head; this.tail = tail; } public static <T> List<T> Cons(T item, List<T> rest) { return new List<T>(item, rest); } public static <T> List<T> Nil() { return new List<>(null, null); } } </code></pre></div></div> We’ve got two constructors: <code class="language-plaintext highlighter-rouge">Cons</code> for putting something on a list, and <code class="language-plaintext highlighter-rouge">Nil</code> for an empty list. We can build a simple list like: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Cons(1, Cons (2, Cons(3, Nil()))) </code></pre></div></div> And the same in Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data [t] = t : [ts] | [] </code></pre></div></div> If you’re unfamiliar with Haskell, this defines two constructors: an infix constructor <code class="language-plaintext highlighter-rouge">:</code> and the empty list <code class="language-plaintext highlighter-rouge">[]</code> constructor. <code class="language-plaintext highlighter-rouge">t</code> is a type parameter. We can make <code class="language-plaintext highlighter-rouge">1 : 2 : 3 : []</code> like this, though the language gives us syntax sugar to write <code class="language-plaintext highlighter-rouge">[1, 2, 3]</code> instead. To continue warming up, let’s write out <code class="language-plaintext highlighter-rouge">map</code> and <code class="language-plaintext highlighter-rouge">filter</code> for lists. This will help with our intuition on folds later on! <h1 id="maps-and-filters">Maps And Filters</h1> <code class="language-plaintext highlighter-rouge">map</code> is a function or method usually defined on lists and arrays, though you can define it for all kinds of types. Our intuition for a map is that – for a structure like <code class="language-plaintext highlighter-rouge">List<A></code>, we’ll have a <code class="language-plaintext highlighter-rouge">Function<A, B></code>. We’ll take all the <code class="language-plaintext highlighter-rouge"><A></code> values in the list and return a new list with <code class="language-plaintext highlighter-rouge"></code> values instead. Recursive stuff works best if you can think of the possible cases. For lists, we’ve got two constructors: <code class="language-plaintext highlighter-rouge">Nil</code> and <code class="language-plaintext highlighter-rouge">Cons</code>. If we have a <code class="language-plaintext highlighter-rouge">Nil</code>, we return another <code class="language-plaintext highlighter-rouge">Nil</code>. If we have a <code class="language-plaintext highlighter-rouge">Cons</code>, we return another <code class="language-plaintext highlighter-rouge">Cons</code> after applying the function to the <code class="language-plaintext highlighter-rouge">head</code> and <code class="language-plaintext highlighter-rouge">map</code>ping over the tail: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static <A, B> List<B> map(Function<A, B> f, List<A> list) { if (list.isNil()) { return List.Nil(); } return List.Cons(f.apply(list.head), map(f, list.tail)); } </code></pre></div></div> Haskell gives us pattern matching, so instead of an <code class="language-plaintext highlighter-rouge">if</code>, we match on the constructors: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map f [] = [] map f (head : tail) = (f head) : (map f tail) </code></pre></div></div> This is a little concise, and we can possibly make it more clear by naming some of the intermediate steps. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static <A, B> List<B> map(Function<A, B> f, List<A> list) { if (list.isNil()) { return List.Nil(); } A val = list.head; List<A> rest = list.tail; B newVal = f.apply(val); List<B> newRest = map(f, rest); return List.Cons(newVal, newRest); } </code></pre></div></div> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map f [] = [] map f (head : tail) = let newVal = f head newRest = map f tail in newVal : newRest </code></pre></div></div> Cool. Alright, let’s do <code class="language-plaintext highlighter-rouge">filter</code> now. Filter takes a <code class="language-plaintext highlighter-rouge">predicate</code> and returns a new list where the elements in the new list returned true for the predicate. Filtering an empty list gives us an empty list. When we filter a <code class="language-plaintext highlighter-rouge">Cons</code>, we have two cases: <ol> <li>Applying the <code class="language-plaintext highlighter-rouge">predicate</code> to the <code class="language-plaintext highlighter-rouge">head</code> returns <code class="language-plaintext highlighter-rouge">true</code>.</li> <li>The above returns <code class="language-plaintext highlighter-rouge">false</code>.</li> </ol> In both cases, we’ll want to continue filtering the list. In the first case, we want to keep the item. In the second case, we don’t want to keep it. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static <A> List<A> filter(Function<A, Boolean> predicate, List<A> list) { if (list.isNil()) { return List.Nil(); } if (predicate.apply(list.head)) { return List.Cons(list.head, filter(predicate, list.tail)); } return filter(predicate, list.tail); } </code></pre></div></div> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>filter pred [] = [] filter pred (head : tail) = if pred head then head : (filter pred tail) else filter pred tail </code></pre></div></div> <h1 id="last-warmup">Last Warmup</h1> Finally, let’s write <code class="language-plaintext highlighter-rouge">sum</code> to sum all the numbers in a list. The sum of an empty list is <code class="language-plaintext highlighter-rouge">0</code>, and the sum of a non-empty list is the value of the head plus the sum of the tail. Easy enough, let’s write it out: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static Integer sum(List<Integer> list) { if (list.isNil()) { return 0; } return list.head + sum(list.tail); } </code></pre></div></div> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sum [] = 0 sum (x:xs) = x + sum xs </code></pre></div></div> Map, filter, and sum all have some things in common: <ol> <li>They recursively walk a list</li> <li>They do something with each element with the result of the rest of the list</li> <li>They have a value for the empty list case.</li> </ol> Alright, with that, we’re ready to conquer folds. <h1 id="fold">Fold</h1> A fold has three arguments: <ol> <li>The zero value (or, what to do with the end of the list)</li> <li>The function to combine</li> <li>The list to fold</li> </ol> <code class="language-plaintext highlighter-rouge">foldr</code> is a right fold, <code class="language-plaintext highlighter-rouge">foldl</code> is a left fold, and they’re defined like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foldr k z [] = z foldr k z (x:xs) = k x (foldr k z xs) foldl k z [] = z foldl k z (x:xs) = foldl k (k z x) xs </code></pre></div></div> The variable <code class="language-plaintext highlighter-rouge">k</code> is our combining function. <code class="language-plaintext highlighter-rouge">foldl</code> is tail recursive, and passes the result of combining the accumulator <code class="language-plaintext highlighter-rouge">z</code> with the current item on the list. “but matt, this doesn’t help me understand” Right. We’re getting there. Haskell lets us use functions infix if we surround them with backticks, so we can also write <code class="language-plaintext highlighter-rouge">foldr</code> like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foldr k z [] = z foldr k z (x:xs) = x `k` (foldr k z xs) </code></pre></div></div> The infix isn’t superfluous. We can get a nice intuition for how <code class="language-plaintext highlighter-rouge">foldr</code> works on a list with it. Let’s see how Haskell would write out <code class="language-plaintext highlighter-rouge">[1..3]</code> without any sugar: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1 : 2 : 3 : [] </code></pre></div></div> Take every <code class="language-plaintext highlighter-rouge">:</code> and replace it with <code class="language-plaintext highlighter-rouge">k</code>, and take the <code class="language-plaintext highlighter-rouge">[]</code> and replace it with <code class="language-plaintext highlighter-rouge">z</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1 `k` 2 `k` 3 `k` z </code></pre></div></div> Now we can substitute <code class="language-plaintext highlighter-rouge">k</code> for <code class="language-plaintext highlighter-rouge">+</code> and <code class="language-plaintext highlighter-rouge">z</code> for <code class="language-plaintext highlighter-rouge">0</code> and see that this is <code class="language-plaintext highlighter-rouge">sum</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1 + 2 + 3 + 0 </code></pre></div></div> We can get <code class="language-plaintext highlighter-rouge">map</code> by replacing <code class="language-plaintext highlighter-rouge">k</code> with <code class="language-plaintext highlighter-rouge">(\x acc-> f x : acc)</code>, and we can get <code class="language-plaintext highlighter-rouge">filter</code> by replacing <code class="language-plaintext highlighter-rouge">k</code> with <code class="language-plaintext highlighter-rouge">(\x acc -> if p x then x:acc else acc)</code>. Let’s walk through an example, step by step: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- initial function call foldr 0 (+) (1 : 2 : 3 : []) -- recurse: -- foldr k z (x:xs) = x `k` foldr k z xs = 1 + (foldr 0 (+) (2 : 3 : [])) -- recurse: = 1 + (2 + (foldr 0 (+) (3 : []))) -- recurse: = 1 + (2 + (3 + (foldr 0 (+) []))) -- base case: -- foldr k z [] = z = 1 + (2 + (3 + 0)) </code></pre></div></div> Yup! What about <code class="language-plaintext highlighter-rouge">foldl</code>? There must be some magic there, right? Nope, though it might look a little strange: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- initial function call foldl 0 (+) (1 : 2 : 3 : []) -- recurse: -- foldl k z (x:xs) = foldl k (k z x) xs foldl (+) (0 + 1) (2 : 3 : []) -- recurse: foldl (+) ((0 + 1) + 2) (3 : []) -- recurse: foldl (+) (((0 + 1) + 2) + 3) [] -- base case: -- foldl k z [] = z (((0 + 1) + 2) + 3) </code></pre></div></div> Interesting! This has nearly the same shape as what <code class="language-plaintext highlighter-rouge">foldr</code> ended up looking like, but the parentheses are nested differently. With <code class="language-plaintext highlighter-rouge">foldr</code>, we directly replace <code class="language-plaintext highlighter-rouge">[]</code> with our <code class="language-plaintext highlighter-rouge">z</code> value. <code class="language-plaintext highlighter-rouge">foldl</code> prepends the <code class="language-plaintext highlighter-rouge">z</code> value to the list and just drops the <code class="language-plaintext highlighter-rouge">[]</code> entirely, so our <code class="language-plaintext highlighter-rouge">foldr</code> “replace the <code class="language-plaintext highlighter-rouge">:</code> with <code class="language-plaintext highlighter-rouge">k</code>” trick needs to be adjusted slightly. With addition, it doesn’t really matter, since you can swap arguments and parentheses around. Let’s try it with subtraction and see the difference: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foldr (-) 0 [1..3] -- desugar the list: = 1 : 2 : 3 : [] -- replace : and [] with right associating k and z = 1 `k` (2 `k` (3 `k` z)) -- replace with args: = 1 - (2 - (3 - 0)) -- evaluate: = 1 - (2 - 3) = 1 - (-1) = 2 foldl (-) 0 [1..3] = 1 : 2 : 3 : [] -- prepend with z = z : 1 : 2 : 3 : [] -- replace : with left associating k and drop the [] = ((z `k` 1) `k` 2) `k` 3 -- replace z with 0 and k with - = ((0 - 1) - 2) - 3 -- evaluate: = (-1 - 2) - 3 = -3 - 3 = -6 </code></pre></div></div> If you’re curious about the evaluation of these functions, you can use their cousins <code class="language-plaintext highlighter-rouge">scanr</code> and <code class="language-plaintext highlighter-rouge">scanl</code>. Instead of returning a single end result, they return a list of all intermediate steps. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- In GHCi, λ> scanr (+) 0 [1..3] [6,5,3,0] λ> scanl (+) 0 [1..3] [0,1,3,6] λ> scanr (-) 0 [1..3] [2,-1,3,0] λ> scanl (-) 0 [1..3] [0,-1,-3,-6] </code></pre></div></div> <h1 id="caffeine-pls">caffeine pls</h1> Alright, enough Haskell for now, let’s implement these two in Java: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static <A, B> B foldRight(BiFunction<A, B, B> k, B z, List<A> list) { if (list.isNil()) { return z; } return k.apply(list.head, foldRight(k, z, list.tail)); } static <A, B> B foldLeft(BiFunction<B, A, B> k, B z, List<A> list) { if (list.isNil()) { return z; } return foldLeft(k, k.apply(z, list.head), list.tail); } </code></pre></div></div> And now, let’s rewrite <code class="language-plaintext highlighter-rouge">map</code> and <code class="language-plaintext highlighter-rouge">filter</code> in terms of these: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static <A, B> List<B> mapR(Function<A, B> f, List<A> list) { return foldRight( (x, acc) -> List.Cons(f.apply(x), acc), List.Nil(), list ); } static <A> List<A> filterR(Function<A, Boolean> p, List<A> list) { return foldRight( (x, acc) -> p.apply(x) ? List.Cons(x, acc) : acc, List.Nil(), list ); } </code></pre></div></div> <h1 id="whats-the-point-of-foldright">What’s the point of foldRight?</h1> So, at first glance, you might think that <code class="language-plaintext highlighter-rouge">foldl</code> is superior. It’s in tail recursive position, so a clever enough compiler could easily optimize it to a loop (unfortunately, Java doesn’t have tail recursion as of now). Lacking tail recursion, though, they both have to do about the same amount of work, and seem to be equivalent. There are two reasons why <code class="language-plaintext highlighter-rouge">foldRight</code> is useful. We can see the first by implementing map in terms of <code class="language-plaintext highlighter-rouge">foldLeft</code>: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static <A, B> List<B> mapL(Function<A, B> f, List<A> list) { return foldLeft( (acc, x) -> acc.append(f.apply(x)), List.Nil(), list ); } </code></pre></div></div> Alas! This is quadratic in the size of the input list! Appending to the end of a singly linked list is $O(n)$ time. We can verify this by looking at the simplest definition of <code class="language-plaintext highlighter-rouge">append</code>: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public List<T> append(T elem) { return foldRight(List::Cons, Cons(elem), this); } </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">foldRight</code> can be useful when constructing new lists. In fact, we can easily write <code class="language-plaintext highlighter-rouge">concat</code> using <code class="language-plaintext highlighter-rouge">foldRight</code>: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public static <T> List<T> concat(List<T> first, List<T> second) { return foldRight(List::Cons, second, first); } </code></pre></div></div> The ease of writing functions like this isn’t a coincidence. <code class="language-plaintext highlighter-rouge">foldRight</code> is theoretically entwined with singly linked lists. The ordinary definitions of data structures involve “how do I construct this,” but you can also define data structures in terms of “how do I deconstruct this.” <code class="language-plaintext highlighter-rouge">foldRight</code> is that definition. This is referred to as the Church encoding of a list. <h1 id="laziness">laziness</h1> Another difference between <code class="language-plaintext highlighter-rouge">foldl</code> and <code class="language-plaintext highlighter-rouge">foldr</code> is how they work with laziness. Laziness can be tricky to understand at first, since it defies all of our intuitions about how to evaluate code. Consider the implementation of <code class="language-plaintext highlighter-rouge">map</code> using <code class="language-plaintext highlighter-rouge">foldr</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map f xs = foldr (\x acc -> f x : acc) [] xs </code></pre></div></div> A <code class="language-plaintext highlighter-rouge">map</code> law is that composing two maps is the same as a single map with the two functions composed. In fancy math, \[map f \circ map g = map (f \circ g)\] In Haskell, <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map f . map g = map (f . g) </code></pre></div></div> In Java, <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map(f, map(g, list)) = map(compose(f, g), list) </code></pre></div></div> If we can fuse the two maps like this, then we can make this dramatically more efficient. Does <code class="language-plaintext highlighter-rouge">foldr</code> respect this law with respect to performance? Let’s watch <code class="language-plaintext highlighter-rouge">map (+1) . map (*2)</code> work, using our <code class="language-plaintext highlighter-rouge">foldr</code> definitions. <code class="language-plaintext highlighter-rouge">print</code> will demand our values and force their evaluation. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>printEach [] = print "Done" printEach (x:xs) = do print x printEach xs printEach (map (+1) (map (*2) [1, 2, 3])) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">print</code> is what actually forces evaluation here, so nothing gets evaluated until <code class="language-plaintext highlighter-rouge">print</code> forces it, and only as much as is required for <code class="language-plaintext highlighter-rouge">print</code>. First, <code class="language-plaintext highlighter-rouge">printEach</code> matches on <code class="language-plaintext highlighter-rouge">map (+1) (map (*2) [1,2,3])</code>. It needs to know if that evaluates to <code class="language-plaintext highlighter-rouge">[]</code> or <code class="language-plaintext highlighter-rouge">(x:xs)</code>. This causes <code class="language-plaintext highlighter-rouge">map (+1)</code> to match on <code class="language-plaintext highlighter-rouge">map (*2) [1,2,3]</code>. Which causes <code class="language-plaintext highlighter-rouge">map (*2)</code> to match on <code class="language-plaintext highlighter-rouge">[1,2,3]</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>printEach ( map (+1) ( map (*2) [1, 2, 3] ) ) -- substitute `foldr` definition for `map`: printEach ( map (+1) ( foldr (\x acc -> x * 2 : acc) [] [1,2,3] ) ) -- foldr's (x:xs) case matches, expression becomes: printEach ( map (+1) ( 1 * 2 : foldr (\x acc -> x * 2 : acc) [] [2, 3] ) ) -- first `map` can pattern match now, so we expand to the -- foldr definition: printEach ( foldr (\x acc -> x + 1 : acc) [] ( 1 * 2 : foldr (\x acc -> x * 2 : acc) [] [2, 3] ) ) -- foldr matches on `(x:xs)`, so we evaluate that bit: printEach ( 1 * 2 + 1 : foldr (\x acc -> x + 1 : acc) [] ( foldr (\x acc -> x * 2 : acc) [] [2, 3] ) ) -- printEach matches on (x:xs), so we can go to `print x`: printEach (1 * 2 + 1 : xs) = do print (1 * 2 + 1) printEach xs -- `print` does the match and prints it, and then recurses: printEach ( foldr (\x acc -> x + 1 : acc) [] ( foldr (\x acc -> x * 2 : acc) [] [2, 3] ) ) </code></pre></div></div> This process repeats, until eventually the inner <code class="language-plaintext highlighter-rouge">foldr</code> yields an empty list, and then the outer <code class="language-plaintext highlighter-rouge">foldr</code> yields an empty list, and then <code class="language-plaintext highlighter-rouge">printEach</code> prints <code class="language-plaintext highlighter-rouge">"Done."</code> to finish things off. Thu, 24 Mar 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/03/24/the_magic_of_folds.html https://www.parsonsmatt.org/2016/03/24/the_magic_of_folds.html An Elegant Fizzbuzz Fizzbuzz is a notorious programming problem to give during interviews. It’s designed to weed out people that can’t program at all. The problem formulation is: <blockquote> Print the numbers 1 to 100. If the number is a multiple of 3, then print “Fizz” instead of the number. If the number is a multiple of 5, then print “Buzz” instead of the number. If the number is divisible by both 3 and 5, then print “FizzBuzz” instead. </blockquote> The basic implementation of the problem looks like this: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Fizz { public static void fizzBuzz() { for (int i = 1; i <= 100; i++) { if (i % 3 == 0 && i % 5 == 0) { System.out.println("FizzBuzz"); } else if (i % 3 == 0) { System.out.println("Fizz"); } else if (i % 5 == 0) { System.out.println("Buzz"); } else { System.out.println(i); } } } } </code></pre></div></div> Which we can translate directly to Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzBuzz :: IO () fizzBuzz = forM_ [1..100] (\i -> if i `mod` 3 == 0 && i `mod` 5 == 0 then putStrLn "FizzBuzz" else if i `mod` 3 == 0 then putStrLn "Fizz" else if i `mod` 5 == 0 then putStrLn "Buzz" else print i ) </code></pre></div></div> If the candidate is capable of answering it easily, then the spec gets increased: <blockquote> Now if it is divisible by 7, print “Quux”. If it’s divisible by a combination, then it needs to print all the words. </blockquote> Uh oh! These prime factors are no fun. There’s going to be an additional <code class="language-plaintext highlighter-rouge">if</code> case for each prime factor, and then you’ll need an additional <code class="language-plaintext highlighter-rouge">if</code> case for each possible combination. It’s much simpler if we can handle the logic without worrying about the duplication. <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Fizz { public static void fizzBuzz() { for (int i = 1; i < 101; i++) { StringBuilder s = new StringBuilder(); if (i % 3 == 0) { s.append("Fizz"); } if (i % 5 == 0) { s.append("Buzz"); } if (i % 7 == 0) { s.append("Quux"); } if (s.length() == 0) { s.append(i); } System.out.println(s.toString()); } } } </code></pre></div></div> We’ve reduced the logic duplication by accumulating state. In Haskell, we’d use the State monad: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzBuzz :: (MonadIO m, MonadState String m) => m () fizzBuzz = forM_ [1..100] (\i -> do put "" when (i `mod` 3 == 0) (modify (++"Fizz")) when (i `mod` 5 == 0) (modify (++"Buzz")) when (i `mod` 7 == 0) (modify (++"Quux")) str <- get when (null str) (put (show i)) get >>= liftIO . putStrLn ) </code></pre></div></div> This implementation has the unfortunate problem of being somewhat inefficient: those <code class="language-plaintext highlighter-rouge">++</code> calls are $O(n)$, and we’ll have to traverse the whole string each time we add something to the end. This solves the “extensibility” problem, but there’s another issue: the printing is tied in with the generation of the strings. Let’s convert this to a map – that will both solve the performance issue noted above as well as making it less difficult to observe what’s happening. In Java, we’ve got: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Fizz { public static String fizzLogic(int i) { StringBuilder s = new StringBuilder(); if (i % 3 == 0) { s.append("Fizz"); } if (i % 5 == 0) { s.append("Buzz"); } if (i % 7 == 0) { s.append("Quux"); } if (s.length() == 0) { s.append(i); } return s.toString(); } public static List<String> fizzBuzz() { return IntStream .range(1, 101) .mapToObj(Fizz::fizzLogic) .collect(ArrayList::new, ArrayList::add, ArrayList::addAll); } } </code></pre></div></div> And, in Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzBuzz :: [Integer] -> [String] fizzBuzz = map fizzLogic fizzLogic :: Integer -> String fizzLogic i = if i `mod` 3 == 0 && i `mod` 5 == 0 then "FizzBuzz" else if i `mod` 3 == 0 then "Fizz" else if i `mod` 5 == 0 then "Buzz" else show i </code></pre></div></div> Alas! Now that we’re back to a single expression, we have to consider all the cases again as single if blocks. The Java implementation is nicer! What gives?! Well, <code class="language-plaintext highlighter-rouge">hlint</code> is telling us to refactor that to use guards rather than <code class="language-plaintext highlighter-rouge">if</code>, so let’s do that and see if anything jumps out at us. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzLogic :: Integer -> String fizzLogic i | i `mod` 3 == 0 && i `mod` 5 == 0 = "FizzBuzz" | i `mod` 3 == 0 = "Fizz" | i `mod` 5 == 0 = "Buzz" | otherwise = show i </code></pre></div></div> Now that it’s laid out like this… I think I see a pattern! Let me align it a little differently: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzLogic :: Integer -> String fizzLogic i | i `mod` 3 == 0 && i `mod` 5 == 0 = "Fizz" ++ "Buzz" | i `mod` 3 == 0 = "Fizz" ++ "" | i `mod` 5 == 0 = "" ++ "Buzz" | otherwise = show i </code></pre></div></div> Do you see it? It’s one of our favorite things: a monoid! Actually, it’s a whole bunch of monoids. <h1 id="monoids">Monoids</h1> A monoid is a neat little idea from abstract algebra that shows up almost everywhere. A monoid is a collection of three things: <ol> <li>A set of objects</li> <li>An associative binary operation (that is, $a \diamond (b \diamond c) = (a \diamond b) \diamond c$)</li> <li>An identity value for the operation (that is, $a \diamond id = a$ and $id \diamond a = a$)</li> </ol> Boolean values and <code class="language-plaintext highlighter-rouge">&&</code> form a monoid, where the set is <code class="language-plaintext highlighter-rouge">{True, False}</code>, the operation is <code class="language-plaintext highlighter-rouge">&&</code>, and the identity is <code class="language-plaintext highlighter-rouge">True</code>. Strings and <code class="language-plaintext highlighter-rouge">++</code> form a monoid, where <code class="language-plaintext highlighter-rouge">""</code> (the empty string) is the identity element. Integers, <code class="language-plaintext highlighter-rouge">+</code>, and <code class="language-plaintext highlighter-rouge">0</code> form a monoid, as do the integers, <code class="language-plaintext highlighter-rouge">*</code>, and <code class="language-plaintext highlighter-rouge">1</code>. They’re everywhere! Getting back to fizzing and buzzing, let’s codify the general form of the rule. We get an integer, and we might return a string. If we return multiple strings, we concatenate them all. If we don’t, then we just print the number. We seem to have a set of rules that may or may not fire. If more than one rule fires, we combine the results of the rule. If none fire, then we need a default value. Let’s represent this in Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type FizzRule = Integer -> Maybe String rule :: Integer -> String -> FizzRule rule n m i = case i `mod` n of 0 -> Just m _ -> Nothing fizz = rule 3 "Fizz" buzz = rule 5 "Buzz" </code></pre></div></div> Alright, so now we have a <code class="language-plaintext highlighter-rouge">[FizzRule]</code>. How do we use that? There are quite a few neat things we can do. <code class="language-plaintext highlighter-rouge">sequence</code> is a promising candidate: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sequence :: Monad m => [m a] -> m [a] -- or, the more generic: sequenceA :: (Applicative f, Traversable t) => t (f a) -> f (t a) </code></pre></div></div> As it happens, <code class="language-plaintext highlighter-rouge">a -> b</code> forms a monad and an applicative! So we if specialize <code class="language-plaintext highlighter-rouge">sequence</code> to functions (and then again to <code class="language-plaintext highlighter-rouge">Integer -> Maybe String</code>), we get: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sequence :: [a -> b] -> (a -> [b]) sequence :: [Integer -> Maybe String] -> Integer -> [Maybe String] </code></pre></div></div> This is pretty close! Now we want to combine the <code class="language-plaintext highlighter-rouge">[Maybe String]</code> into a <code class="language-plaintext highlighter-rouge">Maybe [String]</code>. This is, again, <code class="language-plaintext highlighter-rouge">sequence</code>, since <code class="language-plaintext highlighter-rouge">Maybe</code> is a monad. Finally, we want to concatenate those inner strings. We can use <code class="language-plaintext highlighter-rouge">fmap</code> over the Maybe with <code class="language-plaintext highlighter-rouge">mconcat</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mconcat :: Monoid m => [m] -> m combineRules :: [Integer -> Maybe String] -> Integer -> Maybe String combineRules rules i = fmap mconcat (sequence (sequence rules i)) </code></pre></div></div> Nice! This kind of polymorphic power is what’s so neat about Haskell. Those type classes are doing a bunch of work for us under the hood, and we don’t really have to worry about it. What if I told you a lot of that work was unnecessary? We can punt almost all of this to our fancy monoid instances with a single function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fold :: (Foldable t, Monoid m) => t m -> m </code></pre></div></div> <h1 id="dat-fold">Dat Fold</h1> Here’s where things get fun: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzBuzz :: [FizzRule] -> [Integer] -> [String] fizzBuzz rules = map f where f i = fromMaybe (show i) (ruleSet i) ruleSet = fold rules </code></pre></div></div> The magic happens in <code class="language-plaintext highlighter-rouge">fold rules</code>. Let’s inspect the type signature of <code class="language-plaintext highlighter-rouge">fold</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fold :: (Foldable t, Monoid m) => t m -> m </code></pre></div></div> We’ve got a <code class="language-plaintext highlighter-rouge">[FizzRule]</code> or <code class="language-plaintext highlighter-rouge">[Integer -> Maybe String]</code>. So <code class="language-plaintext highlighter-rouge">Foldable t ~ []</code> and <code class="language-plaintext highlighter-rouge">Monoid m ~ Integer -> Maybe String</code>. The monoid instance that comes into play here is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monoid m => Monoid (a -> m) where mempty = const mempty mappend f g = \x -> f x `mappend` g x </code></pre></div></div> So we require yet another monoid instance for the result of the function. The instance that comes into play here is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monoid a => Monoid (Maybe a) where mempty = Nothing mappend (Just a) (Just b) = Just (mappend a b) mappend (Just a) Nothing = Just a mappend Nothing (Just a) = Just a mappend Nothing Nothing = Nothing </code></pre></div></div> Along with the <code class="language-plaintext highlighter-rouge">Monoid</code> instance for <code class="language-plaintext highlighter-rouge">String</code>, which is <code class="language-plaintext highlighter-rouge">mappend = (++)</code> and <code class="language-plaintext highlighter-rouge">mempty = ""</code>. So we’ve folded three levels of Monoidal structure together. All that work came for free with the <code class="language-plaintext highlighter-rouge">fold</code> function. Note that we’ve come across a really powerful concept here: <code class="language-plaintext highlighter-rouge">fold rules</code> can be used to take a bunch of distinct rules, run them across the same input, and collect their responses. Fizzbuzz is a somewhat trivial implementation, but the generalized concept is really useful. If you’re feeling frisky, you can get a little more type class magic going by using the Functor and Applicative instance for functions. This lets us write: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fizzBuzz :: (Functor f, Foldable t) => t (Integer -> Maybe String) -> f Integer -> f String fizzBuzz rules = fmap (fromMaybe <$> show <*> ruleSet) where ruleSet = fold rules </code></pre></div></div> What? The Functor instance for functions is just function composition. Compare the type signature of <code class="language-plaintext highlighter-rouge">(.) :: (b -> c) -> (a -> b) -> (a -> c)</code> with: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Functor ((->) r) where fmap :: (a -> b) -> (r -> a) -> (r -> b) fmap f g = f . g </code></pre></div></div> Now, we get to the Applicative instance… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Applicative ((->) r) where pure = const f <*> g = \x -> f x (g x) </code></pre></div></div> The pattern <code class="language-plaintext highlighter-rouge">f <$> g <*> h</code> with explicit parentheses is <code class="language-plaintext highlighter-rouge">(f <$> g) <*> x</code>. When we expand that out, we get <code class="language-plaintext highlighter-rouge">\x -> f (g x) (h x)</code>, which is how we get <code class="language-plaintext highlighter-rouge">fromMaybe <$> show <*> ruleSet</code>. I don’t know about you, but when I was working on this, my brain just about exploded. There’s one last fun bit of abstract nonsense… <h1 id="a-monoid-in-the-category">A monoid in the category…</h1> The joke in functional programming is: <blockquote> “A monad is just a monoid in the category of endofunctors, what’s the problem?” </blockquote> Everyone laughs because wait what. But – as it happens, we get a little bit of a hint as to the deeper meaning! The <code class="language-plaintext highlighter-rouge">Foldable</code> class is the <a href="https://comonad.com/reader/2015/free-monoids-in-haskell/"><code class="language-plaintext highlighter-rouge">toFreeMonoid</code></a> class. One of the member functions is <code class="language-plaintext highlighter-rouge">foldMap</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m </code></pre></div></div> which maps each element of the structure to a monoid and then <code class="language-plaintext highlighter-rouge">mappends</code> them all together into a final monoid value. Now, suppose we specialize <code class="language-plaintext highlighter-rouge">foldMap</code> such that <code class="language-plaintext highlighter-rouge">Foldable t ~ []</code> and <code class="language-plaintext highlighter-rouge">Monoid m ~ [b]</code>… This is the resulting signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foldMap :: (a -> [b]) -> [a] -> [b] </code></pre></div></div> Which should look really familiar: that’s very nearly the type signature of <code class="language-plaintext highlighter-rouge">>>=</code>! In fact, that’s precisely the type signature of <code class="language-plaintext highlighter-rouge">=<<</code> specialized to lists: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- or :: (a -> [b]) -> [a] -> [b] (=<<) :: (a -> m b) -> m a -> m b (=<<) = flip (>>=) </code></pre></div></div> <h2 id="uhhh">uhhh</h2> Fizzbuzz has a neat monoidal solution that directly relates to rules engines. I accidentally discovered another neat connection between monoids, folds, and monads. You never know where this kind of thing might take you! Sat, 27 Feb 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/02/27/an_elegant_fizzbuzz.html https://www.parsonsmatt.org/2016/02/27/an_elegant_fizzbuzz.html Proving With Types I think that the Curry Howard correspondence is one of the coolest things ever. Philip Wadler’s <a href="https://homepages.inf.ed.ac.uk/wadler/papers/propositions-as-types/propositions-as-types.pdf">“Propositions as Types”</a> paper sends chills down my spine. Getting to literally reason about the programs we write and use ~Logic~ on them is fascinating and amazing to me. So what does this all mean? Is it practical? Can we do anything with it? <h1 id="we-can">We can!</h1> (please clap) Here’s the basic idea: when we write a type signature in our programs, we’re being tricked into writing a proposition in logic. When we write an implementation for it, we’re providing evidence for that proposition, and potentially proving it. The more powerful our type system, the more logically we can think about it. Let’s take Java for example. Here’s a method signature: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public Integer length(List<Character> string); </code></pre></div></div> This signature is like a promise: “If you give me a <code class="language-plaintext highlighter-rouge">List<Character></code>, then I’ll give you an <code class="language-plaintext highlighter-rouge">Integer</code>.” We can write that in propositional logic as: \[List_{Character} \implies Integer\] Now, in order to prove this, we must demonstrate that this works for all lists of characters. If someone can provide a <code class="language-plaintext highlighter-rouge">List<Character></code> that causes this function to break, then our proposition is false! Fortunately, lists are simple, and even Java’s <code class="language-plaintext highlighter-rouge">null</code> isn’t terribly awful here. Here’s our proof: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public Integer length(List<Character> string) { if (string == null) { return 0; } Integer result = 0; for (Character c : string) { result += 1; } return result; } </code></pre></div></div> Actually, we can provide a proof much more simply: <code class="language-plaintext highlighter-rouge">return 0;</code> satisfies the promise in the type signature! It’s a little easier to see the connection with types and logic with Haskell’s syntax. Compare these two declarations: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>length :: [Char] -> Integer </code></pre></div></div> \[List_{Character} \implies Integer\] There’s even an arrow! <h1 id="proving-something-false">Proving Something False</h1> We can use this sort of logic to prove things false, too. Let’s say you want to prove ONCE AND FOR ALL that the Haskell function <code class="language-plaintext highlighter-rouge">head :: [a] -> a</code> function is just bad. It’s not OK and you want to remove it from your codebase, and maybe even get a ruckus going on the Haskell mailing list. The idea that throwing exceptions in pure code is bad hasn’t been cutting it at the company meetings, so we’ll have to resort to logic. First, we need a way to translate some ideas to get at their logical counterpoints. How are we going to represent the idea of a linked list in logic? Our tools in logic are <code class="language-plaintext highlighter-rouge">and</code> ($\land$), <code class="language-plaintext highlighter-rouge">or</code> ($\lor$), and <code class="language-plaintext highlighter-rouge">implies</code> ($\implies$). We’ve already seen how <code class="language-plaintext highlighter-rouge">implies</code> corresponds to functions, so now we just need to figure out <code class="language-plaintext highlighter-rouge">or</code> and <code class="language-plaintext highlighter-rouge">and</code> and we can do some THEOREM PROVING. <h2 id="and">And</h2> $A \land B$ is true if $A$ is true and if $B$ is true. If either of them are false, then the proposition $A \land B$ is false. Logical <code class="language-plaintext highlighter-rouge">and</code> has a few rules that we can look at: <h3 id="and-introduction">And Introduction</h3> If $A$ is true and $B$ is true, then $A \land B$ is true. That means whenever we know $A$ and $B$, we can introduce $A \land B$. This theorem looks like: \[A \implies (B \implies A \land B)\] What’s a Java method signature that corresponds with this? <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public <A, B> And<A, B> andIntroduction(A a, B b); </code></pre></div></div> Or Haskell: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>andIntroduction :: a -> b -> And a b </code></pre></div></div> <h3 id="and-elimination">And Elimination</h3> If $A \land B$ is true, then we can eliminate the $\land$ and write down both $A$ and $B$ on their own. As a logical proposition, this looks like: $A \land B \implies A$ $A \land B \implies B$ Which we can implement in our two favorite programming languages: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public <A, B> A andElimOne(And<A, B> and); </code></pre></div></div> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>andElimOne :: And a b -> a andElimTwo :: And a b -> b </code></pre></div></div> Given the rules and type signatures for and elimination, I think we can arrive at a suitable class to implement it. It’s the humble pair, or tuple! <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public class <A, B> Tuple<A, B> { public final A fst; public final B snd; public Tuple(A a, B b) { this.fst = a; this.snd = b; } } </code></pre></div></div> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type And a b = (a, b) </code></pre></div></div> <h2 id="or">Or</h2> The proposition $A \lor B$ is true if either of $A$ or $B$ are true. If one is false, that’s fine! As long as they both aren’t false. Now let’s review the rules for <code class="language-plaintext highlighter-rouge">or</code> ($\lor$). This one will be a little trickier. <h3 id="or-introduction">Or Introduction</h3> If we know $A$ is true, then we can say that $A \lor B$ is true. Even if we know that $B$ is false, we can still say $A \lor B$ since $A$ is true. Expressed as a theorem, we have: \[A \implies A \lor B\] Which gives us a Java method signature: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public <A, B> Or<A, B> orIntroduction(A a); </code></pre></div></div> and a Haskell type signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>orIntroduction :: a -> Or a b </code></pre></div></div> Naturally, we can also write <code class="language-plaintext highlighter-rouge">b -> Or a b</code>. <h3 id="or-elimination">Or Elimination</h3> Or elimination is a way of rewriting an <code class="language-plaintext highlighter-rouge">Or</code> value. It’s a bit trickier than <code class="language-plaintext highlighter-rouge">And</code> elimination, since in <code class="language-plaintext highlighter-rouge">And</code> elimination we know we’ve got an A and a B. If we know that $A \lor B$ is true, then we have two possible cases: either $A$ is true, or $B$ is true. But we don’t know which! So we’ll have to able to handle either case. So, provided we know how to handle $A \implies Q$ and $B \implies Q$, then we can do some elimination. Because a full type signature might get messy, here’s a list of “requirements” and a guarantee: <ul> <li>If we can handle an $A$: $A \implies Q$</li> <li>If we can handle a $B$: $B \implies Q$</li> <li>If one is true: $A \lor B$</li> <li>Then I can give you a $Q$</li> </ul> Written out, this is: \[(A \lor B) \implies (A \implies Q) \implies (B \implies Q) \implies Q\] Translated to Java: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public <A, B, Q> Q orElimination(Or<A, B> or, Function<A, Q> left, Function<B, Q> right); </code></pre></div></div> In Haskell, we’ve got: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>orElimination :: (Or a b) -> (a -> q) -> (b -> q) -> q </code></pre></div></div> <h3 id="wat">wat</h3> So we know we can make an $A \lor B$ from either an $A$ or a $B$. And we know that we can eliminate it, but only if we can handle either an $A$ or a $B$. This is Haskell’s <code class="language-plaintext highlighter-rouge">Either</code> type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Either a b = Left a | Right b type Or = Either </code></pre></div></div> Implementing Either in Java is left as an exercise for the reader. <h1 id="logically-sound">Logically Sound</h1> Alright, we’ve reviewed our logic and got some tools on our belts. Let’s get back to lists, and how much we hate the <code class="language-plaintext highlighter-rouge">head</code> function. The Haskell list type is defined like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data List a = Nil | Cons a (List a) </code></pre></div></div> Now, we need to translate this to just using <code class="language-plaintext highlighter-rouge">Or</code> and <code class="language-plaintext highlighter-rouge">And</code>. We can start by deconstructing the <code class="language-plaintext highlighter-rouge">|</code> in the top level sum and replacing it with an <code class="language-plaintext highlighter-rouge">Either</code>. Since the <code class="language-plaintext highlighter-rouge">Left</code> value is a nullary constructor, we can say that the end of a list is like <code class="language-plaintext highlighter-rouge">Left ()</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type List' = Either () (...) </code></pre></div></div> That leaves the other case. We’ve got a constructor <code class="language-plaintext highlighter-rouge">Cons a (List a)</code>, which has two elements. We can express that as a tuple: <code class="language-plaintext highlighter-rouge">(a, List a)</code>. Haskell doesn’t let you have cycles in type synonym declarations, so we can’t actually say: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type List a = Either () (a, List a) </code></pre></div></div> But we can imagine that this is the case. Now that we have it in terms of <code class="language-plaintext highlighter-rouge">Either</code> and <code class="language-plaintext highlighter-rouge">(,)</code>, we can write it as a logical proposition: \[List_{A} = () \lor (A \land List_{A})\] Now, we can plug that directly in to the type signature for <code class="language-plaintext highlighter-rouge">head</code>, which we’ll put here in logical form: \[head : List_{A} \implies A\] Replacing $List A$ with the definition of list: \[() \lor (A \land List_{A}) \implies A\] Now we can use the <code class="language-plaintext highlighter-rouge">and</code> elimination rule as listed above to get rid of the $List_{A}$ term: \[() \lor A \implies A\] And, we’re stuck. We can’t eliminate the $A \lor ()$ since the <code class="language-plaintext highlighter-rouge">or</code> rule requires more information than we have. Considering what is meant by <code class="language-plaintext highlighter-rouge">or</code>: “I have evidence that either $A$ or $B$ is true.” If we don’t know what evidence we have, then we certainly can’t say that $A$ is true. If we had an $A$ available in the environment, then we could prove the $A$ that we’re looking for. Consider: <ul> <li>Add an A: $(() \lor A) \land A \implies A$</li> <li>And elimination: $A \implies A$</li> <li>Tautology – we win!</li> </ul> This is equivalent to changing <code class="language-plaintext highlighter-rouge">head</code>’s type signature to: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head :: [a] -> a -> a head [] a = a head (x:xs) _ = x </code></pre></div></div> Alternatively, we could change that final bit to: \[() \lor A \implies () \lor A\] which is also tautologically true. $A \lor ()$ is equivalent to <code class="language-plaintext highlighter-rouge">Either () a</code> is equivalent to <code class="language-plaintext highlighter-rouge">Maybe a</code>, so we’d have: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head :: [a] -> Maybe a </code></pre></div></div> By breaking down <code class="language-plaintext highlighter-rouge">head</code>s real type signature, we’ve proven that it is partial. Furthermore, by altering the final step in the proof, we can work backwards and determine what a total solution would look like. <h1 id="victory">Victory!</h1> We’ve achieved a minor victory today. When functional programmers talk about ‘reasoning about their code’, this is one of the things we’re talking about. Like, literal, actual, reasoning. Not just ‘thinking’, but logical, clear, structured reasoning about how the code is working and what it means. If you’re interested in learning more, I’d highly recommend reading <a href="https://www.amazon.com/Type-Theory-Formal-Proof-Introduction/dp/110703650X">Type Theory and Formal Proof: An Introduction</a>. (thanks to @hdgarrood on Twitter for posting some corrections! brackets are hard) (thanks to /u/ouchthats on Reddit for correcting my faulty use of a logic… damn LEM showing up everywhere! As it happens, reasoning and logic are difficult to get exactly right, but they’re the only things we can get exactly right!) Tue, 23 Feb 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/02/23/proving_with_types.html https://www.parsonsmatt.org/2016/02/23/proving_with_types.html Hardware Simulation in Idris I I’ve been digging into Idris lately, and have found using dependent types to be a fun and challenging exercise. Last semester, I implemented some basic computer architecture concepts in Idris as a means of learning both. There’s more work that I’d like to do on it, and I’d like to write up the process involved thus far. You shouldn’t need any knowledge in either Idris or computer hardware to get this. Some experience with Elm, ML, Haskell, etc. might be useful though! Let’s build some (simulated) hardware! <h1 id="boolean-logic">Boolean Logic</h1> Computers implement Boolean logic using voltages. If a source has higher voltage, that’s <code class="language-plaintext highlighter-rouge">True</code>. If it has lower voltage, that’s <code class="language-plaintext highlighter-rouge">False</code>. The exact specifics depend on the hardware manufacturer, and we’ll assume that we have <code class="language-plaintext highlighter-rouge">True</code> and <code class="language-plaintext highlighter-rouge">False</code>. One neat fact about Boolean logic is that it is completely expressible with a single operation. <code class="language-plaintext highlighter-rouge">NAND</code> and <code class="language-plaintext highlighter-rouge">NOR</code> are both capable of expressing the entirety of Boolean logic. Here is the truth table for <code class="language-plaintext highlighter-rouge">NAND</code>: <table> <thead> <tr> <th>x</th> <th>y</th> <th>NAND x y</th> </tr> </thead> <tbody> <tr> <td>T</td> <td>T</td> <td>F</td> </tr> <tr> <td>T</td> <td>F</td> <td>T</td> </tr> <tr> <td>F</td> <td>T</td> <td>T</td> </tr> <tr> <td>F</td> <td>F</td> <td>T</td> </tr> </tbody> </table> And the implementation in Idris: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nand : Bool -> Bool -> Bool nand True True = False nand x y = True </code></pre></div></div> Idris, like Haskell, allows top level pattern matching. This makes it pretty easy to define the function. We use <code class="language-plaintext highlighter-rouge">x : y</code> to say <code class="language-plaintext highlighter-rouge">x</code> has the type <code class="language-plaintext highlighter-rouge">y</code>, like Elm and ML (and proper math). We’ll be using the above type signature frequently, so we’ll want to define a type alias for it. This is how that looks in Idris: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Gate : Type Gate = Bool -> Bool -> Bool </code></pre></div></div> We declare a term <code class="language-plaintext highlighter-rouge">Gate</code> that has the type <code class="language-plaintext highlighter-rouge">Type</code>. Then we define <code class="language-plaintext highlighter-rouge">Gate</code> to be equal to a function type <code class="language-plaintext highlighter-rouge">Bool -> Bool -> Bool</code>. Idris uses the same stuff to talk about types and terms. Now we need to recover <code class="language-plaintext highlighter-rouge">AND</code>, <code class="language-plaintext highlighter-rouge">OR</code>, <code class="language-plaintext highlighter-rouge">NOT</code>, etc. Let’s do <code class="language-plaintext highlighter-rouge">AND</code> first. Wikipedia is <a href="https://en.wikipedia.org/wiki/And_gate">kind enough to include a diagram for constructing <code class="language-plaintext highlighter-rouge">AND</code> from <code class="language-plaintext highlighter-rouge">NAND</code></a>, and has diagrams for all the other gates too. Here are the implementations: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>and : Gate and a b = nand (nand a b) (nand a b) </code></pre></div></div> Idris, like Haskell and Elm, uses whitespace for function application. So <code class="language-plaintext highlighter-rouge">f x</code> is the function <code class="language-plaintext highlighter-rouge">f</code> applied to <code class="language-plaintext highlighter-rouge">x</code>. <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>or : Gate or a b = nand (nand a a) (nand b b) not : Bool -> Bool not a = nand a a nor : Gate nor a b = nand q q where q = nand (nand a a) (nand b b) </code></pre></div></div> Idris lets you define sub expressions in <code class="language-plaintext highlighter-rouge">where</code> declarations. If the type is simple enough, then it can infer the type. We don’t have to say <code class="language-plaintext highlighter-rouge">q : Bool</code>, for example. <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xor : Gate xor a b = nand d e where c : Bool c = nand a b d = nand c a e = nand c b </code></pre></div></div> Note that in <code class="language-plaintext highlighter-rouge">xor</code>, we are required to annotate the type of <code class="language-plaintext highlighter-rouge">c</code>. Dependently typed languages trade complete type inference for additional expressiveness in the types. <h1 id="adders">Adders</h1> Now that we’ve assembled all of our logic gates, we can start doing some interesting work with them. <code class="language-plaintext highlighter-rouge">OR</code> in Boolean logic is equivalent to addition. We’ll represent True as 1 and False as 0. Compare the truth table for <code class="language-plaintext highlighter-rouge">OR</code> with adding 0 and 1: <table> <thead> <tr> <th>x</th> <th>y</th> <th>x OR y</th> <th>x + y</th> </tr> </thead> <tbody> <tr> <td>1</td> <td>1</td> <td>1</td> <td>2</td> </tr> <tr> <td>1</td> <td>0</td> <td>1</td> <td>1</td> </tr> <tr> <td>0</td> <td>1</td> <td>1</td> <td>1</td> </tr> <tr> <td>0</td> <td>0</td> <td>0</td> <td>0</td> </tr> </tbody> </table> This is almost right! When <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> are both 1, then <code class="language-plaintext highlighter-rouge">x OR y</code> loses the information. What we’re missing is an implied <code class="language-plaintext highlighter-rouge">0</code> to indicate that we’re at <code class="language-plaintext highlighter-rouge">2</code> and not <code class="language-plaintext highlighter-rouge">1</code>. To make it right, we really need <code class="language-plaintext highlighter-rouge">1 OR 1</code> to be <code class="language-plaintext highlighter-rouge">10</code>. So we need an additional bit: a carry bit. This will only be set when <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> are both <code class="language-plaintext highlighter-rouge">1</code>. We also can’t use OR, since it doesn’t have the right truth table anymore. We’ll interpret the <code class="language-plaintext highlighter-rouge">carry</code> bit as the two’s place and the XOR bit as the one’s place. Here’s our truth table, with + represented in binary: <table> <thead> <tr> <th>x</th> <th>y</th> <th>carry</th> <th>x XOR y</th> <th>x + y</th> </tr> </thead> <tbody> <tr> <td>1</td> <td>1</td> <td>1</td> <td>0</td> <td>10</td> </tr> <tr> <td>1</td> <td>0</td> <td>0</td> <td>1</td> <td>01</td> </tr> <tr> <td>0</td> <td>1</td> <td>0</td> <td>1</td> <td>01</td> </tr> <tr> <td>0</td> <td>0</td> <td>0</td> <td>0</td> <td>00</td> </tr> </tbody> </table> So our <code class="language-plaintext highlighter-rouge">carry</code> bit is the same as <code class="language-plaintext highlighter-rouge">AND</code>, and our first bit is <code class="language-plaintext highlighter-rouge">XOR</code>. Let’s put it in Idris: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>halfAdder : Bool -> Bool -> (Bool, Bool) halfAdder x y = (xor x y, and x y) </code></pre></div></div> Idris has easy tupling like Haskell and Elm. We can signify that a function has two outputs by returning a tuple of values. The half adder is capable of adding two Boolean values, but since it doesn’t take a carry bit, it can’t deal with it’s own output. A full adder has a <code class="language-plaintext highlighter-rouge">CarryIn</code> input and uses that to determine the result and output. The Idris implementation looks like: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fullAdder : Bool -> Bool -> Bool -> (Bool, Bool) fullAdder x y c = (result, carryOut) where result = xor x (xor y c) carryOut = or (and x y) (and c (xor x y)) </code></pre></div></div> Alright, so we’ve implemented the circuit for adding two <code class="language-plaintext highlighter-rouge">Bool</code>s with a carry value. We’re going to cheat a little bit and use Idris’s vectors rather than worrying about making memory right now. <h1 id="actual-operations-vectors">Actual Operations: Vectors!</h1> We haven’t really taken advantage of any dependent typing yet. Length indexed vectors are the first dependently typed data structure that most people work with. They’re pretty useful! Idris defines them using this syntax: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Vect : Nat -> Type -> Type where Nil : Vect Z a (::) : a -> Vect k a -> Vect (S k) a data Nat = Z | S Nat </code></pre></div></div> We can read that like “Vect is a type constructor that takes a natural number, a type, and returns a type.” The first data constructor, <code class="language-plaintext highlighter-rouge">Nil</code>, has the type <code class="language-plaintext highlighter-rouge">Vect Z a</code>. <code class="language-plaintext highlighter-rouge">Z</code> is a representation of the natural number zero, so <code class="language-plaintext highlighter-rouge">Nil</code> has length 0 (as we’d expect). <code class="language-plaintext highlighter-rouge">::</code> is used to prepend a value of type <code class="language-plaintext highlighter-rouge">a</code> to a vector of type <code class="language-plaintext highlighter-rouge">Vect k a</code>, resulting in a <code class="language-plaintext highlighter-rouge">Vect (S k) a</code>. The <code class="language-plaintext highlighter-rouge">S</code> is for successor, and essentially means <code class="language-plaintext highlighter-rouge">+1</code>. This lets us talk about functions which only take non-empty vectors: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head : Vect (S n) a -> a </code></pre></div></div> Idris does pattern matching on the <code class="language-plaintext highlighter-rouge">(S n)</code>, which will succeed for values greater than 0. It’s now a compile time type error to provide <code class="language-plaintext highlighter-rouge">Nil</code> to <code class="language-plaintext highlighter-rouge">head</code>, which is great! We’re going to represent a machine word as a vector of Bool: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Word : Nat -> Type Word n = Vect n Bool Byte : Type Byte = Word 8 </code></pre></div></div> Zero, as a <code class="language-plaintext highlighter-rouge">Byte</code>, will just be 8 <code class="language-plaintext highlighter-rouge">False</code> values. That’s easy enough to implement: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zeroByte : Byte zeroByte = replicate 8 False </code></pre></div></div> But what if we want a <code class="language-plaintext highlighter-rouge">zero</code> of arbitrary word size? Idris allows us to use type information in our function bodies. Check this out: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zero : Word n zero {n} = replicate n False </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">{n}</code> is the exact same natural number in the type! If we load that in the REPL, we can check out that it works: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>src/Hardware> the (Word 2) zero [False, False] : Vect 2 Bool src/Hardware> the (Word 8) zero [False, False, False, False, False, False, False, False] : Vect 8 Bool </code></pre></div></div> What is <code class="language-plaintext highlighter-rouge">the</code>? <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>src/Hardware> :t the the : (a : Type) -> a -> a </code></pre></div></div> It’s a function that takes a type and a value of that type and returns itself. We can use this to constrain the type of <code class="language-plaintext highlighter-rouge">zero</code> and make it produce the right vector. <h1 id="endianness">Endianness</h1> When we want to convert a series of <code class="language-plaintext highlighter-rouge">Bool</code>s into a number, we have to know which end is significant. We can either start with the least significant bits, known as little-endian, or the most significant, known as big-endian. A number in little endian format has the following structure: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bit: 0 0 0 0 0 0 0 0 2^i: 0 1 2 3 4 5 6 7 </code></pre></div></div> It reads backwards of what we might expect. We can take the first bit, raise it to the power of two appropriate to its position, and add it to the rest of the list recursively converted. Meanwhile, a big-endian format has this structure: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bit: 0 0 0 0 0 0 0 0 2^i: 7 6 5 4 3 2 1 0 </code></pre></div></div> That’s how we normally read numbers. Let’s start with little-endian: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>littleEndian : Word n -> Nat littleEndian = helper 0 where helper : Nat -> Word m -> Nat helper x [] = 0 helper x (True :: bs) = (power 2 x) + helper (x + 1) bs helper x (False :: bs) = helper (x + 1) bs </code></pre></div></div> We don’t have to worry about the length of the list because we can just increase the <code class="language-plaintext highlighter-rouge">x</code> as we pass it along. We can make this work for big-endian by reversing the list, but that’s inefficient. Instead, we can take advantage of the fact that we know the length of the list in the type. <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bigEndian : Word n -> Nat bigEndian [] = 0 bigEndian {n = S k} (b :: bs) = (if b then power 2 k else 0) + bigEndian bs </code></pre></div></div> First, we pattern match on the empty vector, and give 0 as a result. Then, we pattern match on <code class="language-plaintext highlighter-rouge">b :: bs</code>, and also assert that <code class="language-plaintext highlighter-rouge">n</code> is non-zero (that is, the successor of a natural number). This brings into scope the variable <code class="language-plaintext highlighter-rouge">k</code>, which we can use. We either add <code class="language-plaintext highlighter-rouge">0</code> or <code class="language-plaintext highlighter-rouge">2^k</code> whether the bit is set or not, and add that to the result of calling <code class="language-plaintext highlighter-rouge">bigEndian</code> on the rest of the list. <h1 id="adding-two-words">Adding Two Words</h1> Let’s recap: we’ve built logic gates, zeroes, and a means of converting a <code class="language-plaintext highlighter-rouge">Word n</code> into a <code class="language-plaintext highlighter-rouge">Nat</code>ural number. Nice! Now, let’s implement addition of two vectors using a <a href="https://en.wikipedia.org/wiki/Adder_(electronics)#Ripple-carry_adder">ripple carry adder</a>. Note that this function only works on little-endian <code class="language-plaintext highlighter-rouge">Word</code>s because the carry bit flows the wrong direction for a big-endian word. We would have to reverse the two lists in order to make this operate correctly on big-endian vectors. <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rippleCarry : Word n -> Word n -> Word n rippleCarry x y = go False x y where go : Bool -> Vect n Bool -> Vect n Bool -> Vect n Bool go carry [] [] = Data.Vect.Nil go carry (a :: as) (b :: bs) = let (s, c) = fullAdder a b carry in s :: go c as bs </code></pre></div></div> We’re asserting that you can only add two <code class="language-plaintext highlighter-rouge">Word</code>s of the same length, so when we pattern match in <code class="language-plaintext highlighter-rouge">go</code>, we don’t have to worry about mismatched lengths. Now, the last challenge: given a <code class="language-plaintext highlighter-rouge">Nat</code>, it’d be nice to get the <code class="language-plaintext highlighter-rouge">Word n</code> from it. We can implement that simply for lists: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkWordList : Nat -> List Bool mkWordList Z = [] mkWordList s@(S k) = ((s `mod` 2) == 1) :: mkWordList (divCeil k 2) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">Z</code> is the empty list. For the <code class="language-plaintext highlighter-rouge">S</code>uccessor of any natural number <code class="language-plaintext highlighter-rouge">k</code>, we see if the <code class="language-plaintext highlighter-rouge">S k</code> is evenly divisible by 2. If it is, we have a <code class="language-plaintext highlighter-rouge">True</code> bit, and otherwise we have <code class="language-plaintext highlighter-rouge">False</code>. Then we <code class="language-plaintext highlighter-rouge">cons</code> that onto the front of the list formed by recursing down. But what happens when we try for vectors? <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkWord : (n : Nat) -> Word m mkWord Z = ?zero mkWord (S k) = ?succ </code></pre></div></div> What is <code class="language-plaintext highlighter-rouge">m</code> going to be? We know that, given <code class="language-plaintext highlighter-rouge">x</code> bits, we can store <code class="language-plaintext highlighter-rouge">2^x</code> numbers, with the largest being <code class="language-plaintext highlighter-rouge">2^x - 1</code>. Idris defines <code class="language-plaintext highlighter-rouge">log2 : Nat -> Nat</code>, which is the inverse of <code class="language-plaintext highlighter-rouge">power 2 n</code>. So <code class="language-plaintext highlighter-rouge">m</code> should be <code class="language-plaintext highlighter-rouge">S (log2 n)</code> Let’s put in the <code class="language-plaintext highlighter-rouge">?zero</code> case now. We’ll use the ‘empty vector = 0’ convention we’ve been using: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkWord (n : Nat) -> Word (S (log2 n)) mkWord Z = [] </code></pre></div></div> Boom! Type error: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Type checking ./src/Hardware.idr ./src/Hardware.idr:98:8: When checking right hand side of mkWord with expected type Word (S (log2 0)) Type mismatch between Vect 0 a (Type of []) and Vect (S (log2 0)) Bool (Expected type) Specifically: Type mismatch between 0 and S (log2 0) </code></pre></div></div> Well, <code class="language-plaintext highlighter-rouge">0</code> doesn’t match <code class="language-plaintext highlighter-rouge">S k</code>, so that’s wrong! Fortunately, we can handle that case directly in the type. Check this out: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkWord : (n : Nat) -> Word (case n of Z => 0; S k => S (log2 (S k))) mkWord Z = [] </code></pre></div></div> We can use a <code class="language-plaintext highlighter-rouge">case</code> statement right there in the type signature. Pretty awesome, right? Now we can get to the other case: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkWord s@(S k) = ((s `mod` 2) == 1) :: mkWord (k `divCeil` 2) </code></pre></div></div> We get another type error here: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>*src/Hardware> :r Type checking ./src/Hardware.idr ./src/Hardware.idr:99:38-40: When checking right hand side of mkWord with expected type Word (case block in mkWord at ./src/Hardware.idr:97:34 (S k) (S k)) When checking argument xs to constructor Data.Vect.::: Type mismatch between Word (case block in mkWord at ./src/Hardware.idr:97:34 (divCeil k 2) (divCeil k 2)) (Type of mkWord (divCeil k 2)) and Vect (log2 (S k)) Bool (Expected type) Specifically: Type mismatch between case block in mkWord at ./src/Hardware.idr:97:34 (divCeil k 2) (divCeil k 2) and log2 (S k) </code></pre></div></div> So it appears that Idris can’t determine that, by reducing <code class="language-plaintext highlighter-rouge">k</code> by 2 with each application of the recursion, it’ll only take <code class="language-plaintext highlighter-rouge">log2 k</code> in order to bottom out. To be proper, we should really prove that this holds. Perhaps I’m wrong, and my code doesn’t work! Maybe there’s a corner case I’m forgetting, and Idris is rightfully blocking me from providing garbage. Perhaps I have a deadline looming, and I need to Move Fast and (potentially) Break Things. I’m pretty sure that I didn’t screw up too bad, and some playing at the REPL indicates that I’m alright. Fortunately, Idris is remarkably flexible. Here’s how we can make that typecheck: <div class="language-idris highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkWord s@(S k) = believe_me (((s `mod` 2) == 1) :: mkWord (k `divCeil` 2)) </code></pre></div></div> Idris has an escape hatch <code class="language-plaintext highlighter-rouge">believe_me : a -> b</code> that allows you to make assertions to the compiler without proving them. I’m planning on learning more about proving things with Idris so that I can make this work, but for right now, this works for me. <h1 id="further-work">Further work…</h1> Next time, I’m planning on implementing memory units so we can stop cheating and use a hardware simulation for memory. Then we can build a processor, some RAM, and have a real hardware simulation going on! Thu, 18 Feb 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/02/18/dependent-hardware-i.html https://www.parsonsmatt.org/2016/02/18/dependent-hardware-i.html ANN: QuickLift I’m happy to announce that my project QuickLift has finally reached the point where I can start using it for myself! It’s still definitely alpha level software, so I don’t really recommend anyone else using it right now. Not that there’s any real risk of that, I don’t think, since it has basically zero features. QuickLift is a weightlifting logging web application. I’ve developed the back end in Haskell using the Servant framework, and the front end currently in PureScript using the Halogen framework. That I’m intending to make it a weightlifting application is less cool than that I’ve intentionally built it thus far to be a reasonably useful scaffold for building functional single page applications. I’ve made a git branch for the state of the repositories as of these blogposts, so that code will be available. This post will serve as a bit of a walkthrough of the QuickLift application. Along the way, I’ll make notes on what will likely be factored out into it’s own libraries, and where parts could be improved a lot. Here are the relevant links: <ul> <li><a href="https://github.com/parsonsmatt/quicklift/tree/blogpost">The backend repository</a></li> <li><a href="https://github.com/parsonsmatt/ql-purs/tree/blogpost">The PureScript frontend repository</a></li> </ul> <h1 id="the-backend">The Backend</h1> The QuickLift backend is fairly standard for a Haskell Servant application. If you’ve seen my <a href="/2015/06/07/servant-persistent.html">servant-persistent</a> tutorial, then you’ve seen most of what the back end is all about. The monad and configuration I’m using are presented here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type AppM = ReaderT Config (EitherT ServantErr IO) data Config = Config { getPool :: ConnectionPool , getEnv :: Environment } </code></pre></div></div> I’ll cover some of the other items here: <h2 id="user-authentication">User Authentication</h2> User authentication is currently an open problem in Servant, though they’re working on getting a <a href="https://github.com/haskell-servant/servant/pull/311">blessed solution</a> for the v0.5 release. I’m doing the bare minimum to work, with the minimum of magic. I’m also leveraging the excellent <a href="https://hackage.haskell.org/package/users-0.3.0.0"><code class="language-plaintext highlighter-rouge">users</code></a> library to handle the boilerplate around user management. I’m encapsulating the authentication and registration handlers in a <code class="language-plaintext highlighter-rouge">UserAPI</code> type, presented here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type UserAPI = Get '[JSON] [Person] :<|> ReqBody '[JSON] Registration :> Post '[JSON] (Either Text.Text Int64) :<|> "login" :> ReqBody '[JSON] Auth :> Post '[JSON] (Maybe AuthResponse) :<|> "verify" :> ReqBody '[JSON] Text :> Post '[JSON] (Maybe AuthResponse) </code></pre></div></div> The first endpoint returns a list of all users. The second is used to register a new user account, returning either a text error message or the user ID. The third is used to log a user in, and returns an <code class="language-plaintext highlighter-rouge">AuthResponse</code> if the login was successful. <code class="language-plaintext highlighter-rouge">verify</code> is used to get the user account information for a given authentication token. I’ve used some Template Haskell <a href="https://www.parsonsmatt.org/2015/11/15/template_haskell.html">as documented in this tutorial</a> to make the Users code available in my <code class="language-plaintext highlighter-rouge">AppM</code> monad, so <code class="language-plaintext highlighter-rouge">getUsers</code> is just: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- Get '[JSON] [Person] getUsers :: AppM [Person] getUsers = do users <- listUsers Nothing return (map (uncurry userToPerson) users) </code></pre></div></div> I put the Servant API type description for the handler above the function to make it a bit more clear what is going on. Registering a new user account is also pretty easy: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- ReqBody '[JSON] Registration :> Post '[JSON] (Either Text.Text Int64) registerUser :: Registration -> AppM (Either Text.Text Int64) registerUser reg = do user <- createUser (convertRegistration reg) return $ either (Left . Text.pack . show) (Right . fromSqlKey) user data Registration = Registration { regName :: Text , regEmail :: Text , regPassword :: Text , regConfirmation :: Text } deriving (Eq, Show) </code></pre></div></div> I could have done <code class="language-plaintext highlighter-rouge">bimap (Text.pack . show) (fromSqlKey) user</code> to save a few characters, but bifunctors are kinda scary right! The <code class="language-plaintext highlighter-rouge">Registration</code> data type is just a dumb data type that I made to serve as the endpoint request. This pattern is pretty common – have a datatype corresponding to the input I expect from the endpoint, and a function to convert it to whatever internal format I need. In this case, I use <code class="language-plaintext highlighter-rouge">convertRegistration</code> to convert a <code class="language-plaintext highlighter-rouge">Registration</code> value into a <code class="language-plaintext highlighter-rouge">User</code> value as expected by the <code class="language-plaintext highlighter-rouge">users</code> library. This separation of concerns made it really easy to switch to the <code class="language-plaintext highlighter-rouge">users</code> library in the first place. Since <code class="language-plaintext highlighter-rouge">servant-0.5</code> is going to come out with real legit authentication support soon, I went with something somewhat janky. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- "login" :> ReqBody '[JSON] Auth :> Post '[JSON] (Maybe AuthResponse) authenticateUser :: Auth -> AppM (Maybe AuthResponse) authenticateUser auth = runMaybeT $ do sessionId <- MaybeT $ authUser (authEmail auth) (WU.PasswordPlain $ authPassword auth) 1200000 person <- lift $ getUser (authEmail auth) return $ AuthResponse sessionId person data Auth = Auth { authEmail :: Text , authPassword :: Text , authConfirmation :: Text } deriving (Eq, Show) data AuthResponse = AuthResponse { sessionId :: SessionId , person :: Person } deriving (Eq, Show, Generic) </code></pre></div></div> Right now, it’s sending a <code class="language-plaintext highlighter-rouge">Maybe AuthResponse</code>, though doing <code class="language-plaintext highlighter-rouge">AppM AuthResponse</code> and using an HTTP error code might be more appropriate to indicate a failed login. We want to respond with both the session information as well as the actual user profile, so we pack them both in the <code class="language-plaintext highlighter-rouge">AuthResponse</code> value. <code class="language-plaintext highlighter-rouge">getUser</code> is actually another function in the API – this is a fantastic display of the composability of Servant handlers. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- This is actually for a different part of the API getUser :: Text -> AppM Person getUser k = do person <- runMaybeT $ do userid <- MaybeT $ getUserIdByName k user <- MaybeT $ getUserById userid return $ userToPerson userid user case person of Nothing -> lift $ left err404 Just person -> return person </code></pre></div></div> This function returns a user, or errors out with a 404. If someone tries to login with a user that doesn’t exist, then they get a 404 error. Nice! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>verifyToken :: Text -> AppM (Maybe AuthResponse) verifyToken sid = runMaybeT $ do let session = WU.SessionId sid userId <- MaybeT $ verifySession session 12000 user <- MaybeT $ getUserById userId return (AuthResponse session (userToPerson userId user)) </code></pre></div></div> Verifying a token is similar. We take an authentication token, ask <code class="language-plaintext highlighter-rouge">users</code> to verify the session. If it’s valid, we then get the user by ID and return an <code class="language-plaintext highlighter-rouge">AuthResponse</code>. <h2 id="composable-handlers">Composable Handlers</h2> So, <code class="language-plaintext highlighter-rouge">getUser</code> is a composable handler, and I’m reusing it all over the place in my app. Here’s the API it is defined in: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type LifterAPI = Get '[JSON] [Person] :<|> Capture "name" Text :> (Get '[JSON] Person :<|> "sessions" :> SessionAPI) type SessionAPI = Get '[JSON] [Entity Liftsession] :<|> Header "auth" Text :> ReqBody '[JSON] Liftsession :> Post '[JSON] (Either Text Int64) </code></pre></div></div> Now, I’ve got a sub-API that handles getting a person, and then a bunch of stuff for them to handle their weightlifting sessions. (I know, weightlifting session, browser session, sigh) At first, I was dismayed at the thought of writing repetitive <code class="language-plaintext highlighter-rouge">lift $ left err404</code> code to check for the user not being present. Naturally, there’s a better way, and the current implementation of the <code class="language-plaintext highlighter-rouge">lifters</code> logic is great: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sessionServer :: Text -> ServerT SessionAPI AppM sessionServer username = getSessions' :<|> createSession' where -- Get '[JSON] [Entity Liftsession] getSessions' :: AppM [Entity Liftsession] getSessions' = getUser username >>= getSessions -- Header "auth" Text -- :> ReqBody '[JSON] Liftsession -- :> Post '[JSON] (Either Text Int64) createSession' :: Maybe Text -> Liftsession -> AppM (Either Text Int64) createSession' Nothing _ = lift $ left err401 createSession' (Just sid) s = do loginId <- verifySession (WU.SessionId sid) 10 user <- getUser username if loginId == Just (personId user) then createSession s user else lift $ left err401 </code></pre></div></div> And by ‘great’, I mean “this could be way cooler but wow compared to Rails/Express…” <code class="language-plaintext highlighter-rouge">createSession'</code> is actually using the authentication mechanism in an ad-hoc way. API clients are required to put a header “auth” with their request. If they don’t send anything, then I <code class="language-plaintext highlighter-rouge">lift $ left err401</code> and their party is over. If they do send a header, then I verify that it is a proper authentication token. This gives me a <code class="language-plaintext highlighter-rouge">Maybe LoginId</code>. Next, I <code class="language-plaintext highlighter-rouge">getUser</code>, and if the user’s ID matches up with the one provided from the session, then I create the session. Otherwise they get booted. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getSessions :: Person -> AppM [Entity Liftsession] getSessions Person {..} = runDb $ selectList [ LiftsessionUser ==. personId ] [] createSession :: Liftsession -> Person -> AppM (Either Text Int64) createSession ls person = do let ls' = ls { liftsessionUser = personId person } key <- runDb $ insert ls' return . return . fromSqlKey $ key </code></pre></div></div> <code class="language-plaintext highlighter-rouge">getSessions</code> doesn’t have to worry about the user not being present, because <code class="language-plaintext highlighter-rouge">getUser</code> already dealt with that. Likewise, <code class="language-plaintext highlighter-rouge">createSession</code> doesn’t have to worry about authorizing the user because we took care of that upstream. Servant’s composable handlers are really cool. In any case, that’s the entire back end as it differs from the <code class="language-plaintext highlighter-rouge">servant-persistent</code> tutorial. Let’s check out the front end. <h1 id="the-frontend">The Frontend</h1> The front end of QuickLift is currently written in PureScript using the Halogen UI library. I’m planning on doing another front end in GHCjs with Reflex-dom, and perhaps another with Elm (and maybe even one with ClojureScript if I’ve got enough time). All of the various ML-inspired JavaScript languages are fairly bleeding edge, with their own sets of trade offs. PureScript has a very nice position, and has the advantage of being designed from the ground up to be great at compiling to JavaScript and avoiding making some of the same mistakes that Haskell has made. The typeclass hierarchy is better. The record system is awesome. Some syntax is much better (no <code class="language-plaintext highlighter-rouge">$</code> needed before <code class="language-plaintext highlighter-rouge">do</code> or lambdas!), some is much worse (<code class="language-plaintext highlighter-rouge"><<<</code> is the default composition operator, though the lens package exports <code class="language-plaintext highlighter-rouge">..</code>). PureScript also has a really good router, which Elm and GHCJS didn’t really have at the time I started. Routers are important for making SPAs useful and not counterintuitive – the back button is your friend, and URLs are what makes the web great. I’m using <a href="https://github.com/slamdata/purescript-halogen">Halogen</a>, which is a beast of a library. I’m going to briefly cover the architecture and design, but you’ll want to refer to my <a href="https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html">Elm Architecture in PureScript</a> series, the <a href="https://github.com/slamdata/purescript-halogen#introduction">official introduction</a>, and the <a href="https://github.com/slamdata/purescript-halogen/tree/master/examples">excellent set of examples</a> if you want to know what’s going on in more depth. <h3 id="note-to-the-future">Note to the future:</h3> The PureScript ecosystem is evolving extremely rapidly, and it’s likely that the code in the <code class="language-plaintext highlighter-rouge">blogpost</code> branch will bitrot. I’ve tightened the dependencies in the <code class="language-plaintext highlighter-rouge">bower.json</code> file, but PureScript and <code class="language-plaintext highlighter-rouge">pulp</code> themselves might evolve and break the project. I’ll try to keep it updated and building, but for posterity, this is the version information that makes it go: <ul> <li>PureScript v0.7.6.3 or v0.8 RC 1</li> <li><code class="language-plaintext highlighter-rouge">pulp</code> versions 4.4 and 7.0.0 tested</li> </ul> <h2 id="mainpurs"><code class="language-plaintext highlighter-rouge">Main.purs</code></h2> The Main module kicks off the application, router, and digs an auth token out of LocalStorage. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- ... import QuickLift as Q import QuickLift.State as S import Router as R import Types (QL()) main :: forall eff. Eff (QL eff) Unit main = do token <- WS.getItem WS.localStorage "auth" runAff throwException (const (pure unit)) do app <- runUI Q.ui S.initialState { authToken = token } appendToBody app.node forkAff $ R.routeSignal app.driver </code></pre></div></div> The router is the next interesting bit of the application. <h2 id="routerpurs"><code class="language-plaintext highlighter-rouge">Router.purs</code></h2> I wrote <a href="https://www.parsonsmatt.org/2015/10/22/purescript_router.html">an introductory tutorial</a> on using <code class="language-plaintext highlighter-rouge">purescript-routing</code> with <code class="language-plaintext highlighter-rouge">purescript-halogen</code>. If you’re wanting more detail, check that out. I’ll briefly cover the main differences here. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routing :: Match Routes routing = profile <|> sessions <|> register <|> login <|> logout <|> home where login = Login <$ route "login" logout = Logout <$ route "logout" register = Registration <$ route "register" profile = Profile <$ route "profile" home = Home <$ lit "" sessions = Sessions <$> (route "sessions" *> parseCRUD) route str = lit "" *> lit str parseCRUD = Show <$> int <|> New <$ lit "new" <|> pure Index int = floor <$> num </code></pre></div></div> I’m actually using an ADT in this implementation of the router, which is much nicer than a stringly typed one in the tutorial. The routes library includes an Applicative-style parser, which should be right at home to anyone who’s used Parsec or related. I’ve also got a type class setup for generating URLs from the routes, and a link convenience function for my templates: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class HasLink a where link :: a -> String instance routesHasLink :: HasLink Routes where link Profile = "#/profile" link (Sessions crud) = "#/sessions" ++ link crud link Home = "#/" link Registration = "#/register" link Login = "#/login" link Logout = "#/logout" instance crudHasLink :: HasLink CRUD where link Index = "" link New = "/new" link (Show n) = "/" ++ show n (</>) :: forall a b. (HasLink a, HasLink b) => (a -> b) -> a -> b (</>) = ($) linkTo :: Routes -> String -> HTML _ _ linkTo r t = H.a [ P.href (link r) ] [ H.text t ] </code></pre></div></div> I’d like to get this split off as a library for use with <code class="language-plaintext highlighter-rouge">halogen</code> and <code class="language-plaintext highlighter-rouge">routing</code>, but I’d like to get something that can generate both the route parsing and URL generation from a single source of truth. <h2 id="quickliftpurs"><code class="language-plaintext highlighter-rouge">QuickLift.purs</code></h2> Next up, we’ll check out the component definition. I started the app going crazy with subcomponents, but ended up finding that it was massive unweildy to have so many coproducts floating around, and ended up refactoring everything and bringing it back into a single component. As it happens, a single component is not annoying or painful at all to deal with yet, so this has been a great shift. Like any good developer, I spread all my code far-and-wide: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: forall eff. Component State Input (QLEff eff) ui = component render eval where render state = L.defaultLayout state [ renderView state.currentPage state ] </code></pre></div></div> <code class="language-plaintext highlighter-rouge">L.defaultLayout</code> and <code class="language-plaintext highlighter-rouge">renderView</code> are in different modules. I’m experimenting with different ways to template, and this has been pretty good so far. Let’s check out the <code class="language-plaintext highlighter-rouge">eval</code> logic: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> eval :: Eval Input State Input (QLEff eff) eval (Goto route next) = do modify (_ { currentPage = route }) case route of Registration -> modify (stCurrentUser .~ Just emptyUser) Sessions Index -> eval (LoadSessions next) $> unit Logout -> eval (UserLogout next) $> unit _ -> pure unit st <- get unless (isJust st.currentUser) do for_ st.authToken \auth -> do res <- liftAff' (API.verifySession auth) case res of Nothing -> modify (stAuthToken .~ Nothing) Just (Tuple session user) -> do liftEff' (WS.setItem WS.localStorage "auth" session) modify (stErrors ?~ []) modify (stCurrentUser ?~ user) modify (stAuthToken ?~ session) liftAff' (updateUrl route) pure next </code></pre></div></div> This is the code that handles the routes. The <code class="language-plaintext highlighter-rouge">State</code> type for QuickLift keeps track of the current page. Then, depending on what the route is, we do some other stuff. In two of those cases, we’re punting to other cases of the <code class="language-plaintext highlighter-rouge">eval</code> function. The rest of the function retrieves the user from the server if there’s an auth token available. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> eval (LoadSessions a) = do u <- gets _.currentUser for_ u \user -> do s <- liftAff' (API.getUserSessions user) modify .. set stLoadedSessions .. concat .. maybeToArray $ s pure a </code></pre></div></div> Stateful asynchronous code with potential <code class="language-plaintext highlighter-rouge">null</code> values is so nice with Haskell, er, PureScript. <code class="language-plaintext highlighter-rouge">Maybe</code>’s <code class="language-plaintext highlighter-rouge">Traversable</code> instance works to great effect here. And not having to write <code class="language-plaintext highlighter-rouge">$</code> looks so much nicer. Getting to use lenses is fun too. This is what programming should be like! Handling forms is good too: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> handleNewSession (Edit fn) = modify (stCurrentSession %~ fn) handleNewSession Submit = do auth <- gets (\s -> Tuple <$> s.authToken <*> s.currentUser) for_ auth \(Tuple token user) -> do sess <- gets _.currentSession result <- liftAff' (API.postSession token user sess) for_ result \n -> do let saved' = sess # _Session .. id_ .~ n rt = Sessions </> Show n modify (stCurrentSession .~ saved') modify (stLoadedSessions %~ (saved' :)) eval (Goto rt unit) </code></pre></div></div> This comes from my <code class="language-plaintext highlighter-rouge">Form</code> module, which I’m planning on spinning off into a library. We get more fun with the <code class="language-plaintext highlighter-rouge">Traversable</code> instance of <code class="language-plaintext highlighter-rouge">Maybe</code>, first checking to see if a user and authentication token are available. If they both are, then we get the current session from the state, post it to the API, and if there’s a result, we save it. <code class="language-plaintext highlighter-rouge">eval (Goto (Sessions </> Show n)) unit)</code> lets us send redirects from within the application handlers. <h2 id="quickliftviewpurs"><code class="language-plaintext highlighter-rouge">QuickLift/View.purs</code></h2> Let’s check out the view. The layout is mostly boring code that I’m reusing to wrap all the views in, so we can skip it. The <code class="language-plaintext highlighter-rouge">renderView</code> function is defined here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>renderView :: Routes -> State -> ComponentHTML Input renderView Home _ = H.div_ [ H.h1_ [ H.text "QuickLift" ] , H.p_ [ H.text "Welcome to QuickLift" ] ] </code></pre></div></div> We’re pattern matching on the <code class="language-plaintext highlighter-rouge">Routes</code> to determine which page to visit, and we have access to the <code class="language-plaintext highlighter-rouge">State</code> to render data. For creating sessions, there’s some CRUD to deal with: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>renderView (Sessions Index) st = let sessions = case map linkSession st.loadedSessions of [] -> H.p_ [ H.text "No sessions." ] xs -> H.ul_ (map (H.li_ <<< pure) xs) in H.div_ [ newButton , loadButton , sessions ] renderView (Sessions (Show n)) st = let maybeIndex = findIndex (eq n .. view (_Session .. id_)) st.loadedSessions session = maybeIndex >>= index (st ^. stLoadedSessions) in showPage n session -- ... later ... showPage :: forall a. Int -> Maybe Session -> HTML a Input showPage n (Just (Session s)) = H.div_ [ H.h1_ [ H.text $ yyyy_mm_dd s.date ] , H.p_ [ H.text s.text ] , newButton ] showPage n Nothing = H.div_ [ H.h2_ [ H.text "hmm, not found... load it?" ] , loadButton ] </code></pre></div></div> lol @ mapping over the sessions in a case statement what am i doing. In any case, the views here are pretty non-remarkable if you’re used to Lucid or other Haskell/Elm templating solutions. The <code class="language-plaintext highlighter-rouge">New</code> route is interesting: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>renderView (Sessions New) st = H.div_ [ F.form (NewSession Submit) [ F.textarea "session" "Session:" (st.currentSession ^. _Session .. text_) (NewSession .. Edit .. set (_Session .. text_)) , F.date "date" "Date:" (yyyy_mm_dd (st.currentSession ^. _Session .. date_)) (NewSession .. Edit .. edDate) ] ] where edDate :: String -> Session -> Session edDate str sess = let d = fromMaybe (sess ^. _Session .. date_) (dateFromString str) in sess # _Session .. date_ .~ d </code></pre></div></div> I’m using a form abstraction that’s based on <code class="language-plaintext highlighter-rouge">lens</code> here. <code class="language-plaintext highlighter-rouge">F.textarea</code> is a function that takes an ID, a label, an initial value, and a function that takes a String and the target of the lens and updates the state. I’m intending to explore that abstraction more, but haven’t had the chance to build it out. <h2 id="forms">Forms</h2> I also built out a <code class="language-plaintext highlighter-rouge">Writer</code> based form that takes advantage of lenses for user stuff: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>renderView Registration st = H.div_ $ errs st.errors : WF.renderForm st.registration Register do WF.textField "name" "User name:" (_UserReg .. name) urlSafe WF.emailField "email" "Email:" (_UserReg .. email) validEmail WF.passwordField "password" "Password:" (_UserReg .. password) validPassword WF.passwordField "confirm" "Confirmation:" (_UserReg .. confirmation) validConfirmation where validPassword str | Str.length str < 6 = Left "Password must be at least 6 characters" | otherwise = Right str validConfirmation str | str == st ^. stRegistration .. _UserReg .. password = Right str | otherwise = Left "Password must match confirmation" validEmail str = maybe (Left "Must have @ symbol") (const (Right str)) (Str.indexOf "@" str) urlSafe str = case Reg.match (Reg.regex "^[\\w\\d-_]*$" Reg.noFlags) str of Just _ -> Right str Nothing -> Left "Only alphanumeric characters, '_', and '-' are allowed." </code></pre></div></div> So the <code class="language-plaintext highlighter-rouge">WF</code> WriterForm is a bit clever, though I need to finish it. <code class="language-plaintext highlighter-rouge">WF.renderForm</code> takes 1) a thing to operate on, 2) an action in the Halogen query algebra which has a constructor which accepts a <code class="language-plaintext highlighter-rouge">FormInput</code> data, and 3) a series of input fields in the <code class="language-plaintext highlighter-rouge">WForm</code> monad (which is just a ReaderT Writer). Each field accepts an ID, a label, and a lens into the object, and a validation function with type <code class="language-plaintext highlighter-rouge">String -> Either String a</code> where <code class="language-plaintext highlighter-rouge">a</code> is the thing being constructed. (todo: construct more than just strings and i guess have a show function?) The query algebra I’ve got setup looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Input a = Goto Routes a | GetUser Int a | LoadSessions a | NewSession (FormInput Session) a | Register (FormInput UserReg) a | Authenticate (FormInput UserAuth) a | UserLogout a </code></pre></div></div> So, for <code class="language-plaintext highlighter-rouge">Register</code>, we’ve got an action that takes (as first argument) a <code class="language-plaintext highlighter-rouge">FormInput UserReg</code>. A <code class="language-plaintext highlighter-rouge">FormInput</code> is simply: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data FormInput a = Submit | Edit (a -> a) </code></pre></div></div> This ends up working quite well! I do need to integrate server received errors better somehow. <h2 id="models">Models</h2> There’s a decent amount of boilerplate involved with PureScript right now. Generic deriving arrived in 0.7.3, which has helped tremendously, but it’s not all there yet unfortunately. There are <a href="https://github.com/parsonsmatt/ql-purs/blob/blogpost/src/QuickLift/Model/Registration.purs">85 lines in this module</a>, almost all of which are boilerplate. This wasn’t much fun to write, and I’ll have to do it again for each and every one of my objects, it seems! I’ll likely write an abstraction or type class of my own to make this more convenient, but there’s a surprising amount one has to write in order to have convenient serialization and deserialization of types. Also I might be doing this totally wrong and would love if someone could show me a better way! Deriving lenses automatically would be great too… <h2 id="api">API</h2> Affjax makes dealing with the API requests really great. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>postRegistration :: forall eff. UserReg -> Aff (ajax :: AJAX | eff) (Either String Int) postRegistration u = do { response: res } <- qlPost "users" u pure $ joinForeign show res verifySession :: forall eff . String -> Aff (ajax :: AJAX | eff) (Maybe (Tuple String User)) verifySession token = do { response: res } <- qlPost "users/verify" (show token) pure <<< eitherToMaybe $ Tuple <$> readProp "sessionId" res <*> readProp "person" res </code></pre></div></div> <code class="language-plaintext highlighter-rouge">qlPost</code> and <code class="language-plaintext highlighter-rouge">joinForeign</code> are defined in the <a href="https://github.com/parsonsmatt/ql-purs/blob/blogpost/src/QuickLift/Api/Util.purs"><code class="language-plaintext highlighter-rouge">QuickLift.API.Util</code></a> module. <h1 id="the-future">The Future…</h1> Well, that’s QuickLift right now. Here’s what you can look out for later: <h2 id="an-actually-useful-app">An actually useful app</h2> that’s right! It’ll be way cooler when I’ve had more time to implement features and improvements. <h2 id="an-easily-packaged-spa-scaffold">An easily packaged SPA scaffold</h2> You can copy the repositories as-is right now and modify them to your own desires. Authentication is covered, as well as super basic CRUD. I want to make a <code class="language-plaintext highlighter-rouge">stack</code> template for easy distribution, and make this as simple as possible to get running. <h2 id="more-libraries-spun-out">More libraries spun out</h2> moar sharing moaaarrrrr I want to polish up the forms library, my utilities for routing, and my utilities for dealing with JSON/API stuff. They’re all general enough that I don’t think they belong in app code. <h2 id="deployment-with-keter">Deployment with Keter</h2> I need to get this deployed with Keter. Manual deployment isn’t bad but automation is great. Sun, 03 Jan 2016 00:00:00 +0000 https://www.parsonsmatt.org/2016/01/03/ann_quicklift.html https://www.parsonsmatt.org/2016/01/03/ann_quicklift.html 2015 Retrospective 2015 has been a hell of a year for me. It was around three years ago that I realized I was trapped in a dead end job in a dead end career. I decided to switch into software development, and spent 2014 working full time and going to classes part time. January of 2014 marked the first time that I did any real programming with CSCI 1301. That summer, I managed to get an internship where I worked about 10 hours per week, on top of everything else. In January of 2015, I quit my job to focus on going to school full time. I didn’t intend to work. Really, I had every intention of focusing on school and enjoying hobbies and interests that I didn’t have time for in 2014. I didn’t lift, and I didn’t touch my music. I made it about 20 days until I got an offer to work for another local startup. They were an awesome group doing Ruby on Rails, which I was enjoying at the time, so I accepted. Initially, I was only going to work for 10-15 hours per week. As time went on, there was so much work to do that I had the opportunity to work more. I liked the work, and I ended up working closer to 20-30 hours per week. Over the summer, I worked full time while taking the data structures and algorithms course. My experience there was intense! I was one of three developers, and as such, had a lot of responsibility in design, testing, implementation, and all that other stuff. I learned a ton extremely quickly, and my proficiency with Ruby went through the roof. Juggling the work with school was really difficult and stressful, and by the end, I could feel that I was pretty close to burnout. In between spring and summer semesters, I went to Lambdaconf in Boulder as a vacation of sorts. I had a ton of fun and plan on returning in 2016! The conference gave me a ton of confidence in the future of functional programming, and Haskell specifically. I spent the summer juggling Ruby and JavaScript at work, C++ for my data structures/algorithms course, and Haskell for my own amusement. I was incredibly fortunate to be contacted for an opportunity on reddit to work for a company in Atlanta doing Haskell as a fall internship. This fall, my schedule was entirely composed of senior level intensive classes. Software Engineering, Computer Architecture, Statistics, and Artificial Intelligence. I spent so much time working and studying. The little time I didn’t spend on school, I spent on work. Most of my leisure activity was spent on personal programming projects. In the beginning of September, I was contacted by Google to undergo the recruiting process. I hadn’t intended on doing the “interview prep/job search” thing until December or so! All of my leisure time was now spent on learning Python well enough to interview and studying for the interviews. They flew me out to Mountain View for an on site interview in October, and they’re still not quite sure if they want to hire me yet. To achieve this level of productivity, I dominated my circadian rhythm. A ton of caffeine in the morning and early afternoon kept me alert. Flux, melatonin, Valerian root extract, and (more frequently than I’d like) alcohol got me to sleep despite the caffeine. Soylent became my breakfast (if I didn’t skip it), and sometimes dinner. Cycling saved me some time over bussing, and removed the need for explicit exercise. All-nighters became more frequent as the semester went on. The two weeks before finals, I think I pulled four. Finally, exams and projects were all complete, and the semester was over. I won. I was hoping to dial back the intensity over the summer, as I’d gotten close to burnout, and didn’t have much of a break. Instead, I’d only increased it. Next semester, I’ll be taking 17 hours in order to graduate on time, in addition to working part time at my Haskell internship. I only anticipate my stress levels to increase. Fortunately, I have a bunch of job opportunities lined up, many of which seem very promising, interesting, and fun. It’s a little too early to tell precisely, but it certainly seems like I’ve won the “change careers into software development” game. You might have noticed that my 2015 retrospective only really spoke of programming. This blog is mostly for programming, but that’s certainly not the only thing I do… or, rather, it wasn’t the only thing I did. Many of my relationships faltered this year, and some failed entirely. I didn’t spend much time with friends. I didn’t touch music until just a few days ago. I didn’t lift. I didn’t read any non-technical thing for pleasure. Honestly, it sucked. But it worked? Here’s hoping that the latter half of 2016 is better. I know the first half won’t be. Thu, 31 Dec 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/12/31/2015_retrospective.html https://www.parsonsmatt.org/2015/12/31/2015_retrospective.html Exploratory Haskell <h2 id="it-doesnt-have-to-be-so-thought-out">It doesn’t have to be so thought out.</h2> A lot of people think that Haskell is great for expressing a problem that you understand really well, but it’s not so great for sketching out a problem and prototyping. In Ruby, you can start writing code that kinda works, and refine it to be more and more correct. In Haskell, you really have to get the code to type check before you do anything else, and if you don’t have a good idea of your data model, then you don’t have types to work with, and you can’t really make it all work. Right? I’m fortunate enough to be able to pick whatever language I want to solve programming problems in my Artificial Intelligence course at UGA. One assignment was to implement the <a href="https://en.wikipedia.org/wiki/DPLL_algorithm">DPLL algorithm</a> from the textbook (Russell and Norvig). I decided to copy the pseudocode, translate it directly into Haskell, and see if I could get to a working solution from there. The Wikipedia article linked does a good job explaining the algorithm, so I’ll just go over the code. Very briefly, the algorithm is a way of taking a list of clauses of Boolean logic and determining whether or not there is a way to assign the variables in the clauses to make the whole expression true. The original code is available <a href="https://www.github.com/parsonsmatt/dpll">in this repository</a>. <h2 id="the-pseudocode">The Pseudocode</h2> Here is the pseudocode for the algorithm, reproduced from Norvig and Russell’s textbook: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>function DPLL(clauses, symbols, model ) returns true or false if every clause in clauses is true in model then return true if some clause in clauses is false in model then return false P, value ← FIND-PURE-SYMBOL (symbols, clauses, model) if P is non-null then return DPLL(clauses, symbols – P, model ∪ {P =value}) P, value ← FIND-UNIT-CLAUSE (clauses, model) if P is non-null then return DPLL(clauses, symbols – P, model ∪ {P =value}) P ← FIRST (symbols) rest ← REST (symbols) return DPLL(clauses, rest, model ∪ {P =true}) or DPLL(clauses, rest, model ∪ {P =false})) </code></pre></div></div> Now let’s make it Haskell! We’ll translate names and terms directly and assume functions exist that we can use. We already know that <code class="language-plaintext highlighter-rouge">P</code> can be nullable, so we’ll go ahead and use <code class="language-plaintext highlighter-rouge">Maybe</code> for that type. Instead of <code class="language-plaintext highlighter-rouge">if</code> checking, we’ll pattern match on the result. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpll :: Clauses -> Symbols -> Model -> Bool dpll clauses symbols model = if all (`isTrueIn` model) clauses then True else if any (`isFalseIn` model) clauses then False else case findPureSymbol symbols clauses model of (Just sym, val) -> dpll clauses (symbols `minus` sym) (model `including` (sym := val)) (Nothing, val) -> case findUnitClause clauses model of (Just sym, val) -> dpll clauses (symbols `minus` sym) (model `including` (sym := val)) (Nothing, val) -> case symbols of (x:xs) -> dpll clauses xs (model `including` (x := False)) || dpll clauses xs (model `including` (x := True)) [] -> False -- should not happen? </code></pre></div></div> This doesn’t compile or type check at all. Despite that, <code class="language-plaintext highlighter-rouge">hlint</code> is capable of suggesting code improvements, and we can follow those to a much nicer looking implementation. Here’s a mostly syntastic transformation to the above code: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpll :: Clauses -> Symbols -> Model -> Bool dpll clauses symbols model | all (`isTrueIn` model) clauses = True | any (`isFalseIn` model) clauses = False | otherwise = case findPureSymbol symbols clauses model of Just (sym := val) -> dpll clauses (symbols `minus` sym) (model `including` (sym := val)) Nothing -> case findUnitClause clauses model of Just (sym := val) -> dpll clauses (symbols `minus` sym) (model `including` (sym := val)) Nothing -> case symbols of (x:xs) -> dpll clauses xs (model `including` (x := False)) || dpll clauses xs (model `including` (x := True)) [] -> False -- really why </code></pre></div></div> The “pattern matching on Maybe” approach is kind of ugly. Let’s extract those branches into named subexpressions and use <code class="language-plaintext highlighter-rouge">maybe</code> from <code class="language-plaintext highlighter-rouge">Data.Maybe</code> to choose which branch to go on, and pattern match directly on the list at the function definition rather than later on. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpll :: Clauses -> Symbols -> Model -> Bool dpll clauses symbols@(x:xs) model | all (`isTrueIn` model) clauses = True | any (`isFalseIn` model) clauses = False | otherwise = maybe pureSymbolNothing pureSymbolJust (findPureSymbol symbols clauses model) where pureSymbolJust (sym := val) = dpll clauses (symbols `minus` sym) (model `including` (sym := val)) pureSymbolNothing = maybe unitClauseNothing unitClauseJust (findUnitClause clauses model) unitClauseJust (sym := val) = dpll clauses (symbols `minus` sym) (model `including` (sym := val)) unitClauseNothing = dpll clauses xs (model `including` (x := False)) || dpll clauses xs (model `including` (x := True)) </code></pre></div></div> We’ve eliminated the manual pattern matching. While more astute programmers might have noticed what’s going on with the previous form, I didn’t quite catch on to the overall pattern. Note that we’re doing the same thing in every terminal branch: we recursively call <code class="language-plaintext highlighter-rouge">dpll</code>, and only the variable assignment changes! More than that, we’re expressing the idea of “Try this thing. If it succeeds, take the result. If it fails, try the next thing.” The <code class="language-plaintext highlighter-rouge">Maybe</code> monad captures a similar idea, but isn’t quite what we want – that returns <code class="language-plaintext highlighter-rouge">Nothing</code> if any are <code class="language-plaintext highlighter-rouge">Nothing</code>, and really, we want to take the first <code class="language-plaintext highlighter-rouge">Just</code> value and do something with it. <code class="language-plaintext highlighter-rouge">Maybe</code> has an <code class="language-plaintext highlighter-rouge">Alternative</code> instance, which does exactly what we want: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> Nothing <|> Just 10 <|> Nothing <|> Just 2 Just 10 </code></pre></div></div> One of the cool things we can do with an <code class="language-plaintext highlighter-rouge">Alternative</code> is the <a href="https://hackage.haskell.org/package/base-4.8.1.0/docs/Data-Foldable.html#v:asum"><code class="language-plaintext highlighter-rouge">asum</code></a> function. We give it a <code class="language-plaintext highlighter-rouge">Foldable t</code> full of <code class="language-plaintext highlighter-rouge">Alternative f => f a</code> values, and it gives us back an <code class="language-plaintext highlighter-rouge">f a</code> based on the definition of <code class="language-plaintext highlighter-rouge">Alternative</code> for our <code class="language-plaintext highlighter-rouge">f</code>. We can very succinctly express the idea “Try these actions and take the first successful one” with this! Alright, let’s refactor our code to use that idea. We’re using the <code class="language-plaintext highlighter-rouge">Maybe</code> functor all over the place, so to bring <code class="language-plaintext highlighter-rouge">dpll</code> in line with the <code class="language-plaintext highlighter-rouge">find</code> functions, we’ll return a <code class="language-plaintext highlighter-rouge">Maybe Model</code>. This makes the function even more useful – now you get back the model instead of just whether or not the statement is satisfiable! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpll :: Clauses -> Symbols -> Model -> Maybe Model dpll clauses symbols model | all (`isTrueIn` model) clauses = Just model | any (`isFalseIn` model) clauses = Nothing | otherwise = let controlList :: [Maybe Assignment] controlList = [ findPureSymbol symbols clauses model , findUnitClause clauses model , (:= False) <$> listToMaybe symbols , (:= True) <$> listToMaybe symbols ] </code></pre></div></div> <code class="language-plaintext highlighter-rouge">controlList</code> is our list of possible assignments. We want to take the first one that actually works and do something with it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> next :: Assignment -> Maybe Model next (sym := val) = dpll clauses (symbols `minus` sym) (model `including` (sym := val)) </code></pre></div></div> The next step is taking the assignment, and recursively calling <code class="language-plaintext highlighter-rouge">dpll</code> with the symbol removed from the symbol collection and the model including the assignment. And now, for the expression that puts everything together: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> in asum (map (>>= next) controlList) </code></pre></div></div> Haskell’s laziness works well for us here. In a strict language, this code would generate all possible models before picking the first one. We can really leverage Haskell’s laziness to make this efficient and fast. <h2 id="uh-but-it-still-doesnt-compile">“uh but it still doesn’t compile”</h2> So we’ve taken a pseudocode algorithm, translated it to Haskell, and applied a bunch of Haskell idioms to it. There are no type definitions, and the code doesn’t compile, and virtually no functions are defined. Now I guess we’ll want to fill in those lower level details with stuff that fits what we want. Let’s analyze what we’re doing with the terms to get some idea of what data types will fit for them. A symbol could be anything – but we’ll just use a String for now. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Symbol = String </code></pre></div></div> I’ve used <code class="language-plaintext highlighter-rouge">listToMaybe</code> on the symbols. This betrays my intent – <code class="language-plaintext highlighter-rouge">Symbols</code> is going to be a list. That gives us enough information to define <code class="language-plaintext highlighter-rouge">minus</code> as well. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Symbols = [Symbol] minus :: Symbols -> Symbol -> Symbols minus xs = (xs \\) . pure -- or, if you don't like point-free, minus xs x = xs \\ [x] </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">Clauses</code> type is a collection of expressions. An expression is a list of terms in <a href="https://en.wikipedia.org/wiki/Conjunctive_normal_form">Conjunctive Normal Form (CNF)</a>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Clauses = [CNF] </code></pre></div></div> With CNF, juxtaposion is conjunction. We can therefore represent a CNF expression as a list of literals: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type CNF = [Literal] </code></pre></div></div> Finally, a literal is a pair of Sign and Symbol. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Literal = (Sign, Symbol) </code></pre></div></div> A sign is a function that either negates or doesn’t negate a boolean expression. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Sign = Pos | Neg deriving (Eq, Ord, Show) apply :: Sign -> Bool -> Bool apply Pos = id apply Neg = not -- some helper functions for testing n :: Symbol -> Literal n c = (Neg, c) p :: Symbol -> Literal p c = (Pos, c) </code></pre></div></div> We can now actually construct an example <code class="language-plaintext highlighter-rouge">Clauses</code> to test our function (when it, you know, compiles): <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ex :: Clauses ex = [ [ n "p", p "a", p "c" ] , [ n "a" ] , [ p "p", n "c" ] ] </code></pre></div></div> The main use we have for the clauses type is to map over it with <code class="language-plaintext highlighter-rouge">isTrueIn</code> and <code class="language-plaintext highlighter-rouge">isFalseIn</code>, checking every clause in the list against the model. The model is an assignment of symbols to truth values, and random access time is important. We’ll use a map. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Model = Map Symbol Bool isTrueIn :: CNF -> Model -> Bool isTrueIn cnf model = or (mapMaybe f cnf) where f (sign, symbol) = apply sign <$> Map.lookup symbol model </code></pre></div></div> Here, we’re looking up every symbol in the CNF expression, and applying the literal’s sign to the possible value. If there’s no value in the model, then it doesn’t return anything. <code class="language-plaintext highlighter-rouge">or</code> checks to see if there are any <code class="language-plaintext highlighter-rouge">True</code> values in the resulting list. If there are, then the CNF is true in the model. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>isFalseIn :: CNF -> Model -> Bool isFalseIn cnf model = all not literals where literals = map f cnf f (sign, symbol) = apply sign (Map.findWithDefault (apply sign True) symbol model) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">isFalseIn</code> is trickier – we map over the CNF expression with a default value of the <code class="language-plaintext highlighter-rouge">sign</code> applied to <code class="language-plaintext highlighter-rouge">True</code>. Then, we apply the <code class="language-plaintext highlighter-rouge">sign</code> again to the resulting value. <code class="language-plaintext highlighter-rouge">all not</code> is a way of saying “every element is false.” Now the compiler is complaining about not recognizing the <code class="language-plaintext highlighter-rouge">:=</code> symbol. As it happens, any infix function that prefixed with a <code class="language-plaintext highlighter-rouge">:</code> is a data constructor. We’ll define the data type <code class="language-plaintext highlighter-rouge">Assignment</code> and give it some accessor functions. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Assignment = (:=) { getSymbol :: Symbol , getValue :: Bool } instance Show Assignment where show (s := v) = "(" ++ s ++ " := " ++ show v ++ ")" </code></pre></div></div> An advantage of using a data constructor is that we can pattern match on the values of that constructor. This gives us a rather nice definition of the <code class="language-plaintext highlighter-rouge">including</code> function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>including :: Model -> Assignment -> Model including m (sym := val) = Map.insert sym val m </code></pre></div></div> The final remaining items that aren’t defined are <code class="language-plaintext highlighter-rouge">findPureSymbol</code> and <code class="language-plaintext highlighter-rouge">findUnitClause</code>. From the textbook, <blockquote> Pure symbol heuristic: A pure symbol is a symbol that always appears with the same “sign” in all clauses. For example, in the three clauses (A ∨ ¬ B), (¬ B ∨ ¬ C), and (C ∨ A), the symbol A is pure because only the positive literal appears, B is pure because only the negative literal appears, and C is impure. </blockquote> If a symbol has all negative signs, then the returned assignment is False. If a symbol has all positive signs, then the returned assignment is True. We’ll punt refining the clauses with the model to a future function… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>findPureSymbol :: Symbols -> Clauses -> Model -> Maybe Assignment findPureSymbol symbols clauses' model = asum (map makeAssignment symbols) where clauses = refinePure clauses' model makeAssignment :: Symbol -> Maybe Assignment makeAssignment sym = (sym :=) <$> negOrPos (signsForSymbol sym) </code></pre></div></div> We’re using <code class="language-plaintext highlighter-rouge">asum</code> again to pick the first assignment that works out. This maps the assignment of the sym variable over the <code class="language-plaintext highlighter-rouge">negOrPos</code> function, which determines whether the symbol should have a True or False assignment. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> signsForSymbol :: Symbol -> [Sign] signsForSymbol sym = clauses >>= signsForSymInClause sym signsForSymInClause :: Symbol -> CNF -> [Sign] signsForSymInClause sym = map fst . filter ((== sym) . snd) </code></pre></div></div> So we’re filtering the <code class="language-plaintext highlighter-rouge">CNF</code> (a list of <code class="language-plaintext highlighter-rouge">Literal</code>s or <code class="language-plaintext highlighter-rouge">(Sign, Symbol)</code>) to only contain the elements whose second element is equal to the symbol. And then we’re extracting the first element, leaving just the <code class="language-plaintext highlighter-rouge">Sign</code>s. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> negOrPos :: [Sign] -> Maybe Bool negOrPos = getSingleton . nub getSingleton :: [a] -> Maybe a getSingleton [x] = Just x getSingleton _ = Nothing </code></pre></div></div> Finally, we <code class="language-plaintext highlighter-rouge">nub</code> the list, and if it’s a singleton, then we have a success. Now, we’ll define the <code class="language-plaintext highlighter-rouge">findUnitClause</code> function. From the textbook, <blockquote> Unit clause heuristic: A unit clause was defined earlier as a clause with just one literal. In the context of DPLL, it also means clauses in which all literals but one are already assigned false by the model. For example, if the model contains B = true, then (¬B ∨ ¬C) simplifies to ¬C, which is a unit clause. Obviously, for this clause to be true, C must be set to false. The unit clause heuristic assigns all such symbols before branching on the remainder. </blockquote> As above, we’ll punt refining the clauses with the model until later. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>findUnitClause :: Clauses -> Model -> Maybe Assignment findUnitClause clauses' model = assignSymbol <$> firstUnitClause where clauses :: Clauses clauses = refineUnit clauses' model firstUnitClause :: Maybe Literal firstUnitClause = asum (map (getSingleton . mapMaybe ifInModel) clauses) ifInModel :: Literal -> Maybe Literal ifInModel (sign, symbol) = case Map.lookup symbol model of Just val -> if apply sign val then Nothing else Just (sign, symbol) _ -> Just (sign, symbol) assignSymbol :: Literal -> Assignment assignSymbol (sign, symbol) = symbol := apply sign True </code></pre></div></div> This is much simpler than pure symbols! We map assignment over the first unit clause that is found. The first unit clause is found with the <code class="language-plaintext highlighter-rouge">asum</code> technique. We take the list of clauses, and for each clause, first determine what to do if it’s in the model already. If the symbol is in the model, then we check to see if the literal in the clause has a <code class="language-plaintext highlighter-rouge">True</code> or <code class="language-plaintext highlighter-rouge">False</code> value by applying the sign to the value. If the value is True, then we don’t include it. Otherwise, we include the literal in the list. Finally, we attempt to get the singleton list. <code class="language-plaintext highlighter-rouge">asum</code> gets the first clause which satisfies these conditions. Now, in the previous functions, we punted refining the clauses. It’s time to do that. For a pure symbol, the given optimization is (from the book): <blockquote> Note that, in determining the purity of a symbol, the algorithm can ignore clauses that are already known to be true in the model constructed so far. For example, if the model contains B = false, then the clause (¬ B ∨ ¬ C) is already true, and in the remaining clauses C appears only as a positive literal; therefore C becomes pure. </blockquote> We’ll start by folding the model and clauses into a new set of clauses. The helper function will go through each symbol in the model, find the relevant clauses, and modify them appropriately. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>refinePure :: Clauses -> Model -> Clauses refinePure = Map.foldrWithKey f where f :: Symbol -> Bool -> Clauses -> Clauses f sym val = map discardTrue where discardTrue = filter (not . clauseIsTrue) clauseIsTrue (sign, symbol) = symbol == sym && apply sign val </code></pre></div></div> The optimization from the text for the unit clause is: <blockquote> In the context of DPLL, it also means clauses in which all literals but one are already assigned false by the model. For example, if the model contains B = true, then (¬ B ∨ ¬ C) simplifies to ¬ C, which is a unit clause. Obviously, for this clause to be true, C must be set to false. The unit clause heuristic assigns all such symbols before branching on the remainder. </blockquote> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>refineUnit :: Clauses -> Model -> Clauses refineUnit clauses model = map refine clauses where refine :: CNF -> CNF refine cnf = case allButOneFalse cnf of Just (s := True) -> [p s] Just (s := False) -> [n s] Nothing -> cnf allButOneFalse :: CNF -> Maybe Assignment allButOneFalse = getSingleton . filter (not . getValue) . map assign assign :: Literal -> Assignment assign (sign, sym) = sym := Map.findWithDefault (apply sign True) sym model </code></pre></div></div> If all but one of the literals in the CNF are false, then we return that with the proper assigment. Otherwise, we return the whole CNF expression. Starting from a straight up transcription, we’ve now finally implemented enough to solve problems! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>solved :: Maybe Model solved = dpll ex ["p", "a", "c"] Map.empty </code></pre></div></div> The output will be kind of ugly, so let’s make a pretty printing function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>showModel :: Model -> String showModel = unlines . map (show . snd) . Map.toList . Map.mapWithKey (:=) </code></pre></div></div> Evaluating <code class="language-plaintext highlighter-rouge">solved</code> in GHCi gives us: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Prelude HW6> putStr . showModel . fromJust $ solved (a := False) (c := False) (p := False) </code></pre></div></div> Assigning the variables with those values does return a true model. Nice! <h2 id="australia">Australia</h2> Given a set of colors, and a map, can you color every state in a way that all touching states have different colors? This is the coloring problem (with a good <a href="https://www.seas.upenn.edu/~cis391/Lectures/CSP-6up.pdf">intro here</a>). As it happens, Australia makes a nice simple model for this, and the three coloring of Australia is easy enough to do by hand that it makes a good model for testing our software out. Let’s define the problem and find the solution! First, we’ll want to define our symbols: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>colors :: [Symbol] colors = [green, blue, red] green = "-green" blue = "-blue" red = "-red" states :: [Symbol] states = [ western , southern , northern , queensland , newSouthWales , victoria ] western = "Western" southern = "Southern" northern = "Northern" queensland = "Queensland" newSouthWales = "New South Wales" victoria = "Victoria" </code></pre></div></div> Now we’ll express that a given state can have one color, but only one: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hasColor :: Symbol -> Clauses hasColor st = [ [ p $ st `is` green , p $ st `is` blue , p $ st `is` red ] , [ n $ st `is` blue , n $ st `is` red ] , [ n $ st `is` green , n $ st `is` red ] , [ n $ st `is` green , n $ st `is` blue ] ] </code></pre></div></div> Since our symbols are lists, we can concatenate them together. We don’t want to get the two confused, so we make specialized functions that only work on symbols and clauses, respectively. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>is :: Symbol -> Symbol -> Symbol is = (++) (/\) :: Clauses -> Clauses -> Clauses (/\) = (++) </code></pre></div></div> And, since we’ll often want to take a list of things, apply a function to each, and make a clause out of the whole thing, we’ll alias the <code class="language-plaintext highlighter-rouge">bind</code> function to something that looks kind of like “take the conjunction of this whole set.” <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(/\:) :: Monad m => m a -> (a -> m b) -> m b (/\:) = (>>=) </code></pre></div></div> At first, we’ll simply say that every state has a color. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>initialConditions :: Clauses initialConditions = states /\: hasColor </code></pre></div></div> Next, we’ll say that for a pair of adjacent states, they can’t both be the same color. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>adjNotEqual :: (Symbol, Symbol) -> Clauses adjNotEqual (a, b) = colors /\: bothAreNot where bothAreNot color = [ [ n $ a `is` color , n $ b `is` color ] ] </code></pre></div></div> Next, a list of adjacent states… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>adjStates :: [(Symbol, Symbol)] adjStates = [ (western, northern) , (western, southern) , (northern, southern) , (northern, queensland) , (southern, newSouthWales) , (southern, victoria) , (southern, queensland) , (newSouthWales, queensland) , (newSouthWales, victoria) ] adjacentStatesNotEqual :: Clauses adjacentStatesNotEqual = adjStates /\: adjNotEqual australiaClauses :: Clauses australiaClauses = initialConditions /\ adjacentStatesNotEqual australiaSymbols :: Symbols australiaSymbols = is <$> states <*> colors </code></pre></div></div> Now, we’ve finally accomplished the encoding, and we can get the solution. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>australiaSolution :: Maybe Model australiaSolution = dpll australiaClauses australiaSymbols mempty </code></pre></div></div> It can be printed with the following function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>showOnlyTrue :: Model -> String showOnlyTrue = unlines . map (show . snd) . filter (getValue . snd) . Map.toList . Map.mapWithKey (:=) printAustralia :: IO () printAustralia = do let model = fromJust australiaSolution putStrLn (showOnlyTrue model) </code></pre></div></div> Which, when evaluated in GHCi, gives us: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Prelude HW6> printAustralia (New South Wales-red := True) (Northern-red := True) (Queensland-blue := True) (Southern-green := True) (Victoria-blue := True) (Western-blue := True) </code></pre></div></div> This can be verified manually to be a correct coloring of Australia. Well, we started with a bunch of pseudocode, transcribed it directly into Haskell syntax, refactored it blindly until it was nice and idiomatic, and finally started implementing the details we’d assumed. This is rather different from how I usually approach solving problems in Haskell, and it was pretty fun. Wed, 09 Dec 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/12/09/exploratory_haskell.html https://www.parsonsmatt.org/2015/12/09/exploratory_haskell.html An Intuition on Context II <h2 id="recapping">Recapping…</h2> <a href="https://www.parsonsmatt.org/2015/11/24/an_intuition_on_context.html">The previous post</a> discussed how information can be stored in types. This information forms a “computational context” that we can take advantage of in new and interesting ways. Now, we’re going to see how that relates to the functor/applicative/monad stuff. <h2 id="functor">Functor</h2> A functor is a combination of some context and a function <code class="language-plaintext highlighter-rouge">fmap</code> that allows us to lift a function into that context. The <code class="language-plaintext highlighter-rouge">fmap</code> function can’t access any information in the context, and it’s not allowed to change the context. A more formal way of expressing that are the Functor Laws: <ol> <li><code class="language-plaintext highlighter-rouge">fmap id === id</code></li> <li><code class="language-plaintext highlighter-rouge">fmap (f . g) === fmap f . fmap g</code></li> </ol> (where <code class="language-plaintext highlighter-rouge">id</code> is the identity function and returns its argument unchanged) The functor type class definition is presented here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class Functor f where fmap :: (a -> b) -> f a -> f b -- or, since function arrows associate to the right, fmap :: (a -> b) -> (f a -> f b) </code></pre></div></div> We define a function that takes a normal function as input, and returns a new function that operates over the context. We can write an <code class="language-plaintext highlighter-rouge">fmap</code> for all of the contexts in the prior post: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap :: (a -> b) -> List a -> List b fmap _ Nil = Nil fmap f (Cons head rest) = Cons (f head) (fmap f rest) fmap :: (a -> b) -> Maybe a -> Maybe b fmap _ Nothing = Nothing fmap f (Just a) = Just (f a) fmap :: (a -> b) -> Either e a -> Either e b fmap _ (Left e) = Left e fmap f (Right a) = Right (f a) fmap :: (a -> b) -> (e, a) -> (e, b) fmap f (e, a) = (e, f a) </code></pre></div></div> There’s an intuition of functors/monads as containers, and this intuition works for the above types. The intuition breaks down when we start thinking about Reader and State, which are functions. The ‘container’ intuition also breaks down with certain containers, like <code class="language-plaintext highlighter-rouge">Set</code>. Thinking of a function as a container is pretty difficult, and there are other functors where the container metaphor breaks down completely. First, why isn’t <code class="language-plaintext highlighter-rouge">Set</code> a valid functor? A Set is an unordered collection of elements where duplicates are discarded. “Duplicates are discarded” sounds like a likely plan of attack! <a href="https://www.schoolofhaskell.com/user/chad/snippets/random-code-snippets/set-is-not-a-functor">Michael Snoyman’s code snippet</a> shows the simplest case. This violation of the Functor laws is enough to say that a Set can’t be a functor, despite being a container. Now, for the function functors! Let’s define <code class="language-plaintext highlighter-rouge">fmap</code> for Reader. It’s not as trivial as the above, where we just pattern match directly on the constructor and apply the <code class="language-plaintext highlighter-rouge">f</code>. We’ll take a little detour of using typed-holes to discover the implementation. We know already that we’ll be constructing a <code class="language-plaintext highlighter-rouge">Reader</code> as the return type, so we can go ahead and fill that in. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap :: (a -> b) -> Reader r a -> Reader r b fmap f readFn = Reader _f </code></pre></div></div> That <code class="language-plaintext highlighter-rouge">_f</code> has the type <code class="language-plaintext highlighter-rouge">r -> b</code>, as GHC is happy to tell us. <code class="language-plaintext highlighter-rouge">readFn</code> is a <code class="language-plaintext highlighter-rouge">Reader r a</code>, and we’ve got a <code class="language-plaintext highlighter-rouge">runReader</code> function with the signature <code class="language-plaintext highlighter-rouge">Reader r a -> (r -> a)</code>. Lastly, we have an <code class="language-plaintext highlighter-rouge">a -> b</code>. We’ll introduce an <code class="language-plaintext highlighter-rouge">r</code> using a lambda! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap f readFn = Reader (\r -> _f) </code></pre></div></div> Now we can do <code class="language-plaintext highlighter-rouge">runReader readFn</code> to get an <code class="language-plaintext highlighter-rouge">r -> a</code>, apply the <code class="language-plaintext highlighter-rouge">r</code> to the function to get an <code class="language-plaintext highlighter-rouge">a</code>, and apply <code class="language-plaintext highlighter-rouge">a</code> to the <code class="language-plaintext highlighter-rouge">f</code> to get a <code class="language-plaintext highlighter-rouge">b</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap f readFn = Reader (\r -> f (runReader readFn r)) -- or, fmap f readFn = Reader (f . runReader readFn) </code></pre></div></div> Note that the function we’re mapping over <code class="language-plaintext highlighter-rouge">Reader</code> doesn’t get to see the environment. It has no access to the context of the computation, just the result value. <code class="language-plaintext highlighter-rouge">State</code> is like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap :: (a -> b) -> State s a -> State s b fmap f oldState = State (\stateValue -> let (result, newStateValue) = runState oldState stateValue in (f result, newStateValue) ) </code></pre></div></div> We start with the <code class="language-plaintext highlighter-rouge">State</code> constructor, and since we know we need a function of <code class="language-plaintext highlighter-rouge">s -> (b, s)</code> at the end, we introduce the lambda. We run the <code class="language-plaintext highlighter-rouge">oldState</code> with the <code class="language-plaintext highlighter-rouge">stateValue</code> from the lambda, and apply the <code class="language-plaintext highlighter-rouge">f</code> to the result in the return tuple. The <code class="language-plaintext highlighter-rouge">f</code> function in all of these functors is completely unaware of the context. It’s not allowed to alter the context, and it’s not allowed to be informed by the context. <h2 id="applicative">Applicative</h2> The fmap function that functors get is pretty limited. It doesn’t get to know anything about the context, so all it can do is transform each element. The Applicative class introduce the <code class="language-plaintext highlighter-rouge"><*></code> operator, pronounced “apply”. The <code class="language-plaintext highlighter-rouge"><*></code> allows us to start taking advantage of the information present in the context. We also get <code class="language-plaintext highlighter-rouge">pure</code> – a generic way to lift something into the Applicative context. <code class="language-plaintext highlighter-rouge">pure</code> and <code class="language-plaintext highlighter-rouge"><*></code> have to follow a big rule: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pure f <*> a === fmap f a </code></pre></div></div> The class definition looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class (Functor f) => Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b </code></pre></div></div> Let’s start with the basics: if we have <code class="language-plaintext highlighter-rouge">Identity (a -> b)</code> and <code class="language-plaintext highlighter-rouge"><*></code> it with <code class="language-plaintext highlighter-rouge">Identity a</code>, then we’ll get an <code class="language-plaintext highlighter-rouge">Identity b</code> back. There’s no extra information to consider, so we don’t have anything extra to do with our applicative powers. Check out the implementation: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Applicative Identity where pure = Identity Identity f <*> a = fmap f a </code></pre></div></div> For <code class="language-plaintext highlighter-rouge">Maybe</code>, we gain a tiny bit of new power. With <code class="language-plaintext highlighter-rouge">Maybe a</code>, we have the information contained in <code class="language-plaintext highlighter-rouge">a</code>, and one extra bit of information: <code class="language-plaintext highlighter-rouge">Nothing</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Applicative Maybe where pure = Just Just f <*> m = fmap f m Nothing <*> _ = Nothing </code></pre></div></div> With Functor, we definitely had to have a function to fmap over the value. With Applicative, we can potentially do <code class="language-plaintext highlighter-rouge">Nothing <*> Just 'a'</code>, which results in <code class="language-plaintext highlighter-rouge">Nothing</code>. We’ve gained a way to change the structure! Functors weren’t allowed to change the structure of the thing being mapped over. Applicatives are. When we’re applying, we get a new bit of information, and we’ll want to find out how to use that information effectively. Additionally, being able to apply a function in a functor to values in a functor lets us curry things and work across many values. The idiom is very common, and uses the fmap infix operator <code class="language-plaintext highlighter-rouge"><$></code>. If we have some function that takes a bunch of parameters, and a bunch of <code class="language-plaintext highlighter-rouge">Maybe</code> values, we can apply the function over the Maybes and get a result back only if they’re all <code class="language-plaintext highlighter-rouge">Just</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data SomeType = SomeType Int String Char isJust = SomeType <$> Just 2 <*> Just "ASdf" <*> Just 'a' -- Just (SomeType 2 "ASdf" 'a') isNothing = SomeType <$> Just 6 <*> Nothing <*> Just 'b' -- Nothing </code></pre></div></div> The contextual application says “If any of these contexts is <code class="language-plaintext highlighter-rouge">Nothing</code>, then the whole context is Nothing.” The <code class="language-plaintext highlighter-rouge">Either</code> functor has a similar Applicative instance. There’s a functor very similar to <code class="language-plaintext highlighter-rouge">Either</code> called <code class="language-plaintext highlighter-rouge">Validation</code> which has an interesting <code class="language-plaintext highlighter-rouge">Applicative</code> instance. Here is the definition and Functor instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Validation e a = Correct a | Errors [e] instance Functor (Validation e) where fmap f (Correct a) = Correct (f a) fmap _ (Errors e) = Errors e instance Functor (Either e) where fmap f (Right a) = Right (f a) fmap _ (Left e) = Left e </code></pre></div></div> They look nearly the same. For reasons that we’ll come to when looking at monads, <code class="language-plaintext highlighter-rouge">Validation</code> cannot form a monad, while <code class="language-plaintext highlighter-rouge">Either</code> can. Despite this, Validation has some nice properties that Either can’t have. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Applicative (Validation e) where pure = Correct Correct f <*> val = fmap f val Errors as <*> Correct _ = Errors as Errors as <*> Errors bs = Errors (as ++ bs) instance Applicative (Either e) where pure = Right Right f <*> a = fmap f a Left e <*> _ = Left e </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">Either</code> Applicative short circuits when it gets a <code class="language-plaintext highlighter-rouge">Left</code> value. It discards everything that comes next. The <code class="language-plaintext highlighter-rouge">Validation</code> applicative collects all the error messages in a list. Let’s see how this plays out: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Person = Person Int String type Error = String valGood :: Validation Error Person valGood = Person <$> Correct 27 <*> Correct "Matt" -- Correct (Person 27 "Matt") eitherGood :: Either Error Person eitherGood = Person <$> Right 25 <*> Right "Alice" -- Right (Person 25 "Alice") valError :: Validation Error Person valError = Person <$> Errors ["Age too low"] <*> Errors ["Empty name"] -- Errors ["Age too low", "Empty name"] eitherError :: Either Error Person eitherError = Person <$> Left "Age too low" <*> Left "Empty name" -- Left "Age too low" </code></pre></div></div> <h2 id="the-list-applicative">The List Applicative</h2> Let’s consider <code class="language-plaintext highlighter-rouge">List</code>, now. We have two bits of information in the context of the list: the number of elements in a list, and the order of elements in the list. If we use the number of elements in the list as our extra information, then we can combine the number of elements in some way in the result list. If we use the order of elements, then we can zip the two lists together with function application, pairing the function at index <code class="language-plaintext highlighter-rouge">i</code> with the value at index <code class="language-plaintext highlighter-rouge">i</code>. Let’s implement both! For a ZipList, we have to remember that <code class="language-plaintext highlighter-rouge">pure f <*> a === fmap f a</code> when considering our <code class="language-plaintext highlighter-rouge">pure</code> definition. If we have a single function that we want to zip a list with, then we need to make an infinite list of that function. Otherwise, we might come up short, and then we’d truncate the list, and that would break the law. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ZipList a = ZipList { unZipList :: [a] } instance Applicative ZipList where pure a = ZipList (repeat a) </code></pre></div></div> The only sensible behavior when zipping two lists together is to truncate the result list to the length of the shorter list, so we’ll return <code class="language-plaintext highlighter-rouge">Nil</code> if we reach the end of either list. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ZipList Nil <*> _ = Nil _ <*> ZipList Nil = Nil </code></pre></div></div> Finally, if we have a function and an element, we apply the function to the element and continue zipping. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ZipList (Cons f fs) <*> ZipList (Cons a as) = ZipList (Cons (f a) (fs <*> as)) </code></pre></div></div> Here, we’re using the ordering of the lists to inform our application. If either list is shorter than the other, then we truncate the list. While we’re using the ordering of the lists to do application, we’re not changing the ordering of the list. This doesn’t seem like we’re fully taking advantage of the information present in the list. Perhaps if we use the count of elements rather than their order, we can have a more powerful impact. Since the length of the list is a number, we can potentially do numeric operations with the two numbers. We can’t have negative or fractional lengths of lists, so we’re back to natural numbers, and that means we can try addition and multiplication. Exponentiation might be a thing, but let’s not get too crazy. We also have to consider that <code class="language-plaintext highlighter-rouge">pure f <*> as === fmap f as</code>. A <code class="language-plaintext highlighter-rouge">pure f <*> as</code> can’t change the length of the list, otherwise it violates the law above. This means that our operation (<code class="language-plaintext highlighter-rouge">+</code> or <code class="language-plaintext highlighter-rouge">*</code>) determines how <code class="language-plaintext highlighter-rouge">pure</code> behaves. If you’re familiar with the idea of a monoid, you might notice that this seems really familiar. In fact, one approach to Applicatives is that they are a: <blockquote> *drumroll please* </blockquote> <blockquote> (you don’t actually need to know this, but the paper might be fun!) </blockquote> <blockquote> <a href="https://strictlypositive.org/Idiom.pdf">strong lax monoidal endofunctor</a> </blockquote> A monoid is the combination of an associative operator <code class="language-plaintext highlighter-rouge"><></code> and an identity element (called unit, or <code class="language-plaintext highlighter-rouge">mempty</code> in Haskell), where the following laws hold: <ul> <li><code class="language-plaintext highlighter-rouge">unit <> x === x</code> and <code class="language-plaintext highlighter-rouge">x <> unit === x</code></li> <li><code class="language-plaintext highlighter-rouge">x <> (y <> z) === (x <> y) <> z</code></li> </ul> Fortunately, addition and multiplication both form a monoid with natural numbers. Observe! <ul> <li><code class="language-plaintext highlighter-rouge">1 * x === x</code>, <code class="language-plaintext highlighter-rouge">x * 1 === x</code>, and <code class="language-plaintext highlighter-rouge">x * (y * z) === (x * y) * z</code></li> <li><code class="language-plaintext highlighter-rouge">0 + x === x</code>, <code class="language-plaintext highlighter-rouge">x + 0 === x</code>, and <code class="language-plaintext highlighter-rouge">x + (y + z) === (x + y) + z</code></li> </ul> So the combination of the operation using the information of the context and the means of lifting a value form a monoid. With addition, the unit is <code class="language-plaintext highlighter-rouge">0</code>, and a list of length <code class="language-plaintext highlighter-rouge">0</code> is <code class="language-plaintext highlighter-rouge">Nil</code>. So, our <code class="language-plaintext highlighter-rouge">pure</code> function with the <code class="language-plaintext highlighter-rouge">+</code> operation would be: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pure _ = Nil </code></pre></div></div> Unfortunately, this does not satisfy <code class="language-plaintext highlighter-rouge">pure f <*> as === fmap f as</code>, so <code class="language-plaintext highlighter-rouge">+</code> can’t be used. Our next choice is multiplication, which has a unit of 1. This means that our <code class="language-plaintext highlighter-rouge">pure</code> function will return a list of length 1. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pure a = Cons a Nil </code></pre></div></div> We’ll multiply the lengths of the two lists together to form the new list. For <code class="language-plaintext highlighter-rouge"><*></code>, note that <code class="language-plaintext highlighter-rouge">Nil</code> corresponds to 0, and 0*x = 0. Therefore, if either argument is Nil, then we return <code class="language-plaintext highlighter-rouge">Nil</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(<*>) :: List (a -> b) -> List a -> List b Nil <*> _ = Nil _ <*> Nil = Nil </code></pre></div></div> If we have a single function, this corresponds to <code class="language-plaintext highlighter-rouge">pure f</code>, and so must be equivalent to <code class="language-plaintext highlighter-rouge">fmap f</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Cons f Nil <*> Cons a as = Cons (f a) (fmap f as) -- or, since fmap/<*> neeed to be equivalent, Cons (f a) (pure f <*> as) </code></pre></div></div> if we have multiple functions… then we’ll need to fmap each function over the target list and combine the results. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fs <*> as = concatMap (\f -> fmap f as) fs </code></pre></div></div> We’ve now mathematically derived both of the Applicative instances for lists, based on an understanding of the information content available in the type. As it happens, this second method is really handy for modeling non-deterministic computation, if we imagine a list as being a bunch of possible values that the variable can take on. If we want to add two numbers, but we’re not sure exactly which numbers we have, we could do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(+) <$> [1, 4, 3] <*> [6, 8, 7] </code></pre></div></div> representing that our first number could be either 1, 4, or 3, and our second number could be 6, 8, or 7. Since we don’t know exactly which numbers we have, we want to get all possible results. What does this end up looking like? Well, <code class="language-plaintext highlighter-rouge"><$></code> is map, so we’ll do <code class="language-plaintext highlighter-rouge">+</code> to all the elements in the first list. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[(1 +), (4 +), (3 +)] <*> [6, 8, 7] </code></pre></div></div> Now, we pair of each function with each possible element in the result list, and calculate all of the possibilities for the number. <h2 id="the-reader-applicative">The Reader Applicative</h2> What about Reader and State? How can they take advantage of their contextual information to do neat stuff? Let’s review <code class="language-plaintext highlighter-rouge">Reader</code>, and get the <code class="language-plaintext highlighter-rouge">Applicative</code> instance: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Reader r a = Reader { runReader :: r -> a } instance Applicative (Reader r) where pure a = Reader (\_ -> a) rf <*> ra = Reader (\r -> let f = runReader rf r a = runReader ra r in f a ) </code></pre></div></div> This allows the two functions to share an environment context. How much information is contained here? The <code class="language-plaintext highlighter-rouge">rf</code> value is <code class="language-plaintext highlighter-rouge">Reader r (a -> b)</code>, which is a newtype for <code class="language-plaintext highlighter-rouge">r -> (a -> b)</code>, which we can express as <code class="language-plaintext highlighter-rouge">(a -> b) :^ r</code>, or <code class="language-plaintext highlighter-rouge">(b :^ a) :^ r</code>. If we’ve got <code class="language-plaintext highlighter-rouge">Reader Circuit (Bool -> Circuit)</code>, then that’s (3^2)^3 = 729 possible implementations. In concrete implementations of the functions, however, we’re going to see essentially three paths: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>readerFn :: Circuit -> (Bool -> Circuit) readerFn High = \b -> if b then Low else High readerFn Low = \b -> if b then Disconnected else Low readerFn Disconnected = \b -> if b then Disconnected else High </code></pre></div></div> Reader, in this case, seems to be a set of contingency plans. “If the environment is high voltage, then do this. If it is low voltage, then do that. If the environment is disconnected, then do this other thing.” An applicative chain of Readers is like a single contingency plan, where the <code class="language-plaintext highlighter-rouge">r</code> value in <code class="language-plaintext highlighter-rouge">Reader r a</code> determines the entire plan. As a bit of an aside, since the <code class="language-plaintext highlighter-rouge">r</code> value can’t be changed, then <code class="language-plaintext highlighter-rouge"><*></code> with Reader is commutative – the ordering of <code class="language-plaintext highlighter-rouge"><*></code> doesn’t matter as far as the end result is concerned. This means we can potentially execute all the <code class="language-plaintext highlighter-rouge">Reader</code> functions in parallel without changing the return calculation. If the various functions had dependencies on each other, then we wouldn’t be able to execute them in parallel. If we can structure our code like this, then we could possibly easily guarantee that the code we write could be executed in parallel! Later, we’ll learn that adding <code class="language-plaintext highlighter-rouge">Monad</code> powers to <code class="language-plaintext highlighter-rouge">Reader</code> are precisely what make the guarantee of parallelism impossible. Let’s do an example of the Reader Applicative. We can write a function to express travel and dress plans based on the weather like this. Take: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Weather = Sunny | Raining | Cold data Transport = Bike | Car data Clothing = BigCoat | Jacket | StretchyPants chooseTransport :: Reader Weather Transport chooseTransport = Reader (\env -> case env of Sunny -> Bike Cold -> Bike Raining -> Car ) chooseClothing :: Reader Weather Clothing chooseClothing = Reader (\env -> case env of Sunny -> StretchyPants Raining -> Jacket Cold -> BigCoat ) data Plan = Plan Transport Clothing whatDo :: Weather -> Plan whatDo = runReader (Plan <$> chooseTransport <*> chooseClothing) </code></pre></div></div> We provide an environment, and this makes the choice for all of the Reader functions. <h2 id="the-state-applicative">The State Applicative</h2> How about State? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype State s a = State { runState :: s -> (a, s) } instance Applicative (State s) where pure a = State (\s -> (a, s)) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">pure</code>, for a State Applicative, says “Whatever state value you end up giving me, I’ll return this value.” <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> (<*>) :: State s (a -> b) -> State s a -> State s b sf <*> sa = State (\s -> let (f, s') = runState sf s (a, s'') = runState sa s' in (f a, s'') ) </code></pre></div></div> Alright! So first, we construct our <code class="language-plaintext highlighter-rouge">State</code> value. We use a lambda to introduce the first <code class="language-plaintext highlighter-rouge">s</code> state value. We run the <code class="language-plaintext highlighter-rouge">sf</code> State to get the <code class="language-plaintext highlighter-rouge">f</code> function and the new <code class="language-plaintext highlighter-rouge">s'</code> state. We run the <code class="language-plaintext highlighter-rouge">sa</code> State to get the <code class="language-plaintext highlighter-rouge">a</code> value, and another new <code class="language-plaintext highlighter-rouge">s''</code> state value. Finally, we apply the function to the value and return the <code class="language-plaintext highlighter-rouge">s''</code> state. Reader had 729 implementations, which tells us that the complexity expands a lot. It didn’t really give us much intuition about how the context worked. Instead, it was nicer to think about it in terms of a set of plans that shared a single decision. State is similar, but since we are able to update the state as we go along, we’re able to build a decision tree on the fly, where each function in the applicative sequence gets to update the state. This allows previous stateful things in the application context to have an effect on decisions that happen later. Let’s revisit the weather travel plans and use a stateful approach. State is pretty awkward to use without the <code class="language-plaintext highlighter-rouge">get</code> and <code class="language-plaintext highlighter-rouge">put</code> functions, so we’ll define those and discuss their utility. I’ll also cheat a little bit in the example by using <code class="language-plaintext highlighter-rouge">do</code> notation for the decision functions, but they won’t do anything an Applicative couldn’t do. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>get :: State s s get = State (\s -> (s, s)) put :: s -> State s () put s = State (\_ -> ((), s) </code></pre></div></div> <code class="language-plaintext highlighter-rouge">get</code> lets us retrieve the state we’re passed easily, and <code class="language-plaintext highlighter-rouge">put</code> lets us set a new state and ignore the old one entirely. Ok, on to the example: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Decision a = State WeatherState a type WeatherState = (Weather, Grumpiness) type Grumpiness = Int data Plan = Plan Transport Clothing Food data Food = HotSoup | SubSandwich | IceCream chooseTransport' :: Decision Transport chooseTransport' = do (weather, grumpy) <- get if grumpy > 5 then return Car else case weather of Sunny -> do put (weather, grumpy - 2) return Bike _ -> do put (weather, grumpy + 6) return Car </code></pre></div></div> So if our grumpiness level is over 5, we just take the car. Otherwise, if the weather is Sunny, we’ll reduce our grumpiness a bit, and take the bike. If the weather is anything else, we upgrade grumpiness and take the car. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chooseClothing' :: Decision Clothing chooseClothing' = do (weather, _) <- get return (runReader chooseClothing weather) </code></pre></div></div> Here, we just use the same plan on the <code class="language-plaintext highlighter-rouge">Reader</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chooseFood :: Decision Food chooseFood = do (weather, grumpy) <- get if grumpy > 7 then do put (weather, grumpy - 1) return IceCream else case weather of Cold -> return HotSoup _ -> return SubSandwich </code></pre></div></div> Now, if we’re really grumpy, then we’ll eat ice cream. That cheers us up, so update our grumpiness. Otherwise, we get a soup or sandwich depending on the weather. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>makePlan :: Decision Plan' makePlan = Plan' <$> chooseTransport' <*> chooseClothing' <*> chooseFood runPlan :: WeatherState -> Plan' runPlan state = evalState makePlan state </code></pre></div></div> We’re using the Applicative instance in <code class="language-plaintext highlighter-rouge">makePlan</code>. So far, we’ve been working with the intuition that applicative application allows contexts to interact in the final value. The context of State is the stateful information surrounding the computation. Using functor to map over the stateful computation didn’t touch the state, it just altered the return value. Applicative application, on the other hand, altered the state that was being passed around. The sequencing of function matters here, and unlike Reader, we’re not free to reorganize the functions however we please. That means we couldn’t let the computer easily decide how to parallelize the operations! We’ve seen that <code class="language-plaintext highlighter-rouge">Either</code> had to short-circuit, and only pick up the first error, while <code class="language-plaintext highlighter-rouge">Validation</code> can collect all of the errors, and this behavior is more useful in some contexts. <code class="language-plaintext highlighter-rouge">Either</code> can be made into a Monad, while <code class="language-plaintext highlighter-rouge">Validation</code> can’t be. Reader (and similar structures) can be easily parallelized, while <code class="language-plaintext highlighter-rouge">State</code> can’t be. There are trade-offs here – sometimes, when we add more power, we lose something else. To summarize Applicatives and Functors: a functor is a context where we can map over, but we can’t look at the context, and we can’t change the context. An Applicative allows our contexts to interact independently of the values contained in the contexts. What if we want to alter the context based on the values produced by the computation? <h2 id="the-dread-pirate-monad">The Dread Pirate Monad</h2> For instance, if the result of <code class="language-plaintext highlighter-rouge">chooseTransportation</code> is <code class="language-plaintext highlighter-rouge">Bike</code>, then I need to wear bike-suitable clothing, regardless of the weather. We could model that with Applicative, but we’d have to keep track of every intermediate result in our state, and that sounds pretty awful. State is a pretty powerful functor, though. Weaker ones like Reader, List, and Maybe can’t simulate that level of decisive power. Let’s review the signatures for the class functions we’ve seen so far: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap :: (a -> b) -> f a -> f b (<*>) :: f (a -> b) -> f a -> f b </code></pre></div></div> <code class="language-plaintext highlighter-rouge">fmap</code> lifts a normal function into the <code class="language-plaintext highlighter-rouge">f</code> context. <code class="language-plaintext highlighter-rouge"><*></code> takes a function in the <code class="language-plaintext highlighter-rouge">f</code> context and a value in the <code class="language-plaintext highlighter-rouge">f</code> context, and combines the two contexts while applying the function to the value. So we can include contextual information by putting a value in the <code class="language-plaintext highlighter-rouge">f</code> context. So if we want a function that takes a value <code class="language-plaintext highlighter-rouge">a</code> and returns some contextual information along with a result <code class="language-plaintext highlighter-rouge">b</code>, then we’re looking at: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>context :: a -> f b </code></pre></div></div> What does this mean for our functors? Let’s specialize the signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ctxMaybe :: a -> Maybe b ctxList :: a -> [a] ctxReader :: a -> Reader r b ~ a -> r -> b ctxState :: a -> State s b ~ a -> s -> (b, s) </code></pre></div></div> For <code class="language-plaintext highlighter-rouge">Maybe</code>, it means: “Give me an <code class="language-plaintext highlighter-rouge">a</code>, and I’ll either return a result <code class="language-plaintext highlighter-rouge">Just b</code> or <code class="language-plaintext highlighter-rouge">Nothing</code>.” For <code class="language-plaintext highlighter-rouge">List</code>, it means: “Give me an <code class="language-plaintext highlighter-rouge">a</code>, and I’ll return a list results <code class="language-plaintext highlighter-rouge">b</code>.” For <code class="language-plaintext highlighter-rouge">Reader</code>, it means: “Give me an <code class="language-plaintext highlighter-rouge">a</code>, and I’ll return a computation that depends on the environment.” For <code class="language-plaintext highlighter-rouge">State</code>, it means: “Give me an <code class="language-plaintext highlighter-rouge">a</code>, and I’ll return a stateful computation with some result <code class="language-plaintext highlighter-rouge">b</code>.” Let’s assume we have a function <code class="language-plaintext highlighter-rouge">ctx :: a -> f b</code>. What if we use that with <code class="language-plaintext highlighter-rouge">fmap</code>? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- fmap with b specialized to `f b` fmap :: (a -> f b) -> f a -> f (f b) </code></pre></div></div> We’re close to what we want, but we have two layers of <code class="language-plaintext highlighter-rouge">f</code>. If we have some way to join two layers of structure, then we’d be set. <h2 id="joinbind">Join/Bind</h2> And, in fact, that’s what a monad is. We can write <code class="language-plaintext highlighter-rouge">bind</code> and <code class="language-plaintext highlighter-rouge">join</code> in terms of each other: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(>>=) :: f a -> (a -> f b) -> f b ma >>= f = join (fmap f ma) join :: f (f a) -> f a join ma = ma >>= id </code></pre></div></div> Rather than explaining it, I want you to play with it. Implement it yourself. Get a good feeling for it. This isn’t a monad tutorial, so I don’t want to explain too much, and instead I’d like to focus on how this new found power affects our interaction with the context. Let’s write the <code class="language-plaintext highlighter-rouge">Monad</code> instance for <code class="language-plaintext highlighter-rouge">Either</code>, and discover why we can’t write one for <code class="language-plaintext highlighter-rouge">Validation</code>! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>instance Monad (Either e) where Right a >>= f = f a Left e >>= _ = Left e instance Monad (Validation e) where Correct a >>= f = f a Errors e >>= f = ... </code></pre></div></div> Well… What do we want to happen? To be faithful to the Applicative instance, we’d like to collect the errors as we go. We don’t have an <code class="language-plaintext highlighter-rouge">a</code>, so we can’t apply the <code class="language-plaintext highlighter-rouge">f</code> to anything. If we can’t apply the <code class="language-plaintext highlighter-rouge">f</code> to anything, then we can’t get any more errors. So all we can do is short-circuit. Since the Monad is tied up in the sequence of events, there’s no way for a monad to consider the entire structure of computation. It has the power to short circuit. It has the power to branch based on results of computations. This added power is also a limitation – there are fewer guarantees you can make about monads. Since we can short circuit, we can’t consider the entire computation at once. We must approach it sequentially. Monads are often seen as strictly more powerful, and people understand that to mean strictly more useful. That’s not quite the case. Sun, 29 Nov 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/11/29/an_intuition_on_context_ii.html https://www.parsonsmatt.org/2015/11/29/an_intuition_on_context_ii.html An Intuition on Context I <h2 id="not-a-monad-tutorial">Not a monad tutorial!</h2> I don’t want to teach you about monads. Instead, I want to give you some intuition and understanding around what is meant by “a functor/monad is like a context.” This post is intended to be accessible to everyone with a basic knowledge of programming. I’ll use minimal Haskell syntax to express some ideas, and I’ll explain the meaning as I go. We’ll start with a way to get information from types, and then we’ll use that to determine how much information can be stored in a context. In another post, I’ll cover what a functor is, and how it relates to the idea of a context, and how to extend that to applicative and monad. <h2 id="types-as-numbers">Types as numbers</h2> If you’re not familiar with the formal notion of a set, a set is an unordered collection of unique things. The set <code class="language-plaintext highlighter-rouge">{ 1, 2, 3 }</code> is equivalent to <code class="language-plaintext highlighter-rouge">{ 3, 1, 2 }</code> and <code class="language-plaintext highlighter-rouge">{ 3, 3, 2, 1 }</code> (since the 3 really only gets counted once). They’re different from arrays and lists, which have an ordering and allow duplicate items. For most purposes, we can think of a type as a set (though this idea breaks down with further scrutiny). We can represent the type <code class="language-plaintext highlighter-rouge">Boolean</code> as a set of possible values <code class="language-plaintext highlighter-rouge">{ True, False }</code>. When we say “This variable has the <code class="language-plaintext highlighter-rouge">Boolean</code> type,” we’re saying “This variable can have a value that is in the set of Boolean values.” Regardless of what is inside the set, we can count the number of things in a set. In this way, we can treat a set as a number. We’ve already covered the number 2: <code class="language-plaintext highlighter-rouge">Boolean</code>! The number 1 can be represented with the unit type. In Haskell, this is referred to as <code class="language-plaintext highlighter-rouge">()</code>, and it only has one value, also written as <code class="language-plaintext highlighter-rouge">()</code>. We get 0 from the <code class="language-plaintext highlighter-rouge">Void</code> type. There are no values of the void type. Sometimes, the word ‘value’ doesn’t exactly fit what we mean – and the word ‘inhabitant’ gets used instead. We’ll want 3 for some later examples, so we’ll use the following type/set and values for that purpose: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Circuit = High | Low | Disconnected </code></pre></div></div> This is a Haskell declaration that says: <blockquote> I’m creating a new type called <code class="language-plaintext highlighter-rouge">Circuit</code> with three possible data constructors: <code class="language-plaintext highlighter-rouge">High</code>, <code class="language-plaintext highlighter-rouge">Low</code>, <code class="language-plaintext highlighter-rouge">Disconnected</code>. </blockquote> This type represents digital circuits, which can either be <code class="language-plaintext highlighter-rouge">High</code> voltage (a 1), <code class="language-plaintext highlighter-rouge">Low</code> voltage (a 0), or <code class="language-plaintext highlighter-rouge">Disconnected</code> entirely. As a refresher, here’s a table on the numbers we have so far, their Haskell declarations, and their set descriptions: <ul> <li>0 <ul> <li>Void = {}</li> <li><code class="language-plaintext highlighter-rouge">data Void</code></li> </ul> </li> <li>1 <ul> <li>Unit = { unit }</li> <li><code class="language-plaintext highlighter-rouge">data () = ()</code></li> </ul> </li> <li>2 <ul> <li>Boolean = { true, false }</li> <li><code class="language-plaintext highlighter-rouge">data Bool = True | False</code></li> </ul> </li> <li>3 <ul> <li>Circuit = { High, Low, Disconnected }</li> <li><code class="language-plaintext highlighter-rouge">data Circuit = High | Low | Disconnected</code></li> </ul> </li> </ul> <h2 id="generics-and-higher-kinded-types">Generics and Higher Kinded Types</h2> Above, we talked about how we can determine the size of a set regardless of what it contains. It’s so often useful to talk about containers irrespective of their contents that virtually all modern programming languages have some facility for generics. Java allows you to define a generic class with <code class="language-plaintext highlighter-rouge"><T></code> notation. Here’s an example of an immutable singly linked list class: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>public class List<T> { final T head; final List<T> rest; private List(T val, List<T> rest) { this.head = val; this.rest = rest; } public static List<T> Nil() { return new List<T>(null, null); } public static List<T> Cons(T val, List<T> rest) { return new List<T>(val, rest); } } </code></pre></div></div> where the <code class="language-plaintext highlighter-rouge">T</code> is a type variable, allowing you to instantiate a <code class="language-plaintext highlighter-rouge">List<Integer></code> or <code class="language-plaintext highlighter-rouge">List<String></code>. We’re only making the <code class="language-plaintext highlighter-rouge">Nil</code> and <code class="language-plaintext highlighter-rouge">Cons</code> static methods public to help ensure that it’s constructed correctly. Haskell’s own system is quite a bit more powerful and less verbose. If we want to declare a <code class="language-plaintext highlighter-rouge">List</code> type that takes a single type parameter, analogous to the above, we’d do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data List a = Nil | Cons a (List a) </code></pre></div></div> This data declaration is a bit more complex than the <code class="language-plaintext highlighter-rouge">Circuit</code> declaration. The <code class="language-plaintext highlighter-rouge">data</code> keyword creates a type constructor. <code class="language-plaintext highlighter-rouge">Circuit</code> is a type constructor that takes 0 type arguments. <code class="language-plaintext highlighter-rouge">List</code> is a type constructor that takes 1 type argument. Next up, we create two data constructors. The circuit definition had three data constructors, none of which took any arguments. The <code class="language-plaintext highlighter-rouge">List</code> definition has two: the first being <code class="language-plaintext highlighter-rouge">Nil</code>, and represents an empty list. The second is <code class="language-plaintext highlighter-rouge">Cons</code> which has two fields – the first is a value of type <code class="language-plaintext highlighter-rouge">a</code>, and the second is a value of type <code class="language-plaintext highlighter-rouge">List a</code>. So, in Java and Haskell, we can now construct singly linked lists like this: <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code>List<Integer> l = Cons(1, Cons(2, Cons(3, Nil()))); </code></pre></div></div> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>l :: List Integer l = Cons 1 (Cons 2 (Cons 3 Nil)) </code></pre></div></div> In Haskell, you apply arguments to things by putting them next to each other. So <code class="language-plaintext highlighter-rouge">List</code> is a type constructor, and <code class="language-plaintext highlighter-rouge">List Integer</code> is the type <code class="language-plaintext highlighter-rouge">Interger</code> applied to the type constructor <code class="language-plaintext highlighter-rouge">List</code>. Likewise, <code class="language-plaintext highlighter-rouge">Cons 3 Nil</code> is the data constructor <code class="language-plaintext highlighter-rouge">Cons</code> with the values <code class="language-plaintext highlighter-rouge">3</code> and <code class="language-plaintext highlighter-rouge">Nil</code> applied to it. Let’s simplify, though. List has two fields, one of which is recursively defined. That’s a little tricky for what we’re working with right now. We can get a lot simpler. The <code class="language-plaintext highlighter-rouge">Identity</code> type will suffice as our simplest thing: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Identity a = Identity a </code></pre></div></div> It doesn’t do anything. It just sits there, referencing a single type, and containing a single value of that type. We’ll also want to be able to represent choice: “I have either this or that”. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Either a b = Right b | Left a </code></pre></div></div> When I see “or”, I immediately want to be able to say “and”. We can say “I have this and that” with a pair, or tuple: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data (a, b) = (a, b) </code></pre></div></div> Now, we’re ready to start exploring some ~TYPE MATH~. <h2 id="type-math">Type Math</h2> We have numbers, so we can probably do some sort of math with them! The notion of sets/types as numbers seems to only really apply to the counting numbers. It doesn’t make much sense to think about a set with -1 elements. Likewise, it doesn’t make much sense to think about a set with 3.14 elements. Natural numbers can be added. From addition, we can derive multiplication, and from multiplication, we can derive exponentiation. As it happens, this applies to types, too! <h3 id="addition">Addition</h3> We’ve already seen how to do addition with types. The <code class="language-plaintext highlighter-rouge">|</code> symbol is a way of adding another value to a type. We can explore that with a new definition: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BoolOrCircuit = B Bool | C Circuit </code></pre></div></div> We’ve declared a type constructor <code class="language-plaintext highlighter-rouge">BoolOrCircuit</code>, and two new data constructors. <code class="language-plaintext highlighter-rouge">B</code> takes an argument of type <code class="language-plaintext highlighter-rouge">Bool</code>, and <code class="language-plaintext highlighter-rouge">C</code> takes an argument of type <code class="language-plaintext highlighter-rouge">Circuit</code>. How many elements are in <code class="language-plaintext highlighter-rouge">BoolOrCircuit</code>? Well, we have two possibilities on the <code class="language-plaintext highlighter-rouge">B</code> side: <code class="language-plaintext highlighter-rouge">B True</code> and <code class="language-plaintext highlighter-rouge">B False</code>. On the <code class="language-plaintext highlighter-rouge">C</code> side, we have <code class="language-plaintext highlighter-rouge">C High</code>, <code class="language-plaintext highlighter-rouge">C Low</code>, and <code class="language-plaintext highlighter-rouge">C Disconnected</code> for five total values. 2 + 3 = 5, so this checks out! The single-argument constructor <code class="language-plaintext highlighter-rouge">C</code> has as many values as the type of the argument, and the <code class="language-plaintext highlighter-rouge">|</code> allows us to add the count of values together. Since we arrive at the total number of values by summing the values of each constructor, types like this are known as sum types. We can use <code class="language-plaintext highlighter-rouge">Either</code> to represent this without requiring a new data type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type BoolOrCircuit2 = Either Bool Circuit </code></pre></div></div> <h3 id="multiplication">Multiplication</h3> What if we have multiple arguments? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data BoolAndCircuit = BC Bool Circuit </code></pre></div></div> <code class="language-plaintext highlighter-rouge">BC</code> here takes two arguments, the first is a <code class="language-plaintext highlighter-rouge">Bool</code> and the second is a <code class="language-plaintext highlighter-rouge">Circuit</code>. We can enumerate all the possible values: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BC True High BC False High BC True Low BC False Low BC True Disconnected BC False Disconnected </code></pre></div></div> And… there are six! This generalizes to an arbitrary number of elements – when we take multiple arguments, we can know the total possible values of the type by taking the product of the values of each type. That’s what is meant when people say “product type.” As with <code class="language-plaintext highlighter-rouge">Either</code> being the general sum type, we can use the general product type to make this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type BoolAndCircuit = (Bool, Circuit) </code></pre></div></div> <h3 id="exponentiation">Exponentiation</h3> Exponentiation is a little trickier. We can add two types with <code class="language-plaintext highlighter-rouge">Either a b</code>. And we can multiply them with <code class="language-plaintext highlighter-rouge">(a, b)</code>. How can we raise the type <code class="language-plaintext highlighter-rouge">a</code> to the power of <code class="language-plaintext highlighter-rouge">b</code>? The answer is functions! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Exponent b a = a -> b </code></pre></div></div> If we use the <code class="language-plaintext highlighter-rouge">TypeOperators</code> language extension, we can do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type b :^ a = a -> b </code></pre></div></div> which makes the relation more clear. Let’s think about the unit type – x^1 is always equal to x, and 1^x is always equal to 1. So <code class="language-plaintext highlighter-rouge">() :^ Bool</code> is like 1^2 which should have only one possible implementation. Let’s unpack and implement it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fn :: () :^ Bool ~ Bool -> () fn True = () fn False = () </code></pre></div></div> In fact, there’s no other possible way to do this. No matter the input, we’ll always return <code class="language-plaintext highlighter-rouge">()</code>. <code class="language-plaintext highlighter-rouge">Bool :^ ()</code> is 2^1, which is equal to 2. So there are two possible implementations of the type signature: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fn1 :: Bool :^ () ~ () -> Bool fn1 () = True fn2 :: Bool :^ () ~ () -> Bool fn2 () = False </code></pre></div></div> If we consider <code class="language-plaintext highlighter-rouge">Bool :^ Bool</code>, we’ll see that there are 4 possible implementations of the function. <code class="language-plaintext highlighter-rouge">Bool :^ Circuit</code> has 8 possible implementations, and <code class="language-plaintext highlighter-rouge">Circuit :^ Bool</code> has 9 possible implementations. <a href="https://chris-taylor.github.io/blog/2013/02/10/the-algebra-of-algebraic-data-types/">This blog post series</a> presents these ideas with much more information and rigor than I do, and if you find it interested, I’d recommend you check it out! We now have enough background information on types to talk about how they can contain information! <h2 id="information-in-the-context">Information in the context</h2> Alright, so let’s talk about some contexts and the information contained therein. <code class="language-plaintext highlighter-rouge">Identity</code> is the simplest context. It is the context of identity, of sameness. There is no extra information here. <code class="language-plaintext highlighter-rouge">Maybe</code> gives us some more information! It looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Maybe a = Just a | Nothing </code></pre></div></div> <code class="language-plaintext highlighter-rouge">Just</code> is essentially <code class="language-plaintext highlighter-rouge">Identity</code>, so the <code class="language-plaintext highlighter-rouge">Just</code> constructor doesn’t add any information to <code class="language-plaintext highlighter-rouge">a</code>. <code class="language-plaintext highlighter-rouge">Nothing</code>, however, is added to it, so the type <code class="language-plaintext highlighter-rouge">Maybe a</code> has <code class="language-plaintext highlighter-rouge">1 + a</code> inhabitants. <code class="language-plaintext highlighter-rouge">List</code>, likewise, gives us even more information: in addition to the elements, we have them in a linear order, and we have a count of how many elements there are in the list. How many inhabitants do lists have? For each element in the list, we have <code class="language-plaintext highlighter-rouge">a</code> possible values. So a list of size 0 has 1 value: the empty list. A list of size 1 has <code class="language-plaintext highlighter-rouge">a</code> values. A list of size 2 has <code class="language-plaintext highlighter-rouge">a * a</code> values. A list of size 3 has <code class="language-plaintext highlighter-rouge">a * a * a</code> values, and we end up with the sequence: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1 + a + a^2 + a^3 + a^4 + ... </code></pre></div></div> which is also: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>a^0 + a^1 + a^2 + a^3 + ... </code></pre></div></div> If you read the blog post I linked above, then you’ll get to see how this result can be derived from type arithmetic in a very cool way. Lists also keep another bit of information around: an idea of sequence, or order, of the elements. So the list context has two bits of information: how many elements, and in what order. There are two more common contexts. We’ll get to see how added information gives us more power and also more complexity. <h3 id="reader">Reader</h3> Reader is a context where we have some read-only environment information. The <code class="language-plaintext highlighter-rouge">Reader</code> type in Haskell is defined like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Reader r a = Reader { runReader :: r -> a } </code></pre></div></div> Here, we’re defining a new type, calling it <code class="language-plaintext highlighter-rouge">Reader</code>. We know how to compute the possible implementations of a function-arrow – exponentiation! So for a <code class="language-plaintext highlighter-rouge">Reader r a</code> function, we know we have <code class="language-plaintext highlighter-rouge">a ^ r</code> implementations. Adding read-only information to a function, then, exponentially increases the possible ways for the function to work. <h3 id="state">State</h3> Stateful computation is modeled in Haskell as a function that takes some state as input, and produces a result value and a new state. We define it like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype State s a = State { runState :: s -> (a, s) } </code></pre></div></div> This is similar to the definition of Reader above, but we’re also returning a new value of our state type. How many inhabitants does this type have? It’s a bit trickier than Reader, but we’ve got all we need to figure it out. We have a pair, so we’ll multiply <code class="language-plaintext highlighter-rouge">a*s</code>. We have a function, so we’ll exponentiate: <code class="language-plaintext highlighter-rouge">(a*s)^s</code>. Adding mutable state to a function multiplies the value’s inhabitants by the state’s inhabitants and then raises that to the power of the state’s inhabitants. Yikes. Considering <code class="language-plaintext highlighter-rouge">State Circuit Bool</code>, we’ve got <code class="language-plaintext highlighter-rouge">(Bool, Circuit) :^ Circuit</code>, which translates to <code class="language-plaintext highlighter-rouge">(2 * 3) ^ 3 = 216</code>. Talk about a huge increase in complexity! <h2 id="next-time">Next time…</h2> We’ve now got a bit of an idea on how basic types and functions can carry information, and can use that to figure out how much information is stored in a generic data type. We can also think about these generic data types as contexts which add some information to the types they contain. In the next post, we’ll Level Up our ability to use these contexts in a generic and reusable way. <a href="https://www.parsonsmatt.org/2015/11/29/an_intuition_on_context_ii.html">Here’s the link to the next post</a> Tue, 24 Nov 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/11/24/an_intuition_on_context.html https://www.parsonsmatt.org/2015/11/24/an_intuition_on_context.html Template Haskell Is Not Scary <h2 id="a-beginner-tutorial">A Beginner Tutorial</h2> This tutorial is aimed at people who are beginner-intermediate Haskellers looking to learn the basics of Template Haskell. I learned about the power and utility of metaprogramming in Ruby. Ruby metaprogramming is done by constructing source code with string concatenation and having the interpreter run it. There are also some methods that can be used to define methods, constants, variables, etc. In my <a href="https://github.com/parsonsmatt/squirrell/">Squirrell</a> Ruby library designed to make encapsulating SQL queries a bit easier, I have a few bits of metaprogramming to allow for some conveniences when defining classes. The idea is that you can define a query class like this: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class PermissionExample include Squirrell requires :user_id permits :post_id def raw_sql <<SQL SELECT * FROM users INNER JOIN posts ON users.id = posts.user_id WHERE users.id = #{user_id} #{has_post?} SQL end def has_post? post_id ? "AND posts.id = #{post_id}" : "" end end </code></pre></div></div> and by specifying <code class="language-plaintext highlighter-rouge">requires</code> with the symbols you want to require, it will define an instance variable and an attribute reader for you, and raise errors if you don’t pass the required parameter. Accomplishing that was pretty easy. Calling <code class="language-plaintext highlighter-rouge">requires</code> does some bookkeeping with required parameters and then calls this method with the arguments passed: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def define_readers(args) args.each do |arg| attr_reader arg end end </code></pre></div></div> Which you can kinda read like a macro: take the arguments, and call <code class="language-plaintext highlighter-rouge">attr_reader</code> with each. The magic happens later, where I overrode the <code class="language-plaintext highlighter-rouge">initialize</code> method: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def initialize(args = {}) return self if args.empty? Squirrell.requires[self.class].each do |k| unless args.keys.include? k fail MissingParameterError, "Missing required parameter: #{k}" end instance_variable_set "@#{k}", args.delete(k) end fail UnusedParameter, "Unspecified parameters: #{args}" if args.any? end </code></pre></div></div> We loop over the arguments provided to <code class="language-plaintext highlighter-rouge">new</code>, and if any required ones are missing, error. Otherwise, we set the instance variable associated with the argument, and remove it from the hash. Another approach involves taking a string, and evaluating it in the context of whatever class you’re in: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def lolwat(your_method, your_string) class_eval "def #{your_method}; puts #{your_string}; end" end </code></pre></div></div> This line of code defines a method with your choice of name and string to print in the context of whatever class is running. <h2 id="wait-this-isnt-haskell-what-am-i-doing-here">wait this isn’t haskell what am i doing here</h2> Metaprogramming in Ruby is mostly based on a textual approach to code. You use Ruby to generate a string of Ruby code, and then you have Ruby evaluate the code. If you’re coming from this sort of background (as I was), then Template Haskell will strike you as different and weird. You’ll think “Oh, I know, I’ll just use quasi quoters and it’ll all work just right.” Nope. You have to think very differently about metaprogramming in Template Haskell. You’re not going to be putting strings together that happen to make valid code. This is Haskell, we’re going to have some compile time checking! <h2 id="constructing-an-ast">Constructing an AST</h2> In Ruby, we built a string, which the Ruby interpreter then parsed, turned into an abstract syntax tree, and interpreted. In Haskell, we’ll skip the string step. We’ll build the abstract syntax tree directly using standard data constructors. GHC will verify that we’re doing everything OK in the construction of the syntax tree, and then it’ll print the syntax tree into our source code before compiling the whole thing. So we get two levels of compile time checking – that we built a correct template, and that we used the template correctly. One of the nastiest things about textual metaprogramming is that there’s no guarantee that your syntax is right – and it can be really hard to debug when doing more complicated stuff. Programming directly into an AST makes it a lot easier to verify the correctness of what we write. The quasiquoters are a convenience built around AST programming, but I’m of the opinion that you should learn the AST stuff first and then dive into the quoters when you have a good idea of how they work. Alright, so let’s get into our first example. We’ve written a function <code class="language-plaintext highlighter-rouge">bigBadMathProblem :: Int -> Double</code> that takes a lot of time at runtime, and we want to write a lookup table for the most common values. Since we want to ensure that runtime speed is super fast, and we don’t mind waiting on the compiler, we’ll do this with Template Haskell. We’ll pass in a list of common numbers, run the function on each to precompute them, and then finally punt to the function if we didn’t cache the number. Since we want to do something like the <code class="language-plaintext highlighter-rouge">makeLenses</code> function to generate a bunch of declarations for us, we’ll first look at the type of that in the <code class="language-plaintext highlighter-rouge">lens</code> library. Jumping to <a href="https://hackage.haskell.org/package/lens-4.13/docs/Control-Lens-TH.html">the lens docs</a>, we can see that the type of <code class="language-plaintext highlighter-rouge">makesLenses</code> is <code class="language-plaintext highlighter-rouge">Name -> DecsQ</code>. Jumping to <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html">the Template Haskell docs</a>, <code class="language-plaintext highlighter-rouge">DecsQ</code> is a type synonym for <code class="language-plaintext highlighter-rouge">Q [Dec]</code>. <code class="language-plaintext highlighter-rouge">Q</code> appears to be a monad for Template Haskell, and a <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#t:Dec"><code class="language-plaintext highlighter-rouge">Dec</code></a> is the data type for a declaration. The constructor for making a function declaration is <code class="language-plaintext highlighter-rouge">FunD</code>. We can get started with this! We’ll start by defining our function. It’ll take a list of commonly used values, apply the function to each, and store the result. Finally, we’ll need a clause that passes the value to the math function in the event we don’t have it cached. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precompute :: [Int] -> DecsQ precompute xs = do -- ....... return [FunD name clauses] </code></pre></div></div> Since <code class="language-plaintext highlighter-rouge">Q</code> is a monad, and <code class="language-plaintext highlighter-rouge">DecsQ</code> is a type synonym for it, we know we can start off with <code class="language-plaintext highlighter-rouge">do</code>. And we know we’re going to be returning a function definition, which, according to the <code class="language-plaintext highlighter-rouge">Dec</code> documentation, has a field for the name of the function and the list of clauses. Now it’s up to us to generate the name and clauses. Names are easy, so we’ll do that first. We can get a name from a string using <code class="language-plaintext highlighter-rouge">mkName</code>. This converts a string into an unqualified name. We’re going to choose <code class="language-plaintext highlighter-rouge">lookupTable</code> as the name of our lookup table, so we can just use that directly. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precompute xs = do let name = mkName "lookupTable" -- ... </code></pre></div></div> Now, we need to apply each variable in <code class="language-plaintext highlighter-rouge">xs</code> to the function named <code class="language-plaintext highlighter-rouge">bigBadMathProblem</code>. This will go in the <code class="language-plaintext highlighter-rouge">[Clause]</code> field, so let’s look at what makes up a <code class="language-plaintext highlighter-rouge">Clause</code>. According to <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#t:Clause">the documentation</a>, a clause is a data constructor with three fields: a list of <code class="language-plaintext highlighter-rouge">Pat</code> patterns, a <code class="language-plaintext highlighter-rouge">Body</code>, and a list of <code class="language-plaintext highlighter-rouge">Dec</code> declarations. The body corresponds to the actual function definition, the <code class="language-plaintext highlighter-rouge">Pat</code> patterns correspond to the patterns we’re matching input arguments on, and the <code class="language-plaintext highlighter-rouge">Dec</code> declarations are what we might find in a <code class="language-plaintext highlighter-rouge">where</code> clause. Let’s identify our patterns first. We’re trying to match on the <code class="language-plaintext highlighter-rouge">Int</code>s directly. Our desired output is going to look something like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>lookupTable 0 = 123.546 lookupTable 12 = 151626.4234 lookupTable 42 = 0.0 -- ... lookupTable x = bigBadMathProblem x </code></pre></div></div> So we need a way to get those <code class="language-plaintext highlighter-rouge">Int</code>s in our <code class="language-plaintext highlighter-rouge">xs</code> variable into a <code class="language-plaintext highlighter-rouge">Pat</code> pattern. We need some function <code class="language-plaintext highlighter-rouge">Int -> Pat</code>… Let’s check out <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#t:Pat">the documentation</a> for <code class="language-plaintext highlighter-rouge">Pat</code> and see how it works. The very first pattern is <code class="language-plaintext highlighter-rouge">LitP</code>, which takes an argument of type <code class="language-plaintext highlighter-rouge">Lit</code>. A <code class="language-plaintext highlighter-rouge">Lit</code> is a sum type that has a constructor for the primitive Haskell types. There’s one for <code class="language-plaintext highlighter-rouge">IntegerL</code>, which we can use. So, we can get from <code class="language-plaintext highlighter-rouge">Int -> Pat</code> with the following function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>intToPat :: Int -> Pat intToPat = LitP . IntegerL . toInteger </code></pre></div></div> Which we can map over the initial list to get our <code class="language-plaintext highlighter-rouge">[Pat]</code>! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precompute xs = do let name = mkName "lookupTable" patterns = map intToPat xs -- ... return [FunD name clauses] </code></pre></div></div> Our <code class="language-plaintext highlighter-rouge">lookupTable</code> function is only going to take a single argument, so we’ll want to <code class="language-plaintext highlighter-rouge">map</code> our integer <code class="language-plaintext highlighter-rouge">Pat</code>s into <code class="language-plaintext highlighter-rouge">Clause</code>, going from our <code class="language-plaintext highlighter-rouge">[Pat] -> [Clause]</code>. That will get use the <code class="language-plaintext highlighter-rouge">clauses</code> variable that we need. From above, a clause is defined like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Clause = Clause [Pat] Body [Dec] </code></pre></div></div> So, our <code class="language-plaintext highlighter-rouge">[Pat]</code> is simple – we only have one literal value we’re matching on. <code class="language-plaintext highlighter-rouge">Body</code> is defined to be either a <code class="language-plaintext highlighter-rouge">GuardedB</code> which uses pattern guards, or a <code class="language-plaintext highlighter-rouge">NormalB</code> which doesn’t. We could define our function in terms of a single clause with a <code class="language-plaintext highlighter-rouge">GuardedB</code> body, but that sounds like more work, so we’ll just use a <code class="language-plaintext highlighter-rouge">NormalB</code> body. The <code class="language-plaintext highlighter-rouge">NormalB</code> constructor takes an argument of type <code class="language-plaintext highlighter-rouge">Exp</code>. So let’s dig in to <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#t:Exp">the <code class="language-plaintext highlighter-rouge">Exp</code> documentation!</a> There’s a lot here. Looking above, we really just want to have a single thing – a literal! The precomputed value. There’s a <code class="language-plaintext highlighter-rouge">LitE</code> constructor which takes a <code class="language-plaintext highlighter-rouge">Lit</code> type. The <code class="language-plaintext highlighter-rouge">Lit</code> type has a constructor for <code class="language-plaintext highlighter-rouge">DoublePrimL</code> which takes a <code class="language-plaintext highlighter-rouge">Rational</code>, so we’ll have to do a bit of conversion. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precomputeInteger :: Int -> Exp precomputeInteger = LitE . DoublePrimL . toRational . bigBadMathProblem </code></pre></div></div> We can get the <code class="language-plaintext highlighter-rouge">Body</code>s for the <code class="language-plaintext highlighter-rouge">Clause</code>s by mapping this function over the list of arguments. The declarations will just be blank, so we’re ready to create our <code class="language-plaintext highlighter-rouge">clauses</code>! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precompute xs = do let name = mkName "lookupTable" patterns = map intToPat xs fnBodies = map precomputeInteger xs precomputedClauses = zipWith (\body pattern -> Clause [pattern] (NormalB body) []) fnBodies patterns -- ...... return [FunD name clauses] </code></pre></div></div> There’s one thing left to do here. We need to create another clause with a variable <code class="language-plaintext highlighter-rouge">x</code> that we delegate to the function. Since we’re introducing a local variable, we don’t need to worry about being hygienic with our naming, so we can use <code class="language-plaintext highlighter-rouge">mkName</code> again. We will have to get a bit more complicated with our <code class="language-plaintext highlighter-rouge">Body</code> expression, since we’ve got an application to a function going on. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precompute xs = do let name = mkName "lookupTable" patterns = map intToPat xs fnBodies = map precomputeInteger xs precomputedClauses = zipWith (\body pattern -> Clause [pattern] (NormalB body) []) fnBodies patterns x' = mkName "x" lastClause = [Clause [VarP x'] (NormalB appBody) []] -- ... clauses = precomputedClauses ++ lastClause return [FunD name clauses] </code></pre></div></div> Going back to the <code class="language-plaintext highlighter-rouge">Exp</code> type, we’re now looking for something that captures the idea of application. The <code class="language-plaintext highlighter-rouge">Exp</code> type has a data constructor <code class="language-plaintext highlighter-rouge">AppE</code> which takes two expressions and applies the second to the first. That’s precisely what we need! It also has a data constructor <code class="language-plaintext highlighter-rouge">VarE</code> which takes a <code class="language-plaintext highlighter-rouge">Name</code> argument. That’s all we need. Let’s do it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>precompute xs = do let name = mkName "lookupTable" patterns = map intToPat xs fnBodies = map precomputeInteger xs precomputedClauses = zipWith (\body pattern -> Clause [pattern] body []) fnBodies patterns x' = mkName "x" lastClause = [Clause [VarP x'] (NormalB appBody) []] appBody = AppE (VarE (mkName "bigBadMathProblem")) (VarE x') clauses = precomputedClauses ++ lastClause return [FunD name clauses] </code></pre></div></div> We did it! We wrangled up some Template Haskell and wrote ourselves a lookup table. Now, we’ll want to splice it into the top level of our program with the <code class="language-plaintext highlighter-rouge">$()</code> splice syntax: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$(precompute [1..1000]) </code></pre></div></div> As it happens, GHC is smart enough to know that a top level expression with the type <code class="language-plaintext highlighter-rouge">Q [Dec]</code> can be spliced without the explicit splicing syntax. Creating Haskell expressions using the data constructors is really easy, if a little verbose. Let’s look at a little more complicated example. <h2 id="boilerplate-be-gone">Boilerplate Be Gone!</h2> We’re excited to be using the excellent <code class="language-plaintext highlighter-rouge">users</code> library with the <code class="language-plaintext highlighter-rouge">persistent</code> backend for the web application we’re working on (source code located <a href="https://github.com/parsonsmatt/QuickLift/">here, if you’re curious</a>). It handles all kinds of stuff for us, taking care of a bunch of boilerplate and user related code. It expects, as its first argument, a value that can be unwrapped and used to run a Persistent query. It also operates in the <code class="language-plaintext highlighter-rouge">IO</code> monad. Right now, our application is setup to use a custom monad <code class="language-plaintext highlighter-rouge">AppM</code> which is defined like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type AppM = ReaderT Config (EitherT ServantErr IO) </code></pre></div></div> So, to actually use the functions in the <code class="language-plaintext highlighter-rouge">users</code> library, we have to do this bit of fun business: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>someFunc :: AppM [User] someFunc = do connPool <- asks getPool let conn = Persistent (`runSqlPool` connPool) users <- liftIO (listUsers conn Nothing) return (map snd users) </code></pre></div></div> That’s going to get annoying quickly, so we start writing functions specific to our monad that we can call instead of doing all that lifting and wrapping. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>backend :: AppM Persistent backend = do pool <- asks getPool return (Persistent (`runSqlPool` pool)) myListUsers :: Maybe (Int64, Int64) -> AppM [(LoginId, QLUser)] myListUsers m = do b <- backend liftIO (listUsers b m) myGetUserById :: LoginId -> AppM (Maybe QLUser) myGetUserById l = do b <- backend liftIO (getUserById b l) myUpdateUser :: LoginId -> (QLUser -> QLUser) -> AppM (Either UpdateUserError ()) myUpdateUser id fn = do b <- backend liftIO (updateUser b id fn) </code></pre></div></div> ahh, totally mechanical code. just straight up boiler plate. This is exactly the sort of thing I’d have metaprogrammed in Ruby. So let’s metaprogram it in Haskell! First, we’ll want to simplify the expression. Let’s use <code class="language-plaintext highlighter-rouge">listUsers</code> as the example. We’ll make it as simple as possible – no infix operators, no <code class="language-plaintext highlighter-rouge">do</code> notation, etc. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>listUsersSimple m = (>>=) backend (\b -> liftIO (listUsers b m)) </code></pre></div></div> Nice. To make it a little easier on seeing the AST, we can take it one step further. Let’s explicitly show all function application by adding parentheses to make everything as explicit as possible. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>listUsersExplicit m = ((>>=) backend) (\b -> liftIO ((listUsers b) m)) </code></pre></div></div> The general formula that we’re going for is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>derivedFunction arg1 arg2 ... argn = ((>>=) backend) (\b -> liftIO ((...(((function b) arg1) arg2)...) argn)) </code></pre></div></div> We’ll start by creating our <code class="language-plaintext highlighter-rouge">deriveReader</code> function, which will take as its first argument the <code class="language-plaintext highlighter-rouge">backend</code> function name. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveReader :: Name -> DecsQ deriveReader rd = mapM (decForFunc rd) [ 'destroyUserBackend , 'housekeepBackend , 'getUserIdByName , 'getUserById , 'listUsers , 'countUsers , 'createUser , 'updateUser , 'updateUserDetails , 'authUser , 'deleteUser ] </code></pre></div></div> This is our first bit of special syntax. The single quote in <code class="language-plaintext highlighter-rouge">'destroyUserBackend</code> is a shorthand way of saying <code class="language-plaintext highlighter-rouge">mkName "destroyUserBackend"</code> Now, what we need is a function <code class="language-plaintext highlighter-rouge">decForFunc</code>, which has the signature <code class="language-plaintext highlighter-rouge">Name -> Name -> Q Dec</code>. In order to do this, we’ll need to get some information about the function we’re trying to derive. Specifically, we need to know how many arguments the source function takes. There’s a whole section in the Template Haskell <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#g:3">documentation about ‘Querying the Compiler’</a> which we can put to good use. The <code class="language-plaintext highlighter-rouge">reify</code> function returns a value of type <code class="language-plaintext highlighter-rouge">Info</code>. For type class operations, it has the data constructor <code class="language-plaintext highlighter-rouge">ClassOpI</code> with arguments <code class="language-plaintext highlighter-rouge">Name</code>, <code class="language-plaintext highlighter-rouge">Type</code>, <code class="language-plaintext highlighter-rouge">ParentName</code>, and <code class="language-plaintext highlighter-rouge">Fixity</code>. None of these have the arity of the function directly… I think it’s time to do a bit of exploratory coding in the REPL. We can fire up <code class="language-plaintext highlighter-rouge">GHCi</code> and start doing some Template Haskell with the following commands: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ: :set -XTemplateHaskell λ: import Language.Haskell.TH </code></pre></div></div> We can also do the following command, and it’ll print out all of the generated code that it makes: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ: :set -ddump-splices </code></pre></div></div> Now, let’s run <code class="language-plaintext highlighter-rouge">reify</code> on something simple and see the output! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ: reify 'id <interactive>:4:1: No instance for (Show (Q Info)) arising from a use of ‘print’ In a stmt of an interactive GHCi command: print it </code></pre></div></div> Hmm.. No show instance. Fortunately, there’s a workaround that can print out stuff in the <code class="language-plaintext highlighter-rouge">Q</code> monad: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ: $(stringE . show =<< reify 'id) "VarI GHC.Base.id (ForallT [KindedTV a_1627463132 StarT] [] (AppT (AppT ArrowT (VarT a_1627463132)) (VarT a_1627463132) ) ) Nothing (Fixity 9 InfixL)" </code></pre></div></div> I’ve formatted it a bit to make it a bit more legible. We’ve got the <code class="language-plaintext highlighter-rouge">Name</code>, the <code class="language-plaintext highlighter-rouge">Type</code>, a <code class="language-plaintext highlighter-rouge">Nothing</code> value that is always <code class="language-plaintext highlighter-rouge">Nothing</code>, and the fixity of the function. The <code class="language-plaintext highlighter-rouge">Type</code> seems pretty useful… Let’s look at the <code class="language-plaintext highlighter-rouge">reify</code> output for one of the class methods we’re trying to work with: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>λ: $(stringE . show =<< reify 'Web.Users.Types.getUserById) "ClassOpI Web.Users.Types.getUserById (ForallT [KindedTV b_1627432398 StarT] [AppT (ConT Web.Users.Types.UserStorageBackend) (VarT b_1627432398)] (ForallT [KindedTV a_1627482920 StarT] [AppT (ConT Data.Aeson.Types.Class.FromJSON) (VarT a_1627482920),AppT (ConT Data.Aeson.Types.Class.ToJSON) (VarT a_1627482920)] (AppT (AppT ArrowT (VarT b_1627432398) ) (AppT (AppT ArrowT (AppT (ConT Web.Users.Types.UserId) (VarT b_1627432398) ) ) (AppT (ConT GHC.Types.IO) (AppT (ConT GHC.Base.Maybe) (AppT (ConT Web.Users.Types.User) (VarT a_1627482920) ) ) ) ) ) ) ) Web.Users.Types.UserStorageBackend (Fixity 9 InfixL)" </code></pre></div></div> WOOOOH. That is a ton of text!! We’re mainly interested in the <code class="language-plaintext highlighter-rouge">Type</code> declaration, and we can get a lot of information about what data constructors are used from <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#t:Type">the rather nice documentation</a>. Just like <code class="language-plaintext highlighter-rouge">AppE</code> is how we applied an expression to an expression, <code class="language-plaintext highlighter-rouge">AppT</code> is how we apply a type to a type. <code class="language-plaintext highlighter-rouge">ArrowT</code> is the function arrow in the type signature. Just as an exercise, we’ll go through the following type signature and transform it into something a bit like the above: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fmap :: (a -> b) -> f a -> f b ~ ((->) a b) -> (f a) -> (f b) ~ (->) ((->) a b) ((f a) -> (f b)) ~ (->) ((->) a b) ((->) (f a) (f b)) </code></pre></div></div> Ok, now all of our <code class="language-plaintext highlighter-rouge">(->)</code>s are written in prefix form. We’ll replace the arrows with <code class="language-plaintext highlighter-rouge">ArrowT</code>, do explicit parentheses, and put in the <code class="language-plaintext highlighter-rouge">ApplyT</code> constructors working from the innermost expressions out. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~ (ArrowT ((ArrowT a) b)) ((ArrowT (f a)) (f b)) ~ (ArrowT ((ApplyT ArrowT a) b)) ((ArrowT (ApplyT f a)) (ApplyT f b)) ~ (ArrowT (ApplyT (ApplyT ArrowT a) b)) (ApplyT (ApplyT ArrowT (ApplyT f a)) (ApplyT f b)) ~ ApplyT (ArrowT (ApplyT (ApplyT ArrowT a) b)) (ApplyT (ApplyT ArrowT (ApplyT f a)) (ApplyT f b)) </code></pre></div></div> That got pretty out of hand and messy looking. But, we have a good idea now of how we can get from one representation to the other. So, going from our type signature, it looks like we can figure out how we can get the arguments we need from the type! We’ll pattern match on the type signature, and if we see something that looks like the continuation of a type signature, we’ll add one to a count and go deeper. Otherwise, we’ll skip out. The function definition looks like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>functionLevels :: Type -> Int functionLevels = go 0 where go :: Int -> Type -> Int go n (AppT (AppT ArrowT _) rest) = go (n+1) rest go n (ForallT _ _ rest) = go n rest go n _ = n </code></pre></div></div> Neat! We can pattern match on these just like ordinary Haskell values. Well, they are ordinary Haskell values, so that makes perfect sense. Lastly, we’ll need a function that gets the type from an <code class="language-plaintext highlighter-rouge">Info</code>. Not all <code class="language-plaintext highlighter-rouge">Info</code> have types, so we’ll encode that with <code class="language-plaintext highlighter-rouge">Maybe</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getType :: Info -> Maybe Type getType (ClassOpI _ t _ _) = Just t getType (DataConI _ t _ _) = Just t getType (VarI _ t _ _) = Just t getType (TyVarI _ t) = Just t getType _ = Nothing </code></pre></div></div> Alright, we’re ready to get started on that <code class="language-plaintext highlighter-rouge">decForFunc</code> function!! We’ll go ahead and fill in what we know we need to do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>decForFunc :: Name -> Name -> Q Dec decForFunc reader fn = do info <- reify fn arity <- maybe (reportError "Unable to get arity of name" >> return 0) (return . functionLevels) (getType info) -- ... return (FunD fnName [Clause varPat (NormalB final) []]) </code></pre></div></div> Arity acquired. Now, we’ll want to get a list of new variable names corresponding with the function arguments. When we want to be hygienic with our variable names, we use the function <code class="language-plaintext highlighter-rouge">newName</code> which creates a totally unique variable name with the string prepended to it. We want (arity - 1) new names, since we’ll be using the bound value from the reader function for the other one. We’ll also want a name for the value we’ll bind out of the lambda. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>varNames <- replicateM (arity - 1) (newName "arg") b <- newName "b" </code></pre></div></div> Next up is the new function name. To keep a consistent API, we’ll use the same name as the one in the actual package. This will require us to import the other package qualified to avoid a name clash. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>let fnName = mkName . nameBase $ fn </code></pre></div></div> <code class="language-plaintext highlighter-rouge">nameBase</code> has the type <code class="language-plaintext highlighter-rouge">Name -> String</code>, and gets the non-qualified name string for a given <code class="language-plaintext highlighter-rouge">Name</code> value. Then we <code class="language-plaintext highlighter-rouge">mkName</code> with the string, giving us a new, non-qualified name with the same value as the original function. This might be a bad idea? You probably want to provide a unique identifier. Module namespacing does a fine job of that, imo. Next up, we’ll want to apply the <code class="language-plaintext highlighter-rouge">(>>=)</code> function to the <code class="language-plaintext highlighter-rouge">reader</code>. We’ll then want to create a function which applies the <code class="language-plaintext highlighter-rouge">bound</code> expression to a lambda. Lambdas have an <a href="https://hackage.haskell.org/package/template-haskell-2.10.0.0/docs/Language-Haskell-TH.html#v:LamE">LamE</a> constructor in the <code class="language-plaintext highlighter-rouge">Exp</code> type. They take a <code class="language-plaintext highlighter-rouge">[Pat]</code> to match on, and an <code class="language-plaintext highlighter-rouge">Exp</code> that represents the lambda body. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bound = AppE (VarE '(>>=)) (VarE reader) binder = AppE bound . LamE [VarP b] </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">AppE bound . LamE [VarP b]</code> is the exact same thing as <code class="language-plaintext highlighter-rouge">(>>=) reader (\b -> ...)</code>! Cool. Next up, we’ll need to create <code class="language-plaintext highlighter-rouge">VarE</code> values for all of the variables. Then, we’ll need to apply all of the values to the <code class="language-plaintext highlighter-rouge">VarE fn</code> expression. Function application binds to the left, so we’ll have: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fn ~ VarE fn fn a ~ AppE (VarE fn) (VarE a) fn a b ~ AppE (AppE (VarE fn) (VarE a)) (VarE b) fn a b c ~ AppE (AppE (AppE (VarE fn) (VarE a)) (VarE b)) (VarE c) </code></pre></div></div> This looks just like a left fold! Once we have that, we’ll apply the fully applied <code class="language-plaintext highlighter-rouge">fn</code> expression to <code class="language-plaintext highlighter-rouge">VarE 'liftIO</code>, and finally bind it to the lambda. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>varExprs = map VarE (b : varNames) fullExpr = foldl AppE (VarE fn) varExprs liftedExpr = AppE (VarE 'liftIO) fullExpr final = binder liftedExpr </code></pre></div></div> This produces our <code class="language-plaintext highlighter-rouge">(>>=) reader (\b -> fn b arg1 arg2 ... argn)</code> expression. The last thing we need to do is get our patterns. This is just the list of variables we generated earlier. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>varPat = map VarP varNames </code></pre></div></div> And now, the whole thing: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deriveReader :: Name -> DecsQ deriveReader rd = mapM (decForFunc rd) [ 'destroyUserBackend , 'housekeepBackend , 'getUserIdByName , 'getUserById , 'listUsers , 'countUsers , 'createUser , 'updateUser , 'updateUserDetails , 'authUser , 'deleteUser ] decForFunc :: Name -> Name -> Q Dec decForFunc reader fn = do info <- reify fn arity <- maybe (reportError "Unable to get arity of name" >> return 0) (return . functionLevels) (getType info) varNames <- replicateM (arity - 1) (newName "arg") b <- newName "b" let fnName = mkName . nameBase $ fn bound = AppE (VarE '(>>=)) (VarE reader) binder = AppE bound . LamE [VarP b] varExprs = map VarE (b : varNames) fullExpr = foldl AppE (VarE fn) varExprs liftedExpr = AppE (VarE 'liftIO) fullExpr final = binder liftedExpr varPat = map VarP varNames return $ FunD fnName [Clause varPat (NormalB final) []] </code></pre></div></div> And we’ve now metaprogrammed a bunch of boilerplate away! We’ve looked at the docs for Template Haskell, figured out how to construct values in Haskell’s AST, and worked out how to do some work at compile time, as well as automate some boilerplate. I’m excited to learn more about the magic of defining quasiquoters and more advanced Template Haskell constructs, but even a super basic “build expressions and declarations using data constructors” approach is very useful. Hopefully, you’ll find this as useful as I did. Sun, 15 Nov 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/11/15/template_haskell.html https://www.parsonsmatt.org/2015/11/15/template_haskell.html Using purescript-routing with purescript-halogen Updated on 2016-06-12: The blog post describes an old version of Halogen and the router. It has been updated to the 0.8 release of Halogen and 0.4 of the router. The changes are pretty small and are entirely in <a href="https://github.com/parsonsmatt/purescript-routing-example/pull/2/commits/e9c3325fc3d3bbaaadda65054044f1e05fe6b6aa">this github commit</a>. <h2 id="a-tutorial">A tutorial</h2> Not only has SlamData came up with <a href="https://github.com/slamdata/purescript-halogen">purescript-halogen</a>, they’ve also got a nice routing library <a href="https://github.com/slamdata/purescript-routing"><code class="language-plaintext highlighter-rouge">purescript-routing</code></a>. While I’ll be demonstrating it with the <code class="language-plaintext highlighter-rouge">purescript-halogen</code> library, it’s actually library agnostic and should work with anything. Let’s dive in and learn how to use it! Now, fair warning, this is alpha software and bleeding edge. This tutorial may be out of date by the time I post it! The code for this project is available in <a href="https://github.com/parsonsmatt/purescript-routing-example">this repository</a>. (edit 12-28-15: there was a breaking change in <code class="language-plaintext highlighter-rouge">purescript-generics-0.7</code> which broke the repository. it has been fixed) <h2 id="defining-routes">Defining Routes</h2> The first step is defining our routes. We’re making a website for logging weightlifting sessions, so we’re concerned with three things: <ol> <li>Getting home. Safety is important and it’s a dangerous world out there.</li> <li>Logging sessions. That’s literally the point, right?</li> <li>Viewing our own profile. Only our own. Vanity is key to success in lifting weights.</li> </ol> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Routes = Profile | Sessions | Home </code></pre></div></div> Now that we’ve defined the data type, we need to write a matcher. This is a function that takes the stuff after the <code class="language-plaintext highlighter-rouge">#</code> in the URL and figures out what item in our <code class="language-plaintext highlighter-rouge">Routes</code> is the right thing. For this super basic example, we’re just going to have the three pages above, so we’ll just parse literals: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routing :: Match Routes routing = Profile <$ lit "" <* lit "profile" <|> Sessions <$ lit "" <* lit "sessions" <|> Home <$ lit "" </code></pre></div></div> “What’s that <code class="language-plaintext highlighter-rouge">lit ""</code> business?” Well, the routing library strips out all of the slashes, so if we want to refer to a single slash, we have to use the <code class="language-plaintext highlighter-rouge">lit ""</code> bit. Let’s define our Halogen component that will be in charge of routing. Right now, it’ll simply be a bit of text telling us which page we’re on. We’ll keep track of the current page in our state, and use the input query algebra to change. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type State = { currentPage :: String } data Input a = Goto Routes a ui :: forall g. (Functor g) => Component State Input g ui = component render eval where render st = div_ [ H.h1_ [ H.text (st.currentPage) ] , H.p_ [ H.text "Routing!!" ] ] eval :: Eval Input State Input g eval (Goto (Sessions next)) = do modify (_{ currentPage = "Sessions" }) pure next eval (Goto (Home next)) = do modify (_{ currentPage = "Home" }) pure next eval (Goto (Profile next)) = do modify (_{ currentPage = "Profile" }) pure next </code></pre></div></div> Cool! Now, we can use these <code class="language-plaintext highlighter-rouge">Goto</code> queries to have our application “go to” a certain route. We’ve got our route matching defined, and a way for our component to react to routes. Let’s run our component: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = runAff throwException (const (pure unit)) $ do app <- runUI R.ui R.init appendToBody app.node </code></pre></div></div> When we do <code class="language-plaintext highlighter-rouge">runUI</code>, we get a record back. The node is the most obvious thing. It’s how we mount components to the DOM. The <code class="language-plaintext highlighter-rouge">app</code> record also includes a <code class="language-plaintext highlighter-rouge">driver</code> field, which is a function that takes data in the query algebra. We can use that to send messages to our routing component. Let’s write a function that accepts the driver, matches the route, and sends messages to our component. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Effects e = (dom :: DOM, avar :: AVAR, err :: EXCEPTION | e) routeSignal :: forall eff. Driver Input eff -> Aff (Effects eff) Unit routeSignal driver = do Tuple old new <- matchesAff routing redirects driver old new </code></pre></div></div> <code class="language-plaintext highlighter-rouge">matchesAff</code> is a function that takes our <code class="language-plaintext highlighter-rouge">routing</code> definition, watches the URL, and returns a tuple of <code class="language-plaintext highlighter-rouge">Maybe oldRoute</code> and <code class="language-plaintext highlighter-rouge">newRoute</code>. It runs asynchronously and will kick off the redirect function every time the URL changes. We want to have <code class="language-plaintext highlighter-rouge">routeSignal</code> be it’s own function in the event that we need to do some additional work here. Now, it’s time for <code class="language-plaintext highlighter-rouge">redirects</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>redirects :: forall eff. Driver Input eff -> Maybe Routes -> Routes -> Aff (Effects eff) Unit redirects driver _ Sessions = do driver (action (Goto Sessions)) redirects driver _ Profile = do driver (action (Goto Profile)) redirects driver _ Home = do driver (action (Goto Home)) </code></pre></div></div> Finally, we’re using the <code class="language-plaintext highlighter-rouge">action</code> to send messages to our driver. We could have expressed that as a one liner <code class="language-plaintext highlighter-rouge">redirects driver _ = driver <<< action <<< Goto</code>, but we’ll be wanting to do some more work here pretty quick. We’ll want to “fork” a process in our main function to run the <code class="language-plaintext highlighter-rouge">routeSignal</code> function. The <code class="language-plaintext highlighter-rouge">purescript-aff</code> package simulates forking with asynchronous code. We’ll add a line to our <code class="language-plaintext highlighter-rouge">main</code> function, and when we run it, we can watch it match routes! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: forall eff. Eff (R.Effects eff) Unit main = runAff throwException (const (pure unit)) $ do app <- runUI R.ui R.init appendToBody app.node forkAff $ R.routeSignal app.driver </code></pre></div></div> Now we can <code class="language-plaintext highlighter-rouge">pulp server</code>, open the browser, and sure enough, <code class="language-plaintext highlighter-rouge">localhost:1337/#/profile</code> causes the title to show “Profile”. Very cool! Let’s put some links in our component and see how it can drive the global state: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: forall g. (Functor g) => Component State Input g ui = component render eval where render st = div_ [ H.h1_ [ H.text (st.currentPage) ] , H.ul_ (map link ["Sessions", "Profile", "Home"]) ] link s = H.li_ [ H.a [ P.href ("#/" ++ toLower s) ] [ H.text s ] ] </code></pre></div></div> So URLs and plain anchor tags can now act as a way to drive our application. The routing library is pretty low level still – there’s a good bit of room available for a higher level routing library specifically for Halogen. Note that the Home link still goes to the home page, even though the link is <code class="language-plaintext highlighter-rouge">#/home</code>. That’s because it goes to the last defined route in the event that no routes match. It’s a good idea to make the last route a catch-all 404 type thing. Now, we’ve got a basic Sessions route. Let’s expand that to have some basic CRUD actions: index and show. Show takes an identifier (<code class="language-plaintext highlighter-rouge">Int</code> in this case), while Index just shows everything. We’ll update the Sessions route to also take this as a parameter. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data CRUD = Index | Show Number data Routes = Home | Profile | Sessions CRUD </code></pre></div></div> Immediately, <code class="language-plaintext highlighter-rouge">pulp -w build</code> complains. We need to update our <code class="language-plaintext highlighter-rouge">matches</code> function to take into account the <code class="language-plaintext highlighter-rouge">CRUD</code> parameters. We also need to update our component’s <code class="language-plaintext highlighter-rouge">eval</code> function. First, let’s just recover our original index behavior in the routing function. We’ll need to match the slash, the sessions literal, and finally apply it to <code class="language-plaintext highlighter-rouge">pure Index</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routing :: Match Routes routing = Profile <$ lit "" <* lit "profile" <|> Home <$ lit "" <|> Sessions <$> (lit "" *> lit "sessions" *> pure Index) </code></pre></div></div> Now, we’ll want to use the <code class="language-plaintext highlighter-rouge">Alternative</code> to allow it to choose between either <code class="language-plaintext highlighter-rouge">Show Number</code> or <code class="language-plaintext highlighter-rouge">Index</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routing :: Match Routes routing = Profile <$ lit "" <* lit "profile" <|> Home <$ lit "" <|> Sessions <$> (lit "" *> lit "sessions" *> (Show <$> num <|> pure Index) </code></pre></div></div> Except, man, that’s kind of ugly… Let’s make that a bit nicer: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routing :: Match Routes routing = profile <|> sessions <|> home where route str = lit "" *> lit str parseCRUD = Show <$> num <|> pure Index profile = Profile <$ route "profile" home = Home <$ lit "" sessions = Sessions <$> (route "sessions" *> parseCRUD) </code></pre></div></div> Much nicer! It’s starting to become clear that there’s a lot of room for making conveniences on top of this, especially for a routing component library… Now we need to update the route matching function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>redirects :: forall eff. Driver Input eff -> Maybe Routes -> Routes -> Aff (Effects eff) Unit redirects driver _ = driver <<< action <<< Goto </code></pre></div></div> Yeah, that’s actually nicer… for now! Let’s check the <code class="language-plaintext highlighter-rouge">eval</code> function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> eval :: Eval Input State Input g eval (Goto Profile next) = do modify (_ { currentPage = "Profile" }) pure next eval (Goto (Sessions view) next) = do modify case view of Index -> (_ { currentPage = "Sessions" }) Show n -> (_ { currentPage = "View Session " ++ show n }) pure next eval (Goto Home next) = do modify (_ { currentPage = "Home" }) pure next </code></pre></div></div> Now, we can type <code class="language-plaintext highlighter-rouge">localhost:1337/#/sessions/2</code> and it’ll change the title to “View Session 2.0”. This is all very cool. We have URL-driven state in our Halogen app. But we’re managing everything in a single top level component, and that <code class="language-plaintext highlighter-rouge">eval</code> function is already getting hairy. What we really want to do is have the routing component simply select the appropriate component and render that. We’ll define two new components: <code class="language-plaintext highlighter-rouge">Profile</code> and <code class="language-plaintext highlighter-rouge">Sessions</code> to handle the respective pages. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Components/Profile.purs data Input a = Noop a type State = Unit data Slot = Slot ui :: forall g. (Functor g) => Component State Input g ui = component render eval where render _ = H.div_ [ H.h1_ [ H.text "Your Profile" ] , H.p_ [ H.text "what a nice profile!" ] ] eval :: Eval _ _ _ g eval (Noop n) = pure n </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">Sessions</code> component is the same for now, but it has slightly different text. Now we’re about to get into <code class="language-plaintext highlighter-rouge">purescript-halogen</code>’s machinery for having a parent component with multiple types of child components. We have to define a way for Halogen to know how to route the inputs, and how to get at the child states. Halogen uses <code class="language-plaintext highlighter-rouge">Coproduct</code> to route queries (<code class="language-plaintext highlighter-rouge">Coproduct f g a</code> is a newtype around <code class="language-plaintext highlighter-rouge">Either (f a) (g a)</code>), and <code class="language-plaintext highlighter-rouge">Either</code> to route states. First, we’ll define our child state: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type ChildState = Either Profile.State Sessions.State </code></pre></div></div> If we have more than one child component, then we can nest <code class="language-plaintext highlighter-rouge">Eithers</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Ex whatever = Either Profile.State (Either Sessions.State whatever) </code></pre></div></div> The child query is essentially the same thing. We have to ensure that the components states and queries have the same “paths”. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type ChildQuery = Coproduct Profile.Input Sessions.Input </code></pre></div></div> Like above, we can nest Coproducts to route more than two kinds of input to their respective query. Next up is a type for the slot. We’ll use Either again, making sure that the types line up. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type ChildSlot = Either Profile.Slot Sessions.Slot </code></pre></div></div> We’ll want to define some convenience functions to route the actions appropriately from the router. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pathToProfile :: ChildPath Profile.State ChildState Profile.Input ChildQuery Profile.Slot ChildSlot pathToProfile = cpL pathToSessions :: ChildPath Sessions.State ChildState Sessions.Input ChildQuery Sessions.Slot ChildSlot pathToSessions = cpR </code></pre></div></div> Another giant type signature! <code class="language-plaintext highlighter-rouge">ChildPath</code> wants to know state, input, and slot for the child and containing components. Two more type aliases and we’ll be done with the boilerplate. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type StateP g = InstalledState State ChildState Input ChildQuery g ChildSlot type QueryP = Coproduct Input (ChildF ChildSlot ChildQuery) </code></pre></div></div> Ok, with all that out of the way, it’s time to revise our router component definition. We’ll use our new type synonyms and make it a parent component. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: forall g. (Plus g) => Component (StateP g) QueryP g ui = parentComponent render eval where render state = L.defaultLayout [ H.h1_ [ H.text state.currentPage ] , H.p_ [ H.text "QuickLift is a quick and easy way to log your weightlifting sessions." ] , viewPage state.currentPage ] </code></pre></div></div> We’ll use <code class="language-plaintext highlighter-rouge">viewPage</code> as a helper function to select the correct page from our various UIs. It’s pretty hacky. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> viewPage :: String -> HTML (SlotConstructor ChildState ChildQuery g ChildSlot) Input viewPage "Sessions" = H.slot' pathToSessions Sessions.Slot \_ -> { component: Sessions.ui, initialState: unit } viewPage "Profile" = H.slot' pathToProfile Profile.Slot \_ -> { component: Profile.ui, initialState: unit } viewPage _ = H.div_ [] eval :: EvalParent Input State ChildState Input ChildQuery g ChildSlot eval = ... </code></pre></div></div> The type signature of <code class="language-plaintext highlighter-rouge">eval</code> is all that changed, so I’ll elide the definition. There are two remaining adjustments to make: Change the <code class="language-plaintext highlighter-rouge">redirects</code> and <code class="language-plaintext highlighter-rouge">routeSignal</code> functions to account for the new types and <code class="language-plaintext highlighter-rouge">Coproduct</code> stuff: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routeSignal :: forall eff. Driver QueryP eff -> Aff (Effects eff) Unit routeSignal driver = do Tuple old new <- matchesAff routing redirects driver old new redirects :: forall eff. Driver QueryP eff -> Maybe Routes -> Routes -> Aff (Effects eff) Unit redirects driver _ = driver <<< left <<< action <<< Goto -- or, if you prefer writing it all out, -- redirects driver _ Home = -- driver (left (action (Goto Home)))) -- etc... </code></pre></div></div> We’re using the <code class="language-plaintext highlighter-rouge">left</code> function from the <code class="language-plaintext highlighter-rouge">Coproduct</code>, which is shorthand for <code class="language-plaintext highlighter-rouge">Coproduct <<< Left</code> Change the <code class="language-plaintext highlighter-rouge">main</code> definition to use <code class="language-plaintext highlighter-rouge">installedState</code> instead of normal state: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: forall eff. Eff (R.Effects eff) Unit main = runAff throwException (const (pure unit)) $ do app <- runUI R.ui (installedState R.init) appendToBody app.node forkAff $ R.routeSignal app.driver </code></pre></div></div> In any case, this works! It correctly chooses the right component based on the current URL state. So, to review, we can now: <ul> <li>Define routes</li> <li>Define helpers and CRUD actions for routes</li> <li>Use the router in a single component to manage component state</li> <li>Use the router among multiple components to direct which component renders.</li> </ul> This should be enough to get you started with <code class="language-plaintext highlighter-rouge">purescript-routing</code>. Thu, 22 Oct 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/10/22/purescript_router.html https://www.parsonsmatt.org/2015/10/22/purescript_router.html Elm Architecture in PureScript IV: Effects <h2 id="the-final-chapter-ing">The Final Chapter-ing</h2> In the last post, I covered higher order components and making dynamic lists of components. We’re going to get into effects and AJAXing with this. It’s almost entirely like you might expect, given the previous posts, but we’ll finally start to specialize that <code class="language-plaintext highlighter-rouge">g</code> functor! As always, the code is available in <a href="https://github.com/parsonsmatt/purs-architecture-tutorial">this repository</a> <h2 id="gif-loader">Gif loader!</h2> I’m pretty excited for this. Counters are boring and now I can get cat gifs delivered? This is precisely what I want from all web UIs, really. So, we’re going to have a topic and a URL for the current gif. The only input we’ll need to handle is requesting a new gif. We’ll be interacting with the giphy public API to do this. Let’s define our state and inputs: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/Five.purs type State = { topic :: String , gifUrl :: String } initialState :: State initialState = { topic: "cats", gifUrl: "" } data Input a = RequestMore a </code></pre></div></div> STANDARD SUPER BORING, you knew it was coming. Since we know we’ll be using effects, we’ll also define a type for the effects our component will be using: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type GifEffects eff = HalogenEffects (ajax :: AJAX | eff) </code></pre></div></div> Halogen defines a type of effects that it normally uses, and we’re just adding the <code class="language-plaintext highlighter-rouge">AJAX</code> effect on top of that. The <code class="language-plaintext highlighter-rouge">ui</code> component is pretty standard. We’ve replaced the <code class="language-plaintext highlighter-rouge">g</code> functor with <code class="language-plaintext highlighter-rouge">Aff (GifEffects ())</code> to indicate that we’ll be using the asynchronous effects monad. The render function is boring, so we’ll get right to the <code class="language-plaintext highlighter-rouge">eval</code> function. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: Component State Input (Aff (GifEffects ())) ui = component render eval where render :: -- *yawn* let's skip to eval eval :: Eval Input State Input (Aff (GifEffects ())) eval (RequestMore a) = do state <- get newGifUrlFn <- liftFI (fetchGif state.topic) modify \s -> s { gifUrl = newGifUrlFn s.gifUrl } pure a </code></pre></div></div> <code class="language-plaintext highlighter-rouge">liftFI</code> is a function that lifts effects from the free monad. So we can phone home, launch missiles, write to the console, or do AJAX all from the <code class="language-plaintext highlighter-rouge">liftFI</code> function. Well, to be precise, we can only do those things if they’re included in the <code class="language-plaintext highlighter-rouge">Aff (GifEffects ())</code> effects type! (I haven’t checked <code class="language-plaintext highlighter-rouge">HalogenEffects</code>…) <code class="language-plaintext highlighter-rouge">fetchGif</code> uses the <code class="language-plaintext highlighter-rouge">Affjax</code> library to make the request, read the JSON, and return either a function to transform the current URL to the new one, or a function that doesn’t change it at all. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fetchGif :: forall eff. String -> Aff (ajax :: AJAX | eff) (String -> String) fetchGif topic = do result <- AJ.get (giphyRequestUrl topic) let url = readProp "data" result.response >>= readProp "image_url" pure (either (flip const) const url) </code></pre></div></div> So if we get a <code class="language-plaintext highlighter-rouge">Left</code> value out of the URL, then we do <code class="language-plaintext highlighter-rouge">flip const</code> on the <code class="language-plaintext highlighter-rouge">Left</code> value, and then finally on the URL in the state. If the request succeeds, then we do <code class="language-plaintext highlighter-rouge">const result</code> over the old URL, which sets it to be equal to the result. <code class="language-plaintext highlighter-rouge">readProp</code> tries to read the JSON property of the object passed, and either returns the result or a <code class="language-plaintext highlighter-rouge">Left</code> error type if it wasn’t successful. That can be a quick way of dealing with data if you don’t want to write a full JSON parser. And that’s it! We’ve got effects. NBD. Running the code in <code class="language-plaintext highlighter-rouge">main</code> looks the same as we’d expect: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runEx5 = runUI Ex5.ui Ex5.initialState </code></pre></div></div> <h2 id="multi-gif-loaders">Multi Gif Loaders?!</h2> Alright, how about a pair of gif loaders? This is very similar to the pair of counters we had in two, but we don’t need to worry about resetting them. In fact, the entire bit of code (imports and all!) is 28 lines! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Example.Six where import Prelude import Control.Plus (Plus) import Data.Functor.Coproduct (Coproduct(..)) import Control.Monad.Aff (Aff()) import Halogen import qualified Halogen.HTML.Indexed as H import qualified Example.Five as Gif data Input a = NoOp a type State = InstalledState Unit Gif.State Input Gif.Input (Aff (Gif.GifEffects ())) Boolean type Query = Coproduct Input (ChildF Boolean Gif.Input) ui :: Component State Query (Aff (Gif.GifEffects ())) ui = parentComponent render eval where render _ = H.div_ [ H.slot true \_ -> { component: Gif.ui, initialState: Gif.initialState } , H.slot false \_ -> { component: Gif.ui, initialState: Gif.initialState } ] eval :: EvalParent Input Unit Gif.State Input Gif.Input (Aff (Gif.GifEffects ())) Boolean eval (NoOp a) = pure a </code></pre></div></div> I’m using <code class="language-plaintext highlighter-rouge">Boolean</code> as the slot type because it naturally only has two elements, and any type that just has two elements is equivalent to <code class="language-plaintext highlighter-rouge">boolean</code>, and this way I don’t have to make ord/eq instances… <h2 id="list-of-gifs">List of Gifs</h2> Next up is a list of gif downloaders. But wait. Instead of making a list of gif downloaders, let’s just make another higher order component that contains a list of other components. We’ll model it off of <code class="language-plaintext highlighter-rouge">Example.Three</code>, so much of the code should look pretty familiar. First we’ll need to define state, query, child slots, etc… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type StateP = { itemArray :: Array Int , nextID :: Int } initialStateP :: StateP initialStateP = { itemArray: [] , nextID: 0 } data QueryP a = AddItem a | RemItem a newtype Slot = Slot Int </code></pre></div></div> We use the <code class="language-plaintext highlighter-rouge">P</code> suffix because we’ll want to create type synonyms for the installed state and child query stuff. The <code class="language-plaintext highlighter-rouge">Slot</code> type needs an instance of the Eq and Ord type classes. Fortunately, the newer versions of PureScript include a mechanism for generically deriving these. We have to import <code class="language-plaintext highlighter-rouge">Data.Generic</code>, and then we get to do: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>derive instance genericSlot :: Generic Slot instance ordSlot :: Ord Slot where compare = gCompare instance eqSlot :: Eq Slot where eq = gEq </code></pre></div></div> Nice! Much less tedious than writing the instances out manually. (Here’s hoping that <code class="language-plaintext highlighter-rouge">deriving (Eq, Ord)</code> makes it into the language soon…) Now we’ll define the <code class="language-plaintext highlighter-rouge">listUI</code>. Like we did with the higher-order “add a remove button” component, we’ll use two type variables for the child state and child query. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>makeList :: forall g p s f. (Plus g) => Component s f g -> s -> Component (State s f g) (Query f) g makeList comp initState = parentComponent render eval where render state = H.div_ [ H.button [ E.onClick $ E.input_ AddItem ] [ H.text "+" ] , H.button [ E.onClick $ E.input_ RemItem ] [ H.text "-" ] , H.ul_ (map (\i -> H.slot (Slot i) (initComp comp initState)) state.itemArray) ] initComp :: Component s f g -> s -> Unit -> { component :: _, initialState :: _ } initComp c s _ = {component: c, initialState: s} eval :: EvalParent QueryP StateP s QueryP f g Slot eval (AddItem next) = modify addItem $> next eval (RemItem next) = modify remItem $> next </code></pre></div></div> The only new thing about this is the <code class="language-plaintext highlighter-rouge">$></code> operator, but it does what you’d expect given it’s place in the function. And we’re done with the component definition! Let’s run it and see where we go: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Main.purs runEx7 = runUI (Ex7.makeList Ex5.ui Ex5.initialState) Ex7.initialState </code></pre></div></div> We don’t even need a type signature. Nice! And thus concludes my tutorial series on the Elm Architecture in PureScript. I’m not going to cover animation because I don’t know how it works, and that’s beyond the scope of the Halogen framework. <h3 id="other-posts-in-the-series">Other posts in the series:</h3> <ol> <li><a href="https://www.parsonsmatt.org/2015/10/03/elm_vs_purescript.html">Elm vs PureScript I: War of the Hello, Worlds</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html">Elm vs PureScript II</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/10/elm_architecture_in_purescript_iii.html">Elm Architecture in PureScript III: Dynamic Lists of Counters</a></li> </ol> Sun, 11 Oct 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/10/11/elm_architecture_in_purescript_iv_effects.html https://www.parsonsmatt.org/2015/10/11/elm_architecture_in_purescript_iv_effects.html Elm Architecture in PureScript III <h2 id="dynamic-lists-of-counters">Dynamic Lists of Counters</h2> On the <a href="https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html">last post</a>, we implemented a pair of counters. Now, we’ll generalize that out to a dynamic list of counters, and later, give them all remove buttons. In the process, we’ll learn how to combine components, stack them, peek on them, and otherwise deal with them appropriately. The code for this is available in <a href="https://github.com/parsonsmatt/purs-architecture-tutorial">this repository</a>. Let’s get started! We want a list of counters, a button to add a counter, and a button to remove a counter. Let’s define our state and inputs: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type StateP = { counterArray :: Array Int , nextID :: Int } initialState :: StateP initialState = { counterArray: [] , nextID: 0 } data Input a = AddCounter a | RemoveCounter a </code></pre></div></div> Another quick detour to define our parent-level state and query types: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type State g = InstalledState StateP Counter.State Input Counter.Input g CounterSlot type Query = Coproduct Input (ChildF CounterSlot Counter.Input) </code></pre></div></div> And, our UI function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: forall g. (Plus g) => Component (State g) Query g ui = parentComponent render eval where render state = H.div_ [ H.h1_ [ H.text "Counters" ] , H.ul_ $ map (\i -> mslot (CounterSlot i) Counter.ui (Counter.init 0)) state.counterArray , H.button [ E.onClick $ E.input_ AddCounter ] [ H.text "Add Counter" ] , H.button [ E.onClick $ E.input_ RemoveCounter ] [ H.text "Remove Counter" ] ] eval :: EvalParent Input StateP Counter.State Input Counter.Input g CounterSlot eval (AddCounter next) = do modify addCounter pure next eval (RemoveCounter next) = do modify removeCounter pure next mslot :: forall s f g p i. p -> Component s f g -> s -> HTML (SlotConstructor s f g p) i mslot slot comp state = H.slot slot \_ -> { component: comp, initialState: state } </code></pre></div></div> Basically the same thing we’ve been working with already! Instead of keeping a <code class="language-plaintext highlighter-rouge">CounterSlot 0</code> and <code class="language-plaintext highlighter-rouge">CounterSlot 1</code> around, we’ve got an array of integers. When we want to render them, we map over them with the slot type constructor and the <code class="language-plaintext highlighter-rouge">H.slot</code> to give them a place to go. Halogen figures out all of the event routing for us. <h2 id="removing-a-counter">Removing a Counter</h2> Alright, it’s time to give counters their own remove button. Rather than touch the counter at all, we’re simply going to wrap the existing counter component in a new component. The sole responsibility of this component will be handling the removal of counters. There’s a bit of boiler plate around the State and Query, but after that, the result is pretty tiny! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/CounterRem.purs data Input a = Remove a type State g = InstalledState Unit Counter.State Input Counter.Input g CounterSlot type Query = Coproduct Input (ChildF CounterSlot Counter.Input) ui :: forall g. (Plus g) => Component (State g) Query g ui = parentComponent render eval where render _ = H.div_ [ mslot (CounterSlot 0) Counter.ui (Counter.init 0) , H.button [ E.onClick $ E.input_ Remove ] [ H.text "Remove" ] ] eval :: EvalParent Input Unit Counter.State Input Counter.Input g CounterSlot eval (Remove a) = pure a </code></pre></div></div> Since we’re not maintaining any state, we’ll just use the <code class="language-plaintext highlighter-rouge">Unit</code> type to signify that. Our <code class="language-plaintext highlighter-rouge">eval</code> function is going to punt the behavior to the parent component. Now… Halogen does some impressive type trickery. Coproducts, free monads, query algebrae… it can be pretty intimidating. There’s a decent amount of associated boilerplate as well. We’re about to get into some of that. Let’s look at <code class="language-plaintext highlighter-rouge">InstalledState</code> in the <a href="https://github.com/slamdata/purescript-halogen/blob/master/docs/Halogen/Component.md#installedstate">Halogen documentation</a>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type InstalledState s s' f f' g p = { parent :: s , children :: Map p (Tuple (Component s' f' g) s') , memo :: Map p (HTML Void (Coproduct f (ChildF p f') Unit)) } </code></pre></div></div> It’s a record with a parent state, a map from child slots to child states, and a map from child slots to memoized HTML. But what is all of this <code class="language-plaintext highlighter-rouge">coproduct</code> stuff again? A <code class="language-plaintext highlighter-rouge">Coproduct</code> is defined like this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Coproduct f g a = Coproduct (Either (f a) (g a)) </code></pre></div></div> It’s a way of saying “I have a value of type a inside of a functor. That functor is either f or g.” We know we can specialize <code class="language-plaintext highlighter-rouge">f</code> in the <code class="language-plaintext highlighter-rouge">InstalledComponent</code> to our <code class="language-plaintext highlighter-rouge">Input</code> query algebra. And <code class="language-plaintext highlighter-rouge">ChildF p f'</code> is a given child’s identifier and the child’s query algebra. Halogen is using the coproduct structure to keep track of the children’s query algebra inputs. Revisiting our type synonyms again, we have: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type State g = InstalledState Unit Counter.State Input Counter.Input g CounterSlot </code></pre></div></div> The true state of this component isn’t just <code class="language-plaintext highlighter-rouge">Unit</code> – it’s the result of installing the <code class="language-plaintext highlighter-rouge">Counter.State</code> into this component. We’re giving that a name we can reference, and allowing the caller to provide the functor. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Query = Coproduct Input (ChildF CounterSlot Counter.Input) </code></pre></div></div> Finally, our <code class="language-plaintext highlighter-rouge">QueryMiddle</code> just fills in the types for the combined query algebra. Alright! Awesome! We’ve augmented a component with a <code class="language-plaintext highlighter-rouge">Remove</code> button. Let’s embed that into a list. We’ll actually get to reuse almost everything from example three! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/Four.purs data Input a = AddCounter a type State g = InstalledState StateP (Counter.State g) Input Counter.Query g CounterSlot type Query = Coproduct Input (ChildF CounterSlot Counter.Query) ui :: forall g. (Plus g) => Component (State g) Query g ui = parentComponent' render eval peek where </code></pre></div></div> Ah! We’re peeking! I can tell because of the <code class="language-plaintext highlighter-rouge">peek</code> function. And also the <code class="language-plaintext highlighter-rouge">'</code> on the end of <code class="language-plaintext highlighter-rouge">parentComponent'</code>. The <code class="language-plaintext highlighter-rouge">'</code> indicates peeking. Peeking is the way to inspect child components in purescript-halogen. So when a child component of a peeking parent is done with an action, then the parent gets a chance to see the action and act accordingly. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> render state = H.div_ [ H.h1_ [ H.text "Counters" ] , H.ul_ (map (mapSlot CounterSlot Counter.ui (installedState unit)) state.counterArray) , H.button [ E.onClick $ E.input_ AddCounter ] [ H.text "Add Counter" ] ] eval :: EvalParent _ _ _ _ _ g CounterSlot eval (AddCounter next) = do modify addCounter pure next mapSlot slot comp state index = mslot (slot index) comp state </code></pre></div></div> Rendering and evalling work exactly as you’d expect. Let’s look at peeking! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> peek :: Peek (ChildF CounterSlot Counter.Query) StateP (Counter.State g) Input Counter.Query g CounterSlot peek (ChildF counterSlot (Coproduct queryAction)) = case queryAction of Left (Counter.Remove _) -> modify (removeCounter counterSlot) _ -> pure unit </code></pre></div></div> So this is kind of a more complex <code class="language-plaintext highlighter-rouge">peek</code> than you’d normally start with. My bad. Generally, the <code class="language-plaintext highlighter-rouge">peek</code> function has a definition that’d look like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>peek (ChildF childSlot action) = case action of DoThing next -> -- ... </code></pre></div></div> But we’re working with the installed/child components who manage their state using the coproduct machinery, and as of now, we have to manually unwrap the coproduct and pattern match on the <code class="language-plaintext highlighter-rouge">Either</code> value inside. When we match on the <code class="language-plaintext highlighter-rouge">Left</code> value, we get to see the immediate child’s actions. If we were to match on the <code class="language-plaintext highlighter-rouge">Right</code> value, then we’d get to inspect children’s of children’s actions. In any case, we <code class="language-plaintext highlighter-rouge">peek</code> on the child component, and if it just did a <code class="language-plaintext highlighter-rouge">Remove</code> action, then we modify our own state. Otherwise, we ignore it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Main.purs main = ... do app <- runEx4 appendToBody app.node runEx4 = runUI Ex4.ui (installedState (Ex3.initialState)) </code></pre></div></div> And now we’ve got our dynamic list of removable embedded counters going. Next up, we’ll be looking at AJAX, effects, and other fun stuff. <h2 id="update-modularize-me-capn">UPDATE: Modularize me, cap’n!</h2> Ok, so I wasn’t happy with how unmodular the above example was. We had to redefine a whole component just to add a remove button. If I wanted another component that had a remove button, I’d have to redo all that work! No thanks. Instead, I made a higher order component out of it. There’s no meaning for distinguishing between children, because it only has one. There’s no state involved either, so we’ll use Unit for both of them. The only query is Remove. So let’s put that all together! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/RemGeneric.purs data QueryP a = Remove a type State s f g = InstalledState Unit s QueryP f g Unit type Query f = Coproduct QueryP (ChildF Unit f) addRemove :: forall g s f. (Plus g) => Component s f g -> s -> Component (State s f g) (Query f) g addRemove comp state = parentComponent render eval where render _ = H.div_ [ H.slot unit \_ -> { component: comp, initialState: state } , H.button [ E.onClick $ E.input_ Remove ] [ H.text "Remove" ] ] eval :: EvalParent QueryP Unit s QueryP f g Unit eval (Remove a) = pure a </code></pre></div></div> Easy! We’ve got a few extra type variables to represent where the child state and query will go. Fairly standard type synonym definitions for use in client components. The only kinda tricky part is rendering: we accept a component and initial state as parameters. Cool! Let’s see what the definition for the counter looks like with the remove button added: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/CounterRemPrime.purs type State g = Rem.State Counter.State Counter.Input g type Query = Rem.Query Counter.Input ui :: forall g. (Plus g) => Component (State g) Query g ui = Rem.addRemove Counter.ui (Counter.init 0) </code></pre></div></div> More type synonyms! And a fairly nice one liner function to wrap the counter. The code for the list itself is essentially unchanged. We do have to import the <code class="language-plaintext highlighter-rouge">RemGeneric</code> as well as the <code class="language-plaintext highlighter-rouge">CounterRemPrime</code> module to be able to use the <code class="language-plaintext highlighter-rouge">RemGeneric.Input</code> type, but the type declarations hardly change at all. All in all, this level of componentiziation is fairly easy! Defining the type synonyms is a bit of a pain, but you’ll likely be writing a lot fewer of them when you have more involved components. <h3 id="other-posts-in-the-series">Other posts in the series:</h3> <ol> <li><a href="/2015/10/03/elm_vs_purescript.html">Elm vs PureScript I: War of the Hello, Worlds</a></li> <li><a href="/2015/10/05/elm_vs_purescript_ii.html">Elm vs PureScript II</a></li> <li><a href="/2015/10/11/elm_architecture_in_purescript_iv_effects.html">Elm Architecture in PureScript IV: Effects</a></li> </ol> Sat, 10 Oct 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/10/10/elm_architecture_in_purescript_iii.html https://www.parsonsmatt.org/2015/10/10/elm_architecture_in_purescript_iii.html Elm vs PureScript II <h2 id="the-elm-architecture-in-purescript">The Elm Architecture, In PureScript</h2> There’s a fantastic introduction to the Elm programming style and application architecture called, as you might expect, <a href="https://github.com/evancz/elm-architecture-tutorial/">The Elm Architecture</a>. The tutorial begins with a rather trivial application, and demonstrates how to extend the application to be more useful via composition of components and managing signals. In the previous post, I compared “Hello World” with PureScript and Elm. I’d like to compare some fairly trivial programs, to get an idea on what simpler programs in the two languages look and feel like. Since The Elm Architecture already does a fantastic job of doing that for Elm, I’ve decided to simply recreate it in PureScript. As it happens, <code class="language-plaintext highlighter-rouge">purescript-thermite</code> is very much like React, and relies a lot on internal state. As a result, it doesn’t work quite as naturally with the Elm architecture examples, especially for using things as nested components. <code class="language-plaintext highlighter-rouge">purescript-halogen</code> seems to have an easier way to handle events and inputs, so I’ve decided to focus on that library instead. The repository with the code is available <a href="https://github.com/parsonsmatt/purs-architecture-tutorial">here</a>. <h1 id="0-hello-world">0. Hello World!</h1> Since I didn’t do it in the previous post, here’s “Hello World!” in Halogen: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/Zero.purs data Input a = Input a ui :: forall g. (Functor g) => Component Unit Input g ui = component render eval where render state = H.div_ [ H.h1_ [ H.text "Hello, World!" ] ] eval :: Eval Input Unit Input g eval (Input a) = pure a -- src/Main.purs import qualified Example.Zero as Ex0 main = runAff throwException (const (pure unit)) $ do node <- runUI Ex0.ui unit appendToBody node.node </code></pre></div></div> Halogen expects the Input type to be have kind <code class="language-plaintext highlighter-rouge">* -> *</code>, and refers to it as a query algebra. We’ll get more into the details of that in the later examples, but we aren’t really using it at the moment. I find it helpful to think of the query algebra as a “public interface” for the component. The <code class="language-plaintext highlighter-rouge">ui</code> function defines the component that we’ll be using, and has two parts: <code class="language-plaintext highlighter-rouge">render</code> and <code class="language-plaintext highlighter-rouge">eval</code>. <code class="language-plaintext highlighter-rouge">render</code> defines the layout in terms of the current state. <code class="language-plaintext highlighter-rouge">eval</code> uses the <code class="language-plaintext highlighter-rouge">Input</code> query algebra to determine how to modify the state. The type signature for <code class="language-plaintext highlighter-rouge">ui</code> is a bit intimidating, but it’s not too bad: <ul> <li><code class="language-plaintext highlighter-rouge">Unit</code> is the state that this component is responsible for.</li> <li><code class="language-plaintext highlighter-rouge">Input</code> is the query algebra we’ll be using in the component.</li> <li><code class="language-plaintext highlighter-rouge">g</code> is the functor/monad that the component operates in (like <code class="language-plaintext highlighter-rouge">Eff</code>, <code class="language-plaintext highlighter-rouge">Aff</code>, etc.)</li> </ul> This is easy enough, so let’s move on to the first interactive example: Counter! <h1 id="1-counter">1. Counter</h1> First, the necessary imports: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/One.purs module Example.One where import Prelude import Halogen import qualified Halogen.HTML.Indexed as H import qualified Halogen.HTML.Events.Indexed as E </code></pre></div></div> Now, we’ll define the type of our state. In this case, it’s a type alias to a record with a count field. Our query algebra is just like Elm but with the extra type parameter. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type State = { count :: Int } data Input a = Increment a | Decrement a </code></pre></div></div> And now, the <code class="language-plaintext highlighter-rouge">ui</code> function! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: forall g. (Functor g) => Component State Input g ui = component render eval where render state = H.div_ [ H.button [ E.onClick $ E.input_ Decrement ] [ H.text "-" ] , H.p_ [ H.text (show state.count)] , H.button [ E.onClick $ E.input_ Increment ] [ H.text "+" ] ] </code></pre></div></div> We’re using <code class="language-plaintext highlighter-rouge">E.input_</code> to send an event to the <code class="language-plaintext highlighter-rouge">eval</code> function. If we cared about the event itself, then we could use <code class="language-plaintext highlighter-rouge">E.input</code> and provide a function that would accept the event information and provide a value on the <code class="language-plaintext highlighter-rouge">Input</code>. We don’t, so we’ll skip that for now. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> eval :: Eval Input State Input g eval (Increment next) = do modify (\state -> state { count = state.count + 1 }) pure next eval (Decrement next) = do modify (\state -> state { count = state.count - 1 }) pure next </code></pre></div></div> Halogen has <code class="language-plaintext highlighter-rouge">get</code> and <code class="language-plaintext highlighter-rouge">modify</code> functions for use in the eval functions, which let us either view the current state or modify it. Halogen uses the type variable associated with our query algebra to type the eval function. Even though we’re not using it yet, we still need for the function to evaluate to something of the same type. That’s why we pass <code class="language-plaintext highlighter-rouge">next</code> along. Running the UI is essentially the same: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Main.purs main = runAff throwException (const (pure unit)) $ do -- node <- runUI Ex0.ui unit node <- runUI Ex1.ui { count: 0 } appendToBody node.node </code></pre></div></div> We use <code class="language-plaintext highlighter-rouge">runUI</code> with our <code class="language-plaintext highlighter-rouge">ui</code> definition and an initial state, and append that to the body. Nice! <h1 id="2-counter-pair">2. Counter Pair</h1> This is where the two examples begin to differ. In Elm, the render function has an address to send actions to, which are then evaluated later. This makes it very easy to lift a child component’s layout and rendering logic into a parent component: just provide a forwarding address and an <code class="language-plaintext highlighter-rouge">Input</code> constructor, and update the state. The state is all kept in the top level component, and passed down to children. As a result, the parent component has access to all of the state of the application, and can inspect it at will. Both Thermite and Halogen instead encapsulate the state, such that parents don’t know about the internal state of their children. Halogen’s query algebra (the <code class="language-plaintext highlighter-rouge">Input a</code> type we’ve been using) is meant to provide an API for the components, allowing them to be interacted with. So, let’s get started! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/Two.purs -- .. imports .. newtype CounterSlot = CounterSlot Int -- ord/eq instances for CounterSlot type StateP = { topCounter :: CounterSlot , bottomCounter :: CounterSlot } init :: StateP init = { topCounter: CounterSlot 0, bottomCounter: CounterSlot 1 } </code></pre></div></div> Since we’re now talking about a component that contains other components, we want some way to talk about how it contains them. Halogen carries around a lot more information in the type system about what’s going on with each components state, effects, etc. So now, the state of our CounterPair is just going to be a pair of slots for counters. The slot is used to give an identifier to the element that the component contains. The query algebra is much simpler: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Input a = Reset a </code></pre></div></div> Since the counters are keeping track of their own internal state, all we need to do is know when to reset them. Before we write the parent component, we’ll want to define some type synonyms to make it easier to refer to the component. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type State g = InstalledState StateP Ex1.State Input Ex1.Input g CounterSlot </code></pre></div></div> So, the real state of our component is going to be an <code class="language-plaintext highlighter-rouge">InstalledState</code> of our newly defined <code class="language-plaintext highlighter-rouge">StateP</code> type on top of the <code class="language-plaintext highlighter-rouge">Ex1.State</code> type. We’ll also have the <code class="language-plaintext highlighter-rouge">Input</code> over the <code class="language-plaintext highlighter-rouge">Ex1.Input</code> query types. Finally, we’ll mention the <code class="language-plaintext highlighter-rouge">g</code> functor, and then refer to the <code class="language-plaintext highlighter-rouge">CounterSlot</code> as the type of slot that the counters will go in. Now, to recap: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type State g = InstalledState -- we're installing a state into another state, StateP -- our parent state Ex1.State -- child state Input -- parent query Ex1.Input -- child query g -- functor variable CounterSlot -- slot for child components </code></pre></div></div> Alright! Next up, we’ve got our query type synonym. It’s using some fanciness in the form of coproduct, but it’s not too complex. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type Query = Coproduct Input (ChildF CounterSlot Ex1.Input) </code></pre></div></div> Actually, this one is a lot simpler! A Coproduct is <code class="language-plaintext highlighter-rouge">newtype Coproduct f g a = Coproduct (Either (f a) (g a))</code>. In actual English, a coproduct is a way of saying “I have a value of type a, and it’s either in a functor f or a functor g.” So, our type synonym is saying something like “The query type is either some value inside the Input functor, or a value inside the slot-indexed child input functor.” It’s pretty complex, but the safety and composability make it worth it. I promise! Now, let’s write the component function! We’ll use the type synonyms to simplify the component type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui :: forall g. (Plus g) => Component (State g) Query g </code></pre></div></div> Not bad! Note that we’re using <code class="language-plaintext highlighter-rouge">Plus g</code> instead of just <code class="language-plaintext highlighter-rouge">Functor</code> because we’re doing a parent component now. On to the function body! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ui = parentComponent render eval where render state = H.div_ [ H.slot state.topCounter mkCounter , H.slot state.bottomCounter mkCounter , H.button [ E.onClick $ E.input_ Reset ] [ H.text "Reset!" ] ] mkCounter :: Unit -> { component :: Component Ex1.State Ex1.Input g, initialState :: Ex1.State } mkCounter _ = { component: Ex1.ui, initialState: Ex1.init 0 } </code></pre></div></div> We use <code class="language-plaintext highlighter-rouge">parentComponent</code> now The <code class="language-plaintext highlighter-rouge">H.slot</code> function accepts two arguments: <ol> <li>Some value of our <code class="language-plaintext highlighter-rouge">CounterSlot</code> type that we’re using to identify child components,</li> <li>A function from <code class="language-plaintext highlighter-rouge">Unit</code> to a record containing the component and initial state.</li> </ol> Halogen uses the function to lazily render the page. The button sends the <code class="language-plaintext highlighter-rouge">Reset</code> message, which will get handled by our <code class="language-plaintext highlighter-rouge">eval</code> function. Finally, the fun stuff! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> eval (Reset next) = -- ... wait, how do we change the child state? </code></pre></div></div> Ah! We can’t! Not quite yet, anyway. The parent component has no way to directly set the state of the child component. Our counter in <code class="language-plaintext highlighter-rouge">Example.One</code> only supports the following actions: <code class="language-plaintext highlighter-rouge">Increment</code> and <code class="language-plaintext highlighter-rouge">Decrement</code>. If we want to reset the counter, we’ll have to add that to the list of actions our <code class="language-plaintext highlighter-rouge">Counter</code> supports. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/Counter.purs -- mostly unchanged, but the query algebra: data Input a = Increment a | Decrement a | Reset a -- ... and the eval function: eval :: Eval Input State Input g eval (Increment next) = ... eval (Decrement next) = ... eval (Reset next) = do modify (const (init 0)) pure next </code></pre></div></div> <code class="language-plaintext highlighter-rouge">modify (const (init 0))</code> is equivalent to <code class="language-plaintext highlighter-rouge">modify \state -> init 0</code>, so we’re – as expected – resetting the counter state to 0. Now, the counters themselves don’t have a control for <code class="language-plaintext highlighter-rouge">Reset</code>ing themselves. Fortunately, we can easily send actions to child components from the parent component. Let’s get back to that <code class="language-plaintext highlighter-rouge">eval</code> function from the counter pair: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Example/Two.purs -- first, change the import to the new counter: import qualified Example.Counter as Ex1 -- ... the rest of the file... eval (Reset next) = do query (CounterSlot 0) (action Ex1.Reset) query (CounterSlot 1) (action Ex1.Reset) pure next </code></pre></div></div> <code class="language-plaintext highlighter-rouge">query</code> allows us to use the query algebra (or, public interface) that our components define. We provide an identifier for the query, so we know where to look, and an action. The action in this case is simply <code class="language-plaintext highlighter-rouge">Reset</code>, and we don’t care about the return value. Halogen also defines <code class="language-plaintext highlighter-rouge">request</code>, which we can use to get some information out of the component. Finally, running the counter works pretty smooth: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- src/Main.purs runEx2 = runUI Ex2.ui (installedState Ex2.init) main = -- boilerplate elided... app <- runEx2 appendToBody app.node </code></pre></div></div> We have to use <code class="language-plaintext highlighter-rouge">installedState</code> since we’re dealing with parent/children components. Alright, that’s the first two examples from Elm Architecture in PureScript Halogen! I’ll be covering the rest in a future installment. <h3 id="other-posts-in-the-series">Other Posts in the series:</h3> <ol> <li><a href="https://www.parsonsmatt.org/2015/10/03/elm_vs_purescript.html">Elm vs PureScript I: War of the Hello, Worlds</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html">Elm vs PureScript II</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/10/elm_architecture_in_purescript_iii.html">Elm Architecture in PureScript III: Dynamic Lists of Counters</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/11/elm_architecture_in_purescript_iv_effects.html">Elm Architecture in PureScript IV: Effects</a></li> </ol> Mon, 05 Oct 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html Elm vs PureScript I <h2 id="war-of-the-hello-worlds">War of the Hello, Worlds</h2> I’m building a web application with Haskell, and I’d like the front end to be functional as well. I could write it all in JavaScript, but that sounds boring. I could go the other direction and write it all in Haskell, but I can’t figure out how to build GHCjs (and have concerns about performance of compiled Haskell). I could learn ClojureScript, but that’s a big investment, and I mostly want to get something built. That leaves Elm and PureScript. Elm is a fairly simple functional language with a focus on developing browser UIs and making it easy to learn and use. PureScript is an advanced functional language that is quite a bit more general and powerful. Where Elm prioritises easy development, PureScript prioritizes powerful language features and abstractions. PureScript’s libraries and frameworks for developing applications are a bit more immature than Elm, but that’s somewhat to be expected given the relative age of the languages. This post seeks to evaluate both PureScript and Elm for the purpose of building a single page application from the perspective of a relative newbie. <h2 id="hello-world">“Hello, World”</h2> <h3 id="elm">Elm:</h3> Elm’s CLI is rather nice. Getting started with a project is just: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ mkdir elm-project && cd elm-project $ elm package install </code></pre></div></div> And we can get “Hello World” on the screen with a few commands: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cat > Main.elm module Main where import Html exposing (..) main = h1 [] [text "Hello World!"] ^C $ elm package install evancz/elm-html --yes $ elm reactor </code></pre></div></div> <code class="language-plaintext highlighter-rouge">https://localhost:8000</code>, click on <code class="language-plaintext highlighter-rouge">Main.elm</code> and you’ll see Hello World. Nice! Getting trivial functionality in applications via signals is pretty easy, as demonstrated by the excellent <a href="https://github.com/evancz/elm-architecture-tutorial/">Elm Architecture</a> tutorial. Elm’s <code class="language-plaintext highlighter-rouge">reactor</code> server is very fast, which makes for a rather nice development cycle. <h3 id="purescript">PureScript:</h3> The <a href="https://github.com/bodil/pulp">pulp</a> build tool is excellent, and we can get a project started pretty easily: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ mkdir purs-project && cd purs-project $ pulp init $ cat > index.html <head><script src="app.js"></script></head> ^C $ pulp server </code></pre></div></div> Now we can go to <code class="language-plaintext highlighter-rouge">https://localhost:1337</code>, open the console and see that <code class="language-plaintext highlighter-rouge">"Hello sailor!"</code> has been logged. PureScript, being more general and modular by default than Elm, requires a bit more work before you can have something up on the screen. There are a few frameworks for PureScript apps: <h4 id="purescript-react"><code class="language-plaintext highlighter-rouge">purescript-react</code></h4> is a library for low level React bindings. <h4 id="purescript-halogen"><code class="language-plaintext highlighter-rouge">purescript-halogen</code></h4> is a high level framework based on <code class="language-plaintext highlighter-rouge">virtual-dom</code> that seems extremely advanced and powerful. Unfortunately, the power comes with complexity, and the documentation, API, and examples seem to be in a state of flux. <h4 id="purescript-thermite"><code class="language-plaintext highlighter-rouge">purescript-thermite</code></h4> is a higher level framework based on <code class="language-plaintext highlighter-rouge">purescript-react</code>. It looks nicer to use and more abstract, but at the cost of some missing features. The examples and documentation are kept up to date and are quite readable. For this reason, I’ll go with it! <h3 id="getting-hello-world-on-the-screen">Getting Hello World on the screen…</h3> First, we need to install our dependencies: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ pulp dep install --save purescript-thermite </code></pre></div></div> Well, there’s quite a bit of boilerplate… <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Main where import Prelude import Data.Maybe import Control.Monad.Eff import Data.Maybe.Unsafe import Data.Nullable (toMaybe) import qualified Thermite as T import qualified Thermite.Action as T import qualified React as R import qualified React.DOM as R import qualified React.DOM.Props as RP import qualified DOM as DOM import qualified DOM.HTML as DOM import qualified DOM.HTML.Document as DOM import qualified DOM.HTML.Types as DOM import qualified DOM.HTML.Window as DOM </code></pre></div></div> (see what I meant about the granularity and modularity?) The actual code to get “Hello World” up isn’t so bad: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>render :: T.Render _ {} _ {} render _ _ _ _ = R.div' [ R.h1' [ R.text "Hello world!" ] ] spec :: T.Spec _ {} _ {} spec = T.simpleSpec {} perfAction render perfAction :: T.PerformAction _ {} _ {} perfAction _ _ = T.modifyState (const {}) main :: forall eff. Eff (dom :: DOM.DOM | eff) R.ReactElement main = body >>= R.render (R.createFactory (T.createClass spec) {}) where body = do win <- DOM.window doc <- DOM.document win elm <- fromJust <$> toMaybe <$> DOM.body doc return $ DOM.htmlElementToElement elm </code></pre></div></div> Thermite requires that we declare a <code class="language-plaintext highlighter-rouge">State</code> and action handlers. I used <code class="language-plaintext highlighter-rouge">Unit</code> for everything, and just rendered those divs to the screen. The <code class="language-plaintext highlighter-rouge">index.html</code> page needs to be modified to include a link to <code class="language-plaintext highlighter-rouge">React</code> in the head, and link to <code class="language-plaintext highlighter-rouge">app.js</code> after the body loads. So that’s a comparison on “Hello World” for the two. Elm’s quite a bit simpler, but we’ll see how PureScript’s more powerful language plays out in the more complex examples. <h3 id="other-posts-in-this-series">Other posts in this series:</h3> <ol> <li><a href="https://www.parsonsmatt.org/2015/10/05/elm_vs_purescript_ii.html">Elm vs PureScript II</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/10/elm_architecture_in_purescript_iii.html">Elm Architecture in PureScript III: Dynamic Lists of Counters</a></li> <li><a href="https://www.parsonsmatt.org/2015/10/11/elm_architecture_in_purescript_iv_effects.html">Elm Architecture in PureScript IV: Effects</a></li> </ol> Sat, 03 Oct 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/10/03/elm_vs_purescript.html https://www.parsonsmatt.org/2015/10/03/elm_vs_purescript.html Recursion Excursion Recursive definitions are a lot of fun. The typical example of a recursive definition is the natural numbers: <blockquote> A natural number is either 0 or the successor of a natural number. </blockquote> Expressed in Haskell, this is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Nat = Zero | Succ Nat </code></pre></div></div> Zero is <code class="language-plaintext highlighter-rouge">Zero</code>, as you’d expect. One is <code class="language-plaintext highlighter-rouge">Succ Zero</code>, two is <code class="language-plaintext highlighter-rouge">Succ (Succ Zero)</code>, etc. The natural numbers can be recursively defined like this. <h2 id="extension-one">Extension One:</h2> Lists are extremely similar to the natural numbers – we just attach something to each <code class="language-plaintext highlighter-rouge">Succ</code>, and now we’ve got a linked list of things. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data List a = Nil | Cons a (List a) </code></pre></div></div> And, interestingly enough, you can encode the natural numbers as lists. We can replace the <code class="language-plaintext highlighter-rouge">a</code> with the unit type <code class="language-plaintext highlighter-rouge">()</code> and the list is equivalent to the natural number representing it’s length: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Cons () (Cons () (Cons () Nil )) Succ (Succ (Succ Zero)) </code></pre></div></div> Cool! But now that I’ve seen one example of this sort of progression, I want to see what else can happen. For lists, we added a type variable <code class="language-plaintext highlighter-rouge">a</code> to the definition of the natural numbers. Let’s try adding another type variable: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data TwoList a b = Z | S a b (TwoList a b) </code></pre></div></div> But this is boring. It’s basically the exact same thing as <code class="language-plaintext highlighter-rouge">List (a, b)</code>. And it’s easy to see that adding more of these variables is the same as adding elements to the tuple, so a <code class="language-plaintext highlighter-rouge">FiveList a b c d e</code> is really just a <code class="language-plaintext highlighter-rouge">List (a, b, c, d, e)</code>. So, we don’t want our new type variable to be trivially expressible by a regular list. What’s next? We could add the type variable to the <code class="language-plaintext highlighter-rouge">Nil</code> constructor, but that’s also pretty uninteresting. <code class="language-plaintext highlighter-rouge">List' b a = Nil' b | Cons' a (List' b a)</code> doesn’t buy any real expressive power. Well, let’s differentiate that new type variable <code class="language-plaintext highlighter-rouge">b</code> in some way. Since it’s a type variable, we can really only talk about the kind that it has. <code class="language-plaintext highlighter-rouge">b :: *</code> didn’t work, so let’s try <code class="language-plaintext highlighter-rouge">b :: * -> *</code>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data ListB a b = NilB | ConsB (b a) (ListB a b) </code></pre></div></div> Ah, but this is also trivial! A <code class="language-plaintext highlighter-rouge">ListB Int Maybe</code> is the same as a <code class="language-plaintext highlighter-rouge">List (Maybe Int)</code>. We have bought nothing with this arrangement. Well, we can’t just apply <code class="language-plaintext highlighter-rouge">a</code> to <code class="language-plaintext highlighter-rouge">b</code>, we have to try something else. The remaining list is the only thing left, so let’s try that. <h2 id="extension-two">Extension Two:</h2> <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data ListF a b = NilF | ConsF b (a (ListF a b)) </code></pre></div></div> There we go. This is not trivially expressible as a <code class="language-plaintext highlighter-rouge">List</code>! And I really felt weird writing <code class="language-plaintext highlighter-rouge">ListF Int Maybe</code> so I swapped the type variables: <code class="language-plaintext highlighter-rouge">b</code> is now the value, and <code class="language-plaintext highlighter-rouge">a</code> is the new thing we’re playing with. But… What does it look like? Let’s try a <code class="language-plaintext highlighter-rouge">ListF Maybe Int</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- specialized type: List = Nil | Cons Int (Maybe List) nil = NilF one = ConsF 1 Nothing one' = ConsF 1 (Just NilF) two = ConsF 1 (Just (ConsF 2 Nothing)) </code></pre></div></div> Interesting! We have two termination cases here. The ordinary <code class="language-plaintext highlighter-rouge">NilF</code> that we get from the list definition, and <code class="language-plaintext highlighter-rouge">Nothing</code> we get from using <code class="language-plaintext highlighter-rouge">Maybe</code> as the <code class="language-plaintext highlighter-rouge">f</code>. What about <code class="language-plaintext highlighter-rouge">Either</code>? Let’s see what <code class="language-plaintext highlighter-rouge">ListF (Either String) Int</code> looks like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- specialized: List = Nil | Cons Int (Either String List) oneR = ConsF 1 (Right NilF) oneL = ConsF 1 (Left "asdf") twoR = ConsF 1 (Right (ConsF 2 (Right NilF))) twoL = ConsF 1 (Right (ConsF 2 (Left "nope"))) </code></pre></div></div> We’ve got two ways to terminate again. The first is the <code class="language-plaintext highlighter-rouge">NilF</code> constructor. The second is the string in the Either definition. Is there an interesting thing of kind <code class="language-plaintext highlighter-rouge">* -> *</code> can we try this with? We’ve tried <code class="language-plaintext highlighter-rouge">Maybe</code> and <code class="language-plaintext highlighter-rouge">Either</code>, which were both basically the same thing. We can figure out why by inspecting their definitions: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Maybe a = Nothing | Just a data Either b a = Left b | Right a </code></pre></div></div> They’re both sum types. When we have <code class="language-plaintext highlighter-rouge">Maybe ()</code>, we have two values: <code class="language-plaintext highlighter-rouge">Nothing</code> and <code class="language-plaintext highlighter-rouge">Just ()</code>. We’ve only added 1 to the number of values in the type variable. <code class="language-plaintext highlighter-rouge">Maybe Bool</code> has <code class="language-plaintext highlighter-rouge">Nothing</code>, <code class="language-plaintext highlighter-rouge">Just True</code>, and <code class="language-plaintext highlighter-rouge">Just False</code>. Again, just one more value than <code class="language-plaintext highlighter-rouge">Bool</code> itself. When we use these in our augmented list, the <code class="language-plaintext highlighter-rouge">a</code> variable is fixed to <code class="language-plaintext highlighter-rouge">ListF f a</code>. So anything that doesn’t carry the last type variable is just going to terminate the list. Either can therefore terminate the list with a different type, but this isn’t fundamentally very interesting. Since simple sum types don’t seem to be terribly interesting, let’s try a product type! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type With a = (,) a </code></pre></div></div> is a convenient synonym we’ll use. To keep things simple, we’ll stick with <code class="language-plaintext highlighter-rouge">Bool</code> for now. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>oneL :: ListF (With Bool) Int oneL = ConsF 1 (True, NilF) twoL :: ListF (With Bool) Int twoL = ConsF 1 (False, ConsF 2 (True, NilF)) </code></pre></div></div> Well, this feels just like a <code class="language-plaintext highlighter-rouge">List (a, b)</code>, but laid out differently. So far, product and sums haven’t really bought us anything. Fortunately, there are exponential types, or functions as they’re more commonly known. Reader is defined as <code class="language-plaintext highlighter-rouge">type Reader r = (->) r</code>. It’s a functor, like <code class="language-plaintext highlighter-rouge">Maybe</code> and <code class="language-plaintext highlighter-rouge">Either</code>, and it has the right kind when we supply the type to be read. Let’s… give that a shot? We’ll specialize the Reader to Bool because there’s only two possible values of type <code class="language-plaintext highlighter-rouge">Bool</code>. This makes it a bit easier to deal with. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- specialized: List = Nil | Cons Int (Reader Bool List) </code></pre></div></div> Ok, wait, what does this mean again? A <code class="language-plaintext highlighter-rouge">ListF (Reader Bool) Int</code> means that the next item in the list is a function that takes a Boolean value and returns a List. Ok. Let’s do this. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>one = ConsF 1 (\c -> NilF) two = ConsF 1 (\c -> if c then NilF else ConsF 2 (const NilF)) </code></pre></div></div> Wait, what? We’re expressing control flow and branching! Oh, this is just too cool. How do we evaluate that? We’ll pass a single in the thing and see what it does, and collect the results into a regular list. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>single :: Bool -> ListF (Reader Bool) Int -> [Int] single _ NilF = [] single c (ConsF n f) = n : single c (f c) single False two == [1, 2] single True two == [1] </code></pre></div></div> Cool! By using the Reader functor we’ve turned the List into a binary tree. Observe: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>binTree = ConsF 1 $ \a -> if a then branchOne else branchTwo branchOne = ConsF 10 $ \a -> if a then binTree else branchTwo branchTwo = ConsF 2 $ \a -> if a then NilF else branchTwo </code></pre></div></div> Check this craziness out. What even is it? It’s some kind of graph/maze thing. Let’s write a function to traverse this business: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>traverseBool :: [Bool] -> ListF (Reader Bool) Int -> [Int] traverseBool _ NilF = [] traverseBool [] (Cons n _) = [n] traverseBool (b:bs) (Cons n f) = n : traverseBool bs (f b) </code></pre></div></div> Alright, so given <code class="language-plaintext highlighter-rouge">[True, True, False, True, True, False]</code>, we’ll generate <code class="language-plaintext highlighter-rouge">[1, 10, 1, 2]</code>. Given a different list of Bools, it’d traverse differently. Very interesting! We can represent binary trees using <code class="language-plaintext highlighter-rouge">Reader Bool</code>. Bool has two possible values, and we’ve got trees with two branches. It seems that the number of branches we’ll get is dependent on how many inhabitants there are in the type. If that’s the case, then <code class="language-plaintext highlighter-rouge">Reader ()</code> will simply be a linked list. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fancyList :: ListF (Reader ()) Int fancyList = ConsF 1 (\() -> ConsF 2 (\() -> NilF)) </code></pre></div></div> So <code class="language-plaintext highlighter-rouge">List a</code> can be expressed as <code class="language-plaintext highlighter-rouge">ListF (Reader ()) a</code>, and the natural numbers are then <code class="language-plaintext highlighter-rouge">ListF (Reader ()) ()</code>. This is the first really interesting functor we’ve used here. Either and Maybe simply provided extra termination options. Tupling just added a bit of extra information at each cell. Reader gives us branching. Finally, let’s try another more interesting functor: State! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype State s a = State { runState :: s -> (a, s) } -- specialized type: ListF (State Bool) Int one = ConsF 1 (State (\s -> (NilF, s))) two = ConsF 1 (State (\s -> (ConsF 2 (State (\s' -> (NilF, not s') )), s))) </code></pre></div></div> Well, this is kind of weird. Let’s traverse it and see what happens. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>traverseState :: [Bool] -> ListF (State Bool) Int -> [Int] traverseState _ NilF = [] traverseState [] (ConsF n _) = [n] traverseState (b:bs) (ConsF n s) = n : traverseState bs (fst (runState s b)) </code></pre></div></div> So, this will go down the list, applying the Bool values to the states in turn, extracting the Ints, and discarding the result states. This isn’t any different from Reader yet. It seems like we should be able to kick off the computation with a single Bool and let it carry on down. Let’s try passing the state down. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- specializing the type: -- traverseState' :: Bool -> ListF (State Bool) Int -> [Int] traverseState' :: s -> ListF (State s) a -> [a] traverseState' _ NilF = [] traverseState' b (ConsF n s) = let (list, state) = runState s b in n : traverseState' state list </code></pre></div></div> And if we want to remember the state history, <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- specialized: -- traverseState'' :: Bool -> ListF (State Bool) Int -> [(Int, Bool)] traverseState'' :: s -> ListF (State s) a -> [(a, s)] traverseState'' _ NilF = [] traverseState'' b (ConsF n s) = let (list, state) = runState s b in (n, b) : traverseState' state list </code></pre></div></div> Cool! So now we can drop a seed value into a chain of stateful computations and the result will be a list of the return values of each computation. Those type signatures are actually a bit more specific than they need to be: we can replace the signature with <code class="language-plaintext highlighter-rouge">traverseState' :: a -> ListF (State a) b -> [b]</code> Let’s do something a tiny bit more interesting with it: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>collatz :: Int -> [Int] collatz x = traverseState' x (chain x) where chain n = ConsF n (State go) go y = if y == 1 then (NilF, y) else let val = if even y then y `div` 2 else 3 * y + 1 in (chain val, val) </code></pre></div></div> This will evaluate the Collatz Conjecture for a given number. <h2 id="what-is-it">What is it?</h2> Now, what is <code class="language-plaintext highlighter-rouge">ListF</code>? It’s kind of similar to the Free monad, but there’s an important distinction: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Free f a = Pure a | Free (f (Free f a)) data ListF f a = Nil | ConsF a (f (ListF f a)) </code></pre></div></div> Free only carries values at the leaves or end of the chain. ListF carries values at each step. The common structure seems like it could be extracted out like so: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data DualFree f b a = DPure a | DFree b (f (DualFree f b a)) </code></pre></div></div> Using the above definition of <code class="language-plaintext highlighter-rouge">DualFree</code>, the free monad can be thought of as a list of computations in the <code class="language-plaintext highlighter-rouge">f</code> functor with some return value <code class="language-plaintext highlighter-rouge">b</code> where <code class="language-plaintext highlighter-rouge">a</code> is <code class="language-plaintext highlighter-rouge">()</code>. <code class="language-plaintext highlighter-rouge">ListF</code> on the other hand is a list of values <code class="language-plaintext highlighter-rouge">a</code> along with a list of computations in the <code class="language-plaintext highlighter-rouge">f</code> functor. <code class="language-plaintext highlighter-rouge">ListF</code> over <code class="language-plaintext highlighter-rouge">Reader r</code> gives us trees of values. <code class="language-plaintext highlighter-rouge">ListF</code> over <code class="language-plaintext highlighter-rouge">State s</code> gives us unfolding structures. In fact… This is (almost!) the list monad transformer “done right!” The canonical implementation is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ListT m a = ListT { unListT :: m (Maybe (a, ListT m a)) } </code></pre></div></div> This uses the <code class="language-plaintext highlighter-rouge">Maybe</code> to avoid having the <code class="language-plaintext highlighter-rouge">Nil</code> data constructor, which allows it to use the <code class="language-plaintext highlighter-rouge">newtype</code> declaration. This also ensures that even the first element in the list is wrapped in the monad. The <code class="language-plaintext highlighter-rouge">ListF</code> version I wrote associates the monadic actions with the links between values, and this version associates the monadic action with the values themselves. It’s also pretty similar to the cofree comonad: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype Cofree h a = Cofree { unCofree :: (a, h (Cofree h a)) } </code></pre></div></div> If you took out the <code class="language-plaintext highlighter-rouge">NilF</code> constructor, anyway. <blockquote> Thanks to Gary Fixler for notifying me that the type signature in <code class="language-plaintext highlighter-rouge">traverseState</code> was wrong! </blockquote> Thu, 24 Sep 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/09/24/recursion.html https://www.parsonsmatt.org/2015/09/24/recursion.html Unix Is The IDE <h2 id="a-story-on-tdd-in-c">A Story on TDD in C++</h2> Whenever people talk about editors and IDEs and other things, something that invariably gets mentioned by a vim user is that: <blockquote> Unix is the IDE and vim is just the text editor. </blockquote> There usually isn’t a justification or explanation of that statement. The IDE or emacs fans tend to think of that as a limitation of vim, rather than a strength, and feel ever more confident in their superiority. I recently did “Unix is the IDE” and thought that a written experience on what that means exactly would be helpful. <h2 id="ruby-experience">Ruby Experience</h2> When I’m at work, I use Ruby, and I use a pretty standard workflow. Guard watches the files in the directory and runs the corresponding test whenever a file is written. With proper test design, this makes for an extremely quick and pleasant feedback loop. Guard is always running in one of my tmux panes, and I get constant feedback whenever I’m working on a file. This makes “test first” software development much easier and nicer. Manually switching contexts, running <code class="language-plaintext highlighter-rouge">rspec</code>, waiting for the entire test suite to run, etc. breaks the development flow and makes it difficult to program effectively. Allowing the feedback to exist solely as a color in my peripheral vision is huge for speeding up development. <h2 id="class-though">Class though</h2> I’m currently taking a data structures and algorithms course at UGA. The entire course is in C++, a language that I had no experience with before the course. The first thing I looked for was a testing library. <a href="https://github.com/philsquared/Catch">Catch</a> stood out as being easy to use, unobtrusive, and simple: precisely the qualities I was looking for. A few entries in my Makefile later, and I have <code class="language-plaintext highlighter-rouge">make test</code> for running the test suite and <code class="language-plaintext highlighter-rouge">make mtest</code> for running the test suite through Valgrind. Very nice! I looked for something similar to Guard (but for C++) and could not find anything. I admit that I didn’t look too hard. It was just easier for me to make something on my own, and I didn’t feel the need to spend a ton of time looking. <h2 id="use-the-fifo-luke">Use the FIFO, Luke</h2> What’s the goal? I want fast and unobtrusive test feedback. And I don’t want to spend a ton of time getting it setup, and it needs to work with what I’ve already got. After all, this is just for a class, and for small projects. Vim supports running shell commands. At first, I just mapped a key to <code class="language-plaintext highlighter-rouge">:! make test</code>. That took away my code, ran the tests, and then returned it back to me. Mediocre. I wanted it to be asynchronous and in my peripheral vision. Vim doesn’t support asynchronous stuff all that well natively, and I didn’t feel like delving into the details. Instead, I created a named pipe with <code class="language-plaintext highlighter-rouge">mkfifo .test_runner</code>. I made a tmux pane and ran the extroardinaly complicated code: <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ `cat .test_runner` </code></pre></div></div> Here’s what’s happening there: <ol> <li>The backticks tell the shell to evaluate what’s inside and execute the resulting string.</li> <li><code class="language-plaintext highlighter-rouge">cat</code> opens the <code class="language-plaintext highlighter-rouge">.test_runner</code> FIFO and waits for input.</li> <li>Until something writes to <code class="language-plaintext highlighter-rouge">.test_runner</code>, the backticks can’t evaluate what’s happening inside, and it waits.</li> </ol> Now, you can do <code class="language-plaintext highlighter-rouge">echo "make test" >> .test_runner</code>. <code class="language-plaintext highlighter-rouge">cat</code> finally receives something, and it dutifully returns the string. The backticks now have something to evaluate: <code class="language-plaintext highlighter-rouge">make test</code>, so they run the command. Nice! <h2 id="not-quite">Not quite…</h2> Unfortunately <code class="language-plaintext highlighter-rouge">cat</code> stops after the first thing is received, so we need to keep the file open somehow. I’m sure there are more elegant ways to do this, but the solution that just comes to mind immediately is: wrap the whole thing in an infinite loop. <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ while true; do `cat .test_runner`; done </code></pre></div></div> And now I can <code class="language-plaintext highlighter-rouge">echo</code> commands to the test runner FIFO all I want, and it dutifully executes them. <h2 id="vimify">Vimify</h2> Now I’ve got a way for one process to send a command to another. I need to incorporate a shortcut key that handles this in vim, so I can get that quick feedback I so desperately crave. The code is again pretty simple: <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>map <leader>t :call system('echo "make test" >> .test_runner')<CR> </code></pre></div></div> And now whenever I hit <code class="language-plaintext highlighter-rouge"><spacebar>t</code>, my tests run in my peripheral vision, and everything is cool. All told, the time it took for me to get this stuff setup was probably ~20 minutes or so. It was faster for me to invent this kinda hacky solution than it was for me to “Do It Right”. In about half an hour, I had enough of a realtime testing feedback loop to work confidently. It wasn’t ideal. But it did get the job done. Over the next few days, I slowly tweaked the setup: the bash script that opened the loop also made the FIFO and deleted it on exit, and I had the vim shortcuts also save the current file. But for the most part, a tiny bit of effort did just about everything I needed to be productive. <h2 id="unix-as-ide">Unix As IDE</h2> I didn’t have to learn anything new here. Not even a plugin, or new shortcuts. All I did was reuse some existing components in a simple way to enable my current programming environment to support an entirely new language. My first programming experience was with Eclipse and Java, last spring. In the summer, we were encouraged to use vim or emacs, and I begrudgingly started learning vim. Before, I’d used the student edition of RubyMine for Ruby, and I think Notepad++ for JavaScript. If I hadn’t learned vim and Unix as well, I imagine that I’d be trying to learn Code::Blocks or something similar. I don’t want to have to spend a ton of time learning different tools, especially for something as basic as editing text. I want to leverage highly general tools to accomplish a wide variety of tasks. Using Unix as the IDE and vim as the text editor fits that perfectly. <h2 id="now-i-am-not-saying">now i am not saying…</h2> I’m not saying that IDEs suck. IDEs are awesome pieces of technology that are great at what they do. If you’re working on a large project in a single language that really benefits from having an IDE, then by all means, you should be using the IDE. If you’re a big fan of emacs as the OS and using it’s inferior process management to handle REPLs, shells, etc., then that’s also great. I’ve just found that it doesn’t really like to be subservient in the same way that vim does. It wants to dominate your workflow and be your single point of contact. There’s nothing wrong with that. But for me, I really like the simplicity and composability of more general tools that don’t mind taking up as little space as you want them to. Vim does a fantastic job of getting out of the way. Even with a ton of plugins and customization, loading vim is lightning fast. It integrates extremely well with the Unix shell. And I suppose that’s what I like the most about it – it’s not opinionated about what I do for compilation, or project management, or operating system, or anything else. It’s just there to edit text. And do a damn good job of it. Sun, 26 Jul 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/07/26/unix_as_ide.html https://www.parsonsmatt.org/2015/07/26/unix_as_ide.html Announcing Squirrell <h2 id="a-kinda-magical-query-gem">A kinda magical query gem</h2> The application we’re building where I work has a pretty complex data model. Our ActiveRecord models where getting more and more complex finders and scopes, with no end in sight. One particular query was unacceptably slow with ActiveRecord and too complex for Arel, so we had to drop down to SQL. On an instance method of a model. Yeah, it was kinda gross. I didn’t like it. I also couldn’t really find a good place to put the code, either, and I wasn’t able to find a gem or convention that handled this. So I decided to make one: <a href="https://www.github.com/parsonsmatt/squirrell">Squirrell</a>! <h2 id="query-object">Query Object</h2> A query object is a class whose responsibility is to query the database and return the result. ActiveRecord models make fine query objects, but have perhaps too much flexibility and power. Mocking <code class="language-plaintext highlighter-rouge">find</code>, <code class="language-plaintext highlighter-rouge">where</code>, <code class="language-plaintext highlighter-rouge">find_by</code> and variants leads to brittle test code, especially if you ever need to <code class="language-plaintext highlighter-rouge">includes</code>! Additionally, having such easy arbitrary access to the database spread throughout the application can make it difficult to pinpoint dependencies and interfaces. Squirrell’s query objects have an extremely simple API: <code class="language-plaintext highlighter-rouge">.find</code>. This makes it very easy to encapsulate their behavior, which makes for much easier testing. As a brief demonstration, I’ll write up a simple query object for finding a user: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class UserFinder include Squirrell requires :id def finder User.find(@id) end end </code></pre></div></div> And in the controller, we’d call it like: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def show @user = UserFinder.find(id: params[:id]) end </code></pre></div></div> And in our controller spec, we’d stub it like: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>expect(UserFinder).to receive(:find).and_return(instance_double(User)) </code></pre></div></div> So far, this is just delegating the find method to the User. The real benefit comes when we need to change something on the query. Perhaps <code class="language-plaintext highlighter-rouge">UsersController#show</code> now needs to include all of a user’s friends in order to avoid an N+1 query. We only need to change the query code in the <code class="language-plaintext highlighter-rouge">UserFinder</code> class. The controller doesn’t need to change, and the controller spec doesn’t need to change. Contrast that with having the call to <code class="language-plaintext highlighter-rouge">User.find</code> directly in the controller. If we want to add <code class="language-plaintext highlighter-rouge">includes(:friends)</code>, we’d need to alter the code in the controller and the spec. Or let the spec hit the database, and then it’s slow. <h2 id="arel">Arel</h2> Squirrell really shines when you’re using it with Arel queries. It provides a convenient location to put the code and a clean interface for using it. Here’s an example of how you’d use that: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class NumberOfLikedPosts include Squirrell requires :user_id def arel users = User.arel_table posts = Post.arel_table likes = Like.arel_table users.join(likes).on(users[:id].eq(likes[:user_id])) .join(posts).on(likes[:post_id].eq(posts[:like_id])) .where(users[:id].eq(@user_id)) .project(count(posts[:id)) end end </code></pre></div></div> To use this, it’s just <code class="language-plaintext highlighter-rouge">ArelExample.find(user_id: 1)</code> and it returns the result of the query. ActiveRecord will return a pretty basic object, which you’ll likely want to modify. Squirrel provides a <code class="language-plaintext highlighter-rouge">process</code> hook that is called with the result of the query. The following process method converts the above query result into an integer: <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class NumberOfLikedPosts # ... def process(result) result[:count].to_i end end </code></pre></div></div> <h2 id="sql">SQL</h2> The main motivation for this gem was using raw SQL queries in Rails in a convention-over-configuration manner. The query that needed to be optimized in SQL ended up with a 25x speed increase over ActiveRecord, but the code wasn’t well organized and didn’t have a clear API. Testing it was tricky, and stubbing it even worse. With Squirrell, it’s great! <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>class AverageAgeOverMinimum include Squirrell requires :min_age def raw_sql <<-SQL SELECT COUNT(*) AS total, AVG(users.age) AS average FROM users WHERE users.age > #{@min_age} SQL end end </code></pre></div></div> And this query is executed with <code class="language-plaintext highlighter-rouge">ComplexSqlQuery.find(min_age: 18)</code>. Sun, 21 Jun 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/06/21/announcing_squirrell.html https://www.parsonsmatt.org/2015/06/21/announcing_squirrell.html servant-persistent NOTE: This tutorial is for <code class="language-plaintext highlighter-rouge">servant-0.4</code>. The repository is currently set up for <code class="language-plaintext highlighter-rouge">servant-0.5</code>, and I’m writing a follow up blog post on it. <h2 id="a-brief-example">A Brief Example</h2> When people talk about why they like Haskell, the type system always comes out as a big win. Everything from inference, to refactorability, to compile-time checking. The more you can program into the type system, the more benefit you gain from it – and there have been some compelling demonstrations lately. <a href="https://haskell-servant.github.io/">Servant</a> is a fantastic example of this. Servant provides a type-level DSL for defining a webservice API. You describe the endpoints, named captures, query parameters, headers, response types, etc. The compiler is then able to verify the correctness of your code in fulfilling the API you described. Define a new route? The compiler immediately lets you know what’s missing. Change the parameters that an endpoint accepts? The compiler lets you know which endpoint function needs to change, and how. Servant is still rather new, and there wasn’t yet an example on connecting a Servant API with a database. <a href="https://www.yesodweb.com/book/persistent">Persistent</a> leverages the type system similarly, and the combination of the two makes for a compelling example on the power of Haskell. The code for this post is available in the following GitHub repository: <a href="https://www.github.com/parsonsmatt/servant-persistent/tree/servant-0.4">parsonsmatt/servant-persistent</a>. <h1 id="mainhs">Main.hs</h1> Main is brief – it gathers some configuration information from the environment, initializes the database resources, and runs the server. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module Main where import Network.Wai.Handler.Warp (run) import System.Environment (lookupEnv) import Database.Persist.Postgresql (runSqlPool) import Config (defaultConfig, Config(..), Environment(..), setLogger, makePool) import Api (app) import Models (doMigrations) main :: IO () main = do env <- lookupSetting "ENV" Development port <- lookupSetting "PORT" 8081 pool <- makePool env let cfg = defaultConfig { getPool = pool, getEnv = env } logger = setLogger env runSqlPool doMigrations pool run port $ logger $ app cfg lookupSetting :: Read a => String -> a -> IO a lookupSetting env def = do p <- lookupEnv env return $ case p of Nothing -> def Just a -> read a </code></pre></div></div> And that’s it for the Main module! <h1 id="apihs">Api.hs</h1> Now, let’s dig into the neat stuff – the API definition! This is located in <code class="language-plaintext highlighter-rouge">src/Api.hs</code>. As usual, language pragmas and imports bring up the beginning: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE DataKinds #-} {-# LANGUAGE TypeOperators #-} module Api where import Control.Monad.Reader (ReaderT, runReaderT, lift) import Control.Monad.Trans.Either (EitherT, left) import Network.Wai (Application) import Database.Persist.Postgresql (selectList, Entity(..), (==.) , fromSqlKey) import Data.Int (Int64) import Servant import Config (Config(..)) import Models -- (Person, runDb, userToPerson, EntityField(UserName)) </code></pre></div></div> The Servant import isn’t qualified, as it brings about 20 terms into scope, included type operators, and GHC was complaining about trying to explicitly import them. <code class="language-plaintext highlighter-rouge">Config</code> and <code class="language-plaintext highlighter-rouge">Models</code> are both internal modules to the application. The qualified imports for the Model are commented out, as the module doesn’t export the EntityField(UserName) definition and wouldn’t compile. I left the comment there to show what is being imported. The API we’ll be describing is pretty simple: we want to be able to get a list of users, create a user, and retreive a single user by name. We’ll return them as JSON. We’ll go over each line individually to describe what’s going on here. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type PersonAPI = "users" :> Get '[JSON] [Person] {- scope / verb encoding return value -} </code></pre></div></div> <code class="language-plaintext highlighter-rouge">"users"</code> indicates that this route will start with <code class="language-plaintext highlighter-rouge">/users</code>. <code class="language-plaintext highlighter-rouge">:></code> acts to combine parts of the route. The final part of the route specifies the HTTP verb, the return content encodings, and what the request will return. So “return all users as JSON” – not too bad! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> :<|> "users" :> Capture "name" String :> Get '[JSON] Person </code></pre></div></div> The <code class="language-plaintext highlighter-rouge">:<|></code> is an <code class="language-plaintext highlighter-rouge">Alternative</code> type operator. It can be read as “or”: The type of this route is either ‘get all users at <code class="language-plaintext highlighter-rouge">/users</code> or get a single user at <code class="language-plaintext highlighter-rouge">/users/:name</code>’. <code class="language-plaintext highlighter-rouge">Capture "name" String</code> is how we specify that we want this to be a named capture, and we’ll <code class="language-plaintext highlighter-rouge">read</code> it as a String. Servant has a lot of types built in, and you’re able to define your own. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code> :<|> "users" :> ReqBody '[JSON] Person :> Post '[JSON] Int64 </code></pre></div></div> Lastly, this one specifies that we’ll accept a <code class="language-plaintext highlighter-rouge">POST</code> request at <code class="language-plaintext highlighter-rouge">/users</code>. The <code class="language-plaintext highlighter-rouge">ReqBody</code> will accept JSON encoding of a Person, and will return an <code class="language-plaintext highlighter-rouge">Int64</code>. <h2 id="implementing-the-server">Implementing the Server</h2> Now that we’ve defined the API, we need to serve it. By default, Servant uses an <code class="language-plaintext highlighter-rouge">EitherT ServantErr IO</code>. We’d like to extend this with the <code class="language-plaintext highlighter-rouge">Reader</code> monad to make the database configuration available without manually threading it through all the functions that require it. A detailed description on this is available <a href="https://haskell-servant.github.io/tutorial/server.html#using-another-monad-for-your-handlers">in this guide</a>. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type AppM = ReaderT Config (EitherT ServantErr IO) userAPI :: Proxy PersonAPI userAPI = Proxy readerToEither :: Config -> AppM :~> EitherT ServantErr IO readerToEither cfg = Nat $ \x -> runReaderT x cfg readerServer :: Config -> Server PersonAPI readerServer cfg = enter (readerToEither cfg) server app :: Config -> Application app cfg = serve userAPI (readerServer cfg) </code></pre></div></div> Now we’ve got something that returns a WAI <code class="language-plaintext highlighter-rouge">Application</code>! We’re just about ready to run this. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server :: ServerT PersonAPI AppM server = allPersons :<|> singlePerson :<|> createPerson </code></pre></div></div> We defined three routes, so we need three handler functions. And the handler functions need to be of the right type! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>allPersons :: AppM [Person] allPersons = do users <- runDb $ selectList [] [] let people = map (\(Entity _ y) -> userToPerson y) users return people </code></pre></div></div> Persistent’s <code class="language-plaintext highlighter-rouge">selectList</code> here is pretty astounding at first – it’s capable of inferring the database table to query on based on the type. The first list is a list of filters, so we’re getting all of the Users out of the database. Persistent knows this because we’re mapping the function <code class="language-plaintext highlighter-rouge">userToPerson :: User -> Person</code> over the array. In any case, we’ve got a fairly standard Persistent query, and a <code class="language-plaintext highlighter-rouge">return</code> to raise it into the <code class="language-plaintext highlighter-rouge">AppM</code> monad. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>singlePerson :: String -> AppM Person singlePerson str = do users <- runDb $ selectList [UserName ==. str] [] let list = map (\(Entity _ y) -> userToPerson y) users case list of [] -> lift $ left err404 (x:xs) -> return x </code></pre></div></div> At the top, we defined the route to take a single named capture. This function uses that named capture as a parameter. We’ll select all users from the database with the same name as the capture. If the resulting list is empty, we’ll error out with a 404. Otherwise, we’ll return the first one. This isn’t particularly elegant, but it shows how to error out of a servant request. Since our overall monad is <code class="language-plaintext highlighter-rouge">ReaderT Config (EitherT ServantErr IO)</code>, we have to lift the call to left. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>createPerson :: Person -> AppM Int64 createPerson p = do newPerson <- runDb $ insert $ User (name p) (email p) return $ fromSqlKey newPerson </code></pre></div></div> In all of these examples, I’ve used an indirect type. <code class="language-plaintext highlighter-rouge">User</code> is the database model, and <code class="language-plaintext highlighter-rouge">Person</code> is what we’re interfacing with. If we were directly dealing with <code class="language-plaintext highlighter-rouge">User</code>s, then this function could be expressed as the point-free one liner: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>createUser = liftM fromSqlKey . runDb . insert </code></pre></div></div> And that’s it. The API is defined and ready to be served! <h1 id="confighs">Config.hs</h1> Config contains many of the functions used to configure the application, as well as the Config datatype that the ReaderT monad uses. I’ll skip the imports in the interest of brevity: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Config = Config { getPool :: ConnectionPool , getEnv :: Environment } data Environment = Development | Test | Production deriving (Eq, Show, Read) defaultConfig :: Config defaultConfig = Config { getPool = undefined , getEnv = Development } </code></pre></div></div> Config is a simple record that we stuff into the Reader monad so we can read information from it with <code class="language-plaintext highlighter-rouge">asks</code>. We’ll use this with the <code class="language-plaintext highlighter-rouge">runDb</code> function, which is defined in Models. <h1 id="modelshs">Models.hs</h1> Models holds the data definitions for our data types, and a few functions for running queries against the database. Persistent brings a ton of language extensions into play and uses Template Haskell extensively. For a good introduction to Persistent, see <a href="https://www.yesodweb.com/book/persistent">the chapter</a> from the Yesod book. Our database model <code class="language-plaintext highlighter-rouge">User</code> is here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| User name String email String deriving Show |] </code></pre></div></div> and our API data type <code class="language-plaintext highlighter-rouge">Person</code> is here: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Person = Person { name :: String , email :: String } deriving (Eq, Show, Generic) instance ToJSON Person instance FromJSON Person userToPerson :: User -> Person userToPerson User{..} = Person { name = userName, email = userEmail } </code></pre></div></div> Nothing too fancy! Now we just need to query the database, and we’ll be all set. The monad that our application uses is the <code class="language-plaintext highlighter-rouge">ReaderT Config (EitherT ServantErr IO)</code> monad stack. <code class="language-plaintext highlighter-rouge">runSqlPool</code> operates in IO, so we’ll need to <code class="language-plaintext highlighter-rouge">liftIO</code> to get it where we want it. Our connection pool is stored in the <code class="language-plaintext highlighter-rouge">Config</code>, so we’ll need to <code class="language-plaintext highlighter-rouge">asks getPool</code> to access it. <code class="language-plaintext highlighter-rouge">runDb</code> therefore looks like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb query = do pool <- asks getPool liftIO $ runSqlPool query pool </code></pre></div></div> This can also be written as <code class="language-plaintext highlighter-rouge">runDb query = asks getPool >>= liftIO . runSqlPool query</code> if you’re into one liners. <h1 id="try-it-out">Try it out!</h1> All the files are available on the <a href="https://github.com/parsonsmatt/servant-persistent">Github repository</a>. Clone the repository, get it running, and play with it. If you’d like some practice, try the following exercises: <h2 id="add-another-data-model-and-route">Add another data model and route!</h2> <ol> <li>Perhaps this app is a blogging service. Add a <code class="language-plaintext highlighter-rouge">Post</code> model to the database definition, with a title, body, and a reference to a <code class="language-plaintext highlighter-rouge">User</code> that authored it. Use Persistent’s convenient <code class="language-plaintext highlighter-rouge">json</code> annotation to automatically derive FromJSON and ToJSON instances.</li> <li>Add a route to the API to get a user’s posts. It should look something like: <code class="language-plaintext highlighter-rouge">users/:id/posts</code>. After adding the route, read the type error, and see what function you need to add to the server.</li> <li>Create the function that will retrieve a users posts.</li> <li>Of course, accessing isn’t enough. Create a route/function that will allow someone to create a blog post at <code class="language-plaintext highlighter-rouge">users/:id/posts</code>. You’ll need to access the <code class="language-plaintext highlighter-rouge">ReqBody</code> and a <code class="language-plaintext highlighter-rouge">Capture</code> for this.</li> </ol> <h2 id="create-a-join-model">Create a join model!</h2> <ol> <li>People now want to follow other people on the blog. Create a join model <code class="language-plaintext highlighter-rouge">FollowRelation</code> that stores a <code class="language-plaintext highlighter-rouge">Follower</code> user reference and a <code class="language-plaintext highlighter-rouge">Follows</code> user reference.</li> <li>Add a route to get a user’s followers: <code class="language-plaintext highlighter-rouge">users/:id/followers</code></li> <li>Now add a route to follow a user: POST a UserID to <code class="language-plaintext highlighter-rouge">users/:id/follow</code> that creates a <code class="language-plaintext highlighter-rouge">FollowRelation</code></li> </ol> Sun, 07 Jun 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/06/07/servant-persistent.html https://www.parsonsmatt.org/2015/06/07/servant-persistent.html Perscotty Pt III <h1 id="finishing-the-story">Finishing the Story</h1> (this is part three of three: <a href="/2015/05/02/scotty_and_persistent.html">one</a> and <a href="/2015/05/04/perscotty_pt_ii.html">two</a> are linked) Last time, we corrected resource utilization by pulling the pool out and reusing it in queries rather than creating and destroying it each time. We also setup the app to run in a specialized monad transformer stack, allowing us some read-only configuration information, but we had to seriously trim down our application in order to figure out how to get that working. Let’s restore our original functionality! <h2 id="reading-the-database">Reading the Database</h2> First, let’s do up our index action for all the posts. <code class="language-plaintext highlighter-rouge">S.get "/posts" (html "listing all posts!")</code> gets added to the application. Let’s figure out how to use that database pool, tucked away in the Reader monad. Having learned my lesson about building up big complex things before breaking them into smaller bits, I want to stop inlining the actions in these and start using real functions. So let’s make the handler for <code class="language-plaintext highlighter-rouge">postsIndex</code> it’s own function. ghc-mod helpfully tells us that the type of <code class="language-plaintext highlighter-rouge">(html "listing all posts!")</code> is <code class="language-plaintext highlighter-rouge">ActionT T.Text ConfigM ()</code>, so let’s make our function and call it. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>app = do middleware logStdoutDev S.get "/" (html "hello world") S.get "/posts" postsIndex postsIndex :: ActionT T.Text ConfigM () postsIndex = html "listing all posts!" </code></pre></div></div> Much cleaner. Now, what will our code look like to access the database? Our earlier function for database access was this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb pool query = liftIO (runSqlPool query pool) </code></pre></div></div> Now that we can read the pool from our ConfigM, we don’t have to pass it explicitly as a parameter. So let’s rewrite the function to get the pool from the reader. The function to get things out of the Reader is <code class="language-plaintext highlighter-rouge">asks</code>. Our <code class="language-plaintext highlighter-rouge">Reader</code> environment is a value of type <code class="language-plaintext highlighter-rouge">Config { getPool :: ConnectionPool }</code>, so the way to extract the pool from that is <code class="language-plaintext highlighter-rouge">getPool cfg</code>. So we’ll <code class="language-plaintext highlighter-rouge">asks getPool</code>. A first attempt! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb query = do pool <- asks getPool liftIO (runSqlPool query pool) </code></pre></div></div> Now, there’s something wrong with this pool. We’re not in the Reader monad, we’re in the ActionT monad, and the Reader monad is below our current context. So we have to <code class="language-plaintext highlighter-rouge">lift</code> that <code class="language-plaintext highlighter-rouge">asks</code> function up into the current monad before it’ll work. As is, we get an error. Let’s fix it up! <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb query = do pool <- lift $ asks getPool liftIO (runSqlPool query pool) </code></pre></div></div> And this works! We’ll update the <code class="language-plaintext highlighter-rouge">postsIndex</code> to make use of this. We’ll just get a count of posts in the database. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>postsIndex :: ActionT T.Text ConfigM () postsIndex = do posts <- runDb (selectList [] []) html $ "This many posts! " <> T.pack (show (length (posts :: [Entity BlogPost]))) </code></pre></div></div> We can fire up the server, run it, and AH HAH! It works! It’s correctly reporting the count of posts in the database. Fantastic. <h2 id="but-does-it-scale">But does it scale?</h2> Previously, we’d use a function somewhere, and that would ‘bake in’ the type. Will this new <code class="language-plaintext highlighter-rouge">runDb</code> function work in other contexts? We’ll uncomment the <code class="language-plaintext highlighter-rouge">doMigrations</code> and <code class="language-plaintext highlighter-rouge">doDbStuff</code> functions, run them in the app, and see what happens. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>app = do runDb doMigrations runDb doDbStuff -- ... </code></pre></div></div> And it works! So our runDb function appears to be OK with working in any level of the stack, as long as it’s got that <code class="language-plaintext highlighter-rouge">ConfigM</code> to read from. ghc-mod infers the following type for <code class="language-plaintext highlighter-rouge">runDb</code>: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb :: forall b (t :: (* -> *) -> * -> *) (m :: * -> *). (MonadTrans t, MonadReader Config m, MonadIO (t m)) => SqlPersistT IO b -> t m b </code></pre></div></div> Which is a little more type sorcery than I am comfortable with<a href="#1">1</a>. Taylor’s example has the following type: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb :: (MonadTrans t, MonadIO (t ConfigM)) => SqlPersistT IO a -> t ConfigM a </code></pre></div></div> Instead of accepting any <code class="language-plaintext highlighter-rouge">MonadReader Config m</code>, we only want to accept <code class="language-plaintext highlighter-rouge">ConfigM</code>. <code class="language-plaintext highlighter-rouge">ConfigM</code> is an instance of <code class="language-plaintext highlighter-rouge">MonadReader Config</code> already, as derived in the newtype declaration. It turns out, that’s all that needs to be done, and now we have a working database connection that we can easily use in our application. If we want to make additional information available to our application, all we have to do is add another field to the ConfigM type and set that up top. That’s rather nice to work with! You can get the current state of the repository <a href="https://github.com/parsonsmatt/scotty-persistent-example/tree/finished">here</a>. Many thanks to Taylor Fausak for the excellent <a href="https://taylor.fausak.me/2014/10/21/building-a-json-rest-api-in-haskell/">Building a JSON REST API in Haskell</a> post. <a name="1"></a>[1] : (2015-12-13) Now that I’ve been studying Haskell for a bit longer, I can actually understand this! ghc-mod puts the <code class="language-plaintext highlighter-rouge">forall</code> qualification on there for clarity, even though it doesn’t have to. We have a function that takes a <code class="language-plaintext highlighter-rouge">SqlPersistT IO b</code>, and lifts it to run in any monad which is a transformer and has an inner layer of <code class="language-plaintext highlighter-rouge">MonadReader Config</code>. This is obvious to me now, but certainly wasn’t at the time. Sun, 10 May 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/05/10/perscotty_pt_iii.html https://www.parsonsmatt.org/2015/05/10/perscotty_pt_iii.html Perscotty Pt II <h2 id="some-updates">Some updates</h2> (this is part two of three: <a href="/2015/05/02/scotty_and_persistent.html">one</a> and <a href="/2015/05/10/perscotty_pt_iii.html">three</a> are linked) The <a href="https://www.reddit.com/r/haskell/comments/34nxtu/scotty_and_persistent_a_beginners_voyage/">reddit thread</a> about my <a href="https://www.parsonsmatt.org/2015/05/02/scotty_and_persistent.html">previous post</a> generated good discussion and advice. I’m going to attempt to work through them now. <a href="https://taylor.fausak.me/2014/10/21/building-a-json-rest-api-in-haskell/">This guide</a> contains a lot of information on what this all should look like in the end. I’ll be taking a decent bit of information from that. <h2 id="pooling">Pooling</h2> The most urgent issue is that my database pool is getting recreated every time I make a query, and then closed. Noo! Instead, I need to create a pool, and pass that to the database functions so they can efficiently reuse the resources. Also, it’s not really necessary to run the migrations from within the server, so we’ll extract that out. I’m also going to unqualify the scotty import to make the code a bit more readable. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runDb pool query = liftIO (runSqlPool query pool) main :: IO () main = do pool <- runStdErrLoggingT $ createPostgresqlPool connStr 10 runDb pool doMigrations rubDb pool doDbStuff scotty 3000 $ do -- ... </code></pre></div></div> Pool acquired! Now we can delete the <code class="language-plaintext highlighter-rouge">inAppDb</code> function. Let’s get the inHandlerDb stuff working too. I’ll just dumb replace the inHandlerDb call with runDb and add the pool parameter: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-- old: posts <- inHandlerDb (selectList [] []) -- new: posts <- runDb (selectList[] []) pool </code></pre></div></div> And it works! I was kind of expecting a type mismatch that would require another function to be made, but this didn’t. Let’s inspect the inferred types? <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inAppDb :: SqlPersistM a -> ScottyT T.Text IO a inHandlerDb :: SqlPersistM a -> ActionT T.Text IO a runDb :: MonadIO m => SqlPersistT IO a -> Pool SqlBackend -> m a </code></pre></div></div> It looks like the reason that <code class="language-plaintext highlighter-rouge">runDb</code> is more general is because the inferred type doesn’t restrict it to a given monad, and it is expecting a transformer <code class="language-plaintext highlighter-rouge">SqlPersistT</code> instead of the <code class="language-plaintext highlighter-rouge">SqlPersistM</code>. <h2 id="fat-stacks-of-monads">Fat Stacks of Monads</h2> Passing around the <code class="language-plaintext highlighter-rouge">pool</code> is kind of annoying, especially when the application gets more complex. Let’s try to make a helper function that will encapsulate that process. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do pool <- etc... let runDb' = runDb pool runDb' doMigrations ... </code></pre></div></div> This works, at first! Unfortunately, it only works when we keep it in the top level. It doesn’t let us use this function in the application. The types don’t line up. Experiment with it a bit: the first place that you use the <code class="language-plaintext highlighter-rouge">runDb'</code> function is what coerces the type, and the type of function to run the database inside the application is incompatible with the type of function to run the database outside of the database. Let’s use <code class="language-plaintext highlighter-rouge">ghc-mod</code> to inspect the inferred type of <code class="language-plaintext highlighter-rouge">runDb'</code> in the above context: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SQLPersistT IO -> IO () </code></pre></div></div> And the type of <code class="language-plaintext highlighter-rouge">runDb pool</code> in the above context: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SqlPersistT [Entity Blogpost] -> Web...ActionT T.Text IO [Entity Blogpost] </code></pre></div></div> In SQLPersistT and ActionT, the T indicates that these are monad transformers. A monad transformer gets stacked on top of another monad, allowing you to access two monads. And monad transformers are themselves monads, so you can stack as many as you want! So we want to somehow generalize <code class="language-plaintext highlighter-rouge">SqlPersistT [Entity Blogpost] -> ActionT Text IO [Entity Blogpost]</code>. I tried a number of possible avenues for that, but wasn’t able to derive a function that would work generically. <h2 id="reader">Reader</h2> The Haskell idiom for implicitly threading some read-only information throughout a program is the Reader monad. The JSON API linked above uses this, along with a Config data type, to build the application up. Let’s start small and build something similar. Let’s keep the <code class="language-plaintext highlighter-rouge">Config</code> data type small, and just store the connection pool for now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>data Config = Config { getPool :: ConnectionPool } </code></pre></div></div> Next up is defining the <code class="language-plaintext highlighter-rouge">Reader</code> monad for this. The bottom of the stack is <code class="language-plaintext highlighter-rouge">IO</code>, so our <code class="language-plaintext highlighter-rouge">Reader</code> will read from Config and sit on top of <code class="language-plaintext highlighter-rouge">IO</code>. Here’s the code (pulled from Taylor Fausak’s post): <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>newtype ConfigM a = ConfigM { runConfigM :: ReaderT Config IO a } deriving (Applicative, Functor, Monad, MonadIO, MonadReader Config) </code></pre></div></div> We had to add a few imports up top to get this to work, and add the <code class="language-plaintext highlighter-rouge">mtl</code> library to the cabal file. Alright! How do we actually use this thing? It turns out, we need to stop calling <code class="language-plaintext highlighter-rouge">scotty</code> and call something else entirely. Taylor’s guide calls <code class="language-plaintext highlighter-rouge">scottyOptsT</code> which has a pretty full configuration set. For right now, I’d like to keep it a bit simpler. Let’s explore the <a href="https://hackage.haskell.org/package/scotty-0.9.1/docs/Web-Scotty-Trans.html">Hackage documentation</a> for scotty’s types and see what we can do. <code class="language-plaintext highlighter-rouge">scottyT</code> looks like the simplest of the bunch, so let’s run with that. Before we get too crazy, let’s make sure we can get <code class="language-plaintext highlighter-rouge">scottyT</code> working just by passing <code class="language-plaintext highlighter-rouge">id</code> in. Our main function now reads: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do pool <- runStdoutLoggingT $ createPostgresqlPool connStr 10 runDb pool migrations runDb pool doDbStuff scottyT 3000 id id $ do -- ... </code></pre></div></div> This works! Awesome! So, what’s going on with those <code class="language-plaintext highlighter-rouge">id</code>s there? Inspecting the type with ghc-mod gives us <code class="language-plaintext highlighter-rouge">IO a -> IO a</code> for the first one, and <code class="language-plaintext highlighter-rouge">IO Response -> IO Response</code>. Cool. Let’s add a line <code class="language-plaintext highlighter-rouge">let c = Config pool</code> under the pool declaration to make our Config data. Taylor’s guide has the following function, which I’m going to copy in: <code class="language-plaintext highlighter-rouge">let r m = runReaderT (runConfigM m) c</code>. The text about that function reads: <blockquote> This takes Scotty’s monad m and adds the ability to read our custom config c from it. This is called a monad transformer stack. It allows us to use any monad in the stack. So after adding our reader monad, we can both deal with requests (using Scotty’s monad) and read our config (using our monad). </blockquote> Cool! Let’s change <code class="language-plaintext highlighter-rouge">id</code> to <code class="language-plaintext highlighter-rouge">r</code> in both of those and see what happens… <h2 id="boom">BOOM</h2> Type errors! Type errors everywhere! While looking at the hackage documentation above, I noticed that all of the normal methods were duplicated in the Web.Scotty.Trans package. Let’s swap out the Scotty version of those functions with the ScottyT versions. And now we’re getting entirely different type errors! We’re missing an instance for <code class="language-plaintext highlighter-rouge">ScottyError</code>. So let’s break the application code into it’s own function, give that a type signature, and see what happens. Now we’re doing <code class="language-plaintext highlighter-rouge">scottyT 3000 r r application</code> and defining <code class="language-plaintext highlighter-rouge">application</code> below. The <code class="language-plaintext highlighter-rouge">pool</code> went out of scope. Let’s just be a tiny bit lazy, and comment out the whole body of that function, and just do a basic <code class="language-plaintext highlighter-rouge">"hello world"</code> for now. We want to get the monad stack working, and database access will be easy as pie after that. Here’s what our main function and app functions look like now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do pool <- runStdOutLogggingT $ createPostgresqlPool connStr 10 let cfg = Config pool r m = runReaderT (runConfigM m) cfg scottyT 3000 r r app app :: ScottyT T.Text ConfigM () app = S.get "/" $ S.html "Hello world" </code></pre></div></div> Now we’re getting a new and entirely vexing type error: <blockquote> Couldn’t match type ‘a’ with Response’ ‘a’ is a rigid type variable bound by a type expected by the context: ConfigM a -> IO a at Main.hs:60:5 Expected type: ConfigM a -> IO a Actual type: ConfigM Response -> IO Response In the second argument of ‘scottyT’, namely ‘r’ In a stmt of a ‘do’ block: scottyT 3000 r r app </blockquote> (the actual error said “wai-3.0.2.3:Network.Wai.Internal.Response”, and I trimmed it for readability) <h2 id="a-detour">A detour</h2> I dug around the internet for hours trying to find the solution to this, and I never really got there. I tried so many things, and nothing fixed it. Finally, I decided I’d rip out as much code as possible for a minimal reproduction to be able to ask the Internet, and something magical happened… First, I deleted all the code except for stuff directly required for the <code class="language-plaintext highlighter-rouge">main</code> and <code class="language-plaintext highlighter-rouge">app</code> functions above. <code class="language-plaintext highlighter-rouge">ghc-mod</code> let me know about a bunch of unused imports, so I trimmed the import list down until it was the bare necessities. ghc-mod was kind enough to let me know that I had a bunch of unused language pragmas, so I removed them. At this point, the type error goes away, and everything works. What. What. I don’t even know. I re-add them, problem recurs. I remove them one-by-one, and the problem was evidently with the <code class="language-plaintext highlighter-rouge">GADTs</code> language pragma, which was used by Persistent’s Template Haskell implementation. Weird. Let this be a warning – break your code into modules, and localize things as much as possible! <h3 id="break-out-the-data-model">Break out the data model</h3> <code class="language-plaintext highlighter-rouge">touch Model.hs</code>, throw all the Persistent stuff in there (minus the stuff required for <code class="language-plaintext highlighter-rouge">ConnectionPool</code>, etc) and start taking language pragmas out of Main. This time, GADTs didn’t fix it, but TypeFamilies did. Well, whatever. Our minimal HelloWorld with the right monad is finally working. The repository in the current state is available <a href="https://github.com/parsonsmatt/scotty-persistent-example/tree/monad-stacks">here</a>. You can look through the commit history and see the incremental changes that I made. This blog post is already way longer than is necessary or normal, so I’ll actually get the little demo working with the database next time. Mon, 04 May 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/05/04/perscotty_pt_ii.html https://www.parsonsmatt.org/2015/05/04/perscotty_pt_ii.html Scotty and Persistent <h2 id="a-beginners-voyage">A beginner’s voyage</h2> (this is part one of three: <a href="/2015/05/04/perscotty_pt_ii.html">two</a> and <a href="/2015/05/10/perscotty_pt_iii.html">three</a> are linked) I’ve been working on a small application with the Haskell web framework scotty, and decided to use the package Persistent to provide database access. I had some trouble getting them to work together, and I couldn’t find many complete examples that used PostgreSQL. I thought I’d put at least one example online of how I’ve gotten it to work thus far. Actually, I lied. I haven’t gotten it to work yet. This blog post is as much a rubber-ducky discovery process as it is a guide. If you have any complaints, suggestions, comments, or questions, I’d love to hear them. I’m going to be tagging commits in a repository, so you’ll have full code examples to work with. The repository for the first part is <a href="https://www.github.com/parsonsmatt/scotty-persistent-example">here</a>. <h2 id="smoke-test-just-the-db">Smoke Test: Just the DB</h2> The <a href="https://www.yesodweb.com/book/persistent">Yesod Book’s Persistent chapter</a> is very good, and has a great starting point for getting it working. The following code snippet is from the Synopsis, and it’s what we’ll be using to make sure that everything is working with Persistent before worrying too much about integrating with scotty. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{-# LANGUAGE EmptyDataDecls #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE QuasiQuotes #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE TypeFamilies #-} import Control.Monad.IO.Class (liftIO) import Database.Persist import Database.Persist.Sqlite import Database.Persist.TH share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| Person name String age Int Maybe deriving Show BlogPost title String authorId PersonId deriving Show |] main :: IO () main = runSqlite ":memory:" $ do runMigration migrateAll johnId <- insert $ Person "John Doe" $ Just 35 janeId <- insert $ Person "Jane Doe" Nothing insert $ BlogPost "My fr1st p0st" johnId insert $ BlogPost "One more for good measure" johnId oneJohnPost <- selectList [BlogPostAuthorId ==. johnId] [LimitTo 1] liftIO $ print (oneJohnPost :: [Entity BlogPost]) john <- get johnId liftIO $ print (john :: Maybe Person) delete janeId deleteWhere [BlogPostAuthorId ==. johnId] </code></pre></div></div> First thing’s first: make a directory to test this, <code class="language-plaintext highlighter-rouge">cabal init</code> to make a project, and <code class="language-plaintext highlighter-rouge">cabal sandbox init</code> to avoid screwing up my global package directory. I add <code class="language-plaintext highlighter-rouge">persistent, persistent-template, persistent-sqlite</code> to my <code class="language-plaintext highlighter-rouge">build-depends</code>, and do <code class="language-plaintext highlighter-rouge">cabal run</code>. I get an error that <code class="language-plaintext highlighter-rouge">Could not find module 'Control.Monad.IO.class'...</code>, so I add the <code class="language-plaintext highlighter-rouge">transformers</code> package to build-depends. <code class="language-plaintext highlighter-rouge">cabal run</code> now works without errors! Hooray! Now let’s get it running with Postgres. This section is tagged <a href="https://github.com/parsonsmatt/scotty-persistent-example/tree/sqlite">sqlite</a> in the repository. <h3 id="to-postgresql-and-beyond">To PostgreSQL and Beyond!</h3> First, we need to change the packages to PostgreSQL instead of SQLite. Change the line in <code class="language-plaintext highlighter-rouge">build-depends</code> to require <code class="language-plaintext highlighter-rouge">persistent-postgresql</code> instead of <code class="language-plaintext highlighter-rouge">-sqlite</code>, and change the import line in <code class="language-plaintext highlighter-rouge">Main.hs</code> to <code class="language-plaintext highlighter-rouge">import Database.Persist.Postgresql</code>. <code class="language-plaintext highlighter-rouge">cabal install --dep</code> (my favorite shorthand for the otherwise verbose <code class="language-plaintext highlighter-rouge">--dependencies-only</code>) to get them installed. Now running the project gets the highly uninteresting error “Not in scope: <code class="language-plaintext highlighter-rouge">runSqlite</code>”. The bottom of that Yesod Book post indicates what we need in order to replace the <code class="language-plaintext highlighter-rouge">runSqlite</code> function: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>... -- In the imports, add: import Control.Monad.Logger (runStderrLoggingT) ... connStr = "host=localhost dbname=test user=test password=test port=5432" main :: IO () main = runStderrLoggingT $ withPostgresqlPool connStr 10 $ \pool -> liftIO $ do flip runSqlPersistMPool pool $ do runMigration migrateAll -- etc ... </code></pre></div></div> Now, the module complains about lacking the <code class="language-plaintext highlighter-rouge">Control.Monad.Logger</code> module. Add it to <code class="language-plaintext highlighter-rouge">build-depends</code>. Let’s try running now! <code class="language-plaintext highlighter-rouge">cabal run</code> and after a lot of thinking, it spits out: <blockquote> libpq: failed (could not connect to server: Connection refused. Is the server running on host “localhost” and accepting TCP/IP connections on port 5432? </blockquote> I honestly wasn’t expecting that – I was sure it’d give an authentication error! <code class="language-plaintext highlighter-rouge">service postgresql status</code> informs me that Postgres is running on port 5433 for some reason, so I edit the ConnectionString. Now I’m getting the expected problem: <code class="language-plaintext highlighter-rouge">FATAL: password authentication failed for user "test"</code>. This isn’t surprising, as I’ve not made a test user, test database, or test password. I make a Postgres user with <code class="language-plaintext highlighter-rouge">createuser -s test -W</code> (warning: this is terrible insecure! Use a more secure means of authentication for your actual application), make a database with <code class="language-plaintext highlighter-rouge">createdb perscotty</code>, and modify the <code class="language-plaintext highlighter-rouge">connStr</code> to reflect this: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>... connStr = "host=localhost dbname=perscotty port=5433 user=test password=test" ... </code></pre></div></div> I had to enter psql and manually do an <code class="language-plaintext highlighter-rouge">ALTER ROLE test WITH PASSWORD 'test';</code>, but afterwards, it all worked! So, what does all of that new code do, anyway? Let’s check out <a href="https://hackage.haskell.org/package/persistent-postgresql-2.1.5/docs/Database-Persist-Postgresql.html">the Hackage page!</a>. Actually I kind of have no idea what that does. The rest of this is going to be a bit of an adventure. The repository at this point has the <a href="https://github.com/parsonsmatt/scotty-persistent-example/tree/postgres">postgres</a> tag. <h3 id="breaking-it-down">Breaking it down</h3> Now, this can’t all be like this. It’s totally not a fun web application, and that’s what I signed up for. Let’s break it up into usable functions. First, ghc-mod is whining about some linting, which I’ll go ahead and follow. Redundant do and discarded values. Next I’ll rename <code class="language-plaintext highlighter-rouge">main</code> to <code class="language-plaintext highlighter-rouge">dbFunction</code> and make a new <code class="language-plaintext highlighter-rouge">main</code> that just calls that. Let’s pull out the <code class="language-plaintext highlighter-rouge">runMigration</code> line too, and also the database inserts/deletes into their own functions. ghc-mod will complain about some crazy missing type signatures, but that’s fine, really, I don’t mind missing those at the moment (I need to learn monad stacks). The current state looks something like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dbFunction = runStderrLoggingT $ withPostgresqlPool connStr 10 $ \pool -> liftIO $ do flip runSqlPersistMPool pool $ do doMigrations doDbStuff </code></pre></div></div> Neat! I feel like we’re very close to a workable solution. We can refactor this to use parameters, and rewrite the current code as: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = do dbFunction doMigrations dbFunction doDbStuff dbFunction query = runStderrLoggingT $ withPostgresqlPool connStr 10 $ \pool -> liftIO $ runSqlPersistMPool query pool doMigrations = runMigration migrateAll doDbStuff = do johnId <- insert $ Person "John Doe" $ Just 35 janeId <- insert $ Person "Jane Doe" Nothing -- ... </code></pre></div></div> Alright, neat, we’ve broken the functions up, and can now user Persistent pretty well. Hooray! The code at this point is tagged at <a href="https://github.com/parsonsmatt/scotty-persistent-example/tree/break-it-up">break-it-up</a>. <h2 id="beam-me-up-scotty">Beam Me Up, Scotty</h2> Now, for the exciting stuff – let’s make this all work with scotty and expose it to the web! Add scotty and wai-extra to the build-depends and install them. Import qualified Scotty and the wai middleware request logger, and we can make our app a website. Our main function looks like this now: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do dbFunction doMigrations dbFunction doDbStuff S.scotty 3000 $ do S.middleware logStdoutDev S.get "/" $ S.html "Hello World" </code></pre></div></div> I’d really like to get that db stuff in the app, instead of before it. After all, we’ll be needing to do all of our database stuff inside of scotty request handlers. The magic function for that is <code class="language-plaintext highlighter-rouge">liftIO</code>! So now we can move the DB migration and STUFF into the scotty app. I’ll also define a function to make it a bit less verbose. Now we’re looking like: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = S.scotty 3000 $ do S.middleware logStdoutDev inAppDb $ do doMigrations doDbStuff S.get "/" $ S.html "Hello World" </code></pre></div></div> Nice! We’re getting somewhere! <h3 id="one-level-deeper">One level deeper…</h3> Let’s do some DB stuff in the handler action! This turns out to be a bit more complex, unfortunately… Here is the revised code that ‘works’: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main :: IO () main = S.scotty 3000 $ do S.middleware logStdoutDev inAppDb $ do doMigrations doDbStuff S.get "/" $ S.html "Hello World" S.get "/posts" $ do posts <- inHandlerDb $ selectList [] [] S.html ("Posts!" <> (T.pack $ show $ length (posts :: [Entity BlogPost]))) S.get "/posts/:id" $ do postId <- S.param "id" findPost <- inHandlerDb $ get (toSqlKey (read postId)) S.html $ "You requested post: " <> (T.pack $ show (findPost :: Maybe BlogPost)) inHandlerDb = liftIO . dbFunction inAppDb = liftIO . dbFunction </code></pre></div></div> Haskell’s inferred type for inHandlerDb is <code class="language-plaintext highlighter-rouge">forall a. SqlPersistM a -> Web.Scotty.Internal.Types.ActionT T.Text IO a</code> and for inAppDb is <code class="language-plaintext highlighter-rouge">forall a. SqlPersistM a -> Web.Scotty.Internal.Types.ScottyT T.Text IO a</code>. I’ve no idea what the <code class="language-plaintext highlighter-rouge">forall</code> bit is about, and I’m sure there’s a better way to handle this than making two type-inferred functions, but that’s why this is a learning process, right? So, this is working! It’s querying the database, returning the count of the posts on that index action and returning a <code class="language-plaintext highlighter-rouge">Maybe BlogPost</code> when you request a given ID. All of the pieces are here to make a much, much nicer solution. The final version is tagged <a href="https://github.com/parsonsmatt/scotty-persistent-example/tree/scotty">scotty</a>. <h2 id="pain-points">Pain Points</h2> This wasn’t an easy process by any means. Reading the available material was good to get me started, but even incredibly basic querying like “How do I get all records of a type out of the database?” or “How do I get a record of a given type with a certain ID?” is non-obvious to a nooblet like myself. First, you have to convert an integer into Persistent’s <code class="language-plaintext highlighter-rouge">Key</code> type. <code class="language-plaintext highlighter-rouge">toSqlKey</code> handles that nicely. Next, you annotate the type of what you’re looking for. That’s how Persistent figures out which table to query. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>findPost <- inHandlerDb $ get (toSqlKey (read postId)) ... (findPost :: Maybe BlogPost) </code></pre></div></div> This also works for selecting multiple records. The code above that gets all posts out of the database is: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>posts <- inHandlerDb $ selectList [] [] ... (posts :: [Entity BlogPost]) ... </code></pre></div></div> This is pretty awesome. But it’s extremely difficult to search for implied information like that, and it took me a really long time to figure that out. It appears that <code class="language-plaintext highlighter-rouge">ScopedTypeVariables</code> extension would allow one to write <code class="language-plaintext highlighter-rouge">posts :: [Entity BlogPost] <- inHandlerDb $ selectList [] []</code>, which is an extremely nice syntax for what’s happening here. I’ll let this be part #1, and I’ll write another post on cleaning all of this up and applying it in a real(ish) app. If you have any questions, corrections, or comments, please feel free to <a href="mailto:parsonsmatt@gmail.com">email me!</a> Sat, 02 May 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/05/02/scotty_and_persistent.html https://www.parsonsmatt.org/2015/05/02/scotty_and_persistent.html Educational Frustrational Finals! I totally should be studying for my ethics final tomorrow, but frankly I think the textbook lacks philosophical rigor, and I can’t really bring myself to study all that much. I did study a lot for my theory of computation course. I find it fascinating, and I can see the applicability in software design and programming languages. I probably should have studied for my web programming class, but reading <a href="https://chimera.labs.oreilly.com/books/1230000000929">“Parallel and Concurrent Programming in Haskell”</a> was way more interesting. I want to be reading <a href="https://www.poodr.com/">“Practical Object-Oriented Design in Ruby”</a>, but I may not make an A in my ‘UNIX Systems Programming in C’ class unless I study the texts provided (one of which refers to Visual Studio 98 and advises using deprecated C functions). I’m finding that I feel pretty frustrated with my educational experience. I can go on Youtube, Coursera, etc. and find extremely high quality educational material from institutions like MIT, Stanford, and other high level colleges. I can, or rather could, if I had time. Instead, I find myself jumping through an inane series of hoops for a line on my resume to guarantee a slightly higher chance of a slightly higher salary. The education system has to optimize for the incentives placed on it. These include the metrics that are easy to take. Unfortunately, the intersection between the things that are easy to measure and the things that are useful to learn and know seems to be small. I don’t want to waste my time learning things that are easy to measure and worthless, even if that’s making up a significant amount of what I’m being evaluated on. All of this is putting me in debt for an absurd quantity of money. Education needs to be fixed, but the tools and material are all out there already. The real value of a degree isn’t the skill attained. It’s a line on your resume, a little bit of status you can wear. Maybe what we need is a cultural fix in how we evaluate each other as potential contributors. Thu, 30 Apr 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/04/30/educational_frustrational.html https://www.parsonsmatt.org/2015/04/30/educational_frustrational.html Written with Haskell! Oh man! I wrote my first actually useful thing in Haskell! It’s a stupid simple program that creates a new post for this Jekyll blog. I had written something similar in bash, but forgot to push it to my dotfiles repo. Rather than walking all the way upstairs, I decided that this would be easier and more fun! <a href="https://github.com/parsonsmatt/parsonsmatt.github.io/blob/master/NewPost.hs">Here’s a link</a> to the code. All told, it was significantly easier for me to write than the equivalent bash script, and I definitely enjoyed writing Haskell more than bash. I can also see room for extensions: some additional command line parameters and a more versatile replace function (<code class="language-plaintext highlighter-rouge">replace :: [(Char, Char)] -> String -> String</code>). <h2 id="what-took-you-so-long">What took you so long?</h2> I’m strongly reminded of <a href="https://github.com/Dobiasd/articles/blob/master/programming_language_learning_curves.md">these graphs</a> on programming language productivity vs learning curves. I can’t remember when exactly I started learning Ruby – sometime in mid-2014 for sure, and after a few months, I’d written a script that read a CSV file and automated data entry over the web. I started learning Rails in early December and had my first app up and running early January, and am now employed as a Rails developer. Meanwhile, I’ve been learning Haskell since November ‘14, aside from the CIS194 exercises, haven’t made anything with it. Why is that? Perhaps the current learning resources focus too much on the more advanced concepts in Haskell, like the dreaded monad. <a href="https://learnyouahaskell.com/chapters">Learn You A Haskell</a> doesn’t get into basic file I/O until well after type classes and higher order functions. <a href="https://book.realworldhaskell.org/read/">Real World Haskell</a> has a similar layout. But when you look at resources for beginning programmers learning Ruby, Java, JavaScript, etc. they start off with basic printing and file stuff significantly quicker. Is there a “Haskell for the Everyman”? I’d love for there to be a resource for learning Haskell that let you get more productive at first. I don’t think the exact notation here is terribly difficult to teach or understand. <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main = do args <- getArgs date <- today let rawTitle = concat args fileName = newPostFileName date rawTitle withFile fileName WriteMode $ \handle -> hPutStrLn handle $ header date rawTitle </code></pre></div></div> Sun, 05 Apr 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/04/05/written_with_haskell.html https://www.parsonsmatt.org/2015/04/05/written_with_haskell.html Haskell progressions I just finished the highly recommended <a href="https://www.cis.upenn.edu/~cis194/spring13/">CIS194</a> course! It’s very interesting to go back to my earlier questions when I started the whole process of learning Haskell back in November or so, and compare them to my code now. I cleaned up some of the older answers, just to compare more explicitly the difference. Old: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>skips :: [a] -> [[a]] skips [] = [] skips xs = map deIndex (map (deIndex) (flatten xs)) deIndex :: (Integral a) => [(a,b)] -> [b] deIndex x = map snd x flatten :: [b] -> [[(Int, (Int, b))]] flatten xs = map dropIndex $ buildList xs -- Drops elements in the list if they're divisible by the number. dropIndex :: (Int,[a]) -> [(Int,a)] dropIndex (n,xs) = filter (\(x,_) -> x `mod` n == 0) (index xs) -- Takes a list, indexes it, expands it into a list of lists, and -- indexes the top level list. buildList :: [a] -> [(Int, [(Int, a)])] buildList = index . expandList . index -- Takes a list and converts it into a list of lists, each of which -- is the original list. expandList :: [a] -> [[a]] expandList xs = map (\x -> xs) xs -- Takes a list and indexes the items. index :: [a] -> [(Int,a)] index = zip [1..] </code></pre></div></div> And lately: <div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>skips :: [a] -> [[a]] skips xs = zipWith takeEvery [1..length xs] (repeat xs) takeEvery :: Int -> [a] -> [a] takeEvery n xs | n > length xs = [] | otherwise = xs !! (n-1) : takeEvery n (drop n xs) </code></pre></div></div> So much cleaner! So much more elegant! It’s rather obvious that I was still thinking imperatively in November and was having a hard time grasping the functional approach. Per Chris Allen’s <a href="https://github.com/bitemyapp/learnhaskell">learnhaskell</a> repository, my next step is the <a href="https://github.com/NICTA/course">NICTA course</a>. I’m certainly excited to see how that progresses. Thu, 02 Apr 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/04/02/haskell.html https://www.parsonsmatt.org/2015/04/02/haskell.html On Github Pages I’ve given my blog a bit of a face lift and moved it onto Github Pages. Hooray! Tue, 31 Mar 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/03/31/github_pages.html https://www.parsonsmatt.org/2015/03/31/github_pages.html From Rails to Yesod I <h2 id="part-i-introduction">Part I: Introduction</h2> Ruby on Rails is a fantastic tool for building an app quickly and painlessly. The incredible array of gems and the focus on a pleasant development experience make it an excellent choice for beginning development: it’s trivial to run <code class="language-plaintext highlighter-rouge">rails new .</code>, throw some plugins into your <code class="language-plaintext highlighter-rouge">Gemfile</code>, and use the generator scripts to build a basic scaffold of your site. Ruby itself is a pleasure to use, allowing developers to rapidly write expressive code with an extensive standard library. Unfortunately, Ruby and Rails both suffer from being relatively slow and resource intensive. This isn’t a problem for most applications, as even a $5 DigitalOcean instance is plenty to run a Rails app under light loads. The size of your Rails app gets multiplied with every forked process, and every gem and feature you add just grows that complexity. It’s certainly possible to continue running Rails on a high volume application. However, many Rails applications are converted to more performant alternatives: Twitter went to Scala, LinkedIn went to Node, etc. When you’re paying for every bit of computer power you’re using, you can make signficant business savings moving to a platform that requires less power to use. Additionally, while Ruby and Rails applications are very easy to start, they can easily become difficult to maintain as they grow. Migrating to another platform might provide a better development environment for a larger application. Node.js, by all accounts, is worse on the maintainability fron (until ES6 becomes mainstream). Where do we go? I’d like to think that Yesod makes for an excellent choice. <h3 id="why-yesod">Why Yesod?</h3> Haskell has a fantastic community built around producing reliable, robust, and beautiful code. The language itself encourages a development style that is concise, easy to maintain, highly modular, and highly composable. Additionally, Yesod is one of the fastest web frameworks around, beating Node.js by a factor of ~4 <a href="https://www.yesodweb.com/blog/2011/03/preliminary-warp-cross-language-benchmarks">according to this older benchmark</a> (the Warp server is the Haskell webserver that Yesod runs on top of). Yesod has many features that a Rails developer will enjoy. Routes are configured in a DSL similar to the <code class="language-plaintext highlighter-rouge">config/routes.rb</code> file. Views are composed similarly to Rails views, but with a greater level of modularity (widgets vs template partials). Yesod enforces RESTful design, which a Rails developer should find natural. Yesod’s default configuration is a MVC setup similar to Rails. The real reason, though, is that I really like using Haskell, and I want to use it productively in a real world way. <h2 id="planned-parts">Planned parts:</h2> In this blog post series, I’ll be converting my <a href="https://www.melodyscout.com">MelodyScout</a> application to Haskell, and tracking the changes in <a href="https://www.github.com/MelodyScout/MelodyScoutHS">this Github repository</a>. My plan of attack for the transition has three main stages: <ol> <li>Creating a JSON API</li> <li>Converting the front end to a SPA that consumes the JSON API</li> <li>Creating a Yesod JSON API that mirrors the MelodyScout API</li> </ol> (EDIT 4-11-2015: this project is actually WAY bigger than I thought it’d be, so it’s getting put off til I have way more confidence in Haskell and Yesod.) Thu, 19 Mar 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/03/19/rails_to_yesod_i.html https://www.parsonsmatt.org/2015/03/19/rails_to_yesod_i.html Rails, Ajax, and you! <h2 id="or-well-me">Or, well, me</h2> There are a bunch of tutorials online for implementing ajax functionality in Rails. It makes for a rather nice user experience, and provides a bit of a performance benefit from the perspective of the user. I know I enjoy using websites that provide this functionality. I pieced together the functionality that I wanted from a number of tutorials and the Rails guides, and I wanted to create a guide to show how I implemented them. <h2 id="example-application">Example application</h2> The code for this will be demonstrated in a minimal Rails app. Check it out on <a href="https://www.github.com/parsonsmatt/rails-ajax rails-ajax">Github</a>. For reference, I’ll be using the following things: <ul> <li>Ruby 2.2.0</li> <li>Rails 4.2.0</li> </ul> If you’re viewing this from the future, this will hopefully make things more sensible for you. <h3 id="getting-started">Getting Started</h3> Create a new Rails app, setup your RVM/rbenv/chruby/etc as you like it, and we’re set. In the tradition of basic Rails tutorials, this will be a blog with a Post. <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rails generate scaffold Post title:string body:text $ rake db:migrate </code></pre></div></div> Now we have the default MVC for Posts. Fantastic. Let’s make it work ajax-style. <h3 id="expected-behavior">Expected Behavior</h3> Here’s how I want it to work: <ol> <li>When a user clicks ‘New Post’, they’ll get the form injected into the page.</li> <li>When a user submits ‘New Post’, it will POST the request to the server.</li> <li>When the server processes the post, it will either re-render the form wth errors or remove the form and update the index.</li> </ol> <h2 id="the-new-action">The <code class="language-plaintext highlighter-rouge">new</code> action</h2> <h3 id="the-controller">The Controller</h3> First, we’ll need to enable responding to JavaScript requests in the controller. To do this, we create a familiar <code class="language-plaintext highlighter-rouge">respond_to</code> call in the <code class="language-plaintext highlighter-rouge">new</code> action. It’ll look like this: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/controllers/posts_controller.rb ... def new @post = Post.new respond_to do |format| format.html format.js end end ...</code></pre></figure> The controller will now respond to “script” requests with the <code class="language-plaintext highlighter-rouge">app/views/posts/new.js.erb</code> by convention. <h3 id="the-view">The View</h3> To request this, it’s super simple. Edit the <code class="language-plaintext highlighter-rouge">index.html.erb</code> file and add <code class="language-plaintext highlighter-rouge">remote: true</code> to the <code class="language-plaintext highlighter-rouge">link_to</code> for New posts. <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/views/posts/index.html.erb ... <%= link_to 'New Post', new_post_path, remote: true %> ...</code></pre></figure> Clicking “New Post” doesn’t do anything, because we don’t have a <code class="language-plaintext highlighter-rouge">new.js.erb</code> file yet. Create <code class="language-plaintext highlighter-rouge">app/views/posts/new.js.erb</code> with the following: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript">alert("New post!");</code></pre></figure> Now, clicking the New link causes an alert to show up. That’s all well and good, but we want to make a new post, not annoy users with <code class="language-plaintext highlighter-rouge">alert</code>s. <h3 id="injecting-a-form">Injecting a form</h3> Our first step is to render the form. <code class="language-plaintext highlighter-rouge">render 'form'</code>. Easy, right? Rails provides a method <code class="language-plaintext highlighter-rouge">escape_javascript</code> which allows you to easily work with HTML in your embedded Ruby JavaScript files. <code class="language-plaintext highlighter-rouge">escape_javascript</code> is aliased to <code class="language-plaintext highlighter-rouge">j</code> for convenience. We’ll then wrap that all up in a nice little jQuery object for convenience. We’ll create a container <code class="language-plaintext highlighter-rouge">div</code> to wrap it, append the whole thing to the body, and be set. The result: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript">var form = $('<%= j(render 'form') %>'); var wrapper = $('<div>').attr('id', 'new-post-form').append(form); $('body').append(wrapper);</code></pre></figure> Not too bad. Clicking “New Post” now slaps the form into the page. Of course, you can click “New Post” multiple times and it just continues appending forms willy-nilly. We want the form to submit over ajax as well, so edit the <code class="language-plaintext highlighter-rouge">form_for</code> method in the <code class="language-plaintext highlighter-rouge">_form.html.erb</code> and add <code class="language-plaintext highlighter-rouge">remote: true</code>. <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><%= form_for @post, remote: true do |f| %></code></pre></figure> This won’t work, because we haven’t setup the create action yet. <h2 id="creation">Creation</h2> <h3 id="the-controller-1">The Controller</h3> As above, all we need to do is add <code class="language-plaintext highlighter-rouge">format.js</code> into the <code class="language-plaintext highlighter-rouge">respond_to</code> block for both branches of the if statement. This tells the server to respond with <code class="language-plaintext highlighter-rouge">create.js.erb</code> <h3 id="createjserb">create.js.erb</h3> Alright, so what do we want to do? <ul> <li>If the post saved successfully, <ul> <li>then remove the form and update the table.</li> <li>else re-render the form with errors.</li> </ul> </li> </ul> We’ll select the old form’s div, then enter the check to see if there are errors. If there are, then we’ll replace the div’s html with the re-rendered form. If not, we’ll render the post partial and insert it into the table. <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript">var oldForm = $('#new-post-form'); <% if @post.errors.any? %> oldForm.html('<%= j( render 'form' ) %>'); <% else %> oldForm.remove(); $('tbody').prepend('<%= j(render @post) %>'); <% end %></code></pre></figure> Actually running this will now cause an error, as the post partial hadn’t been made yet. Silly Rails scaffold! Let’s refactor the index page a bit. Replace the <code class="language-plaintext highlighter-rouge"><% @posts.each do |post| %></code> loop with the following: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/views/posts/index.html.erb ... <tbody> <%= render @posts %> </tbody> ...</code></pre></figure> and create <code class="language-plaintext highlighter-rouge">app/views/posts/_post.html.erb</code>: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/views/posts/_post.html.erb <tr> <td><%= post.title %></td> <td><%= post.body %></td> <td><%= link_to 'Show', post %></td> <td><%= link_to 'Edit', edit_post_path(post) %></td> <td><%= link_to 'Destroy', post, method: :delete, data: { confirm: 'Are you sure?' } %> </td> </tr></code></pre></figure> Rails is clever enough to render each post using the partial, and we got some reuse in our script. New posts and creating posts now works as you’d want! <h2 id="discussion">Discussion</h2> There are some neat benefits to this: <h3 id="portability">Portability</h3> You can do a <code class="language-plaintext highlighter-rouge"><%= link_to 'New Post', new_post_path, remote: true %></code> anywhere on your site and it will inject the form for you. That might not be great for a <code class="language-plaintext highlighter-rouge">Post</code>, but for a <code class="language-plaintext highlighter-rouge">ReportContent</code> or other model that would be great. <h3 id="lightweight">Lightweight</h3> Rails only needs to render much smaller objects for these, rather than a full application page. This is a good bit faster, both for your server and the client. <h3 id="easy-to-implement">Easy to implement</h3> If you just want limited ajax for a few models, then this works great and is rather easy to setup. It’s not perfect, though: <h3 id="javascript">JavaScript!</h3> Yeah, you’re now having to keep track of two views for new/create, one of which is injecting content int your page. Does your site even have a place for it? <h3 id="middle-ground">Middle ground</h3> This is a middle-ground solution. It’s sitting somewhere inbetween a full-HTML solution and a single page application. If you were just sending JSON back and forth, then you’d have much easier time managing the view. <h3 id="gets-hairy-quick">Gets hairy quick</h3> If you want to do something more complicated, this can get real tricky fast. <h2 id="further-thoughts">Further thoughts:</h2> <ul> <li>In my implementation of this for <a href="https://www.melodyscout.com MelodyScout">MelodyScout</a> (on <a href="https://www.github.com/melodyscout/melodyscout MelodyScout Repository">Github</a>), I’m using the new/create to manage the insertion of a Bootstrap modal window. This looks and works very well!</li> <li>Since I want to have forms that work both over HTML and ajax, I added a <code class="language-plaintext highlighter-rouge">remote</code> parameter to my <code class="language-plaintext highlighter-rouge">_form.html.erb</code> object that would default to false unless passed. Forms rendered by default work over html, but the code <code class="language-plaintext highlighter-rouge"><%= render 'form', remote: true %></code> makes it ajax.</li> </ul> Sat, 24 Jan 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/01/24/rails-ajax.html https://www.parsonsmatt.org/2015/01/24/rails-ajax.html Announcing MelodyScout! My first Rails app is now online at <a href="https://www.melodyscout.com">www.melodyscout.com</a>. I’m taking inspiration from Wikipedia, the Metal Archives, and the clean/simple notification system that Bandcamp uses. I’m hoping to make it a really nice usable service, and I know I’m certainly planning on using it. The UI is coming along rather nicely, and Rails has been an absolute pleasure to work with. Mon, 19 Jan 2015 12:00:00 +0000 https://www.parsonsmatt.org/2015/01/19/melodyscout.html https://www.parsonsmatt.org/2015/01/19/melodyscout.html Nested Forms in Rails <h2 id="a-beginners-approach">A beginner’s approach</h2> I’ve got a project called TellMe, which is a music release notification service. The app does pretty much exactly what I want it do, and now all I need to do is code up some user friendly UIs and deploy it for great victory. I’ve been keeping all of my models pretty thin, which means a lot of related models. This calls for nested forms. The simplest example of this is the relation between my <code class="language-plaintext highlighter-rouge">Release</code> and <code class="language-plaintext highlighter-rouge">ReleaseDate</code> classes: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/models/release.rb class Release < ActiveRecord::Base has_many :release_dates, dependent: :destroy end # app/models/release_date.rb class ReleaseDate < ActiveRecord::Base belongs_to :release end</code></pre></figure> The <code class="language-plaintext highlighter-rouge">ReleaseDate</code> class contains information about when a <code class="language-plaintext highlighter-rouge">Release</code> comes out, items that may be relevant for filtering, and is responsible for kicking off the daily release notification process. <h2 id="the-form">The Form</h2> I want it to be great. It should do something like this: <ul> <li>Release Name: [text field]</li> <li>Release Description: [text area]</li> <li>Release Dates: <ul> <li>Date: [date] Country: [country] Delete? [ ]</li> <li>Date: [date] Country: [country] Delete? [x]</li> <li>Add Release Date</li> </ul> </li> </ul> When you click ‘Add Release Date’, it should insert a new bullet above it for a new release date with all relevant information. How can we accomplish this? <h2 id="back-end">Back End</h2> The first step is to allow the <code class="language-plaintext highlighter-rouge">Release</code> class to accept nested attributes in forms. <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/models/release.rb class Release < ActiveRecord::Base has_many :release_dates accepts_nested_attributes_for :release_dates, allow_destroy: true end</code></pre></figure> <code class="language-plaintext highlighter-rouge">allow_destroy</code> will allow us to destroy the object from the nested form. We want this. There’s another attribute called <code class="language-plaintext highlighter-rouge">update_only</code> which prevents the nested form from creating new objects. Second, the controller needs to be modified to allow these new parameters in: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/controllers/releases_controller.rb ... def release_params params.require(:release).permit(:name, :description, release_dates_attributes: [:id, :country, :date, :_destroy]) end ...</code></pre></figure> If <code class="language-plaintext highlighter-rouge">params[:release_date_attributes[:_destroy]]</code> evaluates to a truthy value, then the record will be marked for deletion. All of these parameters will get updated in a single transaction during <code class="language-plaintext highlighter-rouge">@release.save</code>. Cool! The back end of the app now works pretty much exactly like I want it to. It gets a params has with a bunch of release date information and updates/creates/destroys accordingly. <h2 id="the-view">The View</h2> Rails is smart enough to do some pretty cool stuff behind-the-scenes with nested form objects thanks to the <code class="language-plaintext highlighter-rouge">accepts_nested_attributes_for</code> method above. The <code class="language-plaintext highlighter-rouge">form_builder</code> object has a method <code class="language-plaintext highlighter-rouge">fields_for</code> which handles much of the hard work. My form code looks something like this at first: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><%= form_for @release do |f| %> ... <ul> <%= f.fields_for :release_dates do |ff| %> <li> <%= ff.label :date %> <%= ff.date_field :date %> <%= ff.hidden_field :id %> <%= ff.label :_destroy %> <%= ff.check_box :_destroy %> </li> <% end %> <li> <%= link_to 'Add Date', '#' %> </li> </ul> ...</code></pre></figure> Rails is, as usual, rather intelligent. It knows that Release has-many ReleaseDates and that <code class="language-plaintext highlighter-rouge">@release.release_dates</code> is going to be an array. It will pull all of the objects out of the array and create those fields for each of them. If there aren’t any objects in the collection, then it won’t create anything. That is pretty cool! But it doesn’t let us create new ones – that’s why I’ve added the <code class="language-plaintext highlighter-rouge">link_to 'Add Date'</code> up there. We’re going to come back to it and write up some JavaScript to make it work. <h2 id="dynamic-forms">Dynamic Forms</h2> First, I want the ‘Add Date’ link to work dynamically. I’ll add <code class="language-plaintext highlighter-rouge">remote: true</code> to the options. This causes Rails to implement the link as an AJAX request rather than true link. Where will the link go? Since I’m creating a new <code class="language-plaintext highlighter-rouge">ReleaseDate</code> associated with a <code class="language-plaintext highlighter-rouge">Release</code>, I’ll make a route specifically for that: <code class="language-plaintext highlighter-rouge">new_release_release_date_path</code>. The <code class="language-plaintext highlighter-rouge">release_dates_controller</code> will need to set the release appropriately for the view. Lastly, I’ll need an actual <code class="language-plaintext highlighter-rouge">new.js.erb</code> to implement the change. Here are the changes: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"># app/views/releases/_form.html.erb ... <%= link_to 'Add Release Date', new_release_release_date_path(@release), remote: true %> ... # config/routes/.rb ... resources :releases do resources :release_dates, only [:new, :destroy] end ... # app/controllers/release_dates_controller.rb class ReleaseDatesController < ApplicationController before_action :set_release, only: [:new, :destroy] def new @release_date = @release.release_dates.build end def destroy end private def set_release @release = Release.find(params[:release_id]) end end</code></pre></figure> <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"># app/views/release_dates/new.js.erb console.log("hello from <%= @release.name %>!");</code></pre></figure> Now, when I click on the link, the console gets a new message. So this is working exactly as I want it to so far! <h2 id="the-javascript">The JavaScript</h2> First, I want to insert a new <code class="language-plaintext highlighter-rouge">li</code> into the form. This will hold the form for the new field. jQuery makes this fairly easy: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript">// app/views/release_dates/new.js.erb $('li#add-date').before( $('li'). attr('class', 'release_date'). append('hello!') );</code></pre></figure> This selects the <code class="language-plaintext highlighter-rouge">li</code> with <code class="language-plaintext highlighter-rouge">id='add-date'</code>, goes before it in the list, and inserts a new <code class="language-plaintext highlighter-rouge">li</code> with <code class="language-plaintext highlighter-rouge">class='release_date'</code> and contents <code class="language-plaintext highlighter-rouge">hello!</code>. So now, I just need to append the form to the <code class="language-plaintext highlighter-rouge">li</code> and everything will be set, right? Unfortunately, while the following code looks right, it doesn’t quite work: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript">$('li#add-date').before( $('<li>'). attr('class', 'release_date'). <%= fields_for @release_date do |rd| %> append('<%= rd.label :date %>'). append('<%= rd.date_field :date %>') <% end %> );</code></pre></figure> Clicking the link creates the new li and it has a form that looks pretty much right, but the data attributes aren’t correct, and the information doesn’t get included in the <code class="language-plaintext highlighter-rouge">release_date_attributes</code> hash in the params. Check out the resulting HTML: <figure class="highlight"><pre><code class="language-html" data-lang="html">...  <label for="release_release_dates_attributes_3_date">Date</label> <input value="2015-01-06" type="date" name="release[release_date_attributes][3][date]" id="release_release_dates_attributes_3_date"> ...  <label for="release_date_date">Date</label> <input type="date" name="release_date[date]" id="release_date_date"></code></pre></figure> And when you click the link multiple times, the AJAX forms are going to have the same IDs and names, rendering them worthless. So it won’t be that easy! One solution would be to use the new JavaScript code to scan the previous fields, capture the right elements, and add them. That appears to be precisely the solution that <a href="https://rubygems.org/gems/cocoon">Cocoon</a> uses, and it also includes a bunch of helper methods to make the Rails code rather nice. Sat, 10 Jan 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/01/10/rails_forms.html https://www.parsonsmatt.org/2015/01/10/rails_forms.html If Ruby Had... <h3 id="a-small-wishlist">A small wishlist</h3> I went through <a href="https://www.learnyouahaskell.com/">LYAH</a>, and it gave me a great appreciation for the programming language and functional programming style in general. Two things I really like are partial function application and pattern matching. If Ruby had these features, how could they be used in a Rails app? <h2 id="beware">Beware!</h2> I’ve been learning Rails for about a month now. I’m sure my ‘before’ code could be improved markedly. <h2 id="pattern-matching">Pattern matching:</h2> My Rails app has a lot of related data objects, and I find that I’m adding new relations fairly regularly. <code class="language-plaintext highlighter-rouge">Artist</code>s have many <code class="language-plaintext highlighter-rouge">Release</code>s through <code class="language-plaintext highlighter-rouge">Contribution</code>s, <code class="language-plaintext highlighter-rouge">Release</code>s have many <code class="language-plaintext highlighter-rouge">ReleaseDate</code>s, <code class="language-plaintext highlighter-rouge">User</code>s follow many <code class="language-plaintext highlighter-rouge">Artist</code>s, etc. As such, I’ve got methods like will eventually look like this: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby">class User def follow_artist(artist) case artist.class when String artist = Artist.find_by(name: release) when Artist # everything OK else raise ArgumentError.new "Unsupported type" end self.follows.create(artist: artist) end end class Artist def add_release(release) case release.class when String release = Release.find_by(name: release) when Release # everything OK else raise ArgumentError.new "Unsupported type" end self.contributions.create(release: release) end end</code></pre></figure> Pattern matching would mostly be nice as a way to make the code more concise. If Ruby had it, it might look like: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby">class User def follow_artist(artist) pattern_match artist.class, String => ->{ artist = Artist.find_by(name: artist) }, Artist => ->{ artist }, otherwise: ->{ raise ArgumentError.new "Unsupported type" } self.follows.create(artist: artist) end end</code></pre></figure> The <code class="language-plaintext highlighter-rouge">pattern_match</code> function would take an expression and a hash of results paired with lambdas and execute the expression corresponding with the result. The return value of the <code class="language-plaintext highlighter-rouge">pattern_match</code> function is the return of the lambda that gets executed. The above code, in my opinion, looks quite a bit cleaner, and allows a bit more modularity than the former. It can be refactored like so: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby">class User def follow_artist(artist) self.follows.create( pattern_match(artist.class, String => ->{ { artist: Artist.find_by(name: artist) } }, Artist => ->{ { artist: artist } }, otherwise: ->{ raise ArgumentError } ) ) end end</code></pre></figure> <h2 id="partial-function-application">Partial function application:</h2> Mathematicians have determined that any function with multiple arguments can be expressed as a series of functions that take a single argument, return a function that takes a single argument, etc. until all arguments have been used and then returns the final result. In Haskell, this means that you can define a function: <code class="language-plaintext highlighter-rouge">func x y z = x + y + z</code> that takes three arguments and sums them. You can further define a function <code class="language-plaintext highlighter-rouge">func' = func 1</code>. <code class="language-plaintext highlighter-rouge">func'</code> is a function that takes two arguments, sums them, and adds 1. The <code class="language-plaintext highlighter-rouge">func</code> function has been partially applied. <code class="language-plaintext highlighter-rouge">func'' = func' 2</code> partially applies <code class="language-plaintext highlighter-rouge">func'</code> with the argument 2, which means that <code class="language-plaintext highlighter-rouge">func''</code> is now a function that takes a single argument and adds 3 to it. The following code snippet illustrates what is happening: That’s all fine, but it seems really abstract and kind of weird and confusing. Why would you want to do that? Going back to my <code class="language-plaintext highlighter-rouge">add_X</code> method above, even with the <code class="language-plaintext highlighter-rouge">pattern_match</code> function defined, there is a lot of code repetition between models. They’re all essentially doing the same thing: Receiving an object, pattern matching the object, and responding to the type of object. The specifics are different, but could it be abstracted out? With partial function application, it would be fairly easy. The method body would look something like: <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby">class ActiveRecord def add_relation(base, relations, matching_function) base.relations.create(matching_function) # where matching_function calls pattern_match(expression, lambdas) end end</code></pre></figure> In languages with partially applied functions, the parameters that aren’t likely to change much are assigned first, and the parameters that change frequently are listed later. Each class would want to partially apply the method starting with the base class, then specify the relations, and then specify the matching function. <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby">class Release # Creates the add_relation function for a Release object. Partially applied! def add_release_relation super.add_relation(self) end # Further supplies the contribution relation to the function, which now # expects the pattern matching function before it returns a result. def add_contribution_relation self.add_release_relation(self.contributions) end # Finally, a concrete function! This supplies the matching function to the # above relation. Of course, we have to define the artist lambdas now. def add_artist(artist) self.add_contribution_relation(pattern_match artist, artist_matching) end # The return value of the lambdas should be a parameters hash for creating # the relation. def artist_matching { Artist => ->{ { artist: artist } } , String => ->{ { artist: Artist.find_by(name: artist) } } , otherwise: ->{ raise ArgumentError.new } } end ## And again for release_dates! def add_release_date_relation self.add_release_relation(self.release_dates) end def add_release_date(release_date) self.add_release_date_relation(pattern_match release_date, release_date_matching) end def release_date_matching { ReleaseDate => ->{ { release_date: release_date } } , Date => -> { { release_date: ReleaseDate.new(date: release_date) } } , otherwise: ->{ raise ArgumentError.new } } end end</code></pre></figure> With this sort of setup, every class in my application is reusing the same basic code for the creation of releases. All they’re doing is customizing the methods to be more and more specific, until it eventually does what’s wanted. Since everything is so broken up, a rather thorough testing of the base methods will practically ensure that the methods that build upon it have little to go wrong. Naturally, this sort of thing is much more powerful in a language with a powerful type system and restricted side-effects, but it’s not strictly theoretical. (Yes, you could just explicitly pass all parameters to that initial method, but that’s not as fun!) Thu, 08 Jan 2015 00:00:00 +0000 https://www.parsonsmatt.org/2015/01/08/wishlist.html https://www.parsonsmatt.org/2015/01/08/wishlist.html OkFilter <a href="https://www.github.com/parsonsmatt/okfilter">OkFilter</a> is a program that I made to improve my OkCupid experience. It’s been pretty remarkably effective for me, and the development has been pretty fun. It’s currently on version 0.3, with a GUI thanks to Shoes (unfortunately, the packaged .jar file doesn’t work on Windows, and I haven’t tested it on OSX yet). <h2 id="what-it-does">What it does:</h2> OkFilter uses Watir to kick off a browser, login to OkCupid, go to the quickmatch page, and start rating the quickmatches automatically according to the given match settings. By default, it’s set to 5-star anyone with a match percentage greater than 90% and 1-star anyone with a match percentage less than 60%. It skips anyone between, and it skips 0% matches on the assumption that they haven’t filled in any questions yet. <h2 id="why-does-it-work">Why does it work?</h2> OkCupid is a really well designed service. They’ve put a lot of thought into making something that would provide excellent results to their users, and it shows. If you answer enough questions and properly rate their importance, then the match percentage is a very good predictor of whether or not you’ll actually like someone. This is especially true on the lower end: you’re unlikely to even want to be friends with someone with a sufficiently low match percentage, and for me, that’s around 60%. OkCupid also seems to believe that less-is-more, and only shows you a limited pool of users at any given time. Even searching for users by match percentage will not show you a comprehensive list of people in an area, unless the specified area is fairly small. They’re doing some filtering on their end to decide who you get to see. There are a few ways to remove a profile from your pool that I can tell: sending them a message, 1-starring their profile, hiding them, and blocking them. When you remove someone from your pool, then OkCupid will replace them with another user. When you combine the above ideas, you get the following strategy: remove poor matches from the pool as quickly as possible, in the hopes that their replacements will be good matches. OkFilter automates that process. <h2 id="my-experience-with-it">My experience with it</h2> OkFilter has rated somewhere around 10,000 profiles for me, disliking the vast majority and liking a small fraction. How’s it worked out for me? <ul> <li>The number of people that ‘Liked’ my profile is up from 220 to 280, with about 25 of those being mutual.</li> <li>I have visitors turned off, but it resulted in a huge uptick in visitors: something on the order of dozens per day immediately after the filtering.</li> <li>I’ve had good conversations with about half of the people that have matched me.</li> <li>My main page pretty much only has good matches.</li> </ul> All in all, this is a huge improvement on my experience on the website, and I’ve only had this going for a little over a week. Sun, 19 Oct 2014 18:00:00 +0000 https://www.parsonsmatt.org/2014/10/19/okfilter.html https://www.parsonsmatt.org/2014/10/19/okfilter.html So MEAN I decided it would be fun to learn the MEAN stack. I’m already developing some level of proficiency with Javascript, and the combination of performance and language/data consistency seems like a good platform with which to launch into developing a full blown webapp. Ruby on Rails will have to wait. <h2 id="developing-a-vagrantbox">Developing a Vagrantbox</h2> I started with the <code class="language-plaintext highlighter-rouge">ubuntu/trusty64</code> box on the vagrantcloud, modified the Vagrantfile for a provisioning script, and forwarded ports 3000 and 80 to 3030 and 8080 on the host. Bootstrap.sh: <figure class="highlight"><pre><code class="language-sh" data-lang="sh">#!/bin/sh # Bootstrap for MEAN box # using ubuntu/trusty64 base boxv # Install nodejs sudo add-apt-repository ppa:chris-lea/node.js sudo apt-get update sudo apt-get install nodejs git gcc make build-essential -y # Install MongoDB sudo apt-get install mongodb -y # Install Bower sudo npm install -g bower # Install grunt sudo npm install -g grunt-cli # Install mean sudo npm install -g meanio</code></pre></figure> This finishes without error, somewhat ominously. I <code class="language-plaintext highlighter-rouge">cd /vagrant/</code> and <code class="language-plaintext highlighter-rouge">mean init gzclweb</code>, no errors. <code class="language-plaintext highlighter-rouge">cd gzclweb && sudo npm install</code>, this time with a bunch of warnings about wanting things of an earlier version than I have installed. Running <code class="language-plaintext highlighter-rouge">grunt</code> starts everything up (gives an error about missing some c++ bson extensions), starts the server, and accessing localhost:3030 gives the 404 error. It’s working! <a href="https://stackoverflow.com/questions/21656420/failed-to-load-c-bson-extension">This SO question</a> caused me to add the <code class="language-plaintext highlighter-rouge">sudo apt-get install gcc make build-essential</code> to the bootstrap.sh above. This fixed the previous errors about bson C++. MEAN is now running fantastically on my vagrantbox. Sun, 10 Aug 2014 15:12:19 +0000 https://www.parsonsmatt.org/2014/08/10/MEAN.html https://www.parsonsmatt.org/2014/08/10/MEAN.html Vagrant, Windows, and Phonegap: A tale of woe A project I’m working on is converting from a <a href="https://build.phonegap.com/">Phonegap Build</a> to a locally built project, thanks to the lack of plugins available to Build compared to the much larger library of plugins available to the offline platform. I’m pretty sure I know why Phonegap Build is useful – Phonegap is a total PITA to setup and configure, especially in a command line only situation. There are a huge nest of dependencies and most of the answers to fix things are hidden in StackOverflow threads and bug reports in the official documentation for various tools. This is exactly the kind of situation where <a href="https://www.vagrantup.com">Vagrant</a> becomes useful – do the work to make a working dev box once and then share the result out. This’ll let me develop on both desktop and laptop with no issue, and integrating new folks into the team will be easy. So, let’s put in the work once, so I don’t have to do it again! <h2 id="getting-it-running">Getting it running</h2> I’m basing the box off of the <code class="language-plaintext highlighter-rouge">ubuntu\trusty64</code> box, since it seems to be kept up to date fairly well by the Ubuntu folks. I added the standard <code class="language-plaintext highlighter-rouge">config.vm.provision :shell, path: "bootstrap.sh"</code> to get a provisioning script running, and forwarded port 3000 on the guest to 3000 on the host to allow for use of the excellent <code class="language-plaintext highlighter-rouge">phonegap serve</code> for live debugging and testing. <h2 id="the-bootstrapsh">The Bootstrap.sh</h2> Developing with Android and Phonegap is kind of a mess since everything has so many dependencies. It’s terrible, really. If it weren’t so useful, I wouldn’t even bother. Here’s the bootstrap file I came up with: <figure class="highlight"><pre><code class="language-bash" data-lang="bash">#!/bin/sh # Install nodejs and NPM sudo add-apt-repository ppa:chris-lea/node.js -y sudo dpkg --add-architecture i386 sudo apt-get update -y sudo apt-get install nodejs -y sudo npm install npm -g # Install phonegap and phonegap plugin manager sudo npm install phonegap -g sudo npm install plugman -g # Install Java SDK sudo apt-get install openjdk-7-jdk -y # Install Android SDK wget https://dl.google.com/android/android-sdk_r23.0.2-linux.tgz tar -xzf android-sdk_r23.0.2-linux.tgz sudo apt-get install expect -y # allows to give Y to license prompt sudo apt-get install libncurses5:i386 libstdc++6:i386 zlib1g:i386 -y sudo apt-get update sudo apt-get install libncurses5:i386 libstdc++6:i386 zlib1g:i386 expect -c ' set timeout -1 ; spawn sudo android-sdk-linux/tools/android update sdk --no-ui; expect { "Do you accept the license" { exp_send "y\r" ; exp_continue } eof } '</code></pre></figure> The <code class="language-plaintext highlighter-rouge">expect</code> block was pulled from <a href="https://stackoverflow.com/a/17863931/3780203">this very helpful SO answer</a>, and works great. Of course, this doesn’t work. It gets pretty far, though! You’ll be able to create an app and run <code class="language-plaintext highlighter-rouge">phonegap serve</code> and everything will look awesome. But when you try to do a <code class="language-plaintext highlighter-rouge">run</code> or <code class="language-plaintext highlighter-rouge">build</code> it crashes out with an error around shelljs. Unable to debug the matter, I threw in the towel for the night. Fortunately, someone else has begun working on a <a href="https://github.com/vasconcelloslf/phonegap-box">box</a> which is much farther along, so I decided to start working on this to try and get it going. When I download and test it out, I get the same problem – so something nefarious must be up. According to <a href="https://stackoverflow.com/questions/19592701/phonegap-building-phonegap-android-app-gives-compile-error-on-linux">this SO question</a>, doing a <code class="language-plaintext highlighter-rouge">cordova platform remove android</code> and <code class="language-plaintext highlighter-rouge">cordova platform add android</code> should fix it. However, I just get another error with the key line: <figure class="highlight"><pre><code class="language-bash" data-lang="bash">... at Object.fs.symlinkSync (fs.js:735:18) ...</code></pre></figure> Hmm… So a symlink problem? Searching Google for that brings up <a href="https://stackoverflow.com/questions/24200333/symbolic-links-and-synced-folders-in-vagrant">this SO post</a> with the following answer: <blockquote> Virtualbox does not allow symlinks on shared folders for security reasons. To enable symlinks the following line needs to be added to the vm provider config block in the Vagrantfile: </blockquote> <figure class="highlight"><pre><code class="language-ruby" data-lang="ruby">config.vm.provider "virtualbox" do |v| v.customize ["setextradata", :id, "VBoxInternal2/SharedFoldersEnableSymlinksCreate/v-root", "1"] end</code></pre></figure> <blockquote> Additionally, on windows vagrant up needs to be executed in a shell with admin rights. No workarounds necessary. </blockquote> <blockquote> source: https://coderwall.com/p/qklo9w </blockquote> Wonderful. This is exactly what I need, even if I hate the requirement of running a shell with admin rights. There’s an option in <code class="language-plaintext highlighter-rouge">secpol.msc</code> to give users the “Create Symlink” permission which I will experiment with next. Unfortunately, while it’s necessary that your user account be added to this group, that doesn’t remove the need to run the shell as administrator. The results of my work here have been merged into <a href="https://github.com/vasconcelloslf/phonegap-box">this Github repo</a>, which is now working great for me as I continue working on my Phonegap application. Fri, 01 Aug 2014 19:00:00 +0000 https://www.parsonsmatt.org/2014/08/01/phonegapbox.html https://www.parsonsmatt.org/2014/08/01/phonegapbox.html Protecting my Javascript <h2 id="well-what-do-we-have-here">Well, what do we have here?</h2> Good OO practice is to encapsulate your data. I recently wrote a script that managed some data and methods for a mobile application to write WiFi information to an NFC. My first design looked something like this: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 </pre></td><td class="code"><pre>var SSID; var pass; function formatNetwork() { toWrite= {}; toWrite['SSID'] = SSID; toWrite['pass'] = pass; return JSON.stringify(toWrite); } </pre></td></tr></tbody></table></code></pre></figure> Except this is terrible: <code class="language-plaintext highlighter-rouge">SSID</code> and <code class="language-plaintext highlighter-rouge">pass</code> are global variables and exposed to the whole world. Additionally, the network formatting is fragile and difficult to expand. I decided to upgrade to the following and store everything in an object. <h2 id="object-time">Object Time</h2> <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 10 11 12 13 14 </pre></td><td class="code"><pre>var wifiWriter = { SSID: 'none', pass: 'none', formatNetwork: function() { var toWrite = {}; for (var prop in wifiWriter) { if (!isFunction(prop)) { toWrite[prop] = wifiWriter[prop]; } } return JSON.stringify(toWrite); } } // isFunction() was pulled from UnderscoreJS </pre></td></tr></tbody></table></code></pre></figure> Much better. There’s only a single global variable now. The <code class="language-plaintext highlighter-rouge">formatNetwork</code> function checks all of the non-variable properties of the <code class="language-plaintext highlighter-rouge">wifiWriter</code> object and adds them to the object, so we can add more fields to the string without breaking everything. I did end up adding a field, so this paid off sooner rather than later. It’s still not encapsulated. Anyone can access the two variables and modify them directly. My OO training tells me that this is horrible, and I don’t trust myself not to screw things up somehow. We’ll encapsulate it in a function for best results, returning an object that has methods to access and modify the variables. <h2 id="its-function-time">It’s Function Time</h2> <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 </pre></td><td class="code"><pre>var wifiWriter = function () { var SSID = 'none'; var pass = 'none'; return { getSSID: function() { return SSID; }, setSSID: function(newSSID) { // Validation goes here right? SSID = newSSID; }, // pass get/set left as exercise to reader formatNetwork: function() { var toWrite = {}; for (var prop in wifiWriter) { if (!isFunction(prop)) { toWrite[prop] = wifiWriter[prop]; } } return JSON.stringify(toWrite); } }; // trick #1: end the return statement }(); // trick #2: call function immediately after declaring it </pre></td></tr></tbody></table></code></pre></figure> PROBLEM. My clever way of looping through the variables no longer works, since its a function and not an object anymore. It might be possible to pull the variables out in a similar manner, but you know, what if I wanted to add additional fields that weren’t to be incorporated? Let’s wrap it an object! <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 </pre></td><td class="code"><pre>var wifiWriter = function () { var wifiInfo = { SSID: 'none', pass: 'none', locationID: 'where even' }; return { getSSID: function() { return SSID; }, setSSID: function(newSSID) { // Validation goes here right? SSID = newSSID; }, // pass get/set left as exercise to reader formatNetwork: function() { var toWrite = {} for (var prop in wifiInfo) { if (!isFunction(prop)) { toWrite[prop] = wifiInfo[prop]; } } return JSON.stringify(toWrite); } }; }(); </pre></td></tr></tbody></table></code></pre></figure> This satisfies my need for good code writing. Sat, 26 Jul 2014 12:00:00 +0000 https://www.parsonsmatt.org/2014/07/26/protectiveJS.html https://www.parsonsmatt.org/2014/07/26/protectiveJS.html Postfix Arithmetic in Javascript <h2 id="getting-started">Getting Started</h2> I learned about the <a href="https://en.wikipedia.org/wiki/Stack_$abstract_data_type$">stack</a> data type in class, and the specific given example was a postfix arithmetic algorithm. I’ve also been learning Javascript from <a href="https://eloquentjavascript.net/">Eloquent Javascript</a> and <a href="https://www.amazon.com/JavaScript-Good-Parts-Douglas-Crockford/dp/0596517742">Javascript: The Good Parts</a>. It dawned on me that this was a great way to experiment with Javascript’s functional programming elements, get some experience implementing a basic data structure, and documenting the writing process. So I decided to get cracking! <h2 id="postfix-arithmetic">Postfix Arithmetic</h2> Postfix arithmetic means that the operator goes after the two number. Formally, you have <code class="language-plaintext highlighter-rouge">Operand1 Operand2 Operator</code>. It is contrasted with infix arithmetic, where you use the operator in between the operands. A simple example contrasts <code class="language-plaintext highlighter-rouge">5 + 3</code> with its equivalent <code class="language-plaintext highlighter-rouge">5 3 +</code>. A postfix expression can be an operand in another postfix expression: <code class="language-plaintext highlighter-rouge">5 3 + 3 *</code> is equivalent to <code class="language-plaintext highlighter-rouge">(5 + 3 ) * 3</code>. Since an expression can serve as an operand, it’s unnecessary to use parentheses to specify operation order. The infix expression <code class="language-plaintext highlighter-rouge">(3 * 4 / (2 + 5)) * (3 + 4)</code> is equivalent to the postfix expression <code class="language-plaintext highlighter-rouge">3 4 * 2 5 + / 3 4 + *</code>. This seems a bit cryptic. I’ll go through the expression piece by piece and solve it. <code class="language-plaintext highlighter-rouge">3 4 *</code> is the first expression and it evaluates to 12. <code class="language-plaintext highlighter-rouge">2 5 +</code> is the second expression and evaluates to 7. The original expression can be thought of as <code class="language-plaintext highlighter-rouge">(3 4 *) (2 5 +) /</code> which makes it a bit easier to see how the operator takes the two prior expressions as operands. The rest is easy enough to evaluate with that. <h2 id="the-algorithm">The Algorithm</h2> You can easily use a stack to parse postfix expressions. The general algorithm looks like this: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 </pre></td><td class="code"><pre>for each (element in expression) { if ( element is an operator) { push (element(pop,pop); } else { push next; } } </pre></td></tr></tbody></table></code></pre></figure> The algorithm goes through each element and checks to see if it’s an operator or operand. If it’s an operand, it pushes it onto the stack. If it’s an operator, it takes the previous two values on the stack and pushes the computation back to the stack. <h2 id="the-javascript">The Javascript</h2> This seems like an obvious use of functional programming! Javascript already implements the stack operations <code class="language-plaintext highlighter-rouge">push</code> and <code class="language-plaintext highlighter-rouge">pop</code> in the basic array methods, so we can use a simple array for this. First, we’ll consider a Properly Formatted string of characters (presumably acquired from some other function which validates user input) which we’ll split into an array. <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 </pre></td><td class="code"><pre>var expression = "3 4 * 2 5 + / 3 4 + *"; var postfix = expression.split(" "); var postfixStack = []; </pre></td></tr></tbody></table></code></pre></figure> Then we’ll go through the array and either push or compute: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 10 11 12 13 </pre></td><td class="code"><pre>postfix.forEach( function(current) { if ( isOperator(current) ) { postfixStack.push( compute( postfixStack.pop(), symbolToOperator(current), postfixStack.pop() ) ); } else { postfixStack.push(current); } }); </pre></td></tr></tbody></table></code></pre></figure> This makes good enough sense. Let’s define those functions: <figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 </pre></td><td class="code"><pre>function isOperator(toCheck) { switch (toCheck) { case '+': case '-': case '*': case '/': case '%': return true; default: return false; } } function compute(a, operator, b) { return operator(a,b); } function symbolToOperator(symbol) { switch (symbol) { case '+': return plus; case '-': return minus; case '*': return multiply; case '/': return divide; case '%': return modulo; } } function plus(a,b) { return a + b; } function minus(a,b) { return a - b; } function multiply(a,b) { return a * b; } function divide(a,b) { return a / b; } function modulo(a,b) { return a % b; } </pre></td></tr></tbody></table></code></pre></figure> Finally, we’ll add <code class="language-plaintext highlighter-rouge">console.log(postfixStack[0])</code>. Now you can run the script and see how it works. Going forward, I’d probably want to stuff all of this into its own function, and write another set of functions – one to accept input, one to validate it, and perhaps another to parse an infix expression into a postfix. Mon, 07 Jul 2014 12:00:00 +0000 https://www.parsonsmatt.org/2014/07/07/postfixjs.html https://www.parsonsmatt.org/2014/07/07/postfixjs.html What I've been listening to lately I’ve been getting into some new music recently and figured I’d post about it. <ul> <li>Metal <ul> <li>I just discovered <a href="https://www.youtube.com/watch?v=qihOp52Tots">Graveworm</a>, though I’ve evidently had it on my PC since 2008. Very cool symphonic black/death metal.</li> <li><a href="https://www.youtube.com/watch?v=kzQpiJjdprQ">Anaal Nathrakh</a> has been blowing my mind since I heard their Eschaton CD. Whenever I need to get motivated, the chorus to this just fucking blows me away. <a href="https://www.youtube.com/watch?v=2JNI0vd0tz8">sweet live version</a> too!</li> </ul> </li> <li>Electronica <ul> <li><a href="https://www.youtube.com/watch?v=cTjF2_-bneM">Bonobo</a>. His album Black Sands is just perfect when I want a hard contrast for the usual metal.</li> <li><a href="https://www.youtube.com/watch?v=ArPpVRxyyRY">Seven Lions</a> is a bit more upbeat, but satisfies my need for smooth and fun music.</li> </ul> </li> </ul> I’ve regrettably not had time to work on my own music since starting my internship. I’ll have to make more time for it. Fri, 04 Jul 2014 13:00:00 +0000 https://www.parsonsmatt.org/2014/07/04/listening-to.html https://www.parsonsmatt.org/2014/07/04/listening-to.html A New Position I just got my first position as a programmer intern with <a href="https://pylonproducts.com/" title="Pylon Products">Pylon Products</a>. I’ll be working mostly with Javascript it looks like. I’m very excited to get working on the projects. Sun, 22 Jun 2014 18:36:19 +0000 https://www.parsonsmatt.org/2014/06/22/a_new_position.html https://www.parsonsmatt.org/2014/06/22/a_new_position.html FIRST Jekyll is finally running on my server and frankly it rules. Previously, I was using a “buildSite.js” file with document.write(partHTML) to handle template and layouts. Not fun. I’m managing this website with git, and even have it setup with staging to test changes. How responsible! Sat, 21 Jun 2014 15:12:19 +0000 https://www.parsonsmatt.org/2014/06/21/my-first-post.html https://www.parsonsmatt.org/2014/06/21/my-first-post.html

Original Source | Taken Source