( ) Jekyll 2025-11-10T18:34:40+00:00 https://kcsongor.github.io/ Csongor Kiss https://kcsongor.github.io/ <![CDATA[Trait-Constrained Enums in Rust]]> https://kcsongor.github.io/gadts-in-rust 2025-11-07T00:00:00-00:00 2025-06-10T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#a-simple-expression-language" id="markdown-toc-a-simple-expression-language">A simple expression language</a></li> <li><a href="#more-precise-types-with-gadts" id="markdown-toc-more-precise-types-with-gadts">More precise types with GADTs</a></li> <li><a href="#more-flexible-types-with-gadts" id="markdown-toc-more-flexible-types-with-gadts">More flexible types with GADTs</a></li> <li><a href="#encoding-gadts-in-rust" id="markdown-toc-encoding-gadts-in-rust">Encoding GADTs in Rust</a> <ul> <li><a href="#type-equality-witnesses" id="markdown-toc-type-equality-witnesses">Type equality witnesses</a></li> <li><a href="#trait-constraint-witnesses" id="markdown-toc-trait-constraint-witnesses">Trait constraint witnesses</a></li> <li><a href="#using-specialisation-to-recover-constraints" id="markdown-toc-using-specialisation-to-recover-constraints">Using specialisation to recover constraints</a></li> <li><a href="#why-this-works" id="markdown-toc-why-this-works">Why this works</a></li> </ul> </li> <li><a href="#limitations" id="markdown-toc-limitations">Limitations</a></li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>Rust doesn’t have GADTs (generalised algebraic data types), but we can get surprisingly close with some creative type-level tricks.</p> <p>This post might look like a departure from my usual (in the sense of <em>typical</em>, not <em>frequent</em>) Haskell posts since we’ll be writing Rust today. Don’t let that fool you; we’ll just be writing Haskell in Rust.</p> <p>GADTs are a Haskell feature that let constructors carry richer type information. They can enforce constraints or refine type parameters per constructor – which is what we’ll achieve here in Rust.</p> <p>As this post is mainly for Rust programmers, I’ll start by motivating why GADTs are useful. For that, we’ll build a small expression language and see where plain algebraic data types fall short. Then we’ll introduce GADTs to fix the problem, first through type refinement and then with per-constructor constraints. After that, we’ll move to Rust and reconstruct both mechanisms: type equality witnesses (a known trick) and constraint witnesses (the new bit this post is really about). You’ll know when we’ve switched from Haskell to Rust — the syntax gets ugly.</p> <h2 id="a-simple-expression-language">A simple expression language</h2> <p>Let’s start with a small expression language, encoded as a Haskell datatype. It supports defining integer literals, and adding them together.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span> <span class="o">=</span> <span class="kt">LitInt</span> <span class="kt">Int</span> <span class="o">|</span> <span class="kt">Add</span> <span class="kt">Expr</span> <span class="kt">Expr</span></code></pre></figure> <p>The Rust equivalent of this is an enum with two constructors and more parentheses (and explicit heap indirection).</p> <p>We can evaluate expressions recursively:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">eval</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="o">-&gt;</span> <span class="kt">Int</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">LitInt</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">+</span> <span class="n">eval</span> <span class="n">b</span></code></pre></figure> <p>Evaluating <code class="language-plaintext highlighter-rouge">eval (Add (LitInt 3) (Add (LitInt 4) (LitInt 5)))</code> yields <code class="language-plaintext highlighter-rouge">12</code>.</p> <p>This is not a very useful expression language, so let’s add another literal type and another binary operator:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span> <span class="o">=</span> <span class="kt">LitInt</span> <span class="kt">Int</span> <span class="o">|</span> <span class="kt">Add</span> <span class="kt">Expr</span> <span class="kt">Expr</span> <span class="o">|</span> <span class="kt">LitBool</span> <span class="kt">Bool</span> <span class="o">|</span> <span class="kt">Or</span> <span class="kt">Expr</span> <span class="kt">Expr</span></code></pre></figure> <p>Now we can write expressions like <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitInt 2)</code> and <code class="language-plaintext highlighter-rouge">Or (LitBool False) (LitBool True)</code>. But we can also write <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitBool False)</code>, which shouldn’t type-check!</p> <p>Worse, we’re in trouble when writing the return type of <code class="language-plaintext highlighter-rouge">eval :: Expr -&gt; ???</code>. What it should return depends on the input, but the input type doesn’t contain enough information.</p> <h2 id="more-precise-types-with-gadts">More precise types with GADTs</h2> <p>GADTs let us say more about each constructor’s result type. We can extend our <code class="language-plaintext highlighter-rouge">Expr</code> definition so that <code class="language-plaintext highlighter-rouge">Add</code> only exists for integers, and <code class="language-plaintext highlighter-rouge">Or</code> only for booleans.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE GADTs #-}</span> <span class="kr">data</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="kr">where</span> <span class="kt">LitInt</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="kt">Add</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="kt">LitBool</span> <span class="o">::</span> <span class="kt">Bool</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Bool</span> <span class="kt">Or</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="kt">Bool</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Bool</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Bool</span></code></pre></figure> <p>Notice that <code class="language-plaintext highlighter-rouge">Expr a</code> is now <em>parameterised</em> (this would read <code class="language-plaintext highlighter-rouge">Expr&lt;A&gt;</code> in Rust), and each constructor specifies the type of expression it builds. <code class="language-plaintext highlighter-rouge">LitInt</code> takes an <code class="language-plaintext highlighter-rouge">Int</code> and produces an <code class="language-plaintext highlighter-rouge">Expr Int</code>, and <code class="language-plaintext highlighter-rouge">Add</code> combines two <code class="language-plaintext highlighter-rouge">Expr Int</code>s into another <code class="language-plaintext highlighter-rouge">Expr Int</code>.</p> <p>As a result, <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitBool False)</code> is rejected at compile time because the second operand has the wrong type.</p> <p>The evaluation function can now have a precise type:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">eval</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">LitInt</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">+</span> <span class="n">eval</span> <span class="n">b</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">LitBool</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">b</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">Or</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">||</span> <span class="n">eval</span> <span class="n">b</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">eval</code> now takes an expression of any type, and returns a value of that type. When pattern matching on <code class="language-plaintext highlighter-rouge">Expr a</code>, if we see a <code class="language-plaintext highlighter-rouge">LitInt</code>, we learn that <code class="language-plaintext highlighter-rouge">a</code> is <code class="language-plaintext highlighter-rouge">Int</code>, so the result must be an integer. In the <code class="language-plaintext highlighter-rouge">Add</code> branch, both sub-expressions are <code class="language-plaintext highlighter-rouge">Expr Int</code>, so <code class="language-plaintext highlighter-rouge">eval</code> produces two <code class="language-plaintext highlighter-rouge">Int</code>s which can be added together.</p> <p>In other words, we not only <em>restrict</em> what types of expressions can be used when constructing <code class="language-plaintext highlighter-rouge">Add</code>, but also <em>learn</em> type information when destructuring it.</p> <p>So far, every constructor fixes a concrete return type. But what if we wanted to support other types that can also be added together?</p> <h2 id="more-flexible-types-with-gadts">More flexible types with GADTs</h2> <p>Let’s say we want to support <code class="language-plaintext highlighter-rouge">Double</code>s in our language too:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="kr">where</span> <span class="kt">LitInt</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="kt">LitDouble</span> <span class="o">::</span> <span class="kt">Double</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Double</span> <span class="kt">Add</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">...</span></code></pre></figure> <p>Doubles, being numeric values, can also be added together, but the current <code class="language-plaintext highlighter-rouge">Add</code> constructor only works on integers. We can relax this by constraining the <code class="language-plaintext highlighter-rouge">a</code> type parameter just in this constructor:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="kr">where</span> <span class="kt">LitInt</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="kt">LitDouble</span> <span class="o">::</span> <span class="kt">Double</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="kt">Double</span> <span class="kt">Add</span> <span class="o">::</span> <span class="kt">Num</span> <span class="n">a</span> <span class="o">=&gt;</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">...</span></code></pre></figure> <p>The <code class="language-plaintext highlighter-rouge">Num a =&gt;</code> part is a <em>type class constraint</em> in Haskell, equivalent to a trait bound in Rust. In other words, <code class="language-plaintext highlighter-rouge">Add</code> can now take any two expressions of the same type, as long as that type supports numeric operations.</p> <p>The <code class="language-plaintext highlighter-rouge">eval</code> function simply gains one extra case:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">eval</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">LitInt</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">LitDouble</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">+</span> <span class="n">eval</span> <span class="n">b</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">LitBool</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">b</span> <span class="n">eval</span> <span class="p">(</span><span class="kt">Or</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">||</span> <span class="n">eval</span> <span class="n">b</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">Add</code> now supports <code class="language-plaintext highlighter-rouge">Add (LitDouble 1.0) (LitDouble 2.0)</code> and <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitInt 2)</code>. Crucially, the <code class="language-plaintext highlighter-rouge">Num a</code> constraint is attached to just the <code class="language-plaintext highlighter-rouge">Add</code> constructor, not the entire type. It’s still possible to construct <code class="language-plaintext highlighter-rouge">LitBool</code> values, even though booleans don’t support addition.</p> <p>Each <code class="language-plaintext highlighter-rouge">Add</code> value carries evidence that its type parameter <code class="language-plaintext highlighter-rouge">a</code> satisfies <code class="language-plaintext highlighter-rouge">Num</code>. When we pattern match on <code class="language-plaintext highlighter-rouge">Add</code>, the type checker brings that constraint into scope automatically, allowing us to use <code class="language-plaintext highlighter-rouge">(+)</code> in the corresponding branch.</p> <p>This is a subtle but powerful idea: constraints can be local to a constructor. Even the precise return types we saw earlier are another form of locality — <code class="language-plaintext highlighter-rouge">LitInt</code> locally records that <code class="language-plaintext highlighter-rouge">a</code> is equal to <code class="language-plaintext highlighter-rouge">Int</code>.</p> <p>Next, we’ll rebuild the expression language in Rust, and see how to emulate both of these features: constructor-local type equalities and constructor-local constraints. To start adopting the Rust nomenclature, we’ll build an enum whose constructors are trait-constrained.</p> <h2 id="encoding-gadts-in-rust">Encoding GADTs in Rust</h2> <p>We’ll be encoding both properties of GADTs:</p> <ol> <li><strong>Constructor-local type equalities</strong> — like <code class="language-plaintext highlighter-rouge">LitInt</code> refining <code class="language-plaintext highlighter-rouge">a ~ Int</code>.</li> <li><strong>Constructor-local constraints</strong> — like <code class="language-plaintext highlighter-rouge">Add</code> requiring <code class="language-plaintext highlighter-rouge">Num a</code>.</li> </ol> <p>We’ll start with the first one, since the idea is already well known in the Rust community.</p> <p>As a baseline, here’s the simple expression language, with the promised parentheses and heap indirections:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Expr</span> <span class="p">{</span> <span class="nf">LitInt</span><span class="p">(</span><span class="nb">i64</span><span class="p">),</span> <span class="nf">Add</span><span class="p">(</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&gt;</span><span class="p">),</span> <span class="nf">LitBool</span><span class="p">(</span><span class="nb">bool</span><span class="p">),</span> <span class="nf">Or</span><span class="p">(</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&gt;</span><span class="p">),</span> <span class="p">}</span></code></pre></figure> <p>As things stand, we have the same issues as the original Haskell version, namely that we can construct invalid combinations, and can’t give a precise type to <code class="language-plaintext highlighter-rouge">eval</code>.</p> <h3 id="type-equality-witnesses">Type equality witnesses</h3> <p>In Haskell, specifying the return type of a GADT allowed us to express a type equality which the typechecker could then automatically use to unify the type variable with the concrete type.</p> <p>This relies on the type checker’s ability to make progress with locally learned information, which Rust doesn’t natively support. Our encoding will instead rely on an explicit <em>witness</em> of type equality, which we then use where Haskell would use the GADT constraint.</p> <p>In Rust, we can encode this concept as a zero-sized type: <sup id="fnref:is-invariance" role="doc-noteref"><a href="#fn:is-invariance" class="footnote" rel="footnote">1</a></sup></p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">use</span> <span class="nn">core</span><span class="p">::</span><span class="nn">marker</span><span class="p">::</span><span class="n">PhantomData</span><span class="p">;</span> <span class="k">struct</span> <span class="n">Is</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="o">&gt;</span><span class="p">(</span><span class="n">PhantomData</span><span class="o">&lt;</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">)</span><span class="o">&gt;</span><span class="p">);</span> <span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span> <span class="n">Is</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">fn</span> <span class="nf">refl</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span> <span class="nf">Is</span><span class="p">(</span><span class="n">PhantomData</span><span class="p">)</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">Is&lt;A, B&gt;</code> is our equality witness: a value of type <code class="language-plaintext highlighter-rouge">Is&lt;A, B&gt;</code> is only constructible when <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> are equal. <code class="language-plaintext highlighter-rouge">refl</code> is the only <em>safe</em> way to construct values of type <code class="language-plaintext highlighter-rouge">Is</code>.</p> <p>If you’ve seen this trick before, you can safely skim this part. The more interesting bit is how to do the same thing for <strong>trait bounds</strong>, not just type equalities.</p> <p>We can now write <code class="language-plaintext highlighter-rouge">Expr&lt;A&gt;</code> where each constructor stores a type equality witness:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span> <span class="p">{</span> <span class="nf">LitInt</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">i64</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">i64</span><span class="p">),</span> <span class="nf">Add</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">i64</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="nb">i64</span><span class="o">&gt;&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="nb">i64</span><span class="o">&gt;&gt;</span><span class="p">),</span> <span class="nf">LitBool</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">bool</span><span class="p">),</span> <span class="nf">Or</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="nb">bool</span><span class="o">&gt;&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="nb">bool</span><span class="o">&gt;&gt;</span><span class="p">),</span> <span class="p">}</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">Expr::LitInt(Is::refl(), 42)</code> has type <code class="language-plaintext highlighter-rouge">Expr&lt;i64&gt;</code>, because the <code class="language-plaintext highlighter-rouge">refl()</code> constructor forces the <code class="language-plaintext highlighter-rouge">A</code> variable to unify with <code class="language-plaintext highlighter-rouge">i64</code>.</p> <p>So</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span> <span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">1</span><span class="p">)),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">2</span><span class="p">)),</span> <span class="p">)</span></code></pre></figure> <p>typechecks, but</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span> <span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">1</span><span class="p">)),</span> <span class="c1">// wrong type, expected an Expr&lt;i64&gt; but got Expr&lt;bool&gt;</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitBool</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="k">false</span><span class="p">)),</span> <span class="p">)</span></code></pre></figure> <p>doesn’t.</p> <p>This machinery allows us to <em>restrict</em> the types of expressions that can be used in <code class="language-plaintext highlighter-rouge">Add</code>, but how do we <em>learn</em> type information?</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">A</span> <span class="p">{</span> <span class="k">match</span> <span class="n">expr</span> <span class="p">{</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="o">???</span> <span class="c1">// n is of type 'i64', we need to return 'A'</span> <span class="o">...</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p>In the Haskell version, the type equality bound by the GADT constructor is a native language feature that the typechecker knows about, so it freely converts between <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">Int</code> under a pattern match.</p> <p>In Rust, we created a custom encoding of type equality, and the typechecker doesn’t (and shouldn’t, in general) use it to unify types.</p> <p>This means that we need to write a function that actually performs the conversion:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="o">&gt;</span> <span class="n">Is</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">fn</span> <span class="nf">convert</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">a</span><span class="p">:</span> <span class="n">A</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">B</span> <span class="p">{</span> <span class="k">unsafe</span> <span class="p">{</span> <span class="nn">std</span><span class="p">::</span><span class="nn">intrinsics</span><span class="p">::</span><span class="nf">transmute_unchecked</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="p">}</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">transmute_unchecked</code> is a very unsafe function in general, but in our case, we only invoke it when we have a type equality witness available (which can only be constructed via <code class="language-plaintext highlighter-rouge">refl</code>), so we <em>know</em> the types <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> are actually equal.</p> <p>With this, we can now use the equality witnesses in the constructors to rewrite the results into the desired <code class="language-plaintext highlighter-rouge">A</code>:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">A</span> <span class="p">{</span> <span class="k">match</span> <span class="n">expr</span> <span class="p">{</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">n</span><span class="p">),</span> <span class="c1">// i64 -&gt; A</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">left</span><span class="p">,</span> <span class="n">right</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">left</span><span class="p">)</span> <span class="o">+</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">right</span><span class="p">)),</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitBool</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">b</span><span class="p">),</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Or</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">left</span><span class="p">,</span> <span class="n">right</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">left</span><span class="p">)</span> <span class="p">||</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">right</span><span class="p">)),</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <h3 id="trait-constraint-witnesses">Trait constraint witnesses</h3> <p>The type equality witnesses from the previous section are relatively simple, because the only thing we need to record about our type parameter is that it’s equal to a known type in the local context.</p> <p>Trait implementations are more complicated, because we need to know how certain functionality is implemented for our type parameter.</p> <p>Haskell’s GADTs store references to type class dictionaries in their constructors – essentially dynamic dispatch. While Rust supports dynamic dispatch via <code class="language-plaintext highlighter-rouge">dyn Trait</code>, it’s severely limited (requiring “object safe” traits), so we’ll need a different approach.</p> <p>We’ll start with a similar witness idea, but this time, the witness will record the fact that a trait implementation exists for a type.</p> <p>We’ll define a witness for the existence of a <code class="language-plaintext highlighter-rouge">Add</code>-like capability, corresponding to the <code class="language-plaintext highlighter-rouge">Num</code> constraint in the Haskell version.</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">CanAdd</span><span class="o">&lt;</span><span class="n">T</span><span class="p">:</span> <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span> <span class="p">{</span> <span class="n">_phantom</span><span class="p">:</span> <span class="n">PhantomData</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span> <span class="p">}</span> <span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::</span><span class="nb">Add</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">&gt;&gt;</span> <span class="n">CanAdd</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="p">{</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span> <span class="n">CanAdd</span> <span class="p">{</span> <span class="n">_phantom</span><span class="p">:</span> <span class="n">PhantomData</span> <span class="p">}</span> <span class="p">}</span> <span class="p">}</span> <span class="k">fn</span> <span class="n">can_add</span><span class="o">&lt;</span><span class="n">T</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::</span><span class="nb">Add</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">&gt;&gt;</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="n">CanAdd</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="p">{</span> <span class="nn">CanAdd</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span> <span class="p">}</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">CanAdd&lt;T&gt;</code> can only be constructed (via <code class="language-plaintext highlighter-rouge">can_add</code>) if <code class="language-plaintext highlighter-rouge">T</code> supports the <code class="language-plaintext highlighter-rouge">Add</code> operation with result type <code class="language-plaintext highlighter-rouge">T</code>. This mirrors the <code class="language-plaintext highlighter-rouge">Num a =&gt;</code> constraint on the Haskell side.</p> <p>We can now extend our expression type with a constructor that carries this constraint witness:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span> <span class="p">{</span> <span class="nf">LitInt</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">i64</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">i64</span><span class="p">),</span> <span class="nf">LitDouble</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">f64</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">f64</span><span class="p">),</span> <span class="nf">Add</span><span class="p">(</span><span class="n">CanAdd</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;&gt;</span><span class="p">),</span> <span class="nf">LitBool</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">bool</span><span class="p">),</span> <span class="nf">Or</span><span class="p">(</span><span class="n">Is</span><span class="o">&lt;</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="nb">bool</span><span class="o">&gt;&gt;</span><span class="p">,</span> <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Expr</span><span class="o">&lt;</span><span class="nb">bool</span><span class="o">&gt;&gt;</span><span class="p">),</span> <span class="p">}</span></code></pre></figure> <p>This version of <code class="language-plaintext highlighter-rouge">Expr</code> is the Rust analogue of the final Haskell GADT. The <code class="language-plaintext highlighter-rouge">Add</code> constructor now carries a <code class="language-plaintext highlighter-rouge">CanAdd&lt;A&gt;</code> witness that proves <code class="language-plaintext highlighter-rouge">A</code> implements <code class="language-plaintext highlighter-rouge">Add&lt;Output = A&gt;</code>.</p> <p>So far this handles the <em>construction</em> side of the story, but not the <em>destruction</em> side. When we pattern match on an <code class="language-plaintext highlighter-rouge">Expr&lt;A&gt;</code>, Rust doesn’t know that <code class="language-plaintext highlighter-rouge">A</code> satisfies the constraint carried by <code class="language-plaintext highlighter-rouge">CanAdd&lt;A&gt;</code>.</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">A</span> <span class="p">{</span> <span class="k">match</span> <span class="n">expr</span> <span class="p">{</span> <span class="o">...</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">)</span> <span class="o">+</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">b</span><span class="p">),</span> <span class="c1">// cannot add `A` to `A`</span> <span class="o">...</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p>To recover that information, we’ll need to encode it in a trait that can selectively enable the operation based on the presence of a witness.</p> <h3 id="using-specialisation-to-recover-constraints">Using specialisation to recover constraints</h3> <p>We now want to <em>use</em> the information stored in <code class="language-plaintext highlighter-rouge">CanAdd&lt;A&gt;</code> when pattern matching on an expression. In Haskell, this happens automatically: matching on <code class="language-plaintext highlighter-rouge">Add</code> brings the <code class="language-plaintext highlighter-rouge">Num a</code> constraint into scope. Rust has no mechanism for this, so we’ll need an indirection.</p> <p>We’ll introduce a helper trait <code class="language-plaintext highlighter-rouge">MaybeAdd</code> that acts like a type class dictionary. It provides an operation <code class="language-plaintext highlighter-rouge">maybe_add</code>, which only exists when the type supports addition. We’ll use specialisation to make that conditional.</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#![feature(specialization)]</span> <span class="k">trait</span> <span class="n">MaybeAdd</span> <span class="p">{</span> <span class="k">fn</span> <span class="nf">maybe_add</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">rhs</span><span class="p">:</span> <span class="k">Self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span><span class="p">;</span> <span class="p">}</span></code></pre></figure> <p>We define a <em>default</em> implementation for all types:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">MaybeAdd</span> <span class="k">for</span> <span class="n">T</span> <span class="p">{</span> <span class="n">default</span> <span class="k">fn</span> <span class="nf">maybe_add</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">_rhs</span><span class="p">:</span> <span class="k">Self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span> <span class="nd">unreachable!</span><span class="p">(</span><span class="s">"no Add implementation for this type"</span><span class="p">)</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p>and a <em>specialised</em> implementation for types that implement <code class="language-plaintext highlighter-rouge">Add</code>:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::</span><span class="nb">Add</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">&gt;&gt;</span> <span class="n">MaybeAdd</span> <span class="k">for</span> <span class="n">T</span> <span class="p">{</span> <span class="k">fn</span> <span class="nf">maybe_add</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">rhs</span><span class="p">:</span> <span class="k">Self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span> <span class="k">self</span> <span class="o">+</span> <span class="n">rhs</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p>With this machinery, we can now use the constraint witness inside <code class="language-plaintext highlighter-rouge">eval</code>:</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">A</span> <span class="p">{</span> <span class="k">match</span> <span class="n">expr</span> <span class="p">{</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">n</span><span class="p">),</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitDouble</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">d</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">d</span><span class="p">),</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitBool</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">b</span><span class="p">),</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">)</span><span class="nf">.maybe_add</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">b</span><span class="p">)),</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Or</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">)</span> <span class="p">||</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">b</span><span class="p">)),</span> <span class="p">}</span> <span class="p">}</span></code></pre></figure> <p>Rather than directly using <code class="language-plaintext highlighter-rouge">+</code> (which we can’t, since <code class="language-plaintext highlighter-rouge">A</code> isn’t known to implement <code class="language-plaintext highlighter-rouge">std::ops::Add</code> in this context), we delegate to <code class="language-plaintext highlighter-rouge">maybe_add</code>, which uses specialisation to select the correct implementation at monomorphisation time.</p> <figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#[test]</span> <span class="k">fn</span> <span class="nf">eval_test</span><span class="p">()</span> <span class="p">{</span> <span class="k">let</span> <span class="n">expr_int</span> <span class="o">=</span> <span class="p">{</span> <span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">3</span><span class="p">);</span> <span class="k">let</span> <span class="n">b</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">4</span><span class="p">);</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="nf">can_add</span><span class="p">(),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">b</span><span class="p">))</span> <span class="p">};</span> <span class="nd">assert_eq!</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="n">expr_int</span><span class="p">),</span> <span class="mi">7</span><span class="p">);</span> <span class="k">let</span> <span class="n">expr_double</span> <span class="o">=</span> <span class="p">{</span> <span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitDouble</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mf">2.5</span><span class="p">);</span> <span class="k">let</span> <span class="n">b</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitDouble</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mf">4.0</span><span class="p">);</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="nf">can_add</span><span class="p">(),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">b</span><span class="p">))</span> <span class="p">};</span> <span class="nd">assert_eq!</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="n">expr_double</span><span class="p">),</span> <span class="mf">6.5</span><span class="p">);</span> <span class="p">}</span></code></pre></figure> <h3 id="why-this-works">Why this works</h3> <p>If you’re coming from Haskell, it might be surprising that <code class="language-plaintext highlighter-rouge">eval</code> works at all. In Haskell, type class resolution is coupled with evidence generation: when the compiler decides that a type satisfies a constraint, it also produces a reference to the corresponding dictionary. If Rust worked the same, then that algorithm would pick the catch-all default implementation of <code class="language-plaintext highlighter-rouge">MaybeAdd</code> under the <code class="language-plaintext highlighter-rouge">Expr::Add</code> arm of <code class="language-plaintext highlighter-rouge">eval</code>, because at that point, no information is known about the type (and our <code class="language-plaintext highlighter-rouge">CanAdd</code> witness is invisible to the typechecker).</p> <p>However, Rust’s specialisation works differently. During type checking, the compiler only checks that <em>some</em> implementation of <code class="language-plaintext highlighter-rouge">MaybeAdd</code> exists – it doesn’t commit to which one. This step is <strong>proof-irrelevant</strong>: the fact that a trait is implemented matters, but not which implementation it resolves to.</p> <p>The actual selection happens later, during <strong>monomorphisation</strong>, once all type parameters are concrete. At that point, the specialiser sees that <code class="language-plaintext highlighter-rouge">A = i64</code> (or <code class="language-plaintext highlighter-rouge">A = f64</code>, etc.) and picks the more specific implementation that performs real addition. The default <code class="language-plaintext highlighter-rouge">unreachable!()</code> version is never instantiated, precisely because our witness mechanism disallows constructing expressions that try to add values without <code class="language-plaintext highlighter-rouge">Add</code> implementations.</p> <p>This is the crucial distinction between Haskell and Rust: in Haskell, dictionary resolution is part of type checking; in Rust, it’s deferred until code generation. The specialiser makes the final decision once it knows the concrete types, and because our witness types restrict what can actually be constructed, the correct implementation is always chosen.</p> <p>In effect, Rust’s specialisation system lets us recover local constraint learning at compile time, without runtime dictionaries or dynamic dispatch. Everything is resolved statically and erased before code generation. A truly zero-cost abstraction!<sup id="fnref:zero-cost" role="doc-noteref"><a href="#fn:zero-cost" class="footnote" rel="footnote">2</a></sup></p> <h2 id="limitations">Limitations</h2> <p>This technique has a few obvious caveats.</p> <p>First, it relies on <strong>specialisation</strong>, which is still unstable and only available on nightly Rust. The feature also has some unsound edge cases that the compiler can’t currently detect, though this particular usage is benign because it doesn’t overlap implementations in unsafe ways.</p> <p>Second, the design doesn’t generalise to <strong>existential types</strong> — Rust simply has no equivalent. We can simulate type refinement (as with <code class="language-plaintext highlighter-rouge">Expr&lt;A&gt;</code>), but not “forgetting” type information safely.</p> <p>Finally, while the runtime cost is zero, the cognitive cost certainly isn’t. The type signatures are verbose, the ergonomics are questionable, and the amount of ceremony required to recover what Haskell gives you by default is non-trivial.</p> <h2 id="conclusion">Conclusion</h2> <p>Until now we’ve been preoccupied with whether or not we could. Now it’s time to stop and think if we should.</p> <div class="footnotes" role="doc-endnotes"> <ol> <li id="fn:is-invariance" role="doc-endnote"> <p>Using <code class="language-plaintext highlighter-rouge">PhantomData&lt;fn(A) -&gt; B&gt;</code> would make <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> invariant, which is slightly more robust if lifetimes are involved. For this post, <code class="language-plaintext highlighter-rouge">PhantomData&lt;(A, B)&gt;</code> is simpler and works fine. <a href="#fnref:is-invariance" class="reversefootnote" role="doc-backlink">&#8617;</a></p> </li> <li id="fn:zero-cost" role="doc-endnote"> <p>Luckily the cost analysis of abstractions doesn’t include developer ergonomics. <a href="#fnref:zero-cost" class="reversefootnote" role="doc-backlink">&#8617;</a></p> </li> </ol> </div> <p><a href="https://kcsongor.github.io/gadts-in-rust/">Trait-Constrained Enums in Rust</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on June 10, 2025.</p> <![CDATA[Announcing generic-optics (& generic-lens 2.0.0.0)]]> https://kcsongor.github.io/generic-lens-2 2020-02-11T00:00:00-00:00 2020-02-11T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#background" id="markdown-toc-background">Background</a></li> <li><a href="#examples" id="markdown-toc-examples">Examples</a></li> <li><a href="#differences" id="markdown-toc-differences">Differences</a> <ul> <li><a href="#labels" id="markdown-toc-labels">Labels</a></li> </ul> </li> <li><a href="#changes-in-generic-lens" id="markdown-toc-changes-in-generic-lens">Changes in generic-lens</a></li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>I’m happy to announce a new library, <a href="https://hackage.haskell.org/package/generic-optics">generic-optics</a>, accompanied by version 2.0.0.0 of <a href="https://hackage.haskell.org/package/generic-lens">generic-lens</a>.</p> <h2 id="background">Background</h2> <p>A few months ago, the folks at Well-Typed <a href="https://www.well-typed.com/blog/2019/09/announcing-the-optics-library/">announced the <code class="language-plaintext highlighter-rouge">optics</code> library</a>, which aims to improve on the user experience compared to the <code class="language-plaintext highlighter-rouge">lens</code> library. Oleg Grenrus has written an excellent <a href="http://oleg.fi/gists/posts/2020-01-25-case-study-migration-from-lens-to-optics.html">migration guide</a> from <code class="language-plaintext highlighter-rouge">lens</code> to <code class="language-plaintext highlighter-rouge">optics</code>, so please have a look there for some more background.</p> <p><code class="language-plaintext highlighter-rouge">generic-optics</code> is essentially a port of <code class="language-plaintext highlighter-rouge">generic-lens</code> that is compatible with <code class="language-plaintext highlighter-rouge">optics</code>, and is designed to be a drop-in replacement for <code class="language-plaintext highlighter-rouge">generic-lens</code>. This means that if you’re already using <code class="language-plaintext highlighter-rouge">generic-lens</code> with <code class="language-plaintext highlighter-rouge">lens</code> and decide to migrate to <code class="language-plaintext highlighter-rouge">optics</code>, you should be able to replace the <code class="language-plaintext highlighter-rouge">generic-lens</code> dependency with <code class="language-plaintext highlighter-rouge">generic-optics</code> and expect things to just work.</p> <h2 id="examples">Examples</h2> <p>To explain why I’m so excited about <code class="language-plaintext highlighter-rouge">optics</code>, I’m going to compare a real-life workflow between <code class="language-plaintext highlighter-rouge">generic-lens</code> and <code class="language-plaintext highlighter-rouge">generic-optics</code>.</p> <p>First, language pragmas and imports:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE DataKinds #-}</span> <span class="cp">{-# LANGUAGE TypeApplications #-}</span> <span class="cp">{-# LANGUAGE DeriveGeneric #-}</span> <span class="kr">import</span> <span class="nn">Data.Generics.Product</span> <span class="kr">import</span> <span class="nn">GHC.Generics</span></code></pre></figure> <p>Note that the module <code class="language-plaintext highlighter-rouge">Data.Generics.Product</code> is shared between <code class="language-plaintext highlighter-rouge">generic-lens</code> and <code class="language-plaintext highlighter-rouge">generic-optics</code>.</p> <p>When using <code class="language-plaintext highlighter-rouge">generic-lens</code> with the <code class="language-plaintext highlighter-rouge">lens</code> library, we would import</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="nn">Control.Lens</span></code></pre></figure> <p>When using <code class="language-plaintext highlighter-rouge">generic-optics</code> with <code class="language-plaintext highlighter-rouge">optics</code>, the import becomes</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="nn">Optics.Core</span></code></pre></figure> <p>Now we define a simple record:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">MyRecord</span> <span class="o">=</span> <span class="kt">MyRecord</span> <span class="p">{</span> <span class="n">a</span> <span class="o">::</span> <span class="kt">Int</span><span class="p">,</span> <span class="n">b</span> <span class="o">::</span> <span class="kt">Int</span><span class="p">,</span> <span class="n">c</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Bool</span><span class="p">,</span> <span class="kt">Int</span><span class="p">)</span> <span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span> <span class="n">myRecord1</span> <span class="o">::</span> <span class="kt">MyRecord</span> <span class="n">myRecord1</span> <span class="o">=</span> <span class="kt">MyRecord</span> <span class="mi">0</span> <span class="mi">1</span> <span class="p">(</span><span class="kt">False</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span></code></pre></figure> <p>With either library, we can view the <code class="language-plaintext highlighter-rouge">a</code> field using the <code class="language-plaintext highlighter-rouge">field</code> lens:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics&gt; myRecord1 ^. field @"a" 0</code></pre></figure> <p>If we ask what the type of <code class="language-plaintext highlighter-rouge">field @"a"</code> is in GHCi, we already see the advantage of <code class="language-plaintext highlighter-rouge">optics</code>’s opaque representation.</p> <p>Compare</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens&gt; :t field @"a" field @"a" :: (HasField "a" s t a b, Functor f) =&gt; (a -&gt; f b) -&gt; s -&gt; f t</code></pre></figure> <p>with</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">optics&gt; :t field @"a" field @"a" :: HasField "a" s t a b =&gt; Lens s t a b</code></pre></figure> <p>Now let us use the <code class="language-plaintext highlighter-rouge">typed</code> lens, which performs a type-directed lookup in a product type, as long as there is a unique field with that type:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics&gt; myRecord1 ^. typed @(Bool, Int) (False,2)</code></pre></figure> <p>When the type of the field is not unique (such as if we tried to retrieve a field of type <code class="language-plaintext highlighter-rouge">Int</code>), both <code class="language-plaintext highlighter-rouge">generic-optics</code> and <code class="language-plaintext highlighter-rouge">generic-lens</code> provides a helpful type error:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics&gt; myRecord1 ^. typed @Int &lt;interactive&gt; error: • The type MyRecord contains multiple values of type Int. The choice of value is thus ambiguous. The offending constructors are: • MyRecord</code></pre></figure> <p>For situation likes this, both libraries provide a traversal called <code class="language-plaintext highlighter-rouge">types</code> that focuses on all values of the given type.</p> <p>Let’s see what happens if we replace <code class="language-plaintext highlighter-rouge">typed</code> with <code class="language-plaintext highlighter-rouge">types</code> in the above example when using <code class="language-plaintext highlighter-rouge">lens</code>:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens&gt; myRecord1 ^. types @Int &lt;interactive&gt;:43:14-23: error: • No instance for (Monoid Int) arising from a use of ‘types’</code></pre></figure> <p>This error is rather puzzling. Unless we know what’s going on under the hood, it’s not obvious where the <code class="language-plaintext highlighter-rouge">Monoid</code> constraint is coming from.</p> <p>Compare this with <code class="language-plaintext highlighter-rouge">generic-optics</code>:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">optics&gt; myRecord1 ^. types @Int &lt;interactive&gt;:32:1-23: error: • A_Traversal cannot be used as A_Getter</code></pre></figure> <p>Right! <code class="language-plaintext highlighter-rouge">types @Int</code> is a traversal, but <code class="language-plaintext highlighter-rouge">^.</code> takes a getter! Arguably this is a more helpful message. Consulting the documentation of <code class="language-plaintext highlighter-rouge">optics</code>, we find the combinator we’re looking for: <code class="language-plaintext highlighter-rouge">^..</code>, which returns all the values focused on by a traversal:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics&gt; myRecord1 ^.. types @Int [0,1,2]</code></pre></figure> <p>This now of course works in both libraries.</p> <p>To summarise, using the two libraries should be nearly identical as long as everything goes well and we’re not hitting type errors. Where <code class="language-plaintext highlighter-rouge">generic-optics</code> (but really, <code class="language-plaintext highlighter-rouge">optics</code> itself) shines is when things do not go all that well, in which case the resulting error messages are a lot more comprehensible.</p> <h2 id="differences">Differences</h2> <p>The above was just to give a little taste of using <code class="language-plaintext highlighter-rouge">generic-optics</code>. The interface of <code class="language-plaintext highlighter-rouge">generic-optics</code> is intended to be largely identical to that of <code class="language-plaintext highlighter-rouge">generic-lens</code>.</p> <h3 id="labels">Labels</h3> <p>At the time of writing, the main difference is the support for overloaded labels in <code class="language-plaintext highlighter-rouge">generic-lens</code>, which allows writing</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">lens&gt; import Data.Generics.Labels () lens&gt; myRecord1 ^. #a 0</code></pre></figure> <p>I intend to add support for this for <code class="language-plaintext highlighter-rouge">generic-optics</code> too, but it isn’t implemented yet.</p> <h2 id="changes-in-generic-lens">Changes in generic-lens</h2> <p>To support this new interface, <code class="language-plaintext highlighter-rouge">generic-lens</code> itself has undergone a major reorganisation. I thought this was a good opportunity to clean some things up and change the interface at places, which ultimately resulted in a new major version bump.</p> <p>Most notably, GHC versions below 8.4 are no longer supported. <code class="language-plaintext highlighter-rouge">generic-lens</code> (and <code class="language-plaintext highlighter-rouge">generic-optics</code> too) promises good performance by making sure that the generic overhead is eliminated at compile time. Doing so requires really careful coding practices, and GHC’s optimiser changes between every version, which meant that certain tricks that worked for 8.2 didn’t work for 8.6 and vice versa. The result was horrible CPP macros to enable certain hacks on certain versions of GHC. In the end, I decided it wasn’t worth the effort to maintain these hacks for older versions of the compiler.</p> <p>I intend to write a blog post in the near future describing some of these hacks, as they are quite interesting and potentially educational.</p> <p>For a more comprehensive list of changes, refer to the <a href="https://github.com/kcsongor/generic-lens/blob/master/generic-lens/ChangeLog.md">changelog</a>.</p> <h2 id="conclusion">Conclusion</h2> <p>Thanks for reading this blog post, and I’m hope you’re as excited about <code class="language-plaintext highlighter-rouge">generic-optics</code> as I am! Since this release required a major refactoring and moving things around, it is possible that some documentation is out of date, or certain functions are not exported from where you would expect. If you find anything that looks off, please either open a pull request or let me know on the issue tracker!</p> <p>Finally, if you find <code class="language-plaintext highlighter-rouge">generic-lens</code> or <code class="language-plaintext highlighter-rouge">generic-optics</code> useful, consider <a href="https://github.com/sponsors/kcsongor">buying me a coffee</a>!</p> <p><a href="https://kcsongor.github.io/generic-lens-2/">Announcing generic-optics (& generic-lens 2.0.0.0)</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on February 11, 2020.</p> <![CDATA[Opaque constraint synonyms]]> https://kcsongor.github.io/opaque-constraint-synonyms 2019-09-25T00:00:00+00:00 2019-09-25T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#constraints-newtypes-kind-of" id="markdown-toc-constraints-newtypes-kind-of">Constraints newtypes (kind of)</a></li> <li><a href="#a-real-world-example" id="markdown-toc-a-real-world-example">A real world example</a></li> <li><a href="#acknowledgements" id="markdown-toc-acknowledgements">Acknowledgements</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>The list of type class constraints in a function signature can sometimes get out of hand. In these situations, we can introduce a type synonym (thanks to <code class="language-plaintext highlighter-rouge">ConstraintKinds</code>) to avoid repetition.</p> <p>Say we want to group together the <code class="language-plaintext highlighter-rouge">Show</code> and <code class="language-plaintext highlighter-rouge">Read</code> constraints:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Serialise</span> <span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="kt">Show</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Read</span> <span class="n">a</span><span class="p">)</span></code></pre></figure> <p>Now <code class="language-plaintext highlighter-rouge">Serialise a</code> can be used anywhere where we require both constraints:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">roundtrip</span> <span class="o">::</span> <span class="kt">Serialise</span> <span class="n">a</span> <span class="o">=&gt;</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span> <span class="n">roundtrip</span> <span class="o">=</span> <span class="n">read</span> <span class="o">.</span> <span class="n">show</span></code></pre></figure> <p>This is great, because it means we no longer have to spell out <code class="language-plaintext highlighter-rouge">(Show a, Read a)</code> whenever we need both, and we also improved readability, because <code class="language-plaintext highlighter-rouge">Serialise</code> conveys some additional domain-specific meaning.</p> <p>There’s a problem with this, however. If we ask GHCi about the type of <code class="language-plaintext highlighter-rouge">roundtrip</code>:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :t roundtrip roundtrip :: (Show a, Read a) =&gt; a -&gt; a</code></pre></figure> <p>it will eagerly expand the type synonym, removing all traces of <code class="language-plaintext highlighter-rouge">Serialise</code>. Of course this is a well known problem of type synonyms, so we generally avoid them in favour of <code class="language-plaintext highlighter-rouge">newtype</code>s.</p> <p>But there’s no analogous construction for constraints. Or is there?</p> <h2 id="constraints-newtypes-kind-of">Constraints newtypes (kind of)</h2> <p>To begin, we’re going to drop the type synonym in favour of the “constraint synonym” technique, which is essentially the following:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="p">(</span><span class="kt">Show</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Read</span> <span class="n">a</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Serialise</span> <span class="n">a</span> <span class="kr">instance</span> <span class="p">(</span><span class="kt">Show</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Read</span> <span class="n">a</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Serialise</span> <span class="n">a</span></code></pre></figure> <p>In other words, we introduce a new type class with the required superclass constraints, and a single catchall instance.</p> <p>So far, the status quo hasn’t improved though. GHC is quite renitent:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :t roundtrip roundtrip :: (Show a, Read a) =&gt; a -&gt; a</code></pre></figure> <p>This happens because the compiler sees that there’s only one matching instance, so it’s safe to pick that one, and it will do so. This point is the important one: that there’s only one instance. So, if we could somehow trick GHC into thinking that there are other options, then maybe it wouldn’t be so eager to expand our constraints.</p> <p>So, we create an empty data type, only to be used internally:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Opaque</span></code></pre></figure> <p>Next, we satisfy the superclass constraints</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">Read</span> <span class="kt">Opaque</span> <span class="kr">where</span> <span class="n">readsPrec</span> <span class="o">=</span> <span class="n">undefined</span> <span class="kr">instance</span> <span class="kt">Show</span> <span class="kt">Opaque</span> <span class="kr">where</span> <span class="n">showsPrec</span> <span class="o">=</span> <span class="n">undefined</span></code></pre></figure> <p>Note that these two instances only exist so that the constraint is satisfied, but since the type is internal, the actual functions are never going to be invoked.</p> <p>Finally, the key ingredient: an overlapping instance for <code class="language-plaintext highlighter-rouge">Serialise Opaque</code>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# OVERLAPPING #-}</span> <span class="kt">Serialise</span> <span class="kt">Opaque</span></code></pre></figure> <p>Now, every time GHC sees a <code class="language-plaintext highlighter-rouge">Serialise a</code> constraint, it will no longer be able to pick the catchall instance, in case <code class="language-plaintext highlighter-rouge">a</code> gets instantiated to <code class="language-plaintext highlighter-rouge">Opaque</code> later. Of course, this won’t happen, because we don’t export <code class="language-plaintext highlighter-rouge">Opaque</code>, but it’s good enough for GHC.</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :t roundtrip roundtrip :: Serialise a =&gt; a -&gt; a</code></pre></figure> <h2 id="a-real-world-example">A real world example</h2> <p>You might say that the <code class="language-plaintext highlighter-rouge">(Show a, Read a)</code> example is perhaps overly simplistic. I came up with this technique to solve a very real problem in the <a href="http://hackage.haskell.org/package/generic-lens-1.2.0.0">generic-lens</a> library. This problem shows up at many places in the library, but to pick one, consider the <code class="language-plaintext highlighter-rouge">AsType</code> class:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">AsType</span> <span class="n">a</span> <span class="n">s</span> <span class="kr">where</span> <span class="n">_Typed</span> <span class="o">::</span> <span class="kt">Prism'</span> <span class="n">s</span> <span class="n">a</span></code></pre></figure> <p>The exact meaning of the class is irrelevant here (but see the <a href="http://hackage.haskell.org/package/generic-lens-1.2.0.0/docs/Data-Generics-Sum-Typed.html#v:_Typed">documentation</a> if you’re interested). What matters is that there’s a catchall instance defined for all types (using <code class="language-plaintext highlighter-rouge">GHC.Generics</code>), which in turn requires a large number of other constraints and predicates to hold. Since this catchall instance is the only one defined by the library, asking for the type of <code class="language-plaintext highlighter-rouge">_Typed</code> in GHCi eagerly expands the constraints to those of the instance.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="o">:</span><span class="n">t</span> <span class="n">_Typed</span> <span class="n">_Typed</span> <span class="o">::</span> <span class="p">(</span><span class="kt">ErrorUnlessOne</span> <span class="n">a</span> <span class="n">s</span> <span class="p">(</span><span class="kt">CollectPartialType</span> <span class="p">(</span><span class="kt">TupleToList</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">s</span><span class="p">)),</span> <span class="kt">Defined</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">s</span><span class="p">)</span> <span class="p">(</span><span class="kt">TypeError</span> <span class="o">...</span><span class="p">)</span> <span class="p">(</span><span class="nb">()</span> <span class="o">::</span> <span class="kt">Constraint</span><span class="p">),</span> <span class="kt">Generic</span> <span class="n">s</span><span class="p">,</span> <span class="kt">ListTuple</span> <span class="n">a</span> <span class="p">(</span><span class="kt">TupleToList</span> <span class="n">a</span><span class="p">),</span> <span class="kt">GAsType</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">s</span><span class="p">)</span> <span class="p">(</span><span class="kt">TupleToList</span> <span class="n">a</span><span class="p">),</span> <span class="kt">Data</span><span class="o">.</span><span class="kt">Profunctor</span><span class="o">.</span><span class="kt">Choice</span><span class="o">.</span><span class="kt">Choice</span> <span class="n">p</span><span class="p">,</span> <span class="kt">Applicative</span> <span class="n">f</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">p</span> <span class="n">a</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">s</span> <span class="p">(</span><span class="n">f</span> <span class="n">s</span><span class="p">)</span></code></pre></figure> <p>Not great. All the internal implementation details leak out. By employing the opaque constraint trick above, we can define overlapping instances for the <code class="language-plaintext highlighter-rouge">AsType</code> class, which results in the following type signature:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="o">:</span><span class="n">t</span> <span class="n">_Typed</span> <span class="n">_Typed</span> <span class="o">::</span> <span class="kt">AsType</span> <span class="n">a</span> <span class="n">s</span> <span class="o">=&gt;</span> <span class="kt">Prism'</span> <span class="n">s</span> <span class="n">a</span></code></pre></figure> <p>which is much nicer!</p> <h2 id="acknowledgements">Acknowledgements</h2> <p>I wrote most of this post a while time ago, but never published it. Thanks to <a href="https://twitter.com/rob_rix">Rob Rix</a> for bringing up this topic and thus reminding me to publish it. It’s good to see library authors care about the user experience of their library down to this level of detail, and I hope this technique will be useful for many others!</p> <p><a href="https://kcsongor.github.io/opaque-constraint-synonyms/">Opaque constraint synonyms</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 25, 2019.</p> <![CDATA[Tripping up type inference]]> https://kcsongor.github.io/ambiguous-tags 2019-09-18T00:00:00-00:00 2019-09-18T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <p>One of the main selling points of Haskell is that despite (or because) of its strong static type system, it frees us from the burden of having to spell out tedious type signatures everywhere.</p> <p>Type inference is a blessing, but sometimes it can also be a curse. Inference too good can hinder the readability of code, because the compiler knows what the type of an identifier is even when we don’t. It’s not just readability though: correctness can be imperilled too.</p> <p>As an example, consider the <code class="language-plaintext highlighter-rouge">Tagged</code> type, which allows us to attach type information to some other type.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">newtype</span> <span class="kt">Tagged</span> <span class="p">(</span><span class="n">s</span> <span class="o">::</span> <span class="n">k</span><span class="p">)</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">MkTagged</span> <span class="n">a</span></code></pre></figure> <p>Then we might want to define a <code class="language-plaintext highlighter-rouge">Person</code> type consisting of a first name and a last name, both of type <code class="language-plaintext highlighter-rouge">String</code>, tagged by (type-level) symbols accordingly:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Person</span> <span class="o">=</span> <span class="kt">MkPerson</span> <span class="p">(</span><span class="kt">Tagged</span> <span class="s">"firstName"</span> <span class="kt">String</span><span class="p">)</span> <span class="p">(</span><span class="kt">Tagged</span> <span class="s">"lastName"</span> <span class="kt">String</span><span class="p">)</span></code></pre></figure> <p>We can then construct values of this type:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph</span> <span class="o">::</span> <span class="kt">Person</span> <span class="n">joseph</span> <span class="o">=</span> <span class="kt">MkPerson</span> <span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Joseph"</span><span class="p">)</span> <span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Knecht"</span><span class="p">)</span></code></pre></figure> <p>And here is the problem. Since both fields are constructed just with the <code class="language-plaintext highlighter-rouge">MkTagged</code> constructor, nothing is stopping us from mixing up the field names if we misremember the ordering:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph'</span> <span class="o">::</span> <span class="kt">Person</span> <span class="n">joseph'</span> <span class="o">=</span> <span class="kt">MkPerson</span> <span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Knecht"</span><span class="p">)</span> <span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Joseph"</span><span class="p">)</span></code></pre></figure> <p>We would wish to get a type error, but GHC happily infers that <code class="language-plaintext highlighter-rouge">MkTagged "Joseph"</code> indeed has type <code class="language-plaintext highlighter-rouge">Tagged t String</code> for any <code class="language-plaintext highlighter-rouge">t</code>, thus it fits perfectly into the <code class="language-plaintext highlighter-rouge">"lastName"</code> field.</p> <p>We can fix this example by providing explicit type applications to the <code class="language-plaintext highlighter-rouge">MkTagged</code> constructor. Then, mixing up the order <em>is</em> a type error.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph'</span> <span class="o">::</span> <span class="kt">Person</span> <span class="n">joseph'</span> <span class="o">=</span> <span class="kt">MkPerson</span> <span class="p">(</span><span class="kt">MkTagged</span> <span class="o">@</span><span class="s">"lastName"</span> <span class="s">"Knecht"</span><span class="p">)</span> <span class="p">(</span><span class="kt">MkTagged</span> <span class="o">@</span><span class="s">"firstName"</span> <span class="s">"Joseph"</span><span class="p">)</span></code></pre></figure> <p>results in:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text"> • Couldn't match type ‘"lastName"’ with ‘"firstName"’</code></pre></figure> <p>This works, but these annotations are entirely optional, and if we forget about them, we’re in trouble once again.</p> <p>To summarise, the problem is that GHC can infer the type of <code class="language-plaintext highlighter-rouge">MkTagged "Joseph"</code>, and due to the generality of the result, it can also unify it with any arbitrary tag.</p> <p>So the question is this: how do we stop GHC from inferring the type of expressions like <code class="language-plaintext highlighter-rouge">MkTagged "Joseph"</code>? In other words, how do we enforce that the tag must be provided by explicit type annotation?</p> <h2 id="an-ambiguous-smart-constructor">An ambiguous smart constructor</h2> <p>We’re going to write a smart constructor that can only be invoked by explicit type annotation of the tag type.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">mkTagged</span> <span class="o">::</span> <span class="n">forall</span> <span class="n">t</span> <span class="n">a</span><span class="o">.</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">Tagged</span> <span class="p">(</span><span class="o">???</span><span class="p">)</span> <span class="n">a</span> <span class="n">mkTagged</span> <span class="o">=</span> <span class="kt">MkTagged</span></code></pre></figure> <p>What to put in the <code class="language-plaintext highlighter-rouge">???</code> hole? The idea is that we want <code class="language-plaintext highlighter-rouge">t</code> in this type to be <em>ambiguous</em>, in other words, it should be impossible to infer <code class="language-plaintext highlighter-rouge">t</code> even if we know what <code class="language-plaintext highlighter-rouge">Tagged (???) a</code> is. If it can’t be inferred, then GHC will insist that we specify a type annotation at the use site for what <code class="language-plaintext highlighter-rouge">t</code> should be.</p> <p>The obvious thing to plug into <code class="language-plaintext highlighter-rouge">???</code> would be <code class="language-plaintext highlighter-rouge">t</code> itself, but that doesn’t work of course, because from knowing <code class="language-plaintext highlighter-rouge">Tagged t a</code>, <code class="language-plaintext highlighter-rouge">t</code> can be trivially inferred. For example, when given a value of type <code class="language-plaintext highlighter-rouge">Tagged "firstName" String</code>, we can infer that <code class="language-plaintext highlighter-rouge">t</code> must be <code class="language-plaintext highlighter-rouge">"firstName"</code>.</p> <p>As always (at least this seems to be a recurring theme here on my blog), we reach for type families to solve this problem. In particular, we define a rather funny-looking variant of the identity type family, which I’m going to call <code class="language-plaintext highlighter-rouge">Ambiguous</code>:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Ambiguous</span> <span class="p">(</span><span class="n">a</span> <span class="o">::</span> <span class="n">k</span><span class="p">)</span> <span class="o">::</span> <span class="n">j</span> <span class="kr">where</span> <span class="kt">Ambiguous</span> <span class="n">x</span> <span class="o">=</span> <span class="n">x</span></code></pre></figure> <p>The first thing that might strike you is the kind signature: <code class="language-plaintext highlighter-rouge">Ambiguous</code> takes an argument of kind <code class="language-plaintext highlighter-rouge">k</code>, and returns something of kind <code class="language-plaintext highlighter-rouge">j</code>. It helps to think of these kind parameters as additional <em>inputs</em> to the type family.</p> <p>That is, <code class="language-plaintext highlighter-rouge">Ambiguous "firstName"</code> will get stuck:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! Ambiguous "firstName" Ambiguous "firstName" :: j = Ambiguous "firstName"</code></pre></figure> <p>because GHC doesn’t know at which <code class="language-plaintext highlighter-rouge">j</code> we want to evaluate the type family (and indeed, in principle this choice could change the behaviour of the type family, since in GHC, type families are not parametric).</p> <p>In order to properly reduce the family, we must provide the result kind as an input, like so:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! (Ambiguous "firstName" :: Symbol) (Ambiguous "firstName" :: Symbol) :: Symbol = "firstName"</code></pre></figure> <p>Now let us plug this type family into the type of <code class="language-plaintext highlighter-rouge">mkTagged</code>, and see what happens.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">mkTagged</span> <span class="o">::</span> <span class="n">forall</span> <span class="n">t</span> <span class="n">a</span><span class="o">.</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="kt">Tagged</span> <span class="p">(</span><span class="kt">Ambiguous</span> <span class="n">t</span><span class="p">)</span> <span class="n">a</span> <span class="n">mkTagged</span> <span class="o">=</span> <span class="kt">MkTagged</span></code></pre></figure> <p>Now, when GHC’s given <code class="language-plaintext highlighter-rouge">Ambiguous t</code>, it can’t work out what <code class="language-plaintext highlighter-rouge">t</code> is. Why? Suppose we know that <code class="language-plaintext highlighter-rouge">Ambiguous t :: Symbol</code>, that is, we expect it to reduce to a symbol. That still doesn’t tell us anything about the kind of <code class="language-plaintext highlighter-rouge">t</code>! According to the kind signature of <code class="language-plaintext highlighter-rouge">Ambiguous</code>, the kind of <code class="language-plaintext highlighter-rouge">t</code> could be <em>anything</em>. Indeed, the only way to disambiguate this is to provide the kind of <code class="language-plaintext highlighter-rouge">t</code>. As the signature of <code class="language-plaintext highlighter-rouge">mkTagged</code> does not have an explicit kind annotation on <code class="language-plaintext highlighter-rouge">t</code>, the only way to provide the kind of <code class="language-plaintext highlighter-rouge">t</code> is to provide <code class="language-plaintext highlighter-rouge">t</code> itself (since only visibly quantified variables can be applied with visible type applications).</p> <p>Now, the following code</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph</span> <span class="o">::</span> <span class="kt">Person</span> <span class="n">joseph</span> <span class="o">=</span> <span class="kt">MkPerson</span> <span class="p">(</span><span class="n">mkTagged</span> <span class="s">"Joseph"</span><span class="p">)</span> <span class="p">(</span><span class="n">mkTagged</span> <span class="s">"Knecht"</span><span class="p">)</span></code></pre></figure> <p>results in the error:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text"> • Couldn't match type ‘Ambiguous t0’ with ‘"firstName"’</code></pre></figure> <p>To fix it, we now <em>must</em> provide type applications:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph</span> <span class="o">::</span> <span class="kt">Person</span> <span class="n">joseph</span> <span class="o">=</span> <span class="kt">MkPerson</span> <span class="p">(</span><span class="n">mkTagged</span> <span class="o">@</span><span class="s">"firstName"</span> <span class="s">"Joseph"</span><span class="p">)</span> <span class="p">(</span><span class="n">mkTagged</span> <span class="o">@</span><span class="s">"lastName"</span> <span class="s">"Knecht"</span><span class="p">)</span></code></pre></figure> <p><a href="https://kcsongor.github.io/ambiguous-tags/">Tripping up type inference</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 18, 2019.</p> <![CDATA[Most underrated vim features: C-a]]> https://kcsongor.github.io/underrated-vim-c-a 2019-09-12T00:00:00-00:00 2019-09-12T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <p>The aim of this series of blog posts is to shed light on some of the darker corners of the vim text editor that I have encountered over the years. Each post will focus on one particular feature, and should take no longer than a couple of minutes to read.</p> <p>Today, I’d like to talk about the <code class="language-plaintext highlighter-rouge">&lt;C-a&gt;</code> key sequence (that is, control+a). It is extremely simple: pressing <code class="language-plaintext highlighter-rouge">&lt;C-a&gt;</code> searches the current line (starting at the cursor position) for a number, then increments it.</p> <p>For example:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">this is a number: 10. ^</code></pre></figure> <p><code class="language-plaintext highlighter-rouge">&lt;C-a&gt;</code></p> <figure class="highlight"><pre><code class="language-text" data-lang="text">this is a number: 11. ^</code></pre></figure> <p>where <code class="language-plaintext highlighter-rouge">^</code> marks the cursor position.</p> <p>Its inverse is <code class="language-plaintext highlighter-rouge">&lt;C-x&gt;</code>, which decrements the number. We can also specify a count, for example <code class="language-plaintext highlighter-rouge">20&lt;C-x&gt;</code> will result in:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">this is a number: -9. ^</code></pre></figure> <p>Hexadecimal and binary numbers are supported too. For example, to convert <code class="language-plaintext highlighter-rouge">192</code> to hex, we can do</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">this is a hexadecimal number: 0x0. ^</code></pre></figure> <p><code class="language-plaintext highlighter-rouge">192&lt;C-a&gt;</code></p> <figure class="highlight"><pre><code class="language-text" data-lang="text">this is a hexadecimal number: 0xc0. ^</code></pre></figure> <p><a href="https://kcsongor.github.io/underrated-vim-c-a/">Most underrated vim features: C-a</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 12, 2019.</p> <![CDATA[Global Implicit Parameters]]> https://kcsongor.github.io/global-implicit-parameters 2019-07-11T00:00:00-00:00 2019-07-11T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#under-the-hood" id="markdown-toc-under-the-hood">Under the hood</a></li> <li><a href="#barewords" id="markdown-toc-barewords">Barewords</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>Implicit parameters (enabled with the <code class="language-plaintext highlighter-rouge">{-# LANGUAGE ImplicitParams #-}</code> pragma) provide a way to dynamically bind variables in Haskell.</p> <p>For example, the following function can be called in any context where <code class="language-plaintext highlighter-rouge">?x</code> is bound:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foo</span> <span class="o">::</span> <span class="p">(</span><span class="o">?</span><span class="n">x</span> <span class="o">::</span> <span class="kt">Int</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Int</span> <span class="n">foo</span> <span class="o">=</span> <span class="o">?</span><span class="n">x</span> <span class="n">bar</span> <span class="o">::</span> <span class="kt">Int</span> <span class="n">bar</span> <span class="o">=</span> <span class="kr">let</span> <span class="o">?</span><span class="n">x</span> <span class="o">=</span> <span class="mi">10</span> <span class="kr">in</span> <span class="n">foo</span></code></pre></figure> <p>Unlike type classes, implicit parameters are bound locally. But what if we want to bind one in the global scope? This would allow a global “default” value, which could then be shadowed locally.</p> <p>Unfortunately, the following is syntactically invalid:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">?</span><span class="n">x</span> <span class="o">=</span> <span class="mi">21</span></code></pre></figure> <p>We turn to the <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#implicit-parameter-bindings">GHC User Manual</a>, only to be further discouraged:</p> <blockquote> <p>A group of implicit-parameter bindings may occur anywhere a normal group of Haskell bindings can occur, except at top level.</p> </blockquote> <p>Of course, we won’t let mere syntactic restrictions to get in our way.</p> <h2 id="under-the-hood">Under the hood</h2> <p>Since global binding of implicit parameters is officially not possible, we need to turn to unofficial methods. To begin, we pass the <code class="language-plaintext highlighter-rouge">-ddump-tc-trace</code> flag to GHC and recompile the module containing <code class="language-plaintext highlighter-rouge">foo</code> and <code class="language-plaintext highlighter-rouge">bar</code>. This makes GHC dump information about what it’s doing during typechecking the module. There is quite a lot of output, but one line looks interesting:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">canEvNC:cls ghc-prim-0.5.3:GHC.Classes.IP ["x", Int]</code></pre></figure> <p>Good software engineering practice dictates code reuse, and we all know that GHC is a well-engineered piece of software. Therefore, it is not surprising to find that implicit parameters are implemented by piggybacking off of type class resolution with some additional rules to disregard issues like global coherence.</p> <p>As the above line suggests, implicit parameter resolution is desugared into the resolution of the <code class="language-plaintext highlighter-rouge">GHC.Classes.IP</code> type class from <code class="language-plaintext highlighter-rouge">ghc-prim</code>.</p> <p>Even though this module is <a href="http://hackage.haskell.org/package/ghc-prim-0.5.3">not documented</a>, we can import it and ask GHCi for more information:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">IP</span> <span class="p">(</span><span class="n">s</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="n">a</span> <span class="o">|</span> <span class="n">s</span> <span class="o">-&gt;</span> <span class="n">a</span> <span class="kr">where</span> <span class="n">ip</span> <span class="o">::</span> <span class="n">a</span> <span class="cp">{-# MINIMAL ip #-}</span></code></pre></figure> <p>It looks like GHC generates instances of the <code class="language-plaintext highlighter-rouge">IP</code> class on the fly whenever it sees a binder for an implicit parameter. The name of the parameter is represented as a type-level symbol. The functional dependency allows the variable’s type to be resolved just from its name.</p> <p>Let’s try to write an instance for this class by hand:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="c1">-- ?x = 21</span> <span class="kr">instance</span> <span class="kt">IP</span> <span class="s">"x"</span> <span class="kt">Int</span> <span class="kr">where</span> <span class="n">ip</span> <span class="o">=</span> <span class="mi">21</span></code></pre></figure> <p>GHC happily accepts this definition. Indeed, we can now write</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">baz</span> <span class="o">::</span> <span class="kt">Int</span> <span class="n">baz</span> <span class="o">=</span> <span class="o">?</span><span class="n">x</span></code></pre></figure> <p>which evaluates to <code class="language-plaintext highlighter-rouge">21</code>, by picking up the <code class="language-plaintext highlighter-rouge">?x</code> variable from the top-level scope. As expected, <code class="language-plaintext highlighter-rouge">let ?x = 10 in foo</code> still evaluates to <code class="language-plaintext highlighter-rouge">10</code>, as it <em>shadows</em> the top-level binding.</p> <h2 id="barewords">Barewords</h2> <p>Perhaps this is a good place to stop. But we can go further: above, we defined only the <code class="language-plaintext highlighter-rouge">?x</code> variable. It turns out that we can define an instance for <em>all</em> symbols at once:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">KnownSymbol</span> <span class="n">s</span> <span class="o">=&gt;</span> <span class="kt">IP</span> <span class="n">s</span> <span class="kt">String</span> <span class="kr">where</span> <span class="n">ip</span> <span class="o">=</span> <span class="n">symbolVal</span> <span class="p">(</span><span class="kt">Proxy</span> <span class="o">::</span> <span class="kt">Proxy</span> <span class="n">s</span><span class="p">)</span></code></pre></figure> <p>This instance brings all possible implicit variables into scope, and assigns their name their value by reflecting the symbol into a string.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bye</span> <span class="o">::</span> <span class="kt">String</span> <span class="n">bye</span> <span class="o">=</span> <span class="o">?</span><span class="n">thanks</span> <span class="o">++</span> <span class="s">" "</span> <span class="o">++</span> <span class="o">?</span><span class="n">for</span> <span class="o">++</span> <span class="s">" "</span> <span class="o">++</span> <span class="o">?</span><span class="n">reading</span></code></pre></figure> <p>Which <em>almost</em> feels like writing Perl!</p> <p><a href="https://kcsongor.github.io/global-implicit-parameters/">Global Implicit Parameters</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on July 11, 2019.</p> <![CDATA[Detecting the undetectable: custom type errors for stuck type families]]> https://kcsongor.github.io/report-stuck-families 2018-11-30T00:00:00-00:00 2018-11-29T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#type-family-evaluation-semantics" id="markdown-toc-type-family-evaluation-semantics">Type family evaluation semantics</a></li> <li><a href="#custom-type-errors" id="markdown-toc-custom-type-errors">Custom type errors</a></li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>Custom type errors are a great way to improve the usability of Haskell libraries that utilise some of the more recent language extensions. Yet anyone who has written or used one of these libraries will know that despite the authors’ best efforts, there are still many occasions where a wall of text jumps out, leaving us puzzled as to what went wrong.</p> <p>This post is about one particular class of such errors that have been troubling users of many modern Haskell libraries: stuck type families.</p> <p>The following type error perfectly illustrates the problem. It is an actual error <a href="https://github.com/kcsongor/generic-lens/issues/73">reported</a> on the issue tracker of the <a href="http://hackage.haskell.org/package/generic-lens">generic-lens</a> library.</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">• No instance for (Data.Generics.Product.Types.HasTypes' (Data.Generics.Product.Types.Snd (Data.Generics.Product.Types.InterestingOr Description (Data.Generics.Product.Types.InterestingOr Description (Data.Generics.Product.Types.Interesting' Description (Rep Text) Name '[Text, Sirname, None, Description]) (M1 S ('MetaSel ('Just "name") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 Name)) Name) (M1 C ('MetaCons "M" 'PrefixI 'False) (S1 ('MetaSel 'Nothing 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 Multiple))) Name)) Description Name) arising from a use of ‘types’</code></pre></figure> <p>Can you spot the problem? Even if you know what to look for, it takes a good few seconds to locate the culprit. The goal of this post is to turn the above into the following:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">• No instance for Generic Text arising from a traversal over Description.</code></pre></figure> <p>How could we possibly identify a lack of <code class="language-plaintext highlighter-rouge">Generic</code> instance from the above? Let us have a closer look at that large type error. It is a nested chain function of calls, such as <code class="language-plaintext highlighter-rouge">Snd</code> and <code class="language-plaintext highlighter-rouge">Interesting</code>, which are type families leaking out of the library’s implementation. The reason we see these type families (as opposed to the result they evaluate to), is because the computation is <em>stuck</em>. The culprit is the <code class="language-plaintext highlighter-rouge">Rep Text</code> part somewhere in the middle.</p> <p>It turns out that <code class="language-plaintext highlighter-rouge">Rep</code> is an associated type family of the <code class="language-plaintext highlighter-rouge">Generic</code> class:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Generic</span> <span class="n">a</span> <span class="kr">where</span> <span class="kr">type</span> <span class="kt">Rep</span> <span class="n">a</span> <span class="o">::</span> <span class="kt">Type</span> <span class="o">-&gt;</span> <span class="kt">Type</span> <span class="o">...</span></code></pre></figure> <p>Thus, the reason <code class="language-plaintext highlighter-rouge">Rep Text</code> is not defined is that <code class="language-plaintext highlighter-rouge">Text</code> has no <code class="language-plaintext highlighter-rouge">Generic</code> instance. Clearly, it’s unreasonable to expect users to keep such implementation details in mind and hunt for unreduced occurrences of <code class="language-plaintext highlighter-rouge">Rep</code> in their type errors to find out what the issue is!</p> <p>Yet, reporting this is not so easy. To explain why, we need to understand the behaviour of type families.</p> <p class="notice">As things stand today, the associated family <code class="language-plaintext highlighter-rouge">Rep</code> is not actually connected to the <code class="language-plaintext highlighter-rouge">Generic</code> class as far as the type checker is concerned. This is why unreduced occurrences will not result in error messages mentioning anything about <code class="language-plaintext highlighter-rouge">Generic</code> in the first place. <a href="https://arxiv.org/abs/1706.09715">Constrained type families</a> offer a solution to this problem, but they are not (yet) implemented in GHC.</p> <h2 id="type-family-evaluation-semantics">Type family evaluation semantics</h2> <p>The reduction of type families is driven by the constraint solver. To the best of my knowledge, there is no formal specification for their semantics, so I’m not going to attempt to give a comprehensive account here either. Instead, let us just make some key observations about how type families reduce.</p> <p>A type involving a type family is said to be <em>stuck</em> if none of the type family’s equations can be selected for the provided arguments. Since <code class="language-plaintext highlighter-rouge">Text</code>s have no <code class="language-plaintext highlighter-rouge">Generic</code> instance, there is consequently no <code class="language-plaintext highlighter-rouge">Rep Text</code> instance defined either. Thus, <code class="language-plaintext highlighter-rouge">Rep Text</code> is stuck.</p> <p>How does “stuckness” propagate up a chain of function calls? Consider the following type family:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Foo</span> <span class="n">a</span> <span class="kr">where</span> <span class="kt">Foo</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span></code></pre></figure> <p>No matter what we pass in as the argument, the single equation will always match. This means that even if we pass in a stuck type, such as <code class="language-plaintext highlighter-rouge">Rep Text</code>, the equation can reduce to the right hand side (and get stuck afterwards):</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! Foo (Rep Text) = Rep Text</code></pre></figure> <p>In other words, we can think of <code class="language-plaintext highlighter-rouge">Foo</code> as a type family that’s “lazy” in its argument. Now consider the <code class="language-plaintext highlighter-rouge">Bar</code> type family:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Bar</span> <span class="n">a</span> <span class="kr">where</span> <span class="kt">Bar</span> <span class="kt">Maybe</span> <span class="o">=</span> <span class="kt">Maybe</span> <span class="kt">Bar</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span></code></pre></figure> <p>Here, we first check if the argument is <code class="language-plaintext highlighter-rouge">Maybe</code>, in which case <code class="language-plaintext highlighter-rouge">Maybe</code> is returned, otherwise we pick the second equation. Perhaps surprisingly, <code class="language-plaintext highlighter-rouge">Bar</code> behaves the same as <code class="language-plaintext highlighter-rouge">Foo</code>:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! Bar (Rep Text) = Rep Text</code></pre></figure> <p>The two equations of <code class="language-plaintext highlighter-rouge">Bar</code> <em>agree</em> with each other, because the first one is a substitution instance of the second. GHC recognises this, and decides that it is safe to drop the first equation in favour of the second one.</p> <p>We can of course write disagreeing equations:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">T1</span> <span class="n">x</span> <span class="kr">data</span> <span class="kt">T2</span> <span class="n">x</span> <span class="kr">type</span> <span class="n">family</span> <span class="kt">FooBar</span> <span class="n">a</span> <span class="kr">where</span> <span class="kt">FooBar</span> <span class="kt">T1</span> <span class="o">=</span> <span class="kt">T2</span> <span class="kt">FooBar</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span></code></pre></figure> <p>This time, notice that the first equation is not a substitution instance of the second: it returns something other than the argument.</p> <p>GHC won’t optimise this case away anymore, and now instance matching will have to consider both equations. A given equation matches, if the argument unifies with the pattern, and is apart from all of the preceding patterns (i.e. doesn’t match any of them). The important thing here is that a stuck type is <em>not</em> apart from any other type, but neither does it match any other type. This means that</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! FooBar (Rep Text) = FooBar (Rep Text)</code></pre></figure> <p><code class="language-plaintext highlighter-rouge">FooBar</code> gets stuck just when its argument does. We can think of <code class="language-plaintext highlighter-rouge">FooBar</code> as a type family that is “strict” in its argument.</p> <p>If we pass in a non-stuck value, evaluation proceeds as normal:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! FooBar Maybe = Maybe</code></pre></figure> <p>Since <code class="language-plaintext highlighter-rouge">Maybe</code> is apart from <code class="language-plaintext highlighter-rouge">T1</code> (they are different ground types), and it unifies with the catch-all pattern <code class="language-plaintext highlighter-rouge">a</code>.</p> <p>So, if a type family that inspects its argument is given a stuck type, then the resulting type will be stuck itself. Notice that we can’t proceed any further: there is no way to detect if the argument was stuck or not. This is why the type error above is so impenetrable. If we ignore our argument like <code class="language-plaintext highlighter-rouge">Foo</code> does, then it just slips by, but if we try to do something with it like <code class="language-plaintext highlighter-rouge">FooBar</code> does, we get stuck.</p> <p>Of course, I wouldn’t have written down all of these low-level details about type family reduction if they didn’t lead to a solution!</p> <h2 id="custom-type-errors">Custom type errors</h2> <p>The mechanism of custom type errors is quite simple. The constraint solver proceeds normally, reducing all type family equations and solving all type class instances. If at the end, there are any constraints of the form <code class="language-plaintext highlighter-rouge">TypeError ...</code>, then the payload of the error gets printed, otherwise any unsolved constraints are reported.</p> <p>As an example</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foo</span> <span class="o">::</span> <span class="kt">TypeError</span> <span class="p">(</span><span class="kt">'Text</span> <span class="s">"Ouch"</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="nb">()</span> <span class="n">foo</span> <span class="o">=</span> <span class="mi">10</span></code></pre></figure> <p>yields</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">• Ouch</code></pre></figure> <p>even though <code class="language-plaintext highlighter-rouge">10</code> clearly doesn’t have type <code class="language-plaintext highlighter-rouge">()</code>.</p> <p>We want to produce a custom type error when the <code class="language-plaintext highlighter-rouge">Rep</code> type family gets stuck, and we’d like to continue normally otherwise. As discussed above, there is no way to branch on whether a type family is stuck or not.</p> <p>However, we now have all the necessary pieces: all we need to do is to make sure that when <code class="language-plaintext highlighter-rouge">Rep</code> gets stuck, we leave a <code class="language-plaintext highlighter-rouge">TypeError</code> in the residual constraints. To do this, we’re going to wrap the call to <code class="language-plaintext highlighter-rouge">Rep</code> in another type family, which will get stuck just when <code class="language-plaintext highlighter-rouge">Rep</code> is stuck. When <code class="language-plaintext highlighter-rouge">Rep</code> reduces, our wrapper reduces too. The additional piece is that the wrapper will also hold a type error as its argument, which will reside in the unsolved constraint in the stuck case, but disappear otherwise.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Break</span> <span class="p">(</span><span class="n">c</span> <span class="o">::</span> <span class="kt">Constraint</span><span class="p">)</span> <span class="p">(</span><span class="n">rep</span> <span class="o">::</span> <span class="kt">Type</span> <span class="o">-&gt;</span> <span class="kt">Type</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Constraint</span> <span class="kr">where</span> <span class="kt">Break</span> <span class="kr">_</span> <span class="kt">T1</span> <span class="o">=</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span> <span class="nb">()</span><span class="p">)</span> <span class="kt">Break</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="nb">()</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">Break</code> is the wrapper family. It takes a constraint, which will be our type error. Then it forces its argument by testing against <code class="language-plaintext highlighter-rouge">T1</code>. Note that in both equations, the type family reduces to the trivial constraint <code class="language-plaintext highlighter-rouge">()</code>, but in the first case, we use <code class="language-plaintext highlighter-rouge">((), ())</code> (a tuple of two trivial constraints) to ensure that the equations don’t optimise away, like they did with <code class="language-plaintext highlighter-rouge">Bar</code>.</p> <p>Finally, we introduce a type family to construct a custom error message:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">NoGeneric</span> <span class="n">t</span> <span class="kr">where</span> <span class="kt">NoGeneric</span> <span class="n">x</span> <span class="o">=</span> <span class="kt">TypeError</span> <span class="p">(</span><span class="kt">'Text</span> <span class="s">"No instance for "</span> <span class="n">'</span><span class="o">:&lt;&gt;:</span> <span class="kt">'ShowType</span> <span class="p">(</span><span class="kt">Generic</span> <span class="n">x</span><span class="p">))</span></code></pre></figure> <p>Now, consider what happens when we call <code class="language-plaintext highlighter-rouge">Break</code> with the stuck argument <code class="language-plaintext highlighter-rouge">Rep Text</code>:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! Break (NoGeneric Int) (Rep Text) = Break (TypeError ...) (Rep Text)</code></pre></figure> <p>the type gets stuck, with a <code class="language-plaintext highlighter-rouge">TypeError</code> inside! However, when called with a type where <code class="language-plaintext highlighter-rouge">Rep</code> is defined, such as <code class="language-plaintext highlighter-rouge">Bool</code>, the type reduces to the unit constraint, no mention of the type error.</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! Break (NoGeneric Bool) (Rep Bool) = () :: Constraint</code></pre></figure> <p>And with this, we can report errors for any stuck type family.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bar</span> <span class="o">::</span> <span class="kt">Break</span> <span class="p">(</span><span class="kt">NoGeneric</span> <span class="kt">Text</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rep</span> <span class="kt">Text</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="nb">()</span> <span class="n">bar</span> <span class="o">=</span> <span class="nb">()</span></code></pre></figure> <p>yields</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">• No instance for Generic Text • In the expression: bar</code></pre></figure> <h1 id="conclusion">Conclusion</h1> <p>Using this technique, we can place custom type errors right where our stuck type families are, and provide more contextual information about what went wrong. We can even generalise the above to the following type family:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Any</span> <span class="o">::</span> <span class="n">k</span> <span class="kr">type</span> <span class="n">family</span> <span class="kt">Assert</span> <span class="p">(</span><span class="n">err</span> <span class="o">::</span> <span class="kt">Constraint</span><span class="p">)</span> <span class="p">(</span><span class="n">break</span> <span class="o">::</span> <span class="kt">Type</span> <span class="o">-&gt;</span> <span class="kt">Type</span><span class="p">)</span> <span class="p">(</span><span class="n">a</span> <span class="o">::</span> <span class="n">k</span><span class="p">)</span> <span class="o">::</span> <span class="n">k</span> <span class="kr">where</span> <span class="kt">Assert</span> <span class="kr">_</span> <span class="kt">T1</span> <span class="kr">_</span> <span class="o">=</span> <span class="kt">Any</span> <span class="kt">Assert</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">k</span> <span class="o">=</span> <span class="n">k</span></code></pre></figure> <p>which we can use at any point in a computation, not just in constraints. <code class="language-plaintext highlighter-rouge">Assert</code> takes a type error, a potentially stuck computation, and a value. If the computation is stuck, then the custom error is presented, otherwise the value is passed through without any errors. Here, strictness is forced by the same <code class="language-plaintext highlighter-rouge">T1</code> trick, but this time, to ensure that the right hand sides are also different, we return the <code class="language-plaintext highlighter-rouge">Any</code> type family in the first case.</p> <p><a href="https://kcsongor.github.io/report-stuck-families/">Detecting the undetectable: custom type errors for stuck type families</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on November 29, 2018.</p> <![CDATA[Parsing type-level strings in Haskell]]> https://kcsongor.github.io/symbol-parsing-haskell 2018-11-28T00:00:00-00:00 2018-11-28T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#motivation" id="markdown-toc-motivation">Motivation</a></li> <li><a href="#primitives" id="markdown-toc-primitives">Primitives</a> <ul> <li><a href="#appendsymbol" id="markdown-toc-appendsymbol">AppendSymbol</a></li> <li><a href="#cmpsymbol" id="markdown-toc-cmpsymbol">CmpSymbol</a></li> </ul> </li> <li><a href="#decomposition" id="markdown-toc-decomposition">Decomposition</a> <ul> <li><a href="#head" id="markdown-toc-head">Head</a></li> <li><a href="#uncons" id="markdown-toc-uncons">Uncons</a></li> </ul> </li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>Haskell, as implemented in GHC, has a very rich language for expressing computations in types. Thanks to the <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html?highlight=datakinds#datatype-promotion">DataKinds</a> extension, any inductively defined data type can be used not only at the term level, but also at the type level. A notable exception are strings, which provide the main theme for today’s blog post.</p> <p>The <code class="language-plaintext highlighter-rouge">String</code> type in Haskell is defined as a list of <code class="language-plaintext highlighter-rouge">Char</code>s. However, the type-level equivalent, <code class="language-plaintext highlighter-rouge">Symbol</code>, is defined as a primitive in GHC, presumably for efficiency. After all, the type checker passes these types around, and the simpler their structure, the less potential work the constraint solver needs to do.</p> <p>The problem is this: since <code class="language-plaintext highlighter-rouge">Symbol</code> is defined as a primitive, there is no way to pattern match on its structure, and the only way to interact with them are by using the built-in primitive operations, namely appending and (efficient, constant-time) comparison.</p> <p>In this blog post, I will show how these primitives can be used to recover the ability to do arbitrary introspection of these type-level string literals, thereby enabling a whole range of applications where statically known information can be exploited.</p> <p>The technique presented here was inspired by Daniel Winograd-Cort’s <a href="https://github.com/kcsongor/generic-lens/pull/69">pull request for the generic-lens library</a>.</p> <p>All of this is packaged into the <a href="https://github.com/kcsongor/symbols">symbols</a> library.</p> <h1 id="motivation">Motivation</h1> <p>I have <a href="/purescript-safe-printf">written</a> about type-level symbol parsing in PureScript to implement a type-safe <code class="language-plaintext highlighter-rouge">printf</code> function. (There, I achieved symbol decomposition by patching the compiler, but no such thing is required here.)</p> <p>Reusing that example, we will be able to write</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="o">:</span><span class="n">t</span> <span class="n">printf</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="n">printf</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">String</span> <span class="o">-&gt;</span> <span class="kt">String</span></code></pre></figure> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="n">printf</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="mi">10</span> <span class="mi">20</span> <span class="s">"foo"</span> <span class="s">"Wurble 10 20 foo"</span></code></pre></figure> <p>The implementation of the printf example using the technique described in this blog post can be found on <a href="https://github.com/kcsongor/symbols/blob/master/src/Data/Symbol/Examples/Printf.hs">github</a>.</p> <h1 id="primitives">Primitives</h1> <p>First, let’s have a look at the primitives GHC provides for manipulating type of kind <code class="language-plaintext highlighter-rouge">Symbol</code>, namely <code class="language-plaintext highlighter-rouge">AppendSymbol</code> and <code class="language-plaintext highlighter-rouge">CmpSymbol</code>.</p> <p>These functions are implemented in the compiler, and exported from the <a href="http://hackage.haskell.org/package/base-4.12.0.0/docs/GHC-TypeLits.html">GHC.TypeLits</a> module:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">AppendSymbol</span> <span class="p">(</span><span class="n">m</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">n</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="kr">type</span> <span class="n">family</span> <span class="kt">CmpSymbol</span> <span class="p">(</span><span class="n">m</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">n</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Ordering</span></code></pre></figure> <p>Note that there is no <code class="language-plaintext highlighter-rouge">Uncons</code> primitive that returns the head (first character) and the tail of the symbol. It turns out that we can implement <code class="language-plaintext highlighter-rouge">Uncons</code> using the two primitives above.</p> <h2 id="appendsymbol">AppendSymbol</h2> <p>The fact that <code class="language-plaintext highlighter-rouge">AppendSymbol</code> is a type family suggests a rather straightforward semantics. It appends two symbols together resulting in a third one:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="o">:</span><span class="n">kind</span><span class="o">!</span> <span class="kt">AppendSymbol</span> <span class="s">"foo"</span> <span class="s">"bar"</span> <span class="o">=</span> <span class="s">"foobar"</span></code></pre></figure> <p>That is to say, it should only go in one way, so to speak.</p> <p>However, if we have a look at the <a href="https://github.com/ghc/ghc/blob/1c2c2d3dfd4c36884b22163872feb87122b4528d/compiler/typecheck/TcTypeNats.hs#L835">implementation</a> in GHC, we can see that there’s more going on. There are special rules for the interaction of <code class="language-plaintext highlighter-rouge">AppendSymbol</code> constraints with equality constraints. In concrete terms, GHC will solve the following constraint:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">(AppendSymbol "foo" b ~ "foobar") =&gt; (b ~ "bar")</code></pre></figure> <p>That is, if we know a prefix of a symbol, we can decompose it to get the matching suffix. Morally, the actual signature of <code class="language-plaintext highlighter-rouge">AppendSymbol</code> would be closer to</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">AppendSymbol</span> <span class="n">m</span> <span class="n">n</span> <span class="o">=</span> <span class="n">r</span> <span class="o">|</span> <span class="n">r</span> <span class="n">m</span> <span class="o">-&gt;</span> <span class="n">n</span><span class="p">,</span> <span class="n">r</span> <span class="n">n</span> <span class="o">-&gt;</span> <span class="n">m</span></code></pre></figure> <p>But this can’t be expressed today in GHC (type family dependencies only allow the inputs to be decided solely by the result, and no such combination of inputs and outputs are allowed), so <code class="language-plaintext highlighter-rouge">AppendSymbol</code> really is a lot more powerful than what the type system would like to admit!</p> <p>Even with the ability to decompose symbols, there is a problem, however. This decomposition only works if we <em>know</em> what the prefix is. And in general, we need to know two out of the three symbols involved in the constraint to get the third.</p> <p>As a result, the following won’t work:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bad</span> <span class="o">::</span> <span class="kt">AppendSymbol</span> <span class="n">prefix</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span> <span class="o">=&gt;</span> <span class="kt">Proxy</span> <span class="n">suffix</span> <span class="n">bad</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="o">:</span><span class="n">t</span> <span class="n">bad</span> <span class="n">bad</span> <span class="o">::</span> <span class="p">(</span><span class="kt">AppendSymbol</span> <span class="n">prefix</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Proxy</span> <span class="n">suffix</span></code></pre></figure> <p>that is, <code class="language-plaintext highlighter-rouge">suffix</code> is unsolved.</p> <p>We might think that we can just try all possible characters as potential prefixes until one matches, but that would require backtracking in the constraint solver, and GHC’s constraint solver doesn’t backtrack.</p> <p>That is, trying a prefix that doesn’t match results in an unsolvable constraint:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bad'</span> <span class="o">::</span> <span class="kt">AppendSymbol</span> <span class="s">"a"</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span> <span class="o">=&gt;</span> <span class="kt">Proxy</span> <span class="n">suffix</span> <span class="n">bad'</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;&gt;&gt;</span> <span class="o">:</span><span class="n">t</span> <span class="n">bad'</span> <span class="n">bad'</span> <span class="o">::</span> <span class="p">(</span><span class="kt">AppendSymbol</span> <span class="s">"a"</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Proxy</span> <span class="n">suffix</span></code></pre></figure> <p>But since we can’t backtrack, there is no way to try a different character once we’ve committed to a particular prefix.</p> <p><em>If we knew</em> what the first character was, we could strip it off and get the remaining symbol this way, which would allow us to treat Symbols as a list of characters essentially.</p> <h2 id="cmpsymbol">CmpSymbol</h2> <p>It turns out that we can simply use alphabetical ordering to find out what the first character of a string is. <code class="language-plaintext highlighter-rouge">CmpSymbol</code> compares two symbols, and returns one of <code class="language-plaintext highlighter-rouge">LT</code>, <code class="language-plaintext highlighter-rouge">EQ</code>, or <code class="language-plaintext highlighter-rouge">GT</code> as a result.</p> <p>Observe that for any string longer than one, it’s always true that the string follows its first character alphabetically, and precedes any character after its first one. As an example, consider the string <code class="language-plaintext highlighter-rouge">"hello world"</code>, whose first character is <code class="language-plaintext highlighter-rouge">h</code>, and the letter after <code class="language-plaintext highlighter-rouge">h</code> is <code class="language-plaintext highlighter-rouge">i</code>. Then we have</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">"h" &lt; "hello world" &lt; "i"</code></pre></figure> <p>For strings of length one, they will simply return <code class="language-plaintext highlighter-rouge">EQ</code> when compared with their first character (themselves).</p> <h1 id="decomposition">Decomposition</h1> <p>We now put the pieces together to implement an uncons function for symbols. First, we need <code class="language-plaintext highlighter-rouge">Head</code>, a function that returns the first character of a symbol. Second, we will use <code class="language-plaintext highlighter-rouge">Head</code> to interact with <code class="language-plaintext highlighter-rouge">AppendSymbol</code> to retrieve the tail of the symbol. Doing this repeatedly will allow us to turn a symbol into a list of characters, which in turn can be consumed by ordinary type families.</p> <h2 id="head">Head</h2> <p>So, to find out what the first character of a symbol is, we just need to find the last character in the ASCII table that precedes our symbol. To do this reasonably efficiently, we use binary search. Since indexing into a type-level list takes linear time, we use a balanced binary search tree instead. Recall that symbol comparisons are constant-time, so the whole operation is constant time (as we’re working with a fixed size alphabet), so this optimisation simply improves the constant factor by an order of magnitude.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Tree</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">Leaf</span> <span class="o">|</span> <span class="kt">Node</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="kr">deriving</span> <span class="kt">Show</span></code></pre></figure> <p>The printable subset of the ASCII character set can be encoded as the following tree:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Chars</span> <span class="o">=</span> <span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">" "</span><span class="p">,</span> <span class="s">"!"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"!"</span><span class="p">,</span> <span class="s">"</span><span class="se">\"</span><span class="s">"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"</span><span class="se">\"</span><span class="s">"</span><span class="p">,</span> <span class="s">"#"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"#"</span><span class="p">,</span> <span class="s">"$"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"$"</span><span class="p">,</span> <span class="s">"%"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"%"</span><span class="p">,</span> <span class="s">"&amp;"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"&amp;"</span><span class="p">,</span> <span class="s">"'"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"'"</span><span class="p">,</span> <span class="s">"("</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"("</span><span class="p">,</span> <span class="s">")"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">")"</span><span class="p">,</span> <span class="s">"*"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"*"</span><span class="p">,</span> <span class="s">"+"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span> <span class="n">'</span><span class="p">(</span><span class="s">"+"</span><span class="p">,</span> <span class="s">","</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">","</span><span class="p">,</span> <span class="s">"-"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"-"</span><span class="p">,</span> <span class="s">"."</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"."</span><span class="p">,</span> <span class="s">"/"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"/"</span><span class="p">,</span> <span class="s">"0"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"0"</span><span class="p">,</span> <span class="s">"1"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"1"</span><span class="p">,</span> <span class="s">"2"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"2"</span><span class="p">,</span> <span class="s">"3"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"3"</span><span class="p">,</span> <span class="s">"4"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"4"</span><span class="p">,</span> <span class="s">"5"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"5"</span><span class="p">,</span> <span class="s">"6"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"6"</span><span class="p">,</span> <span class="s">"7"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))))</span> <span class="n">'</span><span class="p">(</span><span class="s">"7"</span><span class="p">,</span> <span class="s">"8"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"8"</span><span class="p">,</span> <span class="s">"9"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"9"</span><span class="p">,</span> <span class="s">":"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">":"</span><span class="p">,</span> <span class="s">";"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">";"</span><span class="p">,</span> <span class="s">"&lt;"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"&lt;"</span><span class="p">,</span> <span class="s">"="</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"="</span><span class="p">,</span> <span class="s">"&gt;"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"&gt;"</span><span class="p">,</span> <span class="s">"?"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"?"</span><span class="p">,</span> <span class="s">"@"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"@"</span><span class="p">,</span> <span class="s">"A"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"B"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"B"</span><span class="p">,</span> <span class="s">"C"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span> <span class="n">'</span><span class="p">(</span><span class="s">"C"</span><span class="p">,</span> <span class="s">"D"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"D"</span><span class="p">,</span> <span class="s">"E"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"E"</span><span class="p">,</span> <span class="s">"F"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"F"</span><span class="p">,</span> <span class="s">"G"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"G"</span><span class="p">,</span> <span class="s">"H"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"H"</span><span class="p">,</span> <span class="s">"I"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"I"</span><span class="p">,</span> <span class="s">"J"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"J"</span><span class="p">,</span> <span class="s">"K"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"K"</span><span class="p">,</span> <span class="s">"L"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"L"</span><span class="p">,</span> <span class="s">"M"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"M"</span><span class="p">,</span> <span class="s">"N"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"O"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))))</span> <span class="n">'</span><span class="p">(</span><span class="s">"O"</span><span class="p">,</span> <span class="s">"P"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"P"</span><span class="p">,</span> <span class="s">"Q"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="s">"R"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"R"</span><span class="p">,</span> <span class="s">"S"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"S"</span><span class="p">,</span> <span class="s">"T"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"T"</span><span class="p">,</span> <span class="s">"U"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"U"</span><span class="p">,</span> <span class="s">"V"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"V"</span><span class="p">,</span> <span class="s">"W"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"W"</span><span class="p">,</span> <span class="s">"X"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"X"</span><span class="p">,</span> <span class="s">"Y"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"Y"</span><span class="p">,</span> <span class="s">"Z"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"Z"</span><span class="p">,</span> <span class="s">"["</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span> <span class="n">'</span><span class="p">(</span><span class="s">"["</span><span class="p">,</span> <span class="s">"</span><span class="se">\\</span><span class="s">"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"</span><span class="se">\\</span><span class="s">"</span><span class="p">,</span> <span class="s">"]"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"]"</span><span class="p">,</span> <span class="s">"^"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"^"</span><span class="p">,</span> <span class="s">"_"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"_"</span><span class="p">,</span> <span class="s">"`"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"`"</span><span class="p">,</span> <span class="s">"a"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"a"</span><span class="p">,</span> <span class="s">"b"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"b"</span><span class="p">,</span> <span class="s">"c"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"c"</span><span class="p">,</span> <span class="s">"d"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"d"</span><span class="p">,</span> <span class="s">"e"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"e"</span><span class="p">,</span> <span class="s">"f"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"f"</span><span class="p">,</span> <span class="s">"g"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))))</span> <span class="n">'</span><span class="p">(</span><span class="s">"g"</span><span class="p">,</span> <span class="s">"h"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"h"</span><span class="p">,</span> <span class="s">"i"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"i"</span><span class="p">,</span> <span class="s">"j"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"j"</span><span class="p">,</span> <span class="s">"k"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"k"</span><span class="p">,</span> <span class="s">"l"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"l"</span><span class="p">,</span> <span class="s">"m"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"m"</span><span class="p">,</span> <span class="s">"n"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"n"</span><span class="p">,</span> <span class="s">"o"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"o"</span><span class="p">,</span> <span class="s">"p"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"p"</span><span class="p">,</span> <span class="s">"q"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"q"</span><span class="p">,</span> <span class="s">"r"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"r"</span><span class="p">,</span> <span class="s">"s"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span> <span class="n">'</span><span class="p">(</span><span class="s">"s"</span><span class="p">,</span> <span class="s">"t"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"t"</span><span class="p">,</span> <span class="s">"u"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"u"</span><span class="p">,</span> <span class="s">"v"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"v"</span><span class="p">,</span> <span class="s">"w"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"w"</span><span class="p">,</span> <span class="s">"x"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span> <span class="s">"y"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span> <span class="n">'</span><span class="p">(</span><span class="s">"y"</span><span class="p">,</span> <span class="s">"z"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"z"</span><span class="p">,</span> <span class="s">"{"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"{"</span><span class="p">,</span> <span class="s">"|"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"|"</span><span class="p">,</span> <span class="s">"}"</span><span class="p">)</span> <span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"}"</span><span class="p">,</span> <span class="s">"~"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"~"</span><span class="p">,</span> <span class="s">"~"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))))</span></code></pre></figure> <p>(I generated this structure with the help of other type families, but found that inlining the result into the source file results in much faster lookups.)</p> <p>Note that each node contains two consecutive characters: this is so that we can easily decide when to stop: when the first element is less than, and the second element is greater than our input string.</p> <p>The <code class="language-plaintext highlighter-rouge">Lookup</code> type family (and <code class="language-plaintext highlighter-rouge">Lookup2</code>, to make up for a lack of local declarations in type families) implements a standard binary search.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">LookupTable</span> <span class="o">=</span> <span class="kt">Tree</span> <span class="p">(</span><span class="kt">Symbol</span><span class="p">,</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="kr">type</span> <span class="n">family</span> <span class="kt">Lookup</span> <span class="p">(</span><span class="n">x</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">xs</span> <span class="o">::</span> <span class="kt">LookupTable</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="kr">where</span> <span class="kt">Lookup</span> <span class="n">x</span> <span class="p">(</span><span class="kt">Node</span> <span class="n">l</span> <span class="n">'</span><span class="p">(</span><span class="n">cl</span><span class="p">,</span> <span class="n">cr</span><span class="p">)</span> <span class="n">r</span><span class="p">)</span> <span class="o">=</span> <span class="kt">Lookup2</span> <span class="p">(</span><span class="kt">CmpSymbol</span> <span class="n">cl</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="kt">CmpSymbol</span> <span class="n">cr</span> <span class="n">x</span><span class="p">)</span> <span class="n">x</span> <span class="n">cl</span> <span class="n">l</span> <span class="n">r</span> <span class="kr">type</span> <span class="n">family</span> <span class="kt">Lookup2</span> <span class="n">ol</span> <span class="n">or</span> <span class="n">x</span> <span class="n">cl</span> <span class="n">l</span> <span class="n">r</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="kr">where</span> <span class="kt">Lookup2</span> <span class="kt">'EQ</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">cl</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">cl</span> <span class="c1">-- character matches</span> <span class="kt">Lookup2</span> <span class="kt">'LT</span> <span class="kt">'GT</span> <span class="kr">_</span> <span class="n">cl</span> <span class="kr">_</span> <span class="n">r</span> <span class="o">=</span> <span class="n">cl</span> <span class="c1">-- found the right node</span> <span class="kt">Lookup2</span> <span class="kt">'LT</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">cl</span> <span class="kr">_</span> <span class="kt">'Leaf</span> <span class="o">=</span> <span class="n">cl</span> <span class="c1">-- we're at the rightmost node (~)</span> <span class="kt">Lookup2</span> <span class="kt">'LT</span> <span class="kr">_</span> <span class="n">x</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">r</span> <span class="o">=</span> <span class="kt">Lookup</span> <span class="n">x</span> <span class="n">r</span> <span class="c1">-- go right</span> <span class="kt">Lookup2</span> <span class="kt">'GT</span> <span class="kr">_</span> <span class="n">x</span> <span class="kr">_</span> <span class="n">l</span> <span class="kr">_</span> <span class="o">=</span> <span class="kt">Lookup</span> <span class="n">x</span> <span class="n">l</span> <span class="c1">-- go left</span></code></pre></figure> <p>Finally, <code class="language-plaintext highlighter-rouge">Head</code> is just a lookup in the binary tree.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Head</span> <span class="n">sym</span> <span class="o">=</span> <span class="kt">Lookup</span> <span class="n">sym</span> <span class="kt">Chars</span></code></pre></figure> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :kind! Head "Wurble" = "W"</code></pre></figure> <h2 id="uncons">Uncons</h2> <p>Next, we need to interact the <code class="language-plaintext highlighter-rouge">AppendSymbol</code> constraint with <code class="language-plaintext highlighter-rouge">Head</code>. We now turn to a type class, <code class="language-plaintext highlighter-rouge">Uncons</code>:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Uncons</span> <span class="p">(</span><span class="n">sym</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">h</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">t</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="kr">where</span> <span class="n">uncons</span> <span class="o">::</span> <span class="kt">Proxy</span> <span class="n">'</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">t</span><span class="p">)</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">sym</code> is our symbol, <code class="language-plaintext highlighter-rouge">h</code> is the head, and <code class="language-plaintext highlighter-rouge">t</code> is the tail. It would be nice to have a functional dependency <code class="language-plaintext highlighter-rouge">sym -&gt; h t</code>, but unfortunately we can’t make that pass, as recall that the backwards dependencies of <code class="language-plaintext highlighter-rouge">AppendSymbol</code> are essentially hidden from the type system.</p> <p>We write a single instance, which sets up the right constraints:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="p">(</span> <span class="n">h</span> <span class="o">~</span> <span class="kt">Head</span> <span class="n">sym</span> <span class="p">,</span> <span class="kt">AppendSymbol</span> <span class="n">h</span> <span class="n">t</span> <span class="o">~</span> <span class="n">sym</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Uncons</span> <span class="n">sym</span> <span class="n">h</span> <span class="n">t</span> <span class="kr">where</span> <span class="n">uncons</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure> <p>First, we write <code class="language-plaintext highlighter-rouge">h ~ Head sym</code>, which unifies <code class="language-plaintext highlighter-rouge">h</code> with the first element of the symbol using the binary lookup defined previously. Then, the <code class="language-plaintext highlighter-rouge">AppendSymbol h t ~ sym</code> constraint will trigger the solution of <code class="language-plaintext highlighter-rouge">t</code>, due to the now known prefix <code class="language-plaintext highlighter-rouge">h</code>.</p> <p>The <code class="language-plaintext highlighter-rouge">uncons</code> member is not necessary for things to work out, but it helps illustrate the working of the type class in the REPL:</p> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :t uncons @"foo" uncons @"foo" :: Proxy '("f", "oo")</code></pre></figure> <p>Finally, we can write the <code class="language-plaintext highlighter-rouge">Listify</code> class to recursively break down a symbol into a list of characters:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Listify</span> <span class="p">(</span><span class="n">sym</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">result</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Symbol</span><span class="p">])</span> <span class="kr">where</span> <span class="n">listify</span> <span class="o">::</span> <span class="kt">Proxy</span> <span class="n">result</span> <span class="kr">instance</span> <span class="cp">{-# OVERLAPPING #-}</span> <span class="n">nil</span> <span class="o">~</span> <span class="n">'</span><span class="kt">[]</span> <span class="o">=&gt;</span> <span class="kt">Listify</span> <span class="s">""</span> <span class="n">nil</span> <span class="kr">where</span> <span class="n">listify</span> <span class="o">=</span> <span class="kt">Proxy</span> <span class="kr">instance</span> <span class="p">(</span> <span class="kt">Uncons</span> <span class="n">sym</span> <span class="n">h</span> <span class="n">t</span> <span class="p">,</span> <span class="kt">Listify</span> <span class="n">t</span> <span class="n">result</span><span class="p">,</span> <span class="n">result'</span> <span class="o">~</span> <span class="p">(</span><span class="n">h</span> <span class="n">'</span><span class="o">:</span> <span class="n">result</span><span class="p">)</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Listify</span> <span class="n">sym</span> <span class="n">result'</span> <span class="kr">where</span> <span class="n">listify</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure> <figure class="highlight"><pre><code class="language-text" data-lang="text">&gt;&gt;&gt; :t listify @"Hello" listify @"Hello" :: Proxy '["H", "e", "l", "l", "o"]</code></pre></figure> <p>And with this, we can parse anything we’d like.</p> <h1 id="conclusion">Conclusion</h1> <p>Of course all of the above could be done a lot more efficiently with compiler support, and there’s no reason for that not to happen at some point in the future. This post is just a proof of concept that something like this is already possible today, and the presented technique is suitable for some lightweight applications. For anything larger scale, Template Haskell is probably much better suited for the job today.</p> <p><a href="https://kcsongor.github.io/symbol-parsing-haskell/">Parsing type-level strings in Haskell</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on November 28, 2018.</p> <![CDATA[Deriving Bifunctor with Generics]]> https://kcsongor.github.io/generic-deriving-bifunctor 2018-01-01T00:00:00-00:00 2017-12-31T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#the-problem" id="markdown-toc-the-problem">The problem</a></li> <li><a href="#the-solution" id="markdown-toc-the-solution">The solution</a> <ul> <li><a href="#the-boring-instances" id="markdown-toc-the-boring-instances">The boring instances</a></li> <li><a href="#incoherent-instances" id="markdown-toc-incoherent-instances">Incoherent instances</a></li> <li><a href="#default-signatures" id="markdown-toc-default-signatures">Default signatures</a></li> <li><a href="#a-few-more-instances" id="markdown-toc-a-few-more-instances">A few more instances</a></li> </ul> </li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> <li><a href="#acknowledgements" id="markdown-toc-acknowledgements">Acknowledgements</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>Recently, I’ve been experimenting with deriving various type class instances generically, and seeing how far we can go before having to resort to TemplateHaskell. This post is a showcase of one such experiment: deriving <a href="https://hackage.haskell.org/package/bifunctors">Bifunctor</a>, a type class that ranges over types of kind <code class="language-plaintext highlighter-rouge">* -&gt; * -&gt; *</code>, something <code class="language-plaintext highlighter-rouge">GHC.Generics</code> is known not to be well suited for. The accompanying source code can be found in <a href="https://gist.github.com/kcsongor/a8cb718f676c6ca1d999bfc56def9b7b">this gist</a>.</p> <h2 id="the-problem">The problem</h2> <p>The <a href="https://hackage.haskell.org/package/base-4.10.1.0/docs/GHC-Generics.html">GHC.Generics</a> module defines two representations: <code class="language-plaintext highlighter-rouge">Generic</code> and <code class="language-plaintext highlighter-rouge">Generic1</code>. The former is used to describe types of kind <code class="language-plaintext highlighter-rouge">*</code>, while the latter is used for <code class="language-plaintext highlighter-rouge">* -&gt; *</code>. For example, the <code class="language-plaintext highlighter-rouge">Generic1</code> representation is used in the <a href="http://hackage.haskell.org/package/generic-deriving-1.12/docs/Generics-Deriving-Functor.html">generic-deriving</a> package’s Functor derivation.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">GFunctor</span> <span class="p">(</span><span class="n">f</span> <span class="o">::</span> <span class="o">*</span> <span class="o">-&gt;</span> <span class="o">*</span><span class="p">)</span> <span class="kr">where</span> <span class="n">gmap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">f</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">f</span> <span class="n">b</span></code></pre></figure> <p>Then instances are defined for the generic building blocks. Whenever we have a <code class="language-plaintext highlighter-rouge">GFunctor (Rep1 f)</code>, we can turn that into a <code class="language-plaintext highlighter-rouge">Functor f</code>.</p> <p>With this, it’s possible to derive many useful instances of classes that range over <code class="language-plaintext highlighter-rouge">*</code> or <code class="language-plaintext highlighter-rouge">* -&gt; *</code>. However, there’s no <code class="language-plaintext highlighter-rouge">Generic2</code>, so if we try to adapt <code class="language-plaintext highlighter-rouge">generic-deriving</code>’s Functor approach to Bifunctors, we’ll run into problems.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Bifunctor</span> <span class="p">(</span><span class="n">p</span> <span class="o">::</span> <span class="o">*</span> <span class="o">-&gt;</span> <span class="o">*</span> <span class="o">-&gt;</span> <span class="o">*</span><span class="p">)</span> <span class="kr">where</span> <span class="n">bimap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">c</span> <span class="o">-&gt;</span> <span class="n">d</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">a</span> <span class="n">c</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">b</span> <span class="n">d</span></code></pre></figure> <p>The type parameter <code class="language-plaintext highlighter-rouge">p</code> takes two arguments, but the generic <code class="language-plaintext highlighter-rouge">Rep</code> and <code class="language-plaintext highlighter-rouge">Rep1</code> representations are strictly <code class="language-plaintext highlighter-rouge">* -&gt; *</code> (in the case of <code class="language-plaintext highlighter-rouge">Rep</code>, the type parameter is phantom – it’s only there so that much of the structure of <code class="language-plaintext highlighter-rouge">Rep</code> and <code class="language-plaintext highlighter-rouge">Rep1</code> can be shared, and <code class="language-plaintext highlighter-rouge">Rep1</code> requires <code class="language-plaintext highlighter-rouge">* -&gt; *</code>). This means that even if we defined a <code class="language-plaintext highlighter-rouge">GBifunctor</code>, we would need to require a <code class="language-plaintext highlighter-rouge">GBifunctor (Rep2 p)</code> which we could then turn into a <code class="language-plaintext highlighter-rouge">Bifunctor p</code>. Alas, <code class="language-plaintext highlighter-rouge">Rep2</code> doesn’t exist.</p> <p>Indeed, the deriving mechanism in the bifunctors package uses TH.</p> <h2 id="the-solution">The solution</h2> <p>The solution is inspired by how lenses implement polymorphic updates. The idea is that a <code class="language-plaintext highlighter-rouge">Lens s t a b</code> focuses on the <code class="language-plaintext highlighter-rouge">a</code> inside some structure <code class="language-plaintext highlighter-rouge">s</code>, and if we swap that <code class="language-plaintext highlighter-rouge">a</code> with a <code class="language-plaintext highlighter-rouge">b</code>, we get a <code class="language-plaintext highlighter-rouge">t</code>.</p> <p>Since we’re talking about Bifunctors now, we need two more type variables:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">GBifunctor</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">c</span> <span class="o">-&gt;</span> <span class="n">d</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">s</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="n">t</span> <span class="n">x</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">s</code> and <code class="language-plaintext highlighter-rouge">t</code> will be the generic representations, which means they are of kind <code class="language-plaintext highlighter-rouge">* -&gt; *</code>. However, we’re going to be using <code class="language-plaintext highlighter-rouge">Generic</code> instead of <code class="language-plaintext highlighter-rouge">Generic1</code>, so the type parameter <code class="language-plaintext highlighter-rouge">x</code> is not used.</p> <p>Unlike the <code class="language-plaintext highlighter-rouge">GFunctor</code> class, which looked exactly like <code class="language-plaintext highlighter-rouge">Functor</code>, this one is a lot different from <code class="language-plaintext highlighter-rouge">Bifunctor</code>. Also important to note that <code class="language-plaintext highlighter-rouge">gbimap</code>’s type signature is more polymorphic than that of <code class="language-plaintext highlighter-rouge">bimap</code>, so we need to ensure that our instances are properly parametric.</p> <p class="notice">In an earlier version of this class, I had functional dependencies on the class that expressed this interrelation between the type variables, but I had to lose them so that more interesting instances could be defined (more on this later).</p> <h3 id="the-boring-instances">The boring instances</h3> <p>The first instance simply looks through the metadata node.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">GBifunctor</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">M1</span> <span class="n">k</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span> <span class="p">(</span><span class="kt">M1</span> <span class="n">k</span> <span class="n">m</span> <span class="n">t</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">=</span> <span class="kt">M1</span> <span class="o">.</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">.</span> <span class="n">unM1</span></code></pre></figure> <p>A sum <code class="language-plaintext highlighter-rouge">l :+: r</code> can be turned into <code class="language-plaintext highlighter-rouge">l' :+: r'</code> if we can turn <code class="language-plaintext highlighter-rouge">l</code> into <code class="language-plaintext highlighter-rouge">l'</code> and <code class="language-plaintext highlighter-rouge">r</code> into <code class="language-plaintext highlighter-rouge">r'</code>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="p">(</span> <span class="kt">GBifunctor</span> <span class="n">l</span> <span class="n">l'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="p">,</span> <span class="kt">GBifunctor</span> <span class="n">r</span> <span class="n">r'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="n">l</span> <span class="o">:+:</span> <span class="n">r</span><span class="p">)</span> <span class="p">(</span><span class="n">l'</span> <span class="o">:+:</span> <span class="n">r'</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="kt">L1</span> <span class="n">l</span><span class="p">)</span> <span class="o">=</span> <span class="kt">L1</span> <span class="p">(</span><span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">l</span><span class="p">)</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="kt">R1</span> <span class="n">r</span><span class="p">)</span> <span class="o">=</span> <span class="kt">R1</span> <span class="p">(</span><span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">r</span><span class="p">)</span></code></pre></figure> <p>And similarly, for products.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="p">(</span> <span class="kt">GBifunctor</span> <span class="n">l</span> <span class="n">l'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="p">,</span> <span class="kt">GBifunctor</span> <span class="n">r</span> <span class="n">r'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="n">l</span> <span class="o">:*:</span> <span class="n">r</span><span class="p">)</span> <span class="p">(</span><span class="n">l'</span> <span class="o">:*:</span> <span class="n">r'</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="n">l</span> <span class="o">:*:</span> <span class="n">r</span><span class="p">)</span> <span class="o">=</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">l</span> <span class="o">:*:</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">r</span></code></pre></figure> <p>The last boring instance is for unit types, these are trivially Bifunctors.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">GBifunctor</span> <span class="kt">U1</span> <span class="kt">U1</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">id</span></code></pre></figure> <h3 id="incoherent-instances">Incoherent instances</h3> <p>With all of the gluing out of the way, we can now get to the meat of the problem: the actual fields in the constructors. When considering a field, we have 3 cases:</p> <p>The field is of type <code class="language-plaintext highlighter-rouge">a</code>, and we apply the first function to turn it into a <code class="language-plaintext highlighter-rouge">b</code>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">b</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="kr">_</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">)</span></code></pre></figure> <p>Similarly, if it’s a <code class="language-plaintext highlighter-rouge">c</code>, we turn it into a <code class="language-plaintext highlighter-rouge">d</code> using the second function.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">c</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">d</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="kr">_</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">g</span> <span class="n">a</span><span class="p">)</span></code></pre></figure> <p>Finally, the field is neither <code class="language-plaintext highlighter-rouge">a</code>, nor <code class="language-plaintext highlighter-rouge">c</code>, so we just leave it alone.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">x</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">id</span></code></pre></figure> <p>Note that these instances need to be defined with <code class="language-plaintext highlighter-rouge">{-# INCOHERENT #-}</code> pragmas. This is required because neither of <code class="language-plaintext highlighter-rouge">(Rec0 a) (Rec0 b) a b c d</code> and <code class="language-plaintext highlighter-rouge">(Rec0 c) (Rec0 d) a b c d</code> is more specific than the other.</p> <p>However, in our case, this is not a problem, because we’re going to invoke instance resolution with polymorphic arguments, so there will be exactly one instance that matches.</p> <h3 id="default-signatures">Default signatures</h3> <p>We can now revise our original class definition, and add a default signature (<code class="language-plaintext highlighter-rouge">DefaultSignatures</code>). This will make <code class="language-plaintext highlighter-rouge">Bifunctor</code> derivable with <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Bifunctor</span> <span class="n">p</span> <span class="kr">where</span> <span class="n">bimap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">c</span> <span class="o">-&gt;</span> <span class="n">d</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">a</span> <span class="n">c</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">b</span> <span class="n">d</span> <span class="kr">default</span> <span class="n">bimap</span> <span class="o">::</span> <span class="p">(</span> <span class="kt">Generic</span> <span class="p">(</span><span class="n">p</span> <span class="n">a</span> <span class="n">c</span><span class="p">)</span> <span class="p">,</span> <span class="kt">Generic</span> <span class="p">(</span><span class="n">p</span> <span class="n">b</span> <span class="n">d</span><span class="p">)</span> <span class="p">,</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rep</span> <span class="p">(</span><span class="n">p</span> <span class="n">a</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rep</span> <span class="p">(</span><span class="n">p</span> <span class="n">b</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">b</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">c</span> <span class="o">-&gt;</span> <span class="n">d</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">a</span> <span class="n">c</span> <span class="o">-&gt;</span> <span class="n">p</span> <span class="n">b</span> <span class="n">d</span> <span class="n">bimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">=</span> <span class="n">to</span> <span class="o">.</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">.</span> <span class="n">from</span></code></pre></figure> <p>Note the line <code class="language-plaintext highlighter-rouge">GBifunctor (Rep (p a c)) (Rep (p b d)) a b c d</code>. Here’s where we establish the relationship between the types. This now allows us to derive a <code class="language-plaintext highlighter-rouge">Bifunctor</code> instance for <code class="language-plaintext highlighter-rouge">Either</code>:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">deriving</span> <span class="kr">instance</span> <span class="kt">Bifunctor</span> <span class="kt">Either</span></code></pre></figure> <p>For example, when looking at the <code class="language-plaintext highlighter-rouge">Left</code> constructor, the compiler will try to find an instance for <code class="language-plaintext highlighter-rouge">GBifunctor (Rec0 a) (Rec0 b) a b c d</code>. There is exactly one instance that matches this, so our incoherent instance will not bite us. This is important: if instead we wanted an instance for a concrete type, say, <code class="language-plaintext highlighter-rouge">Either Int Int</code>, all of our incoherent instances would match, and an arbitrary one would be picked. However, we avoid this problem by ensuring that the instance is derived for the aformentioned polymorphic form.</p> <p>With this, we have a correct implementation of <code class="language-plaintext highlighter-rouge">bimap</code> for <code class="language-plaintext highlighter-rouge">Either</code>:</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; bimap show (+ 10) (Left 10) Left "10" &gt;&gt;&gt; bimap show (+ 10) (Right 10) Right 20</code></pre></figure> <p>Even better, compiled with <code class="language-plaintext highlighter-rouge">-O1</code>, all of the overhead from using generics is optimised away:</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">$fBifunctorEither_$cbimap = \ @ a_a3EL @ b_a3EM @ c_a3EN @ d_a3EO f_X1EN g_X1EP eta_B1 -&gt; case eta_B1 of { Left g1_a3X5 -&gt; Left (f_X1EN g1_a3X5); Right g1_a3X8 -&gt; Right (g_X1EP g1_a3X8) }</code></pre></figure> <h3 id="a-few-more-instances">A few more instances</h3> <p>The above deriving mechanism is naive: it only looks at fields whose types is exactly <code class="language-plaintext highlighter-rouge">a</code> or <code class="language-plaintext highlighter-rouge">b</code>. But we can do better: what if the field is a <code class="language-plaintext highlighter-rouge">Maybe a</code>? Surely we can turn that into a <code class="language-plaintext highlighter-rouge">Maybe b</code>. Or if it’s an <code class="language-plaintext highlighter-rouge">Either a b</code>, we can turn that into an <code class="language-plaintext highlighter-rouge">Either c d</code>, since it has a <code class="language-plaintext highlighter-rouge">Bifunctor</code> instance.</p> <p>The following three instances do exactly that.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Bifunctor</span> <span class="n">f</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">bimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">a</span><span class="p">)</span> <span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Functor</span> <span class="n">f</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="kr">_</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">g</span> <span class="n">a</span><span class="p">)</span> <span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Functor</span> <span class="n">f</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="kr">_</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">f</span> <span class="n">a</span><span class="p">)</span> <span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Bifunctor</span> <span class="n">f</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span> <span class="n">a</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span> <span class="n">b</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="kr">_</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">bimap</span> <span class="n">f</span> <span class="n">f</span> <span class="n">a</span><span class="p">)</span> <span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Bifunctor</span> <span class="n">f</span> <span class="o">=&gt;</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">c</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">d</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span> <span class="n">gbimap</span> <span class="kr">_</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">bimap</span> <span class="n">g</span> <span class="n">g</span> <span class="n">b</span><span class="p">)</span></code></pre></figure> <p>Now we can derive even more interesting <code class="language-plaintext highlighter-rouge">Bifunctor</code> instances.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">T</span> <span class="n">a</span> <span class="n">b</span> <span class="o">=</span> <span class="kt">T1</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">|</span> <span class="kt">T2</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">b</span><span class="p">)</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Bifunctor</span><span class="p">)</span></code></pre></figure> <h2 id="conclusion">Conclusion</h2> <p>We have seen a technique for approximating a hypothetical <code class="language-plaintext highlighter-rouge">Generic2</code> representation with only using <code class="language-plaintext highlighter-rouge">Generic</code>. Of course there was nothing specific about the number 2, we can easily generalise this to any fixed number of parameters.</p> <p>I’m planning on writing a post about a further generalisation of this idea, which allows us to talk about types that have an arbitrary number type parameters (unlike here, where it’s a fixed number), which I used in the <a href="https://hackage.haskell.org/package/generic-lens">generic-lens</a> library, to allow for type changing lenses over any type parameter (thanks to the more elaborate extra machinery, there is no need for incoherent instance resolution).</p> <p>It would be interesting to see how far this can be pushed before hitting a roadblock that would truly require a bespoke <code class="language-plaintext highlighter-rouge">GenericN</code> representation.</p> <h2 id="acknowledgements">Acknowledgements</h2> <p>Thanks to <a href="https://github.com/adituv">@adituv</a> for pointing out that two instances were missing.</p> <p><a href="https://kcsongor.github.io/generic-deriving-bifunctor/">Deriving Bifunctor with Generics</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on December 31, 2017.</p> <![CDATA[Announcing generic-lens 0.5.0.0]]> https://kcsongor.github.io/generic-lens 2017-12-10T00:00:00-00:00 2017-12-10T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#overview" id="markdown-toc-overview">Overview</a> <ul> <li><a href="#examples" id="markdown-toc-examples">Examples</a> <ul> <li><a href="#field" id="markdown-toc-field">field</a></li> <li><a href="#typed" id="markdown-toc-typed">typed</a></li> <li><a href="#position" id="markdown-toc-position">position</a></li> <li><a href="#super-row-polymorphism" id="markdown-toc-super-row-polymorphism">super (row polymorphism)</a></li> <li><a href="#_ctor" id="markdown-toc-_ctor">_Ctor</a></li> </ul> </li> <li><a href="#mtl" id="markdown-toc-mtl">mtl</a></li> </ul> </li> <li><a href="#performance" id="markdown-toc-performance">Performance</a></li> <li><a href="#quick-note-migration" id="markdown-toc-quick-note-migration">Quick note (migration)</a></li> <li><a href="#acknowledgements" id="markdown-toc-acknowledgements">Acknowledgements</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>The <a href="https://hackage.haskell.org/package/generic-lens">generic-lens</a> library provides utilities for deriving various optics for your datatypes, using <code class="language-plaintext highlighter-rouge">GHC.Generics</code>. In this post I’ll go over some of the features and provide examples of using them.</p> <h2 id="overview">Overview</h2> <p>Lenses have proven to be an exteremely powerful tool in the Haskell ecosystem. <code class="language-plaintext highlighter-rouge">generic-lens</code> uses <code class="language-plaintext highlighter-rouge">GHC.Generics</code> to derive lenses and prisms on the fly, only when they are needed. These optics are highly polymorphic, and can be used with all types that are of the right shape. Extra care has been taken to keep type errors readable.</p> <h3 id="examples">Examples</h3> <p>To get started, we will need the following extensions:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE DataKinds #-}</span> <span class="cp">{-# LANGUAGE DeriveGeneric #-}</span> <span class="cp">{-# LANGUAGE FlexibleContexts #-}</span> <span class="cp">{-# LANGUAGE TypeApplications #-}</span> <span class="cp">{-# LANGUAGE TypeFamilies #-}</span></code></pre></figure> <p>And the following imports</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="nn">Control.Lens</span> <span class="kr">import</span> <span class="nn">Data.Generics.Product</span> <span class="kr">import</span> <span class="nn">GHC.Generics</span></code></pre></figure> <p>Consider the following datatype:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Human</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">Human</span> <span class="p">{</span> <span class="n">name</span> <span class="o">::</span> <span class="kt">String</span> <span class="p">,</span> <span class="n">age</span> <span class="o">::</span> <span class="kt">Int</span> <span class="p">,</span> <span class="n">address</span> <span class="o">::</span> <span class="kt">String</span> <span class="p">,</span> <span class="n">other</span> <span class="o">::</span> <span class="n">a</span> <span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span></code></pre></figure> <h4 id="field">field</h4> <p>We can access the <code class="language-plaintext highlighter-rouge">name</code> field:</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Human "John" 18 "London" True ^. field @"name" "John"</code></pre></figure> <p>We can update fields too, even changing types where possible (when the type of the field is a type parameter of the datatype):</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Human "John" 18 "London" True &amp; field @"other" %~ show Human {name = "John", age = 18, address = "London", other = "True"}</code></pre></figure> <p>In case of sum types, it only makes sense to have a lens on the fields that appear in every constructor. Trying to use <code class="language-plaintext highlighter-rouge">field</code> to get a lens for a partial field is a type error.</p> <p class="notice">Note that the <code class="language-plaintext highlighter-rouge">field</code> lens works with <code class="language-plaintext highlighter-rouge">DuplicateRecordFields</code>, which means that record fields can actually be shared, and we can get a reusuble lens for all cases without code duplication.</p> <h4 id="typed">typed</h4> <p>We can directly reference a field by its type, as long as the type is unique in the structure.</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Human "John" 18 "London" True ^. typed @Bool True</code></pre></figure> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Human "John" 18 "London" True ^. typed @String &lt;interactive&gt;:34:34: error: • The type Human Bool contains multiple values of type [Char]. The choice of value is thus ambiguous. The offending constructors are: • Human • In the second argument of ‘(^.)’, namely ‘typed @String’ In the expression: Human "John" 18 "London" True ^. typed @String In an equation for ‘it’: it = Human "John" 18 "London" True ^. typed @String</code></pre></figure> <h4 id="position">position</h4> <p>When the above two fail, and we have a product type, we can specify the field of interest by its position.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">MyTuple</span> <span class="n">a</span> <span class="n">b</span> <span class="o">=</span> <span class="kt">MyTuple</span> <span class="n">a</span> <span class="n">b</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span></code></pre></figure> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; MyTuple 10 20 &amp; position @1 .~ "hello" MyTuple "hello" 20</code></pre></figure> <h4 id="super-row-polymorphism">super (row polymorphism)</h4> <p>Given two records, where the set of fields of one is the subset of that of the other, we can talk about a structural subtype relationship. The <code class="language-plaintext highlighter-rouge">super</code> lens allows us to treat the subtype as the supertype - without forgetting the original structure.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Small</span> <span class="o">=</span> <span class="kt">Small</span> <span class="p">{</span> <span class="n">small</span> <span class="o">::</span> <span class="kt">Int</span> <span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span> <span class="kr">data</span> <span class="kt">Large</span> <span class="o">=</span> <span class="kt">Large</span> <span class="p">{</span> <span class="n">small</span> <span class="o">::</span> <span class="kt">Int</span> <span class="p">,</span> <span class="n">large</span> <span class="o">::</span> <span class="kt">String</span> <span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span> <span class="n">smallFun</span> <span class="o">::</span> <span class="kt">Small</span> <span class="o">-&gt;</span> <span class="kt">Small</span> <span class="n">smallFun</span> <span class="p">(</span><span class="kt">Small</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="kt">Small</span> <span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">10</span><span class="p">)</span></code></pre></figure> <p>(Here, we need the <code class="language-plaintext highlighter-rouge">{-# LANGUAGE DuplicateRecordFields #-}</code> extension in addition to the previous ones.)</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Large 10 "foo" &amp; super %~ smallFun Large {small = 20, large = "foo"}</code></pre></figure> <p>Or we can simply upcast:</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Large 10 "foo" ^. super :: Small Small {small = 10}</code></pre></figure> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Small 10 ^. super :: Large &lt;interactive&gt;:53:13: error: • The type 'Small' is not a subtype of 'Large'. The following fields are missing from 'Small': • large</code></pre></figure> <h4 id="_ctor">_Ctor</h4> <p>We can also obtain prisms that focus on individual constructors:</p> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Human "John" 18 "London" True ^? _Ctor @"Human" Just ("John",18,"London",True)</code></pre></figure> <figure class="highlight"><pre><code class="language-txt" data-lang="txt">&gt;&gt;&gt; Human "John" 18 "London" True ^? _Ctor @"Human" . position @3 Just "London"</code></pre></figure> <h3 id="mtl">mtl</h3> <p>So far, we haven’t provided any type signatures. Indeed, everything can be inferred by the compiler. However, because these combinators are highly polymorphic, it might be interesting to use them in a polymorphic context.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">::</span> <span class="p">(</span><span class="kt">MonadReader</span> <span class="n">env</span> <span class="n">m</span><span class="p">,</span> <span class="kt">HasField'</span> <span class="s">"username"</span> <span class="n">env</span> <span class="kt">String</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">m</span> <span class="kt">String</span> <span class="n">f</span> <span class="o">=</span> <span class="n">view</span> <span class="p">(</span><span class="n">field</span> <span class="o">@</span><span class="s">"username"</span><span class="p">)</span></code></pre></figure> <p>This function is now polymorphic not just in the monad stack it will eventually run in, but also in the type of the environment.</p> <p>The type of <code class="language-plaintext highlighter-rouge">field</code> is</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">field</span> <span class="o">::</span> <span class="kt">HasField</span> <span class="n">field</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="o">=&gt;</span> <span class="kt">Lens</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">HasField'</code> (similarly to <code class="language-plaintext highlighter-rouge">Lens'</code>) is a type synonym for <code class="language-plaintext highlighter-rouge">HasField field s s a a</code>.</p> <p>For a more comprehensive overview and more examples, please have a look at the library on <a href="https://hackage.haskell.org/package/generic-lens">hackage</a>, or on <a href="https://github.com/kcsongor/generic-lens">github</a>.</p> <h2 id="performance">Performance</h2> <p>An important question when evaluating such high-level abstractions is whether the abstraction comes at the cost of performance. Fortunately, GHC optimises away all of the overhead of the generic transformations, leaving us with code that is equivalent to what we would’ve written manually.</p> <p>This can be verified by comparing the generated core of both the manually written lens and the generated one. However, it happened multiple times during development that a small change (such as eta-reduction) broke the optimisation. Joachim Breitner’s excellent <a href="https://github.com/nomeata/inspection-testing">inspection-testing</a> tool, which is now integrated into the automated test suite, is making sure that the optimisation happens by automatically doing this comparison. This tool has been invaluable in ensuring the performance guarantees, without having to manually inspect the generated core after every single commit. The tests can be found <a href="https://github.com/kcsongor/generic-lens/blob/master/test/Spec.hs">here</a>.</p> <p class="notice">It’s important to mention that as of this release, only the lenses are optimised away completely, the prisms still have some leftover overhead. This is planned to be fixed in a future release.</p> <h2 id="quick-note-migration">Quick note (migration)</h2> <p>In case you were already using the library, there are some breaking changes in <code class="language-plaintext highlighter-rouge">0.5.0.0</code>. Namely, all the <code class="language-plaintext highlighter-rouge">Has*</code> classes have been extended from 3 type parameters to 5. Auxiliary constraint synonyms are provided, and migration should be relatively simple:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">::</span> <span class="kt">HasField</span> <span class="n">field</span> <span class="n">a</span> <span class="n">record</span> <span class="o">=&gt;</span> <span class="o">...</span></code></pre></figure> <p>becomes</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">::</span> <span class="kt">HasField'</span> <span class="n">field</span> <span class="n">record</span> <span class="n">a</span> <span class="o">=&gt;</span> <span class="o">...</span></code></pre></figure> <p class="notice">Notice the <code class="language-plaintext highlighter-rouge">'</code> at the end of the class name, and the swapping of the last two arguments.</p> <h2 id="acknowledgements">Acknowledgements</h2> <p>Thanks to Matthew Pickering for useful comments on a draft of this post.</p> <p><a href="https://kcsongor.github.io/generic-lens/">Announcing generic-lens 0.5.0.0</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on December 10, 2017.</p> <![CDATA[Well-typed printfs cannot go wrong]]> https://kcsongor.github.io/purescript-safe-printf 2017-9-25T00:00:00-00:00 2017-09-25T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#the-problem" id="markdown-toc-the-problem">The problem</a></li> <li><a href="#type-level-parsing" id="markdown-toc-type-level-parsing">Type-level parsing</a></li> <li><a href="#how-the-sausage-gets-made-computing-the-output-type" id="markdown-toc-how-the-sausage-gets-made-computing-the-output-type">How the sausage gets made: computing the output type</a></li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> </ul> </div> </section> <!-- /#table-of-contents --> <p>One of the classic examples that keeps coming up when talking about dependently typed programming languages is the “safe” <code class="language-plaintext highlighter-rouge">printf</code> function – one that ensures that the number and type of arguments match the requirement in the format specification.</p> <p>In languages like Idris, this is just a function that takes a format string, and returns the type of arguments required for constructing the formatted output string.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"> <span class="n">format</span> <span class="s">"A number: %d, and a string: %s"</span> <span class="o">:</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">String</span> <span class="o">-&gt;</span> <span class="kt">String</span></code></pre></figure> <p>Other languages, like rust, solve this by various means of metaprogramming: writing a program (macro) that runs at compile-time, generating the program to be executed at runtime.</p> <p>What these two approaches have in common is that they both operate on strings that are statically available to the compiler. The aim of this post is to show another way of achieving the same result, with tools that are available in PureScript – a strongly-typed functional language, with no dependent types.</p> <h2 id="the-problem">The problem</h2> <p>We want to write a program that takes a format string, some number of arguments, and returns the result of inserting the arguments at their specified places in the format string, and does all this in a type-safe way.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;</span> <span class="o">:</span><span class="n">t</span> <span class="n">format</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="kt">String</span> <span class="o">-&gt;</span> <span class="kt">String</span></code></pre></figure> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;</span> <span class="n">format</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="mi">10</span> <span class="mi">20</span> <span class="s">"foo"</span> <span class="s">"Wurble 10 20 foo"</span></code></pre></figure> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">&gt;</span> <span class="n">format</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="mi">10</span> <span class="mi">20</span> <span class="mi">30</span> <span class="kt">Error</span> <span class="n">found</span><span class="o">:</span> <span class="kt">Could</span> <span class="n">not</span> <span class="n">match</span> <span class="kr">type</span> <span class="kt">String</span> <span class="n">with</span> <span class="kr">type</span> <span class="kt">Int</span> <span class="n">while</span> <span class="n">trying</span> <span class="n">to</span> <span class="n">match</span> <span class="kr">type</span> <span class="kt">Function</span> <span class="kt">String</span> <span class="n">with</span> <span class="kr">type</span> <span class="kt">Function</span> <span class="kt">Int</span></code></pre></figure> <p class="notice">The <code class="language-plaintext highlighter-rouge">@</code> symbol before the string is the proxy syntax introduced in 0.12 which provides a concise way of passing types around. The format strings are actually type-level literals – but more on this later.</p> <p>Crucially, we need to compute a type from some input, but because PureScript has no dependent types, values and functions in the traditional sense are not available for evaluation at compile-time. However, there is a way to interact with the compiler: via the type-checker.</p> <p>The solution therefore is to encode this computation in the types, and have the type-checker evaluate it for us as part of type-checking. Luckily, PureScript allows string literals in types (these are types whose kind is <code class="language-plaintext highlighter-rouge">Symbol</code>).</p> <p>Thus, constructing our <code class="language-plaintext highlighter-rouge">printf</code> function comprises two steps:</p> <ul> <li>parse the input <code class="language-plaintext highlighter-rouge">Symbol</code> into a list of format tokens</li> <li>generate the function from the format list that will then assemble the output string</li> </ul> <h2 id="type-level-parsing">Type-level parsing</h2> <p>For the sake of simplicity, we’re going to focus on two types of format specifiers: decimals (<code class="language-plaintext highlighter-rouge">%d</code>) and strings (<code class="language-plaintext highlighter-rouge">%s</code>).</p> <p>We represent these cases with a <em>custom kind</em>, which is like a regular algebraic datatype, but lifted to the type-level. This means that these constructors can be used <em>in types</em>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foreign</span> <span class="kr">import</span> <span class="nn">kind</span> <span class="kt">Specifier</span> <span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">D</span> <span class="o">::</span> <span class="kt">Specifier</span> <span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">S</span> <span class="o">::</span> <span class="kt">Specifier</span> <span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">Lit</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="o">-&gt;</span> <span class="kt">Specifier</span></code></pre></figure> <p>Of course, apart from the format specifiers <code class="language-plaintext highlighter-rouge">%d</code> and <code class="language-plaintext highlighter-rouge">%s</code>, everything else is a literal, which we account for by wrapping them in the <code class="language-plaintext highlighter-rouge">Lit</code> type constructor.</p> <p class="notice">The <code class="language-plaintext highlighter-rouge">foreign import</code> bit means that we’re introducing types here that have no constructors. That is to say, it’s impossible to construct a value of type <code class="language-plaintext highlighter-rouge">D</code> and <code class="language-plaintext highlighter-rouge">S</code>. We’ll see later how it is still possible to carry these types around in terms (hint: proxies).</p> <p>Furthermore, we need a way of representing a sequence of these specifiers, for which we introduce another kind:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foreign</span> <span class="kr">import</span> <span class="nn">kind</span> <span class="kt">FList</span> <span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">FNil</span> <span class="o">::</span> <span class="kt">FList</span> <span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">FCons</span> <span class="o">::</span> <span class="kt">Specifier</span> <span class="o">-&gt;</span> <span class="kt">FList</span> <span class="o">-&gt;</span> <span class="kt">FList</span></code></pre></figure> <p>With this, we can now write types like <code class="language-plaintext highlighter-rouge">FCons D (FCons (Lit " foo") FNil)</code>, corresponding to the string <code class="language-plaintext highlighter-rouge">%d foo</code>.</p> <p class="notice">Kind-polymorphism is not supported by the current version (0.12) of PureScript, so we can’t define a parametric type-level list once and for all – we need a new one for each type we want to store in lists. With this, and some syntactic sugar, we would be able to write (as we can in Haskell today) <code class="language-plaintext highlighter-rouge">[D, "foo"]</code>. This limitation is likely to be removed in a future version of the compiler.</p> <p>With these building blocks defined, now we have a vocabulary for talking about the parser itself: it is a function that takes a <code class="language-plaintext highlighter-rouge">Symbol</code> as an input, and returns a <code class="language-plaintext highlighter-rouge">FList</code>. We encode the computation in the following type class:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Parse</span> <span class="p">(</span><span class="n">string</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">format</span> <span class="o">::</span> <span class="kt">FList</span><span class="p">)</span> <span class="o">|</span> <span class="n">string</span> <span class="o">-&gt;</span> <span class="n">format</span></code></pre></figure> <p>The functional dependency <code class="language-plaintext highlighter-rouge">string -&gt; format</code> states that the input <code class="language-plaintext highlighter-rouge">string</code> determines the ouput <code class="language-plaintext highlighter-rouge">format</code>. This bit is crucial, as this is what tells the compiler that knowing <code class="language-plaintext highlighter-rouge">string</code> is sufficient in determining what the value of <code class="language-plaintext highlighter-rouge">format</code> is. It is then our task to ensure that this dependency indeed holds, when writing out the instances.</p> <p>To deconstruct the input symbol, we use the following type class available in 0.12:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">ConsSymbol</span> <span class="p">(</span><span class="n">head</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">tail</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">sym</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="o">|</span> <span class="n">sym</span> <span class="o">-&gt;</span> <span class="n">head</span> <span class="n">tail</span><span class="p">,</span> <span class="n">head</span> <span class="n">tail</span> <span class="o">-&gt;</span> <span class="n">sym</span></code></pre></figure> <p>The interesting functional dependency here is the <code class="language-plaintext highlighter-rouge">sym -&gt; head tail</code>, which, given some symbol, deconstructs it into its <code class="language-plaintext highlighter-rouge">head</code> (the first character) and its <code class="language-plaintext highlighter-rouge">tail</code> – the rest.</p> <p>The parser is like a state machine, with the following legal states:</p> <ul> <li>State 1: found a non-<code class="language-plaintext highlighter-rouge">%</code> character</li> <li>State 2: found a <code class="language-plaintext highlighter-rouge">%</code> character</li> </ul> <p>One possible way of representing these states is by having a separate type class to deal with each.</p> <p>Since in our simplified example, we know that the specifier symbols can only be single characters, we can define the second state as:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Parse2</span> <span class="p">(</span><span class="n">head</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">out</span> <span class="o">::</span> <span class="kt">Specifier</span><span class="p">)</span> <span class="o">|</span> <span class="n">head</span> <span class="o">-&gt;</span> <span class="n">out</span></code></pre></figure> <p>That is, it takes a symbol, and returns the matching specifier. The implementation is straightforward:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">parse2D</span> <span class="o">::</span> <span class="kt">Parse2</span> <span class="s">"d"</span> <span class="kt">D</span> <span class="kr">instance</span> <span class="n">parse2S</span> <span class="o">::</span> <span class="kt">Parse2</span> <span class="s">"s"</span> <span class="kt">S</span></code></pre></figure> <p>This is a partial function, which means that format strings that contain unsupported specifier tokens will simply fail to compile.</p> <p>The first state is more complicated, as it can consume an arbitrary number of characters, so we pass it the remaining string (<code class="language-plaintext highlighter-rouge">tail</code>) as well.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Parse1</span> <span class="p">(</span><span class="n">head</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">tail</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">out</span> <span class="o">::</span> <span class="kt">FList</span><span class="p">)</span> <span class="o">|</span> <span class="n">head</span> <span class="n">tail</span> <span class="o">-&gt;</span> <span class="n">out</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">Parse1</code> represents the parsing state where we have the current character <code class="language-plaintext highlighter-rouge">head</code>, the rest of the input string <code class="language-plaintext highlighter-rouge">tail</code>, and we know that the previous character was not a <code class="language-plaintext highlighter-rouge">%</code>.</p> <p>The first case is when the tail is empty. In this case, we just return the current character as the literal in a singleton list:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">parse1Nil</span> <span class="o">::</span> <span class="kt">Parse1</span> <span class="n">a</span> <span class="s">""</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="n">a</span><span class="p">)</span> <span class="kt">FNil</span><span class="p">)</span></code></pre></figure> <p>The second case is more interesting. This is when we find a <code class="language-plaintext highlighter-rouge">%</code>, so we need to invoke the other function, <code class="language-plaintext highlighter-rouge">Parse2</code>, which handles parsing the specifier itself. To do that, we use <code class="language-plaintext highlighter-rouge">ConsSymbol</code> to split our current tail <code class="language-plaintext highlighter-rouge">s</code> into its head <code class="language-plaintext highlighter-rouge">h</code> and tail <code class="language-plaintext highlighter-rouge">t</code>. <code class="language-plaintext highlighter-rouge">h</code> contains the format specifier, which we pass on to <code class="language-plaintext highlighter-rouge">Parse2</code>. Then, recursively invoke <code class="language-plaintext highlighter-rouge">Parse</code> on <code class="language-plaintext highlighter-rouge">t</code> to parse the rest of the input. In addition to returning <code class="language-plaintext highlighter-rouge">spec</code> consed to <code class="language-plaintext highlighter-rouge">rest</code>, we also put an empty string literal at the head of the output list: this is to maintain the invariant that the head of the output list always contains a string literal. This invariant will be useful for the last case…</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">else</span> <span class="kr">instance</span> <span class="n">parse1Pc</span> <span class="o">::</span> <span class="p">(</span> <span class="kt">ConsSymbol</span> <span class="n">h</span> <span class="n">t</span> <span class="n">s</span> <span class="p">,</span> <span class="kt">Parse2</span> <span class="n">h</span> <span class="n">spec</span> <span class="p">,</span> <span class="kt">Parse</span> <span class="n">t</span> <span class="n">rest</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Parse1</span> <span class="s">"%"</span> <span class="n">s</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="s">""</span><span class="p">)</span> <span class="p">(</span><span class="kt">FCons</span> <span class="n">spec</span> <span class="n">rest</span><span class="p">))</span></code></pre></figure> <p>…when we match any other character, i.e. other than <code class="language-plaintext highlighter-rouge">%</code>. Since we’re in <code class="language-plaintext highlighter-rouge">Parse1</code>, that means that the current character needs to be in a string literal. For this, we first recursively parse the tail <code class="language-plaintext highlighter-rouge">s</code> into <code class="language-plaintext highlighter-rouge">FCons (Lit acc) r</code>. The reason we want to know that at the head of parsing the remaining string is a <code class="language-plaintext highlighter-rouge">Lit</code> is so that we can prepend the current character to that literal – we need to rebuild long string literals character-by-character after all. This is where the invariant from the previous two cases is useful: we don’t have to handle the cases where the head is not a <code class="language-plaintext highlighter-rouge">Lit</code>, because the recursive calls guarantee that it is. <code class="language-plaintext highlighter-rouge">acc</code> is thus the tail of the string literal we’re currently parsing, so we put it together with the current character by <code class="language-plaintext highlighter-rouge">ConsSymbol o acc rest</code> (recall that this type class can both construct and deconstruct symbols via its functional dependencies). Then we simply return <code class="language-plaintext highlighter-rouge">Lit rest</code> along with <code class="language-plaintext highlighter-rouge">r</code>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">else</span> <span class="kr">instance</span> <span class="n">parse1Other</span> <span class="o">::</span> <span class="p">(</span> <span class="kt">Parse</span> <span class="n">s</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="n">acc</span><span class="p">)</span> <span class="n">r</span><span class="p">)</span> <span class="p">,</span> <span class="kt">ConsSymbol</span> <span class="n">o</span> <span class="n">acc</span> <span class="n">rest</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Parse1</span> <span class="n">o</span> <span class="n">s</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="n">rest</span><span class="p">)</span> <span class="n">r</span><span class="p">)</span></code></pre></figure> <p>Notice how these instances actually overlap. In the third case, we can easily imagine a particular instantiation of <code class="language-plaintext highlighter-rouge">o</code> and <code class="language-plaintext highlighter-rouge">r</code> such that it matches the instance head in the second case. In other words, when the current character is <code class="language-plaintext highlighter-rouge">%</code>, both <code class="language-plaintext highlighter-rouge">parse1Pc</code> and <code class="language-plaintext highlighter-rouge">parse1Other</code> match (because <code class="language-plaintext highlighter-rouge">parse1Other</code> is more general).</p> <p>To make sure that the instances are selected in the order we want them to be, we use instance chains. That is, by writing <code class="language-plaintext highlighter-rouge">instance A else instance B</code> we tell the compiler to try to match instance <code class="language-plaintext highlighter-rouge">A</code> first, and if it fails, then try <code class="language-plaintext highlighter-rouge">B</code>. This is a new feature in PureScript 0.12, and a very powerful one – it allows us to avoid the overlapping instance problem for good.</p> <p>Finally, we need to actually kick off the parser. We do this by invoking it in the first state.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">parseNil</span> <span class="o">::</span> <span class="kt">Parse</span> <span class="s">""</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="s">""</span><span class="p">)</span> <span class="kt">FNil</span><span class="p">)</span> <span class="kr">else</span> <span class="kr">instance</span> <span class="n">parseCons</span> <span class="o">::</span> <span class="p">(</span> <span class="kt">ConsSymbol</span> <span class="n">h</span> <span class="n">t</span> <span class="n">string</span> <span class="p">,</span> <span class="kt">Parse1</span> <span class="n">h</span> <span class="n">t</span> <span class="n">fl</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Parse</span> <span class="n">string</span> <span class="n">fl</span></code></pre></figure> <h2 id="how-the-sausage-gets-made-computing-the-output-type">How the sausage gets made: computing the output type</h2> <p>But how do we know how many arguments we need to pass to the formatter? It depends on the format string! No surprises here: just like all the previous type-level computations, this one will also be encoded in a type class with a functional dependency.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">FormatF</span> <span class="p">(</span><span class="n">format</span> <span class="o">::</span> <span class="kt">FList</span><span class="p">)</span> <span class="n">fun</span> <span class="o">|</span> <span class="n">format</span> <span class="o">-&gt;</span> <span class="n">fun</span> <span class="kr">where</span> <span class="n">formatF</span> <span class="o">::</span> <span class="o">@</span><span class="n">format</span> <span class="o">-&gt;</span> <span class="kt">String</span> <span class="o">-&gt;</span> <span class="n">fun</span></code></pre></figure> <p class="notice">The <code class="language-plaintext highlighter-rouge">@</code> symbol is special syntax, and in this case, it means that the <code class="language-plaintext highlighter-rouge">formatF</code> function takes an <code class="language-plaintext highlighter-rouge">FList</code> (<code class="language-plaintext highlighter-rouge">format</code>) as an input. But because <code class="language-plaintext highlighter-rouge">FList</code> is a <em>custom kind</em>, it has no value-level inhabitants. So, how can we still get something whose type mentions <code class="language-plaintext highlighter-rouge">format</code>? This is what <code class="language-plaintext highlighter-rouge">@</code> does – it’s a proxy for a type. Its value is isomorphic to <code class="language-plaintext highlighter-rouge">Unit</code>, and carries no information, other than its type. Notice that it works for any kind – indeed, proxies are currently a special-cased type in PureScript, in that they are kind-polymorphic.</p> <p>Thus <code class="language-plaintext highlighter-rouge">formatF</code> takes a format list, and an accumulator string, and returns some <code class="language-plaintext highlighter-rouge">fun</code> – this type depends on the actual format list.</p> <p>Starting with the base case, when there’s nothing to print, simply just return the accumulated formatted string.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">formatFNil</span> <span class="o">::</span> <span class="kt">FormatF</span> <span class="kt">FNil</span> <span class="kt">String</span> <span class="kr">where</span> <span class="n">formatF</span> <span class="kr">_</span> <span class="n">str</span> <span class="o">=</span> <span class="n">str</span></code></pre></figure> <p>When the head of the list is <code class="language-plaintext highlighter-rouge">D</code>, we know that we will need an <code class="language-plaintext highlighter-rouge">Int</code> argument, and the rest of the function’s type can be computed by recursing on the tail of the list. As for the implementation, since the return type is now refined to be of the form <code class="language-plaintext highlighter-rouge">Int -&gt; fun</code>, we are allowed to construct a lambda that takes the <code class="language-plaintext highlighter-rouge">Int</code>, and appends it to the end of the accumulator, then recurses on the rest. The implementation of <code class="language-plaintext highlighter-rouge">S</code> is identical, and is omitted for brevity.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">formatFConsD</span> <span class="o">::</span> <span class="kt">FormatF</span> <span class="n">rest</span> <span class="n">fun</span> <span class="o">=&gt;</span> <span class="kt">FormatF</span> <span class="p">(</span><span class="kt">FCons</span> <span class="kt">D</span> <span class="n">rest</span><span class="p">)</span> <span class="p">(</span><span class="kt">Int</span> <span class="o">-&gt;</span> <span class="n">fun</span><span class="p">)</span> <span class="kr">where</span> <span class="n">formatF</span> <span class="kr">_</span> <span class="n">str</span> <span class="o">=</span> <span class="nf">\</span><span class="n">i</span> <span class="o">-&gt;</span> <span class="n">formatF</span> <span class="o">@</span><span class="n">rest</span> <span class="p">(</span><span class="n">str</span> <span class="o">&lt;&gt;</span> <span class="n">show</span> <span class="n">i</span><span class="p">)</span></code></pre></figure> <p>Handling literals (<code class="language-plaintext highlighter-rouge">Lit</code>) is left as an exercise for the reader.</p> <h2 id="conclusion">Conclusion</h2> <p>Finally, as a matter of convenience, we can wrap the above type classes into one, that serves as a bridge between the parser and the formatter, as such:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Format</span> <span class="p">(</span><span class="n">string</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="n">fun</span> <span class="o">|</span> <span class="n">string</span> <span class="o">-&gt;</span> <span class="n">fun</span> <span class="kr">where</span> <span class="n">format</span> <span class="o">::</span> <span class="o">@</span><span class="n">string</span> <span class="o">-&gt;</span> <span class="n">fun</span> <span class="kr">instance</span> <span class="n">formatFFormat</span> <span class="o">::</span> <span class="p">(</span> <span class="kt">Parse</span> <span class="n">string</span> <span class="n">format</span> <span class="p">,</span> <span class="kt">FormatF</span> <span class="n">format</span> <span class="n">fun</span> <span class="p">)</span> <span class="o">=&gt;</span> <span class="kt">Format</span> <span class="n">string</span> <span class="n">fun</span> <span class="kr">where</span> <span class="n">format</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">formatF</span> <span class="o">@</span><span class="n">format</span> <span class="s">""</span></code></pre></figure> <p>And that’s it! It might be instructional to try and work out <code class="language-plaintext highlighter-rouge">FormatF</code>’s instance resolution for a few simple examples by hand, to get a better idea why this works. A fully working implementation of the code in this post can be found <a href="https://github.com/kcsongor/purescript-safe-printf">on github</a>.</p> <p><a href="https://kcsongor.github.io/purescript-safe-printf/">Well-typed printfs cannot go wrong</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 25, 2017.</p> <![CDATA[Time travel in Haskell for dummies]]> https://kcsongor.github.io/time-travel-in-haskell-for-dummies 2015-10-02T00:00:00-00:00 2015-10-02T00:00:00+00:00 Csongor Kiss https://kcsongor.github.io <section id="table-of-contents" class="toc"> <header> <h3><i class="fa fa-book"></i> Overview</h3> </header> <div id="drawer"> <ul id="markdown-toc"> <li><a href="#how" id="markdown-toc-how">How?</a></li> <li><a href="#the-repmax-problem" id="markdown-toc-the-repmax-problem">The repMax problem</a></li> <li><a href="#wait-what" id="markdown-toc-wait-what">Wait, what?</a></li> <li><a href="#states-travelling-back-in-time" id="markdown-toc-states-travelling-back-in-time">States travelling back in time</a> <ul> <li><a href="#what-are-states-anyway" id="markdown-toc-what-are-states-anyway">What are states anyway?</a></li> </ul> </li> <li><a href="#finally-the-time-machine-tardis" id="markdown-toc-finally-the-time-machine-tardis">Finally, the time machine, TARDIS</a> <ul> <li><a href="#a-single-pass-assembler-an-example" id="markdown-toc-a-single-pass-assembler-an-example">A single-pass assembler: an example</a></li> <li><a href="#io-doesnt-mix-with-the-future-the-past-is-fine" id="markdown-toc-io-doesnt-mix-with-the-future-the-past-is-fine">IO doesn’t mix with the future! (The past is fine)</a></li> <li><a href="#thanks" id="markdown-toc-thanks">Thanks</a></li> </ul> </li> </ul> </div> </section> <!-- /#table-of-contents --> <p>Browsing Hackage the other day, I came across the <a href="https://hackage.haskell.org/package/tardis-0.3.0.0/docs/Control-Monad-Tardis.html">Tardis Monad</a>. Reading its description, it turns out that the Tardis monad is capable of sending state back in time. Yep. Back in time.</p> <h2 id="how">How?</h2> <p>No, it’s not the reification of <a href="https://en.wikipedia.org/wiki/Tachyon">some hypothetical time-travelling particle</a>, rather a really clever way of exploiting Haskell’s laziness.</p> <p>In this rather lengthy post, I’ll showcase some interesting consequences of lazy evaluation and the way to work ourselves up from simple examples to ’time travelling’ craziness through different levels of abstraction.</p> <h2 id="the-repmax-problem">The repMax problem</h2> <p>Imagine you had a list, and you wanted to replace all the elements of the list with the largest element, by only passing the list once. You might say something like “Easier said than done, how do I know the largest element without having passed the list before?”</p> <p>Let’s start from the beginning: – First, you ask the future for the largest element of the list, (don’t worry, this will make sense in a bit) let’s call this value <code class="language-plaintext highlighter-rouge">rep</code> (as in the value we replace stuff with).</p> <p>Walking through the list, you do two things:</p> <ul> <li>replace the current element with <code class="language-plaintext highlighter-rouge">rep</code></li> <li>’return’ the larger of the current element and the largest element of the remaining list.</li> </ul> <p>When only one element remains, replace it with <code class="language-plaintext highlighter-rouge">rep</code>, and return what was there originally. (this is the base case)</p> <p>Right, at the moment, we haven’t acquired the skill of seeing the future, so we just write the rest of the function with that bit left out.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">repMax</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Int</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="kt">Int</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="kt">Int</span><span class="p">,</span> <span class="p">[</span><span class="kt">Int</span><span class="p">])</span> <span class="n">repMax</span> <span class="kt">[]</span> <span class="n">rep</span> <span class="o">=</span> <span class="p">(</span><span class="n">rep</span><span class="p">,</span> <span class="kt">[]</span><span class="p">)</span> <span class="n">repMax</span> <span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="n">rep</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="p">[</span><span class="n">rep</span><span class="p">])</span> <span class="n">repMax</span> <span class="p">(</span><span class="n">l</span> <span class="o">:</span> <span class="n">ls</span><span class="p">)</span> <span class="n">rep</span> <span class="o">=</span> <span class="p">(</span><span class="n">m'</span><span class="p">,</span> <span class="n">rep</span> <span class="o">:</span> <span class="n">ls'</span><span class="p">)</span> <span class="kr">where</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">ls'</span><span class="p">)</span> <span class="o">=</span> <span class="n">repMax</span> <span class="n">ls</span> <span class="n">rep</span> <span class="n">m'</span> <span class="o">=</span> <span class="n">max</span> <span class="n">m</span> <span class="n">l</span></code></pre></figure> <p>So, it takes a list, and the rep element, and returns (Int, [Int])</p> <p><code class="language-plaintext highlighter-rouge">repMax [1,2,3,4,5,3] 6</code> gives us <code class="language-plaintext highlighter-rouge">(5, [6,6,6,6,6,6])</code> which is exactly what we wanted: the elements are replaced with rep and we also have the largest element. Now, all we need to do is use that largest element as <code class="language-plaintext highlighter-rouge">rep</code>:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">doRepMax</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Int</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">[</span><span class="kt">Int</span><span class="p">]</span> <span class="n">doRepMax</span> <span class="n">xs</span> <span class="o">=</span> <span class="n">xs'</span> <span class="kr">where</span> <span class="p">(</span><span class="n">largest</span><span class="p">,</span> <span class="n">xs'</span><span class="p">)</span> <span class="o">=</span> <span class="n">repMax</span> <span class="n">xs</span> <span class="n">largest</span></code></pre></figure> <h2 id="wait-what">Wait, what?</h2> <p>This can be done thanks to lazy evaluation. Haskell systems use so-called ’thunks’ for values that are yet to be evaluated. When you say <code class="language-plaintext highlighter-rouge">(min 5 6)</code>, the expression will form a thunk and not be evaluated until it really needs to be. Here, <code class="language-plaintext highlighter-rouge">rep</code> can be thought of as a reference to a thunk. When we tell GHC to put <code class="language-plaintext highlighter-rouge">largest</code> in all slots of the list, it will in fact put a reference to the same thunk in those slots, not the actual data. As we pass the list, this thunk is building up with nested <code class="language-plaintext highlighter-rouge">max</code> expressions. For <code class="language-plaintext highlighter-rouge">[1,2,3,4]</code>, will end up with a thunk: <code class="language-plaintext highlighter-rouge">max 1 (max 2 (max 3 4))</code>. A reference to this thunk will be placed everywhere in the list. By the time we finished traversing the list, the thunk will be finished too, and can be evaluated. (Before finishing, the thunk has the form similar to <code class="language-plaintext highlighter-rouge">max 1 (_something_)</code> where <code class="language-plaintext highlighter-rouge">_something_</code> is the max of the rest of the list. This obivously can not be evaluated at this point)</p> <p>How about generalising this idea to other data structures?</p> <p>There’s an old saying in the world of lists</p> <blockquote> <p>“Everything’s a fold”.</p> </blockquote> <p>Indeed, we could easily rewrite our doRepMax function using a fold:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foldmax</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Ord</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Num</span> <span class="n">a</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="n">foldmax</span> <span class="n">ls</span> <span class="o">=</span> <span class="n">ls'</span> <span class="kr">where</span> <span class="p">(</span><span class="n">ls'</span><span class="p">,</span> <span class="n">largest</span><span class="p">)</span> <span class="o">=</span> <span class="n">foldl</span> <span class="p">(</span><span class="nf">\</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">largest</span> <span class="o">:</span> <span class="n">b</span><span class="p">,</span> <span class="n">max</span> <span class="n">a</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">[]</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="n">ls</span></code></pre></figure> <p>Brilliant! Now we can use this technique on everything that is Foldable! Or can we?</p> <p>Taking a look at the type signature of the generalised <code class="language-plaintext highlighter-rouge">foldl</code> (from Data.Foldable): <code class="language-plaintext highlighter-rouge">Data.Foldable.foldl :: Foldable t =&gt; (b -&gt; a -&gt; b) -&gt; b -&gt; t a -&gt; b</code> we realise that the returned value’s structure <code class="language-plaintext highlighter-rouge">b</code> is independent from that of the input <code class="language-plaintext highlighter-rouge">t a</code>. The reason we could get away with this in our fold example was that we knew we were dealing with a list, so we used the <code class="language-plaintext highlighter-rouge">:</code> operator explicitly to restore the structure.</p> <p>No problem! There exists a type class that does just what we want, that is it lets us fold it while keeping its structure. This magical class is called <code class="language-plaintext highlighter-rouge">Traversable</code>.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable #-}</span> <span class="kr">data</span> <span class="kt">Tree</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">Empty</span> <span class="o">|</span> <span class="kt">Leaf</span> <span class="n">a</span> <span class="o">|</span> <span class="kt">Node</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">,</span> <span class="kt">Functor</span><span class="p">,</span> <span class="kt">Foldable</span><span class="p">,</span> <span class="kt">Traversable</span><span class="p">)</span></code></pre></figure> <p>– Thankfully, GHC is clever enough to derive Traversable for us from this data definiton. (But it wouldn’t be too difficult to do by hand anyway)</p> <p>Traversable data structures can do a really neat trick (among many others): <code class="language-plaintext highlighter-rouge">mapAccumR :: Traversable t =&gt; (a -&gt; b -&gt; (a, c)) -&gt; a -&gt; t b -&gt; (a, t c)</code></p> <p>This function is like combining a map with a fold (and so all Traversables also need to be Functors and Foldables). We take a function <code class="language-plaintext highlighter-rouge">(a -&gt; b -&gt; (a, c))</code>, an initial <code class="language-plaintext highlighter-rouge">a</code> and a Traversable of <code class="language-plaintext highlighter-rouge">b</code>s (<code class="language-plaintext highlighter-rouge">t b</code>).</p> <p>The elements will be changed with their respective <code class="language-plaintext highlighter-rouge">c</code>s. (the one calculated by <code class="language-plaintext highlighter-rouge">(a -&gt; b -&gt; (a, c))</code>) So <code class="language-plaintext highlighter-rouge">c</code> is a perfect place for us to put our <code class="language-plaintext highlighter-rouge">rep</code> (the largest element in this case)</p> <p>Apart from the final Traversable <code class="language-plaintext highlighter-rouge">t c</code>, it also returns the accumulated <code class="language-plaintext highlighter-rouge">a</code>s (that’s where we return the largest).</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">generalMax</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Traversable</span> <span class="n">t</span><span class="p">,</span> <span class="kt">Num</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Ord</span> <span class="n">a</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">t</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">t</span> <span class="n">a</span> <span class="n">generalMax</span> <span class="n">t</span> <span class="o">=</span> <span class="n">xs'</span> <span class="kr">where</span> <span class="p">(</span><span class="n">largest</span><span class="p">,</span> <span class="n">xs'</span><span class="p">)</span> <span class="o">=</span> <span class="n">mapAccumR</span> <span class="p">(</span><span class="nf">\</span><span class="n">a</span> <span class="n">b</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">max</span> <span class="n">a</span> <span class="n">b</span><span class="p">,</span> <span class="n">largest</span><span class="p">))</span> <span class="mi">0</span> <span class="n">t</span></code></pre></figure> <p>This generalisation gives us new options! What we’ve been doing so far is we’ve used <code class="language-plaintext highlighter-rouge">a</code>, <code class="language-plaintext highlighter-rouge">b</code> and <code class="language-plaintext highlighter-rouge">c</code> as the same types (as, say Ints).</p> <p>For instance, if we want to replace all the elements with the average of them, then we can accumulate the sum and the count of elements in a tuple (<code class="language-plaintext highlighter-rouge">a</code> will then take the role of this tuple) and <code class="language-plaintext highlighter-rouge">c</code> will be the sum divided by the count, for which we’re going to ask the future again!</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">generalAvg</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Traversable</span> <span class="n">t</span><span class="p">,</span> <span class="kt">Integral</span> <span class="n">a</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">t</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">t</span> <span class="n">a</span> <span class="n">generalAvg</span> <span class="n">t</span> <span class="o">=</span> <span class="n">xs'</span> <span class="kr">where</span> <span class="n">avg</span> <span class="o">=</span> <span class="n">s</span> <span class="p">`</span><span class="n">div</span><span class="p">`</span> <span class="n">c</span> <span class="p">((</span><span class="n">s</span><span class="p">,</span> <span class="n">c</span><span class="p">),</span> <span class="n">xs'</span><span class="p">)</span> <span class="o">=</span> <span class="n">mapAccumR</span> <span class="p">(</span><span class="nf">\</span><span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">c'</span><span class="p">)</span> <span class="n">b</span> <span class="o">-&gt;</span> <span class="p">((</span><span class="n">s'</span> <span class="o">+</span> <span class="n">b</span><span class="p">,</span> <span class="n">c'</span> <span class="o">+</span> <span class="mi">1</span><span class="p">),</span> <span class="n">avg</span><span class="p">))</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="n">t</span></code></pre></figure> <p>And so on, we can do all sorts of interesting things in a single traversal of our data structures.</p> <h2 id="states-travelling-back-in-time">States travelling back in time</h2> <hr /> <h5 id="what-are-states-anyway">What are states anyway?</h5> <p>In Haskell, whenever we want to write functions that operate on some sort of environment or state, we write these functions in the following form: statefulFunction :: b -&gt; c -&gt; d -&gt; s -&gt; (a, s) that is, we take some arguments (<code class="language-plaintext highlighter-rouge">b</code>, <code class="language-plaintext highlighter-rouge">c</code>, <code class="language-plaintext highlighter-rouge">d</code> here), a state <code class="language-plaintext highlighter-rouge">s</code>, and return a new, possibly modified state along with some value <code class="language-plaintext highlighter-rouge">a</code>. Now, this involves writing a lot of boilerplate code, both in the type signatures and in the actual code that is using the state.</p> <p>For example, using the state as a counter:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">statefulFunction</span> <span class="n">arg1</span> <span class="n">arg2</span> <span class="n">arg3</span> <span class="n">counter</span> <span class="o">=</span> <span class="p">(</span><span class="n">arg1</span> <span class="o">+</span> <span class="n">arg2</span> <span class="o">+</span> <span class="n">arg3</span><span class="p">,</span> <span class="n">counter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span></code></pre></figure> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bindStatefulFunctions</span> <span class="o">::</span> <span class="p">(</span><span class="n">s</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">s</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span> <span class="o">-&gt;</span> <span class="n">s</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span> <span class="n">bindStatefulFunctions</span> <span class="n">f1</span> <span class="n">f2</span> <span class="o">=</span> <span class="nf">\</span><span class="n">initialState</span> <span class="o">-&gt;</span> <span class="kr">let</span> <span class="p">(</span><span class="n">result</span><span class="p">,</span> <span class="n">updatedState</span><span class="p">)</span> <span class="o">=</span> <span class="n">f1</span> <span class="n">initialState</span> <span class="kr">in</span> <span class="n">f2</span> <span class="n">result</span> <span class="n">updatedState</span></code></pre></figure> <p>Note that f2 takes an extra <code class="language-plaintext highlighter-rouge">a</code>, that’s the output of the first function. That’s why this function is called bind, we bind the output of the first function to the input of the second while passing the modified state.</p> <p>The State monad essentially does something like the above code, but hides it all and makes the state passing implicit. Also, being a monad, gives us the all so convenient do notation!</p> <p><code class="language-plaintext highlighter-rouge">State s a</code> is basically just a type synonym for <code class="language-plaintext highlighter-rouge">s -&gt; (a, s)</code>, so our previous example could be written as <code class="language-plaintext highlighter-rouge">statefulFunction :: b -&gt; c -&gt; d -&gt; State s a</code></p> <p>and bindStatefulFunctions we get for free from State (known as <code class="language-plaintext highlighter-rouge">&gt;&gt;=</code> for monads)</p> <p>Now we can do:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">statefulFunction</span> <span class="n">arg1</span> <span class="n">arg2</span> <span class="n">arg3</span> <span class="o">=</span> <span class="kr">do</span> <span class="n">counter</span> <span class="o">&lt;-</span> <span class="n">get</span> <span class="n">put</span> <span class="p">(</span><span class="n">counter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">return</span> <span class="p">(</span><span class="n">arg1</span> <span class="o">+</span> <span class="n">arg2</span> <span class="o">+</span> <span class="n">arg3</span><span class="p">)</span></code></pre></figure> <p>(Did you know that Haskell is also the best imperative language?) Notice how the state is not explicitly passed as an argument (thus our function is partially applied), but is bound to counter by the get function. Put then puts the updated counter back in the state. Return then just makes sure that what we get out of is wrapped back in the State monad.</p> <hr /> <p>The nice thing about the State monad is that all the computations we do within it are essentially just partially applied functions, so they can’t be evaluated until provided with an initial state, which will then magically flow through the pipeline of computations, each doing their respective modifications in the meantime.</p> <p><code class="language-plaintext highlighter-rouge">mapAccumR</code> does a series of stateful computations (in nature, but it’s not using the State monad), where it takes a value and a state, then returns a new value with a modified state. (Accum refers to the fact that this state can be used as an accumulator as we traverse the data)</p> <p><code class="language-plaintext highlighter-rouge">mapAccumR :: Traversable t =&gt; (a -&gt; b -&gt; (a, c)) -&gt; a -&gt; t b -&gt; (a, t c)</code></p> <p><code class="language-plaintext highlighter-rouge">a</code> is that state here, that is what we used to store the largest element. This state, however, travels forward in time, so to speak, as we go through the list. The trick we do only happens at the end, when we feed it its own output. We can do so thanks to lazy evaluation.</p> <p>So the State monad passes its <code class="language-plaintext highlighter-rouge">s</code> from computation to computation, that’s how these computations are bound.</p> <p>Imagine using the same laziness self-feeding trick, but for passing the state:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">reverseBind</span> <span class="n">stateful1</span> <span class="n">stateful2</span> <span class="o">=</span> <span class="nf">\</span><span class="n">s</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">x'</span><span class="p">,</span> <span class="n">s''</span><span class="p">)</span> <span class="kr">where</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">s''</span><span class="p">)</span> <span class="o">=</span> <span class="n">stateful1</span> <span class="n">s'</span> <span class="p">(</span><span class="n">x'</span><span class="p">,</span> <span class="n">s'</span><span class="p">)</span> <span class="o">=</span> <span class="n">stateful2</span> <span class="n">x</span> <span class="n">s</span></code></pre></figure> <p>So first we run stateful1 <strong>with the state modified by stateful2</strong>! Then we run stateful2 with stateful1’s output. Finally, we return the state after running stateful1 along with the value <code class="language-plaintext highlighter-rouge">x'</code> from stateful2. Note that because of the way this binding is done, stateful1’s ouput state will actually be the <em>past</em> of stateful1. (That is, whatever we do with the state in stateful1, will be visible to the computations preceding stateful1, just like how stateful2’s effects are seen in stateful1. Lazy evaluation rocks!)</p> <p>Coming from an imperative background, this can be thought of as stateful1 putting forward references to the values it uses from the state, and once those values are actually calculated in the future, stateful1 will be able to do whatever it wanted. These references are not explicit though as they would be in C (using pointers, for example), but implicitly placed there by GHC as thunks.</p> <p>That also means whatever we do with these values has to be done lazily. (an example below)</p> <p class="notice">The above code is a modified version of the monadic binding found in the rev-state package (which is in turn a modification of the original State monad by reversing the flow of state).</p> <h2 id="finally-the-time-machine-tardis">Finally, the time machine, TARDIS</h2> <p>So we have the State monad, of which the state flows forwards, then we have the Rev-State, which sends the state backwards. So what do we get if we combine these two? Yes, a time machine! Also known as the Tardis monad: it is in fact a combination of the State and Rev-State monads with some nice functions to deal with the bidirectional states.</p> <p>I say states, because naturally, we have data coming from the future and data coming from the past, and those make two (a backwards travelling and a forwards travelling state).</p> <p>These could be of different types, say we can send Strings back in time and Ints to the future.</p> <h3 id="a-single-pass-assembler-an-example">A single-pass assembler: an example</h3> <p>Writing an assembler is relatively straightforward. We go through a list of assembly instructions and turn them into their binary equivalent for the given CPU architecture.</p> <p>However, there are some instructions that we can’t immediately convert. One of such instructions is a label for branching. (jumps) For these labels, we need a symbol table.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="k">qualified</span> <span class="nn">Data.Map.Strict</span> <span class="k">as</span> <span class="n">M</span> <span class="kr">type</span> <span class="kt">Addr</span> <span class="o">=</span> <span class="kt">Int</span> <span class="kr">type</span> <span class="kt">SymTable</span> <span class="o">=</span> <span class="kt">M</span><span class="o">.</span><span class="kt">Map</span> <span class="kt">String</span> <span class="kt">Addr</span> <span class="c1">-- map label names to their addresses</span> <span class="kr">data</span> <span class="kt">Instr</span> <span class="o">=</span> <span class="kt">Add</span> <span class="o">|</span> <span class="kt">Mov</span> <span class="o">|</span> <span class="kt">ToLabel</span> <span class="kt">String</span> <span class="o">|</span> <span class="kt">ToAddr</span> <span class="kt">Addr</span> <span class="o">|</span> <span class="kt">Label</span> <span class="kt">String</span> <span class="o">|</span> <span class="kt">Err</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">)</span></code></pre></figure> <p><code class="language-plaintext highlighter-rouge">Instr</code> is a rather rudimentary representation of assembly instructions, but it does the job for us now.</p> <p>What we want to have is a function that takes a list of <code class="language-plaintext highlighter-rouge">Instr</code>s and returns a list of <code class="language-plaintext highlighter-rouge">[(Addr, Instr)]</code> and also replace all the <code class="language-plaintext highlighter-rouge">ToLabel</code>s with <code class="language-plaintext highlighter-rouge">ToAddr</code>s that point to the address of the label. If the label is never defined, we put an <code class="language-plaintext highlighter-rouge">Err</code> there. (In real life, you would use some ExceptT monad transformer to handle such errors.)</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">runAssembler</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Instr</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="p">[(</span><span class="kt">Addr</span><span class="p">,</span> <span class="kt">Instr</span><span class="p">)]</span></code></pre></figure> <p>Jumping to a label that is already defined is easy, we look it up in our SymTable and convert <code class="language-plaintext highlighter-rouge">ToLabel</code> to <code class="language-plaintext highlighter-rouge">ToAddr</code>. This sounds like an application of the State monad, doesn’t it? When we encounter a label definition, just add it to the state (<code class="language-plaintext highlighter-rouge">SymTable</code>). Done!</p> <p>The problem arises from the fact that some labels might be defined after they are used. The ‘else’ block of an if statement will typically be done like this. Implementing this in C, you could remember these positions and at the end, fill in the gaps with the knowledge you have acquired. Thunks, anyone?</p> <p>I’ll just use a Rev-State monad and send these definitions back in time. Simple enough, right?</p> <p>So at this point, we can see that we will need both types of these states: one that’s travelling forward and one that is going backwards. And that is exactly what the Tardis monad is!</p> <p class="notice">Labels will not be turned into any binary, instead the next actual instruction’s address will be used.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Assembler</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">Tardis</span> <span class="kt">SymTable</span> <span class="kt">SymTable</span> <span class="n">a</span></code></pre></figure> <p>Right, our <code class="language-plaintext highlighter-rouge">runAssembler</code> function will run some <code class="language-plaintext highlighter-rouge">assemble</code> function in the Tardis monad. (That is, it will give it the initial states and extract the final value at the end).</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">runAssembler</span> <span class="n">asm</span> <span class="o">=</span> <span class="n">instructions</span> <span class="kr">where</span> <span class="p">(</span><span class="n">instructions</span><span class="p">,</span> <span class="kr">_</span><span class="p">)</span> <span class="o">=</span> <span class="n">runTardis</span> <span class="p">(</span><span class="n">assemble</span> <span class="mi">0</span> <span class="n">asm</span><span class="p">)</span> <span class="p">(</span><span class="kt">M</span><span class="o">.</span><span class="n">empty</span><span class="p">,</span> <span class="kt">M</span><span class="o">.</span><span class="n">empty</span><span class="p">)</span></code></pre></figure> <p>The <code class="language-plaintext highlighter-rouge">assemble</code> function turns a list of instructions to <code class="language-plaintext highlighter-rouge">[(Addr, Instr)]</code> in the Assembler monad (which is a synonym for Tardis SymTable SymTable). What’s that 0 doing there, you ask?</p> <p>We need to keep track of the address we will use for the next instruction. This is because of labels. When we encounter a regular instruction, we put that at the provided address, then increment that address by 1. If a label comes around, we put it in the State then continue without incrementing the address.</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">assemble</span> <span class="o">::</span> <span class="kt">Addr</span> <span class="o">-&gt;</span> <span class="p">[</span><span class="kt">Instr</span><span class="p">]</span> <span class="o">-&gt;</span> <span class="kt">Assembler</span> <span class="p">[(</span><span class="kt">Addr</span><span class="p">,</span> <span class="kt">Instr</span><span class="p">)]</span> <span class="n">assemble</span> <span class="kr">_</span> <span class="kt">[]</span> <span class="o">=</span> <span class="n">return</span> <span class="kt">[]</span> <span class="c1">-- label found, update state then go on</span> <span class="n">assemble</span> <span class="n">addr</span> <span class="p">(</span><span class="kt">Label</span> <span class="n">label</span> <span class="o">:</span> <span class="n">is'</span><span class="p">)</span> <span class="o">=</span> <span class="kr">do</span> <span class="n">modifyBackwards</span> <span class="p">(</span><span class="kt">M</span><span class="o">.</span><span class="n">insert</span> <span class="n">label</span> <span class="n">addr</span><span class="p">)</span> <span class="c1">-- send to past</span> <span class="n">modifyForwards</span> <span class="p">(</span><span class="kt">M</span><span class="o">.</span><span class="n">insert</span> <span class="n">label</span> <span class="n">addr</span><span class="p">)</span> <span class="c1">-- send to future</span> <span class="n">assemble</span> <span class="n">addr</span> <span class="n">is'</span> <span class="c1">-- assemble the rest of the instructions</span> <span class="c1">-- jump to label found, replace with</span> <span class="c1">-- jump to address</span> <span class="c1">-- then do the rest starting at (addr + 1)</span> <span class="n">assemble</span> <span class="n">addr</span> <span class="p">(</span><span class="kt">ToLabel</span> <span class="n">label</span> <span class="o">:</span> <span class="n">is'</span><span class="p">)</span> <span class="o">=</span> <span class="kr">do</span> <span class="n">bw</span> <span class="o">&lt;-</span> <span class="n">getFuture</span> <span class="n">fw</span> <span class="o">&lt;-</span> <span class="n">getPast</span> <span class="kr">let</span> <span class="n">union</span> <span class="o">=</span> <span class="kt">M</span><span class="o">.</span><span class="n">union</span> <span class="n">bw</span> <span class="n">fw</span> <span class="c1">-- take union of the two symbol tables</span> <span class="n">this</span> <span class="o">=</span> <span class="kr">case</span> <span class="kt">M</span><span class="o">.</span><span class="n">lookup</span> <span class="n">label</span> <span class="n">union</span> <span class="kr">of</span> <span class="kt">Just</span> <span class="n">a'</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="kt">ToAddr</span> <span class="n">a'</span><span class="p">)</span> <span class="kt">Nothing</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="kt">Err</span><span class="p">)</span> <span class="n">rest</span> <span class="o">&lt;-</span> <span class="n">assemble</span> <span class="p">(</span><span class="n">addr</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">is'</span> <span class="n">return</span> <span class="o">$</span> <span class="n">this</span> <span class="o">:</span> <span class="n">rest</span> <span class="c1">-- regular instruction found,</span> <span class="c1">-- assign it to the address</span> <span class="c1">-- then do the rest starting at (addr + 1)</span> <span class="n">assemble</span> <span class="n">addr</span> <span class="p">(</span><span class="n">instr</span> <span class="o">:</span> <span class="n">is'</span><span class="p">)</span> <span class="o">=</span> <span class="kr">do</span> <span class="n">rest</span> <span class="o">&lt;-</span> <span class="n">assemble</span> <span class="p">(</span><span class="n">addr</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">is'</span> <span class="n">return</span> <span class="o">$</span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="n">instr</span><span class="p">)</span> <span class="o">:</span> <span class="n">rest</span></code></pre></figure> <p>Now we come up with some test instructions:</p> <figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">input</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Instr</span><span class="p">]</span> <span class="n">input</span> <span class="o">=</span> <span class="p">[</span><span class="kt">Add</span><span class="p">,</span> <span class="kt">Add</span><span class="p">,</span> <span class="kt">ToLabel</span> <span class="s">"my_label"</span><span class="p">,</span> <span class="kt">Mov</span><span class="p">,</span> <span class="kt">Mov</span><span class="p">,</span> <span class="kt">Label</span> <span class="s">"my_label"</span><span class="p">,</span> <span class="kt">Label</span> <span class="s">"second_label"</span><span class="p">,</span> <span class="kt">Mov</span><span class="p">,</span> <span class="kt">ToLabel</span> <span class="s">"second_label"</span><span class="p">,</span> <span class="kt">Mov</span><span class="p">]</span></code></pre></figure> <p>…and we can try running the assembler on this data:</p> <p><code class="language-plaintext highlighter-rouge">&gt; runAssembler input</code></p> <p><code class="language-plaintext highlighter-rouge">&gt; [(0,Add),(1,Add),(2,ToAddr 5),(3,Mov),(4,Mov),(5,Mov),(6,ToAddr 5),(7,Mov)]</code></p> <p>Yay! Just what we wanted!</p> <h3 id="io-doesnt-mix-with-the-future-the-past-is-fine">IO doesn’t mix with the future! (The past is fine)</h3> <p>Be careful about what you do with the state coming from the future. Everything has to be lazily passed through.</p> <p>You might be tempted to use the TardisT monad transformer to interleave IO effects in your time-travelling code. Most IO computations, however are strict.</p> <p>Let’s say you want to get the label from the future and print its address. IO’s print will try to evaluate its argument (which is a partial thunk at this point). It will block the thread until the evaluation is completed, which will result in the program breaking, as the thread block prevents it from progressing further. In this case, I’d advise the use of a Writer monad which has a lazy mechanism, and the results can be printed at the end using IO.</p> <h3 id="thanks">Thanks</h3> <p>Thanks for reading this lengthy post, in which we saw how we can mimic the use of references in pure Haskell code (altough time-travel is an arguably better name for this). This comes at a price though: accumulating unevaluated thunks can use up quite a bit of memory, so be careful if you want to use these techniques in a memory critical environment.</p> <p>If you find any bugs or mistakes, please make sure to let me know!</p> <p><a href="https://kcsongor.github.io/time-travel-in-haskell-for-dummies/">Time travel in Haskell for dummies</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on October 02, 2015.</p>