( )Jekyll2025-11-10T18:34:40+00:00https://kcsongor.github.io/Csongor Kisshttps://kcsongor.github.io/https://kcsongor.github.io/gadts-in-rust2025-11-07T00:00:00-00:002025-06-10T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#a-simple-expression-language" id="markdown-toc-a-simple-expression-language">A simple expression language</a></li>
<li><a href="#more-precise-types-with-gadts" id="markdown-toc-more-precise-types-with-gadts">More precise types with GADTs</a></li>
<li><a href="#more-flexible-types-with-gadts" id="markdown-toc-more-flexible-types-with-gadts">More flexible types with GADTs</a></li>
<li><a href="#encoding-gadts-in-rust" id="markdown-toc-encoding-gadts-in-rust">Encoding GADTs in Rust</a> <ul>
<li><a href="#type-equality-witnesses" id="markdown-toc-type-equality-witnesses">Type equality witnesses</a></li>
<li><a href="#trait-constraint-witnesses" id="markdown-toc-trait-constraint-witnesses">Trait constraint witnesses</a></li>
<li><a href="#using-specialisation-to-recover-constraints" id="markdown-toc-using-specialisation-to-recover-constraints">Using specialisation to recover constraints</a></li>
<li><a href="#why-this-works" id="markdown-toc-why-this-works">Why this works</a></li>
</ul>
</li>
<li><a href="#limitations" id="markdown-toc-limitations">Limitations</a></li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>Rust doesn’t have GADTs (generalised algebraic data types), but we can get
surprisingly close with some creative type-level tricks.</p>
<p>This post might look like a departure from my usual (in the sense of <em>typical</em>,
not <em>frequent</em>) Haskell posts since we’ll be writing Rust today. Don’t let that
fool you; we’ll just be writing Haskell in Rust.</p>
<p>GADTs are a Haskell feature that let
constructors carry richer type information. They can enforce constraints or
refine type parameters per constructor – which is what we’ll achieve here in
Rust.</p>
<p>As this post is mainly for Rust programmers, I’ll start by motivating why GADTs
are useful.
For that, we’ll build a small expression language and see where plain algebraic
data types fall short.
Then we’ll introduce GADTs to fix the problem, first through type refinement and
then with per-constructor constraints.
After that, we’ll move to Rust and reconstruct both mechanisms: type equality
witnesses (a known trick) and constraint witnesses (the new bit this post is
really about). You’ll know when we’ve switched from Haskell to Rust — the syntax
gets ugly.</p>
<h2 id="a-simple-expression-language">A simple expression language</h2>
<p>Let’s start with a small expression language, encoded as a Haskell datatype. It
supports defining integer literals, and adding them together.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span>
<span class="o">=</span> <span class="kt">LitInt</span> <span class="kt">Int</span>
<span class="o">|</span> <span class="kt">Add</span> <span class="kt">Expr</span> <span class="kt">Expr</span></code></pre></figure>
<p>The Rust equivalent of this is an enum with two constructors and more parentheses
(and explicit heap indirection).</p>
<p>We can evaluate expressions recursively:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">eval</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="o">-></span> <span class="kt">Int</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">LitInt</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">+</span> <span class="n">eval</span> <span class="n">b</span></code></pre></figure>
<p>Evaluating <code class="language-plaintext highlighter-rouge">eval (Add (LitInt 3) (Add (LitInt 4) (LitInt 5)))</code> yields <code class="language-plaintext highlighter-rouge">12</code>.</p>
<p>This is not a very useful expression language, so let’s add another literal type
and another binary operator:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span>
<span class="o">=</span> <span class="kt">LitInt</span> <span class="kt">Int</span>
<span class="o">|</span> <span class="kt">Add</span> <span class="kt">Expr</span> <span class="kt">Expr</span>
<span class="o">|</span> <span class="kt">LitBool</span> <span class="kt">Bool</span>
<span class="o">|</span> <span class="kt">Or</span> <span class="kt">Expr</span> <span class="kt">Expr</span></code></pre></figure>
<p>Now we can write expressions like <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitInt 2)</code> and <code class="language-plaintext highlighter-rouge">Or (LitBool False) (LitBool True)</code>.
But we can also write <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitBool False)</code>, which shouldn’t
type-check!</p>
<p>Worse, we’re in trouble when writing the return type of <code class="language-plaintext highlighter-rouge">eval :: Expr -> ???</code>.
What it should return depends on the input, but the input type doesn’t contain
enough information.</p>
<h2 id="more-precise-types-with-gadts">More precise types with GADTs</h2>
<p>GADTs let us say more about each constructor’s result type.
We can extend our <code class="language-plaintext highlighter-rouge">Expr</code> definition so that <code class="language-plaintext highlighter-rouge">Add</code> only exists for integers, and
<code class="language-plaintext highlighter-rouge">Or</code> only for booleans.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE GADTs #-}</span>
<span class="kr">data</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">LitInt</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span>
<span class="kt">Add</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span>
<span class="kt">LitBool</span> <span class="o">::</span> <span class="kt">Bool</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Bool</span>
<span class="kt">Or</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="kt">Bool</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Bool</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Bool</span></code></pre></figure>
<p>Notice that <code class="language-plaintext highlighter-rouge">Expr a</code> is now <em>parameterised</em> (this would read <code class="language-plaintext highlighter-rouge">Expr<A></code> in Rust),
and each constructor specifies the type of expression it builds. <code class="language-plaintext highlighter-rouge">LitInt</code> takes
an <code class="language-plaintext highlighter-rouge">Int</code> and produces an <code class="language-plaintext highlighter-rouge">Expr Int</code>, and <code class="language-plaintext highlighter-rouge">Add</code> combines two <code class="language-plaintext highlighter-rouge">Expr Int</code>s into
another <code class="language-plaintext highlighter-rouge">Expr Int</code>.</p>
<p>As a result, <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitBool False)</code> is rejected at compile time
because the second operand has the wrong type.</p>
<p>The evaluation function can now have a precise type:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">eval</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-></span> <span class="n">a</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">LitInt</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">+</span> <span class="n">eval</span> <span class="n">b</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">LitBool</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">b</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">Or</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">||</span> <span class="n">eval</span> <span class="n">b</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">eval</code> now takes an expression of any type, and returns a value of that type.
When pattern matching on <code class="language-plaintext highlighter-rouge">Expr a</code>, if we see a <code class="language-plaintext highlighter-rouge">LitInt</code>, we learn that <code class="language-plaintext highlighter-rouge">a</code> is
<code class="language-plaintext highlighter-rouge">Int</code>, so the result must be an integer.
In the <code class="language-plaintext highlighter-rouge">Add</code> branch, both sub-expressions are <code class="language-plaintext highlighter-rouge">Expr Int</code>, so <code class="language-plaintext highlighter-rouge">eval</code> produces two
<code class="language-plaintext highlighter-rouge">Int</code>s which can be added together.</p>
<p>In other words, we not only <em>restrict</em> what types of expressions can be used
when constructing <code class="language-plaintext highlighter-rouge">Add</code>, but also <em>learn</em> type information when destructuring
it.</p>
<p>So far, every constructor fixes a concrete return type.
But what if we wanted to support other types that can also be added together?</p>
<h2 id="more-flexible-types-with-gadts">More flexible types with GADTs</h2>
<p>Let’s say we want to support <code class="language-plaintext highlighter-rouge">Double</code>s in our language too:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">LitInt</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span>
<span class="kt">LitDouble</span> <span class="o">::</span> <span class="kt">Double</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Double</span>
<span class="kt">Add</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span>
<span class="o">...</span></code></pre></figure>
<p>Doubles, being numeric values, can also be added together, but the current <code class="language-plaintext highlighter-rouge">Add</code>
constructor only works on integers. We can relax this by constraining the <code class="language-plaintext highlighter-rouge">a</code>
type parameter just in this constructor:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">LitInt</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Int</span>
<span class="kt">LitDouble</span> <span class="o">::</span> <span class="kt">Double</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="kt">Double</span>
<span class="kt">Add</span> <span class="o">::</span> <span class="kt">Num</span> <span class="n">a</span> <span class="o">=></span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-></span> <span class="kt">Expr</span> <span class="n">a</span>
<span class="o">...</span></code></pre></figure>
<p>The <code class="language-plaintext highlighter-rouge">Num a =></code> part is a <em>type class constraint</em> in Haskell, equivalent to a
trait bound in Rust.
In other words, <code class="language-plaintext highlighter-rouge">Add</code> can now take any two expressions of the same type, as long as that type
supports numeric operations.</p>
<p>The <code class="language-plaintext highlighter-rouge">eval</code> function simply gains one extra case:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">eval</span> <span class="o">::</span> <span class="kt">Expr</span> <span class="n">a</span> <span class="o">-></span> <span class="n">a</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">LitInt</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">LitDouble</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="n">n</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">Add</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">+</span> <span class="n">eval</span> <span class="n">b</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">LitBool</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">b</span>
<span class="n">eval</span> <span class="p">(</span><span class="kt">Or</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="n">eval</span> <span class="n">a</span> <span class="o">||</span> <span class="n">eval</span> <span class="n">b</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">Add</code> now supports <code class="language-plaintext highlighter-rouge">Add (LitDouble 1.0) (LitDouble 2.0)</code> and <code class="language-plaintext highlighter-rouge">Add (LitInt 1) (LitInt 2)</code>.
Crucially, the <code class="language-plaintext highlighter-rouge">Num a</code> constraint is attached to just the <code class="language-plaintext highlighter-rouge">Add</code> constructor, not
the entire type. It’s still possible to construct <code class="language-plaintext highlighter-rouge">LitBool</code> values, even though
booleans don’t support addition.</p>
<p>Each <code class="language-plaintext highlighter-rouge">Add</code> value carries evidence that its type parameter <code class="language-plaintext highlighter-rouge">a</code> satisfies <code class="language-plaintext highlighter-rouge">Num</code>.
When we pattern match on <code class="language-plaintext highlighter-rouge">Add</code>, the type checker brings that constraint into
scope automatically, allowing us to use <code class="language-plaintext highlighter-rouge">(+)</code> in the corresponding branch.</p>
<p>This is a subtle but powerful idea: constraints can be local to a constructor.
Even the precise return types we saw earlier are another form of locality —
<code class="language-plaintext highlighter-rouge">LitInt</code> locally records that <code class="language-plaintext highlighter-rouge">a</code> is equal to <code class="language-plaintext highlighter-rouge">Int</code>.</p>
<p>Next, we’ll rebuild the expression language in Rust, and see how to emulate both
of these features: constructor-local type equalities and constructor-local
constraints.
To start adopting the Rust nomenclature, we’ll build an enum whose constructors
are trait-constrained.</p>
<h2 id="encoding-gadts-in-rust">Encoding GADTs in Rust</h2>
<p>We’ll be encoding both properties of GADTs:</p>
<ol>
<li><strong>Constructor-local type equalities</strong> — like <code class="language-plaintext highlighter-rouge">LitInt</code> refining <code class="language-plaintext highlighter-rouge">a ~ Int</code>.</li>
<li><strong>Constructor-local constraints</strong> — like <code class="language-plaintext highlighter-rouge">Add</code> requiring <code class="language-plaintext highlighter-rouge">Num a</code>.</li>
</ol>
<p>We’ll start with the first one, since the idea is already well known in the Rust community.</p>
<p>As a baseline, here’s the simple expression language, with the promised
parentheses and heap indirections:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Expr</span> <span class="p">{</span>
<span class="nf">LitInt</span><span class="p">(</span><span class="nb">i64</span><span class="p">),</span>
<span class="nf">Add</span><span class="p">(</span><span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o">></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o">></span><span class="p">),</span>
<span class="nf">LitBool</span><span class="p">(</span><span class="nb">bool</span><span class="p">),</span>
<span class="nf">Or</span><span class="p">(</span><span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o">></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o">></span><span class="p">),</span>
<span class="p">}</span></code></pre></figure>
<p>As things stand, we have the same issues as the original Haskell version, namely that we can construct invalid combinations, and can’t give a precise type to <code class="language-plaintext highlighter-rouge">eval</code>.</p>
<h3 id="type-equality-witnesses">Type equality witnesses</h3>
<p>In Haskell, specifying the return type of a GADT allowed us to express a type
equality which the typechecker could then automatically use to unify the type
variable with the concrete type.</p>
<p>This relies on the type checker’s ability to make progress with locally learned
information, which Rust doesn’t natively support. Our encoding will instead
rely on an explicit <em>witness</em> of type equality, which we then use where Haskell
would use the GADT constraint.</p>
<p>In Rust, we can encode this concept as a zero-sized type: <sup id="fnref:is-invariance" role="doc-noteref"><a href="#fn:is-invariance" class="footnote" rel="footnote">1</a></sup></p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">use</span> <span class="nn">core</span><span class="p">::</span><span class="nn">marker</span><span class="p">::</span><span class="n">PhantomData</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">Is</span><span class="o"><</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="o">></span><span class="p">(</span><span class="n">PhantomData</span><span class="o"><</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">)</span><span class="o">></span><span class="p">);</span>
<span class="k">impl</span><span class="o"><</span><span class="n">A</span><span class="o">></span> <span class="n">Is</span><span class="o"><</span><span class="n">A</span><span class="p">,</span> <span class="n">A</span><span class="o">></span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">refl</span><span class="p">()</span> <span class="k">-></span> <span class="k">Self</span> <span class="p">{</span>
<span class="nf">Is</span><span class="p">(</span><span class="n">PhantomData</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">Is<A, B></code> is our equality witness: a value of type <code class="language-plaintext highlighter-rouge">Is<A, B></code> is only constructible when <code class="language-plaintext highlighter-rouge">A</code> and
<code class="language-plaintext highlighter-rouge">B</code> are equal. <code class="language-plaintext highlighter-rouge">refl</code> is the only <em>safe</em> way to construct values of type <code class="language-plaintext highlighter-rouge">Is</code>.</p>
<p>If you’ve seen this trick before, you can safely skim this part. The more
interesting bit is how to do the same thing for <strong>trait bounds</strong>, not just type
equalities.</p>
<p>We can now write <code class="language-plaintext highlighter-rouge">Expr<A></code> where each constructor stores a type equality witness:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">></span> <span class="p">{</span>
<span class="nf">LitInt</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">i64</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">i64</span><span class="p">),</span>
<span class="nf">Add</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">i64</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="nb">i64</span><span class="o">>></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="nb">i64</span><span class="o">>></span><span class="p">),</span>
<span class="nf">LitBool</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">bool</span><span class="p">),</span>
<span class="nf">Or</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="nb">bool</span><span class="o">>></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="nb">bool</span><span class="o">>></span><span class="p">),</span>
<span class="p">}</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">Expr::LitInt(Is::refl(), 42)</code> has type <code class="language-plaintext highlighter-rouge">Expr<i64></code>, because the <code class="language-plaintext highlighter-rouge">refl()</code> constructor forces the <code class="language-plaintext highlighter-rouge">A</code> variable to unify with <code class="language-plaintext highlighter-rouge">i64</code>.</p>
<p>So</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span>
<span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span>
<span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">1</span><span class="p">)),</span>
<span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">2</span><span class="p">)),</span>
<span class="p">)</span></code></pre></figure>
<p>typechecks, but</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span>
<span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span>
<span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">1</span><span class="p">)),</span>
<span class="c1">// wrong type, expected an Expr<i64> but got Expr<bool></span>
<span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Expr</span><span class="p">::</span><span class="nf">LitBool</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="k">false</span><span class="p">)),</span>
<span class="p">)</span></code></pre></figure>
<p>doesn’t.</p>
<p>This machinery allows us to <em>restrict</em> the types of expressions that can be used
in <code class="language-plaintext highlighter-rouge">Add</code>, but how do we <em>learn</em> type information?</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">)</span> <span class="k">-></span> <span class="n">A</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">expr</span> <span class="p">{</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">=></span> <span class="o">???</span> <span class="c1">// n is of type 'i64', we need to return 'A'</span>
<span class="o">...</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>In the Haskell version, the type equality bound by the GADT constructor is a
native language feature that the typechecker knows about, so it freely converts
between <code class="language-plaintext highlighter-rouge">a</code> and <code class="language-plaintext highlighter-rouge">Int</code> under a pattern match.</p>
<p>In Rust, we created a custom encoding of type equality, and the typechecker
doesn’t (and shouldn’t, in general) use it to unify types.</p>
<p>This means that we need to write a function that actually performs the conversion:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="o">></span> <span class="n">Is</span><span class="o"><</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="o">></span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">convert</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">a</span><span class="p">:</span> <span class="n">A</span><span class="p">)</span> <span class="k">-></span> <span class="n">B</span> <span class="p">{</span>
<span class="k">unsafe</span> <span class="p">{</span> <span class="nn">std</span><span class="p">::</span><span class="nn">intrinsics</span><span class="p">::</span><span class="nf">transmute_unchecked</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">transmute_unchecked</code> is a very unsafe function in general, but in our case, we
only invoke it when we have a type equality witness available (which can only be
constructed via <code class="language-plaintext highlighter-rouge">refl</code>), so we <em>know</em> the types <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> are actually equal.</p>
<p>With this, we can now use the equality witnesses in the constructors to rewrite
the results into the desired <code class="language-plaintext highlighter-rouge">A</code>:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">)</span> <span class="k">-></span> <span class="n">A</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">expr</span> <span class="p">{</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">n</span><span class="p">),</span> <span class="c1">// i64 -> A</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">left</span><span class="p">,</span> <span class="n">right</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">left</span><span class="p">)</span> <span class="o">+</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">right</span><span class="p">)),</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">LitBool</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">b</span><span class="p">),</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Or</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">left</span><span class="p">,</span> <span class="n">right</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">left</span><span class="p">)</span> <span class="p">||</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">right</span><span class="p">)),</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<h3 id="trait-constraint-witnesses">Trait constraint witnesses</h3>
<p>The type equality witnesses from the previous section are relatively simple,
because the only thing we need to record about our type parameter is that it’s
equal to a known type in the local context.</p>
<p>Trait implementations are more complicated, because we need to know how certain
functionality is implemented for our type parameter.</p>
<p>Haskell’s GADTs store references to type class dictionaries in their
constructors – essentially dynamic dispatch.
While Rust supports dynamic dispatch via <code class="language-plaintext highlighter-rouge">dyn Trait</code>, it’s severely limited
(requiring “object safe” traits), so we’ll need a different approach.</p>
<p>We’ll start with a similar witness idea, but this time, the witness will record
the fact that a trait implementation exists for a type.</p>
<p>We’ll define a witness for the existence of a <code class="language-plaintext highlighter-rouge">Add</code>-like capability, corresponding
to the <code class="language-plaintext highlighter-rouge">Num</code> constraint in the Haskell version.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">struct</span> <span class="n">CanAdd</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="o">?</span><span class="nb">Sized</span><span class="o">></span> <span class="p">{</span>
<span class="n">_phantom</span><span class="p">:</span> <span class="n">PhantomData</span><span class="o"><</span><span class="n">T</span><span class="o">></span><span class="p">,</span>
<span class="p">}</span>
<span class="k">impl</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::</span><span class="nb">Add</span><span class="o"><</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">>></span> <span class="n">CanAdd</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-></span> <span class="k">Self</span> <span class="p">{</span>
<span class="n">CanAdd</span> <span class="p">{</span> <span class="n">_phantom</span><span class="p">:</span> <span class="n">PhantomData</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="n">can_add</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::</span><span class="nb">Add</span><span class="o"><</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">>></span><span class="p">()</span> <span class="k">-></span> <span class="n">CanAdd</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="p">{</span>
<span class="nn">CanAdd</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span>
<span class="p">}</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">CanAdd<T></code> can only be constructed (via <code class="language-plaintext highlighter-rouge">can_add</code>) if <code class="language-plaintext highlighter-rouge">T</code> supports the <code class="language-plaintext highlighter-rouge">Add</code> operation
with result type <code class="language-plaintext highlighter-rouge">T</code>. This mirrors the <code class="language-plaintext highlighter-rouge">Num a =></code> constraint on the Haskell side.</p>
<p>We can now extend our expression type with a constructor that carries this
constraint witness:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">enum</span> <span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">></span> <span class="p">{</span>
<span class="nf">LitInt</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">i64</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">i64</span><span class="p">),</span>
<span class="nf">LitDouble</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">f64</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">f64</span><span class="p">),</span>
<span class="nf">Add</span><span class="p">(</span><span class="n">CanAdd</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">>></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">>></span><span class="p">),</span>
<span class="nf">LitBool</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">bool</span><span class="p">),</span>
<span class="nf">Or</span><span class="p">(</span><span class="n">Is</span><span class="o"><</span><span class="nb">bool</span><span class="p">,</span> <span class="n">A</span><span class="o">></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="nb">bool</span><span class="o">>></span><span class="p">,</span> <span class="nb">Box</span><span class="o"><</span><span class="n">Expr</span><span class="o"><</span><span class="nb">bool</span><span class="o">>></span><span class="p">),</span>
<span class="p">}</span></code></pre></figure>
<p>This version of <code class="language-plaintext highlighter-rouge">Expr</code> is the Rust analogue of the final Haskell GADT.
The <code class="language-plaintext highlighter-rouge">Add</code> constructor now carries a <code class="language-plaintext highlighter-rouge">CanAdd<A></code> witness that proves
<code class="language-plaintext highlighter-rouge">A</code> implements <code class="language-plaintext highlighter-rouge">Add<Output = A></code>.</p>
<p>So far this handles the <em>construction</em> side of the story, but not the
<em>destruction</em> side. When we pattern match on an <code class="language-plaintext highlighter-rouge">Expr<A></code>, Rust doesn’t know
that <code class="language-plaintext highlighter-rouge">A</code> satisfies the constraint carried by <code class="language-plaintext highlighter-rouge">CanAdd<A></code>.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">)</span> <span class="k">-></span> <span class="n">A</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">expr</span> <span class="p">{</span>
<span class="o">...</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=></span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">)</span> <span class="o">+</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">b</span><span class="p">),</span> <span class="c1">// cannot add `A` to `A`</span>
<span class="o">...</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>To recover that information, we’ll need to encode it in a trait that can
selectively enable the operation based on the presence of a witness.</p>
<h3 id="using-specialisation-to-recover-constraints">Using specialisation to recover constraints</h3>
<p>We now want to <em>use</em> the information stored in <code class="language-plaintext highlighter-rouge">CanAdd<A></code> when pattern
matching on an expression. In Haskell, this happens automatically:
matching on <code class="language-plaintext highlighter-rouge">Add</code> brings the <code class="language-plaintext highlighter-rouge">Num a</code> constraint into scope.
Rust has no mechanism for this, so we’ll need an indirection.</p>
<p>We’ll introduce a helper trait <code class="language-plaintext highlighter-rouge">MaybeAdd</code> that acts like a type class dictionary.
It provides an operation <code class="language-plaintext highlighter-rouge">maybe_add</code>, which only exists when the type supports
addition. We’ll use specialisation to make that conditional.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#![feature(specialization)]</span>
<span class="k">trait</span> <span class="n">MaybeAdd</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">maybe_add</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">rhs</span><span class="p">:</span> <span class="k">Self</span><span class="p">)</span> <span class="k">-></span> <span class="k">Self</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>We define a <em>default</em> implementation for all types:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="n">T</span><span class="o">></span> <span class="n">MaybeAdd</span> <span class="k">for</span> <span class="n">T</span> <span class="p">{</span>
<span class="n">default</span> <span class="k">fn</span> <span class="nf">maybe_add</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">_rhs</span><span class="p">:</span> <span class="k">Self</span><span class="p">)</span> <span class="k">-></span> <span class="k">Self</span>
<span class="p">{</span>
<span class="nd">unreachable!</span><span class="p">(</span><span class="s">"no Add implementation for this type"</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>and a <em>specialised</em> implementation for types that implement <code class="language-plaintext highlighter-rouge">Add</code>:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">impl</span><span class="o"><</span><span class="n">T</span><span class="p">:</span> <span class="nn">std</span><span class="p">::</span><span class="nn">ops</span><span class="p">::</span><span class="nb">Add</span><span class="o"><</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">>></span> <span class="n">MaybeAdd</span> <span class="k">for</span> <span class="n">T</span> <span class="p">{</span>
<span class="k">fn</span> <span class="nf">maybe_add</span><span class="p">(</span><span class="k">self</span><span class="p">,</span> <span class="n">rhs</span><span class="p">:</span> <span class="k">Self</span><span class="p">)</span> <span class="k">-></span> <span class="k">Self</span>
<span class="p">{</span>
<span class="k">self</span> <span class="o">+</span> <span class="n">rhs</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>With this machinery, we can now use the constraint witness inside <code class="language-plaintext highlighter-rouge">eval</code>:</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="k">fn</span> <span class="n">eval</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">(</span><span class="n">expr</span><span class="p">:</span> <span class="n">Expr</span><span class="o"><</span><span class="n">A</span><span class="o">></span><span class="p">)</span> <span class="k">-></span> <span class="n">A</span> <span class="p">{</span>
<span class="k">match</span> <span class="n">expr</span> <span class="p">{</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">n</span><span class="p">),</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">LitDouble</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">d</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">d</span><span class="p">),</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">LitBool</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="n">b</span><span class="p">),</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=></span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">)</span><span class="nf">.maybe_add</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">b</span><span class="p">)),</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Or</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span> <span class="k">=></span> <span class="n">p</span><span class="nf">.convert</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">)</span> <span class="p">||</span> <span class="nf">eval</span><span class="p">(</span><span class="o">*</span><span class="n">b</span><span class="p">)),</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Rather than directly using <code class="language-plaintext highlighter-rouge">+</code> (which we can’t, since <code class="language-plaintext highlighter-rouge">A</code> isn’t known to
implement <code class="language-plaintext highlighter-rouge">std::ops::Add</code> in this context), we delegate to <code class="language-plaintext highlighter-rouge">maybe_add</code>, which
uses specialisation to select the correct implementation at monomorphisation
time.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#[test]</span>
<span class="k">fn</span> <span class="nf">eval_test</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">expr_int</span> <span class="o">=</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">3</span><span class="p">);</span>
<span class="k">let</span> <span class="n">b</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitInt</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mi">4</span><span class="p">);</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="nf">can_add</span><span class="p">(),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">b</span><span class="p">))</span>
<span class="p">};</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="n">expr_int</span><span class="p">),</span> <span class="mi">7</span><span class="p">);</span>
<span class="k">let</span> <span class="n">expr_double</span> <span class="o">=</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">a</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitDouble</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mf">2.5</span><span class="p">);</span>
<span class="k">let</span> <span class="n">b</span> <span class="o">=</span> <span class="nn">Expr</span><span class="p">::</span><span class="nf">LitDouble</span><span class="p">(</span><span class="nn">Is</span><span class="p">::</span><span class="nf">refl</span><span class="p">(),</span> <span class="mf">4.0</span><span class="p">);</span>
<span class="nn">Expr</span><span class="p">::</span><span class="nf">Add</span><span class="p">(</span><span class="nf">can_add</span><span class="p">(),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">a</span><span class="p">),</span> <span class="nn">Box</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">b</span><span class="p">))</span>
<span class="p">};</span>
<span class="nd">assert_eq!</span><span class="p">(</span><span class="nf">eval</span><span class="p">(</span><span class="n">expr_double</span><span class="p">),</span> <span class="mf">6.5</span><span class="p">);</span>
<span class="p">}</span></code></pre></figure>
<h3 id="why-this-works">Why this works</h3>
<p>If you’re coming from Haskell, it might be surprising that <code class="language-plaintext highlighter-rouge">eval</code> works at all.
In Haskell, type class resolution is coupled with evidence generation: when the
compiler decides that a type satisfies a constraint, it also produces a
reference to the corresponding dictionary. If Rust worked the same, then that
algorithm would pick the catch-all default implementation of <code class="language-plaintext highlighter-rouge">MaybeAdd</code> under
the <code class="language-plaintext highlighter-rouge">Expr::Add</code> arm of <code class="language-plaintext highlighter-rouge">eval</code>, because at that point, no information is known
about the type (and our <code class="language-plaintext highlighter-rouge">CanAdd</code> witness is invisible to the typechecker).</p>
<p>However, Rust’s specialisation works differently. During type checking, the
compiler only checks that <em>some</em> implementation of <code class="language-plaintext highlighter-rouge">MaybeAdd</code> exists – it
doesn’t commit to which one.
This step is <strong>proof-irrelevant</strong>: the fact that a trait is implemented matters,
but not which implementation it resolves to.</p>
<p>The actual selection happens later, during <strong>monomorphisation</strong>, once all type
parameters are concrete. At that point, the specialiser sees that <code class="language-plaintext highlighter-rouge">A = i64</code> (or
<code class="language-plaintext highlighter-rouge">A = f64</code>, etc.) and picks the more specific implementation that performs real
addition. The default <code class="language-plaintext highlighter-rouge">unreachable!()</code> version is never instantiated, precisely
because our witness mechanism disallows constructing expressions that try to add
values without <code class="language-plaintext highlighter-rouge">Add</code> implementations.</p>
<p>This is the crucial distinction between Haskell and Rust: in Haskell, dictionary
resolution is part of type checking; in Rust, it’s deferred until code
generation. The specialiser makes the final decision once it knows the concrete
types, and because our witness types restrict what can actually be constructed,
the correct implementation is always chosen.</p>
<p>In effect, Rust’s specialisation system lets us recover local constraint
learning at compile time, without runtime dictionaries or dynamic dispatch.
Everything is resolved statically and erased before code generation.
A truly zero-cost abstraction!<sup id="fnref:zero-cost" role="doc-noteref"><a href="#fn:zero-cost" class="footnote" rel="footnote">2</a></sup></p>
<h2 id="limitations">Limitations</h2>
<p>This technique has a few obvious caveats.</p>
<p>First, it relies on <strong>specialisation</strong>, which is still unstable and only
available on nightly Rust. The feature also has some unsound edge cases that
the compiler can’t currently detect, though this particular usage is benign
because it doesn’t overlap implementations in unsafe ways.</p>
<p>Second, the design doesn’t generalise to <strong>existential types</strong> — Rust simply
has no equivalent. We can simulate type refinement (as with <code class="language-plaintext highlighter-rouge">Expr<A></code>), but not
“forgetting” type information safely.</p>
<p>Finally, while the runtime cost is zero, the cognitive cost certainly isn’t.
The type signatures are verbose, the ergonomics are questionable, and the
amount of ceremony required to recover what Haskell gives you by default is
non-trivial.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Until now we’ve been preoccupied with whether or not we could.
Now it’s time to stop and think if we should.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:is-invariance" role="doc-endnote">
<p>Using <code class="language-plaintext highlighter-rouge">PhantomData<fn(A) -> B></code> would make <code class="language-plaintext highlighter-rouge">A</code> and <code class="language-plaintext highlighter-rouge">B</code> invariant, which is slightly more robust if lifetimes are involved. For this post, <code class="language-plaintext highlighter-rouge">PhantomData<(A, B)></code> is simpler and works fine. <a href="#fnref:is-invariance" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:zero-cost" role="doc-endnote">
<p>Luckily the cost analysis of abstractions doesn’t include developer ergonomics. <a href="#fnref:zero-cost" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
<p><a href="https://kcsongor.github.io/gadts-in-rust/">Trait-Constrained Enums in Rust</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on June 10, 2025.</p>
https://kcsongor.github.io/generic-lens-22020-02-11T00:00:00-00:002020-02-11T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#background" id="markdown-toc-background">Background</a></li>
<li><a href="#examples" id="markdown-toc-examples">Examples</a></li>
<li><a href="#differences" id="markdown-toc-differences">Differences</a> <ul>
<li><a href="#labels" id="markdown-toc-labels">Labels</a></li>
</ul>
</li>
<li><a href="#changes-in-generic-lens" id="markdown-toc-changes-in-generic-lens">Changes in generic-lens</a></li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>I’m happy to announce a new library,
<a href="https://hackage.haskell.org/package/generic-optics">generic-optics</a>,
accompanied by version 2.0.0.0 of <a href="https://hackage.haskell.org/package/generic-lens">generic-lens</a>.</p>
<h2 id="background">Background</h2>
<p>A few months ago, the folks at Well-Typed <a href="https://www.well-typed.com/blog/2019/09/announcing-the-optics-library/">announced the <code class="language-plaintext highlighter-rouge">optics</code>
library</a>,
which aims to improve on the user experience compared to the <code class="language-plaintext highlighter-rouge">lens</code> library.
Oleg Grenrus has written an excellent <a href="http://oleg.fi/gists/posts/2020-01-25-case-study-migration-from-lens-to-optics.html">migration guide</a>
from <code class="language-plaintext highlighter-rouge">lens</code> to <code class="language-plaintext highlighter-rouge">optics</code>, so please have a look there for some more background.</p>
<p><code class="language-plaintext highlighter-rouge">generic-optics</code> is essentially a port of <code class="language-plaintext highlighter-rouge">generic-lens</code> that is
compatible with <code class="language-plaintext highlighter-rouge">optics</code>, and is designed to be a drop-in replacement
for <code class="language-plaintext highlighter-rouge">generic-lens</code>. This means that if you’re already using <code class="language-plaintext highlighter-rouge">generic-lens</code>
with <code class="language-plaintext highlighter-rouge">lens</code> and decide to migrate to <code class="language-plaintext highlighter-rouge">optics</code>, you should be able to replace the
<code class="language-plaintext highlighter-rouge">generic-lens</code> dependency with <code class="language-plaintext highlighter-rouge">generic-optics</code> and expect things to just work.</p>
<h2 id="examples">Examples</h2>
<p>To explain why I’m so excited about <code class="language-plaintext highlighter-rouge">optics</code>, I’m going to
compare a real-life workflow between <code class="language-plaintext highlighter-rouge">generic-lens</code> and <code class="language-plaintext highlighter-rouge">generic-optics</code>.</p>
<p>First, language pragmas and imports:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE DataKinds #-}</span>
<span class="cp">{-# LANGUAGE TypeApplications #-}</span>
<span class="cp">{-# LANGUAGE DeriveGeneric #-}</span>
<span class="kr">import</span> <span class="nn">Data.Generics.Product</span>
<span class="kr">import</span> <span class="nn">GHC.Generics</span></code></pre></figure>
<p>Note that the module <code class="language-plaintext highlighter-rouge">Data.Generics.Product</code> is shared between
<code class="language-plaintext highlighter-rouge">generic-lens</code> and <code class="language-plaintext highlighter-rouge">generic-optics</code>.</p>
<p>When using <code class="language-plaintext highlighter-rouge">generic-lens</code> with the <code class="language-plaintext highlighter-rouge">lens</code> library, we would import</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="nn">Control.Lens</span></code></pre></figure>
<p>When using <code class="language-plaintext highlighter-rouge">generic-optics</code> with <code class="language-plaintext highlighter-rouge">optics</code>, the import becomes</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="nn">Optics.Core</span></code></pre></figure>
<p>Now we define a simple record:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">MyRecord</span> <span class="o">=</span> <span class="kt">MyRecord</span> <span class="p">{</span> <span class="n">a</span> <span class="o">::</span> <span class="kt">Int</span><span class="p">,</span> <span class="n">b</span> <span class="o">::</span> <span class="kt">Int</span><span class="p">,</span> <span class="n">c</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Bool</span><span class="p">,</span> <span class="kt">Int</span><span class="p">)</span> <span class="p">}</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span>
<span class="n">myRecord1</span> <span class="o">::</span> <span class="kt">MyRecord</span>
<span class="n">myRecord1</span> <span class="o">=</span> <span class="kt">MyRecord</span> <span class="mi">0</span> <span class="mi">1</span> <span class="p">(</span><span class="kt">False</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span></code></pre></figure>
<p>With either library, we can view the <code class="language-plaintext highlighter-rouge">a</code> field using
the <code class="language-plaintext highlighter-rouge">field</code> lens:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics> myRecord1 ^. field @"a"
0</code></pre></figure>
<p>If we ask what the type of <code class="language-plaintext highlighter-rouge">field @"a"</code> is in GHCi, we already see
the advantage of <code class="language-plaintext highlighter-rouge">optics</code>’s opaque representation.</p>
<p>Compare</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens> :t field @"a"
field @"a"
:: (HasField "a" s t a b, Functor f) => (a -> f b) -> s -> f t</code></pre></figure>
<p>with</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">optics> :t field @"a"
field @"a" :: HasField "a" s t a b => Lens s t a b</code></pre></figure>
<p>Now let us use the <code class="language-plaintext highlighter-rouge">typed</code> lens, which performs a type-directed lookup in
a product type, as long as there is a unique field with that type:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics> myRecord1 ^. typed @(Bool, Int)
(False,2)</code></pre></figure>
<p>When the type of the field is not unique (such as if we tried to retrieve a field
of type <code class="language-plaintext highlighter-rouge">Int</code>), both <code class="language-plaintext highlighter-rouge">generic-optics</code> and <code class="language-plaintext highlighter-rouge">generic-lens</code> provides a helpful type error:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics> myRecord1 ^. typed @Int
<interactive> error:
• The type MyRecord contains multiple values of type Int.
The choice of value is thus ambiguous. The offending constructors are:
• MyRecord</code></pre></figure>
<p>For situation likes this, both libraries provide a traversal called
<code class="language-plaintext highlighter-rouge">types</code> that focuses on all values of the given type.</p>
<p>Let’s see what happens if we replace <code class="language-plaintext highlighter-rouge">typed</code> with <code class="language-plaintext highlighter-rouge">types</code> in the above
example when using <code class="language-plaintext highlighter-rouge">lens</code>:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens> myRecord1 ^. types @Int
<interactive>:43:14-23: error:
• No instance for (Monoid Int) arising from a use of ‘types’</code></pre></figure>
<p>This error is rather puzzling. Unless we know what’s going on under
the hood, it’s not obvious where the <code class="language-plaintext highlighter-rouge">Monoid</code> constraint is coming from.</p>
<p>Compare this with <code class="language-plaintext highlighter-rouge">generic-optics</code>:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">optics> myRecord1 ^. types @Int
<interactive>:32:1-23: error:
• A_Traversal cannot be used as A_Getter</code></pre></figure>
<p>Right! <code class="language-plaintext highlighter-rouge">types @Int</code> is a traversal, but <code class="language-plaintext highlighter-rouge">^.</code> takes a getter!
Arguably this is a more helpful message. Consulting the documentation
of <code class="language-plaintext highlighter-rouge">optics</code>, we find the combinator we’re looking for: <code class="language-plaintext highlighter-rouge">^..</code>, which returns
all the values focused on by a traversal:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens|optics> myRecord1 ^.. types @Int
[0,1,2]</code></pre></figure>
<p>This now of course works in both libraries.</p>
<p>To summarise, using the two libraries should be nearly identical as
long as everything goes well and we’re not hitting type errors.
Where <code class="language-plaintext highlighter-rouge">generic-optics</code> (but really, <code class="language-plaintext highlighter-rouge">optics</code> itself) shines is when things
do not go all that well, in which case the resulting error messages are a lot more
comprehensible.</p>
<h2 id="differences">Differences</h2>
<p>The above was just to give a little taste of using
<code class="language-plaintext highlighter-rouge">generic-optics</code>. The interface of <code class="language-plaintext highlighter-rouge">generic-optics</code> is
intended to be largely identical to that of <code class="language-plaintext highlighter-rouge">generic-lens</code>.</p>
<h3 id="labels">Labels</h3>
<p>At the time of writing, the main difference is the support for overloaded
labels in <code class="language-plaintext highlighter-rouge">generic-lens</code>, which allows writing</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">lens> import Data.Generics.Labels ()
lens> myRecord1 ^. #a
0</code></pre></figure>
<p>I intend to add support for this for <code class="language-plaintext highlighter-rouge">generic-optics</code> too, but it
isn’t implemented yet.</p>
<h2 id="changes-in-generic-lens">Changes in generic-lens</h2>
<p>To support this new interface, <code class="language-plaintext highlighter-rouge">generic-lens</code> itself has undergone a major
reorganisation. I thought this was a good opportunity to clean some things
up and change the interface at places, which ultimately resulted in a new major
version bump.</p>
<p>Most notably, GHC versions below 8.4 are no longer
supported. <code class="language-plaintext highlighter-rouge">generic-lens</code> (and <code class="language-plaintext highlighter-rouge">generic-optics</code> too) promises good
performance by making sure that the generic overhead is eliminated at
compile time. Doing so requires really careful coding practices, and
GHC’s optimiser changes between every version, which meant that
certain tricks that worked for 8.2 didn’t work for 8.6 and vice
versa. The result was horrible CPP macros to enable certain hacks on
certain versions of GHC. In the end, I decided it wasn’t worth the
effort to maintain these hacks for older versions of the compiler.</p>
<p>I intend to write a blog post in the near future describing some of these
hacks, as they are quite interesting and potentially educational.</p>
<p>For a more comprehensive list of changes, refer to the
<a href="https://github.com/kcsongor/generic-lens/blob/master/generic-lens/ChangeLog.md">changelog</a>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Thanks for reading this blog post, and I’m hope you’re as excited
about <code class="language-plaintext highlighter-rouge">generic-optics</code> as I am! Since this release required a major refactoring
and moving things around, it is possible that some documentation is out of date, or
certain functions are not exported from where you would expect. If you find anything
that looks off, please either open a pull request or let me know on the issue tracker!</p>
<p>Finally, if you find <code class="language-plaintext highlighter-rouge">generic-lens</code> or <code class="language-plaintext highlighter-rouge">generic-optics</code> useful,
consider <a href="https://github.com/sponsors/kcsongor">buying me a coffee</a>!</p>
<p><a href="https://kcsongor.github.io/generic-lens-2/">Announcing generic-optics (& generic-lens 2.0.0.0)</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on February 11, 2020.</p>
https://kcsongor.github.io/opaque-constraint-synonyms2019-09-25T00:00:00+00:002019-09-25T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#constraints-newtypes-kind-of" id="markdown-toc-constraints-newtypes-kind-of">Constraints newtypes (kind of)</a></li>
<li><a href="#a-real-world-example" id="markdown-toc-a-real-world-example">A real world example</a></li>
<li><a href="#acknowledgements" id="markdown-toc-acknowledgements">Acknowledgements</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>The list of type class constraints in a function signature can
sometimes get out of hand. In these situations, we can introduce a
type synonym (thanks to <code class="language-plaintext highlighter-rouge">ConstraintKinds</code>) to avoid repetition.</p>
<p>Say we want to group together the <code class="language-plaintext highlighter-rouge">Show</code> and <code class="language-plaintext highlighter-rouge">Read</code> constraints:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Serialise</span> <span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="kt">Show</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Read</span> <span class="n">a</span><span class="p">)</span></code></pre></figure>
<p>Now <code class="language-plaintext highlighter-rouge">Serialise a</code> can be used anywhere where we require both constraints:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">roundtrip</span> <span class="o">::</span> <span class="kt">Serialise</span> <span class="n">a</span> <span class="o">=></span> <span class="n">a</span> <span class="o">-></span> <span class="n">a</span>
<span class="n">roundtrip</span> <span class="o">=</span> <span class="n">read</span> <span class="o">.</span> <span class="n">show</span></code></pre></figure>
<p>This is great, because it means we no longer have to spell out <code class="language-plaintext highlighter-rouge">(Show a, Read a)</code>
whenever we need both, and we also improved readability, because
<code class="language-plaintext highlighter-rouge">Serialise</code> conveys some additional domain-specific meaning.</p>
<p>There’s a problem with this, however. If we ask GHCi about the type of
<code class="language-plaintext highlighter-rouge">roundtrip</code>:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :t roundtrip
roundtrip :: (Show a, Read a) => a -> a</code></pre></figure>
<p>it will eagerly expand the type synonym, removing all traces of
<code class="language-plaintext highlighter-rouge">Serialise</code>. Of course this is a well known problem of type
synonyms, so we generally avoid them in favour of <code class="language-plaintext highlighter-rouge">newtype</code>s.</p>
<p>But there’s no analogous construction for constraints. Or is there?</p>
<h2 id="constraints-newtypes-kind-of">Constraints newtypes (kind of)</h2>
<p>To begin, we’re going to drop the type synonym in favour of the
“constraint synonym” technique, which is essentially the following:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="p">(</span><span class="kt">Show</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Read</span> <span class="n">a</span><span class="p">)</span> <span class="o">=></span> <span class="kt">Serialise</span> <span class="n">a</span>
<span class="kr">instance</span> <span class="p">(</span><span class="kt">Show</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Read</span> <span class="n">a</span><span class="p">)</span> <span class="o">=></span> <span class="kt">Serialise</span> <span class="n">a</span></code></pre></figure>
<p>In other words, we introduce a new type class with the required
superclass constraints, and a single catchall instance.</p>
<p>So far, the status quo hasn’t improved though. GHC is quite renitent:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :t roundtrip
roundtrip :: (Show a, Read a) => a -> a</code></pre></figure>
<p>This happens because the compiler sees that there’s only one matching
instance, so it’s safe to pick that one, and it will do so. This point
is the important one: that there’s only one instance. So, if we could somehow
trick GHC into thinking that there are other options, then maybe it wouldn’t
be so eager to expand our constraints.</p>
<p>So, we create an empty data type, only to be used internally:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Opaque</span></code></pre></figure>
<p>Next, we satisfy the superclass constraints</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">Read</span> <span class="kt">Opaque</span> <span class="kr">where</span>
<span class="n">readsPrec</span> <span class="o">=</span> <span class="n">undefined</span>
<span class="kr">instance</span> <span class="kt">Show</span> <span class="kt">Opaque</span> <span class="kr">where</span>
<span class="n">showsPrec</span> <span class="o">=</span> <span class="n">undefined</span></code></pre></figure>
<p>Note that these two instances only exist so that the constraint is
satisfied, but since the type is internal, the actual functions are
never going to be invoked.</p>
<p>Finally, the key ingredient: an overlapping instance for <code class="language-plaintext highlighter-rouge">Serialise Opaque</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# OVERLAPPING #-}</span> <span class="kt">Serialise</span> <span class="kt">Opaque</span></code></pre></figure>
<p>Now, every time GHC sees a <code class="language-plaintext highlighter-rouge">Serialise a</code> constraint, it will no longer
be able to pick the catchall instance, in case <code class="language-plaintext highlighter-rouge">a</code> gets instantiated
to <code class="language-plaintext highlighter-rouge">Opaque</code> later. Of course, this won’t happen, because we don’t
export <code class="language-plaintext highlighter-rouge">Opaque</code>, but it’s good enough for GHC.</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :t roundtrip
roundtrip :: Serialise a => a -> a</code></pre></figure>
<h2 id="a-real-world-example">A real world example</h2>
<p>You might say that the <code class="language-plaintext highlighter-rouge">(Show a, Read a)</code> example is perhaps overly
simplistic. I came up with this technique to solve a very real problem
in the <a href="http://hackage.haskell.org/package/generic-lens-1.2.0.0">generic-lens</a> library.
This problem shows up at many places in the library, but to pick one, consider the <code class="language-plaintext highlighter-rouge">AsType</code>
class:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">AsType</span> <span class="n">a</span> <span class="n">s</span> <span class="kr">where</span>
<span class="n">_Typed</span> <span class="o">::</span> <span class="kt">Prism'</span> <span class="n">s</span> <span class="n">a</span></code></pre></figure>
<p>The exact meaning of the class is irrelevant here (but see the
<a href="http://hackage.haskell.org/package/generic-lens-1.2.0.0/docs/Data-Generics-Sum-Typed.html#v:_Typed">documentation</a> if you’re interested). What
matters is that there’s a catchall instance defined for all types
(using <code class="language-plaintext highlighter-rouge">GHC.Generics</code>), which in turn requires a large number of other constraints and predicates
to hold. Since this catchall instance is the only one defined by the library, asking for the
type of <code class="language-plaintext highlighter-rouge">_Typed</code> in GHCi eagerly expands the constraints to those of the instance.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="o">:</span><span class="n">t</span> <span class="n">_Typed</span>
<span class="n">_Typed</span>
<span class="o">::</span> <span class="p">(</span><span class="kt">ErrorUnlessOne</span>
<span class="n">a</span> <span class="n">s</span> <span class="p">(</span><span class="kt">CollectPartialType</span> <span class="p">(</span><span class="kt">TupleToList</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">s</span><span class="p">)),</span>
<span class="kt">Defined</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">s</span><span class="p">)</span> <span class="p">(</span><span class="kt">TypeError</span> <span class="o">...</span><span class="p">)</span> <span class="p">(</span><span class="nb">()</span> <span class="o">::</span> <span class="kt">Constraint</span><span class="p">),</span> <span class="kt">Generic</span> <span class="n">s</span><span class="p">,</span>
<span class="kt">ListTuple</span> <span class="n">a</span> <span class="p">(</span><span class="kt">TupleToList</span> <span class="n">a</span><span class="p">),</span> <span class="kt">GAsType</span> <span class="p">(</span><span class="kt">Rep</span> <span class="n">s</span><span class="p">)</span> <span class="p">(</span><span class="kt">TupleToList</span> <span class="n">a</span><span class="p">),</span>
<span class="kt">Data</span><span class="o">.</span><span class="kt">Profunctor</span><span class="o">.</span><span class="kt">Choice</span><span class="o">.</span><span class="kt">Choice</span> <span class="n">p</span><span class="p">,</span> <span class="kt">Applicative</span> <span class="n">f</span><span class="p">)</span> <span class="o">=></span>
<span class="n">p</span> <span class="n">a</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">)</span> <span class="o">-></span> <span class="n">p</span> <span class="n">s</span> <span class="p">(</span><span class="n">f</span> <span class="n">s</span><span class="p">)</span></code></pre></figure>
<p>Not great. All the internal implementation details leak out. By
employing the opaque constraint trick above, we can define overlapping
instances for the <code class="language-plaintext highlighter-rouge">AsType</code> class, which results in the following type signature:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="o">:</span><span class="n">t</span> <span class="n">_Typed</span>
<span class="n">_Typed</span> <span class="o">::</span> <span class="kt">AsType</span> <span class="n">a</span> <span class="n">s</span> <span class="o">=></span> <span class="kt">Prism'</span> <span class="n">s</span> <span class="n">a</span></code></pre></figure>
<p>which is much nicer!</p>
<h2 id="acknowledgements">Acknowledgements</h2>
<p>I wrote most of this post a while time ago, but never published it.
Thanks to <a href="https://twitter.com/rob_rix">Rob Rix</a> for bringing up this
topic and thus reminding me to publish it. It’s good to see library authors
care about the user experience of their library down to this level of detail,
and I hope this technique will be useful for many others!</p>
<p><a href="https://kcsongor.github.io/opaque-constraint-synonyms/">Opaque constraint synonyms</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 25, 2019.</p>
https://kcsongor.github.io/ambiguous-tags2019-09-18T00:00:00-00:002019-09-18T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<p>One of the main selling points of Haskell is that despite (or because)
of its strong static type system, it frees us from the burden of having
to spell out tedious type signatures everywhere.</p>
<p>Type inference is a blessing, but sometimes it can also be a
curse. Inference too good can hinder the readability of code, because
the compiler knows what the type of an identifier is even when we
don’t. It’s not just readability though: correctness
can be imperilled too.</p>
<p>As an example, consider the <code class="language-plaintext highlighter-rouge">Tagged</code> type, which allows
us to attach type information to some other type.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">newtype</span> <span class="kt">Tagged</span> <span class="p">(</span><span class="n">s</span> <span class="o">::</span> <span class="n">k</span><span class="p">)</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">MkTagged</span> <span class="n">a</span></code></pre></figure>
<p>Then we might want to define a <code class="language-plaintext highlighter-rouge">Person</code> type consisting of a first
name and a last name, both of type <code class="language-plaintext highlighter-rouge">String</code>, tagged by (type-level)
symbols accordingly:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Person</span> <span class="o">=</span> <span class="kt">MkPerson</span>
<span class="p">(</span><span class="kt">Tagged</span> <span class="s">"firstName"</span> <span class="kt">String</span><span class="p">)</span>
<span class="p">(</span><span class="kt">Tagged</span> <span class="s">"lastName"</span> <span class="kt">String</span><span class="p">)</span></code></pre></figure>
<p>We can then construct values of this type:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph</span> <span class="o">::</span> <span class="kt">Person</span>
<span class="n">joseph</span> <span class="o">=</span> <span class="kt">MkPerson</span>
<span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Joseph"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Knecht"</span><span class="p">)</span></code></pre></figure>
<p>And here is the problem. Since both fields are constructed just with
the <code class="language-plaintext highlighter-rouge">MkTagged</code> constructor, nothing is stopping us from mixing up the
field names if we misremember the ordering:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph'</span> <span class="o">::</span> <span class="kt">Person</span>
<span class="n">joseph'</span> <span class="o">=</span> <span class="kt">MkPerson</span>
<span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Knecht"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">MkTagged</span> <span class="s">"Joseph"</span><span class="p">)</span></code></pre></figure>
<p>We would wish to get a type error, but GHC happily infers that
<code class="language-plaintext highlighter-rouge">MkTagged "Joseph"</code> indeed has type <code class="language-plaintext highlighter-rouge">Tagged t String</code> for any <code class="language-plaintext highlighter-rouge">t</code>,
thus it fits perfectly into the <code class="language-plaintext highlighter-rouge">"lastName"</code> field.</p>
<p>We can fix this example by providing explicit type applications to
the <code class="language-plaintext highlighter-rouge">MkTagged</code> constructor. Then, mixing up the order <em>is</em> a type error.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph'</span> <span class="o">::</span> <span class="kt">Person</span>
<span class="n">joseph'</span> <span class="o">=</span> <span class="kt">MkPerson</span>
<span class="p">(</span><span class="kt">MkTagged</span> <span class="o">@</span><span class="s">"lastName"</span> <span class="s">"Knecht"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">MkTagged</span> <span class="o">@</span><span class="s">"firstName"</span> <span class="s">"Joseph"</span><span class="p">)</span></code></pre></figure>
<p>results in:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text"> • Couldn't match type ‘"lastName"’ with ‘"firstName"’</code></pre></figure>
<p>This works, but these annotations are entirely optional, and if we
forget about them, we’re in trouble once again.</p>
<p>To summarise, the problem is that GHC can infer the type of <code class="language-plaintext highlighter-rouge">MkTagged "Joseph"</code>,
and due to the generality of the result, it can also unify
it with any arbitrary tag.</p>
<p>So the question is this: how do we stop GHC from inferring the type of
expressions like <code class="language-plaintext highlighter-rouge">MkTagged "Joseph"</code>? In other words, how do we enforce
that the tag must be provided by explicit type annotation?</p>
<h2 id="an-ambiguous-smart-constructor">An ambiguous smart constructor</h2>
<p>We’re going to write a smart constructor that can only be invoked by
explicit type annotation of the tag type.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">mkTagged</span> <span class="o">::</span> <span class="n">forall</span> <span class="n">t</span> <span class="n">a</span><span class="o">.</span> <span class="n">a</span> <span class="o">-></span> <span class="kt">Tagged</span> <span class="p">(</span><span class="o">???</span><span class="p">)</span> <span class="n">a</span>
<span class="n">mkTagged</span> <span class="o">=</span> <span class="kt">MkTagged</span></code></pre></figure>
<p>What to put in the <code class="language-plaintext highlighter-rouge">???</code> hole? The idea is that we want <code class="language-plaintext highlighter-rouge">t</code> in this
type to be <em>ambiguous</em>, in other words, it should be impossible to
infer <code class="language-plaintext highlighter-rouge">t</code> even if we know what <code class="language-plaintext highlighter-rouge">Tagged (???) a</code> is. If it can’t be
inferred, then GHC will insist that we specify a type annotation at
the use site for what <code class="language-plaintext highlighter-rouge">t</code> should be.</p>
<p>The obvious thing to plug into <code class="language-plaintext highlighter-rouge">???</code> would be <code class="language-plaintext highlighter-rouge">t</code> itself, but that
doesn’t work of course, because from knowing <code class="language-plaintext highlighter-rouge">Tagged t a</code>, <code class="language-plaintext highlighter-rouge">t</code> can be
trivially inferred.
For example, when given a value of type <code class="language-plaintext highlighter-rouge">Tagged "firstName" String</code>,
we can infer that <code class="language-plaintext highlighter-rouge">t</code> must be <code class="language-plaintext highlighter-rouge">"firstName"</code>.</p>
<p>As always (at least this seems to be a recurring theme here on my
blog), we reach for type families to solve this problem. In
particular, we define a rather funny-looking variant of the identity
type family, which I’m going to call <code class="language-plaintext highlighter-rouge">Ambiguous</code>:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Ambiguous</span> <span class="p">(</span><span class="n">a</span> <span class="o">::</span> <span class="n">k</span><span class="p">)</span> <span class="o">::</span> <span class="n">j</span> <span class="kr">where</span>
<span class="kt">Ambiguous</span> <span class="n">x</span> <span class="o">=</span> <span class="n">x</span></code></pre></figure>
<p>The first thing that might strike you is the kind signature:
<code class="language-plaintext highlighter-rouge">Ambiguous</code> takes an argument of kind <code class="language-plaintext highlighter-rouge">k</code>, and returns something of
kind <code class="language-plaintext highlighter-rouge">j</code>. It helps to think of these kind parameters as additional
<em>inputs</em> to the type family.</p>
<p>That is, <code class="language-plaintext highlighter-rouge">Ambiguous "firstName"</code> will get stuck:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! Ambiguous "firstName"
Ambiguous "firstName" :: j
= Ambiguous "firstName"</code></pre></figure>
<p>because GHC doesn’t know at which <code class="language-plaintext highlighter-rouge">j</code> we want to evaluate the type
family (and indeed, in principle this choice could change the
behaviour of the type family, since in GHC, type families are not
parametric).</p>
<p>In order to properly reduce the family, we must provide the result
kind as an input, like so:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! (Ambiguous "firstName" :: Symbol)
(Ambiguous "firstName" :: Symbol) :: Symbol
= "firstName"</code></pre></figure>
<p>Now let us plug this type family into the type of <code class="language-plaintext highlighter-rouge">mkTagged</code>, and see what happens.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">mkTagged</span> <span class="o">::</span> <span class="n">forall</span> <span class="n">t</span> <span class="n">a</span><span class="o">.</span> <span class="n">a</span> <span class="o">-></span> <span class="kt">Tagged</span> <span class="p">(</span><span class="kt">Ambiguous</span> <span class="n">t</span><span class="p">)</span> <span class="n">a</span>
<span class="n">mkTagged</span> <span class="o">=</span> <span class="kt">MkTagged</span></code></pre></figure>
<p>Now, when GHC’s given <code class="language-plaintext highlighter-rouge">Ambiguous t</code>, it can’t work out what <code class="language-plaintext highlighter-rouge">t</code>
is. Why? Suppose we know that <code class="language-plaintext highlighter-rouge">Ambiguous t :: Symbol</code>, that is, we
expect it to reduce to a symbol. That still doesn’t tell us anything
about the kind of <code class="language-plaintext highlighter-rouge">t</code>! According to the kind signature of <code class="language-plaintext highlighter-rouge">Ambiguous</code>, the
kind of <code class="language-plaintext highlighter-rouge">t</code> could be <em>anything</em>. Indeed, the only way to disambiguate this
is to provide the kind of <code class="language-plaintext highlighter-rouge">t</code>. As the signature of <code class="language-plaintext highlighter-rouge">mkTagged</code> does
not have an explicit kind annotation on <code class="language-plaintext highlighter-rouge">t</code>, the only way to provide
the kind of <code class="language-plaintext highlighter-rouge">t</code> is to provide <code class="language-plaintext highlighter-rouge">t</code> itself (since only visibly quantified
variables can be applied with visible type applications).</p>
<p>Now, the following code</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph</span> <span class="o">::</span> <span class="kt">Person</span>
<span class="n">joseph</span> <span class="o">=</span> <span class="kt">MkPerson</span>
<span class="p">(</span><span class="n">mkTagged</span> <span class="s">"Joseph"</span><span class="p">)</span>
<span class="p">(</span><span class="n">mkTagged</span> <span class="s">"Knecht"</span><span class="p">)</span></code></pre></figure>
<p>results in the error:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text"> • Couldn't match type ‘Ambiguous t0’ with ‘"firstName"’</code></pre></figure>
<p>To fix it, we now <em>must</em> provide type applications:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">joseph</span> <span class="o">::</span> <span class="kt">Person</span>
<span class="n">joseph</span> <span class="o">=</span> <span class="kt">MkPerson</span>
<span class="p">(</span><span class="n">mkTagged</span> <span class="o">@</span><span class="s">"firstName"</span> <span class="s">"Joseph"</span><span class="p">)</span>
<span class="p">(</span><span class="n">mkTagged</span> <span class="o">@</span><span class="s">"lastName"</span> <span class="s">"Knecht"</span><span class="p">)</span></code></pre></figure>
<p><a href="https://kcsongor.github.io/ambiguous-tags/">Tripping up type inference</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 18, 2019.</p>
https://kcsongor.github.io/underrated-vim-c-a2019-09-12T00:00:00-00:002019-09-12T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<p>The aim of this series of blog posts is to shed light on some of the
darker corners of the vim text editor that I have encountered over the
years. Each post will focus on one particular feature, and should take
no longer than a couple of minutes to read.</p>
<p>Today, I’d like to talk about the <code class="language-plaintext highlighter-rouge"><C-a></code> key sequence (that is,
control+a). It is extremely simple: pressing <code class="language-plaintext highlighter-rouge"><C-a></code> searches the
current line (starting at the cursor position) for a number, then
increments it.</p>
<p>For example:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">this is a number: 10.
^</code></pre></figure>
<p><code class="language-plaintext highlighter-rouge"><C-a></code></p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">this is a number: 11.
^</code></pre></figure>
<p>where <code class="language-plaintext highlighter-rouge">^</code> marks the cursor position.</p>
<p>Its inverse is <code class="language-plaintext highlighter-rouge"><C-x></code>, which decrements the number.
We can also specify a count, for example <code class="language-plaintext highlighter-rouge">20<C-x></code> will result in:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">this is a number: -9.
^</code></pre></figure>
<p>Hexadecimal and binary numbers are supported too.
For example, to convert <code class="language-plaintext highlighter-rouge">192</code> to hex, we can do</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">this is a hexadecimal number: 0x0.
^</code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">192<C-a></code></p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">this is a hexadecimal number: 0xc0.
^</code></pre></figure>
<p><a href="https://kcsongor.github.io/underrated-vim-c-a/">Most underrated vim features: C-a</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 12, 2019.</p>
https://kcsongor.github.io/global-implicit-parameters2019-07-11T00:00:00-00:002019-07-11T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#under-the-hood" id="markdown-toc-under-the-hood">Under the hood</a></li>
<li><a href="#barewords" id="markdown-toc-barewords">Barewords</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>Implicit parameters (enabled with the <code class="language-plaintext highlighter-rouge">{-# LANGUAGE ImplicitParams #-}</code> pragma) provide a way to dynamically bind variables in Haskell.</p>
<p>For example, the following function can be called in any context where <code class="language-plaintext highlighter-rouge">?x</code> is bound:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foo</span> <span class="o">::</span> <span class="p">(</span><span class="o">?</span><span class="n">x</span> <span class="o">::</span> <span class="kt">Int</span><span class="p">)</span> <span class="o">=></span> <span class="kt">Int</span>
<span class="n">foo</span> <span class="o">=</span> <span class="o">?</span><span class="n">x</span>
<span class="n">bar</span> <span class="o">::</span> <span class="kt">Int</span>
<span class="n">bar</span> <span class="o">=</span> <span class="kr">let</span> <span class="o">?</span><span class="n">x</span> <span class="o">=</span> <span class="mi">10</span> <span class="kr">in</span> <span class="n">foo</span></code></pre></figure>
<p>Unlike type classes, implicit parameters are bound locally. But what
if we want to bind one in the global scope? This would allow a global
“default” value, which could then be shadowed locally.</p>
<p>Unfortunately, the following is syntactically invalid:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">?</span><span class="n">x</span> <span class="o">=</span> <span class="mi">21</span></code></pre></figure>
<p>We turn to the <a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#implicit-parameter-bindings">GHC User Manual</a>,
only to be further discouraged:</p>
<blockquote>
<p>A group of implicit-parameter bindings may occur anywhere a normal group of Haskell bindings can occur, except at top level.</p>
</blockquote>
<p>Of course, we won’t let mere syntactic restrictions to get in our way.</p>
<h2 id="under-the-hood">Under the hood</h2>
<p>Since global binding of implicit parameters is officially not possible,
we need to turn to unofficial methods.
To begin, we pass the <code class="language-plaintext highlighter-rouge">-ddump-tc-trace</code> flag
to GHC and recompile the module containing <code class="language-plaintext highlighter-rouge">foo</code> and <code class="language-plaintext highlighter-rouge">bar</code>.
This makes GHC dump information about what it’s doing during typechecking
the module. There is quite a lot of output, but one line looks interesting:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">canEvNC:cls ghc-prim-0.5.3:GHC.Classes.IP ["x", Int]</code></pre></figure>
<p>Good software engineering practice dictates code reuse, and we all
know that GHC is a well-engineered piece of software. Therefore, it is
not surprising to find that implicit parameters are implemented by
piggybacking off of type class resolution with some additional rules
to disregard issues like global coherence.</p>
<p>As the above line suggests, implicit parameter resolution is desugared into
the resolution of the <code class="language-plaintext highlighter-rouge">GHC.Classes.IP</code> type class from <code class="language-plaintext highlighter-rouge">ghc-prim</code>.</p>
<p>Even though this module is <a href="http://hackage.haskell.org/package/ghc-prim-0.5.3">not documented</a>, we
can import it and ask GHCi for more information:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">IP</span> <span class="p">(</span><span class="n">s</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="n">a</span> <span class="o">|</span> <span class="n">s</span> <span class="o">-></span> <span class="n">a</span> <span class="kr">where</span>
<span class="n">ip</span> <span class="o">::</span> <span class="n">a</span>
<span class="cp">{-# MINIMAL ip #-}</span></code></pre></figure>
<p>It looks like GHC generates instances of the <code class="language-plaintext highlighter-rouge">IP</code> class on the fly
whenever it sees a binder for an implicit parameter. The name
of the parameter is represented as a type-level symbol. The functional
dependency allows the variable’s type to be resolved just from its name.</p>
<p>Let’s try to write an instance for this class by hand:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="c1">-- ?x = 21</span>
<span class="kr">instance</span> <span class="kt">IP</span> <span class="s">"x"</span> <span class="kt">Int</span> <span class="kr">where</span>
<span class="n">ip</span> <span class="o">=</span> <span class="mi">21</span></code></pre></figure>
<p>GHC happily accepts this definition. Indeed, we can now write</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">baz</span> <span class="o">::</span> <span class="kt">Int</span>
<span class="n">baz</span> <span class="o">=</span> <span class="o">?</span><span class="n">x</span></code></pre></figure>
<p>which evaluates to <code class="language-plaintext highlighter-rouge">21</code>, by picking up the <code class="language-plaintext highlighter-rouge">?x</code> variable from the
top-level scope. As expected, <code class="language-plaintext highlighter-rouge">let ?x = 10 in foo</code> still evaluates to <code class="language-plaintext highlighter-rouge">10</code>, as it
<em>shadows</em> the top-level binding.</p>
<h2 id="barewords">Barewords</h2>
<p>Perhaps this is a good place to stop. But we can go further:
above, we defined only the <code class="language-plaintext highlighter-rouge">?x</code> variable. It turns out
that we can define an instance for <em>all</em> symbols at once:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">KnownSymbol</span> <span class="n">s</span> <span class="o">=></span> <span class="kt">IP</span> <span class="n">s</span> <span class="kt">String</span> <span class="kr">where</span>
<span class="n">ip</span> <span class="o">=</span> <span class="n">symbolVal</span> <span class="p">(</span><span class="kt">Proxy</span> <span class="o">::</span> <span class="kt">Proxy</span> <span class="n">s</span><span class="p">)</span></code></pre></figure>
<p>This instance brings all possible implicit variables into scope, and
assigns their name their value by reflecting the symbol into a string.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bye</span> <span class="o">::</span> <span class="kt">String</span>
<span class="n">bye</span> <span class="o">=</span> <span class="o">?</span><span class="n">thanks</span> <span class="o">++</span> <span class="s">" "</span> <span class="o">++</span> <span class="o">?</span><span class="n">for</span> <span class="o">++</span> <span class="s">" "</span> <span class="o">++</span> <span class="o">?</span><span class="n">reading</span></code></pre></figure>
<p>Which <em>almost</em> feels like writing Perl!</p>
<p><a href="https://kcsongor.github.io/global-implicit-parameters/">Global Implicit Parameters</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on July 11, 2019.</p>
https://kcsongor.github.io/report-stuck-families2018-11-30T00:00:00-00:002018-11-29T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#type-family-evaluation-semantics" id="markdown-toc-type-family-evaluation-semantics">Type family evaluation semantics</a></li>
<li><a href="#custom-type-errors" id="markdown-toc-custom-type-errors">Custom type errors</a></li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>Custom type errors are a great way to improve the usability of Haskell
libraries that utilise some of the more recent language extensions.
Yet anyone who has written or used one of these libraries will know
that despite the authors’ best efforts, there are still many occasions
where a wall of text jumps out, leaving us puzzled as to what went wrong.</p>
<p>This post is about one particular class of such errors that have been
troubling users of many modern Haskell libraries: stuck type
families.</p>
<p>The following type error perfectly illustrates the problem. It is an
actual error <a href="https://github.com/kcsongor/generic-lens/issues/73">reported</a>
on the issue tracker of the
<a href="http://hackage.haskell.org/package/generic-lens">generic-lens</a>
library.</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">• No instance for (Data.Generics.Product.Types.HasTypes'
(Data.Generics.Product.Types.Snd
(Data.Generics.Product.Types.InterestingOr
Description
(Data.Generics.Product.Types.InterestingOr
Description
(Data.Generics.Product.Types.Interesting'
Description
(Rep Text)
Name
'[Text, Sirname, None, Description])
(M1
S
('MetaSel
('Just "name")
'NoSourceUnpackedness
'NoSourceStrictness
'DecidedLazy)
(Rec0 Name))
Name)
(M1
C
('MetaCons "M" 'PrefixI 'False)
(S1
('MetaSel
'Nothing
'NoSourceUnpackedness
'NoSourceStrictness
'DecidedLazy)
(Rec0 Multiple)))
Name))
Description
Name)
arising from a use of ‘types’</code></pre></figure>
<p>Can you spot the problem? Even if you know what to look for,
it takes a good few seconds to locate the culprit. The goal of this
post is to turn the above into the following:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">• No instance for Generic Text
arising from a traversal over Description.</code></pre></figure>
<p>How could we possibly identify a lack of <code class="language-plaintext highlighter-rouge">Generic</code> instance from the
above? Let us have a closer look at that large type error. It is a
nested chain function of calls, such as <code class="language-plaintext highlighter-rouge">Snd</code> and <code class="language-plaintext highlighter-rouge">Interesting</code>, which
are type families leaking out of the library’s implementation. The
reason we see these type families (as opposed to the result they
evaluate to), is because the computation is <em>stuck</em>. The culprit is
the <code class="language-plaintext highlighter-rouge">Rep Text</code> part somewhere in the middle.</p>
<p>It turns out that <code class="language-plaintext highlighter-rouge">Rep</code> is an associated type family of the <code class="language-plaintext highlighter-rouge">Generic</code>
class:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Generic</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kr">type</span> <span class="kt">Rep</span> <span class="n">a</span> <span class="o">::</span> <span class="kt">Type</span> <span class="o">-></span> <span class="kt">Type</span>
<span class="o">...</span></code></pre></figure>
<p>Thus, the reason <code class="language-plaintext highlighter-rouge">Rep Text</code> is not defined is that <code class="language-plaintext highlighter-rouge">Text</code> has no
<code class="language-plaintext highlighter-rouge">Generic</code> instance. Clearly, it’s unreasonable to expect users to keep
such implementation details in mind and hunt for unreduced occurrences
of <code class="language-plaintext highlighter-rouge">Rep</code> in their type errors to find out what the issue is!</p>
<p>Yet, reporting this is not so easy. To explain why, we need to
understand the behaviour of type families.</p>
<p class="notice">As things stand today, the associated family <code class="language-plaintext highlighter-rouge">Rep</code> is not actually
connected to the <code class="language-plaintext highlighter-rouge">Generic</code> class as far as the type checker is
concerned. This is why unreduced occurrences will not result in error
messages mentioning anything about <code class="language-plaintext highlighter-rouge">Generic</code> in the first place.
<a href="https://arxiv.org/abs/1706.09715">Constrained type families</a> offer a solution to this problem, but
they are not (yet) implemented in GHC.</p>
<h2 id="type-family-evaluation-semantics">Type family evaluation semantics</h2>
<p>The reduction of type families is driven by the constraint solver. To
the best of my knowledge, there is no formal specification for their
semantics, so I’m not going to attempt to give a comprehensive account
here either. Instead, let us just make some key observations about
how type families reduce.</p>
<p>A type involving a type family is said to be <em>stuck</em> if none of the
type family’s equations can be selected for the provided
arguments. Since <code class="language-plaintext highlighter-rouge">Text</code>s have no <code class="language-plaintext highlighter-rouge">Generic</code> instance, there is
consequently no <code class="language-plaintext highlighter-rouge">Rep Text</code> instance defined either. Thus, <code class="language-plaintext highlighter-rouge">Rep Text</code>
is stuck.</p>
<p>How does “stuckness” propagate up a chain of function calls? Consider
the following type family:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Foo</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">Foo</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span></code></pre></figure>
<p>No matter what we pass in as the argument, the single equation will
always match. This means that even if we pass in a stuck type, such as
<code class="language-plaintext highlighter-rouge">Rep Text</code>, the equation can reduce to the right hand side (and get
stuck afterwards):</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! Foo (Rep Text)
= Rep Text</code></pre></figure>
<p>In other words, we can think of <code class="language-plaintext highlighter-rouge">Foo</code> as a type family that’s “lazy”
in its argument. Now consider the <code class="language-plaintext highlighter-rouge">Bar</code> type family:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Bar</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">Bar</span> <span class="kt">Maybe</span> <span class="o">=</span> <span class="kt">Maybe</span>
<span class="kt">Bar</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span></code></pre></figure>
<p>Here, we first check if the argument is <code class="language-plaintext highlighter-rouge">Maybe</code>, in which case <code class="language-plaintext highlighter-rouge">Maybe</code>
is returned, otherwise we pick the second equation. Perhaps
surprisingly, <code class="language-plaintext highlighter-rouge">Bar</code> behaves the same as <code class="language-plaintext highlighter-rouge">Foo</code>:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! Bar (Rep Text)
= Rep Text</code></pre></figure>
<p>The two equations of <code class="language-plaintext highlighter-rouge">Bar</code> <em>agree</em> with each other, because the first
one is a substitution instance of the second. GHC recognises this, and
decides that it is safe to drop the first equation in favour of the
second one.</p>
<p>We can of course write disagreeing equations:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">T1</span> <span class="n">x</span>
<span class="kr">data</span> <span class="kt">T2</span> <span class="n">x</span>
<span class="kr">type</span> <span class="n">family</span> <span class="kt">FooBar</span> <span class="n">a</span> <span class="kr">where</span>
<span class="kt">FooBar</span> <span class="kt">T1</span> <span class="o">=</span> <span class="kt">T2</span>
<span class="kt">FooBar</span> <span class="n">a</span> <span class="o">=</span> <span class="n">a</span></code></pre></figure>
<p>This time, notice that the first equation is not a substitution
instance of the second: it returns something other than the argument.</p>
<p>GHC won’t optimise this case away anymore, and now instance matching
will have to consider both equations. A given equation matches, if the
argument unifies with the pattern, and is apart from all of the preceding
patterns (i.e. doesn’t match any of them).
The important thing here is that a stuck type is <em>not</em> apart from any
other type, but neither does it match any other type. This means that</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! FooBar (Rep Text)
= FooBar (Rep Text)</code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">FooBar</code> gets stuck just when its argument does. We can think of
<code class="language-plaintext highlighter-rouge">FooBar</code> as a type family that is “strict” in its argument.</p>
<p>If we pass in a non-stuck value, evaluation proceeds as normal:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! FooBar Maybe
= Maybe</code></pre></figure>
<p>Since <code class="language-plaintext highlighter-rouge">Maybe</code> is apart from <code class="language-plaintext highlighter-rouge">T1</code> (they are different ground types), and
it unifies with the catch-all pattern <code class="language-plaintext highlighter-rouge">a</code>.</p>
<p>So, if a type family that inspects its argument is given a stuck type,
then the resulting type will be stuck itself. Notice that we can’t
proceed any further: there is no way to detect if the argument was
stuck or not. This is why the type error above is so impenetrable.
If we ignore our argument like <code class="language-plaintext highlighter-rouge">Foo</code> does, then it just slips by, but
if we try to do something with it like <code class="language-plaintext highlighter-rouge">FooBar</code> does, we get stuck.</p>
<p>Of course, I wouldn’t have written down all of these low-level details
about type family reduction if they didn’t lead to a solution!</p>
<h2 id="custom-type-errors">Custom type errors</h2>
<p>The mechanism of custom type errors is quite simple. The constraint
solver proceeds normally, reducing all type family equations and
solving all type class instances. If at the end, there are any
constraints of the form <code class="language-plaintext highlighter-rouge">TypeError ...</code>, then the payload of the error
gets printed, otherwise any unsolved constraints are reported.</p>
<p>As an example</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foo</span> <span class="o">::</span> <span class="kt">TypeError</span> <span class="p">(</span><span class="kt">'Text</span> <span class="s">"Ouch"</span><span class="p">)</span> <span class="o">=></span> <span class="nb">()</span>
<span class="n">foo</span> <span class="o">=</span> <span class="mi">10</span></code></pre></figure>
<p>yields</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">• Ouch</code></pre></figure>
<p>even though <code class="language-plaintext highlighter-rouge">10</code> clearly doesn’t have type <code class="language-plaintext highlighter-rouge">()</code>.</p>
<p>We want to produce a custom type error when the <code class="language-plaintext highlighter-rouge">Rep</code> type family gets
stuck, and we’d like to continue normally otherwise. As discussed
above, there is no way to branch on whether a type family is stuck or
not.</p>
<p>However, we now have all the necessary pieces: all we need to do is
to make sure that when <code class="language-plaintext highlighter-rouge">Rep</code> gets stuck, we leave a <code class="language-plaintext highlighter-rouge">TypeError</code> in the
residual constraints. To do this, we’re going to wrap the call to <code class="language-plaintext highlighter-rouge">Rep</code> in
another type family, which will get stuck just when <code class="language-plaintext highlighter-rouge">Rep</code> is stuck. When
<code class="language-plaintext highlighter-rouge">Rep</code> reduces, our wrapper reduces too. The additional piece is that
the wrapper will also hold a type error as its argument, which will
reside in the unsolved constraint in the stuck case, but disappear otherwise.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Break</span> <span class="p">(</span><span class="n">c</span> <span class="o">::</span> <span class="kt">Constraint</span><span class="p">)</span> <span class="p">(</span><span class="n">rep</span> <span class="o">::</span> <span class="kt">Type</span> <span class="o">-></span> <span class="kt">Type</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Constraint</span> <span class="kr">where</span>
<span class="kt">Break</span> <span class="kr">_</span> <span class="kt">T1</span> <span class="o">=</span> <span class="p">(</span><span class="nb">()</span><span class="p">,</span> <span class="nb">()</span><span class="p">)</span>
<span class="kt">Break</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="nb">()</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">Break</code> is the wrapper family. It takes a constraint, which will be
our type error. Then it forces its argument by testing against
<code class="language-plaintext highlighter-rouge">T1</code>. Note that in both equations, the type family reduces to the
trivial constraint <code class="language-plaintext highlighter-rouge">()</code>, but in the first case, we use <code class="language-plaintext highlighter-rouge">((), ())</code> (a
tuple of two trivial constraints) to ensure that the equations
don’t optimise away, like they did with <code class="language-plaintext highlighter-rouge">Bar</code>.</p>
<p>Finally, we introduce a type family to construct a custom error message:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">NoGeneric</span> <span class="n">t</span> <span class="kr">where</span>
<span class="kt">NoGeneric</span> <span class="n">x</span> <span class="o">=</span> <span class="kt">TypeError</span> <span class="p">(</span><span class="kt">'Text</span> <span class="s">"No instance for "</span> <span class="n">'</span><span class="o">:<>:</span> <span class="kt">'ShowType</span> <span class="p">(</span><span class="kt">Generic</span> <span class="n">x</span><span class="p">))</span></code></pre></figure>
<p>Now, consider what happens when we call <code class="language-plaintext highlighter-rouge">Break</code> with the stuck
argument <code class="language-plaintext highlighter-rouge">Rep Text</code>:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! Break (NoGeneric Int) (Rep Text)
= Break (TypeError ...) (Rep Text)</code></pre></figure>
<p>the type gets stuck, with a <code class="language-plaintext highlighter-rouge">TypeError</code> inside! However, when
called with a type where <code class="language-plaintext highlighter-rouge">Rep</code> is defined, such as <code class="language-plaintext highlighter-rouge">Bool</code>, the type reduces
to the unit constraint, no mention of the type error.</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! Break (NoGeneric Bool) (Rep Bool)
= () :: Constraint</code></pre></figure>
<p>And with this, we can report errors for any stuck type family.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bar</span> <span class="o">::</span> <span class="kt">Break</span> <span class="p">(</span><span class="kt">NoGeneric</span> <span class="kt">Text</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rep</span> <span class="kt">Text</span><span class="p">)</span> <span class="o">=></span> <span class="nb">()</span>
<span class="n">bar</span> <span class="o">=</span> <span class="nb">()</span></code></pre></figure>
<p>yields</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">• No instance for Generic Text
• In the expression: bar</code></pre></figure>
<h1 id="conclusion">Conclusion</h1>
<p>Using this technique, we can place custom type errors right where our
stuck type families are, and provide more contextual information about
what went wrong. We can even generalise the above to the following type family:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">Any</span> <span class="o">::</span> <span class="n">k</span>
<span class="kr">type</span> <span class="n">family</span> <span class="kt">Assert</span> <span class="p">(</span><span class="n">err</span> <span class="o">::</span> <span class="kt">Constraint</span><span class="p">)</span> <span class="p">(</span><span class="n">break</span> <span class="o">::</span> <span class="kt">Type</span> <span class="o">-></span> <span class="kt">Type</span><span class="p">)</span> <span class="p">(</span><span class="n">a</span> <span class="o">::</span> <span class="n">k</span><span class="p">)</span> <span class="o">::</span> <span class="n">k</span> <span class="kr">where</span>
<span class="kt">Assert</span> <span class="kr">_</span> <span class="kt">T1</span> <span class="kr">_</span> <span class="o">=</span> <span class="kt">Any</span>
<span class="kt">Assert</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">k</span> <span class="o">=</span> <span class="n">k</span></code></pre></figure>
<p>which we can use at any point in a computation, not just in
constraints. <code class="language-plaintext highlighter-rouge">Assert</code> takes a type error, a potentially stuck
computation, and a value. If the computation is stuck, then the custom
error is presented, otherwise the value is passed through without any
errors. Here, strictness is forced by the same <code class="language-plaintext highlighter-rouge">T1</code> trick, but this
time, to ensure that the right hand sides are also different, we
return the <code class="language-plaintext highlighter-rouge">Any</code> type family in the first case.</p>
<p><a href="https://kcsongor.github.io/report-stuck-families/">Detecting the undetectable: custom type errors for stuck type families</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on November 29, 2018.</p>
https://kcsongor.github.io/symbol-parsing-haskell2018-11-28T00:00:00-00:002018-11-28T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#motivation" id="markdown-toc-motivation">Motivation</a></li>
<li><a href="#primitives" id="markdown-toc-primitives">Primitives</a> <ul>
<li><a href="#appendsymbol" id="markdown-toc-appendsymbol">AppendSymbol</a></li>
<li><a href="#cmpsymbol" id="markdown-toc-cmpsymbol">CmpSymbol</a></li>
</ul>
</li>
<li><a href="#decomposition" id="markdown-toc-decomposition">Decomposition</a> <ul>
<li><a href="#head" id="markdown-toc-head">Head</a></li>
<li><a href="#uncons" id="markdown-toc-uncons">Uncons</a></li>
</ul>
</li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>Haskell, as implemented in GHC, has a very rich language for expressing computations in types. Thanks to the
<a href="https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html?highlight=datakinds#datatype-promotion">DataKinds</a>
extension, any inductively defined data type can be used not only at the term level, but also at the type level.
A notable exception are strings, which provide the main theme for today’s blog post.</p>
<p>The <code class="language-plaintext highlighter-rouge">String</code> type in Haskell is defined as a list of <code class="language-plaintext highlighter-rouge">Char</code>s. However,
the type-level equivalent, <code class="language-plaintext highlighter-rouge">Symbol</code>, is defined as a primitive in GHC,
presumably for efficiency. After all, the type checker passes these
types around, and the simpler their structure, the less potential work
the constraint solver needs to do.</p>
<p>The problem is this: since <code class="language-plaintext highlighter-rouge">Symbol</code> is defined as a primitive, there
is no way to pattern match on its structure, and the only way to
interact with them are by using the built-in primitive operations,
namely appending and (efficient, constant-time) comparison.</p>
<p>In this blog post, I will show how these primitives can be used to
recover the ability to do arbitrary introspection of these type-level
string literals, thereby enabling a whole range of applications where
statically known information can be exploited.</p>
<p>The technique presented here was inspired by Daniel Winograd-Cort’s
<a href="https://github.com/kcsongor/generic-lens/pull/69">pull request for the generic-lens library</a>.</p>
<p>All of this is packaged into the
<a href="https://github.com/kcsongor/symbols">symbols</a> library.</p>
<h1 id="motivation">Motivation</h1>
<p>I have <a href="/purescript-safe-printf">written</a> about type-level symbol
parsing in PureScript to implement a type-safe <code class="language-plaintext highlighter-rouge">printf</code>
function. (There, I achieved symbol decomposition by patching the
compiler, but no such thing is required here.)</p>
<p>Reusing that example, we will be able to write</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="o">:</span><span class="n">t</span> <span class="n">printf</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span>
<span class="n">printf</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="o">::</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">String</span> <span class="o">-></span> <span class="kt">String</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="n">printf</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="mi">10</span> <span class="mi">20</span> <span class="s">"foo"</span>
<span class="s">"Wurble 10 20 foo"</span></code></pre></figure>
<p>The implementation of the printf example using the technique described in this blog post can be found on
<a href="https://github.com/kcsongor/symbols/blob/master/src/Data/Symbol/Examples/Printf.hs">github</a>.</p>
<h1 id="primitives">Primitives</h1>
<p>First, let’s have a look at the primitives GHC provides for
manipulating type of kind <code class="language-plaintext highlighter-rouge">Symbol</code>, namely <code class="language-plaintext highlighter-rouge">AppendSymbol</code> and
<code class="language-plaintext highlighter-rouge">CmpSymbol</code>.</p>
<p>These functions are implemented in the compiler, and exported from the
<a href="http://hackage.haskell.org/package/base-4.12.0.0/docs/GHC-TypeLits.html">GHC.TypeLits</a> module:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">AppendSymbol</span> <span class="p">(</span><span class="n">m</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">n</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Symbol</span>
<span class="kr">type</span> <span class="n">family</span> <span class="kt">CmpSymbol</span> <span class="p">(</span><span class="n">m</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">n</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Ordering</span></code></pre></figure>
<p>Note that there is no <code class="language-plaintext highlighter-rouge">Uncons</code> primitive that returns the head (first
character) and the tail of the symbol. It turns out that we can
implement <code class="language-plaintext highlighter-rouge">Uncons</code> using the two primitives above.</p>
<h2 id="appendsymbol">AppendSymbol</h2>
<p>The fact that <code class="language-plaintext highlighter-rouge">AppendSymbol</code> is a type family suggests a rather
straightforward semantics. It appends two symbols together resulting
in a third one:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="o">:</span><span class="n">kind</span><span class="o">!</span> <span class="kt">AppendSymbol</span> <span class="s">"foo"</span> <span class="s">"bar"</span>
<span class="o">=</span> <span class="s">"foobar"</span></code></pre></figure>
<p>That is to say, it should only go in one way, so to speak.</p>
<p>However, if we have a look at the
<a href="https://github.com/ghc/ghc/blob/1c2c2d3dfd4c36884b22163872feb87122b4528d/compiler/typecheck/TcTypeNats.hs#L835">implementation</a>
in GHC, we can see that there’s more going on. There are special rules
for the interaction of <code class="language-plaintext highlighter-rouge">AppendSymbol</code> constraints with equality
constraints. In concrete terms, GHC will solve the following
constraint:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">(AppendSymbol "foo" b ~ "foobar") => (b ~ "bar")</code></pre></figure>
<p>That is, if we know a prefix of a symbol, we can decompose it to get
the matching suffix. Morally, the actual signature of
<code class="language-plaintext highlighter-rouge">AppendSymbol</code> would be closer to</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="n">family</span> <span class="kt">AppendSymbol</span> <span class="n">m</span> <span class="n">n</span> <span class="o">=</span> <span class="n">r</span> <span class="o">|</span> <span class="n">r</span> <span class="n">m</span> <span class="o">-></span> <span class="n">n</span><span class="p">,</span> <span class="n">r</span> <span class="n">n</span> <span class="o">-></span> <span class="n">m</span></code></pre></figure>
<p>But this can’t be expressed today in GHC (type family dependencies
only allow the inputs to be decided solely by the result, and no
such combination of inputs and outputs are allowed), so <code class="language-plaintext highlighter-rouge">AppendSymbol</code>
really is a lot more powerful than what the type system would like to admit!</p>
<p>Even with the ability to decompose symbols, there is a problem,
however. This decomposition only works if we <em>know</em> what the prefix
is. And in general, we need to know two out of the three symbols
involved in the constraint to get the third.</p>
<p>As a result, the following won’t work:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bad</span> <span class="o">::</span> <span class="kt">AppendSymbol</span> <span class="n">prefix</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span> <span class="o">=></span> <span class="kt">Proxy</span> <span class="n">suffix</span>
<span class="n">bad</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="o">:</span><span class="n">t</span> <span class="n">bad</span>
<span class="n">bad</span> <span class="o">::</span> <span class="p">(</span><span class="kt">AppendSymbol</span> <span class="n">prefix</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span><span class="p">)</span> <span class="o">=></span> <span class="kt">Proxy</span> <span class="n">suffix</span></code></pre></figure>
<p>that is, <code class="language-plaintext highlighter-rouge">suffix</code> is unsolved.</p>
<p>We might think that we can just try all possible characters as
potential prefixes until one matches, but that would require
backtracking in the constraint solver, and GHC’s constraint solver
doesn’t backtrack.</p>
<p>That is, trying a prefix that doesn’t match results in an unsolvable
constraint:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bad'</span> <span class="o">::</span> <span class="kt">AppendSymbol</span> <span class="s">"a"</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span> <span class="o">=></span> <span class="kt">Proxy</span> <span class="n">suffix</span>
<span class="n">bad'</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">>>></span> <span class="o">:</span><span class="n">t</span> <span class="n">bad'</span>
<span class="n">bad'</span> <span class="o">::</span> <span class="p">(</span><span class="kt">AppendSymbol</span> <span class="s">"a"</span> <span class="n">suffix</span> <span class="o">~</span> <span class="s">"hello world"</span><span class="p">)</span> <span class="o">=></span> <span class="kt">Proxy</span> <span class="n">suffix</span></code></pre></figure>
<p>But since we can’t backtrack, there is no way to try a different
character once we’ve committed to a particular prefix.</p>
<p><em>If we knew</em> what the first character was, we could strip it off
and get the remaining symbol this way, which would allow us to
treat Symbols as a list of characters essentially.</p>
<h2 id="cmpsymbol">CmpSymbol</h2>
<p>It turns out that we can simply use alphabetical ordering to find out
what the first character of a string is. <code class="language-plaintext highlighter-rouge">CmpSymbol</code> compares two symbols,
and returns one of <code class="language-plaintext highlighter-rouge">LT</code>, <code class="language-plaintext highlighter-rouge">EQ</code>, or <code class="language-plaintext highlighter-rouge">GT</code> as a result.</p>
<p>Observe that for any string longer than one, it’s always true that the
string follows its first character alphabetically, and precedes any
character after its first one. As an example, consider the string
<code class="language-plaintext highlighter-rouge">"hello world"</code>, whose first character is <code class="language-plaintext highlighter-rouge">h</code>, and the letter after
<code class="language-plaintext highlighter-rouge">h</code> is <code class="language-plaintext highlighter-rouge">i</code>. Then we have</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">"h" < "hello world" < "i"</code></pre></figure>
<p>For strings of length one, they will simply return <code class="language-plaintext highlighter-rouge">EQ</code> when compared
with their first character (themselves).</p>
<h1 id="decomposition">Decomposition</h1>
<p>We now put the pieces together to implement an uncons function for
symbols. First, we need <code class="language-plaintext highlighter-rouge">Head</code>, a function that returns the first
character of a symbol. Second, we will use <code class="language-plaintext highlighter-rouge">Head</code> to interact with
<code class="language-plaintext highlighter-rouge">AppendSymbol</code> to retrieve the tail of the symbol. Doing this
repeatedly will allow us to turn a symbol into a list of characters,
which in turn can be consumed by ordinary type families.</p>
<h2 id="head">Head</h2>
<p>So, to find out what the first character of a symbol is, we just
need to find the last character in the ASCII table that precedes
our symbol. To do this reasonably efficiently, we use binary search.
Since indexing into a type-level list takes linear time, we use a
balanced binary search tree instead. Recall that symbol comparisons
are constant-time, so the whole operation is constant time (as we’re
working with a fixed size alphabet), so this optimisation simply
improves the constant factor by an order of magnitude.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Tree</span> <span class="n">a</span>
<span class="o">=</span> <span class="kt">Leaf</span>
<span class="o">|</span> <span class="kt">Node</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="kt">Show</span></code></pre></figure>
<p>The printable subset of the ASCII character set can be encoded as the
following tree:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Chars</span>
<span class="o">=</span> <span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">" "</span><span class="p">,</span> <span class="s">"!"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"!"</span><span class="p">,</span> <span class="s">"</span><span class="se">\"</span><span class="s">"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"</span><span class="se">\"</span><span class="s">"</span><span class="p">,</span> <span class="s">"#"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"#"</span><span class="p">,</span> <span class="s">"$"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"$"</span><span class="p">,</span> <span class="s">"%"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"%"</span><span class="p">,</span> <span class="s">"&"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"&"</span><span class="p">,</span> <span class="s">"'"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"'"</span><span class="p">,</span> <span class="s">"("</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"("</span><span class="p">,</span> <span class="s">")"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">")"</span><span class="p">,</span> <span class="s">"*"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"*"</span><span class="p">,</span> <span class="s">"+"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"+"</span><span class="p">,</span> <span class="s">","</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">","</span><span class="p">,</span> <span class="s">"-"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"-"</span><span class="p">,</span> <span class="s">"."</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"."</span><span class="p">,</span> <span class="s">"/"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"/"</span><span class="p">,</span> <span class="s">"0"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"0"</span><span class="p">,</span> <span class="s">"1"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"1"</span><span class="p">,</span> <span class="s">"2"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"2"</span><span class="p">,</span> <span class="s">"3"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"3"</span><span class="p">,</span> <span class="s">"4"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"4"</span><span class="p">,</span> <span class="s">"5"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"5"</span><span class="p">,</span> <span class="s">"6"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"6"</span><span class="p">,</span> <span class="s">"7"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"7"</span><span class="p">,</span> <span class="s">"8"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"8"</span><span class="p">,</span> <span class="s">"9"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"9"</span><span class="p">,</span> <span class="s">":"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">":"</span><span class="p">,</span> <span class="s">";"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">";"</span><span class="p">,</span> <span class="s">"<"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"<"</span><span class="p">,</span> <span class="s">"="</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"="</span><span class="p">,</span> <span class="s">">"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">">"</span><span class="p">,</span> <span class="s">"?"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"?"</span><span class="p">,</span> <span class="s">"@"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"@"</span><span class="p">,</span> <span class="s">"A"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"A"</span><span class="p">,</span> <span class="s">"B"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"B"</span><span class="p">,</span> <span class="s">"C"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"C"</span><span class="p">,</span> <span class="s">"D"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"D"</span><span class="p">,</span> <span class="s">"E"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"E"</span><span class="p">,</span> <span class="s">"F"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"F"</span><span class="p">,</span> <span class="s">"G"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"G"</span><span class="p">,</span> <span class="s">"H"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"H"</span><span class="p">,</span> <span class="s">"I"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"I"</span><span class="p">,</span> <span class="s">"J"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"J"</span><span class="p">,</span> <span class="s">"K"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"K"</span><span class="p">,</span> <span class="s">"L"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"L"</span><span class="p">,</span> <span class="s">"M"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"M"</span><span class="p">,</span> <span class="s">"N"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"N"</span><span class="p">,</span> <span class="s">"O"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"O"</span><span class="p">,</span> <span class="s">"P"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"P"</span><span class="p">,</span> <span class="s">"Q"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"Q"</span><span class="p">,</span> <span class="s">"R"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"R"</span><span class="p">,</span> <span class="s">"S"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"S"</span><span class="p">,</span> <span class="s">"T"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"T"</span><span class="p">,</span> <span class="s">"U"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"U"</span><span class="p">,</span> <span class="s">"V"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"V"</span><span class="p">,</span> <span class="s">"W"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"W"</span><span class="p">,</span> <span class="s">"X"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"X"</span><span class="p">,</span> <span class="s">"Y"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"Y"</span><span class="p">,</span> <span class="s">"Z"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"Z"</span><span class="p">,</span> <span class="s">"["</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"["</span><span class="p">,</span> <span class="s">"</span><span class="se">\\</span><span class="s">"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"</span><span class="se">\\</span><span class="s">"</span><span class="p">,</span> <span class="s">"]"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"]"</span><span class="p">,</span> <span class="s">"^"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"^"</span><span class="p">,</span> <span class="s">"_"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"_"</span><span class="p">,</span> <span class="s">"`"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"`"</span><span class="p">,</span> <span class="s">"a"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"a"</span><span class="p">,</span> <span class="s">"b"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"b"</span><span class="p">,</span> <span class="s">"c"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"c"</span><span class="p">,</span> <span class="s">"d"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"d"</span><span class="p">,</span> <span class="s">"e"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"e"</span><span class="p">,</span> <span class="s">"f"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"f"</span><span class="p">,</span> <span class="s">"g"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"g"</span><span class="p">,</span> <span class="s">"h"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"h"</span><span class="p">,</span> <span class="s">"i"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"i"</span><span class="p">,</span> <span class="s">"j"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"j"</span><span class="p">,</span> <span class="s">"k"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"k"</span><span class="p">,</span> <span class="s">"l"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"l"</span><span class="p">,</span> <span class="s">"m"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"m"</span><span class="p">,</span> <span class="s">"n"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"n"</span><span class="p">,</span> <span class="s">"o"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"o"</span><span class="p">,</span> <span class="s">"p"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"p"</span><span class="p">,</span> <span class="s">"q"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"q"</span><span class="p">,</span> <span class="s">"r"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"r"</span><span class="p">,</span> <span class="s">"s"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"s"</span><span class="p">,</span> <span class="s">"t"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"t"</span><span class="p">,</span> <span class="s">"u"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"u"</span><span class="p">,</span> <span class="s">"v"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"v"</span><span class="p">,</span> <span class="s">"w"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"w"</span><span class="p">,</span> <span class="s">"x"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span> <span class="s">"y"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">))</span>
<span class="n">'</span><span class="p">(</span><span class="s">"y"</span><span class="p">,</span> <span class="s">"z"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"z"</span><span class="p">,</span> <span class="s">"{"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"{"</span><span class="p">,</span> <span class="s">"|"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span>
<span class="n">'</span><span class="p">(</span><span class="s">"|"</span><span class="p">,</span> <span class="s">"}"</span><span class="p">)</span>
<span class="p">(</span><span class="kt">'Node</span> <span class="p">(</span><span class="kt">'Node</span> <span class="kt">'Leaf</span> <span class="n">'</span><span class="p">(</span><span class="s">"}"</span><span class="p">,</span> <span class="s">"~"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)</span> <span class="n">'</span><span class="p">(</span><span class="s">"~"</span><span class="p">,</span> <span class="s">"~"</span><span class="p">)</span> <span class="kt">'Leaf</span><span class="p">)))))</span></code></pre></figure>
<p>(I generated this structure with the help of other type families, but
found that inlining the result into the source file results in much
faster lookups.)</p>
<p>Note that each node contains two consecutive characters: this is so
that we can easily decide when to stop: when the first element is less
than, and the second element is greater than our input string.</p>
<p>The <code class="language-plaintext highlighter-rouge">Lookup</code> type family (and <code class="language-plaintext highlighter-rouge">Lookup2</code>, to make up for a lack of local
declarations in type families) implements a standard binary search.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">LookupTable</span> <span class="o">=</span> <span class="kt">Tree</span> <span class="p">(</span><span class="kt">Symbol</span><span class="p">,</span> <span class="kt">Symbol</span><span class="p">)</span>
<span class="kr">type</span> <span class="n">family</span> <span class="kt">Lookup</span> <span class="p">(</span><span class="n">x</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">xs</span> <span class="o">::</span> <span class="kt">LookupTable</span><span class="p">)</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="kr">where</span>
<span class="kt">Lookup</span> <span class="n">x</span> <span class="p">(</span><span class="kt">Node</span> <span class="n">l</span> <span class="n">'</span><span class="p">(</span><span class="n">cl</span><span class="p">,</span> <span class="n">cr</span><span class="p">)</span> <span class="n">r</span><span class="p">)</span>
<span class="o">=</span> <span class="kt">Lookup2</span> <span class="p">(</span><span class="kt">CmpSymbol</span> <span class="n">cl</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="kt">CmpSymbol</span> <span class="n">cr</span> <span class="n">x</span><span class="p">)</span> <span class="n">x</span> <span class="n">cl</span> <span class="n">l</span> <span class="n">r</span>
<span class="kr">type</span> <span class="n">family</span> <span class="kt">Lookup2</span> <span class="n">ol</span> <span class="n">or</span> <span class="n">x</span> <span class="n">cl</span> <span class="n">l</span> <span class="n">r</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="kr">where</span>
<span class="kt">Lookup2</span> <span class="kt">'EQ</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">cl</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">cl</span> <span class="c1">-- character matches</span>
<span class="kt">Lookup2</span> <span class="kt">'LT</span> <span class="kt">'GT</span> <span class="kr">_</span> <span class="n">cl</span> <span class="kr">_</span> <span class="n">r</span> <span class="o">=</span> <span class="n">cl</span> <span class="c1">-- found the right node</span>
<span class="kt">Lookup2</span> <span class="kt">'LT</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">cl</span> <span class="kr">_</span> <span class="kt">'Leaf</span> <span class="o">=</span> <span class="n">cl</span> <span class="c1">-- we're at the rightmost node (~)</span>
<span class="kt">Lookup2</span> <span class="kt">'LT</span> <span class="kr">_</span> <span class="n">x</span> <span class="kr">_</span> <span class="kr">_</span> <span class="n">r</span> <span class="o">=</span> <span class="kt">Lookup</span> <span class="n">x</span> <span class="n">r</span> <span class="c1">-- go right</span>
<span class="kt">Lookup2</span> <span class="kt">'GT</span> <span class="kr">_</span> <span class="n">x</span> <span class="kr">_</span> <span class="n">l</span> <span class="kr">_</span> <span class="o">=</span> <span class="kt">Lookup</span> <span class="n">x</span> <span class="n">l</span> <span class="c1">-- go left</span></code></pre></figure>
<p>Finally, <code class="language-plaintext highlighter-rouge">Head</code> is just a lookup in the binary tree.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Head</span> <span class="n">sym</span> <span class="o">=</span> <span class="kt">Lookup</span> <span class="n">sym</span> <span class="kt">Chars</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :kind! Head "Wurble"
= "W"</code></pre></figure>
<h2 id="uncons">Uncons</h2>
<p>Next, we need to interact the <code class="language-plaintext highlighter-rouge">AppendSymbol</code> constraint with
<code class="language-plaintext highlighter-rouge">Head</code>. We now turn to a type class, <code class="language-plaintext highlighter-rouge">Uncons</code>:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Uncons</span> <span class="p">(</span><span class="n">sym</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">h</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">t</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">uncons</span> <span class="o">::</span> <span class="kt">Proxy</span> <span class="n">'</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">t</span><span class="p">)</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">sym</code> is our symbol, <code class="language-plaintext highlighter-rouge">h</code> is the head, and <code class="language-plaintext highlighter-rouge">t</code> is the tail. It would
be nice to have a functional dependency <code class="language-plaintext highlighter-rouge">sym -> h t</code>, but
unfortunately we can’t make that pass, as recall that the backwards
dependencies of <code class="language-plaintext highlighter-rouge">AppendSymbol</code> are essentially hidden from the type
system.</p>
<p>We write a single instance, which sets up the right constraints:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="p">(</span> <span class="n">h</span> <span class="o">~</span> <span class="kt">Head</span> <span class="n">sym</span>
<span class="p">,</span> <span class="kt">AppendSymbol</span> <span class="n">h</span> <span class="n">t</span> <span class="o">~</span> <span class="n">sym</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">Uncons</span> <span class="n">sym</span> <span class="n">h</span> <span class="n">t</span> <span class="kr">where</span>
<span class="n">uncons</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure>
<p>First, we write <code class="language-plaintext highlighter-rouge">h ~ Head sym</code>, which unifies <code class="language-plaintext highlighter-rouge">h</code> with the first
element of the symbol using the binary lookup defined
previously. Then, the <code class="language-plaintext highlighter-rouge">AppendSymbol h t ~ sym</code> constraint will trigger
the solution of <code class="language-plaintext highlighter-rouge">t</code>, due to the now known prefix <code class="language-plaintext highlighter-rouge">h</code>.</p>
<p>The <code class="language-plaintext highlighter-rouge">uncons</code> member is not necessary for things to work out, but it
helps illustrate the working of the type class in the REPL:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :t uncons @"foo"
uncons @"foo" :: Proxy '("f", "oo")</code></pre></figure>
<p>Finally, we can write the <code class="language-plaintext highlighter-rouge">Listify</code> class to recursively break down a
symbol into a list of characters:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Listify</span> <span class="p">(</span><span class="n">sym</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">result</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Symbol</span><span class="p">])</span> <span class="kr">where</span>
<span class="n">listify</span> <span class="o">::</span> <span class="kt">Proxy</span> <span class="n">result</span>
<span class="kr">instance</span> <span class="cp">{-# OVERLAPPING #-}</span> <span class="n">nil</span> <span class="o">~</span> <span class="n">'</span><span class="kt">[]</span> <span class="o">=></span> <span class="kt">Listify</span> <span class="s">""</span> <span class="n">nil</span> <span class="kr">where</span>
<span class="n">listify</span> <span class="o">=</span> <span class="kt">Proxy</span>
<span class="kr">instance</span> <span class="p">(</span> <span class="kt">Uncons</span> <span class="n">sym</span> <span class="n">h</span> <span class="n">t</span>
<span class="p">,</span> <span class="kt">Listify</span> <span class="n">t</span> <span class="n">result</span><span class="p">,</span> <span class="n">result'</span> <span class="o">~</span> <span class="p">(</span><span class="n">h</span> <span class="n">'</span><span class="o">:</span> <span class="n">result</span><span class="p">)</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">Listify</span> <span class="n">sym</span> <span class="n">result'</span> <span class="kr">where</span>
<span class="n">listify</span> <span class="o">=</span> <span class="kt">Proxy</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-text" data-lang="text">>>> :t listify @"Hello"
listify @"Hello" :: Proxy '["H", "e", "l", "l", "o"]</code></pre></figure>
<p>And with this, we can parse anything we’d like.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Of course all of the above could be done a lot more efficiently with
compiler support, and there’s no reason for that not to happen at some
point in the future. This post is just a proof of concept that
something like this is already possible today, and the presented
technique is suitable for some lightweight applications. For anything
larger scale, Template Haskell is probably much better suited for the
job today.</p>
<p><a href="https://kcsongor.github.io/symbol-parsing-haskell/">Parsing type-level strings in Haskell</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on November 28, 2018.</p>
https://kcsongor.github.io/generic-deriving-bifunctor2018-01-01T00:00:00-00:002017-12-31T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#the-problem" id="markdown-toc-the-problem">The problem</a></li>
<li><a href="#the-solution" id="markdown-toc-the-solution">The solution</a> <ul>
<li><a href="#the-boring-instances" id="markdown-toc-the-boring-instances">The boring instances</a></li>
<li><a href="#incoherent-instances" id="markdown-toc-incoherent-instances">Incoherent instances</a></li>
<li><a href="#default-signatures" id="markdown-toc-default-signatures">Default signatures</a></li>
<li><a href="#a-few-more-instances" id="markdown-toc-a-few-more-instances">A few more instances</a></li>
</ul>
</li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
<li><a href="#acknowledgements" id="markdown-toc-acknowledgements">Acknowledgements</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>Recently, I’ve been experimenting with deriving various type class instances
generically, and seeing how far we can go before having to resort to
TemplateHaskell. This post is a showcase of one such experiment: deriving
<a href="https://hackage.haskell.org/package/bifunctors">Bifunctor</a>, a type class that ranges
over types of kind <code class="language-plaintext highlighter-rouge">* -> * -> *</code>, something <code class="language-plaintext highlighter-rouge">GHC.Generics</code> is known not to be
well suited for. The accompanying source code can be found in <a href="https://gist.github.com/kcsongor/a8cb718f676c6ca1d999bfc56def9b7b">this gist</a>.</p>
<h2 id="the-problem">The problem</h2>
<p>The <a href="https://hackage.haskell.org/package/base-4.10.1.0/docs/GHC-Generics.html">GHC.Generics</a>
module defines two representations: <code class="language-plaintext highlighter-rouge">Generic</code> and <code class="language-plaintext highlighter-rouge">Generic1</code>. The former is used to describe
types of kind <code class="language-plaintext highlighter-rouge">*</code>, while the latter is used for <code class="language-plaintext highlighter-rouge">* -> *</code>.
For example, the <code class="language-plaintext highlighter-rouge">Generic1</code> representation is used in the <a href="http://hackage.haskell.org/package/generic-deriving-1.12/docs/Generics-Deriving-Functor.html">generic-deriving</a> package’s Functor derivation.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">GFunctor</span> <span class="p">(</span><span class="n">f</span> <span class="o">::</span> <span class="o">*</span> <span class="o">-></span> <span class="o">*</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">gmap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-></span> <span class="n">b</span><span class="p">)</span> <span class="o">-></span> <span class="n">f</span> <span class="n">a</span> <span class="o">-></span> <span class="n">f</span> <span class="n">b</span></code></pre></figure>
<p>Then instances are defined for the generic building blocks. Whenever we have a
<code class="language-plaintext highlighter-rouge">GFunctor (Rep1 f)</code>, we can turn that into a <code class="language-plaintext highlighter-rouge">Functor f</code>.</p>
<p>With this, it’s possible to derive many useful instances of classes that range
over <code class="language-plaintext highlighter-rouge">*</code> or <code class="language-plaintext highlighter-rouge">* -> *</code>. However, there’s no <code class="language-plaintext highlighter-rouge">Generic2</code>, so if we try to adapt <code class="language-plaintext highlighter-rouge">generic-deriving</code>’s
Functor approach to Bifunctors, we’ll run into problems.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Bifunctor</span> <span class="p">(</span><span class="n">p</span> <span class="o">::</span> <span class="o">*</span> <span class="o">-></span> <span class="o">*</span> <span class="o">-></span> <span class="o">*</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">bimap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-></span> <span class="n">b</span><span class="p">)</span> <span class="o">-></span> <span class="p">(</span><span class="n">c</span> <span class="o">-></span> <span class="n">d</span><span class="p">)</span> <span class="o">-></span> <span class="n">p</span> <span class="n">a</span> <span class="n">c</span> <span class="o">-></span> <span class="n">p</span> <span class="n">b</span> <span class="n">d</span></code></pre></figure>
<p>The type parameter <code class="language-plaintext highlighter-rouge">p</code> takes two arguments, but the generic <code class="language-plaintext highlighter-rouge">Rep</code> and <code class="language-plaintext highlighter-rouge">Rep1</code>
representations are strictly <code class="language-plaintext highlighter-rouge">* -> *</code> (in the case of <code class="language-plaintext highlighter-rouge">Rep</code>, the type parameter
is phantom – it’s only there so that much of the structure of <code class="language-plaintext highlighter-rouge">Rep</code> and <code class="language-plaintext highlighter-rouge">Rep1</code>
can be shared, and <code class="language-plaintext highlighter-rouge">Rep1</code> requires <code class="language-plaintext highlighter-rouge">* -> *</code>). This means that even if we
defined a <code class="language-plaintext highlighter-rouge">GBifunctor</code>, we would need to require a <code class="language-plaintext highlighter-rouge">GBifunctor (Rep2 p)</code> which
we could then turn into a <code class="language-plaintext highlighter-rouge">Bifunctor p</code>. Alas, <code class="language-plaintext highlighter-rouge">Rep2</code> doesn’t exist.</p>
<p>Indeed, the deriving mechanism in the bifunctors package uses TH.</p>
<h2 id="the-solution">The solution</h2>
<p>The solution is inspired by how lenses implement polymorphic updates. The idea
is that a <code class="language-plaintext highlighter-rouge">Lens s t a b</code> focuses on the <code class="language-plaintext highlighter-rouge">a</code> inside some structure <code class="language-plaintext highlighter-rouge">s</code>, and if
we swap that <code class="language-plaintext highlighter-rouge">a</code> with a <code class="language-plaintext highlighter-rouge">b</code>, we get a <code class="language-plaintext highlighter-rouge">t</code>.</p>
<p>Since we’re talking about Bifunctors now, we need two more type variables:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">GBifunctor</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-></span> <span class="n">b</span><span class="p">)</span> <span class="o">-></span> <span class="p">(</span><span class="n">c</span> <span class="o">-></span> <span class="n">d</span><span class="p">)</span> <span class="o">-></span> <span class="n">s</span> <span class="n">x</span> <span class="o">-></span> <span class="n">t</span> <span class="n">x</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">s</code> and <code class="language-plaintext highlighter-rouge">t</code> will be the generic representations, which means they are of kind
<code class="language-plaintext highlighter-rouge">* -> *</code>. However, we’re going to be using <code class="language-plaintext highlighter-rouge">Generic</code> instead of <code class="language-plaintext highlighter-rouge">Generic1</code>, so
the type parameter <code class="language-plaintext highlighter-rouge">x</code> is not used.</p>
<p>Unlike the <code class="language-plaintext highlighter-rouge">GFunctor</code> class, which looked exactly like <code class="language-plaintext highlighter-rouge">Functor</code>, this one is a
lot different from <code class="language-plaintext highlighter-rouge">Bifunctor</code>. Also important to note that <code class="language-plaintext highlighter-rouge">gbimap</code>’s type
signature is more polymorphic than that of <code class="language-plaintext highlighter-rouge">bimap</code>, so we need to ensure that
our instances are properly parametric.</p>
<p class="notice">In an earlier version of this class, I had functional dependencies on the
class that expressed this interrelation between the type variables, but I had to
lose them so that more interesting instances could be defined (more on this
later).</p>
<h3 id="the-boring-instances">The boring instances</h3>
<p>The first instance simply looks through the metadata node.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">GBifunctor</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span>
<span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">M1</span> <span class="n">k</span> <span class="n">m</span> <span class="n">s</span><span class="p">)</span> <span class="p">(</span><span class="kt">M1</span> <span class="n">k</span> <span class="n">m</span> <span class="n">t</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">=</span> <span class="kt">M1</span> <span class="o">.</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">.</span> <span class="n">unM1</span></code></pre></figure>
<p>A sum <code class="language-plaintext highlighter-rouge">l :+: r</code> can be turned into <code class="language-plaintext highlighter-rouge">l' :+: r'</code> if we can turn <code class="language-plaintext highlighter-rouge">l</code> into <code class="language-plaintext highlighter-rouge">l'</code> and
<code class="language-plaintext highlighter-rouge">r</code> into <code class="language-plaintext highlighter-rouge">r'</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span>
<span class="p">(</span> <span class="kt">GBifunctor</span> <span class="n">l</span> <span class="n">l'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span>
<span class="p">,</span> <span class="kt">GBifunctor</span> <span class="n">r</span> <span class="n">r'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="n">l</span> <span class="o">:+:</span> <span class="n">r</span><span class="p">)</span> <span class="p">(</span><span class="n">l'</span> <span class="o">:+:</span> <span class="n">r'</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="kt">L1</span> <span class="n">l</span><span class="p">)</span> <span class="o">=</span> <span class="kt">L1</span> <span class="p">(</span><span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">l</span><span class="p">)</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="kt">R1</span> <span class="n">r</span><span class="p">)</span> <span class="o">=</span> <span class="kt">R1</span> <span class="p">(</span><span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">r</span><span class="p">)</span></code></pre></figure>
<p>And similarly, for products.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span>
<span class="p">(</span> <span class="kt">GBifunctor</span> <span class="n">l</span> <span class="n">l'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span>
<span class="p">,</span> <span class="kt">GBifunctor</span> <span class="n">r</span> <span class="n">r'</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="n">l</span> <span class="o">:*:</span> <span class="n">r</span><span class="p">)</span> <span class="p">(</span><span class="n">l'</span> <span class="o">:*:</span> <span class="n">r'</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="n">l</span> <span class="o">:*:</span> <span class="n">r</span><span class="p">)</span> <span class="o">=</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">l</span> <span class="o">:*:</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">r</span></code></pre></figure>
<p>The last boring instance is for unit types, these are trivially Bifunctors.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="kt">GBifunctor</span> <span class="kt">U1</span> <span class="kt">U1</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">id</span></code></pre></figure>
<h3 id="incoherent-instances">Incoherent instances</h3>
<p>With all of the gluing out of the way, we can now get to the meat of the
problem: the actual fields in the constructors. When considering a field, we
have 3 cases:</p>
<p>The field is of type <code class="language-plaintext highlighter-rouge">a</code>, and we apply the first function to turn it into a <code class="language-plaintext highlighter-rouge">b</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">a</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">b</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="kr">_</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">)</span></code></pre></figure>
<p>Similarly, if it’s a <code class="language-plaintext highlighter-rouge">c</code>, we turn it into a <code class="language-plaintext highlighter-rouge">d</code> using the second function.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">c</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">d</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="kr">_</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">g</span> <span class="n">a</span><span class="p">)</span></code></pre></figure>
<p>Finally, the field is neither <code class="language-plaintext highlighter-rouge">a</code>, nor <code class="language-plaintext highlighter-rouge">c</code>, so we just leave it alone.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">x</span><span class="p">)</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="n">x</span><span class="p">)</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="kr">_</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">id</span></code></pre></figure>
<p>Note that these instances need to be defined with <code class="language-plaintext highlighter-rouge">{-# INCOHERENT #-}</code> pragmas.
This is required because neither of <code class="language-plaintext highlighter-rouge">(Rec0 a) (Rec0 b) a b c d</code> and <code class="language-plaintext highlighter-rouge">(Rec0 c) (Rec0 d) a b c d</code> is
more specific than the other.</p>
<p>However, in our case, this is not a problem, because we’re going to invoke
instance resolution with polymorphic arguments, so there will be exactly one
instance that matches.</p>
<h3 id="default-signatures">Default signatures</h3>
<p>We can now revise our original class definition, and add a default signature
(<code class="language-plaintext highlighter-rouge">DefaultSignatures</code>). This will make <code class="language-plaintext highlighter-rouge">Bifunctor</code> derivable with <code class="language-plaintext highlighter-rouge">DeriveAnyClass</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Bifunctor</span> <span class="n">p</span> <span class="kr">where</span>
<span class="n">bimap</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-></span> <span class="n">b</span><span class="p">)</span> <span class="o">-></span> <span class="p">(</span><span class="n">c</span> <span class="o">-></span> <span class="n">d</span><span class="p">)</span> <span class="o">-></span> <span class="n">p</span> <span class="n">a</span> <span class="n">c</span> <span class="o">-></span> <span class="n">p</span> <span class="n">b</span> <span class="n">d</span>
<span class="kr">default</span> <span class="n">bimap</span>
<span class="o">::</span> <span class="p">(</span> <span class="kt">Generic</span> <span class="p">(</span><span class="n">p</span> <span class="n">a</span> <span class="n">c</span><span class="p">)</span>
<span class="p">,</span> <span class="kt">Generic</span> <span class="p">(</span><span class="n">p</span> <span class="n">b</span> <span class="n">d</span><span class="p">)</span>
<span class="p">,</span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rep</span> <span class="p">(</span><span class="n">p</span> <span class="n">a</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rep</span> <span class="p">(</span><span class="n">p</span> <span class="n">b</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span>
<span class="p">)</span> <span class="o">=></span> <span class="p">(</span><span class="n">a</span> <span class="o">-></span> <span class="n">b</span><span class="p">)</span> <span class="o">-></span> <span class="p">(</span><span class="n">c</span> <span class="o">-></span> <span class="n">d</span><span class="p">)</span> <span class="o">-></span> <span class="n">p</span> <span class="n">a</span> <span class="n">c</span> <span class="o">-></span> <span class="n">p</span> <span class="n">b</span> <span class="n">d</span>
<span class="n">bimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">=</span> <span class="n">to</span> <span class="o">.</span> <span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="o">.</span> <span class="n">from</span></code></pre></figure>
<p>Note the line <code class="language-plaintext highlighter-rouge">GBifunctor (Rep (p a c)) (Rep (p b d)) a b c d</code>. Here’s where we
establish the relationship between the types. This now allows us to derive a
<code class="language-plaintext highlighter-rouge">Bifunctor</code> instance for <code class="language-plaintext highlighter-rouge">Either</code>:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">deriving</span> <span class="kr">instance</span> <span class="kt">Bifunctor</span> <span class="kt">Either</span></code></pre></figure>
<p>For example, when looking at the <code class="language-plaintext highlighter-rouge">Left</code> constructor, the compiler will try to
find an instance for <code class="language-plaintext highlighter-rouge">GBifunctor (Rec0 a) (Rec0 b) a b c d</code>. There is exactly
one instance that matches this, so our incoherent instance will not bite us.
This is important: if instead we wanted an instance for a concrete type, say,
<code class="language-plaintext highlighter-rouge">Either Int Int</code>, all of our incoherent instances would match, and an arbitrary
one would be picked. However, we avoid this problem by ensuring that the
instance is derived for the aformentioned polymorphic form.</p>
<p>With this, we have a correct implementation of <code class="language-plaintext highlighter-rouge">bimap</code> for <code class="language-plaintext highlighter-rouge">Either</code>:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> bimap show (+ 10) (Left 10)
Left "10"
>>> bimap show (+ 10) (Right 10)
Right 20</code></pre></figure>
<p>Even better, compiled with <code class="language-plaintext highlighter-rouge">-O1</code>, all of the overhead from using generics is
optimised away:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">$fBifunctorEither_$cbimap
= \ @ a_a3EL @ b_a3EM @ c_a3EN @ d_a3EO f_X1EN g_X1EP eta_B1 ->
case eta_B1 of {
Left g1_a3X5 -> Left (f_X1EN g1_a3X5);
Right g1_a3X8 -> Right (g_X1EP g1_a3X8)
}</code></pre></figure>
<h3 id="a-few-more-instances">A few more instances</h3>
<p>The above deriving mechanism is naive: it only looks at fields whose types is
exactly <code class="language-plaintext highlighter-rouge">a</code> or <code class="language-plaintext highlighter-rouge">b</code>. But we can do better: what if the field is a <code class="language-plaintext highlighter-rouge">Maybe a</code>?
Surely we can turn that into a <code class="language-plaintext highlighter-rouge">Maybe b</code>. Or if it’s an <code class="language-plaintext highlighter-rouge">Either a b</code>, we can turn that into
an <code class="language-plaintext highlighter-rouge">Either c d</code>, since it has a <code class="language-plaintext highlighter-rouge">Bifunctor</code> instance.</p>
<p>The following three instances do exactly that.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Bifunctor</span> <span class="n">f</span>
<span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">bimap</span> <span class="n">f</span> <span class="n">g</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Functor</span> <span class="n">f</span>
<span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="kr">_</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">g</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Functor</span> <span class="n">f</span>
<span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="kr">_</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">fmap</span> <span class="n">f</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Bifunctor</span> <span class="n">f</span>
<span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">a</span> <span class="n">a</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">b</span> <span class="n">b</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="n">f</span> <span class="kr">_</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">a</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">bimap</span> <span class="n">f</span> <span class="n">f</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">instance</span> <span class="cp">{-# INCOHERENT #-}</span> <span class="kt">Bifunctor</span> <span class="n">f</span>
<span class="o">=></span> <span class="kt">GBifunctor</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">c</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">Rec0</span> <span class="p">(</span><span class="n">f</span> <span class="n">d</span> <span class="n">d</span><span class="p">))</span> <span class="n">a</span> <span class="n">b</span> <span class="n">c</span> <span class="n">d</span> <span class="kr">where</span>
<span class="n">gbimap</span> <span class="kr">_</span> <span class="n">g</span> <span class="p">(</span><span class="kt">K1</span> <span class="n">b</span><span class="p">)</span> <span class="o">=</span> <span class="kt">K1</span> <span class="p">(</span><span class="n">bimap</span> <span class="n">g</span> <span class="n">g</span> <span class="n">b</span><span class="p">)</span></code></pre></figure>
<p>Now we can derive even more interesting <code class="language-plaintext highlighter-rouge">Bifunctor</code> instances.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">T</span> <span class="n">a</span> <span class="n">b</span> <span class="o">=</span> <span class="kt">T1</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="p">(</span><span class="kt">Either</span> <span class="n">a</span> <span class="n">b</span><span class="p">)</span> <span class="o">|</span> <span class="kt">T2</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">b</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Bifunctor</span><span class="p">)</span></code></pre></figure>
<h2 id="conclusion">Conclusion</h2>
<p>We have seen a technique for approximating a hypothetical <code class="language-plaintext highlighter-rouge">Generic2</code>
representation with only using <code class="language-plaintext highlighter-rouge">Generic</code>. Of course there was nothing specific
about the number 2, we can easily generalise this to any fixed number of
parameters.</p>
<p>I’m planning on writing a post about a further generalisation of
this idea, which allows us to talk about types that have an arbitrary number type
parameters (unlike here, where it’s a fixed number), which I used in the
<a href="https://hackage.haskell.org/package/generic-lens">generic-lens</a> library, to
allow for type changing lenses over any type parameter (thanks to the more
elaborate extra machinery, there is no need for incoherent instance
resolution).</p>
<p>It would be interesting to see how far this can be pushed before hitting a
roadblock that would truly require a bespoke <code class="language-plaintext highlighter-rouge">GenericN</code> representation.</p>
<h2 id="acknowledgements">Acknowledgements</h2>
<p>Thanks to <a href="https://github.com/adituv">@adituv</a> for pointing out that two instances were missing.</p>
<p><a href="https://kcsongor.github.io/generic-deriving-bifunctor/">Deriving Bifunctor with Generics</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on December 31, 2017.</p>
https://kcsongor.github.io/generic-lens2017-12-10T00:00:00-00:002017-12-10T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#overview" id="markdown-toc-overview">Overview</a> <ul>
<li><a href="#examples" id="markdown-toc-examples">Examples</a> <ul>
<li><a href="#field" id="markdown-toc-field">field</a></li>
<li><a href="#typed" id="markdown-toc-typed">typed</a></li>
<li><a href="#position" id="markdown-toc-position">position</a></li>
<li><a href="#super-row-polymorphism" id="markdown-toc-super-row-polymorphism">super (row polymorphism)</a></li>
<li><a href="#_ctor" id="markdown-toc-_ctor">_Ctor</a></li>
</ul>
</li>
<li><a href="#mtl" id="markdown-toc-mtl">mtl</a></li>
</ul>
</li>
<li><a href="#performance" id="markdown-toc-performance">Performance</a></li>
<li><a href="#quick-note-migration" id="markdown-toc-quick-note-migration">Quick note (migration)</a></li>
<li><a href="#acknowledgements" id="markdown-toc-acknowledgements">Acknowledgements</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>The <a href="https://hackage.haskell.org/package/generic-lens">generic-lens</a> library
provides utilities for deriving various optics for your datatypes,
using <code class="language-plaintext highlighter-rouge">GHC.Generics</code>. In this post I’ll go over some of the features and
provide examples of using them.</p>
<h2 id="overview">Overview</h2>
<p>Lenses have proven to be an exteremely powerful tool in the Haskell ecosystem.
<code class="language-plaintext highlighter-rouge">generic-lens</code> uses <code class="language-plaintext highlighter-rouge">GHC.Generics</code> to derive lenses and prisms on the fly, only
when they are needed. These optics are highly polymorphic, and can be used with
all types that are of the right shape. Extra care has been taken to keep type
errors readable.</p>
<h3 id="examples">Examples</h3>
<p>To get started, we will need the following extensions:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE DataKinds #-}</span>
<span class="cp">{-# LANGUAGE DeriveGeneric #-}</span>
<span class="cp">{-# LANGUAGE FlexibleContexts #-}</span>
<span class="cp">{-# LANGUAGE TypeApplications #-}</span>
<span class="cp">{-# LANGUAGE TypeFamilies #-}</span></code></pre></figure>
<p>And the following imports</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="nn">Control.Lens</span>
<span class="kr">import</span> <span class="nn">Data.Generics.Product</span>
<span class="kr">import</span> <span class="nn">GHC.Generics</span></code></pre></figure>
<p>Consider the following datatype:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Human</span> <span class="n">a</span>
<span class="o">=</span> <span class="kt">Human</span>
<span class="p">{</span> <span class="n">name</span> <span class="o">::</span> <span class="kt">String</span>
<span class="p">,</span> <span class="n">age</span> <span class="o">::</span> <span class="kt">Int</span>
<span class="p">,</span> <span class="n">address</span> <span class="o">::</span> <span class="kt">String</span>
<span class="p">,</span> <span class="n">other</span> <span class="o">::</span> <span class="n">a</span>
<span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span></code></pre></figure>
<h4 id="field">field</h4>
<p>We can access the <code class="language-plaintext highlighter-rouge">name</code> field:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Human "John" 18 "London" True ^. field @"name"
"John"</code></pre></figure>
<p>We can update fields too, even changing types where possible (when the type of
the field is a type parameter of the datatype):</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Human "John" 18 "London" True & field @"other" %~ show
Human {name = "John", age = 18, address = "London", other = "True"}</code></pre></figure>
<p>In case of sum types, it only makes sense to have a lens on the fields that
appear in every constructor. Trying to use <code class="language-plaintext highlighter-rouge">field</code> to get a lens for a partial
field is a type error.</p>
<p class="notice">Note that the <code class="language-plaintext highlighter-rouge">field</code> lens works with <code class="language-plaintext highlighter-rouge">DuplicateRecordFields</code>, which means that
record fields can actually be shared, and we can get a reusuble lens for all
cases without code duplication.</p>
<h4 id="typed">typed</h4>
<p>We can directly reference a field by its type, as long as the type is unique in
the structure.</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Human "John" 18 "London" True ^. typed @Bool
True</code></pre></figure>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Human "John" 18 "London" True ^. typed @String
<interactive>:34:34: error:
• The type Human Bool contains multiple values of type [Char].
The choice of value is thus ambiguous. The offending constructors are:
• Human
• In the second argument of ‘(^.)’, namely ‘typed @String’
In the expression: Human "John" 18 "London" True ^. typed @String
In an equation for ‘it’:
it = Human "John" 18 "London" True ^. typed @String</code></pre></figure>
<h4 id="position">position</h4>
<p>When the above two fail, and we have a product type, we can specify the field
of interest by its position.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">MyTuple</span> <span class="n">a</span> <span class="n">b</span> <span class="o">=</span> <span class="kt">MyTuple</span> <span class="n">a</span> <span class="n">b</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> MyTuple 10 20 & position @1 .~ "hello"
MyTuple "hello" 20</code></pre></figure>
<h4 id="super-row-polymorphism">super (row polymorphism)</h4>
<p>Given two records, where the set of fields of one is the subset of that of the
other, we can talk about a structural subtype relationship. The <code class="language-plaintext highlighter-rouge">super</code> lens
allows us to treat the subtype as the supertype - without forgetting the
original structure.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">data</span> <span class="kt">Small</span>
<span class="o">=</span> <span class="kt">Small</span>
<span class="p">{</span> <span class="n">small</span> <span class="o">::</span> <span class="kt">Int</span>
<span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span>
<span class="kr">data</span> <span class="kt">Large</span>
<span class="o">=</span> <span class="kt">Large</span>
<span class="p">{</span> <span class="n">small</span> <span class="o">::</span> <span class="kt">Int</span>
<span class="p">,</span> <span class="n">large</span> <span class="o">::</span> <span class="kt">String</span>
<span class="p">}</span> <span class="kr">deriving</span> <span class="p">(</span><span class="kt">Generic</span><span class="p">,</span> <span class="kt">Show</span><span class="p">)</span>
<span class="n">smallFun</span> <span class="o">::</span> <span class="kt">Small</span> <span class="o">-></span> <span class="kt">Small</span>
<span class="n">smallFun</span> <span class="p">(</span><span class="kt">Small</span> <span class="n">n</span><span class="p">)</span> <span class="o">=</span> <span class="kt">Small</span> <span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">10</span><span class="p">)</span></code></pre></figure>
<p>(Here, we need the <code class="language-plaintext highlighter-rouge">{-# LANGUAGE DuplicateRecordFields #-}</code> extension in
addition to the previous ones.)</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Large 10 "foo" & super %~ smallFun
Large {small = 20, large = "foo"}</code></pre></figure>
<p>Or we can simply upcast:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Large 10 "foo" ^. super :: Small
Small {small = 10}</code></pre></figure>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Small 10 ^. super :: Large
<interactive>:53:13: error:
• The type 'Small' is not a subtype of 'Large'.
The following fields are missing from 'Small':
• large</code></pre></figure>
<h4 id="_ctor">_Ctor</h4>
<p>We can also obtain prisms that focus on individual constructors:</p>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Human "John" 18 "London" True ^? _Ctor @"Human"
Just ("John",18,"London",True)</code></pre></figure>
<figure class="highlight"><pre><code class="language-txt" data-lang="txt">>>> Human "John" 18 "London" True ^? _Ctor @"Human" . position @3
Just "London"</code></pre></figure>
<h3 id="mtl">mtl</h3>
<p>So far, we haven’t provided any type signatures. Indeed, everything can be
inferred by the compiler. However, because these combinators are highly
polymorphic, it might be interesting to use them in a polymorphic context.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">::</span> <span class="p">(</span><span class="kt">MonadReader</span> <span class="n">env</span> <span class="n">m</span><span class="p">,</span> <span class="kt">HasField'</span> <span class="s">"username"</span> <span class="n">env</span> <span class="kt">String</span><span class="p">)</span> <span class="o">=></span> <span class="n">m</span> <span class="kt">String</span>
<span class="n">f</span> <span class="o">=</span> <span class="n">view</span> <span class="p">(</span><span class="n">field</span> <span class="o">@</span><span class="s">"username"</span><span class="p">)</span></code></pre></figure>
<p>This function is now polymorphic not just in the monad stack it will eventually
run in, but also in the type of the environment.</p>
<p>The type of <code class="language-plaintext highlighter-rouge">field</code> is</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">field</span> <span class="o">::</span> <span class="kt">HasField</span> <span class="n">field</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="o">=></span> <span class="kt">Lens</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">HasField'</code> (similarly to <code class="language-plaintext highlighter-rouge">Lens'</code>) is a type synonym for <code class="language-plaintext highlighter-rouge">HasField field s s a a</code>.</p>
<p>For a more comprehensive overview and more examples, please have a look at the
library on <a href="https://hackage.haskell.org/package/generic-lens">hackage</a>, or on
<a href="https://github.com/kcsongor/generic-lens">github</a>.</p>
<h2 id="performance">Performance</h2>
<p>An important question when evaluating such high-level abstractions is whether
the abstraction comes at the cost of performance. Fortunately, GHC optimises
away all of the overhead of the generic transformations, leaving us with code
that is equivalent to what we would’ve written manually.</p>
<p>This can be verified by comparing the generated core of both the manually
written lens and the generated one. However, it happened multiple times during
development that a small change (such as eta-reduction) broke the optimisation.
Joachim Breitner’s excellent
<a href="https://github.com/nomeata/inspection-testing">inspection-testing</a> tool, which
is now integrated into the automated test suite, is making sure that the
optimisation happens by automatically doing this comparison. This tool has been
invaluable in ensuring the performance guarantees, without having to manually
inspect the generated core after every single commit. The tests can be found
<a href="https://github.com/kcsongor/generic-lens/blob/master/test/Spec.hs">here</a>.</p>
<p class="notice">It’s important to mention that as of this release, only the lenses are
optimised away completely, the prisms still have some leftover overhead. This
is planned to be fixed in a future release.</p>
<h2 id="quick-note-migration">Quick note (migration)</h2>
<p>In case you were already using the library, there are some breaking changes in <code class="language-plaintext highlighter-rouge">0.5.0.0</code>.
Namely, all the <code class="language-plaintext highlighter-rouge">Has*</code> classes have been extended from 3 type parameters to 5.
Auxiliary constraint synonyms are provided, and migration should be relatively simple:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">::</span> <span class="kt">HasField</span> <span class="n">field</span> <span class="n">a</span> <span class="n">record</span> <span class="o">=></span> <span class="o">...</span></code></pre></figure>
<p>becomes</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">f</span> <span class="o">::</span> <span class="kt">HasField'</span> <span class="n">field</span> <span class="n">record</span> <span class="n">a</span> <span class="o">=></span> <span class="o">...</span></code></pre></figure>
<p class="notice">Notice the <code class="language-plaintext highlighter-rouge">'</code> at the end of the class name, and the swapping of the last two arguments.</p>
<h2 id="acknowledgements">Acknowledgements</h2>
<p>Thanks to Matthew Pickering for useful comments on a draft of this post.</p>
<p><a href="https://kcsongor.github.io/generic-lens/">Announcing generic-lens 0.5.0.0</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on December 10, 2017.</p>
https://kcsongor.github.io/purescript-safe-printf2017-9-25T00:00:00-00:002017-09-25T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#the-problem" id="markdown-toc-the-problem">The problem</a></li>
<li><a href="#type-level-parsing" id="markdown-toc-type-level-parsing">Type-level parsing</a></li>
<li><a href="#how-the-sausage-gets-made-computing-the-output-type" id="markdown-toc-how-the-sausage-gets-made-computing-the-output-type">How the sausage gets made: computing the output type</a></li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>One of the classic examples that keeps coming up when talking about dependently
typed programming languages is the “safe” <code class="language-plaintext highlighter-rouge">printf</code> function – one that ensures
that the number and type of arguments match the requirement in the format
specification.</p>
<p>In languages like Idris, this is just a function that takes a format string,
and returns the type of arguments required for constructing the formatted
output string.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"> <span class="n">format</span> <span class="s">"A number: %d, and a string: %s"</span> <span class="o">:</span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">String</span> <span class="o">-></span> <span class="kt">String</span></code></pre></figure>
<p>Other languages, like rust, solve this by various means of metaprogramming:
writing a program (macro) that runs at compile-time, generating the program to
be executed at runtime.</p>
<p>What these two approaches have in common is that they both operate on strings
that are statically available to the compiler. The aim of this post is to show
another way of achieving the same result, with tools that are available in
PureScript – a strongly-typed functional language, with no dependent types.</p>
<h2 id="the-problem">The problem</h2>
<p>We want to write a program that takes a format string, some number of
arguments, and returns the result of inserting the arguments at their specified
places in the format string, and does all this in a type-safe way.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">></span> <span class="o">:</span><span class="n">t</span> <span class="n">format</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span>
<span class="kt">Int</span> <span class="o">-></span> <span class="kt">Int</span> <span class="o">-></span> <span class="kt">String</span> <span class="o">-></span> <span class="kt">String</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">></span> <span class="n">format</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="mi">10</span> <span class="mi">20</span> <span class="s">"foo"</span>
<span class="s">"Wurble 10 20 foo"</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">></span> <span class="n">format</span> <span class="o">@</span><span class="s">"Wurble %d %d %s"</span> <span class="mi">10</span> <span class="mi">20</span> <span class="mi">30</span>
<span class="kt">Error</span> <span class="n">found</span><span class="o">:</span>
<span class="kt">Could</span> <span class="n">not</span> <span class="n">match</span> <span class="kr">type</span>
<span class="kt">String</span>
<span class="n">with</span> <span class="kr">type</span>
<span class="kt">Int</span>
<span class="n">while</span> <span class="n">trying</span> <span class="n">to</span> <span class="n">match</span> <span class="kr">type</span> <span class="kt">Function</span> <span class="kt">String</span>
<span class="n">with</span> <span class="kr">type</span> <span class="kt">Function</span> <span class="kt">Int</span></code></pre></figure>
<p class="notice">The <code class="language-plaintext highlighter-rouge">@</code> symbol before the string is the proxy syntax introduced in 0.12
which provides a concise way of passing types around. The format strings
are actually type-level literals – but more on this later.</p>
<p>Crucially, we need to compute a type from some input, but because PureScript
has no dependent types, values and functions in the traditional sense are not
available for evaluation at compile-time. However, there is a way to interact
with the compiler: via the type-checker.</p>
<p>The solution therefore is to encode this computation in the types, and have the
type-checker evaluate it for us as part of type-checking. Luckily, PureScript
allows string literals in types (these are types whose kind is <code class="language-plaintext highlighter-rouge">Symbol</code>).</p>
<p>Thus, constructing our <code class="language-plaintext highlighter-rouge">printf</code> function comprises two steps:</p>
<ul>
<li>parse the input <code class="language-plaintext highlighter-rouge">Symbol</code> into a list of format tokens</li>
<li>generate the function from the format list that will then assemble the output string</li>
</ul>
<h2 id="type-level-parsing">Type-level parsing</h2>
<p>For the sake of simplicity, we’re going to focus on two types of format
specifiers: decimals (<code class="language-plaintext highlighter-rouge">%d</code>) and strings (<code class="language-plaintext highlighter-rouge">%s</code>).</p>
<p>We represent these cases with a <em>custom kind</em>, which is like a regular
algebraic datatype, but lifted to the type-level. This means that these
constructors can be used <em>in types</em>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foreign</span> <span class="kr">import</span> <span class="nn">kind</span> <span class="kt">Specifier</span>
<span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">D</span> <span class="o">::</span> <span class="kt">Specifier</span>
<span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">S</span> <span class="o">::</span> <span class="kt">Specifier</span>
<span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">Lit</span> <span class="o">::</span> <span class="kt">Symbol</span> <span class="o">-></span> <span class="kt">Specifier</span></code></pre></figure>
<p>Of course, apart from the format specifiers <code class="language-plaintext highlighter-rouge">%d</code> and <code class="language-plaintext highlighter-rouge">%s</code>, everything else is a
literal, which we account for by wrapping them in the <code class="language-plaintext highlighter-rouge">Lit</code> type constructor.</p>
<p class="notice">The <code class="language-plaintext highlighter-rouge">foreign import</code> bit means that we’re introducing types here that have no
constructors. That is to say, it’s impossible to construct a value of type <code class="language-plaintext highlighter-rouge">D</code>
and <code class="language-plaintext highlighter-rouge">S</code>. We’ll see later how it is still possible to carry these types around
in terms (hint: proxies).</p>
<p>Furthermore, we need a way of representing a sequence of these specifiers, for
which we introduce another kind:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foreign</span> <span class="kr">import</span> <span class="nn">kind</span> <span class="kt">FList</span>
<span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">FNil</span> <span class="o">::</span> <span class="kt">FList</span>
<span class="n">foreign</span> <span class="kr">import</span> <span class="nn">data</span> <span class="kt">FCons</span> <span class="o">::</span> <span class="kt">Specifier</span> <span class="o">-></span> <span class="kt">FList</span> <span class="o">-></span> <span class="kt">FList</span></code></pre></figure>
<p>With this, we can now write types like <code class="language-plaintext highlighter-rouge">FCons D (FCons (Lit " foo") FNil)</code>,
corresponding to the string <code class="language-plaintext highlighter-rouge">%d foo</code>.</p>
<p class="notice">Kind-polymorphism is not supported by the current version (0.12) of PureScript,
so we can’t define a parametric type-level list once and for all – we need a
new one for each type we want to store in lists. With this, and some syntactic
sugar, we would be able to write (as we can in Haskell today) <code class="language-plaintext highlighter-rouge">[D, "foo"]</code>.
This limitation is likely to be removed in a future version of the compiler.</p>
<p>With these building blocks defined, now we have a vocabulary for talking about
the parser itself: it is a function that takes a <code class="language-plaintext highlighter-rouge">Symbol</code> as an input, and
returns a <code class="language-plaintext highlighter-rouge">FList</code>. We encode the computation in the following type class:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Parse</span> <span class="p">(</span><span class="n">string</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">format</span> <span class="o">::</span> <span class="kt">FList</span><span class="p">)</span> <span class="o">|</span> <span class="n">string</span> <span class="o">-></span> <span class="n">format</span></code></pre></figure>
<p>The functional dependency <code class="language-plaintext highlighter-rouge">string -> format</code> states that the input <code class="language-plaintext highlighter-rouge">string</code>
determines the ouput <code class="language-plaintext highlighter-rouge">format</code>. This bit is crucial, as this is what tells the
compiler that knowing <code class="language-plaintext highlighter-rouge">string</code> is sufficient in determining what the value of
<code class="language-plaintext highlighter-rouge">format</code> is. It is then our task to ensure that this dependency indeed holds,
when writing out the instances.</p>
<p>To deconstruct the input symbol, we use the following type class available in 0.12:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">ConsSymbol</span> <span class="p">(</span><span class="n">head</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span>
<span class="p">(</span><span class="n">tail</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span>
<span class="p">(</span><span class="n">sym</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="o">|</span>
<span class="n">sym</span> <span class="o">-></span> <span class="n">head</span> <span class="n">tail</span><span class="p">,</span> <span class="n">head</span> <span class="n">tail</span> <span class="o">-></span> <span class="n">sym</span></code></pre></figure>
<p>The interesting functional dependency here is the <code class="language-plaintext highlighter-rouge">sym -> head tail</code>, which,
given some symbol, deconstructs it into its <code class="language-plaintext highlighter-rouge">head</code> (the first character) and
its <code class="language-plaintext highlighter-rouge">tail</code> – the rest.</p>
<p>The parser is like a state machine, with the following legal states:</p>
<ul>
<li>State 1: found a non-<code class="language-plaintext highlighter-rouge">%</code> character</li>
<li>State 2: found a <code class="language-plaintext highlighter-rouge">%</code> character</li>
</ul>
<p>One possible way of representing these states is by having a separate type class
to deal with each.</p>
<p>Since in our simplified example, we know that the specifier symbols can
only be single characters, we can define the second state as:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Parse2</span> <span class="p">(</span><span class="n">head</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">out</span> <span class="o">::</span> <span class="kt">Specifier</span><span class="p">)</span> <span class="o">|</span> <span class="n">head</span> <span class="o">-></span> <span class="n">out</span></code></pre></figure>
<p>That is, it takes a symbol, and returns the matching specifier. The
implementation is straightforward:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">parse2D</span> <span class="o">::</span> <span class="kt">Parse2</span> <span class="s">"d"</span> <span class="kt">D</span>
<span class="kr">instance</span> <span class="n">parse2S</span> <span class="o">::</span> <span class="kt">Parse2</span> <span class="s">"s"</span> <span class="kt">S</span></code></pre></figure>
<p>This is a partial function, which means that format strings that contain
unsupported specifier tokens will simply fail to compile.</p>
<p>The first state is more complicated, as it can consume an arbitrary number of
characters, so we pass it the remaining string (<code class="language-plaintext highlighter-rouge">tail</code>) as well.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Parse1</span> <span class="p">(</span><span class="n">head</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">tail</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="p">(</span><span class="n">out</span> <span class="o">::</span> <span class="kt">FList</span><span class="p">)</span> <span class="o">|</span> <span class="n">head</span> <span class="n">tail</span> <span class="o">-></span> <span class="n">out</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">Parse1</code> represents the parsing state where we have the current character
<code class="language-plaintext highlighter-rouge">head</code>, the rest of the input string <code class="language-plaintext highlighter-rouge">tail</code>, and we know that the previous
character was not a <code class="language-plaintext highlighter-rouge">%</code>.</p>
<p>The first case is when the tail is empty. In this case, we just return the
current character as the literal in a singleton list:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">parse1Nil</span> <span class="o">::</span> <span class="kt">Parse1</span> <span class="n">a</span> <span class="s">""</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="n">a</span><span class="p">)</span> <span class="kt">FNil</span><span class="p">)</span></code></pre></figure>
<p>The second case is more interesting. This is when we find a <code class="language-plaintext highlighter-rouge">%</code>, so we need to
invoke the other function, <code class="language-plaintext highlighter-rouge">Parse2</code>, which handles parsing the specifier
itself. To do that, we use <code class="language-plaintext highlighter-rouge">ConsSymbol</code> to split our current tail <code class="language-plaintext highlighter-rouge">s</code> into
its head <code class="language-plaintext highlighter-rouge">h</code> and tail <code class="language-plaintext highlighter-rouge">t</code>. <code class="language-plaintext highlighter-rouge">h</code> contains the format specifier, which we pass on
to <code class="language-plaintext highlighter-rouge">Parse2</code>. Then, recursively invoke <code class="language-plaintext highlighter-rouge">Parse</code> on <code class="language-plaintext highlighter-rouge">t</code> to parse the rest of
the input. In addition to returning <code class="language-plaintext highlighter-rouge">spec</code> consed to <code class="language-plaintext highlighter-rouge">rest</code>, we also put
an empty string literal at the head of the output list: this is to maintain
the invariant that the head of the output list always contains a string literal.
This invariant will be useful for the last case…</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">else</span> <span class="kr">instance</span> <span class="n">parse1Pc</span> <span class="o">::</span>
<span class="p">(</span> <span class="kt">ConsSymbol</span> <span class="n">h</span> <span class="n">t</span> <span class="n">s</span>
<span class="p">,</span> <span class="kt">Parse2</span> <span class="n">h</span> <span class="n">spec</span>
<span class="p">,</span> <span class="kt">Parse</span> <span class="n">t</span> <span class="n">rest</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">Parse1</span> <span class="s">"%"</span> <span class="n">s</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="s">""</span><span class="p">)</span> <span class="p">(</span><span class="kt">FCons</span> <span class="n">spec</span> <span class="n">rest</span><span class="p">))</span></code></pre></figure>
<p>…when we match any other character, i.e. other than <code class="language-plaintext highlighter-rouge">%</code>. Since we’re in
<code class="language-plaintext highlighter-rouge">Parse1</code>, that means that the current character needs to be in a string
literal. For this, we first recursively parse the tail <code class="language-plaintext highlighter-rouge">s</code> into
<code class="language-plaintext highlighter-rouge">FCons (Lit acc) r</code>. The reason we want to know that at the head of parsing
the remaining string is a <code class="language-plaintext highlighter-rouge">Lit</code> is so that we can prepend the current character
to that literal – we need to rebuild long string literals
character-by-character after all. This is where the invariant from the previous
two cases is useful: we don’t have to handle the cases where the head is not
a <code class="language-plaintext highlighter-rouge">Lit</code>, because the recursive calls guarantee that it is. <code class="language-plaintext highlighter-rouge">acc</code> is thus the
tail of the string literal we’re currently parsing, so we put it together
with the current character by <code class="language-plaintext highlighter-rouge">ConsSymbol o acc rest</code> (recall that this type
class can both construct and deconstruct symbols via its functional
dependencies). Then we simply return <code class="language-plaintext highlighter-rouge">Lit rest</code> along with <code class="language-plaintext highlighter-rouge">r</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">else</span> <span class="kr">instance</span> <span class="n">parse1Other</span> <span class="o">::</span>
<span class="p">(</span> <span class="kt">Parse</span> <span class="n">s</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="n">acc</span><span class="p">)</span> <span class="n">r</span><span class="p">)</span>
<span class="p">,</span> <span class="kt">ConsSymbol</span> <span class="n">o</span> <span class="n">acc</span> <span class="n">rest</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">Parse1</span> <span class="n">o</span> <span class="n">s</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="n">rest</span><span class="p">)</span> <span class="n">r</span><span class="p">)</span></code></pre></figure>
<p>Notice how these instances actually overlap. In the third case, we can
easily imagine a particular instantiation of <code class="language-plaintext highlighter-rouge">o</code> and <code class="language-plaintext highlighter-rouge">r</code> such that it
matches the instance head in the second case. In other words, when the current
character is <code class="language-plaintext highlighter-rouge">%</code>, both <code class="language-plaintext highlighter-rouge">parse1Pc</code> and <code class="language-plaintext highlighter-rouge">parse1Other</code> match (because
<code class="language-plaintext highlighter-rouge">parse1Other</code> is more general).</p>
<p>To make sure that the instances are selected in the order we want them to be,
we use instance chains. That is, by writing <code class="language-plaintext highlighter-rouge">instance A else instance B</code> we
tell the compiler to try to match instance <code class="language-plaintext highlighter-rouge">A</code> first, and if it fails, then try
<code class="language-plaintext highlighter-rouge">B</code>. This is a new feature in PureScript 0.12, and a very powerful one – it
allows us to avoid the overlapping instance problem for good.</p>
<p>Finally, we need to actually kick off the parser. We do this by invoking it
in the first state.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">parseNil</span> <span class="o">::</span>
<span class="kt">Parse</span> <span class="s">""</span> <span class="p">(</span><span class="kt">FCons</span> <span class="p">(</span><span class="kt">Lit</span> <span class="s">""</span><span class="p">)</span> <span class="kt">FNil</span><span class="p">)</span>
<span class="kr">else</span> <span class="kr">instance</span> <span class="n">parseCons</span> <span class="o">::</span>
<span class="p">(</span> <span class="kt">ConsSymbol</span> <span class="n">h</span> <span class="n">t</span> <span class="n">string</span>
<span class="p">,</span> <span class="kt">Parse1</span> <span class="n">h</span> <span class="n">t</span> <span class="n">fl</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">Parse</span> <span class="n">string</span> <span class="n">fl</span></code></pre></figure>
<h2 id="how-the-sausage-gets-made-computing-the-output-type">How the sausage gets made: computing the output type</h2>
<p>But how do we know how many arguments we need to pass to the formatter? It
depends on the format string! No surprises here: just like all the previous
type-level computations, this one will also be encoded in a type class with a
functional dependency.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">FormatF</span> <span class="p">(</span><span class="n">format</span> <span class="o">::</span> <span class="kt">FList</span><span class="p">)</span> <span class="n">fun</span> <span class="o">|</span> <span class="n">format</span> <span class="o">-></span> <span class="n">fun</span> <span class="kr">where</span>
<span class="n">formatF</span> <span class="o">::</span> <span class="o">@</span><span class="n">format</span> <span class="o">-></span> <span class="kt">String</span> <span class="o">-></span> <span class="n">fun</span></code></pre></figure>
<p class="notice">The <code class="language-plaintext highlighter-rouge">@</code> symbol is special syntax, and in this case, it means that the <code class="language-plaintext highlighter-rouge">formatF</code>
function takes an <code class="language-plaintext highlighter-rouge">FList</code> (<code class="language-plaintext highlighter-rouge">format</code>) as an input. But because <code class="language-plaintext highlighter-rouge">FList</code> is a
<em>custom kind</em>, it has no value-level inhabitants. So, how can we still get
something whose type mentions <code class="language-plaintext highlighter-rouge">format</code>? This is what <code class="language-plaintext highlighter-rouge">@</code> does – it’s a proxy
for a type. Its value is isomorphic to <code class="language-plaintext highlighter-rouge">Unit</code>, and carries no information,
other than its type. Notice that it works for any kind – indeed, proxies are
currently a special-cased type in PureScript, in that they are kind-polymorphic.</p>
<p>Thus <code class="language-plaintext highlighter-rouge">formatF</code> takes a format list, and an accumulator string, and returns some
<code class="language-plaintext highlighter-rouge">fun</code> – this type depends on the actual format list.</p>
<p>Starting with the base case, when there’s nothing to print, simply just
return the accumulated formatted string.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">formatFNil</span> <span class="o">::</span> <span class="kt">FormatF</span> <span class="kt">FNil</span> <span class="kt">String</span> <span class="kr">where</span>
<span class="n">formatF</span> <span class="kr">_</span> <span class="n">str</span> <span class="o">=</span> <span class="n">str</span></code></pre></figure>
<p>When the head of the list is <code class="language-plaintext highlighter-rouge">D</code>, we know that we will need an <code class="language-plaintext highlighter-rouge">Int</code> argument,
and the rest of the function’s type can be computed by recursing on the tail of
the list. As for the implementation, since the return type is now refined to
be of the form <code class="language-plaintext highlighter-rouge">Int -> fun</code>, we are allowed to construct a lambda that takes
the <code class="language-plaintext highlighter-rouge">Int</code>, and appends it to the end of the accumulator, then recurses on the
rest. The implementation of <code class="language-plaintext highlighter-rouge">S</code> is identical, and is omitted for brevity.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">instance</span> <span class="n">formatFConsD</span> <span class="o">::</span>
<span class="kt">FormatF</span> <span class="n">rest</span> <span class="n">fun</span>
<span class="o">=></span> <span class="kt">FormatF</span> <span class="p">(</span><span class="kt">FCons</span> <span class="kt">D</span> <span class="n">rest</span><span class="p">)</span> <span class="p">(</span><span class="kt">Int</span> <span class="o">-></span> <span class="n">fun</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">formatF</span> <span class="kr">_</span> <span class="n">str</span>
<span class="o">=</span> <span class="nf">\</span><span class="n">i</span> <span class="o">-></span> <span class="n">formatF</span> <span class="o">@</span><span class="n">rest</span> <span class="p">(</span><span class="n">str</span> <span class="o"><></span> <span class="n">show</span> <span class="n">i</span><span class="p">)</span></code></pre></figure>
<p>Handling literals (<code class="language-plaintext highlighter-rouge">Lit</code>) is left as an exercise for the reader.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Finally, as a matter of convenience, we can wrap the above type classes into
one, that serves as a bridge between the parser and the formatter, as such:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">class</span> <span class="kt">Format</span> <span class="p">(</span><span class="n">string</span> <span class="o">::</span> <span class="kt">Symbol</span><span class="p">)</span> <span class="n">fun</span> <span class="o">|</span> <span class="n">string</span> <span class="o">-></span> <span class="n">fun</span> <span class="kr">where</span>
<span class="n">format</span> <span class="o">::</span> <span class="o">@</span><span class="n">string</span> <span class="o">-></span> <span class="n">fun</span>
<span class="kr">instance</span> <span class="n">formatFFormat</span> <span class="o">::</span>
<span class="p">(</span> <span class="kt">Parse</span> <span class="n">string</span> <span class="n">format</span>
<span class="p">,</span> <span class="kt">FormatF</span> <span class="n">format</span> <span class="n">fun</span>
<span class="p">)</span> <span class="o">=></span> <span class="kt">Format</span> <span class="n">string</span> <span class="n">fun</span> <span class="kr">where</span>
<span class="n">format</span> <span class="kr">_</span> <span class="o">=</span> <span class="n">formatF</span> <span class="o">@</span><span class="n">format</span> <span class="s">""</span></code></pre></figure>
<p>And that’s it! It might be instructional to try and work out <code class="language-plaintext highlighter-rouge">FormatF</code>’s
instance resolution for a few simple examples by hand, to get a better idea why
this works. A fully working implementation of the code in this post can be
found <a href="https://github.com/kcsongor/purescript-safe-printf">on github</a>.</p>
<p><a href="https://kcsongor.github.io/purescript-safe-printf/">Well-typed printfs cannot go wrong</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on September 25, 2017.</p>
https://kcsongor.github.io/time-travel-in-haskell-for-dummies2015-10-02T00:00:00-00:002015-10-02T00:00:00+00:00Csongor Kisshttps://kcsongor.github.io
<section id="table-of-contents" class="toc">
<header>
<h3><i class="fa fa-book"></i> Overview</h3>
</header>
<div id="drawer">
<ul id="markdown-toc">
<li><a href="#how" id="markdown-toc-how">How?</a></li>
<li><a href="#the-repmax-problem" id="markdown-toc-the-repmax-problem">The repMax problem</a></li>
<li><a href="#wait-what" id="markdown-toc-wait-what">Wait, what?</a></li>
<li><a href="#states-travelling-back-in-time" id="markdown-toc-states-travelling-back-in-time">States travelling back in time</a> <ul>
<li><a href="#what-are-states-anyway" id="markdown-toc-what-are-states-anyway">What are states anyway?</a></li>
</ul>
</li>
<li><a href="#finally-the-time-machine-tardis" id="markdown-toc-finally-the-time-machine-tardis">Finally, the time machine, TARDIS</a> <ul>
<li><a href="#a-single-pass-assembler-an-example" id="markdown-toc-a-single-pass-assembler-an-example">A single-pass assembler: an example</a></li>
<li><a href="#io-doesnt-mix-with-the-future-the-past-is-fine" id="markdown-toc-io-doesnt-mix-with-the-future-the-past-is-fine">IO doesn’t mix with the future! (The past is fine)</a></li>
<li><a href="#thanks" id="markdown-toc-thanks">Thanks</a></li>
</ul>
</li>
</ul>
</div>
</section>
<!-- /#table-of-contents -->
<p>Browsing Hackage the other day, I came across the <a href="https://hackage.haskell.org/package/tardis-0.3.0.0/docs/Control-Monad-Tardis.html">Tardis
Monad</a>.
Reading its description, it turns out that the Tardis monad is capable of
sending state back in time. Yep. Back in time.</p>
<h2 id="how">How?</h2>
<p>No, it’s not the reification of <a href="https://en.wikipedia.org/wiki/Tachyon">some hypothetical time-travelling
particle</a>, rather a really clever way of
exploiting Haskell’s laziness.</p>
<p>In this rather lengthy post, I’ll showcase some interesting consequences of
lazy evaluation and the way to work ourselves up from simple examples to ’time
travelling’ craziness through different levels of abstraction.</p>
<h2 id="the-repmax-problem">The repMax problem</h2>
<p>Imagine you had a list, and you wanted to replace all the elements of the list
with the largest element, by only passing the list once. You might say
something like “Easier said than done, how do I know the largest element
without having passed the list before?”</p>
<p>Let’s start from the beginning:
– First, you ask the future for the largest element of the list, (don’t worry,
this will make sense in a bit) let’s call this value <code class="language-plaintext highlighter-rouge">rep</code> (as in the value we
replace stuff with).</p>
<p>Walking through the list, you do two things:</p>
<ul>
<li>replace the current element with <code class="language-plaintext highlighter-rouge">rep</code></li>
<li>’return’ the larger of the current element and the largest element of the remaining list.</li>
</ul>
<p>When only one element remains, replace it with <code class="language-plaintext highlighter-rouge">rep</code>, and return what was there
originally. (this is the base case)</p>
<p>Right, at the moment, we haven’t acquired the skill of seeing the future, so we
just write the rest of the function with that bit left out.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">repMax</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Int</span><span class="p">]</span> <span class="o">-></span> <span class="kt">Int</span> <span class="o">-></span> <span class="p">(</span><span class="kt">Int</span><span class="p">,</span> <span class="p">[</span><span class="kt">Int</span><span class="p">])</span>
<span class="n">repMax</span> <span class="kt">[]</span> <span class="n">rep</span> <span class="o">=</span> <span class="p">(</span><span class="n">rep</span><span class="p">,</span> <span class="kt">[]</span><span class="p">)</span>
<span class="n">repMax</span> <span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="n">rep</span> <span class="o">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="p">[</span><span class="n">rep</span><span class="p">])</span>
<span class="n">repMax</span> <span class="p">(</span><span class="n">l</span> <span class="o">:</span> <span class="n">ls</span><span class="p">)</span> <span class="n">rep</span> <span class="o">=</span> <span class="p">(</span><span class="n">m'</span><span class="p">,</span> <span class="n">rep</span> <span class="o">:</span> <span class="n">ls'</span><span class="p">)</span>
<span class="kr">where</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">ls'</span><span class="p">)</span> <span class="o">=</span> <span class="n">repMax</span> <span class="n">ls</span> <span class="n">rep</span>
<span class="n">m'</span> <span class="o">=</span> <span class="n">max</span> <span class="n">m</span> <span class="n">l</span></code></pre></figure>
<p>So, it takes a list, and the rep element, and returns (Int, [Int])</p>
<p><code class="language-plaintext highlighter-rouge">repMax [1,2,3,4,5,3] 6</code>
gives us <code class="language-plaintext highlighter-rouge">(5, [6,6,6,6,6,6])</code> which is exactly what we wanted: the elements are
replaced with rep and we also have the largest element.
Now, all we need to do is use that largest element as <code class="language-plaintext highlighter-rouge">rep</code>:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">doRepMax</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Int</span><span class="p">]</span> <span class="o">-></span> <span class="p">[</span><span class="kt">Int</span><span class="p">]</span>
<span class="n">doRepMax</span> <span class="n">xs</span> <span class="o">=</span> <span class="n">xs'</span>
<span class="kr">where</span> <span class="p">(</span><span class="n">largest</span><span class="p">,</span> <span class="n">xs'</span><span class="p">)</span> <span class="o">=</span> <span class="n">repMax</span> <span class="n">xs</span> <span class="n">largest</span></code></pre></figure>
<h2 id="wait-what">Wait, what?</h2>
<p>This can be done thanks to lazy evaluation. Haskell systems use so-called
’thunks’ for values that are yet to be evaluated. When you say <code class="language-plaintext highlighter-rouge">(min 5 6)</code>, the
expression will form a thunk and not be evaluated until it really needs to be.
Here, <code class="language-plaintext highlighter-rouge">rep</code> can be thought of as a reference to a thunk. When we tell GHC to put
<code class="language-plaintext highlighter-rouge">largest</code> in all slots of the list, it will in fact put a reference to the same
thunk in those slots, not the actual data. As we pass the list, this thunk is
building up with nested <code class="language-plaintext highlighter-rouge">max</code> expressions. For <code class="language-plaintext highlighter-rouge">[1,2,3,4]</code>, will end up with a
thunk: <code class="language-plaintext highlighter-rouge">max 1 (max 2 (max 3 4))</code>.
A reference to this thunk will be placed everywhere in the list. By the time we
finished traversing the list, the thunk will be finished too, and can be
evaluated. (Before finishing, the thunk has the form similar to
<code class="language-plaintext highlighter-rouge">max 1 (_something_)</code> where <code class="language-plaintext highlighter-rouge">_something_</code> is the max of the rest of the list.
This obivously can not be evaluated at this point)</p>
<p>How about generalising this idea to other data structures?</p>
<p>There’s an old saying in the world of lists</p>
<blockquote>
<p>“Everything’s a fold”.</p>
</blockquote>
<p>Indeed, we could easily rewrite our doRepMax function using a fold:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">foldmax</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Ord</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Num</span> <span class="n">a</span><span class="p">)</span> <span class="o">=></span> <span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="o">-></span> <span class="p">[</span><span class="n">a</span><span class="p">]</span>
<span class="n">foldmax</span> <span class="n">ls</span> <span class="o">=</span> <span class="n">ls'</span>
<span class="kr">where</span>
<span class="p">(</span><span class="n">ls'</span><span class="p">,</span> <span class="n">largest</span><span class="p">)</span>
<span class="o">=</span> <span class="n">foldl</span> <span class="p">(</span><span class="nf">\</span><span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span> <span class="n">a</span> <span class="o">-></span> <span class="p">(</span><span class="n">largest</span> <span class="o">:</span> <span class="n">b</span><span class="p">,</span> <span class="n">max</span> <span class="n">a</span> <span class="n">c</span><span class="p">))</span> <span class="p">(</span><span class="kt">[]</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="n">ls</span></code></pre></figure>
<p>Brilliant! Now we can use this technique on everything that is Foldable!
Or can we?</p>
<p>Taking a look at the type signature of the generalised <code class="language-plaintext highlighter-rouge">foldl</code> (from
Data.Foldable): <code class="language-plaintext highlighter-rouge">Data.Foldable.foldl :: Foldable t => (b -> a -> b) -> b -> t a
-> b</code> we realise that the returned value’s structure <code class="language-plaintext highlighter-rouge">b</code> is independent from
that of the input <code class="language-plaintext highlighter-rouge">t a</code>. The reason we could get away with this in our fold
example was that we knew we were dealing with a list, so we used the <code class="language-plaintext highlighter-rouge">:</code>
operator explicitly to restore the structure.</p>
<p>No problem! There exists a type class that does just what we want, that is it
lets us fold it while keeping its structure.
This magical class is called <code class="language-plaintext highlighter-rouge">Traversable</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE DeriveFunctor,
DeriveFoldable,
DeriveTraversable #-}</span>
<span class="kr">data</span> <span class="kt">Tree</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">Empty</span> <span class="o">|</span> <span class="kt">Leaf</span> <span class="n">a</span> <span class="o">|</span> <span class="kt">Node</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span> <span class="n">a</span> <span class="p">(</span><span class="kt">Tree</span> <span class="n">a</span><span class="p">)</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">,</span> <span class="kt">Functor</span><span class="p">,</span> <span class="kt">Foldable</span><span class="p">,</span> <span class="kt">Traversable</span><span class="p">)</span></code></pre></figure>
<p>– Thankfully, GHC is clever enough to derive Traversable for us from this
data definiton. (But it wouldn’t be too difficult to do by hand anyway)</p>
<p>Traversable data structures can do a really neat trick (among many others):
<code class="language-plaintext highlighter-rouge">mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)</code></p>
<p>This function is like combining a map with a fold (and so all Traversables also
need to be Functors and Foldables). We take a function <code class="language-plaintext highlighter-rouge">(a -> b -> (a, c))</code>,
an initial <code class="language-plaintext highlighter-rouge">a</code> and a Traversable of <code class="language-plaintext highlighter-rouge">b</code>s (<code class="language-plaintext highlighter-rouge">t b</code>).</p>
<p>The elements will be changed with their respective <code class="language-plaintext highlighter-rouge">c</code>s. (the one calculated by
<code class="language-plaintext highlighter-rouge">(a -> b -> (a, c))</code>) So <code class="language-plaintext highlighter-rouge">c</code> is a perfect place for us to put our <code class="language-plaintext highlighter-rouge">rep</code> (the
largest element in this case)</p>
<p>Apart from the final Traversable <code class="language-plaintext highlighter-rouge">t c</code>, it also returns the accumulated <code class="language-plaintext highlighter-rouge">a</code>s
(that’s where we return the largest).</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">generalMax</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Traversable</span> <span class="n">t</span><span class="p">,</span> <span class="kt">Num</span> <span class="n">a</span><span class="p">,</span> <span class="kt">Ord</span> <span class="n">a</span><span class="p">)</span> <span class="o">=></span> <span class="n">t</span> <span class="n">a</span> <span class="o">-></span> <span class="n">t</span> <span class="n">a</span>
<span class="n">generalMax</span> <span class="n">t</span> <span class="o">=</span> <span class="n">xs'</span>
<span class="kr">where</span>
<span class="p">(</span><span class="n">largest</span><span class="p">,</span> <span class="n">xs'</span><span class="p">)</span>
<span class="o">=</span> <span class="n">mapAccumR</span> <span class="p">(</span><span class="nf">\</span><span class="n">a</span> <span class="n">b</span> <span class="o">-></span> <span class="p">(</span><span class="n">max</span> <span class="n">a</span> <span class="n">b</span><span class="p">,</span> <span class="n">largest</span><span class="p">))</span> <span class="mi">0</span> <span class="n">t</span></code></pre></figure>
<p>This generalisation gives us new options! What we’ve been doing so far is
we’ve used <code class="language-plaintext highlighter-rouge">a</code>, <code class="language-plaintext highlighter-rouge">b</code> and <code class="language-plaintext highlighter-rouge">c</code> as the same types (as, say Ints).</p>
<p>For instance, if we want to replace all the elements with the average of them,
then we can accumulate the sum and the count of elements in a tuple (<code class="language-plaintext highlighter-rouge">a</code> will
then take the role of this tuple) and <code class="language-plaintext highlighter-rouge">c</code> will be the sum divided by the count,
for which we’re going to ask the future again!</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">generalAvg</span> <span class="o">::</span> <span class="p">(</span><span class="kt">Traversable</span> <span class="n">t</span><span class="p">,</span> <span class="kt">Integral</span> <span class="n">a</span><span class="p">)</span> <span class="o">=></span> <span class="n">t</span> <span class="n">a</span> <span class="o">-></span> <span class="n">t</span> <span class="n">a</span>
<span class="n">generalAvg</span> <span class="n">t</span> <span class="o">=</span> <span class="n">xs'</span>
<span class="kr">where</span>
<span class="n">avg</span> <span class="o">=</span> <span class="n">s</span> <span class="p">`</span><span class="n">div</span><span class="p">`</span> <span class="n">c</span>
<span class="p">((</span><span class="n">s</span><span class="p">,</span> <span class="n">c</span><span class="p">),</span> <span class="n">xs'</span><span class="p">)</span>
<span class="o">=</span> <span class="n">mapAccumR</span> <span class="p">(</span><span class="nf">\</span><span class="p">(</span><span class="n">s'</span><span class="p">,</span> <span class="n">c'</span><span class="p">)</span> <span class="n">b</span> <span class="o">-></span> <span class="p">((</span><span class="n">s'</span> <span class="o">+</span> <span class="n">b</span><span class="p">,</span> <span class="n">c'</span> <span class="o">+</span> <span class="mi">1</span><span class="p">),</span> <span class="n">avg</span><span class="p">))</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="n">t</span></code></pre></figure>
<p>And so on, we can do all sorts of interesting things in a single traversal of
our data structures.</p>
<h2 id="states-travelling-back-in-time">States travelling back in time</h2>
<hr />
<h5 id="what-are-states-anyway">What are states anyway?</h5>
<p>In Haskell, whenever we want to write functions that operate on some sort
of environment or state, we write these functions in the following form:
statefulFunction :: b -> c -> d -> s -> (a, s)
that is, we take some arguments (<code class="language-plaintext highlighter-rouge">b</code>, <code class="language-plaintext highlighter-rouge">c</code>, <code class="language-plaintext highlighter-rouge">d</code> here), a state <code class="language-plaintext highlighter-rouge">s</code>, and
return a new, possibly modified state along with some value <code class="language-plaintext highlighter-rouge">a</code>.
Now, this involves writing a lot of boilerplate code, both in the type
signatures and in the actual code that is using the state.</p>
<p>For example, using the state as a counter:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">statefulFunction</span> <span class="n">arg1</span> <span class="n">arg2</span> <span class="n">arg3</span> <span class="n">counter</span> <span class="o">=</span>
<span class="p">(</span><span class="n">arg1</span> <span class="o">+</span> <span class="n">arg2</span> <span class="o">+</span> <span class="n">arg3</span><span class="p">,</span> <span class="n">counter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">bindStatefulFunctions</span> <span class="o">::</span>
<span class="p">(</span><span class="n">s</span> <span class="o">-></span> <span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span> <span class="o">-></span>
<span class="p">(</span><span class="n">a</span> <span class="o">-></span> <span class="n">s</span> <span class="o">-></span> <span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span> <span class="o">-></span>
<span class="n">s</span> <span class="o">-></span> <span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">s</span><span class="p">)</span>
<span class="n">bindStatefulFunctions</span> <span class="n">f1</span> <span class="n">f2</span> <span class="o">=</span> <span class="nf">\</span><span class="n">initialState</span> <span class="o">-></span>
<span class="kr">let</span> <span class="p">(</span><span class="n">result</span><span class="p">,</span> <span class="n">updatedState</span><span class="p">)</span> <span class="o">=</span> <span class="n">f1</span> <span class="n">initialState</span>
<span class="kr">in</span> <span class="n">f2</span> <span class="n">result</span> <span class="n">updatedState</span></code></pre></figure>
<p>Note that f2 takes an extra <code class="language-plaintext highlighter-rouge">a</code>, that’s the output of the first function.
That’s why this function is called bind, we bind the output of the first
function to the input of the second while passing the modified state.</p>
<p>The State monad essentially does something like the above code, but hides
it all and makes the state passing implicit. Also, being a monad, gives us
the all so convenient do notation!</p>
<p><code class="language-plaintext highlighter-rouge">State s a</code> is basically just a type synonym for <code class="language-plaintext highlighter-rouge">s -> (a, s)</code>, so our
previous example could be written as
<code class="language-plaintext highlighter-rouge">statefulFunction :: b -> c -> d -> State s a</code></p>
<p>and bindStatefulFunctions we get for free from State (known as <code class="language-plaintext highlighter-rouge">>>=</code> for monads)</p>
<p>Now we can do:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">statefulFunction</span> <span class="n">arg1</span> <span class="n">arg2</span> <span class="n">arg3</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">counter</span> <span class="o"><-</span> <span class="n">get</span>
<span class="n">put</span> <span class="p">(</span><span class="n">counter</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">return</span> <span class="p">(</span><span class="n">arg1</span> <span class="o">+</span> <span class="n">arg2</span> <span class="o">+</span> <span class="n">arg3</span><span class="p">)</span></code></pre></figure>
<p>(Did you know that Haskell is also the best imperative language?)
Notice how the state is not explicitly passed as an argument (thus our
function is partially applied), but is bound to counter by the get function.
Put then puts the updated counter back in the state. Return then just makes
sure that what we get out of is wrapped back in the State monad.</p>
<hr />
<p>The nice thing about the State monad is that all the computations we do
within it are essentially just partially applied functions, so they can’t be
evaluated until provided with an initial state, which will then magically
flow through the pipeline of computations, each doing their respective
modifications in the meantime.</p>
<p><code class="language-plaintext highlighter-rouge">mapAccumR</code> does a series of stateful computations (in nature, but it’s not
using the State monad), where it takes a value and a state, then returns a
new value with a modified state. (Accum refers to the fact that this state
can be used as an accumulator as we traverse the data)</p>
<p><code class="language-plaintext highlighter-rouge">mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)</code></p>
<p><code class="language-plaintext highlighter-rouge">a</code> is that state here, that is what we used to store the largest element.
This state, however, travels forward in time, so to speak, as we go through
the list. The trick we do only happens at the end, when we feed it its own
output. We can do so thanks to lazy evaluation.</p>
<p>So the State monad passes its <code class="language-plaintext highlighter-rouge">s</code> from computation to computation, that’s
how these computations are bound.</p>
<p>Imagine using the same laziness self-feeding trick, but for passing the
state:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">reverseBind</span> <span class="n">stateful1</span> <span class="n">stateful2</span>
<span class="o">=</span> <span class="nf">\</span><span class="n">s</span> <span class="o">-></span> <span class="p">(</span><span class="n">x'</span><span class="p">,</span> <span class="n">s''</span><span class="p">)</span>
<span class="kr">where</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">s''</span><span class="p">)</span> <span class="o">=</span> <span class="n">stateful1</span> <span class="n">s'</span>
<span class="p">(</span><span class="n">x'</span><span class="p">,</span> <span class="n">s'</span><span class="p">)</span> <span class="o">=</span> <span class="n">stateful2</span> <span class="n">x</span> <span class="n">s</span></code></pre></figure>
<p>So first we run stateful1 <strong>with the state modified by stateful2</strong>!
Then we run stateful2 with stateful1’s output. Finally, we return the
state after running stateful1 along with the value <code class="language-plaintext highlighter-rouge">x'</code> from stateful2.
Note that because of the way this binding is done, stateful1’s ouput state
will actually be the <em>past</em> of stateful1. (That is, whatever we do with the
state in stateful1, will be visible to the computations preceding stateful1,
just like how stateful2’s effects are seen in stateful1. Lazy evaluation
rocks!)</p>
<p>Coming from an imperative background, this can be thought of as stateful1
putting forward references to the values it uses from the state, and once
those values are actually calculated in the future, stateful1 will be able
to do whatever it wanted. These references are not explicit though as they
would be in C (using pointers, for example), but implicitly placed there
by GHC as thunks.</p>
<p>That also means whatever we do with these values has to be done lazily. (an
example below)</p>
<p class="notice">The above code is a modified version of the monadic binding found in the
rev-state package (which is in turn a modification of the original State
monad by reversing the flow of state).</p>
<h2 id="finally-the-time-machine-tardis">Finally, the time machine, TARDIS</h2>
<p>So we have the State monad, of which the state flows forwards, then we have
the Rev-State, which sends the state backwards. So what do we get if we
combine these two? Yes, a time machine! Also known as the Tardis monad: it
is in fact a combination of the State and Rev-State monads with some nice
functions to deal with the bidirectional states.</p>
<p>I say states, because naturally, we have data coming from the future and
data coming from the past, and those make two (a backwards travelling and a
forwards travelling state).</p>
<p>These could be of different types, say we can send Strings back in time and
Ints to the future.</p>
<h3 id="a-single-pass-assembler-an-example">A single-pass assembler: an example</h3>
<p>Writing an assembler is relatively straightforward. We go through a list of
assembly instructions and turn them into their binary equivalent for the given
CPU architecture.</p>
<p>However, there are some instructions that we can’t immediately convert.
One of such instructions is a label for branching. (jumps)
For these labels, we need a symbol table.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">import</span> <span class="k">qualified</span> <span class="nn">Data.Map.Strict</span> <span class="k">as</span> <span class="n">M</span>
<span class="kr">type</span> <span class="kt">Addr</span> <span class="o">=</span> <span class="kt">Int</span>
<span class="kr">type</span> <span class="kt">SymTable</span> <span class="o">=</span> <span class="kt">M</span><span class="o">.</span><span class="kt">Map</span> <span class="kt">String</span> <span class="kt">Addr</span> <span class="c1">-- map label names to their addresses</span>
<span class="kr">data</span> <span class="kt">Instr</span> <span class="o">=</span> <span class="kt">Add</span>
<span class="o">|</span> <span class="kt">Mov</span>
<span class="o">|</span> <span class="kt">ToLabel</span> <span class="kt">String</span>
<span class="o">|</span> <span class="kt">ToAddr</span> <span class="kt">Addr</span>
<span class="o">|</span> <span class="kt">Label</span> <span class="kt">String</span>
<span class="o">|</span> <span class="kt">Err</span>
<span class="kr">deriving</span> <span class="p">(</span><span class="kt">Show</span><span class="p">)</span></code></pre></figure>
<p><code class="language-plaintext highlighter-rouge">Instr</code> is a rather rudimentary representation of assembly instructions, but it
does the job for us now.</p>
<p>What we want to have is a function that takes a list of <code class="language-plaintext highlighter-rouge">Instr</code>s and returns
a list of <code class="language-plaintext highlighter-rouge">[(Addr, Instr)]</code> and also replace all the <code class="language-plaintext highlighter-rouge">ToLabel</code>s with <code class="language-plaintext highlighter-rouge">ToAddr</code>s
that point to the address of the label. If the label is never defined, we
put an <code class="language-plaintext highlighter-rouge">Err</code> there. (In real life, you would use some ExceptT monad transformer
to handle such errors.)</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">runAssembler</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Instr</span><span class="p">]</span> <span class="o">-></span> <span class="p">[(</span><span class="kt">Addr</span><span class="p">,</span> <span class="kt">Instr</span><span class="p">)]</span></code></pre></figure>
<p>Jumping to a label that is already defined is easy, we look it up in our
SymTable and convert <code class="language-plaintext highlighter-rouge">ToLabel</code> to <code class="language-plaintext highlighter-rouge">ToAddr</code>. This sounds like an application
of the State monad, doesn’t it?
When we encounter a label definition, just add it to the state (<code class="language-plaintext highlighter-rouge">SymTable</code>).
Done!</p>
<p>The problem arises from the fact that some labels might be defined after they
are used. The ‘else’ block of an if statement will typically be done like this.
Implementing this in C, you could remember these positions and at the end, fill
in the gaps with the knowledge you have acquired. Thunks, anyone?</p>
<p>I’ll just use a Rev-State monad and send these definitions back in time.
Simple enough, right?</p>
<p>So at this point, we can see that we will need both types of these states:
one that’s travelling forward and one that is going backwards. And that is
exactly what the Tardis monad is!</p>
<p class="notice">Labels will not be turned into any binary, instead
the next actual instruction’s address will be used.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="kr">type</span> <span class="kt">Assembler</span> <span class="n">a</span> <span class="o">=</span> <span class="kt">Tardis</span> <span class="kt">SymTable</span> <span class="kt">SymTable</span> <span class="n">a</span></code></pre></figure>
<p>Right, our <code class="language-plaintext highlighter-rouge">runAssembler</code> function will run some <code class="language-plaintext highlighter-rouge">assemble</code> function
in the Tardis monad. (That is, it will give it the initial states and extract
the final value at the end).</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">runAssembler</span> <span class="n">asm</span> <span class="o">=</span> <span class="n">instructions</span>
<span class="kr">where</span> <span class="p">(</span><span class="n">instructions</span><span class="p">,</span> <span class="kr">_</span><span class="p">)</span>
<span class="o">=</span> <span class="n">runTardis</span> <span class="p">(</span><span class="n">assemble</span> <span class="mi">0</span> <span class="n">asm</span><span class="p">)</span> <span class="p">(</span><span class="kt">M</span><span class="o">.</span><span class="n">empty</span><span class="p">,</span> <span class="kt">M</span><span class="o">.</span><span class="n">empty</span><span class="p">)</span></code></pre></figure>
<p>The <code class="language-plaintext highlighter-rouge">assemble</code> function turns a list of instructions to <code class="language-plaintext highlighter-rouge">[(Addr, Instr)]</code>
in the Assembler monad (which is a synonym for Tardis SymTable SymTable).
What’s that 0 doing there, you ask?</p>
<p>We need to keep track of the address we will use for the next instruction.
This is because of labels. When we encounter a regular instruction, we put
that at the provided address, then increment that address by 1. If a label
comes around, we put it in the State then continue without incrementing the
address.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">assemble</span> <span class="o">::</span> <span class="kt">Addr</span> <span class="o">-></span> <span class="p">[</span><span class="kt">Instr</span><span class="p">]</span> <span class="o">-></span> <span class="kt">Assembler</span> <span class="p">[(</span><span class="kt">Addr</span><span class="p">,</span> <span class="kt">Instr</span><span class="p">)]</span>
<span class="n">assemble</span> <span class="kr">_</span> <span class="kt">[]</span> <span class="o">=</span> <span class="n">return</span> <span class="kt">[]</span>
<span class="c1">-- label found, update state then go on</span>
<span class="n">assemble</span> <span class="n">addr</span> <span class="p">(</span><span class="kt">Label</span> <span class="n">label</span> <span class="o">:</span> <span class="n">is'</span><span class="p">)</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">modifyBackwards</span> <span class="p">(</span><span class="kt">M</span><span class="o">.</span><span class="n">insert</span> <span class="n">label</span> <span class="n">addr</span><span class="p">)</span> <span class="c1">-- send to past</span>
<span class="n">modifyForwards</span> <span class="p">(</span><span class="kt">M</span><span class="o">.</span><span class="n">insert</span> <span class="n">label</span> <span class="n">addr</span><span class="p">)</span> <span class="c1">-- send to future</span>
<span class="n">assemble</span> <span class="n">addr</span> <span class="n">is'</span> <span class="c1">-- assemble the rest of the instructions</span>
<span class="c1">-- jump to label found, replace with</span>
<span class="c1">-- jump to address</span>
<span class="c1">-- then do the rest starting at (addr + 1)</span>
<span class="n">assemble</span> <span class="n">addr</span> <span class="p">(</span><span class="kt">ToLabel</span> <span class="n">label</span> <span class="o">:</span> <span class="n">is'</span><span class="p">)</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">bw</span> <span class="o"><-</span> <span class="n">getFuture</span>
<span class="n">fw</span> <span class="o"><-</span> <span class="n">getPast</span>
<span class="kr">let</span> <span class="n">union</span> <span class="o">=</span> <span class="kt">M</span><span class="o">.</span><span class="n">union</span> <span class="n">bw</span> <span class="n">fw</span> <span class="c1">-- take union of the two symbol tables</span>
<span class="n">this</span> <span class="o">=</span> <span class="kr">case</span> <span class="kt">M</span><span class="o">.</span><span class="n">lookup</span> <span class="n">label</span> <span class="n">union</span> <span class="kr">of</span>
<span class="kt">Just</span> <span class="n">a'</span> <span class="o">-></span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="kt">ToAddr</span> <span class="n">a'</span><span class="p">)</span>
<span class="kt">Nothing</span> <span class="o">-></span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="kt">Err</span><span class="p">)</span>
<span class="n">rest</span> <span class="o"><-</span> <span class="n">assemble</span> <span class="p">(</span><span class="n">addr</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">is'</span>
<span class="n">return</span> <span class="o">$</span> <span class="n">this</span> <span class="o">:</span> <span class="n">rest</span>
<span class="c1">-- regular instruction found,</span>
<span class="c1">-- assign it to the address</span>
<span class="c1">-- then do the rest starting at (addr + 1)</span>
<span class="n">assemble</span> <span class="n">addr</span> <span class="p">(</span><span class="n">instr</span> <span class="o">:</span> <span class="n">is'</span><span class="p">)</span> <span class="o">=</span> <span class="kr">do</span>
<span class="n">rest</span> <span class="o"><-</span> <span class="n">assemble</span> <span class="p">(</span><span class="n">addr</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">is'</span>
<span class="n">return</span> <span class="o">$</span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="n">instr</span><span class="p">)</span> <span class="o">:</span> <span class="n">rest</span></code></pre></figure>
<p>Now we come up with some test instructions:</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">input</span> <span class="o">::</span> <span class="p">[</span><span class="kt">Instr</span><span class="p">]</span>
<span class="n">input</span> <span class="o">=</span> <span class="p">[</span><span class="kt">Add</span><span class="p">,</span>
<span class="kt">Add</span><span class="p">,</span>
<span class="kt">ToLabel</span> <span class="s">"my_label"</span><span class="p">,</span>
<span class="kt">Mov</span><span class="p">,</span>
<span class="kt">Mov</span><span class="p">,</span>
<span class="kt">Label</span> <span class="s">"my_label"</span><span class="p">,</span>
<span class="kt">Label</span> <span class="s">"second_label"</span><span class="p">,</span>
<span class="kt">Mov</span><span class="p">,</span>
<span class="kt">ToLabel</span> <span class="s">"second_label"</span><span class="p">,</span>
<span class="kt">Mov</span><span class="p">]</span></code></pre></figure>
<p>…and we can try running the assembler on this data:</p>
<p><code class="language-plaintext highlighter-rouge">> runAssembler input</code></p>
<p><code class="language-plaintext highlighter-rouge">> [(0,Add),(1,Add),(2,ToAddr 5),(3,Mov),(4,Mov),(5,Mov),(6,ToAddr 5),(7,Mov)]</code></p>
<p>Yay! Just what we wanted!</p>
<h3 id="io-doesnt-mix-with-the-future-the-past-is-fine">IO doesn’t mix with the future! (The past is fine)</h3>
<p>Be careful about what you do with the state coming from the future.
Everything has to be lazily passed through.</p>
<p>You might be tempted to use the TardisT monad transformer to interleave IO
effects in your time-travelling code. Most IO computations, however are
strict.</p>
<p>Let’s say you want to get the label from the future and print its address.
IO’s print will try to evaluate its argument (which is a partial thunk at this
point). It will block the thread until the evaluation is completed, which will
result in the program breaking, as the thread block prevents it from
progressing further. In this case, I’d advise the use of a Writer monad which
has a lazy mechanism, and the results can be printed at the end using IO.</p>
<h3 id="thanks">Thanks</h3>
<p>Thanks for reading this lengthy post, in which we saw how we can mimic the use
of references in pure Haskell code (altough time-travel is an arguably better
name for this). This comes at a price though: accumulating
unevaluated thunks can use up quite a bit of memory, so be careful if you want
to use these techniques in a memory critical environment.</p>
<p>If you find any bugs or mistakes, please make sure to let me know!</p>
<p><a href="https://kcsongor.github.io/time-travel-in-haskell-for-dummies/">Time travel in Haskell for dummies</a> was originally published by Csongor Kiss at <a href="https://kcsongor.github.io">( )</a> on October 02, 2015.</p>