vigoo's software development blog Zola 2025-12-19T00:00:00+00:00 https://blog.vigoo.dev/atom.xml Agent patterns in Golem 2025-12-19T00:00:00+00:00 2025-12-19T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/golem-agent-patterns/ <p><a href="https://golem.cloud">Golem</a> is an <em>agent-native</em> platform that provides high level of fault-tolerance and exactly-once (or in some cases, at-least once) semantics automatically without requiring to write any code for persisting and recovering state. We wrote several blog posts, demos and live coding sessions showing how this looks like. Yesterday I've <a href="https://blog.vigoo.dev/posts/rust-agents-golem14/">posted about writing a Golem application in Rust</a>, and a week before John de Goes had a really nice <a href="https://www.youtube.com/live/ovVn_fNIyJU">live coding session showing writing a NoSQL database using Golem in Type Script</a>.</p> <p>In all these examples we are defining one or more <strong>agents</strong> that interact with each other - it's the fundamental building block of a Golem application. But what are these agents? How should we structure our application, what are some common patterns?</p> <p>I'm trying to answer some of these questions in this article.</p> <h2 id="agents-as-workflows">Agents as workflows</h2> <p>When we first released Golem, we said you can run any application (with some restrictions, of course) on it without having to rewrite it to use anything Golem specific. With the new, agent centric approach this might seem to be no longer true - but to some extent it still is. The simplest way to map an application to agents is that every unique run of the program is an agent - we can create such a unique instance of our program, and run it - and it's going to do some <em>side-effects</em> such as calling remote HTTP endpoints or databases.</p> <p>In this setup the agent's identity is a unique identifier - for example a UUID, and the agent exposes a single callable entry point, similar how a traditional program's exposes it's single entry point in the form of a <code>main</code> function. Let's see how an agent like this looks like in Golem, using TypeScript:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">import </span><span>{ </span><span style="color:#e45649;">BaseAgent</span><span>, </span><span style="color:#e45649;">agent</span><span>,} </span><span style="color:#a626a4;">from </span><span style="color:#50a14f;">&#39;@golemcloud/golem-ts-sdk&#39;</span><span>; </span><span style="color:#a626a4;">import </span><span>{ </span><span style="color:#e45649;">validate </span><span style="color:#a626a4;">as </span><span style="color:#e45649;">uuidValidate </span><span>} </span><span style="color:#a626a4;">from </span><span style="color:#50a14f;">&#39;uuid&#39;</span><span>; </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">BlackboxWorkflow </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#a626a4;">!</span><span style="color:#0184bc;">uuidValidate</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#c18401;">)) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">throw new </span><span style="color:#c18401;">Error(</span><span style="color:#50a14f;">`Invalid id, must be a UUID: ${</span><span style="color:#e45649;">id</span><span style="color:#50a14f;">}`</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">id</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Potentially long running workflow doing a series of steps </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>The <code>run</code> function can do a a series of steps - call <code>fetch</code> multiple times, sleep, use third party libraries to connect to external systems, and so on, without having any Golem specific detail in it. Even an agent like this is completely <strong>durable</strong> in Golem. For example, if it's execution gets interrupted by a scaling event the running agent can be "moved" (interrupted in one node, and restored in another) to another node and it can continue executing without any noticable effect, except for some latency.</p> <h3 id="observing-a-workflow-s-state">Observing a workflow's state</h3> <p>The above example is a <strong>black box</strong> - once you started a workflow like this, by creating the agent and invoking it's <code>run</code> method, you cannot really observe the workflow's inner state. You can observe its side effects - it may call external systems, write logs, etc, but there is no real way to query or control what's happening until <code>run</code> finishes.</p> <p>The primary reason for this is that Golem agents are <strong>single-threaded</strong> and <strong>invocations cannot overlap</strong>. Although <code>run</code> can contain overlapping asynchronous network calls, for example, even if we would export more methods than just <code>run</code> from our agent, we would not be able to call them <em>during</em> <code>run</code> is executing. The other calls would end up in a message queue, waiting to be processed one by one.</p> <p>One very important property of <strong>agents</strong> is that an agent can <strong>call</strong> another agent; this call can be synchronous (the caller awaits a response) or just a trigger (the caller puts an invocation request in the other agent's message queue). Both of these are persistent and by that, Golem can guarantee <strong>exactly-once semantics</strong>. If the code to call an agent from another agent runs, you can be sure that is going to happen, and only once, no matter what happens to the execution environment.</p> <p>We can use this feature to introduce simple observability to a long-running workflow like the one we defined above by defining a second agent representing the running workflow's state:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">type </span><span>State </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#e45649;">tag</span><span style="color:#a626a4;">: </span><span style="color:#50a14f;">&quot;not-started&quot; </span><span> </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span>string, </span><span>} </span><span style="color:#a626a4;">| </span><span>{ </span><span> </span><span style="color:#e45649;">tag</span><span style="color:#a626a4;">: </span><span style="color:#50a14f;">&quot;in-progress&quot;</span><span>, </span><span> </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span>string, </span><span> </span><span style="color:#e45649;">currentStep</span><span style="color:#a626a4;">: </span><span>string, </span><span> </span><span style="color:#e45649;">startedAt</span><span style="color:#a626a4;">: </span><span>string </span><span>} </span><span style="color:#a626a4;">| </span><span>{ </span><span> </span><span style="color:#e45649;">tag</span><span style="color:#a626a4;">: </span><span style="color:#50a14f;">&quot;completed&quot;</span><span>, </span><span> </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span>string, </span><span> </span><span style="color:#e45649;">startedAt</span><span style="color:#a626a4;">?: </span><span>string, </span><span> </span><span style="color:#e45649;">finishedAt</span><span style="color:#a626a4;">: </span><span>string, </span><span> </span><span style="color:#e45649;">results</span><span style="color:#a626a4;">: </span><span>number[] </span><span style="color:#a0a1a7;">// some domain-specific result </span><span>} </span><span> </span><span>@agent() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">WorkflowState </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private </span><span style="color:#e45649;">state</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">State; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{tag: </span><span style="color:#50a14f;">&quot;not-started&quot;</span><span style="color:#c18401;">, </span><span style="color:#e45649;">id</span><span style="color:#c18401;">}; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">get</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">State { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">update</span><span style="color:#c18401;">(</span><span style="color:#e45649;">currentStep</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">tag </span><span style="color:#a626a4;">=== </span><span style="color:#50a14f;">&quot;not-started&quot;</span><span style="color:#c18401;">) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> tag: </span><span style="color:#50a14f;">&quot;in-progress&quot;</span><span style="color:#c18401;">, </span><span style="color:#c18401;"> id: </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.id, </span><span style="color:#e45649;">currentStep</span><span style="color:#c18401;">, </span><span style="color:#c18401;"> startedAt: </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Date().</span><span style="color:#0184bc;">toISOString</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> }; </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">else if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">tag </span><span style="color:#a626a4;">=== </span><span style="color:#50a14f;">&quot;in-progress&quot;</span><span style="color:#c18401;">) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">currentStep </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">currentStep</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">else if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">tag </span><span style="color:#a626a4;">== </span><span style="color:#50a14f;">&quot;completed&quot;</span><span style="color:#c18401;">) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">throw new </span><span style="color:#c18401;">Error(</span><span style="color:#50a14f;">&quot;Cannot update completed workflow state&quot;</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">finished</span><span style="color:#c18401;">(</span><span style="color:#e45649;">results</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number[]) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">tag </span><span style="color:#a626a4;">== </span><span style="color:#50a14f;">&quot;completed&quot;</span><span style="color:#c18401;">) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">throw new </span><span style="color:#c18401;">Error(</span><span style="color:#50a14f;">&quot;Cannot finish completed workflow state&quot;</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">else </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> tag: </span><span style="color:#50a14f;">&quot;completed&quot;</span><span style="color:#c18401;">, </span><span style="color:#c18401;"> id: </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.id, </span><span style="color:#c18401;"> startedAt: </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">tag </span><span style="color:#a626a4;">=== </span><span style="color:#50a14f;">&quot;in-progress&quot; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">? </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">state</span><span style="color:#c18401;">.</span><span style="color:#e45649;">startedAt </span><span style="color:#a626a4;">: </span><span style="color:#c18401;">undefined, </span><span style="color:#c18401;"> finishedAt: </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Date().</span><span style="color:#0184bc;">toISOString</span><span style="color:#c18401;">(), </span><span style="color:#c18401;"> </span><span style="color:#e45649;">results </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>With this, we can modify our black box agent's <code>run</code> method to report progress and completion to this other agent:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#e45649;">async </span><span style="color:#0184bc;">run</span><span>(): </span><span style="color:#c18401;">Promise</span><span style="color:#a626a4;">&lt;void&gt; </span><span>{ </span><span> const </span><span style="color:#e45649;">state </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">WorkflowState</span><span>.</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#e45649;">this</span><span>.id); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> state.update.trigger(</span><span style="color:#50a14f;">&quot;step 1&quot;</span><span>); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> state.update.trigger(</span><span style="color:#50a14f;">&quot;step 2&quot;</span><span>); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> state.finished.trigger([1, 2, 3]); </span><span>} </span></code></pre> <p>The <code>trigger</code> in the agent calls means we don't want to block on these calls - we just trigger the update on the other agent, so it's very fast.</p> <p>Our <code>WorkflowState</code> agent is very different from our first one, as it is reactive. Its methods are not long running, they are just updating the agent's inner state. We can call <code>get</code> any time on it while the workflow is running, these <code>get</code> calls are going to be interleaved between the status update calls.</p> <h3 id="multi-step-workflows">Multi-step workflows</h3> <p>As we've seen on <code>WorkflowState</code> agents can export <strong>multiple methods</strong>. This suggests that instead of writing a workflow like in the first example - a single method performing all the steps sequentially, we could also expose these steps as agent methods.</p> <p>This is not always what we want, and it has pros and cons:</p> <ul> <li>We may already have our code structured like this - extracting sub-steps into functions. Exposing them as agent methods is just a matter of making them public methods on the agent class</li> <li>However once we do that, we no longer have a single entry point for our workflow. We need something that orchestrates the workflow execution! This can be both an advantage and a disadvantage: <ul> <li>The external orchestrator may customize the execution flow by deciding which steps to call, etc</li> <li>But we need to write this orchestrator (potentially an another agent) that complicates our architecture</li> </ul> </li> </ul> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">InnerWorkflow </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private </span><span style="color:#e45649;">value</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">0; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">id</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">step1</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// .. </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">step2</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// .. </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">step3</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// .. </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">OuterWorkflow </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private </span><span style="color:#e45649;">value</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">0; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">id</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">inner </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">InnerWorkflow</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">get</span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">await </span><span style="color:#e45649;">inner</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">step1</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">await </span><span style="color:#e45649;">inner</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">step2</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">await </span><span style="color:#e45649;">inner</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">step3</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>We can introduce these hierarchies of agents in as many layers as we want, but the reason why would want to do so is controlling <strong>concurrency</strong>.</p> <h3 id="concurrency">Concurrency</h3> <p>Even though agent methods can be <code>async</code> and do some operations in parallel - HTTP requests, remote agent calls and so on - agents are single threaded and their exported methods cannot overlap.</p> <p>We can implement fully parallel execution with two techniques:</p> <ul> <li>Spawning child agents</li> <li>Forking agents</li> </ul> <p>We have already seen a simple example of spawning child agents to achieve concurrency when we created a <code>WorkflowState</code> agent as a child of our <code>Workflow</code> agent.</p> <p>In general the pattern looks like the following:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">ConcurrentAgent </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">id</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">inputs </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">[</span><span style="color:#50a14f;">&quot;a&quot;</span><span style="color:#c18401;">, </span><span style="color:#50a14f;">&quot;b&quot;</span><span style="color:#c18401;">, </span><span style="color:#50a14f;">&quot;c&quot;</span><span style="color:#c18401;">]; </span><span style="color:#a0a1a7;">// example chunks we want to process in parallel </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">stepPromises </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">inputs</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">map</span><span style="color:#c18401;">((</span><span style="color:#e45649;">input</span><span style="color:#c18401;">, </span><span style="color:#e45649;">idx</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> </span><span style="color:#e45649;">Substep</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">get</span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id, </span><span style="color:#e45649;">idx</span><span style="color:#c18401;">).</span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">input</span><span style="color:#c18401;">)); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">results </span><span style="color:#a626a4;">= await </span><span style="color:#c18401;">Promise.</span><span style="color:#0184bc;">all</span><span style="color:#c18401;">(</span><span style="color:#e45649;">stepPromises</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">Substep </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">substepId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string, </span><span style="color:#e45649;">substepId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">id</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">substepId </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">substepId</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">input</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string)</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;string&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Some operation on input producing an output </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">input</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>This example takes every element of <code>inputs</code> and spawns a separate <strong>child agent</strong> to process that chunk. The <code>Substep</code> agent is responsible for performing work on one chunk - its <em>identity</em> is no longer just the main workflow's ID, but it also contains an index. This can be useful if this number (<code>substepId</code> in the example) holds some important meaning in our problem domain (imagine it's not just an index, but a domain-specific identifier of the chunk of data it works on).</p> <p>In case the identity of the substeps does not matter, just that we get a separate instance for each substep that can run in parallel, we can use the <strong>phantom agent</strong> feature of Golem to make it even simpler:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">PhantomSubstep </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">() { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">input</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string)</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;string&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">stepPromises </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">inputs</span><span>.</span><span style="color:#0184bc;">map</span><span>(</span><span style="color:#e45649;">input </span><span style="color:#a626a4;">=&gt; </span><span style="color:#e45649;">PhantomSubstep</span><span>.</span><span style="color:#0184bc;">newPhantom</span><span>().</span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#e45649;">input</span><span>)); </span></code></pre> <p><code>newPhantom</code> always creates a new instance of the agent, even if there is no user-defined unique identity to distinguish them.</p> <h4 id="results">Results</h4> <p>The above example was <strong>awaiting all results</strong> before moving forward. This is just one of the possibilities. When using <code>Promise.all</code> to await all the agent invocations, we were using the divide and conquer strategy to delegate work to other agents running in parallel, to speed up execution. We need all the results to move forward to the next step (or completion) of our main agent.</p> <p>It's also possible do a <strong>race</strong> instead - we wait for the first child agent to complete, and use its results, ignoring the others. This is problematic in Golem 1.4 though, because <code>Promise.race</code> does not cancel the losing promises, and Golem's current JS runtime <strong>ensures that all promises complete</strong> before an invocation stops. So even though we would get the winner promise as soon as possible, our method would only return when every other sub-agents finished as well. This is a temporary limitation and future Golem TypeScript SDK versions will provide a way to pass an <a href="https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal"><code>AbortSignal</code></a> to the invocations.</p> <p>Until then, let's see how racing looks like in a Rust agent!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>RustRaceSubAgent { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">millis</span><span>: </span><span style="color:#a626a4;">u64</span><span>) -&gt; String; </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>RustRaceSubAgentImpl {} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>RustRaceSubAgent </span><span style="color:#a626a4;">for </span><span>RustRaceSubAgentImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{} </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">millis</span><span>: </span><span style="color:#a626a4;">u64</span><span>) -&gt; String { </span><span> std::thread::sleep(std::time::Duration::from_millis(millis)); </span><span> format!(</span><span style="color:#50a14f;">&quot;slept </span><span style="color:#c18401;">{}</span><span style="color:#50a14f;"> millis&quot;</span><span>, millis) </span><span> } </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>RustRaceAgent { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">name</span><span>: String) -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>); </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>RustRaceAgentImpl { </span><span> </span><span style="color:#e45649;">_name</span><span>: String, </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>RustRaceAgent </span><span style="color:#a626a4;">for </span><span>RustRaceAgentImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">name</span><span>: String) -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{ </span><span> _name: name, </span><span> } </span><span> } </span><span> </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>) { </span><span> </span><span style="color:#a626a4;">let mut</span><span> a </span><span style="color:#a626a4;">= </span><span>RustRaceSubAgentClient::new_phantom(); </span><span> </span><span style="color:#a626a4;">let mut</span><span> b </span><span style="color:#a626a4;">= </span><span>RustRaceSubAgentClient::new_phantom(); </span><span> </span><span style="color:#a626a4;">let mut</span><span> c </span><span style="color:#a626a4;">= </span><span>RustRaceSubAgentClient::new_phantom(); </span><span> </span><span> </span><span style="color:#a626a4;">let</span><span> f1 </span><span style="color:#a626a4;">=</span><span> a.</span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#c18401;">1000</span><span>); </span><span> </span><span style="color:#a626a4;">let</span><span> f2 </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#c18401;">2000</span><span>); </span><span> </span><span style="color:#a626a4;">let</span><span> f3 </span><span style="color:#a626a4;">=</span><span> c.</span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#c18401;">10000</span><span>); </span><span> </span><span> </span><span style="color:#a626a4;">let</span><span> result </span><span style="color:#a626a4;">= </span><span>(f1, f2, f3).</span><span style="color:#0184bc;">race</span><span>().await; </span><span> println!(</span><span style="color:#50a14f;">&quot;</span><span style="color:#c18401;">{result}</span><span style="color:#50a14f;">&quot;</span><span>); </span><span> } </span><span>} </span></code></pre> <p>This works as expected - the main agent finishes in 1 second.</p> <p>What can we do in TypeScript until invocations became abortable? There is a workaround - we can use <strong>Golem Promises</strong> to signal completion, instead of blocking asynchronous method invocations.</p> <h4 id="golem-promises">Golem Promises</h4> <p>A <strong>Golem Promise</strong> is a cluster-level entity that can be <strong>completed</strong> either from an agent, or even from the outside. Completing it from the outside is our basic building block for introducing <strong>human-in-the-loop</strong> for agentic workflows.</p> <p>Let's see how we can use promises as a workaround for the race issue in TypeScript!</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">import </span><span>{</span><span style="color:#e45649;">awaitPromise</span><span>, </span><span style="color:#e45649;">completePromise</span><span>, </span><span style="color:#e45649;">createPromise</span><span>, </span><span style="color:#e45649;">PromiseId</span><span>} </span><span style="color:#a626a4;">from </span><span style="color:#50a14f;">&#39;@golemcloud/golem-ts-sdk&#39;</span><span>; </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">RaceAgentWithPromise </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">id</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">id</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">promise </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">createPromise</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">RaceSubstepWithPromise</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">newPhantom</span><span style="color:#c18401;">(</span><span style="color:#e45649;">promise</span><span style="color:#c18401;">).</span><span style="color:#e45649;">run</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">trigger</span><span style="color:#c18401;">(1000); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">RaceSubstepWithPromise</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">newPhantom</span><span style="color:#c18401;">(</span><span style="color:#e45649;">promise</span><span style="color:#c18401;">).</span><span style="color:#e45649;">run</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">trigger</span><span style="color:#c18401;">(2000); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">RaceSubstepWithPromise</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">newPhantom</span><span style="color:#c18401;">(</span><span style="color:#e45649;">promise</span><span style="color:#c18401;">).</span><span style="color:#e45649;">run</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">trigger</span><span style="color:#c18401;">(10000); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">= await </span><span style="color:#0184bc;">awaitPromise</span><span style="color:#c18401;">(</span><span style="color:#e45649;">promise</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> console.</span><span style="color:#0184bc;">log</span><span style="color:#c18401;">(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">TextDecoder().</span><span style="color:#0184bc;">decode</span><span style="color:#c18401;">(</span><span style="color:#e45649;">result</span><span style="color:#c18401;">)); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">RaceSubstepWithPromise </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">promiseId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">PromiseId; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">promiseId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">PromiseId) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">promiseId </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">promiseId</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">millis</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number)</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;void&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">sleep</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;string&gt; </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">Promise(</span><span style="color:#e45649;">resolve </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">setTimeout</span><span style="color:#c18401;">(() </span><span style="color:#a626a4;">=&gt; </span><span style="color:#0184bc;">resolve</span><span style="color:#c18401;">(</span><span style="color:#50a14f;">`Slept ${</span><span style="color:#e45649;">millis</span><span style="color:#50a14f;">}`</span><span style="color:#c18401;">), </span><span style="color:#e45649;">millis</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> ); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">= await </span><span style="color:#e45649;">sleep</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">completePromise</span><span style="color:#c18401;">(</span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">promiseId</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">TextEncoder().</span><span style="color:#0184bc;">encode</span><span style="color:#c18401;">(</span><span style="color:#e45649;">result</span><span style="color:#c18401;">)); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>Instead of returning the result as a return value from our method, we are completing a <strong>Golem promise</strong> with it. This way we don't need await the invocation itself on the caller side, instead we await the promise - and the first sub-agent that completes it will unblock that await.</p> <h4 id="forking">Forking</h4> <p>Forking is another way to do work in parallel from Golem agents. It is a special way to spawn a new agent - instead of explicitly creating a new agent with an initial state, <code>fork()</code> creates a <strong>copy</strong> of the agent it is called in, and both agents continue running from the fork point. The new copy inherits the state of the original agent, including all the values of all the variables etc. The only distinction between the two copies is the <strong>return value</strong> of <code>fork()</code> - it can be used to decide what to do next.</p> <p><strong>Golem promises</strong> are an important part of working with forking, as in this case there is no invoked method to return values from.</p> <p>Forking can be a convenient way to parallize some work in cases where the context necessary for the child agent is big and would be hard to pass down as parameters.</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#e45649;">async </span><span style="color:#0184bc;">run</span><span>(): </span><span style="color:#c18401;">Promise</span><span style="color:#a626a4;">&lt;void&gt; </span><span>{ </span><span> const </span><span style="color:#e45649;">inputs </span><span style="color:#a626a4;">= </span><span>[ </span><span> </span><span style="color:#50a14f;">&quot;a&quot;</span><span>, </span><span style="color:#50a14f;">&quot;b&quot;</span><span>, </span><span style="color:#50a14f;">&quot;c&quot; </span><span> ]; </span><span> </span><span> const </span><span style="color:#e45649;">promises </span><span style="color:#a626a4;">= </span><span>[]; </span><span> </span><span> </span><span style="color:#0184bc;">for </span><span>(</span><span style="color:#e45649;">const input of inputs</span><span>) { </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">promise </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">createPromise</span><span>(); </span><span> </span><span style="color:#e45649;">promises</span><span>.</span><span style="color:#0184bc;">push</span><span>(</span><span style="color:#e45649;">this</span><span>.</span><span style="color:#0184bc;">processInput</span><span>(</span><span style="color:#e45649;">promise</span><span>, </span><span style="color:#e45649;">input</span><span>)); </span><span> </span><span style="color:#a626a4;">if </span><span>(</span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">forked</span><span>) { </span><span> </span><span style="color:#a626a4;">break</span><span>; </span><span> } </span><span> } </span><span> </span><span style="color:#0184bc;">if </span><span>(!this.forked) { </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">results </span><span style="color:#a626a4;">= await </span><span style="color:#c18401;">Promise</span><span>.</span><span style="color:#0184bc;">all</span><span>(</span><span style="color:#e45649;">promises</span><span>); </span><span> </span><span style="color:#c18401;">console</span><span>.</span><span style="color:#0184bc;">log</span><span>(</span><span style="color:#e45649;">results</span><span>); </span><span> } </span><span>} </span><span> </span><span style="color:#e45649;">async </span><span style="color:#0184bc;">processInput</span><span>(</span><span style="color:#e45649;">promise</span><span>: </span><span style="color:#e45649;">PromiseId</span><span>, </span><span style="color:#e45649;">input</span><span>: </span><span style="color:#e45649;">string</span><span>): </span><span style="color:#c18401;">Promise</span><span style="color:#a626a4;">&lt;</span><span style="color:#e45649;">string</span><span style="color:#a626a4;">&gt; </span><span>{ </span><span> </span><span style="color:#0184bc;">switch </span><span>(</span><span style="color:#e45649;">fork</span><span>().tag) { </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;original&quot;</span><span>: { </span><span> </span><span style="color:#a0a1a7;">// awaiting the promise in the original copy </span><span> const </span><span style="color:#e45649;">bytes </span><span style="color:#a626a4;">= await </span><span style="color:#0184bc;">awaitPromise</span><span>(</span><span style="color:#e45649;">promise</span><span>); </span><span> return new </span><span style="color:#0184bc;">TextDecoder</span><span>().</span><span style="color:#0184bc;">decode</span><span>(</span><span style="color:#e45649;">bytes</span><span>); </span><span> } </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;forked&quot;</span><span>: { </span><span> </span><span style="color:#a0a1a7;">// do the actual work in the forked copy </span><span> const </span><span style="color:#e45649;">processed </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">input </span><span style="color:#a626a4;">+ </span><span style="color:#50a14f;">&quot;!&quot;</span><span>; </span><span> </span><span style="color:#0184bc;">completePromise</span><span>(</span><span style="color:#e45649;">promise</span><span>, </span><span style="color:#e45649;">new TextEncoder</span><span>().</span><span style="color:#0184bc;">encode</span><span>(</span><span style="color:#e45649;">processed</span><span>)); </span><span> this.</span><span style="color:#e45649;">forked </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true</span><span>; </span><span> return processed; </span><span> } </span><span> } </span><span>} </span></code></pre> <p>Note that we need to remember that we got into a forked copy to avoid forking again from the fork - as all copies are (almost) identical. This makes things easy as the forked copy has the same state as the original at the fork point, but also makes things hard by having to remember to not get back to the same code path in both instances.</p> <h2 id="agents-as-domain-entities">Agents as domain entities</h2> <p>So far we represented workflows and their subtasks as agents. We can also <strong>model our domain using agents</strong>!</p> <p><strong>Domain-driven design</strong> as an approach is with us since more than 20 years, and many books and articles discuss how to model our application based on the domains its applied to. The entities identified can be directly mapped to Golem agents and the interaction between these entities can be done with our <strong>agent-to-agent</strong> communication.</p> <p>It is important that agents are <strong>durable</strong> by default, which can significantly simplify the implementation of these entities, so we end up with something that closely maps to our domain model.</p> <p>To demonstrate this, consider a simple e-commerce example where we identify three entities: <strong>Customer</strong>, <strong>Order</strong> and <strong>OrderItem</strong>. Customers and orders are top-level entities directly accessible by their unique identifier: the customer e-mail address and the order ID. Each order is associated with a customer, and can have one or more items.</p> <p>We said that both Customer and Order are top-level entities with unique identifiers. This maps directly to Golem agents:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">type </span><span>OrderId </span><span style="color:#a626a4;">= </span><span>string; </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">Customer </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">email</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">email</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">email </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">email</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">Order </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">orderId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">orderId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">orderId </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">orderId</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>We can refer to a customer directly by its domain-specific unique identifier, for example in Golem REPL:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>&gt;&gt;&gt; let vigoo = customer(&quot;[email protected]&quot;) </span></code></pre> <p>The third entity, an item associated with an Order is important from a data modelling perspective but probably not something we would map to an individual Golem agent. It is small enough that we can just model it as a record type and associate the list of items directly with our Order by just adding a new field to it:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">type </span><span>OrderItem </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#e45649;">productId</span><span style="color:#a626a4;">: </span><span>string, </span><span> </span><span style="color:#e45649;">quantity</span><span style="color:#a626a4;">: </span><span>number, </span><span> </span><span style="color:#e45649;">price</span><span style="color:#a626a4;">: </span><span>number </span><span>} </span><span> </span><span>@agent() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">Order </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">orderId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">items</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderItem[]; </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>Operations on these entities are going to be <strong>agent methods</strong>:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#0184bc;">add</span><span>(</span><span style="color:#e45649;">item</span><span>: </span><span style="color:#e45649;">OrderItem</span><span>) { </span><span> </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">items</span><span>.</span><span style="color:#0184bc;">push</span><span>(</span><span style="color:#e45649;">item</span><span>) </span><span>} </span><span> </span><span style="color:#0184bc;">getItems</span><span>(): </span><span style="color:#e45649;">OrderItem</span><span>[] { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">items</span><span>; </span><span>} </span></code></pre> <p>The agents can simply call each other when needed using <strong>agent-to-agent calls</strong>. For example, assuming we also added a way to attach the customer's email address to an order, we could have a method in <code>Customer</code> like the following:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#e45649;">async </span><span style="color:#0184bc;">newOrder</span><span>(): </span><span style="color:#c18401;">Promise</span><span style="color:#a626a4;">&lt;</span><span style="color:#e45649;">OrderId</span><span style="color:#a626a4;">&gt; </span><span>{ </span><span> const </span><span style="color:#e45649;">newId </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">uuidv4</span><span>(); </span><span> const </span><span style="color:#e45649;">order </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">Order</span><span>.</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#e45649;">newId</span><span>); </span><span> await order.setCustomer(this.email); </span><span> return newId; </span><span>} </span></code></pre> <p>The facts that agents are <strong>durable</strong> and that calls between agents are guaranteed to be <strong>exactly-once</strong> make these trivial implementations very powerful.</p> <h2 id="patterns">Patterns</h2> <p>We can identify some useful patterns that can help a lot in developing agent based applications on Golem. As people will write more and more applications on this platform, we are going to identify more of these patterns.</p> <h3 id="cluster-level-singletons">Cluster level singletons</h3> <p>The first pattern we discuss is <strong>cluster level singletons</strong>. A cluster level singleton agent has exactly one instance in the whole application. In Golem an agent's identity is its constructor parameters - so if our agent does not have any constructor parameter, it can only have a single instance, and any other agent or external call referring to it will refer to the same instance.</p> <p>As an example, let's assume that our <code>OrderId</code> type from the previous section is not a UUID but need to be a unique, sequential number. It's quite easy to achieve this of course if we have some kind of database in our application. But in Golem we don't need a third party database for this - we can just create a singleton agent responsible for generating new unique order IDs:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">type </span><span>OrderId </span><span style="color:#a626a4;">= </span><span>number; </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">OrderIds </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private </span><span style="color:#e45649;">next</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">0; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">() { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">nextId</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.next</span><span style="color:#a626a4;">++</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">Customer </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">newOrder</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">newId </span><span style="color:#a626a4;">= await </span><span style="color:#e45649;">OrderIds</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">get</span><span style="color:#c18401;">().</span><span style="color:#0184bc;">nextId</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <h3 id="shard-singletons">Shard singletons</h3> <p>Having a single shared global state using <strong>cluster singletons</strong> can be very useful, however it can very soon become a bottleneck. One possible way to reduce this bottleneck is to not only have a single instance in the whole cluster, but define somehow a set of distinct <strong>shards</strong>, and have a single instance <strong>per shard</strong>. It is very much depending on the actual domain whether this technique can be applied or not, but in general the idea is that if every agent can calculate the <strong>shard</strong> it belongs to, and there is no need to share information among the shards, they can simply access the shard instance they need to.</p> <p>To demonstrate this, let's assume we want to maintain a <strong>list of all customers</strong> with a count of how many orders they have. As this map will have to be updated every time an order is created, we don't want to maintain it in a cluster level singleton. Instead, we associate the customers into N (=100) shards, and have a separate <code>Customer-&gt;OrderCount</code> map in each shard:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">type </span><span>Shard </span><span style="color:#a626a4;">= </span><span>number; </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">ShardedCustomers </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">shard</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Shard; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">customerWithOrderCount</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Map&lt;string, number&gt; </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">Map(); </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">shard</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Shard) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">shard </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">shard</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">registerCustomer</span><span style="color:#c18401;">(</span><span style="color:#e45649;">customer</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">customerWithOrderCount</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">set</span><span style="color:#c18401;">(</span><span style="color:#e45649;">customer</span><span style="color:#c18401;">, 0); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">registerOrder</span><span style="color:#c18401;">(</span><span style="color:#e45649;">customer</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string, </span><span style="color:#e45649;">orderId</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">OrderId) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">count </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">customerWithOrderCount</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">get</span><span style="color:#c18401;">(</span><span style="color:#e45649;">customer</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">?? </span><span style="color:#c18401;">0; </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.</span><span style="color:#e45649;">customerWithOrderCount</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">set</span><span style="color:#c18401;">(</span><span style="color:#e45649;">customer</span><span style="color:#c18401;">, </span><span style="color:#e45649;">count </span><span style="color:#a626a4;">+ </span><span style="color:#c18401;">1); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>We can then write a function that determines the <code>Shard</code> from a customer's identifier (an email address string):</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">import </span><span style="color:#e45649;">md5 </span><span style="color:#a626a4;">from </span><span style="color:#50a14f;">&quot;md5-ts&quot;</span><span>; </span><span> </span><span style="color:#a626a4;">function </span><span style="color:#0184bc;">shardOfCustomer</span><span>(</span><span style="color:#e45649;">email</span><span style="color:#a626a4;">: </span><span>string)</span><span style="color:#a626a4;">: </span><span>Shard { </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">SHARDS </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">100</span><span>; </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">hash </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">md5</span><span>(</span><span style="color:#e45649;">email</span><span>); </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">hashPefixAsNumber </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">parseInt</span><span>(</span><span style="color:#e45649;">hash</span><span>.</span><span style="color:#0184bc;">substring</span><span>(</span><span style="color:#c18401;">0</span><span>, </span><span style="color:#c18401;">8</span><span>), </span><span style="color:#c18401;">16</span><span>); </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">hashPefixAsNumber </span><span style="color:#a626a4;">% </span><span style="color:#e45649;">SHARDS</span><span>; </span><span>} </span></code></pre> <p>Using this function we can call <code>registerCustomer</code> and <code>registerOrder</code> in an asynchronous way using <code>trigger</code> on the remote agent interface:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a0a1a7;">// in class Order: </span><span style="color:#0184bc;">setCustomer</span><span>(</span><span style="color:#e45649;">customer</span><span>: </span><span style="color:#e45649;">string</span><span>) { </span><span> </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">customer </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">customer</span><span>; </span><span> </span><span style="color:#e45649;">ShardedCustomers </span><span> .</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#0184bc;">shardOfCustomer</span><span>(</span><span style="color:#e45649;">customer</span><span>)) </span><span> .</span><span style="color:#e45649;">registerOrder </span><span> .</span><span style="color:#0184bc;">trigger</span><span>(</span><span style="color:#e45649;">customer</span><span>, </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">orderId</span><span>); </span><span>} </span><span> </span><span style="color:#a0a1a7;">// in class Customer: </span><span style="color:#0184bc;">constructor</span><span>(</span><span style="color:#e45649;">email</span><span>: </span><span style="color:#e45649;">string</span><span>) { </span><span> </span><span style="color:#e45649;">super</span><span>(); </span><span> </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">email </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">email</span><span>; </span><span> </span><span style="color:#e45649;">ShardedCustomers </span><span> .</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#0184bc;">shardOfCustomer</span><span>(</span><span style="color:#e45649;">email</span><span>)) </span><span> .</span><span style="color:#e45649;">registerCustomer </span><span> .</span><span style="color:#0184bc;">trigger</span><span>(</span><span style="color:#e45649;">email</span><span>); </span><span>} </span></code></pre> <p>By using <code>trigger</code> we guarantee that even if the shard-level singleton gets overloaded with registration messages, this does not block our Order or Customer agents.</p> <h3 id="agents-as-message-queues">Agents as message queues</h3> <p>It's not really a pattern but this last statement also tells an important feature of Golem - every agent <strong>has its own persistent message queue</strong>. If every agent invocation is done using <code>trigger</code> (or <code>schedule</code> to trigger an invocation in a future point in time), an agent actually works as a persistent message queue from the senders point of view.</p> <p>The example above demonstrated this - <code>Order</code> and <code>Customer</code> were just sending <em>registration messages</em> to a message queue, without blocking on anything. The consumer of the message queue was the <code>ShardedCustomers</code> agent itself.</p> <p>These message queues are just as durable as everything else in Golem. Once the <code>trigger</code> call returned we can be sure that the invocation is in the target agent's pending invocations queue, and even if the agent gets restarted, or relocated to another node, it will always have it in its invocation queue.</p> <p>A different example could be a cluster-level singleton for aggregating important domain-level events from various agents:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">type </span><span>Event </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span> </span><span>@agent() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">EventLog </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">() { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">log</span><span style="color:#c18401;">(</span><span style="color:#e45649;">event</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Event) { </span><span style="color:#c18401;"> console.</span><span style="color:#0184bc;">log</span><span style="color:#c18401;">(event); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a0a1a7;">/// Log an event from anywhere </span><span style="color:#a626a4;">function </span><span style="color:#0184bc;">log</span><span>(</span><span style="color:#e45649;">event</span><span style="color:#a626a4;">: </span><span>Event) { </span><span> </span><span style="color:#e45649;">EventLog</span><span>.</span><span style="color:#0184bc;">get</span><span>().</span><span style="color:#e45649;">log</span><span>.</span><span style="color:#0184bc;">trigger</span><span>(event); </span><span>} </span></code></pre> <p>Our <code>log</code> processor could do anything - just log the event in the singleton agent's log stream, or store it in memory, extract and aggregate some information, etc. Even just logging it can help with debugging distributed systems, as it is a developer-defined aggregated view of the logs important for the application's logic itself, unlike the lower level server logs of the Golem platform itself.</p> <h3 id="ephemeral-agents">Ephemeral agents</h3> <p>Finally let's talk about <strong>ephemeral agents</strong> which are so important that Golem has built-in support for them and make the use of them very convenient. Ephemeral agents are agents - they can have constructor parameters and exported methods, but they are <strong>not durable</strong>. For each invocation a fresh instance is created, and although Golem allows you to observe past ephemeral agent instances for debugging purposes, you cannot invoke them again.</p> <p>Calling an ephemeral agent's method is Golem's equivalent of <strong>serverless functions</strong>. As each invocation gets a fresh instance, we don't have to worry about the message queueing property of agents - all the invoations are going to run in parallel. This is a very good fit for writing <strong>request handlers</strong> at the edge of a Golem application.</p> <p>As an example we are going to write an ephemeral agent with a single function that enumerates <strong>all customers</strong> in the system. In a real application where we have so many customers that we are managing the list of them in sharded singletons, we would not want a single function returning all of them at once, of course; but for simplicity, let's implement it anyway.</p> <p>First we add a new agent method to <code>ShardedCustomers</code> to get the list of customers in that particular shard:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#0184bc;">getCustomers</span><span>(): </span><span style="color:#e45649;">string</span><span>[] { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">Array</span><span>.</span><span style="color:#0184bc;">from</span><span>(</span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">customerWithOrderCount</span><span>.</span><span style="color:#0184bc;">keys</span><span>()); </span><span>} </span></code></pre> <p>And then we create an <strong>ephemeral</strong> <code>RequestHandler</code> agent:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span>@</span><span style="color:#0184bc;">agent</span><span>({mode: </span><span style="color:#50a14f;">&quot;ephemeral&quot;</span><span>}) </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">RequestHandler </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">() { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">getAllConsumers</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;string[]&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">SHARDS </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">100; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">allCustomers</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string[] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">[]; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">for </span><span style="color:#c18401;">(</span><span style="color:#a626a4;">let </span><span style="color:#e45649;">shard </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">0; </span><span style="color:#e45649;">shard </span><span style="color:#a626a4;">&lt; </span><span style="color:#e45649;">SHARDS</span><span style="color:#c18401;">; </span><span style="color:#e45649;">shard</span><span style="color:#a626a4;">++</span><span style="color:#c18401;">) { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">customers </span><span style="color:#a626a4;">= await </span><span style="color:#e45649;">ShardedCustomers</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">get</span><span style="color:#c18401;">(</span><span style="color:#e45649;">shard</span><span style="color:#c18401;">).</span><span style="color:#0184bc;">getCustomers</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">allCustomers</span><span style="color:#c18401;">.</span><span style="color:#0184bc;">push</span><span style="color:#c18401;">(</span><span style="color:#a626a4;">...</span><span style="color:#e45649;">customers</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">allCustomers</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>This is a long running call that makes 100 remote agent calls in sequence. But it is done every time on a fresh ephemeral agent instance without affecting any potentially already running request handlers.</p> <h2 id="conclusion">Conclusion</h2> <p>As I demonstrated the basic building blocks of Golem - <strong>agents</strong> - can be used in many different ways to write applications that can run in a safe way, surviving external failure conditions, automatically providing observability and many other features with almost zero boilerplate. Some of these benefits apply even if the application is not really designed with a distributed net of agents in mind, but to use the full potential of Golem we should think about how our application can be modelled like that.</p> <p>It is nothing fundamentally new - there are many publications discussing domain driven design, distributed actor systems, reactive messaging patterns and so on. These existing materials can be useful inspiration, but keep in mind that in Golem each agent is durable (persistent) by default - this provides a lot of guarantees out of the box, which would have been solved by explicit architecture in other systems.</p> Rust agents in Golem 1.4 2025-12-18T00:00:00+00:00 2025-12-18T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/rust-agents-golem14/ <p>The <a href="https://blog.vigoo.dev/posts/golem-code-first-agents/">previous version of Golem, 1.3</a> made a big leap from earlier versions by introducing <strong>code-first agents</strong>. We could now write stateful, persistent entities called agents by simply defining TypeScript classes with some annotations, while previously it required learning the WebAssembly interface definition language and using that to first define our application's public interface, then figure out how to implement that.</p> <p>But in Golem 1.3 we only supported this to TypeScript - we dropped support for all languages, as we wanted to only support this new experience; with Golem 1.4, <a href="https://x.com/GolemCloud/status/2000696015142228039">launched on 22nd of December</a> we have a similar new code-first developer experience for <strong>Rust</strong>.</p> <p>In this post I'm showing how to write a small Golem application with the new Rust SDK.</p> <p>Note that even though this post is about Rust agents, everything we are going to see is possible with TypeScript agents as well, in a very similar way (with slightly even less boilerplate).</p> <h2 id="the-example">The example</h2> <p>The application we are going to develop, although not being extremely useful by itself, is small enough to fit to this post, but still contains many interesting details of how writing Rust agents in Golem 1.4 feels.</p> <p>Our application is going to be a graph database of <em>libraries</em> for various <em>programming languages</em>, organized into <em>topics</em>. By selecting a topic (let's say <code>json</code>) we will be able to discover libraries (Rust and JavaScript libraries) that associated with that topic - then all hits are going to be analyzed and stored. Each analyzed library is going to also add new topics, and these topics can be used to do more search for libraries, and so on.</p> <p>For searching libraries for a given topic, we are going to use the Google programmable search API to look for repositories on GitHub. To analyze the hits, we are going to ask OpenAI to give a short description and set of topics based on the project's README.</p> <p>The number of libraries and topics will be arbitrarily scalable, as well as the number of parallel topic discoveries (but limited by the 3rd party provider's limitations, of course).</p> <h2 id="implementation">Implementation</h2> <p>In general it is a good idea to design the whole architecture of an application like this in advance, and start working on it with a good understanding of what to do. I'm not going to show the overall design at this point, because introducing the parts one by one will make it easier to explain the concepts and decisions.</p> <h3 id="starting-the-project">Starting the project</h3> <p>The only prerequisites to implement this example are:</p> <ul> <li>Golem 1.4 (the <code>golem</code> CLI application)</li> <li>Rust toolchain with the <code>wasm32-wasip1</code> target installed</li> <li><code>cargo-component 0.21.1</code></li> </ul> <p>See the Rust tab on <a href="https://learn.golem.cloud/develop/setup">the official setup page</a> to learn how exactly set these up.</p> <p>Once we have them, we can create a new application using <code>golem</code> :</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>$ golem new </span><span>&gt; Application name: lib-db </span><span>&gt; Select language for the new component Rust </span><span>&gt; Select a template for the new component: default: A simple agent implementing a counter </span><span>&gt; Component Package Name: libdb:backend </span><span>&gt; Add another component or create application? Create application </span><span>Created application directory: lib-db </span><span>Adding component libdb:backend </span><span>Added new app component libdb:backend </span><span>Created application lib-db </span></code></pre> <p>The application is called <code>lib-db</code>, and it consists of a single component, <code>libdb:backend</code>. Components are an organizational unit in Golem - an application can have multiple components, and each component can any number of <strong>agents</strong>. In this example we don't need to use multiple components. A possible reason could be to have different update/deploy strategies for different subsets of our agents.</p> <p>The <code>default</code> template we've chosen consists of a simple agent called <code>CounterAgent</code>, implementing a stateful counter identified by a name, with a single <code>increment</code> method.</p> <p>We can delete that, and start implementing our own agents. All agents are going to be defined in <code>components-rust/libdb-backend/src</code> - the module structure can be anything, I prefer putting each agent in its own submodule, and the shared data types (if not many) in the root module.</p> <h3 id="the-library-agent">The library agent</h3> <p>What is an agent in Golem? A stateful, durable entity, identified by its constructor parameters, exposing methods. Agents can run in parallel, but each agent itself executes their invoked methods sequentially. They also scale horizontally as each agent is put on one executor of the whole Golem cluster, based on some internal sharing logic.</p> <p>One good candidate for an agent in our example application is a <strong>library</strong>. A library is an entity identified by its name and programming language, and it holds state - was it already analyzed? If it was, it has some data - description, set of topics.</p> <p>As the agent state is not publicly visible, we also need to expose a method to query it.</p> <p>Let's see how this looks like in Rust!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">use </span><span>golem_rust::{agent_definition, agent_implementation}; </span><span style="color:#a626a4;">use </span><span>http::Uri; </span><span style="color:#a626a4;">use </span><span>std::collections::HashSet; </span><span> </span><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>Library { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">reference</span><span>: LibraryReference) -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_details</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Result&lt;LibraryDetails, String&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>LibraryImpl { </span><span> </span><span style="color:#e45649;">reference</span><span>: LibraryReference, </span><span> </span><span style="color:#e45649;">state</span><span>: LibraryState </span><span>} </span><span> </span><span style="color:#a626a4;">enum </span><span>LibraryState { </span><span> Unknown, </span><span> Analysed { </span><span> repository: Uri, </span><span> description: String, </span><span> topics: HashSet&lt;String&gt;, </span><span> }, </span><span> Failed { </span><span> message: String, </span><span> }, </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>Library </span><span style="color:#a626a4;">for </span><span>LibraryImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">reference</span><span>: LibraryReference) -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{ </span><span> reference, </span><span> state: LibraryState::Unknown </span><span> } </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_details</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Result&lt;LibraryDetails, String&gt; { </span><span> </span><span style="color:#a626a4;">match &amp;</span><span style="color:#e45649;">self</span><span>.state { </span><span> LibraryState::Failed { message } </span><span style="color:#a626a4;">=&gt; </span><span>Err(message.</span><span style="color:#0184bc;">clone</span><span>()), </span><span> LibraryState::Analysed { </span><span> description, </span><span> topics, </span><span> repository, </span><span> } </span><span style="color:#a626a4;">=&gt; </span><span>Ok(LibraryDetails { </span><span> description: description.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> name: </span><span style="color:#e45649;">self</span><span>.reference.name.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> language: </span><span style="color:#e45649;">self</span><span>.reference.language.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> repository: repository.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> topics: topics.</span><span style="color:#0184bc;">iter</span><span>().</span><span style="color:#0184bc;">cloned</span><span>().</span><span style="color:#0184bc;">collect</span><span>(), </span><span> }), </span><span> LibraryState::Unknown </span><span style="color:#a626a4;">=&gt; </span><span>Err(</span><span style="color:#50a14f;">&quot;Library not yet analyzed&quot;</span><span>.</span><span style="color:#0184bc;">to_string</span><span>()), </span><span> } </span><span> } </span><span>} </span></code></pre> <p>And some common data types used in the above snippet:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">use </span><span>golem_rust::Schema; </span><span> </span><span>#[</span><span style="color:#e45649;">derive</span><span>(Debug, Clone, Hash, PartialEq, Eq, Schema)] </span><span style="color:#a626a4;">pub enum </span><span>Language { </span><span> Rust, </span><span> JavaScript, </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">derive</span><span>(Debug, Clone, Hash, PartialEq, Eq, Schema)] </span><span style="color:#a626a4;">pub struct </span><span>LibraryReference { </span><span> </span><span style="color:#e45649;">name</span><span>: String, </span><span> </span><span style="color:#e45649;">language</span><span>: Language, </span><span>} </span><span> </span><span style="color:#a626a4;">impl </span><span>Display </span><span style="color:#a626a4;">for </span><span>LibraryReference { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">fmt</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">f</span><span>: </span><span style="color:#a626a4;">&amp;mut </span><span>Formatter&lt;&#39;</span><span style="color:#a626a4;">_</span><span>&gt;) -&gt; std::fmt::Result { </span><span> write!(f, </span><span style="color:#50a14f;">&quot;</span><span style="color:#c18401;">{}</span><span style="color:#50a14f;"> [</span><span style="color:#c18401;">{:?}</span><span style="color:#50a14f;">]&quot;</span><span>, </span><span style="color:#e45649;">self</span><span>.name, </span><span style="color:#e45649;">self</span><span>.language) </span><span> } </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">derive</span><span>(Debug, Clone, Schema)] </span><span style="color:#a626a4;">pub struct </span><span>LibraryDetails { </span><span> </span><span style="color:#e45649;">name</span><span>: String, </span><span> </span><span style="color:#e45649;">language</span><span>: Language, </span><span> </span><span style="color:#e45649;">repository</span><span>: Uri, </span><span> </span><span style="color:#e45649;">description</span><span>: String, </span><span> </span><span style="color:#e45649;">topics</span><span>: HashSet&lt;String&gt;, </span><span>} </span></code></pre> <p>In these snippets we only have three Golem-specific details:</p> <ul> <li>Data types used anywhere in the agent's interface must derive <code>Schema</code></li> <li>The agent must be implemented as a pair of a <code>trait</code> and an implementation, with two macros: <code>agent_definition</code> and <code>agent_implementation</code> applied</li> </ul> <p>That's all that is required - our application can be built with <code>golem build</code> and deployed with <code>golem deploy</code>, assuming we have a locally started Golem server (<code>golem server run</code>).</p> <p>Let's do that, and try it out with <code>golem repl</code>:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>&gt;&gt;&gt; let testr = library({ name: &quot;test-r&quot;, language: rust}) </span><span>() </span><span>&gt;&gt;&gt; testr.get-details() </span><span>err(&quot;Library not yet analysed&quot;) </span><span>&gt;&gt;&gt; let desertrs = library({ name: &quot;desert-rust&quot;, language: rust}) </span><span>() </span><span>&gt;&gt;&gt; desertrs.get-details() </span><span>err(&quot;Library not yet analysed&quot;) </span><span>&gt;&gt;&gt; let golem-ts-sdk = library({ name: &quot;golem-ts-sdk&quot;, language: java-script}) </span><span>() </span><span>&gt;&gt;&gt; golem-ts-sdk.get-details() </span><span>err(&quot;Library not yet analysed&quot;) </span><span>&gt;&gt;&gt; </span></code></pre> <p>Of course every agent is initialized with <code>LibraryState::Unknown</code> so we can't see anything interesting yet. If we get out of the REPL, and check <code>golem agent list</code>, we can see that it indeed created three different agents in our server:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>Selected app: lib-db, env: local, server: local - builtin (http://localhost:9881) </span><span>+----------------+-----------------------------+-----------+--------+-------------+--------------------------+ </span><span>| Component name | Agent name | Component | Status | Pending | Created at | </span><span>| | | revision | | invocations | | </span><span>+----------------+-----------------------------+-----------+--------+-------------+--------------------------+ </span><span>| libdb:backend | library({name:&quot;desert- | 0 | Idle | 0 | 2025-12-18T14:53:14.531Z | </span><span>| | rust&quot;,language:rust}) | | | | | </span><span>+----------------+-----------------------------+-----------+--------+-------------+--------------------------+ </span><span>| libdb:backend | log() | 0 | Idle | 0 | 2025-12-18T14:52:50.858Z | </span><span>+----------------+-----------------------------+-----------+--------+-------------+--------------------------+ </span><span>| libdb:backend | library({name:&quot;test- | 0 | Idle | 0 | 2025-12-18T14:52:50.731Z | </span><span>| | r&quot;,language:rust}) | | | | | </span><span>+----------------+-----------------------------+-----------+--------+-------------+--------------------------+ </span><span>| libdb:backend | library({name:&quot;golem-ts- | 0 | Idle | 0 | 2025-12-18T14:53:38.352Z | </span><span>| | sdk&quot;,language:java-script}) | | | | | </span><span>+----------------+-----------------------------+-----------+--------+-------------+--------------------------+ </span></code></pre> <p>One thing to notice - although we defined our data types and method names in the normal convention of Rust - pascal case for the type names, snake case for the method names and fields - when using the REPL and other Golem CLI commands, we have to use a <code>kebab-cased</code> version of everything. This is a limitation of Golem 1.4 that's going to be removed in the next version. For now, because of how it builds on WebAssembly components under the hood, we need to accept this, but at least the REPL provides auto-completion to make it easier to discover these transformed names.</p> <h3 id="the-topic-agent">The topic agent</h3> <p>The second entity in our system is going to be a <strong>topic</strong>. Let's just define a topic as something identified by a (lowercase) string, and has two methods: one to get the known list of libraries implementing this topic, and another to start <strong>discovering</strong> more libraries for this topic.</p> <p>We want to keep this discovery process on-demand, otherwise we would create an exponentially growing system trying to explore the whole GitHub.</p> <p>We can create another submodule and just sketch an initial version of this agent:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>Topic { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">name</span><span>: String) -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">discover_libraries</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>); </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_libraries</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Result&lt;HashSet&lt;LibraryReference&gt;, Vec&lt;String&gt;&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>TopicImpl { </span><span> </span><span style="color:#e45649;">name</span><span>: String, </span><span> </span><span style="color:#e45649;">libraries</span><span>: HashSet&lt;LibraryReference&gt;, </span><span> </span><span style="color:#e45649;">failures</span><span>: Vec&lt;String&gt; </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>Topic </span><span style="color:#a626a4;">for </span><span>TopicImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">name</span><span>: String) -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">if</span><span> name.</span><span style="color:#0184bc;">to_lowercase</span><span>() </span><span style="color:#a626a4;">!=</span><span> name { </span><span> panic!(</span><span style="color:#50a14f;">&quot;Topic names must be lowercase&quot;</span><span>) </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">Self </span><span>{ </span><span> name, </span><span> libraries: HashSet::new(), </span><span> failures: Vec::new() </span><span> } </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">discover_libraries</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>) { </span><span> todo!(); </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_libraries</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Result&lt;HashSet&lt;LibraryReference&gt;, Vec&lt;String&gt;&gt; { </span><span> </span><span style="color:#a626a4;">if </span><span style="color:#e45649;">self</span><span>.failures.</span><span style="color:#0184bc;">is_empty</span><span>() { </span><span> Ok(</span><span style="color:#e45649;">self</span><span>.libraries.</span><span style="color:#0184bc;">clone</span><span>()) </span><span> } </span><span style="color:#a626a4;">else </span><span>{ </span><span> Err(</span><span style="color:#e45649;">self</span><span>.failures.</span><span style="color:#0184bc;">clone</span><span>()) </span><span> } </span><span> } </span><span>} </span></code></pre> <p>With out two main entities defined, we can finally switch to implement the topic discovery and library analysis!</p> <h3 id="topic-discovery">Topic discovery</h3> <p>The <code>discover_libraries</code> method that we left unimplemented so far need to do some Google search calls to find links to libraries on GitHub, and then for each hit, spawn a <strong>Library agent</strong> that is going to analyze that library and in case the analysis was successful, register it to belong to our <strong>topic</strong>.</p> <p>At this point we could write some code in <code>discover_libraries</code> implementing this - do requests to Google, process the response, loop on paginated results, etc. But this is a slow process, and as I mentioned earlier, <strong>agents are single-threaded and their invocations are sequential</strong>. If we would do it this way, the topic agent would be unresponsive during the discovery process, we could not even query it's status (for example invoking <code>get_libraries</code> on it would just be enqueued to be executed <em>after</em> the discovery method returned).</p> <p>There are two ways to solve this in Golem - forking and spawning child agents. In this example we are going to define a new agent, <strong>TopicDiscovery</strong>, which is going to be responsible for the long-running process of searching for libraries, while the Topic agents remain responsible.</p> <p>When defining a new agent, we need to think about two things: its identity (constructor parameters) and its methods. In this case we have only one really good choice for the agent identity - we could say that each <strong>topic</strong> can have maximum 1 <strong>topic discovery</strong>. To achieve this, we can make the topic discovery agent also be identified by <strong>the topic name</strong>.</p> <p>If we would choose something with smaller cardinality, for example we would make the <strong>TopicDiscovery</strong> a singleton with no constructor parameters, then we could not run searches for multiple topics in parallel. If we would just assign a random ID (Golem has a built-in feature for that called <strong>phantom agents</strong>) then we would be able to run multiple searches for the <em>same topic</em> in parallel, which also would not make much sense.</p> <p>So we are going to have a 1-1 mapping, and use the topic name as our discovery agent's identity, and we are going to add a <strong>single run method</strong> to it. This is going to be long-running, but it does not matter because there isn't anything else to be called on this agent.</p> <p>Let's first define the "skeleton" of this agent without an actual implementation:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>TopicDiscovery { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">name</span><span>: String) -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>); </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>TopicDiscoveryImpl { </span><span> </span><span style="color:#e45649;">name</span><span>: String </span><span>} </span><span> </span><span style="color:#a626a4;">impl </span><span>TopicDiscoveryImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">try_run</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; anyhow::Result&lt;Vec&lt;(Language, SearchResult)&gt;&gt; { </span><span> todo!() </span><span style="color:#a0a1a7;">// A </span><span> } </span><span>} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>TopicDiscovery </span><span style="color:#a626a4;">for </span><span>TopicDiscoveryImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">name</span><span>: String) -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{ </span><span> name </span><span> } </span><span> } </span><span> </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) { </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">self</span><span>.</span><span style="color:#0184bc;">try_run</span><span>() { </span><span> Ok(results) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> todo!() </span><span style="color:#a0a1a7;">// B </span><span> } </span><span> Err(err) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> todo!() </span><span style="color:#a0a1a7;">// C </span><span> } </span><span> } </span><span> } </span><span>} </span></code></pre> <p>There are two new things in this snippet we haven't seen so far:</p> <ul> <li>agent methods (and also the constructor) can be <code>async</code>. Even being single-threaded, and invocations not being able to overlap, within an invocation there are async operations that can overlap, such as RPC calls, HTTP requests, and waiting for external events (using Golem promises).</li> <li>we can easily add helper methods to our implementations - everything outside of the <code>agent_implementation</code> is an implementation detail of that agent</li> </ul> <p>There are three <code>todo!</code>s in the above implementation, let's discuss them one by one.</p> <h4 id="searching-the-web-a">Searching the web (A)</h4> <p>We could manually do HTTP requests to use Google's search APIs (the recommended way is using the <a href="https://docs.rs/wstd/latest/wstd/">wstd crate's HTTP client</a>) but we have a better option. Golem comes with a large number of <strong>connectors</strong> for 3rd party providers: LLMs, embeddings, text-to-speech, speech-to-text, video generation, code snippet execution, vector databases, searching, etc. Each of these categories define a unified API for working with various third-party providers in that category.</p> <p>For implementing <code>try_run</code>, we are going to use the <code>golem_rust::golem_ai::golem::web_search</code> module and as a separate step, we are going to choose Google as the selected implementation for it.</p> <p>The first step is to enable these connectors in the <code>Cargo.toml</code> file, as they are disabled in the default template:</p> <pre data-lang="toml" style="background-color:#fafafa;color:#383a42;" class="language-toml "><code class="language-toml" data-lang="toml"><span style="color:#e45649;">golem-rust </span><span>= { </span><span style="color:#e45649;">version </span><span>= </span><span style="color:#50a14f;">&quot;1.10.3&quot;</span><span>, </span><span style="color:#e45649;">features </span><span>= [</span><span style="color:#50a14f;">&quot;export_golem_agentic&quot;</span><span>, </span><span style="color:#50a14f;">&quot;golem_ai&quot;</span><span>] } </span></code></pre> <p>Adding the <code>golem_ai</code> feature enables access to all the connectors defined in the <a href="https://github.com/golemcloud/golem-ai">golem-ai repo</a>.</p> <p>Then we can use these bindings in our method implementation:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">use </span><span>golem_rust::golem_ai::golem::web_search::types::{SearchParams, SearchResult}; </span><span style="color:#a626a4;">use </span><span>golem_rust::golem_ai::golem::web_search::web_search::start_search; </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#c18401;">LANGUAGES</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>[Language] </span><span style="color:#a626a4;">= &amp;</span><span>[Language::Rust, Language::JavaScript]; </span><span style="color:#a626a4;">let mut</span><span> result </span><span style="color:#a626a4;">= </span><span>vec![]; </span><span> </span><span style="color:#a626a4;">for</span><span> language </span><span style="color:#a626a4;">in </span><span style="color:#c18401;">LANGUAGES </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> search </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">start_search</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>SearchParams { </span><span> query: format!(</span><span style="color:#50a14f;">&quot;</span><span style="color:#c18401;">{}</span><span style="color:#50a14f;"> library for </span><span style="color:#c18401;">{:?}</span><span style="color:#50a14f;">&quot;</span><span>, </span><span style="color:#e45649;">self</span><span>.name.</span><span style="color:#0184bc;">clone</span><span>(), language), </span><span> include_domains: Some(vec![</span><span style="color:#50a14f;">&quot;github.com&quot;</span><span>.</span><span style="color:#0184bc;">to_string</span><span>()]), </span><span> include_images: Some(</span><span style="color:#c18401;">false</span><span>), </span><span> </span><span style="color:#a0a1a7;">// everything else is undefined below </span><span> safe_search: None, language: None, </span><span> region: None, max_results: None, time_range: None, </span><span> exclude_domains: None, include_html: None, advanced_answer: None, </span><span> })</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">loop </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> page </span><span style="color:#a626a4;">=</span><span> search.</span><span style="color:#0184bc;">next_page</span><span>()</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a626a4;">if</span><span> page.</span><span style="color:#0184bc;">is_empty</span><span>() { </span><span> </span><span style="color:#a626a4;">break</span><span>; </span><span> } </span><span> result.</span><span style="color:#0184bc;">extend</span><span>(page.</span><span style="color:#0184bc;">into_iter</span><span>().</span><span style="color:#0184bc;">map</span><span>(|</span><span style="color:#e45649;">r</span><span>| (language.</span><span style="color:#0184bc;">clone</span><span>(), r))); </span><span> } </span><span>} </span><span> </span><span>Ok(result) </span></code></pre> <p>We perform a separate search query for each language we are interested in, and go through all pages of the results for each.</p> <p>Note that the search API (and all the others in <code>golem-ai</code>) is NOT async. This is a limitation coming from being built on the current version of the WASM component model, and it is going to be lifted in the next Golem release.</p> <p>This search interface is not limited to Google search - we have implementations for Brave Search, Google Custom Search, Serper.dev and Tavily AI at the moment. Before deploying our application we need to choose which provider to use, by editing the <code>golem.yaml</code> file of our component. It comes by default with commented-out sections for all these connectors. To enable Google, we need to add a <code>dependency</code> and two entries to the <code>env</code> section:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#e45649;">components</span><span>: </span><span> </span><span style="color:#e45649;">libdb:backend</span><span>: </span><span> </span><span style="color:#e45649;">templates</span><span>: </span><span style="color:#50a14f;">rust </span><span> </span><span style="color:#e45649;">env</span><span>: </span><span> </span><span style="color:#e45649;">GOOGLE_API_KEY</span><span>: </span><span style="color:#50a14f;">&quot;{{ GOOGLE_API_KEY }}&quot; </span><span> </span><span style="color:#e45649;">GOOGLE_SEARCH_ENGINE_ID</span><span>: </span><span style="color:#50a14f;">&quot;{{ GOOGLE_SEARCH_ENGINE_ID }}&quot; </span><span> </span><span style="color:#e45649;">dependencies</span><span>: </span><span> - </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">wasm </span><span> </span><span style="color:#e45649;">url</span><span>: </span><span style="color:#50a14f;">https://github.com/golemcloud/golem-ai/releases/download/v0.4.0-dev.1/golem_web_search_google-dev.wasm </span></code></pre> <p>Using the <code>{{ X }}</code> syntax for the environment variables allow the <code>golem</code> CLI tool to read them from the environment during deployment, so we don't accidentally commit our keys in our repo. See the <a href="https://developers.google.com/custom-search/v1/introduction">official Google page</a> to learn how to define an API key and a search engine ID.</p> <h4 id="processing-results-b">Processing results (B)</h4> <p>If the search was successful, we end up having a list of <code>SearchResult</code> values - these are records defined in the web-search API. In this example we are only going to use the <code>url</code> field of it, which is the search result URL.</p> <p>For each result we are going to spawn a <strong>LibraryAnalysis</strong> agent. The idea is the same as with <strong>Topic</strong> vs <strong>TopicDiscovery</strong> - we want something that runs in the background, not affecting the actual Library, so it can be accessed freely while the analysis runs. Let's assume we identify a library analysis by the library reference (1-1 mapping between a library and its analysis agent), and we pass additional information, such as the GitHub repository our search revealed, to its <strong>run method</strong>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">for </span><span>(language, result) </span><span style="color:#a626a4;">in</span><span> results { </span><span> </span><span style="color:#a626a4;">let mut</span><span> library_analysis </span><span style="color:#a626a4;">= </span><span>LibraryAnalysisClient::get(LibraryReference { </span><span> name: </span><span style="color:#0184bc;">extract_github_repo_name</span><span>(result.url.</span><span style="color:#0184bc;">clone</span><span>()), </span><span> language, </span><span> }); </span><span> library_analysis </span><span> .</span><span style="color:#0184bc;">run</span><span>(Some(</span><span style="color:#e45649;">self</span><span>.name.</span><span style="color:#0184bc;">clone</span><span>()), </span><span style="color:#0184bc;">extract_github_repo</span><span>(result.url)) </span><span> .await; </span><span>} </span></code></pre> <p>Before talking about the more interesting parts of this snippet, let's just quickly define what <code>extract_github_repo_name</code> and <code>extract_github_repo</code> are. Our search is constrained to only give hits within https://github.com and repository links are having the format <code>https://github.com/&lt;org&gt;/&lt;name&gt;</code>. These helper functions are just extracting <code>name</code> and the root repository URL from an arbitrary search result URL (that can point to anywhere within a repo).</p> <p>But more importantly, what we see here is <strong>agent-to-agent communication</strong>!</p> <p>Every agent we define with the <code>#[agent_definition]</code> macro automatically creates a <strong>client</strong> type - if the agent name is <code>LibraryAnalysis</code>, the client is a type called <code>LibraryAnalyisClient</code>. Each such agent client has a <code>get</code> method, with exactly the same parameters as the agent's constructor. The semantics of this get method is "upsert" - as the constructor parameters are the identity of an agent, calling this method either returns a reference to an existing agent with the given identity, or to a new one if it had not existed before. This explains why is it called <code>get</code> and not <code>new</code>.</p> <p>There are two more client constructor methods in each client (<code>new_phantom</code> and <code>get_phantom</code>) but we don't need them for this example.</p> <p>The clients returned by <code>get</code> have an <strong>async method</strong> for each <strong>agent method</strong> the agent exports, with the same parameters as in the original definition. No matter if the agent method was async or not, the method on the client is always <code>async</code> - as it represents an async remote call awaiting the method's result.</p> <p>So, <code>.await</code>-ing <code>run</code> in the loop means we do the analysis one by one. I did that to reduce the load on my OpenAI account which the analysis is using. We could also trigger an analysis for all libraries together, or in batches - these are standard Rust futures, so we could use crates like <a href="https://docs.rs/futures-concurrency/latest/futures_concurrency/index.html">futures_concurrency</a> to manage them.</p> <h4 id="search-failures-c">Search failures (C)</h4> <p>If the search failed, we want to report this back to the <strong>Topic</strong> because that's the "user-facing" representation of our topic. So let's add a new agent method to the topic agent (both its trait and its impl):</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">record_failure</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">failure</span><span>: String) { </span><span> </span><span style="color:#e45649;">self</span><span>.failures.</span><span style="color:#0184bc;">push</span><span>(failure); </span><span>} </span></code></pre> <p>and then use <strong>agent-to-agent communication</strong> again to call this from our <code>Err</code> branch in the topic discovery implementation:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let mut</span><span> topic </span><span style="color:#a626a4;">= </span><span>TopicClient::get(</span><span style="color:#e45649;">self</span><span>.name.</span><span style="color:#0184bc;">clone</span><span>()); </span><span>topic.</span><span style="color:#0184bc;">record_failure</span><span>(err.</span><span style="color:#0184bc;">to_string</span><span>()).await; </span></code></pre> <h3 id="logging">Logging</h3> <p>Before implementing <strong>LibraryAnalysis</strong>, let's take a look at logging. Now that we can spawn multiple parallel web searches running in background agents, if we would start playing with out application (which we can't at the moment, without defining the library analysis agent first), it would be very hard to observe what is happening on each agent.</p> <p>In Golem each agent can emit log events - writing to the standard output is a log event, but in Rust we can also use the <a href="https://docs.rs/log/0.4.29/log/">log crate</a> to emit log events in different log levels. This log stream is <strong>per agent</strong>. We can observe it by using for example the <code>golem agent stream</code> command:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>golem agent stream &#39;library({ name: &quot;test-r&quot;, language: rust})&#39; </span></code></pre> <p>Invocations from the Golem REPL are also automatically streaming the log events back to the REPL. We haven't seen that before in this example because we did not log anything.</p> <p>Golem does not have any built-in support for observing logs from a tree of agents currently. So if we want to see - after asking our application to discover a topic - logs from our topic discovery agents, and then from each library analysis agent that we spawned, we are in trouble.</p> <p>But we can simply solve this by building our <strong>own log aggregator</strong> in Golem itself! As we've seen, it's very easy to call an agent from another agent. We can define a <strong>log agent</strong> that receives messages, and then emits them as its own log events that we can stream with Golem's CLI.</p> <p>But if we would only have what we have seen so far, this would have a terrible effect on our application. The <strong>log agent</strong> would be a single instance processing log messages one by one, and every remote log method call would block until it processed the message.</p> <p>Fortunately the <strong>agent clients</strong> have another variant of each agent method on their interface - they can <strong>trigger</strong> an agent method invocation in a non-blocking way. This is very fast and returns immediately (as soon as the invocation is enqueued in the remote agent). It is also very safe - no message is going to be lost. Golem guarantees exactly-once calling semantics between agents, and the log agent itself is also automatically durable.</p> <p>(With a very large number of agents, or large log entries of course having a single log agent can be a bottleneck - it may not be able to process the messages fast enough; we are not going to solve this problem in this post)</p> <p>Let's define our log agent!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">use </span><span>golem_rust::bindings::wasi::logging::logging::{log, Level}; </span><span> </span><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">trait </span><span>Log { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">log</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">level</span><span>: Level, </span><span style="color:#e45649;">sender</span><span>: String, </span><span style="color:#e45649;">message</span><span>: String) </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>LogImpl {} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>Log </span><span style="color:#a626a4;">for </span><span>LogImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{} </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">log</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">level</span><span>: Level, </span><span style="color:#e45649;">sender</span><span>: String, </span><span style="color:#e45649;">message</span><span>: String) { </span><span> </span><span style="color:#0184bc;">log</span><span>(level, </span><span style="color:#a626a4;">&amp;</span><span>sender, </span><span style="color:#a626a4;">&amp;</span><span>message) </span><span> } </span><span>} </span></code></pre> <p>This is a very simple agent. It has <strong>no constructor parameters</strong>, which means it is a <strong>cluster-level singleton</strong>. It has a single method, that just delegates the call to the low-level <code>log</code> function defined in the Golem Rust SDK.</p> <p>To make this agent nice to use, we define a helper struct called <code>Logger</code>, which we can use in our other agents to conveniently log messages.</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub struct </span><span>Logger { </span><span> </span><span style="color:#e45649;">client</span><span>: LogClient, </span><span> </span><span style="color:#e45649;">sender</span><span>: String </span><span>} </span><span> </span><span style="color:#a626a4;">impl </span><span>Logger { </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">sender</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{ </span><span> client: LogClient::get(), </span><span> sender: sender.</span><span style="color:#0184bc;">to_string</span><span>(), </span><span> } </span><span> } </span><span> </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">info</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">message</span><span>: impl AsRef&lt;</span><span style="color:#a626a4;">str</span><span>&gt;) { </span><span> </span><span style="color:#0184bc;">log</span><span>(Level::Info, </span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>.sender, message.</span><span style="color:#0184bc;">as_ref</span><span>()); </span><span> </span><span style="color:#e45649;">self</span><span>.client.</span><span style="color:#0184bc;">trigger_log</span><span>( </span><span> Level::Info, </span><span> </span><span style="color:#e45649;">self</span><span>.sender.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> message.</span><span style="color:#0184bc;">as_ref</span><span>().</span><span style="color:#0184bc;">to_string</span><span>(), </span><span> ); </span><span> } </span><span> </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <p>We create the remote client in the constructor, and then expose methods for each log level. In these methods we first emit the log message in our "own" agent's log stream, and then also enqueue the log message in the singleton log agent's message queue.</p> <p>We do this by calling <strong>trigger_log</strong> on the client, instead of <strong>log</strong> - this is the non-blocking method to trigger an invocation without awaiting its execution.</p> <p>With this set up, we can add a <code>Logger</code> to our other agents, for example:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">struct </span><span>TopicDiscoveryImpl { </span><span> </span><span style="color:#e45649;">name</span><span>: String, </span><span> </span><span style="color:#e45649;">logger</span><span>: Logger, </span><span>} </span></code></pre> <p>and then use it to log messages:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">for</span><span> language </span><span style="color:#a626a4;">in </span><span style="color:#c18401;">LANGUAGES </span><span>{ </span><span> </span><span style="color:#e45649;">self</span><span>.logger.</span><span style="color:#0184bc;">debug</span><span>(format!(</span><span style="color:#50a14f;">&quot;Searching for libraries in </span><span style="color:#c18401;">{language:?}</span><span style="color:#50a14f;">...&quot;</span><span>)); </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>When running our application we can observe all logs by running</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>golem agent stream &#39;log()&#39; --logs-only </span></code></pre> <h3 id="library-analysis">Library analysis</h3> <p>We already seen how our <strong>library analysis agent</strong> will look like:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">trait </span><span>LibraryAnalysis { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>(</span><span style="color:#e45649;">reference</span><span>: LibraryReference) -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">parent_topic</span><span>: Option&lt;String&gt;, </span><span style="color:#e45649;">repo_uri</span><span>: Uri); </span><span>} </span></code></pre> <p>The agent's identity is the same as the library agent's - there is a 1-1 mapping between them. The only agent method is the long-running <code>run</code> method, that gets some details (which topic initiated the analysis, and what is the GitHub repo URL).</p> <p>As mentioned earlier, we also have an LLM library with implementations for various providers: Anthropic, OpenAI, OpenRouter, Amazon Bedrock, Grok and Ollama. It works the same way as I explained with the web search - we use the library through a module of <code>golem_rust</code>, and then configure the provider and its API keys in <code>golem.yaml</code>.</p> <p>The library analysis itself won't be very sophisticated - just serving example purposes. We are going to ask an LLM to:</p> <ul> <li>Read the repository's front page</li> <li>Check if it's a library of the programming language we believe it is</li> <li>If yes, write a short summary of what the library does</li> <li>Also collect a set of tags (or topics) representing what the library implements</li> </ul> <p>We ask it to return this in a structured (JSON) format. If it does not, or anything else fails, we mark the library analysis as failed.</p> <p>In either way, at the end we will call something in the corresponding <strong>LibraryAgent</strong> to store the analysis results.</p> <p>So first let's extend <strong>LibraryAgent</strong> with two new methods to store the results:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>Library { </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">analysis_failed</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">message</span><span>: String); </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">analysis_succeeded</span><span>( </span><span> </span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">repository</span><span>: Uri, </span><span> </span><span style="color:#e45649;">description</span><span>: String, </span><span> </span><span style="color:#e45649;">topics</span><span>: Vec&lt;String&gt; </span><span> ); </span><span>} </span></code></pre> <p>The <code>analysis_failed</code> implementation just changes the state and logs a message:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">analysis_failed</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">message</span><span>: String) { </span><span> </span><span style="color:#e45649;">self</span><span>.logger </span><span> .</span><span style="color:#0184bc;">error</span><span>(format!(</span><span style="color:#50a14f;">&quot;Library analysis failed: </span><span style="color:#c18401;">{message}</span><span style="color:#50a14f;">&quot;</span><span>)); </span><span> </span><span> </span><span style="color:#e45649;">self</span><span>.state </span><span style="color:#a626a4;">= </span><span>LibraryState::Failed { message }; </span><span>} </span></code></pre> <p>The <code>analysis_succeeded</code> also registers the library into the <strong>topics</strong> the LLM identified it belongs to! This way we continuously build our information graph. To register a library to a topic, we can add a simple <code>add</code> method to the <strong>TopicAgent</strong> and then trigger the invocation (to not introduce any slowdowns here):</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">analysis_succeeded</span><span>( </span><span> </span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">repository</span><span>: Uri, </span><span> </span><span style="color:#e45649;">description</span><span>: String, </span><span> </span><span style="color:#e45649;">topics</span><span>: Vec&lt;String&gt;, </span><span>) { </span><span> </span><span style="color:#e45649;">self</span><span>.logger.</span><span style="color:#0184bc;">info</span><span>(format!( </span><span> </span><span style="color:#50a14f;">&quot;Library analysis based on </span><span style="color:#c18401;">{repository}</span><span style="color:#50a14f;"> succeeded with description: </span><span style="color:#c18401;">{description}</span><span style="color:#50a14f;"> and topics: </span><span style="color:#c18401;">{topics:?}</span><span style="color:#50a14f;">&quot; </span><span> )); </span><span> </span><span> </span><span style="color:#a626a4;">for</span><span> topic </span><span style="color:#a626a4;">in &amp;</span><span>topics { </span><span> </span><span style="color:#a626a4;">let mut</span><span> topic </span><span style="color:#a626a4;">= </span><span>TopicClient::get(topic.</span><span style="color:#0184bc;">clone</span><span>()); </span><span> topic.</span><span style="color:#0184bc;">trigger_add</span><span>(</span><span style="color:#e45649;">self</span><span>.reference.</span><span style="color:#0184bc;">clone</span><span>()); </span><span> } </span><span> </span><span> </span><span style="color:#e45649;">self</span><span>.state </span><span style="color:#a626a4;">= </span><span>LibraryState::Analysed { </span><span> repository, </span><span> description, </span><span> topics: topics.</span><span style="color:#0184bc;">into_iter</span><span>().</span><span style="color:#0184bc;">collect</span><span>(), </span><span> }; </span><span>} </span></code></pre> <p>With this being ready, let's go back to our <strong>LibraryAnalysis</strong> agent's <code>run</code> method!</p> <p>We start by using Golem's LLM connector to ask a question:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let</span><span> response </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">send</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>[Event::Message( </span><span> Message { </span><span> role: Role::User, </span><span> name: None, </span><span> content: vec![ </span><span> ContentPart::Text(format!(</span><span style="color:#50a14f;">&quot;Let&#39;s analyse the GitHub repository at </span><span style="color:#c18401;">{}</span><span style="color:#50a14f;">. First check if this is a library for </span><span style="color:#c18401;">{:?}</span><span style="color:#50a14f;">. If it is, then come up with a list of tags describing what this library is for, and return it as a JSON array of strings. If it is not for the given language, return an empty tag array.&quot;</span><span>, repo_uri, </span><span style="color:#e45649;">self</span><span>.reference.language)), </span><span> ContentPart::Text(</span><span style="color:#50a14f;">&quot;In addition to the array of tags, also return a short description of the library in a separate field of the result JSON object.&quot;</span><span>.</span><span style="color:#0184bc;">to_string</span><span>()), </span><span> ContentPart::Text(</span><span style="color:#50a14f;">&quot;Always response with a JSON object with the following structure: { </span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">description</span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">: </span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">short description of the library</span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">, </span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">tags</span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">: [</span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">tag1</span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">, </span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">tag2</span><span style="color:#0997b3;">\&quot;</span><span style="color:#50a14f;">, ...] }&quot;</span><span>.</span><span style="color:#0184bc;">to_string</span><span>()), </span><span> ], </span><span> } </span><span> )], </span><span> </span><span style="color:#a626a4;">&amp;</span><span>Config { </span><span> model: </span><span style="color:#50a14f;">&quot;gpt-3.5-turbo&quot;</span><span>.</span><span style="color:#0184bc;">to_string</span><span>(), </span><span> temperature: None, </span><span> max_tokens: None, </span><span> stop_sequences: None, </span><span> tools: None, </span><span> tool_choice: None, </span><span> provider_options: None, </span><span> }, </span><span>); </span><span> </span><span style="color:#a626a4;">let mut</span><span> library </span><span style="color:#a626a4;">= </span><span>LibraryClient::get(</span><span style="color:#e45649;">self</span><span>.reference.</span><span style="color:#0184bc;">clone</span><span>()); </span><span style="color:#a626a4;">match</span><span> response { </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>The response is either <code>Ok</code> or <code>Err</code>. If it was successful, it just contains a list of <code>ContentPart</code>s. We just naively try to concatenate those and decode as our expected JSON:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">derive</span><span>(Debug, Clone, serde::Deserialize)] </span><span style="color:#a626a4;">struct </span><span>ExpectedLlmResponse { </span><span> </span><span style="color:#e45649;">description</span><span>: String, </span><span> </span><span style="color:#e45649;">tags</span><span>: Vec&lt;String&gt;, </span><span>} </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">let</span><span> raw_string_content </span><span style="color:#a626a4;">=</span><span> response </span><span> .content </span><span> .</span><span style="color:#0184bc;">iter</span><span>() </span><span> .</span><span style="color:#0184bc;">map</span><span>(|</span><span style="color:#e45649;">c</span><span>| </span><span style="color:#a626a4;">match</span><span> c { </span><span> ContentPart::Text(s) </span><span style="color:#a626a4;">=&gt;</span><span> s.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> </span><span style="color:#a626a4;">_ =&gt; </span><span style="color:#50a14f;">&quot;&quot;</span><span>.</span><span style="color:#0184bc;">to_string</span><span>(), </span><span> }) </span><span> .collect::&lt;Vec&lt;String&gt;&gt;() </span><span> .</span><span style="color:#0184bc;">join</span><span>(</span><span style="color:#50a14f;">&quot;&quot;</span><span>); </span><span> </span><span style="color:#e45649;">self</span><span>.logger.</span><span style="color:#0184bc;">debug</span><span>(format!(</span><span style="color:#50a14f;">&quot;LLM response: </span><span style="color:#c18401;">{raw_string_content}</span><span style="color:#50a14f;">&quot;</span><span>)); </span><span> </span><span style="color:#a626a4;">match </span><span>serde_json::from_str::&lt;ExpectedLlmResponse&gt;(</span><span style="color:#a626a4;">&amp;</span><span>raw_string_content) { </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>If the <code>tags</code> in the response is empty, or anything else fails, we call <code>analysis_failed</code> through <code>library</code>, otherwise we call <code>analysis_succeeded</code> with the information gathered from the LLM's response.</p> <p>At this point we can build and deploy our application, and start playing with it:</p> <p><img src="/images/libdb-golem-rust-1.png" alt="" /></p> <p>and the log stream:</p> <p><img src="/images/libdb-golem-rust-2.png" alt="" /></p> <h3 id="catalog-agent">Catalog agent</h3> <p>The application we created so far spawns many top-level agents automatically - discovering one topic can create a lot of new topic agents, all ready to further investigate by calling <code>discover-libraries</code> on them.</p> <p>To see what topics and libraries we've discovered so far, we can use the <code>golem agent list</code> command - but that's just a debug tool, not suitable for using as part of our application's API. If we want to for example build a UI on top of this app, we would need a way to enumerate all the topics we currently know about.</p> <p>This can be very easily done by introducing a new <strong>singleton agent</strong> to just keep a catalog of all the topics and libraries in its memory. This, however, will become a bottleneck if we want to scale this application significantly. There are solutions to that, for example we could define a sharded multi-agent catalog. In this post, however, we are going to do the simple version and just define it as a singleton agent with two lists:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>] </span><span style="color:#a626a4;">trait </span><span>Catalog { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_libraries</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Vec&lt;LibraryReference&gt;; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_topics</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Vec&lt;String&gt;; </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">register_library</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">library</span><span>: LibraryReference); </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">register_topic</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">topic</span><span>: String); </span><span>} </span></code></pre> <p>The implementation is straightforward - just store the libraries and topics in two <code>Vec</code>s in the agent's internal state. We can call <code>register_library</code> from the <code>analysis_succeeded</code> method of <code>Library</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let mut</span><span> catalog </span><span style="color:#a626a4;">= </span><span>CatalogClient::get(); </span><span>catalog.</span><span style="color:#0184bc;">trigger_register_library</span><span>(</span><span style="color:#e45649;">self</span><span>.reference.</span><span style="color:#0184bc;">clone</span><span>()); </span></code></pre> <p>And similarly, <code>register_topic</code> from the <strong>constructor</strong> of <code>Topic</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let mut</span><span> catalog </span><span style="color:#a626a4;">= </span><span>CatalogClient::get(); </span><span>catalog.</span><span style="color:#0184bc;">trigger_register_topic</span><span>(name.</span><span style="color:#0184bc;">clone</span><span>()); </span></code></pre> <h2 id="public-api">Public API</h2> <p>At this point we are mostly done with our application's implementation, but we can only interact with it through debug tools like the Golem REPL. We could also use Golem's <a href="https://learn.golem.cloud/rest-api/worker">REST API to invoke agents</a> but that's not a very nice way for integration our application to for example a user interface.</p> <p>Fortunately Golem supports <strong>defining custom APIs</strong> for applications. In Golem 1.4, this has to be done in the application manifest - defining routes in the <code>golem.yaml</code>, and mapping logic in a custom scripting language called <a href="https://learn.golem.cloud/rib">Rib</a>.</p> <p>This is something that is going to be changing in the next release (a few months from now), and we are going to be able to define these APIs fully from code, in our chosen programming language. Until then, let's see how the current method looks like!</p> <p>In our component's <code>golem.yaml</code> file, there is a <code>httpApi</code> section:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#e45649;">httpApi</span><span>: </span><span> </span><span style="color:#e45649;">definitions</span><span>: </span><span> </span><span style="color:#e45649;">libdb-backend-api</span><span>: </span><span> </span><span style="color:#e45649;">version</span><span>: </span><span style="color:#50a14f;">&#39;0.0.1&#39; </span><span> </span><span style="color:#e45649;">routes</span><span>: </span><span> </span><span style="color:#a0a1a7;"># ... </span></code></pre> <p>Here we can list endpoints, and for each endpoint include a script that can access information from the request, <strong>call an agent</strong> and use the agent's results to construct HTTP response.</p> <p>A simple one can be an endpoint that lists <em>all the libraries</em> by invoking the <strong>Catalog</strong> agent:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span> - </span><span style="color:#e45649;">method</span><span>: </span><span style="color:#50a14f;">GET </span><span> </span><span style="color:#e45649;">path</span><span>: </span><span style="color:#50a14f;">/libdb-backend-api/libs </span><span> </span><span style="color:#e45649;">binding</span><span>: </span><span> </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">default </span><span> </span><span style="color:#e45649;">componentName</span><span>: </span><span style="color:#50a14f;">libdb:backend </span><span> </span><span style="color:#e45649;">response</span><span>: </span><span style="color:#a626a4;">| </span><span style="color:#50a14f;"> let agent = catalog(); </span><span style="color:#50a14f;"> let libs = agent.get-libraries(); </span><span style="color:#50a14f;"> { status: 200, body: { libraries: libs } } </span></code></pre> <p>The language used in these scripts is the same that we were using in the Golem REPL.</p> <p>For more advanced cases, we may need to use pattern matching in the scripts. For example to get <em>the libraries belonging to a topic</em>, our agent method returns a Rust <code>Result</code> which we have to process in the script:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span> - </span><span style="color:#e45649;">method</span><span>: </span><span style="color:#50a14f;">GET </span><span> </span><span style="color:#e45649;">path</span><span>: </span><span style="color:#50a14f;">/libdb-backend-api/topics/{name} </span><span> </span><span style="color:#e45649;">binding</span><span>: </span><span> </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">default </span><span> </span><span style="color:#e45649;">componentName</span><span>: </span><span style="color:#50a14f;">libdb:backend </span><span> </span><span style="color:#e45649;">response</span><span>: </span><span style="color:#a626a4;">| </span><span style="color:#50a14f;"> let name: string = request.path.name; </span><span style="color:#50a14f;"> let agent = topic(name); </span><span style="color:#50a14f;"> let res = agent.get-libraries(); </span><span style="color:#50a14f;"> match res { </span><span style="color:#50a14f;"> ok(libraries) =&gt; { status: 200, body: { libraries: some(libraries), failures: none } }, </span><span style="color:#50a14f;"> err(failures) =&gt; { status: 500, body: { libraries: none, failures: some(failures) } } </span><span style="color:#50a14f;"> } </span></code></pre> <p>One important trick here is that the branches of the <code>match</code> must evaluate to the same type. So we can't just use <code>body : { libraries: libraries }</code> in one branch and <code>body: { failures: failures }</code> in the other, as Rib cannot unify those two body types.</p> <p>We can add more endpoints to get details of a library or trigger discovery of a topic, etc. Once we've done with all that, simply running <code>golem deploy</code> again will make these endpoints available on the chosen deployment. For locally running Golem server, that's by default is <code>http://lib-db.localhost:9006</code> for this example.</p> <p>Deployments are also configurable in the application manifest, and there can be different environments such as local, prod, etc with different properties.</p> <p>Once the API is deployed we can try it out with <code>curl</code> for example:</p> <pre data-lang="bas" style="background-color:#fafafa;color:#383a42;" class="language-bas "><code class="language-bas" data-lang="bas"><span>$ curl -X GET http://lib-db.localhost:9006/libdb-backend-api/topics </span><span>{&quot;status&quot;:200,&quot;topics&quot;:[&quot;mp3&quot;]}% </span></code></pre> <h2 id="frontend">Frontend</h2> <h3 id="writing-the-frontend">"Writing" the frontend</h3> <p>Now that we have a public REST API for our application, we can build a simple web application on top of it, and host it from our Golem application itself. As the frontend itself is not the focus of this post, we are going to generate it with AI and just see how we can integrate it within our application.</p> <p>The first step we can do is to export an <strong>OpenAPI definition</strong> for our API, hoping that our AI tools will understand it better than Golem's own API definition language. Running the following command:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>$ golem api definition open-api libdb-backend-api </span><span>Selected app: lib-db, env: local, server: local - builtin (http://localhost:9881) </span><span>Exported OpenAPI spec for libdb-backend-api to libdb-backend-api.yaml </span></code></pre> <p>Then we can ask our favorite coding agent to use this to build a frontend for us. I asked for a single HTML page with embedded scripts, with no dependencies or build steps necessary, for simplicity: <a href="https://ampcode.com/threads/T-019b313f-191c-7444-80ca-d8f46ba7fdee">see the amp thread</a>.</p> <p>With this we have an <code>index.html</code>, and we want to host that as part of our application.</p> <h3 id="hosting-the-frontend">Hosting the frontend</h3> <p>One thing we can do is to modify the <code>golem.yaml</code> again, and list files to be added to our <strong>agent's file system</strong>:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#e45649;">components</span><span>: </span><span> </span><span style="color:#e45649;">libdb:backend</span><span>: </span><span> </span><span style="color:#e45649;">files</span><span>: </span><span> - </span><span style="color:#e45649;">sourcePath</span><span>: </span><span style="color:#50a14f;">index.html </span><span> </span><span style="color:#e45649;">targetPath</span><span>: </span><span style="color:#50a14f;">/index.html </span><span> </span><span style="color:#e45649;">permissions</span><span>: </span><span style="color:#50a14f;">read-only </span></code></pre> <p>With Rust, however, it is much easier to include a single HTML file by using <code>include_bytes!</code> macro. This is compile time, so we don't need to add any files to our agent's run-time file system.</p> <p>We can define a new agent with the only purpose to be able to return this file:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">agent_definition</span><span>(ephemeral)] </span><span style="color:#a626a4;">trait </span><span>Frontend { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self</span><span>; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">index</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Vec&lt;</span><span style="color:#a626a4;">u8</span><span>&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">struct </span><span>FrontendImpl {} </span><span> </span><span>#[</span><span style="color:#e45649;">agent_implementation</span><span>] </span><span style="color:#a626a4;">impl </span><span>Frontend </span><span style="color:#a626a4;">for </span><span>FrontendImpl { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> </span><span style="color:#a626a4;">Self </span><span>{} </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">index</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Vec&lt;</span><span style="color:#a626a4;">u8</span><span>&gt; { </span><span> </span><span style="color:#a626a4;">let</span><span> bytes </span><span style="color:#a626a4;">= </span><span>include_bytes!(</span><span style="color:#50a14f;">&quot;../index.html&quot;</span><span>); </span><span> bytes.</span><span style="color:#0184bc;">to_vec</span><span>() </span><span> } </span><span>} </span><span> </span></code></pre> <p>This agent is <strong>singleton</strong> - there is only one way to return this <code>index.html</code>, we don't need multiple agents with different parameters to do so. On the other hand we already learned that agents are executing a single request at the same time, so if we would serve our <code>index.html</code> through a single agent instance, that would be a significant performance problem.</p> <p>The solution for these cases in Golem is to mark the agent as <strong>ephemeral</strong>. In Rust we can do it in the parameter of the <code>agent_definition</code> macro, as shown above. Ephemeral agents are different from the default, <strong>durable agents</strong> in the following ways:</p> <ul> <li>They are faster and cheaper - their state is not persisted (but to some extent their history is still preserved in long-term storage)</li> <li>They are not durable - every invocation is starting from a fresh state, and cannot survive failures</li> <li>Invoking an ephemeral agent from an API definition (or the REPL, etc) <strong>creates a separate agent every time</strong>.</li> </ul> <p>This last feature allows us to serve an arbitrary number of <code>index.html</code> requests simultaneously, even though our agent looks like a singleton as there is no constructor parameter to distinguish these parallel instances. Golem has a built-in feature called <strong>phantom-id</strong> that is appended to the identity of these agents in this case.</p> <h3 id="endpoint-for-index-html">Endpoint for index.html</h3> <p>With this new <strong>Frontend</strong> agent we can add a new endpoint to our routes:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span> - </span><span style="color:#e45649;">method</span><span>: </span><span style="color:#50a14f;">GET </span><span> </span><span style="color:#e45649;">path</span><span>: </span><span style="color:#50a14f;">/libdb-backend-api </span><span> </span><span style="color:#e45649;">binding</span><span>: </span><span> </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">default </span><span> </span><span style="color:#e45649;">componentName</span><span>: </span><span style="color:#50a14f;">&quot;libdb:backend&quot; </span><span> </span><span style="color:#e45649;">response</span><span>: </span><span style="color:#a626a4;">| </span><span style="color:#50a14f;"> let agent = frontend(); </span><span style="color:#50a14f;"> let file = agent.index(); </span><span style="color:#50a14f;"> { </span><span style="color:#50a14f;"> headers: { </span><span style="color:#50a14f;"> Content-Type: &quot;text/html; charset=utf-8&quot; </span><span style="color:#50a14f;"> }, </span><span style="color:#50a14f;"> body: file </span><span style="color:#50a14f;"> } </span></code></pre> <p>Let's deploy this and try out in the browser! The page gets downloaded, but it does not work - failing with CORS errors.</p> <h3 id="cors">CORS</h3> <p>We need to add CORS Preflight endpoints to our route to make the scripts work. In the current version of Golem this is a bit inconvenient, as we need to add them one by one for each endpoint we defined, for example:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span> - </span><span style="color:#e45649;">method</span><span>: </span><span style="color:#50a14f;">OPTIONS </span><span> </span><span style="color:#e45649;">path</span><span>: </span><span style="color:#50a14f;">/libdb-backend-api/topics/{name} </span><span> </span><span style="color:#e45649;">binding</span><span>: </span><span> </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">cors-preflight </span></code></pre> <p>Once we added all of them and redeployed, our frontend works as expected!</p> <p><img src="/images/libdb-golem-rust-3.png" alt="" /></p> <h2 id="conclusion">Conclusion</h2> <p>I hope this post shows how much more fun it is to write applications for Golem in this new release. The important thing is to think about the problem to be solved as a set of durable agents communicating with each other. You can think of a Golem application as a distributed, persistent actor system, if you are familiar with those concepts. Once the architecture is done, it's mostly just writing the application logic, without dealing with code generators, new languages (except Rib, for now), or boilerplate to set the network up. Everything is automatically persisted, the agents remain available forever, and by scaling the Golem Cluster your application scales horizontally as well.</p> <p>The example is available on <a href="https://github.com/vigoo/golem-example-lib-db">GitHub</a>.</p> Golem 1.3's code-first TypeScript agents 2025-10-04T00:00:00+00:00 2025-10-04T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/golem-code-first-agents/ <h2 id="introduction">Introduction</h2> <p>In the <a href="https://blog.vigoo.dev/posts/golem-new-js-engine/">previous post about Golem 1.3</a> I explained that the new release comes with a new JavaScript engine. This engine fixes some important bugs and allows us to move faster and provide a better JS/TS experience - but it comes with a price. One of the tradeoffs was that building an actual WebAssembly component containing this engine requires the Rust toolchain - we are generating a Rust component that implements the component's (WIT) interface by delegating the calls to <code>rquickjs</code>.</p> <p>But we don't want TypeScript developers to compile Rust components. Even more importantly, we don't want them to have to learn about the WebAssembly Component Model's own interface definition language, WIT, or any other WASM specific tooling; they should just write TypeScript code and Golem should deal with all the underlying complexity.</p> <h2 id="history">History</h2> <p>Before showing how we solved this issue, let's talk about our previous approach. Up until today Golem embraced the component model - we've built it on it from the beginning, and I talked multiple times (on <a href="https://blog.vigoo.dev/posts/golem-and-the-wasm-component-model/">LambdaConf</a> and on <a href="https://blog.vigoo.dev/posts/golem-powered-by-wasm/">Wasm I/O</a>) about how we take advantage of various properties of it. Golem components were just arbitrary WebAssembly components, and Golem directly exposed the component's exported interface through its invocation and remote procedure call APIs. Although we provided more and more help in <em>building</em> these components by making our CLI tool aware of WIT interfaces, relationship between components and so on, the idea still was that Golem should just work with any component built by any WebAssembly tooling.</p> <p>We take a leap from this with Golem 1.3 and while the underlying technology is still the same, we decided to hide its complexity more from the users, and focus on one (and a few more later) supported language and do it in a way that our users are not exposed to the ever changing and evolving complexity of WebAssembly tooling. The first such supported language is <strong>TypeScript</strong>.</p> <h2 id="the-new-way">The new way</h2> <p>Let's go through the technical details of how we are doing it!</p> <p>We can avoid the need to generate and build Rust crates (when using TS) and avoid having to learn about WIT with a simple shift in Golem's approach to user defined components: we no longer support components with an arbitrary, user-defined WIT interface. There is one specific <code>WIT world</code> (a set of imports and exports) applied to every Golem component. This world <em>imports</em> all the supported host APIs of Golem - it's durability controls, forking, ability to update and query information about agents, etc. It also imports all the supported AI libraries <a href="https://github.com/golemcloud/golem-ai">of the golem-ai project</a>.</p> <p>With this predefined set of imports, we can generate the Rust crate with <a href="https://github.com/golemcloud/wasm-rquickjs/">wasm-rquickjs</a> once, build time, and the resulting WASM will contain a QuickJs engine with all the bindings set up to work with Golem. This WASM is then packaged in our TypeScript SDK and published <a href="https://www.npmjs.com/package/@golemcloud/golem-ts-sdk">on npmjs.com</a>.</p> <p>It's clear that this way we can provide support for the fixed set of features Golem provides. But we still want our users to be able to define their own interfaces that can be invoked through <a href="https://learn.golem.cloud/invoke">Golem's invocation API</a>, bound to HTTP routes, or called through RPC from one agent to another. How can we do this while having a static WIT interface which is even hidden from our users?</p> <p>The answer is again a tradeoff - we give up some performance and composability coming from the component model to have something much more flexible and extensible.</p> <p>The idea is that every Golem component implements the following interface:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>package golem:agent; </span><span> </span><span>interface guest { </span><span> use common.{agent-error, agent-type, data-value}; </span><span> </span><span> /// Initializes the agent of a given type with the given constructor parameters. </span><span> /// If called a second time, it fails. </span><span> initialize: func(agent-type: string, input: data-value) -&gt; result&lt;_, agent-error&gt;; </span><span> </span><span> /// Invokes an agent. If create was not called before, it fails </span><span> invoke: func(method-name: string, input: data-value) -&gt; result&lt;data-value, agent-error&gt;; </span><span> </span><span> /// Gets the agent type. If create was not called before, it fails </span><span> get-definition: func() -&gt; agent-type; </span><span> </span><span> /// Gets the agent types defined by this component </span><span> discover-agent-types: func() -&gt; result&lt;list&lt;agent-type&gt;, agent-error&gt;; </span><span>} </span></code></pre> <p>This is quite low level and dynamic, so let's see what this means:</p> <ul> <li>Every Golem <strong>component</strong> can implement one or more <strong>agent types</strong>. The agent types are defined by the <code>agent-type</code> data type and the component can self-describe the set of agent types it implements, using the <code>discover-agent-types</code> exported function.</li> <li>Every <strong>instance</strong> of a Golem component (called worker in previous Golem versions) is a single instance of one of the <strong>agent types</strong> implemented by the component.</li> <li>The instance is initialized by the <code>initialize</code> exported function - this selects the agent type the instance belongs to, and passes <strong>constructor parameters</strong> (in form of a dynamic value of the <code>data-value</code> type). The initialize call is always the first call to an agent and it is automatically called by Golem itself.</li> <li>Once an agent is initialized, it can tell its own agent-type (with <code>get-definition</code>), and more importantly it can be <strong>invoked</strong> dynamically using the exported <code>invoke</code> function. This is dynamic in a sense that it is a single exported WIT function that takes the invoked <strong>agent method</strong>'s name as a string, and the parameters as an arbitrary <code>data-value</code> (just like with the constructor parameters). This is not type safe on the component level - but type safety is guaranteed by Golem on both the invocation side and the SDK side.</li> </ul> <p>To have a better sense of what this interface is capable of, let's take a look at some parts of the <code>agent-type</code> and <code>data-value</code> types.</p> <p>The <code>agent-type</code> is all the metadata available about an agent type, including it's constructor and methods, with full type information. In addition to that it can contain additional metadata to help integration with AI systems for example.</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>record agent-type { </span><span> type-name: string, </span><span> description: string, </span><span> %constructor: agent-constructor, </span><span> methods: list&lt;agent-method&gt;, </span><span> dependencies: list&lt;agent-dependency&gt;, </span><span>} </span></code></pre> <p>Dependencies are not used at the moment, but it is going to allow us to statically know the dependency graph of agents. Both the constructor and the agent's methods are using the <code>data-schema</code> type to describe their input (parameters) and output (return type). The earlier mentioned <code>data-value</code> type is an instance of a type defined by <code>data-schema</code>.</p> <p>In Golem 1.3, <code>data-schema</code> is still tightly coupled with the component model - it supports all the data types supported by the component model, but extends them with some concepts that are more agent specific. It's defined in the following way:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>variant data-schema { </span><span> /// List of named elements </span><span> %tuple(list&lt;tuple&lt;string, element-schema&gt;&gt;), </span><span> /// List of named variants that can be used 0 or more times in a multimodal `data-value` </span><span> multimodal(list&lt;tuple&lt;string, element-schema&gt;&gt;), </span><span>} </span><span> </span><span>variant element-schema { </span><span> component-model(wit-type), </span><span> unstructured-text(text-descriptor), </span><span> unstructured-binary(binary-descriptor), </span><span>} </span></code></pre> <p>Without going into much detail, an input or output of an agent method can be either a tuple of elements, or multimodal. The tuple case is the traditional case - for example a method with three parameters would have a schema describing a 3-tuple. Multimodal, on the other hand, can be thought of as a list of variant values, where each element of the multimodal schema can appear any number of times. An actual example for such an interface can be an chat agent that accepts (and/or returns) content in multiple media formats, such as text, audio, or image.</p> <p>An element of these tuple or multimodal schemas can be one of the WebAssembly component model types (<code>wit-type</code>), an unstructured text (possibly annotated with a language code) or unstructured binary (annotated with a MIME type) data.</p> <h2 id="code-first-sdk">Code-first SDK</h2> <p>The above defined interface explains how we can define and implement multiple agent types without writing any WIT definitions, but using it directly would be very inconvenient.</p> <p>In TypeScript, when using <a href="https://github.com/golemcloud/wasm-rquickjs/">wasm-rquickjs</a>, the implementation would require writing the following functions:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">export namespace </span><span>guest { </span><span> </span><span style="color:#a0a1a7;">/** </span><span style="color:#a0a1a7;"> * Initializes the agent of a given type with the given constructor parameters. </span><span style="color:#a0a1a7;"> * If called a second time, it fails. </span><span style="color:#a0a1a7;"> * </span><span style="color:#a626a4;">@throws</span><span style="color:#a0a1a7;"> AgentError </span><span style="color:#a0a1a7;"> */ </span><span> </span><span style="color:#a626a4;">export function </span><span style="color:#0184bc;">initialize</span><span>(</span><span style="color:#e45649;">agentType</span><span style="color:#a626a4;">: </span><span>string, </span><span style="color:#e45649;">input</span><span style="color:#a626a4;">: </span><span>DataValue)</span><span style="color:#a626a4;">: </span><span>Promise&lt;void&gt;; </span><span> </span><span style="color:#a0a1a7;">/** </span><span style="color:#a0a1a7;"> * Invokes an agent. If create was not called before, it fails </span><span style="color:#a0a1a7;"> * </span><span style="color:#a626a4;">@throws</span><span style="color:#a0a1a7;"> AgentError </span><span style="color:#a0a1a7;"> */ </span><span> </span><span style="color:#a626a4;">export function </span><span style="color:#0184bc;">invoke</span><span>(</span><span style="color:#e45649;">methodName</span><span style="color:#a626a4;">: </span><span>string, </span><span style="color:#e45649;">input</span><span style="color:#a626a4;">: </span><span>DataValue)</span><span style="color:#a626a4;">: </span><span>Promise&lt;DataValue&gt;; </span><span> </span><span style="color:#a0a1a7;">/** </span><span style="color:#a0a1a7;"> * Gets the agent type. If create was not called before, it fails </span><span style="color:#a0a1a7;"> */ </span><span> </span><span style="color:#a626a4;">export function </span><span style="color:#0184bc;">getDefinition</span><span>()</span><span style="color:#a626a4;">: </span><span>Promise&lt;AgentType&gt;; </span><span> </span><span style="color:#a0a1a7;">/** </span><span style="color:#a0a1a7;"> * Gets the agent types defined by this component </span><span style="color:#a0a1a7;"> * </span><span style="color:#a626a4;">@throws</span><span style="color:#a0a1a7;"> AgentError </span><span style="color:#a0a1a7;"> */ </span><span> </span><span style="color:#a626a4;">export function </span><span style="color:#0184bc;">discoverAgentTypes</span><span>()</span><span style="color:#a626a4;">: </span><span>Promise&lt;AgentType[]&gt;; </span><span>} </span></code></pre> <p>It's inconvenient to write these by hand, but it's just TypeScript code - we can just write a library on top of it that makes it more user friendly!</p> <h3 id="a-pure-typescript-approach">A pure TypeScript approach</h3> <p>One possibility would be to write a TypeScript library that exports functions to define the data schemas and agent type metadata, then connect an implementation to each. This can be made very type safe using advanced type level techniques - for example defining the schema would not only assemble a <code>DataSchema</code> value, but would also track the value type (such as a tuple of the agent method parameters) on the type system level. Then when attaching an actual implementation to the defined method, the compiler can infer that the parameters are having these types.</p> <p>The SDK would expose some kind of global registry to define these well typed agents in, possibly using a builder-like fluent API. In the end it just implements the above WIT exports itself using the registered agent definitions.</p> <p>Parts of this would be very similar to how some TypeScript libraries define schemas for validation. It is important to mention though that with Golem validating the input is not necessary - the runtime guarantees that the agent constructor and agent methods are only called values matching the types from the agent type metadata.</p> <p>With a library like this, you could define the simplest possible stateful Golem agent, a counter, in a way like this:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#0184bc;">defineAgentType</span><span>({ </span><span> typeName: </span><span style="color:#50a14f;">&quot;counter&quot;</span><span>, </span><span> description: </span><span style="color:#50a14f;">&quot;An example Golem agent implementing a counter&quot;</span><span>, </span><span> id: { name: </span><span style="color:#0184bc;">type_string</span><span>() }, </span><span> </span><span style="color:#0184bc;">state</span><span>: (</span><span style="color:#e45649;">_id</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>{ value: </span><span style="color:#c18401;">0 </span><span>}, </span><span> methods: [ </span><span> </span><span style="color:#0184bc;">agentMethod</span><span>(</span><span style="color:#50a14f;">&quot;increment&quot;</span><span>, {}, { result: </span><span style="color:#0184bc;">type_u32</span><span>() }) </span><span> ] </span><span>}).</span><span style="color:#0184bc;">implement</span><span>({ </span><span> </span><span style="color:#0184bc;">initialize</span><span>: (</span><span style="color:#e45649;">name</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#c18401;">console</span><span>.</span><span style="color:#0184bc;">log</span><span>(</span><span style="color:#50a14f;">`Counter ${</span><span style="color:#e45649;">name</span><span style="color:#50a14f;">} created`</span><span>) </span><span> }, </span><span> </span><span style="color:#0184bc;">increment</span><span>: (</span><span style="color:#e45649;">state</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#e45649;">state</span><span>.value </span><span style="color:#a626a4;">+= </span><span style="color:#c18401;">1</span><span>; </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">state</span><span>.value; </span><span> } </span><span>}) </span></code></pre> <p>Here the agent's identity (constructor parameters), state and agent methods would be defined in terms of typed versions of <code>DataSchema</code>, with schema constructors such as <code>type_string()</code>. Then in the <code>implement</code> call the object passed would require implementations for the constructor and the methods using the data schema to infer their parameter and return types.</p> <p>Note that this is just a sketch - we decided to <em>not</em> implement a library like this.</p> <h3 id="golem-s-typescript-sdk">Golem's TypeScript SDK</h3> <p>The approach we chose is to take advantage of <strong>decorators</strong> and the <a href="https://ts-morph.com">ts-morph library</a> to make writing the agents even more convenient. The primary advantage is <em>not</em> having to specify data schemas at all. When compiling the TypeScript code the types are extracted and made available for the SDK in runtime - it can transform the TypeScript AST to the matching <code>DataSchema</code> values and <code>AgentType</code> definitions, or fail in a user friendly way if something in the user's code is not supported.</p> <p>Before looking into the details, see how the same <em>counter</em> example looks like with the actual Golem TS SDK!</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">import </span><span>{ </span><span> </span><span style="color:#e45649;">BaseAgent</span><span>, </span><span> </span><span style="color:#e45649;">agent</span><span>, </span><span> </span><span style="color:#e45649;">description</span><span>, </span><span>} </span><span style="color:#a626a4;">from </span><span style="color:#50a14f;">&#39;@golemcloud/golem-ts-sdk&#39;</span><span>; </span><span> </span><span>@</span><span style="color:#0184bc;">agent</span><span>() </span><span style="color:#a626a4;">class </span><span style="color:#c18401;">CounterAgent </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">BaseAgent { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private readonly </span><span style="color:#e45649;">name</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private </span><span style="color:#e45649;">value</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">number </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">0; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">constructor</span><span style="color:#c18401;">(</span><span style="color:#e45649;">name</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">string) { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">super</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.name </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">name</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> @</span><span style="color:#0184bc;">description</span><span style="color:#c18401;">(</span><span style="color:#50a14f;">&quot;Increases the count by one and returns the new value&quot;</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">async </span><span style="color:#0184bc;">increment</span><span style="color:#c18401;">()</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Promise&lt;number&gt; { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.value </span><span style="color:#a626a4;">+= </span><span style="color:#c18401;">1; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">this</span><span style="color:#c18401;">.value; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span></code></pre> <p>Every class annotated with <code>@agent</code> becomes an <strong>exported agent type</strong>. There are not many restrictions - they have to extend <code>BaseAgent</code>, and every type used in the constructor and the methods of this class must be something that the SDK can express with a <code>DataSchema</code>. But it is fully automatic - we can define and use complex custom data types in the agent's interface without having to manually write any schema for them.</p> <p>A component can define as many agent types (classes decorated as <code>@agent()</code> as necessary). The only reason to have multiple <em>components</em> in an application is to have different update policies or other configuration for them.</p> <p>In addition to automatically converting these annotated classes into agent type definitions and their implementations, the SDK also provides support for <strong>remote agent calls</strong>. Every agent class gets a static method on it (put there by the decorator) called <code>get</code>. The get method has "get-or-create" semantics. Every agent is identified by their constructor parameters. There can be only one instance with a specific constructor parameter value. The get-or-create semantics of the <code>get</code> method guarantees that this is true (it creates a new agent if it did not exist yet, otherwise returns a reference to the existing one).</p> <p>In the above example our counter has a string identifier called <code>name</code>. As I wrote earlier, in Golem every component instance corresponds to a single agent. This means that referring to and calling any other agent (either the same type, or another type) ends up being an "agent-to-agent" remote procedure call under the hood.</p> <p>With the SDK this is very convenient - we can use the <code>get</code> method to get a remote agent reference by just passing the constructor parameters to it, then call any of the agent's methods directory on the agent reference:</p> <pre data-lang="typescript" style="background-color:#fafafa;color:#383a42;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#a626a4;">const </span><span style="color:#e45649;">anotherCounter </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">CounterAgent</span><span>.</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#50a14f;">&quot;not-my-name&quot;</span><span>); </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">newValue </span><span style="color:#a626a4;">= await </span><span style="color:#e45649;">anotherCounter</span><span>.</span><span style="color:#0184bc;">increment</span><span>(); </span></code></pre> <p>What happens under the hood is that when compiling the TypeScript agent (using <code>golem-cli app build</code>), first a pre-compilation step called <code>golem-typegen</code> analyses the source code and emits a JSON describing the agent classes and their method parameter and return types. This step uses <a href="https://ts-morph.com">ts-morph</a>, which is a wrapper of the TypeScript compiler API, to get the AST of the user code and extract the necessary information by traversing that.</p> <p>This generated JSON gets bundled into the final JS code and it's used by the decorator logic to implement the <code>discoverAgentTypes</code>, <code>initialize</code> and <code>invoke</code> methods.</p> <h3 id="future">Future</h3> <p>Note that this approach of having the low-level agent interface, and building SDKs on top means that even though we chose a specific approach we support as the official way of writing TypeScript agents for Golem, it is easy to experiment with alternative techniques and publish alternative SDKs. Also the official can be extended with different styles of agent definitions, if we decide to do so.</p> <p>It is also possible to experiment with supporting other languages. The next Golem release after 1.3 will bring back support for using Rust. With Rust we are getting the same code-first agent SDK as with TypeScript, only instead of ts-morph generated ASTs and decorators it is going to be built on proc macros and type classes.</p> <h2 id="composition">Composition</h2> <p>There are two additional interesting build steps hidden behind the scenes that we haven't talked about yet. Both are implemented using WebAssembly <strong>component composition</strong> - something that we no longer expose to our users (to avoid having to fall back to WASM tooling and hand-written WIT specs) but we still use it under the hood.</p> <h3 id="base-wasm-and-user-js">Base WASM and user JS</h3> <p>The result of the compilation steps described above - <code>golem-typegen</code> and then compiling the TypeScript code itself - results in a single JS file. On the other hand we want to have a Golem component - which is a WASM component. I explained above that by restricting a Golem component to always implement a specific world, we can precompile the JavaScript engine with all the import and export bindings. This precompiled WASM is part of the <code>golem-ts-sdk</code> NPM package. We still have to somehow inject the compiled JavaScript file into this component!</p> <p>What we do is the following:</p> <ul> <li>The precompiled WASM <strong>imports</strong> a specific WIT interface with a single method that returns a JS string</li> <li>We generate <em>another WASM component</em> bundling the compiled JS and exposing a single <strong>export</strong> that returns this string.</li> <li>The import matches the export so we can <strong>compose</strong> the two WASM components into one.</li> </ul> <p>Generating raw WASM component bytecode might not be very difficult for this particular use case, but we wanted something more scalable. As you will see in the next section, this is not the only place where we needed to generate WASM on the fly.</p> <p>Instead we are using <a href="https://www.moonbitlang.com">MoonBit</a> to compile high level MoonBit source code directly into WASM. We can even embed the MoonBit compiler in Golem's CLI so there are no external dependencies for our users, thanks to that the <a href="https://www.moonbitlang.com/blog/moonbit-wasm-compiler">compiler itself is running on WASM</a>. MoonBit is a high level and very exciting new programming language, and what makes it a perfect choice for this job is that it generates really concise WASM bytecode.</p> <p>I have written a small helper crate for Rust, <a href="https://github.com/golemcloud/moonbit-component-generator">moonbit-component-generator</a> that embeds the compiler and helps with generating these small wrapper components.</p> <p>For injecting the scripts, we basically give the WIT of the component we want to generate:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let mut</span><span> component </span><span style="color:#a626a4;">= </span><span>MoonBitComponent::empty_from_wit( </span><span> </span><span style="color:#a626a4;">r</span><span style="color:#50a14f;">#&quot; </span><span style="color:#50a14f;"> package golem:script-source; </span><span style="color:#50a14f;"> </span><span style="color:#50a14f;"> world script-source { </span><span style="color:#50a14f;"> export get-script: func() -&gt; string; </span><span style="color:#50a14f;"> } </span><span style="color:#50a14f;"> &quot;#</span><span>, </span><span> Some(</span><span style="color:#50a14f;">&quot;script-source&quot;</span><span>), </span><span>)</span><span style="color:#a626a4;">?</span><span>; </span></code></pre> <p>Then we define the MoonBit bindings based on this WIT, and add our implementation as a MoonBit source string:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>component </span><span> .</span><span style="color:#0184bc;">define_bindgen_packages</span><span>() </span><span> .</span><span style="color:#0184bc;">context</span><span>(</span><span style="color:#50a14f;">&quot;Defining bindgen packages&quot;</span><span>)</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a626a4;">let mut</span><span> stub_mbt </span><span style="color:#a626a4;">= </span><span>String::new(); </span><span>uwriteln!(stub_mbt, </span><span style="color:#50a14f;">&quot;// Generated by `moonbit-component-generator`&quot;</span><span>); </span><span>uwriteln!(stub_mbt, </span><span style="color:#50a14f;">&quot;&quot;</span><span>); </span><span>uwriteln!(stub_mbt, </span><span style="color:#50a14f;">&quot;pub fn get_script() -&gt; String {{&quot;</span><span>); </span><span style="color:#a626a4;">for</span><span> line </span><span style="color:#a626a4;">in</span><span> script.</span><span style="color:#0184bc;">lines</span><span>() { </span><span> uwriteln!(stub_mbt, </span><span style="color:#50a14f;">&quot; #|{line}&quot;</span><span>); </span><span>} </span><span>uwriteln!(stub_mbt, </span><span style="color:#50a14f;">&quot;}}&quot;</span><span>); </span><span> </span><span>component </span><span> .</span><span style="color:#0184bc;">write_world_stub</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>stub_mbt) </span><span> .</span><span style="color:#0184bc;">context</span><span>(</span><span style="color:#50a14f;">&quot;Writing world stub&quot;</span><span>)</span><span style="color:#a626a4;">?</span><span>; </span></code></pre> <p>And finally build the WASM component:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>component </span><span> .</span><span style="color:#0184bc;">build</span><span>(None, target) </span><span> .</span><span style="color:#0184bc;">context</span><span>(</span><span style="color:#50a14f;">&quot;Building component&quot;</span><span>)</span><span style="color:#a626a4;">?</span><span>; </span></code></pre> <p>The resulting WASM is ready to be composed with the prebuilt JavaScript engine. For this we can use the <a href="https://crates.io/crates/wac-graph">wac-graph</a> Rust crate (or the <code>wac</code> command line tool when doing it by hand).</p> <h3 id="wit-wrapper-for-agents">WIT wrapper for Agents</h3> <p>As I wrote in the <em>History</em> section of this post, Golem built directly on the WASM component model. It's component exports and invocation mechanism directly depends on analyzing the component's WIT exports and providing ways to invoke them remotely.</p> <p>We've also created a <a href="https://learn.golem.cloud/rib">scripting language called <strong>Rib</strong></a> that is used for defining HTTP APIs on top of Golem components as well as for playing with them through a REPL; this scripting language uses WASM-specific syntax and naming conventions allow users to call component model interfaces through Golem. Rib here is just used as an example of something still depending on component exports. For more information check the linked official documentation, or <a href="https://www.youtube.com/watch?v=vgrZxN0t-N0">Afsal Thaj's presentation from Wasm I/O 2025</a>.</p> <p>Although we decided to move our user experience to the higher level agent interface described in this post, many parts of Golem still depends on components exporting their interfaces on the component model level. Most of these can be evolved in future versions to directly know about the agent type metadata, etc., but this is going to be a migration process through multiple Golem releases.</p> <p>Until then, without some additional trick, we would be in a very bad situation when for example using Rib to define HTTP API mappings or just manually trying to invoke an agent method. The only relevant exported WIT interface is the low-level dynamic one - each invocation would need to be put together by passing agent type name and agent method name strings, and assembling parameter values by converting them to the <code>data-value</code> component model value representation. This would be completely unusable in practice.</p> <p>What we did for this release to avoid rewriting a large part of the system is that as part of the build process we generate <strong>a static wrapper</strong> that exports a WIT interface that represents the user-defined agent types coming from code.</p> <p>The steps to do this are the following:</p> <ul> <li>First the user's code is compiled to JS and composed with the base WASM, as explained before</li> <li>Then we instantiate this WASM and call the <code>discoverAgentTypes</code> export - we get back the agent type metadata</li> <li>Using this we generate a <strong>WIT interface per agent type</strong>, with an <code>initialize</code> function representing the constructor and one exported function for each agent method</li> <li>We generate a <strong>MoonBit</strong> implementation of these interfaces. These implementations encode/decode the values into <code>data-value</code> and calls the underlying component's dynamic <code>initialize</code> and <code>invoke</code> exports.</li> <li>Finally we compose this wrapper with the original component too. The resulting WASM will have all the exports and imports as the original one, but in addition to that will also export static, well typed interfaces for each agent type defined in the TypeScript code.</li> </ul> <p>Even though this static wrapper will most likely not be needed in the next major Golem release, the technique may remain used if we want to use a Golem component in context of another WASM environment or tool.</p> Golem 1.3's new JavaScript engine 2025-09-19T00:00:00+00:00 2025-09-19T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/golem-new-js-engine/ <p>As we are rapidly approaching the release data for Golem 1.3, a major update, I'm going to publish a series of small posts talking about some of the technical details of this new release. In this first one, let's talk about the <em>new JavaScript engine</em>.</p> <h2 id="javascript-support-in-previous-versions">JavaScript support in previous versions</h2> <p>In previous Golem versions we tried to support JavaScript (and TypeScript) using the "official" way of using these languages in the WASM Component Model: using the <a href="https://github.com/bytecodealliance/ComponentizeJS">ComponentizeJs project</a>. This embeds a special version of the <a href="https://spidermonkey.dev">SpiderMonkey JS engine</a>, called StarlingMonkey in a WASM component together with the user's JS code, and generates import and export bindings based on the component model interface definition (WIT). In addition to this, ComponentizeJs also does a <em>preinitialization step</em> - basically pre-running and snapshotting parts of the resulting component compile time to make the component's initialization time quicker.</p> <p>Although this all sounds very good, this project is still considered <em>experimental</em> and we ran into serious issues with it, especially around it's implementation of <code>fetch</code> and async boundaries. We reported these issues, and also tried to fix some of them ourselves, but working on this project is extremely difficult and we did not reach a point where our users would be guaranteed to be able to build on top of these core JS APIs.</p> <h2 id="the-new-engine">The new engine</h2> <p>Instead trying to fix ComponentizeJs or waiting for others to do so, we decided to try to <em>replace it</em> for the next Golem release. This worked out so well that we were able to refocus our language support to be primarily TypeScript for the next release.</p> <p>So what did I do?</p> <p>The goal was to have a similar solution - take the user's JS and an interface definition (<a href="https://component-model.bytecodealliance.org/design/wit.html">WIT</a>) and get a WebAssembly component implementing this interface by running the user's JavaScript code. But we wanted something that is significantly easier to work with, and easier to extend with more and more "build-in" JS APIs. This is important for us as we want people to be able to use as many existing libraries in their Golem applications as possible. There must be a trade-off somewhere, of course - and there are two that I'm going to talk about in details. First, our new engine supposed to have worse performance than ComponentizeJs, although it has not been benchmarked yet; and the second one is the need of a Rust compiler toolchain to convert the JavaScript code to WASM. This, however, is not affecting Golem users due to some other changes we introduced; more about it later.</p> <p>So with all these constraints, I ended up creating <a href="https://github.com/golemcloud/wasm-rquickjs">wasm-rquickjs</a>, with the following properties:</p> <ul> <li>It's built on the <a href="https://quickjs-ng.github.io/quickjs/">QuickJS-NG engine</a></li> <li>But, to make it much easier to maintain and extend, it is using this engine through Rust, using the excellent <a href="https://github.com/DelSkayn/rquickjs">rquickjs crate</a></li> <li>It generates glue code to bridge the JS world with the Rust bindings generated by <code>wit-bindgen-rust</code> for the component model exports and imports</li> <li>And also defines a growing set of built-in JS APIs, some implemented from scratch, others by taking various open-source polyfill libraries</li> </ul> <p>The result is a CLI tool (<code>wasm-rquickjs-cli</code>) and embeddable Rust library that can take a WIT world, a JS file, and ends up generating a standalone Rust crate that, when compiled using <a href="https://github.com/bytecodealliance/cargo-component">cargo-component</a>, emits the WASM that we need.</p> <p>It also support emitting TypeScript module definitions for all the imports and exports of the component.</p> <h2 id="details">Details</h2> <p>To understand why I chose to go with generating Rust crates and using the above mentioned <code>rquickjs</code> library, let's take a closer look at how things are done within <code>wasm-rquickjs</code>.</p> <h3 id="defining-built-in-apis">Defining built-in APIs</h3> <p>We wanted to be able to easily increase the set of supported built-in APIs to have increased compatibility with the existing JS ecosystem. Some of these APIs can be introduced with pure JS polyfill libraries, but many of them requires to be somehow implemented on top of imported WebAssemby system interfaces (WASI). A good example can be implementing (a subset of) the <code>node:fs</code> API to work with files and filesystems.</p> <p>The <code>rquickjs</code> crate really makes this very easy to do - it has a convenient way to bind native Rust functions into the JavaScript context, and it also solves the difficult problem of bridging the world of JS promises with <em>async Rust</em>.</p> <p>This means we can write Rust functions in which we can use the Rust standard library or any other imported WIT bindings and then call these functions from JS. For example we can define a <code>read_file</code> function that exposes <code>std::fs::read</code> for JavaScript:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">rquickjs</span><span>::</span><span style="color:#e45649;">function</span><span>] </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">read_file</span><span>(</span><span style="color:#e45649;">path</span><span>: String, </span><span style="color:#e45649;">ctx</span><span>: Ctx&lt;&#39;</span><span style="color:#a626a4;">_</span><span>&gt;) -&gt; List&lt;(Option&lt;TypedArray&lt;&#39;</span><span style="color:#a626a4;">_</span><span>, </span><span style="color:#a626a4;">u8</span><span>&gt;&gt;, Option&lt;String&gt;)&gt; { </span><span> </span><span style="color:#a626a4;">let</span><span> path </span><span style="color:#a626a4;">= </span><span>Path::new(</span><span style="color:#a626a4;">&amp;</span><span>path); </span><span> </span><span style="color:#a626a4;">match </span><span>std::fs::read(path) { </span><span> Ok(bytes) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> typed_array </span><span style="color:#a626a4;">= </span><span> TypedArray::new_copy(ctx.</span><span style="color:#0184bc;">clone</span><span>(), </span><span style="color:#a626a4;">&amp;</span><span>bytes) </span><span> .</span><span style="color:#0184bc;">expect</span><span>(</span><span style="color:#50a14f;">&quot;Failed to create TypedArray&quot;</span><span>); </span><span> List((Some(typed_array), None)) </span><span> } </span><span> Err(err) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> error_message </span><span style="color:#a626a4;">= </span><span>format!(</span><span style="color:#50a14f;">&quot;Failed to read file </span><span style="color:#c18401;">{path:?}</span><span style="color:#50a14f;">: </span><span style="color:#c18401;">{err}</span><span style="color:#50a14f;">&quot;</span><span>); </span><span> List((None, Some(error_message))) </span><span> } </span><span> } </span><span>} </span></code></pre> <p>Then the actual JavaScript API can be implemented in JS itself, using these native functions:</p> <pre data-lang="js" style="background-color:#fafafa;color:#383a42;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#a626a4;">export function </span><span style="color:#0184bc;">readFile</span><span>(</span><span style="color:#e45649;">path</span><span>, </span><span style="color:#e45649;">optionsOrCallback</span><span>, </span><span style="color:#e45649;">callback</span><span>) { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span style="color:#a626a4;">else </span><span>{ </span><span> </span><span style="color:#a626a4;">const </span><span>[</span><span style="color:#e45649;">contents</span><span>, </span><span style="color:#e45649;">error</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">read_file</span><span>(</span><span style="color:#e45649;">path</span><span>); </span><span> </span><span style="color:#a626a4;">if </span><span>(</span><span style="color:#e45649;">error </span><span style="color:#a626a4;">=== </span><span style="color:#c18401;">undefined</span><span>) { </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">buffer </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Buffer</span><span>.</span><span style="color:#0184bc;">from</span><span>(</span><span style="color:#e45649;">contents</span><span>); </span><span> </span><span style="color:#0184bc;">callback</span><span>(</span><span style="color:#e45649;">buffer</span><span>); </span><span> } </span><span style="color:#a626a4;">else </span><span>{ </span><span> </span><span style="color:#0184bc;">callback</span><span>(</span><span style="color:#c18401;">undefined</span><span>, </span><span style="color:#e45649;">error</span><span>); </span><span> } </span><span> } </span><span>} </span></code></pre> <p>This makes it really convenient to add support for more and more APIs, and as mentioned earlier, these native functions can be <code>async</code> Rust functions too, which simply translates to async JS functions.</p> <p>For example, part of the <code>fetch</code> implementation is sending the request body asynchronously:</p> <pre data-lang="js" style="background-color:#fafafa;color:#383a42;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#a626a4;">async function </span><span style="color:#0184bc;">sendBody</span><span>(</span><span style="color:#e45649;">bodyWriter</span><span>, </span><span style="color:#e45649;">body</span><span>) { </span><span> </span><span style="color:#a626a4;">const </span><span style="color:#e45649;">reader </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">body</span><span>.</span><span style="color:#0184bc;">getReader</span><span>(); </span><span> </span><span style="color:#a626a4;">while </span><span>(</span><span style="color:#c18401;">true</span><span>) { </span><span> </span><span style="color:#a626a4;">const </span><span>{</span><span style="color:#e45649;">done</span><span>, </span><span style="color:#e45649;">value</span><span>} </span><span style="color:#a626a4;">= await </span><span style="color:#e45649;">reader</span><span>.</span><span style="color:#0184bc;">read</span><span>(); </span><span> </span><span style="color:#a626a4;">if </span><span>(</span><span style="color:#e45649;">done</span><span>) </span><span style="color:#a626a4;">break</span><span>; </span><span> </span><span style="color:#a626a4;">await </span><span style="color:#e45649;">bodyWriter</span><span>.</span><span style="color:#0184bc;">writeRequestBodyChunk</span><span>(</span><span style="color:#e45649;">value</span><span>); </span><span> } </span><span> </span><span style="color:#e45649;">bodyWriter</span><span>.</span><span style="color:#0184bc;">finishBody</span><span>(); </span><span>} </span></code></pre> <p>The <code>writeRequestBodyChunk</code> method is a native Rust method defined like this:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">rquickjs</span><span>::</span><span style="color:#e45649;">methods</span><span>(rename_all </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;camelCase&quot;</span><span>)] </span><span style="color:#a626a4;">impl </span><span>WrappedRequestBodyWriter { </span><span> #[</span><span style="color:#e45649;">qjs</span><span>(constructor)] </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">new</span><span>() -&gt; </span><span style="color:#a626a4;">Self </span><span>{ </span><span> WrappedRequestBodyWriter { writer: None } </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">pub</span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">write_request_body_chunk</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">chunk</span><span>: TypedArray&lt;&#39;</span><span style="color:#a626a4;">_</span><span>, </span><span style="color:#a626a4;">u8</span><span>&gt;) { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <h3 id="implementing-imports">Implementing imports</h3> <p>With the above technique, we could have a precompiled WASM JS engine that is capable of running user code while providing them a fix set of supported APIs. This is what a similar project, <a href="https://github.com/second-state/wasmedge-quickjs">wasmedge-quickjs</a> does.</p> <p>But <code>wasm-rquickjs</code> does not stop here - it uses the same method of defining JS modules with native Rust bindings to define a JS module for <em>each imported WIT interface</em>.</p> <p>So a code generator takes the WIT imports, and emits Rust code in the style of the above examples that exposes these WIT imports to JavaScript by calling the Rust WIT bindings, generated by <code>wit-bindgen-rust</code> (this happens automatically under the hood when using the already mentioned <code>cargo-component</code> build tool).</p> <p>Every data type WIT supports is mapped to a specific JS construct, and <em>resources</em> are mapped to JS classes. The following example shows the generated function for one of the exported functions of <code>golem:llm</code> <a href="https://github.com/golemcloud/golem-ai">from the Golem AI libraries</a>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">rquickjs</span><span>::</span><span style="color:#e45649;">function</span><span>] </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">send</span><span>( </span><span> </span><span style="color:#e45649;">messages</span><span>: Vec&lt;crate::bindings::golem::llm::llm::Message&gt;, </span><span> </span><span style="color:#e45649;">config</span><span>: crate::bindings::golem::llm::llm::Config, </span><span>) -&gt; crate::bindings::golem::llm::llm::ChatEvent { </span><span> </span><span style="color:#a626a4;">let</span><span> result: </span><span style="color:#a626a4;">crate</span><span>::bindings::golem::llm::llm::ChatEvent </span><span style="color:#a626a4;">= crate</span><span>::bindings::golem::llm::llm::send( </span><span> messages.</span><span style="color:#0184bc;">into_iter</span><span>().</span><span style="color:#0184bc;">map</span><span>(|</span><span style="color:#e45649;">v</span><span>| v).collect::&lt;Vec&lt;</span><span style="color:#a626a4;">_</span><span>&gt;&gt;().</span><span style="color:#0184bc;">as_slice</span><span>(), </span><span> </span><span style="color:#a626a4;">&amp;</span><span>config, </span><span> ); </span><span> result </span><span>} </span></code></pre> <p>This simply uses <code>rquickjs</code>'s native binding macro to do the hard work, and calls the generated Rust bindings under the hood.</p> <p>Of course to make this work, <code>rquickjs</code> also needs to know how to encode these data types, such as the LLM <code>Message</code>, as JS. So the code generator also emits instances of the <code>ToJs</code> and <code>FromJs</code> type classes, such as:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt; rquickjs::IntoJs&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt; </span><span style="color:#a626a4;">for </span><span>crate::bindings::golem::llm::llm::Message { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">into_js</span><span>( </span><span> </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">ctx</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>rquickjs::Ctx&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> ) -&gt; rquickjs::Result&lt;rquickjs::Value&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;&gt; { </span><span> </span><span style="color:#a626a4;">let</span><span> obj </span><span style="color:#a626a4;">= </span><span>rquickjs::Object::new(ctx.</span><span style="color:#0184bc;">clone</span><span>())</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a626a4;">let</span><span> role: </span><span style="color:#a626a4;">crate</span><span>::bindings::golem::llm::llm::Role </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.role; </span><span> obj.</span><span style="color:#0184bc;">set</span><span>(</span><span style="color:#50a14f;">&quot;role&quot;</span><span>, role)</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> Ok(obj.</span><span style="color:#0184bc;">into_value</span><span>()) </span><span> } </span><span>} </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt; rquickjs::FromJs&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt; </span><span style="color:#a626a4;">for </span><span>crate::bindings::golem::llm::llm::Message { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">from_js</span><span>( </span><span> </span><span style="color:#e45649;">_ctx</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>rquickjs::Ctx&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> </span><span style="color:#e45649;">value</span><span>: rquickjs::Value&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> ) -&gt; rquickjs::Result&lt;</span><span style="color:#a626a4;">Self</span><span>&gt; { </span><span> </span><span style="color:#a626a4;">let</span><span> obj </span><span style="color:#a626a4;">= </span><span>rquickjs::Object::from_value(value)</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a626a4;">let</span><span> role: </span><span style="color:#a626a4;">crate</span><span>::bindings::golem::llm::llm::Role </span><span style="color:#a626a4;">=</span><span> obj.</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#50a14f;">&quot;role&quot;</span><span>)</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span>} </span></code></pre> <p>The main difficulty was not generating these JS mappings - it was matching the expected signatures of <code>wit-bindgen-rust</code>, as it has some complex rules for deciding when to pass things by value or by reference.</p> <h3 id="implementing-exports">Implementing exports</h3> <p>For all the exported interfaces in a component's WIT definition, <code>wit-bindgen-rust</code> generates a <em>trait</em> to be implemented. We expect the JS developers to implement all these imports with some well defined rules (interfaces becoming exported objects, kebab-case names becoming camel cased, etc.). With the assumption that the user's JS code implements all the exports, <code>wasm-rquickjs</code> can generate implementations for these rust traits that are calling into the QuickJS engine, running these functions.</p> <p>Part of the problem is very similar to what we have with imports - converting from the Rust types (coming from the WIT types) to JS types. This is done using the same conversion type classes we already talked about.</p> <p>When setting up the JS context, we always store a reference to the user's module in a global variable, so the generated export code can easily access it:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let</span><span> module: Object </span><span style="color:#a626a4;">=</span><span> ctx.</span><span style="color:#0184bc;">globals</span><span>().</span><span style="color:#0184bc;">get</span><span>(</span><span style="color:#50a14f;">&quot;userModule&quot;</span><span>) </span><span> .</span><span style="color:#0184bc;">expect</span><span>(</span><span style="color:#50a14f;">&quot;Failed to get userModule&quot;</span><span>); </span></code></pre> <p>There are similar global helper tables for tracking the class instances for WIT resource instances.</p> <p>Once we have the module object, we can apply the naming rules and find the function value and call it with <code>rquickjs</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">call_with_this</span><span>&lt;</span><span style="color:#a626a4;">&#39;js</span><span>, A, R&gt;( </span><span> </span><span style="color:#e45649;">ctx</span><span>: Ctx&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> </span><span style="color:#e45649;">function</span><span>: Function&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> </span><span style="color:#e45649;">this</span><span>: Object&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> </span><span style="color:#e45649;">args</span><span>: A, </span><span>) -&gt; rquickjs::Result&lt;R&gt; </span><span style="color:#a626a4;">where </span><span> A: IntoArgs&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span> R: FromJs&lt;</span><span style="color:#a626a4;">&#39;js</span><span>&gt;, </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> num </span><span style="color:#a626a4;">=</span><span> args.</span><span style="color:#0184bc;">num_args</span><span>(); </span><span> </span><span style="color:#a626a4;">let mut</span><span> accum_args </span><span style="color:#a626a4;">= </span><span>Args::new(ctx.</span><span style="color:#0184bc;">clone</span><span>(), num </span><span style="color:#a626a4;">+ </span><span style="color:#c18401;">1</span><span>); </span><span> accum_args.</span><span style="color:#0184bc;">this</span><span>(this)</span><span style="color:#a626a4;">?</span><span>; </span><span> args.</span><span style="color:#0184bc;">into_args</span><span>(</span><span style="color:#a626a4;">&amp;mut</span><span> accum_args)</span><span style="color:#a626a4;">?</span><span>; </span><span> function.</span><span style="color:#0184bc;">call_arg</span><span>(accum_args) </span><span>} </span></code></pre> <p>A nice property we can offer is that we don't have to constrain the user to always implement the exported functions as async JavaScript functions. We can simply check the return value before trying to convert it to the Rust equivalent whether it is a <code>Promise</code> or not. And if it is, we can just <code>await</code> it in the Rust code!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">if</span><span> value.</span><span style="color:#0184bc;">is_promise</span><span>() { </span><span> </span><span style="color:#a626a4;">let</span><span> promise: Promise </span><span style="color:#a626a4;">=</span><span> value.</span><span style="color:#0184bc;">into_promise</span><span>().</span><span style="color:#0184bc;">unwrap</span><span>(); </span><span> </span><span style="color:#a626a4;">let</span><span> promise_future </span><span style="color:#a626a4;">=</span><span> promise.into_future::&lt;R&gt;(); </span><span> </span><span style="color:#a626a4;">match</span><span> promise_future.await { </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <h3 id="async-all-the-way-down">Async all the way down</h3> <p>This seamless integration of the JS and Rust async world is a key component in making <code>wasm-rquickjs</code> easy to work with. But it's not enough that <code>rquickjs</code> implements the boundary between JS and Rust. The end result is a WASM component, which is single threaded and only provides a very specific set of system APIs to build on; we cannot just use Tokio for example as our Rust runtime (at the time of writing). At the bottom of all the Rust and JS async stacks, there is single small WASI API supporting all this: <code>wasi:io/poll</code>. <a href="https://blog.yoshuawuyts.com/building-an-async-runtime-for-wasi/">Yoshua Wuyts has an excellent blog post</a> about the topic. <code>wasm-rquickjs</code> builds on his <a href="https://docs.rs/wasi-async-runtime/latest/wasi_async_runtime/"><code>wasi_async_runtime</code></a> crate (and soon will be migrated to the newer <a href="https://docs.rs/wstd/latest/wstd/"><code>wstd</code> crate</a>).</p> <h2 id="trade-offs">Trade-offs</h2> <p>As I mentioned in the introduction, this approach naturally comes with some trade-offs when comparing to ComponentizeJs.</p> <h3 id="performance">Performance</h3> <p>We are not doing any precompilation at the moment, so component initialization time for bigger projects is definitely supposed to be slower. On the other hand the engine itself is much smaller than the modified SpiderMonkey in ComponentizeJs, so this may balance out the difference in some cases. I also expect SpiderMonkey to be faster in general than QuickJS, although this is not as clear <a href="https://cfallin.org/blog/2023/10/11/spidermonkey-pbl/">because SpiderMonkey also has to run in interpreter mode</a> on WASM.</p> <h3 id="rust-compilation">Rust compilation</h3> <p>A more serious trade-off is that by generating a Rust crate, we force the JS/TS users to have a Rust tool-chain available and compile these generated crates to WASM.</p> <p>We've spent a lot of effort in the past year hiding the complexity of having these build tools, and especially having the <em>correct version</em> of WASM / component model related tools automatically set up and invoked by hiding the component creating process in Golem's own CLI interface.</p> <p>Still, having to set up Rust to just run a simple JavaScript snippet on Golem is too much to ask. We worked around this issue by not allowing users to work directly on the component model level anymore - no WIT, no composition for them. This way we can embed a precompiled WASM binary in our tooling that can be combined with the user's JavaScript code to form a final WASM component. I am going to write a separate post about this decision and its technical details.</p> <h2 id="conclusion">Conclusion</h2> <p><a href="https://github.com/golemcloud/wasm-rquickjs/">wasm-rquickjs</a> turned out to be a very capable alternative for ComponentizeJs, that is much easier to iterate on. It is a standalone project, completely usable outside of Golem; if the above two trade-offs are acceptable, it provides a nice experience of writing JavaScript or TypeScript code for the WASM Component Model.</p> [Video] Missing Testing Features in Rust @ LambdaConf 2025 2025-06-23T00:00:00+00:00 2025-06-23T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/missing-testing-features-in-rust/ <p>My talk at <a href="https://www.lambdaconf.us">LambdaConf 2025</a> about my Rust test framework <a href="https://test-r.vigoo.dev">test-r</a>.</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/Yf5oIj816mw?list=PL7DZ7q3nEWhwo2OmeaMzNggy7sof9qg5p" title="Daniel Vigovszky - Missing Testing Features in Rust" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> LambdaConf 2024-2025 - one year of Golem 2025-05-13T00:00:00+00:00 2025-05-13T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/one-year-of-golem/ <p>I'm on the last LambdaConf at the moment, and exactly a year ago I gave a <a href="/posts/golem-and-the-wasm-component-model">talk about Golem on LambdaConf 2024</a>. Someone asked me what happened with Golem since then? So many things I could not properly answer. So here is a short summary of one year amount of Golem development, from LambdaConf 2024 to LambdaConf 2025.</p> <p>We had a Golem Hackathon on last LambdaConf, and had a fresh release of Golem for it which we called <strong>Golem 0.0.100</strong>. We've spent the summer after that to prepare the first production ready release of Golem OSS, <strong>Golem 1.0</strong>, which we released 23th of August, 2024. In the few months between the hackathon and Golem 1.0, we made Golem's oplog store more scalable, introduced <em>environment inheritance</em> between workers when they are spawned through RPC, added a worker scheduler that tracks worker memory usage and suspends workers when necessary to keep the system responsible. The stability of the executor itself has been improved significantly. We tried to make the CLI for 1.0 more user friendly, created precompiled binaries of it so users don't have to compile it themselves, and created the first usable version of <strong>Rib</strong>, our scripting language used in Golem's API gateway.</p> <p>Rib itself did not stop there - we continued working on it in the rest of the year and it got much better type inference, error messages, new language features such as first-class worker support, list comprehensions and aggregations and so on.</p> <p>Our next milestone was <strong>Golem 1.1</strong>, which has been released on the 9th of December, 2024. With this release we were no longer just targeting durable execution but realized that most applications also need components that are ephemeral - so we added support for <strong>ephemeral components</strong>, which are stateless programs getting a fresh instance for each incoming call. We added the concept of <strong>plugins</strong> in this release, although not fully complete yet, but with a vision of a future plugin ecosystem where these plugins can transform user's components and observe the living workers realtime. The API gateway got support for things like <strong>authentication</strong> and <strong>CORS</strong>, and we created tools to better observe the Golem worker's history by <strong>querying their oplog</strong> itself. This was the first release with the ability to add an <strong>initial file system</strong> for components.</p> <p>We were trying to make it easier for users, especially if they are not Rust developers, to use Golem. So we created precompiled, downloadable <strong>single executable Golem versions</strong> for local development.</p> <p>Golem 1.1 was also the first version introducing the <strong>Golem application manifest</strong>. This brings the concept of an <em>application</em>, consist of one or more components, with the ability to describe (RPC) <strong>dependencies</strong> between these components in a declarative way. This significantly simplified the way how these multi-component Golem applications are built, especially the iterative development process.</p> <p>For <strong>Golem 1.2</strong> we decided to make this application manifest feature a core element of Golem development. We have redesigned the CLI to be based on the application concept, with single commands to build and deploy whole, multi-component Golem applications with support for dependencies between these components and allowing them to be written in different programming languages, even within a single application.</p> <p>We also improved our RPC solution so it no longer requires a working Rust toolchain. Instead we are <strong>linking</strong> the RPC clients dynamically in the executor.</p> <p>Other improvements added in the 3-4 months of development of Golem 1.2 consisted of the first version of Golem's <strong>debugging</strong> service, which will allow interactive debug and observation of running workers once it is done. Other features helping with the debugging of Golem code are the support for <strong>reverting</strong> workers and cancel pending invocations. We have added support for special kind of workers implementing a HTTP <strong>request handler</strong> (using the wasi-http interface) that can be directly mapped in the API gateway to various endpoints, and get the whole incoming HTTP request to be processed in the worker itself. We also support now <strong>scheduling</strong> invocations (even through RPC) to be done at an arbitrary point in time instead of being executed immediately.</p> <p>Golem 1.2 has been released on 27th of March, 2025.</p> <p>In the roughly 1.5 months since then we were focusing on further improving the developer experience by updating our <strong>language support</strong> to the latest version of everything (especially the JavaScript and TypeScript, Python and Go tooling). We can now define <em>plugin installations</em> and <em>APIs</em> in the <strong>application manifest</strong> itself. We continued making the application manifest the primary way to work with Golem by further simplifying the CLI interface and enforcing that every Golem component is named the same as the WIT package it defines. We've introduced a <strong>new dependency type</strong> in the application manifest for directly depending on another WASM component, composing them build-time. This dependency type even supports downloading these WASM components from remote URLs, which is an nice way to use WASM components as language-independent libraries. The first such library we provide is <a href="https://github.com/golemcloud/golem-llm"><strong>golem-llm</strong></a>. The dependencies can now also added using a CLI command for those who prefer this method. In addition to that, we improved the CLI's <strong>error messages</strong> significantly. Another small detail is that in Golem Cloud accounts can be now referenced by their <strong>e-mail address</strong>, which allows us to for example share a project with other accounts by using their e-mail. Another small improvement is the ability to define <strong>environment variables</strong> on a per-component level now (not just per worker). A nice new feature available for Golem programs is the ability to <strong>fork</strong> workers.</p> <p>Rib continued to evolve, being more and more stable and having better error reporting. It is no longer just a scripting language to be used as glue code in the API gateway, but it is also integrated into the CLI as a <strong>REPL</strong>, a convenient way to interact with Golem workers.</p> <p>All these DX improvements are going to be released today as <strong>Golem 1.2.2</strong>, the version to be used on the LambdaConf 2025 hackathon.</p> [Video] Golem powered by WebAssembly @ Wasm I/O 2025 2025-04-04T00:00:00+00:00 2025-04-04T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/golem-powered-by-wasm/ <p>My talk at <a href="https://www.wasm.io">Wasm I/O 2025</a> explaining how <a href="https://golem.cloud">Golem</a> is built on WebAssembly and the Component Model.</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/_oEhuFjTyeQ?si=zYPcpJLasEBV-vGE" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> Durable Execution is not just for failures 2025-03-28T00:00:00+00:00 2025-03-28T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/durable-execution-is-not-just-for-failures/ <h2 id="introduction">Introduction</h2> <p>When talking about <a href="https://golem.cloud">Golem</a> or other <strong>durable execution engines</strong> the most important property we are always pointing out is that by making the application <em>durable</em>, it can automatically survive various failure scenarios. In case of a transient error, or some other external event such as updating or restarting the underlying servers durable programs can survive by seamlessly continuing their execution from the point where they were interrupted, without any visible (except for some latency, of course) effect for the application's users.</p> <p>But having this core capability has many other interesting consequences.</p> <p>A durable program can be dropped out of memory any time without having to explicitly save its state or shut it down in any way - and whenever it is needed it can be automatically recovered and it continues from where it left. The application developers can rely on very simple code storing everything in memory - as it is guaranteed that the in-memory state never gets lost.</p> <p>If a <strong>Golem worker</strong> (a running durable program) is not performing any active job at the moment - for example it is waiting to be invoked, or waiting for some scheduled event - they automatically get dropped out of the executor's memory to make space for other workers. This means we can have an (almost arbitrary) large number of "running" workers, if they are not performing CPU intensive tasks. Sure, having to continuously recover dropped out workers is affecting latency, but still, it means we can run these large number of simultaneous, stateful programs even on a locally started Golem on a developer machine.</p> <h2 id="demo">Demo</h2> <h3 id="setting-it-up">Setting it up</h3> <p>In this short blog post we are going to demonstrate this. We are going to start the latest version of Golem (1.2) locally, then use the CLI (and some <a href="https://www.nushell.sh">Nushell</a> snippets) to build, deploy and run a large number of workers.</p> <p>First we download the latest <code>golem</code> command line application <a href="https://learn.golem.cloud/quickstart">according to Golem's Quick Start pages</a>. With that we can start our local Golem cluster - all the core Golem services are integrated in this single <code>golem</code> binary:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem server run </span></code></pre> <p>We are going to use the same <code>golem</code> CLI application to create, deploy and invoke Golem components.</p> <p>Next we create a new <em>golem application</em>:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem app new manyworkers rust </span></code></pre> <p><img src="/images/2025-03-28/1.png" alt="" /></p> <p>Golem comes with a set of <strong>components templates</strong> for all supported languages. One of these templates is a simple <em>shopping cart</em> implementation in Rust, where each Golem worker (running instance of this component) represents a single shopping cart, keeping its contents in memory.</p> <p>We are going to create <strong>10</strong> (identical) versions of this template, simulating that we have more than one applications running in a cluster. Even though they are going to be exactly the same to keep the post simple, from Golem's point of view it is going to be 10 different applications, compiled and deployed separately.</p> <p>Let's call the <code>golem component new</code> command 10 times in the newly generated application to set this up!</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>0..9 | each { |x| golem component new rust/example-shopping-cart $&quot;demo:cart(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">x</span><span>)&quot; } </span></code></pre> <p>This command created 10 components in our application, with names <code>demo:cart0</code> to <code>demo:cart9</code>. First let's build and deploy these components:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem app build </span><span>golem app deploy </span></code></pre> <p><img src="/images/2025-03-28/2.png" alt="" /></p> <p>To see the interface of this example, let's query one using <code>component get</code>:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem component get demo:cart0 </span></code></pre> <p><img src="/images/2025-03-28/3.png" alt="" /></p> <p>Before spawning our thousands of workers, we try out this exported interface by creating a single worker of <code>demo:cart0</code> called <code>test</code> and calling a few methods in it:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span> golem worker invoke demo:cart0/test initialize-cart &#39;&quot;user1&quot;&#39; </span></code></pre> <p><img src="/images/2025-03-28/4.png" alt="" /></p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem worker invoke demo:cart0/test add-item &#39;{ product-id: &quot;p1&quot;, name: &quot;Example product&quot;, price: 1000.0, quantity: 2 }&#39; </span></code></pre> <p><img src="/images/2025-03-28/5.png" alt="" /></p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem worker invoke demo:cart0/test get-cart-contents </span></code></pre> <p><img src="/images/2025-03-28/6.png" alt="" /></p> <p>For some more context, we can also check the size of the compiled WASM files (we were doing a debug build so they are relatively large) for these components:</p> <p><img src="/images/2025-03-28/7.png" alt="" /></p> <p>We can also query metadata of the created worker to get the same size information, and it also going to tell us the amount of <strong>memory</strong> the instance allocates on startup:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem worker get demo:cart0/test </span></code></pre> <p><img src="/images/2025-03-28/9.png" alt="" /></p> <p>And we can query the test worker's <em>oplog</em> to get an idea of how much additional memory it allocated dynamically runtime:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem worker oplog demo:cart0/test --query memory </span></code></pre> <p><img src="/images/2025-03-28/8.png" alt="" /></p> <h3 id="spawning-many-workers">Spawning many workers</h3> <p>Now that we have seen how a single worker looks like, let's spawn 1000 workers of each test component. This is going to take some time as it actually <strong>instantiates</strong> the WASM program for each to make the initial two invocations.</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>mut j = 0; </span><span>loop { </span><span> mut i = 0; </span><span> loop { </span><span> golem worker new $&quot;demo:cart(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">i</span><span>)/(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">j</span><span>)&quot;; </span><span> golem worker invoke $&quot;demo:cart(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">i</span><span>)/(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">j</span><span>)&quot; initialize-cart &#39;&quot;user1&quot;&#39;; </span><span> golem worker invoke $&quot;demo:cart(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">i</span><span>)/(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">j</span><span>)&quot; add-item $&quot;{ product-id: \&quot;p1\&quot;, name: \&quot;Example product (</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">j</span><span>)/(</span><span style="color:#a626a4;">$</span><span style="color:#e45649;">i</span><span>)\&quot;, price: 1000.0, quantity: 2 }&quot;; </span><span> </span><span> </span><span style="color:#a626a4;">if $</span><span style="color:#e45649;">i</span><span> &gt;= 9 { break; }; </span><span> </span><span style="color:#a626a4;">$</span><span style="color:#e45649;">i</span><span> = </span><span style="color:#a626a4;">$</span><span style="color:#e45649;">i</span><span> + 1; </span><span> } </span><span> </span><span style="color:#a626a4;">if $</span><span style="color:#e45649;">j</span><span> &gt;= 999 { break; }; </span><span> </span><span style="color:#a626a4;">$</span><span style="color:#e45649;">j</span><span> = </span><span style="color:#a626a4;">$</span><span style="color:#e45649;">j</span><span> + 1; </span><span>} </span></code></pre> <p>After that, we have 10000 "running" workers (all idle, waiting for a next invocation). We can check by listing for example one of the component's workers:</p> <pre data-lang="nu" style="background-color:#fafafa;color:#383a42;" class="language-nu "><code class="language-nu" data-lang="nu"><span>golem worker list demo:cart5 </span></code></pre> <p><img src="/images/2025-03-28/10.png" alt="" /></p> <p>Of course only some of these workers (the last accessed ones) are really in the locally running executor's memory. Whenever a worker that's not in memory is going to be accessed, it is loaded and its state is transparently restored before it gets the request. Golem is tracking the resource usage of its running components and if there is not enough memory to load the new component, an old one is going to be dropped out.</p> <h3 id="trying-it-out">Trying it out</h3> <p>To demonstrate this, we can just invoke workers randomly from the 10000 we've created:</p> <p><img src="/images/2025-03-28/11.png" alt="" /></p> <p>Thanks to the durable execution model, every one of the 10000 workers react just as if it was running.</p> Using MoonBit with Golem Cloud 2025-01-03T00:00:00+00:00 2025-01-03T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/moonbit-with-golem/ <h2 id="introduction">Introduction</h2> <p><a href="https://www.moonbitlang.com">MoonBit</a>, a new programming language has been open sourced a few weeks ago - see <a href="https://www.moonbitlang.com/blog/compiler-opensource">this blog post</a>. MoonBit is an exciting modern programming language that natively supports WebAssembly, including the component model - this makes it a perfect fit for writing applications for <a href="https://golem.cloud">Golem Cloud</a>.</p> <p>In this post I'm exploring the current state of MoonBit and whether it is ready for writing Golem components, by implementing an example application more complex than a simple "hello world" example.</p> <p>The application to be implemented is a simple collaborative list editor - on the <a href="https://youtu.be/11Cig1iH6S0">launch event of Golem 1.0</a> I have live-coded the same example using three different programming languages (TypeScript, Rust and Go) for the three main modules it requires. In this post I am implementing all three using <strong>MoonBit</strong>, including the e-mail sending feature that was omitted from the live demo due to time constraints.</p> <p>The application can handle an arbitrary number of simultaneously open <strong>lists</strong>. Each list consists of a list of string items. These items can be appended, inserted and deleted simultaneously by multiple users; the current list state can be queried any time, as well as the active connections (users who can perform editing operations on the list). Modification is only allowed for connected editors, and there is a <code>poll</code> function exposed for them which returns the new changes since the last poll call. Lists can be archived, in which case they are no longer editable and their contents are saved in a separate <strong>list archive</strong>. Then the list itself can be deleted, its last state remains stored forever in the archive. An additional feature is that if a list is <em>not archived</em> and there were no changes for a certain period of time, all the connected editors are notified by sending an <strong>email</strong> to them.</p> <h2 id="golem-architecture">Golem Architecture</h2> <p>In Golem a good architecture to run this is to have three different <strong>golem components</strong>:</p> <ul> <li>the list</li> <li>the archive</li> <li>the email notifier</li> </ul> <p>These are compiled WebAssembly components, each exporting a distinct set of functions. Golem provides APIs to invoke these functions from the external world (for example mapping them to a HTTP API) and also allows <strong>workers</strong> (instances of these components) to invoke each other. A component can have an arbitrary number of instances, each such worker being identified by a unique name.</p> <p>We can use this feature to have a very simple and straightforward implementation of the list editor - each document (editable list) will be mapped to its own worker, identified by the list's identifier. This way our list component only has to deal with a single list; scaling it up to handle multiple (possibly even millions) of lists is done automatically by Golem.</p> <p>For archiving lists, we want to store each archived list in a single place - so we are going to have only a single instance of our archive component, where each archived list information is sent to. This singleton worker can store the archived lists in some database if needed - but because Golem's durable execution guarantees, it is enough to just store them in memory (one important exception is if we want to store a really large amount of archived lists not fitting in a single worker's memory). Golem guarantees that the worker's state is restored in any case of failure or rescaling event so the archive component can really remain very simple.</p> <p>Finally, because Golem workers are single threaded and does not support async calls overlapping with its invocations at the moment, we need a third component to implement the delayed email sending functionality. There will be an <strong>email sending worker</strong> corresponding to each <strong>list worker</strong> and this worker will be suspended for an extended period of time (the amount we want to wait before sending out the email). Again, because of Golem's durable execution feature we can just "sleep" for an arbitrary long time in this component and we don't need to care about what can happen to our execution environment during that long period.</p> <h2 id="initial-moonbit-implementation">Initial MoonBit implementation</h2> <p>Before going into details of how to develop Golem components with MoonBit, let's try to implement the above described components in this new language, without any Golem or WebAssembly specifics.</p> <p>First we create a new <code>lib</code> project using <code>moon new</code>. This creates a new <strong>project</strong> with a single <strong>package</strong>. To match our architecture let's start by creating multiple packages, one for each component to develop (<code>list</code>, <code>archive</code>, <code>email</code>)</p> <p>We create a folder for each package, with a <code>moon.pkg.json</code> in each:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span>{ </span><span> </span><span style="color:#50a14f;">&quot;import&quot;</span><span>: [ </span><span> ] </span><span>} </span></code></pre> <h3 id="list-model">List model</h3> <p>Let's start by modelling our <strong>list</strong>. The edited "document" itself is just an array of strings:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">struct </span><span>Document { </span><span> </span><span style="color:#a626a4;">mut </span><span style="color:#e45649;">items</span><span>: Array[String] </span><span>} </span></code></pre> <p>We can implement <strong>methods</strong> on <code>Document</code> corresponding to the document editing operations we want to support. On this level we don't care about collaborative editing or connected users, just model our document as a pure data structure:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Creates an empty document </span><span style="color:#a626a4;">pub</span><span> fn Document::new() </span><span style="color:#a626a4;">-&gt; </span><span>Document { </span><span> { </span><span style="color:#e45649;">items</span><span>: [] } </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Adds a new item to the document </span><span style="color:#a626a4;">pub</span><span> fn add(</span><span style="color:#e45649;">self</span><span> : Document, </span><span style="color:#e45649;">item</span><span> : String) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#a626a4;">if </span><span style="color:#e45649;">self</span><span>.items.search(</span><span style="color:#e45649;">item</span><span>).is_empty() { </span><span> </span><span style="color:#e45649;">self</span><span>.items.push(</span><span style="color:#e45649;">item</span><span>) </span><span> } </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Deletes an item from the document </span><span style="color:#a626a4;">pub</span><span> fn delete(</span><span style="color:#e45649;">self</span><span> : Document, </span><span style="color:#e45649;">item</span><span> : String) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#e45649;">self</span><span>.items </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.items.filter(fn(</span><span style="color:#e45649;">i</span><span>) { </span><span style="color:#e45649;">item </span><span style="color:#a626a4;">!= </span><span style="color:#e45649;">i</span><span> }) </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Inserts an item to the document after an existing item. If `after` is not in the document, the new item is inserted at the end. </span><span style="color:#a626a4;">pub</span><span> fn insert(</span><span style="color:#e45649;">self</span><span> : Document, </span><span style="color:#e45649;">after</span><span>~ : String, </span><span style="color:#e45649;">value</span><span>~ : String) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">index </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.items.search(</span><span style="color:#e45649;">after</span><span>) </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">index</span><span> { </span><span> Some(</span><span style="color:#e45649;">index</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#e45649;">self</span><span>.items.insert(</span><span style="color:#e45649;">index </span><span style="color:#a626a4;">+ </span><span style="color:#c18401;">1</span><span>, </span><span style="color:#e45649;">value</span><span>) </span><span> None </span><span style="color:#a626a4;">=&gt; </span><span style="color:#e45649;">self</span><span>.add(</span><span style="color:#e45649;">value</span><span>) </span><span> } </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Gets a view of the document&#39;s items </span><span style="color:#a626a4;">pub</span><span> fn get(</span><span style="color:#e45649;">self</span><span> : Document) </span><span style="color:#a626a4;">-&gt; </span><span>ArrayView[String] { </span><span> </span><span style="color:#e45649;">self</span><span>.items[:] </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Iterates the items in the document </span><span style="color:#a626a4;">pub</span><span> fn iter(</span><span style="color:#e45649;">self</span><span> : Document) </span><span style="color:#a626a4;">-&gt; </span><span>Iter[String] { </span><span> </span><span style="color:#e45649;">self</span><span>.items.iter() </span><span>} </span></code></pre> <p>We can also use MoonBit's built-in test feature to write unit tests for this. The following test contains an assertion that the initial document is empty:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">test </span><span style="color:#50a14f;">&quot;</span><span>new document is empty</span><span style="color:#50a14f;">&quot;</span><span> { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">empty </span><span style="color:#a626a4;">= </span><span>Document::new() </span><span> assert_eq!(</span><span style="color:#e45649;">empty</span><span>.items, []) </span><span>} </span></code></pre> <p>With the <code>inspect</code> function tests can use <strong>snapshot values</strong> to compare values with. The <code>moon</code> CLI tool and the IDE integration provides a way to automatically update the snapshot values (the <code>content=</code> part) in these test functions when needed:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">test </span><span style="color:#50a14f;">&quot;</span><span>basic document operations</span><span style="color:#50a14f;">&quot;</span><span> { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">doc </span><span style="color:#a626a4;">= </span><span>Document::new() </span><span> ..add(</span><span style="color:#50a14f;">&quot;</span><span>x</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> ..add(</span><span style="color:#50a14f;">&quot;</span><span>y</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> ..add(</span><span style="color:#50a14f;">&quot;</span><span>z</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> ..insert(</span><span style="color:#e45649;">after</span><span style="color:#a626a4;">=</span><span style="color:#50a14f;">&quot;</span><span>y</span><span style="color:#50a14f;">&quot;</span><span>, </span><span style="color:#e45649;">value</span><span style="color:#a626a4;">=</span><span style="color:#50a14f;">&quot;</span><span>w</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> ..insert(</span><span style="color:#e45649;">after</span><span style="color:#a626a4;">=</span><span style="color:#50a14f;">&quot;</span><span>a</span><span style="color:#50a14f;">&quot;</span><span>, </span><span style="color:#e45649;">value</span><span style="color:#a626a4;">=</span><span style="color:#50a14f;">&quot;</span><span>b</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> ..delete(</span><span style="color:#50a14f;">&quot;</span><span>z</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> ..delete(</span><span style="color:#50a14f;">&quot;</span><span>f</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> inspect!( </span><span> </span><span style="color:#e45649;">doc</span><span>.get(), </span><span> </span><span style="color:#e45649;">content</span><span style="color:#a626a4;">= </span><span> </span><span style="color:#50a14f;">#|[&quot;x&quot;, &quot;y&quot;, &quot;w&quot;, &quot;b&quot;] </span><span> , </span><span> ) </span><span>} </span></code></pre> <h3 id="list-editor-state">List editor state</h3> <p>The next step is to implement the editor state management on top of this <code>Document</code> type. As a reminder, we decided that every instance (Golem worker) of the list component will be only responsible for editing a single list. So we don't need to care about storing and indexing the lists, or routing connections to the corresponding node where the list state is - this is all going to be managed by Golem.</p> <p>What we need to do, however, is write stateful code to handle connecting and disconnecting users ("editors"), adding some validation on top of the document editing API so only connected editors can make changes, and collect change events for the polling API.</p> <p>We can start by defining a new datatype holding our document editing state:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Document state </span><span style="color:#a626a4;">struct </span><span>State { </span><span> </span><span style="color:#e45649;">document</span><span> : Document </span><span> </span><span style="color:#e45649;">connected</span><span> : Map[ConnectionId, EditorState] </span><span> </span><span style="color:#a626a4;">mut </span><span style="color:#e45649;">last_connection_id</span><span> : ConnectionId </span><span> </span><span style="color:#a626a4;">mut </span><span style="color:#e45649;">archived</span><span> : Bool </span><span> </span><span style="color:#a626a4;">mut </span><span style="color:#e45649;">email_deadline</span><span> : @datetime.DateTime </span><span> </span><span style="color:#a626a4;">mut </span><span style="color:#e45649;">email_recipients</span><span> : Array[EmailAddress] </span><span>} </span></code></pre> <p>Beside the actual document we are going to store:</p> <ul> <li>A map of connected editors, with some per-editor state associated with them</li> <li>The last used connection ID so we can always generate a new unique one</li> <li>Whether the document has been archived or not</li> <li>When should we send out the email notification, and to what recipients</li> </ul> <p>So far we have only defined the <code>Document</code> type so let's continue by specifying all these other types used in <code>State</code>s fields.</p> <p><code>ConnectionId</code> is going to be a <strong>newtype</strong> wrapping an integer:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Identifier of a connected editor </span><span style="color:#a626a4;">type </span><span>ConnectionId Int </span><span style="color:#a626a4;">derive</span><span>(</span><span style="color:#c18401;">Eq</span><span>, </span><span style="color:#c18401;">Hash</span><span>) </span><span> </span><span style="color:#a0a1a7;">///| Generates a next unique connection ID </span><span>fn next(</span><span style="color:#e45649;">self</span><span> : ConnectionId) </span><span style="color:#a626a4;">-&gt; </span><span>ConnectionId { </span><span> ConnectionId(</span><span style="color:#e45649;">self</span><span>._ </span><span style="color:#a626a4;">+ </span><span style="color:#c18401;">1</span><span>) </span><span>} </span></code></pre> <p>We want to use this type as a <strong>key</strong> of a <code>Map</code> so we need instances of the <code>Eq</code> and <code>Hash</code> type classes. MoonBit can derive it for us automatically for newtypes. In addition to that, we also define a method called <code>next</code> that generates a new connection ID with an incremented value.</p> <p>The <code>EditorState</code> structure holds information for each connected editor. To keep things simple, we only store the editor's <strong>email address</strong> and a buffer of change events since the last call to <code>poll</code>.</p> <p>An email address is a newtype of a <code>String</code>:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Email address of a connected editor </span><span style="color:#a626a4;">type </span><span>EmailAddress String </span></code></pre> <p>The <code>Change</code> enum describes the possible changes made to the document:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| An observable change of the edited document </span><span style="color:#a626a4;">enum </span><span>Change { </span><span> Added(String) </span><span> Deleted(String) </span><span> Inserted(</span><span style="color:#e45649;">after</span><span>~ : String, </span><span style="color:#e45649;">value</span><span>~ : String) </span><span>} </span><span style="color:#a626a4;">derive</span><span>(</span><span style="color:#c18401;">Show</span><span>) </span></code></pre> <p>Deriving <code>Show</code> (or implementing it by hand) makes it possible to use the <code>inspect</code> test function to compare string snapshots of array of changes with the results of our <code>poll</code> function.</p> <p>Finally, let's define <code>EditorState</code> using these two new types:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| State per connected editor </span><span style="color:#a626a4;">struct </span><span>EditorState { </span><span> </span><span style="color:#e45649;">email</span><span> : EmailAddress </span><span> </span><span style="color:#a626a4;">mut </span><span style="color:#e45649;">events</span><span> : Array[Change] </span><span>} </span></code></pre> <p>The <code>email</code> field never changes of a connected editor - but the <code>events</code> array is, as every call to <code>poll</code> will reset this so the next poll returns only the new changes. To be able to do this, we have to mark it as <code>mut</code>-able.</p> <p>The last new type we need to introduce for <code>State</code> is something representing a point in time. MoonBit's <code>core</code> standard library does not have currently anything for this, but there is already a package database, <a href="https://mooncakes.io">mooncakes</a>, with published MoonBit packages. Here we can find a <a href="https://mooncakes.io/docs/#/suiyunonghen/datetime/">package called <code>datetime</code></a>. Adding it to our project can be done with the <code>moon</code> CLI:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span> </span><span style="color:#e45649;">moon</span><span> add suiyunonghen/datetime </span></code></pre> <p>and then importing it into the <code>list</code> package by modifying its <code>moon.pkg.json</code>:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span>{ </span><span> </span><span style="color:#50a14f;">&quot;import&quot;</span><span>: [ </span><span> </span><span style="color:#50a14f;">&quot;suiyunonghen/datetime&quot; </span><span> ] </span><span>} </span></code></pre> <p>With this we can refer to the <code>DateTime</code> type in this package using <code>@datetime.DateTime</code>.</p> <p>Before starting to implement methods for <code>State</code>, we have to think about error handling too - some of the operations on <code>State</code> may fail, for example if a wrong connection ID is used, or a document editing operation comes in for an already archived list. MoonBit has built-in support for error handling, and it starts by defining our own error type in the following way:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Error type for editor state operations </span><span style="color:#a626a4;">type! </span><span>EditorError { </span><span> </span><span style="color:#a0a1a7;">///| Error returned when an invalid connection ID is used </span><span> InvalidConnection(ConnectionId) </span><span> </span><span style="color:#a0a1a7;">///| Error when trying to modify an already archived document </span><span> AlreadyArchived </span><span>} </span></code></pre> <p>With this we are ready to implement the collaborative list editor! I'm not going to list <em>all</em> the methods of <code>State</code> in this post, but the full source code is available <a href="https://github.com/vigoo/golem-moonbit-example">on GitHub</a>.</p> <p>The <code>connect</code> method associates a new connection ID with the connected user, and also returns the current document state. This is important to be able to use the results of <code>poll</code> - the returned list of changes have to be applied to exactly this document state on the client side.</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Connects a new editor </span><span style="color:#a626a4;">pub</span><span> fn connect( </span><span> </span><span style="color:#e45649;">self</span><span> : State, </span><span> </span><span style="color:#e45649;">email</span><span> : EmailAddress </span><span>) </span><span style="color:#a626a4;">-&gt;</span><span> (ConnectionId, ArrayView[String]) { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">connection_id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.last_connection_id.next() </span><span> </span><span style="color:#e45649;">self</span><span>.last_connection_id </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">connection_id </span><span> </span><span style="color:#e45649;">self</span><span>.connected.set(</span><span style="color:#e45649;">connection_id</span><span>, EditorState::new(</span><span style="color:#e45649;">email</span><span>)) </span><span> (</span><span style="color:#e45649;">connection_id</span><span>, </span><span style="color:#e45649;">self</span><span>.document.get()) </span><span>} </span></code></pre> <p>The <em>editing operations</em> are more interesting. They build on top of the editing operations we already defined for <code>Document</code>, but in addition to that, they all perform the following tasks:</p> <ul> <li>Validating the connection ID</li> <li>Validating that the document is not archived yet</li> <li>Adding a <code>Change</code> event to each connected editor's state</li> <li>Updating the <code>email_deadline</code> and <code>email_recipients</code> fields, as each editing operation <em>resets</em> the timeout for sending out the emails</li> </ul> <p>Let's go through these steps one by one. For validations, we define two helper methods as we want to reuse them in all editing methods:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Fails if the document is archived </span><span>fn ensure_not_archived(</span><span style="color:#e45649;">self</span><span> : State) </span><span style="color:#a626a4;">-&gt; </span><span>Unit!EditorError { </span><span> </span><span style="color:#a626a4;">guard not</span><span>(</span><span style="color:#e45649;">self</span><span>.archived) </span><span style="color:#a626a4;">else</span><span> { </span><span style="color:#a626a4;">raise </span><span>AlreadyArchived } </span><span> </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Fails if the given `connection_id` is not in the connection map </span><span>fn ensure_is_connected( </span><span> </span><span style="color:#e45649;">self</span><span> : State, </span><span> </span><span style="color:#e45649;">connection_id</span><span> : ConnectionId </span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>Unit!EditorError { </span><span> </span><span style="color:#a626a4;">guard </span><span style="color:#e45649;">self</span><span>.connected.contains(</span><span style="color:#e45649;">connection_id</span><span>) </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#a626a4;">raise </span><span>InvalidConnection(</span><span style="color:#e45649;">connection_id</span><span>) </span><span> } </span><span> </span><span>} </span></code></pre> <p>The <code>Unit!EditorError</code> result type indicates that these methods can fail with <code>EditorError</code>.</p> <p>We can also define a helper method for adding a change event to each connected editor's state:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Adds a change event to each connected editor&#39;s state </span><span>fn add_event(</span><span style="color:#e45649;">self</span><span> : State, </span><span style="color:#e45649;">change</span><span> : Change) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#a626a4;">for </span><span style="color:#e45649;">editor_state </span><span style="color:#a626a4;">in </span><span style="color:#e45649;">self</span><span>.connected.values() { </span><span> </span><span style="color:#e45649;">editor_state</span><span>.events.push(</span><span style="color:#e45649;">change</span><span>) </span><span> } </span><span>} </span></code></pre> <p>And finally one for resetting the email-sending deadline and list of recipients:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Updates the `email_deadline` and `email_recipients` fields after an update. </span><span>fn update_email_properties(</span><span style="color:#e45649;">self</span><span> : State) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">now </span><span style="color:#a626a4;">= </span><span>@datetime.DateTime::from_unix_mseconds(</span><span style="color:#c18401;">0</span><span>) </span><span style="color:#a0a1a7;">// TODO </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">send_at </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">now</span><span>.inc_hour(</span><span style="color:#c18401;">12</span><span>) </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">email_list </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.connected_editors() </span><span> </span><span style="color:#e45649;">self</span><span>.email_deadline </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">send_at </span><span> </span><span style="color:#e45649;">self</span><span>.email_recipients </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">email_list </span><span>} </span></code></pre> <p>Note that the <code>datetime</code> library we imported has no concept of getting the <em>current</em> date and time which we need for this function to work properly. We are going to address this problem once we start targeting WebAssembly (and Golem) as getting the current system time is something depending on the target platform.</p> <p>With these helper functions, implementing the editor functions, for example <code>add</code>, is straightforward:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Adds a new element to the document as a connected editor </span><span style="color:#a626a4;">pub</span><span> fn add( </span><span> </span><span style="color:#e45649;">self</span><span> : State, </span><span> </span><span style="color:#e45649;">connection_id</span><span> : ConnectionId, </span><span> </span><span style="color:#e45649;">value</span><span> : String </span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>Unit!EditorError { </span><span> </span><span style="color:#e45649;">self</span><span>.ensure_not_archived!() </span><span> </span><span style="color:#e45649;">self</span><span>.ensure_is_connected!(</span><span style="color:#e45649;">connection_id</span><span>) </span><span> </span><span style="color:#e45649;">self</span><span>.document.add(</span><span style="color:#e45649;">value</span><span>) </span><span> </span><span style="color:#e45649;">self</span><span>.add_event(Change::Added(</span><span style="color:#e45649;">value</span><span>)) </span><span> </span><span style="color:#e45649;">self</span><span>.update_email_properties() </span><span>} </span></code></pre> <p>Implementing <code>poll</code> is also easy, as we already maintain the list of changes per connection, we just need to reset it after each call:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Returns the list of changes occurred since the last call to poll </span><span style="color:#a626a4;">pub</span><span> fn poll( </span><span> </span><span style="color:#e45649;">self</span><span> : State, </span><span> </span><span style="color:#e45649;">connection_id</span><span> : ConnectionId </span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>Array[Change]!EditorError { </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">self</span><span>.connected.get(</span><span style="color:#e45649;">connection_id</span><span>) { </span><span> Some(</span><span style="color:#e45649;">editor_state</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">events </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">editor_state</span><span>.events </span><span> </span><span style="color:#e45649;">editor_state</span><span>.events </span><span style="color:#a626a4;">=</span><span> [] </span><span> </span><span style="color:#e45649;">events </span><span> } </span><span> None </span><span style="color:#a626a4;">=&gt; raise </span><span>InvalidConnection(</span><span style="color:#e45649;">connection_id</span><span>) </span><span> } </span><span>} </span></code></pre> <h3 id="list-archiving">List archiving</h3> <p>As mentioned in the introduction, we are going to have a singleton Golem worker to store <strong>archived lists</strong>. At this point we are still not having anything Golem or WebAssembly specific, like RPC calls, so let's just implement the list archive store in the simplest possible way. As I wrote earlier, we can simply store the archived lists in memory, and Golem will take care of persisting it.</p> <p>We don't want to reuse the same <code>Document</code> type as it represents a live, editable document. Instead we define a few new types in the <code>archive</code> package:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Unique name of a document </span><span style="color:#a626a4;">type </span><span>DocumentName String </span><span style="color:#a626a4;">derive</span><span>(</span><span style="color:#c18401;">Eq</span><span>, </span><span style="color:#c18401;">Hash</span><span>) </span><span> </span><span style="color:#a0a1a7;">///| Show instance for DocumentName </span><span style="color:#a626a4;">impl </span><span style="color:#c18401;">Show </span><span style="color:#a626a4;">for </span><span>DocumentName </span><span style="color:#a626a4;">with </span><span>output(</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">logger</span><span>) { </span><span style="color:#e45649;">self</span><span>._.output(</span><span style="color:#e45649;">logger</span><span>) } </span><span> </span><span style="color:#a0a1a7;">///| A single archived immutable document, encapsulating the document&#39;s name and its items </span><span style="color:#a626a4;">struct </span><span>ArchivedDocument { </span><span> </span><span style="color:#e45649;">name</span><span> : DocumentName </span><span> </span><span style="color:#e45649;">items</span><span> : Array[String] </span><span>} </span><span style="color:#a626a4;">derive</span><span>(</span><span style="color:#c18401;">Show</span><span>) </span><span> </span><span style="color:#a0a1a7;">///| Archive is a list of archived documents </span><span style="color:#a626a4;">struct </span><span>Archive { </span><span> </span><span style="color:#e45649;">documents</span><span> : Map[DocumentName, ArchivedDocument] </span><span>} </span></code></pre> <p>All we need is an <code>insert</code> method and a way to iterate all the archived documents:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Archives a named document </span><span style="color:#a626a4;">pub</span><span> fn insert( </span><span> </span><span style="color:#e45649;">self</span><span> : Archive, </span><span> </span><span style="color:#e45649;">name</span><span> : DocumentName, </span><span> </span><span style="color:#e45649;">items</span><span> : Array[String] </span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#e45649;">self</span><span>.documents.set(</span><span style="color:#e45649;">name</span><span>, { </span><span style="color:#e45649;">name</span><span>, </span><span style="color:#e45649;">items</span><span> }) </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Iterates all the archived documents </span><span style="color:#a626a4;">pub</span><span> fn iter(</span><span style="color:#e45649;">self</span><span> : Archive) </span><span style="color:#a626a4;">-&gt; </span><span>Iter[ArchivedDocument] { </span><span> </span><span style="color:#e45649;">self</span><span>.documents.values() </span><span>} </span></code></pre> <p>With this done, we first implement the list archiving in the <code>list</code> package using simple method calls. Later we are going to replace it with Golem's own <em>Worker to Worker communication</em>.</p> <p>As there will be a singleton archive worker, we can simulate this for now by having a top-level <code>Archive</code> instance in the <code>archive</code> package:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>pub let archive: Archive = Archive::new() </span></code></pre> <p>And calling this in our <code>State::archive</code> method:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">pub</span><span> fn archive(</span><span style="color:#e45649;">self</span><span> : State) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#e45649;">self</span><span>.archived </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">name </span><span style="color:#a626a4;">= </span><span>@archive.DocumentName(</span><span style="color:#50a14f;">&quot;</span><span>TODO</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> @archive.archive.insert(</span><span style="color:#e45649;">name</span><span>, </span><span style="color:#e45649;">self</span><span>.document.iter().to_array()) </span><span>} </span></code></pre> <p>Note that so far we have no way to know the document's name in <code>State</code> - we did not store it anywhere. This is intentional, as we discussed earlier the <strong>worker name</strong> will be used as the document's unique identifier. Getting the worker's name will be done in a Golem specific way once we get there.</p> <h3 id="sending-an-email">Sending an email</h3> <p>We already prepared some part of the email sending logic in the <code>State</code> type: it has a <em>deadline</em> and a list of <em>recipients</em>. The idea is that we start an <strong>email sending worker</strong> when a new list is created, and this runs in parallel to our editing session, in a loop. In this loop it first queries the deadline and list of recipients from our list editing state, and then just sleeps until that given deadline. When it wakes up (after 12 hours), it queries the list again, and if it is <em>past</em> the deadline, it means there were no further editing operations in the meantime. Then it sends the notification emails to the list of recipients.</p> <p>There is no library on <a href="https://mooncakes.io">mooncakes</a> yet for sending emails or even for making HTTP requests, so this is something we will have to do ourselves. Also, spawning the worker to run it in parallel is something Golem specific, so at this point we are not going to implement anything for the <code>email</code> package. We will get back to it once the rest of the application is already compiled as Golem components.</p> <h2 id="compiling-as-golem-components">Compiling as Golem Components</h2> <p>It is time to try to compile our code as <strong>Golem components</strong> - these are WebAssembly components (using the <a href="https://component-model.bytecodealliance.org">component model</a>) exporting an API described with the Wasm Interface Type (WIT) language.</p> <h3 id="bindings">Bindings</h3> <p>In the current world of the WASM component model, components are defined in a spec-first way - first we write the WIT files describing types and exported interfaces, and then use a <em>binding generator</em> to generate language-specific glue code from them. Fortunately the <a href="https://github.com/bytecodealliance/wit-bindgen"> <code>wit-bindgen</code> tool</a> already has MoonBit support, so we can start by installing the latest version:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>cargo install wit-bindgen-cli </span></code></pre> <p>Note that Golem's documentation recommends an older, specific version of <code>wit-bindgen</code> - but that version did not support MoonBit yet. The new version should work well but the example codes for Golem were not tested with it.</p> <p>We will reuse the WIT definitions that were created for the Golem 1.0 launch demo.</p> <p>For the <code>list</code> component, it is the following:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>demo:lst; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#a626a4;">record </span><span>connection { </span><span> </span><span style="color:#e45649;">id</span><span>: </span><span style="color:#a626a4;">u64 </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">record </span><span>insert-params { </span><span> </span><span style="color:#e45649;">after</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span> </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">string </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">variant </span><span>change { </span><span> added(</span><span style="color:#a626a4;">string</span><span>), </span><span> deleted(</span><span style="color:#a626a4;">string</span><span>), </span><span> inserted(insert-params) </span><span> } </span><span> </span><span> </span><span style="color:#0184bc;">add</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">c</span><span>: connection, </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;</span><span style="color:#a626a4;">_</span><span>, </span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span style="color:#0184bc;">delete</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">c</span><span>: connection, </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;</span><span style="color:#a626a4;">_</span><span>, </span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span style="color:#0184bc;">insert</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">c</span><span>: connection, </span><span style="color:#e45649;">after</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;</span><span style="color:#a626a4;">_</span><span>, </span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span style="color:#0184bc;">get</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; list</span><span>&lt;</span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span> </span><span style="color:#0184bc;">poll</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">c</span><span>: connection) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;</span><span style="color:#a626a4;">list</span><span>&lt;change&gt;, </span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span> </span><span style="color:#0184bc;">connect</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; tuple</span><span>&lt;connection, </span><span style="color:#a626a4;">list</span><span>&lt;</span><span style="color:#a626a4;">string</span><span>&gt;&gt;; </span><span> </span><span style="color:#0184bc;">disconnect</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">c</span><span>: connection) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;</span><span style="color:#a626a4;">_</span><span>, </span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span style="color:#0184bc;">connected-editors</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; list</span><span>&lt;</span><span style="color:#a626a4;">string</span><span>&gt;; </span><span> </span><span> </span><span style="color:#0184bc;">archive</span><span>: </span><span style="color:#a626a4;">func</span><span>(); </span><span> </span><span style="color:#0184bc;">is-archived</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; bool</span><span>; </span><span>} </span><span> </span><span style="color:#a626a4;">interface </span><span>email-query { </span><span> </span><span style="color:#0184bc;">deadline</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; option</span><span>&lt;</span><span style="color:#a626a4;">u64</span><span>&gt;; </span><span> </span><span style="color:#0184bc;">recipients</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; list</span><span>&lt;</span><span style="color:#a626a4;">string</span><span>&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>lst { </span><span> </span><span style="color:#a0a1a7;">// .. imports to be explained later .. </span><span> </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span> </span><span style="color:#a626a4;">export </span><span>email-query; </span><span>} </span></code></pre> <p>This interface definition exports two APIs - one is the public API of our list editors, very similar to the methods we already implemented for the <code>State</code> type. The other is an internal API for the <code>email</code> component to query the deadline and recipients as it was explained earlier.</p> <p>For simplicity, we are using <code>string</code> as an error type on the public API.</p> <p>For the <code>archive</code> component, we define a much simpler interface:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>demo:archive; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#a626a4;">record </span><span>archived-list { </span><span> </span><span style="color:#e45649;">name</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span> </span><span style="color:#e45649;">items</span><span>: </span><span style="color:#a626a4;">list</span><span>&lt;</span><span style="color:#a626a4;">string</span><span>&gt; </span><span> } </span><span> </span><span> </span><span style="color:#0184bc;">store</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span style="color:#e45649;">items</span><span>: </span><span style="color:#a626a4;">list</span><span>&lt;</span><span style="color:#a626a4;">string</span><span>&gt;); </span><span> </span><span style="color:#0184bc;">get-all</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; list</span><span>&lt;archived-list&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>archive { </span><span> </span><span style="color:#a0a1a7;">// .. imports to be explained later .. </span><span> </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span>} </span></code></pre> <p>And finally, for the <code>email</code> component:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>demo:email; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#a626a4;">use </span><span>golem:rpc/types@</span><span style="color:#c18401;">0.1.0</span><span>.{</span><span style="color:#e45649;">uri</span><span>}; </span><span> </span><span> </span><span style="color:#0184bc;">send-email</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">list-uri</span><span>: uri); </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>email { </span><span> </span><span style="color:#a0a1a7;">// .. imports to be explained later .. </span><span> </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span>} </span></code></pre> <p>Here we are using a Golem specific type: <code>uri</code>. This is needed because the <code>email</code> workers need to call the specific <code>list</code> worker it was spawned from. The details of this will be explained later.</p> <p>These WIT definitions need to be put in <code>wit</code> directories of each package, and dependencies in subdirectories of <code>wit/deps</code>. Check the <a href="https://github.com/vigoo/golem-moonbit-example">repository</a> for reference.</p> <p>We started with defining a single MoonBit <strong>module</strong> (identified by <code>moon.mod.json</code> in the root) and just created <code>list</code>, <code>email</code> and <code>archive</code> as internal packages. At this point we have to change this because we need to have a separate module for each chunk of code we want to compile to a separate Golem component. By running <code>wit-bindgen</code> in each of the three subdirectories (shown below), it actually generates module definitions for us.</p> <p>We reorganize the directory structure a bit, moving <code>src/archive</code> to <code>archive</code> etc, and moving the previously written source code to <code>archive/src</code>. This way the generated bindings and our hand-written implementation will be put next to each other. We can also delete the top-level module definition JSON.</p> <p>Now in all the three directories we can generate the bindings:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">wit-bindgen</span><span> moonbit wit </span></code></pre> <p>Note that once we start modifying the generated <code>stub.wit</code> files, running this command again will overwrite our changes. To avoid that, it can be run in the following way:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">wit-bindgen</span><span> moonbit wit</span><span style="color:#e45649;"> --ignore-stub </span></code></pre> <p>With this done,</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">moon</span><span> build</span><span style="color:#e45649;"> --target</span><span> wasm </span></code></pre> <p>will compile a WASM module for us in <code>./target/wasm/release/build/gen/gen.wasm</code>. This is not yet a WASM <strong>component</strong> - so it's not ready to be used directly in Golem. To do so, we will have to use another command line tool, <a href="https://github.com/bytecodealliance/wasm-tools"><code>wasm-tools</code></a> to convert this module into a component that self-describes its higher level exported interface.</p> <h3 id="wit-dependencies">WIT dependencies</h3> <p>We are going to need to depend on some WIT packages, some from WASI (WebAssembly System Interface) to access things like environment variables and the current date/time, and some Golem specific ones to implement worker-to-worker communication.</p> <p>The simplest way to get the appropriate version of all the dependencies Golem provides is to use Golem's "all" packaged interfaces with the <a href="https://github.com/bytecodealliance/wit-deps"><code>wit-deps</code></a> tool.</p> <p>So first we install <code>wit-deps</code>:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">cargo</span><span> install wit-deps-cli </span></code></pre> <p>And create a <code>deps.toml</code> file in each <code>wit</code> directory we have created with the following contents:</p> <pre data-lang="toml" style="background-color:#fafafa;color:#383a42;" class="language-toml "><code class="language-toml" data-lang="toml"><span style="color:#e45649;">all </span><span>= </span><span style="color:#50a14f;">&quot;https://github.com/golemcloud/golem-wit/archive/main.tar.gz&quot; </span></code></pre> <p>And finally we run the following command to fill the <code>wit/deps</code> directory:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">wit-deps</span><span> update </span></code></pre> <h3 id="implementing-the-exports">Implementing the exports</h3> <p>Before setting up this compilation chain let's see how we can connect the generated bindings with our existing code. Let's start with the <code>archive</code> component, as it is the simplest one.</p> <p>The binding generator creates a <code>stub.mbt</code> file at <code>archive/gen/interface/demo/archive/api/stub.mbt</code> with the two exported functions to be implemented. Here we face the usual question when working with code generators: we have a definition of <code>archived-list</code> in WIT and the binding generator generated the following MoonBit definition from it:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">// Generated by `wit-bindgen` 0.36.0. DO NOT EDIT! </span><span> </span><span style="color:#a626a4;">pub struct </span><span>ArchivedList { </span><span> </span><span style="color:#e45649;">name</span><span> : String; </span><span style="color:#e45649;">items</span><span> : Array[String] </span><span>} </span><span style="color:#a626a4;">derive</span><span>() </span></code></pre> <p>But we already defined a very similar structure called <code>ArchivedDocument</code>! The only differences are the use of the <code>DocumentName</code> newtype and that our version was deriving a <code>Show</code> instance. We could decide to give up using the newtype, and use the generated type in our business logic, or we could keep the generated types separated from our actual code. (This is not really specific to MoonBit or the WASM tooling, we face the same issue with any code generator based approach).</p> <p>In this post I will keep the generated code separate from our already written business logic, and just show how to implement the necessary conversions to implement the <code>stub.mbt</code> file(s).</p> <p>The first exported function to implement is called <code>store</code>. We can implement it by just calling <code>insert</code> on our singleton top level <code>Archive</code> as we did before when we directly wired the <code>archive</code> package to the <code>list</code> package:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">pub</span><span> fn store(</span><span style="color:#e45649;">name</span><span> : String, </span><span style="color:#e45649;">items</span><span> : Array[String]) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> @src.archive.insert(@src.DocumentName(</span><span style="color:#e45649;">name</span><span>), </span><span style="color:#e45649;">items</span><span>) </span><span>} </span></code></pre> <p>Note that we need to import our main <code>archive</code> source in the stub's package JSON:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span>{ </span><span> </span><span style="color:#50a14f;">&quot;import&quot;</span><span>: [ </span><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/archive/ffi&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot; </span><span>: </span><span style="color:#50a14f;">&quot;ffi&quot; </span><span>}, </span><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/archive/src&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot; </span><span>: </span><span style="color:#50a14f;">&quot;src&quot; </span><span>} </span><span> ] </span><span>} </span></code></pre> <p>The second function to be implemented needs to convert between the two representations of an archived document:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">pub</span><span> fn get_all() </span><span style="color:#a626a4;">-&gt; </span><span>Array[ArchivedList] { </span><span> @src.archive </span><span> .iter() </span><span> .map(fn(</span><span style="color:#e45649;">archived</span><span>) { { </span><span style="color:#e45649;">name</span><span>: </span><span style="color:#e45649;">archived</span><span>.name._, </span><span style="color:#e45649;">items</span><span>: </span><span style="color:#e45649;">archived</span><span>.items } }) </span><span> .to_array() </span><span>} </span></code></pre> <p>Note that for this to work, we also have to make the previously defined <code>struct ArchivedDocument</code> a <code>pub struct</code> otherwise we cannot access it's <code>name</code> and <code>items</code> fields from the stub package.</p> <p>(Note: at the time of writing https://github.com/bytecodealliance/wit-bindgen/pull/1100 was not merged yet, and it is needed for the binding generator to produce working code with Golem wasm-rpc; Until it is merged, it is possible to compile the fork and use it directly)</p> <p>The same way we can implement the two generated stubs in the <code>list</code> module (in <code>list/gen/interface/demo/lst/api/stub.mbt</code> and <code>list/gen/interface/demo/lst/emailQuery/stub.mbt</code>) using our existing implementation of <code>State</code>.</p> <p>One interesting details is how we can map the <code>EditorError</code> failures into the string errors used in the WIT definition. First we define a <code>to_string</code> method for <code>EditorError</code>:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">pub</span><span> fn to_string(</span><span style="color:#e45649;">self</span><span> : EditorError) </span><span style="color:#a626a4;">-&gt; </span><span>String { </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">self</span><span> { </span><span> InvalidConnection(</span><span style="color:#e45649;">id</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#50a14f;">&quot;</span><span>Invalid connection ID: \{id._}</span><span style="color:#50a14f;">&quot; </span><span> AlreadyArchived </span><span style="color:#a626a4;">=&gt; </span><span style="color:#50a14f;">&quot;</span><span>Document is already archived</span><span style="color:#50a14f;">&quot; </span><span> } </span><span>} </span></code></pre> <p>Then use <code>?</code> and <code>map_err</code> in the stubs:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">pub</span><span> fn add(</span><span style="color:#e45649;">c</span><span> : Connection, </span><span style="color:#e45649;">value</span><span> : String) </span><span style="color:#a626a4;">-&gt; </span><span>Result[Unit, String] { </span><span> @src.state </span><span> .add?(to_connection_id(</span><span style="color:#e45649;">c</span><span>), </span><span style="color:#e45649;">value</span><span>) </span><span> .map_err(fn(</span><span style="color:#e45649;">err</span><span>) { </span><span style="color:#e45649;">err</span><span>.to_string() }) </span><span>} </span></code></pre> <h3 id="using-host-functions">Using host functions</h3> <p>When we implemented the <code>update_email_properties</code> function earlier, we could not properly query the current time to calculate the proper deadline. Now that we are targeting Golem, we can use the WebAssembly system interface (WASI) to access things like the system time. One way would be to use the published <a href="https://mooncakes.io/docs/#/yamajik/wasi-bindings/"><code>wasi-bindings</code> package</a> but as we are already generating bindings from WIT anyway, we can just use our own generated bindings to imported host functions.</p> <p>First, we need to import the WASI wall-clock interface into our WIT world:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">world </span><span>lst { </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span> </span><span style="color:#a626a4;">export </span><span>email-query; </span><span> </span><span> </span><span style="color:#a626a4;">import </span><span>wasi:clocks/wall-clock@</span><span style="color:#c18401;">0.2.0</span><span>; </span><span>} </span></code></pre> <p>Then we regenerate the bindings (make sure to use <code>--ignore-stub</code> to avoid rewriting our stub implementation!) and import it into our main (<code>src</code>) package:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span>{ </span><span> </span><span style="color:#50a14f;">&quot;import&quot;</span><span>: [ </span><span> </span><span style="color:#50a14f;">&quot;suiyunonghen/datetime&quot;</span><span>, </span><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/lst/interface/wasi/clocks/wallClock&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot; </span><span>: </span><span style="color:#50a14f;">&quot;wallClock&quot; </span><span>} </span><span> ] </span><span>} </span></code></pre> <p>With that we can call the WASI <code>now</code> function to query the current system time, and convert it to the <code>datetime</code> module's <code>DateTime</code> type which we were using before:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Queries the WASI wall clock and returns it as a @datetime.DateTime </span><span style="color:#a0a1a7;">/// </span><span style="color:#a0a1a7;">/// Note that DateTime has only millisecond precision </span><span>fn now() </span><span style="color:#a626a4;">-&gt; </span><span>@datetime.DateTime { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">wasi_now </span><span style="color:#a626a4;">= </span><span>@wallClock.now() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">base_ms </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">wasi_now</span><span>.seconds.reinterpret_as_int64() </span><span style="color:#a626a4;">* </span><span style="color:#c18401;">1000</span><span>; </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">nano_ms </span><span style="color:#a626a4;">=</span><span> (</span><span style="color:#e45649;">wasi_now</span><span>.nanoseconds.reinterpret_as_int() </span><span style="color:#a626a4;">/ </span><span style="color:#c18401;">1000000</span><span>).to_int64(); </span><span> @datetime.DateTime::from_unix_mseconds(</span><span style="color:#e45649;">base_ms </span><span style="color:#a626a4;">+ </span><span style="color:#e45649;">nano_ms</span><span>) </span><span>} </span></code></pre> <h2 id="golem-app-manifest">Golem app manifest</h2> <p>In the next step of our implementation we will have to connect our two existing components: <code>list</code> and <code>archive</code> in a way that <code>list</code> can do remote procedure calls to <code>archive</code>. With the same technique we will be able to implement the third component, <code>email</code> which needs to be both called <em>from</em><code> list</code> (when started) and called back (when getting the deadline and recipients).</p> <p>Golem has tooling supporting this - but before trying to use it, let's convert our project into a <strong>golem application</strong> described by <strong>app manifests</strong>. This will enable us to use <code>golem-cli</code> to generate the necessary files for worker-to-worker communication, and will also make it easier to deploy the compiled components into Golem.</p> <h3 id="the-build-steps">The build steps</h3> <p>To build a single MoonBit module into a Golem component, without any worker-to-worker communication involved, we have to perform the following steps:</p> <ul> <li>(Optionally) regenerate the WIT bindings with <code>wit-bindgen ... --ignore-stub</code></li> <li>Compile the MoonBit source code into a WASM module with <code>moon build --target wasm</code></li> <li>Embed the WIT specification into a custom WASM section using <code>wasm-tools component embed</code></li> <li>Convert the WASM module into a WASM <em>component</em> using <code>wasm-tools component new</code></li> </ul> <p>When we will start to use worker-to-worker communication it will require even more steps, as we are going to generate stub WIT interfaces, and compile and link multiple WASM components. An earlier version of this was <a href="https://blog.vigoo.dev/posts/w2w-communication-golem/">described in the Worker to Worker communication in Golem</a> blog post last year.</p> <p>The Golem app manifest and the corresponding CLI tool, introduced with <strong>Golem 1.1</strong>, automates all these steps for us.</p> <h3 id="manifest-template">Manifest template</h3> <p>We start by creating a root app manifest, <code>golem.yaml</code>, in the root of our project. We start by setting up a temporary directory and a shared directory for the WIT dependencies we previously fetched with <code>wit-deps</code>:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#a0a1a7;"># Schema for IDEA: </span><span style="color:#a0a1a7;"># $schema: https://schema.golem.cloud/app/golem/1.1.0/golem.schema.json </span><span style="color:#a0a1a7;"># Schema for vscode-yaml </span><span style="color:#a0a1a7;"># yaml-language-server: $schema=https://schema.golem.cloud/app/golem/1.1.0/golem.schema.json </span><span> </span><span style="color:#e45649;">tempDir</span><span>: </span><span style="color:#50a14f;">target/golem-temp </span><span style="color:#e45649;">witDeps</span><span>: </span><span> - </span><span style="color:#50a14f;">common-wit/deps </span></code></pre> <p>By moving our previous <code>deps.toml</code> into <code>common-wit</code> and doing a <code>wit-deps update</code> in the root, we can fill up this <code>deps</code> directory with all the WASI and Golem APIs we need.</p> <p>Then we define a <strong>template</strong> for building MoonBit components with Golem CLI. In the template, we are going to define two <strong>profiles</strong> - one for doing a <strong>release</strong> build and one for <strong>debug</strong>. In the post I'm only going to show the release build.</p> <p>It starts by specifying some directory names and where the final WASM files will be placed:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#e45649;">templates</span><span>: </span><span> </span><span style="color:#e45649;">moonbit</span><span>: </span><span> </span><span style="color:#e45649;">profiles</span><span>: </span><span> </span><span style="color:#e45649;">release</span><span>: </span><span> </span><span style="color:#e45649;">sourceWit</span><span>: </span><span style="color:#50a14f;">wit </span><span> </span><span style="color:#e45649;">generatedWit</span><span>: </span><span style="color:#50a14f;">wit-generated </span><span> </span><span style="color:#e45649;">componentWasm</span><span>: </span><span style="color:#50a14f;">../target/release/{{ componentName }}.wasm </span><span> </span><span style="color:#e45649;">linkedWasm</span><span>: </span><span style="color:#50a14f;">../target/release/{{ componentName }}-linked.wasm </span></code></pre> <p>These directories are relative to the components subdirectories (for example <code>archive</code>) so what we say here is that once all the components are built, they al will be put in the root <code>target/release</code> directory.</p> <p>Then we specify the <strong>build steps</strong>, described in the previous section:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span> </span><span style="color:#e45649;">build</span><span>: </span><span> - </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#50a14f;">wit-bindgen moonbit wit-generated --ignore-stub --derive-error --derive-show </span><span> </span><span style="color:#e45649;">sources</span><span>: </span><span> - </span><span style="color:#50a14f;">wit-generated </span><span> </span><span style="color:#e45649;">targets</span><span>: </span><span> - </span><span style="color:#50a14f;">ffi </span><span> - </span><span style="color:#50a14f;">interface </span><span> - </span><span style="color:#50a14f;">world </span><span> - </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#50a14f;">moon build --target wasm </span><span> - </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#50a14f;">wasm-tools component embed wit-generated target/wasm/release/build/gen/gen.wasm -o ../target/release/{{ componentName }}.module.wasm --encoding utf16 </span><span> </span><span style="color:#e45649;">mkdirs</span><span>: </span><span> - </span><span style="color:#50a14f;">../target/release </span><span> - </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#50a14f;">wasm-tools component new ../target/release/{{ componentName }}.module.wasm -o ../target/release/{{ componentName }}.wasm </span></code></pre> <p>Finally, we can define additional directories to be cleaned by the <code>golem app clean</code> command, and we can even define custom commands to be executed with <code>golem app xxx</code>:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span> </span><span style="color:#e45649;">clean</span><span>: </span><span> - </span><span style="color:#50a14f;">target </span><span> - </span><span style="color:#50a14f;">wit-generated </span><span> </span><span style="color:#e45649;">customCommands</span><span>: </span><span> </span><span style="color:#e45649;">update-deps</span><span>: </span><span> - </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#50a14f;">wit-deps update </span><span> </span><span style="color:#e45649;">dir</span><span>: </span><span style="color:#c18401;">.. </span><span> </span><span style="color:#e45649;">regenerate-stubs</span><span>: </span><span> - </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#50a14f;">wit-bindgen moonbit wit-generated </span></code></pre> <p>With this set, we can add a new <em>MoonBit module</em>* to this <strong>Golem project</strong> by creating a <code>golem.yaml</code> in its directory - so <code>archive/golem.yaml</code> and <code>list/golem.yaml</code> for now.</p> <p>In these sub-manifests we can use the above defined template to tell Golem that this is a MoonBit module. It is possible to mix Golem components written in different languages in a single application.</p> <p>For example the <code>archive</code> component's manifest will look like this:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#a0a1a7;"># Schema for IDEA: </span><span style="color:#a0a1a7;"># $schema: https://schema.golem.cloud/app/golem/1.1.0/golem.schema.json </span><span style="color:#a0a1a7;"># Schema for vscode-yaml </span><span style="color:#a0a1a7;"># yaml-language-server: $schema=https://schema.golem.cloud/app/golem/1.1.0/golem.schema.json </span><span> </span><span style="color:#e45649;">components</span><span>: </span><span> </span><span style="color:#e45649;">archive</span><span>: </span><span> </span><span style="color:#e45649;">template</span><span>: </span><span style="color:#50a14f;">moonbit </span></code></pre> <h3 id="building-the-components">Building the components</h3> <p>With this set, the whole application (with its two already written components) can be compiled by simply saying</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>golem app build </span></code></pre> <p>There are a few organizational things to do first, as <code>golem app build</code> does some transformations on the WIT definitions. This means that our previously written <strong>stubs</strong> are a wrong place. The easiest way to fix this is to delete all the wit-bindgen generated directories (but first backup the hand-written stubs!) and then copy back the stubs into the new directories created. We are not going to discuss this in more details here. The blog post incrementally discovers how to build Golem applications with MoonBit and introduces the app manifest in a late stage, but the recommended way is to start immediately with an app manifest and then there is no need to do these fixes.</p> <h3 id="first-try">First try</h3> <p>Running the build command results in two WASM files that are ready to be used with Golem! Although they are not able to communicate with each other yet (so the archiving functionality does not work), it is already possible to try them out with Golem.</p> <p>To do so, we can start Golem locally by downloading the latest release of <a href="https://github.com/golemcloud/golem/releases/tag/v1.1.0">single-executable Golem</a> or using our hosted Golem Cloud. With the <code>golem</code> binary, we just use the following command to start up the services locally:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem start</span><span style="color:#e45649;"> -vv </span></code></pre> <p>Then, from the root of our project, we can upload the two compiled components using the same command:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem component add</span><span style="color:#e45649;"> --component-name</span><span> archive </span><span style="color:#e45649;">Added</span><span> new component archive </span><span> </span><span style="color:#e45649;">Component</span><span> URN: urn:component:bde2da89-75a8-4adf-953f-33b360c978d0 </span><span style="color:#e45649;">Component</span><span> name: archive </span><span style="color:#e45649;">Component</span><span> version: 0 </span><span style="color:#e45649;">Component</span><span> size: 9.35 KiB </span><span style="color:#e45649;">Created</span><span> at: 2025-01-03 15:09:05.166785 UTC </span><span style="color:#e45649;">Exports: </span><span> </span><span style="color:#0184bc;">demo:archive-interface/api.</span><span>{</span><span style="color:#0184bc;">get-all}</span><span>() -&gt; list&lt;record { </span><span style="color:#e45649;">name:</span><span> string, items: list</span><span style="color:#a626a4;">&lt;</span><span>string</span><span style="color:#a626a4;">&gt;</span><span> }</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#e45649;">demo:archive-interface/api.{store</span><span>}(</span><span style="color:#e45649;">name:</span><span> string, items: list</span><span style="color:#a626a4;">&lt;</span><span>string</span><span style="color:#a626a4;">&gt;</span><span>) </span></code></pre> <p>and</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem component add</span><span style="color:#e45649;"> --component-name</span><span> list </span><span style="color:#e45649;">Added</span><span> new component list </span><span> </span><span style="color:#e45649;">Component</span><span> URN: urn:component:b6420554-62b5-4902-8994-89c692a937f7 </span><span style="color:#e45649;">Component</span><span> name: list </span><span style="color:#e45649;">Component</span><span> version: 0 </span><span style="color:#e45649;">Component</span><span> size: 28.46 KiB </span><span style="color:#e45649;">Created</span><span> at: 2025-01-03 15:09:09.743733 UTC </span><span style="color:#e45649;">Exports: </span><span> </span><span style="color:#e45649;">demo:lst-interface/api.{add</span><span>}(</span><span style="color:#e45649;">c:</span><span> record { id: u64 }, value: string) -</span><span style="color:#a626a4;">&gt;</span><span> result</span><span style="color:#a626a4;">&lt;</span><span>_, string</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#0184bc;">demo:lst-interface/api.</span><span>{</span><span style="color:#0184bc;">archive}</span><span>() </span><span> demo:lst-interface/api.{</span><span style="color:#e45649;">connect</span><span>}(</span><span style="color:#e45649;">email:</span><span> string) -</span><span style="color:#a626a4;">&gt;</span><span> tuple</span><span style="color:#a626a4;">&lt;</span><span>record { id: u64 }, list</span><span style="color:#a626a4;">&lt;</span><span>string</span><span style="color:#a626a4;">&gt;&gt; </span><span> </span><span style="color:#0184bc;">demo:lst-interface/api.</span><span>{</span><span style="color:#0184bc;">connected-editors}</span><span>() -&gt; list&lt;string&gt; </span><span> demo:lst-interface/api.{</span><span style="color:#e45649;">delete</span><span>}(</span><span style="color:#e45649;">c:</span><span> record { id: u64 }, value: string) -</span><span style="color:#a626a4;">&gt;</span><span> result</span><span style="color:#a626a4;">&lt;</span><span>_, string</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#e45649;">demo:lst-interface/api.{disconnect</span><span>}(</span><span style="color:#e45649;">c:</span><span> record { id: u64 }) -</span><span style="color:#a626a4;">&gt;</span><span> result</span><span style="color:#a626a4;">&lt;</span><span>_, string</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#0184bc;">demo:lst-interface/api.</span><span>{</span><span style="color:#0184bc;">get}</span><span>() -&gt; list&lt;string&gt; </span><span> demo:lst-interface/api.{</span><span style="color:#e45649;">insert</span><span>}(</span><span style="color:#e45649;">c:</span><span> record { id: u64 }, after: string, value: string) -</span><span style="color:#a626a4;">&gt;</span><span> result</span><span style="color:#a626a4;">&lt;</span><span>_, string</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#0184bc;">demo:lst-interface/api.</span><span>{</span><span style="color:#0184bc;">is-archived}</span><span>() -&gt; bool </span><span> demo:lst-interface/api.{</span><span style="color:#e45649;">poll</span><span>}(</span><span style="color:#e45649;">c:</span><span> record { id: u64 }) -</span><span style="color:#a626a4;">&gt;</span><span> result</span><span style="color:#a626a4;">&lt;</span><span>list</span><span style="color:#a626a4;">&lt;</span><span>variant { added(string), deleted(string), inserted(record { after: string, value: string }) }</span><span style="color:#a626a4;">&gt;</span><span>, string</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#0184bc;">demo:lst-interface/email-query.</span><span>{</span><span style="color:#0184bc;">deadline}</span><span>() -&gt; option&lt;u64&gt; </span><span> demo:lst-interface/email-query.{</span><span style="color:#0184bc;">recipients}</span><span>() -&gt; list&lt;string&gt; </span></code></pre> <p>We can try out the <code>archive</code> component by first invoking the <code>store</code> function, and then the <code>get-all</code> function, using the CLI's <code>worker invoke-and-await</code> command:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem worker invoke-and-await</span><span style="color:#e45649;"> --worker</span><span> urn:worker:bde2da89-75a8-4adf-953f-33b360c978d0/archive</span><span style="color:#e45649;"> --function </span><span style="color:#50a14f;">&#39;demo:archive-interface/api.{store}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;&quot;list1&quot;&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;[&quot;x&quot;, &quot;y&quot;, &quot;z&quot;]&#39; </span><span style="color:#e45649;">Empty</span><span> result. </span><span> </span><span style="color:#e45649;">$</span><span> golem worker invoke-and-await</span><span style="color:#e45649;"> --worker</span><span> urn:worker:bde2da89-75a8-4adf-953f-33b360c978d0/archive</span><span style="color:#e45649;"> --function </span><span style="color:#50a14f;">&#39;demo:archive-interface/api.{get-all}&#39; </span><span style="color:#e45649;">Invocation</span><span> results in WAVE format: </span><span style="color:#e45649;">- </span><span style="color:#50a14f;">&#39;[{name: &quot;list1&quot;, items: [&quot;x&quot;, &quot;y&quot;, &quot;z&quot;]}]&#39; </span></code></pre> <p>Similarly we can try out the <code>list</code> component, keeping in mind that the <strong>worker name</strong> is the list name:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span> </span></code></pre> <p>When we try out list, we get an error (and if we used the <code>debug</code> profile - using <code>--build-profile debug</code> then we also get a nice call stack):</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>Failed to create worker b6420554-62b5-4902-8994-89c692a937f7/list6: Failed to instantiate worker -1/b6420554-62b5-4902-8994-89c692a937f7/list6: error while executing at wasm backtrace: </span><span> 0: 0x19526 - wit-component:shim!indirect-wasi:clocks/[email protected] </span><span> 1: 0x414b - &lt;unknown&gt;!demo/lst/interface/wasi/clocks/wallClock.wasmImportNow </span><span> 2: 0x4165 - &lt;unknown&gt;!demo/lst/interface/wasi/clocks/wallClock.now </span><span> 3: 0x42c1 - &lt;unknown&gt;!demo/lst/src.now </span><span> 4: 0x433d - &lt;unknown&gt;!@demo/lst/src.State::update_email_properties </span><span> 5: 0x440e - &lt;unknown&gt;!@demo/lst/src.State::new </span><span> 6: 0x5d81 - &lt;unknown&gt;!*init*/38 </span></code></pre> <p>The reason is we are creating a global variable of <code>State</code> and in its constructor we are tryting to call a WASI function (to get the current date-time). This is too early for that; so let's modify the <code>State::new</code> method to not call any host functions:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Creates a new empty document editing state </span><span style="color:#a626a4;">pub</span><span> fn State::new() </span><span style="color:#a626a4;">-&gt; </span><span>State { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">state </span><span style="color:#a626a4;">=</span><span> { </span><span> </span><span style="color:#e45649;">document</span><span>: Document::new(), </span><span> </span><span style="color:#e45649;">connected</span><span>: Map::new(), </span><span> </span><span style="color:#e45649;">last_connection_id</span><span>: ConnectionId(</span><span style="color:#c18401;">0</span><span>), </span><span> </span><span style="color:#e45649;">archived</span><span>: </span><span style="color:#c18401;">false</span><span>, </span><span> </span><span style="color:#e45649;">email_deadline</span><span>: @datetime.DateTime::from_unix_mseconds(</span><span style="color:#c18401;">0</span><span>), </span><span style="color:#a0a1a7;">// Note: can&#39;t use now() here because it will run in initialization-time (due to the global `state` variable) </span><span> </span><span style="color:#e45649;">email_recipients</span><span>: [], </span><span> } </span><span> </span><span style="color:#e45649;">state </span><span>} </span></code></pre> <p>This fixes the issue! Now we can create and play with our collaboratively editable lists:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem worker start</span><span style="color:#e45649;"> --component</span><span> urn:component:b6420554-62b5-4902-8994-89c692a937f7</span><span style="color:#e45649;"> --worker-name</span><span> list7 </span><span style="color:#e45649;">Added</span><span> worker list7 </span><span> </span><span style="color:#e45649;">Worker</span><span> URN: urn:worker:b6420554-62b5-4902-8994-89c692a937f7/list7 </span><span style="color:#e45649;">Component</span><span> URN: urn:component:b6420554-62b5-4902-8994-89c692a937f7 </span><span style="color:#e45649;">Worker</span><span> name: list7 </span><span> </span><span style="color:#e45649;">$</span><span> golem worker invoke-and-await</span><span style="color:#e45649;"> --component</span><span> urn:component:b6420554-62b5-4902-8994-89c692a937f7</span><span style="color:#e45649;"> --worker-name</span><span> list7</span><span style="color:#e45649;"> --function </span><span style="color:#50a14f;">&#39;demo:lst-interface/api.{connect}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;&quot;[email protected]&quot;&#39; </span><span style="color:#e45649;">Invocation</span><span> results in WAVE format: </span><span style="color:#e45649;">- </span><span style="color:#50a14f;">&#39;({id: 1}, [])&#39; </span><span> </span><span style="color:#e45649;">$</span><span> golem worker invoke-and-await</span><span style="color:#e45649;"> --component</span><span> urn:component:b6420554-62b5-4902-8994-89c692a937f7</span><span style="color:#e45649;"> --worker-name</span><span> list7</span><span style="color:#e45649;"> --function </span><span style="color:#50a14f;">&#39;demo:lst-interface/api.{add}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;{ id: 1}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;&quot;a&quot;&#39; </span><span style="color:#e45649;">Invocation</span><span> results in WAVE format: </span><span style="color:#e45649;">-</span><span> ok </span><span> </span><span style="color:#e45649;">$</span><span> golem worker invoke-and-await</span><span style="color:#e45649;"> --component</span><span> urn:component:b6420554-62b5-4902-8994-89c692a937f7</span><span style="color:#e45649;"> --worker-name</span><span> list7</span><span style="color:#e45649;"> --function </span><span style="color:#50a14f;">&#39;demo:lst-interface/api.{add}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;{ id: 1}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;&quot;b&quot;&#39; </span><span style="color:#e45649;">Invocation</span><span> results in WAVE format: </span><span style="color:#e45649;">-</span><span> ok </span><span> </span><span style="color:#e45649;">$</span><span> golem worker invoke-and-await</span><span style="color:#e45649;"> --component</span><span> urn:component:b6420554-62b5-4902-8994-89c692a937f7</span><span style="color:#e45649;"> --worker-name</span><span> list7</span><span style="color:#e45649;"> --function </span><span style="color:#50a14f;">&#39;demo:lst-interface/api.{connect}&#39;</span><span style="color:#e45649;"> --arg </span><span style="color:#50a14f;">&#39;&quot;[email protected]&quot;&#39; </span><span style="color:#e45649;">Invocation</span><span> results in WAVE format: </span><span style="color:#e45649;">- </span><span style="color:#50a14f;">&#39;({id: 2}, [&quot;a&quot;, &quot;b&quot;])&#39; </span></code></pre> <h2 id="worker-to-worker-communication">Worker to Worker communication</h2> <h3 id="list-calling-archive">List calling archive</h3> <p>The first worker-to-worker communication we want to set up is the <code>list</code> component calling the <code>archive</code> component - basically, when we call <code>archive()</code> on the list, it needs to call <code>store</code> in a singleton archive worker, sending its data to it.</p> <p>The first step is to simply state this dependency in the app manifest of <code>list</code>:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#e45649;">components</span><span>: </span><span> </span><span style="color:#e45649;">list</span><span>: </span><span> </span><span style="color:#e45649;">template</span><span>: </span><span style="color:#50a14f;">moonbit </span><span> </span><span style="color:#e45649;">dependencies</span><span>: </span><span> </span><span style="color:#e45649;">list</span><span>: </span><span> - </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">wasm-rpc </span><span> </span><span style="color:#e45649;">target</span><span>: </span><span style="color:#50a14f;">archive </span></code></pre> <p>Running <code>golem app build</code> after this will run a lot of new build steps - including generating and compiling some Rust source code, which is something that will no longer be needed in the next release of Golem.</p> <p>We are not going into details of what is generated for worker to worker communication in this post - what is important is that after this change, and running build once, we can <strong>import</strong> a generated <strong>stub</strong> of our <code>archive</code> component in our <code>list</code> component's moonbit package:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span>{ </span><span> </span><span style="color:#50a14f;">&quot;import&quot;</span><span>: [ </span><span> </span><span style="color:#50a14f;">&quot;suiyunonghen/datetime&quot;</span><span>, </span><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/lst/interface/wasi/clocks/wallClock&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot; </span><span>: </span><span style="color:#50a14f;">&quot;wallClock&quot; </span><span>}, </span><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/lst/interface/demo/archive_stub/stubArchive&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot;</span><span>: </span><span style="color:#50a14f;">&quot;stubArchive&quot; </span><span>}, </span><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/lst/interface/golem/rpc/types&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot;</span><span>: </span><span style="color:#50a14f;">&quot;rpcTypes&quot; </span><span>} </span><span> ] </span><span>} </span></code></pre> <p>Then we can add the following code into our <code>archive</code> function to call the remote worker:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">archive_component_id </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;</span><span>bde2da89-75a8-4adf-953f-33b360c978d0</span><span style="color:#50a14f;">&quot;</span><span>; </span><span style="color:#a0a1a7;">// TODO </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">archive </span><span style="color:#a626a4;">= </span><span>@stubArchive.Api::api({ </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>urn:worker:\{archive_component_id}/archive</span><span style="color:#50a14f;">&quot;</span><span>}); </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">name </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;</span><span>TODO</span><span style="color:#50a14f;">&quot;</span><span>; </span><span style="color:#a0a1a7;">// TODO </span><span> </span><span> </span><span style="color:#e45649;">archive</span><span>.blocking_store(</span><span style="color:#e45649;">name</span><span>, </span><span style="color:#e45649;">self</span><span>.document.iter().to_array()) </span></code></pre> <p>In line 2 we construct the remote interface by pointing to a specific <strong>worker</strong>, by using the component ID and the worker's name. (In the next Golem release this is going to be simplified by being able to use the component's name instead). In line 5 we call the remote <code>store</code> function.</p> <p>What is missing are two things:</p> <ul> <li>We should not hard-code the archive component's ID as it is automatically generated when the component is first uploaded to Golem</li> <li>We need to know our own <strong>worker name</strong> to be used as the list's name</li> </ul> <p>The solution to both is to use <strong>environment variables</strong> - Golem automatically sets the <code>GOLEM_WORKER_NAME</code> environment variable to the worker's name, and we can manually provide values to workers through custom environment variables. This allows us to inject the component ID from the outside (until a more sophisticated configuration feature is added in Golem 1.2).</p> <p>We have already seen how we can use WASI to query the current date/time; we can use another WASI interface to get environment variables. So once again, we add an import to our WIT file:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span> </span><span style="color:#a626a4;">import </span><span>wasi:cli/environment@</span><span style="color:#c18401;">0.2.0</span><span>; </span></code></pre> <p>Then run <code>golem app build</code> to regenerate the bindings, and import it in the <code>list/src</code> MoonBit package:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span> { </span><span style="color:#50a14f;">&quot;path&quot; </span><span>: </span><span style="color:#50a14f;">&quot;demo/lst/interface/wasi/cli/environment&quot;</span><span>, </span><span style="color:#50a14f;">&quot;alias&quot;</span><span>: </span><span style="color:#50a14f;">&quot;environment&quot; </span><span>} </span></code></pre> <p>and implement a helper function to get a specific key from the environment variables:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Gets an environment variable using WASI </span><span>fn get_env(</span><span style="color:#e45649;">key</span><span> : String) </span><span style="color:#a626a4;">-&gt; </span><span>String? { </span><span> @environment.get_environment() </span><span> .iter() </span><span> .find_first(fn(</span><span style="color:#e45649;">pair</span><span>) { </span><span> </span><span style="color:#e45649;">pair</span><span>.</span><span style="color:#c18401;">0 </span><span style="color:#a626a4;">== </span><span style="color:#e45649;">key </span><span> }) </span><span> .map(fn(</span><span style="color:#e45649;">pair</span><span>) { </span><span> </span><span style="color:#e45649;">pair</span><span>.</span><span style="color:#c18401;">1 </span><span> }) </span><span>} </span></code></pre> <p>We can use this to get the worker's name and the archive component ID:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">let </span><span style="color:#e45649;">archive_component_id </span><span style="color:#a626a4;">= </span><span>get_env(</span><span style="color:#50a14f;">&quot;</span><span>ARCHIVE_COMPONENT_ID</span><span style="color:#50a14f;">&quot;</span><span>).or(</span><span style="color:#50a14f;">&quot;</span><span>unknown</span><span style="color:#50a14f;">&quot;</span><span>); </span><span style="color:#a0a1a7;">// ... </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">name </span><span style="color:#a626a4;">= </span><span>get_env(</span><span style="color:#50a14f;">&quot;</span><span>GOLEM_WORKER_NAME</span><span style="color:#50a14f;">&quot;</span><span>).or(</span><span style="color:#50a14f;">&quot;</span><span>unknown</span><span style="color:#50a14f;">&quot;</span><span>); </span></code></pre> <p>When starting the list workers, we have to explicitly specify <code>ARCHIVE_COMPONENT_ID</code>:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem worker start</span><span style="color:#e45649;"> --component</span><span> urn:component:b6420554-62b5-4902-8994-89c692a937f7</span><span style="color:#e45649;"> --worker-name</span><span> list10</span><span style="color:#e45649;"> --env </span><span style="color:#50a14f;">&quot;ARCHIVE_COMPONENT_ID=bde2da89-75a8-4adf-953f-33b360c978d0&quot; </span></code></pre> <p>With that we can try connecting to the list, adding some items and then calling <code>archive</code> on it, and finally calling <code>get-all</code> on the archive worker - we can see that the remote procedure call works!</p> <h3 id="list-and-email">List and email</h3> <p>We haven't implemented the third component of the application yet - the one responsible for sending an email after some deadline. Setting up the component and the worker-to-worker communication works exactly the same as it was demonstrated above. The app manifest supports circular dependencies, so we can add say that <code>list</code> depends on <code>email</code> via <code>wasm-rpc</code>, and also <code>email</code> depends on <code>list</code> via <code>wasm-rpc</code>. We need to communicate in both directions.</p> <p>We will have to use the WASI <code>monotonic-clock</code> interface's <code>subscribe-instant</code> function to <strong>sleep</strong> until the given deadline.</p> <p>Without showing all the details, here is the MoonBit code implementing the single <code>send-email</code> function we defined in the <code>email.wit</code> file:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Structure holding an email sender&#39;s configuration </span><span style="color:#a626a4;">pub</span><span>(</span><span style="color:#e45649;">all</span><span>) </span><span style="color:#a626a4;">struct </span><span>Email { </span><span> </span><span style="color:#e45649;">list_worker_urn</span><span> : String </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Run the email sending loop </span><span style="color:#a626a4;">pub</span><span> fn run(</span><span style="color:#e45649;">self</span><span> : Email) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#a626a4;">while </span><span style="color:#c18401;">true</span><span> { </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">self</span><span>.get_deadline() { </span><span> Some(</span><span style="color:#e45649;">epoch_ms</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">now </span><span style="color:#a626a4;">= </span><span>@wallClock.now() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">now_ms </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">now</span><span>.seconds </span><span style="color:#a626a4;">* </span><span style="color:#c18401;">1000 </span><span style="color:#a626a4;">+ </span><span> (</span><span style="color:#e45649;">now</span><span>.nanoseconds.reinterpret_as_int() </span><span style="color:#a626a4;">/ </span><span style="color:#c18401;">1000000</span><span>).to_uint64() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">duration_ms </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">epoch_ms</span><span>.reinterpret_as_int64() </span><span style="color:#a626a4;">- </span><span> </span><span style="color:#e45649;">now_ms</span><span>.reinterpret_as_int64() </span><span> </span><span style="color:#a626a4;">if </span><span style="color:#e45649;">duration_ms </span><span style="color:#a626a4;">&gt; </span><span style="color:#c18401;">0</span><span> { </span><span> sleep(</span><span style="color:#e45649;">duration_ms</span><span>.reinterpret_as_uint64()) </span><span> } </span><span style="color:#a626a4;">else</span><span> { </span><span> send_emails(</span><span style="color:#e45649;">self</span><span>.get_recipients()) </span><span> } </span><span> </span><span style="color:#a626a4;">continue </span><span> } </span><span> None </span><span style="color:#a626a4;">=&gt; break </span><span> } </span><span> } </span><span>} </span></code></pre> <p>We use the <code>wall-clock</code> interface again to query the current time and calculate the duration to sleep for based on the deadline got from the corresponding list worker. The <code>get_deadline</code> and <code>get_recipients</code> methods are just using Golem's Worker to Worker communication as shown before:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Get the current deadline from the associated list worker </span><span>fn get_deadline(</span><span style="color:#e45649;">self</span><span> : Email) </span><span style="color:#a626a4;">-&gt; </span><span>UInt64? { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">api </span><span style="color:#a626a4;">= </span><span>@stubLst.EmailQuery::email_query({ </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#e45649;">self</span><span>.list_worker_urn }) </span><span> </span><span style="color:#e45649;">api</span><span>.blocking_deadline() </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Get the current list of recipients from the associated list worker </span><span>fn get_recipients(</span><span style="color:#e45649;">self</span><span> : Email) </span><span style="color:#a626a4;">-&gt; </span><span>Array[String] { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">api </span><span style="color:#a626a4;">= </span><span>@stubLst.EmailQuery::email_query({ </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#e45649;">self</span><span>.list_worker_urn }) </span><span> </span><span style="color:#e45649;">api</span><span>.blocking_recipients() </span><span>} </span></code></pre> <p>The two remaining interesting part are sleeping and sending emails.</p> <p>We can <strong>sleep</strong> by calling the <code>subscribe-duration</code> function in the WASI <code>monotonic-clock</code> package to get a pollable, and then poll for it. As we only pass a single pollable to the list, it won't return until the deadline we want to wait for expires:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Sleep for the given amount of milliseconds </span><span>fn sleep(</span><span style="color:#e45649;">ms</span><span> : UInt64) </span><span style="color:#a626a4;">-&gt; </span><span>Unit { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">ns </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">ms </span><span style="color:#a626a4;">* </span><span style="color:#c18401;">1000000 </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">pollable </span><span style="color:#a626a4;">= </span><span>@monotonicClock.subscribe_duration(</span><span style="color:#e45649;">ns</span><span>) </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">= </span><span>@poll.poll([</span><span style="color:#e45649;">pollable</span><span>]) </span><span>} </span></code></pre> <p>On the <code>list</code> side, we don't want to block until this email sending loop runs - as it would block our list from receiving new requests. The generated RPC stubs support this, we simply use the non-blocking version on the generated <code>Api</code> type:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span> </span><span style="color:#a626a4;">if not</span><span>(</span><span style="color:#e45649;">self</span><span>.email_worker_started) { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">email_component_id </span><span style="color:#a626a4;">= </span><span>get_env(</span><span style="color:#50a14f;">&quot;</span><span>EMAIL_COMPONENT_ID</span><span style="color:#50a14f;">&quot;</span><span>).or(</span><span style="color:#50a14f;">&quot;</span><span>unknown</span><span style="color:#50a14f;">&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">name </span><span style="color:#a626a4;">= </span><span>get_env(</span><span style="color:#50a14f;">&quot;</span><span>GOLEM_WORKER_NAME</span><span style="color:#50a14f;">&quot;</span><span>).or(</span><span style="color:#50a14f;">&quot;</span><span>unknown</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">self_component_id </span><span style="color:#a626a4;">= </span><span>get_env(</span><span style="color:#50a14f;">&quot;</span><span>GOLEM_COMPONENT_ID</span><span style="color:#50a14f;">&quot;</span><span>).or(</span><span style="color:#50a14f;">&quot;</span><span>unknown</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">api </span><span style="color:#a626a4;">= </span><span>@stubEmail.Api::api({ </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>urn:worker:\{email_component_id}:\{name}</span><span style="color:#50a14f;">&quot;</span><span>}) </span><span> </span><span style="color:#e45649;">api</span><span>.send_email({ </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>urn:worker:\{self_component_id}:\{name}</span><span style="color:#50a14f;">&quot;</span><span>}) </span><span> </span><span style="color:#e45649;">self</span><span>.email_worker_started </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true</span><span>; </span><span> } </span></code></pre> <h2 id="sending-emails">Sending emails</h2> <p>Sending actual emails is a bit more difficult, as there are no HTTP client libraries in the MoonBit ecosystem at the moment. But Golem implements the WASI HTTP interface, so we can use the already demonstrated techniques to import WASI HTTP through WIT, generate bindings for it, and then use it from MoonBit code to send emails through a third party provider.</p> <p>In the example we are going to use <a href="https://sendgrid.com/en-us">Sendgrid</a> as a provider. This means we have to send a HTTP <strong>POST</strong> request to <code>https://api.sendgrid.com/v3/mail/send</code> with a pre-configured authorization header, and a JSON body describing our email sending request.</p> <p>First we are going to define a few helper constants and functions to assemble the parts of the requests:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">const </span><span>AUTHORITY : String </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;</span><span>api.sendgrid.com</span><span style="color:#50a14f;">&quot; </span><span style="color:#a626a4;">const </span><span>PATH : String </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;</span><span>/v3/mail/send</span><span style="color:#50a14f;">&quot; </span><span> </span><span style="color:#a626a4;">type! </span><span>HttpClientError String </span></code></pre> <p>The payload is a JSON, which can be constructed using MoonBit's built-in JSON literal feature. However in the WASI HTTP interface we have to write it out as a byte array. MoonBit strings are UTF-16 but SendGrid requires the payload to be in UTF-8. Unfortunately there isn't any string encoding library available for MoonBit yet, so we write a simple function that fails if any of the characters is not ASCII:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Converts a string to ASCII byte array if all characters are ASCII characters, otherwise fails </span><span>fn string_to_ascii( </span><span> </span><span style="color:#e45649;">what</span><span> : String, </span><span> </span><span style="color:#e45649;">value</span><span> : String </span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>FixedArray[Byte]!HttpClientError { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">= </span><span>FixedArray::makei(</span><span style="color:#e45649;">value</span><span>.length(), fn(</span><span style="color:#e45649;">_</span><span>) { </span><span style="color:#e45649;">b</span><span style="color:#50a14f;">&#39; &#39;</span><span> }) </span><span> </span><span style="color:#a626a4;">for </span><span style="color:#e45649;">i</span><span>, </span><span style="color:#e45649;">ch </span><span style="color:#a626a4;">in </span><span style="color:#e45649;">value</span><span> { </span><span> </span><span style="color:#a626a4;">if </span><span style="color:#e45649;">ch</span><span>.to_int() </span><span style="color:#a626a4;">&lt; </span><span style="color:#c18401;">256</span><span> { </span><span> </span><span style="color:#e45649;">result</span><span>[</span><span style="color:#e45649;">i</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">ch</span><span>.to_int().to_byte() </span><span> } </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#a626a4;">raise </span><span>HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>The \{what} contains non-ASCII characters</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> } </span><span> } </span><span> </span><span style="color:#e45649;">result </span><span>} </span></code></pre> <p>With this we can construct the payload and we can also read the sendgrid API key from an environment variable:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a0a1a7;">///| Constructs a SendGrid send message payload as an ASCII byte array </span><span>fn payload(</span><span style="color:#e45649;">recipients</span><span> : Array[String]) </span><span style="color:#a626a4;">-&gt; </span><span>FixedArray[Byte]!HttpClientError { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">email_addresses </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">recipients </span><span> .iter() </span><span> .map(fn(</span><span style="color:#e45649;">email</span><span>) { { </span><span style="color:#50a14f;">&quot;</span><span>email</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#e45649;">email</span><span>, </span><span style="color:#50a14f;">&quot;</span><span>name</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#e45649;">email</span><span> } }) </span><span> .to_array() </span><span> .to_json() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">from</span><span> : Json </span><span style="color:#a626a4;">=</span><span> { </span><span style="color:#50a14f;">&quot;</span><span>email</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>[email protected]</span><span style="color:#50a14f;">&quot;</span><span>, </span><span style="color:#50a14f;">&quot;</span><span>name</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>Daniel Vigovszky</span><span style="color:#50a14f;">&quot;</span><span> } </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">json</span><span> : Json </span><span style="color:#a626a4;">=</span><span> { </span><span> </span><span style="color:#50a14f;">&quot;</span><span>personalizations</span><span style="color:#50a14f;">&quot;</span><span>: [{ </span><span style="color:#50a14f;">&quot;</span><span>to</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#e45649;">email_addresses</span><span>, </span><span style="color:#50a14f;">&quot;</span><span>cc</span><span style="color:#50a14f;">&quot;</span><span>: [], </span><span style="color:#50a14f;">&quot;</span><span>bcc</span><span style="color:#50a14f;">&quot;</span><span>: [] }], </span><span> </span><span style="color:#50a14f;">&quot;</span><span>from</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#e45649;">from</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;</span><span>subject</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>Collaborative list editor warning</span><span style="color:#50a14f;">&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;</span><span>content</span><span style="color:#50a14f;">&quot;</span><span>: [ </span><span> { </span><span> </span><span style="color:#50a14f;">&quot;</span><span>type</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>text/html</span><span style="color:#50a14f;">&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;</span><span>value</span><span style="color:#50a14f;">&quot;</span><span>: </span><span style="color:#50a14f;">&quot;</span><span>&lt;p&gt;The list opened for editing has not been changed in the last 12 hours&lt;/p&gt;</span><span style="color:#50a14f;">&quot;</span><span>, </span><span> }, </span><span> ], </span><span> } </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">json_str </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">json</span><span>.to_string() </span><span> string_to_ascii!(</span><span style="color:#50a14f;">&quot;</span><span>constructed JSON body</span><span style="color:#50a14f;">&quot;</span><span>, </span><span style="color:#e45649;">json_str</span><span>) </span><span>} </span><span> </span><span style="color:#a0a1a7;">///| Gets the SENDGRID_API_KEY environment variable as an ASCII byte array </span><span>fn authorization_header() </span><span style="color:#a626a4;">-&gt; </span><span>FixedArray[Byte]!HttpClientError { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">key_str </span><span style="color:#a626a4;">= </span><span>@environment.get_environment() </span><span> .iter() </span><span> .find_first(fn(</span><span style="color:#e45649;">pair</span><span>) { </span><span style="color:#e45649;">pair</span><span>.</span><span style="color:#c18401;">0 </span><span style="color:#a626a4;">== </span><span style="color:#50a14f;">&quot;</span><span>SENDGRID_API_KEY</span><span style="color:#50a14f;">&quot;</span><span> }) </span><span> .map(fn(</span><span style="color:#e45649;">pair</span><span>) { </span><span style="color:#e45649;">pair</span><span>.</span><span style="color:#c18401;">1</span><span> }) </span><span> .unwrap() </span><span> string_to_ascii!( </span><span> </span><span style="color:#50a14f;">&quot;</span><span>provided authorization header via SENDGRID_API_KEY</span><span style="color:#50a14f;">&quot;</span><span>, </span><span style="color:#e45649;">key_str</span><span>, </span><span> ) </span><span>} </span></code></pre> <p>The next step is to create the data structures for sending out the HTTP request. In WASI HTTP, outgoing requests are modeled as WIT <strong>resources</strong>, which means we have to construct them with a constructor and call various methods to set properties of the request. All these methods have a <code>Result</code> result type so our code is going to be quite verbose:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">headers </span><span style="color:#a626a4;">= </span><span>@httpTypes.Fields::fields() </span><span> </span><span style="color:#e45649;">headers </span><span> .append(</span><span style="color:#50a14f;">&quot;</span><span>Authorization</span><span style="color:#50a14f;">&quot;</span><span>, authorization_header!()) </span><span> .map_err(fn(</span><span style="color:#e45649;">error</span><span>) { </span><span> HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to set Authorization header: \{error}</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">request </span><span style="color:#a626a4;">= </span><span>@httpTypes.OutgoingRequest::outgoing_request(</span><span style="color:#e45649;">headers</span><span>) </span><span> </span><span style="color:#e45649;">request </span><span> .set_authority(Some(AUTHORITY)) </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to set request authority</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#e45649;">request </span><span> .set_method(@httpTypes.Method::Post) </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to set request method</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#e45649;">request </span><span> .set_path_with_query(Some(PATH)) </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to set request path</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#e45649;">request </span><span> .set_scheme(Some(@httpTypes.Scheme::Https)) </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to set request scheme</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">outgoing_body </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">request </span><span> .body() </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to get the outgoing body</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">stream </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">outgoing_body </span><span> .write() </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { </span><span> HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to open the outgoing body stream</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">stream </span><span> .blocking_write_and_flush(payload!(</span><span style="color:#e45649;">recipients</span><span>)) </span><span> .map_err(fn(</span><span style="color:#e45649;">error</span><span>) { </span><span> HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to write request body: \{error}</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> }) </span><span> .unwrap_or_error!() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">outgoing_body </span><span> .finish(None) </span><span> .map_err(fn(</span><span style="color:#e45649;">_</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to close the outgoing body</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span></code></pre> <p>At this point we have our <code>request</code> variable initialized with everything we need, so we can call the <code>handle</code> function to initiate the HTTP request:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">future_incoming_response </span><span style="color:#a626a4;">= </span><span>@outgoingHandler.handle(</span><span style="color:#e45649;">request</span><span>, None) </span><span> .map_err(fn(</span><span style="color:#e45649;">error</span><span>) { HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Failed to send request: \{error}</span><span style="color:#50a14f;">&quot;</span><span>) }) </span><span> .unwrap_or_error!() </span></code></pre> <p>Sending a request is an async operation and what we have a result here is just a handle for a future value we have to await somehow. As we don't want to do anything else in parallel in this example, we just write a loop that awaits for the result and checks for errors:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span> </span><span style="color:#a626a4;">while </span><span style="color:#c18401;">true</span><span> { </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">future_incoming_response</span><span>.get() { </span><span> Some(Ok(Ok(</span><span style="color:#e45649;">response</span><span>))) </span><span style="color:#a626a4;">=&gt;</span><span> { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">status </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">response</span><span>.status() </span><span> </span><span style="color:#a626a4;">if </span><span style="color:#e45649;">status </span><span style="color:#a626a4;">&gt;= </span><span style="color:#c18401;">200 </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#e45649;">status </span><span style="color:#a626a4;">&lt; </span><span style="color:#c18401;">300</span><span> { </span><span> </span><span style="color:#a626a4;">break </span><span> } </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#a626a4;">raise </span><span>HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Http request returned with status \{status}</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> } </span><span> } </span><span> Some(Ok(Err(</span><span style="color:#e45649;">code</span><span>))) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">raise </span><span>HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Http request failed with \{code}</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> Some(Err(</span><span style="color:#e45649;">_</span><span>)) </span><span style="color:#a626a4;">=&gt; raise </span><span>HttpClientError(</span><span style="color:#50a14f;">&quot;</span><span>Http request failed</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> None </span><span style="color:#a626a4;">=&gt;</span><span> { </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">pollable </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">future_incoming_response</span><span>.subscribe() </span><span> </span><span style="color:#a626a4;">let </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">= </span><span>@poll.poll([</span><span style="color:#e45649;">pollable</span><span>]) </span><span> </span><span> } </span><span> } </span><span> } </span></code></pre> <p>We are ignoring the response body in this example - but in other applications, <code>response</code> could be used to open an incoming body stream and read chunks from it.</p> <p>With this we implemented the simplest possible way to call the SendGrid API for sending an e-mail using WASI HTTP provided by Golem.</p> <h2 id="debugging">Debugging</h2> <p>When compiled to debug (using <code>golem app build --build-profile debug</code>), Golem shows a nice stack trace when something goes wrong in a MoonBit component. Another useful way to observe a worker is to write a <strong>log</strong> in it, which can be realtime watched (or queried later) using tools like <code>golem worker connect</code> or the Golem Console.</p> <p>The best way to write logs from MoonBit is to use the WASI Logging interface. We can import it as usual in our WITs:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">import </span><span>wasi:logging/logging; </span></code></pre> <p>and then to our MoonBit packages:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span> </span><span style="color:#50a14f;">&quot;demo/archive/interface/wasi/logging/logging&quot; </span></code></pre> <p>and then write out log messages of various levels from our application logic:</p> <pre data-lang="moonbit" style="background-color:#fafafa;color:#383a42;" class="language-moonbit "><code class="language-moonbit" data-lang="moonbit"><span style="color:#a626a4;">let </span><span style="color:#e45649;">recipients </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.get_recipients(); </span><span>@logging.log(@logging.Level::INFO, </span><span style="color:#50a14f;">&quot;&quot;</span><span>, </span><span style="color:#50a14f;">&quot;</span><span>Sending emails to recipients: \{recipients}</span><span style="color:#50a14f;">&quot;</span><span>) </span><span style="color:#a626a4;">match </span><span>send_emails?(</span><span style="color:#e45649;">recipients</span><span>) { </span><span> Ok(</span><span style="color:#e45649;">_</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>@logging.log(@logging.Level::INFO, </span><span style="color:#50a14f;">&quot;&quot;</span><span>, </span><span style="color:#50a14f;">&quot;</span><span>Sending emails succeeded</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> Err(</span><span style="color:#e45649;">error</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>@logging.log(@logging.Level::ERROR, </span><span style="color:#50a14f;">&quot;&quot;</span><span>, </span><span style="color:#50a14f;">&quot;</span><span>Failed to send emails: \{error}</span><span style="color:#50a14f;">&quot;</span><span>) </span><span>} </span></code></pre> <h2 id="conclusion">Conclusion</h2> <p>MoonBit is a nice new language that is quite powerful and expressive, and seems to be a very good fit for developing applications for Golem. The resulting WASM binaries are very small - a few tens of kilobytes for this application (only increased by the generated Rust stubs - but those are going away soon). A few things in the language felt a little bit inconvenient - but maybe it is just a matter of personal taste - mostly the JSON files describing MoonBit packages, the anonymous function syntax and the way the built-in formatter organizes things. I'm sure some of these, especially the tooling, will greatly improve in the future.</p> <p>The support for WASM and the Component Model are still in an early stage - but working. It requires many manual steps, but fortunately Golem's app manifest feature can automate most of this for us. Still the generated directory structure of <code>wit-bindgen moonbit</code> felt a little overwhelming first.</p> <p>I hope the MoonBit ecosystem will get some useful libraries in the near future, convenient wappers for WASI and WASI HTTP, (and Golem specific ones!), string encoding utilities, etc. As there are not many libraries yet, it is very easy to find something useful to work on.</p> <p>I'm looking forward to have official support for MoonBit in Golem, such as templates for the <code>golem new ...</code> command and extensive documentation on our website.</p> [Video] Golem and the WASM Component Model @ LambdaConf 2024 2024-06-16T00:00:00+00:00 2024-06-16T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/golem-and-the-wasm-component-model/ <p>My talk at <a href="https://www.lambdaconf.us">LambdaConf 2024</a> explaining how <a href="https://golem.cloud">Golem</a> takes advantage of the WebAssembly Component Model.</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/g5uUQSByvI4?si=lxlQgFztHp94WjrU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> Zig and the WASM Component Model 2024-05-09T00:00:00+00:00 2024-05-09T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zig-wasm-component-model/ <p><a href="https://golem.cloud">Golem</a> always considered <a href="https://ziglang.org">Zig</a> a supported language, but until now the only documented way to use it was to compile a program with a single <code>main</code> function into a <em>core WebAssembly module</em> and then wrap that as a component that can be uploaded to Golem for execution. This is very limiting, as in order to take full advantage of Golem (and any other part of the evolving <em>WASM Component Model ecosystem</em>) a Zig program must have definitions for both <em>importing</em> and <em>exporting</em> functions and data types in order to be a usable component.</p> <h2 id="binding-generators">Binding generators</h2> <p>For many supported languages the workflow is to write a <strong>WIT</strong> file, which is the Component Model's <a href="https://component-model.bytecodealliance.org/design/wit.html">interface definition language</a> and then use a <em>binding generator</em>, such as <a href="https://github.com/bytecodealliance/wit-bindgen/">wit-bindgen</a> to create statically typed representation of the component's imports and exports in the targeted language.</p> <p>The binding generator does not support Zig, but it does support C. So the best we can do with existing tooling is to use the C binding generator and Zig's excellent C interoperability together to be able to create WASM components with Zig.</p> <h2 id="the-steps">The steps</h2> <p>The primary steps are the following:</p> <ul> <li><strong>Define</strong> the component's interface using WIT</li> <li><strong>Generate</strong> C bindings from this definition</li> <li><strong>Implement</strong> the exported functions in Zig, potentially using other imported interfaces and data types available through the generated binding</li> <li><strong>Compile</strong> the whole project into WASM</li> <li>As Zig's standard library still uses <em>WASI Preview 1</em>, and outputs a single WASM module, we also have to <strong>compose</strong> our resulting module with an <em>adapter component</em> in order to get a WASM component depending on <em>WASI Preview 2</em>.</li> </ul> <p>The first step is manual work - although we may eventually get code-first approaches in some languages where the WIT interface is generated as part as the build flow, it is not the case for Zig at the moment.</p> <p>For generating the bindings we use <code>wit-bindgen</code>, and once the implementation is done we compile the Zig source code, together with the generated C bindings into a WASM module using zig's build system (<code>zig build</code>).</p> <p>Finally we can use <code>wasm-tools compose</code> to take this WASM module and an appropriate version of a Preview1 adapter such as <a href="https://github.com/golemcloud/golem-wit/blob/main/adapters/tier1/wasi_snapshot_preview1.wasm">the one we provide for Golem</a> to get the final component that's ready to be used with Golem.</p> <h2 id="zig-s-build-system">Zig's build system</h2> <p>Executing all these steps manually is not convenient but fortunately we can integrate all the steps within Zig's <em>build system</em>. Let's see how!</p> <p>We need to write a custom <code>build.zig</code> in the following way. First, let's do some imports and start defining our build flow:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">const</span><span> std </span><span style="color:#a626a4;">= @import</span><span>(</span><span style="color:#50a14f;">&quot;std&quot;</span><span>); </span><span style="color:#a626a4;">const</span><span> Builder </span><span style="color:#a626a4;">=</span><span> std.build.Builder; </span><span style="color:#a626a4;">const</span><span> CrossTarget </span><span style="color:#a626a4;">=</span><span> std.zig.CrossTarget; </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">build</span><span>(</span><span style="color:#e45649;">b</span><span>: </span><span style="color:#a626a4;">*Builder</span><span>) </span><span style="color:#a626a4;">!void </span><span>{ </span></code></pre> <p>The first non-manual thing on our list of steps is <strong>generating</strong> the C bindings. Let's define a build step that just runs <code>wit-bindgen</code> for us:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> </span><span style="color:#a626a4;">const</span><span> bindgen </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">addSystemCommand</span><span>(&amp;.{ </span><span style="color:#50a14f;">&quot;wit-bindgen&quot;</span><span>, </span><span style="color:#50a14f;">&quot;c&quot;</span><span>, </span><span style="color:#50a14f;">&quot;--autodrop-borrows&quot;</span><span>, </span><span style="color:#50a14f;">&quot;yes&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;./wit&quot;</span><span>, </span><span style="color:#50a14f;">&quot;--out-dir&quot;</span><span>, </span><span style="color:#50a14f;">&quot;src/bindings&quot; </span><span>}); </span></code></pre> <p>This is just a description of running the binding generator, not integrated within the build flow yet. The next step is <strong>compiling</strong> our Zig and C files into WASM.</p> <p>First we define it as an <em>executable target</em>:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> </span><span style="color:#a626a4;">const</span><span> optimize </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">standardOptimizeOption</span><span>(.{ </span><span> .preferred_optimize_mode </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> .ReleaseSmall</span><span>, </span><span> }); </span><span> </span><span style="color:#a626a4;">const</span><span> wasm </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">addExecutable</span><span>(.{ </span><span> .name </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;main&quot;</span><span>, </span><span> .root_source_file </span><span style="color:#a626a4;">=</span><span> .{ .path </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;src/main.zig&quot; </span><span>}, </span><span> .target </span><span style="color:#a626a4;">=</span><span> .{ </span><span> .cpu_arch </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> .wasm32</span><span>, </span><span> .os_tag </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> .wasi</span><span>, </span><span> }, </span><span> .optimize </span><span style="color:#a626a4;">=</span><span> optimize </span><span> }); </span></code></pre> <p>This already defines we want to use WASM and target WASI and points to our root source file. We are not done yet though, as if we run the binding generator step defined above, we will end up having a couple of files generated in our <code>src/bindings</code> directory:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>λ l src/bindings </span><span>.rw-r--r-- 909 vigoo 9 May 09:34 zig3.c </span><span>.rw-r--r-- 371 vigoo 9 May 09:34 zig3.h </span><span>.rw-r--r-- 299 vigoo 9 May 09:34 zig3_component_type.o </span></code></pre> <p>The <code>.c</code>/<code>.h</code> pair contains the generated binding, while the object file holds the binary representation of the WIT interface it was generated from.</p> <p>We need to add the C source and the object file into our build, and the header file to the include file paths. As the name of the generated files depend on the WIT file's contents, we need to list all files in this <code>bindings</code> directory and mutate our <code>wasm</code> build target according to what we find:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> </span><span style="color:#a626a4;">const</span><span> binding_root </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">pathFromRoot</span><span>(</span><span style="color:#50a14f;">&quot;src/bindings&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">var</span><span> binding_root_dir </span><span style="color:#a626a4;">= try</span><span> std.fs.</span><span style="color:#e45649;">cwd</span><span>().</span><span style="color:#e45649;">openIterableDir</span><span>(binding_root, .{}); </span><span> </span><span style="color:#a626a4;">defer</span><span> binding_root_dir.</span><span style="color:#e45649;">close</span><span>(); </span><span> </span><span style="color:#a626a4;">var</span><span> it </span><span style="color:#a626a4;">= try</span><span> binding_root_dir.</span><span style="color:#e45649;">walk</span><span>(b.allocator); </span><span> </span><span style="color:#a626a4;">while </span><span>(</span><span style="color:#a626a4;">try</span><span> it.</span><span style="color:#e45649;">next</span><span>()) </span><span style="color:#a626a4;">|</span><span>entry</span><span style="color:#a626a4;">| </span><span>{ </span><span> </span><span style="color:#a626a4;">switch </span><span>(entry.kind) { </span><span style="color:#c18401;"> .file </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">const</span><span> path </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">pathJoin</span><span>(&amp;.{ binding_root, entry.path }); </span><span> </span><span style="color:#a626a4;">if </span><span>(std.mem.</span><span style="color:#e45649;">endsWith</span><span>(u8, entry.basename, </span><span style="color:#50a14f;">&quot;.c&quot;</span><span>)) { </span><span> wasm.</span><span style="color:#e45649;">addCSourceFile</span><span>(.{ .file </span><span style="color:#a626a4;">=</span><span> .{ .path </span><span style="color:#a626a4;">=</span><span> path }, .flags </span><span style="color:#a626a4;">=</span><span> &amp;.{} }); </span><span> } </span><span style="color:#a626a4;">else if </span><span>(std.mem.</span><span style="color:#e45649;">endsWith</span><span>(u8, entry.basename, </span><span style="color:#50a14f;">&quot;.o&quot;</span><span>)) { </span><span> wasm.</span><span style="color:#e45649;">addObjectFile</span><span>(.{ .path </span><span style="color:#a626a4;">=</span><span> path }); </span><span> } </span><span> }, </span><span> </span><span style="color:#a626a4;">else =&gt; continue</span><span>, </span><span> } </span><span> } </span></code></pre> <p>This registers all the <code>.c</code> and <code>.o</code> files from the generated bindings, but we still need to add the whole directory as an include path:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> wasm.</span><span style="color:#e45649;">addIncludePath</span><span>(.{ .path </span><span style="color:#a626a4;">=</span><span> binding_root }); </span></code></pre> <p>and enable linking with <code>libc</code>:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> wasm.</span><span style="color:#e45649;">linkLibC</span><span>(); </span></code></pre> <p>Now that we defined two build steps - the generating the bindings and compiling to a WASM module - we define the third step which is <strong>composing</strong> the generated module and the preview1 adapter into a WASM component:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> </span><span style="color:#a626a4;">const</span><span> adapter </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">option</span><span>( </span><span> []</span><span style="color:#a626a4;">const</span><span> u8, </span><span> </span><span style="color:#50a14f;">&quot;adapter&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;Path to the Golem Tier1 WASI adapter&quot;</span><span>) </span><span style="color:#a626a4;">orelse </span><span style="color:#50a14f;">&quot;adapters/tier1/wasi_snapshot_preview1.wasm&quot;</span><span>; </span><span> </span><span style="color:#a626a4;">const</span><span> out </span><span style="color:#a626a4;">= try</span><span> std.fmt.</span><span style="color:#e45649;">allocPrint</span><span>(b.allocator, </span><span style="color:#50a14f;">&quot;zig-out/bin/{s}&quot;</span><span>, .{wasm.out_filename}); </span><span> </span><span style="color:#a626a4;">const</span><span> component </span><span style="color:#a626a4;">=</span><span> b.</span><span style="color:#e45649;">addSystemCommand</span><span>(&amp;.{ </span><span style="color:#50a14f;">&quot;wasm-tools&quot;</span><span>, </span><span style="color:#50a14f;">&quot;component&quot;</span><span>, </span><span style="color:#50a14f;">&quot;new&quot;</span><span>, out, </span><span> </span><span style="color:#50a14f;">&quot;-o&quot;</span><span>, </span><span style="color:#50a14f;">&quot;zig-out/bin/component.wasm&quot;</span><span>, </span><span style="color:#50a14f;">&quot;--adapt&quot;</span><span>, adapter }); </span></code></pre> <p>Here we provide a way to override the path to the adapter WASM using <code>zig build -Dadapter=xxx</code> but default to <code>adapters/tier1/wasi_snapshot_preview1.wasm</code> in case it is not specified.</p> <p>The final step is to set up dependencies between these build steps and wire them to the main build flow:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> wasm.step.</span><span style="color:#e45649;">dependOn</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>bindgen.step); </span><span> component.step.</span><span style="color:#e45649;">dependOn</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>wasm.step); </span><span> b.</span><span style="color:#e45649;">installArtifact</span><span>(wasm); </span><span> b.</span><span style="color:#e45649;">getInstallStep</span><span>().</span><span style="color:#e45649;">dependOn</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>component.step); </span><span> } </span></code></pre> <h2 id="trying-it-out">Trying it out</h2> <p>Let's try this out by implementing a simple counter component. We start with the first step - defining our WIT file, putting it into <code>wit/counter.wit</code>:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>golem:example; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#0184bc;">add</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">u64</span><span>); </span><span> </span><span style="color:#0184bc;">get</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; u64</span><span>; </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>counter { </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span>} </span><span> </span></code></pre> <p>We also save the above defined build script as <code>build.zig</code> (full version <a href="https://gist.github.com/vigoo/19ed4b5d3e47ca2f5f1258d1ae8b28a4">available here</a>) and then write an initial <code>src/main.zig</code> file:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">const</span><span> std </span><span style="color:#a626a4;">= @import</span><span>(</span><span style="color:#50a14f;">&quot;std&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">main</span><span>() </span><span style="color:#a626a4;">anyerror!void </span><span>{} </span></code></pre> <p>Let's place the <a href="https://github.com/golemcloud/golem-wit/raw/main/adapters/tier1/wasi_snapshot_preview1.wasm">adapter WASM</a> as well in the <code>adapters/tier1</code> directory, and then try to compile this:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>λ zig build --summary all ... </span><span>zig build-exe main Debug wasm32-wasi: error: the following command failed with 2 compilation errors: </span><span>... </span><span>error: wasm-ld: /Users/vigoo/projects/demo/counter/zig-cache/o/a212123ad3dcf4839747c2bd77f7ef4e/counter.o: </span><span>undefined symbol: exports_golem_example_api_add </span><span>error: wasm-ld: /Users/vigoo/projects/demo/counter/zig-cache/o/a212123ad3dcf4839747c2bd77f7ef4e/counter.o: </span><span>undefined symbol: exports_golem_example_api_get </span></code></pre> <p>It fails because we defined two exported functions: <code>api/add</code> and <code>api/get</code> in our WIT file but haven't implemented them yet. Let's do that:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">var </span><span style="color:#e45649;">state</span><span>: </span><span style="color:#a626a4;">u64 = </span><span style="color:#c18401;">0</span><span>; </span><span> </span><span style="color:#a626a4;">export fn </span><span style="color:#0184bc;">exports_golem_example_api_add</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">u64</span><span>) </span><span style="color:#a626a4;">void </span><span>{ </span><span> </span><span style="color:#a626a4;">const</span><span> stdout </span><span style="color:#a626a4;">=</span><span> std.io.</span><span style="color:#e45649;">getStdOut</span><span>().</span><span style="color:#e45649;">writer</span><span>(); </span><span> stdout.</span><span style="color:#e45649;">print</span><span>(</span><span style="color:#50a14f;">&quot;Adding {} to state</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>, .{value}) </span><span style="color:#a626a4;">catch unreachable</span><span>; </span><span> state </span><span style="color:#a626a4;">+=</span><span> value; </span><span>} </span><span> </span><span style="color:#a626a4;">export fn </span><span style="color:#0184bc;">exports_golem_example_api_get</span><span>() </span><span style="color:#a626a4;">u64 </span><span>{ </span><span> </span><span style="color:#a626a4;">return</span><span> state; </span><span>} </span></code></pre> <p>Then compile it:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>λ zig build --summary all </span><span>Generating &quot;src/bindings/counter.c&quot; </span><span>Generating &quot;src/bindings/counter.h&quot; </span><span>Generating &quot;src/bindings/counter_component_type.o&quot; </span><span>Build Summary: 5/5 steps succeeded </span><span>install success </span><span>├─ install main cached </span><span>│ └─ zig build-exe main Debug wasm32-wasi cached 9ms MaxRSS:29M </span><span>│ └─ run wit-bindgen success 3ms MaxRSS:3M </span><span>└─ run wasm-tools success 11ms MaxRSS:8M </span><span> └─ zig build-exe main Debug wasm32-wasi (+1 more reused dependencies) </span></code></pre> <p>and we can verify our resulting <code>zig-out/component.wasm</code> using <code>wasm-tools</code>:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>λ wasm-tools print --skeleton zig-out/bin/component.wasm </span><span>(component </span><span> ... </span><span> (instance (;11;) (instantiate 0 </span><span> (with &quot;import-func-add&quot; (func 16)) </span><span> (with &quot;import-func-get&quot; (func 17)) </span><span> ) </span><span> ) </span><span> (export (;12;) &quot;golem:example/api&quot; (instance 11)) </span><span> (@producers </span><span> (processed-by &quot;wit-component&quot; &quot;0.20.1&quot;) </span><span> ) </span><span>) </span></code></pre> <h2 id="using-imports">Using imports</h2> <p>After this simple example let's try <em>importing</em> some interface and using that from our Zig code. What we are going to do is every time our counter changes, we are going to also save that value to an external key-value store. This is usually not something you need to do when writing a Golem application, because your program will be durable anyway - you can just keep the counter in memory. But it is a simple enough example to demonstrate how to use imported interfaces from Zig.</p> <p>First let's add some additional WIT files into <code>wit/deps</code> from the <a href="https://github.com/golemcloud/golem-wit">golem-wit repository</a> (Note that the WASI Key-Value interface is defined <a href="https://github.com/WebAssembly/wasi-keyvalue">here</a>, the <code>golem-wit</code> repo just stores the exact version of its definitions which is currently implemented by Golem ).</p> <p>We need the following directory tree:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>λ tree wit </span><span>wit </span><span>├── counter.wit </span><span>└── deps </span><span> ├── io </span><span> │   ├── error.wit </span><span> │   ├── poll.wit </span><span> │   ├── streams.wit </span><span> │   └── world.wit </span><span> └── keyvalue </span><span> ├── atomic.wit </span><span> ├── caching.wit </span><span> ├── error.wit </span><span> ├── eventual-batch.wit </span><span> ├── eventual.wit </span><span> ├── handle-watch.wit </span><span> ├── types.wit </span><span> └── world.wit </span><span> </span><span>4 directories, 13 files </span></code></pre> <p>Then we can import the key-value interface to <code>counter.wit</code>:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>golem:example; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#0184bc;">add</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">u64</span><span>); </span><span> </span><span style="color:#0184bc;">get</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; u64</span><span>; </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>counter { </span><span> </span><span style="color:#a626a4;">import </span><span>wasi:keyvalue/eventual@</span><span style="color:#c18401;">0.1.0</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span>} </span></code></pre> <p>By recompiling the project we can verify everything still works, and we will also get our new bindings generated in the C source.</p> <p>Before implementing writing to the key-value store in Zig, let's just take a look at the WIT interface of <code>wasi:keyvalue/[email protected]</code> to understand what we will have to do:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">interface </span><span>eventual { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#0184bc;">set</span><span>: </span><span style="color:#a626a4;">func</span><span>( </span><span> </span><span style="color:#e45649;">bucket</span><span>: </span><span style="color:#a626a4;">borrow</span><span>&lt;bucket&gt;, </span><span> </span><span style="color:#e45649;">key</span><span>: key, </span><span> </span><span style="color:#e45649;">outgoing-value</span><span>: </span><span style="color:#a626a4;">borrow</span><span>&lt;outgoing-value&gt; </span><span> ) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;</span><span style="color:#a626a4;">_</span><span>, error&gt;; </span><span>} </span></code></pre> <p>We will need to pass a <code>bucket</code> and an <code>outgoing-value</code>, both being <em>WIT resources</em> so we first need to create them, then borrow references of them for the <code>set</code> call, and finally drop them.</p> <p>The bucket resource can be constructed with a static function called <code>open-bucket</code>:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">resource </span><span style="color:#c18401;">bucket </span><span>{ </span><span> </span><span style="color:#0184bc;">open-bucket</span><span>: </span><span style="color:#a626a4;">static func</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; result</span><span>&lt;bucket, error&gt;; </span><span>} </span></code></pre> <p>Searching for this in the generated C bindings reveals the following:</p> <pre data-lang="c" style="background-color:#fafafa;color:#383a42;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#a626a4;">extern bool </span><span style="color:#0184bc;">wasi_keyvalue_types_static_bucket_open_bucket</span><span>( </span><span> counter_string_t </span><span style="color:#a626a4;">*</span><span style="color:#e45649;">name</span><span>, </span><span> wasi_keyvalue_types_own_bucket_t </span><span style="color:#a626a4;">*</span><span style="color:#e45649;">ret</span><span>, </span><span> wasi_keyvalue_types_own_error_t </span><span style="color:#a626a4;">*</span><span>err </span><span>); </span></code></pre> <p>We will have to drop the created bucket with</p> <pre data-lang="c" style="background-color:#fafafa;color:#383a42;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#a626a4;">extern void </span><span style="color:#0184bc;">wasi_keyvalue_types_bucket_drop_own</span><span>( </span><span> wasi_keyvalue_types_own_bucket_t handle </span><span>); </span></code></pre> <p>With all this information let's try to open a bucket in Zig by directly using the generated C bindings. First we need to import the C headers:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">const</span><span> c </span><span style="color:#a626a4;">= @cImport</span><span>({ </span><span> </span><span style="color:#0184bc;">@cDefine</span><span>(</span><span style="color:#50a14f;">&quot;_NO_CRT_STDIO_INLINE&quot;</span><span>, </span><span style="color:#50a14f;">&quot;1&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">@cInclude</span><span>(</span><span style="color:#50a14f;">&quot;counter.h&quot;</span><span>); </span><span>}); </span></code></pre> <p>We also define an initial error type for our function for using later:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">const </span><span>KVError </span><span style="color:#a626a4;">= error </span><span>{ </span><span> FailedToOpenBucket, </span><span>}; </span></code></pre> <p>Then start implementing the store function by first storing the bucket's name in <code>counter_string_t</code>:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">record_state</span><span>() </span><span style="color:#a626a4;">anyerror!void </span><span>{ </span><span> </span><span style="color:#a626a4;">const</span><span> stdout </span><span style="color:#a626a4;">=</span><span> std.io.</span><span style="color:#e45649;">getStdOut</span><span>().</span><span style="color:#e45649;">writer</span><span>(); </span><span> </span><span> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">bucket_name</span><span>: </span><span style="color:#a626a4;">c.counter_string_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span> c.</span><span style="color:#e45649;">counter_string_dup</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>bucket_name, </span><span style="color:#50a14f;">&quot;state&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">counter_string_free</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>bucket_name); </span></code></pre> <p>and then invoking the <code>wasi_keyvalue_types_static_bucket_open_bucket</code> function:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">bucket</span><span>: </span><span style="color:#a626a4;">c.wasi_keyvalue_types_own_bucket_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">bucket_err</span><span>: </span><span style="color:#a626a4;">c.wasi_keyvalue_wasi_keyvalue_error_own_error_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span> </span><span style="color:#a626a4;">if </span><span>(c.</span><span style="color:#e45649;">wasi_keyvalue_types_static_bucket_open_bucket</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>bucket_name, </span><span style="color:#a626a4;">&amp;</span><span>bucket, </span><span style="color:#a626a4;">&amp;</span><span>bucket_err)) { </span><span> </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_bucket_drop_own</span><span>(bucket); </span><span> </span><span> </span><span style="color:#a0a1a7;">// TODO </span><span> } </span><span style="color:#a626a4;">else </span><span>{ </span><span> </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_wasi_keyvalue_error_error_drop_own</span><span>(bucket_err); </span><span> </span><span style="color:#a626a4;">try</span><span> stdout.</span><span style="color:#e45649;">print</span><span>(</span><span style="color:#50a14f;">&quot;Failed to open bucket</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>, .{}); </span><span> </span><span> </span><span style="color:#a626a4;">return</span><span> KVError.FailedToOpenBucket; </span><span> } </span><span>} </span></code></pre> <p>Now that we have an open bucket we want to call the <code>set</code> function to update a key's value:</p> <pre data-lang="c" style="background-color:#fafafa;color:#383a42;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#a626a4;">extern bool </span><span style="color:#0184bc;">wasi_keyvalue_eventual_set</span><span>( </span><span> wasi_keyvalue_eventual_borrow_bucket_t </span><span style="color:#e45649;">bucket</span><span>, </span><span> wasi_keyvalue_eventual_key_t </span><span style="color:#a626a4;">*</span><span style="color:#e45649;">key</span><span>, </span><span> wasi_keyvalue_eventual_borrow_outgoing_value_t </span><span style="color:#e45649;">outgoing_value</span><span>, </span><span> wasi_keyvalue_eventual_own_error_t </span><span style="color:#a626a4;">*</span><span>err </span><span>); </span></code></pre> <p>We already have our bucket, but we <em>own</em> it and we need to pass a <em>borrowed</em> bucket to this function. What's the difference? There is no difference in the actual value - both just store a <em>handle</em> to a resource that exists in the runtime engine, but we still have to borrow the owned value using the <code>wasi_keyvalue_types_borrow_bucket</code> function. The <code>wasi_keyvalue_eventual_key_t</code> type is just an alias for <code>counter_string_t</code> and <code>wasi_keyvalue_eventual_borrow_outgoing_value_t</code> is another resource we need to construct first. Let's put this together!</p> <p>First we borrow the owned bucket:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">var</span><span> borrowed_bucket </span><span style="color:#a626a4;">=</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_borrow_bucket</span><span>(bucket); </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_bucket_drop_borrow</span><span>(borrowed_bucket); </span></code></pre> <p>Then we create an <em>outgoing value</em> that's going to be stored in the key-value store:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">var</span><span> outgoing_value </span><span style="color:#a626a4;">=</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_static_outgoing_value_new_outgoing_value</span><span>(); </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_outgoing_value_drop_own</span><span>(outgoing_value); </span><span style="color:#a626a4;">var</span><span> borrowed_outgoing_value </span><span style="color:#a626a4;">=</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_borrow_outgoing_value</span><span>(outgoing_value); </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_types_outgoing_value_drop_borrow</span><span>(borrowed_outgoing_value); </span><span> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">body</span><span>: </span><span style="color:#a626a4;">c.counter_string_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span style="color:#a626a4;">var</span><span> value </span><span style="color:#a626a4;">= try</span><span> std.fmt.</span><span style="color:#e45649;">allocPrint</span><span>(gpa.</span><span style="color:#e45649;">allocator</span><span>(), </span><span style="color:#50a14f;">&quot;{d}&quot;</span><span>, .{state}); </span><span>c.</span><span style="color:#e45649;">counter_string_set</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>body, </span><span style="color:#0184bc;">@ptrCast</span><span>(value)); </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">counter_string_free</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>body); </span><span> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">write_err</span><span>: </span><span style="color:#a626a4;">c.wasi_keyvalue_types_own_error_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span style="color:#a626a4;">if </span><span>(</span><span style="color:#a626a4;">!</span><span>c.</span><span style="color:#e45649;">wasi_keyvalue_types_method_outgoing_value_outgoing_value_write_body_sync</span><span>( </span><span> borrowed_outgoing_value, </span><span> </span><span style="color:#0184bc;">@ptrCast</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>body), </span><span> </span><span style="color:#a626a4;">&amp;</span><span>bucket_err)) { </span><span> </span><span> </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">wasi_keyvalue_wasi_keyvalue_error_error_drop_own</span><span>(write_err); </span><span> </span><span style="color:#a626a4;">try</span><span> stdout.</span><span style="color:#e45649;">print</span><span>(</span><span style="color:#50a14f;">&quot;Failed to set outgoing value</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>, .{}); </span><span> </span><span style="color:#a626a4;">return</span><span> KVError.FailedToSetKey; </span><span>} </span></code></pre> <p>Also we need to create a string for holding the <em>key</em>:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">var </span><span style="color:#e45649;">key</span><span>: </span><span style="color:#a626a4;">c.counter_string_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span>c.</span><span style="color:#e45649;">counter_string_dup</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>key, </span><span style="color:#50a14f;">&quot;latest&quot;</span><span>); </span><span style="color:#a626a4;">defer</span><span> c.</span><span style="color:#e45649;">counter_string_free</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>key); </span></code></pre> <p>And finally call the <code>set</code> function:</p> <pre data-lang="zig" style="background-color:#fafafa;color:#383a42;" class="language-zig "><code class="language-zig" data-lang="zig"><span style="color:#a626a4;">var </span><span style="color:#e45649;">set_err</span><span>: </span><span style="color:#a626a4;">c.wasi_keyvalue_eventual_own_error_t = </span><span style="color:#c18401;">undefined</span><span>; </span><span style="color:#a626a4;">if </span><span>(</span><span style="color:#a626a4;">!</span><span>c.</span><span style="color:#e45649;">wasi_keyvalue_eventual_set</span><span>(borrowed_bucket, </span><span style="color:#a626a4;">&amp;</span><span>key, borrowed_outgoing_value, </span><span style="color:#a626a4;">&amp;</span><span>set_err)) { </span><span> </span><span style="color:#a626a4;">try</span><span> stdout.</span><span style="color:#e45649;">print</span><span>(</span><span style="color:#50a14f;">&quot;Failed to set key</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>, .{}); </span><span> </span><span style="color:#a626a4;">return</span><span> KVError.FailedToSetKey; </span><span>} </span></code></pre> <p>With this implementation we can compile our new version of our WASM component which now also depends on <code>wasi:keyvalue</code> and stores the latest value in a remote storage every time it gets updated.</p> <h2 id="what-s-next">What's next?</h2> <p>With the above technique we have a way to impelment WASM components in Zig, but working with the generated C bindings is a bit inconvenient. It would be nice to have a more idiomatic Zig interface to the component model, and maybe it can be achieved just by using Zig's metaprogramming features without having to create a Zig specific binding generator in addition to the existing ones.</p> Golem's Rust transaction API 2024-04-13T00:00:00+00:00 2024-04-13T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/golem-rust-transaction-api/ <h2 id="introduction">Introduction</h2> <p>A few weeks ago we have added a new set of <em>host functions</em> to <a href="https://golem.cloud">Golem</a>, which allow programs running on this platform to control some of the persistency and transactional behavior of the executor. You can learn about these low-level functions on <a href="https://learn.golem.cloud/docs/transaction-api">the corresponding learn page</a>.</p> <p>These exported functions allow a lot of control but they are very low level, and definitely not pleasant to use directly. To make them nicer we can write language-specific wrapper libraries on top of them, providing a first class experience for the supported programming languages.</p> <p>The first such wrapper library is <a href="http://github.com/golemcloud/golem-rust">golem-rust</a>, and this post explains some of the Rust specific technical details of how this library works.</p> <h2 id="regional-changes">Regional changes</h2> <p>The easy part is providing higher level support for temporarily changing the executor's behavior. The common property of these host functions is that they come in pairs:</p> <ul> <li>The <code>mark-begin-operation</code>/<code>mark-end-operation</code> pair defines a region that is treated as an atomic operation</li> <li>We can get the current retry policy and change it to something else with the <code>get-retry-policy</code> and <code>set-retry-policy</code> functions</li> <li>We can control persistency with <code>get-oplog-persistence-level</code> and <code>set-oplog-persistence-level</code></li> <li>And we can change whether the executor assumes that external calls are idempotent using the <code>get-idempotence-mode</code> and <code>set-idempotence-mode</code> pair.</li> </ul> <p>For all these, a simple way to make them more safe and more idiomatic is to connect the lifetime of the temporarily changed behavior to the lifetime of a rust variable. For example in the following snippet, the whole function will be treated as an atomic region, but as soon the function returns, the region ends:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">some_atomic_operation</span><span>() { </span><span> </span><span style="color:#a626a4;">let</span><span> _atomic </span><span style="color:#a626a4;">= </span><span>golem_rust::mark_atomic_operation(); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <p>Implement these wrappers is quite simple. First we need to define <em>data type</em> which the wrapper will return. Let's call it <code>AtomicOperationGuard</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub struct </span><span>AtomicOperationGuard { </span><span> </span><span style="color:#e45649;">begin</span><span>: OplogIndex, </span><span>} </span></code></pre> <p>We store the return value of Golem's <code>mark-begin-operation</code> in it, as we have to pass this value to the <code>mark-end-operation</code> when we want to close the atomic region.</p> <p>We want to close the atomic region when this value is dropped - so we can call Golem's <code>mark-end-operation</code> in an explicitly implemented <code>drop</code> function:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl </span><span>Drop </span><span style="color:#a626a4;">for </span><span>AtomicOperationGuard { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">drop</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>) { </span><span> </span><span style="color:#0184bc;">mark_end_operation</span><span>(</span><span style="color:#e45649;">self</span><span>.begin); </span><span> } </span><span>} </span></code></pre> <p>Finally we define the wrapper function which returns this guard value:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">must_use</span><span>] </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">mark_atomic_operation</span><span>() -&gt; AtomicOperationGuard { </span><span> </span><span style="color:#a626a4;">let</span><span> begin </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">mark_begin_operation</span><span>(); </span><span> AtomicOperationGuard { begin } </span><span>} </span></code></pre> <p>By using the <code>#[must_use]</code> attribute we can make the compiler give a warning if the result value is not used - this is important, because that would mean that the atomic region gets closed as soon as it has been opened.</p> <p>With this basic building block we can also support an alternative style where we pass a function to be executed with the temporary change in Golem's behavior. These are higher order functions, taking a function as a parameter, and just using the already defined wrapper to apply the change:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">atomically</span><span>&lt;T&gt;(</span><span style="color:#e45649;">f</span><span>: impl FnOnce() -&gt; T) -&gt; T { </span><span> </span><span style="color:#a626a4;">let</span><span> _guard </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">mark_atomic_operation</span><span>(); </span><span> </span><span style="color:#0184bc;">f</span><span>() </span><span>} </span></code></pre> <p>The same pattern can be used for all the mentioned host function pairs to get a pair of wrappers (one returning a guard, the other taking a function as a parameter):</p> <ul> <li><code>use_retry_policy</code> and <code>with_retry_policy</code></li> <li><code>use_idempotence_mode</code> and <code>with_idempotence_mode</code></li> <li><code>use_persistence_level</code> and <code>with_persistence_level</code></li> </ul> <h2 id="transactions">Transactions</h2> <p>Golem provides <strong>durable execution</strong> and that comes with guarantees that your program will always run until it terminates, and (by default) all external operations are performed <em>at least once</em>. (Here <em>at least once</em> is the guarantee we can provide - naturally it does not mean that we just rerun all operations in case of a failure event. Golem tries to perform every operation exactly once but this cannot be guaranteed without special collaboration with the remote host. This behavior can be switched to <em>at most once</em> by changing the <strong>idempotence mode</strong> with the helper functions we defined above.)</p> <p>Many times external operations (such as HTTP calls to remote hosts) need to be executed <em>transactionally</em>. If some of the operations failed the transaction need to be rolled back - <strong>compensation actions</strong> need to undo whatever the already successfully performed operations did.</p> <p>We identified and implemented two different transaction types - both provide different guarantees and both can be useful.</p> <p>A <strong>fallible transaction</strong> only deals with domain errors. Within the transaction every <strong>operation</strong> that succeeds gets recorded. If an operation fails, all the recorded operations get <em>compensated</em> in reverse order before the transaction block returns with a failure.</p> <p>What if anything non-domain specific failure happens to the worker? It can be an unexpected fatal error, hardware failure, an executor restarted because of a deployment, etc. A fallible transaction is completely implemented as regular user code, so Golem's durable execution guarantees apply to it. If for example the executor dies while 3 operation were completed out of the 5 in the transaction, the execution will continue from where it was - continuing with the 4th operation. If the 4th operation fails with a domain error, and the <code>golem-rust</code> library starts executing the compensation actions, and then a random failure causes a panic in the middle of this, the execution will continue from the middle of the compensation actions making sure that all the operations are properly rolled back.</p> <p>Another possibility is what we call <strong>infallible transaction</strong>s. Here we say that the transaction must not fail - but still if a step fails in it, we want to run compensation actions before we retry.</p> <p>To implement this we need some of the low-level transaction controls Golem provides. First of all, we need to mark the whole transaction as an <em>atomic region</em>. This way if a (non domain level) failure happens during the transaction, the previously performed external operations will be automatically retried as the atomic region was never committed.</p> <p>We can capture the domain errors in user code and perform the compensation actions just like in the <em>fallible transaction</em> case. But what should we do when all operations have been rolled back? We can use the <code>set-oplog-index</code> host function to tell Golem to "go back in time" to the beginning of the transaction, forget everything that was performed after it, and start executing the transaction again.</p> <p>There is a third, more complete version of <strong>infallible transactions</strong> which is not implemented yet - in this version we can guarantee that the compensation actions are performed even in case of a non-domain failure event. This can be implemented with the existing features of Golem but it is out of the scope of this post.</p> <h3 id="operation-and-transaction">Operation and Transaction</h3> <p>Let's see how we can implement this transaction feature.</p> <p>The first thing we need to define is an <em>operation</em> - something that pairs an arbitrary action with a compensation action that undoes it. We can define it as a trait with two methods:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub trait </span><span>Operation: Clone { </span><span> </span><span style="color:#a626a4;">type </span><span>In: Clone; </span><span> </span><span style="color:#a626a4;">type </span><span>Out: Clone; </span><span> </span><span style="color:#a626a4;">type </span><span>Err: Clone; </span><span> </span><span> </span><span style="color:#a0a1a7;">/// Executes the operation which may fail with a domain error </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">execute</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">input</span><span>: </span><span style="color:#a626a4;">Self::</span><span>In) -&gt; Result&lt;</span><span style="color:#a626a4;">Self::</span><span>Out, </span><span style="color:#a626a4;">Self::</span><span>Err&gt;; </span><span> </span><span> </span><span style="color:#a0a1a7;">/// Executes a compensation action for the operation. </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">compensate</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">input</span><span>: </span><span style="color:#a626a4;">Self::</span><span>In, </span><span style="color:#e45649;">result</span><span>: </span><span style="color:#a626a4;">Self::</span><span>Out) -&gt; Result&lt;(), </span><span style="color:#a626a4;">Self::</span><span>Err&gt;; </span><span>} </span></code></pre> <p>If the operation succeeds, its result of type <code>Out</code> will be stored - if it fails, <code>compensate</code> will be called for all the previous operations with these stored output values.</p> <p>We also need something that defines the boundaries of a transaction, and allows executing these operations. Here we can create two slightly different interfaces for fallible and infallible transactions - to make it more user friendly.</p> <p>For fallible transactions we can define a higher order function where the user's logic itself can fail, and in the end we get back a transaction result:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">fallible_transaction</span><span>&lt;Out, Err: Clone </span><span style="color:#a626a4;">+ &#39;static</span><span>&gt;( </span><span> </span><span style="color:#e45649;">f</span><span>: impl FnOnce(&amp;</span><span style="color:#e45649;">mut FallibleTransaction</span><span>&lt;</span><span style="color:#e45649;">Err</span><span>&gt;) -&gt; Result&lt;Out, Err&gt;, </span><span>) -&gt; TransactionResult&lt;Out, Err&gt; </span></code></pre> <p>The result type here is just an alias to the standard Rust <code>Result</code> type, in which the error type will be <code>TransactionFailure</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub type </span><span>TransactionResult</span><span style="color:#a626a4;">&lt;</span><span>Out, Err</span><span style="color:#a626a4;">&gt; = </span><span>Result&lt;Out, TransactionFailure&lt;Err&gt;&gt;; </span><span> </span><span style="color:#a626a4;">pub enum </span><span>TransactionFailure&lt;Err&gt; { </span><span> </span><span style="color:#a0a1a7;">/// One of the operations failed with an error, and the transaction was fully rolled back. </span><span> FailedAndRolledBackCompletely(Err), </span><span> </span><span style="color:#a0a1a7;">/// One of the operations failed with an error, and the transaction was partially rolled back </span><span> </span><span style="color:#a0a1a7;">/// because the compensation action of one of the operations also failed. </span><span> FailedAndRolledBackPartially { </span><span> failure: Err, </span><span> compensation_failure: Err, </span><span> }, </span><span>} </span></code></pre> <p>The function we pass to <code>fallible_transaction</code> gets a mutable reference to a transaction object - this is what we can use to execute operations:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">struct </span><span>FallibleTransaction { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;Err: Clone </span><span style="color:#a626a4;">+ &#39;static</span><span>&gt; FallibleTransaction&lt;Err&gt; { </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">execute</span><span>&lt;OpIn: Clone </span><span style="color:#a626a4;">+ &#39;static</span><span>, OpOut: Clone </span><span style="color:#a626a4;">+ &#39;static</span><span>&gt;( </span><span> </span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">operation</span><span>: impl Operation&lt;In = OpIn, Out = OpOut, Err = Err&gt; + </span><span style="color:#a626a4;">&#39;static</span><span>, </span><span> </span><span style="color:#e45649;">input</span><span>: OpIn, </span><span> ) -&gt; Result&lt;OpOut, Err&gt; </span><span>} </span></code></pre> <p>This looks a bit verbose but all it says is you can pass an arbitrary <code>Operation</code> to this function, but all of them needs to have the same failure type, and you provide an _input_value for your operation. This separation of operation and input makes it possible to define reusable operations by implementing the <code>Operation</code> trait manually - we will see more ways to define operations later.</p> <p>We also define a similar function and corresponding data type for <em>infallible transactions</em>. There are two main differences:</p> <ul> <li>The <code>infallible_transaction</code> function's result type is simply <code>Out</code> - it can never fail</li> <li>Similarly, <code>execute</code> it self cannot fail and this means that the transactional function itself cannot fail - and no need to use <code>?</code> or other ways to deal with result types.</li> </ul> <p>Storing the compensation actions in these structs is easy - we can just create closures capturing the input and output values and calling the trait's <code>compensate</code> function, and store these closures in a vec:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">struct </span><span>CompensationAction&lt;Err&gt; { </span><span> </span><span style="color:#e45649;">action</span><span>: Box&lt;dyn Fn() -&gt; Result&lt;(), Err&gt;&gt;, </span><span>} </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;Err&gt; CompensationAction&lt;Err&gt; { </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">execute</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Result&lt;(), Err&gt; { </span><span> (</span><span style="color:#e45649;">self</span><span>.action)() </span><span> } </span><span>} </span><span> </span><span style="color:#a626a4;">pub struct </span><span>FallibleTransaction&lt;Err&gt; { </span><span> </span><span style="color:#e45649;">compensations</span><span>: Vec&lt;CompensationAction&lt;Err&gt;&gt;, </span><span>} </span></code></pre> <p>A last thing we can do in this level of the API is to think about cases where one would write generic code that works both with fallible and infallible transactions. Using a unified interface would not be as nice as using the dedicated one - as it deal with error types even if the transaction can never fail - but may provide better code reusability. We can hide the difference by defining a trait:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub trait </span><span>Transaction&lt;Err&gt; { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">execute</span><span>&lt;OpIn: Clone </span><span style="color:#a626a4;">+ &#39;static</span><span>, OpOut: Clone </span><span style="color:#a626a4;">+ &#39;static</span><span>&gt;( </span><span> </span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">operation</span><span>: impl Operation&lt;In = OpIn, Out = OpOut, Err = Err&gt; + </span><span style="color:#a626a4;">&#39;static</span><span>, </span><span> </span><span style="color:#e45649;">input</span><span>: OpIn, </span><span> ) -&gt; Result&lt;OpOut, Err&gt;; </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">fail</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">error</span><span>: Err) -&gt; Result&lt;(), Err&gt;; </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">run</span><span>&lt;Out&gt;(</span><span style="color:#e45649;">f</span><span>: impl FnOnce(&amp;</span><span style="color:#e45649;">mut Self</span><span>) -&gt; Result&lt;Out, Err&gt;) -&gt; TransactionResult&lt;Out, Err&gt;; </span><span>} </span></code></pre> <p>The trait provides a way to execute operations and explicitly fail the transaction, and it also generalizes the <code>fallible_transaction</code> and <code>infallible_transaction</code> function with a static function called <code>run</code>. Implementing this interface for our two transaction types is straightforward.</p> <h3 id="defining-operations">Defining operations</h3> <p>We defined an <code>Operation</code> trait but haven't talked yet about how we will declare new operations. One obvious way is to define a type and implement the trait for it:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">struct </span><span>CreateAccount { </span><span> </span><span style="color:#a0a1a7;">// configuration </span><span>} </span><span> </span><span style="color:#a626a4;">impl </span><span>Operation </span><span style="color:#a626a4;">for </span><span>CreateAccount { </span><span> </span><span style="color:#a626a4;">type </span><span>In </span><span style="color:#a626a4;">=</span><span> AccountDetails; </span><span> </span><span style="color:#a626a4;">type </span><span>Out </span><span style="color:#a626a4;">=</span><span> AccountId; </span><span> </span><span style="color:#a626a4;">type </span><span>Err </span><span style="color:#a626a4;">=</span><span> DomainError; </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">execute</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">input</span><span>: AccountDetails) -&gt; Result&lt;AccountId, DomainError&gt; { </span><span> todo!(</span><span style="color:#50a14f;">&quot;Create the account&quot;</span><span>) </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">compensate</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">input</span><span>: AccountDetails, </span><span style="color:#e45649;">result</span><span>: AccountId) -&gt; Result&lt;(), </span><span style="color:#a626a4;">Self::</span><span>Err&gt; { </span><span> todo!(</span><span style="color:#50a14f;">&quot;Delete the account&quot;</span><span>); </span><span> } </span><span>} </span></code></pre> <p>The library provides a more concise way to define ad-hoc operations by just passing two functions:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">operation</span><span>&lt;In: Clone, Out: Clone, Err: Clone&gt;( </span><span> </span><span style="color:#e45649;">execute_fn</span><span>: impl Fn(</span><span style="color:#e45649;">In</span><span>) -&gt; Result&lt;Out, Err&gt; + </span><span style="color:#a626a4;">&#39;static</span><span>, </span><span> </span><span style="color:#e45649;">compensate_fn</span><span>: impl Fn(</span><span style="color:#e45649;">In</span><span>, </span><span style="color:#e45649;">Out</span><span>) -&gt; Result&lt;(), Err&gt; + </span><span style="color:#a626a4;">&#39;static</span><span>, </span><span>) -&gt; impl Operation&lt;In = In, Out = Out, Err = Err&gt; { </span><span style="color:#a626a4;">... </span><span>} </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">let</span><span> op </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">operation</span><span>( </span><span> </span><span style="color:#a626a4;">move |</span><span>account_details: AccountDetails</span><span style="color:#a626a4;">| </span><span>{ </span><span> todo!(</span><span style="color:#50a14f;">&quot;Create the account&quot;</span><span>) </span><span> }, </span><span> </span><span style="color:#a626a4;">move |</span><span>account_details: AccountDetails, account_id: AccountId</span><span style="color:#a626a4;">| </span><span>{ </span><span> todo!(</span><span style="color:#50a14f;">&quot;Delete the account&quot;</span><span>) </span><span> }); </span></code></pre> <p>Under the hood this creates a struct called <code>FnOperation</code> storing these two closures in it.</p> <p>There is a third way though. Let's see how it looks like, and then explore how it can be implemented with <em>Rust macros</em>!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">golem_operation</span><span>(compensation</span><span style="color:#a626a4;">=</span><span>delete_account)] </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">create_account</span><span>(</span><span style="color:#e45649;">username</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;AccountId, DomainError&gt; { </span><span> todo!(</span><span style="color:#50a14f;">&quot;Create the account&quot;</span><span>) </span><span>} </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">delete_account</span><span>(</span><span style="color:#e45649;">account_id</span><span>: AccountId) -&gt; Result&lt;(), DomainError&gt; { </span><span> todo!(</span><span style="color:#50a14f;">&quot;Delete the account&quot;</span><span>) </span><span>} </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#0184bc;">infallible_transaction</span><span>(|</span><span style="color:#e45649;">tx</span><span>| { </span><span> </span><span style="color:#a626a4;">let</span><span> account_id </span><span style="color:#a626a4;">=</span><span> tx.</span><span style="color:#0184bc;">create_account</span><span>(</span><span style="color:#50a14f;">&quot;vigoo&quot;</span><span>, </span><span style="color:#50a14f;">&quot;x@y&quot;</span><span>); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>}); </span></code></pre> <h3 id="operation-macro">Operation macro</h3> <p>In the above example <code>golem_operation</code> is a macro. It is a function executed compile time that takes the annotated item - in this case the <code>create_account</code> function and <strong>transforms</strong> it to something else.</p> <p>The first thing to figure out when writing a macro like that is what exactly we want to transform the function into. Let's see what this macro generates, and then I explain how to get there.</p> <p>If we expand the macro for the above example we get the following:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">create_account</span><span>(</span><span style="color:#e45649;">username</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;AccountId, DomainError&gt; { </span><span> todo!(</span><span style="color:#50a14f;">&quot;Create the account&quot;</span><span>) </span><span>} </span><span> </span><span style="color:#a626a4;">trait </span><span>CreateAccount { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">create_account</span><span>(</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">username</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;AccountId, DomainError&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;T: Transaction&lt;DomainError&gt;&gt; CreateAccount </span><span style="color:#a626a4;">for &amp;</span><span>mut T { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">create_account</span><span>(</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">username</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;AccountId, DomainError&gt; { </span><span> </span><span style="color:#e45649;">self</span><span>.</span><span style="color:#0184bc;">execute</span><span>( </span><span> </span><span style="color:#0184bc;">operation</span><span>( </span><span> |(</span><span style="color:#e45649;">username</span><span>, </span><span style="color:#e45649;">email</span><span>): (</span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#a626a4;">&amp;str</span><span>)| { </span><span> </span><span style="color:#0184bc;">create_account</span><span>(username, email) </span><span> }, </span><span> |(</span><span style="color:#e45649;">username</span><span>, </span><span style="color:#e45649;">email</span><span>): (</span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#a626a4;">&amp;str</span><span>), </span><span style="color:#e45649;">op_result</span><span>: AccountId| { </span><span> </span><span style="color:#0184bc;">call_compensation_function</span><span>( </span><span> delete_account, </span><span> op_result, </span><span> (username, email) </span><span> ).</span><span style="color:#0184bc;">map_err</span><span>(|</span><span style="color:#e45649;">err</span><span>| err.</span><span style="color:#c18401;">0</span><span>) </span><span> }), (username, email)) </span><span> } </span><span>} </span></code></pre> <p>So seems like the macro leaves the function in its original form, but generates some additional items: a <em>trait</em> which contains the same function signature as the annotated one, and then an <em>implementation</em> for this trait for any <code>&amp;mut T</code> where <code>T</code> is a <code>Transaction&lt;DomainError&gt;</code>.</p> <p>As I explained above, <code>Transaction</code> is a trait that provides a unified interface for both the fallible and infallible transactions. With this instance we define an <strong>extension method</strong> for the <code>tx</code> value we get in our transaction functions - this is what allows us to write <code>tx.create_account</code> in the above example.</p> <p>Two more details to notice:</p> <ul> <li>Our <code>Operation</code> type deals with a single input value but our annotated function can have arbitrary number of parameters. We can solve this by defining the operation's input as a <strong>tuple</strong> containing all the function parameters.</li> <li>The compensation function (<code>delete_action</code>) is not called directly, but through a helper called <code>call_compensation_function</code>. This allows us to support compensation functions of different shapes, and I will explain how it works in details.</li> </ul> <h4 id="defining-the-macro-and-parsing-the-function">Defining the macro and parsing the function</h4> <p>This type of Rust macro which is invoked by annotating items in the code is called a <a href="https://doc.rust-lang.org/reference/procedural-macros.html">proc-macro</a>. We need to create a separate Rust <em>crate</em> for defining the macro, and set <code>proc-macro = true</code> in its <code>Cargo.toml</code> file and then create a top-level function annotated with <code>#[proc_macro_attribute]</code> to define our macro:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">proc_macro_attribute</span><span>] </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">golem_operation</span><span>(</span><span style="color:#e45649;">attr</span><span>: TokenStream, </span><span style="color:#e45649;">item</span><span>: TokenStream) -&gt; TokenStream { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <p>Rust macros are transformations on <strong>token streams</strong>. The first parameter of our macro gets the <em>parameters</em> passed to the macro - so in our example it will contain a stream of tokens representing <code>compensation=delete_account</code>. The second parameter is the annotated item itself - in our case it's a stream of tokens of the whole function definition including its body.</p> <p>The result of the function is also a token stream and the easiest thing we can do is to just return <code>item</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">proc_macro_attribute</span><span>] </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">golem_operation</span><span>(</span><span style="color:#e45649;">attr</span><span>: TokenStream, </span><span style="color:#e45649;">item</span><span>: TokenStream) -&gt; TokenStream { </span><span> item </span><span>} </span></code></pre> <p>This is a valid macro that does not do anything.</p> <p>We somehow have to generate a trait and a trait implementation with only having these two token streams. Before we can generate anything we need to understand the annotated function - we need its name, its parameters, its result type etc.</p> <p>We can use the <a href="https://docs.rs/syn/latest/syn/">syn</a> create for this to parse the stream of tokens into a Rust AST.</p> <p>To parse <code>item</code> as a function, we can write:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let</span><span> ast: ItemFn </span><span style="color:#a626a4;">= </span><span>syn::parse(item).</span><span style="color:#0184bc;">expect</span><span>(</span><span style="color:#50a14f;">&quot;Expected a function&quot;</span><span>); </span></code></pre> <p>This is something we can extract information from, for example <code>ItemFn</code> has the following contents:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub struct </span><span>ItemFn { </span><span> </span><span style="color:#a626a4;">pub </span><span style="color:#e45649;">attrs</span><span>: Vec&lt;Attribute&gt;, </span><span> </span><span style="color:#a626a4;">pub </span><span style="color:#e45649;">vis</span><span>: Visibility, </span><span> </span><span style="color:#a626a4;">pub </span><span style="color:#e45649;">sig</span><span>: Signature, </span><span> </span><span style="color:#a626a4;">pub </span><span style="color:#e45649;">block</span><span>: Box&lt;Block&gt;, </span><span>} </span></code></pre> <p>And <code>sig</code> contains things like the function's name, parameters and return type. It is important to keep in mind though that this is just a parsed AST from the tokens - the whole transformation runs before any type checking and we don't have any way to identify actual Rust types. We only see what's in the source code.</p> <p>For example in our macro we expect that the annotated function returns with a <code>Result</code> type and we need to look into this type because we will use the success and error types in separate places in the generated code.</p> <p>We cannot do this in a 100% reliable way. We can look for things like the result type <em>looks like</em> a <code>Result&lt;Out, Err&gt;</code>, and we may support some additional forms such as <code>std::result::Result&lt;Out, Err&gt;</code>, but if the user defined a type alias and uses that, a macro that looks at the AST cannot know that it is equal to a result type. In many cases these limitations can be solved by applying type level programming - we could have a trait that extracts the success and error types of a <code>Result</code> and is not implemented for any other type, and then generate code from the macro that uses these helper types.</p> <p>The current implementation of the <code>golem_operation</code> macro does not do this for determining the result types, so it has this limitation that it only works if you use the "standard" way of writing <code>Result&lt;Out, Err&gt;</code>.</p> <p>This looks like the following:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">result_type</span><span>(</span><span style="color:#e45649;">ty</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>Type) -&gt; Option&lt;(Type, Type)&gt; { </span><span> </span><span style="color:#a626a4;">match</span><span> ty { </span><span> Type::Group(group) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#0184bc;">result_type</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>group.elem), </span><span> Type::Paren(paren) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#0184bc;">result_type</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>paren.elem), </span><span> Type::Path(type_path) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> idents </span><span style="color:#a626a4;">=</span><span> type_path.path.segments.</span><span style="color:#0184bc;">iter</span><span>().</span><span style="color:#0184bc;">map</span><span>(|</span><span style="color:#e45649;">segment</span><span>| segment.ident.</span><span style="color:#0184bc;">to_string</span><span>()).collect::&lt;Vec&lt;</span><span style="color:#a626a4;">_</span><span>&gt;()</span><span style="background-color:#e06c75;color:#fafafa;">;</span><span> </span><span> </span><span style="color:#a626a4;">if</span><span> idents </span><span style="color:#a626a4;">== </span><span>vec![</span><span style="color:#50a14f;">&quot;Result&quot;</span><span>] { </span><span style="color:#a0a1a7;">// ... some more cases </span><span> </span><span style="color:#a626a4;">let</span><span> last_segment </span><span style="color:#a626a4;">=</span><span> type_path.path.segments.</span><span style="color:#0184bc;">last</span><span>().</span><span style="color:#0184bc;">unwrap</span><span>(); </span><span> </span><span style="color:#a626a4;">let </span><span>syn::PathArguments::AngleBracketed(generics) </span><span style="color:#a626a4;">= &amp;</span><span>last_segment.arguments </span><span style="color:#a626a4;">else </span><span>{ </span><span style="color:#a626a4;">return </span><span>None }; </span><span> </span><span style="color:#a626a4;">if</span><span> generics.args.</span><span style="color:#0184bc;">len</span><span>() </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">2 </span><span>{ </span><span> </span><span style="color:#a626a4;">return </span><span>None; </span><span> } </span><span> </span><span style="color:#a626a4;">let </span><span>syn::GenericArgument::Type(success_type) </span><span style="color:#a626a4;">= &amp;</span><span>generics.args[</span><span style="color:#c18401;">0</span><span>] </span><span style="color:#a626a4;">else </span><span>{ </span><span> </span><span style="color:#a626a4;">return </span><span>None; </span><span> }; </span><span> </span><span style="color:#a626a4;">let </span><span>syn::GenericArgument::Type(err_type) </span><span style="color:#a626a4;">= &amp;</span><span>generics.args[</span><span style="color:#c18401;">1</span><span>] </span><span style="color:#a626a4;">else </span><span>{ </span><span> </span><span style="color:#a626a4;">return </span><span>None; </span><span> }; </span><span> Some((success_type.</span><span style="color:#0184bc;">clone</span><span>(), err_type.</span><span style="color:#0184bc;">clone</span><span>())) </span><span> } </span><span> </span><span style="color:#a0a1a7;">// ... other cases returning None </span><span>} </span></code></pre> <p>Once we have all the information we need - the function's name, its parameters, the successful and failed result types, all in <code>syn</code> AST nodes, we can generate the additional code that we can return in the end as the new token stream.</p> <p>To generate token stream we use the <a href="https://docs.rs/quote/latest/quote/">quote library</a>. This library provides the <code>quote!</code> macro, which itself generates a <code>TokenStream</code> . (Although it is not the same <code>TokenStream</code> as the one we need to return from the macro. The macro requires <code>proc_macro::TokenStream</code> and <code>quote!</code> returns <code>proc_macro2::TokenStream</code>. Fortunately it can be simply converted with <code>.into()</code>).</p> <p>We write a single <code>quote!</code> for producing the result of the macro:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let</span><span> result </span><span style="color:#a626a4;">= </span><span>quote! { </span><span> </span><span style="color:#a626a4;">#</span><span>ast </span><span> </span><span> </span><span style="color:#a626a4;">trait #</span><span>traitname { </span><span> </span><span style="color:#a626a4;">#</span><span>fnsig; </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;T: golem_rust::Transaction&lt;</span><span style="background-color:#e06c75;color:#fafafa;">#</span><span>err&gt;&gt; #traitname for </span><span style="color:#a626a4;">&amp;</span><span>mut T { </span><span> </span><span style="color:#a626a4;">#</span><span>fnsig { </span><span> </span><span style="color:#e45649;">self</span><span>.</span><span style="color:#0184bc;">execute</span><span>( </span><span> golem_rust::</span><span style="color:#a626a4;">#</span><span style="color:#0184bc;">operation</span><span>( </span><span> |#</span><span style="color:#e45649;">input_pattern</span><span>| { </span><span> </span><span style="color:#a626a4;">#</span><span style="color:#0184bc;">fnname</span><span>(</span><span style="color:#a626a4;">#</span><span>(</span><span style="color:#a626a4;">#</span><span>input_args), </span><span style="color:#a626a4;">*</span><span>) </span><span> }, </span><span> |#</span><span style="color:#e45649;">compensation_pattern</span><span>| { </span><span> </span><span style="color:#a626a4;">#</span><span style="color:#0184bc;">compensate</span><span>( </span><span> </span><span style="color:#a626a4;">#</span><span>compensation, </span><span> (op_result,), </span><span> (</span><span style="color:#a626a4;">#</span><span>(</span><span style="color:#a626a4;">#</span><span>compensation_args), </span><span style="color:#a626a4;">*</span><span>) </span><span> ).</span><span style="color:#0184bc;">map_err</span><span>(|</span><span style="color:#e45649;">err</span><span>| err.</span><span style="color:#c18401;">0</span><span>) </span><span> } </span><span> ), </span><span> (</span><span style="color:#a626a4;">#</span><span>(</span><span style="color:#a626a4;">#</span><span>input_args), </span><span style="color:#a626a4;">*</span><span>) </span><span> ) </span><span> } </span><span> } </span><span>}; </span><span> </span><span>result.</span><span style="color:#0184bc;">into</span><span>() </span><span style="color:#a0a1a7;">// proc_macro2::TokenStream to proc_macro::TokenStream </span></code></pre> <p>All the parts prefixed with <code>#</code> are references to rust variables outside of the quote, and they can be (and usually are) various <code>syn</code> AST nodes or raw token streams.</p> <p>There is a special syntax for interpolating sequences of values. The case used in the above example is when you write <code>#(#var), *</code>. This means that <code>var</code> is expected to be an iterable variable (in our case it will be <code>Vec&lt;_&gt;</code> usually) and it interpolates each elements by inserting extra tokens, defined between <code>)</code> and <code>*</code>, between these elements. So this example would insert a comma and a space between the elements.</p> <p>The above defined <code>quote</code> is a template that matches what we wanted to generate. All that's needed is to define all these variables holding dynamic parts of the generated code. The <code>#ast</code> variable itself is the parsed function - so the first line of the quote just makes sure the original definition is part of the result.</p> <p>The <code>#succ</code> and <code>#err</code> types are extracted with the <code>result_type</code> helper function as described above. The others are just defined by either transforming and cloning AST nodes, or using <code>quote!</code> to generate sub token streams.</p> <p>Let's see a few examples!</p> <p>The new trait's name has to be an <code>Ident</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let</span><span> fnname </span><span style="color:#a626a4;">=</span><span> fnsig.ident.</span><span style="color:#0184bc;">clone</span><span>(); </span><span style="color:#a626a4;">let</span><span> traitname </span><span style="color:#a626a4;">= </span><span>Ident::new(</span><span style="color:#a626a4;">&amp;</span><span>fnname.</span><span style="color:#0184bc;">to_string</span><span>().</span><span style="color:#0184bc;">to_pascal_case</span><span>(), fnsig.ident.</span><span style="color:#0184bc;">span</span><span>()); </span></code></pre> <p>Here we use the <code>to_pascal_case</code> extension method provided by the <a href="https://docs.rs/heck/latest/heck/">heck crate</a>.</p> <p>Another example is the signature of the function that's inside the trait. It is <em>almost</em> the same as the annotated feature, but it has to have a <code>self</code> parameter as the first parameter of it, that's how it becomes an extension method on the transaction.</p> <p>We can do this by cloning the annotated function's signature and just adding a new parameter:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let mut</span><span> fnsig </span><span style="color:#a626a4;">=</span><span> ast.sig.</span><span style="color:#0184bc;">clone</span><span>(); </span><span>fnsig.inputs.</span><span style="color:#0184bc;">insert</span><span>(</span><span style="color:#c18401;">0</span><span>, parse_quote! { </span><span style="color:#e45649;">self </span><span>}); </span></code></pre> <p>Note that <code>parse_quote!</code> immediately parses the token stream generated by quote back to a <code>syn</code> AST node.</p> <h4 id="compensation-function-shapes">Compensation function shapes</h4> <p>The last interesting bit is how the macro supports compensation functions of different shapes. What we support right now, is the following.</p> <ul> <li>The compensation function has no parameters at all</li> <li>The compensation function takes the output of the action but not the inputs</li> <li>The compensation function takes the output and all the inputs</li> </ul> <p>With the account creation example this means all of these are valid:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">golem_operation</span><span>(compensation</span><span style="color:#a626a4;">=</span><span>delete_account)] </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">create_account</span><span>(</span><span style="color:#e45649;">username</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;AccountId, DomainError&gt;; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">delete_account</span><span>() -&gt; Result&lt;(), DomainError&gt;; </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">delete_account</span><span>(</span><span style="color:#e45649;">account_id</span><span>: AccountId) -&gt; Result&lt;(), DomainError&gt;; </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">delete_account</span><span>(</span><span style="color:#e45649;">account_id</span><span>: AccountId, </span><span style="color:#e45649;">username</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">email</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;(), DomainError&gt;; </span></code></pre> <p>If we could have the AST of <code>delete_account</code> from the macro, it would be easy to decide which shape we have - we would not even need to worry about not having actual types because we could just compare the parameter list and result type tokens of the two functions to be able to decide which way to go.</p> <p>Unfortunately our macro is on the <code>create_account</code> function and there is no way to access anything else about <code>delete_account</code> from it than the <code>compensation=delete_account</code> part which we passed as an attribute parameter.</p> <p>Before solving this problem let's see how we can get the <em>name</em> of the compensation function, at least:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">let</span><span> args </span><span style="color:#a626a4;">= </span><span>parse_macro_input!(args with Punctuated::&lt;Meta, syn::Token</span><span style="background-color:#e06c75;color:#fafafa;">!</span><span>[,]</span><span style="color:#a626a4;">&gt;</span><span>::parse_terminated); </span><span> </span><span style="color:#a626a4;">let mut</span><span> compensation </span><span style="color:#a626a4;">= </span><span>None; </span><span style="color:#a626a4;">for</span><span> arg </span><span style="color:#a626a4;">in</span><span> args { </span><span> </span><span style="color:#a626a4;">if let </span><span>Meta::NameValue(name_value) </span><span style="color:#a626a4;">=</span><span> arg { </span><span> </span><span style="color:#a626a4;">let</span><span> name </span><span style="color:#a626a4;">=</span><span> name_value.path.</span><span style="color:#0184bc;">get_ident</span><span>().</span><span style="color:#0184bc;">unwrap</span><span>().</span><span style="color:#0184bc;">to_string</span><span>(); </span><span> </span><span style="color:#a626a4;">let</span><span> value </span><span style="color:#a626a4;">=</span><span> name_value.value; </span><span> </span><span> </span><span style="color:#a626a4;">if</span><span> name </span><span style="color:#a626a4;">== </span><span style="color:#50a14f;">&quot;compensation&quot; </span><span>{ </span><span> compensation </span><span style="color:#a626a4;">= </span><span>Some(value); </span><span> } </span><span> } </span><span>} </span></code></pre> <p>We parse the macro's input into a list of <code>Meta</code> nodes, and look for the <code>NameValue</code> cases representing the attribute arguments having the <code>x=y</code> form. If the key is <code>compensation</code> we store the value, which has the type <code>Expr</code> (expression AST node) and we can interpolate this expression node directly in the quoted code to get our function name.</p> <p>Let's go back to the primary problem - how can we generate code that invokes this function which can have three different shapes, if we cannot know which one it is?</p> <p>First we define a <strong>trait</strong> that abstracts this problem for us:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub trait </span><span>CompensationFunction&lt;In, Out, Err&gt; { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">call</span><span>(</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">result</span><span>: Out, </span><span style="color:#e45649;">input</span><span>: In) -&gt; Result&lt;(), Err&gt;; </span><span>} </span></code></pre> <p>This always has the same shape - we just pass both the results and the inputs to it, and the trait's implementation can decide to use any of these parameters to actually call the compensation function or not.</p> <p>We can define a function that takes an arbitrary value <code>T</code> for which we have an implementation of this trait, and just call it:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">call_compensation_function</span><span>&lt;In, Out, Err&gt;( </span><span> </span><span style="color:#e45649;">f</span><span>: impl CompensationFunction&lt;In, Out, Err&gt;, </span><span> </span><span style="color:#e45649;">result</span><span>: Out, </span><span> </span><span style="color:#e45649;">input</span><span>: In, </span><span>) -&gt; Result&lt;(), Err&gt; { </span><span> f.</span><span style="color:#0184bc;">call</span><span>(result, input) </span><span>} </span></code></pre> <p>With this, we can simply generate code from the macro that passes <strong>the actual compensation function</strong> to the <code>f</code> parameter of <code>call_compensation_function</code>, and always pass both the result and the input!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#0184bc;">call_compensation_function</span><span>( </span><span> delete_account, </span><span> op_result, </span><span> (username, email) </span><span>) </span></code></pre> <p>To make this work we need instances of <code>CompensationFunction</code> for arbitrary function types.</p> <p>Let's try to define it for the function with no parameters (the first supported shape):</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;F, Err&gt; CompensationFunction&lt;(), (), Err&gt; </span><span style="color:#a626a4;">for </span><span>F </span><span style="color:#a626a4;">where </span><span> F: FnOnce() -&gt; Result&lt;(), Err&gt;, </span><span>{ </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">call</span><span>( </span><span> </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">_result</span><span>: (), </span><span> </span><span style="color:#e45649;">_input</span><span>: (), </span><span> ) -&gt; Result&lt;(), (Err,)&gt; { </span><span> </span><span style="color:#0184bc;">self</span><span>()</span><span style="color:#a626a4;">?</span><span>; </span><span> Ok(()) </span><span> } </span><span>} </span></code></pre> <p>This is not the final implementation as we will see soon. If we try to write an implementation for the second shape - where we only use the result and not the input, we immediately run into a problem:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;F, Out, Err&gt; CompensationFunction&lt;(), Out, Err&gt; </span><span style="color:#a626a4;">for </span><span>F </span><span style="color:#a626a4;">where </span><span> F: FnOnce(Out) -&gt; Result&lt;(), Err&gt; { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <p>The error is about <strong>conflicting implementations</strong> of our trait:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>error[E0119]: conflicting implementations of trait `CompensationFunction&lt;(), (), _&gt;` </span><span> --&gt; golem-rust/src/transaction/compfn.rs:45:1 </span><span> | </span><span>31 | / impl&lt;F, Err&gt; CompensationFunction&lt;(), (), Err&gt; for F </span><span>32 | | where </span><span>33 | | F: FnOnce() -&gt; Result&lt;(), Err&gt;, </span><span> | |___________________________________- first implementation here </span><span>... </span><span>45 | / impl&lt;F, Out, Err&gt; CompensationFunction&lt;(), Out, Err&gt; for F </span><span>46 | | where </span><span>47 | | F: FnOnce(Out) -&gt; Result&lt;(), Err&gt;, </span><span> | |______________________________________^ conflicting implementation </span></code></pre> <p>These. two trait implementations <strong>overlap</strong>. Although it is not obvious at first glance why the two are overlapping, what happens is all the types involved in the overlap check can be unified:</p> <ul> <li>The trait's parameters - <ul> <li>the first is <code>()</code> in both cases</li> <li>The second is <code>()</code> vs <code>Out</code>. Nothing prevents <code>Out</code> to be <code>()</code></li> <li>The third can be anything in both cases</li> </ul> </li> <li>The type we implement the trait for <ul> <li>This is the confusing part - as we have two different function type signatures in the two cases! But these are only type bounds. We say we implement <code>CompensationFunction</code> for a type <code>F</code> which implements the trait <code>FnOnce() ...</code>. The problem is that in theory there can be a type that implements both these function traits, so this is not preventing the overlap either.</li> </ul> </li> </ul> <p>This is something <a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md">specialization</a> would solve but that is currently an unstable compiler feature.</p> <p>If at least one of the above types could not be unified, we would not have an overlap, so that's what we have to do. The simplest way to do so is to stop having unconstrained types in the trait's type parameters such as <code>In</code> and <code>Out</code> and <code>Err</code> (Actually <code>Err</code> should not be affected by this, but I applied the same technique to all parameters at once in the library. This is something that could be potentially simplified in the future.).</p> <p>So we just have to have a type parameter that can contain an arbitrary input or output type, but does not unify with <code>()</code>. We can do that by wrapping the output type in a tuple:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;F, Out, Err&gt; CompensationFunction&lt;(), (Out,), (Err,)&gt; </span><span style="color:#a626a4;">for </span><span>F </span><span> </span><span style="color:#a626a4;">where </span><span> F: FnOnce(Out) -&gt; Result&lt;(), Err&gt; </span></code></pre> <p>Here instead of <code>Out</code> we use <code>(Out,)</code> which is a 1-tuple wrapping our output type. This no longer unifies with <code>()</code> so the compiler error is solved!</p> <p>We can imagine additional trait implementations for one or more input parameters:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;F, T1, Out, Err&gt; CompensationFunction&lt;(T1,), (Out,), (Err,)&gt; </span><span style="color:#a626a4;">for </span><span>F </span><span> </span><span style="color:#a626a4;">where </span><span> F: FnOnce(Out, T1) -&gt; Result&lt;(), Err&gt;, </span><span> </span><span>impl&lt;F, T1, T2, Out, Err&gt; CompensationFunction&lt;(T1,T2), (Out,), (Err,)&gt; </span><span style="color:#a626a4;">for</span><span> F </span><span> </span><span style="color:#a626a4;">where </span><span> F: FnOnce(Out, T1, T2) -&gt; Result&lt;(), Err&gt; </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>Two more problems to solve before we are done!</p> <p>The first problem occurs when we try to use this mechanism for the first to compensation function shapes - when the result, or the result and the input are not used by the function.</p> <p>The problem is that these trait implementations bind the <code>In</code> and/or <code>Out</code> types to <code>()</code> in these cases, which means that our <code>call</code> function will use the unit type for these parameters. For example for <code>delete_account</code> which does not takes the input parameters, it would have the following types if we replace the generic parameters with the inferred ones:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">call_compensation_function</span><span>( </span><span> </span><span style="color:#e45649;">f</span><span>: impl FnOnce(</span><span style="color:#e45649;">AccountId</span><span>) -&gt; Result&lt;(), DomainError&gt;, </span><span> </span><span style="color:#e45649;">result</span><span>: AccountId, </span><span> </span><span style="color:#e45649;">input</span><span>: (), </span><span>) -&gt; Result&lt;(), DomainError&gt; </span></code></pre> <p>And our macro will call it like this:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#0184bc;">call_compensation_function</span><span>( </span><span> delete_account, </span><span> op_result, </span><span> (username, email) </span><span>) </span></code></pre> <p>This of course will not compile, because we pass <code>(&amp;str, &amp;str)</code> in place of a <code>()</code>.</p> <p>Let's take a step back, and change our <code>CompensationFunction</code> trait:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub trait </span><span>CompensationFunction&lt;In, Out, Err&gt; { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">call</span><span>( </span><span> </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">result</span><span>: impl TupleOrUnit&lt;Out&gt;, </span><span> </span><span style="color:#e45649;">input</span><span>: impl TupleOrUnit&lt;In&gt; </span><span> ) -&gt; Result&lt;(), Err&gt;; </span><span>} </span><span> </span></code></pre> <p>Instead of directly taking <code>Out</code> and <code>In</code> in the parameters we now accept <strong>anything that implements TupleOrUnit</strong> for the given type.</p> <p><code>TupleOrUnit</code> is just a special conversion trait:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub trait </span><span>TupleOrUnit&lt;T&gt; { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">into</span><span>(</span><span style="color:#e45649;">self</span><span>) -&gt; T; </span><span>} </span></code></pre> <p>What makes it special and what makes it solve our problem is what instances we have for it.</p> <p>First of all we say that <strong>anything can be converted to unit</strong>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;T&gt; TupleOrUnit&lt;()&gt; </span><span style="color:#a626a4;">for </span><span>T { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">into</span><span>(</span><span style="color:#e45649;">self</span><span>) {} </span><span>} </span></code></pre> <p>Then we use the same trick to avoid overlapping instances, and we say that 1-tuple, 2-tuple, etc. can be converted to itself only:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl</span><span>&lt;T1&gt; TupleOrUnit&lt;(T1, )&gt; </span><span style="color:#a626a4;">for</span><span> (T1, ) { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">into</span><span>(</span><span style="color:#e45649;">self</span><span>) -&gt; (T1, ) { </span><span> </span><span style="color:#e45649;">self </span><span> } </span><span>} </span><span style="color:#a626a4;">impl</span><span>&lt;T1, T2&gt; TupleOrUnit&lt;(T1, T2, )&gt; </span><span style="color:#a626a4;">for</span><span> (T1, T2, ) { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">into</span><span>(</span><span style="color:#e45649;">self</span><span>) -&gt; (T1, T2, ) { </span><span> </span><span style="color:#e45649;">self </span><span> } </span><span>} </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>With this we achieved that the <code>call_compensation_function</code> function is still type safe - it requires us to pass the proper <code>Out</code> and <code>In</code> types - but in the special case when either of these types are unit, it allows us to pass an arbitrary value instead of an actual <code>()</code>.</p> <p>This makes our macro complete.</p> <p>The last thing to solve is to have enough instances of these two type classes - <code>CompensationFunction</code> and <code>TupleOrUnit</code> so our library works with more than 1 or 2 parameters. Writing them by hand is an option but we can easily generate them with another macro!</p> <p>This time we don't have to write a procedural macro - we can use a <strong>declarative macro</strong>s which are simpler, and they can be defined inline in the same module where we define these types.</p> <p>Let's start with <code>TupleOrUnit</code> as it is a bit simpler. We use the <a href="https://doc.rust-lang.org/reference/macros-by-example.html">macro_rules</a> macro which is basically a pattern match with a special syntax - you can match on what is passed to the macro, and generate code with interpolation similar to the <code>quote!</code> macro - but using <code>$</code> instead of <code>#</code> as the interpolation symbol. The following definition defines an instance of <code>TupleOrUnit</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#0184bc;">macro_rules! </span><span>tuple_or_unit { </span><span> (</span><span style="color:#a626a4;">$</span><span>(</span><span style="color:#e45649;">$ty</span><span>:</span><span style="color:#a626a4;">ident</span><span>),</span><span style="color:#a626a4;">*</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;$(</span><span style="color:#e45649;">$ty</span><span>),*&gt; TupleOrUnit&lt;($(</span><span style="color:#e45649;">$ty</span><span>,)*)&gt; </span><span style="color:#a626a4;">for</span><span> ($(</span><span style="color:#e45649;">$ty</span><span>,)*) { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">into</span><span>(</span><span style="color:#e45649;">self</span><span>) -&gt; ($(</span><span style="color:#e45649;">$ty</span><span>,)*) { </span><span> </span><span style="color:#e45649;">self </span><span> } </span><span> } </span><span> } </span><span>} </span></code></pre> <p>We have a single case of our pattern match, which matches a <strong>comma-separated list of identifiers</strong>. We can refer to this list of identifiers as <code>ty</code>. Then we use the same syntax for interpolating sequences into the code as we have seen already in our procedural macro and just generate the instance.</p> <p>We can call this macro with a list of type parameters (which are all <em>identifiers</em>):</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>tuple_or_unit!(</span><span style="color:#c18401;">T1</span><span>, </span><span style="color:#c18401;">T2</span><span>, </span><span style="color:#c18401;">T3</span><span>); </span></code></pre> <p>Let's do the same for generating <code>CompensationFunction</code> instances:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#0184bc;">macro_rules! </span><span>compensation_function { </span><span> (</span><span style="color:#a626a4;">$</span><span>(</span><span style="color:#e45649;">$ty</span><span>:</span><span style="color:#a626a4;">ident</span><span>),</span><span style="color:#a626a4;">*</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">impl</span><span>&lt;F, $(</span><span style="color:#e45649;">$ty</span><span>),*, Out, Err&gt; CompensationFunction&lt;($(</span><span style="color:#e45649;">$ty</span><span>),*,), (Out,), (Err,)&gt; </span><span style="color:#a626a4;">for </span><span>F </span><span> </span><span style="color:#a626a4;">where </span><span> F: FnOnce(Out, $(</span><span style="color:#e45649;">$ty</span><span>),*) -&gt; Result&lt;(), Err&gt;, </span><span> { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">call</span><span>( </span><span> </span><span style="color:#e45649;">self</span><span>, </span><span> </span><span style="color:#e45649;">out</span><span>: impl TupleOrUnit&lt;(Out,)&gt;, </span><span> </span><span style="color:#e45649;">input</span><span>: impl TupleOrUnit&lt;($(</span><span style="color:#e45649;">$ty</span><span>),*,)&gt;, </span><span> ) -&gt; Result&lt;(), (Err,)&gt; { </span><span> #[</span><span style="color:#e45649;">allow</span><span>(non_snake_case)] </span><span> </span><span style="color:#a626a4;">let </span><span>( </span><span style="color:#a626a4;">$</span><span>(</span><span style="color:#e45649;">$ty</span><span>,)</span><span style="color:#a626a4;">+ </span><span>) </span><span style="color:#a626a4;">=</span><span> input.</span><span style="color:#0184bc;">into</span><span>(); </span><span> </span><span style="color:#a626a4;">let </span><span>(out,) </span><span style="color:#a626a4;">=</span><span> out.</span><span style="color:#0184bc;">into</span><span>(); </span><span> </span><span style="color:#0184bc;">self</span><span>(out, </span><span style="color:#a626a4;">$</span><span>(</span><span style="color:#e45649;">$ty</span><span>),</span><span style="color:#a626a4;">*</span><span>).</span><span style="color:#0184bc;">map_err</span><span>(|</span><span style="color:#e45649;">err</span><span>| (err,)) </span><span> } </span><span> } </span><span> } </span><span>} </span></code></pre> <p>The only interesting part here is how we access the components of our tuple.</p> <p>Let's imagine we pass <code>T1, T2, T3</code> as arguments to this macro, so <code>ty</code> is a sequence of three identifiers. We can interpolate this comma separated list into the type parameter part (<code>impl&lt;F, $($ty),*, Out, Err&gt;</code>) without any problems but this is still just a list of identifiers - and when we call our compensation function (<code>self</code>), we have to access the individual elements of this tuple and pass them to the function as separate parameters.</p> <p>We could write it by hand like this:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">call</span><span>(</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">out</span><span>: impl TupleOrUnit&lt;(Out,)), input: impl TupleOrUnit&lt;(T1, T2, T3)&gt;) -&gt; Result&lt;(), Err&gt; </span><span style="background-color:#e06c75;color:#fafafa;">{</span><span> </span><span> let </span><span style="color:#e45649;">out</span><span>: Out = out.into(); </span><span> let </span><span style="color:#e45649;">input</span><span>: (</span><span style="color:#e45649;">T1</span><span>, </span><span style="color:#e45649;">T2</span><span>, </span><span style="color:#e45649;">T3</span><span>) = input.into(); </span><span> </span><span style="color:#e45649;">self</span><span>(</span><span style="color:#e45649;">out</span><span>, </span><span style="color:#e45649;">input</span><span>.0, </span><span style="color:#e45649;">input</span><span>.1, </span><span style="color:#e45649;">input</span><span>.2) </span><span>} </span></code></pre> <p>It is possible to generate a list of accessors like this from a procedural macro, but not in a declarative one - we only have <code>ty</code> to work with. We can instead <strong>destructure</strong> the tuple and we can actually reuse the list of identifiers to do so!</p> <p>In the above macro code, this can be seen as:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">allow</span><span>(non_snake_case)] </span><span style="color:#a626a4;">let </span><span>( </span><span style="color:#a626a4;">$</span><span>(</span><span style="color:#e45649;">$ty</span><span>,)</span><span style="color:#a626a4;">+ </span><span>) </span><span style="color:#a626a4;">=</span><span> input.</span><span style="color:#0184bc;">into</span><span>(); </span><span style="color:#0184bc;">self</span><span>(out, </span><span style="color:#a626a4;">$</span><span>(</span><span style="color:#e45649;">$ty</span><span>),</span><span style="color:#a626a4;">*</span><span>) </span></code></pre> <p>This translates to</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">allow</span><span>(non_snake_case)] </span><span style="color:#a626a4;">let </span><span>(</span><span style="color:#c18401;">T1</span><span>, </span><span style="color:#c18401;">T2</span><span>, </span><span style="color:#c18401;">T3</span><span>) </span><span style="color:#a626a4;">=</span><span> input.</span><span style="color:#0184bc;">into</span><span>(); </span><span style="color:#0184bc;">self</span><span>(out, </span><span style="color:#c18401;">T1</span><span>, </span><span style="color:#c18401;">T2</span><span>, </span><span style="color:#c18401;">T3</span><span>) </span></code></pre> <p>The error mapping is only necessary because currently the error typed is also wrapped into a tuple - this could enable additional function shapes where the compensation function never fails, for example, but it is not implemented yet.</p> <h2 id="conclusion">Conclusion</h2> <p>The library described here is open source and is available <a href="https://github.com/golemcloud/golem-rust">on GitHub</a> and published <a href="https://crates.io/crates/golem-rust">to crates.io</a>. Documentation and examples will soon be added to <a href="https://learn.golem.cloud/docs/intro">Golem's learn pages</a>. And of course this is just a first version I hope to see grow based on user feedback.</p> <p>We also plan to have similar higher-level wrapper libraries for Golem's features for the other supported languages - everything Golem provides is exposed through the WASM Component Model so any language supporting that have immediate access to the building blocks. All remains is writing idiomatic wrappers on top of them for each language.</p> Worker to Worker communication in Golem 2024-03-08T00:00:00+00:00 2024-03-08T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/w2w-communication-golem/ <p>This article was originally posted at <a href="https://www.golem.cloud/post/worker-to-worker-communication">the Golem Cloud Blog</a></p> <p>Golem Cloud's first developer preview <a href="https://www.golem.cloud/post/unveiling-golem-cloud">has been unveiled in August</a>, and just a month ago, we released <a href="https://www.golem.cloud/post/golem-goes-open-source">an open-source version of Golem</a>. Workers, the fundamental primitive in Golem, expose a typed interface that can be invoked through the REST API or the command line tools, but until today, calling a worker from <em>another worker</em> was neither easy nor type-safe.</p> <p>With the latest release of Golem and the <code>golem-cli</code> tool, we finally have a first-class, typed way to invoke one worker from another, using any of the supported guest languages!</p> <h2 id="golem-wasm-rpc">Golem WASM RPC</h2> <p>Golem's new worker to worker communication feature consists of two major layers:</p> <ul> <li>A low-level, dynamic worker invocation API exposed as a Golem <strong>host function</strong> to all workers. This interface is not type safe. Rather, it matches the capabilities of the external REST API, allowing a worker to invoke any method on any other worker with any parameters. However, it avoids the overhead of setting up an HTTP connection and will be optimized in the future.</li> <li>The ability to generate <strong>stubs</strong> for having a completely type-safe, language-independent remote worker invocation for any supported language having a WIT-based binding generator.</li> </ul> <p>With the new stub generator commands integrated into Golem's command line tool (<code>golem-cli</code>) worker to worker communication is now a simple and fully type-safe experience.</p> <h2 id="a-full-example">A full example</h2> <p>To demonstrate how this new feature works, we will take one of the first Golem example projects, the <strong>shopping cart</strong>, and extend it with worker-to-worker communication. The original shopping-cart project defines a worker for each shopping cart of an online web store, with exported functions to add items to the cart and eventually check out and finish the shopping process.</p> <p>In this example, we introduce a second <strong>worker template</strong>, one that will be used to create a single <strong>worker</strong> for each online shopper. This worker will keep a log of all the purchases of the user it belongs to. We will extend the shopping cart's <code>checkout</code> function with a remote worker invocation to add a new entry to the account's purchase log.</p> <p>First, let's make sure we have the latest version of <code>golem-cli</code>, if using the open-source Golem version, or <code>golem-cloud-cli</code>, if using the hosted version. It must have the new <code>stubgen</code> subcommand, to check let's run <code>golem-cli stubgen --help</code>:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">WASM</span><span> RPC stub generator </span><span> </span><span style="color:#e45649;">Usage:</span><span> golem-cli stubgen </span><span style="color:#a626a4;">[</span><span>OPTIONS</span><span style="color:#a626a4;">] &lt;</span><span>COMMAND</span><span style="color:#a626a4;">&gt; </span><span> </span><span style="color:#e45649;">Commands: </span><span> </span><span style="color:#e45649;">generate</span><span> Generate a Rust RPC stub crate for a WASM component </span><span> </span><span style="color:#e45649;">build</span><span> Build an RPC stub for a WASM component </span><span> </span><span style="color:#e45649;">add-stub-dependency</span><span> Adds a generated stub as a dependency to another WASM component </span><span> </span><span style="color:#e45649;">compose</span><span> Compose a WASM component with a generated stub WASM </span><span> </span><span style="color:#e45649;">initialize-workspace</span><span> Initializes a Golem-specific cargo-make configuration in a Cargo workspace for automatically generating stubs and composing results </span><span> </span><span style="color:#0184bc;">help</span><span> Print this message or the help of the given subcommand(s) </span><span> </span><span style="color:#e45649;">Options: </span><span> </span><span style="color:#e45649;">-v, --verbose</span><span>... Increase logging verbosity </span><span> </span><span style="color:#e45649;">-q, --quiet</span><span>... Decrease logging verbosity </span><span> </span><span style="color:#e45649;">-h, --help</span><span> Print help </span></code></pre> <h3 id="preparing-the-example">Preparing the example</h3> <p>We are going to create two different <strong>Golem templates</strong>, and have the source codes of both of them in a single <strong>Cargo workspace</strong>. This is not required—they could live in completely separate places—but it allows using our built-in cargo-make support, which currently gives us the best possible developer experience for worker-to-worker communication.</p> <p>First, let's use the <code>golem-cli new</code> command to take the <strong>shopping-cart example</strong> and generate a new template source from it:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem-cli new</span><span style="color:#e45649;"> --example</span><span> rust-shopping-cart</span><span style="color:#e45649;"> --template-name</span><span> shopping-cart-rpc </span><span style="color:#e45649;">See</span><span> the documentation about installing common tooling: https://golem.cloud/learn/rust </span><span> </span><span style="color:#e45649;">Compile</span><span> the Rust component with cargo-component: </span><span> </span><span style="color:#e45649;">cargo</span><span> component build</span><span style="color:#e45649;"> --release </span><span style="color:#e45649;">The</span><span> result in target/wasm32-wasi/release/shopping_cart_rpc.wasm is ready to be used with Golem! </span></code></pre> <p>The <code>shopping-cart-rpc</code> directory now contains a single Rust crate, which can be compiled to WASM using <code>cargo component build</code>. We need two different WASMs (two Golem templates) so as a first step, we convert the generated Cargo project to a <a href="https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html"><strong>cargo workspace</strong></a>.</p> <p>First, create two sub-directories for the two templates we will use:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> mkdir</span><span style="color:#e45649;"> -pv</span><span> shopping-cart </span><span style="color:#e45649;">shopping-cart </span><span style="color:#e45649;">$</span><span> mkdir</span><span style="color:#e45649;"> -pv</span><span> purchase-history </span><span style="color:#e45649;">purchase-history </span></code></pre> <p>Then, move the generated shopping cart source code into the <code>shopping-cart</code> subdirectory:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> mv</span><span style="color:#e45649;"> -v</span><span> src shopping-cart </span><span style="color:#e45649;">src</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> shopping-cart/src </span><span> </span><span style="color:#e45649;">$</span><span> mv</span><span style="color:#e45649;"> -v</span><span> wit shopping-cart </span><span style="color:#e45649;">wit</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> shopping-cart/wit </span><span> </span><span style="color:#e45649;">$</span><span> mv</span><span style="color:#e45649;"> -v</span><span> Cargo.toml shopping-cart </span><span style="color:#e45649;">Cargo.toml</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> shopping-cart/Cargo.toml </span></code></pre> <p>We can copy the whole contents of the <code>shopping-cart</code> directory to the <code>purchase-history</code> directory too:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> cp</span><span style="color:#e45649;"> -rv</span><span> shopping-cart/</span><span style="color:#a626a4;">*</span><span> purchase-history </span><span style="color:#e45649;">shopping-cart/Cargo.toml</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> purchase-history/Cargo.toml </span><span style="color:#e45649;">shopping-cart/src</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> purchase-history/src </span><span style="color:#e45649;">shopping-cart/src/lib.rs</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> purchase-history/src/lib.rs </span><span style="color:#e45649;">shopping-cart/wit</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> purchase-history/wit </span><span style="color:#e45649;">shopping-cart/wit/shopping-cart-rpc.wit</span><span> -</span><span style="color:#a626a4;">&gt;</span><span> purchase-history/wit/shopping-cart-rpc.wit </span></code></pre> <p>Then we create a new <code>Cargo.toml</code> file in the root, pointing to the two sub-projects:</p> <pre data-lang="toml" style="background-color:#fafafa;color:#383a42;" class="language-toml "><code class="language-toml" data-lang="toml"><span>[workspace] </span><span style="color:#e45649;">resolver </span><span>= </span><span style="color:#50a14f;">&quot;2&quot; </span><span> </span><span style="color:#e45649;">members </span><span>= [ </span><span> </span><span style="color:#50a14f;">&quot;shopping-cart&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;purchase-history&quot;</span><span>, </span><span>] </span></code></pre> <p>Next, modify the <code>name</code> property in both sub-project's <code>Cargo.toml</code>. In <code>shopping-cart/Cargo.toml</code>, it should be:</p> <pre data-lang="toml" style="background-color:#fafafa;color:#383a42;" class="language-toml "><code class="language-toml" data-lang="toml"><span style="color:#e45649;">name </span><span>= </span><span style="color:#50a14f;">&quot;shopping-cart&quot; </span></code></pre> <p>while in the other</p> <pre data-lang="toml" style="background-color:#fafafa;color:#383a42;" class="language-toml "><code class="language-toml" data-lang="toml"><span style="color:#e45649;">name </span><span>= </span><span style="color:#50a14f;">&quot;purchase-history&quot; </span></code></pre> <p>It's also recommended that you rename the WIT file in both the <code>wit</code> directories to a file name that corresponds to the given sub-project's name, but it does not have any effect on the compilation—it just makes working on the source code easier.</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> mv shopping-cart/wit/shopping-cart-rpc.wit shopping-cart/wit/shopping-cart.wit </span><span style="color:#e45649;">$</span><span> mv purchase-history/wit/shopping-cart-rpc.wit purchase-history/wit/purchase-history.wit </span></code></pre> <p>At this point running <code>cargo component build</code> in the root will compile both identical sub-projects, creating two different WASM files (but both containing the shopping cart implementation for now):</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> cargo component build </span><span style="color:#e45649;">... </span><span> </span><span style="color:#e45649;">Creating</span><span> component /Users/vigoo/projects/demo/shopping-cart-rpc/target/wasm32-wasi/debug/purchase_history.wasm </span><span> </span><span style="color:#e45649;">Creating</span><span> component /Users/vigoo/projects/demo/shopping-cart-rpc/target/wasm32-wasi/debug/shopping_cart.wasm </span></code></pre> <h3 id="implementing-the-purchase-history-template">Implementing the purchase history template</h3> <p>Before talking about <em>worker-to-worker communication</em>, let's just implement a simple version of the <strong>purchase history template</strong>. Each worker of this template will correspond to a <strong>user</strong> of the system, the worker name being equal to the user's identifier. We only need two exported functions, one for recording a purchase, and one for getting all the previous purchases.</p> <p>Let's completely replace <code>purchase-history/wit/purchase-history.wit</code> with the following interface definition:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>shopping:purchase-history; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#a626a4;">record </span><span>product-item { </span><span> </span><span style="color:#e45649;">product-id</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span> </span><span style="color:#e45649;">name</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span> </span><span style="color:#e45649;">price</span><span>: float32, </span><span> </span><span style="color:#e45649;">quantity</span><span>: </span><span style="color:#a626a4;">u32</span><span>, </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">record </span><span>order { </span><span> </span><span style="color:#e45649;">order-id</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span> </span><span style="color:#e45649;">items</span><span>: </span><span style="color:#a626a4;">list</span><span>&lt;product-item&gt;, </span><span> </span><span style="color:#e45649;">total</span><span>: float32, </span><span> </span><span style="color:#e45649;">timestamp</span><span>: </span><span style="color:#a626a4;">u64</span><span>, </span><span> } </span><span> </span><span> </span><span style="color:#0184bc;">add-order</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">order</span><span>: order) </span><span style="color:#a626a4;">-&gt; </span><span>(); </span><span> </span><span> </span><span style="color:#0184bc;">get-orders</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; list</span><span>&lt;order&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>purchase-history { </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span>} </span></code></pre> <p>Our <code>product-item</code> and <code>order</code> types are the same that we have in the shopping-cart WIT. In a next step, we will remove them from the shopping-cart WIT, and import them from this component's interface definition!</p> <p>Running <code>cargo component build</code> now will print a couple of errors, as we did not update the <code>purchase-history</code> module's Rust source code yet:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> cargo component build </span><span style="color:#e45649;">... </span><span style="color:#e45649;">error[E0433]:</span><span> failed to resolve: could not find `</span><span style="color:#e45649;">golem</span><span>` in `</span><span style="color:#e45649;">exports</span><span>` </span><span> </span><span style="color:#e45649;">--</span><span style="color:#a626a4;">&gt;</span><span> purchase-history/src/lib.rs:3:31 </span><span> </span><span style="color:#a626a4;">| </span><span style="color:#e45649;">3 </span><span style="color:#a626a4;">| </span><span style="color:#e45649;">use</span><span> crate::bindings::exports::golem::template::api::</span><span style="color:#a626a4;">*; </span><span> </span><span style="color:#a626a4;">| </span><span style="color:#e45649;">^^^^^</span><span> could not find `</span><span style="color:#e45649;">golem</span><span>` in `</span><span style="color:#e45649;">exports</span><span>` </span><span style="color:#e45649;">... </span></code></pre> <p>A simple implementation of this can be the following code replacing the existing <code>lib.rs</code>:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">mod </span><span>bindings; </span><span> </span><span style="color:#a626a4;">use crate</span><span>::bindings::exports::shopping::purchase_history::api::</span><span style="color:#a626a4;">*</span><span>; </span><span> </span><span style="color:#a626a4;">struct </span><span>Component; </span><span> </span><span style="color:#a626a4;">struct </span><span>State { </span><span> </span><span style="color:#e45649;">orders</span><span>: Vec&lt;Order&gt;, </span><span>} </span><span> </span><span style="color:#a626a4;">static mut </span><span style="color:#c18401;">STATE</span><span>: State </span><span style="color:#a626a4;">=</span><span> State { </span><span> orders: Vec::new() </span><span>}; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">with_state</span><span>&lt;T&gt;(</span><span style="color:#e45649;">f</span><span>: impl FnOnce(&amp;</span><span style="color:#e45649;">mut State</span><span>) -&gt; T) -&gt; T { </span><span> </span><span style="color:#a626a4;">let</span><span> result </span><span style="color:#a626a4;">= unsafe </span><span>{ </span><span style="color:#0184bc;">f</span><span>(</span><span style="color:#a626a4;">&amp;mut </span><span style="color:#c18401;">STATE</span><span>) }; </span><span> </span><span> </span><span style="color:#a626a4;">return</span><span> result; </span><span>} </span><span> </span><span style="color:#a626a4;">impl </span><span>Guest </span><span style="color:#a626a4;">for </span><span>Component { </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">add_order</span><span>(</span><span style="color:#e45649;">order</span><span>: Order) { </span><span> </span><span style="color:#0184bc;">with_state</span><span>(|</span><span style="color:#e45649;">state</span><span>| { </span><span> state.orders.</span><span style="color:#0184bc;">push</span><span>(order); </span><span> }); </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_orders</span><span>() -&gt; Vec&lt;Order&gt; { </span><span> </span><span style="color:#0184bc;">with_state</span><span>(|</span><span style="color:#e45649;">state</span><span>| { </span><span> state.orders.</span><span style="color:#0184bc;">clone</span><span>() </span><span> }) </span><span> } </span><span>} </span></code></pre> <p>With this, <code>cargo component build</code> now compiles the new <code>purchase_history.wasm</code> for us.</p> <h3 id="worker-to-worker-communication">Worker to worker communication</h3> <p>At this point, the only outstanding task in our example is to <strong>invoke the appropriate purchase history worker</strong> in the <code>checkout</code> implementation of the shopping cart.</p> <p>To find all the available options for doing this, check the <a href="https://learn.golem.cloud/docs/rpc">Worker-to-Worker communication's documentation</a>. In this example, we have both the target (the purchase history) and the caller (the shopping cart) in <strong>the same cargo workspace</strong>, so we can use Golem's <a href="https://github.com/sagiegurari/cargo-make">cargo-make</a> based solution for enabling communication between the different sub-projects of the workspace.</p> <p>Let's initialize this using <code>golem-cli</code> (or <code>golem-cloud-cli</code>):</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> golem-cli stubgen initialize-workspace</span><span style="color:#e45649;"> --targets</span><span> purchase-history</span><span style="color:#e45649;"> --callers</span><span> shopping-cart </span><span style="color:#e45649;">Writing</span><span> cargo-make Makefile to </span><span style="color:#50a14f;">&quot;/Users/vigoo/projects/demo/shopping-cart-rpc/Makefile.toml&quot; </span><span style="color:#e45649;">Generating</span><span> initial stub for purchase-history </span><span style="color:#e45649;">Generating</span><span> stub WIT to /Users/vigoo/projects/demo/shopping-cart-rpc/purchase-history-stub/wit/_stub.wit </span><span style="color:#e45649;">Copying</span><span> root package shopping:purchasehistory </span><span> </span><span style="color:#e45649;">..</span><span> /Users/vigoo/projects/demo/shopping-cart-rpc/purchase-history/wit/purchase-history.wit to /Users/vigoo/projects/demo/shopping-cart-rpc/purchase-history-stub/wit/deps/shopping_purchasehistory/purchase-history.wit </span><span style="color:#e45649;">Writing</span><span> wasm-rpc.wit to /Users/vigoo/projects/demo/shopping-cart-rpc/purchase-history-stub/wit/deps/wasm-rpc </span><span style="color:#e45649;">Generating</span><span> Cargo.toml to /Users/vigoo/projects/demo/shopping-cart-rpc/purchase-history-stub/Cargo.toml </span><span style="color:#e45649;">Generating</span><span> stub source to /Users/vigoo/projects/demo/shopping-cart-rpc/purchase-history-stub/src/lib.rs </span><span style="color:#e45649;">Writing</span><span> updated Cargo.toml to </span><span style="color:#50a14f;">&quot;/Users/vigoo/projects/demo/shopping-cart-rpc/Cargo.toml&quot; </span></code></pre> <p>As a next step, we check if the generated artifacts work, by running <strong>cargo make</strong> to execute the full build flow. It contains custom steps invoking <code>golem-cli</code> to implement the typed worker-to-worker communication.</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> cargo make build-flow </span><span style="color:#e45649;">... </span><span> </span><span style="color:#e45649;">Creating</span><span> component /Users/vigoo/projects/demo/shopping-cart-rpc/target/wasm32-wasi/debug/purchase_history.wasm </span><span> </span><span style="color:#e45649;">Creating</span><span> component /Users/vigoo/projects/demo/shopping-cart-rpc/target/wasm32-wasi/debug/shopping_cart.wasm </span><span> </span><span style="color:#e45649;">Creating</span><span> component /Users/vigoo/projects/demo/shopping-cart-rpc/target/wasm32-wasi/debug/purchase_history_stub.wasm </span><span style="color:#e45649;">[cargo-make]</span><span> INFO - Execute Command: </span><span style="color:#50a14f;">&quot;wasm-rpc-stubgen&quot; &quot;compose&quot; &quot;--source-wasm&quot; &quot;target/wasm32-wasi/debug/shopping_cart.wasm&quot; &quot;--stub-wasm&quot; &quot;target/wasm32-wasi/debug/purchase_history_stub.wasm&quot; &quot;--dest-wasm&quot; &quot;target/wasm32-wasi/debug/shopping_cart_composed.wasm&quot; </span><span style="color:#e45649;">Error:</span><span> no dependencies of component `</span><span style="color:#e45649;">target/wasm32-wasi/debug/shopping_cart.wasm</span><span>` were found </span></code></pre> <p>Don't worry about the failure at the end - it will be fixed in the next step.</p> <p>There are several changes in our workspace after running this command:</p> <ul> <li>We have a <code>Makefile.toml</code> file describing custom build tasks related to worker to worker communication</li> <li>We have a completely new sub-project called <code>purchase-history-stub</code> which is added to the Cargo workspace</li> <li>The <code>shopping-cart/wit/deps</code> directory now contains three dependencies: the original purchase history module, the generated stub interface, and the general purpose <code>wasm-rpc</code> package.</li> <li>These dependencies are also registered in <code>shopping-cart/Cargo.toml</code></li> </ul> <p>Before further explaining what these generated stubs are, let's finish our example. We need to modify the <strong>shopping cart</strong> template's interface definition (<code>shopping-cart/wit/shopping-cart.wit</code>) to import the generated stub, and to reuse the data types defined for the purchase history template instead of redefining them.</p> <p>The updated WIT file would look like this:</p> <pre data-lang="wit" style="background-color:#fafafa;color:#383a42;" class="language-wit "><code class="language-wit" data-lang="wit"><span style="color:#a626a4;">package </span><span>shopping:cart; </span><span> </span><span style="color:#a626a4;">interface </span><span>api { </span><span> </span><span style="color:#a626a4;">use </span><span>shopping:purchase-history/api.{</span><span style="color:#e45649;">product-item</span><span>}; </span><span> </span><span style="color:#a626a4;">use </span><span>shopping:purchase-history/api.{</span><span style="color:#e45649;">order</span><span>}; </span><span> </span><span> </span><span style="color:#a626a4;">record </span><span>order-confirmation { </span><span> </span><span style="color:#e45649;">order-id</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">variant </span><span>checkout-result { </span><span> error(</span><span style="color:#a626a4;">string</span><span>), </span><span> success(order-confirmation), </span><span> } </span><span> </span><span> </span><span style="color:#0184bc;">initialize-cart</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">user-id</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>(); </span><span> </span><span style="color:#0184bc;">add-item</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">item</span><span>: product-item) </span><span style="color:#a626a4;">-&gt; </span><span>(); </span><span> </span><span style="color:#0184bc;">remove-item</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">product-id</span><span>: </span><span style="color:#a626a4;">string</span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>(); </span><span> </span><span style="color:#0184bc;">update-item-quantity</span><span>: </span><span style="color:#a626a4;">func</span><span>(</span><span style="color:#e45649;">product-id</span><span>: </span><span style="color:#a626a4;">string</span><span>, </span><span style="color:#e45649;">quantity</span><span>: </span><span style="color:#a626a4;">u32</span><span>) </span><span style="color:#a626a4;">-&gt; </span><span>(); </span><span> </span><span style="color:#0184bc;">checkout</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; </span><span>checkout-result; </span><span> </span><span style="color:#0184bc;">get-cart-contents</span><span>: </span><span style="color:#a626a4;">func</span><span>() </span><span style="color:#a626a4;">-&gt; list</span><span>&lt;product-item&gt;; </span><span>} </span><span> </span><span style="color:#a626a4;">world </span><span>shopping-cart { </span><span> </span><span style="color:#a626a4;">import </span><span>shopping:purchase-history-stub/stub-purchase-history; </span><span> </span><span style="color:#a626a4;">export </span><span>api; </span><span>} </span></code></pre> <p>There are three changes:</p> <ul> <li>We renamed the package from the default <code>golem:template</code> to <code>shopping:cart</code> to make it more consistent with the other packages</li> <li>We deleted the definition of <code>product-item</code> and <code>order</code>, and instead importing them from the <code>shopping:purchase-history</code> package.</li> <li>We added the <code>import</code> statement in the <code>world</code>, which loads the generated <strong>stub</strong> into the template's world, so we can call it from the Rust code to initiate remote calls to the <code>purchase-history</code> workers.</li> </ul> <p>Because of the change of the package name, we have to update the import in <code>lib.rs</code> :</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">use crate</span><span>::bindings::exports::shopping::cart::api::</span><span style="color:#a626a4;">*</span><span>; </span></code></pre> <p>The only remaining step is to extend the <code>checkout</code> function with the remote worker invocation!</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">use crate</span><span>::bindings::shopping::purchase_history::api::{Order}; </span><span style="color:#a626a4;">use crate</span><span>::bindings::shopping::purchase_history_stub::stub_purchase_history; </span><span style="color:#a626a4;">use crate</span><span>::bindings::golem::rpc::types::Uri; </span><span> </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">checkout</span><span>() -&gt; CheckoutResult { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#0184bc;">dispatch_order</span><span>()</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span> </span><span style="color:#a0a1a7;">// Defining the order to be saved in history </span><span> </span><span style="color:#a626a4;">let</span><span> order </span><span style="color:#a626a4;">=</span><span> Order { </span><span> items: state.items.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> order_id: order_id.</span><span style="color:#0184bc;">clone</span><span>(), </span><span> timestamp: std::time::SystemTime::now().</span><span style="color:#0184bc;">duration_since</span><span>(std::time::SystemTime::</span><span style="color:#c18401;">UNIX_EPOCH</span><span>).</span><span style="color:#0184bc;">unwrap</span><span>().</span><span style="color:#0184bc;">as_secs</span><span>(), </span><span> total: state.items.</span><span style="color:#0184bc;">iter</span><span>().</span><span style="color:#0184bc;">map</span><span>(|</span><span style="color:#e45649;">item</span><span>| item.price </span><span style="color:#a626a4;">*</span><span> item.quantity </span><span style="color:#a626a4;">as f32</span><span>).</span><span style="color:#0184bc;">sum</span><span>(), </span><span> }; </span><span> </span><span> </span><span style="color:#a0a1a7;">// Constructing the remote worker&#39;s URI </span><span> </span><span style="color:#a626a4;">let</span><span> template_id </span><span style="color:#a626a4;">= </span><span> std::env::var(</span><span style="color:#50a14f;">&quot;PURCHASE_HISTORY_TEMPLATE_ID&quot;</span><span>) </span><span> .</span><span style="color:#0184bc;">expect</span><span>(</span><span style="color:#50a14f;">&quot;PURCHASE_HISTORY_TEMPLATE_ID not set&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">let</span><span> uri </span><span style="color:#a626a4;">=</span><span> Uri { </span><span> value: format!(</span><span style="color:#50a14f;">&quot;worker://</span><span style="color:#c18401;">{template_id}</span><span style="color:#50a14f;">/</span><span style="color:#c18401;">{}</span><span style="color:#50a14f;">&quot;</span><span>, state.user_id), </span><span> }; </span><span> </span><span> </span><span style="color:#a0a1a7;">// Connecdting to the remote worker and invoking it </span><span> </span><span style="color:#a626a4;">let</span><span> history </span><span style="color:#a626a4;">= </span><span>stub_purchase_history::Api::new(</span><span style="color:#a626a4;">&amp;</span><span>uri); </span><span> history.</span><span style="color:#0184bc;">add_order</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>order); </span><span>} </span></code></pre> <p>With all these changes, running <code>cargo make</code> again will succeed:</p> <pre data-lang="bash" style="background-color:#fafafa;color:#383a42;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#e45649;">$</span><span> cargo make build-flow </span><span style="color:#e45649;">... </span><span style="color:#e45649;">Writing</span><span> composed component to </span><span style="color:#50a14f;">&quot;target/wasm32-wasi/debug/shopping_cart_composed.wasm&quot; </span><span style="color:#e45649;">[cargo-make]</span><span> INFO - Build Done in 7.38 seconds. </span></code></pre> <p>We first created the <code>Order</code> value to be saved in the remote purchase history. Then we get an <strong>environment variable</strong> to figure out the Golem <em>template-id</em> of the purchase history template. This is something we need to record when uploading the template to Golem, and set it to all shopping cart worker's when creating them. The remote URI consists of the template identifier and the <em>worker name</em>, and in our example the worker name is the same as the <strong>user id</strong> that the shopping cart belongs to. This guarantees that we will have a distinct purchase history worker for each user.</p> <p>When we have the URI, we just instantiate the <strong>generated stub</strong> for by passing the remote worker's URI—and we get an interface that corresponds to the remote worker's exported interface! This way we can just call <code>add_order</code> on it, passing the constructed order value.</p> <p>Everything else is handled by Golem. If this was the first order of the user, a new purchase history worker is created. Otherwise, the existing worker will be targeted, which is likely already in a suspended state, not actively in any worker executor's memory. Golem restores the worker's state and invokes the <code>add_order</code> function on them, which adds the new order to the list of orders for that user, in a fully durable way, without the need for a database.</p> <h3 id="how-does-it-work">How does it work?</h3> <p>The generated cargo-make makefile just wraps a couple of <code>golem-cli stubgen</code> commands.</p> <p>First, <code>stubgen generate</code> creates a new Rust crate for each <strong>target</strong> that has a similar interface as the original worker, but all the exported functions and interfaces are wrapped in a resource, which has to be instantiated with a <strong>worker URI</strong>. This generated crate can be compiled to a WASM file (or <code>stubgen build</code> can do that automatically) and it also contains a <strong>WIT</strong> file describing this interface.</p> <p>The <code>stubgen add-stub-dependency</code> command takes this generated interface specification and <strong>adds it</strong> to an other worker's <code>wit</code> folder—making it a <em>dependency</em> of that worker. So the caller worker is not depending directly on the target worker, it depends on the <strong>generated stub</strong>.</p> <p>If we compile this caller worker to WASM, it will not only require host functions provided by Golem (such as the WASI interfaces or Golem specific APIs) but it will also require an <strong>implementation</strong> of the stub interface. That's where the generated Rust crate comes into the picture—its compiled WASM <strong>implements</strong> (exports) the stub interface while the caller WASM <strong>requires</strong> (imports) it. WASM components can be composed so by combining the two we can get a result WASM that no longer tries to import the stub interface—it is going to be wired within the component—only the other dependencies the original modules had.</p> <p>One way to do this composition is to use <code>wasm-tools compose</code>, but it is more convenient to use <code>golem-cli</code> (or <code>golem-cloud-cli</code>)'s built-in command for it, called <code>stubgen compose</code>. This is the last step the generated cargo-make file performs when running the <code>build-flow</code> task.</p> <p>The following diagram demonstrates how the component's in the example are interacting with each other:</p> <p><img src="/images/w2w-comm.png" alt="" /></p> <h2 id="conclusion">Conclusion</h2> <p>We have seen how the new Golem tools enable simple, fully-typed communication between <strong>workers</strong>. Although the above demonstrated <code>cargo-make</code>-based build is Rust specific, the other <code>stubgen</code> commands are not: they can be used with any language that has WIT binding generator support (see <a href="https://learn.golem.cloud/docs/building-templates/tier-2">Golem's Tier 2 languages</a>)—Rust, C, Go, JavaScript, Python and Scala.js.</p> <p>The remote calls are not only simple to use, they are also efficient, and they get translated to direct function calls when the source and the target workers are running on the same <strong>worker executor</strong>. They are also fully durable, as all other external interaction running on Golem. This means we don't have to worry about failures when calling remote workers. Additionally, Golem applies retry policies in case of transient failures, and it makes sure that a remote invocation only happens once.</p> <p>This feature is ready to use both in the <a href="http://github.com/golemcloud/golem">open source</a> and the <a href="https://www.golem.cloud/">cloud version</a>.</p> desert part 1 - features 2024-02-19T00:00:00+00:00 2024-02-19T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/desert-1/ <h2 id="introduction">Introduction</h2> <p>This is the <strong>first part</strong> of a series of blog posts about my serialization library, <a href="https://vigoo.github.io/desert">desert</a>. I also gave an overview of this library on Functional Scala 2022 - you can check the <a href="https://blog.vigoo.dev/posts/desert-1/@posts/funscala2022-talk.md">talk on YouTube if interested</a>.</p> <p>In this post I'm going to give an overview of the features this serialization library provides, and then going to dive into the details of how it supports evolving data types.</p> <h2 id="where-is-it-coming-from">Where is it coming from?</h2> <p>The idea of creating <code>desert</code> came after some serious disappointment in our previously chosen serialization library. It was used for serialization of both persistent Akka actors and for the distributed actor messages, and it turned out that just by updating the Scala version from 2.12 to 2.13 completely broke our serialization format.</p> <p>None of the alternatives looked good enough to me - I wanted something that is code first and fits well to our functional Scala style. Support for multiple platforms or programming languages were not a requirement.</p> <p>So I started thinking about what would a perfect serialization library look like, at least for our use cases? It was something that has first-class support for ADTs, for Scala's collection libraries (I don't want to see Scala lists serialized via Java reflection ever again!), with a focus of supporting evolution of the serialized data types. We <em>knew</em> that our persisted data and actor messages will change over time, and we had to be able to survive these changes without any downtime.</p> <h2 id="features">Features</h2> <p>Let's just go through all the features provided by the library before we talk about how exactly it supports these kind of changes in the serialized data structures.</p> <p><code>desert</code> is a Scala library. As probably expected, it captures the core concept of binary serialization though a simple <code>trait</code> called <code>BinaryCodec[T]</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> BinarySerializer</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">serialize</span><span style="color:#c18401;">(</span><span style="color:#e45649;">value</span><span style="color:#c18401;">: T)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">context</span><span style="color:#c18401;">: SerializationContext): </span><span style="color:#a626a4;">Unit </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">contramap</span><span style="color:#c18401;">[U](</span><span style="color:#e45649;">f</span><span style="color:#c18401;">: U </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">T): BinarySerializer[U] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">contramapOrFail</span><span style="color:#c18401;">[U](</span><span style="color:#e45649;">f</span><span style="color:#c18401;">: U </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Either[DesertFailure, T]): BinarySerializer[U] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> BinaryDeserializer</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">deserialize</span><span style="color:#c18401;">()(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">ctx</span><span style="color:#c18401;">: DeserializationContext): T </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">map</span><span style="color:#c18401;">[U](</span><span style="color:#e45649;">f</span><span style="color:#c18401;">: T </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">U): BinaryDeserializer[U] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">mapOrFail</span><span style="color:#c18401;">[U](</span><span style="color:#e45649;">f</span><span style="color:#c18401;">: T </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Either[DesertFailure, U]): BinaryDeserializer[U] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> BinaryCodec</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">extends </span><span>BinarySerializer[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">with </span><span>BinaryDeserializer[</span><span style="color:#c18401;">T</span><span>] </span></code></pre> <p>These <code>BinaryCodec</code> instances should be made implicitly available for each type we need to serialize. There are multiple ways to create an instance of a binary codec:</p> <ul> <li>There are many built-in codecs for primitive types, standard collections, date-time classes, etc.</li> <li>The <code>map</code> and <code>contramap</code> operators can be used to construct new codecs from existing ones</li> <li>There is a codec derivation macro for ADTs (case classes and sealed traits / enums)</li> <li>Custom implementation can directly read/write the binary data and access some of the built-in features like the type registry, references, string deduplication and compression</li> <li>It is also possible to define these custom implementations in a more functional way on top of <code>ZPure</code></li> </ul> <p>Under the hood there is a simple <code>BinaryInput</code> / <code>BinaryOutput</code> abstraction which is extensible, by default implemented for Java <code>InputStream</code> and <code>OutputStream</code>.</p> <p>On the lowest level, in addition to having an interface for serializing primitive types we also have support for <strong>variable length integer encoding</strong> and for gzip <strong>compression</strong>. Custom codecs can also use the built-in <strong>string deduplication</strong> feature, and encode cyclic graphs using support for storing <strong>references</strong>.</p> <p>Sometimes you want to serialize only a part of your data structure - a real-world example we had was having a set of <em>typed actor messages</em> where only a subset of the cases were designed to be used between different nodes. Some cases were only used locally, and in those we would store things that are not serializable at all - for example open websocket connection handles. This is supported by <code>desert</code> by having the concept of both <strong>transient fields</strong> and <strong>transient constructors</strong>.</p> <p>What if a field is not an ADT but contains a reference to an arbitrary type with a given interface? Or if we don't know the root type of a message, only a set of possible types which are otherwise unrelated? The library provides a <strong>type registry</strong> for this purpose. Every type registered into this will have an associated identifier, and in places where we don't know the exact type, we can use these to get the codec by it's unique ID from the type registry.</p> <p>On the top level <code>desert</code> also comes with a set of <strong>integration modules</strong>. The following modules are available at the time of writing:</p> <ul> <li><code>desert-akka</code> provides helper functions to serialize from/to <code>ByteString</code>, provides codecs for both typed and untyped <code>ActorRef</code>s, and provides an implementation of Akka's <code>Serializer</code> interface.</li> <li><code>desert-cats</code> adds codecs for <code>Validation</code>, <code>NonEmptyList</code>, <code>NonEmptySet</code> and <code>NonEmptyMap</code> from the <a href="https://typelevel.org/cats/">cats library</a>.</li> <li><code>desert-cats-effect</code> gives a <a href="https://typelevel.org/cats-effect/">cats-effect</a> <code>IO</code> version of the top level serialization and deserialization functions</li> <li><code>desert-zio</code> provides <code>ZIO</code> version of the top level serialization and deserialization functions and adds codec and helper functions to work with <code>Chunk</code>s,</li> <li><code>desert-zio-prelude</code> provides a more functional interface for defining custom codecs, as well as having built-in codecs for</li> <li><code>desert-shardcake</code> provides easy integration within the <a href="https://devsisters.github.io/shardcake/">Shardcake</a> library</li> </ul> <p>There are two more modules which implement the same core functionality, <strong>codec derivation</strong>, with different tradeoffs:</p> <ul> <li><code>desert-shapeless</code> is a <a href="https://github.com/milessabin/shapeless">shapeless</a> based codec deriver, the original implementation of <code>desert</code>'s derivation logic. It only works for <strong>Scala 2</strong> but it has no additional requirements.</li> <li><code>desert-zio-scheme</code> is an alternative implementation of the same codec derivation, built on the <code>Deriver</code> feature of <a href="https://zio.dev/zio-schema/">zio-schema</a>. This works both with <strong>Scala 2</strong> and <strong>Scala 3</strong>, and supposed to provide better compile-time error messages, but requires to derive an implicit <code>Schema</code> for each serialized type beside the binary codec.</li> </ul> <p>I wrote a <a href="https://blog.vigoo.dev/posts/desert-1/@posts/zio-schema-deriving.md">detailed post about typeclass derivation</a> a few months ago.</p> <h2 id="data-evolution">Data evolution</h2> <p>Let's see in details what it means that <code>desert</code> supports <em>evolving</em> data structures.</p> <h3 id="primitives-vs-newtype-wrappers">Primitives vs newtype wrappers</h3> <p>Let's start with a simple example: we are serializing a single <code>Int</code>. The default codec just uses the fixed width 32-bit representation of the integer:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">x</span><span>: </span><span style="color:#a626a4;">Int = </span><span style="color:#c18401;">100 </span></code></pre> <p>results in:</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">100</td> </tr> </table> <p>Imagine that later we decide that <code>Int</code> is just too generic, and what we have here is in fact a <code>Coordinate</code>. We can define a a newtype wrapper like the following:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Coordinate</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">Int</span><span>) </span><span style="color:#a626a4;">extends </span><span>AnyVal </span></code></pre> <p>and then define the binary codec either by using <code>map</code> and <code>contramap</code> on the integer codec, or by using the <code>deriveForWrapper</code> macro:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Coordinate { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Coordinate] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DeriveBinaryCodec.deriveForWrapper </span><span style="color:#c18401;">} </span></code></pre> <p>The binary representation of a <code>Coordinate</code> will be exactly the same as for an <code>Int</code>, so we are still fully backward and forward compatible regarding our serialization format:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">x</span><span>: </span><span style="color:#c18401;">Coordinate </span><span style="color:#a626a4;">= </span><span>Coordinate(</span><span style="color:#c18401;">100</span><span>) </span></code></pre> <p>results in:</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">100</td> </tr> </table> <h3 id="collections">Collections</h3> <p>First let's see what happens if we try to serialize a pair of coordinates:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">xy </span><span style="color:#a626a4;">= </span><span>(Coordinate(</span><span style="color:#c18401;">1</span><span>), Coordinate(</span><span style="color:#c18401;">2</span><span>)) </span></code></pre> <p>results in:</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(147, 154, 231); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> </tr> </table> <p>the binary representation starts with a <code>0</code>, which is an <em>ADT header</em>. We will talk about it later. The rest of the data is just a flat representation of the two coordinates, taking in total 9 bytes.</p> <p>Now we start storing arrays of these coordinates:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">coordinates</span><span>: </span><span style="color:#c18401;">Array</span><span>[(</span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#c18401;">Coordinate</span><span>)] </span><span style="color:#a626a4;">= </span><span> Array( </span><span> (Coordinate(</span><span style="color:#c18401;">1</span><span>), Coordinate(</span><span style="color:#c18401;">2</span><span>)), </span><span> (Coordinate(</span><span style="color:#c18401;">3</span><span>), Coordinate(</span><span style="color:#c18401;">4</span><span>)), </span><span> (Coordinate(</span><span style="color:#c18401;">5</span><span>), Coordinate(</span><span style="color:#c18401;">6</span><span>)) </span><span> ) </span></code></pre> <p>Arrays are serialized simply by writing the length of the array as a variable-length integer and then serializing all elements.</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(147, 154, 231); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">6</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">3</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">4</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">5</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">6</td> </tr> </table> <p>The variable-length integer encoding of <code>3</code> is <code>6</code>, and that is simply followed by the three 9-byte long serialized representation of the coordinate pairs.</p> <p>What if we decide we don't want to use <code>Array</code> but ZIO's <code>Chunk</code> instead? Or if we realize our data model is more precise if we talk about a <em>set</em> of coordinate pairs? Nothing! Desert uses the same encoding for all collection types, allowing us to always choose the best data type without being worried about breaking the serialization format. In some collections, such as linked lists, there is no way to know the number of elements without iterating through the whole data set. Desert supports these collection types by writing <code>-1</code> as the number of elements, and then prefixing each element with a single byte where <code>1</code> represents we have a next element and <code>0</code> that we don't. This is actually exactly the same binary format as a series of <code>Option[T]</code> values where the first and only <code>None</code> represents the end of the sequence.</p> <h3 id="records">Records</h3> <p>Maybe using tuples of coordinates was a good idea in the beginning but as our data model evolves we want to introduced a named record type instead:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>) </span></code></pre> <p>We can use <code>desert</code>'s codec derivation feature to get a binary codec for this type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Point { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">schema</span><span style="color:#c18401;">: Schema[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DeriveSchema.gen </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DerivedBinaryCodec.derive </span><span style="color:#c18401;">} </span></code></pre> <p>When using <code>desert-zio-scheme</code> we also need to derive a <code>Schema</code> instance - this is not required when using the <code>desert-shapeless</code> version of the codec derivation.</p> <p>Let's see how <code>desert</code> serializes an instance of this <code>Point</code> type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">pt </span><span style="color:#a626a4;">= </span><span>Point(Coordinate(</span><span style="color:#c18401;">1</span><span>), Coordinate(</span><span style="color:#c18401;">2</span><span>)) </span></code></pre> <p>results in:</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(147, 154, 231); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> </tr> </table> <p>This is exactly the same as the tuple's binary representation was, which probably isn't a big surprise as they are structurally equivalent. Still this is an important property as it allows us to replace any tuple with an equivalent record type and keeping the binary format exactly the same!</p> <p>If we have to change a record's type, we can only change any of its fields if that field's new type has a compatible binary representation with the old one. All the cases described in this post are valid data evolution steps. Beside those there are a few special type of changes <code>desert</code> supports for records. Let's see!s</p> <h3 id="adding-a-field">Adding a field</h3> <p>As a next step let's imagine our data type requires a new field. Let's add a <code>z</code> coordinate to our point:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">z</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>) </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Point { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DerivedBinaryCodec.derive </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">pt </span><span style="color:#a626a4;">= </span><span>Point(Coordinate(</span><span style="color:#c18401;">1</span><span>), Coordinate(</span><span style="color:#c18401;">2</span><span>), Coordinate(</span><span style="color:#c18401;">3</span><span>)) </span></code></pre> <p>Serializing this <code>pt</code> value results in:</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(147, 154, 231); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">3</td> </tr> </table> <p>If we try to read this value with the <em>deserializer</em> of our original <code>Point</code> type, it will read <code>Point(Coordinate(1), Coordinate(2))</code>, but the next deserialized value will be corrupt as the input stream will point to the beginning of the <code>0, 0, 0, 3</code> value. Similarly, if we would try to read a binary serialized with the old <code>Point</code> <em>serializer</em>, it would read the next four bytes from the data stream which, if even exists, belongs to some other serialized element.</p> <p>The solution for this in <code>desert</code> is to <strong>explicitly document data evolution</strong>. This is done by listing each modification in an <em>attribute</em> called <code>evolutionSteps</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>@</span><span style="color:#e45649;">evolutionSteps</span><span>(FieldAdded[</span><span style="color:#c18401;">Coordinate</span><span>](</span><span style="color:#50a14f;">&quot;z&quot;</span><span>, Coordinate(</span><span style="color:#c18401;">0</span><span>))) </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">z</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>) </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Point { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DerivedBinaryCodec.derive </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">pt </span><span style="color:#a626a4;">= </span><span>Point(Coordinate(</span><span style="color:#c18401;">1</span><span>), Coordinate(</span><span style="color:#c18401;">2</span><span>), Coordinate(</span><span style="color:#c18401;">3</span><span>)) </span></code></pre> <p>With this annotation, we mark <code>z</code> as a newly added field, and provide a <em>default value</em> for it which will be used in cases when reading an old version of the serialized data which did not have this field yet. Every time we change the data type we record the change as a new element in this attribute. There are other supported evolution step types as we will see soon.</p> <p>But first let's see what changes in the binary representation of <code>Point</code> now that we added this attribute!</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(147, 154, 231); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">16</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">8</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">3</td> </tr> </table> <p>Now that we have an <em>evolution step</em> the first byte, which was always <code>0</code> before, becomes <code>1</code>. Every evolution step increases this value, which is interpreted as the type's <em>version</em>. For each ADT which has a version other than 0, this first version byte is followed by a list of the binary encoding of the evolution steps. Here the <code>16</code> is the variable-length encoding of the value <code>8</code>, which is the length of the "version 0" part of the data type. This is followed by <code>8</code> which is just the variable-length encoding of the value <code>4</code>, and it represents the <em>field added</em> evolution step, encoding the newly added field's size.</p> <p>With this format when the <em>old</em> deserializer reads the point, it knows it needs to skip additional 4 bytes after reading the <code>x</code> and <code>y</code> coordinates. Also when the <em>new</em> deserializer encounters an old point, that binary data will begin with <code>0</code>, so the deserializer is aware that it's an older version and can set the deserialized value's <code>z</code> coordinate to the provided default.</p> <p>By documenting the data type change we get full forward and backward compatibility in this case. The cost is that instead of <code>13</code> bytes, now each <code>Point</code> takes <code>15</code> bytes.</p> <h3 id="making-a-field-optional">Making a field optional</h3> <p>Another special data type change is making an existing field optional. Staying with the previous example we could change our <code>Point</code> type like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>@</span><span style="color:#e45649;">evolutionSteps</span><span>( </span><span> FieldAdded[</span><span style="color:#c18401;">Coordinate</span><span>](</span><span style="color:#50a14f;">&quot;z&quot;</span><span>, Coordinate(</span><span style="color:#c18401;">0</span><span>)), </span><span> FieldMadeOptional(</span><span style="color:#50a14f;">&quot;z&quot;</span><span>) </span><span>) </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#c18401;">Coordinate</span><span>, </span><span style="color:#e45649;">z</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Coordinate</span><span>]) </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Point { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DerivedBinaryCodec.derive </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">pt </span><span style="color:#a626a4;">= </span><span>Point(Coordinate(</span><span style="color:#c18401;">1</span><span>), Coordinate(</span><span style="color:#c18401;">2</span><span>), None) </span></code></pre> <p>This of course can no longer guarantee full forward and backward compatibility - but it can be useful as an intermediate step in getting rid of some unused parts of the data model, while still being able to access it when it's available from older serialized data.</p> <p>This evolution step is represented by a variable-length integer <code>-1</code> in the ADT header. All positive values are representing the <em>field added</em> case, with the actual value containing the size of the added field. -1 is a special marker for field removed, and it is followed by another variable-length integer encoding the field position which has been made optional. Then serializing the <code>Option</code> field, the integer gets prefixed by a <code>1</code> if the value was <code>Some</code>, or the whole option is serialized as a <code>0</code> if it was <code>None</code>.</p> <p>The total serialized record of the above example would look like this:</p> <table style="border-collapse: initial; border: 0px; width: auto; color: black"> <tr> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(147, 154, 231); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">16</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(60, 200, 150); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(60, 200, 150); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(154, 231, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">1</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 154, 147); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">2</td> <td style="border: 1px solid; padding: 6px; text-align: center; background-color: rgb(231, 147, 200); margin: 0px; border-spacing: 1px; font-family: monospace; font-weight: normal">0</td> </tr> </table> <p>The first byte is now <code>2</code> as we have two evolution steps. The next one still defines that the original part of the data is 8 bytes long, the third byte shows that this time the new <em>z</em> field is taking only 1 byte (as it was set to <code>None</code>). The header is now containing two more bytes, as described above: the first <code>1</code> means a field has been made optional, and the second points to the field.</p> <p>This can be still loaded by the very first point serializer (or even as the coordinate pair tuple), as everything after the first two coordinates would be skipped. It can also be loaded as a <code>Point</code> with non-optional z coordinate, but only if the serialized data is a <code>Some</code>. So in the above example it would lead to a deserialization error. The change is fully backward compatible so our latest deserializer can still load all the variants we have seen before.</p> <h3 id="removing-a-field">Removing a field</h3> <p>The final special data evolution step supported by the library is <em>removing</em> a field completely. This is more limited than the previous ones though - backward compatibility is easy, newer versions of the deserializer just have to skip the removed fields which they can easily do. But forward compatibility is only possible if the removed field was an <strong>option field</strong> - that's the only type <code>desert</code> can automatically provide a default value, <code>None</code> for.</p> <p>The binary header for removing a field needs to store the actual <em>field name</em> because it cannot otherwise identify the field which is not actually in the rest of the data set. To make this more space-efficient, <code>desert</code> uses string deduplication and only needs to serialize the actual field name once.</p> <h3 id="sum-types">Sum types</h3> <p>Scala 2 sealed trait hierarchies and Scala 3 enums are simply serialized with the same techniques mentioned above, but with a <em>constructor ID</em> serialized as a prefix to the binary. Constructor identifiers are associated in order - as the constructors appear in the source code. This means that adding new constructors is backward and forward compatible, as long as they are added as the <em>last</em> constructor. Otherwise the identifiers will be rearranged and binary compatibility breaks.</p> <h3 id="transients">Transients</h3> <p>It is possible to make a previously non-transient field transient and maintain binary compatibility. The rules are the same as for <em>removing</em> a field.</p> <h3 id="type-registry">Type registry</h3> <p>As mentioned earlier, a <em>type registry</em> can be used to associate identifiers to types, and then serialize arbitrary values using these identifiers. Maintaining the stability of this mapping is also very important when evolving data types. What if we want to delete a type which was added to the type registry because we never want to use it again, and we already migrated our serialized data and we are sure we will never encounter that ID again during deserialization?</p> <p>We still cannot just simply remove the entry from the type registry, because it will break all the following identifiers as they get assigned sequentially. The library has a solution for this - it is possible to registry empty placeholders where we previously had an actual type - it will maintain the identifier order, but will lead to a runtime error when that identifier is encountered during deserialization.</p> <h2 id="summary">Summary</h2> <p>In this post I summarized the key features of the <code>desert</code> serialization library, and explained in detail how it supports changes into the data model while trying to keep maximal backward and forward compatibility.</p> <p>In the next post I will show how the same library can be implemented for <strong>Rust</strong>, how the Scala solution maps into different concepts in the other language and what difficulties I've encountered during the migration process.</p> [Video] Beyond OpenAPI @ Functional Scala 2023 2024-01-21T00:00:00+00:00 2024-01-21T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/funscala2023-talk/ <p>My talk at <a href="https://www.functionalscala.com/">Functional Scala 2023</a> about my experience with generating client libraries from OpenAPI specifications, and an alternative code-first approach using <a href="https://github.com/zio/zio-http/">ZIO Http</a>.</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/wwKs37GVubg?si=dMEqTmjUgGBhYB38" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> Type class derivation with ZIO Schema 2023-12-02T00:00:00+00:00 2023-12-02T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zio-schema-deriving/ <h2 id="introduction">Introduction</h2> <p>Making the compiler to automatically <em>derive</em> implementations of a type class for your custom algebraic data types is a common technique in programming languages. Haskell, for example, has built-in syntax for it:</p> <pre data-lang="haskell" style="background-color:#fafafa;color:#383a42;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#a626a4;">data </span><span style="color:#c18401;">Literal </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">StringLit String </span><span> </span><span style="color:#a626a4;">| </span><span style="color:#c18401;">BoolLit Bool </span><span> </span><span style="color:#a626a4;">deriving</span><span> (Show) </span></code></pre> <p>and Rust is using macros instantiated by <em>annotations</em> to do the same:</p> <pre data-lang="Rust" style="background-color:#fafafa;color:#383a42;" class="language-Rust "><code class="language-Rust" data-lang="Rust"><span>#[</span><span style="color:#e45649;">deriving</span><span>(Debug)] </span><span style="color:#a626a4;">enum </span><span>Literal { </span><span> StringLit(String), </span><span> BoolLit(</span><span style="color:#a626a4;">bool</span><span>) </span><span>} </span></code></pre> <p>Scala 3 has its own syntax for deriving type classes:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>enum Literal deriving Show: </span><span> </span><span style="color:#a626a4;">case </span><span>StringLit(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#c18401;">String</span><span>) </span><span> </span><span style="color:#a626a4;">case </span><span>BoolLit(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>) </span></code></pre> <p>but the more traditional way that works with Scala 2 as well is to define an implicit in the type's companion object by an explicit macro invocation:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> Literal </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Literal { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> StringLit(</span><span style="color:#e45649;">value</span><span style="color:#c18401;">: String) </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Literal </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> BoolLit(</span><span style="color:#e45649;">value</span><span style="color:#c18401;">: String) </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Literal </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">show</span><span style="color:#c18401;">: Show[Literal] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DeriveShow[Literal] </span><span style="color:#c18401;">} </span></code></pre> <p>All these examples from different languages are common in a way that in order to automatically generate an implementation for an arbitrary type we need to be able to gather information about these types as (compilation-) runtime values, and to generate new code fragments (or actual abstract syntax tree) which then takes part of to the compilation, producing the same result as writing the implementation by hand.</p> <p>This means using some kind of macro, depending on which programming language we use. But writing these macros is never easy, and in some cases can be very different from the usual way of writing code - so in each programming language people are writing <em>libraries</em> helping type class derivation in one way or the other.</p> <p>In this post I will show a library like that for Scala, the <code>Deriver</code> feature of <a href="https://zio.dev/zio-schema/">ZIO Schema</a> that I added at the end of last year (2022). But before that let's see a real world example and what alternatives we had.</p> <h2 id="example">Example</h2> <p><a href="https://vigoo.github.io/desert/">Desert</a> is a Scala serialization library I wrote in 2020. Not surprisingly in the core of Desert is a <em>trait</em> that describes serialization and deserailization of a type <code>T</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> BinaryCodec</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">extends </span><span>BinarySerializer[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">with </span><span>BinaryDeserializer[</span><span style="color:#c18401;">T</span><span>] </span><span> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> BinarySerializer</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">serialize</span><span style="color:#c18401;">(</span><span style="color:#e45649;">value</span><span style="color:#c18401;">: T)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">context</span><span style="color:#c18401;">: SerializationContext): </span><span style="color:#a626a4;">Unit </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> BinaryDeserializer</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">deserialize</span><span style="color:#c18401;">()(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">ctx</span><span style="color:#c18401;">: DeserializationContext): T </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>Although we can implement these traits manually, in order to take advantage of Desert's type evolution capabilities, for complex types like <em>case classes</em> or <em>enums</em> we want the user to be able to write something like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">z</span><span>: </span><span style="color:#a626a4;">Int</span><span>) </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Point { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DerivedBinaryCodec.derive </span><span style="color:#c18401;">} </span></code></pre> <h2 id="alternatives">Alternatives</h2> <h3 id="scala-3-mirrors">Scala 3 mirrors</h3> <p>First of all, <strong>Scala 3</strong> has some built-in support for implementing derivation macros using its <code>Mirror</code> type, explained in the <a href="https://docs.scala-lang.org/scala3/reference/contextual/derivation.html">official documentation</a>. We can see a simple example of this technique <a href="https://github.com/zio/zio/blob/series%2F2.x/test-magnolia/shared/src/main/scala-3/zio/test/magnolia/DeriveGen.scala">in the ZIO codebase</a> where I have implemented a deriving mechanism for the <code>Gen[R, A]</code> trait which is Scala 3 specific. (The Scala 2 version is using the Magnolia library, introduced below, which did not have a Scala 3 version back then). The <code>Mirror</code> values are summoned by the compiler and they provide the type information:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>inline </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">gen</span><span>[</span><span style="color:#c18401;">T</span><span>](using </span><span style="color:#e45649;">m</span><span>: </span><span style="color:#c18401;">Mirror</span><span>.</span><span style="color:#c18401;">Of</span><span>[</span><span style="color:#c18401;">T</span><span>]): </span><span style="color:#c18401;">DeriveGen</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">DeriveGen</span><span>[</span><span style="color:#c18401;">T</span><span>] { </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span>: </span><span style="color:#c18401;">Gen</span><span>[</span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">elemInstances </span><span style="color:#a626a4;">=</span><span> summonAll[m.</span><span style="color:#c18401;">MirroredElemTypes</span><span>] </span><span> inline m </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">s</span><span>: </span><span style="color:#c18401;">Mirror</span><span>.</span><span style="color:#c18401;">SumOf</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">=&gt;</span><span> genSum(s, elemInstances) </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">p</span><span>: </span><span style="color:#c18401;">Mirror</span><span>.</span><span style="color:#c18401;">ProductOf</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">=&gt;</span><span> genProduct(p, elemInstances) </span><span> } </span><span> } </span><span> } </span></code></pre> <p>As this function is an <a href="https://docs.scala-lang.org/scala3/reference/metaprogramming/inline.html">inline function</a>, it gets evaluated compile time, using this summoned <code>Mirror</code> value to produce an implementation of <code>Gen[Any, T]</code>.</p> <p>This is a little low level and requires knowledge of inline functions and things like <code>summonAll</code> etc., but otherwise a relatively easy way to solve the type class derivation problem. But it is Scala 3 only.</p> <p>Back in 2020 when I wrote the first version of Desert, there was no Scala 3 at all, and the three main way to do this were</p> <ul> <li>writing a (Scala 2) macro by hand</li> <li>using <a href="https://github.com/milessabin/shapeless">Shapeless</a></li> <li>using <a href="https://github.com/softwaremill/magnolia">Magnolia</a></li> </ul> <h3 id="scala-2-macros">Scala 2 macros</h3> <p>Writing a custom derivation logic with Scala 2 macros is not easy, but it is completely possible. It starts by defining a <a href="https://www.scala-lang.org/api/2.13.12/scala-reflect/scala/reflect/macros/whitebox/Context.html">whitebox macro</a>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Derive { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span style="color:#c18401;">[A]: BinaryCodec[A] </span><span style="color:#a626a4;">= macro</span><span style="color:#c18401;"> deriveImpl[A] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">deriveImpl</span><span style="color:#c18401;">[A</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">c.WeakTypeTag]( </span><span style="color:#c18401;"> </span><span style="color:#e45649;">c</span><span style="color:#c18401;">: whitebox.Context </span><span style="color:#c18401;"> ): c.Tree </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">import</span><span style="color:#c18401;"> c.universe.</span><span style="color:#e45649;">_ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>The job of <code>deriveImpl</code> is to examine the type of <code>A</code> and generate a <code>Tree</code> that represents the implementation of the <code>BinaryCodec</code> trait for <code>A</code>. We can start by getting a <code>Type</code> value for <code>A</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">tpe</span><span>: </span><span style="color:#c18401;">Type </span><span style="color:#a626a4;">=</span><span> weakTypeOf[</span><span style="color:#c18401;">A</span><span>] </span></code></pre> <p>and then use that to get all kind of information about this type. For example to check if it is a <em>case class</em>, we could write</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">isCaseClass</span><span>(</span><span style="color:#e45649;">tpe</span><span>: </span><span style="color:#c18401;">Type</span><span>): </span><span style="color:#a626a4;">Boolean =</span><span> tpe.typeSymbol.asClass.isCaseClass </span></code></pre> <p>and then try to collect all the fields of that case class:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">fields </span><span style="color:#a626a4;">=</span><span> tpe.decls.sorted.collect { </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">p</span><span>: </span><span style="color:#c18401;">TermSymbol </span><span style="color:#a626a4;">if</span><span> p.isCaseAccessor &amp;&amp; !p.isMethod </span><span style="color:#a626a4;">=&gt;</span><span> p </span><span>} </span></code></pre> <p>As we can see this is a very direct and low level way to work with the types, much harder then the <code>Mirror</code> type we used for Scala 3. Once we gathered all the necessary information for generating the derived type class, we can use <em>quotes</em> to construct fragments of Scala AST:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">fieldSerializationStatements </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">codec </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">q</span><span style="color:#50a14f;">&quot;new BinaryCodec[</span><span style="color:#e45649;">$tpe</span><span style="color:#50a14f;">] { </span><span style="color:#50a14f;"> def serialize(value: T)(implicit context: SerializationContext): Unit = { </span><span style="color:#50a14f;"> ..</span><span style="color:#e45649;">$fieldSerializationStatements </span><span style="color:#50a14f;"> } </span><span style="color:#50a14f;">} </span></code></pre> <p>In the end, this quoted <code>codec</code> value is a <code>Tree</code> which we can return from the macro.</p> <h3 id="shapeless">Shapeless</h3> <p><a href="https://github.com/milessabin/shapeless">Shapeless</a> is a library for <em>type level programming</em> in Scala 2 (and there is a <a href="https://github.com/typelevel/shapeless-3">new version</a> for Scala 3 too). It provides things like type-level heterogeneous lists and all of operations on them, and it also defines <em>macros</em> that can convert an arbitrary case class into a <em>generic representation</em>, which is essentially a type level list containing all the fields. Similarly it can convert an arbitrary sum type (sealed trait in Scala 2) to a generic representation of coproducts. For example the <code>Point</code> case class we used in an earlier example would be represented like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">z</span><span>: </span><span style="color:#a626a4;">Int</span><span>) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">point</span><span>: </span><span style="color:#c18401;">Point </span><span style="color:#a626a4;">= </span><span>Point(</span><span style="color:#c18401;">1</span><span>, </span><span style="color:#c18401;">2</span><span>, </span><span style="color:#c18401;">3</span><span>) </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">genericPoint</span><span>: </span><span style="color:#a626a4;">Int </span><span>:: </span><span style="color:#a626a4;">Int </span><span>:: </span><span style="color:#a626a4;">Int </span><span>:: </span><span style="color:#c18401;">HNil </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// type </span><span> </span><span style="color:#c18401;">1</span><span> :: </span><span style="color:#c18401;">2</span><span> :: </span><span style="color:#c18401;">3</span><span> :: HNil </span><span style="color:#a0a1a7;">// value </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">labelledGenericPoint </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// type too complex to show here </span><span> (</span><span style="color:#50a14f;">&quot;x&quot;</span><span> -&gt;&gt; </span><span style="color:#c18401;">1</span><span>) :: (</span><span style="color:#50a14f;">&quot;y&quot;</span><span> -&gt;&gt; </span><span style="color:#c18401;">2</span><span>) :: (</span><span style="color:#50a14f;">&quot;z&quot;</span><span> -&gt;&gt; </span><span style="color:#c18401;">3</span><span>) :: HNil </span><span style="color:#a0a1a7;">// value </span></code></pre> <p>In connection with type class derivation the idea is that by using Shapeless we no longer have to write macros to extract type information for our types - we can work with these generic representations instead using advanced type level programming techniques. So the complexity of writing macros is replaced with the complexity of doing type level computation.</p> <p>Let's see how it would look like. First we start by creating a <code>derive</code> method that gets the type we are deriving the codec for as a type parameter:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>This <code>T</code> is an arbitrary type, for example our <code>Point</code> structure. In order to get its generic representation provided by Shapeless we have to start using type level techniques, by introducing new type parameters for the things we want to calculate (as types) and implicits to drive these computations. The following version, when compiles, will "calculate" the generic representation of <code>T</code> as the type parameter <code>H</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span>[</span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">H</span><span>](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">gen</span><span>: </span><span style="color:#c18401;">LabelledGeneric</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">H</span><span>]) </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">BinaryCodec</span><span>[</span><span style="color:#c18401;">T</span><span>] { </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">serialize</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#c18401;">T</span><span>)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">context</span><span>: </span><span style="color:#c18401;">SerializationContext</span><span>): </span><span style="color:#a626a4;">Unit = </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">h</span><span>: </span><span style="color:#c18401;">H </span><span style="color:#a626a4;">=</span><span> gen.to(value) </span><span style="color:#a0a1a7;">// generic representation of (value: T) </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span>} </span></code></pre> <p>This is not that hard yet but we need to recursively summon implicit codecs for our fields, so we can't just use this <code>H</code> value to go through all the fields in a traditional way - we need to traverse it on the type level.</p> <p>To do that we need to write our own type level computations implemented as implicit instances for <code>HNil</code> and <code>::</code> etc. The serialization part of the codec would look something like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">hnilSerializer</span><span>: </span><span style="color:#c18401;">BinarySerializer</span><span>[</span><span style="color:#c18401;">HNil</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">BinarySerializer</span><span>[</span><span style="color:#c18401;">HNil</span><span>] { </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">serialize</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#c18401;">HNil</span><span>)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">context</span><span>: </span><span style="color:#c18401;">SerializationContext</span><span>) </span><span style="color:#a626a4;">=</span><span>&gt; { </span><span> </span><span style="color:#a0a1a7;">// no (more) fields </span><span> } </span><span> } </span><span> </span><span style="color:#a626a4;">implicit def </span><span style="color:#0184bc;">hlistSerializer</span><span>[</span><span style="color:#c18401;">K </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Symbol</span><span>, </span><span style="color:#c18401;">H</span><span>, </span><span style="color:#c18401;">T </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>](</span><span style="color:#a626a4;">implicit </span><span> </span><span style="color:#e45649;">witness</span><span>: </span><span style="color:#c18401;">Witness</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">K</span><span>] </span><span style="color:#a0a1a7;">// type level extraction of the field&#39;s name </span><span> headSerializer: </span><span style="color:#c18401;">BinarySerializer</span><span>[</span><span style="color:#c18401;">H</span><span>] </span><span style="color:#a0a1a7;">// type class summoning for the field </span><span> tailSerializer: </span><span style="color:#c18401;">BinarySerializer</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a0a1a7;">// hlist recursion </span><span>): </span><span style="color:#c18401;">BinarySerializer</span><span>[</span><span style="color:#c18401;">FieldType</span><span>[</span><span style="color:#c18401;">K</span><span>, </span><span style="color:#c18401;">H</span><span>] :: </span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>Similar methods have to be implemented for coproducts too, and also in the codec example we would have to simultaneously derive the serializer <em>and</em> the deserializer. A real implementation would also require access to the <em>annotations</em> of various fields to drive the serialization logic, which requires more and more type level calculations and complicates these type signatures.</p> <p>I did chose to use Shapeless in the first version of Desert, and the real <code>derive</code> method has the following signature:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span>[</span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">H</span><span>, </span><span style="color:#c18401;">Ks </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span style="color:#c18401;">Trs </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span style="color:#c18401;">Trcs </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span style="color:#c18401;">KsTrs </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span style="color:#c18401;">TH</span><span>](</span><span style="color:#a626a4;">implicit </span><span> </span><span style="color:#e45649;">gen</span><span>: </span><span style="color:#c18401;">LabelledGeneric</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">H</span><span>], </span><span> </span><span style="color:#e45649;">keys</span><span>: </span><span style="color:#c18401;">Lazy</span><span>[</span><span style="color:#c18401;">Symbols</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">H</span><span>, </span><span style="color:#c18401;">Ks</span><span>]], </span><span> </span><span style="color:#e45649;">transientAnnotations</span><span>: </span><span style="color:#c18401;">Annotations</span><span>.</span><span style="color:#c18401;">Aux</span><span>[transientField, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">Trs</span><span>], </span><span> </span><span style="color:#e45649;">transientConstructorAnnotations</span><span>: </span><span style="color:#c18401;">Annotations</span><span>.</span><span style="color:#c18401;">Aux</span><span>[transientConstructor, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">Trcs</span><span>], </span><span> </span><span style="color:#e45649;">taggedTransients</span><span>: </span><span style="color:#c18401;">TagTransients</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">H</span><span>, </span><span style="color:#c18401;">Trs</span><span>, </span><span style="color:#c18401;">Trcs</span><span>, </span><span style="color:#c18401;">TH</span><span>], </span><span> </span><span style="color:#e45649;">zip</span><span>: </span><span style="color:#c18401;">Zip</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">Ks </span><span>:: </span><span style="color:#c18401;">Trs </span><span>:: </span><span style="color:#c18401;">HNil</span><span>, </span><span style="color:#c18401;">KsTrs</span><span>], </span><span> </span><span style="color:#e45649;">toList</span><span>: </span><span style="color:#c18401;">ToTraversable</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">KsTrs</span><span>, </span><span style="color:#c18401;">List</span><span>, (</span><span style="color:#c18401;">Symbol</span><span>, </span><span style="color:#c18401;">Option</span><span>[transientField])], </span><span> </span><span style="color:#e45649;">serializationPlan</span><span>: </span><span style="color:#c18401;">Lazy</span><span>[</span><span style="color:#c18401;">SerializationPlan</span><span>[</span><span style="color:#c18401;">TH</span><span>]], </span><span> </span><span style="color:#e45649;">deserializationPlan</span><span>: </span><span style="color:#c18401;">Lazy</span><span>[</span><span style="color:#c18401;">DeserializationPlan</span><span>[</span><span style="color:#c18401;">TH</span><span>]], </span><span> </span><span style="color:#e45649;">toConstructorMap</span><span>: </span><span style="color:#c18401;">Lazy</span><span>[</span><span style="color:#c18401;">ToConstructorMap</span><span>[</span><span style="color:#c18401;">TH</span><span>]], </span><span> </span><span style="color:#e45649;">classTag</span><span>: </span><span style="color:#c18401;">ClassTag</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span> ): </span><span style="color:#c18401;">BinaryCodec</span><span>[</span><span style="color:#c18401;">T</span><span>] </span></code></pre> <p>Although this works, there are many problems with this approach. All these type and implicit resolutions can make the compilation quite slow, the code is very complex and hard to understand or modify, and most importantly error messages will be a nightmare. A user trying to derive a type class for our serialization library should not get an error that complains about not being able to find an implicit value of <code>Zip.Aux</code> for a weird type that does not even fit on one screen!</p> <h3 id="magnolia">Magnolia</h3> <p>The <a href="https://github.com/softwaremill/magnolia">Magnolia</a> library provides a much more friendly solution for deriving type classes for algebraic data types - it moves the whole problem into the value space by hiding the necessary macros. The derivation implementation for a given type class then only requires defining two functions (one for working with products, one for working with coproducts) that are regular Scala functions getting a "context" value and producing an instance of the derived type class. The context value contains type information - for example the name and type of all the fields of a case class - and also contains an <em>instance</em> of the derived type class for each of these inner elements.</p> <p>To write a Magnolia based deriver you have to create an <code>object</code> with a <code>join</code> and a <code>split</code> method and a <code>Typeclass</code> type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> BinaryCodecDerivation { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Typeclass[T] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">BinaryCodec[T] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">join</span><span style="color:#c18401;">[T](</span><span style="color:#e45649;">ctx</span><span style="color:#c18401;">: CaseClass[BinaryCodec, T]): BinaryCodec[T] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">BinaryCodec[T] { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">serialize</span><span style="color:#c18401;">(</span><span style="color:#e45649;">value</span><span style="color:#c18401;">: T)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">context</span><span style="color:#c18401;">: SerializationContext) </span><span style="color:#a626a4;">=</span><span style="color:#c18401;">&gt; { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">for </span><span style="color:#c18401;">(</span><span style="color:#e45649;">parameter </span><span style="color:#a626a4;">&lt;-</span><span style="color:#c18401;"> ctx.parameters) { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// recursively serialize the fields </span><span style="color:#c18401;"> parameter.typeclass.serialize(parameter.dereference(value)) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">split</span><span style="color:#c18401;">[T](</span><span style="color:#e45649;">ctx</span><span style="color:#c18401;">: SealedTrait[BinaryCodec, T]): BinaryCodec[T] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">gen</span><span style="color:#c18401;">[T]: BinaryCodec[T] </span><span style="color:#a626a4;">= macro </span><span style="color:#c18401;">Magnolia.gen[T] </span><span style="color:#c18401;">} </span></code></pre> <p>There is a Magnolia version for Scala 3 too, which is although quite similar, it is not source compatible with the Scala 2 version, leading to the need to define these derivations twice in cross-compiled projects.</p> <h2 id="why-not-magnolia">Why not Magnolia?</h2> <p>Magnolia already existed when I wrote the first version of Desert, but I could not use it because of two reasons. In that early version of the library the derivation had to take a user defined list of <em>evolution steps</em>, so the actual codec definitions looked something like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Point { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">codec</span><span style="color:#c18401;">: BinaryCodec[Point] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">BinaryCodec.derive(FieldAdded[</span><span style="color:#a626a4;">Int</span><span style="color:#c18401;">](</span><span style="color:#50a14f;">&quot;z&quot;</span><span style="color:#c18401;">, 1)) </span><span style="color:#c18401;">} </span></code></pre> <p>It was not clear how could I pass these parameters to Magnolia context - with Shapeless it was not a problem because it is possible to simply pass them as a parameter to the <code>derive</code> function that "starts" the type level computation.</p> <p>This requirement no longer exists though, as in recent versions the <em>evolution steps</em> are defined by attributes, which are fully supported by Magnolia as well:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>@</span><span style="color:#e45649;">evolutionSteps</span><span>(FieldAdded[</span><span style="color:#a626a4;">Int</span><span>](</span><span style="color:#50a14f;">&quot;z&quot;</span><span>, </span><span style="color:#c18401;">1</span><span>)) </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Point</span><span>(</span><span style="color:#e45649;">x</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">y</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">z</span><span>: </span><span style="color:#a626a4;">Int</span><span>) </span></code></pre> <p>The second reason was a much more important limitation in Magnolia that still exists - it is not possible to shortcut the derivation tree. Desert has <em>transient field</em> and <em>transient constructor</em> support. For those fields and constructors which are marked as transient we don't want to, and cannot define codec instances. They can be things like open files, streams, actor references, sockets etc. Even though Magnolia only instantiates the type class instances when they are accessed, the derivation fails if there are types in the tree that does not have an instance. This issue is <a href="https://github.com/softwaremill/magnolia/issues/297">tracked here</a>.</p> <p>There was one more decision I did not like regarding Magnolia - the decision to have an incompatible Scala 3 version. I believe it was a big missed opportunity to seamlessly support cross-compiled type class derivation code.</p> <h2 id="zio-schema-based-derivation">ZIO Schema based derivation</h2> <p>All these issues lead to writing a new derivation library - as part of the <a href="https://zio.dev/zio-schema/">ZIO Schema</a> project. It was first released in version <a href="https://github.com/zio/zio-schema/releases/tag/v0.3.0">v0.3.0</a> in November of 2022.</p> <p>From the previously demonstrated type class derivation techniques the closest to ZIO Schema's deriver is Magnolia. On the other hand it does supports the transient field use case, and it is fully cross-compilation compatible between Scala 2 and Scala 3.</p> <p>To implement type class derivation based on ZIO Schema you need to implement a trait called <code>Deriver</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Deriver</span><span>[</span><span style="color:#c18401;">F</span><span>[</span><span style="color:#e45649;">_</span><span>]] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">deriveRecord</span><span style="color:#c18401;">[A]( </span><span style="color:#c18401;"> </span><span style="color:#e45649;">record</span><span style="color:#c18401;">: Schema.Record[A], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">fields</span><span style="color:#c18401;">: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Chunk[WrappedF[F, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">]], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">summoned</span><span style="color:#c18401;">: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Option[F[A]] </span><span style="color:#c18401;"> ): F[A] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// more deriveXXX methods to impelment </span><span style="color:#c18401;">} </span></code></pre> <p>This looks similar to Magnolia's <code>join</code> method but has some significant differences. The first thing to notice is that we get a <code>Schema.Record</code> value describing our case class. This is one of the cases of the core data type <code>Schema[T]</code> which describes Scala data types and provides a lot of features to work with them. So having a <code>Schema[A]</code> is a requirement to derive an <code>F[A]</code> with <code>Deriver</code> - but luckily ZIO schema has derivation support for Schema itself.</p> <p>The second thing to notice is that <code>Schema[A]</code> itself does not know anything about type class derivation and especially about the actual <code>F</code> type class that is being derived, so the second parameter of <code>deriveRecord</code> is a collection of potentially derived instances of our derived type class for each field. <code>WrappedF</code> is just making this lazy so if we decide we don't need instances for (some of) the fields they won't be traversed (they still need to have a <code>Schema</code> though - but it can even be a <code>Schema.fail</code> for things not representable by ZIO Schema - it will be fine if we never touch them by unwrapping the <code>WrappedF</code> value).</p> <p>The third parameter is also interesting as it provides full control to the developer to choose between the summoned implicit and the derivation logic. If your <code>deriveRecord</code> is called for a record type <code>A</code> and there is already an implicit <code>F[A]</code> that the compiler can find (for example defined in <code>A</code>'s companion object), it will be passed in the <code>summoned</code> parameter to <code>deriveRecord</code>. The usual logic is to choose the summoned value when it is available and only derive an instance when there isn't any. By calling <code>.autoAcceptSummoned</code> on our <code>Deriver</code> class we can automatically enable this behavior - in this case <code>deriveRecord</code> will only be called for the cases where <code>summoned</code> was <code>None</code>.</p> <p>Another method we have on <code>Deriver</code> is <code>.cached</code> which stores the generated type class instances in a concurrent hash map shared between the macro invocations.</p> <p>Our ZIO Schema based Desert codec derivation is defined using these modifiers:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> DerivedBinaryCodec { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">deriver </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">BinaryCodecDeriver().cached.autoAcceptSummoned </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private final case class</span><span style="color:#c18401;"> BinaryCodecDeriver() </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Deriver[BinaryCodec] { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>As ZIO Schema is not only describing records and enums but also primitive types, tuples, and special cases like <code>Option</code> and <code>Either</code> and collection types, the deriver has to support all these.</p> <p>The minimum set of methods to implement is <code>deriveRecord</code>, <code>deriveEnum</code>, <code>derivePrimitive</code>, <code>deriveOption</code>, <code>deriveSequence</code>, <code>deriveMap</code> and <code>deriveTransformedRecord</code>. In addition to that we can also override <code>deriveEither</code>, <code>deriveSet</code> and <code>deriveTupleN</code> (1-22) to handle these cases specially.</p> <p>In case of Desert the <code>deriveRecord</code> and <code>deriveEnum</code> are calling to the implementation of the same data-evolution aware binary format that was previously implemented using Shapeless, but this time it is automatically supporting Scala 2 and Scala 3 the same time. The <code>derivePrimitive</code> is just choosing from predefined <code>BinaryCodec</code> instances based on the primitive's type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">derivePrimitive</span><span>[</span><span style="color:#c18401;">A</span><span>]( </span><span> </span><span style="color:#e45649;">st</span><span>: </span><span style="color:#c18401;">StandardType</span><span>[</span><span style="color:#c18401;">A</span><span>], </span><span> </span><span style="color:#e45649;">summoned</span><span>: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">BinaryCodec</span><span>[</span><span style="color:#c18401;">A</span><span>]] </span><span>): </span><span style="color:#c18401;">BinaryCodec</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= </span><span> st </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>StandardType.UnitType </span><span style="color:#a626a4;">=&gt;</span><span> unitCodec </span><span> </span><span style="color:#a626a4;">case </span><span>StandardType.StringType </span><span style="color:#a626a4;">=&gt;</span><span> stringCodec </span><span> </span><span style="color:#a626a4;">case </span><span>StandardType.BoolType </span><span style="color:#a626a4;">=&gt;</span><span> booleanCodec </span><span> </span><span style="color:#a626a4;">case </span><span>StandardType.ByteType </span><span style="color:#a626a4;">=&gt;</span><span> byteCodec </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span></code></pre> <p>Same applies for option, either, sequence etc - it is just a mapping to the library's own definition of these binary codecs.</p> <p>Under the hood <code>Deriver</code> is a macro (implemented separately both for Scala 2 and Scala 3) that traverses the types simultaneously with the provided <code>Schema</code> (so it does not need to regenerate those) and maps these informations into calls through the <code>Deriver</code> interface. The whole process is initiated by calling the <code>derive</code> method on our <code>Deriver</code>, which is the entry point of these macros, so it has a different looking (but source-code compatible) definition for Scala 2 and Scala 3:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a0a1a7;">// Scala 3 </span><span>inline </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">schema</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">A</span><span>]): </span><span style="color:#c18401;">F</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span> </span><span style="color:#a0a1a7;">// Scala 2 </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span>[</span><span style="color:#c18401;">F</span><span>[</span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">deriver</span><span>: </span><span style="color:#c18401;">Deriver</span><span>[</span><span style="color:#c18401;">F</span><span>])( </span><span> </span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">schema</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span>): </span><span style="color:#c18401;">F</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= macro</span><span> deriveImpl[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">A</span><span>] </span></code></pre> <p>These are compatible if you are directly calling them: so you can write</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">binaryCodecDeriver</span><span>: </span><span style="color:#c18401;">Deriver</span><span>[</span><span style="color:#c18401;">BinaryCodec</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">pointCodec</span><span>: </span><span style="color:#c18401;">BinaryCodec</span><span>[</span><span style="color:#c18401;">Point</span><span>] </span><span style="color:#a626a4;">=</span><span> binaryCodecDeriver.derive[</span><span style="color:#c18401;">Point</span><span>] </span></code></pre> <p>Or even:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> BinaryCodecDeriver </span><span style="color:#a626a4;">extends </span><span>Deriver[</span><span style="color:#c18401;">BinaryCodec</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">pointCodec</span><span>: </span><span style="color:#c18401;">BinaryCodec</span><span>[</span><span style="color:#c18401;">Point</span><span>] </span><span style="color:#a626a4;">= </span><span>BinaryCodecDeriver.derive[</span><span style="color:#c18401;">Point</span><span>] </span></code></pre> <p>But if you want to wrap this derive call you have to be aware that they are macro calls, and they have to be wrapped by (version-specific) macros. This is what Desert is doing - as shown before, it uses the <code>cached</code> and <code>autoAcceptSummoned</code> modifiers to create a deriver, but still exposes a simple <code>derive</code> method through an <code>object</code>. To do so it needs to wrap the inner deriver macro with its own macro like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a0a1a7;">// Scala 2 </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> DerivedBinaryCodecVersionSpecific { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">deriver</span><span style="color:#c18401;">: Deriver[BinaryCodec] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span style="color:#c18401;">[T](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">schema</span><span style="color:#c18401;">: Schema[T]): BinaryCodec[T] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">macro </span><span style="color:#c18401;">DerivedBinaryCodecVersionSpecific.deriveImpl[T] </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> DerivedBinaryCodecVersionSpecific { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">deriveImpl</span><span style="color:#c18401;">[T</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">c.WeakTypeTag]( </span><span style="color:#c18401;"> </span><span style="color:#e45649;">c</span><span style="color:#c18401;">: whitebox.Context)( </span><span style="color:#c18401;"> </span><span style="color:#e45649;">schema</span><span style="color:#c18401;">: c.Expr[Schema[T]] </span><span style="color:#c18401;"> ): c.Tree </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">import</span><span style="color:#c18401;"> c.universe.</span><span style="color:#e45649;">_ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">tpe </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> weakTypeOf[T] </span><span style="color:#c18401;"> </span><span style="color:#0184bc;">q</span><span style="color:#50a14f;">&quot;_root_.zio.schema.Derive.derive[BinaryCodec, </span><span style="color:#e45649;">$tpe</span><span style="color:#50a14f;">] (_root_.io.github.vigoo.desert.zioschema.DerivedBinaryCodec.deriver)(</span><span style="color:#e45649;">$schema</span><span style="color:#50a14f;">)&quot; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a0a1a7;">// Scala 3 </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> DerivedBinaryCodecVersionSpecific { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">deriver</span><span style="color:#c18401;">: Deriver[BinaryCodec] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> inline </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">derive</span><span style="color:#c18401;">[T](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">schema</span><span style="color:#c18401;">: Schema[T]): BinaryCodec[T] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> Derive.derive[BinaryCodec, T](DerivedBinaryCodec.deriver) </span><span style="color:#c18401;">} </span></code></pre> <h2 id="conclusion">Conclusion</h2> <p>We have a new alternative for deriving type class instances from type information, based on ZIO Schema. You may want to use it if you want to have a single deriver source code for both Scala 2 and Scala 3, if you need more flexibility than what Magnolia provides, or if you are already using ZIO Schema in your project.</p> Generating a Rust client library for ZIO Http endpoints 2023-09-07T00:00:00+00:00 2023-09-07T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/generating-a-rust-client-library-for-zio-http-endpoints/ <p>We at <a href="https://golem.cloud">Golem Cloud</a> built our first developer preview on top of the ZIO ecosystem, including <a href="https://github.io/zio/zio-http">ZIO Http</a> for defining and implementing our server's REST API. By using <strong>ZIO Http</strong> we immediately had the ability to call our endpoints using endpoint <strong>client</strong>s, which allowed us to develop the first version of Golem's <strong>CLI tool</strong> very rapidly.</p> <p>Although very convenient for development, <em>using</em> a CLI tool built with Scala for the JVM is not a pleasant experience for the users due to the slow startup time. One possible solution is to compile to native using <a href="https://www.graalvm.org/22.0/reference-manual/native-image/">GraalVM Native Image</a> but it is very hard to set up and even when it works, it is extremely fragile - further changes to the code or updated dependencies can break it causing unexpected extra maintenance cost. After some initial experiments we dropped this idea - and instead chose to reimplement the CLI using <strong>Rust</strong> - a language being a much better fit for command line tools, and also already an important technology in our Golem stack.</p> <h2 id="zio-http">ZIO Http</h2> <p>If we rewrite <code>golem-cli</code> to Rust, we lose the convenience of using <strong>endpoint definitions</strong> (written in Scala with ZIO Http, the ones we have for implementing the server) for calling our API, and we would also lose all the <strong>types</strong> used in these APIs as they are all defined as Scala case classes and enums. Just to have more context, let's take a look at one of the endpoints!</p> <p>A ZIO Http <strong>endpoint</strong> is just a definition of a single endpoint of a HTTP API, describing the routing as well the inputs and outputs of it:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">getWorkerMetadata </span><span style="color:#a626a4;">= </span><span> Endpoint(GET / </span><span style="color:#50a14f;">&quot;v1&quot;</span><span> / </span><span style="color:#50a14f;">&quot;templates&quot;</span><span> / rawTemplateId / </span><span style="color:#50a14f;">&quot;workers&quot;</span><span> / workerName) </span><span> .header(Auth.tokenSecret) </span><span> .outErrorCodec(errorCodec) </span><span> .out[</span><span style="color:#c18401;">WorkerMetadata</span><span>] ?? Doc.p(</span><span style="color:#50a14f;">&quot;Get the current worker status and metadata&quot;</span><span>) </span></code></pre> <p>Let's see what we have here:</p> <ul> <li>the endpoint is reached by sending a <strong>GET</strong> request</li> <li>the request <strong>path</strong> consists of some static segments as well as the <em>template id</em> and the <em>worker name</em></li> <li>it also requires an <strong>authorization header</strong></li> <li>we define the kind of errors it can return</li> <li>and finally it defines that the response's <strong>body</strong> will contain a JSON representation (default in ZIO Http) of a type called <code>WorkerMetadata</code></li> </ul> <p>What are <code>rawTemplateId</code> and <code>workerName</code>? These are so called <strong>path codecs</strong>, defined in a common place so they can be reused in multiple endpoints. They allow us to have dynamic parts of the request path mapped to specific types - so when we implement the endpoint (or call it in a client) we don't have to pass strings and we can directly work with the business domain types, in this case <code>RawTemplateId</code> and <code>WorkerName</code>.</p> <p>The simplest way to define path codecs is to <strong>transform</strong> an existing one:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">workerName</span><span>: </span><span style="color:#c18401;">PathCodec</span><span>[</span><span style="color:#c18401;">WorkerName</span><span>] </span><span style="color:#a626a4;">= </span><span> string(</span><span style="color:#50a14f;">&quot;worker-name&quot;</span><span>).transformOrFailLeft(WorkerName.make(</span><span style="color:#e45649;">_</span><span>).toErrorEither, </span><span style="color:#e45649;">_</span><span>.value) </span></code></pre> <p>Here the <code>make</code> function is a <strong>ZIO Prelude</strong> <a href="https://zio.github.io/zio-prelude/docs/functionaldatatypes/validation"><code>Validation</code></a> which we have to convert to an <code>Either</code> for the transform function. Validations can contain more than one failures, as opposed to <code>Either</code>s, which allows us to compose them in a way that we can keep multiple errors instead of immediately returning with the first failure.</p> <p>The <code>tokenSecret</code> is similar, but it is a <code>HeaderCodec</code> describing what type of header it is and how the value of the given header should be mapped to a specific type (a token, in this case).</p> <p>What is <code>WorkerMetadata</code> and how does ZIO Http know how to produce a JSON from it?</p> <p>It's just a simple <em>case class</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> WorkerMetadata</span><span>( </span><span> </span><span style="color:#e45649;">workerId</span><span>: </span><span style="color:#c18401;">ComponentInstanceId</span><span>, </span><span> </span><span style="color:#e45649;">accountId</span><span>: </span><span style="color:#c18401;">AccountId</span><span>, </span><span> </span><span style="color:#e45649;">args</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">String</span><span>], </span><span> </span><span style="color:#e45649;">env</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">String</span><span>], </span><span> </span><span style="color:#e45649;">status</span><span>: </span><span style="color:#c18401;">InstanceStatus</span><span>, </span><span> </span><span style="color:#e45649;">templateVersion</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span> </span><span style="color:#e45649;">retryCount</span><span>: </span><span style="color:#a626a4;">Int </span><span>) </span></code></pre> <p>But with an implicit <strong>derived</strong> <strong>ZIO Schema</strong>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> WorkerMetadata { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">schema</span><span style="color:#c18401;">: Schema[WorkerMetadata] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">DeriveSchema.gen[WorkerMetadata] </span><span style="color:#c18401;">} </span></code></pre> <p>We will talk more about ZIO Schema below - for now all we need to know is it describes the structure of Scala types, and this information can be used to serialize data into various formats, including JSON.</p> <p>Once we have our endpoints defined like this, we can do several things with them - they are just data describing what an endpoint looks like!</p> <h3 id="implementing-an-endpoint">Implementing an endpoint</h3> <p>When developing a <em>server</em>, the most important thing to do with an endpoint is to <strong>implement</strong> it. Implementing an endpoint looks like the following:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">getWorkerMetadataImpl </span><span style="color:#a626a4;">= </span><span> getWorkerMetadata.implement { </span><span> Handler.fromFunctionZIO { (</span><span style="color:#e45649;">rawTemplateId</span><span>, </span><span style="color:#e45649;">workerName</span><span>, </span><span style="color:#e45649;">authTokenId</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a0a1a7;">// ... ZIO program returning a WorkerMetadata </span><span> } </span><span> } </span></code></pre> <p>The <em>type</em> of <code>getWorkerMetadataImpl</code> is <code>Route</code> - it is no longer just a description of what an endpoint looks like, it defines a specific HTTP route and its associated <em>request handler</em>, implemented by a ZIO effect (remember that ZIO effects are also values - we <em>describe</em> what we need to do when a request comes in, but executing it will be the responsibility of the server implementation).</p> <p>The nice thing about ZIO Http endpoints is that they are completely type safe. I've hidden the type signature in the previous code snippets but actually <code>getWorkerMetadata</code> has the type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Endpoint[ </span><span> (</span><span style="color:#c18401;">RawTemplateId</span><span>, </span><span style="color:#c18401;">WorkerName</span><span>), </span><span> (</span><span style="color:#c18401;">RawTemplateId</span><span>, </span><span style="color:#c18401;">WorkerName</span><span>, </span><span style="color:#c18401;">TokenSecret</span><span>), </span><span> </span><span style="color:#c18401;">WorkerEndpointError</span><span>, </span><span> </span><span style="color:#c18401;">WorkerMetadata</span><span>, </span><span> </span><span style="color:#c18401;">None </span><span>] </span></code></pre> <p>Here the <em>second</em> type parameter defines the <strong>input</strong> of the request handler and the <em>forth</em> type parameter defines the <strong>output</strong> the server constructs the response from.</p> <p>With these types, we really just have to implement a (ZIO) function from the input to the output:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>(</span><span style="color:#e45649;">RawTemplateId</span><span>, </span><span style="color:#e45649;">WorkerName</span><span>, </span><span style="color:#e45649;">TokenSecret</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>ZIO[</span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">WorkerEndpointError</span><span>, </span><span style="color:#c18401;">WorkerMetadata</span><span>] </span></code></pre> <p>and this is exactly what we pass to <code>Handler.fromFunctionZIO</code> in the above example.</p> <h3 id="calling-an-endpoint">Calling an endpoint</h3> <p>The same endpoint values can also be used to make requests to our API from clients such as <code>golem-cli</code>. Taking advantage of the same type safe representation we can just call <code>apply</code> on the endpoint definition passing its input as a parameter to get an <strong>invocation</strong>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">invocation </span><span style="color:#a626a4;">=</span><span> getInstanceMetadata(rawTemplateId, workerName, token) </span></code></pre> <p>this invocation can be <strong>executed</strong> to perform the actual request using an <code>EndpointExecutor</code> which can be easily constructed from a ZIO Http <code>Client</code> and some other parameters like the URL of the remote server:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>executor(invocation).flatMap { </span><span style="color:#e45649;">workerMetadata </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <h2 id="the-task">The task</h2> <p>So can we do anything to keep this convenient way of calling our endpoints when migrating the CLI to Rust? At the time of writing we already had more than 60 endpoints, with many complex types used in them - defining them by hand in Rust, and keeping the Scala and Rust code in sync sounds like a nightmare.</p> <p>The ideal case would be to have something like this in Rust:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">async_trait</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>Worker { </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_worker_metadata</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">template_id</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>TemplateId, </span><span style="color:#e45649;">worker_name</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>WorkerName, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;</span><span>Token) -&gt; Result&lt;WorkerMetadata, WorkerError&gt;; </span><span>} </span></code></pre> <p>with an implementation that just requires the same amount of configuration as the Scala endpoint executor (server URL, etc), and all the referenced types like <code>WorkerMetadata</code> would be an exact clone of the Scala types just in Rust.</p> <p>Fortunately we can have (almost) this by taking advantage of the declarative nature of ZIO Http and ZIO Schema!</p> <p>In the rest of this post we will see how we can <strong>generate Rust code</strong> using a combination of ZIO libraries to automatically have all our type definitions and client implementation ready to use from the Rust version of <code>golem-cli</code>.</p> <h2 id="the-building-blocks">The building blocks</h2> <p>We want to generate from an arbitrary set of ZIO Http <code>Endpoint</code> definitions a <strong>Rust crate</strong> ready to be compiled, published and used. We will take advantage of the following libraries:</p> <ul> <li><a href="https://zio.dev/zio-http/">ZIO Http</a> as the source of <strong>endpoint</strong> definitions</li> <li><a href="https://zio.dev/zio-schema/">ZIO Schema</a> for observing the <strong>type</strong> definitions</li> <li><a href="https://zio.dev/zio-parser/">ZIO Parser</a> because it has a composable <strong>printer</strong> concept</li> <li><a href="https://zio.dev/zio-nio/">ZIO NIO</a> for working with the <strong>filesystem</strong></li> <li><a href="https://zio.dev/zio-prelude/">ZIO Prelude</a> for implementing the stateful endpoint/type discovery in a purely functional way</li> </ul> <h2 id="generating-rust-code">Generating Rust code</h2> <p>Let's start with the actual source code generation. This is something that can be done in many different ways - one extreme could be to just concatenate strings (or use a <code>StringBuilder</code>) while the other is to build a full real Rust <em>AST</em> and pretty print that. I had a <a href="https://blog.vigoo.dev/posts/funscala2021-talk/">talk on Function Scala 2021 about the topic</a>.</p> <p>For this task I chose a technique which is somewhere in the middle and provides some extent of composability while also allowing use to do just the amount of abstraction we want to. The idea is that we define a <em>Rust code generator model</em> which does not have to strictly follow the actual generated language's concepts, and then define a pretty printer for this model. This way we only have to model the subset of the language we need for the code generator, and we can keep simplifications or even complete string fragments in it if that makes our life easier.</p> <p>Let's see how this works with some examples!</p> <p>We will have to generate <em>type definitions</em> so we can define a Scala <em>enum</em> describing what kind of type definitions we want to generate:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>enum RustDef: </span><span> </span><span style="color:#a626a4;">case </span><span>TypeAlias(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">typ</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">derives</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustType</span><span>]) </span><span> </span><span style="color:#a626a4;">case </span><span>Newtype(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">typ</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">derives</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustType</span><span>]) </span><span> </span><span style="color:#a626a4;">case </span><span>Struct(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">fields</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>.</span><span style="color:#c18401;">Field</span><span>], </span><span style="color:#e45649;">derives</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustType</span><span>], </span><span style="color:#e45649;">isPublic</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>) </span><span> </span><span style="color:#a626a4;">case </span><span>Enum(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">cases</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>], </span><span style="color:#e45649;">derives</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustType</span><span>]) </span><span> </span><span style="color:#a626a4;">case </span><span>Impl(</span><span style="color:#e45649;">tpe</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">functions</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>]) </span><span> </span><span style="color:#a626a4;">case </span><span>ImplTrait(</span><span style="color:#e45649;">implemented</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">forType</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">functions</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>]) </span><span> </span><span style="color:#a626a4;">case </span><span>Function(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">parameters</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>.</span><span style="color:#c18401;">Parameter</span><span>], </span><span style="color:#e45649;">returnType</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">body</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">isPublic</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>) </span></code></pre> <p>We can make this as convenient to use as we want, for example adding constructors like:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">struct</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">fields</span><span>: </span><span style="color:#c18401;">Field</span><span style="color:#a626a4;">*</span><span>): </span><span style="color:#c18401;">RustDef </span></code></pre> <p>The <code>Name</code> is an opaque string type with extension methods to convert between various cases like pascal case, snake case, etc. <code>RustType</code> is a similar <em>enum</em> to <code>RustDef</code>, containing all the different type descriptions we will have to use. But it is definitely not how a proper Rust parser would define what a type is - for example we can have a <code>RustType.Option</code> as a shortcut for wrapping a Rust type in Rust's own option type, just because it makes our code generator simpler to write.</p> <p>So once we have this model (which in practice evolves together with the code generator, usually starting with a few simple case classes) we can use <strong>ZIO Parser</strong>'s printer feature to define composable elements constructing Rust source code.</p> <p>We start by defining a module and a type alias for our printer:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Rust</span><span>: </span><span> </span><span style="color:#a626a4;">type </span><span>Rust[</span><span style="color:#a626a4;">-</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Printer</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#a626a4;">Char</span><span>, </span><span style="color:#c18401;">A</span><span>] </span></code></pre> <p>and then just define building blocks - what these building blocks are depends completely on us, and the only thing it affects is how well you can compose them. Having very small building blocks may reduce the readability of the code generator, but using too large chunks reduces their composability and makes it harder to change or refactor.</p> <p>We can define some short aliases for often used characters or string fragments:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">gt</span><span>: </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#a626a4;">Any</span><span>] </span><span style="color:#a626a4;">= </span><span>Printer.print(</span><span style="color:#c18401;">&#39;&gt;&#39;</span><span>) </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">lt</span><span>: </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#a626a4;">Any</span><span>] </span><span style="color:#a626a4;">= </span><span>Printer.print(</span><span style="color:#c18401;">&#39;&lt;&#39;</span><span>) </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">bracketed</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">inner</span><span>: </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#c18401;">A</span><span>]): </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= </span><span> lt ~ inner ~ gt </span></code></pre> <p>and we have to define <code>Rust</code> printers for each of our model types. For example for the <code>RustType</code> enum it could be something like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">typename</span><span>: </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#c18401;">RustType</span><span>] </span><span style="color:#a626a4;">= </span><span>Printer.byValue: </span><span> </span><span style="color:#a626a4;">case </span><span>RustType.Primitive(</span><span style="color:#e45649;">name</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> str(name) </span><span> </span><span style="color:#a626a4;">case </span><span>RustType.Option(</span><span style="color:#e45649;">inner</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> typename(RustType.Primitive(</span><span style="color:#50a14f;">&quot;Option&quot;</span><span>)) ~ bracketed(typename(inner)) </span><span> </span><span style="color:#a626a4;">case </span><span>RustType.Vec(</span><span style="color:#e45649;">inner</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> typename(RustType.Primitive(</span><span style="color:#50a14f;">&quot;Vec&quot;</span><span>)) ~ bracketed(typename(inner)) </span><span> </span><span style="color:#a626a4;">case </span><span>RustType.SelectFromModule(</span><span style="color:#e45649;">path</span><span>, </span><span style="color:#e45649;">typ</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>Printer.anyString.repeatWithSep(dcolon)(path) ~ dcolon ~ typename(typ) </span><span> </span><span style="color:#a626a4;">case </span><span>RustType.Parametric(</span><span style="color:#e45649;">name</span><span>, </span><span style="color:#e45649;">params</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> str(name) ~ bracketed(typename.repeatWithSep(comma)(params)) </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>We can see that <code>typename</code> uses itself to recursively generate inner type names, for example when generating type parameters of tuple members. It also demonstrates that we can extract patterns such as <code>bracketed</code> to simplify our printer definitions and eliminate repetition.</p> <p>Another nice feature we get by using a general purpose printer library like ZIO Parser is that we can use the built-in combinators to get printers for new types. One example is the sequential composition of printers. For example the following fragment:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">p </span><span style="color:#a626a4;">=</span><span> str(</span><span style="color:#50a14f;">&quot;pub &quot;</span><span>) ~ name ~ str(</span><span style="color:#50a14f;">&quot;: &quot;</span><span>) ~ typename </span></code></pre> <p>would have the type <code>Rust[(Name, RustType)]</code> and we can even make that a printer of a case class like:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PublicField</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">typ</span><span>: </span><span style="color:#c18401;">RustType</span><span>) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">p2 </span><span style="color:#a626a4;">=</span><span> p.from[</span><span style="color:#c18401;">PublicField</span><span>] </span></code></pre> <p>where <code>p2</code> will have the type <code>Rust[PublicField</code>].</p> <p>Another very useful combinator is <strong>repetition</strong>. For example if we have a printer for an enum's case:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">enumCase</span><span>: </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#c18401;">RustDef</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>we can simply use one of the repetition combinators to make a printer for a <em>list of enum cases</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">enumCases</span><span>: </span><span style="color:#c18401;">Rust</span><span>[</span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>]] </span><span style="color:#a626a4;">=</span><span> enumCase.* </span></code></pre> <p>or as in the <code>typename</code> example above:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>typename.repeatWithSep(comma) </span></code></pre> <p>to have a <code>Rust[Chunk[RustType]]</code> that inserts a comma between each element when printed.</p> <h2 id="inspecting-the-scala-types">Inspecting the Scala types</h2> <p>As we have seen the <em>endpoint DSL</em> uses <strong>ZIO Schema</strong> to capture information about the types being used in the endpoints (usually as request or response bodies, serialized into JSON). We can use the same information to generate <strong>Rust types</strong> from our Scala types!</p> <p>The core data type defined by the ZIO Schema library is called <code>Schema</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> Schema</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>Schema describes the structure of a Scala type <code>A</code> in a way we can inspect it from regular Scala code. Let's imagine we have <code>Schema[WorkerMetadata]</code> coming from our endpoint definition and we have to generate an equivalent Rust <code>struct</code> with the same field names and field types.</p> <p>The first thing to notice is that type definitions are recursive. Unless <code>WorkerMetadata</code> only contains fields of <em>primitive types</em> such as integer or string, our job does not end with generating a single Rust struct - we need to recursively generate all the other types <code>WorkerMetadata</code> is depending on! To capture this fact let's introduce a type that represents everything we have to extract from a single (or a set of) schemas in order to generate Rust types from them:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> RustModel</span><span>( </span><span> </span><span style="color:#e45649;">typeRefs</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>], </span><span style="color:#c18401;">RustType</span><span>], </span><span> </span><span style="color:#e45649;">definitions</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>], </span><span> </span><span style="color:#e45649;">requiredCrates</span><span>: </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">Crate</span><span>] </span><span>) </span></code></pre> <p>We have <code>typeRefs</code> which associates a <code>RustType</code> with a schema so we can use it in future steps of our code generator to refer to a generated type in our Rust codebase. We have a list of <code>RustDef</code> values which are the generated type definitions, ready to be printed with out <code>Rust</code> pretty printer. And finally we can also gather a set of required extra rust <em>crates</em>, because some of the types considered <em>primitive types</em> by ZIO Schema are not having proper representations in the Rust standard library, only in external crates. Examples are UUIDs and various date/time types.</p> <p>So our job now is to write a function of</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">fromSchemas</span><span>(</span><span style="color:#e45649;">schemas</span><span>: </span><span style="color:#c18401;">Seq</span><span>[</span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>]]): </span><span style="color:#c18401;">Either</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">RustModel</span><span>] </span></code></pre> <p>The <code>Either</code> result type is used to indicate failures. Even if we write a transformation that can produce from any <code>Schema</code> a proper <code>RustModel</code>, we always have to have an error result when working with ZIO Schema because it has an explicit failure case called <code>Schema.Fail</code>. If we process a schema and end up with a <code>Fail</code> node, we can't do anything else than fail our code generator.</p> <p>There are many important details to consider when implementing this function, but let's just see first what the actual <code>Schema</code> type looks like. When we have a value of <code>Schema[?]</code> we can pattern match on it and implement the following cases:</p> <ul> <li><code>Schema.Primitive</code> describes a primitive type - there are a lot of primitive types defined by ZIO Schema's <code>StandardType</code> enum</li> <li><code>Schema.Enum</code> describes a type with multiple cases (a <em>sum type</em>) such as a <code>sealed trait</code> or <code>enum</code></li> <li><code>Schema.Record</code> describes a type with multiple fields (a <em>product type</em>) such as a <code>case class</code></li> <li><code>Schema.Map</code> represents a <em>map</em> with a key and value type</li> <li><code>Schema.Sequence</code> represents a <em>sequence</em> of items of a given element type</li> <li><code>Schema.Set</code> is a <em>set</em> of items of a given element type</li> <li><code>Schema.Optional</code> represents an <em>optional</em> type (like an <code>Option[T]</code>)</li> <li><code>Schema.Either</code> is a special case of sum types representing either one or the other type (like an <code>Either[A, B]</code>)</li> <li><code>Schema.Lazy</code> is used to safely encode recursive types, it contains a function that evaluates into an inner <code>Schema</code></li> <li><code>Schema.Dynamic</code> represents a type that is dynamic - like a <code>JSON</code> value</li> <li><code>Schema.Transform</code> assigns a transformation function that converts a <em>value</em> of a type represented by the schema to a value of some other type. As we have no way to inspect these functions (they are compiled Scala functions) in our code generator, this is not very interesting for us now.</li> <li><code>Schema.Fail</code> as already mentioned represents a failure in describing the data type</li> </ul> <p>When traversing a <code>Schema</code> recursively (for any reason), it is important to keep in mind that it <em>can</em> encode recursive types! A simple example is a binary tree:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Tree</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">label</span><span>: </span><span style="color:#c18401;">A</span><span>, </span><span style="color:#e45649;">left</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Tree</span><span>], </span><span style="color:#e45649;">right</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Tree</span><span>]) </span></code></pre> <p>We can construct a <code>Schema[Tree[A]]</code> if we have a <code>Schema[A]</code>. This will be something like (pseudo-code):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">tree</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">Tree</span><span>] </span><span style="color:#a626a4;">= </span><span> Schema.Record( </span><span> Field(</span><span style="color:#50a14f;">&quot;label&quot;</span><span>, Schema[</span><span style="color:#c18401;">A</span><span>]), </span><span> Field(</span><span style="color:#50a14f;">&quot;left&quot;</span><span>, Schema.Optional(Schema.Lazy(() </span><span style="color:#a626a4;">=&gt;</span><span> tree))), </span><span> Field(</span><span style="color:#50a14f;">&quot;right&quot;</span><span>, Schema.Optional(Schema.Lazy(() </span><span style="color:#a626a4;">=&gt;</span><span> tree))) </span><span> ) </span></code></pre> <p>If we are not prepared for recursive types we can easily get into an endless loop (or stack overflow) when processing these schemas.</p> <p>This is just one example of things to keep track of while converting a schema into a set of Rust definitions. If fields refer to the self type we want to use <code>Box</code> so to put them on the heap. We also need to keep track of if everything within a generated type derives <code>Ord</code> and <code>Hash</code> - and if yes, we should derive an instance for the same type classes for our generated type as well.</p> <p>My preferred way to implement such recursive stateful transformation functions is to use <strong>ZIO Prelude</strong>'s <code>ZPure</code> type. It's type definition looks a little scary:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> ZPure</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">W</span><span>, </span><span style="color:#a626a4;">-</span><span style="color:#c18401;">S1</span><span>, </span><span style="color:#a626a4;">+</span><span style="color:#c18401;">S2</span><span>, </span><span style="color:#a626a4;">-</span><span style="color:#c18401;">R</span><span>, </span><span style="color:#a626a4;">+</span><span style="color:#c18401;">E</span><span>, </span><span style="color:#a626a4;">+</span><span style="color:#c18401;">A</span><span>] </span></code></pre> <p><code>ZPure</code> describes a <em>purely functional computation</em> which can:</p> <ul> <li>Emit log entries of type <code>W</code></li> <li>Works with an inital state of type <code>S1</code></li> <li>Results in a final state of type <code>S2</code></li> <li>Has access to some context of type <code>R</code></li> <li>Can fail with a value of <code>E</code></li> <li>Or succeed with a value of <code>A</code></li> </ul> <p>In this case we need the state, failure and result types only, but we could also take advantage of <code>W</code> to log debug information within our schema transformation function.</p> <p>To make it easier to work with <code>ZPure</code> we can introduce a <em>type alias</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">type </span><span>Fx[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">ZPure</span><span>[</span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">State</span><span>, </span><span style="color:#c18401;">State</span><span>, </span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">A</span><span>] </span></code></pre> <p>where <code>State</code> is our own <em>case class</em> containing everything we need:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> State</span><span>( </span><span> </span><span style="color:#e45649;">typeRefs</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>], </span><span style="color:#c18401;">RustType</span><span>], </span><span> </span><span style="color:#e45649;">definitions</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustDef</span><span>], </span><span> </span><span style="color:#e45649;">requiredCrates</span><span>: </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">Crate</span><span>], </span><span> </span><span style="color:#e45649;">processed</span><span>: </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>]], </span><span> </span><span style="color:#e45649;">stack</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>]], </span><span> </span><span style="color:#e45649;">nameTypeIdMap</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">TypeId</span><span>]], </span><span> </span><span style="color:#e45649;">schemaCaps</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>], </span><span style="color:#c18401;">Capabilities</span><span>] </span><span>) </span></code></pre> <p>We won't get into the details of the state type here, but I'm showing some fragments to get a feeling of working with <code>ZPure</code> values.</p> <p>Some helper functions to manipulate the state can make our code much easier to read:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">getState</span><span>: </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#c18401;">State</span><span>] </span><span style="color:#a626a4;">= </span><span>ZPure.get[</span><span style="color:#c18401;">State</span><span>] </span><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">updateState</span><span>(</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#c18401;">State </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">State</span><span>): </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span>ZPure.update[</span><span style="color:#c18401;">State</span><span>, </span><span style="color:#c18401;">State</span><span>](f) </span></code></pre> <p>For example we can use <code>updateState</code> to manipulate the <code>stack</code> field of the state around another computation - before running it, we add a schema to the stack, after that we remove it:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">stacked</span><span>[</span><span style="color:#c18401;">A</span><span>, </span><span style="color:#c18401;">R</span><span>](</span><span style="color:#e45649;">schema</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">A</span><span>])(</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#c18401;">R</span><span>]): </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#c18401;">R</span><span>] </span><span style="color:#a626a4;">= </span><span> updateState(</span><span style="color:#e45649;">s </span><span style="color:#a626a4;">=&gt;</span><span> s.copy(stack </span><span style="color:#a626a4;">=</span><span> s.stack :+ schema)) </span><span> .zipRight(f) </span><span> .zipLeft(updateState(</span><span style="color:#e45649;">s </span><span style="color:#a626a4;">=&gt;</span><span> s.copy(stack </span><span style="color:#a626a4;">=</span><span> s.stack.dropRight(</span><span style="color:#c18401;">1</span><span>)))) </span></code></pre> <p>This allows us to decide whether we have to wrap a generated field's type in <code>Box</code> in the rust code:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">boxIfNeeded</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">schema</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">A</span><span>]): </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#c18401;">RustType</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span> state &lt;- getState </span><span> backRef </span><span style="color:#a626a4;">=</span><span> state.stack.contains(schema) </span><span> rustType &lt;- getRustType(schema) </span><span> </span><span style="color:#a626a4;">yield if</span><span> backRef then RustType.box(rustType) </span><span style="color:#a626a4;">else</span><span> rustType </span></code></pre> <p>By looking into <code>state.stack</code> we can decide if we are dealing with a recursive type or not, and make our decision regarding boxing the field.</p> <p>Another example is to guard against infinite recursion when traversing the schema definition, as I explained before. We can define a helper function that just keeps track of all the visited schemas and shortcuts the computation if something has already been seen:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">ifNotProcessed</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">A</span><span>])(</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#a626a4;">Unit</span><span>]): </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> getState.</span><span style="color:#e45649;">flatMap</span><span>: state </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">if</span><span> state.processed.contains(value) then ZPure.unit </span><span> </span><span style="color:#a626a4;">else</span><span> updateState(</span><span style="color:#e45649;">_</span><span>.copy(processed </span><span style="color:#a626a4;">=</span><span> state.processed + value)).zipRight(f) </span></code></pre> <p>Putting all these smaller combinators together we have an easy-to-read core recursive transformation function for converting the schema:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">process</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">schema</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#c18401;">A</span><span>]): </span><span style="color:#c18401;">Fx</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> ifNotProcessed(schema): </span><span> getRustType(schema).</span><span style="color:#e45649;">flatMap</span><span>: typeRef </span><span style="color:#a626a4;">=&gt; </span><span> stacked(schema): </span><span> schema </span><span style="color:#a626a4;">match </span><span> </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>In the end to run a <code>Fx[A]</code> all we need to do is to provide an initial state:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>processSchema.provideState(State.empty).runEither </span></code></pre> <h2 id="inspecting-the-endpoints">Inspecting the endpoints</h2> <p>We generated Rust code for all our types but we still need to generate HTTP clients. The basic idea is the same as what we have seen so far:</p> <ul> <li>Traversing the <code>Endpoint</code> data structure for each endpoint we have</li> <li>Generate some intermediate model</li> <li>Pretty print this model to Rust code</li> </ul> <p>The conversion once again is recursive, can fail, and requires keeping track of various things, so we can use <code>ZPure</code> to implement it. Not repeating the same details, in this section we will talk about what exactly the endpoint descriptions look like and what we have be aware of when trying to process them.</p> <p>The first problem to solve is that currently ZIO Http does not have a concept of multiple endpoints. We are not composing <code>Endpoint</code> values into an API, instead we first <strong>implement</strong> them to get <code>Route</code> values and compose those. We can no longer inspect the endpoint definitions from the composed routes, so unfortunately we have to repeat ourselves and somehow compose our set of endpoints for our code generator.</p> <p>First we can define a <code>RustEndpoint</code> class, similar to the <code>RustModel</code> earlier, containing all the necessary information to generate Rust code for a <strong>single endpoint</strong>.</p> <p>We can construct it with a function:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a0a1a7;">// ... </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> RustEndpoint</span><span>: </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">fromEndpoint</span><span>[</span><span style="color:#c18401;">PathInput</span><span>, </span><span style="color:#c18401;">Input</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">Output</span><span>, </span><span style="color:#c18401;">Middleware </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">EndpointMiddleware</span><span>]( </span><span> </span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span> </span><span style="color:#e45649;">endpoint</span><span>: </span><span style="color:#c18401;">Endpoint</span><span>[</span><span style="color:#c18401;">PathInput</span><span>, </span><span style="color:#c18401;">Input</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">Output</span><span>, </span><span style="color:#c18401;">Middleware</span><span>], </span><span> ): </span><span style="color:#c18401;">Either</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">RustEndpoint</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>The second thing to notice: endpoints do not have a name! If we look back to our initial example of <code>getWorkerMetadata</code>, it did not have a unique name except the Scala value it was assigned to. But we can't observe that in our code generator (without writing a macro) so here we have chosen to just get a name as a string next to the definition.</p> <p>Then we can define a <strong>collection</strong> of <code>RustEndpoint</code>s:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> RustEndpoints</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">Name</span><span>, </span><span style="color:#e45649;">originalEndpoints</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustEndpoint</span><span>]) </span></code></pre> <p>and define a <code>++</code> operator between <code>RustEndpoint</code> and <code>RustEndpoints</code>. In the end we can use these to define APIs like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">for </span><span> getDefaultProject &lt;- fromEndpoint(</span><span style="color:#50a14f;">&quot;getDefaultProject&quot;</span><span>, ProjectEndpoints.getDefaultProject) </span><span> getProjects &lt;- fromEndpoint(</span><span style="color:#50a14f;">&quot;getProjects&quot;</span><span>, ProjectEndpoints.getProjects) </span><span> postProject &lt;- fromEndpoint(</span><span style="color:#50a14f;">&quot;postProject&quot;</span><span>, ProjectEndpoints.postProject) </span><span> getProject &lt;- fromEndpoint(</span><span style="color:#50a14f;">&quot;getProject&quot;</span><span>, ProjectEndpoints.getProject) </span><span> deleteProject &lt;- fromEndpoint(</span><span style="color:#50a14f;">&quot;deleteProject&quot;</span><span>, ProjectEndpoints.deleteProject) </span><span> </span><span style="color:#a626a4;">yield </span><span>(getDefaultProject ++ getProjects ++ postProject ++ getProject ++ deleteProject).named(</span><span style="color:#50a14f;">&quot;Project&quot;</span><span>) </span></code></pre> <p>The collection of endpoints also have a name (<code>"Project"</code>). In the code generator we can use these to have a separate <strong>client</strong> (trait and implementation) for each of these groups of endpoints.</p> <p>When processing a single endpoint, we need to process the following parts of data:</p> <ul> <li>Inputs (<code>endpoint.input</code>)</li> <li>Outputs (<code>endpoint.output</code>)</li> <li>Errors (<code>endpoint.error</code>)</li> </ul> <p>Everything we need is encoded in one of these three fields of an endpoint, and all three are built on the same abstraction called <code>HttpCodec</code>. Still there is a significant difference in what we want to do with inputs versus what we want to do with outputs and errors, so we can write two different traversals for gathering all the necessary information from them.</p> <h3 id="inputs">Inputs</h3> <p>When gathering information from the inputs, we are going to run into the following cases:</p> <ul> <li><code>HttpCodec.Combine</code> means we have two different inputs; we need both, so we have to process both inner codecs sequentially, both extending our conversion function's state.</li> <li><code>HttpCodec.Content</code> describes a <strong>request body</strong>. Here we have a <code>Schema</code> of our request body type and we can use the previously generated schema-to-rust type mapping to know how to refer to the generated rust type in our client code. It is important that in case there are <strong>multiple content codecs</strong>, that means the endpoint receives a <code>multipart/form-data</code> body, while if there is only one codec, it accepts an <code>application/json</code> representation of that.</li> <li><code>HttpCodec.ContentStream</code> represents a body containing a stream of a given element type. We can model this as just a <code>Vec&lt;A&gt;</code> in the Rust side, but there is one special case here - if the element is a <code>Byte</code>, ZIO Http expects a simple byte stream of type <code>application/octet-stream</code> instead of a JSON-encoded array of bytes.</li> <li><code>HttpCodec.Fallback</code> this represents the case when we should either use the first codec, <em>or</em> the second. A special case is when the <code>right</code> value of <code>Fallback</code> is <code>HttpCodec.Empty</code>. This is how ZIO Http represents optional inputs! We have to handle this specially in our code generator to mark some of the input parameters of the generated API as optional parameters. We don't support currently the other cases (when <code>right</code> is not empty) as it is not frequently used and was not required for the <em>Golem API</em>.</li> <li><code>HttpCodec.Header</code> means we need to send a <em>header</em> in the request, which can be a static (value described by the endpoint) or dynamic one (where we need to add an extra parameter to the generated function to get a value of the header). There are a couple of different primitive types supported for the value, such as string, numbers, UUIDs.</li> <li><code>HttpCodec.Method</code> defines the method to be used for calling the endpoint</li> <li><code>HttpCodec.Path</code> describes the request path, which consists of a sequence of static and dynamic segments - for the dynamic segments the generated API need to have exposed function parameters of the appropriate type</li> <li><code>HttpCodec.Query</code> similar to the header codec defines query parameters to be sent</li> <li><code>HttpCodec.TransformOrFail</code> transforms a value with a Scala function - the same case as with <code>Schema.Transform</code>. We cannot use the Scala function in our code generator so we just need to ignore this and go to the inner codec.</li> <li><code>HttpCodec.Annotated</code> attaches additional information to the codecs that we are currently not using, but it could be used to get documentation strings and include them in the generated code as comments, for example.</li> </ul> <h3 id="outputs">Outputs</h3> <p>For outputs we are dealing with the same <code>HttpCodec</code> type but there are some significant differences:</p> <ul> <li>We can ignore <code>Path</code>, <code>Method</code>, <code>Query</code> as they have no meaning for outputs</li> <li>We could look for <em>output headers</em> but currently we ignore them</li> <li><code>Fallback</code> on the other hand needs to be properly handled for outputs (errors, especially) because this is how the different error responses are encoded.</li> <li><code>Status</code> is combined with <code>Content</code> in these <code>Fallback</code> nodes to describe cases. This complicates the code generator because we need to record "possible outputs" which are only added as real output once we are sure we will not get any other piece of information for them.</li> </ul> <p>To understand the error fallback handling better, let's take a look at how it is defined in one of Golem's endpoint groups:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">errorCodec</span><span>: </span><span style="color:#c18401;">HttpCodec</span><span>[</span><span style="color:#c18401;">HttpCodecType</span><span>.</span><span style="color:#c18401;">Status </span><span>&amp; </span><span style="color:#c18401;">HttpCodecType</span><span>.</span><span style="color:#c18401;">Content</span><span>, </span><span style="color:#c18401;">LimitsEndpointError</span><span>] </span><span style="color:#a626a4;">= </span><span> HttpCodec.enumeration[</span><span style="color:#c18401;">LimitsEndpointError</span><span>]( </span><span> HttpCodec.error[</span><span style="color:#c18401;">LimitsEndpointError</span><span>.</span><span style="color:#c18401;">Unauthorized</span><span>](Status.Unauthorized), </span><span> HttpCodec.error[</span><span style="color:#c18401;">LimitsEndpointError</span><span>.</span><span style="color:#c18401;">ArgValidationError</span><span>](Status.BadRequest), </span><span> HttpCodec.error[</span><span style="color:#c18401;">LimitsEndpointError</span><span>.</span><span style="color:#c18401;">LimitExceeded</span><span>](Status.Forbidden), </span><span> HttpCodec.error[</span><span style="color:#c18401;">LimitsEndpointError</span><span>.</span><span style="color:#c18401;">InternalError</span><span>](Status.InternalServerError) </span><span> ) </span></code></pre> <p>This leads to a series of nested <code>HttpCodec.Fallback</code>, <code>HttpCodec.Combine</code>, <code>HttpCodec.Status</code> and <code>HttpCodec.Content</code> nodes. When processing them we first add values of possible outputs:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PossibleOutput</span><span>(</span><span style="color:#e45649;">tpe</span><span>: </span><span style="color:#c18401;">RustType</span><span>, </span><span style="color:#e45649;">status</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Status</span><span>], </span><span style="color:#e45649;">isError</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>, </span><span style="color:#e45649;">schema</span><span>: </span><span style="color:#c18401;">Schema</span><span>[</span><span style="color:#e45649;">?</span><span>]) </span></code></pre> <p>and once we have fully processed one branch of a <code>Fallback</code>, we finalize these possible outputs and make them real outputs. The way these different error cases are mapped into different case classes of a a single error type (<code>LimitsEndpointError</code>) also complicates things. When we reach a <code>HttpCodec.Content</code> referencing <code>Schema[LimitsEndpointError.LimitExceeded</code>] for example, all we see is a <code>Schema.Record</code> - and not the parent enum! For this reason in the code generator we are explicitly defining the error ADT type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">fromEndpoint </span><span style="color:#a626a4;">= </span><span>RustEndpoint.withKnownErrorAdt[</span><span style="color:#c18401;">LimitsEndpointError</span><span>].zio </span></code></pre> <p>and we detect if all cases are subtypes of this error ADT and generate the client code according to that.</p> <h3 id="the-rust-client">The Rust client</h3> <p>It is time to take a look at what the output of all this looks like. In this section we will examine some parts of the generated Rust code.</p> <p>Let's take a look at the <strong>Projects API</strong>. We have generated a <code>trait</code> for all the endpoints belonging to it:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">async_trait</span><span>::</span><span style="color:#e45649;">async_trait</span><span>] </span><span style="color:#a626a4;">pub trait </span><span>Project { </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_default_project</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;crate::model::Project, ProjectError&gt;; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_projects</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">project_name</span><span>: Option&lt;</span><span style="color:#a626a4;">&amp;str</span><span>&gt;, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;Vec&lt;crate::model::Project&gt;, ProjectError&gt;; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">post_project</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">field0</span><span>: crate::model::ProjectDataRequest, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;crate::model::Project, ProjectError&gt;; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_project</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">project_id</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;crate::model::Project, ProjectError&gt;; </span><span> async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">delete_project</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">project_id</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;(), ProjectError&gt;; </span><span>} </span></code></pre> <p>This is quite close to our original goal! One significant difference is that some type information is lost: <code>project_id</code> was <code>ProjectId</code> in Scala, and <code>authorization</code> was <code>TokenSecret</code> etc. Unfortunately with the current version of ZIO Schema these newtypes (or Scala 3 opaque types) are represented as primitive types transformed by a function. As explained earlier, we can't inspect the transformation function so all we can do is to use the underlying primitive type's schema here. This can be solved by introducing the concept of newtypes into ZIO Schema.</p> <p>The <code>ProjectError</code> is a client specific generated <code>enum</code> which can represent a mix of internal errors (such as not being able to call the endpoint) as well as the endpoint-specific domain errors:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">pub enum </span><span>ProjectError { </span><span> RequestFailure(reqwest::Error), </span><span> InvalidHeaderValue(reqwest::header::InvalidHeaderValue), </span><span> UnexpectedStatus(reqwest::StatusCode), </span><span> Status404 { </span><span> message: String, </span><span> }, </span><span> Status403 { </span><span> error: String, </span><span> }, </span><span> Status400 { </span><span> errors: Vec&lt;String&gt;, </span><span> }, </span><span> Status500 { </span><span> error: String, </span><span> }, </span><span> Status401 { </span><span> message: String, </span><span> }, </span><span>} </span></code></pre> <p>So why are these per-status-code error types inlined here instead of generating the error ADT as a Rust <code>enum</code> and using that? The reason is a difference between Scala and Rust: we have a single error ADT in Scala and we can still use its <em>cases</em> directly in the endpoint definition:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> ProjectEndpointError </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> ProjectEndpointError { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> ArgValidation(</span><span style="color:#e45649;">errors</span><span style="color:#c18401;">: Chunk[String]) </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">ProjectEndpointError </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>HttpCodec.error[</span><span style="color:#c18401;">ProjectEndpointError</span><span>.</span><span style="color:#c18401;">ArgValidation</span><span>](Status.BadRequest), </span></code></pre> <p>We <em>do</em> generate the corresponding <code>ProjectEndpointError</code> enum in Rust:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">derive</span><span>(Debug, Clone, PartialEq, Eq, Hash, Ord, PartialOrd, serde::Serialize, serde::Deserialize)] </span><span style="color:#a626a4;">pub enum </span><span>ProjectEndpointError { </span><span> ArgValidation { </span><span> errors: Vec&lt;String&gt;, </span><span> }, </span><span> </span><span style="color:#a0a1a7;">// ... </span><span>} </span></code></pre> <p>but we cannot use <code>ProjectEndpointError::ArgValidation</code> as a type in the above <code>ProjectError</code> enum. And we cannot safely do something like <code>Either[ClientError, ProjectEndpointError]</code> because in the endpoint DSL we just have a sequence of status code - error case pairs. There is no guarantee that one enum case is only used once in that mapping, or that every case is used at least once. For this reason the mapping from <code>ProjectError</code> to <code>ProjectEndpointError</code> is generated as a transformation function:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#a626a4;">impl </span><span>ProjectError { </span><span> </span><span style="color:#a626a4;">pub fn </span><span style="color:#0184bc;">to_project_endpoint_error</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>) -&gt; Option&lt;crate::model::ProjectEndpointError&gt; { </span><span> </span><span style="color:#a626a4;">match </span><span style="color:#e45649;">self </span><span>{ </span><span> ProjectError::Status400 { errors } </span><span style="color:#a626a4;">=&gt; </span><span>Some(</span><span style="color:#a626a4;">crate</span><span>::model::ProjectEndpointError::ArgValidation { errors: errors.</span><span style="color:#0184bc;">clone</span><span>() }), </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span> } </span><span>} </span></code></pre> <p>For each client trait we also generate a <strong>live implementation</strong>, represented by a struct containing configuration for the client:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#e45649;">derive</span><span>(Clone, Debug)] </span><span style="color:#a626a4;">pub struct </span><span>ProjectLive { </span><span> </span><span style="color:#a626a4;">pub </span><span style="color:#e45649;">base_url</span><span>: reqwest::Url, </span><span> </span><span style="color:#a626a4;">pub </span><span style="color:#e45649;">allow_insecure</span><span>: </span><span style="color:#a626a4;">bool</span><span>, </span><span>} </span></code></pre> <p>And the implementation of the client trait for these live structs are just using <a href="https://docs.rs/reqwest/latest/reqwest/"><code>reqwest</code></a> (a HTTP client library for Rust) to construct the request from the input parameters exactly the way the endpoint definition described:</p> <pre data-lang="rust" style="background-color:#fafafa;color:#383a42;" class="language-rust "><code class="language-rust" data-lang="rust"><span>async </span><span style="color:#a626a4;">fn </span><span style="color:#0184bc;">get_project</span><span>(</span><span style="color:#a626a4;">&amp;</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">project_id</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>, </span><span style="color:#e45649;">authorization</span><span>: </span><span style="color:#a626a4;">&amp;str</span><span>) -&gt; Result&lt;Project, ProjectError&gt; { </span><span> </span><span style="color:#a626a4;">let mut</span><span> url </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.base_url.</span><span style="color:#0184bc;">clone</span><span>(); </span><span> url.</span><span style="color:#0184bc;">set_path</span><span>(</span><span style="color:#a626a4;">&amp;</span><span>format!(</span><span style="color:#50a14f;">&quot;v1/projects/</span><span style="color:#c18401;">{project_id}</span><span style="color:#50a14f;">&quot;</span><span>)); </span><span> </span><span> </span><span style="color:#a626a4;">let mut</span><span> headers </span><span style="color:#a626a4;">= </span><span>reqwest::header::HeaderMap::new(); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span> </span><span style="color:#a626a4;">let mut</span><span> builder </span><span style="color:#a626a4;">= </span><span>reqwest::Client::builder(); </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">let</span><span> client </span><span style="color:#a626a4;">=</span><span> builder.</span><span style="color:#0184bc;">build</span><span>()</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a626a4;">let</span><span> result </span><span style="color:#a626a4;">=</span><span> client </span><span> .</span><span style="color:#0184bc;">get</span><span>(url) </span><span> .</span><span style="color:#0184bc;">headers</span><span>(headers) </span><span> .</span><span style="color:#0184bc;">send</span><span>() </span><span> .await</span><span style="color:#a626a4;">?</span><span>; </span><span> </span><span style="color:#a626a4;">match</span><span> result.</span><span style="color:#0184bc;">status</span><span>().</span><span style="color:#0184bc;">as_u16</span><span>() { </span><span> </span><span style="color:#c18401;">200 </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> body </span><span style="color:#a626a4;">=</span><span> result.json::&lt;crate::model::Project&gt;().await</span><span style="color:#a626a4;">?</span><span>; </span><span> Ok(body) </span><span> } </span><span> </span><span style="color:#c18401;">404 </span><span style="color:#a626a4;">=&gt; </span><span>{ </span><span> </span><span style="color:#a626a4;">let</span><span> body </span><span style="color:#a626a4;">=</span><span> result.json::&lt;ProjectEndpointErrorNotFoundPayload&gt;().await</span><span style="color:#a626a4;">?</span><span>; </span><span> Err(ProjectError::Status404 { message: body.message }) </span><span> } </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span>} </span></code></pre> <h2 id="putting-it-all-together">Putting it all together</h2> <p>At this point we have seen how <em>ZIO Http</em> describes endpoints, how <em>ZIO Schema</em> encodes Scala types, how we can use <em>ZIO Parser</em> to have composable printers and how <em>ZIO Prelude</em> can help with working with state in a purely functional code. The only thing remaining is to wire everything together and define an easy to use function that, when executed, creates all the required <em>Rust files</em> ready to be compiled.</p> <p>We can create a class for this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> ClientCrateGenerator</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">version</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">description</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">homepage</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">endpoints</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">RustEndpoints</span><span>]): </span></code></pre> <p>Here <code>endpoints</code> is a collection of a <strong>group of endpoints</strong>, as it was shown earlier. So first you can use <code>RustEndpoint.fromEither</code> and <code>++</code> to create a <code>RustEndpoints</code> value for each API you have, and then generate a client for all of those in one run with this class.</p> <p>The first thing to do is collect <em>all</em> the referenced <code>Schema</code> from all the endpoints:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private val </span><span style="color:#e45649;">allSchemas </span><span style="color:#a626a4;">=</span><span> endpoints.map(</span><span style="color:#e45649;">_</span><span>.endpoints.toSet.flatMap(</span><span style="color:#e45649;">_</span><span>.referredSchemas)).reduce(</span><span style="color:#e45649;">_</span><span> union </span><span style="color:#e45649;">_</span><span>) </span></code></pre> <p>Then we define a ZIO function (it is an effectful function, manipulating the filesystem!) to generate the files:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">generate</span><span>(</span><span style="color:#e45649;">targetDirectory</span><span>: </span><span style="color:#c18401;">Path</span><span>): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">Throwable</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span> clientModel &lt;- ZIO.fromEither(RustModel.fromSchemas(allSchemas.toSeq)) </span><span> .mapError(</span><span style="color:#e45649;">err </span><span style="color:#a626a4;">=&gt; new </span><span style="color:#c18401;">RuntimeException</span><span>(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Failed to generate client model: </span><span style="color:#e45649;">$err</span><span style="color:#50a14f;">&quot;</span><span>)) </span><span> cargoFile </span><span style="color:#a626a4;">=</span><span> targetDirectory / </span><span style="color:#50a14f;">&quot;Cargo.toml&quot; </span><span> srcDir </span><span style="color:#a626a4;">=</span><span> targetDirectory / </span><span style="color:#50a14f;">&quot;src&quot; </span><span> libFile </span><span style="color:#a626a4;">=</span><span> srcDir / </span><span style="color:#50a14f;">&quot;lib.rs&quot; </span><span> modelFile </span><span style="color:#a626a4;">=</span><span> srcDir / </span><span style="color:#50a14f;">&quot;model.rs&quot; </span><span> </span><span> requiredCrates </span><span style="color:#a626a4;">=</span><span> clientModel.requiredCrates union endpoints.map(</span><span style="color:#e45649;">_</span><span>.requiredCrates).reduce(</span><span style="color:#e45649;">_</span><span> union </span><span style="color:#e45649;">_</span><span>) </span><span> </span><span> </span><span style="color:#e45649;">_</span><span> &lt;- Files.createDirectories(targetDirectory) </span><span> </span><span style="color:#e45649;">_</span><span> &lt;- Files.createDirectories(srcDir) </span><span> </span><span style="color:#e45649;">_</span><span> &lt;- writeCargo(cargoFile, requiredCrates) </span><span> </span><span style="color:#e45649;">_</span><span> &lt;- writeLib(libFile) </span><span> </span><span style="color:#e45649;">_</span><span> &lt;- writeModel(modelFile, clientModel.definitions) </span><span> </span><span style="color:#e45649;">_</span><span> &lt;- ZIO.foreachDiscard(endpoints): endpoints </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">clientFile </span><span style="color:#a626a4;">=</span><span> srcDir / </span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;${</span><span>endpoints.name.toSnakeCase</span><span style="color:#50a14f;">}.rs&quot; </span><span> writeClient(clientFile, endpoints) </span><span> </span><span style="color:#a626a4;">yield </span><span style="color:#c18401;">() </span></code></pre> <p>The steps are straightforward:</p> <ul> <li>Create a <code>RustModel</code> using all the collected <code>Schema[?]</code> values</li> <li>Create all the required directories</li> <li>Write a <em>cargo file</em> - having all the dependencies and other metadata required to compile the Rust project</li> <li>Write a <em>lib file</em> - this is just a series of <code>pub mod xyz;</code> lines, defining the generated modules which are put in different fiels</li> <li>Write all the generated Rust types into a <code>model.rs</code></li> <li>For each endpoint group create a <code>xyz.rs</code> module containing the client trait and implementation</li> </ul> <p>For working with the file system - creating directories, writing data into files, we can use the [<a href="https://zio.dev/zio-nio/">ZIO NIO</a>] library providing ZIO wrapprers for all these functionalities.</p> <h3 id="links">Links</h3> <p>Finally, some links:</p> <ul> <li>The <strong>code generator</strong> is open source and available at https://github.com/vigoo/zio-http-rust - the code and the repository itself is not documented at the moment, except by this blog post.</li> <li>The generated <strong>Golem client for Rust</strong> is published as a crate to https://crates.io/crates/golem-client</li> <li>The new <strong>Golem CLI</strong>, using the generated client, is also open sourced and can be found at https://github.com/golemcloud/golem-cli</li> <li>Finally you can learn more about <strong>Golem</strong> itself at https://www.golem.cloud</li> </ul> [Video] Introducing ZIO Flow @ ZIO World 2023 2023-09-06T00:00:00+00:00 2023-09-06T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/introducing-zio-flow/ <p>My short talk at <a href="https://www.zioworld.com/">ZIO World 2023</a> about the <a href="https://zio.dev/zio-flow/">zio-flow library</a>.</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/ujJuFd6Vvfc?si=bsh3b7f-LXFVP_v_" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> [Video] Binary Serialization Of Evolving Data Types @ Functional Scala 2022 2022-12-01T00:00:00+00:00 2022-12-01T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/funscala2022-talk/ <p>My talk at <a href="https://www.functionalscala.com/">Functional Scala 2022</a> about the binary serialization library <a href="https://vigoo.github.io/desert/">desert</a>.</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/Y2KopYpjZ3Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> ZIO Kafka with transactions - a debugging story 2022-06-15T00:00:00+00:00 2022-06-15T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zio-kafka-debugging-story/ <h2 id="introduction">Introduction</h2> <p>With one of our clients, we were working on a chain of services responsible for processing some logs coming from a Kafka topic, partition them by some properties like user and date, infer and aggregate the log schema and eventually store the partitioned data in a different format. The details of this use case are not important for understanding this post, in which I'm going to explain the recent changes to <a href="https://ziverge.com/blog/introduction-to-zio-kafka">ZIO Kafka</a>, how was it implemented and how did we know it's not perfect, and the long story of investigation that finally resulted in a fix making this new feature usable in production.</p> <p>We only have to know about the first component of this data pipeline, which is a zio-kafka service:</p> <ul> <li>Consumes it's source topic. Each record in this topic consists one or more log entries for a given user. The kafka topic's partitions are not aligned with our target partition (of user/date), all kafka partitions may contain data from all users.</li> <li>The service partitions the source data per user/date/hour and writes the log entries into Avro files in the local file system</li> <li>It also computes and aggregates a log schema in memory for each of these files</li> <li>It is using Kafka transactions to achieve <a href="https://www.baeldung.com/kafka-exactly-once">exactly-once delivery</a>. This means that the processed records are not committed when they are written to the Avro files - there is a periodic event triggered every 30 seconds and at each rebalance that uploads the Avro files to S3, and <em>then</em> it emits Kafka messages to downstream containing references to the uploaded files and their aggregated schema, and it commits all the offsets of all the input Kafka topic's <em>transactionally</em>.</li> </ul> <p><img src="/images/blog-zio-kafka-debugging-1.png" alt="" /></p> <h2 id="stream-restarting-mode-in-zio-kafka">Stream restarting mode in zio-kafka</h2> <p>When we first implemented this using zio-kafka and started to test it we have seen a lot of errors like</p> <p><code>Transiting to abortable error state due to org.apache.kafka.clients.consumer.CommitFailedException: Transaction offset Commit failed due to consumer group metadata mismatch: Specified group generation id is not valid."}</code></p> <p><em>Group generation ID</em> is a counter that gets incremented at each rebalance. The problem was that zio-kafka by default provides a continuous stream for partitions that survives rebalances. So we have a single stream per Kafka partition and after a rebalance we end up with some of them revoked and their streams stopped, some new streams created, but the ones that remained assigned are not going to be recreated.</p> <p><img src="/images/blog-zio-kafka-debugging-2.png" alt="" /></p> <p>This works fine without using transactions, but it means your stream can contain messages from multiple generations. I first tried to solve this by detecting generation switches downstream but quickly realized this cannot work. It's too late to commit the previous generation when there are already records from the new generation; we have to do it before the rebalance finishes.</p> <p>To solve this I introduced a new <em>mode</em> in zio-kafka back in February 2022, with <a href="https://github.com/zio/zio-kafka/pull/427">this pull request</a>.</p> <p>This adds a new mode to zio-kafka's core run loop which guarantees that every rebalance stops all the partition streams and create new ones every time.</p> <p><img src="/images/blog-zio-kafka-debugging-3.png" alt="" /></p> <p>With this approach the library user can build the following logic on top of the "stream of partition streams" API of zio-kafka:</p> <ul> <li>Get the next set of partition streams</li> <li>Merge and drain them all</li> <li>Perform a <em>flush</em> - upload and commit everything before start working on the new set of streams</li> <li>Repeat</li> </ul> <p>This alone is still not enough - we have to block the rebalancing until we are done with the committing otherwise we would still get the invalid generation ID error.</p> <p>The <code>onRevoke</code> and <code>onAssigned</code> callbacks from the underlying Java Kafka library are working in a way that they block the rebalance process so that's the place where we can finish every processing for the revoked partitions. This extension point is provided by zio-kafka too but it's completely detached from the streaming API so I have introduced a rebalance event queue with with some promises and timeouts to coordinate this:</p> <ul> <li>In <code>onRevoke</code> we publish a rebalance event and wait until it gets processed.</li> <li>Because the new run loop mode is guaranteed to terminate all streams on rebalance (which <em>is</em> already happening, as we are in <code>onRevoke</code>) we can be sure that eventually the main consumer stream's current stage - that drains the previous generation's partition streams will finish soon</li> <li>and then it performs the rotation and fulfills the promise in the rebalance event.</li> </ul> <p>With these changes our service started to work - but we had to know if it works correctly.</p> <h2 id="qos-tests">QoS tests</h2> <p>We implemented a QoS test running on Spark which periodically checks that we are not loosing any data with our new pipeline.</p> <p>Our log entries have associated unique identifiers coming from upstream - so what we can do in this test is to consume an hour amount of log records from the same Kafka topic our service is consuming from, and read all the Avro files produced in that period (with some padding of course to have some tolerance for lag) and then see if there are any missing records in our output.</p> <p>Another source of truth for the investigation was an older system doing something similar, resulting in the same input being available as archived CSV files in some cases. Comparing the archived CSV files with the archived Avro files I could verify that the QoS test itself works correctly, by checking that both methods report the same set of missing records.</p> <p>What we learned from these tests was that:</p> <ul> <li>there is data loss</li> <li>the data loss is related to rebalances</li> </ul> <p>To understand it's related to rebalances I was comparing failing QoS reports from several hours, figured out the ingestion time for some of the missing log records within these hours, and checked our service and infrastructure logs around that time. Every time there was a rebalance near the reported errors.</p> <h2 id="additional-tests">Additional tests</h2> <p>During the investigation I added some additional debug features and logs to the system.</p> <p>One of them is an extra verification step, enabled only temporarily in our development cluster, that</p> <ul> <li>aggregates all the log identifiers at the earliest point - as soon as they got in the zio-kafka partition stream</li> <li>after uploading the Avro files and committing the records, it re-downloads all the files from S3 and checks if they got all the log identifiers that they should.</li> </ul> <p>This never reported any error so based on that I considered the flow <em>after</em> zio-kafka correct.</p> <p>We also have a lot of debug logs coming from the Java Kafka library, from zio-kafka and from our service to help understanding the issue:</p> <ul> <li>After each rebalance, the Java library logs the offset it's starting to read from</li> <li>When committing I'm logging the minimum and maximum offset contained by the committed and uploaded Avro files per kafka partition</li> <li>All streams creation and termination are logged</li> <li>If records within a partition stream are skipping an offset (this was never logged actually)</li> </ul> <p>I wrote a test app that reads our service's logs from a given period, logged from all the Kubernetes pods it's running on, and runs a state machine that verifies that all the logged offsets from the different pods are in sync. It fails in two cases:</p> <ul> <li>When a pod <em>resets its offsets</em> to something that was previously seen in the logs and there is a gap</li> <li>When a pod <em>rotates</em> a kafka without it got assigned to that pod first (so if multiple pods would somehow consume the same partition which Kafka prevents)</li> </ul> <p>I tried for long to write integration tests using embedded Kafka (similar to how it's done in zio-kafka's test suite) that reproduces the data loss issue, without any luck. In all my simulated cases everything works perfectly.</p> <h2 id="theories-and-fixes">Theories and fixes</h2> <p>From logs from the time ranges where the data loss is reported from, these additional checks were not showing any discrepancies.</p> <p>This could only mean two things:</p> <ul> <li>All the kafka/zio-kafka level is correct but we are still loosing data in our service-specific logic, somewhere in writing to Avro-s and uploading to S3.</li> <li>On Kafka level everything is fine but somehow zio-kafka does not pass all the records to our service's logic</li> </ul> <p>I trusted the validation mode I described earlier (the one that re-downloads the data) so I ruled out the first option.</p> <h2 id="zio-kafka-internals">zio-kafka internals</h2> <p>Before discussing the fixes I tried to make in zio-kafka, first let's talk about how the library works.</p> <p>The zio-kafka library wraps the Java library for Kafka and provides a ZIO Stream interface for consuming the records. As I mentioned earlier, it creates a separate stream for each kafka partition assigned to the consumer. The primary operation on the Java interface is called <code>poll</code>. This method is responsible for fetching data for all the subscribed partitions for a given timeout. Another important property is that in case of rebalancing, the <code>poll</code> is blocked until the rebalancing completes, and it calls the already mentioned revoked/assigned callbacks in this blocked state.</p> <p>Another thing it has to support is back pressure. We don't want this <code>poll</code> to fetch more and more data for partitions where we did process the previous records yet. In other words, upstream demand in our ZIO Streams must control what partitions we <code>poll</code>. In the Java level this is controlled by pausing and resuming individual partitions.</p> <p>So let's see a summary of how the consumer streams work:</p> <ul> <li>Each partition stream is a repeated ZIO effect that enqueues a <code>Request</code> in a <em>queue</em> and then waits for the promise contained in this request to be fulfilled. The promise will contain a chunk of records fetched from Kafka if everything went well.</li> <li>There is a single (per consumer) <em>run loop</em> which periodically calls <code>poll</code>. Before calling it, it pauses/resumes partitions based on which partitions has at least one <code>Request</code> since the last <code>poll</code>.</li> <li>This, as ZIO streams are pull based, implements the back pressure semantics mentioned earlier.</li> </ul> <p>There is a similar mechanism for gathering commit requests and then performing them as part of the <em>run loop</em> but in our use case that is not used - the transactional producer is independent of this mechanism.</p> <p>There is one more concept which is very important for to understand the problem: <em>buffered records</em>. Imagine that we are consuming five partitions, <code>1 .. 5</code> and only have a request (downstream pull) for partition <code>1</code>. This means we are pausing <code>2 .. 5</code> and do a <code>poll</code> but what if the resulting record set contains records from other partitions? There could be multiple reason for this (and some of them may not be possible in practice), for example there could be some data already buffered within the Java library for the paused partitions, or maybe a rebalance assigns some new partitions which are not paused yet (as we don't know we are going to get them) resulting in immediately fetching some data for them.</p> <p>The library handles these cases in a simple way: it <em>buffers</em> these records which were not requested in a per-partition map, and when a partition is pulled next time, it will not only give the records returned by <code>poll</code> to the request's promise, but also all the buffered ones, prepended to the new set of records.</p> <p>Another important detail for this investigation is that we don't care about graceful shutdown, or if records got lost during shutdown. This is also very interesting in general, but our service is not trying to finish writing and uploading all data during shutdown, it simply ignores the partial data and quits without committing them so they get reprocessed as soon as possible in another consumer.</p> <p>What happens during rebalancing? Let's forget the default mode of zio-kafka for this discussion and focus on the new mode which <em>restarts</em> all the partition streams every time.</p> <p>We don't know in advance that a rebalance will happen, it happens during the call to <code>poll</code>. The method in the <em>run loop</em> that contains this logic is called <code>handlePoll</code> and does roughly the following (in our case):</p> <ul> <li>store the current state (containing the current streams, requests, buffered records etc) in a ref</li> <li>pause/resume partitions based on the current requests, as described earlier</li> <li>call <code>poll</code> <ul> <li>during <code>poll</code> in the revoked callback we end all partition streams. This means they get an interrupt signal and they stop. As I mentioned earlier, in this mode the consumer merges the partition streams and drain them; this is the other side of it, interrupting all the streams so we know that eventually this merged stream will also stop.</li> <li>dropping all the buffered records, but first adding them to a <em>drain queue</em> (this is a fix that was not part of the original implementation). It is now guaranteed that the partition streams will get the remaining buffered elements before they stop.</li> <li>storing the fact of the rebalancing, so the rest of <code>handlePoll</code> knows about it when <code>poll</code> returns.</li> </ul> </li> <li>once <code>poll</code> returned, buffer all records for all unrequested partitions. this is another place where a fix was made, currently we treat <em>all</em> records unrequested in case of a rebalancing, because all the streams were restarted, so the original requests were made by the previous set of streams; fulfilling them would loose data because the new streams are not waiting for the same promises.</li> <li>the next step would be to fulfill all the requests that we can by using the combination of buffered records and the <code>poll</code> result. But we had a rebalance and dropped all the requests! So this step must not do anything.</li> <li>finally we start new streams for each assigned partition</li> </ul> <p>So based on all this, and the theory that the commits/offsets are all correct but somehow data is lost between the Java library and the service logic, the primary suspect was the <em>buffered records</em>.</p> <p>Let's see what fixes and changes I made, in time order:</p> <h2 id="fix-attempt-1">Fix attempt 1</h2> <p>The first time I suspected buffered records are behind the issue I realized that when we end <em>all</em> partition streams during rebalancing, we loose the buffered records. This is not a problem if those partitions are really revoked - it means there was no demand for those partitions, so it's just that some records were read ahead and now they get dropped and will be reprocessed on another consumer.</p> <p>But if the same partition is "reassigned" to the same consumer, this could be a data loss! The reason is that there is an internal state in Kafka which is a per-consumer, per-partition <em>position</em>. In this case this position would point to <em>after</em> the buffered records, so the next <code>poll</code> will get the next records and the previously buffered ones will not be prepended as usual because the revocation clears the buffer.</p> <p>Note that this whole problem would not exist if the reassigned partitions get <em>reseted</em> to the last committed offset after rebalancing. I don't think this is the case, only when a new partition is assigned to a consumer with no previous position.</p> <p>My first fix was passing the buffered records to the user-defined revoke handler so it could write the remaining records to the Avro files before uploading them. This was just a quick test, as it does not really fit into the API of zio-kafka.</p> <h2 id="fix-attempt-2">Fix attempt 2</h2> <p>After playing with the first fix for a while I thought it solved the issue but it was just not reproducing - it is not completely clear why, probably I missed some test results.</p> <p>But I wrote a second version of the same fix, this time by adding the remaining buffered elements to the end of the partition streams before they stop, instead of explicitly passing them to the revoke handler.</p> <p>This should work exactly the same but handles the problem transparently.</p> <h2 id="fix-attempt-3">Fix attempt 3</h2> <p>After some more testing it was clear that the QoS tests were still showing data loss. The investigation continued and the next problem I have found was that in <code>handlePoll</code> after a rebalance we were not storing the buffered records anymore in this "restarting streams" mode. I did not catch this in the first fix attempts I was focusing on dealing with the buffered records at the <em>end</em> of the revoked streams.</p> <p>What does it mean it was not storing the buffered records? In <code>handlePoll</code> there is a series of state manipulation functions and the buffered records map is part of this state. The logic here is quite complicated and it very much depends on whether we are running the consumer in <em>normal</em> or <em>stream restarting</em> mode. The problem was that for some reason after a rebalance (in the new mode only) this buffered records field was cleared instead of preserving records from before the rebalance.</p> <h2 id="fix-attempt-4">Fix attempt 4</h2> <p>Very soon turned out that my previous fix was not doing anything, because there was one more problem in the state handling in <code>handlePoll</code>. As I wrote, it bufferes only those records which were not <em>requested</em>. For those partitions which have a request, it fulfills these requests with the new records instead. When the reassigned partitions are not restarted during rebalancing (as in the <em>normal mode</em>) this is OK but for us, as we are creating new streams, the old requests must be dropped and not taken into account when deciding which records to buffer.</p> <p>In other words, in <em>restarting streams mode</em> we have to buffer all records after a rebalance.</p> <h2 id="fix-attempt-5">Fix attempt 5</h2> <p>I was very confident about the previous fix but something was still not OK, the test continued to report data loss. After several code reviews and discussions, I realized that it is not guaranteed that the <code>onRevoked</code> and <code>onAssigned</code> callbacks are called within a single <code>poll</code>! My code was not prepared for this (the original zio-kafka code was, actually, but I did not realize this for a long time).</p> <p>First of all I had to change the way how the rebalance callbacks are passing information to the poll handler. The previously added rebalance event (which was a simple case class) was changed to be either <code>Revoked</code>, <code>Assigned</code> or <code>RevokedAndAssigned</code> and I made sure that for each case all the run loop state variables are modified correctly.</p> <p>Immediately after deploying this, I saw evidence in the logs that indeed the revoked and assigned callbacks are called separately, so the fix was definitely needed. The only problem was that I did not really understand how could this cause data loss, and by doing some rebalancing tests it turned out that the problem still exists.</p> <h2 id="fix-attempt-6">Fix attempt 6</h2> <p>One more thing I added in the previous attempt was a log in a place that was suspicious to me and I did not care about it earlier. When adding requests to the run loop - these are added to the run loop's command queue when a partition stream tries to pull, completely asynchronous to the run loop itself - it was checking if currently the run loop is in the middle of a rebalancing. So in case the rebalancing takes multiple <code>poll</code>s, as we have seen, it is possible that between the <code>onRevoked</code> and <code>onAssigned</code> events we get some new requests from the streams.</p> <p>In the restart-streams mode all partition streams are interrupted on the revoke event, and no new streams are created until the assigned event. This means that these requests can <em>only</em> come from the previous streams so they should be ignored. But what zio-kafka was doing was to add these requests to the run loop's pending requests. This is correct behavior in its normal mode, because on rebalance some of the streams survive it and their requests can be still fulfilled.</p> <p>But in our case it is incorrect, because after the assignment is done and some records are fetched by <code>poll</code>, these pending requests get fulfilled with them, "stealing" the records from the new partition streams!</p> <p>At this point I really felt like this was the last missing piece of the puzzle.</p> <h2 id="conclusion">Conclusion</h2> <p>And it was!</p> <p>The final set of fixes are published <a href="https://github.com/zio/zio-kafka/pull/473">in this pull request</a>. The service and its tests are running perfectly since more than 10 days, proving that it is correct.</p> [Video] ZIO Parser @ ZIO World 2022 2022-03-11T00:00:00+00:00 2022-03-11T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zioworld-talk/ <p>My talk at <a href="https://zioworld.com/">ZIO World 2022</a> introducing <a href="https://github.com/zio/zio-parser">ZIO Parser</a></p> <iframe width="800" height="450" src="https://www.youtube.com/embed/IG6SmKPPamY" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> [Video] Generating Libraries @ Functional Scala 2021 2021-12-03T00:00:00+00:00 2021-12-03T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/funscala2021-talk/ <p>My talk at <a href="https://www.functionalscala.com/">Functional Scala 2021</a> about generating libraries in Scala:</p> <iframe width="800" height="450" src="https://www.youtube.com/embed/HCPTmytex3U" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> Writing kubectl plugins with ZIO K8s 2021-03-07T00:00:00+00:00 2021-03-07T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zio-k8s-plugins/ <p>Originally posted <a href="https://ziverge.com/blog/zio-k8s-kubectl-plugin">at the Ziverge blog</a>.</p> <p>Andrea Peruffo recently published <a href="https://www.lightbend.com/blog/writing-kubectl-plugins-with-scala-or-java-with-fabric8-kubernetes-client-on-graalvm?utm_campaign=Oktopost-BLG+-+Writing+Kubectl+plugins+in+Java+or+Scala">a blog post on the Lightbend blog</a> about how they migrated a <code>kubectl</code> plugin from Golang to Scala using the <a href="https://github.com/fabric8io/kubernetes-client">Fabric8</a> Kubernetes client and a few Scala libraries. This is a perfect use case for the <a href="https://coralogix.github.io/zio-k8s/">zio-k8s library</a> announced <a href="https://coralogix.com/log-analytics-blog/the-coralogix-operator-a-tale-of-zio-and-kubernetes/">two weeks ago</a>, so we decided to write this post demonstrating how to implement the same example using the ZIO ecosystem.</p> <p>We are going to implement the same example, originally described in the <a href="https://dev.to/ikwattro/write-a-kubectl-plugin-in-java-with-jbang-and-fabric8-566">Write a kubectl plugin in Java with JBang and fabric8</a> article, using the following libraries:</p> <ul> <li><a href="https://zio.dev/">ZIO</a></li> <li><a href="https://coralogix.github.io/zio-k8s/">ZIO K8s</a></li> <li><a href="https://zio.github.io/zio-logging/">ZIO Logging</a></li> <li><a href="https://vigoo.github.io/clipp/docs/">clipp</a></li> <li><a href="https://sttp.softwaremill.com/en/latest/">sttp</a></li> <li><a href="https://circe.github.io/circe/">circe</a></li> </ul> <p>The source code of the example <a href="https://github.com/zivergetech/zio-k8s-kubectl-plugin-example">can be found here</a>.</p> <p>The linked blog post does a great job in explaining the benefits and difficulties of compiling to native image with GraalVM so we are not going to repeat it here. Instead, we will focus on how the implementation looks in the functional Scala world.</p> <p>The example has to implement two <em>kubectl commands</em>: <code>version</code> to print its own version and <code>list</code> to list information about <em>all Pods of the Kubernetes cluster</em> in either ASCII table, JSON or YAML format.</p> <h3 id="cli-parameters">CLI parameters</h3> <p>Let's start with defining these command line options with the <a href="https://vigoo.github.io/clipp/docs/">clipp</a> library!</p> <p>First, we define the data structures that describe our parameters:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> Format </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Format { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case object</span><span style="color:#c18401;"> Default </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Format </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case object</span><span style="color:#c18401;"> Json </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Format </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case object</span><span style="color:#c18401;"> Yaml </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Format </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> Command </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Command { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> ListPods(</span><span style="color:#e45649;">format</span><span style="color:#c18401;">: Format) </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Command </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case object</span><span style="color:#c18401;"> Version </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Command </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Parameters</span><span>(</span><span style="color:#e45649;">verbose</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>, </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#c18401;">Command</span><span>) </span></code></pre> <p>When parsing the arguments (passed as an array of strings), we need to either produce a <code>Parameters</code> value or fail and print some usage information.</p> <p>With <code>clipp</code>, this is done by defining a parameter parser using its parser DSL in a <em>for comprehension</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">spec </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> metadata(</span><span style="color:#50a14f;">&quot;kubectl lp&quot;</span><span>) </span><span> </span><span style="color:#e45649;">verbose </span><span style="color:#a626a4;">&lt;-</span><span> flag(</span><span style="color:#50a14f;">&quot;Verbose logging&quot;</span><span>, </span><span style="color:#c18401;">&#39;v&#39;</span><span>, </span><span style="color:#50a14f;">&quot;verbose&quot;</span><span>) </span><span> </span><span style="color:#e45649;">commandName </span><span style="color:#a626a4;">&lt;-</span><span> command(</span><span style="color:#50a14f;">&quot;version&quot;</span><span>, </span><span style="color:#50a14f;">&quot;list&quot;</span><span>) </span><span> </span><span style="color:#e45649;">command </span><span style="color:#a626a4;">&lt;- </span><span> commandName </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;version&quot; </span><span style="color:#a626a4;">=&gt; </span><span> pure(Command.Version) </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;list&quot; </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">specifiedFormat </span><span style="color:#a626a4;">&lt;-</span><span> optional { </span><span> namedParameter[</span><span style="color:#c18401;">Format</span><span>]( </span><span> </span><span style="color:#50a14f;">&quot;Output format&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;default|json|yaml&quot;</span><span>, </span><span> </span><span style="color:#c18401;">&#39;o&#39;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;output&quot; </span><span> ) </span><span> } </span><span> </span><span style="color:#e45649;">format </span><span style="color:#a626a4;">=</span><span> specifiedFormat.getOrElse(Format.Default) </span><span> } </span><span style="color:#a626a4;">yield </span><span>Command.ListPods(format) </span><span> } </span><span> } </span><span style="color:#a626a4;">yield </span><span>Parameters(verbose, command) </span></code></pre> <p>As we can see, it is possible to make decisions in the parser based on the previously parsed values, so each <em>command</em> can have a different set of arguments. In order to parse the possible <em>output formats</em>, we also implement the <code>ParameterParser</code> type class for <code>Format</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">parameterParser</span><span>: </span><span style="color:#c18401;">ParameterParser</span><span>[</span><span style="color:#c18401;">Format</span><span>] </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">ParameterParser</span><span>[</span><span style="color:#c18401;">Format</span><span>] { </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">parse</span><span>(</span><span style="color:#e45649;">value</span><span>: </span><span style="color:#c18401;">String</span><span>): </span><span style="color:#c18401;">Either</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">Format</span><span>] </span><span style="color:#a626a4;">= </span><span> value.toLowerCase </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;default&quot; </span><span style="color:#a626a4;">=&gt; </span><span>Right(Format.Default) </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;json&quot; </span><span style="color:#a626a4;">=&gt; </span><span>Right(Format.Json) </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#50a14f;">&quot;yaml&quot; </span><span style="color:#a626a4;">=&gt; </span><span>Right(Format.Yaml) </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=&gt; </span><span>Left(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Invalid output format &#39;</span><span style="color:#e45649;">$value</span><span style="color:#50a14f;">&#39;, use &#39;default&#39;, &#39;json&#39; or &#39;yaml&#39;&quot;</span><span>) </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">example</span><span>: </span><span style="color:#c18401;">Format </span><span style="color:#a626a4;">= </span><span>Format.Default </span><span> } </span></code></pre> <p>This is all we need to bootstrap our command line application. The following main function parses the arguments and provides the parsed <code>Parameters</code> value to the <code>ZIO</code> program:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#e45649;">args</span><span>: </span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">String</span><span>]): </span><span style="color:#c18401;">URIO</span><span>[zio.</span><span style="color:#c18401;">ZEnv</span><span>, </span><span style="color:#c18401;">ExitCode</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">clippConfig </span><span style="color:#a626a4;">=</span><span> config.fromArgsWithUsageInfo(args, Parameters.spec) </span><span> runWithParameters() </span><span> .provideCustomLayer(clippConfig) </span><span> .catchAll { </span><span style="color:#e45649;">_</span><span>: </span><span style="color:#c18401;">ParserFailure </span><span style="color:#a626a4;">=&gt; </span><span>ZIO.succeed(ExitCode.failure) } </span><span>} </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">runWithParameters</span><span>(): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">ZEnv </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">ClippConfig</span><span>[</span><span style="color:#c18401;">Parameters</span><span>], </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">ExitCode</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span></code></pre> <h3 id="working-with-kubernetes">Working with Kubernetes</h3> <p>In <code>runWithParameters</code>, we have everything needed to initialize the logging and Kubernetes modules and perform the actual command. Before talking about the initialization though, let's take a look at how we can list the pods!</p> <p>We define a data type holding all the information we want to report about each pod:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> PodInfo</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">namespace</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">status</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">message</span><span>: </span><span style="color:#c18401;">String</span><span>) </span></code></pre> <p>The task now is to fetch <em>all pods</em> from Kubernetes and construct <code>PodInfo</code> values. In <code>zio-k8s</code> <em>getting a list of pods</em> is defined as a <strong>ZIO Stream</strong>, which under the hood sends multiple HTTP requests to Kubernetes taking advantage of its <em>pagination</em> capability. In this <em>stream</em> each element will be a <code>Pod</code> and we can start processing them one by one as soon they arrive over the wire. This way the implementation of the <code>list</code> command can be something like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#e45649;">format</span><span>: </span><span style="color:#c18401;">Format</span><span>) </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> log.debug(</span><span style="color:#50a14f;">&quot;Executing the list command&quot;</span><span>) </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> pods </span><span> .getAll(</span><span style="color:#e45649;">namespace</span><span> = None) </span><span> .mapM(</span><span style="color:#e45649;">toModel</span><span>) </span><span> .run(reports.sink(</span><span style="color:#e45649;">format</span><span>)) </span><span> .catchAll { </span><span style="color:#e45649;">k8sFailure </span><span style="color:#a626a4;">=&gt; </span><span> console.putStrLnErr(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Failed to get the list of pods: </span><span style="color:#e45649;">$k8sFailure</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> } </span><span> } </span><span style="color:#a626a4;">yield </span><span style="color:#c18401;">() </span></code></pre> <p>Let's take a look at each line!</p> <p>First, <code>log.debug</code> uses the <em>ZIO logging</em> library. We are going to initialize logging in a way that these messages only appear if the <code>--verbose</code> option was enabled.</p> <p>Then <code>pods.getAll</code> is the ZIO Stream provided by the <em>ZIO K8s</em> library. Not providing a specific namespace means that we are getting pods from <em>all</em> namespaces.</p> <p>With <code>mapM(toModel)</code> we transform each <code>Pod</code> in the stream to our <code>PodInfo</code> data structure.</p> <p>Finally we <code>run</code> the stream into a <em>sink</em> that is responsible for displaying the <code>PodInfo</code> structures with the specific <em>output format</em>.</p> <p>The <code>Pod</code> objects returned in the stream are simple <em>case classes</em> containing all the information available for the given resource. Most of the fields of these case classes are <em>optional</em> though, even though we can be sure that in our case each pod would have a name, a namespace and a status. To make working with these data structures easier within a set of expectations, they feature <em>getter methods</em> that are ZIO functions either returning the field's value, or failing if they are not specified. With these we can implement <code>toModel</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">toModel</span><span>(</span><span style="color:#e45649;">pod</span><span>: </span><span style="color:#c18401;">Pod</span><span>): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">K8sFailure</span><span>, </span><span style="color:#c18401;">PodInfo</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">metadata </span><span style="color:#a626a4;">&lt;-</span><span> pod.getMetadata </span><span> </span><span style="color:#e45649;">name </span><span style="color:#a626a4;">&lt;-</span><span> metadata.getName </span><span> </span><span style="color:#e45649;">namespace </span><span style="color:#a626a4;">&lt;-</span><span> metadata.getNamespace </span><span> </span><span style="color:#e45649;">status </span><span style="color:#a626a4;">&lt;-</span><span> pod.getStatus </span><span> </span><span style="color:#e45649;">phase </span><span style="color:#a626a4;">&lt;-</span><span> status.getPhase </span><span> </span><span style="color:#e45649;">message </span><span style="color:#a626a4;">=</span><span> status.message.getOrElse(</span><span style="color:#50a14f;">&quot;&quot;</span><span>) </span><span> } </span><span style="color:#a626a4;">yield </span><span>PodInfo(name, namespace, phase, message) </span></code></pre> <p>An alternative would be to just store the optional values in <code>PodInfo</code> and handle their absence in the <em>report sink</em>.</p> <p>Let's talk about the <em>type</em> of the above defined <code>run</code> function:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>ZIO[</span><span style="color:#c18401;">Pods </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Console </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Logging</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span></code></pre> <p>The ZIO <em>environment</em> precisely specifies the modules used by our <code>run</code> function:</p> <table><thead><tr><th>Module</th><th>Description</th></tr></thead><tbody> <tr><td><code>Pods</code></td><td>for accessing K8s pods</td></tr> <tr><td><code>Console</code></td><td>for printing <em>errors</em> on the standard error channel with <code>putStrLnErr</code></td></tr> <tr><td><code>Logging</code></td><td>for emitting some debug logs</td></tr> </tbody></table> <p>The error type is <code>Nothing</code> because it can never fail - all errors are catched and displayed for the user within the run function.</p> <h3 id="initialization">Initialization</h3> <p>Now we can see that in order to run the <code>list</code> command in <code>runWithParameters</code>, we must <em>provide</em> <code>Pods</code> and <code>Logging</code> modules to our implementation (<code>Console</code> is part of the default environment and does not need to be provided).</p> <p>These modules are described by <em>ZIO Layers</em> which can be composed together to provide the <em>environment</em> for running our ZIO program. In this case we need to define a <em>logging layer</em> and a <em>kubernetes pods client</em> layer and then compose the two for our <code>list</code> implementation.</p> <p>Let's start with logging:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">configuredLogging</span><span>(</span><span style="color:#e45649;">verbose</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>): </span><span style="color:#c18401;">ZLayer</span><span>[</span><span style="color:#c18401;">Console </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Clock</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">Logging</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">logLevel </span><span style="color:#a626a4;">= if </span><span>(verbose) LogLevel.Trace </span><span style="color:#a626a4;">else </span><span>LogLevel.Info </span><span> Logging.consoleErr(logLevel) &gt;&gt;&gt; initializeSlf4jBridge </span><span> } </span></code></pre> <p>We create a simple ZIO console logger that will print lines to the standard error channel; the enabled log level is determined by the <code>verbose</code> command line argument. As this logger writes to the console and also prints timestamps, our logging layer <em>requires</em> <code>Console with Clock</code> to be able to build a <code>Logging</code> module. Enabling the <em>SLF4j bridge</em> guarantees that logs coming from third party libraries will also get logged through ZIO logging. In our example this means that when we enable verbose logging, our <code>kubectl</code> plugin will log the HTTP requests made by the Kubernetes library!</p> <p>The second layer we must define constructs a <code>Pods</code> module:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">pods </span><span style="color:#a626a4;">=</span><span> k8sDefault &gt;&gt;&gt; Pods.live) </span></code></pre> <p>By using <code>k8sDefault</code> we ask <code>zio-k8s</code> to use the <em>default configuration chain</em>, which first tries to load the <code>kubeconfig</code> and use the active <em>context</em> stored in it. This is exactly what <code>kubectl</code> does, so it is the perfect choice when writing a <code>kubectl</code> plugin. Other variants provide more flexibility such as loading custom configuration with the <a href="https://zio.github.io/zio-config/">ZIO Config</a> library. Once we have a <em>k8s configuration</em> we just feed it to the set of resource modules we need. In this example we only need to access pods. In more complex applications this would be something like <code>k8sDefault &gt;&gt;&gt; (Pods.live ++ Deployments.live ++ ...)</code>.</p> <p>With both layers defined, we can now provide them to our command implementation:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>runCommand(parameters.command) </span><span> .provideCustomLayer(logging ++ pods) </span></code></pre> <h3 id="output">Output</h3> <p>The last thing missing is the <em>report sink</em> that we are running the stream of pods into. We are going to define three different sinks for the three output types.</p> <p>Let's start with JSON!</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">sink</span><span>[</span><span style="color:#c18401;">T</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Encoder</span><span>]: </span><span style="color:#c18401;">ZSink</span><span>[</span><span style="color:#c18401;">Console</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> ZSink.foreach { (</span><span style="color:#e45649;">item</span><span>: </span><span style="color:#c18401;">T</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> console.putStrLn(item.asJson.printWith(Printer.spaces2SortKeys)) </span><span> } </span></code></pre> <p>The JSON sink requires <code>Console</code> and then for each element <code>T</code> it converts it to JSON and pretty prints it to console. Note that this is going to be a JSON document per each line. We could easily define a different sink that collects each element and produces a single valid JSON array of them:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">arraySink</span><span>[</span><span style="color:#c18401;">T</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Encoder</span><span>]: </span><span style="color:#c18401;">ZSink</span><span>[</span><span style="color:#c18401;">Console</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> ZSink.collectAll.flatMap { (</span><span style="color:#e45649;">items</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">T</span><span>]) </span><span style="color:#a626a4;">=&gt; </span><span> ZSink.fromEffect { </span><span> console.putStrLn(Json.arr(items.map(</span><span style="color:#e45649;">_</span><span>.asJson): </span><span style="color:#a626a4;">_*</span><span>).printWith(Printer.spaces2SortKeys)) </span><span> } </span><span> } </span></code></pre> <p>The <code>T</code> type paramter in our example will always be <code>PodInfo</code>. By requiring it to have an implementation of circe's <code>Encoder</code> type class we can call <code>.asJson</code> on instances of <code>T</code>, encoding it into a JSON object. We can <em>derive</em> these encoders automatically:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">encoder</span><span>: </span><span style="color:#c18401;">Encoder</span><span>[</span><span style="color:#c18401;">PodInfo</span><span>] </span><span style="color:#a626a4;">=</span><span> deriveEncoder </span></code></pre> <p>Producing YAML output is exactly the same except of first converting the JSON model to YAML with <code>asJson.asYaml</code>.</p> <p>The third output format option is to generate ASCII tables. We implement that with the same Java library as the original post, called <a href="https://github.com/vdmeer/asciitable"><code>asciitable</code></a>. In order to separate the specification of how to convert a <code>PodInfo</code> to a table from the sink implementation, we can define our own type class similar to the JSON <code>Encoder</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Tabular</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">/** Initializes a table by setting properties and adding header rows </span><span style="color:#a0a1a7;"> */ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">createTableRenderer</span><span style="color:#c18401;">(): ZManaged[</span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Nothing</span><span style="color:#c18401;">, AsciiTable] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">/** Adds a single item of type T to the table created with [[createTableRenderer()]] </span><span style="color:#a0a1a7;"> */ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">addRow</span><span style="color:#c18401;">(</span><span style="color:#e45649;">table</span><span style="color:#c18401;">: AsciiTable)(</span><span style="color:#e45649;">item</span><span style="color:#c18401;">: T): UIO[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">/** Adds the table&#39;s footer and renders it to a string </span><span style="color:#a0a1a7;"> */ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">renderTable</span><span style="color:#c18401;">(</span><span style="color:#e45649;">table</span><span style="color:#c18401;">: AsciiTable): UIO[String] </span><span style="color:#c18401;"> } </span></code></pre> <p>We can implement this for <code>PodInfo</code> and then use a generic sink for printing the result table, similar to the previous examples:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">sink</span><span>[</span><span style="color:#c18401;">T</span><span>](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">tabular</span><span>: </span><span style="color:#c18401;">Tabular</span><span>[</span><span style="color:#c18401;">T</span><span>]): </span><span style="color:#c18401;">ZSink</span><span>[</span><span style="color:#c18401;">Console</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> ZSink.managed[</span><span style="color:#c18401;">Console</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">AsciiTable</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#a626a4;">Unit</span><span>](tabular.createTableRenderer()) { </span><span> </span><span style="color:#e45649;">table </span><span style="color:#a626a4;">=&gt; </span><span style="color:#a0a1a7;">// initialize the table </span><span> ZSink.foreach(tabular.addRow(table)) &lt;* </span><span style="color:#a0a1a7;">// add each row </span><span> printResultTable[</span><span style="color:#c18401;">T</span><span>](table) </span><span style="color:#a0a1a7;">// print the result </span><span> } </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">printResultTable</span><span>[</span><span style="color:#c18401;">T</span><span>]( </span><span> </span><span style="color:#e45649;">table</span><span>: </span><span style="color:#c18401;">AsciiTable </span><span>)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">tabular</span><span>: </span><span style="color:#c18401;">Tabular</span><span>[</span><span style="color:#c18401;">T</span><span>]): </span><span style="color:#c18401;">ZSink</span><span>[</span><span style="color:#c18401;">Console</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#c18401;">T</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span> ZSink.fromEffect { </span><span> tabular </span><span> .renderTable(table) </span><span> .flatMap(</span><span style="color:#e45649;">str </span><span style="color:#a626a4;">=&gt;</span><span> console.putStrLn(str)) </span><span> } </span></code></pre> <h3 id="trying-it-out">Trying it out</h3> <p>With the report sinks implemenented we have everything ready to try out our new <code>kubectl</code> plugin!</p> <p>We can compile the example to <em>native image</em> and copy the resulting image to a location on the <code>PATH</code>:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>sbt nativeImage </span><span>cp target/native-image/kubectl-lp ~/bin </span></code></pre> <p>Then use <code>kubectl lp</code> to access our custom functions:</p> <p><img src="/images/blog-ziok8s-kubectlplugin.png" alt="kubectl-example" /></p> The Coralogix Operator: A Tale of ZIO and Kubernetes 2021-02-16T00:00:00+00:00 2021-02-16T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zio-k8s/ <p>My blog post <a href="https://coralogix.com/blog/the-coralogix-operator-a-tale-of-zio-and-kubernetes/">published at the Coralogix blog</a> about using <a href="https://coralogix.github.io/zio-k8s/">zio-k8s</a> for writing operators.</p> ZIO-AWS with ZIO Query 2020-11-01T00:00:00+00:00 2020-11-01T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zioaws-zioquery/ <p>A few years ago I wrote a <a href="https://blog.vigoo.dev/posts/aws-rate-limits-prezidig/">post</a> about how I refactored one of our internal tools at <a href="https://prezi.com">Prezi</a>. This command line tool was able to discover a set of AWS resources and present them in a nice human readable way. The primary motivation at that time was to introduce circuit breaking to survive AWS API rate limits.</p> <p>I have recently published a set of libraries, <a href="https://github.com/vigoo/zio-aws"><strong>zio-aws</strong></a>, and thought it would be interesting to rewrite this tool on top of it, and use this opportunity to try out <a href="https://zio.github.io/zio-query/"><strong>ZIO Query</strong></a> on a real-world example. In this post I'm going to show step by step how to build an efficient and easily extensible query tool with the help of <em>ZIO</em> libraries. The full source can be found <a href="https://github.com/vigoo/aws-query">on GitHub</a>.</p> <h2 id="the-task">The task</h2> <p>The CLI tool we build will get an arbitrary string as an input, and search for it in various AWS resources. Once it has a match, it has to traverse a graph of these resources and finally pretty-print all the gathered information to the console.</p> <img src="/images/awsquery-1.png"/> <p>The provided input could mean any of the following:</p> <ul> <li>An <strong>EC2</strong> <em>instance ID</em></li> <li>An <strong>ELB</strong> (load balancer)'s <em>name</em></li> <li>An <strong>ElasticBeanstalk</strong> <em>environment name</em> or <em>ID</em></li> <li>An <strong>ElasticBeanstalk</strong> <em>application name</em></li> <li>An <strong>ASG</strong> (auto-scaling group) <em>ID</em></li> </ul> <p>For the level of detail to be reported I copied the original tool. This means finding all the related resources in the above sets (plus among <em>launch configurations</em>) but only include a single <em>EC2 instance</em> in the output if it was explicitly queried. So for example if the search term matches an <em>ELB</em> that belongs to an <em>ElasticBeanstalk environment</em>, the report will contain the <em>EB app</em> and all its other environments as well, but won't show individual instances. This choice does not affect the design and could be easily changed or extended with additional resource types.</p> <h2 id="aws-client">AWS client</h2> <p>For querying the above mentioned resources, we have to call four different AWS services. The <code>zio-aws</code> project adds a streaming ZIO wrapper for <em>all</em> the libraries in <a href="https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/welcome.html">AWS Java SDK v2</a>, each published as separate artifact:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>libraryDependencies ++= Seq( </span><span> </span><span style="color:#50a14f;">&quot;io.github.vigoo&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;zio-aws-autoscaling&quot;</span><span> % zioAwsVersion, </span><span> </span><span style="color:#50a14f;">&quot;io.github.vigoo&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;zio-aws-ec2&quot;</span><span> % zioAwsVersion, </span><span> </span><span style="color:#50a14f;">&quot;io.github.vigoo&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;zio-aws-elasticloadbalancing&quot;</span><span> % zioAwsVersion, </span><span> </span><span style="color:#50a14f;">&quot;io.github.vigoo&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;zio-aws-elasticbeanstalk&quot;</span><span> % zioAwsVersion </span><span> </span><span> </span><span style="color:#50a14f;">&quot;io.github.vigoo&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;zio-aws-netty&quot;</span><span> % zioAwsVersion, </span><span>) </span></code></pre> <p>In addition to loading the necessary client libraries, we also need one of the <em>http implementations</em>, in this case I chose the default <em>Netty</em>. Other possibilities are <em>akka-http</em> and <em>http4s</em>. If your application already uses one of these for other HTTP communications you may want to use them to share their configuration and pools.</p> <p>The client libraries have a <code>ZStream</code> API for all the operations that either support streaming (like for example S3 download/upload) or pagination, and <code>ZIO</code> wrapper for non-streaming simple operations. Instead of using the Java SDK's builders, the requests are described by <em>case classes</em>, and the <em>result</em> types have convenience accessors to handle the nullable results.</p> <p>Let's see some examples!</p> <p>We can get information about <em>EB applications</em> with the <em>ElasticBeanstalk</em> API's <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/api/API_DescribeApplications.html"><code>DescribeApplications</code> operation</a>. This is defined like the following in <code>zio-aws-elasticbeanstalk</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">describeApplications</span><span>(</span><span style="color:#e45649;">request</span><span>: </span><span style="color:#c18401;">DescribeApplicationsRequest</span><span>): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">ElasticBeanstalk</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">DescribeApplicationsResponse</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>] </span><span> </span><span style="color:#a626a4;">type </span><span>ApplicationName </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">String </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> DescribeApplicationsRequest</span><span>(</span><span style="color:#e45649;">applicationNames</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Iterable</span><span>[</span><span style="color:#c18401;">ApplicationName</span><span>]]) </span><span> </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> DescribeApplicationsResponse</span><span>(</span><span style="color:#e45649;">applications </span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Iterable</span><span>[</span><span style="color:#c18401;">ApplicationDescription</span><span>]]) </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> DescribeApplicationsResponse { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> ReadOnly { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">editable</span><span style="color:#c18401;">: DescribeApplicationsResponse </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">applicationsValue</span><span style="color:#c18401;">: Option[List[ApplicationDescription.ReadOnly]] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">applications</span><span style="color:#c18401;">: ZIO[</span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">, AwsError, List[ApplicationDescription.ReadOnly]] </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>A few things to notice here:</p> <ul> <li>The client function requires the <code>ElasticBeanstalk</code> module. We will see how to set up the dependencies in the <em>Putting all together</em> section.</li> <li>The primitive types defined by the AWS schema are currently simple type aliases. In the future they will be probably replaced by <a href="https://github.com/zio/zio-prelude">zio-prelude</a>'s <em>newtypes</em>.</li> <li>Each wrapper type has a <code>ReadOnly</code> trait and a <em>case class</em>. The case classes are used as input, and the read-only interfaces as outputs. This way the result provided by the Java SDK can be accessed directly and it only has to be rewrapped in the case class if it is passed to another call as input.</li> <li>In many cases the AWS SDK describes fields as optional even if in normal circumstances it would never be <code>None</code>. To make it more convenient to work with these, the <code>ReadOnly</code> interface contains <em>accessor functions</em> which fail with <code>FieldIsNone</code> in case the field did not have any value. The pure optional values can be accessed with the <code>xxxValue</code> variants. See <code>applications</code> and <code>applicationsValue</code> in the above example.</li> </ul> <p>For operations support pagination, the wrapper functions return a stream. The actual first AWS call happens when the stream is first pulled. An example for this that we have to use in this application is the <em>EC2</em> API's <a href="https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html"><code>DescribeInstances</code> operation</a>.</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">describeInstances</span><span>(</span><span style="color:#e45649;">request</span><span>: </span><span style="color:#c18401;">DescribeInstancesRequest</span><span>): </span><span style="color:#c18401;">ZStream</span><span>[</span><span style="color:#c18401;">Ec2</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Reservation</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>] </span></code></pre> <p>The pagination can be controlled by setting the <code>MaxResults</code> property in <code>DescribeInstancesRequest</code>. For the user of the <code>describeInstances</code> function this is completely transparent, the returned stream will gather all the results, possibly by performing multiple AWS requests.</p> <h2 id="queries">Queries</h2> <p>We could implement the resource discovery directly using the low level AWS wrappers described above, using ZIO's tools to achieve concurrency. There are several things to consider though:</p> <ul> <li>We don't know what resource we are looking for, so we should start multiple queries in parallel to find a match as soon as possible</li> <li>Some queries return additional data that could be reused later. For example it is not possible to search for an ELB by a instance ID contained by it; for that we have to query <em>all</em> load balancers and check the members on client side.</li> <li>There are AWS operations that support querying multiple entities, for example by providing a list of IDs to look for</li> <li>We should minimize the number of calls to AWS, both for performance reasons, and to avoid getting rate limited</li> </ul> <p>We can achieve all this by expressing our AWS queries with a higher level abstraction, delegating the execution to a library called <a href="https://zio.github.io/zio-query/">ZIO Query</a>. This library let us define composable <em>queries</em> to arbitrary <em>data sources</em>, and it automatically provides <em>pipelining</em>, <em>batching</em> and <em>caching</em>. A perfect match for the problem we have to solve here.</p> <p>To be able to cache results that became available as a side effect of a query, we need a <a href="https://github.com/zio/zio-query/pull/105">recent improvement</a> that is not published yet, so <code>aws-query</code> currently uses a snapshot release of <code>zio-query</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>libraryDependencies += </span><span style="color:#50a14f;">&quot;dev.zio&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;zio-query&quot;</span><span> % </span><span style="color:#50a14f;">&quot;0.2.5+12-c41557f7-SNAPSHOT&quot; </span></code></pre> <p>The first step is to define custom <em>data sources</em>. Data sources must implement a function <code>runAll</code> with the following signature:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">runAll</span><span>(</span><span style="color:#e45649;">requests</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">A</span><span>]]): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">R</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">CompletedRequestMap</span><span>] </span></code></pre> <p>Here <code>A</code> is the <em>request type</em> specific to a given data source (extending <code>Request[E, A]</code>, and the returned <code>CompletedRequestMap</code> will store an <code>Either[E, A]</code> result for each request. The two nested chunks model sequential and parallel execution: the requests in the inner chunks can be executed in parallel, while these batches contained by the outer chunk must be performed sequentially. In practice we won't implement this method but use <code>DataSource.Batched</code> that is a simplified version that can perform requests in parallel but does not make further optimizations on the requests to be performed sequentially.</p> <p>What should belong to one data source? It could be a single data source for all the AWS queries, or one per service, or one per resource type. The best choice in this case is to have one for each resource type, for the following reasons:</p> <ul> <li>There are no opportunities to do any cross-resource-type caching. For example when we are querying EC2 instances, we won't fetch auto scaling groups as a side effect.</li> <li>If all requests are about the same data type, implementing the data source is much simpler</li> </ul> <p>Let's see a simple example. EC2 instances can be queried by <em>instance ID</em> with the <a href="https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html"><code>DescribeInstances</code></a> operation, and it supports querying for multiple IDs in a single request. We first define a <em>request type</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> GetEc2Instance</span><span>(</span><span style="color:#e45649;">id</span><span>: </span><span style="color:#c18401;">InstanceId</span><span>) </span><span style="color:#a626a4;">extends </span><span>Request[</span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Instance</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>] </span></code></pre> <p>Then the data source:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">ec2InstancesDataSource</span><span>: </span><span style="color:#c18401;">DataSource</span><span>[</span><span style="color:#c18401;">Logging </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Ec2</span><span>, </span><span style="color:#c18401;">GetEc2Instance</span><span>] </span><span style="color:#a626a4;">= </span><span> DataSource.Batched.make(</span><span style="color:#50a14f;">&quot;ec2&quot;</span><span>) { (</span><span style="color:#e45649;">requests</span><span>: </span><span style="color:#c18401;">Chunk</span><span>[</span><span style="color:#c18401;">GetEc2Instance</span><span>]) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">import</span><span> AwsDataSource.</span><span style="color:#e45649;">_ </span><span> </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">&lt;-</span><span> ec2.describeInstances(DescribeInstancesRequest(instanceIds </span><span style="color:#a626a4;">= </span><span>Some(requests.map(</span><span style="color:#e45649;">_</span><span>.id)))) </span><span> .mapM(</span><span style="color:#e45649;">_</span><span>.instances) </span><span> .flatMap(</span><span style="color:#e45649;">instances </span><span style="color:#a626a4;">=&gt; </span><span>ZStream.fromIterable(instances)) </span><span> .foldM(CompletedRequestMap.empty) { (</span><span style="color:#e45649;">resultMap</span><span>, </span><span style="color:#e45649;">item</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">instanceId </span><span style="color:#a626a4;">&lt;-</span><span> item.instanceId </span><span> } </span><span style="color:#a626a4;">yield</span><span> resultMap.insert(GetEc2Instance(instanceId))(Right(item)) </span><span> } </span><span> .recordFailures(</span><span style="color:#50a14f;">&quot;DescribeInstances&quot;</span><span>, requests) </span><span> } </span><span style="color:#a626a4;">yield</span><span> result </span><span> } </span></code></pre> <p>Here <code>requests</code> holds a set of <code>GetEc2Instance</code> requests to be performed in parallel. We can simply do this by taking all the <em>instance IDs</em> from these requests and performing a single <code>describeInstances</code> AWS call. The result, as I explained before, is a <code>ZStream</code> of instances. We have to construct a <code>CompletedRequestMap</code> holding one entry for each request in <code>requests</code>. To do this we <code>foldM</code> the stream, using the <code>instanceId</code> accessor function to reconstruct the request value for each item in the result stream.</p> <p>The <code>.recordFailures</code> function is a helper extension method defined in <code>AwsDataSource</code>. It catches all errors and produces a <code>CompletedRequestMap</code> where all requested items are recorded as failures:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">recordFailures</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">description</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span> </span><span style="color:#e45649;">requests</span><span>: </span><span style="color:#c18401;">Iterable</span><span>[</span><span style="color:#c18401;">Request</span><span>[</span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">A</span><span>]]): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">R</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">CompletedRequestMap</span><span>] </span><span style="color:#a626a4;">= </span><span> f.catchAll { </span><span style="color:#e45649;">error </span><span style="color:#a626a4;">=&gt; </span><span> log.error(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;</span><span style="color:#e45649;">$description</span><span style="color:#50a14f;"> failed with </span><span style="color:#e45649;">$error</span><span style="color:#50a14f;">&quot;</span><span>) *&gt; </span><span> ZIO.succeed { </span><span> requests.foldLeft(CompletedRequestMap.empty) { </span><span style="color:#a626a4;">case </span><span>(</span><span style="color:#e45649;">resultMap</span><span>, </span><span style="color:#e45649;">req</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> resultMap.insert(req)(Left(error)) </span><span> } </span><span> } </span><span> } </span></code></pre> <p>This is necessary because the data source requires a function of type <code>Chunk[A] =&gt; ZIO[R, Nothing, CompletedRequestMap]</code> that cannot fail.</p> <p>With the data source defined, we can define primitive <em>queries</em> on it:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">getEc2Instance</span><span>(</span><span style="color:#e45649;">id</span><span>: </span><span style="color:#c18401;">InstanceId</span><span>): </span><span style="color:#c18401;">ZQuery</span><span>[</span><span style="color:#c18401;">Logging </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Ec2</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Instance</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>] </span><span style="color:#a626a4;">= </span><span> ZQuery.fromRequest(GetEc2Instance(id))(ec2InstancesDataSource) </span></code></pre> <p>A more complex example is <code>ebEnvDataSource</code>, the data source of <em>ElasticBeanstalk environments</em>. For this resource, we have different request types:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> EbEnvRequest</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">extends </span><span>Request[</span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> GetEnvironmentByName</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">EnvironmentName</span><span>) </span><span> </span><span style="color:#a626a4;">extends </span><span>EbEnvRequest[</span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">EnvironmentDescription</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>]] </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> GetEnvironmentById</span><span>(</span><span style="color:#e45649;">id</span><span>: </span><span style="color:#c18401;">EnvironmentId</span><span>) </span><span> </span><span style="color:#a626a4;">extends </span><span>EbEnvRequest[</span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">EnvironmentDescription</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>]] </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> GetEnvironmentByApplicationName</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">ApplicationName</span><span>) </span><span> </span><span style="color:#a626a4;">extends </span><span>EbEnvRequest[</span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">EnvironmentDescription</span><span>.</span><span style="color:#c18401;">ReadOnly</span><span>]] </span></code></pre> <p>In the data source implementation we get a <code>Chunk</code> of <code>EbEnvRequest</code> to be performed in parallel. We start it by separating it per request type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">byName </span><span style="color:#a626a4;">=</span><span> requests.collect { </span><span style="color:#a626a4;">case </span><span>GetEnvironmentByName(</span><span style="color:#e45649;">name</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> name } </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">byId </span><span style="color:#a626a4;">=</span><span> requests.collect { </span><span style="color:#a626a4;">case </span><span>GetEnvironmentById(</span><span style="color:#e45649;">id</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> id } </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">byAppName </span><span style="color:#a626a4;">=</span><span> requests.collect { </span><span style="color:#a626a4;">case </span><span>GetEnvironmentByApplicationName(</span><span style="color:#e45649;">name</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> name } </span></code></pre> <p>Then for each of these collections, if not empty, we can perform a <code>describeEnvironments</code> AWS call and then fold the result stream to create partial <code>CompletedRequestMap</code> values. What is interesting here is that if we already queried an environment by either name or id or it's application name, we already know both its identifier and name, so we can store additional items in <code>CompletedRequestMap</code> that will be cached and reused in future queries. For example this is how the query by-id gets processed:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>resultMap &lt;- elasticbeanstalk </span><span> .describeEnvironments(DescribeEnvironmentsRequest(environmentIds </span><span style="color:#a626a4;">= </span><span>Some(byId))) </span><span> .foldM(initialResultMap) { (</span><span style="color:#e45649;">resultMap</span><span>, </span><span style="color:#e45649;">item</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">name </span><span style="color:#a626a4;">&lt;-</span><span> item.environmentName </span><span> </span><span style="color:#e45649;">id </span><span style="color:#a626a4;">&lt;-</span><span> item.environmentId </span><span> } </span><span style="color:#a626a4;">yield</span><span> resultMap </span><span> .insert(GetEnvironmentById(id))(Right(Some(item))) </span><span> .insert(GetEnvironmentByName(name))(Right(Some(item))) </span><span> } </span><span> .recordFailures(</span><span style="color:#50a14f;">&quot;DescribeEnvironmentRequest(id)&quot;</span><span>, byId.map(GetEnvironmentById)) </span></code></pre> <p>For all three request types we describe the computation to create a partial <code>CompletedRequestMap</code> for them. Then we can implement the data source by executing these (maximum) three queries in parallel and combining the results:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>byNameResultMap </span><span> .zipWithPar(byIdResultMap)(</span><span style="color:#e45649;">_</span><span> ++ </span><span style="color:#e45649;">_</span><span>) </span><span> .zipWithPar(byAppNameResultMap)(</span><span style="color:#e45649;">_</span><span> ++ </span><span style="color:#e45649;">_</span><span>) </span></code></pre> <p>There are some cases where being able to query <em>all</em> instances of a given resource is also a requirement. An example is <em>load balancers</em>, where the only way to find if an ELB contains a given <em>EC2 instance</em> is to query <em>all</em> ELBs and check their members. There are a few more cases that require a very similar implementation, so it makes sense extracting it to a common place. We define an <code>AllOrPerItem</code> trait that defines the specifics per use case:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> AllOrPerItem</span><span>[</span><span style="color:#c18401;">R</span><span>, </span><span style="color:#c18401;">Req</span><span>, </span><span style="color:#c18401;">Item</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">name</span><span style="color:#c18401;">: String </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">isGetAll</span><span style="color:#c18401;">(</span><span style="color:#e45649;">request</span><span style="color:#c18401;">: Req): </span><span style="color:#a626a4;">Boolean </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">isPerItem</span><span style="color:#c18401;">(</span><span style="color:#e45649;">request</span><span style="color:#c18401;">: Req): </span><span style="color:#a626a4;">Boolean </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">allReq</span><span style="color:#c18401;">: Req </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">itemToReq</span><span style="color:#c18401;">(</span><span style="color:#e45649;">item</span><span style="color:#c18401;">: Item): ZIO[R, AwsError, Req] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">getAll</span><span style="color:#c18401;">(): ZStream[R, AwsError, Item] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">getSome</span><span style="color:#c18401;">(</span><span style="color:#e45649;">reqs</span><span style="color:#c18401;">: Set[Req]): ZStream[R, AwsError, Item] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">processAdditionalRequests</span><span style="color:#c18401;">(</span><span style="color:#e45649;">requests</span><span style="color:#c18401;">: Chunk[Req], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">partialResult</span><span style="color:#c18401;">: CompletedRequestMap): ZIO[R, </span><span style="color:#a626a4;">Nothing</span><span style="color:#c18401;">, CompletedRequestMap] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> ZIO.succeed(partialResult) </span><span style="color:#c18401;">} </span></code></pre> <p>By implementing these one-liners the actual data source implementation can be a shared code defined in <code>AllOrPerItem.make</code>. It's very similar to the examples already seen. If any of the requests is the <em>get all request</em>, that's the only thing to be performed, and all the result items will be cached. Otherwise a single batched request is made.</p> <p>These primitive <code>ZQuery</code>s then can be composed to more complex queries. For example the following code:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">instance </span><span style="color:#a626a4;">&lt;-</span><span> ec2query.getEc2Instance(instanceId) </span><span> </span><span style="color:#e45649;">imageId </span><span style="color:#a626a4;">&lt;- </span><span>ZQuery.fromEffect(instance.imageId) </span><span> </span><span style="color:#e45649;">imgElb </span><span style="color:#a626a4;">&lt;- </span><span>(ec2query.getImage(imageId) &lt;&amp;&gt; elbquery.loadBalancerOf(instanceId)) </span><span> (</span><span style="color:#e45649;">image</span><span>, </span><span style="color:#e45649;">elb</span><span>) </span><span style="color:#a626a4;">=</span><span> imgElb </span><span> </span><span style="color:#e45649;">elbReport </span><span style="color:#a626a4;">&lt;-</span><span> optionally(elb)(getElbReport) </span><span> </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">&lt;- </span><span style="color:#a0a1a7;">// ... </span><span>} </span><span style="color:#a626a4;">yield</span><span> result </span></code></pre> <p>This is part of the definition of a query of type <code>ZQuery[QueryEnv, AwsError, LinkedReport[Ec2InstanceKey, Ec2InstanceReport]]</code>. We will talk about <code>QueryEnv</code> and <code>LinkedReport</code> later, for now it's enough to understand that this is a more complex query that provides an <em>EC2 instance report</em>; the data type that will be used to render the human-readable output. The query first gets an EC2 instance by <em>instance ID</em>. Then with <code>ZQuery.fromEffect</code> we lift a <code>ZIO</code> effect to the query. In this case this is a <code>zio-aws</code> <em>accessor function</em> that fails if <code>imageId</code> is <code>None</code>.</p> <p>By this we express that we <em>expect</em> that <code>imageId</code> is always specified, and if not, we fail the <em>whole query</em>. Then we use <code>&lt;&amp;&gt;</code> (it's alias is <code>zipPar</code>) to perform two queries <strong>in parallel</strong>: getting an EC2 image and finding the load balancer containing the instance. Once both queries are finished, we optionally generate a <em>load balancer report</em> (if we have found an ELB link) and then we construct the result.</p> <p>Here <code>optionally</code> is a simple helper function that makes our query more readable. It could have been written as <code>elb.fold(ZQuery.none)(getElbReport)</code>.</p> <p>Another useful combinator on <code>ZQuery</code> is <code>collectAllPar</code> that runs a subquery on each item of a collection in parallel:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">elbNames </span><span style="color:#a626a4;">&lt;- </span><span>ZQuery.fromEffect(asg.loadBalancerNames) </span><span> </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">&lt;- </span><span>ZQuery.collectAllPar(elbNames.map(</span><span style="color:#e45649;">name </span><span style="color:#a626a4;">=&gt;</span><span> elbquery.getLoadBalancer(name) &gt;&gt;= getElbReport)) </span><span>} </span><span style="color:#a626a4;">yield</span><span> result </span></code></pre> <p>As I mentioned earlier, we have no way to know what resource we are looking for (in fact we could for example detect EC2 <em>instance IDs</em> by a pattern but let's ignore that for now). So on top level we simply start _all the possible queries <strong>at once</strong> and let print all the non-failing ones:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">renderers </span><span style="color:#a626a4;">&lt;- </span><span>ZQuery.collectAllPar(possibleQueries).run </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;- </span><span>ZIO.foreach_(renderers.flatten)(identity) </span><span>} </span><span style="color:#a626a4;">yield </span><span style="color:#c18401;">() </span></code></pre> <p>Where <code>possibleQueries</code> is a where we list all the queries we want to support, tied to the <em>renderer</em> to show it on the console.</p> <h2 id="report-cache">Report cache</h2> <p><em>ZIO Query</em> solves caching and optimizes the requests on the AWS resource level, but we still have a problem. The queries form a cyclic graph. For example an <em>EC2 instance</em> holds a link to its <em>load balancer</em>, that holds a link to the <em>EB environment</em> it is defined in. The environment refers back to the ELB, and it also links to the <em>EB app</em> and the application has again links to all the <em>environments</em> it contains.</p> <p>We want to collect all these resources exactly once, and there is a chance that parallel queries reach to the same resource. To solve this we can add an extra <em>caching layer</em> on top of <em>ZIO Query</em>. Let's define this caching layer as a ZIO <em>module</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> ReportCache { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Service { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">storeIfNew</span><span style="color:#c18401;">[A </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Report](</span><span style="color:#e45649;">reportKey</span><span style="color:#c18401;">: ReportKey, </span><span style="color:#c18401;"> </span><span style="color:#e45649;">query</span><span style="color:#c18401;">: ZQuery[</span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">, AwsError, A]): ZQuery[</span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">, AwsError, </span><span style="color:#a626a4;">Boolean</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">retrieve</span><span style="color:#c18401;">[A </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Report](</span><span style="color:#e45649;">key</span><span style="color:#c18401;">: ReportKey): ZIO[</span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">, AwsError, Option[A]] </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>The <code>storeIfNew</code> function is a <em>query</em>, to be used in high level queries to shortcut cycles in case a given report is already stored in the cache. We can define a helper function <code>cached</code> like the following:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">protected def </span><span style="color:#0184bc;">cached</span><span>[</span><span style="color:#c18401;">R </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ReportCache </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Logging</span><span>, </span><span style="color:#c18401;">A</span><span>, </span><span style="color:#c18401;">B </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Report</span><span>, </span><span style="color:#c18401;">K </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ReportKey</span><span>] </span><span> (</span><span style="color:#e45649;">input</span><span>: </span><span style="color:#c18401;">A</span><span>) </span><span> (</span><span style="color:#e45649;">keyFn</span><span>: </span><span style="color:#c18401;">A </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">K</span><span>]) </span><span> (</span><span style="color:#e45649;">query</span><span>: </span><span style="color:#c18401;">K </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">ZQuery</span><span>[</span><span style="color:#c18401;">R</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">B</span><span>]): </span><span style="color:#c18401;">ZQuery</span><span>[</span><span style="color:#c18401;">R</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">LinkedReport</span><span>[</span><span style="color:#c18401;">K</span><span>, </span><span style="color:#c18401;">B</span><span>]] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">key </span><span style="color:#a626a4;">&lt;- </span><span>ZQuery.fromEffect(keyFn(input)) </span><span> </span><span style="color:#e45649;">env </span><span style="color:#a626a4;">&lt;- </span><span>ZQuery.environment[</span><span style="color:#c18401;">R</span><span>] </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> storeIfNew( </span><span> key, </span><span> query(key).provide(env ? </span><span style="color:#50a14f;">&quot;provided environment&quot;</span><span>) </span><span> ) </span><span> } </span><span style="color:#a626a4;">yield </span><span>LinkedReport[</span><span style="color:#c18401;">K</span><span>, </span><span style="color:#c18401;">B</span><span>](key) </span></code></pre> <p>Then we can use it in queries like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">getEbAppReport</span><span>(</span><span style="color:#e45649;">name</span><span>: </span><span style="color:#c18401;">ApplicationName</span><span>): </span><span style="color:#c18401;">ZQuery</span><span>[</span><span style="color:#c18401;">QueryEnv</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">LinkedReport</span><span>[</span><span style="color:#c18401;">EbAppKey</span><span>, </span><span style="color:#c18401;">EbAppReport</span><span>]] </span><span style="color:#a626a4;">= </span><span> cached(name)(</span><span style="color:#e45649;">name </span><span style="color:#a626a4;">=&gt; </span><span>ZIO.succeed(EbAppKey(name))) { (</span><span style="color:#e45649;">key</span><span>: </span><span style="color:#c18401;">EbAppKey</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span></code></pre> <p>Let's see in detail how this works!</p> <p>First of all, we define the following types:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> LinkedReport</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">K </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ReportKey</span><span>, </span><span style="color:#a626a4;">+</span><span style="color:#c18401;">R </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Report</span><span>](</span><span style="color:#e45649;">key</span><span>: </span><span style="color:#c18401;">K</span><span>) </span><span> </span><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> ReportKey </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Ec2InstanceKey</span><span>(</span><span style="color:#e45649;">instanceId</span><span>: </span><span style="color:#c18401;">InstanceId</span><span>) </span><span style="color:#a626a4;">extends </span><span>ReportKey </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> Report </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Ec2InstanceReport</span><span>(</span><span style="color:#e45649;">instanceId</span><span>: ec2.model.primitives.</span><span style="color:#c18401;">InstanceId</span><span>, </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#e45649;">elb</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">LinkedReport</span><span>[</span><span style="color:#c18401;">ElbKey</span><span>, </span><span style="color:#c18401;">ElbReport</span><span>]] </span><span> ) </span><span style="color:#a626a4;">extends </span><span>Report </span></code></pre> <p>In <code>cached</code>, we provide a <code>keyFn</code> that is an effectful function to extract the <code>ReportKey</code> from the arbitrary input that can be the key itself, or an already fetched resource. Then we call the <code>ReportCache</code> module's <code>storeIfNew</code> query and return a <code>LinkedReport</code>. A <em>linked report</em> is just a wrapper around a report key, it is the type to be used in <code>Report</code> types to refer to each other. We store the cyclic resource graph by using these report keys and the cache's <code>retrieve</code> function to resolve the references on demand.</p> <p>One thing to notice is the <code>.provide</code> in the code of <code>cached</code>. The report cache does not know about the environments needed for the queries it caches the results of; the <code>query</code> parameter of <code>storeIfNew</code> has the type <code>ZQuery[Any, AwsError, A]</code>. For this reason <code>cached</code> eliminates the environment of its inner query by getting it and calling <code>.provide(env)</code> before passing it to the cache.</p> <p>The report cache itself can be implemented with <a href="https://zio.dev/docs/datatypes/datatypes_stm"><em>STM</em></a>. First we create a <a href="https://zio.dev/docs/datatypes/datatypes_tmap"><code>TMap</code></a>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>cache &lt;- TMap.empty[</span><span style="color:#c18401;">ReportKey</span><span>, </span><span style="color:#c18401;">Promise</span><span>[</span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Report</span><span>]].commit </span></code></pre> <p>We want to store the fact that a query <em>has been started</em> for a given report key. This can be modelled with a <code>Promise</code> that eventually gets a <code>Report</code> value. With this <code>TMap</code> structure, the <code>storeIfNew</code> function can be defined as:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">storeIfNew</span><span>[</span><span style="color:#c18401;">A </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Report</span><span>](</span><span style="color:#e45649;">reportKey</span><span>: </span><span style="color:#c18401;">ReportKey</span><span>, </span><span> </span><span style="color:#e45649;">query</span><span>: </span><span style="color:#c18401;">ZQuery</span><span>[</span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">A</span><span>]): </span><span style="color:#c18401;">ZQuery</span><span>[</span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#a626a4;">Boolean</span><span>] </span><span style="color:#a626a4;">= </span><span> ZQuery.fromEffect { </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">promise </span><span style="color:#a626a4;">&lt;- </span><span>Promise.make[</span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Report</span><span>] </span><span> </span><span style="color:#e45649;">finalQuery </span><span style="color:#a626a4;">&lt;-</span><span> cache.get(reportKey).flatMap { </span><span> </span><span style="color:#a626a4;">case </span><span>Some(</span><span style="color:#e45649;">report</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a0a1a7;">// replacing the query with the cached value </span><span> ZSTM.succeed(ZQuery.succeed(</span><span style="color:#c18401;">false</span><span>)) </span><span> </span><span style="color:#a626a4;">case </span><span>None </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a0a1a7;">// replacing the query with the cached value </span><span> cache.put(reportKey, promise).map { </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=&gt; </span><span> query.foldM( </span><span> </span><span style="color:#e45649;">failure </span><span style="color:#a626a4;">=&gt; </span><span>ZQuery.fromEffect(promise.fail(failure)) *&gt; ZQuery.fail(failure), </span><span> </span><span style="color:#e45649;">success </span><span style="color:#a626a4;">=&gt; </span><span>ZQuery.fromEffect(promise.succeed(success)) </span><span> ) </span><span> } </span><span> }.commit </span><span> } </span><span style="color:#a626a4;">yield</span><span> finalQuery </span><span> }.flatMap(identity) </span></code></pre> <p>This may seem simple but actually we are combining three different layers of abstraction here!</p> <ul> <li>The whole thing is a <em>query</em>. But we first run a <em>ZIO effect</em> that <strong>produces</strong> a query, and then execute that result query (in <code>.flatMap(identity)</code>)</li> <li>In the effect we create a promise that might be used or not, depending on the outcome of the transaction. Then we do <code>cache.get</code> which is an <em>STM transaction</em>.</li> <li>In the transaction we produce a <code>ZQuery</code> value that is either returning a simple <code>false</code> value if the report was already cached, or we store the already created promise in the map and return the query that constructs the report as the <em>result</em> of the transaction.</li> <li>As it is an <em>STM transaction</em> it may be retried multiple times but eventually it returns with a query that is either a NOP or calculates the <em>report</em> <strong>and</strong> sets the promise in the end.</li> </ul> <p>The other function of <code>ReportCache</code>, <code>retrieve</code> will be used when traversing the gathered <em>reports</em> to follow the <code>LinkedReport</code> links. It is simply a combination of getting an item from the <code>TMap</code> and then waiting for the stored promise.</p> <h2 id="throttling">Throttling</h2> <p>The original implementation of this tool did not control the amount and rate of AWS requests in any way, and a few years ago API rate limits made it somewhat unusable. As I explained <a href="https://blog.vigoo.dev/posts/aws-rate-limits-prezidig/">in a previous post</a>, I solved it by centralizing the calls to AWS then adding <em>circuit breaking and retry</em> to handle the <em>throttling errors</em>.</p> <p>In this new implementation <em>ZIO Query</em> 's batching feature already reduces the load but AWS has a global rate limit that can be reached any time, regardless of the actual request rate provided by this application. So how could we handle this with <code>zio-aws</code> and ZIO Query?</p> <p>There is useful ZIO library called <a href="https://www.vroste.nl/rezilience/">rezilience</a> that defines utilities to express circuit breaking, retries, rate limiting and other similar policies. With this library we can create a policy that detects <code>AwsError</code>s representing throttling failures:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">throttlingPolicy</span><span>: </span><span style="color:#c18401;">ZManaged</span><span>[</span><span style="color:#c18401;">Random </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Clock </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Logging</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">Policy</span><span>[</span><span style="color:#c18401;">AwsError</span><span>]] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">cb </span><span style="color:#a626a4;">&lt;- </span><span>CircuitBreaker.make[</span><span style="color:#c18401;">AwsError</span><span>]( </span><span> trippingStrategy </span><span style="color:#a626a4;">= </span><span>TrippingStrategy.failureCount(</span><span style="color:#c18401;">1</span><span>), </span><span> resetPolicy </span><span style="color:#a626a4;">= </span><span>Retry.Schedules.exponentialBackoff(min </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">1</span><span>.second, max </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">1</span><span>.minute), </span><span> isFailure </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>GenericAwsError(</span><span style="color:#e45649;">error</span><span>: </span><span style="color:#c18401;">AwsServiceException</span><span>) </span><span style="color:#a626a4;">if</span><span> error.isThrottlingException </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">true </span><span> } </span><span> ) </span><span> </span><span style="color:#e45649;">retry </span><span style="color:#a626a4;">&lt;- </span><span>Retry.make(min </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">1</span><span>.second, max </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">1</span><span>.minute) </span><span> </span><span style="color:#e45649;">retryComposable </span><span style="color:#a626a4;">=</span><span> retry.widen[</span><span style="color:#c18401;">PolicyError</span><span>[</span><span style="color:#c18401;">AwsError</span><span>]] { </span><span style="color:#a626a4;">case </span><span>Policy.WrappedError(</span><span style="color:#e45649;">e</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> e } </span><span> } </span><span style="color:#a626a4;">yield</span><span> cb.toPolicy compose retryComposable.toPolicy </span></code></pre> <p>This will open a circuit breaker in case of throttling errors, and retry the operation with exponential back-off.</p> <p>These policies can be applied to <code>ZIO</code> effects. What we really need is to apply a policy like this to <em>all</em> AWS call. It should be the actual call to the underlying <em>AWS Java SDK</em>, not on the <code>zio-aws</code> wrapper level, because for example a streaming API function may produce multiple AWS requests.</p> <p>The <code>zio-aws</code> library supports applying <code>AwsCallAspect</code>s on the <em>AWS service client layers</em> to modify the underlying SDK calls. This is exactly what we need to apply the throttling policy to all calls! What's even better, by creating a single <code>throttlingPolicy</code> and applying it to all the service layers (<code>ec2</code>, <code>elasticloadbalancing</code>, <code>elasticbeanstalk</code> and <code>autoscaling</code>) they will share a common circuit breaker that matches the situation perfectly as the AWS API rate limiting is applied to globally to all services.</p> <p>An AWS call aspect has the following form:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">throttling </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">AwsCallAspect</span><span>[</span><span style="color:#a626a4;">Any</span><span>] { </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">apply</span><span>[</span><span style="color:#c18401;">R1</span><span>, </span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">R1</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Described</span><span>[</span><span style="color:#c18401;">A</span><span>]]): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">R1</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, aspects.</span><span style="color:#c18401;">Described</span><span>[</span><span style="color:#c18401;">A</span><span>]] </span><span style="color:#a626a4;">= </span><span> policy(f).mapError { </span><span> </span><span style="color:#a626a4;">case </span><span>Policy.WrappedError(</span><span style="color:#e45649;">e</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> e </span><span> </span><span style="color:#a626a4;">case </span><span>Policy.BulkheadRejection </span><span style="color:#a626a4;">=&gt; </span><span>AwsError.fromThrowable(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">RuntimeException</span><span>(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Bulkhead rejection&quot;</span><span>)) </span><span> </span><span style="color:#a626a4;">case </span><span>Policy.CircuitBreakerOpen </span><span style="color:#a626a4;">=&gt; </span><span>AwsError.fromThrowable(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">RuntimeException</span><span>(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;AWS rate limit exceeded&quot;</span><span>)) </span><span> } </span><span> } </span></code></pre> <p>Another simple example could be logging all AWS requests:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">callLogging</span><span>: </span><span style="color:#c18401;">AwsCallAspect</span><span>[</span><span style="color:#c18401;">Logging</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">AwsCallAspect</span><span>[</span><span style="color:#c18401;">Logging</span><span>] { </span><span> </span><span style="color:#a626a4;">override final def </span><span style="color:#0184bc;">apply</span><span>[</span><span style="color:#c18401;">R1 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Logging</span><span>, </span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">R1</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Described</span><span>[</span><span style="color:#c18401;">A</span><span>]]): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">R1</span><span>, </span><span style="color:#c18401;">AwsError</span><span>, </span><span style="color:#c18401;">Described</span><span>[</span><span style="color:#c18401;">A</span><span>]] </span><span style="color:#a626a4;">= </span><span> f.flatMap { </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">r</span><span style="color:#a626a4;">@</span><span>Described(</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">description</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> log.info(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;[${</span><span>description.service</span><span style="color:#50a14f;">}/${</span><span>description.operation</span><span style="color:#50a14f;">}]&quot;</span><span>).as(r) </span><span> } </span><span> } </span></code></pre> <p>These aspects can be applied to a <code>zio-aws</code> <code>ZLayer</code> directly, such as:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>ec2.live @@ (throttling &gt;&gt;&gt; callLogging) </span></code></pre> <h2 id="rendering">Rendering</h2> <p>With the queries and report cache ready the last missing building block is <em>rendering</em> the gathered reports. We implement it in its own ZIO module with the following interface:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Rendering { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Service { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">renderEc2Instance</span><span style="color:#c18401;">(</span><span style="color:#e45649;">report</span><span style="color:#c18401;">: LinkedReport[Ec2InstanceKey, Ec2InstanceReport]): UIO[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">renderElb</span><span style="color:#c18401;">(</span><span style="color:#e45649;">report</span><span style="color:#c18401;">: LinkedReport[ElbKey, ElbReport], </span><span style="color:#e45649;">context</span><span style="color:#c18401;">: Option[String]): UIO[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">renderAsg</span><span style="color:#c18401;">(</span><span style="color:#e45649;">report</span><span style="color:#c18401;">: LinkedReport[AsgKey, AsgReport]): UIO[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">renderEbEnv</span><span style="color:#c18401;">(</span><span style="color:#e45649;">report</span><span style="color:#c18401;">: LinkedReport[EbEnvKey, EbEnvReport]): UIO[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">renderEbApp</span><span style="color:#c18401;">(</span><span style="color:#e45649;">report</span><span style="color:#c18401;">: LinkedReport[EbAppKey, EbAppReport]): UIO[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>The live implementation of course needs access to <code>ReportCache</code> and writes the report out to <code>Console</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">live</span><span>: </span><span style="color:#c18401;">ZLayer</span><span>[</span><span style="color:#c18401;">Console </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">ReportCache</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">Rendering</span><span>] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>We need two main things to implement report rendering:</p> <ul> <li>A way to pretty-print reports to the console</li> <li>We have to track which report was already rendered to be able to traverse the cyclic result graph</li> </ul> <p>To track the already printed reports we can simply create a <code>Ref</code> holding a set of visited <code>ReportKey</code>s:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private case class</span><span style="color:#c18401;"> State</span><span>(</span><span style="color:#e45649;">alreadyVisited</span><span>: </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">ReportKey</span><span>]) </span><span style="color:#a0a1a7;">// ... </span><span>alreadyVisited &lt;- Ref.make(State(Set.empty)) </span></code></pre> <p>For pretty printing the reports there are several possibilities. Eventually we want to call <code>console.putStr</code> to write to the console. The original implementation of this tool used a string templating engine to define the output. Instead of doing that we can write a pretty-printing DSL to define our output in Scala. Take a look at the following example:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>ifNotVisitedYet(report) { </span><span style="color:#e45649;">env </span><span style="color:#a626a4;">=&gt; </span><span> sectionHeader(</span><span style="color:#50a14f;">&quot;Beanstalk/Env&quot;</span><span>) &lt;-&gt; highlighted(env.name) &lt;-&gt; </span><span> details(env.id) &lt;-&gt; </span><span> normal(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;is a Beanstalk environment of the application ${</span><span>env.appName</span><span style="color:#50a14f;">}&quot;</span><span>) \\ </span><span> indented { </span><span> keyword(</span><span style="color:#50a14f;">&quot;AWS Console&quot;</span><span>) &lt;:&gt; </span><span> link(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;https://console.aws.amazon.com/elasticbeanstalk/home?region=${</span><span>env.region</span><span style="color:#50a14f;">}#/environment/dashboard?applicationName=${</span><span>env.appName</span><span style="color:#50a14f;">}&amp;environmentId=${</span><span>env.id</span><span style="color:#50a14f;">}&quot;</span><span>) \\ </span><span> keyword(</span><span style="color:#50a14f;">&quot;Health&quot;</span><span>) &lt;:&gt; highlighted(env.health.toString) \\ </span><span> keyword(</span><span style="color:#50a14f;">&quot;Currently running version&quot;</span><span>) &lt;:&gt; normal(env.version) \\ </span><span> normal(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;${</span><span>env.asgs.size</span><span style="color:#50a14f;">} ASGs, ${</span><span>env.instanceCount</span><span style="color:#50a14f;">} instances, ${</span><span>env.elbs.size</span><span style="color:#50a14f;">} ELBs&quot;</span><span>) \\ </span><span> env.elbs.foreach_(elb(</span><span style="color:#e45649;">_</span><span>, None)) \\ </span><span> env.asgs.foreach_(asg) \\ </span><span> ebApp(env.app) </span><span> } </span><span>} </span></code></pre> <p>We can see here a couple of functions and operators, all created to the specific task of printing <em>AWS resource reports</em>:</p> <ul> <li><code>ifNotYetVisitedYet</code> must somehow interact with the <code>Ref</code> we defined above</li> <li><code>&lt;-&gt;</code> concatenates two texts with a space</li> <li><code>&lt;:&gt;</code> concatenates two texts with a colon and a space</li> <li><code>\\</code> concatenates two texts with a newline</li> <li><code>keyword</code>, <code>link</code>, <code>normal</code>, <code>highlighted</code> etc. add styling to the given text</li> <li><code>foreach_</code> is coming from <code>zio-prelude</code>-s <code>Traversable</code>. We will see why is it used soon.</li> </ul> <p>We could define these styling functions as <code>ZIO</code> effects and the helper operators as general extension methods on <code>ZIO</code>. Then we could store required state (for example for indentation) in a <code>Ref</code> for example. This works but we can do better. By defining our own monadic data type <code>Print[A]</code> we get the following advantages:</p> <ul> <li>It is more type safe. The pretty printing operators will be only applicable to pretty printing functions, not to arbitrary ZIO effects</li> <li>Pretty printing state gets completely hidden from the pretty printing definitions</li> <li>We can easily do some optimizations such as collapsing multiple newlines into one, which makes rendering optional lines more convenient</li> </ul> <p>So let's define a data type to represent pretty printing:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> Print</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PrintPure</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">a</span><span>: </span><span style="color:#c18401;">A</span><span>) </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PrintS</span><span>(</span><span style="color:#e45649;">s</span><span>: </span><span style="color:#c18401;">String</span><span>) </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PrintModified</span><span>(</span><span style="color:#e45649;">s</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">modifiers</span><span>: </span><span style="color:#c18401;">String</span><span>) </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">final case object</span><span style="color:#c18401;"> PrintNL </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PrintIndented</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">p</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>]) </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PrintFlatMap</span><span>[</span><span style="color:#c18401;">A</span><span>, </span><span style="color:#c18401;">B</span><span>](</span><span style="color:#e45649;">a</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>], </span><span style="color:#e45649;">f</span><span>: </span><span style="color:#c18401;">A </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">B</span><span>]) </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#c18401;">B</span><span>] </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> PrintEffect</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#c18401;">UIO</span><span>[</span><span style="color:#c18401;">A</span><span>]) </span><span style="color:#a626a4;">extends </span><span>Print[</span><span style="color:#c18401;">A</span><span>] </span></code></pre> <p><code>PrintPure</code> and <code>PrintFlatMap</code> can be used to implement <code>zio-prelude</code>s type classes:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit val </span><span style="color:#e45649;">print </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">Covariant</span><span>[</span><span style="color:#c18401;">Print</span><span>] </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">IdentityFlatten</span><span>[</span><span style="color:#c18401;">Print</span><span>] </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">IdentityBoth</span><span>[</span><span style="color:#c18401;">Print</span><span>] { </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">map</span><span>[</span><span style="color:#c18401;">A</span><span>, </span><span style="color:#c18401;">B</span><span>](</span><span style="color:#e45649;">f</span><span>: </span><span style="color:#c18401;">A </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">B</span><span>): </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">B</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#e45649;">fa </span><span style="color:#a626a4;">=&gt; </span><span>PrintFlatMap(fa, (</span><span style="color:#e45649;">a</span><span>: </span><span style="color:#c18401;">A</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>PrintPure(f(a))) </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">any</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#a626a4;">Any</span><span>] </span><span style="color:#a626a4;">= </span><span> PrintPure(</span><span style="color:#c18401;">()</span><span>) </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">flatten</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">ffa</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>]]): </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= </span><span> PrintFlatMap(ffa, (</span><span style="color:#e45649;">fa</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>]) </span><span style="color:#a626a4;">=&gt;</span><span> fa) </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">both</span><span>[</span><span style="color:#c18401;">A</span><span>, </span><span style="color:#c18401;">B</span><span>](</span><span style="color:#e45649;">fa</span><span>: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>], </span><span style="color:#e45649;">fb</span><span>: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">B</span><span>]): </span><span style="color:#c18401;">Print</span><span>[(</span><span style="color:#c18401;">A</span><span>, </span><span style="color:#c18401;">B</span><span>)] </span><span style="color:#a626a4;">= </span><span> PrintFlatMap(fa, (</span><span style="color:#e45649;">a</span><span>: </span><span style="color:#c18401;">A</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> map((</span><span style="color:#e45649;">b</span><span>: </span><span style="color:#c18401;">B</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>(a, b))(fb)) </span><span>} </span></code></pre> <p>What are these type classes providing to us?</p> <ul> <li><code>Covariant</code> basically gives us <code>map</code></li> <li><code>IdentityFlatten</code> means that the data type can be "flattened" associatively and has an identity element. This gives us <code>flatten</code> and <code>flatMap</code>.</li> <li><code>IdentityBoth</code> means we have an associative binary operator to combine two values. This enables syntax like <code>&lt;*&gt;</code>.</li> </ul> <p>Having this we can define primitive pretty printing operators like:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">normal</span><span>(</span><span style="color:#e45649;">text</span><span>: </span><span style="color:#c18401;">String</span><span>): </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span>PrintS(text) </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">space</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span><span>PrintS(</span><span style="color:#50a14f;">&quot; &quot;</span><span>) </span><span> </span><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> PrintOps</span><span>[</span><span style="color:#c18401;">A</span><span>](</span><span style="color:#e45649;">self</span><span>: </span><span style="color:#c18401;">Print</span><span>[</span><span style="color:#c18401;">A</span><span>]) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&lt;-&gt;</span><span style="color:#c18401;">[B](</span><span style="color:#e45649;">next</span><span style="color:#c18401;">: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Print[B]): Print[B] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> self *&gt; space *&gt; next </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>Then we can use the syntax provided by <code>zio-prelude</code> to compose these pretty printer values. The only thing remaining is to provide a transformation of <code>Print[A]</code> to <code>UIO[A]</code>. This is where we can hide the pretty printer state and can handle special rules like collapsing newlines:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private trait</span><span style="color:#c18401;"> PrettyConsole { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">protected val </span><span style="color:#e45649;">console</span><span style="color:#c18401;">: Console.Service </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private case class</span><span style="color:#c18401;"> PrettyState(</span><span style="color:#e45649;">indentation</span><span style="color:#c18401;">: String, </span><span style="color:#e45649;">afterNL</span><span style="color:#c18401;">: </span><span style="color:#a626a4;">Boolean</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">printFlatMap</span><span style="color:#c18401;">[A, B](</span><span style="color:#e45649;">a</span><span style="color:#c18401;">: Print[A], </span><span style="color:#e45649;">f</span><span style="color:#c18401;">: A </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Print[B], </span><span style="color:#e45649;">state</span><span style="color:#c18401;">: PrettyState): UIO[(B, PrettyState)] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">for </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#e45649;">r1 </span><span style="color:#a626a4;">&lt;-</span><span style="color:#c18401;"> runImpl(a, state) </span><span style="color:#c18401;"> </span><span style="color:#e45649;">r2 </span><span style="color:#a626a4;">&lt;-</span><span style="color:#c18401;"> runImpl(f(r1._1), r1._2) </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">yield</span><span style="color:#c18401;"> r2 </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">runImpl</span><span style="color:#c18401;">[A](</span><span style="color:#e45649;">p</span><span style="color:#c18401;">: Print[A], </span><span style="color:#e45649;">state</span><span style="color:#c18401;">: PrettyState): UIO[(A, PrettyState)] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> p </span><span style="color:#a626a4;">match </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintPure(</span><span style="color:#e45649;">a</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">ZIO.succeed((a, state)) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintS(</span><span style="color:#e45649;">s</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">ZIO.when(state.afterNL)(console.putStr(state.indentation)) *&gt; </span><span style="color:#c18401;"> console.putStr(s).as(((), state.copy(afterNL </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">false))) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintModified(</span><span style="color:#e45649;">s</span><span style="color:#c18401;">, </span><span style="color:#e45649;">modifiers</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">ZIO.when(state.afterNL)(console.putStr(state.indentation)) *&gt; </span><span style="color:#c18401;"> console.putStr(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;${</span><span style="color:#c18401;">modifiers</span><span style="color:#50a14f;">}</span><span style="color:#e45649;">$s$RESET</span><span style="color:#50a14f;">&quot;</span><span style="color:#c18401;">).as(((), state.copy(afterNL </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">false))) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintNL </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(state.afterNL) ZIO.succeed(((), state)) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else</span><span style="color:#c18401;"> console.putStrLn(</span><span style="color:#50a14f;">&quot;&quot;</span><span style="color:#c18401;">).as(((), state.copy(afterNL </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true))) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintIndented(</span><span style="color:#e45649;">f</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> runImpl(f, state.copy(indentation </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> state.indentation + </span><span style="color:#50a14f;">&quot; &quot;</span><span style="color:#c18401;">)).map { </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">(</span><span style="color:#e45649;">a</span><span style="color:#c18401;">, </span><span style="color:#e45649;">s</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> (a, s.copy(indentation </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> state.indentation)) } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintFlatMap(</span><span style="color:#e45649;">a</span><span style="color:#c18401;">, </span><span style="color:#e45649;">f</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> printFlatMap(a, f, state) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">PrintEffect(</span><span style="color:#e45649;">f</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> f.map((</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, state)) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">[A](</span><span style="color:#e45649;">p</span><span style="color:#c18401;">: Print[A]): UIO[A] </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> runImpl(p, PrettyState(</span><span style="color:#50a14f;">&quot;&quot;</span><span style="color:#c18401;">, afterNL </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">false)).map(</span><span style="color:#e45649;">_</span><span style="color:#c18401;">._1) </span><span style="color:#c18401;">} </span></code></pre> <p>A couple of things to notice here:</p> <ul> <li><code>PrettyState</code> holds the indentation and a flag that is true when the last print was a <em>new line</em></li> <li><code>runImpl</code> gets the state as input and has the capability to modify it, by returning the modified state together with the computation's result</li> <li>there is a <code>PrintEffect</code> constructor that allows lifting arbitrary <code>ZIO</code> effects to the pretty printer. This is needed for interacting with the <code>Ref</code> that holds the record of already printed reports.</li> </ul> <h2 id="putting-all-together">Putting all together</h2> <p>Putting all this together means getting command line arguments, setting up the AWS client libraries, the report cache and the rendering modules and running the top level queries.</p> <p>To parse the command line arguments we can use my <a href="https://vigoo.github.io/clipp/docs/">clipp library</a>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Parameters</span><span>(</span><span style="color:#e45649;">verbose</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>, </span><span> </span><span style="color:#e45649;">searchInput</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span> </span><span style="color:#e45649;">region</span><span>: </span><span style="color:#c18401;">String</span><span>) </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">paramSpec </span><span style="color:#a626a4;">= for </span><span>{ </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> metadata(</span><span style="color:#50a14f;">&quot;aws-query&quot;</span><span>, </span><span style="color:#50a14f;">&quot;search for AWS infrastructure resources&quot;</span><span>) </span><span> </span><span style="color:#e45649;">verbose </span><span style="color:#a626a4;">&lt;-</span><span> flag(</span><span style="color:#50a14f;">&quot;Verbose logging&quot;</span><span>, </span><span style="color:#c18401;">&#39;v&#39;</span><span>, </span><span style="color:#50a14f;">&quot;verbose&quot;</span><span>) </span><span> </span><span style="color:#e45649;">searchInput </span><span style="color:#a626a4;">&lt;-</span><span> parameter[</span><span style="color:#c18401;">String</span><span>](</span><span style="color:#50a14f;">&quot;Search input&quot;</span><span>, </span><span style="color:#50a14f;">&quot;NAME_OR_ID&quot;</span><span>) </span><span> </span><span style="color:#e45649;">region </span><span style="color:#a626a4;">&lt;-</span><span> optional { namedParameter[</span><span style="color:#c18401;">String</span><span>](</span><span style="color:#50a14f;">&quot;AWS region&quot;</span><span>, </span><span style="color:#50a14f;">&quot;REGION&quot;</span><span>, </span><span style="color:#50a14f;">&quot;region&quot;</span><span>) } </span><span>} </span><span style="color:#a626a4;">yield </span><span>Parameters(verbose, searchInput, region.getOrElse(</span><span style="color:#50a14f;">&quot;us-east-1&quot;</span><span>)) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">params </span><span style="color:#a626a4;">=</span><span> clipp.zioapi.config.fromArgsWithUsageInfo(args, paramSpec) </span></code></pre> <p>The <code>verbose</code> flag is used to set up logging. We use <a href="https://zio.github.io/zio-logging/">zio-logging</a> with SLF4j support (to be able to see logs from the underlying AWS Java SDK) with lo4j2 backend. In order to control the log level by the command line <code>verbose</code> flag, instead of the usual XML-based configuration for log4j2 we define a ZIO <em>layer</em> that's only purpose is to perform the configuration programmatically:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">log4j2Configuration</span><span>: </span><span style="color:#c18401;">ZLayer</span><span>[</span><span style="color:#c18401;">Has</span><span>[</span><span style="color:#c18401;">ClippConfig</span><span>.</span><span style="color:#c18401;">Service</span><span>[</span><span style="color:#c18401;">Parameters</span><span>]], </span><span style="color:#c18401;">Throwable</span><span>, </span><span style="color:#c18401;">Has</span><span>[</span><span style="color:#c18401;">Log4jConfiguration</span><span>]] </span><span style="color:#a626a4;">= </span><span>{ </span><span> ZLayer.fromServiceM[</span><span style="color:#c18401;">ClippConfig</span><span>.</span><span style="color:#c18401;">Service</span><span>[</span><span style="color:#c18401;">Parameters</span><span>], </span><span style="color:#a626a4;">Any</span><span>, </span><span style="color:#c18401;">Throwable</span><span>, </span><span style="color:#c18401;">Log4jConfiguration</span><span>] { </span><span style="color:#e45649;">params </span><span style="color:#a626a4;">=&gt; </span><span> ZIO.effect { </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">builder </span><span style="color:#a626a4;">= </span><span>ConfigurationBuilderFactory.newConfigurationBuilder() </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> Configurator.initialize(builder.build()) </span><span> Log4jConfiguration() </span><span> } </span><span> } </span></code></pre> <p>This way the root logger's level can depend on the <code>Parameters</code> parsed by <code>clipp</code>. Composing this layer with <code>zio-logger</code>s <code>Slf4jLogger</code> gives us a working <code>Logging</code> layer:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">logging </span><span style="color:#a626a4;">=</span><span> log4j2Configuration &gt;+&gt; Slf4jLogger.make { (</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">message</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> message } </span></code></pre> <p>By bootstrapping the parameters and the logging we can run our main application like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">&lt;-</span><span> awsQuery() </span><span> .provideCustomLayer(params &gt;+&gt; logging) </span><span> .catchAll { </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=&gt; </span><span>ZIO.succeed(ExitCode.failure) } </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;- </span><span>ZIO.effect(LogManager.shutdown()).orDie </span><span>} </span><span style="color:#a626a4;">yield</span><span> result </span></code></pre> <p>The <code>clipp</code> parser will print detailed usage info in case it fails, and other runtime errors are logged, so we can simply catch all errors and exit with a failure on top level.</p> <p>In <code>awsQuery</code> we create all the other layers necessary for running the queries. First we need to create the <em>throttling policy</em> that is used by all the AWS service clients as I explained above:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">awsQuery</span><span>(): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">Random </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Clock </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Console </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">Logging </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">ClippConfig</span><span>[</span><span style="color:#c18401;">Parameters</span><span>], </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#c18401;">ExitCode</span><span>] </span><span style="color:#a626a4;">= </span><span> throttlingPolicy.use { </span><span style="color:#e45649;">policy </span><span style="color:#a626a4;">=&gt; </span></code></pre> <p>The <code>zio-aws</code> library uses <a href="https://zio.github.io/zio-config/">ZIO Config</a> for configuration. This means we need a <code>ZConfig[CommonAwsConfig]</code> to construct the <code>AwsConfig</code> layer:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">commonConfig </span><span style="color:#a626a4;">= </span><span>ZLayer.succeed(CommonAwsConfig( </span><span> region </span><span style="color:#a626a4;">= </span><span>Some(Region.of(params.region)), </span><span> credentialsProvider </span><span style="color:#a626a4;">= </span><span>DefaultCredentialsProvider.create(), </span><span> endpointOverride </span><span style="color:#a626a4;">= </span><span>None, </span><span> commonClientConfig </span><span style="color:#a626a4;">= </span><span>None </span><span>)) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">awsCore </span><span style="color:#a626a4;">= </span><span>(netty.default ++ commonConfig) &gt;&gt;&gt; core.config.configured() </span></code></pre> <p>The <code>AwsConfig</code> layer combines the configuration with a selected HTTP backend. In our case this is the <em>Netty</em> backend, using its default configuration.</p> <p>Then we define the per-service client layers, applying the throttling and call logging <em>aspects</em> as I described before:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">awsClients </span><span style="color:#a626a4;">= </span><span> ec2.live @@ (throttling &gt;&gt;&gt; callLogging) ++ </span><span> elasticloadbalancing.live @@ (throttling &gt;&gt;&gt; callLogging) ++ </span><span> elasticbeanstalk.live @@ (throttling &gt;&gt;&gt; callLogging) ++ </span><span> autoscaling.live @@ (throttling &gt;&gt;&gt; callLogging) </span></code></pre> <p>To produce the final layer, we feed the logging and the <code>AwsConfig</code> layers to the client layers, and add the <code>ReportCache</code> and <code>Render</code> implementations:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">finalLayer </span><span style="color:#a626a4;">= </span><span> ((ZLayer.service[</span><span style="color:#c18401;">Logger</span><span>[</span><span style="color:#c18401;">String</span><span>]] ++ awsCore) &gt;&gt;&gt; awsClients) ++ </span><span> ((Console.any ++ cache.live) &gt;+&gt; render.live) </span></code></pre> <p>This has the environment <code>ClippConfig[Parameters] with Console with Logging with ReportCache with Rendering with AllServices</code> where</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">type </span><span>AllServices </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Ec2 </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">ElasticLoadBalancing </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">ElasticBeanstalk </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">AutoScaling </span></code></pre> <h2 id="conclusion">Conclusion</h2> <p>We reimplemented the tool to query AWS resources using functional programming techniques, built on top of ZIO libraries. By separating the execution from the problem specification we get an easily readable and maintainable code that can be easily extended with new queries or reports without having to thing about how caching and concurrency is implemented under the hood. We can rate limit AWS requests without touching the actual queries, and take advantage of batching AWS operations while keeping the query logic simple and unaware of this optimization.</p> Code generation in ZIO-AWS 2020-09-23T00:00:00+00:00 2020-09-23T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/zioaws-code-generation/ <p>I have recently published a set of libraries, <a href="https://github.com/vigoo/zio-aws"><strong>zio-aws</strong></a>, aiming to provide a better interface for working with <em>AWS services</em> from <a href="https://zio.dev/">ZIO</a> applications. For more information about how the ZIO <em>interface</em> works and how to get started with these libraries, read the repository's README. In this post, I will focus on how these libraries are generated from the schema provided by the <a href="https://github.com/aws/aws-sdk-java-v2">AWS Java SDK v2</a>.</p> <h2 id="generating-code">Generating code</h2> <p>I wanted to cover <em>all</em> AWS services at once. This means client libraries for more than 200 services, so the only possible approach was to <em>generate</em> these libraries on top of a small hand-written core.</p> <h3 id="schema">Schema</h3> <p>The first thing we need for generating code is a source schema. This is the model that we use to create the source code from. It is usually constructed by some kind of DSL or more directly described by a JSON or YAML or similar data model. In the case of <strong>zio-aws</strong> this was already defined in the <a href="https://github.com/aws/aws-sdk-java-v2">AWS Java SDK v2</a> project. The way it works is:</p> <ul> <li>There is a <code>codegen</code> project, published in the <code>software.amazon.awssdk</code> group among the client libraries, that contains the Java classes used for generating the Java SDK itself. This contains the data model classes for parsing the actual schema as well.</li> <li>In the AWS Java SDK v2 repository, the schema is located in the subdirectory called <a href="https://github.com/aws/aws-sdk-java-v2/tree/master/services"><code>services</code></a>. There is a directory for each AWS service and it contains among other things some relevant <em>JSON</em> schema files: <ul> <li><code>service-2.json</code> is the main schema of the service, describing the data structures and operations</li> <li><code>paginators-1.json</code> describes the operations that the Java SDK creates a <em>paginator interface</em> for</li> <li><code>customization.config</code> contains extra information, including changes to be applied on top of the service model</li> </ul> </li> <li>Fortunately, these are also embedded in the generated <em>AWS Java SDK</em> libraries as resources, so getting <em>all client libraries</em> on the classpath gives us an easy way to get the corresponding schemas as well</li> </ul> <p>I decided to use the low-level data classes from the AWS <code>codegen</code> library to parse these files and using that build a higher-level model that can be then used as an input for the <em>code generator</em>.</p> <p>This is encapsulated in a <em>ZIO layer</em> called <code>Loader</code>, which has two functions:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">findModels</span><span>(): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">Blocking</span><span>, </span><span style="color:#c18401;">Throwable</span><span>, </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">ModelId</span><span>]] </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">loadCodegenModel</span><span>(</span><span style="color:#e45649;">id</span><span>: </span><span style="color:#c18401;">ModelId</span><span>): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">Blocking</span><span>, </span><span style="color:#c18401;">Throwable</span><span>, </span><span style="color:#c18401;">C2jModels</span><span>] </span></code></pre> <p>The first one, <code>findModels</code> uses the <code>ClassLoader</code> to enumerate all <code>codegen-resources</code> folders on the <em>classpath</em> and just returns a set of <code>ModelId</code>s. <code>ModelId</code> is a pair of a model name (such as <code>s3</code>) and an optional submodule name (for example <code>dynamodb:dynamodbstreams</code>).</p> <p>Then for each detected model we can load it with the <code>loadCodegenModel</code> function, <code>C2jModels</code> is a class from the AWS <code>codegen</code> library.</p> <p>Figuring out how to interpret these data structures, and how to map them to the generated Java API was the hardest part, but it's out of scope for this post. Our next topic here is how we generate code from our <em>model</em>.</p> <h3 id="scalameta">Scalameta</h3> <p>There are several possibilities to generate source code and I tried many of them during the past years. Let's see some examples:</p> <ul> <li>Using a general-purpose text template engine. An example we used at <a href="https://prezi.com">Prezi</a> is the <a href="https://github.com/bkiers/Liqp">Java implementation of the Liquid templating engine</a>. Another example is the <a href="https://github.com/OpenAPITools/openapi-generator">OpenAPI generator project</a> that uses <a href="https://mustache.github.io/">Mustache</a> templates to generate server and client code from OpenAPI specifications.</li> <li>Generating from code with some general-purpose pretty-printing library. With this approach, we are using the pretty-printer library's composability features to create source code building blocks, and map the code generator model to these constructs. It is easier to express complex logic in this case, as we don't have to encode it in a limited dynamic template model. On the other hand, reading the code generator's source and imagining the output is not easy, and nothing enforces that the pretty-printer building blocks are actually creating valid source code.</li> <li>If the target language has an AST with a pretty-printing feature, we can map the model to the AST directly and just pretty print at the end. With this, we get a much more efficient development cycle, as the generated code is at least guaranteed to be syntactically correct. But the AST can be far from how the target language's textual representation looks like, which makes it difficult to read and write this code.</li> <li>With a library that supports building ASTs with <em>quasiquotes</em>, we can build the AST fragments with a syntax that is very close to the generated target language. For <em>Scala</em>, a library that supports this and is used in a lot of tooling projects is <a href="https://scalameta.org/">Scalameta</a></li> </ul> <p>I wanted to try using <em>Scalameta</em> ever since I met Devon Stewart and he mentioned how he uses it in <a href="https://github.com/twilio/guardrail/">guardrail</a>. Finally, this was a perfect use case to do so!</p> <p>To get an understanding of what kind of Scala language constructs can be built with <em>quasiquotes</em> with <em>Scalameta</em>, check <a href="https://scalameta.org/docs/trees/quasiquotes.html">the list of them in the official documentation</a>.</p> <p>We get a good mix of both worlds with this. It is possible to express complex template logic in real code, creating higher-level constructs, taking advantage of the full power of Scala. On the other hand, the actual <em>quasiquoted</em> fragments are still close to the code generator's target language (which is in this case also Scala).</p> <p>Let's see a short example of this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">generateMap</span><span>(</span><span style="color:#e45649;">m</span><span>: </span><span style="color:#c18401;">Model</span><span>): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">GeneratorContext</span><span>, </span><span style="color:#c18401;">GeneratorFailure</span><span>, </span><span style="color:#c18401;">ModelWrapper</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">keyModel </span><span style="color:#a626a4;">&lt;-</span><span> get(m.shape.getMapKeyType.getShape) </span><span> </span><span style="color:#e45649;">valueModel </span><span style="color:#a626a4;">&lt;-</span><span> get(m.shape.getMapValueType.getShape) </span><span> </span><span style="color:#e45649;">keyT </span><span style="color:#a626a4;">&lt;- </span><span>TypeMapping.toWrappedType(keyModel) </span><span> </span><span style="color:#e45649;">valueT </span><span style="color:#a626a4;">&lt;- </span><span>TypeMapping.toWrappedType(valueModel) </span><span> } </span><span style="color:#a626a4;">yield </span><span>ModelWrapper( </span><span> code </span><span style="color:#a626a4;">= </span><span>List(</span><span style="color:#0184bc;">q</span><span style="color:#50a14f;">&quot;&quot;&quot;type ${</span><span>m.asType</span><span style="color:#50a14f;">} = Map[</span><span style="color:#e45649;">$keyT</span><span style="color:#50a14f;">, </span><span style="color:#e45649;">$valueT</span><span style="color:#50a14f;">]&quot;&quot;&quot;</span><span>) </span><span> ) </span><span>} </span></code></pre> <p>For each <em>AWS</em> service-specific <em>model type</em> we generate some kind of wrapper code into the ZIO service client library. This is done by processing the schema model to an intermediate format where for each such wrapper, we have a <code>ModelWrapper</code> value that already has the <em>Scalameta AST</em> for that particular wrapper. The above code fragment creates this for <em>map types</em>, which is a simple type alias for a Scala <code>Map</code>. It's a <code>ZIO</code> function, taking advantage of passing around the context in the <em>environment</em> and safely handling generator failures, while the actual generated code part in the <code>q"""..."""</code> remained quite readable.</p> <p>Then the whole <em>model package</em> can be expressed like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#e45649;">primitiveModels </span><span style="color:#a626a4;">&lt;- </span><span>ZIO.foreach(primitiveModels.toList.sortBy(</span><span style="color:#e45649;">_</span><span>.name))(generateModel) </span><span> </span><span style="color:#e45649;">models </span><span style="color:#a626a4;">&lt;- </span><span>ZIO.foreach(complexModels.toList.sortBy(</span><span style="color:#e45649;">_</span><span>.name))(generateModel) </span><span>} </span><span style="color:#a626a4;">yield </span><span style="color:#0184bc;">q</span><span style="color:#50a14f;">&quot;&quot;&quot;package </span><span style="color:#e45649;">$fullPkgName</span><span style="color:#50a14f;"> { </span><span style="color:#50a14f;"> </span><span style="color:#50a14f;"> import scala.jdk.CollectionConverters._ </span><span style="color:#50a14f;"> import java.time.Instant </span><span style="color:#50a14f;"> import zio.{Chunk, ZIO} </span><span style="color:#50a14f;"> import software.amazon.awssdk.core.SdkBytes </span><span style="color:#50a14f;"> </span><span style="color:#50a14f;"> ..</span><span style="color:#e45649;">$parentModuleImport </span><span style="color:#50a14f;"> </span><span style="color:#50a14f;"> package object model { </span><span style="color:#50a14f;"> object primitives { </span><span style="color:#50a14f;"> ..${</span><span>primitiveModels.flatMap(</span><span style="color:#e45649;">_</span><span>.code)</span><span style="color:#50a14f;">} </span><span style="color:#50a14f;"> } </span><span style="color:#50a14f;"> </span><span style="color:#50a14f;"> ..${</span><span>models.flatMap(</span><span style="color:#e45649;">_</span><span>.code)</span><span style="color:#50a14f;">} </span><span style="color:#50a14f;"> }}&quot;&quot;&quot; </span></code></pre> <p>This can be then <em>pretty printed</em> simply with<code>.toString</code> and saved to a <code>.scala</code> file.</p> <h2 id="building-the-libraries">Building the libraries</h2> <p>We have a way to collect the service models and generate source code from that, but we still have to use that generated code somehow. In <code>zio-aws</code> the goal was to generate a separate <em>client library</em> for each AWS service. At the time of writing, there were <strong>235</strong> such services. The generated libraries have to be built and published to <em>Sonatype</em>.</p> <h3 id="first-version">First version</h3> <p>In the first version I simply wired together the above described <code>loader</code> and <code>generator</code> module into a <code>ZIO</code> <em>command line</em> app, using <a href="https://vigoo.github.io/clipp/docs/">clipp</a> for command line parsing. It's <code>main</code> was really just something like the following:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">app </span><span style="color:#a626a4;">= for </span><span>{ </span><span> </span><span style="color:#e45649;">svcs </span><span style="color:#a626a4;">&lt;-</span><span> config.parameters[</span><span style="color:#c18401;">Parameters</span><span>].map(</span><span style="color:#e45649;">_</span><span>.serviceList) </span><span> </span><span style="color:#e45649;">ids </span><span style="color:#a626a4;">&lt;-</span><span> svcs </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>Some(</span><span style="color:#e45649;">ids</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>ZIO.succeed(ids.toSet) </span><span> </span><span style="color:#a626a4;">case </span><span>None </span><span style="color:#a626a4;">=&gt;</span><span> loader.findModels().mapError(ReflectionError) </span><span> } </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;- </span><span>ZIO.foreachPar(ids) { </span><span style="color:#e45649;">id </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">model </span><span style="color:#a626a4;">&lt;-</span><span> loader.loadCodegenModel(id).mapError(ReflectionError) </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> generator.generateServiceCode(id, model).mapError(GeneratorError) </span><span> } </span><span style="color:#a626a4;">yield </span><span style="color:#c18401;">() </span><span> } </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> generator.generateBuildSbt(ids).mapError(GeneratorError) </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> generator.copyCoreProject().mapError(GeneratorError) </span><span>} </span><span style="color:#a626a4;">yield </span><span>ExitCode.success </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">cfg </span><span style="color:#a626a4;">=</span><span> config.fromArgsWithUsageInfo(args, Parameters.spec).mapError(ParserError) </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">modules </span><span style="color:#a626a4;">=</span><span> loader.live ++ (cfg &gt;+&gt; generator.live) </span><span>app.provideCustomLayer(modules) </span></code></pre> <p>Then created a <em>multi-module</em> <code>sbt</code> project with the following modules:</p> <ul> <li><code>zio-aws-codegen</code> the CLI code generator we were talking about so far</li> <li><code>zio-aws-core</code> holding the common part of all AWS service wrapper libraries. This contains things like how to translate AWS pagination into <code>ZStream</code> etc.</li> <li><code>zio-aws-akka-http</code>, <code>zio-aws-http4s</code> and <code>zio-aws-netty</code> are the supported <em>HTTP layers</em>, all depend on <code>zio-aws-core</code></li> </ul> <p>I also created a first <em>example</em> project in a separate <code>sbt</code> project, that demonstrated the use of some of the generated AWS client libraries. With this primitive setup, building everything from scratch and running the example took the following steps:</p> <ol> <li><code>sbt compile</code> the root project</li> <li>manually running <code>zio-aws-codegen</code> to generate <em>all client libs at once</em> to a separate directory, with a corresponding <code>build.sbt</code> including all these projects in a single <code>sbt</code> project</li> <li><code>sbt publishLocal</code> in the generated <code>sbt</code> project</li> <li><code>sbt run</code> in the <em>examples</em> project</li> </ol> <p>For the second, manual step I created some <em>custom sbt tasks</em> called <code>generateAll</code>, <code>buildAll</code>, and <code>publishLocalAll</code>, that downloaded an <code>sbt-launch-*.jar</code> and used it to run the code generator and fork an <code>sbt</code> to build the generated project.</p> <p>The <code>generateAll</code> task was quite simple:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>generateAll := Def.taskDyn { </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">root </span><span style="color:#a626a4;">=</span><span> baseDirectory.value.getAbsolutePath </span><span> Def.task { </span><span> (codegen / Compile / run).toTask(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot; --target-root ${</span><span>root</span><span style="color:#50a14f;">}/generated --source-root ${</span><span>root</span><span style="color:#50a14f;">} --version </span><span style="color:#e45649;">$zioAwsVersion</span><span style="color:#50a14f;"> --zio-version </span><span style="color:#e45649;">$zioVersion</span><span style="color:#50a14f;"> --zio-rs-version </span><span style="color:#e45649;">$zioReactiveStreamsInteropVersion</span><span style="color:#50a14f;">&quot;</span><span>).value </span><span> } </span><span>}.value </span></code></pre> <p>Launching a second <code>sbt</code> took more effort:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>buildAll := Def.taskDyn { </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=</span><span> generateAll.value </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">generatedRoot </span><span style="color:#a626a4;">=</span><span> baseDirectory.value / </span><span style="color:#50a14f;">&quot;generated&quot; </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">launcherVersion </span><span style="color:#a626a4;">=</span><span> sbtVersion.value </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">launcher </span><span style="color:#a626a4;">= </span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;sbt-launch-</span><span style="color:#e45649;">$launcherVersion</span><span style="color:#50a14f;">.jar&quot; </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">launcherFile </span><span style="color:#a626a4;">=</span><span> generatedRoot / launcher </span><span> </span><span> Def.task[</span><span style="color:#a626a4;">Unit</span><span>] { </span><span> </span><span style="color:#a626a4;">if </span><span>(!launcherFile.exists) { </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">u </span><span style="color:#a626a4;">=</span><span> url(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;https://oss.sonatype.org/content/repositories/public/org/scala-sbt/sbt-launch/</span><span style="color:#e45649;">$launcherVersion</span><span style="color:#50a14f;">/sbt-launch-</span><span style="color:#e45649;">$launcherVersion</span><span style="color:#50a14f;">.jar&quot;</span><span>) </span><span> sbt.io.Using.urlInputStream(u) { </span><span style="color:#e45649;">inputStream </span><span style="color:#a626a4;">=&gt; </span><span> IO.transfer(inputStream, launcherFile) </span><span> } </span><span> } </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">fork </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">ForkRun</span><span>(ForkOptions() </span><span> .withWorkingDirectory(generatedRoot)) </span><span> fork.run( </span><span> </span><span style="color:#50a14f;">&quot;xsbt.boot.Boot&quot;</span><span>, </span><span> classpath </span><span style="color:#a626a4;">=</span><span> launcherFile :: Nil, </span><span> options </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;compile&quot;</span><span> :: Nil, </span><span> log </span><span style="color:#a626a4;">=</span><span> streams.value.log </span><span> ) </span><span> } </span><span>}.value </span></code></pre> <p>With these extra tasks, I released the first version of the library manually, but there was a lot of annoying difficulties:</p> <ul> <li>Having to switch between various <code>sbt</code> projects</li> <li>The need to <code>publishLocal</code> the generated artifacts in order to build the examples, or any kind of integration tests that I planned to add</li> <li>The only way to build only those client libraries that are needed for the examples/tests was to build and publish them manually, as this dependency was not tracked at all between the unrelated <code>sbt</code> projects</li> <li>Because the generated <code>sbt</code> project could not refer to the outer <code>zio-aws-core</code> project, it has to be copied into the generated project in the code generator step</li> <li>Building and publishing all the <strong>235</strong> projects at once required about <strong>16Gb</strong> memory and hours of compilation time. It was too big to run on any of the (freely available) CI systems.</li> </ul> <h3 id="proper-solution">Proper solution</h3> <p>When I mentioned this, <em>Itamar Ravid</em> recommended trying to make it an <em>sbt code generator</em>. <code>sbt</code> has built-in support for generating source code, as described <a href="https://www.scala-sbt.org/1.0/docs/Howto-Generating-Files.html">on it's documentation page</a>. This alone though would not be enough to cover our use case, as in <code>zio-aws</code> even the <em>set of projects</em> is dynamic and comes from the enumeration of schema models. Fortunately, there is support for that in too, through the <code>extraProjects</code> property of <code>sbt</code> <em>plugins</em>.</p> <p>With these two tools, the new project layout became the following:</p> <ul> <li><code>zio-aws-codegen</code> is an sbt <strong>plugin</strong>, having it's own <code>sbt</code> project in a subdirectory</li> <li>the <code>zio-aws-core</code> and the HTTP libraries are all in the top-level project as before</li> <li>examples and integration tests are also part of the top-level project</li> <li>the <code>zio-aws-codegen</code> plugin is referenced using a <code>ProjectRef</code> from the outer project</li> <li>the plugin adds all the <em>AWS service client wrapper libraries</em> to the top-level project</li> <li>these projects generate their source on-demand</li> </ul> <p>In this setup, it is possible to build any subset of the generated libraries without the need to process and compile all of them, so it needs much less memory. It is also much simpler to run tests or build examples on top of them, as the test and example projects can directly depend on the generated libraries as <code>sbt</code> submodules. And even developing the <em>code generator</em> itself is convenient - although for editing it, it has to be opened as in a separate IDE session, but otherwise, <code>sbt reload</code> on the top level project automatically recompiles the plugin when needed.</p> <p>Let's see piece by piece how we can achieve this!</p> <h4 id="project-as-a-source-dependency">Project as a source dependency</h4> <p>The first thing I wanted to do is having the <code>zio-aws-codegen</code> project converted to an <code>sbt</code> plugin, but still having it in the same repository and be able to use it without having to install to a local repository. Although the whole code generator code could have been added to the top level <code>sbt</code> project's <code>project</code> source, I wanted to keep it as a separate module to be able to publish it as a library or a CLI tool in the future if needed.</p> <p>This can be achieved by putting it in a subdirectory of the top level project, with a separate <code>build.sbt</code> that contains the</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>sbtPlugin := </span><span style="color:#c18401;">true </span></code></pre> <p>(beside the usual ones). Then it can be referenced in the top level project's <code>project/plugins.sbt</code> in the following way:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">codegen </span><span style="color:#a626a4;">=</span><span> project </span><span> .in(file(</span><span style="color:#50a14f;">&quot;.&quot;</span><span>)) </span><span> .dependsOn(ProjectRef(file(</span><span style="color:#50a14f;">&quot;../zio-aws-codegen&quot;</span><span>), </span><span style="color:#50a14f;">&quot;zio-aws-codegen&quot;</span><span>)) </span></code></pre> <p>and enabled in the <code>build.sbt</code> as</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>enablePlugins(ZioAwsCodegenPlugin) </span></code></pre> <h4 id="dynamically-generating-projects">Dynamically generating projects</h4> <p>To generate the subprojects dynamically, we need the <code>Set[ModelId]</code> coming from the <code>loader</code> module. It is a <code>ZIO</code> module, so from the <code>sbt</code> plugin we have to use <code>Runtime.default.unsafeRun</code> to execute it.</p> <p>As the code generator project is now an <code>sbt</code> plugin, all the <code>sbt</code> data structures are directly available, so we can just write a function that maps the <code>ModelId</code>s to <code>Project</code>s:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">protected def </span><span style="color:#0184bc;">generateSbtSubprojects</span><span>(</span><span style="color:#e45649;">ids</span><span>: </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">ModelId</span><span>]): </span><span style="color:#c18401;">Seq</span><span>[</span><span style="color:#c18401;">Project</span><span>] </span><span style="color:#a626a4;">= ??? </span></code></pre> <p>One interesting part here is that some of the subprojects are depending on each other. This happens with AWS service <em>submodules</em>, indicated by the second parameter of <code>ModelId</code>. An example is <code>dynamodbstreams</code> that depends on <code>dynamodb</code>. When creating the <code>Project</code> values, we have to be able to <code>dependOn</code> on some other already generated projects, and they have to be generated in the correct order to do so.</p> <p>We could do a full topological sort, but it is not necessary, here we know that the maximum depth of dependencies is 1, so it is enough to put the submodules at the end of the sequence:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">map </span><span style="color:#a626a4;">=</span><span> ids </span><span> .toSeq </span><span> .sortWith { </span><span style="color:#a626a4;">case </span><span>(</span><span style="color:#e45649;">a</span><span>, </span><span style="color:#e45649;">b</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">aIsDependent </span><span style="color:#a626a4;">=</span><span> a.subModuleName </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>Some(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">if</span><span> value </span><span style="color:#a626a4;">!=</span><span> a.name </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">true </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">false </span><span> } </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">bIsDependent </span><span style="color:#a626a4;">=</span><span> b.subModuleName </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>Some(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">if</span><span> value </span><span style="color:#a626a4;">!=</span><span> b.name </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">true </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">false </span><span> } </span><span> bIsDependent || (!aIsDependent &amp;&amp; a.toString &lt; b.toString) </span><span> } </span></code></pre> <p>Then in order to be able get the dependencies, we do a <em>fold</em> on the ordered <code>ModelId</code>s:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> .foldLeft(Map.empty[</span><span style="color:#c18401;">ModelId</span><span>, </span><span style="color:#c18401;">Project</span><span>]) { (</span><span style="color:#e45649;">mapping</span><span>, </span><span style="color:#e45649;">id</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">deps </span><span style="color:#a626a4;">=</span><span> id.subModule </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>Some(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">if</span><span> value </span><span style="color:#a626a4;">!=</span><span> id.name </span><span style="color:#a626a4;">=&gt; </span><span> Seq(ClasspathDependency(LocalProject(</span><span style="color:#50a14f;">&quot;zio-aws-core&quot;</span><span>), None), </span><span> ClasspathDependency(mapping(ModelId(id.name, Some(id.name))), None)) </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=&gt; </span><span> Seq(ClasspathDependency(LocalProject(</span><span style="color:#50a14f;">&quot;zio-aws-core&quot;</span><span>), None)) </span><span> } </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">project </span><span style="color:#a626a4;">= </span><span>Project(fullName, file(</span><span style="color:#50a14f;">&quot;generated&quot;</span><span>) / name) </span><span> .settings( </span><span> libraryDependencies += </span><span style="color:#50a14f;">&quot;software.amazon.awssdk&quot;</span><span> % id.name % awsLibraryVersion.value, </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> .dependsOn(deps: </span><span style="color:#a626a4;">_*</span><span>) </span><span> </span><span> mapping.updated(id, project) </span><span> } </span></code></pre> <p>To make it easier to work with the generated projects, we also create a project named <code>all</code> that aggregates all the ones generated above.</p> <h4 id="applying-settings-to-the-generated-projects">Applying settings to the generated projects</h4> <p>The code generator only sets the basic settings for the generated projects: name, path and dependencies. We need a lot more, setting organization and version, all the publishing options, controlling the Scala version, etc.</p> <p>I decided to keep these settings outside of the code generator plugin, in the top-level <code>sbt</code> project. By creating an <code>AutoPlugin</code> end enabling it for all projects, we can inject all the common settings for both the hand-written and the generated projects:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Common </span><span style="color:#a626a4;">extends </span><span>AutoPlugin </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> autoImport { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">scala212Version </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;2.12.12&quot; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">scala213Version </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;2.13.3&quot; </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">import</span><span style="color:#c18401;"> autoImport.</span><span style="color:#e45649;">_ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">trigger </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> allRequirements </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">requires </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Sonatype </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override lazy val </span><span style="color:#e45649;">projectSettings </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> Seq( </span><span style="color:#c18401;"> scalaVersion := scala213Version, </span><span style="color:#c18401;"> crossScalaVersions := List(scala212Version, scala213Version), </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> ) </span><span style="color:#c18401;">} </span></code></pre> <h4 id="source-generator-task">Source generator task</h4> <p>At this point, we could also add the already existing <em>source code generation</em> to the initialization of the plugin, and just generate all the subproject's all source files every time the <code>sbt</code> project is loaded. With this number of generated projects though, it would have been a very big startup overhead and would not allow us to split the build (at least not the code generation part) on CI, to solve the memory and build time issues.</p> <p>As <code>sbt</code> has built-in support for defining <em>source generator tasks</em>, we can do much better!</p> <p>Instead of generating the source codes in one step, we define a <code>generateSources</code> task and add it to each <em>generated subproject</em> as a <em>source generator</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Compile / sourceGenerators += generateSources.taskValue, </span><span>awsLibraryId := id.toString </span></code></pre> <p>The <code>awsLibraryId</code> is a custom property that we the <code>generateSources</code> task can use to determine which schema to use for the code generation.</p> <p>The first part of this task is to gather the information from the project it got applied on, including the custom <code>awsLibraryId</code> property:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">generateSources </span><span style="color:#a626a4;">= </span><span> Def.task { </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">log </span><span style="color:#a626a4;">=</span><span> streams.value.log </span><span> </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">idStr </span><span style="color:#a626a4;">=</span><span> awsLibraryId.value </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">id </span><span style="color:#a626a4;">= </span><span>ModelId.parse(idStr) </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>Left(</span><span style="color:#e45649;">failure</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> sys.error(failure) </span><span> </span><span style="color:#a626a4;">case </span><span>Right(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> value </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">targetRoot </span><span style="color:#a626a4;">= </span><span>(sourceManaged in Compile).value </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">travisSrc </span><span style="color:#a626a4;">=</span><span> travisSource.value </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">travisDst </span><span style="color:#a626a4;">=</span><span> travisTarget.value </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">parallelJobs </span><span style="color:#a626a4;">=</span><span> travisParallelJobs.value </span></code></pre> <p>From these, we create a <code>Parameters</code> data structure to pass to the <code>generator</code> module. This is what we used to construct with <code>clipp</code> from CLI arguments:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">params </span><span style="color:#a626a4;">= </span><span>Parameters( </span><span> targetRoot </span><span style="color:#a626a4;">= </span><span>Path.fromJava(targetRoot.toPath), </span><span> travisSource </span><span style="color:#a626a4;">= </span><span>Path.fromJava(travisSrc.toPath), </span><span> travisTarget </span><span style="color:#a626a4;">= </span><span>Path.fromJava(travisDst.toPath), </span><span> parallelTravisJobs </span><span style="color:#a626a4;">=</span><span> parallelJobs </span><span> ) </span></code></pre> <p>And finally, construct the <code>ZIO</code> environment, load a <strong>single</strong> schema model, and generate the library's source code:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> zio.Runtime.default.unsafeRun { </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">cfg </span><span style="color:#a626a4;">= </span><span>ZLayer.succeed(params) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">env </span><span style="color:#a626a4;">=</span><span> loader.live ++ (cfg &gt;+&gt; generator.live) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">task </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;- </span><span>ZIO.effect(log.info(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Generating sources for </span><span style="color:#e45649;">$id</span><span style="color:#50a14f;">&quot;</span><span>)) </span><span> </span><span style="color:#e45649;">model </span><span style="color:#a626a4;">&lt;-</span><span> loader.loadCodegenModel(id) </span><span> </span><span style="color:#e45649;">files </span><span style="color:#a626a4;">&lt;-</span><span> generator.generateServiceCode(id, model) </span><span> } </span><span style="color:#a626a4;">yield</span><span> files.toSeq </span><span> task.provideCustomLayer(env).catchAll { </span><span style="color:#e45649;">generatorError </span><span style="color:#a626a4;">=&gt; </span><span> ZIO.effect(log.error(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Code generator failure: ${</span><span>generatorError</span><span style="color:#50a14f;">}&quot;</span><span>)).as(Seq.empty) </span><span> } </span><span> } </span><span> } </span></code></pre> <p>The <code>generateServiceCode</code> function returns a <code>Set[File]</code> value containing all the generated source files. This is the result of the <em>source generator task</em>, and <code>sbt</code> uses this information to add the generated files to the compilation.</p> <h4 id="referencing-the-generated-projects">Referencing the generated projects</h4> <p>When defining downstream projects in the <code>build.sbt</code>, such as integration tests and other examples, we have to refer to the generated projects somehow. There is no value of type <code>Project</code> in scope to do so, but we can do it easily by name using <code>LocalProject</code>. The following example shows how the <code>example1</code> subproject does this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">example1 </span><span style="color:#a626a4;">= </span><span>Project(</span><span style="color:#50a14f;">&quot;example1&quot;</span><span>, file(</span><span style="color:#50a14f;">&quot;examples&quot;</span><span>) / </span><span style="color:#50a14f;">&quot;example1&quot;</span><span>) </span><span> .dependsOn( </span><span> core, </span><span> http4s, </span><span> netty, </span><span> LocalProject(</span><span style="color:#50a14f;">&quot;zio-aws-elasticbeanstalk&quot;</span><span>), </span><span> LocalProject(</span><span style="color:#50a14f;">&quot;zio-aws-ec2&quot;</span><span>) </span><span> ) </span></code></pre> <h4 id="parallel-build-on-travis-ci">Parallel build on Travis CI</h4> <p>The last thing that I wanted to solve is building the full <code>zio-aws</code> suite on a CI. I am using <a href="https://travis-ci.org/">Travis CI</a> for my private projects, so that's what I built it for. The idea is to split the set of <em>service client libraries</em> to chunks and create <a href="https://docs.travis-ci.com/user/build-matrix/">build matrix</a> to run those in parallel. The tricky part is that the set of generated service libraries is dynamic, collected by the code generator.</p> <p>To solve this, I started to generate the <code>.travis.yml</code> build descriptor as well. The <em>hand-written</em> part has been moved to <code>.travis.base.yml</code>:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#e45649;">language</span><span>: </span><span style="color:#50a14f;">scala </span><span style="color:#e45649;">services</span><span>: </span><span> - </span><span style="color:#50a14f;">docker </span><span style="color:#e45649;">scala</span><span>: </span><span> - </span><span style="color:#c18401;">2.12.12 </span><span> - </span><span style="color:#c18401;">2.13.3 </span><span> </span><span style="color:#e45649;">cache</span><span>: </span><span> </span><span style="color:#e45649;">directories</span><span>: </span><span> - </span><span style="color:#50a14f;">$HOME/.cache/coursier </span><span> - </span><span style="color:#50a14f;">$HOME/.ivy2/cache </span><span> - </span><span style="color:#50a14f;">$HOME/.sbt </span><span> </span><span style="color:#e45649;">env</span><span>: </span><span> - </span><span style="color:#50a14f;">COMMANDS=&quot;clean zio-aws-core/test zio-aws-akka-http/test zio-aws-http4s/test zio-aws-netty/test&quot; </span><span> - </span><span style="color:#50a14f;">COMMANDS=&quot;clean examples/compile&quot; </span><span> - </span><span style="color:#50a14f;">COMMANDS=&quot;clean integtests/test&quot; </span><span> </span><span style="color:#e45649;">before_install</span><span>: </span><span> - </span><span style="color:#50a14f;">if [ &quot;$COMMANDS&quot; = &quot;clean integtests/test&quot; ]; then docker pull localstack/localstack; fi </span><span> - </span><span style="color:#50a14f;">if [ &quot;$COMMANDS&quot; = &quot;clean integtests/test&quot; ]; then docker run -d -p 4566:4566 --env SERVICES=s3,dynamodb --env START_WEB=0 localstack/localstack; fi </span><span> </span><span style="color:#e45649;">script</span><span>: </span><span> - </span><span style="color:#50a14f;">sbt ++$TRAVIS_SCALA_VERSION -jvm-opts travis/jvmopts $COMMANDS </span></code></pre> <p>I use the <code>COMMANDS</code> environment variable to define the parallel sets of <code>sbt</code> commands here. There are three predefined sets: building <code>zio-aws-core</code> and the HTTP implementations, building the <em>example projects</em> and running the <em>integration test</em>. The last two involve generating actual service client code and building them - but only the few that are necessary, so it is not an issue to do that redundantly.</p> <p>The real <code>.travis.yml</code> file is then generated by running a task <em>manually</em>, <code>sbt generateTravisYaml</code>. It is implemented in the <code>zio-aws-codegen</code> plugin and it loads the <code>.travis.base.yml</code> file and extends the <code>env</code> section with a set of <code>COMMANDS</code> variants, each compiling a subset of the generated subprojects.</p> <h2 id="conclusion">Conclusion</h2> <p>Travis CI can now build <code>zio-aws</code> and run its integration tests. A build runs for hours, but it is stable, and consists of 22 parallel jobs to build all the libraries for both Scala 2.12 and 2.13. At the same time, developing the code generator and the other subprojects and tests became really convenient.</p> prox part 4 - simplified redesign 2020-08-03T00:00:00+00:00 2020-08-03T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/prox-4-simplify/ <h2 id="blog-post-series">Blog post series</h2> <ul> <li><a href="https://blog.vigoo.dev/posts/prox-1-types/">Part 1 - type level programming</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-2-io-akkastreams/">Part 2 - akka streams with cats effect</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-3-zio/">Part 3 - effect abstraction and ZIO</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-4-simplify/">Part 4 - simplified redesign</a></li> </ul> <h2 id="intro">Intro</h2> <p>In <a href="https://blog.vigoo.dev/posts/prox-4-simplify/2019-02-10-prox-1-types.html">Part 1</a> I described how the advanced type level programming techniques can be used to describe the execution of system processes. It was both a good playground to experiment with these and the result has been proven useful as we started to use it in more and more production systems and test environments at <a href="https://prezi.com">Prezi</a>.</p> <p>On the other hand as I mentioned at the end of the first post, there is a tradeoff. These techniques made the original version of <em>prox</em> very hard to maintain and improve, and the error messages library users got by small mistakes were really hard to understand.</p> <p>Last December (in 2019) I redesigned the library to be simpler and easier to use by making some compromises. Let's discover how!</p> <h2 id="a-single-process">A single process</h2> <p>We start completely from scratch and try to design the library with the same functionality but with simplicity in mind. The code snippets shown here are not necessarily the final, current state of the traits and objects of the library, but some intermediate steps so we see the thought process.</p> <p>First let's focus on defining a <strong>single process</strong>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Process { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">command</span><span style="color:#c18401;">: String </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">arguments</span><span style="color:#c18401;">: List[String] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">workingDirectory</span><span style="color:#c18401;">: Option[Path] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">environmentVariables</span><span style="color:#c18401;">: Map[String, String] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">removedEnvironmentVariables</span><span style="color:#c18401;">: Set[String] </span><span style="color:#c18401;">} </span></code></pre> <p>Without deciding already how it will be implemented, we know we need these information to be able to launch the process alone. And how to execute it? Let's separate it completely:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> ProcessResult { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">exitCode</span><span style="color:#c18401;">: ExitCode </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> ProcessRunner { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">start</span><span style="color:#c18401;">(</span><span style="color:#e45649;">process</span><span style="color:#c18401;">: Process): Resource[IO, Fiber[IO, ProcessResult]] </span><span style="color:#c18401;">} </span></code></pre> <p>I decided that better integration with the IO library (<a href="https://typelevel.org/cats-effect/">cats-effect</a> in this case) is also a goal of the redesign, so for starter modelled the <em>running process</em> as a cancellable fiber resulting in <code>ProcessResult</code>, where cancellation means <strong>terminating</strong> the process. At this stage of the redesign I worked directly with <code>IO</code> instead of the <em>IO typeclasses</em> and later replaced it like I described in <a href="https://blog.vigoo.dev/posts/prox-4-simplify/2019-08-13-prox-3-zio.html">the previous post</a>.</p> <p>Let's see how a simple runner implementation would look like:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">import</span><span> java.lang.{Process </span><span style="color:#a626a4;">=&gt;</span><span> JvmProcess} </span><span> </span><span style="color:#a626a4;">class</span><span style="color:#c18401;"> JVMProcessRunner</span><span>(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span>: </span><span style="color:#c18401;">ContextShift</span><span>[</span><span style="color:#c18401;">IO</span><span>]) </span><span style="color:#a626a4;">extends </span><span>ProcessRunner </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">import</span><span style="color:#c18401;"> JVMProcessRunner.</span><span style="color:#e45649;">_ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">start</span><span style="color:#c18401;">(</span><span style="color:#e45649;">process</span><span style="color:#c18401;">: Process): Resource[IO, Fiber[IO, ProcessResult]] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">builder </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> withEnvironmentVariables(process, </span><span style="color:#c18401;"> withWorkingDirectory(process, </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">ProcessBuilder((process.command :: process.arguments).asJava))) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">start </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">IO.delay(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">JVMRunningProcess(builder.start())).bracketCase { </span><span style="color:#e45649;">runningProcess </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> runningProcess.waitForExit() </span><span style="color:#c18401;"> } { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">(</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, Completed) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> IO.unit </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">(</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, Error(</span><span style="color:#e45649;">reason</span><span style="color:#c18401;">)) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> IO.raiseError(reason) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">(</span><span style="color:#e45649;">runningProcess</span><span style="color:#c18401;">, Canceled) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> runningProcess.terminate() &gt;&gt; IO.unit </span><span style="color:#c18401;"> }.start </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> Resource.make(start)(</span><span style="color:#e45649;">_</span><span style="color:#c18401;">.cancel) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>Here <code>withEnvironmentVariables</code> and <code>withWorkingDirectories</code> are just helper functions around the JVM <em>process builder</em>. The more important part is the <em>cancelation</em> and that we expose it as a <em>resource</em>.</p> <p>First we wrap the started JVM process in a <code>JVMRunningProcess</code> class which really just wraps some of it's operations in IO operations:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> SimpleProcessResult</span><span>(</span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">exitCode</span><span>: </span><span style="color:#c18401;">ExitCode</span><span>) </span><span> </span><span style="color:#a626a4;">extends </span><span>ProcessResult </span><span> </span><span style="color:#a626a4;">class</span><span style="color:#c18401;"> JVMRunningProcess</span><span>(</span><span style="color:#a626a4;">val </span><span style="color:#e45649;">nativeProcess</span><span>: </span><span style="color:#c18401;">JvmProcess</span><span>) </span><span style="color:#a626a4;">extends </span><span>RunningProcess </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">isAlive</span><span style="color:#c18401;">: IO[</span><span style="color:#a626a4;">Boolean</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">IO.delay(nativeProcess.isAlive) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">kill</span><span style="color:#c18401;">(): IO[ProcessResult] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">IO.delay(nativeProcess.destroyForcibly()) &gt;&gt; waitForExit() </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">terminate</span><span style="color:#c18401;">(): IO[ProcessResult] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">IO.delay(nativeProcess.destroy()) &gt;&gt; waitForExit() </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">waitForExit</span><span style="color:#c18401;">(): IO[ProcessResult] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">for </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#e45649;">exitCode </span><span style="color:#a626a4;">&lt;- </span><span style="color:#c18401;">IO.delay(nativeProcess.waitFor()) </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">yield </span><span style="color:#c18401;">SimpleProcessResult(ExitCode(exitCode)) </span><span style="color:#c18401;">} </span></code></pre> <p>Then we wrap the <em>starting of the process</em> with <code>bracketCase</code>, specifying the two cases:</p> <ul> <li>On normal execution, we <code>waitForExit</code> for the process to stop and create the <code>ProcessResult</code> as the result of the bracketed IO operation.</li> <li>In the release case, if JVM thrown an exception it is raised to the IO level</li> <li>And if it got <em>canceled</em>, we <code>terminate</code> the process</li> </ul> <p>This way the IO cancelation interface gets a simple way to wait for or terminate an executed process. By calling <code>.start</code> on this bracketed IO operation we move it to a concurrent <em>fiber</em>.</p> <p>Finally we wrap it in a <code>Resource</code>, so if the user code starting the process got canceled, it <em>releases the resource</em> too that ends up <em>terminating</em> the process, leaving no process leaks. This is something that was missing from the earlier versions of the library.</p> <p>To make starting processes more convenient we can create an <strong>extension method</strong> on the <code>Process</code> trait:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> ProcessOps</span><span>(</span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">Process</span><span>) </span><span style="color:#a626a4;">extends </span><span>AnyVal </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">start</span><span style="color:#c18401;">(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">runner</span><span style="color:#c18401;">: ProcessRunner): Resource[IO, Fiber[IO, ProcessResult]] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> runner.start(process) </span><span style="color:#c18401;">} </span></code></pre> <h2 id="redirection">Redirection</h2> <p>The next step was to implement input/output/error <em>redirection</em>. In the original <em>prox</em> library we had two important features, both implemented with type level techniques:</p> <ul> <li>Allow redirection only once per channel</li> <li>The redirection source or target was a type class with <em>dependent result types</em></li> </ul> <p>To keep the type signatures simpler I decided to work around these by sacrificing some genericity and terseness. Let's start by defining an interface for <strong>redirecting process output</strong>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> RedirectableOutput</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">P</span><span>[</span><span style="color:#e45649;">_</span><span>] </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#e45649;">_</span><span>]] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">connectOutput</span><span style="color:#c18401;">[R </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">OutputRedirection, O](</span><span style="color:#e45649;">target</span><span style="color:#c18401;">: R)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">outputRedirectionType</span><span style="color:#c18401;">: OutputRedirectionType.Aux[R, O]): P[O] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>This is not <em>very</em> much different than the output redirection operator in the previous <em>prox</em> versions:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span>[</span><span style="color:#c18401;">F</span><span>[</span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">To</span><span>, </span><span style="color:#c18401;">NewOut</span><span>, </span><span style="color:#c18401;">NewOutResult</span><span>, </span><span style="color:#c18401;">Result </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#c18401;">Redirected</span><span>, </span><span style="color:#e45649;">_</span><span>]] </span><span> (</span><span style="color:#e45649;">to</span><span>: </span><span style="color:#c18401;">To</span><span>) </span><span> (</span><span style="color:#a626a4;">implicit </span><span> </span><span style="color:#e45649;">contextOf</span><span>: </span><span style="color:#c18401;">ContextOf</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">F</span><span>], </span><span> </span><span style="color:#e45649;">target</span><span>: </span><span style="color:#c18401;">CanBeProcessOutputTarget</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">To</span><span>, </span><span style="color:#c18401;">NewOut</span><span>, </span><span style="color:#c18401;">NewOutResult</span><span>], </span><span> </span><span style="color:#e45649;">redirectOutput</span><span>: </span><span style="color:#c18401;">RedirectOutput</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">To</span><span>, </span><span style="color:#c18401;">NewOut</span><span>, </span><span style="color:#c18401;">NewOutResult</span><span>, </span><span style="color:#c18401;">Result</span><span>]) </span></code></pre> <p>One of the primary differences is that we don't allow arbitrary targets just by requiring a <code>CanBeProcessOutput</code> type class. Instead we can only connect the output to a value of <code>OutputRedirection</code> which is an ADT:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> OutputRedirection </span><span style="color:#a626a4;">case object</span><span style="color:#c18401;"> StdOut </span><span style="color:#a626a4;">extends </span><span>OutputRedirection </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> OutputFile</span><span>(</span><span style="color:#e45649;">path</span><span>: </span><span style="color:#c18401;">Path</span><span>, </span><span style="color:#e45649;">append</span><span>: </span><span style="color:#a626a4;">Boolean</span><span>) </span><span style="color:#a626a4;">extends </span><span>OutputRedirection </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> OutputStream</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#a626a4;">+</span><span style="color:#c18401;">OR</span><span>](</span><span style="color:#e45649;">pipe</span><span>: </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#c18401;">O</span><span>], </span><span style="color:#e45649;">runner</span><span>: </span><span style="color:#c18401;">Stream</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#c18401;">O</span><span>] </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">OR</span><span>], </span><span style="color:#e45649;">chunkSize</span><span>: </span><span style="color:#a626a4;">Int = </span><span style="color:#c18401;">8192</span><span>) </span><span style="color:#a626a4;">extends </span><span>OutputRedirection </span></code></pre> <p>We still need a type level calculation to extract the result type of the <code>OutputStream</code> case (which is the <code>OR</code> type parameter). This extracted by the following trait with the help of the <code>Aux</code> pattern:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> OutputRedirectionType</span><span>[</span><span style="color:#c18401;">R</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Out </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">runner</span><span style="color:#c18401;">(</span><span style="color:#e45649;">of</span><span style="color:#c18401;">: R)(</span><span style="color:#e45649;">nativeProcess</span><span style="color:#c18401;">: JvmProcess, </span><span style="color:#e45649;">blocker</span><span style="color:#c18401;">: Blocker, </span><span style="color:#e45649;">contextShift</span><span style="color:#c18401;">: ContextShift[IO]): IO[Out] </span><span style="color:#c18401;">} </span></code></pre> <p>The important difference from earlier versions of the library is that this remains completely an implementation detail. <code>OutputRedirectionType</code> is implemented for all three cases of the <code>OutputRedirection</code> type and <code>connectOutput</code> is not even used in the default use cases, only when implementing redirection for something custom.</p> <p>Instead the <code>RedirectableOutput</code> trait itself defines a set of operators and named function versions for redirecting to different targets. With this we loose a general-purpose, type class managed way to redirect to <em>anything</em> but improve a lot on the usability of the library. All these functions are easily discoverable from the IDE and there would not be any weird implicit resolution errors.</p> <p>Let's see some examples of these functions:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> RedirectableOutput</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">P</span><span>[</span><span style="color:#e45649;">_</span><span>] </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#e45649;">_</span><span>]] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span style="color:#c18401;">(</span><span style="color:#e45649;">sink</span><span style="color:#c18401;">: Pipe[IO, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">]): P[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> toSink(sink) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">toSink</span><span style="color:#c18401;">(</span><span style="color:#e45649;">sink</span><span style="color:#c18401;">: Pipe[F, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">]): P[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> connectOutput(OutputStream(sink, (</span><span style="color:#e45649;">s</span><span style="color:#c18401;">: Stream[F, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">]) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> s.compile.drain)) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;#</span><span style="color:#c18401;">[O</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Monoid](</span><span style="color:#e45649;">pipe</span><span style="color:#c18401;">: Pipe[F, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">, O]): P[O] </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> toFoldMonoid(pipe) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">toFoldMonoid</span><span style="color:#c18401;">[O</span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Monoid](</span><span style="color:#e45649;">pipe</span><span style="color:#c18401;">: Pipe[F, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">, O]): P[O] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> connectOutput(OutputStream(pipe, (</span><span style="color:#e45649;">s</span><span style="color:#c18401;">: Stream[F, O]) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> s.compile.foldMonoid)) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;&gt;</span><span style="color:#c18401;">(</span><span style="color:#e45649;">path</span><span style="color:#c18401;">: Path): P[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> appendToFile(path) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">appendToFile</span><span style="color:#c18401;">(</span><span style="color:#e45649;">path</span><span style="color:#c18401;">: Path): P[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> connectOutput(OutputFile[F](path, append </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true)) </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>All of them are just using the <code>connectOutput</code> function so implementations of the <code>RedirectableOutput</code> trait need to define that single function to get this capability.</p> <p>Note that <code>connectOutput</code> has a return type of <code>P[O]</code> instead of being just <code>Process</code>. This is important for multiple reasons.</p> <p>First, in order to actually <em>execute</em> the output streams, we need to store it somehow in the <code>Process</code> data type itself. For this reason we add a type parameter to the <code>Process</code> trait representing the <em>output type</em> and store the <em>output stream runner function</em> itself in it:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Process</span><span>[</span><span style="color:#c18401;">O</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">outputRedirection</span><span style="color:#c18401;">: OutputRedirection </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">runOutputStream</span><span style="color:#c18401;">: (JvmProcess, Blocker, ContextShift[IO]) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">IO[O] </span><span style="color:#c18401;">} </span></code></pre> <p>Note that <code>runOutputStream</code> is actually the <code>OutputRedirectiontype.runner</code> function, got from the "hidden" type level operation and stored in the process data structure. With this, the <em>process runner</em> can be extended to pass the started JVM process to this function that sets up the redirection, and then store the result of type <code>O</code> in <code>ProcessResult[O]</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">start</span><span>[</span><span style="color:#c18401;">O</span><span>](</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#c18401;">O</span><span>], </span><span style="color:#e45649;">blocker</span><span>: </span><span style="color:#c18401;">Blocker</span><span>): </span><span style="color:#c18401;">Resource</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#c18401;">Fiber</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#c18401;">ProcessResult</span><span>[</span><span style="color:#c18401;">O</span><span>]]] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a0a1a7;">// ... process builder </span><span> </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">outputRedirect </span><span style="color:#a626a4;">=</span><span> process.outputRedirection </span><span style="color:#a626a4;">match </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span>StdOut </span><span style="color:#a626a4;">=&gt; </span><span>ProcessBuilder.Redirect.INHERIT </span><span> </span><span style="color:#a626a4;">case </span><span>OutputFile(</span><span style="color:#e45649;">path</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>ProcessBuilder.Redirect.to(path.toFile) </span><span> </span><span style="color:#a626a4;">case </span><span>OutputStream(</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>ProcessBuilder.Redirect.PIPE </span><span> } </span><span> builder.redirectOutput(outputRedirect) </span><span> </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">startProcess </span><span style="color:#a626a4;">= for </span><span>{ </span><span> </span><span style="color:#e45649;">nativeProcess </span><span style="color:#a626a4;">&lt;- </span><span>IO.delay(builder.start()) </span><span> </span><span style="color:#e45649;">runningOutput </span><span style="color:#a626a4;">&lt;-</span><span> process.runOutputStream(nativeProcess, blocker, contextShift).start </span><span> } </span><span style="color:#a626a4;">yield new </span><span style="color:#c18401;">JVMRunningProcess</span><span>(nativeProcess, runningOutput) </span><span> </span><span> </span><span style="color:#a0a1a7;">// ... bracketCase, start, Resource.make </span><span>} </span></code></pre> <p>It is also important that this <code>RedirectableOutput</code> trait is not something all process has: it is a <strong>capability</strong>, and only processes with unbound output should implement it. This is the new encoding of fixing the three channels of a process. Instead of having three type parameters with <em>phantom types</em>, now we have a combination of capability traits mixed with the <code>Process</code> trait, constraining what kind of redirections we can do. As this is not something unbounded and have relatively small number of cases, I chose to implement the combinations by hand, designing it in a way to minimize the redundancy in these implementation classes. This means, in total <strong>8</strong> classes representing the combinations of bound input, output and error.</p> <p>I will demonstrate this with a single example. The <code>Process</code> constructor now returns a type with everything unbound, represented by having all the redirection capability traits:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> Process { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span style="color:#c18401;">(</span><span style="color:#e45649;">command</span><span style="color:#c18401;">: String, </span><span style="color:#e45649;">arguments</span><span style="color:#c18401;">: List[String] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">List.empty): ProcessImpl </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> ProcessImpl( </span><span style="color:#c18401;"> command, </span><span style="color:#c18401;"> arguments, </span><span style="color:#c18401;"> workingDirectory </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">None, </span><span style="color:#c18401;"> environmentVariables </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Map.empty, </span><span style="color:#c18401;"> removedEnvironmentVariables </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Set.empty, </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> outputRedirection </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">StdOut, </span><span style="color:#c18401;"> runOutputStream </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">(</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">IO.unit, </span><span style="color:#c18401;"> errorRedirection </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">StdOut, </span><span style="color:#c18401;"> runErrorStream </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">(</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">IO.unit, </span><span style="color:#c18401;"> inputRedirection </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">StdIn </span><span style="color:#c18401;"> ) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> ProcessImpl(</span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">command</span><span style="color:#c18401;">: String, </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">arguments</span><span style="color:#c18401;">: List[String], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">workingDirectory</span><span style="color:#c18401;">: Option[Path], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">environmentVariables</span><span style="color:#c18401;">: Map[String, String], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">removedEnvironmentVariables</span><span style="color:#c18401;">: Set[String], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">outputRedirection</span><span style="color:#c18401;">: OutputRedirection[F], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">runOutputStream</span><span style="color:#c18401;">: (java.io.InputStream, Blocker, ContextShift[F]) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">F[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">errorRedirection</span><span style="color:#c18401;">: OutputRedirection[F], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">runErrorStream</span><span style="color:#c18401;">: (java.io.InputStream, Blocker, ContextShift[F]) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">F[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">inputRedirection</span><span style="color:#c18401;">: InputRedirection[F]) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Process[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">RedirectableOutput[ProcessImplO[</span><span style="color:#e45649;">*</span><span style="color:#c18401;">]] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">RedirectableError[ProcessImplE[</span><span style="color:#e45649;">*</span><span style="color:#c18401;">]] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">RedirectableInput[ProcessImplI]] { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">connectOutput</span><span style="color:#c18401;">[R </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">OutputRedirection, RO](</span><span style="color:#e45649;">target</span><span style="color:#c18401;">: R)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">outputRedirectionType</span><span style="color:#c18401;">: OutputRedirectionType.Aux[R, RO]): ProcessImplO[RO] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> ProcessImplO( </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> target, </span><span style="color:#c18401;"> outputRedirectionType.runner(target), </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> ) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> ProcessImplO[O](</span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override val </span><span style="color:#e45649;">runOutputStream</span><span style="color:#c18401;">: (java.io.InputStream, Blocker, ContextShift[F]) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">F[O], </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> ) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">Process[O, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">RedirectableError[ProcessImplOE[O, </span><span style="color:#e45649;">*</span><span style="color:#c18401;">]] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">RedirectableInput[ProcessImplIO[O]] { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>Each implementation class only has the necessary subset of type parameters <code>O</code> and <code>E</code> (<code>E</code> is the error output type), and the <code>I</code> <code>O</code> and <code>E</code> postfixes in the class names represent which channels are <em>bound</em>. Each redirection leads to a different implementation class with less and less redirection <em>capabilities</em>. <code>ProcessImplIOE</code> is the fully bound process.</p> <p>This makes all the redirection operators completely type inferable and very pleasant to use for building up concrete process definitions. And we don't loose the ability to create generic function either. We can do it by requiring redirection capabilities:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">withInput</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>, </span><span style="color:#c18401;">P </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>]](</span><span style="color:#e45649;">s</span><span>: </span><span style="color:#c18401;">String</span><span>)(</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>] </span><span style="color:#a626a4;">with </span><span style="color:#c18401;">RedirectableInput</span><span>[</span><span style="color:#c18401;">P</span><span>]): </span><span style="color:#c18401;">P </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">input </span><span style="color:#a626a4;">= </span><span>Stream(</span><span style="color:#50a14f;">&quot;This is a test string&quot;</span><span>).through(text.utf8Encode) </span><span> process &lt; input </span><span>} </span></code></pre> <p>Here we know we want to have a <code>Process</code> with the <code>RedirectableInput</code> capability. We also know that by binding the input we get a something without that trait, so we know the result is a process <code>P</code> but know nothing else about its further capabilities. This is where this solution gets a bit inconvenient, if we want to chain these wrapper functions. To help with it, the library contains <em>type aliases</em> for the whole redirection capability chain that can be used in these functions. For example:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a0a1a7;">/** Process with unbound input, output and error streams */ </span><span style="color:#a626a4;">type </span><span>UnboundProcess </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#a626a4;">Unit</span><span>, </span><span style="color:#a626a4;">Unit</span><span>] </span><span> with RedirectableInput[</span><span style="color:#c18401;">UnboundOEProcess</span><span>] </span><span> with RedirectableOutput[</span><span style="color:#c18401;">UnboundIEProcess</span><span>[</span><span style="color:#e45649;">*</span><span>]] </span><span> with RedirectableError[</span><span style="color:#c18401;">UnboundIOProcess</span><span>[</span><span style="color:#e45649;">*</span><span>]] </span></code></pre> <h2 id="process-piping">Process piping</h2> <p>The other major feature beside redirection that <em>prox</em> had is <strong>piping processes together</strong>, meaning the first process' output gets redirected to the second process' input. Now that we have redesigned processes and redirection capabilities, we can try to implement this on top of them.</p> <p>The idea is that when we construct a <em>process group</em> from a list of <code>Process</code> instances with the necessary redirection capabilities, this construction could set up the redirection and store the modified processes instead, then running them together. And it can reuse the <code>RedirectableOutput</code> and <code>RedirectableInput</code> capabilities to bind the first/last process!</p> <p>Let's again start by defining what we need for the <em>process group</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> ProcessGroup</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>] </span><span style="color:#a626a4;">extends </span><span>ProcessLike </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">firstProcess</span><span style="color:#c18401;">: Process[Stream[IO, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">], E] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">innerProcesses</span><span style="color:#c18401;">: List[Process.UnboundIProcess[Stream[IO, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">], E]] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">lastProcess</span><span style="color:#c18401;">: Process.UnboundIProcess[O, E] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">originalProcesses</span><span style="color:#c18401;">: List[Process[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">]] </span><span style="color:#c18401;">} </span></code></pre> <p><code>ProcessLike</code> is a common base trait for <code>Process</code> and <code>ProcessGroup</code>. By introducing it, we can change the <code>RedirectableOutput</code> trait's self type bounds so it works for both processes and process groups.</p> <p>A valid process group always have at least <strong>2</strong> processes and they get pre-configured during the construction of the group so when they get started, their channels can be joined. This means the group members can be split into three groups:</p> <ul> <li>The <strong>first process</strong> has it's output redirected to a stream, but <em>running</em> the stream just returns the stream itself; this way it can be connected to the next process's input</li> <li>The <strong>inner processes</strong> are all having their output redirected in the same way, and it is also a <em>requirement</em> that these must have their <em>input channel</em> unbound. This is needed for the operation described above, when we plug the previous process' output into the input</li> <li>The <strong>last process</strong> can have its output freely redirected by the user, but it's <em>input</em> must be unbound so the previous process can be plugged in</li> </ul> <p>We also store the <em>original</em> process values for reasons explained later.</p> <p>So as we can see the piping has two stages:</p> <ol> <li>First we prepare the processes by setting up their output to return an un-executed stream</li> <li>And we need a process group specific start function into the <code>ProcessRunner</code> that plugs everything together</li> </ol> <p>The first step is performed by the <em>pipe operator</em> (<code>|</code>), which is defined on <code>Process</code> via an extension method to construct group of two processes, and on <code>ProcessGroupImpl</code> to add more. For simplicity the piping operator is currently not defined on the bound process group types. So it has to be first constructed, and then the redirection set up.</p> <p>Let's see the one that adds one more process to a group:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">pipeInto</span><span>(</span><span style="color:#e45649;">other</span><span>: </span><span style="color:#c18401;">Process</span><span>.</span><span style="color:#c18401;">UnboundProcess</span><span>, </span><span> </span><span style="color:#e45649;">channel</span><span>: </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#a626a4;">Byte</span><span>]): </span><span style="color:#c18401;">ProcessGroupImpl </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">pl1 </span><span style="color:#a626a4;">=</span><span> lastProcess.connectOutput(OutputStream(channel, (</span><span style="color:#e45649;">stream</span><span>: </span><span style="color:#c18401;">Stream</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>]) </span><span style="color:#a626a4;">=&gt; </span><span>IO.pure(stream))) </span><span> </span><span> copy( </span><span> innerProcesses </span><span style="color:#a626a4;">=</span><span> pl1 :: innerProcesses, </span><span> lastProcess </span><span style="color:#a626a4;">=</span><span> other, </span><span> originalProcesses </span><span style="color:#a626a4;">=</span><span> other :: originalProcesses </span><span> ) </span><span>} </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">|</span><span>(</span><span style="color:#e45649;">other</span><span>: </span><span style="color:#c18401;">Process</span><span>.</span><span style="color:#c18401;">UnboundProcess</span><span>): </span><span style="color:#c18401;">ProcessGroupImpl </span><span style="color:#a626a4;">=</span><span> pipeInto(other, identity) </span></code></pre> <p>Other than moving processes around in the <code>innerProcesses</code> and <code>lastProcess</code>, we also set up the <strong>previous last process</strong>'s output in the way I described:</p> <ul> <li>It gets redirected to a pipe which is by default <code>identity</code></li> <li>And it's <em>runner</em> instead of actually running the stream, just returns the stream definition</li> </ul> <p>This way we can write a process group specific start function into the <em>process runner</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">startProcessGroup</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>](</span><span style="color:#e45649;">processGroup</span><span>: </span><span style="color:#c18401;">ProcessGroup</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>], </span><span style="color:#e45649;">blocker</span><span>: </span><span style="color:#c18401;">Blocker</span><span>): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RunningProcessGroup</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>]] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">first </span><span style="color:#a626a4;">&lt;-</span><span> startProcess(processGroup.firstProcess, blocker) </span><span> </span><span style="color:#e45649;">firstOutput </span><span style="color:#a626a4;">&lt;-</span><span> first.runningOutput.join </span><span> </span><span style="color:#e45649;">innerResult </span><span style="color:#a626a4;">&lt;- if </span><span>(processGroup.innerProcesses.isEmpty) { </span><span> IO.pure((List.empty, firstOutput)) </span><span> } </span><span style="color:#a626a4;">else </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">inner </span><span style="color:#a626a4;">=</span><span> processGroup.innerProcesses.reverse </span><span> connectAndStartProcesses(inner.head, firstOutput, inner.tail, blocker, List.empty) </span><span> } </span><span> (</span><span style="color:#e45649;">inner</span><span>, </span><span style="color:#e45649;">lastInput</span><span>) </span><span style="color:#a626a4;">=</span><span> innerResult </span><span> </span><span style="color:#e45649;">last </span><span style="color:#a626a4;">&lt;-</span><span> startProcess(processGroup.lastProcess.connectInput(InputStream(lastInput, flushChunks </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">false</span><span>)), blocker) </span><span> </span><span style="color:#e45649;">runningProcesses </span><span style="color:#a626a4;">=</span><span> processGroup.originalProcesses.reverse.zip((first :: inner) :+ last).toMap </span><span> } </span><span style="color:#a626a4;">yield new </span><span style="color:#c18401;">JVMRunningProcessGroup</span><span>[</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#c18401;">E</span><span>](runningProcesses, last.runningOutput) </span></code></pre> <p>where <code>connectAndStartProcesses</code> is a recursive function that does the same as we do with the first process:</p> <ul> <li>start it with the <code>startProcess</code> function (this is the same function we discussed in the first section, that starts <code>Process</code> values)</li> <li>then "join" the output fiber; this completes immediately as it is not really running the output stream just returning it</li> <li>we connect the <em>input</em> of the next process to the previous process' output</li> </ul> <p>One thing we did not talk about yet is getting the <strong>results</strong> of a process group. This is where the old implementation again used some type level techniques and returned a <code>RunningProcess</code> value with specific per-process output and error types for each member of the group, as a <code>HList</code> (or converted to a <em>tuple</em>).</p> <p>By making the library a bit more dynamic we can drop this part too. What is that we really want to do with a running process group?</p> <ul> <li><strong>Terminating</strong> the whole group together. Terminating just one part is something we does not support currently although it would not be hard to add.</li> <li><strong>Waiting</strong> for all processes to stop</li> <li>Examining the <strong>exit code</strong> for each member of the group</li> <li>Redirecting the <strong>error</strong> channel of each process to something and getting them in the result</li> <li>Redirecting the <strong>input</strong> of the group's first process</li> <li>Redirecting the <strong>output</strong> of the group's last process, and getting it in the result</li> </ul> <p>The most difficult and primary reason for the <code>HList</code> in the old version is the error redirection, as it can be done <em>per process</em>. With some restrictions we can make a reasonable implementation though.</p> <p>First, we require that the processes participating in forming a <em>process group</em> does not have their <em>error channel</em> bound yet. Then we create a <code>RedirectableErrors</code> capability that is very similar to the existing <code>RedirectableError</code> trait, but provides an advanced interface through it's <code>customizedPerProcess</code> field:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> RedirectableErrors</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">P</span><span>[</span><span style="color:#e45649;">_</span><span>] </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessGroup</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>]] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">lazy val </span><span style="color:#e45649;">customizedPerProcess</span><span style="color:#c18401;">: RedirectableErrors.CustomizedPerProcess[P] </span><span style="color:#a626a4;">= </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>where the <code>CustomizedPerProcess</code> interface contains the same redirection functions but accept a function of a <code>Process</code> as parameter.</p> <p>For example:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">errorsToSink</span><span>(</span><span style="color:#e45649;">sink</span><span>: </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#a626a4;">Unit</span><span>]): </span><span style="color:#c18401;">P</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a0a1a7;">// vs </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">errorsToSink</span><span>(</span><span style="color:#e45649;">sinkFn</span><span>: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>] </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#a626a4;">Unit</span><span>]): </span><span style="color:#c18401;">P</span><span>[</span><span style="color:#a626a4;">Unit</span><span>] </span><span style="color:#a626a4;">= </span></code></pre> <p>The limitation is that for all process we need to have the same <strong>error result type</strong> but it still gets a lot of freedom via the advanced interface: we can tag the output with the process and split their processing further in the stream.</p> <p>With this choice, we can finally define the result type of the process group too:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> ProcessGroupResult</span><span>[</span><span style="color:#a626a4;">+</span><span style="color:#c18401;">O</span><span>, </span><span style="color:#a626a4;">+</span><span style="color:#c18401;">E</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">exitCodes</span><span style="color:#c18401;">: Map[Process[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">], ExitCode] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">output</span><span style="color:#c18401;">: O </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">errors</span><span style="color:#c18401;">: Map[Process[</span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">], E] </span><span style="color:#c18401;">} </span></code></pre> <p>The error results and the exit codes are in a map indexed by the <strong>original process</strong>. This is the value passed to the piping operator, the one that the user constructing the group has. That's why in the <code>ProcessGroup</code> trait we also had to store the original process values.</p> <p>As the output of all the inner processes are piped to the next process, we only have to care about the last process' output.</p> <h2 id="conclusion">Conclusion</h2> <p>With a full redesign and making some compromises, we get a library that has a much more readable and easier to maintain code, and an API that is discoverable by the IDE and does not produce any weird error messages on misuse.</p> <p>Note that in all the code snippets above I removed the <em>effect abstraction</em> and just used <code>IO</code> to make them simpler. The real code of course can be used with any IO library such as ZIO, just like the previous versions.</p> prox part 3 - effect abstraction and ZIO 2019-08-13T00:00:00+00:00 2019-08-13T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/prox-3-zio/ <h2 id="blog-post-series">Blog post series</h2> <ul> <li><a href="https://blog.vigoo.dev/posts/prox-1-types/">Part 1 - type level programming</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-2-io-akkastreams/">Part 2 - akka streams with cats effect</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-3-zio/">Part 3 - effect abstraction and ZIO</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-4-simplify/">Part 4 - simplified redesign</a></li> </ul> <h2 id="intro">Intro</h2> <p>The <a href="https://blog.vigoo.dev/posts/prox-3-zio/2019-02-10-prox-1-types.html">first post</a> introduced the <em>prox library</em> and demonstrated the advanced type level programming techniques it uses. Then in the <a href="https://blog.vigoo.dev/posts/prox-3-zio/2019-03-07-prox-2-io-akkastreams.html">second part</a> of this series we experimented with replacing the <em>streaming library</em> from <a href="https://fs2.io/">fs2</a> to <a href="https://doc.akka.io/docs/akka/2.5/stream/">Akka Streams</a>.</p> <p>In both cases the library used <a href="https://typelevel.org/cats-effect/">cats-effect</a> for describing side effects. But it did not really take advantage of <em>cats-effect</em>'s effect abstraction: it explicitly defined everything to be a computation in <a href="https://typelevel.org/cats-effect/datatypes/io.html"><code>IO</code></a>, cats-effect's implementation of describing effectful computations.</p> <p>But we can do better! By not relying on <code>IO</code> but the various type classes the <em>cats-effect</em> library provides we can make <em>prox</em> work with any kind of effect library out of the box. One such example is <a href="https://github.com/zio/zio">ZIO</a>.</p> <h2 id="effect-abstraction">Effect abstraction</h2> <p>Let's see an example of how <code>IO</code> used to be used in the library! The following function is in the <code>Start</code> type class, and it starts a process or piped process group:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span>(</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#e45649;">dontStartOutput</span><span>: </span><span style="color:#a626a4;">Boolean = </span><span style="color:#c18401;">false</span><span>, </span><span style="color:#e45649;">blocker</span><span>: </span><span style="color:#c18401;">Blocker</span><span>) </span><span> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span>: </span><span style="color:#c18401;">ContextShift</span><span>[</span><span style="color:#c18401;">IO</span><span>]): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RunningProcesses</span><span>] </span></code></pre> <p>We can observe two things here:</p> <ul> <li>The function returns an effectful computation in <code>IO</code></li> <li>An implicit <em>context shifter</em> is needed by the implementations which are calling some streaming functions needing it.</li> </ul> <p>To make it independent of the effect library implementation we have to get rid of <code>IO</code> and use a generic type instead, let's call it <code>F</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span>(</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">PN</span><span>, </span><span> </span><span style="color:#e45649;">dontStartOutput</span><span>: </span><span style="color:#a626a4;">Boolean = </span><span style="color:#c18401;">false</span><span>, </span><span> </span><span style="color:#e45649;">blocker</span><span>: </span><span style="color:#c18401;">Blocker</span><span>) </span><span> (</span><span style="color:#a626a4;">implicit </span><span> </span><span style="color:#e45649;">concurrent</span><span>: </span><span style="color:#c18401;">Concurrent</span><span>[</span><span style="color:#c18401;">F</span><span>], </span><span> </span><span style="color:#e45649;">contextShift</span><span>: </span><span style="color:#c18401;">ContextShift</span><span>[</span><span style="color:#c18401;">F</span><span>]): </span><span style="color:#c18401;">F</span><span>[</span><span style="color:#c18401;">RunningProcesses</span><span>] </span></code></pre> <p>Beside using <code>F</code> instead of <code>IO</code> everywhere we also have a new requirement, our context type (<code>F</code>) have to have an implementation of the <a href="https://typelevel.org/cats-effect/typeclasses/concurrent.html"><code>Concurrent</code></a> type class.</p> <p><em>Cats-effect</em> defines a hierarchy of type classes to deal with effectful computations. At the time of writing it looks like this: <img src="https://typelevel.org/cats-effect/img/cats-effect-typeclasses.svg"/></p> <p>Read the <a href="https://typelevel.org/cats-effect/typeclasses/">official documentation</a> for more information.</p> <p>Prox is based on the <code>ProcessNode</code> type which has two implementations, a single <code>Process</code> or a set of processes piped together to a <code>PipedProcess</code>. Because these types store their I/O redirection within themselves, they also have to be enriched with a context type parameter.</p> <p>For example <code>Process</code> will look like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> Process</span><span>[</span><span style="color:#c18401;">F</span><span>[</span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">OutResult</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>, </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>] </span><span>(</span><span style="color:#a626a4;">val </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">arguments</span><span>: </span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">String</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">workingDirectory</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Path</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">inputSource</span><span>: </span><span style="color:#c18401;">ProcessInputSource</span><span>[</span><span style="color:#c18401;">F</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">outputTarget</span><span>: </span><span style="color:#c18401;">ProcessOutputTarget</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">OutResult</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">errorTarget</span><span>: </span><span style="color:#c18401;">ProcessErrorTarget</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">environmentVariables</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">String</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">removedEnvironmentVariables</span><span>: </span><span style="color:#c18401;">Set</span><span>[</span><span style="color:#c18401;">String</span><span>]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ProcessNode[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">ORS</span><span>, </span><span style="color:#c18401;">ERS</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>The context parameter (<code>F</code>) is needed because the <em>input source</em> and <em>output target</em> are all representing effectful code such as writing to the standard output, reading from a file, or passing data through concurrent streams.</p> <p>Let's see some examples of how the abstract types of <em>cats-effect</em> can be used to describe the computation, when we cannot rely on <code>IO</code> itself!</p> <p>The most basic operation is to <em>delay the execution</em> of some code that does not use the effect abstractions. This is how we wrap the Java process API, for example.</p> <p>While with the original implementation of <em>prox</em> it was done by using the <code>IO</code> constructor:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>IO { </span><span> systemProcess.isAlive </span><span>} </span></code></pre> <p>with an arbitrary <code>F</code> we only need to require that it has an implementation of the <code>Sync</code> type class:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">private class</span><span style="color:#c18401;"> WrappedProcess</span><span>[</span><span style="color:#c18401;">F</span><span>[</span><span style="color:#e45649;">_</span><span>] </span><span style="color:#a626a4;">: </span><span style="color:#c18401;">Sync</span><span>, </span><span style="color:#a0a1a7;">// ... </span></code></pre> <p>and then use the <code>delay</code> function:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Sync[</span><span style="color:#c18401;">F</span><span>].delay { </span><span> systemProcess.isAlive </span><span>} </span></code></pre> <p>Similarily the <code>Concurrent</code> type class can be used to start a concurrent computation on a <em>fiber</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Concurrent[</span><span style="color:#c18401;">F</span><span>].start(stream.compile.toVector) </span></code></pre> <h2 id="type-level">Type level</h2> <p>This would be it - except that we need one more thing because of the type level techniques described in the <a href="https://blog.vigoo.dev/posts/prox-3-zio/2019-02-10-prox-1-types.html">first post</a>.</p> <p>To understand the problem, let's see how the <em>output redirection</em> operator works. It is implemented as an <em>extension method</em> on the <code>ProcessNode</code> type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> ProcessNodeOutputRedirect</span><span>[</span><span style="color:#c18401;">PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#e45649;">_</span><span>]](</span><span style="color:#e45649;">processNode</span><span>: </span><span style="color:#c18401;">PN</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span style="color:#c18401;">[F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">], To, NewOut, NewOutResult, Result </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, Redirected, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">]] </span><span style="color:#c18401;"> (</span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To) </span><span style="color:#c18401;"> (</span><span style="color:#a626a4;">implicit </span><span style="color:#c18401;"> </span><span style="color:#e45649;">target</span><span style="color:#c18401;">: CanBeProcessOutputTarget.Aux[F, To, NewOut, NewOutResult], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">redirectOutput</span><span style="color:#c18401;">: RedirectOutput.Aux[F, PN, To, NewOut, NewOutResult, Result]): Result </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> redirectOutput(processNode, to) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>This extension method basically just finds the appropriate type class implementations and then call it to alter the process node to register the output redirection:</p> <ul> <li>we are redirecting the output of <code>processNode</code> (of type <code>PN</code>) to <code>to</code> (of type <code>To</code>)</li> <li><code>target</code> is the <code>CanBeProcessOutputTarget</code> implementation, containing the actual code to set up the redirection</li> <li><code>redirectOutput</code> is the process node type specific implementation of the <code>RedirectOutput</code> interface, knowing how to set up the redirection of a <code>Process</code> or a <code>PipedProcess</code></li> </ul> <p>This code would compile, but we won't be able to use it. For example for the following code:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>running &lt;- (Process[</span><span style="color:#c18401;">IO</span><span>](</span><span style="color:#50a14f;">&quot;echo&quot;</span><span>, List(</span><span style="color:#50a14f;">&quot;Hello world!&quot;</span><span>)) &gt; tempFile.toPath).start(blocker) </span></code></pre> <p>It fails with not being able to resolve the implicits correctly. The exact error of course depends much on the context but one example for the above line could be:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>[error] prox/src/test/scala/io/github/vigoo/prox/ProcessSpecs.scala:95:63: diverging implicit expansion for type cats.effect.Concurrent[F] </span><span>[error] starting with method catsIorTConcurrent in object Concurrent </span><span>[error] running &lt;- (Process[IO](&quot;echo&quot;, List(&quot;Hello world!&quot;)) &gt; tempFile.toPath).start(blocker) </span></code></pre> <p>This does not really help understanding the real problem though. As we have seen earlier, in this library the <code>Process</code> types have to be parameterized with the context as well, because they store their redirection logic within themselves. That's why we specify it explicitly in the example to be <code>IO</code>: <code>Process[IO](...)</code>. What we would expect is that by tying <code>F[_]</code> to <code>IO</code> at the beginning, all the subsequent operations such as the <code>&gt;</code> redirection would respect this and the context gets inferred to be <code>IO</code> everywhere in the expression.</p> <p>The compiler cannot do this. If we check the definition of <code>&gt;</code> again, you can see that there is no connection expressed between the type <code>PN</code> (the actual process node type) and <code>F</code> which is used as a type parameter for the implicit parameters.</p> <p>The fix is to link the two, and we have a technique exactly for this that I described earlier: the <em>aux pattern</em>.</p> <p>First let's write some code that, in compile time, can "extract" the context type from a process node type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> ContextOf</span><span>[</span><span style="color:#c18401;">PN</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Context[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> ContextOf { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Aux[PN, F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">]] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">ContextOf[PN] { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Context[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span style="color:#c18401;">[PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">], F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">]](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextOf</span><span style="color:#c18401;">: ContextOf.Aux[PN, F]): Aux[PN, F] </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> contextOf </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit def </span><span style="color:#0184bc;">contextOfProcess</span><span style="color:#c18401;">[F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">], Out, Err, OutResult, ErrResult, IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState, ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState, ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState]: </span><span style="color:#c18401;"> Aux[Process[F, Out, Err, OutResult, ErrResult, IRS, ORS, ERS], F] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">ContextOf[Process[F, Out, Err, OutResult, ErrResult, IRS, ORS, ERS]] { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override type </span><span style="color:#c18401;">Context[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">implicit def </span><span style="color:#0184bc;">contextOfPipedProcess</span><span style="color:#c18401;">[ </span><span style="color:#c18401;"> F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">], </span><span style="color:#c18401;"> Out, Err, </span><span style="color:#c18401;"> PN1 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">], </span><span style="color:#c18401;"> PN2 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">], </span><span style="color:#c18401;"> IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState, ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState, ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState]: </span><span style="color:#c18401;"> Aux[PipedProcess[F, Out, Err, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">, PN1, PN2, IRS, ORS, ERS], F] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">ContextOf[PipedProcess[F, Out, Err, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">, PN1, PN2, IRS, ORS, ERS]] { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override type </span><span style="color:#c18401;">Context[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">F[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>Both <code>Process</code> and <code>PipedProcess</code> have the context as their first type parameter. By creating the <code>ContextOf</code> type class and the corresponding <code>Aux</code> type we can extend the <code>&gt;</code> operator to <em>require</em> such a connection (a way to get a <code>F[_]</code> context out of a type <code>PN</code>) in compile time, and with the aux pattern it unifies the type parameters and the context type gets <em>chained</em> through all the subsequent calls as we desired:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span>[</span><span style="color:#c18401;">F</span><span>[</span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">To</span><span>, </span><span style="color:#c18401;">NewOut</span><span>, </span><span style="color:#c18401;">NewOutResult</span><span>, </span><span style="color:#c18401;">Result </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#c18401;">Redirected</span><span>, </span><span style="color:#e45649;">_</span><span>]] </span><span> (</span><span style="color:#e45649;">to</span><span>: </span><span style="color:#c18401;">To</span><span>) </span><span> (</span><span style="color:#a626a4;">implicit </span><span> </span><span style="color:#e45649;">contextOf</span><span>: </span><span style="color:#c18401;">ContextOf</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">F</span><span>], </span><span> </span><span style="color:#e45649;">target</span><span>: </span><span style="color:#c18401;">CanBeProcessOutputTarget</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">To</span><span>, </span><span style="color:#c18401;">NewOut</span><span>, </span><span style="color:#c18401;">NewOutResult</span><span>], </span><span> </span><span style="color:#e45649;">redirectOutput</span><span>: </span><span style="color:#c18401;">RedirectOutput</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">To</span><span>, </span><span style="color:#c18401;">NewOut</span><span>, </span><span style="color:#c18401;">NewOutResult</span><span>, </span><span style="color:#c18401;">Result</span><span>]): </span><span style="color:#c18401;">Result </span><span style="color:#a626a4;">= </span><span>{ </span><span> redirectOutput(processNode, to) </span><span> } </span></code></pre> <h2 id="zio">ZIO</h2> <p>Now that everything is in place, we can try out whether <em>prox</em> is really working with other effect libraries such as <a href="https://github.com/zio/zio">ZIO</a>.</p> <p><em>ZIO</em> has a compatibility layer for <em>cats-effect</em>. It's the implementation of the type classes cats-effect provides. It is in an extra library called <a href="https://github.com/zio/interop-cats">zio-interop-cats</a>.</p> <p>For running processes with <em>prox</em> we can use the following variants of the <code>ZIO</code> type:</p> <ul> <li><code>RIO[-R, +A]</code> which is an alias for <code>ZIO[R, scala.Throwable, A]</code></li> <li>or <code>Task[A]</code> which is an alias for <code>ZIO[scala.Any, scala.Throwable, A]</code> if we don't take advantage of the environment parameter <code>R</code>.</li> </ul> <p>This in fact assuming the correct context only means switching <code>IO</code> to <code>RIO</code> or <code>Task</code> in the type parameter for <code>Process</code>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">import</span><span> zio.interop.catz.</span><span style="color:#e45649;">_ </span><span> </span><span>Blocker[</span><span style="color:#c18401;">RIO</span><span>[</span><span style="color:#c18401;">Console</span><span>, </span><span style="color:#e45649;">?</span><span>]].use { </span><span style="color:#e45649;">blocker </span><span style="color:#a626a4;">=&gt; </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;-</span><span> console.putStrLn(</span><span style="color:#50a14f;">&quot;Starting external process...&quot;</span><span>) </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">&lt;- </span><span>(Process[</span><span style="color:#c18401;">Task</span><span>](</span><span style="color:#50a14f;">&quot;echo&quot;</span><span>, List(</span><span style="color:#50a14f;">&quot;Hello world!&quot;</span><span>)) &gt; tempFile.toPath).start(blocker) </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span><span style="color:#a626a4;">yield </span><span style="color:#c18401;">() </span><span>} </span></code></pre> <p>A nice way to have everything set up for this is to use the interop library's <a href="https://zio.dev/docs/interop/interop_catseffect#cats-app"><code>CatsApp</code></a> trait as an entrypoint for the application.</p> <p>This brings all the necessary implicits in scope and requires you to implement the following function as the entry point of the application:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#e45649;">args</span><span>: </span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">String</span><span>]): </span><span style="color:#c18401;">ZIO</span><span>[</span><span style="color:#c18401;">Environment</span><span>, </span><span style="color:#a626a4;">Nothing</span><span>, </span><span style="color:#a626a4;">Int</span><span>] </span></code></pre> prox part 2 - akka streams with cats effect 2019-03-07T00:00:00+00:00 2019-03-07T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/prox-2-io-akkastreams/ <h2 id="blog-post-series">Blog post series</h2> <ul> <li><a href="https://blog.vigoo.dev/posts/prox-1-types/">Part 1 - type level programming</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-2-io-akkastreams/">Part 2 - akka streams with cats effect</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-3-zio/">Part 3 - effect abstraction and ZIO</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-4-simplify/">Part 4 - simplified redesign</a></li> </ul> <h2 id="intro">Intro</h2> <p>In the previous post we have seen how <a href="https://github.com/vigoo/prox">prox</a> applies advanced type level programming techniques to express executing external system processes. The input and output of these processes can be connected to <strong>streams</strong>. The current version of <a href="https://github.com/vigoo/prox">prox</a> uses the <a href="https://fs2.io/">fs2</a> library to describe these streams, and <a href="https://typelevel.org/cats-effect/">cats-effect</a> as an <strong>IO</strong> abstraction, allowing it to separate the specification of a process pipeline from its actual execution.</p> <p>In this post we will keep <a href="https://typelevel.org/cats-effect/">cats-effect</a> but replace <a href="https://fs2.io/">fs2</a> with the stream library of the Akka toolkit, <a href="https://doc.akka.io/docs/akka/2.5/stream/">Akka Streams</a>. This will be a hybrid solution, as Akka Streams is not using any kind of IO abstraction, unlike <a href="https://fs2.io/">fs2</a> which is implemented on top of <a href="https://typelevel.org/cats-effect/">cats-effect</a>. We will experiment with implementing <a href="https://github.com/vigoo/prox">prox</a> purely with the <em>Akka</em> libraries in a future post.</p> <h2 id="replacing-fs2-with-akka-streams">Replacing fs2 with Akka Streams</h2> <p>We start by removing the <a href="https://fs2.io/">fs2</a> dependency and adding <em>Akka Streams</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>- </span><span style="color:#50a14f;">&quot;co.fs2&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;fs2-core&quot;</span><span> % </span><span style="color:#50a14f;">&quot;1.0.3&quot;</span><span>, </span><span>- </span><span style="color:#50a14f;">&quot;co.fs2&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;fs2-io&quot;</span><span> % </span><span style="color:#50a14f;">&quot;1.0.3&quot;</span><span>, </span><span> </span><span>+ </span><span style="color:#50a14f;">&quot;com.typesafe.akka&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;akka-stream&quot;</span><span> % </span><span style="color:#50a14f;">&quot;2.5.20&quot;</span><span>, </span></code></pre> <p>Then we have to change all the <em>fs2</em> types used in the codebase to the matching <em>Akka Streams</em> types. The following table describe these pairs:</p> <table><thead><tr><th>fs2</th><th>Akka Streams</th></tr></thead><tbody> <tr><td><code>Stream[IO, O]</code></td><td><code>Source[O, Any]</code></td></tr> <tr><td><code>Pipe[IO, I, O]</code></td><td><code>Flow[I, O, Any]</code></td></tr> <tr><td><code>Sink[IO, O]</code></td><td><code>Sink[O, Future[Done]</code></td></tr> </tbody></table> <p>Another small difference that requires changing a lot of our functions is the <em>implicit context</em> these streaming solutions require.</p> <p>With the original implementation it used to be:</p> <ul> <li>an implicit <code>ContextShift[IO]</code> instance</li> <li>and an explicitly passed <em>blocking execution context</em> of type <code>ExecutionContext</code></li> </ul> <p>We can treat the blocking execution context as part of the implicit context for <em>prox</em> too, and could refactor the library to pass both of them wrapped together within a context object.</p> <p>Let's see what we need for the <em>Akka Streams</em> based implementation!</p> <ul> <li>an implicit <code>ContextShift[IO]</code> is <em>still needed</em> because we are still using <code>cats-effect</code> as our IO abstraction</li> <li>The blocking execution context however was only used for passing it to <em>fs2</em>, so we can remove that</li> <li>And for <em>Akka Streams</em> we will need an execution context of type <code>ExecutionContext</code> and also a <code>Materializer</code>. The materializer is used by <em>Akka Streams</em> to execute blueprints of streams. The usual implementation is <code>ActorMaterializer</code> which does that by spawning actors implementing the stream graph.</li> </ul> <p>So for example the <code>start</code> extension method, is modified like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>- </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">start</span><span>[</span><span style="color:#c18401;">RP</span><span>](</span><span style="color:#e45649;">blockingExecutionContext</span><span>: </span><span style="color:#c18401;">ExecutionContext</span><span>) </span><span> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">start</span><span>: </span><span style="color:#c18401;">Start</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">RP</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#e45649;">contextShift</span><span>: </span><span style="color:#c18401;">ContextShift</span><span>[</span><span style="color:#c18401;">IO</span><span>]): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RP</span><span>] </span><span>+ </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">start</span><span>[</span><span style="color:#c18401;">RP</span><span>]() </span><span> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">start</span><span>: </span><span style="color:#c18401;">Start</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">RP</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#e45649;">contextShift</span><span>: </span><span style="color:#c18401;">ContextShift</span><span>[</span><span style="color:#c18401;">IO</span><span>], </span><span> </span><span style="color:#e45649;">materializer</span><span>: </span><span style="color:#c18401;">Materializer</span><span>, </span><span> </span><span style="color:#e45649;">executionContext</span><span>: </span><span style="color:#c18401;">ExecutionContext</span><span>): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RP</span><span>] </span></code></pre> <p>It turns out that there is one more minor difference that needs changes in the internal type signatures.</p> <p>In <em>Akka Streams</em> byte streams are represented by not streams of element type <code>Byte</code>. like in <em>fs2</em>, but streams of <em>chunks</em> called <code>ByteString</code>s. So everywhere we used <code>Byte</code> as element type, such as on the process boundaries, we now simply have to use <code>ByteStrings</code>, for example:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>- </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span>(</span><span style="color:#e45649;">from</span><span>: </span><span style="color:#c18401;">PN1</span><span>, </span><span style="color:#e45649;">to</span><span>: </span><span style="color:#c18401;">PN2</span><span>, </span><span style="color:#e45649;">via</span><span>: </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#a626a4;">Byte</span><span>]): </span><span style="color:#c18401;">ResultProcess </span><span>+ </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span>(</span><span style="color:#e45649;">from</span><span>: </span><span style="color:#c18401;">PN1</span><span>, </span><span style="color:#e45649;">to</span><span>: </span><span style="color:#c18401;">PN2</span><span>, </span><span style="color:#e45649;">via</span><span>: </span><span style="color:#c18401;">Flow</span><span>[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#a626a4;">Any</span><span>]): </span><span style="color:#c18401;">ResultProcess </span></code></pre> <p>Another thing to notice is that <em>fs2</em> had a type parameter for passing the <code>IO</code> monad to run on. As I wrote earlier, <em>Akka Streams</em> does not depend on such abstractions, so this parameter is missing. On the other hand, it has a third type parameter which is set in the above example to <code>Any</code>. This parameter is called <code>Mat</code> and represents the type of the value the flow will materialize to. At this point we don't care about it so we set it to <code>Any</code>.</p> <p>Let's take a look of the <code>connect</code> function of the <code>ProcessIO</code> trait. With <em>fs2</em> the <code>InputStreamingSource</code> is implemented like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> InputStreamingSource</span><span>(</span><span style="color:#e45649;">source</span><span>: </span><span style="color:#c18401;">Source</span><span>[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#a626a4;">Any</span><span>]) </span><span style="color:#a626a4;">extends </span><span>ProcessInputSource </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">toRedirect</span><span style="color:#c18401;">: Redirect </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Redirect.PIPE </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">connect</span><span style="color:#c18401;">(</span><span style="color:#e45649;">systemProcess</span><span style="color:#c18401;">: lang.Process, </span><span style="color:#e45649;">blockingExecutionContext</span><span style="color:#c18401;">: ExecutionContext) </span><span style="color:#c18401;"> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span style="color:#c18401;">: ContextShift[IO]): Stream[IO, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> source.observe( </span><span style="color:#c18401;"> io.writeOutputStream[IO]( </span><span style="color:#c18401;"> IO { systemProcess.getOutputStream }, </span><span style="color:#c18401;"> closeAfterUse </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true, </span><span style="color:#c18401;"> blockingExecutionContext </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> blockingExecutionContext)) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">stream</span><span style="color:#c18401;">: Stream[IO, </span><span style="color:#a626a4;">Byte</span><span style="color:#c18401;">])(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span style="color:#c18401;">: ContextShift[IO]): IO[Fiber[IO, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">]] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> Concurrent[IO].start(stream.compile.drain) </span><span style="color:#c18401;">} </span></code></pre> <p>We have a <code>source</code> stream and during the setup of the process graph, when the system process has been already created, we have to set up the redirection of this source stream to this process. This is separated to a <code>connect</code> and a <code>run</code> step:</p> <ul> <li>The <code>connect</code> step creates an <em>fs2 stream</em> that observers the source stream and sends each byte to the system process's standard input. This just <strong>defines</strong> this stream, and returns it as a pure functional value.</li> <li>The <code>run</code> step on the other hand has the result type <code>IO[Fiber[IO, Unit]]</code>. It <strong>defines</strong> the effect of starting a new thread and running the stream on it.</li> </ul> <p>In the case of <em>fs2</em> we can be sure that the <code>source.observe</code> function is pure just by checking it's type signature:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">observe</span><span>(</span><span style="color:#e45649;">p</span><span>: </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">O</span><span>, </span><span style="color:#a626a4;">Unit</span><span>])(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">F</span><span>: </span><span style="color:#c18401;">Concurrent</span><span>[</span><span style="color:#c18401;">F</span><span>]): </span><span style="color:#c18401;">Stream</span><span>[</span><span style="color:#c18401;">F</span><span>, </span><span style="color:#c18401;">O</span><span>] </span></code></pre> <p>All side-effecting functions in <em>fs2</em> are defined as <code>IO</code> functions, so we simply know that this one is not among them, and that's why the <code>connect</code> was a pure, non-<code>IO</code> function in the original implementation. With <em>Akka Streams</em> we don't have any information about this encoded in the type system. We use the <code>source.alsoTo</code> function:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">alsoTo</span><span>(</span><span style="color:#e45649;">that</span><span>: </span><span style="color:#c18401;">Graph</span><span>[</span><span style="color:#c18401;">SinkShape</span><span>[</span><span style="color:#c18401;">Out</span><span>], </span><span style="color:#e45649;">_</span><span>]): </span><span style="color:#c18401;">Repr</span><span>[</span><span style="color:#c18401;">Out</span><span>] </span></code></pre> <p>which is actually also pure (only creating a blueprint of the graph to be executed), so we can safely replace the implementation to this in the <em>Akka Streams</em> version:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> InputStreamingSource</span><span>(</span><span style="color:#e45649;">source</span><span>: </span><span style="color:#c18401;">Source</span><span>[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#a626a4;">Any</span><span>]) </span><span style="color:#a626a4;">extends </span><span>ProcessInputSource </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">toRedirect</span><span style="color:#c18401;">: Redirect </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Redirect.PIPE </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">connect</span><span style="color:#c18401;">(</span><span style="color:#e45649;">systemProcess</span><span style="color:#c18401;">: lang.Process)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span style="color:#c18401;">: ContextShift[IO]): Source[ByteString, </span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> source.alsoTo(fromOutputStream(() </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> systemProcess.getOutputStream, autoFlush </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true)) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">stream</span><span style="color:#c18401;">: Source[ByteString, </span><span style="color:#a626a4;">Any</span><span style="color:#c18401;">]) </span><span style="color:#c18401;"> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span style="color:#c18401;">: ContextShift[IO], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">materializer</span><span style="color:#c18401;">: Materializer, </span><span style="color:#c18401;"> </span><span style="color:#e45649;">executionContext</span><span style="color:#c18401;">: ExecutionContext): IO[Fiber[IO, </span><span style="color:#a626a4;">Unit</span><span style="color:#c18401;">]] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> Concurrent[IO].start(IO.async { </span><span style="color:#e45649;">finish </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;"> stream.runWith(Sink.ignore).onComplete { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">Success(Done) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> finish(Right(())) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#c18401;">Failure(</span><span style="color:#e45649;">reason</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> finish(Left(reason)) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> }) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>The implementation of <code>run</code> above is a nice example of how we can integrate asynchronous operations not implemented with <code>cats-effect</code> to an <code>IO</code> based program. With <code>IO.async</code> we define how to start the asynchronous operation (in this case running the <em>Akka stream</em>) and we get a callback function, <code>finish</code> to be called when the asynchronous operation ends. The stream here <em>materializes</em> to a <code>Future[T]</code> value, so we can use it's <code>onComplete</code> function to notify the IO system about the finished stream. The <code>IO</code> value returned by <code>IO.async</code> represents the whole asynchronous operation, it returns it's final result when the callback is called, and "blocks" the program flow until it is done. This does not mean actually blocking a thread; but the next IO function will be executed only when it finished running (as it's type is <code>IO[A]</code>). That is not what we need here, so we use <code>Concurrent[IO].start</code> to put this <code>IO</code> action on a separate <em>fiber</em>. This way all streams involved in the process graph will be executing in parallel.</p> <h3 id="calculating-the-result">Calculating the result</h3> <p><a href="https://github.com/vigoo/prox">prox</a> supports multiple ways to calculate a result of running a process graph:</p> <ul> <li>If the target is a <code>Sink</code>, the result type is <code>Unit</code></li> <li>If the pipe's output is <code>Out</code> and there is a <code>Monoid</code> instance for <code>Out</code>, the stream is folded into an <code>Out</code> value</li> <li>Otherwise if the pipe's output is <code>Out</code>, the result type will be <code>Vector[Out]</code></li> </ul> <p>These cases can be enforced by the <code>Drain</code>, <code>ToVector</code> and <code>Fold</code> wrapper classes.</p> <p>Let's see how we can implement them with <em>Akka Streams</em> compared to <em>fs2</em>.</p> <h4 id="drain-sink">Drain sink</h4> <p>The sink version was implemented like this with <em>fs2</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Concurrent[</span><span style="color:#c18401;">IO</span><span>].start(stream.compile.drain) </span></code></pre> <ul> <li><code>.compile</code> gets an interface that can be used to convert the stream to a <code>IO[A]</code> value in multiple ways.</li> <li><code>.drain</code> is one of them. It runs the stream but ignores its elements, having a result type of <code>IO[Unit]</code>.</li> <li>We want to run this concurrently with the other streams so we move it to a <em>fiber</em></li> </ul> <p>With <em>Akka Streams</em> there is one big difference. In <em>fs2</em> the sink is represented as a <code>Pipe[F, E, Unit]</code>, so we could treat it in the same way as other stream segments. In this case the <code>Sink</code> is not a <code>Flow</code>, so we do a trick to keep the interface as close to the original one as possible:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>create((</span><span style="color:#e45649;">sink</span><span>: </span><span style="color:#c18401;">Sink</span><span>[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">R</span><span>]]) </span><span style="color:#a626a4;">=&gt; new </span><span style="color:#c18401;">OutputStreamingTarget</span><span>(Flow.fromFunction(identity)) </span><span> with ProcessOutputTarget[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#c18401;">R</span><span>] { </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">run</span><span>(</span><span style="color:#e45649;">stream</span><span>: </span><span style="color:#c18401;">Source</span><span>[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#a626a4;">Any</span><span>]) </span><span> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">contextShift</span><span>: </span><span style="color:#c18401;">ContextShift</span><span>[</span><span style="color:#c18401;">IO</span><span>], </span><span> </span><span style="color:#e45649;">materializer</span><span>: </span><span style="color:#c18401;">Materializer</span><span>, </span><span> </span><span style="color:#e45649;">executionContext</span><span>: </span><span style="color:#c18401;">ExecutionContext</span><span>): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">Fiber</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#c18401;">R</span><span>]] </span><span style="color:#a626a4;">= </span><span> Concurrent[</span><span style="color:#c18401;">IO</span><span>].start(IO.async { </span><span style="color:#e45649;">complete </span><span style="color:#a626a4;">=&gt; </span><span> stream.runWith(sink).onComplete { </span><span> </span><span style="color:#a626a4;">case </span><span>Success(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> complete(Right(value)) </span><span> </span><span style="color:#a626a4;">case </span><span>Failure(</span><span style="color:#e45649;">reason</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> complete(Left(reason)) </span><span> } </span><span> }) </span><span>} </span></code></pre> <p>The trick is that we create the <code>OutputStreamingTarget</code> with an identity flow, and only use the <code>Sink</code> when we actually run the stream, passing it to the <code>runWith</code> function. This materializes the stream into a <code>Future[Done]</code> value, that we can tie back to our <code>IO</code> system with <code>IO.async</code> as I already described it.</p> <h4 id="combine-with-monoid">Combine with Monoid</h4> <p>When the element type is a <em>monoid</em> we can fold it into a single value. <em>Fs2</em> directly supports this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Concurrent[</span><span style="color:#c18401;">IO</span><span>].start(stream.compile.foldMonoid) </span></code></pre> <p><em>Akka Streams</em> does not use cats type classes, but it also has a way to <em>fold</em> the stream, so we can easily implement it using the <em>monoid instance</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Concurrent[</span><span style="color:#c18401;">IO</span><span>].start(IO.async { </span><span style="color:#e45649;">complete </span><span style="color:#a626a4;">=&gt; </span><span> stream.runFold(monoid.empty)(monoid.combine).onComplete { </span><span> </span><span style="color:#a626a4;">case </span><span>Success(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> complete(Right(value)) </span><span> </span><span style="color:#a626a4;">case </span><span>Failure(</span><span style="color:#e45649;">reason</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> complete(Left(reason)) </span><span> } </span><span>}) </span></code></pre> <h4 id="vector-of-elements">Vector of elements</h4> <p>Finally let's see the version that keeps all the stream elements in a vector as a result:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Concurrent[</span><span style="color:#c18401;">IO</span><span>].start(stream.compile.toVector) </span></code></pre> <p>With <em>Akka Streams</em> we can do it by running the stream into a <em>sink</em> created for this, <code>Sink.seq</code>. It materializes into a <code>Future[Seq[T]]</code> value that holds all the elements of the executed stream:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>Concurrent[</span><span style="color:#c18401;">IO</span><span>].start(IO.async { </span><span style="color:#e45649;">complete </span><span style="color:#a626a4;">=&gt; </span><span> stream.runWith(Sink.seq).onComplete { </span><span> </span><span style="color:#a626a4;">case </span><span>Success(</span><span style="color:#e45649;">value</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> complete(Right(value.toVector)) </span><span> </span><span style="color:#a626a4;">case </span><span>Failure(</span><span style="color:#e45649;">reason</span><span>) </span><span style="color:#a626a4;">=&gt;</span><span> complete(Left(reason)) </span><span> } </span><span>}) </span></code></pre> <h3 id="testing">Testing</h3> <p>At this point the only remaining thing is to modify the tests too. One of the more complex examples is the <code>customProcessPiping</code> test case. With <em>fs2</em> it takes advantage of some <em>text processing</em> pipe elements coming with the library:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">customPipe</span><span>: </span><span style="color:#c18401;">Pipe</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#a626a4;">Byte</span><span>] </span><span style="color:#a626a4;">= </span><span> (</span><span style="color:#e45649;">s</span><span>: </span><span style="color:#c18401;">Stream</span><span>[</span><span style="color:#c18401;">IO</span><span>, </span><span style="color:#a626a4;">Byte</span><span>]) </span><span style="color:#a626a4;">=&gt;</span><span> s </span><span> .through(text.utf8Decode) </span><span> .through(text.lines) </span><span> .map(</span><span style="color:#e45649;">_</span><span>.split(</span><span style="color:#c18401;">&#39; &#39;</span><span>).toVector) </span><span> .map(</span><span style="color:#e45649;">v </span><span style="color:#a626a4;">=&gt;</span><span> v.map(</span><span style="color:#e45649;">_</span><span> + </span><span style="color:#50a14f;">&quot; !!!&quot;</span><span>).mkString(</span><span style="color:#50a14f;">&quot; &quot;</span><span>)) </span><span> .intersperse(</span><span style="color:#50a14f;">&quot;</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> .through(text.utf8Encode) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">proc </span><span style="color:#a626a4;">= </span><span>Process(</span><span style="color:#50a14f;">&quot;echo&quot;</span><span>, List(</span><span style="color:#50a14f;">&quot;This is a test string&quot;</span><span>)) </span><span> .via(customPipe) </span><span> .to(Process(</span><span style="color:#50a14f;">&quot;wc&quot;</span><span>, List(</span><span style="color:#50a14f;">&quot;-w&quot;</span><span>)) &gt; text.utf8Decode[</span><span style="color:#c18401;">IO</span><span>]) </span></code></pre> <p>There are similar tools in <em>Akka Streams</em> to express this in the <code>Framing</code> module:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">customPipe </span><span style="color:#a626a4;">= </span><span>Framing.delimiter( </span><span> delimiter </span><span style="color:#a626a4;">= </span><span>ByteString(</span><span style="color:#50a14f;">&quot;</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>), </span><span> maximumFrameLength </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">10000</span><span>, </span><span> allowTruncation </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">true </span><span> ).map(</span><span style="color:#e45649;">_</span><span>.utf8String) </span><span> .map(</span><span style="color:#e45649;">_</span><span>.split(</span><span style="color:#c18401;">&#39; &#39;</span><span>).toVector) </span><span> .map(</span><span style="color:#e45649;">v </span><span style="color:#a626a4;">=&gt;</span><span> v.map(</span><span style="color:#e45649;">_</span><span> + </span><span style="color:#50a14f;">&quot; !!!&quot;</span><span>).mkString(</span><span style="color:#50a14f;">&quot; &quot;</span><span>)) </span><span> .intersperse(</span><span style="color:#50a14f;">&quot;</span><span style="color:#0997b3;">\n</span><span style="color:#50a14f;">&quot;</span><span>) </span><span> .map(ByteString.apply) </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">proc </span><span style="color:#a626a4;">= </span><span>Process(</span><span style="color:#50a14f;">&quot;echo&quot;</span><span>, List(</span><span style="color:#50a14f;">&quot;This is a test string&quot;</span><span>)) </span><span> .via(customPipe) </span><span> .to(Process(</span><span style="color:#50a14f;">&quot;wc&quot;</span><span>, List(</span><span style="color:#50a14f;">&quot;-w&quot;</span><span>)) &gt; utf8Decode) </span></code></pre> <p>where <code>utf8Decode</code> is a helper sink defined as:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">utf8Decode</span><span>: </span><span style="color:#c18401;">Sink</span><span>[</span><span style="color:#c18401;">ByteString</span><span>, </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">String</span><span>]] </span><span style="color:#a626a4;">= </span><span> Flow[</span><span style="color:#c18401;">ByteString</span><span>] </span><span> .reduce(</span><span style="color:#e45649;">_</span><span> ++ </span><span style="color:#e45649;">_</span><span>) </span><span> .map(</span><span style="color:#e45649;">_</span><span>.utf8String) </span><span> .toMat(Sink.head)(Keep.right) </span></code></pre> <p>First it concatenates the <code>ByteString</code> chunks, then simply calls <code>.utf8String</code> on the result.</p> <h2 id="final-thoughts">Final thoughts</h2> <p>We have seen that it is relatively easy to replace the stream library in <a href="https://github.com/vigoo/prox">prox</a> without changing it's interface much, if we keep <a href="https://typelevel.org/cats-effect/">cats-effect</a> for expressing the effectful computations. The complete working example is available on the <a href="https://github.com/vigoo/prox/compare/akka-streams"><code>akka-streams</code> branch</a>.</p> prox part 1 - type level programming 2019-02-10T00:00:00+00:00 2019-02-10T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/prox-1-types/ <h2 id="blog-post-series">Blog post series</h2> <ul> <li><a href="https://blog.vigoo.dev/posts/prox-1-types/">Part 1 - type level programming</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-2-io-akkastreams/">Part 2 - akka streams with cats effect</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-3-zio/">Part 3 - effect abstraction and ZIO</a></li> <li><a href="https://blog.vigoo.dev/posts/prox-4-simplify/">Part 4 - simplified redesign</a></li> </ul> <h2 id="intro">Intro</h2> <p>I started writing <a href="https://github.com/vigoo/prox">prox</a> at the end of 2017 for two reasons. First, I never liked any of the existing solutions for running external processes and capture their input/output streams. And I just returned from the <a href="https://scala.io/">scala.io conference</a> full of inspiration; I wanted to try out some techniques and libraries and this seemed to be a nice small project to do so.</p> <p>Since then, <a href="https://github.com/vigoo/prox">prox</a> has been proved to be useful, we are using it at <a href="https://prezi.com/">Prezi</a> in all our Scala projects where we have to deal with external processes. The last stable version was created last October, after <a href="https://typelevel.org/cats-effect/">cats-effect 1.0</a> and <a href="https://fs2.io/">fs2 1.0</a> was released.</p> <p>This is the first part of a series of blog posts dedicated to this library. In the first one I'm going to talk about <a href="https://github.com/milessabin/shapeless">shapeless</a> and <em>type level programming</em> techniques are used to create a strongly typed interface for starting system processes. In future posts I will explore replacing its dependencies such as using <a href="https://doc.akka.io/docs/akka/2.5/stream/">akka-streams</a> instead of <a href="https://fs2.io/">fs2</a> or <a href="https://scalaz.github.io/scalaz-zio/">ZIO</a> instead of <a href="https://typelevel.org/cats-effect/">cats-effect</a>. These different versions will be a good opportunity to do some performance comparison, and to close the series with creating a new version of the library which is easier to use in the alternative environments.</p> <h2 id="limiting-redirection">Limiting redirection</h2> <p>When I started writing the library I wanted to explore how I can express some strict constraints on the type level:</p> <ul> <li>A process can have its input, output and error streams redirected, but only once</li> <li>Processes without redirected output can be piped to processes without a redirected input</li> </ul> <p>In prox <em>0.2.1</em> a single system process is described by the following type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> Process</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">OutResult</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>, </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>]( </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">command</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">arguments</span><span>: </span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">String</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">workingDirectory</span><span>: </span><span style="color:#c18401;">Option</span><span>[</span><span style="color:#c18401;">Path</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">inputSource</span><span>: </span><span style="color:#c18401;">ProcessInputSource</span><span>, </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">outputTarget</span><span>: </span><span style="color:#c18401;">ProcessOutputTarget</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">OutResult</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">errorTarget</span><span>: </span><span style="color:#c18401;">ProcessErrorTarget</span><span>[</span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>], </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">environmentVariables</span><span>: </span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">String</span><span>]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ProcessNode[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">ORS</span><span>, </span><span style="color:#c18401;">ERS</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>but let's focus first on the requirement to be able to redirect one of the streams <em>maximum once</em>. This is encoded by the <code>IRS</code>, <code>ORS</code> and <code>ERS</code> type parameters, which are all have to be subtypes of <code>RedirectionState</code>. <code>RedirectionState</code> is a <strong>phantom type</strong>; there are no values ever created of this type, it is only used in type signatures to encode whether one of the three streams are already redirected or not:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a0a1a7;">/** Phantom type representing the redirection state of a process */ </span><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> RedirectionState </span><span> </span><span style="color:#a0a1a7;">/** Indicates that the given channel is not redirected yet */ </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> NotRedirected </span><span style="color:#a626a4;">extends </span><span>RedirectionState </span><span> </span><span style="color:#a0a1a7;">/** Indicates that the given channel has already been redirected */ </span><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Redirected </span><span style="color:#a626a4;">extends </span><span>RedirectionState </span></code></pre> <p>So for example with a simplified model of a <em>process</em>, <code>Process[IRS &lt;: RedirectionState, ORS &lt;: RedirectionState, ERS &lt;: RedirectionState]</code>, using the output redirection operator <code>&gt;</code> would change the types in the following way:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">p1</span><span>: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>] </span><span style="color:#a626a4;">= ??? </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">p2</span><span>: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">Redirected</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>] </span><span style="color:#a626a4;">=</span><span> p1 &gt; (home / </span><span style="color:#50a14f;">&quot;tmp&quot;</span><span> / </span><span style="color:#50a14f;">&quot;out.txt&quot;</span><span>) </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">p3 </span><span style="color:#a626a4;">=</span><span> p2 &gt; (home / </span><span style="color:#50a14f;">&quot;tmp&quot;</span><span> / </span><span style="color:#50a14f;">&quot;another.txt&quot;</span><span>) </span><span style="color:#a0a1a7;">// THIS MUST NOT COMPILE </span></code></pre> <p>How can we restrict the redirect function to only work on <code>Process[_, NotRedirected, _]</code>? We can define it as an <strong>extension method</strong> with an implicit class (once again this is a simplified version focusing only on the <em>redirection state</em> handling):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> ProcessNodeOutputRedirect</span><span>[ </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Process</span><span>[</span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">ERS</span><span>]](</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">PN</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span style="color:#c18401;">[To](</span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">target</span><span style="color:#c18401;">: CanBeProcessOutputTarget[To]): Process[IRS, Redirected, ERS] </span><span style="color:#a626a4;">= ??? </span><span style="color:#c18401;"> } </span></code></pre> <p>By forcing the <code>ORS</code> type parameter to be <code>NotRedirected</code> and setting it to <code>Redirected</code> in the result type we can guarantee that this function can only be called on a process that does not have their output redirected yet. The <em>target</em> of the redirection is extensible through the <code>CanBeProcessOutputTarget</code> type class, as we will see later.</p> <h2 id="dependent-types">Dependent types</h2> <p>Reality is much more complicated, because of <em>process piping</em> and because the process types encode the redirection result types too. Let's get back to our <code>&gt;</code> function and see how we could modify it so it works with piped processes too. Anyway, how is process piping encoded in this library?</p> <p>Two processes connected through a pipe are represented by the <code>PipedProcess</code> class. Both <code>Procses</code> and <code>PipedProcess</code> implements the following trait:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> ProcessNode</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>] </span></code></pre> <p>We've already seen <code>Process</code>. <code>PipedProcess</code> is a bit more complicated:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> PipedProcess</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">PN1Out</span><span>, </span><span> </span><span style="color:#c18401;">PN1 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#c18401;">PN2 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>] </span><span> (</span><span style="color:#a626a4;">val </span><span style="color:#e45649;">from</span><span>: </span><span style="color:#c18401;">PN1</span><span>, </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">createTo</span><span>: </span><span style="color:#c18401;">PipeConstruction</span><span>[</span><span style="color:#c18401;">PN1Out</span><span>] </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">PN2</span><span>) </span><span> </span><span style="color:#a626a4;">extends </span><span>ProcessNode[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">ORS</span><span>, </span><span style="color:#c18401;">ERS</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... </span><span style="color:#c18401;">} </span></code></pre> <p>To make <code>&gt;</code> work on both, we can start by modifying its definition to work on <em>any</em> <code>ProcessNode</code> not just <code>Process</code> (omitting the output type params for now):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> ProcessNodeOutputRedirect</span><span>[ </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">ERS</span><span>]](</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">PN</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span style="color:#c18401;">[To](</span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">target</span><span style="color:#c18401;">: CanBeProcessOutputTarget[To]): ProcessNode[IRS, Redirected, ERS] </span><span style="color:#a626a4;">= ??? </span><span style="color:#c18401;">} </span></code></pre> <p>This has a serious problem though. The output type is <code>ProcessNode</code> and not the "real" process type, which means that we lose type information and all the other dependent typed operations will not work. We have to make the result type <strong>depend</strong> on the input!</p> <p>We may try to use the <code>RedirectionOutput</code> type class like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> ProcessNodeOutputRedirect</span><span>[ </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">ERS</span><span>]](</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">PN</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span style="color:#c18401;">[To](</span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To) </span><span style="color:#c18401;"> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">target</span><span style="color:#c18401;">: CanBeProcessOutputTarget[To], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">redirectOutput</span><span style="color:#c18401;">: RedirectOutput[PN, To]): redirectOutput.Result </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> redirectOutput(to) </span><span style="color:#c18401;">} </span></code></pre> <p>Here the result (<code>redirectOutput.Result</code>) is a <em>path dependent type</em>. This may work in some simple cases but have two serious issues:</p> <ul> <li>It is not possible to use <code>redirectOutput.Result</code> in the <em>parameter block</em> of the function, so if another type class needed it as a type parameter we could not pass it.</li> <li>Further implicit resolutions and type level operations will quickly break as the compiler will not be able to unify the various path dependent types</li> </ul> <p>The <strong>Aux pattern</strong>, used heavily in the <a href="https://github.com/milessabin/shapeless">shapeless</a> library provides a nice pattern for fixing both problems. We start by defining a <em>type class</em> for describing the operation, in this case <em>redirecting the output channel of a process</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> RedirectOutput</span><span>[</span><span style="color:#c18401;">PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">To</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Result </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, Redirected, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span style="color:#c18401;">(</span><span style="color:#e45649;">process</span><span style="color:#c18401;">: PN, </span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To)(</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">target</span><span style="color:#c18401;">: CanBeProcessOutputTarget[To]): Result </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> RedirectOutput { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Aux[PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, NotRedirected, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">], To, Result0] </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> RedirectOutput[PN, To] { </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Result </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Result0 } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ... type class instances </span><span style="color:#c18401;">} </span></code></pre> <p>The type class itself is straightforward. We have to implement it for both <code>Process</code> and <code>PipedProcess</code> and set the <code>Result</code> type accordingly, then implement <code>apply</code> that sets up the actual redirection. But what the <code>Aux</code> type is for?</p> <p>It solves the problems with the <em>path dependent</em> version if we use it like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit class</span><span style="color:#c18401;"> ProcessNodeOutputRedirect</span><span>[ </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">PN </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">NotRedirected</span><span>, </span><span style="color:#c18401;">ERS</span><span>]](</span><span style="color:#e45649;">process</span><span>: </span><span style="color:#c18401;">PN</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">&gt;</span><span style="color:#c18401;">[To, Result </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode[</span><span style="color:#e45649;">_</span><span style="color:#c18401;">, Redirected, </span><span style="color:#e45649;">_</span><span style="color:#c18401;">]](</span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To) </span><span style="color:#c18401;"> (</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">target</span><span style="color:#c18401;">: CanBeProcessOutputTarget[To], </span><span style="color:#c18401;"> </span><span style="color:#e45649;">redirectOutput</span><span style="color:#c18401;">: RedirectOutput.Aux[PN, To, Result]): Result </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> redirectOutput(to) </span><span style="color:#c18401;">} </span></code></pre> <p>By lifting the <code>Result</code> from the type class instance to a type parameter the compiler can now "extract" the calculated type from <code>redirectOutput.Result</code> to the <code>&gt;</code> function's <code>Result</code> type parameter and use it directly, both for other further type requirements or as we do here, in the result type.</p> <p>This is the basic pattern used for <em>all</em> the operations in prox. You can check <a href="http://gigiigig.github.io/posts/2015/09/13/aux-pattern.html">Luigi's short introduction to the <code>Aux</code> pattern</a> for a more detailed explanation.</p> <h2 id="starting-the-processes">Starting the processes</h2> <p>So far we just combined purely functional data structures in a complicated way. The result value may encode the launching of several system processes that are connected via pipes to each other and possibly other streams as we will see.</p> <p>When we eventually decide to <em>start</em> these processes, we need a way to observe their status, wait for them to stop, get their exit code, and to access the data sent to the output streams if they were redirected. And we need this <em>per process</em>, while launching the whole process graph in a <em>single step</em>.</p> <p>First let's model a single <em>running process</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> RunningProcess</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">OutResult</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">isAlive</span><span style="color:#c18401;">: IO[</span><span style="color:#a626a4;">Boolean</span><span style="color:#c18401;">] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">waitForExit</span><span style="color:#c18401;">(): IO[ProcessResult[OutResult, ErrResult]] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">terminate</span><span style="color:#c18401;">(): IO[ProcessResult[OutResult, ErrResult]] </span><span style="color:#c18401;">} </span></code></pre> <p>and <code>ProcessResult</code> that represents an already <em>terminated process</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">case class</span><span style="color:#c18401;"> ProcessResult</span><span>[</span><span style="color:#c18401;">OutResult</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>]( </span><span> </span><span style="color:#e45649;">exitCode</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span> </span><span style="color:#e45649;">fullOutput</span><span>: </span><span style="color:#c18401;">OutResult</span><span>, </span><span> </span><span style="color:#e45649;">fullError</span><span>: </span><span style="color:#c18401;">ErrResult </span><span>) </span></code></pre> <p>Now we need to define a <code>start</code> extension method on <code>ProcessNode</code> that returns somehow one well typed <code>RunningProcess</code> for <em>each</em> system process that it starts.</p> <p>Let's forget for a second about having multiple processes piped together and just consider the single process case. For that, we would need somehing like this (the <code>Out</code> parameter is needed only for piping so I omitted it):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">start</span><span>: </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RunningProcess</span><span>[</span><span style="color:#c18401;">OutResult</span><span>, </span><span style="color:#c18401;">ErrResult</span><span>]] </span></code></pre> <p>Now we can see why <code>Process</code> has those additional type paramters. It is not enough to encode whether the output and error channels were redirected or not, we also have to encode the expected <em>result type</em> of redirecting these. By storing these types in type parameters of <code>Process</code> we can easily imagine that by using the pattern described in the previous section, the <em>result type</em> can <strong>depend</strong> on what we redirected the process to.</p> <p>Let's see some examples of what this means!</p> <table><thead><tr><th>Target</th><th>Result type</th></tr></thead><tbody> <tr><td>A file system path</td><td>The result type is <code>Unit</code>, the redirection happens on OS level</td></tr> <tr><td>Sink</td><td>The result type is <code>Unit</code>, only the sink's side effect matters</td></tr> <tr><td>Pipe with monoid elem type</td><td>The stream is folded by the monoid, the result type is <code>T</code></td></tr> <tr><td>Pipe with non-monoid elem type</td><td>The stream captures the elements in a vector, the result type is <code>Vector[T]</code></td></tr> <tr><td>Custom fold function</td><td>The result type is the function's result type</td></tr> </tbody></table> <p>The <code>CanBeProcessOutputTarget</code> type class we've seen earlier defines both the stream element type and the result type:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> CanBeProcessOutputTarget</span><span>[</span><span style="color:#c18401;">To</span><span>] </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">/** Output stream element type */ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">Out </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">/** Result type of running the output stream */ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">type </span><span style="color:#c18401;">OutResult </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span style="color:#c18401;">(</span><span style="color:#e45649;">to</span><span style="color:#c18401;">: To): ProcessOutputTarget[Out, OutResult] </span><span style="color:#c18401;">} </span></code></pre> <p><code>ProcessOutputTarget</code> contains the actual IO code to build the redirection of the streams, I won't get into details in this post. Note that there are similar type classes for <em>error</em> and <em>input</em> redirection too.</p> <p>For two processes piped together we have to provide <em>two</em> <code>RunningProcess</code> instances with the proper result type parameters. So we can see that it is not enough that the <em>redirection</em> stores the result type in the process type, the <em>start</em> method must be dependent typed too.</p> <p>One way to encode this in the type system would be something like this (simplified):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">p1 </span><span style="color:#a626a4;">= </span><span>Process() </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">p2 </span><span style="color:#a626a4;">= </span><span>Process() </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">p3 </span><span style="color:#a626a4;">= </span><span>Process() </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">rp1</span><span>: </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RunningProcess</span><span>] </span><span style="color:#a626a4;">=</span><span> p1.start </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">rp2</span><span>: </span><span style="color:#c18401;">IO</span><span>[(</span><span style="color:#c18401;">RunningProcess</span><span>, </span><span style="color:#c18401;">RunningProcess</span><span>)] </span><span style="color:#a626a4;">= </span><span>(p1 | p2).start </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">rp3</span><span>: </span><span style="color:#c18401;">IO</span><span>[(</span><span style="color:#c18401;">RunningProcess</span><span>, </span><span style="color:#c18401;">RunningProcess</span><span>, </span><span style="color:#c18401;">RunningProcess</span><span>)] </span><span style="color:#a626a4;">= </span><span>(p1 | p2 | p3).start </span></code></pre> <p>We encode piped processes with tuples of <code>RunningProcess</code> and single process with a single <code>RunningProcess</code>. To implement this we can make use of the <a href="https://github.com/milessabin/shapeless">shapeless</a> library's <code>HList</code> implementation.</p> <p>HLists are heterogeneous lists; basically similar to a tuple, but with all the "usual" list-like functions implemented as dependent typed functions. It's type describes the types of all its elements, and you can split it to head/tail, append two, etc. And we can do it both on the <em>type level</em> (computing the result type of appending two <code>HList</code>'s, for example) and on the <em>value leve</em> (appending the two values creating a third <code>HList</code> value).</p> <p>We can implement the <code>start</code> method more easily by building a <code>HList</code>, while still keep the desired interface as <a href="https://github.com/milessabin/shapeless">shapeless</a> implements a conversion from <code>HList</code> to tuples.</p> <p>We can define two separate <em>start functions</em>, one producing <code>HList</code> and another the tuples (IO releated parameters omitted):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">start</span><span>[</span><span style="color:#c18401;">RP</span><span>](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">start</span><span>: </span><span style="color:#c18401;">Start</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#c18401;">RP</span><span>, </span><span style="color:#e45649;">_</span><span>]]): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RP</span><span>] </span><span style="color:#a626a4;">= ??? </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">startHL</span><span>[</span><span style="color:#c18401;">RPL </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">start</span><span>: </span><span style="color:#c18401;">Start</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#c18401;">RP</span><span>[</span><span style="color:#c18401;">IO</span><span>]): </span><span style="color:#c18401;">IO</span><span>[</span><span style="color:#c18401;">RPL</span><span>] = </span><span style="color:#e45649;">??? </span></code></pre> <p>The <code>Start</code> type class calculates both the tupled and the <code>HList</code> version's result type. The implementation's responsibility is to start the actual system processes and wire the streams together.</p> <p>The interesting part is how we use <em>type level calculations</em> from <a href="https://github.com/milessabin/shapeless">shapeless</a> to calculte the tuple and <code>HList</code> types for piped processes. This is all done using the technique I described earlier, but may look a bit shocking first. Let's take a look!</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">implicit def </span><span style="color:#0184bc;">startPipedProcess</span><span>[ </span><span> </span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span> </span><span style="color:#c18401;">PN1 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#c18401;">PN2 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">ProcessNode</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#c18401;">IRS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ORS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span style="color:#c18401;">ERS </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RedirectionState</span><span>, </span><span> </span><span style="color:#c18401;">RP1</span><span>, </span><span style="color:#c18401;">RPL1 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span style="color:#c18401;">RP1Last </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RunningProcess</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#c18401;">RP2</span><span>, </span><span style="color:#c18401;">RPL2 </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span style="color:#c18401;">RP2Head </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RunningProcess</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">RP2Tail </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>, </span><span> </span><span style="color:#c18401;">RPT</span><span>, </span><span style="color:#c18401;">RPL </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">HList</span><span>] </span><span> (</span><span style="color:#a626a4;">implicit </span><span> </span><span style="color:#e45649;">start1</span><span>: </span><span style="color:#c18401;">Start</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN1</span><span>, </span><span style="color:#c18401;">RP1</span><span>, </span><span style="color:#c18401;">RPL1</span><span>], </span><span> </span><span style="color:#e45649;">start2</span><span>: </span><span style="color:#c18401;">Start</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PN2</span><span>, </span><span style="color:#c18401;">RP2</span><span>, </span><span style="color:#c18401;">RPL2</span><span>], </span><span> </span><span style="color:#e45649;">last1</span><span>: </span><span style="color:#c18401;">Last</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">RPL1</span><span>, </span><span style="color:#c18401;">RP1Last</span><span>], </span><span> </span><span style="color:#e45649;">rp1LastType</span><span>: </span><span style="color:#c18401;">RP1Last </span><span>&lt;:&lt; </span><span style="color:#c18401;">RunningProcess</span><span>[</span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>], </span><span> </span><span style="color:#e45649;">hcons2</span><span>: </span><span style="color:#c18401;">IsHCons</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">RPL2</span><span>, </span><span style="color:#c18401;">RP2Head</span><span>, </span><span style="color:#c18401;">RP2Tail</span><span>], </span><span> </span><span style="color:#e45649;">prepend</span><span>: </span><span style="color:#c18401;">Prepend</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">RPL1</span><span>, </span><span style="color:#c18401;">RPL2</span><span>, </span><span style="color:#c18401;">RPL</span><span>], </span><span> </span><span style="color:#e45649;">tupler</span><span>: </span><span style="color:#c18401;">Tupler</span><span>.</span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">RPL</span><span>, </span><span style="color:#c18401;">RPT</span><span>]): </span><span> </span><span style="color:#c18401;">Aux</span><span>[</span><span style="color:#c18401;">PipedProcess</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#c18401;">PN1</span><span>, </span><span style="color:#c18401;">PN2</span><span>, </span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">ORS</span><span>, </span><span style="color:#c18401;">ERS</span><span>], </span><span style="color:#c18401;">RPT</span><span>, </span><span style="color:#c18401;">RPL</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Start</span><span>[</span><span style="color:#c18401;">PipedProcess</span><span>[</span><span style="color:#c18401;">Out</span><span>, </span><span style="color:#c18401;">Err</span><span>, </span><span style="color:#a626a4;">Byte</span><span>, </span><span style="color:#c18401;">PN1</span><span>, </span><span style="color:#c18401;">PN2</span><span>, </span><span style="color:#c18401;">IRS</span><span>, </span><span style="color:#c18401;">ORS</span><span>, </span><span style="color:#c18401;">ERS</span><span>]] { </span><span> </span><span style="color:#a626a4;">override type </span><span>RunningProcesses </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">RPT </span><span> </span><span style="color:#a626a4;">override type </span><span>RunningProcessList </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">RPL </span><span> </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> } </span></code></pre> <p>The way to parse this is to follow the type level computations performed through the <em>Aux types</em> in the implicit parameter list:</p> <ul> <li><code>PN1</code> and <code>PN2</code> are the types of the two processes piped together</li> <li>The first two implicit definition calculates the <em>running process tuple</em> and the <em>running process HList</em> types of these inidividual process nodes and "stores" the results in <code>RP1</code>, <code>RPL1</code>, <code>RP2</code> and <code>RPL2</code> type parameters. For example if the two processes pipe together are single <code>Process</code> instances, then <code>RP1</code> and <code>RP2</code> would be some kind of <code>RunningProcess</code>, and the HLists would be one element long, like <code>RunningProcess :: HNil</code>.</li> <li>The <code>last1</code> implicit parameter is a type level <em>last</em> functinon on the first process's <code>HList</code>. This is required because <code>PN1</code> itself can also be a sequence of piped processes, and we are connecting <code>PN2</code> to the <strong>last</strong> of these. The <code>RP1Last</code> type parameter becomes the <em>type</em> of the <em>last running process</em> of the first process node.</li> <li>The next line, <code>rp1LastType</code> is an additional constraint fixing the <em>output stream element type</em> of <code>RP1Last</code> to <code>Byte</code>. The piping implementation is not able to connect streams of arbitrary element types, as the <em>process input</em> is always required to be a <em>byte stream</em>.</li> <li><code>hcons2</code> is similar to the <code>last1</code> but here we are calculating the type level <em>head type</em> of the <code>HList</code> called <code>RPL2</code>. The head will be in <code>RP2Head</code> and the tail <code>HList</code> in <code>RP2Tail</code>.</li> <li>In the <code>prepend</code> step we concatenate <code>RPL1</code> with <code>RPL2</code> using the <code>Prepend</code> operation, the result <code>HList</code> type is in <code>RPL</code>. This is the <code>HList</code> representation of the piped running process.</li> <li>Finally we use the <code>Tupler</code> operation to calculate the tuple type from the <code>HList</code>, and store it in <code>RPT</code>.</li> </ul> <p>The compiler perform the type level calculations and we can use the result types <code>RPT</code> and <code>RPL</code> to actually implement the <em>start typeclass</em>. This is the most complicated type level calculation in the library.</p> <h2 id="final-thoughts">Final thoughts</h2> <p>As we've seen, Scala's type system can bring us quite far in expressing a dependent typed interface. On the other hand writing and reading code in this style is really hard, and if things go wrong, decoding the compiler's error messages is not an easy task either. This is a serious tradeoff that has to be considered and in many cases a more dynamic but much more readable and maintainable approach can be better.</p> <p>With <a href="https://github.com/vigoo/prox">prox</a> I explicitly wanted to explore these features of the Scala language.</p> <p>In the next posts we will ignore the type level parts of the library and focus on different <em>streaming</em> and <em>effect</em> libraries.</p> AWS rate limits vs prezidig 2018-09-21T00:00:00+00:00 2018-09-21T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/aws-rate-limits-prezidig/ <p>At <a href="https://prezi.com">Prezi</a>, we have an internal tool called <strong>prezidig</strong> for discovering AWS resources. I like it a lot so I was quite annoyed recently that it always fails with a <em>throttling exception</em> because of our increased use of the AWS API. It made it completely unusable, so I decided to try to fix this.</p> <p>Then I decided to write the story in this blog post, as the steps I had to made to achieve the results I aimed for can be useful for writing maintainable, fast and safe Scala code in the future.</p> <p>I will describe the phases as they happened, as I did not really know anything about this codebase so the path to the success was not exactly clear immediately.</p> <h2 id="wrapping-the-calls">Wrapping the calls</h2> <p>So my initial thought was to just find the AWS API calls and wrap them in a helper function which catches the throttling error and retries with an increasing delay.</p> <p>I basically wrote this in the base class of all the <em>mirrors</em> (the classes which are responsible for fetching AWS and other resource data for <strong>prezidig</strong>):</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">protected def </span><span style="color:#0184bc;">byHandlingThrottling</span><span>[</span><span style="color:#c18401;">T</span><span>](</span><span style="color:#e45649;">awsCall</span><span>: </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">T</span><span>): </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">call</span><span>(</span><span style="color:#e45649;">remainingTries</span><span>: </span><span style="color:#a626a4;">Int</span><span>, </span><span style="color:#e45649;">wait</span><span>: </span><span style="color:#c18401;">FiniteDuration</span><span>): </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">T</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> Future(Try(awsCall)).flatMap { </span><span> </span><span style="color:#a626a4;">case </span><span>Success(</span><span style="color:#e45649;">result</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>Future.successful(result) </span><span> </span><span style="color:#a626a4;">case </span><span>Failure(</span><span style="color:#e45649;">awsException</span><span>: </span><span style="color:#c18401;">AmazonServiceException</span><span>) </span><span style="color:#a626a4;">if</span><span> awsException.getErrorCode </span><span style="color:#a626a4;">== </span><span style="color:#50a14f;">&quot;Throttling&quot;</span><span> &amp;&amp; remainingTries &gt; </span><span style="color:#c18401;">0 </span><span style="color:#a626a4;">=&gt; </span><span> akka.pattern.after(wait, actorSystem.scheduler) { </span><span> call(remainingTries - </span><span style="color:#c18401;">1</span><span>, wait * </span><span style="color:#c18401;">2</span><span>) </span><span> } </span><span> </span><span style="color:#a626a4;">case </span><span>Failure(</span><span style="color:#e45649;">reason</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span>Future.failed(reason) </span><span> } </span><span> } </span><span> call(</span><span style="color:#c18401;">10</span><span>, </span><span style="color:#c18401;">100</span><span>.millis) </span><span style="color:#a0a1a7;">// TODO: make configurable </span><span> } </span></code></pre> <p>Then the only thing I had to do was to was wrapping all the existing AWS calls with this. Then I realized that this won’t be this simple, as these calls were not always asynchronous, just sometimes. To see an example, for an <em>ElasticBeanstalk application</em>, it fetches the <em>application metadata</em> with synchronous call, then fetches the related <em>EB environments</em> asynchronously. The whole thing might be wrapped in another future somewhere else, but that’s a different story.</p> <p>While making these discoveries I also found several synchronization points, like the code waiting for some futures to complete in a blocking way. Also that the model is mutable. So… just for trying this out, I <em>still <strong>wrapped</strong></em> all the AWS calls with this stuff, by converting the future back to a synchronous call by immediately blocking on it.</p> <p>What did I achieve with this? Well, some throttling errors were fixed, the code became extremely ugly, and I could not even wrap everything so the errors remained, and because of the tons of blocking, timeouts, etc. it was basically impossible to understand whether this would work or deadlock or just be slow.</p> <p>That was the point I decided to do this properly</p> <h2 id="reflection">Reflection</h2> <p>Before solving the real problem I found that the mirrors are initialized via reflection, something like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">buildMirrors</span><span>[</span><span style="color:#c18401;">A </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RegionAwareAWSMirror</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>]](</span><span style="color:#a626a4;">implicit </span><span style="color:#e45649;">mf</span><span>: </span><span style="color:#c18401;">Manifest</span><span>[</span><span style="color:#c18401;">A</span><span>]): </span><span style="color:#c18401;">Seq</span><span>[</span><span style="color:#c18401;">A</span><span>] </span><span style="color:#a626a4;">= </span><span> Config.regions.map(</span><span style="color:#e45649;">region </span><span style="color:#a626a4;">=&gt;</span><span> mf.runtimeClass.getConstructor(classOf[</span><span style="color:#c18401;">String</span><span>]).newInstance(region).asInstanceOf[</span><span style="color:#c18401;">A</span><span>]) </span></code></pre> <p>This is something that you should avoid, as it leads to problems that are not detected by the compiler, only at runtime, every time you refactor something around these classes. There are some use cases where this may be required, like dynamically loading plugins or stuff like this, but to just have a factory for something, it is must simple to use… <strong>functions</strong>!</p> <p>So I could not hold myself back and quickly changed this to:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">buildMirrors</span><span>[</span><span style="color:#c18401;">A </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">RegionAwareAWSMirror</span><span>[</span><span style="color:#e45649;">_</span><span>, </span><span style="color:#e45649;">_</span><span>]](</span><span style="color:#e45649;">factory</span><span>: (</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">ActorSystem</span><span>) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">A</span><span>) </span><span> Config.regions.map(</span><span style="color:#e45649;">region </span><span style="color:#a626a4;">=&gt;</span><span> factory(region, system)) </span></code></pre> <p>(Since then even this has disappeared, but don’t run that much forward).</p> <h2 id="async-fetching">Async fetching</h2> <p>Ok so the first obvious step was to refactor the whole fetching code in a way that it is just a chain of <strong>futures</strong>. By making everything async in the process, the AWS calls would be simply replaceable with the throttling function above or anything more sophisticated!</p> <p>But I knew that I cannot safely do this while the model we are building itself is mutable - there is no way I want to debug what happens with it once all the steps are really becoming parallel!</p> <h3 id="immutable-model">Immutable model</h3> <p>I believe the following GitHub diff captures the core change of this step:</p> <img src="/images/prezidig-img-1.png" width="800"/> <p>Of course I had to change all the subtypes of Model, and I went through the code looking for</p> <ul> <li><strong>var</strong>s</li> <li>mutable collections</li> </ul> <p>and got rid of them. Except for the caching constructs, because I planned to refactor those in the next step, so for now I left them alone.</p> <h3 id="async-mirrors">Async mirrors</h3> <p>Once I felt the model is safe enough, I went to the next big change, making everything asynchronous.</p> <img src="/images/prezidig-img-2.png" width="800"/> <p>This took some hours, to be honest. But really, the core idea is only that the result must be a <code>Future[T]</code>, not <code>T</code>.</p> <p>So how do you refactor a code that was previously half synchronous, half asynchronous to achieve this? Let’s see an example! It will be the <em>key-pair mirror</em> as it is the smallest.</p> <p>Originally (with my ugly wrapping in the previous step) it looked like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">override protected def </span><span style="color:#0184bc;">fetch</span><span>(</span><span style="color:#e45649;">input</span><span>: </span><span style="color:#c18401;">SimpleParsedInput</span><span>, </span><span style="color:#e45649;">context</span><span>: </span><span style="color:#c18401;">Context</span><span>): </span><span style="color:#c18401;">Seq</span><span>[</span><span style="color:#c18401;">KeyPair</span><span>] </span><span style="color:#a626a4;">= </span><span> </span><span style="color:#a626a4;">try </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">futureResult </span><span style="color:#a626a4;">=</span><span> byHandlingThrottling( </span><span> buildClient(AmazonEC2ClientBuilder.standard()).describeKeyPairs( </span><span> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">DescribeKeyPairsRequest</span><span>().withKeyNames(input.id) </span><span> )) </span><span> </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">= </span><span>Await.result(futureResult, </span><span style="color:#c18401;">10</span><span>.seconds) </span><span> result.getKeyPairs.asScala.map(</span><span style="color:#e45649;">info </span><span style="color:#a626a4;">=&gt; </span><span>KeyPair(info, region)).seq </span><span> .map(</span><span style="color:#e45649;">keypair </span><span style="color:#a626a4;">=&gt;</span><span> keypair.withFutureChildren(LaunchConfigurationMirror(region, actorSystem).apply(context.withInput(keypair.description.getKeyName)))) </span><span> } </span><span style="color:#a626a4;">catch </span><span>{ </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">_</span><span>: </span><span style="color:#c18401;">AmazonEC2Exception </span><span style="color:#a626a4;">=&gt; </span><span>Seq() </span><span> } </span></code></pre> <p>So as you can see fetching the key pairs by name was a synchronous request, but then the <em>launch configurations</em> are fetched asynchronously and are being updated back the result model in a mutable way. We want to transform this function so it does not have any side effects, just performs a chain of asynchronous operations and in the end have a fully fetched <em>key pair</em> with the related <em>launch configurations</em>.</p> <p>In every case the only thing needed was a combination of <code>map</code> and <code>flatMap</code> on futures, and of course the <em>for syntax</em> can also be used to make the code more readable:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">private def </span><span style="color:#0184bc;">fetchKeyPair</span><span>(</span><span style="color:#e45649;">client</span><span>: </span><span style="color:#c18401;">AmazonEC2</span><span>, </span><span style="color:#e45649;">context</span><span>: </span><span style="color:#c18401;">Context</span><span>, </span><span style="color:#e45649;">info</span><span>: </span><span style="color:#c18401;">KeyPairInfo</span><span>): </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">KeyPair</span><span>] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">for </span><span>{ </span><span> </span><span style="color:#e45649;">launchConfigurations </span><span style="color:#a626a4;">&lt;- </span><span>LaunchConfigurationMirror(region, actorSystem).apply(context.withInput(info.getKeyName)) </span><span> } </span><span style="color:#a626a4;">yield </span><span>KeyPair( </span><span> description </span><span style="color:#a626a4;">=</span><span> info, </span><span> region </span><span style="color:#a626a4;">=</span><span> region, </span><span> children </span><span style="color:#a626a4;">=</span><span> launchConfigurations </span><span> ) </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">override protected def </span><span style="color:#0184bc;">fetch</span><span>(</span><span style="color:#e45649;">input</span><span>: </span><span style="color:#c18401;">SimpleParsedInput</span><span>, </span><span style="color:#e45649;">context</span><span>: </span><span style="color:#c18401;">Context</span><span>): </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">KeyPair</span><span>]] </span><span style="color:#a626a4;">= </span><span>{ </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">client </span><span style="color:#a626a4;">=</span><span> buildClient(AmazonEC2ClientBuilder.standard()) </span><span> </span><span> byHandlingThrottling(client.describeKeyPairs(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">DescribeKeyPairsRequest</span><span>().withKeyNames(input.id))).flatMap { </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">=&gt; </span><span> Future.sequence( </span><span> result.getKeyPairs.asScala.toList.map(fetchKeyPair(client, context, </span><span style="color:#e45649;">_</span><span>)) </span><span> ) </span><span> }.recover { </span><span> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">_</span><span>: </span><span style="color:#c18401;">AmazonEC2Exception </span><span style="color:#a626a4;">=&gt; </span><span>List() </span><span style="color:#a0a1a7;">// TODO: log? </span><span> } </span><span> } </span></code></pre> <p>Note that the <code>Future.sequence</code> function is quite useful in these scenarios, as it makes a <code>Future[List[T]]</code> from <code>List[Future[T]]</code>.</p> <p>Of course the code became more verbose because of all this chaining, this is the price of this transformation. And why I don’t like to express complex logic with a chain of futures, rather with some higher level abstraction such as actors (or for this use case, streams would fit even better).</p> <p>But I wanted to make iterative changes, so I did this transformation on all the mirrors and eventually got a <code>Future[List[Model]]</code> in the main function that I could await for. I also thrown out the global atomic integer that counted the running stuff for completion, as in this model the completion of the composed future should mark the end of the whole computation.</p> <p>So did I succeed at this point? Of course not. Actually this whole thing is a big deadlock :)</p> <h2 id="caching-and-circular-references">Caching and circular references</h2> <p>It was not immediately obvious what causes the deadlock. In a system like this it can happen in different ways. For example I knew that there are global singleton caches in the code, protected by <strong>locks</strong>. This <em>could</em> cause deadlocks if all the executors got blocked and no new threads can be spawned by the active executor. I did not know if this is happening, but would not have been surprised at all, as much more things were happening in parallel because of the previous refactoring step.</p> <p>And circular references in the huge chained future graph can also lead to this. Let’s consider this simplified example:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> Cache { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">get</span><span style="color:#c18401;">(</span><span style="color:#e45649;">key</span><span style="color:#c18401;">: String): Future[Work] </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">put</span><span style="color:#c18401;">(</span><span style="color:#e45649;">key</span><span style="color:#c18401;">: String, </span><span style="color:#e45649;">compute</span><span style="color:#c18401;">: () </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Future[Work]): </span><span style="color:#a626a4;">Unit </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">cache</span><span>: </span><span style="color:#c18401;">Cache </span><span style="color:#a626a4;">= ??? </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">work1</span><span>: </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">Work</span><span>] </span><span style="color:#a626a4;">=</span><span> cache.get(</span><span style="color:#50a14f;">&quot;work2&quot;</span><span>).map { </span><span style="color:#e45649;">w2 </span><span style="color:#a626a4;">=&gt; </span><span>Work(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Hello </span><span style="color:#e45649;">$w2</span><span style="color:#50a14f;">&quot;</span><span>)) } </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">work2</span><span>: </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">Work</span><span>] </span><span style="color:#a626a4;">=</span><span> cache.get(</span><span style="color:#50a14f;">&quot;work1&quot;</span><span>).map { </span><span style="color:#e45649;">w1 </span><span style="color:#a626a4;">=&gt; </span><span>Work(</span><span style="color:#0184bc;">s</span><span style="color:#50a14f;">&quot;Hello </span><span style="color:#e45649;">$w1</span><span style="color:#50a14f;">&quot;</span><span>)) } </span><span> </span><span>cache.put(work1) </span><span>cache.put(work2) </span><span> </span><span>println(Await.result(work1), </span><span style="color:#c18401;">1</span><span>.second) </span></code></pre> <p>This can never work. If you think about what <strong>prezidig</strong> does, you will have a feeling that this happens. A lot.</p> <p>But let’s go in order.</p> <h3 id="non-blocking-cache">Non-blocking cache</h3> <p>First I wanted to get rid of the global, lock-protected mutable maps used as caches, and have a non-blocking implementation with more control and better performance and safety. This is the kind of job that an <strong>actor</strong> can model nicely, so I created a <em>model cache actor</em> that is spawned for <em>each mirror</em> and can store and retrieve lists of AWS models for a given key.</p> <p>I won’t list the whole actor’s code here, let’s see the messages it consumes:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">sealed trait</span><span style="color:#c18401;"> ModelCacheMessage</span><span>[</span><span style="color:#c18401;">M </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Model</span><span>] </span><span> </span><span> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Put</span><span>[</span><span style="color:#c18401;">M </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Model</span><span>](</span><span style="color:#e45649;">key</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">value</span><span>: </span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">M</span><span>]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ModelCacheMessage[</span><span style="color:#c18401;">M</span><span>] </span><span> </span><span> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> FetchFailed</span><span>[</span><span style="color:#c18401;">M </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Model</span><span>](</span><span style="color:#e45649;">key</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">failure</span><span>: </span><span style="color:#c18401;">Failure</span><span>[</span><span style="color:#e45649;">_</span><span>]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ModelCacheMessage[</span><span style="color:#c18401;">M</span><span>] </span><span> </span><span> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> GetOrFetch</span><span>[</span><span style="color:#c18401;">M </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Model</span><span>](</span><span style="color:#e45649;">key</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">fetch</span><span>: () </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">M</span><span>]], </span><span style="color:#e45649;">respondTo</span><span>: </span><span style="color:#c18401;">ActorRef</span><span>[</span><span style="color:#c18401;">Try</span><span>[</span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">M</span><span>]]]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ModelCacheMessage[</span><span style="color:#c18401;">M</span><span>] </span><span> </span><span> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> GetRefOrFetch</span><span>[</span><span style="color:#c18401;">M </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Model</span><span>](</span><span style="color:#e45649;">key</span><span>: </span><span style="color:#c18401;">String</span><span>, </span><span style="color:#e45649;">fetch</span><span>: () </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Future</span><span>[</span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">M</span><span>]], </span><span style="color:#e45649;">respondTo</span><span>: </span><span style="color:#c18401;">ActorRef</span><span>[</span><span style="color:#c18401;">ModelRef</span><span>[</span><span style="color:#c18401;">M</span><span>]]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ModelCacheMessage[</span><span style="color:#c18401;">M</span><span>] </span><span> </span><span> </span><span style="color:#a626a4;">final case class</span><span style="color:#c18401;"> Dump</span><span>[</span><span style="color:#c18401;">M </span><span style="color:#a626a4;">&lt;: </span><span style="color:#c18401;">Model</span><span>](</span><span style="color:#e45649;">respondTo</span><span>: </span><span style="color:#c18401;">ActorRef</span><span>[</span><span style="color:#c18401;">Map</span><span>[</span><span style="color:#c18401;">String</span><span>, </span><span style="color:#c18401;">List</span><span>[</span><span style="color:#c18401;">M</span><span>]]]) </span><span> </span><span style="color:#a626a4;">extends </span><span>ModelCacheMessage[</span><span style="color:#c18401;">M</span><span>] </span></code></pre> <p>This cache itself is responsible for executing the <em>fetch function</em> only if needed, when the value for the given key is not cached yet. It is done by using the <strong>pipe pattern</strong>: it starts the asynchronous fetch function on a configured worker executor (which can be the actor system, or a fix thread pool, etc.) and registers an <code>onFinish</code> callback for the future which <em>pipes back</em> the future’s result to the actor as actor messages (<code>Put</code> and <code>FetchFailed</code>).</p> <p>I will talk about references and cache dumps in the next section.</p> <p>There was one more big problem with the existing code that prevented introducing these cache actors: that the mirrors were not really singletons but some mirrors created new instances of existing mirrors (without any difference to the ones created in the main function). These shared the singleton mutable lock-protected cache map in the original version, that’s why it worked. But in the new implementation each mirror spawned its own cache actor, so it was no longer allowed to create multiple instances of the same thing.</p> <p>So in this step I collected all the mirrors to a class called <code>Mirrors</code>, which later became the collection of all the resources needed to perform the “dig”, so in the final version it is called <code>DigSite</code>.</p> <p>With this change the caching could be replaced, and with the <strong>ask pattern</strong> I was able to fit it to the chain of futures created in the previous step.</p> <p>Did it solve the deadlock? No, of course not</p> <h3 id="circular-references">Circular references</h3> <p>But now it was obvious that there are some circular references. And by simply drawing it, I could see that this is actually the core concept of the whole thing :)</p> <p>Let me show you <em>the drawing</em>:</p> <img src="/images/prezidig-img-3.png" width="800"/> <p>So everything refers back to everything, not a surprise that this chained-together code cannot finish.</p> <p>To be honest, I was not sure how exactly did it work in the original version, whether the boundary of sync and async calls were carefully designed to make this work or just accidentally, whatever.</p> <p>I wanted to have a solution where you don’t have to think about it so nobody will fuck it up next time when it has to be modified.</p> <p>The chosen solution can be summarized in the following way:</p> <ul> <li>The <em>models</em> are only storing <strong>references to other models</strong> encoded by the <code>ModelRef</code> type. A reference is basically selecting a mirror (by its <em>cache</em>) and an item in it by its <em>key</em></li> <li>When fetching a model, you immediately get back a <em>model reference</em> from the cache so it can be stored in the owner model, even with circular references. The real data is still fetched and cached as before.</li> <li>This works because nobody uses the actual child models until the <strong>rendering</strong> of the output. So we have the asynchronous, parallel fetching of all the models, and then a completely separate, non-async step where we need the real connections to actually render the output based on the templates. I could change how the rendering works to query the model references from the cache, but I did not want to touch that part. So I introduced a middle step where all the <em>model cache actors</em> <strong>dump</strong> their state to simple immutable maps, and then the model gets <em>updated</em> by selecting the referenced models from this map and changing a field. Yes, a mutable field. It is a non-threadsafe operation that has a single, well defined place to be called, and this way the whole third part (rendering the output) could remain untouched.</li> <li>Because of decoupling the actual fetching from the result future (it is completed earlier, as it only needs the references!), I had to have something that keeps track of the ongoing tasks ran by the cache actors, so there is also a <em>work monitor actor</em> that notifies the main logic once everything is complete.</li> </ul> <p>Considering all this, the main steps before starting to render the output looks like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">val </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">= for </span><span>{ </span><span> </span><span style="color:#e45649;">models </span><span style="color:#a626a4;">&lt;-</span><span> runRelevantMirrors(digSite.allMirrors, Context.initial(input)) </span><span> </span><span style="color:#e45649;">fetchingDone </span><span style="color:#a626a4;">&lt;-</span><span> digSite.workMonitor ? WorkMonitor.WaitForReady </span><span> </span><span style="color:#e45649;">cacheDumps </span><span style="color:#a626a4;">&lt;- </span><span>CacheDumps.fromMirrors(digSite.allMirrors) </span><span> </span><span style="color:#e45649;">_ </span><span style="color:#a626a4;">=</span><span> models.foreach(</span><span style="color:#e45649;">_</span><span>.resolveChildren(cacheDumps)) </span><span style="color:#a0a1a7;">// side effect! </span><span>} </span><span style="color:#a626a4;">yield</span><span> models </span></code></pre> <h2 id="anyone-else-blocking">Anyone else blocking?</h2> <p>At this point the tool started to work again and produce results. So I went back checking if any other blocking code remained that can be implemented in other ways. The progress tracker was like that, it had mutable state and locks, so I converted that to an actor too. It was quite simple, and on the usage side almost nothing changed compared to the original.</p> <h2 id="and-what-about-the-throttling">And what about the throttling?</h2> <p>Ok so at this point I refactored the whole stuff but still did not solve the throttling issue, right?</p> <p>Right.</p> <p>But now finally I knew how to do it!</p> <p>I already wrapped all AWS calls with that specific function (and at this point it was really <em>all</em> calls, not just <em>almost)</em>. So I just had to write it in a better way.</p> <p>I wanted to:</p> <ul> <li>Have control on how many AWS requests are we doing in parallel</li> <li>In case of throttling errors delay <strong>everything</strong> as soon as possible</li> </ul> <p>This can be achieved easily by some standard patterns like treating AWS as an encapsulated resource and putting some circuit breaking logic in it, and explicitly distributing the work among multiple workers.</p> <p>Let’s see the designed solution on a drawing:</p> <img src="/images/prezidig-img-4.png" width="800"/> <p><strong>Note</strong>: the <em>classic Akka</em> has built-in support for this routing and circuit breaking, but I prefer <em>Akka-typed</em> because of its type safety, where there are no official reusable higher level components like this yet. The one I implemented here is quite specific, later could be refactored to be built from more reusable typed actor components.</p> <p>So how does this work?</p> <ul> <li>There is a single coordinator actor called <strong>AWS</strong> and multiple (32 by default) worker actors called <strong>AWS Worker</strong>.</li> <li>The number of worker actors control the maximum number of parallel AWS operations, because each worker actor is guaranteed to run maximum one such operation at the same time. All the other incoming requests are distributed among the workers and gets enqueued.</li> <li>The AWS calls are executed on a different thread pool, not blocking the actors. Their result is sent back by the already mentioned <em>pipe to</em> pattern</li> <li>AWS throttling errors are detected on the worker nodes, and the worker node immediately switches to <strong>open circuit state</strong> in which it does not start any new AWS command. The length of the open state increases with every throttling error, and gets reseted after a number of successful requests.</li> <li>Opening the circuit breaker on one worker node is immediately followed by opening it on <strong>all other</strong> worker nodes too, to stop overloading AWS.</li> </ul> <p>This could be further improved with more advanced logic but I believe it is good enough for our current purposes, and now we can use <strong>prezidig</strong> again!</p> Bari with Visual Studio Code 2016-01-21T00:00:00+00:00 2016-01-21T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/bari-vscode/ <h2 id="intro">Intro</h2> <p>A few weeks ago I discovered <a href="https://code.visualstudio.com/">Visual Studio Code</a> and started using it for some of my work. <em>(Note: I'm using multiple editors/IDEs all the time, based on the task; Emacs, Sublime, Atom, IntelliJ, VS, etc.)</em> So far <em>Code</em> is my favourite among the set of similar editors, such as Atom. I was pleasently surprised how well it works with its integrated <a href="http://www.omnisharp.net/">OmniSharp</a> plugin on <a href="http://vigoo.github.io/bari/">bari's</a> codebase, so I decided to try to write a <em>bari plugin</em> for it.</p> <p>Writing an extension for <em>Code</em> was a nice experience. The outcome is the <a href="https://marketplace.visualstudio.com/items/vigoo.bari">bari build management extension</a>, which I'll demonstrate in the next section.</p> <h2 id="developing-net-applications-with-visual-studio-code-and-bari">Developing .NET applications with Visual Studio Code and bari</h2> <p>As <em>Code</em> is multiplatform, and <em>bari</em> also works with <a href="http://www.mono-project.com/">Mono</a>, I'll demonstrate how you can use these tools to develop a .NET application (actually <em>bari</em> itself) on a Mac. The steps here (except installing Mono) would be the same on Windows or Linux as well.</p> <h3 id="installing-the-tools">Installing the tools</h3> <p>First, if you are not on Windows, you'll have to install the latest <a href="http://www.mono-project.com/">Mono</a> framework. On OSX I recommed to use <a href="http://brew.sh/"><code>brew</code></a> to do that:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>brew install mono </span><span>mono --version </span></code></pre> <p>Then get the latest <a href="https://code.visualstudio.com/">Visual Studio Code</a> version, either by downloading it from its homepage or with <a href="https://github.com/caskroom/homebrew-cask"><code>brew cask</code></a>:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>brew cask install visual-studio-code </span></code></pre> <p>Get the latest <em>bari</em>. On Windows I recommend downloading and extracting the <a href="https://github.com/vigoo/bari/releases/latest">latest official release</a> and adding it to the <code>PATH</code>. On OSX, with <code>mono</code> we already have <code>nuget</code>, so let's use that:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>cd /opt </span><span>nuget install bari-mono </span><span>ln -s bari-mono.1.0.2.2 bari </span></code></pre> <p>and create a script to execute it somewhere in your <code>PATH</code>:</p> <pre data-lang="sh" style="background-color:#fafafa;color:#383a42;" class="language-sh "><code class="language-sh" data-lang="sh"><span style="color:#a0a1a7;">#!/bin/sh </span><span style="color:#e45649;">mono</span><span> /opt/bari/tools/bari.exe $</span><span style="color:#e45649;">@ </span></code></pre> <p>That's it. Future versions of the <em>bari extension</em> will probably be able to install <em>bari</em> itself.</p> <p>Let's start <em>Code</em> now!</p> <h3 id="installing-the-extension">Installing the extension</h3> <p>Open the <em>command palette</em> (F1, or ⇧⌘P) and type <code>ext install bari</code></p> <p><a href="/images/baricode1.png" class="zimg"><img width="600" src="/images/baricode1.png" alt="bari-code-1"></a></p> <h3 id="loading-the-project">Loading the project</h3> <p>After that restart the editor. Have your bari-built project available somewhere. As we are going to develop bari itself, let's clone its repository:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>git clone https://github.com/vigoo/bari.git </span></code></pre> <p>Then open the result <code>bari</code> directory with <em>Code</em>. This should look like the following:</p> <p><a href="/images/baricode2.png" class="zimg"><img width="800" src="/images/baricode2.png" alt="bari-code-2"></a></p> <p>The <em>bari plugin</em> automatically detected that the opened folder has a <code>suite.yaml</code> in its root, and loaded it. That's why we can see the two sections on the statusbar's right side: <code>full</code> and <code>debug</code>. The first one is the <a href="https://github.com/vigoo/bari/wiki/Product">selected target product</a> and the second one is the <a href="https://github.com/vigoo/bari/wiki/Goal">selected goal</a>. All the <em>bari commands</em> provided by the extension will be executed with these settings.</p> <h3 id="changing-the-target">Changing the target</h3> <p>To change the active product or goal, you can click on the statusbar or use the <em>command palette</em> (F1, or ⇧⌘P) and choose <code>bari: Change goal</code> or <code>bari: Change target product</code>.</p> <p>Let's change the <em>goal</em> to <code>debug-mono</code>, as we are working on a non-Windows environment:</p> <p><a href="/images/baricode3.png" class="zimg"><img width="800" src="/images/baricode3.png" alt="bari-code-3"></a></p> <h3 id="generating-the-solution">Generating the solution</h3> <p>The next step before starting coding is to actually <strong>generate</strong> the solution and projects files (and fetch the dependencies, etc.) so <em>OmniSharp</em> can load it and provide code completion, analysis, etc. features.</p> <p>To do so, just use the <em>command palette</em> and choose <code>bari: Regenerate solution</code>, which <a href="https://github.com/vigoo/bari/wiki/VsCommand">runs the <code>bari vs</code> command</a> with the correct parameters. The command's output is displayed in an <em>output panel</em> called <code>bari</code>. This looks like the following:</p> <p><a href="/images/baricode4.png" class="zimg"><img width="800" src="/images/baricode4.png" alt="bari-code-4"></a></p> <p>There's nothing else left than pointing <em>OmniSharp</em> to the generated solution, with the following command:</p> <p><a href="/images/baricode5.png" class="zimg"><img width="800" src="/images/baricode5.png" alt="bari-code-5"></a></p> <p>It will automatically find the generated <code>.sln</code> file, just select the correct one:</p> <p><a href="/images/baricode6.png" class="zimg"><img width="800" src="/images/baricode6.png" alt="bari-code-6"></a></p> <p>In a few seconds (and with a few warnings for this project), <em>OmniSharp</em> works. To see what it can do, <a href="https://code.visualstudio.com/Docs/languages/csharp">check this page</a>. A simple example is to jump to a given class or interface with ⌘P:</p> <p><a href="/images/baricode7.png" class="zimg"><img width="600" src="/images/baricode7.png" alt="bari-code-7"></a></p> <h3 id="working-on-the-project">Working on the project</h3> <p>You can work on the project and build it from <em>Code</em> or run its tests using the <code>bari: Build</code> and <code>bari: Test</code> commands. The build output will be shown just like in the <em>solution generation step</em>.</p> <p><a href="/images/baricode8.png" class="zimg"><img width="600" src="/images/baricode8.png" alt="bari-code-8"></a></p> <p>Whenever the suite definition itself must be modified, you can jump there with the <code>bari: Open suite.yaml</code> command and then just regenerate the solution as it was shown above.</p> <h2 id="implementation">Implementation</h2> <p>The implementation was really straightforward. The source code <a href="https://github.com/vigoo/bari-code">can be found here</a>. It's basically a <em>JSON</em> defining how the plugin is integrated and some implementation code in <em>TypeScript</em>. It's easy to run and debug the plugin from <em>Code</em> itself.</p> <p>For example the following section from the extension definition describes what events triggers the extension:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span style="color:#50a14f;">&quot;activationEvents&quot;</span><span>: [ </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.build&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.test&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.vs&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.openSuiteYaml&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.selfUpdate&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.goal.changeCurrentGoal&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;onCommand:bari.goal.changeCurrentProduct&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;workspaceContains:suite.yaml&quot; </span><span>], </span></code></pre> <p>It's either done by invoking one of the defined commands from the <em>command palette</em>, or if the opened workspace contains a <code>suite.yaml</code>. The latter enables the extension to parse the suite definition and initialize the statusbar immediately one the suite has been opened.</p> <p>The package definition also specifies the provided configuration values, such as:</p> <pre data-lang="json" style="background-color:#fafafa;color:#383a42;" class="language-json "><code class="language-json" data-lang="json"><span style="color:#50a14f;">&quot;bari.commandLine&quot;</span><span>: { </span><span> </span><span style="color:#50a14f;">&quot;type&quot;</span><span>: </span><span style="color:#50a14f;">&quot;string&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;default&quot;</span><span>: </span><span style="color:#50a14f;">&quot;bari&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;description&quot;</span><span>: </span><span style="color:#50a14f;">&quot;Command line to execute bari&quot; </span><span>}, </span><span style="color:#50a14f;">&quot;bari.verboseOutput&quot;</span><span>: { </span><span> </span><span style="color:#50a14f;">&quot;type&quot;</span><span>: </span><span style="color:#50a14f;">&quot;boolean&quot;</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;default&quot;</span><span>: </span><span style="color:#c18401;">false</span><span>, </span><span> </span><span style="color:#50a14f;">&quot;description&quot;</span><span>: </span><span style="color:#50a14f;">&quot;Turns on verbose output for all the executed bari commands&quot; </span><span>} </span></code></pre> <p>The implementation itself is really simple, all the user interface elements involved such as the console output window, the command palette, the statusbar panels can be easily managed.</p> <p>For example the panel showing <code>bari</code>'s output is created by the following code snippet:</p> <pre data-lang="javascript" style="background-color:#fafafa;color:#383a42;" class="language-javascript "><code class="language-javascript" data-lang="javascript"><span style="color:#a626a4;">var </span><span style="color:#e45649;">channel </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">vscode</span><span>.</span><span style="color:#e45649;">window</span><span>.</span><span style="color:#0184bc;">createOutputChannel</span><span>(</span><span style="color:#50a14f;">&#39;bari&#39;</span><span>); </span><span style="color:#e45649;">channel</span><span>.</span><span style="color:#0184bc;">show</span><span>(); </span></code></pre> <p>Or to display the result of an operation:</p> <pre data-lang="javascript" style="background-color:#fafafa;color:#383a42;" class="language-javascript "><code class="language-javascript" data-lang="javascript"><span style="color:#e45649;">vscode</span><span>.</span><span style="color:#e45649;">window</span><span>.</span><span style="color:#0184bc;">showErrorMessage</span><span>(</span><span style="color:#50a14f;">&quot;No suite.yaml in the current workspace!&quot;</span><span>) </span></code></pre> <p>or to create the statusbar panel:</p> <pre data-lang="javascript" style="background-color:#fafafa;color:#383a42;" class="language-javascript "><code class="language-javascript" data-lang="javascript"><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">goals </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">vscode</span><span>.</span><span style="color:#e45649;">window</span><span>.</span><span style="color:#0184bc;">createStatusBarItem</span><span>(</span><span style="color:#e45649;">vscode</span><span>.</span><span style="color:#e45649;">StatusBarAlignment</span><span>.</span><span style="color:#e45649;">Right</span><span>); </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">goals</span><span>.</span><span style="color:#e45649;">command </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&#39;bari.goal.changeCurrentGoal&#39;</span><span>; </span><span style="color:#e45649;">this</span><span>.</span><span style="color:#e45649;">goals</span><span>.</span><span style="color:#0184bc;">show</span><span>(); </span></code></pre> <p>This API is simple and well documented enough so basic integrations like this can be done in an hour.</p> Gradle-Haskell-plugin with experimental Stack support 2015-12-22T00:00:00+00:00 2015-12-22T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/gradle-haskell-plugin-stack/ <p>I've released a <strong>new version (0.4)</strong> of <a href="https://github.com/prezi/gradle-haskell-plugin">gradle-haskell-plugin</a> today, with <strong>experimental stack support</strong>. It is not enabled by default, but I used it exclusively for months and it seems to get quite stable. To use it you need <a href="https://haskellstack.com">stack</a>, have it enabled with <code>-Puse-stack</code> and have to keep some rules in your <code>.cabal</code> file, as explained <a href="https://github.com/prezi/gradle-haskell-plugin#explanation-stack-mode">in the README</a>.</p> <h2 id="how-does-it-work">How does it work?</h2> <p>The core idea did not change <a href="https://blog.vigoo.dev/posts/gradle-haskell-plugin/">compared to the original, cabal based solution</a>.</p> <p>To support chaining the binary artifacts, I had to add a new option to <em>stack</em> called <a href="https://github.com/commercialhaskell/stack/pull/990">extra package databases</a>. The databases listed in this section are passed <em>after the global</em> but <strong>before</strong> the snapshot and the local databases, which means that the snapshot database cannot be used (the packages in the binary artifacts are not "seeing" them). This sounds bad, but <em>gradle-haskell-plugin</em> does a workaround; it <strong>generates</strong> the <code>stack.yaml</code> automatically, and in a way that:</p> <ul> <li>it disables snapshots on stack level (uses a resolver like <code>ghc-7.10.2</code>)</li> <li>lists all the dependencies explicitly in <code>extra-deps</code></li> <li>but it still figures out the <em>versions</em> of the dependencies (to be listed in <code>extra-deps</code>) based on a given <em>stackage snapshot</em>!</li> </ul> <p>With this approach we get the same behavior that was already proven in cabal mode, but with the advantage that the generated <code>stack.yaml</code> completely defines the project for any tool that knows stack. So after gradle extracted the dependencies and generated the <code>stack.yaml</code>, it is no longer needed to succesfully compile/run/test the project, which means that tools like IDE integration will work much better than with the more hacky cabal mode of the plugin.</p> Case Study - Haskell at Prezi 2015-09-21T00:00:00+00:00 2015-09-21T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/haskell-case-study/ <p>I wrote a <em>case study</em> for <a href="http://www.fpcomplete.com">FPComplete</a> on how we use Haskell at <a href="https://prezi.com">Prezi</a>. It is published <a href="https://www.fpcomplete.com/page/case-study-prezi">here</a>, but I'm just posting it here as well:</p> <p><a href="https://prezi.com">Prezi</a> is a cloud-based presentation and storytelling tool, based on a zoomable canvas. The company was founded in 2009, and today we have more than 50 million users, with more than 160 million prezis created.</p> <p>The company is using several different platforms and technologies; one of these is <em>Haskell</em>, which we are using server side, for code generation and for testing.</p> <h2 id="pdom">PDOM</h2> <p>Prezi's document format is continuously evolving as we add features to the application. It is very important for us that this format is handled correctly on all our supported platforms, and both on client and server side. To achieve this, we created an eDSL in Haskell that defines the schema of a Prezi. From this schema we are able to generate several artifacts.</p> <p>Most importantly we are generating a <em>Prezi Document Object Model (PDOM)</em> library for multiple platforms - Haxe (compiled to JS) code for the web, C++ code for the native platforms, and Haskell code for our tests, tools and the server side. These libraries are responsible for loading, updating, maintaining consistency and saving Prezis.</p> <p>This API also implements <em>collaborative editing</em> functionality by transparently synchronising document changes between multiple clients. This technique is called <a href="https://en.wikipedia.org/wiki/Operational_transformation">operational transformation (OT)</a>. We implemented the server side of this in Haskell; it supports clients from any of the supported platforms and it is connected to several other backend services.</p> <h2 id="benefits">Benefits</h2> <p>Using <em>Haskell</em> for this project turned out to have huge benefits.</p> <p>We are taking advantage of Haskell's capabilities to create embedded domain specific languages, using it to define the document's schema in our own eDSL which is used not only by Haskell developers but many others too.</p> <p>Haskell's clean and terse code allows us to describe document invariants and rules in a very readable way and the type system guarantees that we handle all the necessary cases, providing a stable base Haskell implementation which we can compare the other language backends to.</p> <p>It was also possible to define a set of merge laws for OT, which are verified whenever we introduce a new element to the document schema, guaranteeing that the collaboration functionality works correctly.</p> <p>We use the <em>QuickCheck</em> testing library on all levels. We can generate arbitrary Prezi documents and test serialization on all the backends. We are even generating arbitrary JavaScript code which uses our generated API to test random collaborative network sessions. These tests turned out to be critical for our success as they caught many interesting problems before we deployed anything to production</p> Haskell plugin for Gradle 2015-04-22T00:00:00+00:00 2015-04-22T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/gradle-haskell-plugin/ <p>My team at <a href="https://prezi.com">Prezi</a> uses <strong>Haskell</strong> for several projects, which usually depend on each other, often with build steps using other languages such as Scala, C++ or Haxe. As <a href="https://gradle.org/">Gradle</a> is used heavily in the company, we decided to try to integrate our Haskell projects within Gradle.</p> <p>The result is <a href="https://github.com/prezi/gradle-haskell-plugin">Gradle Haskell Plugin</a>, which we were using succesfully in the last 2 months in our daily work, and we have <em>open-sourced</em> recently.</p> <p>What makes this solution interesting is that it not just simply wraps <em>cabal</em> within Gradle tasks, but implements a way to define <strong>dependencies</strong> between Haskell projects and to upload the binary Haskell artifacts to a <em>repository</em> such as <a href="http://www.jfrog.com/open-source/">artifactory</a>.</p> <p>This makes it easy to modularize our projects, publish them, and also works perfectly with <a href="https://github.com/prezi/pride">pride</a>, an other <em>open-source</em> Prezi project. This means that we can work on a subset of our Haskell projects while the other dependencies are built on Jenkins, and it also integrates well with our non-Haskell projects.</p> <h2 id="how-does-it-work">How does it work?</h2> <p>The main idea is that we let <em>cabal</em> manage the Haskell packages, and handle whole Haskell <em>sandboxes</em> on Gradle level. So if you have a single Haskell project, it will be built using <em>cabal</em> and the result sandbox (the built project together with all the dependent cabal packages which are not installed in the <em>global package database</em>) will be packed/published as a Gradle <em>artifact</em>.</p> <p>This is not very interesting so far, but when you introduce dependencies on Gradle level, the plugin does something which (as far as I know) is not really done by anyone else, which I call <em>sandbox chaining</em>. This basically means that to compile the haskell project, the plugin will pass all the dependent sandboxes' package database to cabal and GHC, so for the actual sandbox only the packages which are <strong>not</strong> in any of the dependent sandboxes will be installed.</p> <h2 id="example">Example</h2> <p>Let's see an example scenario with <em>4 gradle-haskell projects</em>.</p> <p><a href="https://raw.githubusercontent.com/prezi/gradle-haskell-plugin/master/doc/gradle-haskell-plugin-drawing1.png" class="zimg"><img width="600" src="https://raw.githubusercontent.com/prezi/gradle-haskell-plugin/master/doc/gradle-haskell-plugin-drawing1.png" alt="gradle-haskell-plugin"></a></p> <p>The project called <em>Haskell project</em> depends on two other projects, which taking into accound the transitive dependencies means it depends on <em>three other haskell projects</em>. Each project has its own haskell source and <em>cabal file</em>. Building this suite consists of the following steps:</p> <ul> <li><strong>dependency 1</strong> is built using only the <em>global package database</em>, everything <strong>not</strong> in that database, together with the compiled project goes into its <code>build/sandbox</code> directory, which is a combination of a <em>GHC package database</em> and the project's build output. This is packed as <strong>dependency 1</strong>'s build artifact.</li> <li>For <strong>dependency 2</strong>, Gradle first downloads the build artifact of <em>dependency 1</em> and extracts it to <code>build/deps/dependency1</code>.</li> <li>Then it runs <a href="https://github.com/exFalso/sandfix">SandFix</a> on it</li> <li>And compiles the second project, now passing <strong>both</strong> the <em>global package database</em> and <strong>dependency 1</strong>'s sandbox to cabal/ghc. The result is that only the packages which are <strong>not</strong> in any of these two package databases will be installed in the project's own sandbox, which becomes the build artifact of <strong>dependency 2</strong>.</li> <li>For <strong>dependency 3</strong>, Gradle extracts both the direct dependency and the transitive dependency's sandbox, to <code>build/deps/dependency2</code> and <code>build/deps/dependency3</code>.</li> <li>Then it runs <a href="https://github.com/exFalso/sandfix">SandFix</a> on both the dependencies</li> <li>And finally passes three package databases to cabal/ghc to compile the project. Only those cabal dependencies will be installed into this sandbox which are not in global, neither in any of the dependent sandboxes.</li> <li>Finally, for <strong>Haskell project</strong> it goes the same way, but here we have three sandboxes, all chained together to make sure only the built sandbox only contains what is not in the dependent sandboxes yet.</li> </ul> <p>For more information, check out <a href="https://github.com/prezi/gradle-haskell-plugin">the documentation</a>.</p> bari 1.0 released 2014-12-08T00:00:00+00:00 2014-12-08T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/bari-1-0/ <p>I already wrote about <a href="http://vigoo.github.io/bari">bari</a> in <a href="https://blog.vigoo.dev/posts/introducing-bari/">May</a>.</p> <p>As a reminder, <a href="http://vigoo.github.io/bari">bari</a> is a <em>build management system</em> primarily for .NET, trying to fix Visual Studio's bad parts while keeping the good ones.</p> <p>After more than two years of development, and being in production at <a href="http://www.kotem.com/">KOTEM</a> for almost half a year, bari reached a state when it can be considered as a <em>stable</em> and <em>usable</em> first version.</p> <p>To indicate this today I released <strong>bari 1.0</strong>.</p> <p>Try it out and feel free to give any kind of feedback or ask any questions!</p> <p><img src="http://vigoo.github.io/bari/img/barilogo-small.png" alt="" /></p> ScalaFXML 0.2.2 available 2014-10-22T00:00:00+00:00 2014-10-22T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/scalafxml-0-2-2/ <p>I've released a new version of <a href="https://github.com/vigoo/scalafxml">ScalaFXML</a>, which now supports <em>both</em> <a href="https://github.com/scalafx/scalafx">ScalaFX 8</a> with <em>JavaFX 8</em> on Java 8, and <a href="https://github.com/scalafx/scalafx">ScalaFX 2.2</a> with <em>JavaFX 2.x</em> on Java 7.</p> <p>The two branches are separated by the <code>sfx2</code> and <code>sfx8</code> postfixes, and both are available for <em>Scala</em> <code>2.10.x</code> and <code>2.11.x</code>.</p> <p>To use it with <a href="http://www.scala-sbt.org/">sbt</a> on Java 7:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>addCompilerPlugin(</span><span style="color:#50a14f;">&quot;org.scalamacros&quot;</span><span> % </span><span style="color:#50a14f;">&quot;paradise&quot;</span><span> % </span><span style="color:#50a14f;">&quot;2.0.1&quot;</span><span> cross CrossVersion.full) </span><span> </span><span>libraryDependencies += </span><span style="color:#50a14f;">&quot;org.scalafx&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;scalafx&quot;</span><span> % </span><span style="color:#50a14f;">&quot;2.2.67-R10&quot; </span><span> </span><span>libraryDependencies += </span><span style="color:#50a14f;">&quot;org.scalafx&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;scalafxml-core-sfx2&quot;</span><span> % </span><span style="color:#50a14f;">&quot;0.2.2&quot; </span></code></pre> <p>And on Java 8:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>addCompilerPlugin(</span><span style="color:#50a14f;">&quot;org.scalamacros&quot;</span><span> % </span><span style="color:#50a14f;">&quot;paradise&quot;</span><span> % </span><span style="color:#50a14f;">&quot;2.0.1&quot;</span><span> cross CrossVersion.full) </span><span> </span><span>libraryDependencies += </span><span style="color:#50a14f;">&quot;org.scalafx&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;scalafx&quot;</span><span> % </span><span style="color:#50a14f;">&quot;8.0.20-R6&quot; </span><span> </span><span>libraryDependencies += </span><span style="color:#50a14f;">&quot;org.scalafx&quot;</span><span> %% </span><span style="color:#50a14f;">&quot;scalafxml-core-sfx8&quot;</span><span> % </span><span style="color:#50a14f;">&quot;0.2.2&quot; </span></code></pre> A python/thrift profiling story 2014-09-15T00:00:00+00:00 2014-09-15T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/thrift-profiling/ <p>A few weeks ago I met a problem where a script, running once every night sending out some emails did not run correctly because a remote thrift call timed out in it. As I started investigating it, turned out that it's a <em>search</em> call:</p> <pre data-lang="python" style="background-color:#fafafa;color:#383a42;" class="language-python "><code class="language-python" data-lang="python"><span>staff_users </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">RemoteUserFactory</span><span>().</span><span style="color:#e45649;">search</span><span>(</span><span style="color:#e45649;">is_staff</span><span style="color:#a626a4;">=</span><span style="color:#c18401;">True</span><span>) </span></code></pre> <p>The details here are not really important, what this call does is that it asks a service to return a <em>set of users</em>, and the communication is going on <a href="https://thrift.apache.org/">thrift</a>.</p> <p>Executing it manually on the server revealed that it should return <em>5649</em> users. Checking out the logs I could see that the call took extremely long time, between 8 to 12 seconds. Even when the cron job was moved from 3:00 AM to a less busy time (several other jobs were executing at the same time), it took more than 6 seconds!</p> <p>This was suspicious so I also checked the log of a <em>proxy</em> which runs on the same host as the script itself and provides client side load balancing, circuit breaking, retry logic etc. for thrift connections. This log showed that the service replied in <em>2.5 seconds</em>, but it took almost 4 seconds to get this response from the proxy to the client on localhost! This seemed to be completely unacceptable, and also the 2.5 second response time from the service seemed to be too big (I ran the query on one of the server nodes and it returned the users from the database almost instantly). I also had similar experience (but without measurements) before.</p> <p>So I decided to find out what's going on. And I found the process interesting enough to write this post about it :)</p> <h2 id="test-environment">Test environment</h2> <p>I started by adding a test method to the service's thrift API called <code>test_get_users(count, sleep)</code> which returns <code>count</code> fake users after waiting <code>sleep</code> seconds. Then in the following experiments I called it with <code>(5499, 1)</code>. The 1 second sleep was intended to simulate the network latency and database query; there was no advantage from having it at the end, but as it is visible everywhere in the results, I had to mention.</p> <p>For finding out what's going on I used <a href="https://docs.python.org/2/library/profile.html">cProfile</a> with <a href="https://code.google.com/p/jrfonseca/">gprof2dot</a>, calling the remote test method from a django shell, while everything is running on localhost.</p> <h3 id="first-measurement">First measurement</h3> <p>Without touching anything, returning 5499 dummy users on localhost took <strong>5.272 seconds</strong>!</p> <p>The client side of the call looked like this:</p> <p><a href="/images/profile1.png" class="zimg"><img width="600" src="/images/profile1.png" alt="profile1"></a></p> <p>Here we can see that the call has two major phases:</p> <ul> <li>The thrift call itself (65%)</li> <li>Converting the raw results to model objects with <code>_row_to_model</code> (35%)</li> </ul> <p>Let's see first the thrift call (the green branch on the picture). Once again it has two, nearly equivalent branches:</p> <ul> <li><code>send_test_get_users</code> which sends the request and waits for the response. This includes the 1 second sleep as well.</li> <li><code>recv_test_get_users</code> processes the response</li> </ul> <p>What's interesting here is that <code>recv_test_get_users</code> took ~32% of the overall time which is around ~1.6 seconds for simple data deserialization.</p> <h3 id="optimizing-thrift-deserialization">Optimizing thrift deserialization</h3> <p>I did not want to believe that the python thrift deserialization is that slow, so I did a search and found that the <code>TBinaryProtocol</code> which we are using is really that slow.</p> <p>But the thrift library contains a class called <code>TBinaryProtocolAccelerated</code> which is about 10x faster (according to a stackoverflow post).</p> <p>First I simply changed the used protocol to this, but nothing happened. Digging deeper I found that this is not a real protocol implementation, but a lower level hack.</p> <p>The documentation of the protocol class says:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span> C-Accelerated version of TBinaryProtocol. </span><span> </span><span> This class does not override any of TBinaryProtocol&#39;s methods, </span><span> but the generated code recognizes it directly and will call into </span><span> our C module to do the encoding, bypassing this object entirely. </span><span> We inherit from TBinaryProtocol so that the normal TBinaryProtocol </span><span> encoding can happen if the fastbinary module doesn&#39;t work for some </span><span> reason. (TODO(dreiss): Make this happen sanely in more cases.) </span><span> </span><span> In order to take advantage of the C module, just use </span><span> TBinaryProtocolAccelerated instead of TBinaryProtocol. </span></code></pre> <p>So why didn't it work? The answer is in <a href="https://github.com/apache/thrift/blob/master/lib/py/src/protocol/TBase.py#L52-L58">TBase.py</a>.</p> <p>The following conditions have to met in order to use the fast deserializer:</p> <ul> <li>Protocol must be <code>TBinaryProtocolAccelerated</code> (I changed that)</li> <li>Protocol's transport implementation must implement the <code>TTransport.CReadableTransport</code> interface</li> <li><code>thrift_spec</code> must be available (this was true in this case)</li> <li><code>fastbinary</code> must be available (also true)</li> </ul> <p>The problem was that we were replacing the <code>TTransport</code> implementation with a custom class called <code>ThriftifyTransport</code> in order to do thrift logging, HMAC authentication, etc.</p> <p>Fortunately all the default transport implementations implement the <code>CReadableTransport</code> interface, and one of them, <code>TBufferedTransport</code> can be used to wrap another transport to add buffering around it. That's what I did, and it immediately started using the fast deserialization code.</p> <p>The test call now ran in <strong>3.624 seconds</strong>.</p> <p>And the new profiling results with this change:</p> <p><a href="/images/profile2.png" class="zimg"><img width="600" src="/images/profile2.png" alt="profile2"></a></p> <p>The left-hand side of the call graph remained the same, but <code>recv_test_get_users</code> is now only 2.35% of the overall time which is ~0.08 seconds (to be compared with the 1.6 seconds with the original deserializer!)</p> <h3 id="optimizing-thrift-serialization">Optimizing thrift serialization</h3> <p>The obvious next step was to apply this change on the server side as well, so our service can use the fast binary protocol for serialization too. For this I simply copied the change and remeasured everything.</p> <p>The test call now ran in <strong>3.328 seconds</strong>!</p> <p>Let's see the call graph of this stage:</p> <p><a href="/images/profile3.png" class="zimg"><img width="600" src="/images/profile3.png" alt="profile3"></a></p> <h3 id="optimizing-result-processing">Optimizing result processing</h3> <p>The client side of the test method was written similar to how the original API method is written:</p> <pre data-lang="python" style="background-color:#fafafa;color:#383a42;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">test_get_users_thrift</span><span>(</span><span style="color:#e45649;">self</span><span>, </span><span style="color:#e45649;">count</span><span>, </span><span style="color:#e45649;">sleep</span><span>): </span><span> rpc </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">ThriftRPC</span><span>(UserDataService, </span><span style="color:#e45649;">self</span><span>.name, </span><span style="color:#e45649;">service_name</span><span style="color:#a626a4;">=</span><span style="color:#e45649;">self</span><span>.service_name, </span><span style="color:#e45649;">client_config</span><span style="color:#a626a4;">=</span><span>client_config) </span><span> </span><span> result </span><span style="color:#a626a4;">= </span><span>[] </span><span> </span><span style="color:#a626a4;">for </span><span>row </span><span style="color:#a626a4;">in </span><span>rpc.</span><span style="color:#e45649;">test_get_users</span><span>(count, sleep).</span><span style="color:#e45649;">iteritems</span><span>(): </span><span> user </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">self</span><span>.</span><span style="color:#e45649;">_row_to_model</span><span>(</span><span style="color:#e45649;">self</span><span>.user_factory, row) </span><span> result.</span><span style="color:#e45649;">append</span><span>(user) </span><span> </span><span> </span><span style="color:#a626a4;">return </span><span>result </span></code></pre> <p>It is clearly visible on the call graph that the 5499 call to <code>_row_to_model</code> takes 53% of the total time, which is ~1.7 seconds. There are two main branches of this call. The left hand side (<code>row_to_model</code>) seemed to be simple data conversion, and its slowest part is date-time deserialization.</p> <p>The other branch however looked like a real problem; why should we resolve HMAC host, or parse configuration for each row?</p> <p>It turned out to be a bug, <code>_row_to_model</code> created a new <em>model factory</em> in each call, which involves a lot of initialization, config parsing, and similar things.</p> <p>So the simple fix was to create a <code>_rows_to_model</code> helper function which does the same for multiple rows with a single factory.</p> <p>Running my test code once again showed that the optimization makes sense. Now it ran in <strong>2.448 seconds</strong>, with the following call graph:</p> <p><a href="/images/profile4.png" class="zimg"><img width="600" src="/images/profile4.png" alt="profile4"></a></p> <h3 id="further-optimizations">Further optimizations</h3> <p>I saw two possible ways to further optimize this case:</p> <ol> <li> <p>Lazy conversion of raw thrift data to model data (per field). This would make sense because many times only a few fields (the id for example) are used, but it seemed to be a too complex change</p> </li> <li> <p>Checking the server side as well</p> </li> </ol> <p>To profile the server side and only measure the thrift request processing I had to add profiling code to the django view class in the following way:</p> <pre data-lang="python" style="background-color:#fafafa;color:#383a42;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#a626a4;">import </span><span>cProfile </span><span> </span><span>cProfile.</span><span style="color:#e45649;">runctx</span><span>(</span><span style="color:#50a14f;">&#39;self._call_processor(op_data)&#39;</span><span>, </span><span style="color:#0184bc;">globals</span><span>(), </span><span style="color:#0184bc;">locals</span><span>(), </span><span style="color:#50a14f;">&#39;callstats&#39;</span><span>) </span><span style="color:#a0a1a7;"># self._call_processor(op_data) </span></code></pre> <p>The server-side call took <strong>1.691 seconds</strong> and looked like this:</p> <p><a href="/images/profile5.png" class="zimg"><img width="600" src="/images/profile5.png" alt="profile5"></a></p> <p>As expected, 60% of this was the 1 second sleep. The rest of the calls are data conversion with no obvious point to improve.</p> <h2 id="summary">Summary</h2> <p>These optimizations are decreasing the response time significantly, especially for calls returning multiple rows.</p> <p>The interesting was that the extremely slow performance was caused by both the slow perfomance of the python thrift serializer and a bug in our code.</p> Conditional blocks in Distributed Documentor 2014-07-13T00:00:00+00:00 2014-07-13T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/conditional-blocks-in-ddoc/ <p>I've added a new feature to <a href="https://github.com/vigoo/distributed-documentor">Distributed Documentor</a> today, <em>conditional blocks</em>.</p> <p>The idea is that parts of the documents can be enabled when a given <em>condition</em> is present. This is very similar to <a href="http://gcc.gnu.org/onlinedocs/cpp/Ifdef.html">C's ifdef blocks</a>. To use it with the <em>MediaWiki syntax</em>, put <code>[When:X]</code> and <code>[End]</code> commands in separate lines:</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span>Unconditional </span><span> </span><span>[When:FIRST] </span><span>First conditional </span><span> </span><span>[When:SECOND] </span><span>First and second conditional </span><span>[End] </span><span>[End] </span><span> </span><span>[When:SECOND] </span><span>Second conditional </span><span>[End] </span></code></pre> <p><em>Snippets</em> can also have conditional blocks.</p> <p>There are two possibilities to set which conditionals are enabled:</p> <ol> <li> <p>Specifying it with command line arguments, such as</p> <pre style="background-color:#fafafa;color:#383a42;"><code><span> java -jar DistributedDocumentor.jar -D FIRST -D SECOND </span></code></pre> <p>This is useful when exporting a documentation from command line, or to launch the documentation editor with a predefined set of enabled conditions.</p> </li> <li> <p>On the user interface, using <em>View</em> menu's <em>Enabled conditions...</em> menu item:</p> </li> </ol> <p><img src="/images/enabled-conditions-dialog.png" alt="" /></p> Introducing bari 2014-05-16T00:00:00+00:00 2014-05-16T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/introducing-bari/ <p>In the past two years I worked on a project called <a href="https://github.com/vigoo/bari">bari</a> which now reached an usable state. <strong>bari</strong> is a <em>build management system</em>, trying to fix Visual Studio's bad parts while keeping the good ones.</p> <p>Basically it tries to make .NET development more convenient, when</p> <ul> <li>The application may consist of a <em>large number of projects</em></li> <li>There may be several different <em>subsets</em> of these projects defining valuable target <em>products</em></li> <li><em>Custom build steps</em> may be required</li> <li>It is important to be able to <em>reproduce</em> the build environment as easily as possible</li> <li>The developers want to use the full power of their <em>IDE</em></li> </ul> <p>The main idea is to generate Visual Studio solutions and projects <em>on the fly</em> as needed, from a concise <em>declarative</em> build description. I tried to optimize this build description for human readability. Let's see an example, a short section from <strong>bari</strong>'s own build definition:</p> <pre data-lang="yaml" style="background-color:#fafafa;color:#383a42;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span>- </span><span style="color:#e45649;">name</span><span>: </span><span style="color:#50a14f;">bari </span><span> </span><span style="color:#e45649;">type</span><span>: </span><span style="color:#50a14f;">executable </span><span> </span><span style="color:#e45649;">references</span><span>: </span><span> - </span><span style="color:#50a14f;">gac://System </span><span> - </span><span style="color:#50a14f;">nuget://log4net </span><span> - </span><span style="color:#50a14f;">nuget://Ninject/3.0.1.10 </span><span> - </span><span style="color:#50a14f;">nuget://QuickGraph </span><span> - </span><span style="color:#50a14f;">module://Bari.Core </span><span> </span><span style="color:#e45649;">csharp</span><span>: </span><span> </span><span style="color:#e45649;">root-namespace</span><span>: </span><span style="color:#50a14f;">Bari.Console </span></code></pre> <p>The main advantage of generating solutions and projects on the fly is that each developer can work on the subset he needs for his current task keeping the IDE fast, but can also open everything in one solution if it is useful for performing a refactoring.</p> <p>To keep build definitions short and readable, <strong>bari</strong> prefers <em>convention</em> over <em>configuration</em>. For example the directory stucture in which the source code lays defines not only the name of the modules to build, but also the way it is built. For example, in a simple <em>hello world</em> example the C# source code would be put in the <code>src/TestModule/HelloWorld/cs</code> directory, and <strong>bari</strong> would build <code>target/TestModule/HelloWorld.exe</code>.</p> <p><strong>bari</strong> unifies the handling of <em>project references</em> in a way that referencing projects within a suite, from the GAC, using <a href="http://www.nuget.org">Nuget</a> or from a custom repository works exactly the same. It is also possible to write <em>custom builders</em> in Python.</p> <p>For more information check out <a href="https://github.com/vigoo/bari/wiki/GettingStarted">the getting started page</a>.</p> ScalaFX with FXML 2014-01-12T00:00:00+00:00 2014-01-12T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/scalafx-with-fxml/ <p><a href="https://code.google.com/p/scalafx/">ScalaFX</a> is a nice wrapper around JavaFX for Scala, but currently it lacks support for using <a href="http://docs.oracle.com/javafx/2/api/javafx/fxml/doc-files/introduction_to_fxml.html">FXML</a> instead of Scala code for defining the user interfaces. This can be understood as <em>ScalaFX</em> is in fact a DSL for defining the UI in Scala instead of an XML file. Still I believe that using FXML instead may have its advantages; first of all it has a visual designer (<a href="http://www.oracle.com/technetwork/java/javafx/tools/index.html">JavaFX Scene Builder</a>). For me, designing an UI without immediate visual feedback is hard, and involves a lot of iterations of tweaking the code, running it and checking the results. I also expect that in the future there will be more tools available which work on FXML data.</p> <p>It is not impossible to use FXML user interfaces from Scala, but the ScalaFX wrappers does not help and the code for the controller classes is not clean enough. See <a href="https://github.com/jpsacha/ProScalaFX/blob/master/src/proscalafx/ch10/fxml/AdoptionFormController.scala">the following example</a> to get a feeling how it looks like.</p> <p>To make it better I wrote a small library called <a href="https://github.com/vigoo/scalafxml">ScalaFXML</a>. In this post I'll go through a small example to explain how it works.</p> <p>The following image shows how our sample application will look like:</p> <p><img src="/images/unit-conversion-shot.png" alt="" /></p> <p>The <em>From</em> fiels is editable, and the result in the <em>To</em> field is filled as you type using <em>data binding</em>. The <em>Close</em> button's only purpose is to demonstrate event handlers.</p> <p>The conversion logic itself is implemented by <a href="https://github.com/vigoo/scalafxml/blob/master/demo/src/main/scala/scalafxml/demo/unitconverter/UnitConverter.scala">small classes</a> sharing the same trait:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">trait</span><span style="color:#c18401;"> UnitConverter { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">description</span><span style="color:#c18401;">: String </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">input</span><span style="color:#c18401;">: String): String </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">toString </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> description </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> MMtoInches </span><span style="color:#a626a4;">extends </span><span>UnitConverter </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">description</span><span style="color:#c18401;">: String </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;Millimeters to inches&quot; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">input</span><span style="color:#c18401;">: String): String </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">try </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> (input.toDouble / 25.4).toString </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">catch </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">ex</span><span style="color:#c18401;">: Throwable </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> ex.toString </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span><span> </span><span style="color:#a626a4;">object</span><span style="color:#c18401;"> InchesToMM </span><span style="color:#a626a4;">extends </span><span>UnitConverter </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">description</span><span style="color:#c18401;">: String </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;Inches to millimeters&quot; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">run</span><span style="color:#c18401;">(</span><span style="color:#e45649;">input</span><span style="color:#c18401;">: String): String </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">try </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> (input.toDouble * 25.4).toString </span><span style="color:#c18401;"> } </span><span style="color:#a626a4;">catch </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">case </span><span style="color:#e45649;">ex</span><span style="color:#c18401;">: Throwable </span><span style="color:#a626a4;">=&gt;</span><span style="color:#c18401;"> ex.toString </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>To describe the set of available <em>unit converters</em>, we define one more helper class:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> UnitConverters</span><span>(</span><span style="color:#e45649;">converters</span><span>: </span><span style="color:#c18401;">UnitConverter</span><span style="color:#a626a4;">*</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">available </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">List(converters : </span><span style="color:#a626a4;">_*</span><span style="color:#c18401;">) </span><span style="color:#c18401;">} </span></code></pre> <p>Now let's start with a <a href="https://github.com/vigoo/scalafxml/blob/master/demo/src/main/scala/scalafxml/demo/unitconverter/PureScalaFX.scala">pure ScalaFX solution</a>, where the user interface is defined in Scala. I've implemented the view itself in a class called <code>PureScalaFXView</code>, which gets the set of available <em>unit converters</em> as a dependency through its constructor. This makes the main application object very simple:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> PureScalaFX </span><span style="color:#a626a4;">extends </span><span>JFXApp </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> stage </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">PureScalaFXView( </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">UnitConverters(InchesToMM, MMtoInches)) </span><span style="color:#c18401;">} </span></code></pre> <p>The <code>PureScalaFXView</code> class consists of two distinct parts. First we define the user interface using the <em>ScalaFX UI DSL</em>:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> PureScalaFXView</span><span>(</span><span style="color:#e45649;">converters</span><span>: </span><span style="color:#c18401;">UnitConverters</span><span>) </span><span style="color:#a626a4;">extends </span><span>JFXApp.PrimaryStage </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// UI Definition </span><span style="color:#c18401;"> title </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;Unit conversion&quot; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">types </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">ComboBox[UnitConverter]() { </span><span style="color:#c18401;"> maxWidth </span><span style="color:#a626a4;">= Double</span><span style="color:#c18401;">.MaxValue </span><span style="color:#c18401;"> margin </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Insets(3) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">from </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">TextField { </span><span style="color:#c18401;"> margin </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Insets(3) </span><span style="color:#c18401;"> prefWidth </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">200.0 </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">to </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">TextField { </span><span style="color:#c18401;"> prefWidth </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">200.0 </span><span style="color:#c18401;"> margin </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Insets(3) </span><span style="color:#c18401;"> editable </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">false </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> scene </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">Scene { </span><span style="color:#c18401;"> content </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">GridPane { </span><span style="color:#c18401;"> padding </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Insets(5) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> add(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Label(</span><span style="color:#50a14f;">&quot;Conversion type:&quot;</span><span style="color:#c18401;">), 0, 0) </span><span style="color:#c18401;"> add(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Label(</span><span style="color:#50a14f;">&quot;From:&quot;</span><span style="color:#c18401;">), 0, 1) </span><span style="color:#c18401;"> add(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Label(</span><span style="color:#50a14f;">&quot;To:&quot;</span><span style="color:#c18401;">), 0, 2) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> add(types, 1, 0) </span><span style="color:#c18401;"> add(from, 1, 1) </span><span style="color:#c18401;"> add(to, 1, 2) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> add(</span><span style="color:#a626a4;">new </span><span style="color:#c18401;">Button(</span><span style="color:#50a14f;">&quot;Close&quot;</span><span style="color:#c18401;">) { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// inline event handler binding </span><span style="color:#c18401;"> onAction </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">(</span><span style="color:#e45649;">e</span><span style="color:#c18401;">: ActionEvent) </span><span style="color:#a626a4;">=&gt; </span><span style="color:#c18401;">Platform.exit() </span><span style="color:#c18401;"> }, 1, 3) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> columnConstraints </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">List( </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">ColumnConstraints { </span><span style="color:#c18401;"> halignment </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">HPos.LEFT </span><span style="color:#c18401;"> hgrow </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Priority.SOMETIMES </span><span style="color:#c18401;"> margin </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Insets(5) </span><span style="color:#c18401;"> }, </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">ColumnConstraints { </span><span style="color:#c18401;"> halignment </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">HPos.RIGHT </span><span style="color:#c18401;"> hgrow </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Priority.ALWAYS </span><span style="color:#c18401;"> margin </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">Insets(5) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> ) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span></code></pre> <p>This is not 100% pure UI definition, because it also contains an inline event handler definition for the <em>Close</em> button.</p> <p>The next part fills the <em>combo box</em> and defines the data binding. Filling the combo box is a simple procedural loop:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">for </span><span>(</span><span style="color:#e45649;">converter </span><span style="color:#a626a4;">&lt;-</span><span> converters.available) { </span><span> types += converter </span><span> } </span><span> types.getSelectionModel.selectFirst() </span></code></pre> <p>For the data binding we define a <a href="http://docs.oracle.com/javafx/2/binding/jfxpub-binding.htm">low level data binding</a> which depends on the combo box's selected value and the <em>From</em> field's text, and produces the output for the <em>To</em> field:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> to.text &lt;== </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">StringBinding </span><span>{ </span><span> bind(from.text.delegate, types.getSelectionModel.selectedItemProperty) </span><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">computeValue</span><span>() </span><span style="color:#a626a4;">=</span><span> types.getSelectionModel.getSelectedItem.run(from.text.value) </span><span> } </span></code></pre> <p>That's all, the application is fully functional. The next thing is to split this class so the UI definition and the UI logic got separated. This <a href="https://github.com/vigoo/scalafxml/blob/master/demo/src/main/scala/scalafxml/demo/unitconverter/RefactoredPureScalaFX.scala">refactored ScalaFX solution</a> is very similar to the previous one, but the initialization of the combo box, the data binding and the event handler are all encapsulated by a new, separate class:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> RawUnitConverterPresenter</span><span>( </span><span> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">from</span><span>: </span><span style="color:#c18401;">TextField</span><span>, </span><span> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">to</span><span>: </span><span style="color:#c18401;">TextField</span><span>, </span><span> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">types</span><span>: </span><span style="color:#c18401;">ComboBox</span><span>[</span><span style="color:#c18401;">UnitConverter</span><span>], </span><span> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">converters</span><span>: </span><span style="color:#c18401;">UnitConverters</span><span>) </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Filling the combo box </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">for </span><span style="color:#c18401;">(</span><span style="color:#e45649;">converter </span><span style="color:#a626a4;">&lt;-</span><span style="color:#c18401;"> converters.available) { </span><span style="color:#c18401;"> types += converter </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> types.getSelectionModel.selectFirst() </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Data binding </span><span style="color:#c18401;"> to.text &lt;== </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">StringBinding { </span><span style="color:#c18401;"> bind(from.text.delegate, types.getSelectionModel.selectedItemProperty) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">computeValue</span><span style="color:#c18401;">() </span><span style="color:#a626a4;">=</span><span style="color:#c18401;"> types.getSelectionModel.getSelectedItem.run(from.text.value) </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Close button event handler </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">onClose</span><span style="color:#c18401;">(</span><span style="color:#e45649;">event</span><span style="color:#c18401;">: ActionEvent) { </span><span style="color:#c18401;"> Platform.exit() </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>What I wanted is to be able to define the controller class exactly like this while building the user interface from FXML. Without <a href="https://github.com/vigoo/scalafxml">ScalaFXML</a> the controller class have some serious limitations:</p> <ul> <li>It must implement the <a href="http://docs.oracle.com/javafx/2/api/javafx/fxml/Initializable.html">Initializable</a> interface</li> <li>It cannot have any constructor arguments</li> <li>The user interface objects must be variable fields of the class</li> <li>And they have to have the type of the JavaFX controls, so to be able to use the ScalaFX wrappers, they have to be explicitly wrapped in the <code>initialize</code> method.</li> </ul> <p>With <a href="https://github.com/vigoo/scalafxml">ScalaFXML</a> the process is really simple. First we create the FXML, for example with the <a href="http://www.oracle.com/technetwork/java/javafx/tools/index.html">JavaFX Scene Builder</a>:</p> <p><img src="/images/unit-conversion-scenebuilder.png" alt="" /></p> <p>In the FXML we give the <code>from</code>, <code>to</code>, and <code>types</code> identifiers to our controls using the <code>fx:id</code> attribute, for example:</p> <pre data-lang="xml" style="background-color:#fafafa;color:#383a42;" class="language-xml "><code class="language-xml" data-lang="xml"><span> &lt;</span><span style="color:#e45649;">TextField </span><span style="color:#c18401;">fx:id</span><span>=</span><span style="color:#50a14f;">&quot;from&quot; </span><span style="color:#c18401;">prefWidth</span><span>=</span><span style="color:#50a14f;">&quot;200.0&quot; </span><span> </span><span style="color:#c18401;">GridPane.columnIndex</span><span>=</span><span style="color:#50a14f;">&quot;1&quot; </span><span> </span><span style="color:#c18401;">GridPane.margin</span><span>=</span><span style="color:#50a14f;">&quot;$x1&quot; </span><span> </span><span style="color:#c18401;">GridPane.rowIndex</span><span>=</span><span style="color:#50a14f;">&quot;1&quot; </span><span>/&gt; </span></code></pre> <p>The event handlers can be specified simply by their name:</p> <pre data-lang="xml" style="background-color:#fafafa;color:#383a42;" class="language-xml "><code class="language-xml" data-lang="xml"><span>&lt;</span><span style="color:#e45649;">Button </span><span style="color:#c18401;">onAction</span><span>=</span><span style="color:#50a14f;">&quot;#onClose&quot; </span><span style="color:#c18401;">text</span><span>=</span><span style="color:#50a14f;">&quot;Close&quot; </span><span> </span><span style="color:#c18401;">mnemonicParsing</span><span>=</span><span style="color:#50a14f;">&quot;false&quot; </span><span> </span><span style="color:#c18401;">GridPane.columnIndex</span><span>=</span><span style="color:#50a14f;">&quot;1&quot; </span><span> </span><span style="color:#c18401;">GridPane.halignment</span><span>=</span><span style="color:#50a14f;">&quot;RIGHT&quot; </span><span> </span><span style="color:#c18401;">GridPane.rowIndex</span><span>=</span><span style="color:#50a14f;">&quot;3&quot; </span><span>/&gt; </span></code></pre> <p>and the controller class must be referenced on the root node</p> <pre data-lang="xml" style="background-color:#fafafa;color:#383a42;" class="language-xml "><code class="language-xml" data-lang="xml"><span>fx:controller=&quot;scalafxml.demo.unitconverter.UnitConverterPresenter&quot; </span></code></pre> <p>The controller class <a href="https://github.com/vigoo/scalafxml/blob/master/demo/src/main/scala/scalafxml/demo/unitconverter/ScalaFXML.scala">can be exactly the same as the <code>RawUnitConverterPresenter</code></a>, adding an additional <code>@sfxml</code> annotation for it. Everything else is handled by the library, as we will see.</p> <p>The application object itself looks like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">object</span><span style="color:#c18401;"> ScalaFXML </span><span style="color:#a626a4;">extends </span><span>JFXApp </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">val </span><span style="color:#e45649;">root </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">FXMLView(getClass.getResource(</span><span style="color:#50a14f;">&quot;unitconverter.fxml&quot;</span><span style="color:#c18401;">), </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">DependenciesByType(Map( </span><span style="color:#c18401;"> typeOf[UnitConverters] -&gt; </span><span style="color:#a626a4;">new </span><span style="color:#c18401;">UnitConverters(InchesToMM, MMtoInches)))) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> stage </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">JFXApp.PrimaryStage() { </span><span style="color:#c18401;"> title </span><span style="color:#a626a4;">= </span><span style="color:#50a14f;">&quot;Unit conversion&quot; </span><span style="color:#c18401;"> scene </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">Scene(root) </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre> <p>Beside giving the URI for the FXML file we also has to provide the <em>additional dependencies</em> of the controller class. This is an easily extensible part of the library, and it already has support for <a href="https://github.com/dickwall/subcut">SubCut</a> and <a href="https://code.google.com/p/google-guice/">Guice</a> as well. Here we are using a simple <em>type-&gt;value</em> mapping instead.</p> <p>How does this work? What happens behind the scenes?</p> <p>The <code>@sfxml</code> is a <a href="http://docs.scala-lang.org/overviews/macros/annotations.html">macro annotation</a>. In <em>compile-time</em>, the class definition itself is transformed by the <a href="https://github.com/vigoo/scalafxml/blob/master/core-macros/src/main/scala/scalafxml/core/macros/sfxmlMacro.scala"><code>sfxmlMacro.impl</code> function</a>.</p> <p>The transformation's result is a class definition with the source class' name, but with a completely different content. The original class is added as an inner class, always called <code>Controller</code>. In our example, the generated class definition would look like something similar:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">class</span><span style="color:#c18401;"> UnitConverterPresenter</span><span>(</span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">dependencyResolver</span><span>: </span><span style="color:#c18401;">ControllerDependencyResolver</span><span>) </span><span> </span><span style="color:#a626a4;">extends </span><span>javafx.fxml.Initializable </span><span> </span><span style="color:#a626a4;">with </span><span>FxmlProxyGenerator.ProxyDependencyInjection </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">class</span><span style="color:#c18401;"> Controller( </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">from</span><span style="color:#c18401;">: TextField, </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">to</span><span style="color:#c18401;">: TextField, </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">types</span><span style="color:#c18401;">: ComboBox[UnitConverter], </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private val </span><span style="color:#e45649;">converters</span><span style="color:#c18401;">: UnitConverters) { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// … </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private var </span><span style="color:#e45649;">impl</span><span style="color:#c18401;">: Controller </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">null </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// … </span><span style="color:#c18401;">} </span></code></pre> <p>The class have four distinct parts:</p> <ol> <li>Getting the additional dependencies from the <em>dependency resolver</em></li> <li>Variable fields for binding the JavaFX controls defined in the FXML</li> <li>Event handler methods</li> <li>The <code>initializable</code> method's implementation</li> </ol> <p>The first one is simple - for each constructor argument of the controller class which is <em>not</em> a ScalaFX control, we query the <em>dependency resolver</em> to get a value for it. These are performed when the outer, generated class is instantiated and stored through the <code>ProxyDependencyInjection</code> trait.</p> <p>The variable fields are simple fields for all the ScalaFX constructor arguments of the controller class, but converted to their JavaFX counterpart. For example the generated field for the controller's <code>from</code> argument will look like this:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>@javafx.fxml.</span><span style="color:#e45649;">FXML </span><span style="color:#a626a4;">private var </span><span style="color:#e45649;">from</span><span>: javafx.scene.control.</span><span style="color:#c18401;">TextField </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">null </span></code></pre> <p>The <em>event handler</em>'s are proxies for all the public methods of the controller, but the ScalaFX event argument types are replaced with JavaFX event argument types and they are wrapped automatically when forwarding the call to the real implementation. For the <code>onClose</code> event handler it would look like the following:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span>@javafx.fxml.</span><span style="color:#e45649;">FXML </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">onClose</span><span>(</span><span style="color:#e45649;">e</span><span>: javafx.event.</span><span style="color:#c18401;">ActionEvent</span><span>) { </span><span> impl.onClose(</span><span style="color:#a626a4;">new </span><span>scalafx.event.</span><span style="color:#c18401;">ActionEvent</span><span>(e)) </span><span>} </span></code></pre> <p>When JavaFX calls the generated controller's <code>initialize</code> method, the control fields are already set up, and the additional dependencies were already gathered from the dependency resolver so we have all the values required to instantiate the real controller class. For ScalaFX arguments we wrap the JavaFX controls, for the additional dependencies we use the <code>ProxyDependencyInjection</code> trait's <code>getDependency</code> method:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span style="color:#a626a4;">def </span><span style="color:#0184bc;">initialize</span><span>(</span><span style="color:#e45649;">url</span><span>: java.net.</span><span style="color:#c18401;">URL</span><span>, </span><span style="color:#e45649;">rb</span><span>: java.util.</span><span style="color:#c18401;">ResourceBundle</span><span>) { </span><span> impl </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">Controller</span><span>( </span><span> </span><span style="color:#a626a4;">new </span><span>scalafx.scene.control.</span><span style="color:#c18401;">TextField</span><span>(from), </span><span> </span><span style="color:#a626a4;">new </span><span>scalafx.scene.control.</span><span style="color:#c18401;">TextField</span><span>(to), </span><span> </span><span style="color:#a626a4;">new </span><span>scalafx.scene.control.</span><span style="color:#c18401;">ComboBox</span><span>[</span><span style="color:#c18401;">UnitConverter</span><span>](types), </span><span> getDependencies[</span><span style="color:#c18401;">UnitConverters</span><span>](</span><span style="color:#50a14f;">&quot;converters&quot;</span><span>)) </span><span>} </span></code></pre> <p>That's all. The final interesting bit is the <code>FXMLView</code> object, which overrides JavaFX's default controller factory. This is only necessary to be able to pass the given <code>ControllerDependencyResolver</code> to the generated controller's constructor:</p> <pre data-lang="scala" style="background-color:#fafafa;color:#383a42;" class="language-scala "><code class="language-scala" data-lang="scala"><span> </span><span style="color:#a626a4;">def </span><span style="color:#0184bc;">apply</span><span>(</span><span style="color:#e45649;">fxml</span><span>: </span><span style="color:#c18401;">URL</span><span>, </span><span style="color:#e45649;">dependencies</span><span>: </span><span style="color:#c18401;">ControllerDependencyResolver</span><span>): jfxs.</span><span style="color:#c18401;">Parent </span><span style="color:#a626a4;">= </span><span> jfxf.FXMLLoader.load( </span><span> fxml, </span><span> </span><span style="color:#c18401;">null</span><span>, </span><span> </span><span style="color:#a626a4;">new </span><span>jfxf.</span><span style="color:#c18401;">JavaFXBuilderFactory</span><span>(), </span><span> </span><span style="color:#a626a4;">new </span><span>jfxu.</span><span style="color:#c18401;">Callback</span><span>[</span><span style="color:#c18401;">Class</span><span>[</span><span style="color:#e45649;">_</span><span>], </span><span style="color:#c18401;">Object</span><span>] { </span><span> </span><span style="color:#a626a4;">override def </span><span style="color:#0184bc;">call</span><span>(</span><span style="color:#e45649;">cls</span><span>: </span><span style="color:#c18401;">Class</span><span>[</span><span style="color:#e45649;">_</span><span>]): </span><span style="color:#c18401;">Object </span><span style="color:#a626a4;">= </span><span> FxmlProxyGenerator(cls, dependencies) </span><span> }) </span></code></pre> <p><code>FxmlProxyGenerator</code> uses reflection to create a new instance of the generated controller, and pass the dependency resolver as its only constructor argument.</p> Trying out Ceylon - Part 1 2013-11-17T00:00:00+00:00 2013-11-17T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/trying-out-ceylon-part-1/ <p>Ceylon's first production release was announced on 12th of November. I decided to try it out after going through the quick introduction, as it looked quite promising. In a series of posts I'd like to share my first attempts to use this interesting language.</p> <p>This first release came with an eclipse plugin as well - after installing it I was immediately able to start working on my test project. In this few hours the plugin seemed to be stable enough, I did not experience any problems.</p> <p>I have a <code>JVLT</code> file which I created while attending a foreign language course about a year ago. I was using only a limited subset of this application, so basically what I have is a .jvlt file, which is in fact a ZIP archive, in which a <code>dict.xml</code> stores a set of words and for each word one or more translation and the lesson we have learnt it.</p> <p>See the following example:</p> <pre data-lang="xml" style="background-color:#fafafa;color:#383a42;" class="language-xml "><code class="language-xml" data-lang="xml"><span>&lt;</span><span style="color:#e45649;">dictionary </span><span style="color:#c18401;">language</span><span>=</span><span style="color:#50a14f;">&quot;french&quot; </span><span style="color:#c18401;">version</span><span>=</span><span style="color:#50a14f;">&quot;1.4&quot;</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">entry </span><span style="color:#c18401;">id</span><span>=</span><span style="color:#50a14f;">&quot;e275&quot;</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">orth</span><span>&gt;à côté de&lt;/</span><span style="color:#e45649;">orth</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">sense </span><span style="color:#c18401;">id</span><span>=</span><span style="color:#50a14f;">&quot;e275-s1&quot;</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">trans</span><span>&gt;mellett&lt;/</span><span style="color:#e45649;">trans</span><span>&gt; </span><span> &lt;/</span><span style="color:#e45649;">sense</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">sense </span><span style="color:#c18401;">id</span><span>=</span><span style="color:#50a14f;">&quot;e275-s2&quot;</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">trans</span><span>&gt;mellé&lt;/</span><span style="color:#e45649;">trans</span><span>&gt; </span><span> &lt;/</span><span style="color:#e45649;">sense</span><span>&gt; </span><span> &lt;</span><span style="color:#e45649;">lesson</span><span>&gt;8&lt;/</span><span style="color:#e45649;">lesson</span><span>&gt; </span><span> &lt;/</span><span style="color:#e45649;">entry</span><span>&gt; </span><span>&lt;/</span><span style="color:#e45649;">dictionary</span><span>&gt; </span></code></pre> <p>My idea was to write an application that helps me learning and practicing these words.</p> <p>In this first post I'm going to load the dictionary from the JVLT file.</p> <p>To get started, I created a new Ceylon module with the help of the IDE called jvlt. This immediately created three program units: <code>module.ceylon</code>, <code>package.ceylon</code> and <code>run.ceylon</code>. The <code>module.ceylon</code> contains the module definition, which also describes the module's dependencies. As I was trying to implement the dictionary reader, I ended up with the following module definition:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#a626a4;">module </span><span style="color:#e45649;">jvlt </span><span style="color:#50a14f;">&quot;1.0.0&quot;</span><span> { </span><span> </span><span style="color:#a626a4;">shared import </span><span style="color:#e45649;">ceylon</span><span>.</span><span style="color:#e45649;">file </span><span style="color:#50a14f;">&quot;1.0.0&quot;</span><span>; </span><span> </span><span style="color:#a626a4;">import </span><span style="color:#e45649;">ceylon</span><span>.</span><span style="color:#e45649;">collection </span><span style="color:#50a14f;">&quot;1.0.0&quot;</span><span>; </span><span> </span><span style="color:#a626a4;">import </span><span style="color:#e45649;">ceylon</span><span>.</span><span style="color:#e45649;">interop</span><span>.</span><span style="color:#e45649;">java </span><span style="color:#50a14f;">&quot;1.0.0&quot;</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">import </span><span style="color:#e45649;">javax</span><span>.</span><span style="color:#e45649;">xml </span><span style="color:#50a14f;">&quot;7&quot;</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">import </span><span style="color:#e45649;">ceylon</span><span>.</span><span style="color:#e45649;">test </span><span style="color:#50a14f;">&quot;1.0.0&quot;</span><span>; </span><span>} </span></code></pre> <p>Let's start with the data model we want to build up! The dictionary consists of words:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Represents a foreign word with one or more senses&quot; </span><span style="color:#a626a4;">shared class </span><span style="color:#c18401;">Word</span><span>(</span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">String </span><span style="color:#e45649;">word</span><span>, </span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">Set</span><span>&lt;</span><span style="color:#e45649;">string</span><span>&gt; </span><span style="color:#e45649;">senses</span><span>, </span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">Integer </span><span style="color:#e45649;">lesson</span><span>){ </span><span>} </span></code></pre> <p>The word, senses and lessons are all shared attributes of this class, accessible from the outside. To make it easy to access the word objects by their foreign word, I'm currently storing them in a map:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Represents a dictionary of words in a given language&quot; </span><span style="color:#a626a4;">shared class </span><span style="color:#c18401;">Dictionary</span><span>(</span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">String </span><span style="color:#e45649;">language</span><span>, </span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">Map</span><span>&lt;</span><span style="color:#e45649;">string word</span><span>=</span><span style="color:#50a14f;">&quot;&quot;</span><span>&gt; </span><span style="color:#e45649;">words</span><span>) { </span><span>} </span></code></pre> <p>Basically that's the data model, but I wrapped the whole thing in an abstract JVLT class which looks like this:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Represents a JVLT file&quot; </span><span style="color:#a626a4;">abstract shared class </span><span style="color:#c18401;">JVLT</span><span>() { </span><span> </span><span> </span><span style="color:#50a14f;">&quot;The dictionary stored in this JVLT&quot; </span><span> </span><span style="color:#a626a4;">formal shared </span><span style="color:#c18401;">Dictionary </span><span style="color:#e45649;">dictionary</span><span>; </span><span>} </span></code></pre> <p>The idea is that you get a JVLT instance from one of the helper functions and then use it as a root of the data model.</p> <p>The next thing is to create this data model from the JVLT files. For this, I needed two things:</p> <ul> <li>Reading a ZIP archive</li> <li>Parsing XML</li> </ul> <p>It turned out that Ceylon's file module has ZIP support, with the <code>createZipFileSystem</code> function as an entry point. I made two module-level functions beside the JVLT class for creating instances deriving from the abstract JVLT class:</p> <ul> <li><code>loadJVLT</code> which loads a JVLT ZIP archive from the file system</li> <li><code>loadJVLTFromDictionaryString</code> oads directly a dict.xml-like XML passed as a simple string. I'm using this for unit testing the XML parser.</li> </ul> <p>Let's see the ZIP handling first:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Loads a JVLT file from a `.jvlt` ZIP archive, if possible.&quot; </span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">JVLT</span><span>? </span><span style="color:#e45649;">loadJVLT</span><span>(</span><span style="color:#c18401;">File </span><span style="color:#e45649;">file</span><span>) { </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">zip</span><span> = </span><span style="color:#e45649;">createZipFileSystem</span><span>(</span><span style="color:#e45649;">file</span><span>); </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">dictPath</span><span> = </span><span style="color:#e45649;">zip</span><span>.</span><span style="color:#e45649;">parsePath</span><span>(</span><span style="color:#50a14f;">&quot;/dict.xml&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">if</span><span> (</span><span style="color:#a626a4;">is </span><span style="color:#c18401;">File </span><span style="color:#e45649;">dictFile</span><span> = </span><span style="color:#e45649;">dictPath</span><span>.</span><span style="color:#e45649;">resource</span><span>) { </span><span> </span><span style="color:#a626a4;">try</span><span> (</span><span style="color:#e45649;">reader</span><span> = </span><span style="color:#e45649;">dictFile</span><span>.</span><span style="color:#c18401;">Reader</span><span>()) { </span><span> </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">loadJVLTFromDictionaryString</span><span>(</span><span style="color:#e45649;">readAll</span><span>(</span><span style="color:#e45649;">reader</span><span>)); </span><span> } </span><span> } </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">null</span><span>; </span><span> } </span><span>} </span></code></pre> <p>Well, the error handling is not too sophisticated in this case, it either returns a JVLT or returns <code>null</code> if the given file did not have a <code>dict.xml</code> in it. Other error conditions such as a <code>dict.xml</code> with a wrong format, etc., are not handled currently. As you can see, I'm reusing my other load function here, once the <code>dict.xml</code> is read.</p> <p>There are two interesting things here. First, the if statement where we check if the resource is an instance of <code>File</code> and immediately store it in the value called <code>dictFile</code>. The <code>dictPath.resource</code> attribute has the type <code>Resource</code> which is a Ceylon interface. It is either an <code>ExistingResource</code>: <code>Directory</code>, <code>File</code> or <code>Link</code>, or <code>Nil</code>. In any case if it is not a <code>File</code> instance, we just return <code>null</code>.</p> <p>For simplicity, I'm reading the full <code>dict.xml</code> into a string before parsing it. For this purpose I wrote a small helper function <code>readAll</code>:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Reads all lines from a file reader and returns the concatenated string&quot; </span><span style="color:#c18401;">String </span><span style="color:#e45649;">readAll</span><span>(</span><span style="color:#c18401;">File</span><span>.</span><span style="color:#c18401;">Reader </span><span style="color:#e45649;">reader</span><span>) { </span><span> </span><span style="color:#a626a4;">variable </span><span style="color:#c18401;">String </span><span style="color:#e45649;">result</span><span> = </span><span style="color:#50a14f;">&quot;&quot;</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">while</span><span> (</span><span style="color:#a626a4;">exists </span><span style="color:#e45649;">line</span><span> = </span><span style="color:#e45649;">reader</span><span>.</span><span style="color:#e45649;">readLine</span><span>()) { </span><span> </span><span style="color:#e45649;">result</span><span> += </span><span style="color:#e45649;">line</span><span>; </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">result</span><span>; </span><span>} </span></code></pre> <p>Probably it's not an optimal solution, but works :)</p> <p>Now that we have our data model and have a way to build it up from XML, we can write some unit tests to see how it works. The Ceylon SDK has a test module and the Ceylon IDE supports running the tests. There is a <a href="http://ceylon-lang.org/documentation/1.0/ide/test-plugin/">separate page in the documentation</a> describing how. It is really simple, I had to add the test module as a dependency, and I created a separate file to hold my test definitions. The class groups the tests together and optionally supports running extra code before/after each test case, as in other test frameworks:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#a626a4;">class </span><span style="color:#c18401;">DictionaryParserTests</span><span>() { </span><span> </span><span> </span><span style="color:#a626a4;">shared </span><span style="color:#e45649;">test </span><span style="color:#a626a4;">void </span><span style="color:#e45649;">emptyDictionary</span><span>() { </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">dic</span><span> = </span><span style="color:#e45649;">loadJVLTFromDictionaryString</span><span>(</span><span style="color:#50a14f;">&quot;&lt;dictionary&gt;&quot;</span><span>); </span><span> </span><span> </span><span style="color:#a626a4;">assert</span><span> (</span><span style="color:#e45649;">dic</span><span>.</span><span style="color:#e45649;">dictionary</span><span>.</span><span style="color:#e45649;">words</span><span>.</span><span style="color:#e45649;">empty</span><span>); </span><span> </span><span style="color:#a626a4;">assert</span><span> (</span><span style="color:#e45649;">dic</span><span>.</span><span style="color:#e45649;">dictionary</span><span>.</span><span style="color:#e45649;">language</span><span> == </span><span style="color:#50a14f;">&quot;unknown&quot;</span><span>); </span><span> } </span><span> </span><span> </span><span style="color:#a626a4;">shared </span><span style="color:#e45649;">test </span><span style="color:#a626a4;">void </span><span style="color:#e45649;">languageAttributeRead</span><span>() { </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">dic</span><span> = </span><span style="color:#e45649;">loadJVLTFromDictionaryString</span><span>(</span><span style="color:#50a14f;">&quot;&lt;dictionary language=&quot;</span><span style="color:#e45649;">testlang</span><span style="color:#50a14f;">&quot;&gt;&quot;</span><span>); </span><span> </span><span style="color:#a626a4;">assert</span><span> (</span><span style="color:#e45649;">dic</span><span>.</span><span style="color:#e45649;">dictionary</span><span>.</span><span style="color:#e45649;">language</span><span> == </span><span style="color:#50a14f;">&quot;testlang&quot;</span><span>); </span><span> } </span><span> </span><span> </span><span style="color:#a0a1a7;">// ... </span><span> </span></code></pre> <p>I won't paste here all the test code, only a few samples to get the feeling how the Ceylon code looks like. To test whether a given word's translations are loaded correctly, I wrote a helper function:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#a626a4;">void </span><span style="color:#e45649;">assertSenses</span><span>(</span><span style="color:#c18401;">JVLT </span><span style="color:#e45649;">jvlt</span><span>, </span><span style="color:#c18401;">String </span><span style="color:#e45649;">w</span><span>, [</span><span style="color:#c18401;">String</span><span>+] </span><span style="color:#e45649;">expectedSenses</span><span>) { </span><span> </span><span> </span><span style="color:#c18401;">Word</span><span>? </span><span style="color:#e45649;">word</span><span> = </span><span style="color:#e45649;">jvlt</span><span>.</span><span style="color:#e45649;">dictionary</span><span>.</span><span style="color:#e45649;">words</span><span>[</span><span style="color:#e45649;">w</span><span>]; </span><span> </span><span style="color:#a626a4;">if</span><span> (</span><span style="color:#a626a4;">exists </span><span style="color:#e45649;">word</span><span>) { </span><span> </span><span style="color:#a626a4;">assert</span><span> (</span><span style="color:#e45649;">word</span><span>.</span><span style="color:#e45649;">senses</span><span>.</span><span style="color:#e45649;">equals</span><span>(</span><span style="color:#c18401;">HashSet</span><span>(</span><span style="color:#e45649;">expectedSenses</span><span>))); </span><span> } </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#e45649;">fail</span><span>(</span><span style="color:#50a14f;">&quot;Word does not exists&quot;</span><span>); </span><span> } </span><span>} </span></code></pre> <p>This helper function can be used to assert that a word has been loaded correctly:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#a626a4;">shared </span><span style="color:#e45649;">test </span><span style="color:#a626a4;">void </span><span style="color:#e45649;">wordWithMultipleSenses</span><span>() { </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">dic</span><span> = </span><span style="color:#e45649;">loadJVLTFromDictionaryString</span><span>( </span><span> </span><span style="color:#50a14f;">&quot;&lt;dictionary&gt; </span><span style="color:#50a14f;"> &lt;entry id=&quot;</span><span style="color:#e45649;">e1</span><span style="color:#50a14f;">&quot;&gt; </span><span style="color:#50a14f;"> &lt;orth&gt;src1&lt;/orth&gt; </span><span style="color:#50a14f;"> &lt;sense id=&quot;</span><span style="color:#e45649;">e1</span><span>-</span><span style="color:#e45649;">s1</span><span style="color:#50a14f;">&quot;&gt; </span><span style="color:#50a14f;"> &lt;trans&gt;dst1&lt;/trans&gt; </span><span style="color:#50a14f;"> &lt;/sense&gt; </span><span style="color:#50a14f;"> &lt;sense id=&quot;</span><span style="color:#e45649;">e1</span><span>-</span><span style="color:#e45649;">s2</span><span style="color:#50a14f;">&quot;&gt; </span><span style="color:#50a14f;"> &lt;trans&gt;dst2&lt;/trans&gt; </span><span style="color:#50a14f;"> &lt;/sense&gt; </span><span style="color:#50a14f;"> &lt;/entry&gt; </span><span style="color:#50a14f;"> &lt;/dictionary&gt;&quot;</span><span>); </span><span> </span><span> </span><span style="color:#e45649;">assertSenses</span><span>(</span><span style="color:#e45649;">dic</span><span>, </span><span style="color:#50a14f;">&quot;src1&quot;</span><span>, [</span><span style="color:#50a14f;">&quot;dst1&quot;</span><span>, </span><span style="color:#50a14f;">&quot;dst2&quot;</span><span>]); </span><span>} </span></code></pre> <p>Now the only problem is that there is no XML parsing support in the Ceylon SDK currently, so it has to be done using Java interop. As I wrote the code to build up the data model from the XML, I wrote several helper functions to make it easier to fit into the language. So let's see first how the dictionary loading is defined, and then I'll show the helper functions.</p> <p>The XML parsing is done by two module level functions which are not shared - only used by the JVLT constructor functions I shown before. The first one creates a map entry for a single word:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Creates a word entry for the dictionary&quot; </span><span style="color:#c18401;">String</span><span>-&gt;</span><span style="color:#c18401;">Word </span><span style="color:#e45649;">loadEntry</span><span>(</span><span style="color:#c18401;">Element </span><span style="color:#e45649;">elem</span><span>) { </span><span> </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">w</span><span> = </span><span style="color:#c18401;">Word</span><span> { </span><span> </span><span style="color:#e45649;">word</span><span> = </span><span style="color:#e45649;">selectNodeText</span><span>(</span><span style="color:#e45649;">elem</span><span>, </span><span style="color:#50a14f;">&quot;orth&quot;</span><span>) </span><span style="color:#a626a4;">else </span><span style="color:#50a14f;">&quot;???&quot;</span><span>; </span><span> </span><span style="color:#e45649;">lesson</span><span> = </span><span style="color:#e45649;">selectNodeInteger</span><span>(</span><span style="color:#e45649;">elem</span><span>, </span><span style="color:#50a14f;">&quot;lesson&quot;</span><span>) </span><span style="color:#a626a4;">else </span><span style="color:#c18401;">0</span><span>; </span><span> </span><span style="color:#e45649;">senses</span><span> = </span><span style="color:#c18401;">HashSet</span><span>(</span><span style="color:#e45649;">selectNodes</span><span>(</span><span style="color:#e45649;">elem</span><span>, </span><span style="color:#50a14f;">&quot;sense/trans&quot;</span><span>) </span><span> .</span><span style="color:#e45649;">map</span><span>((</span><span style="color:#c18401;">Node </span><span style="color:#e45649;">n</span><span>) =&gt; </span><span style="color:#e45649;">n</span><span>.</span><span style="color:#e45649;">textContent</span><span>)); </span><span> }; </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">w</span><span>.</span><span style="color:#e45649;">word</span><span>-&gt;</span><span style="color:#e45649;">w</span><span>; </span><span>} </span></code></pre> <p>and the second one loads all the words from the XML document:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Loads a dictionary from JVLT&#39;s `dict.xml` format.&quot; </span><span style="color:#c18401;">Dictionary </span><span style="color:#e45649;">loadDictionaryFromXML</span><span>(</span><span style="color:#c18401;">Document </span><span style="color:#a626a4;">doc</span><span>) { </span><span> </span><span> </span><span style="color:#a626a4;">doc</span><span>.</span><span style="color:#e45649;">documentElement</span><span>.</span><span style="color:#e45649;">normalize</span><span>(); </span><span> </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">Dictionary</span><span> { </span><span> </span><span style="color:#e45649;">language</span><span> = </span><span style="color:#e45649;">getAttribute</span><span>(</span><span style="color:#a626a4;">doc</span><span>.</span><span style="color:#e45649;">documentElement</span><span>, </span><span style="color:#50a14f;">&quot;language&quot;</span><span>) </span><span style="color:#a626a4;">else </span><span style="color:#50a14f;">&quot;unknown&quot;</span><span>; </span><span> </span><span style="color:#e45649;">words</span><span> = </span><span style="color:#c18401;">HashMap</span><span>({ </span><span> </span><span style="color:#a626a4;">for</span><span> (</span><span style="color:#e45649;">node </span><span style="color:#a626a4;">in </span><span style="color:#e45649;">selectNodes</span><span>(</span><span style="color:#a626a4;">doc</span><span>, </span><span style="color:#50a14f;">&quot;dictionary/entry&quot;</span><span>)) </span><span> </span><span style="color:#a626a4;">if</span><span> (</span><span style="color:#a626a4;">is </span><span style="color:#c18401;">Element </span><span style="color:#e45649;">elem</span><span> = </span><span style="color:#e45649;">node</span><span>) </span><span> </span><span style="color:#e45649;">loadEntry</span><span>(</span><span style="color:#e45649;">elem</span><span>) }); </span><span> }; </span><span>} </span></code></pre> <p>The function which returns the JVLT instance uses this function and Java interop to read the dictionary:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#50a14f;">&quot;Loads a JVLT file by the parsing the dictionary XML directly from a string&quot; </span><span style="color:#a626a4;">shared </span><span style="color:#c18401;">JVLT </span><span style="color:#e45649;">loadJVLTFromDictionaryString</span><span>(</span><span style="color:#c18401;">String </span><span style="color:#e45649;">dictXML</span><span>) { </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">docBuilderFactory</span><span> = </span><span style="color:#c18401;">DocumentBuilderFactory</span><span>.</span><span style="color:#e45649;">newInstance</span><span>(); </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">builder</span><span> = </span><span style="color:#e45649;">docBuilderFactory</span><span>.</span><span style="color:#e45649;">newDocumentBuilder</span><span>(); </span><span> </span><span style="color:#a626a4;">value doc</span><span> = </span><span style="color:#e45649;">builder</span><span>.</span><span style="color:#e45649;">parse</span><span>(</span><span style="color:#c18401;">ByteArrayInputStream</span><span>(</span><span style="color:#e45649;">javaString</span><span>(</span><span style="color:#e45649;">dictXML</span><span>).</span><span style="color:#e45649;">bytes</span><span>)); </span><span> </span><span> </span><span style="color:#a626a4;">object </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">extends </span><span style="color:#c18401;">JVLT</span><span>() { </span><span> </span><span style="color:#e45649;">dictionary</span><span> = </span><span style="color:#e45649;">loadDictionaryFromXML</span><span>(</span><span style="color:#a626a4;">doc</span><span>); </span><span> } </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">result</span><span>; </span><span>} </span></code></pre> <p>There are two things to notice here: we had to convert from Ceylon's string to Java string. This is not done automatically and we need the <code>ceylon.interop.java</code> module to do it. In the last lines we define an anonymous class extending from JVLT and overwriting it's abstract dictionary attribute. Then this anonymous class instance is returned as the loaded JVLT.</p> <p>To make the XML parsing less painful, I defined a few helper functions in a separate compilation unit (<code>XmlHelper.ceylon</code>). I won't show here the full file but there are some interesting parts. First, from Ceylon you cannot call static methods, but you can import them. I'm using the following two import statements:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#a626a4;">import </span><span style="color:#e45649;">org</span><span>.</span><span style="color:#e45649;">w3c</span><span>.</span><span style="color:#e45649;">dom</span><span> { </span><span style="color:#c18401;">Node</span><span>, </span><span style="color:#c18401;">NodeList</span><span>, </span><span style="color:#c18401;">Element</span><span> } </span><span style="color:#a626a4;">import </span><span style="color:#e45649;">javax</span><span>.</span><span style="color:#e45649;">xml</span><span>.</span><span style="color:#e45649;">xpath</span><span> { </span><span style="color:#c18401;">XPathFactory</span><span> { </span><span style="color:#e45649;">newXPathFactory</span><span> = </span><span style="color:#e45649;">newInstance</span><span> }, </span><span> </span><span style="color:#c18401;">XPathConstants</span><span> { </span><span style="color:#e45649;">nodeSet</span><span> = </span><span style="color:#e45649;">\iNODESET</span><span> }} </span></code></pre> <p>The first one is straightforward. It imports three DOM interfaces. The second one first imports the <code>XPathFactory.newInstance</code> static method and also renames it, as newInstance is a too generic name without its class name as a prefix. The third line imports a constant value and gives it a Ceylon-compatible name. Because in Ceylon only the types can start with an uppercase character, we have to use a special and ugly syntax which helps the interoperability - prefixing it with <code>\i</code>.</p> <p>The <code>ceylon.interop.java</code> module has helper classes to make Java Iterable objects iterable in Ceylon, but unfortunately the <code>NodeList</code> interface is not iterable in Java either. So I wrote a simple wrapper that iterates through a node list:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#a626a4;">class </span><span style="color:#c18401;">NodeListIterator</span><span>(</span><span style="color:#c18401;">NodeList </span><span style="color:#e45649;">nodes</span><span>) </span><span style="color:#a626a4;">satisfies </span><span style="color:#c18401;">Iterable</span><span>&lt;</span><span style="color:#c18401;">Node</span><span>&gt; { </span><span> </span><span style="color:#a626a4;">shared actual default </span><span style="color:#c18401;">Iterator</span><span>&lt;</span><span style="color:#c18401;">Node</span><span>&gt; </span><span style="color:#e45649;">iterator</span><span>() { </span><span> </span><span style="color:#a626a4;">object </span><span style="color:#e45649;">it </span><span style="color:#a626a4;">satisfies </span><span style="color:#c18401;">Iterator</span><span>&lt;</span><span style="color:#c18401;">Node</span><span>&gt; { </span><span> </span><span style="color:#a626a4;">variable </span><span style="color:#c18401;">Integer </span><span style="color:#e45649;">i</span><span> = </span><span style="color:#c18401;">0</span><span>; </span><span> </span><span> </span><span style="color:#a626a4;">shared actual </span><span style="color:#c18401;">Node</span><span>|</span><span style="color:#c18401;">Finished </span><span style="color:#e45649;">next</span><span>() { </span><span> </span><span style="color:#a626a4;">if</span><span> (</span><span style="color:#e45649;">i</span><span> &lt; </span><span style="color:#e45649;">nodes</span><span>.</span><span style="color:#e45649;">length</span><span>) { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">nodes</span><span>.</span><span style="color:#e45649;">item</span><span>(</span><span style="color:#e45649;">i</span><span>++); </span><span> } </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">finished</span><span>; </span><span> } </span><span> } </span><span> } </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">it</span><span>; </span><span> } </span><span>} </span></code></pre> <p>Using this iterator and the imports I wrote a <code>selectNodes</code> function to run XPath expressions and return the result as a Ceylon iterable:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span>{</span><span style="color:#c18401;">Node</span><span>*} </span><span style="color:#e45649;">selectNodes</span><span>(</span><span style="color:#c18401;">Node </span><span style="color:#e45649;">root</span><span>, </span><span style="color:#c18401;">String </span><span style="color:#e45649;">xpath</span><span>) { </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">factory</span><span> = </span><span style="color:#e45649;">newXPathFactory</span><span>(); </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">xpathCompiler</span><span> = </span><span style="color:#e45649;">factory</span><span>.</span><span style="color:#e45649;">newXPath</span><span>(); </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">expr</span><span> = </span><span style="color:#e45649;">xpathCompiler</span><span>.</span><span style="color:#e45649;">compile</span><span>(</span><span style="color:#e45649;">xpath</span><span>); </span><span> </span><span style="color:#a626a4;">value </span><span style="color:#e45649;">nodeList</span><span> = </span><span style="color:#e45649;">expr</span><span>.</span><span style="color:#e45649;">evaluate</span><span>(</span><span style="color:#e45649;">root</span><span>, </span><span style="color:#e45649;">nodeSet</span><span>); </span><span> </span><span style="color:#a626a4;">if</span><span> (</span><span style="color:#a626a4;">is </span><span style="color:#c18401;">NodeList </span><span style="color:#e45649;">nodeList</span><span>) { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">NodeListIterator</span><span>(</span><span style="color:#e45649;">nodeList</span><span>); </span><span> } </span><span> </span><span style="color:#a626a4;">else</span><span> { </span><span> </span><span style="color:#a626a4;">return</span><span> []; </span><span> } </span><span>} </span></code></pre> <p>Using this function it is very easy to write a variant that selects a single node:</p> <pre data-lang="ceylon" style="background-color:#fafafa;color:#383a42;" class="language-ceylon "><code class="language-ceylon" data-lang="ceylon"><span style="color:#c18401;">Node</span><span>? </span><span style="color:#e45649;">selectNode</span><span>(</span><span style="color:#c18401;">Node </span><span style="color:#e45649;">root</span><span>, </span><span style="color:#c18401;">String </span><span style="color:#e45649;">xpath</span><span>) { </span><span> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">selectNodes</span><span>(</span><span style="color:#e45649;">root</span><span>, </span><span style="color:#e45649;">xpath</span><span>).</span><span style="color:#e45649;">first</span><span>; </span><span>} </span></code></pre> <p>There are some other helper functions returning the node's text, converting it to integer, etc. but I think they are not that interesting. Now that I have my data model which is built from my JVLT file, the next thing is to make a user interface somehow where the vocabulary can be shown an the user's knowledge can be tested/improved. This will be the topic of some future posts, as soon as I have time to experiment more with this new language.</p> Cloning WPF flow document fragments 2013-10-25T00:00:00+00:00 2013-10-25T00:00:00+00:00 Daniel Vigovszky https://blog.vigoo.dev/posts/cloning-wpf-flow-document-fragments/ <p>Today I had to write such an ugly hack to fix a bug that I decided to start writing a blog where I can show it to the world :)</p> <p>The software I'm working on has some sort of context sensitive help panel, which is implemented using dynamically generated <a href="http://msdn.microsoft.com/en-us/library/aa970909.aspx">flow documents</a>. The software loads a large set of flow document sections from a XAML file runtime, and later builds documents from a subset of them.</p> <p>For some reason (which belong to a separate post), it is not possible to reuse these flow document elements in multiple flow documents, not even if there is only one at a time. To work around this, I was cloning these sections before adding them to the document.</p> <p>As WPF elements are not <em>cloneable</em>, I was using the method recommended many places, for example <a href="http://stackoverflow.com/questions/32541/how-can-you-clone-a-wpf-object">in this StackOverflow post</a>: saving the object tree to an in-memory XAML stream, and loading it back.</p> <p>This worked quite well.. until we discovered a bug, which I still cannot explain. In some cases which were easily reproducible for any developer, but the code running in those cases being exactly the same as in other, working cases, the clone method simply stopped working.</p> <p>Stopped working here means that the following code:</p> <pre data-lang="cs" style="background-color:#fafafa;color:#383a42;" class="language-cs "><code class="language-cs" data-lang="cs"><span style="color:#a626a4;">var </span><span style="color:#e45649;">xaml </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">XamlWriter</span><span>.</span><span style="color:#e45649;">Save</span><span>(</span><span style="color:#e45649;">block</span><span>); </span></code></pre> <p>would write out the correct object hierarchy, but without any properties (no attributes, no content properties, nothing but the element names)! In the same time the objects in the memory were untouched and still had all the relevant properties set.</p> <p>I also tried to write my own XAML serializer based on the code found <a href="http://go4answers.webhost4life.com/Example/xaml-serialization-replacement-75133.aspx">at this site</a>, but this was only good to find out that the problem lies deep within the <code>MarkupWriter</code> class, which is the same what the <code>XamlWriter</code> uses internally. When the <code>XamlWriter</code> failed, my own code could not find any properties using the returned <a href="http://msdn.microsoft.com/en-us/library/system.windows.markup.primitives.markupobject.aspx">MarkupObject</a>:</p> <pre data-lang="cs" style="background-color:#fafafa;color:#383a42;" class="language-cs "><code class="language-cs" data-lang="cs"><span>MarkupObject </span><span style="color:#e45649;">markupObj </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">MarkupWriter</span><span>.</span><span style="color:#e45649;">GetMarkupObjectFor</span><span>(</span><span style="color:#e45649;">obj</span><span>); </span></code></pre> <p>For the same object, in the working scenarios it returned a markup object with a working <code>Properties</code> collection.</p> <p>So here is the final <em>"solution"</em> which I'm not really proud of, but solved the problem. Maybe with some modifications it is useful for someone struggling with the framework:</p> <pre data-lang="cs" style="background-color:#fafafa;color:#383a42;" class="language-cs "><code class="language-cs" data-lang="cs"><span style="color:#a0a1a7;">/// &lt;</span><span style="color:#e45649;">summary</span><span style="color:#a0a1a7;">&gt; </span><span style="color:#a0a1a7;">/// Horrible ugly clone hack to issues where XamlWriter/XamlReader based </span><span style="color:#a0a1a7;">/// clone method did not work. </span><span style="color:#a0a1a7;">/// &lt;/</span><span style="color:#e45649;">summary</span><span style="color:#a0a1a7;">&gt; </span><span style="color:#a626a4;">public static class </span><span style="color:#c18401;">CloneHelper </span><span style="color:#c18401;">{ </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">public static </span><span style="color:#c18401;">Block </span><span style="color:#0184bc;">Clone</span><span style="color:#c18401;">&lt;t&gt;(</span><span style="color:#a626a4;">this </span><span style="color:#c18401;">T </span><span style="color:#e45649;">block</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">where </span><span style="color:#c18401;">T : Block </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">result </span><span style="color:#a626a4;">= </span><span style="color:#c18401;">(T)</span><span style="color:#e45649;">DeepClone</span><span style="color:#c18401;">(</span><span style="color:#e45649;">block</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">result</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private static object </span><span style="color:#0184bc;">DeepClone</span><span style="color:#c18401;">(</span><span style="color:#a626a4;">object </span><span style="color:#e45649;">obj</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">obj </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Replacing ResourceDictionary and Style values with null. </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// In this particular use case it is correct to do </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">obj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">() </span><span style="color:#a626a4;">== typeof</span><span style="color:#c18401;">(ResourceDictionary) </span><span style="color:#a626a4;">|| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">obj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">() </span><span style="color:#a626a4;">== typeof</span><span style="color:#c18401;">(Style)) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">null; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Value types and some special cases where we don&#39;t want to clone </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">obj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">().</span><span style="color:#e45649;">IsValueType </span><span style="color:#a626a4;">|| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">obj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">() </span><span style="color:#a626a4;">== typeof </span><span style="color:#c18401;">(Cursor) </span><span style="color:#a626a4;">|| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">obj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">() </span><span style="color:#a626a4;">== typeof </span><span style="color:#c18401;">(XmlLanguage)) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">obj</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// If it is cloneable, use it </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">cloneable </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">obj </span><span style="color:#a626a4;">as </span><span style="color:#c18401;">ICloneable; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">cloneable </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">cloneable</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Clone</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Creating the clone with reflection </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">typ </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">obj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">clone </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">Activator</span><span style="color:#c18401;">.</span><span style="color:#e45649;">CreateInstance</span><span style="color:#c18401;">(</span><span style="color:#e45649;">typ</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Property names which are known locally set </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// dependency properties </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">usedNames </span><span style="color:#a626a4;">= new </span><span style="color:#c18401;">HashSet&lt;</span><span style="color:#a626a4;">string</span><span style="color:#c18401;">&gt;(); </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Copying locally set dependency properties from the </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// source to the target </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">dobjSource </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">obj </span><span style="color:#a626a4;">as </span><span style="color:#c18401;">DependencyObject; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">dobjTarget </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">clone </span><span style="color:#a626a4;">as </span><span style="color:#c18401;">DependencyObject; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">dobjSource </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#e45649;">dobjTarget </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">locallySetProperties </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#e45649;">dobjSource</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetLocalValueEnumerator</span><span style="color:#c18401;">(); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">while </span><span style="color:#c18401;">(</span><span style="color:#e45649;">locallySetProperties</span><span style="color:#c18401;">.</span><span style="color:#e45649;">MoveNext</span><span style="color:#c18401;">()) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> DependencyProperty </span><span style="color:#e45649;">dp </span><span style="color:#a626a4;">= </span><span style="color:#c18401;"> </span><span style="color:#e45649;">locallySetProperties</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Current</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Property</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#a626a4;">!</span><span style="color:#e45649;">dp</span><span style="color:#c18401;">.</span><span style="color:#e45649;">ReadOnly</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">dobjTarget</span><span style="color:#c18401;">.</span><span style="color:#e45649;">SetValue</span><span style="color:#c18401;">(</span><span style="color:#e45649;">dp</span><span style="color:#c18401;">, </span><span style="color:#e45649;">dobjSource</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetValue</span><span style="color:#c18401;">(</span><span style="color:#e45649;">dp</span><span style="color:#c18401;">)); </span><span style="color:#c18401;"> </span><span style="color:#e45649;">usedNames</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Add</span><span style="color:#c18401;">(</span><span style="color:#e45649;">dp</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Name</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// Getting all the public, non-static properties of the source </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">foreach </span><span style="color:#c18401;">(</span><span style="color:#a626a4;">var </span><span style="color:#e45649;">pi </span><span style="color:#a626a4;">in </span><span style="color:#e45649;">typ</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetProperties</span><span style="color:#c18401;">( </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Instance </span><span style="color:#a626a4;">| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Public </span><span style="color:#a626a4;">| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">FlattenHierarchy</span><span style="color:#c18401;">)) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// If it is not a dependency property </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// and not the default property... </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">CanRead </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">!</span><span style="color:#e45649;">usedNames</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Contains</span><span style="color:#c18401;">(</span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Name</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">!</span><span style="color:#e45649;">IsDependencyProperty</span><span style="color:#c18401;">(</span><span style="color:#e45649;">dobjSource</span><span style="color:#c18401;">, </span><span style="color:#e45649;">pi</span><span style="color:#c18401;">) </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#c18401;"> </span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Name </span><span style="color:#a626a4;">!= </span><span style="color:#50a14f;">&quot;Item&quot;</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">val </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetValue</span><span style="color:#c18401;">(</span><span style="color:#e45649;">obj</span><span style="color:#c18401;">, null); </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ..and it is writeable, then we recursively clone </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// the value and set the property: </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">CanWrite</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">SetValue</span><span style="color:#c18401;">(</span><span style="color:#e45649;">clone</span><span style="color:#c18401;">, </span><span style="color:#e45649;">DeepClone</span><span style="color:#c18401;">(</span><span style="color:#e45649;">val</span><span style="color:#c18401;">), null); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// ..otherwise if it is a readonly list property, </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// go through each item, clone it and add to </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">// the clone&#39;s list property </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">PropertyType </span><span style="color:#c18401;"> .</span><span style="color:#e45649;">GetInterfaces</span><span style="color:#c18401;">() </span><span style="color:#c18401;"> .</span><span style="color:#e45649;">Contains</span><span style="color:#c18401;">(</span><span style="color:#a626a4;">typeof </span><span style="color:#c18401;">(IList))) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">source </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">val </span><span style="color:#a626a4;">as </span><span style="color:#c18401;">IList; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">target </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetValue</span><span style="color:#c18401;">(</span><span style="color:#e45649;">clone</span><span style="color:#c18401;">, null) </span><span style="color:#a626a4;">as </span><span style="color:#c18401;">IList; </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">source </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#e45649;">target </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">foreach </span><span style="color:#c18401;">(</span><span style="color:#a626a4;">var </span><span style="color:#e45649;">item </span><span style="color:#a626a4;">in </span><span style="color:#e45649;">source</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> </span><span style="color:#e45649;">target</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Add</span><span style="color:#c18401;">(</span><span style="color:#e45649;">DeepClone</span><span style="color:#c18401;">(</span><span style="color:#e45649;">item</span><span style="color:#c18401;">)); </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#e45649;">clone</span><span style="color:#c18401;">; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">null; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a0a1a7;">/// &lt;</span><span style="color:#e45649;">summary</span><span style="color:#a0a1a7;">&gt; </span><span style="color:#a0a1a7;"> /// Tries to determine if a property is a dependency property, by reflection and </span><span style="color:#a0a1a7;"> /// naming convention </span><span style="color:#a0a1a7;"> /// &lt;/</span><span style="color:#e45649;">summary</span><span style="color:#a0a1a7;">&gt; </span><span style="color:#a0a1a7;"> /// &lt;</span><span style="color:#e45649;">param </span><span style="color:#c18401;">name</span><span style="color:#a0a1a7;">=</span><span style="color:#50a14f;">&quot;dobj&quot;</span><span style="color:#a0a1a7;">&gt;Dependency object </span><span style="color:#a0a1a7;"> /// &lt;</span><span style="color:#e45649;">param </span><span style="color:#c18401;">name</span><span style="color:#a0a1a7;">=</span><span style="color:#50a14f;">&quot;pi&quot;</span><span style="color:#a0a1a7;">&gt;Property info </span><span style="color:#a0a1a7;"> /// &lt;</span><span style="color:#e45649;">returns</span><span style="color:#a0a1a7;">&gt;Returns &lt;</span><span style="color:#e45649;">c</span><span style="color:#a0a1a7;">&gt;true&lt;/</span><span style="color:#e45649;">c</span><span style="color:#a0a1a7;">&gt; if the given property seems to be a </span><span style="color:#a0a1a7;"> /// CLR access property for a dependency property.&lt;/</span><span style="color:#e45649;">returns</span><span style="color:#a0a1a7;">&gt; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">private static bool </span><span style="color:#0184bc;">IsDependencyProperty</span><span style="color:#c18401;">(DependencyObject </span><span style="color:#e45649;">dobj</span><span style="color:#c18401;">, PropertyInfo </span><span style="color:#e45649;">pi</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">dobj </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null) </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">dpProp </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">dobj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">().</span><span style="color:#e45649;">GetProperty</span><span style="color:#c18401;">(</span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Name </span><span style="color:#a626a4;">+ </span><span style="color:#50a14f;">&quot;Property&quot;</span><span style="color:#c18401;">, </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Static </span><span style="color:#a626a4;">| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Public </span><span style="color:#a626a4;">| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">FlattenHierarchy</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">dpProp </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#e45649;">dpProp</span><span style="color:#c18401;">.</span><span style="color:#e45649;">PropertyType </span><span style="color:#a626a4;">== typeof </span><span style="color:#c18401;">(DependencyProperty)) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">true; </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">else </span><span style="color:#c18401;"> { </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">var </span><span style="color:#e45649;">dpField </span><span style="color:#a626a4;">= </span><span style="color:#e45649;">dobj</span><span style="color:#c18401;">.</span><span style="color:#e45649;">GetType</span><span style="color:#c18401;">().</span><span style="color:#e45649;">GetField</span><span style="color:#c18401;">(</span><span style="color:#e45649;">pi</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Name </span><span style="color:#a626a4;">+ </span><span style="color:#50a14f;">&quot;Property&quot;</span><span style="color:#c18401;">, </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Static </span><span style="color:#a626a4;">| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">Public </span><span style="color:#a626a4;">| </span><span style="color:#c18401;"> </span><span style="color:#e45649;">BindingFlags</span><span style="color:#c18401;">.</span><span style="color:#e45649;">FlattenHierarchy</span><span style="color:#c18401;">); </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">if </span><span style="color:#c18401;">(</span><span style="color:#e45649;">dpField </span><span style="color:#a626a4;">!= </span><span style="color:#c18401;">null </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#c18401;"> </span><span style="color:#e45649;">dpField</span><span style="color:#c18401;">.</span><span style="color:#e45649;">FieldType </span><span style="color:#a626a4;">== typeof </span><span style="color:#c18401;">(DependencyProperty) </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#c18401;"> </span><span style="color:#e45649;">dpField</span><span style="color:#c18401;">.</span><span style="color:#e45649;">IsInitOnly </span><span style="color:#a626a4;">&amp;&amp; </span><span style="color:#e45649;">dpField</span><span style="color:#c18401;">.</span><span style="color:#e45649;">IsStatic</span><span style="color:#c18401;">) </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">true; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> } </span><span style="color:#c18401;"> </span><span style="color:#c18401;"> </span><span style="color:#a626a4;">return </span><span style="color:#c18401;">false; </span><span style="color:#c18401;"> } </span><span style="color:#c18401;">} </span></code></pre>