Zola2026-02-23T00:00:00+00:00https://mnt.io/atom.xmlAbout memory pressure, lock contention, and Data-oriented Design2026-02-23T00:00:00+00:002026-02-23T00:00:00+00:00
Unknown
https://mnt.io/articles/about-memory-pressure-lock-contention-and-data-oriented-design/<p>I'm here to narrate you a story about performance. Recently, I was in the same
room as some Memory Pressure and some Lock Contention. It took me a while to
recognize them. Legend says it only happens in obscure, low-level systems,
but I'm here to refute the legend. While exploring, I had the pleasure of fixing a
funny bug in a higher-order stream: lucky us, to top it all off, we even have a
sweet treat! This story is also a pretext to introduce you to Data-oriented Design,
and to show how it improved execution time by 98.7% and throughput
by 7718.5%. I believe we have all the ingredients for a juicy story. Let's cook,
and <em lang="fr">bon appétit !</em></p>
<h2 id="on-a-beautiful-morning">On a Beautiful Morning…<a role="presentation" class="anchor" href="#on-a-beautiful-morning" title="Anchor link to this header">#</a>
</h2>
<p>While powering on my <a rel="noopener external" target="_blank" href="https://dygma.com/pages/defy">Dygma Defy</a>, unlocking my computer, and checking
messages from my colleagues, I suddenly come across this one:</p>
<blockquote>
<p>Does anyone also experience a frozen room list?</p>
</blockquote>
<p>Ah yeah, for some years now, I've been employed by <a rel="noopener external" target="_blank" href="https://element.io/">Element</a> to work on the
<a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk">Matrix Rust SDK</a>. If one needs to write a complete, modern, cross-platform,
fast Matrix client or bot, this SDK is an excellent choice. The SDK is composed
of many crates. Some are very low in the stack and are not aimed at being used
directly by developers, like <code>matrix_sdk_crypto</code>. Some others are higher in the
stack — the highest is for User Interfaces (UI) with <code>matrix_sdk_ui</code>. While it is
a bit opinionated, it is designed to provide the high-quality features everybody
expects in a modern Matrix client.</p>
<p>One of these features is the Room List. The Room List is a place where users
spend a lot of their time in a messaging application (along with the Timeline,
i.e. the room's messages). Some expectations for this component:</p>
<ul>
<li>Be superfast,</li>
<li>List all the rooms,</li>
<li>Interact with rooms (open them, mark them as unread etc.),</li>
<li>Filter the rooms,</li>
<li>Sort the rooms.</li>
</ul>
<p>Let's focus on the part that interests us today: <em>Sort the rooms</em>. The Room List
holds… no rooms. It actually provides a <em>stream of updates about rooms</em>; more
precisely a <code>Stream<Item = Vec<VectorDiff<Room>>></code>. What does this mean? The
stream yields a vector of “diffs” of rooms. I'm writing <a href="https://mnt.io/series/reactive-programming-in-rust/">a series about reactive
programming</a> — you might be
interested to read more about it. Otherwise, here is what you need to know.</p>
<p><a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.8.0/eyeball_im/enum.VectorDiff.html">The <code>VectorDiff</code> type</a> comes from <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.8.0/eyeball_im/">the <code>eyeball-im</code>
crate</a>, initially created for the Matrix Rust SDK as a solid
foundation for reactive programming. It looks like this:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> VectorDiff</span><span><</span><span class="z-entity z-name">T</span><span>> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Append</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> values</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vector</span><span><</span><span class="z-entity z-name">T</span><span>>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Clear</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> PushFront</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> T</span><span>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> PushBack</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> T</span><span>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> PopFront</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> PopBack</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Insert</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> index</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> T</span><span>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Set</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> index</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> T</span><span>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Remove</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> index</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Truncate</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Reset</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> values</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vector</span><span><</span><span class="z-entity z-name">T</span><span>>,</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>It represents a <em>change</em> in <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.8.0/eyeball_im/struct.ObservableVector.html">an <code>ObservableVector</code></a>.
This is like a <code>Vec</code>, but <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.8.0/eyeball_im/struct.ObservableVector.html#method.subscribe">one can subscribe to the
changes</a>, and will receive… well… <code>VectorDiff</code>s!</p>
<p>The Room List type merges several streams into a single stream representing the
list of rooms. For example, let's imagine the room at index 3 receives a new
message. Its “preview” (the <em>latest event</em> displayed beneath the room's name,
e.g. <q>Alice: Hello!</q>) changes. Also, the Room List sorts rooms by their
“recency” (the <em>time</em> something happened in the room). And since the “preview”
has changed, its “recency” changes too, which means the room is sorted and
re-positioned. Then, we expect the Room List's stream to yield:</p>
<ol>
<li><code>VectorDiff::Set { index: 3, value: new_room }</code> because of the new “preview”,</li>
<li><code>VectorDiff::Remove { index: 3 }</code> to remove the room… immediately followed by</li>
<li><code>VectorDiff::PushFront { value: new_room }</code> to insert the room at the top of the Room List.</li>
</ol>
<p>This reactive programming mechanism has proven to be extremely efficient.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>I did my calculation: the size of <code>VectorDiff<Room></code> is 72 bytes (mostly because
<code>Room</code> contains <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/sync/struct.Arc.html">an <code>Arc</code></a> over the real struct type). This is pretty
small for an update. Not only it brings a small memory footprint, but it crosses
the FFI boundary pretty easily, making it easy to map to other languages like
Swift or Kotlin — languages that provide UI components, like <a rel="noopener external" target="_blank" href="https://developer.apple.com/swiftui/">SwiftUI</a> or
<a rel="noopener external" target="_blank" href="https://developer.android.com/compose">Jetpack Compose</a>.</p>
</div>
</div>
<p>Absolutely! These are two popular UI components where a <code>VectorDiff</code> maps
straightforwardly to their List component update operations. They are actually
(remarkably) pretty similar to each other<sup class="footnote-reference" id="fr-vectordiff_on_other_uis-1"><a href="#fn-vectordiff_on_other_uis">1</a></sup>.</p>
<p>You're always a good digression companion, thank you. Let's go back on our
problem:</p>
<blockquote>
<p>What does "frozen" mean for the Room List?</p>
</blockquote>
<p>It means that the Room List is simply… <em>blank</em>, <em>empty</em>, <em
lang="fr">vide</em>, <em lang="es">vacía</em>, <em lang="it">vuoto</em>, <em
lang="ar">خلو</em>… well, you get the idea.</p>
<blockquote>
<p>What could freeze the Room List?</p>
</blockquote>
<p>What are our options?</p>
<div class="conversation" data-character="factotum">
<div class="conversation--character">
<span lang="fr">Le Factotum</span>
<picture role="presentation">
<source srcset="/image/factotum.avif" type="image/avif" />
<source srcset="/image/factotum.webp" type="image/webp" />
<img src="/image/factotum.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>It would be a real pleasure if you let me assist you in this task.</p>
<ul>
<li>The network sync is not running properly, hence giving the <em>impression</em> of a
frozen Room List? Hmm, no, everything works as expected here. Moreover, local
data should be displayed.</li>
<li>The “source streams” used by the Room List are not yielding the expected
updates? No, everything works like a charm.</li>
<li>The “merge of streams” is broken for some reasons? No, it seems fine.</li>
<li>The filtering of the streams? Not touched since a long time.</li>
<li>The sorting? Ah, maybe, I reckon we have changed something here…</li>
</ul>
</div>
</div>
<p>Indeed, we have changed one sorter recently. Let's take a look at how this Room List stream is computed, shall we?</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">let</span><span class="z-variable"> stream</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> stream!</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> loop</span><span> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Wait for the filter to be updated.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> filter</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> filter_cell</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">take</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Get the “raw” entries.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span> (</span><span class="z-variable">initial_values</span><span>,</span><span class="z-variable"> stream</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">entries</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Combine normal stream updates with other room updates.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> stream</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> merge_streams</span><span>(</span><span class="z-variable">initial_values</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">clone</span><span>(),</span><span class="z-variable"> stream</span><span>,</span><span class="z-variable"> other_updates</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span> (</span><span class="z-variable">initial_values</span><span>,</span><span class="z-variable"> stream</span><span>)</span><span class="z-keyword z-operator"> =</span><span> (</span><span class="z-variable">initial_values</span><span>,</span><span class="z-variable"> stream</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">filter</span><span>(</span><span class="z-variable">filter</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">sort_by</span><span>(</span><span class="z-entity z-name z-function">new_sorter_lexicographic</span><span>(</span><span class="z-entity z-name z-function">vec!</span><span>[</span></span>
<span class="giallo-l"><span class="z-comment"> // Sort by latest event's kind.</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-entity z-name z-function">new_sorter_latest_event</span><span>()),</span></span>
<span class="giallo-l"><span class="z-comment"> // Sort rooms by their recency.</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-entity z-name z-function">new_sorter_recency</span><span>()),</span></span>
<span class="giallo-l"><span class="z-comment"> // Finally, sort by name.</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-entity z-name z-function">new_sorter_name</span><span>()),</span></span>
<span class="giallo-l"><span> ]))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">dynamic_head_with_initial_value</span><span>(</span><span class="z-variable">page_size</span><span>,</span><span class="z-variable"> limit_stream</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Clearing the stream before chaining with the real stream.</span></span>
<span class="giallo-l"><span class="z-keyword"> yield</span><span class="z-entity z-name z-function"> once</span><span>(</span><span class="z-entity z-name z-function">ready</span><span>(</span><span class="z-entity z-name z-function">vec!</span><span>[</span><span class="z-entity z-name">VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Reset</span><span> {</span><span class="z-variable"> values</span><span class="z-keyword z-operator">:</span><span class="z-variable"> initial_values</span><span> }]))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">chain</span><span>(</span><span class="z-variable">stream</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">switch</span><span>();</span></span></code></pre>
<p>There is a lot going on here. Sadly, we are not going to explain everything in
this beautiful piece of art<sup class="footnote-reference" id="fr-switch-1"><a href="#fn-switch">2</a></sup>.</p>
<p>The <code>.filter()</code>, <code>.sort_by()</code> and <code>.dynamic_head_with_initial_value()</code> methods
are part of <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im-util/0.10.0/eyeball_im_util/">the <code>eyeball-im-util</code> crate</a>. They are used
to filter, sort etc. a stream: They are essentially mapping a <code>Stream<Item = Vec<VectorDiff<T>>></code> to another <code>Stream<Item = Vec<VectorDiff<T>>></code>. In
other terms, they “change” the <code>VectorDiff</code>s on-the-fly to simulate filtering,
sorting, or something else. Let's see a very concrete example with <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im-util/0.10.0/eyeball_im_util/vector/struct.Sort.html">the <code>Sort</code>
higher-order stream</a> (the following example is
mostly a copy of the documentation of <code>Sort</code>, but <a rel="noopener external" target="_blank" href="https://github.com/jplatte/eyeball/pull/43">since I wrote this algorithm,
I guess you, dear reader, will find it acceptable</a>).</p>
<p>Let's imagine we have a vector of <code>char</code>. We want a <code>Stream</code> of <em>changes</em> about
this vector (the famous <code>VectorDiff</code>). We also want to <em>simulate</em> a sorted
vector, by only modifying the <em>changes</em>. The solution looks like this:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> std</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">cmp</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Ordering</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball_im</span><span class="z-keyword z-operator">::</span><span>{</span><span class="z-entity z-name">ObservableVector</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span>};</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball_im_util</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">vector</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">VectorObserverExt</span><span>;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> stream_assert</span><span class="z-keyword z-operator">::</span><span>{assert_next_eq, assert_pending};</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Our comparison function.</span></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> cmp</span><span><</span><span class="z-entity z-name">T</span><span>>(</span><span class="z-variable">left</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">T</span><span>,</span><span class="z-variable"> right</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">T</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Ordering</span></span>
<span class="giallo-l"><span class="z-keyword">where</span></span>
<span class="giallo-l"><span class="z-entity z-name"> T</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Ord</span><span>,</span></span>
<span class="giallo-l"><span>{</span></span>
<span class="giallo-l"><span class="z-variable"> left</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">cmp</span><span>(</span><span class="z-variable">right</span><span>)</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Our vector.</span></span>
<span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> vector</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> ObservableVector</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-entity z-name">char</span><span>></span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>();</span></span>
<span class="giallo-l"><span class="z-storage">let</span><span> (</span><span class="z-variable">initial_values</span><span>,</span><span class="z-storage"> mut</span><span class="z-variable"> stream</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> vector</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">subscribe</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">sort_by</span><span>(</span><span class="z-variable">cmp</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment">// ^^^</span></span>
<span class="giallo-l"><span class="z-comment">// |</span></span>
<span class="giallo-l"><span class="z-comment">// there</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert!</span><span>(</span><span class="z-variable">initial_values</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">is_empty</span><span>());</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_pending!</span><span>(</span><span class="z-variable">stream</span><span>);</span></span></code></pre>
<p>Alrighty. That's a good start. <code>vector</code> is empty, so the initial values from the
subscribe are empty, and the <code>stream</code> is also pending<sup class="footnote-reference" id="fr-stream_assert-1"><a href="#fn-stream_assert">3</a></sup>. I think
it's time to play with this new toy, isn't it?</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// Append unsorted values.</span></span>
<span class="giallo-l"><span class="z-variable">vector</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">append</span><span>(</span><span class="z-entity z-name z-function">vector!</span><span>[</span><span class="z-string">'d'</span><span>,</span><span class="z-string"> 'b'</span><span>,</span><span class="z-string"> 'e'</span><span>]);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// We get a `VectorDiff::Append` with sorted values!</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> stream</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Append</span><span> {</span><span class="z-variable"> values</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name z-function"> vector!</span><span>[</span><span class="z-string">'b'</span><span>,</span><span class="z-string"> 'd'</span><span>,</span><span class="z-string"> 'e'</span><span>] }</span></span>
<span class="giallo-l"><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_pending!</span><span>(</span><span class="z-variable">stream</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Let's recap what we have. `vector` is our `ObservableVector`,</span></span>
<span class="giallo-l"><span class="z-comment">// `stream` is the “sorted view”/“sorted stream” of `vector`:</span></span>
<span class="giallo-l"><span class="z-comment">//</span></span>
<span class="giallo-l"><span class="z-comment">// | index | 0 1 2 |</span></span>
<span class="giallo-l"><span class="z-comment">// | `vector` | d b e |</span></span>
<span class="giallo-l"><span class="z-comment">// | `stream` | b d e |</span></span></code></pre>
<p>So far, so good. It looks naive and simple: one operation in, one operation out.
It's funnier when things get more complicated though:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// Append multiple other values.</span></span>
<span class="giallo-l"><span class="z-variable">vector</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">append</span><span>(</span><span class="z-entity z-name z-function">vector!</span><span>[</span><span class="z-string">'f'</span><span>,</span><span class="z-string"> 'g'</span><span>,</span><span class="z-string"> 'a'</span><span>,</span><span class="z-string"> 'c'</span><span>]);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// We get three `VectorDiff`s this time!</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> stream</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushFront</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'a'</span><span> }</span></span>
<span class="giallo-l"><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> stream</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Insert</span><span> {</span><span class="z-variable"> index</span><span class="z-keyword z-operator">:</span><span class="z-constant z-numeric"> 2</span><span>,</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'c'</span><span> }</span></span>
<span class="giallo-l"><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> stream</span><span>,</span></span>
<span class="giallo-l"><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Append</span><span> {</span><span class="z-variable"> values</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name z-function"> vector!</span><span>[</span><span class="z-string">'f'</span><span>,</span><span class="z-string"> 'g'</span><span>] }</span></span>
<span class="giallo-l"><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_pending!</span><span>(</span><span class="z-variable">stream</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Let's recap what we have:</span></span>
<span class="giallo-l"><span class="z-comment">//</span></span>
<span class="giallo-l"><span class="z-comment">// | index | 0 1 2 3 4 5 6 |</span></span>
<span class="giallo-l"><span class="z-comment">// | `vector` | d b e f g a c |</span></span>
<span class="giallo-l"><span class="z-comment">// | `stream` | a b c d e f g |</span></span>
<span class="giallo-l"><span class="z-comment">// ^ ^ ^^^</span></span>
<span class="giallo-l"><span class="z-comment">// | | |</span></span>
<span class="giallo-l"><span class="z-comment">// | | with `VectorDiff::Append { .. }`</span></span>
<span class="giallo-l"><span class="z-comment">// | with `VectorDiff::Insert { index: 2, .. }`</span></span>
<span class="giallo-l"><span class="z-comment">// with `VectorDiff::PushFront { .. }`</span></span></code></pre>
<p>Notice how <code>vector</code> is <em>never</em> sorted. That's the power of these higher-order
streams of <code>VectorDiff</code>s: light and —more importantly— <strong>combinable</strong>! I repeat
myself: we are always mapping a <code>Stream<Item = Vec<VectorDiff<T>>></code> to another
<code>Stream<Item = Vec<VectorDiff<T>>></code>. That's the same type! The whole collection
is never computed entirely, except for the initial values: only the changes are
handled and trigger a computation. Knowing that, in the manner of <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/future/trait.Future.html"><code>Future</code></a>,
<code>Stream</code> is lazy —i.e. it does something only when polled—, it makes things
pretty efficient. And…</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>… as your favourite digression companion, I really, deeply, appreciate these
details. Nonetheless, I hope you don't mind if… I suggest to you that… you might
want to, maybe, go back to… <small>the main… subject, don't you think?</small></p>
</div>
</div>
<p>Which topic? Ah! The frozen Room List! Sorters are <em>not</em> the culprit. There.
Happy? Short enough?</p>
<p>These details were important. Kind of. I hope you've learned something along the
way. Next, let's see how a sorter works, and how it could be responsible for our
memory pressure and lock contention.</p>
<h2 id="randomness">Randomness<a role="presentation" class="anchor" href="#randomness" title="Anchor link to this header">#</a>
</h2>
<p>Taking a step back, I was asking myself: <q>Is it really frozen?</q>. The
cherry on the cake: I was unable to reproduce the problem! Even the reporters
of the problem were unable to reproduce it consistently. Hmm, a random problem?
Fortunately, two of the reporters are obstinate. Ultimately, we got analysis.</p>
<figure>
<picture>
<source srcset="./memory-pressure.avif" type="image/avif" />
<source srcset="./memory-pressure.webp" type="image/webp" />
<img src="./memory-pressure.png" loading="lazy" decoding="async" />
</picture>
<figcaption>
<p>Memory analysis of Element X in Android Studio (Element X is based on the Matrix
Rust SDK). It presents a callback tree, with the number of allocations and
deallocations for each node in this tree. Thanks <a rel="noopener external" target="_blank" href="https://github.com/jmartinesp">Jorge</a>!</p>
<p>And, holy cow, we see <strong>a lot</strong> of memory allocations, exactly 322'042 to be
precise, counting for 743Mib, for the <code>eyeball_im_util::vector::sort::SortBy</code>
type! I don't remember exactly how many rooms are part of the Room List, but
it's probably around 500-600.</p>
<p><small>Download fullsize image as: <a href="./memory-pressure.avif" title="Download the AVIF image">AVIF</a>,
<a href="./memory-pressure.webp" title="Download the WebP image">WebP</a>,
<a href="./memory-pressure.png" title="Download the PNG image">PNG</a>.</small></p>
</figcaption>
</figure>
<p>The Room List wasn't frozen. It was taking <strong>a lot</strong> of time to yield values.
Sometimes, up to 5 minutes on a phone. Alright, we have two problems to solve
here:</p>
<ol>
<li>Why is it random?</li>
<li>Why so many memory allocations and deallocations?</li>
</ol>
<p>The second problem will be discussed in the next section. Let's start with the
first problem in this section, shall we?</p>
<p>Let's start at the beginning. <code>eyeball_im_util::vector::sort::SortBy</code> is used
like so:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-variable">stream</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">sort_by</span><span>(</span><span class="z-entity z-name z-function">new_sorter_lexicographic</span><span>(</span><span class="z-entity z-name z-function">vec!</span><span>[</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-entity z-name z-function">new_sorter_latest_event</span><span>()),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-entity z-name z-function">new_sorter_recency</span><span>()),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-entity z-name z-function">new_sorter_name</span><span>()),</span></span>
<span class="giallo-l"><span> ]))</span></span></code></pre>
<p><code>sort_by</code> receives a sorter: <a rel="noopener external" target="_blank" href="https://docs.rs/matrix-sdk-ui/0.16.0/matrix_sdk_ui/room_list_service/sorters/fn.new_sorter_lexicographic.html"><code>new_sorter_lexicographic</code></a>. It's from
<a rel="noopener external" target="_blank" href="https://docs.rs/matrix-sdk-ui/0.16.0/matrix_sdk_ui/room_list_service/sorters/"><code>matrix_sdk_ui::room_list::sorters</code></a>, and it's a constructor for a…
lexicographic sorter. All sorters must implement <a rel="noopener external" target="_blank" href="https://docs.rs/matrix-sdk-ui/0.16.0/matrix_sdk_ui/room_list_service/sorters/trait.Sorter.html">the <code>Sorter</code> trait</a>.
Once again, it's a trait from <code>matrix_sdk_ui</code>, nothing fancy, it's simply this:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// Trait “alias”.</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> trait</span><span class="z-entity z-name"> Sorter</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name z-function"> Fn</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-entity z-name">Room</span><span>,</span><span class="z-keyword z-operator"> &</span><span class="z-entity z-name">Room</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Ordering</span><span> {}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// All functions `F` are auto-implementing `Sorter`.</span></span>
<span class="giallo-l"><span class="z-keyword">impl</span><span><</span><span class="z-entity z-name">F</span><span>></span><span class="z-entity z-name"> Sorter</span><span class="z-keyword"> for</span><span class="z-entity z-name"> F</span></span>
<span class="giallo-l"><span class="z-keyword">where</span></span>
<span class="giallo-l"><span class="z-entity z-name"> F</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name z-function"> Fn</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-entity z-name">Room</span><span>,</span><span class="z-keyword z-operator"> &</span><span class="z-entity z-name">Room</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Ordering</span><span> {}</span></span></code></pre>
<p>Put it differently, all functions with two parameters of type <code>&Room</code>, and with
a return type <code>Ordering</code> is considered a sorter. There. It's crystal clear now,
except… what's a lexicographic sorter?</p>
<div class="conversation" data-character="procureur">
<div class="conversation--character">
<span lang="fr">Le Procureur</span>
<picture role="presentation">
<source srcset="/image/procureur.avif" type="image/avif" />
<source srcset="/image/procureur.webp" type="image/webp" />
<img src="/image/procureur.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Should I really quote the documentation of <code>new_sorter_lexicographic</code>? My work
here is turning into a tragedy.</p>
<p>It creates a new sorter that will run multiple sorters. When the
<math><msup><mi>n</mi><mtext>th</mtext></msup></math> sorter returns
<code>Ordering::Equal</code>, the next sorter is called. It stops as soon as a sorter
returns <code>Ordering::Greater</code> or <code>Ordering::Less</code>.</p>
<p>This is an implementation of a lexicographic order as defined for <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Lexicographic_order#Cartesian_products">cartesian
products</a>.</p>
</div>
</div>
<p>In short, we are executing 3 sorters: by <em>latest event</em>, by <em>recency</em>
and by <em>name</em>.</p>
<p>None of these sorters are using any form of randomness. It's a <em
lang="fr">cul-de-sac</em>. Let's take a step back by looking at <code>SortBy</code> in
<code>eyeball_im_util</code> itself maybe? <i>Scroll the documentation</i>, not here,
<i>read the initial patch</i>, hmm, I see a mention of a binary search, <i>jump
into the code</i>, ah, <a rel="noopener external" target="_blank" href="https://github.com/jplatte/eyeball/blob/b7dc6fde71e507459ecbd7519a8a22f12bf2a8de/eyeball-im-util/src/vector/sort.rs#L315-L318">here, look at the comment</a>:</p>
<blockquote>
<p>When looking for the <em>position</em> of a value (e.g. where to insert a new
value?), <code>Vector::binary_search_by</code> is used — it is possible because the
<code>Vector</code> is sorted. When looking for the <em>unsorted index</em> of a value,
<code>Iterator::position</code> is used.</p>
</blockquote>
<p><a rel="noopener external" target="_blank" href="https://docs.rs/imbl/7.0.0/imbl/type.Vector.html#method.binary_search_by"><code>Vector::binary_search_by</code></a> doesn't mention any form of randomness in its
documentation. Another <em lang="fr">cul-de-sac</em>.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Remember that the Room List appears frozen but it is actually blank. The problem
is not when the stream receives an update, but when the stream is “created”,
i.e. when the initial items are sorted for the first time before receiving
updates.</p>
<p>Moreover, the comment says <q>it is possible because the <code>Vector</code> is sorted</q>,
which indicates that “the vector” (I guess it's a buffer somewhere) <em>has been
sorted</em> one way or another. What do you think?</p>
</div>
</div>
<p>Ah! Brilliant. That's correct! Looking at <a rel="noopener external" target="_blank" href="https://github.com/jplatte/eyeball/blob/b7dc6fde71e507459ecbd7519a8a22f12bf2a8de/eyeball-im-util/src/vector/sort.rs#L261">the constructor of
<code>SortBy</code></a> (or its implementation), we notice it's using
<a rel="noopener external" target="_blank" href="https://docs.rs/imbl/7.0.0/imbl/type.Vector.html#method.sort_by"><code>Vector::sort_by</code></a>. And guess what? It's relying on… <i>drum roll</i>…
<a rel="noopener external" target="_blank" href="https://github.com/jneem/imbl/blob/6feb48d04ed9bd2a004968541d1a90d61c423d31/src/vector/mod.rs#L1575-L1583">quicksort</a>! Following the path, we see
<a rel="noopener external" target="_blank" href="https://github.com/jneem/imbl/blob/6feb48d04ed9bd2a004968541d1a90d61c423d31/src/sort.rs#L177-L185">it actually creates a pseudo random number generator (PRNG) to do the
quicksort</a>.</p>
<p>Phew. Finally. Time for a cup of tea and a biscuit<sup class="footnote-reference" id="fr-biscuit-1"><a href="#fn-biscuit">4</a></sup>.</p>
<p>My guess here is the following. Depending on the (pseudo randomly) generated
pivot index, the number of comparisons may vary each time this runs. We can
enter a pathological case where more comparisons means more memory pressure,
which means slower sorting, which means… A Frozen Room List<sup><abbr
title="Trademark">TM</abbr></sup>, <i>play horror movie music</i>!</p>
<h2 id="memory-pressure">Memory Pressure<a role="presentation" class="anchor" href="#memory-pressure" title="Anchor link to this header">#</a>
</h2>
<p>A memory allocator is responsible for… well… allocating the memory. If you
believe this is a simple problem, please retract this offensive thought quickly:
what an oaf! Memory is managed based on the strategy or strategies used by the
memory allocator: there is not a unique solution. Each memory allocator comes
with trade-offs: do you allocate and replace multiple similar small objects
several times in a row, do you need fixed-size blocks of memory, dynamic blocks
etc.</p>
<p>Allocating memory is not free. The memory allocator has a cost in itself —which
could be mitigated by implementing a custom memory allocator maybe—, but there
is also <strong>a hardware cost</strong>, and it's comparatively more difficult to mitigate.
Memory is allocated on the heap, i.e. <em>the RAM</em>, also called <em>the main memory</em>
(not be confused with <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/CPU_cache">CPU caches: L1, L2…</a>). The RAM is nice and
all, but it lives far from the CPU. It <em>takes time</em> to allocate something on the
heap and…</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Hold on a second. I heard it is around 100-150 nanoseconds to fetch a data from
the heap. In what world is this “costly”? How is this “far” from the CPU?</p>
<p>I understand we are talking about <em>random</em> accesses (the <em>R</em> in RAM), and
multiple indirections, but still, it sounds pretty fast, right?</p>
</div>
</div>
<p>Hmm, <i>refrain from opening the Pandora's box</i>, let's try to stay high-level
here, shall we? Be careful: the numbers I am going to present can vary depending
on your hardware, but the important part is <strong>the scale</strong>: keep that in mind.</p>
<figure>
<table><thead><tr><th>Operation</th><th style="text-align: right">Time</th><th style="text-align: right">“Human scale”</th></tr></thead><tbody>
<tr><td>Fetch from L1 cache</td><td style="text-align: right">1ns</td><td style="text-align: right">1mn</td></tr>
<tr><td>Branch misprediction</td><td style="text-align: right">3ns</td><td style="text-align: right">3mn</td></tr>
<tr><td>Fetch from L2 cache</td><td style="text-align: right">4ns</td><td style="text-align: right">4mn</td></tr>
<tr><td>Mutex lock/unlock</td><td style="text-align: right">17ns</td><td style="text-align: right">17mn</td></tr>
<tr><td>Fetch from the main memory</td><td style="text-align: right">100ns</td><td style="text-align: right">1h40mn</td></tr>
<tr><td>SSD random read</td><td style="text-align: right">16'000ns</td><td style="text-align: right">11.11 days</td></tr>
</tbody></table>
<figcaption>
<p>Latency numbers for the year 2020 for various operations (source:
<a rel="noopener external" target="_blank" href="https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html"><cite>Latency Numbers Every Programmer Shoud Know</cite> from Colin Scott (UC
Berkeley)</a>).</p>
<p>The time in the second column is given in nanoseconds, i.e.
<math><mfrac><mn>1</mn><mn>1'000'000'000</mn></mfrac></math> second. The time in
the third column is “humanized” to give us a better sense of the scale here: we
imagine 1ns maps to 1min.</p>
</figcaption>
</figure>
<p>Do you see the difference between the L1/L2 caches and the main memory? 1ns to
100ns is the same difference as 1mn to 1h40. So, yes, it takes time to read
from memory. That's why we try to avoid allocations as much as possible.</p>
<figure>
<svg viewBox="0 0 200 55" role="img" id="memory-race">
<style>
#memory-race text { font-size: 4pt }
#memory-race circle {
fill: oklch(69.50% .140 76.18);
animation: 4s linear 0s infinite alternate slide;
}
#memory-race .l1 { animation-duration: .5s }
#memory-race .l2 { animation-duration: 2s }
#memory-race .ram { animation-duration: 50s }
@keyframes slide {
from {
transform: translateX(15%);
}
to {
transform: translateX(85%);
}
}
</style>
<text x="0" y="12">CPU</text>
<text x="0" y="27">CPU</text>
<text x="0" y="42">CPU</text>
<text x="180" y="12">L1</text>
<text x="180" y="27">L2</text>
<text x="180" y="42">RAM</text>
<circle cx="0" cy="10" r="4" class="l1" />
<circle cx="0" cy="25" r="4" class="l2" />
<circle cx="0" cy="40" r="4" class="ram" />
</svg>
<figcaption>
<p>Not comfortable with numbers? Let's try to visualise it with 1ns = 1s! On
the left: the CPU. On the right, the L1 cache, the L2 cache, and the RAM. The
“balls” represent the time it takes to move information between the CPU and the
L1/L2 caches or the RAM.</p>
</figcaption>
</figure>
<p>Sadly, in our case, it appears we are allocating 322'042 times to sort the
initial rooms of the Room List, for a total of 743'151'616 bits allocated,
with 287 bytes per allocation. Of course, if we are doing quick napkin
maths<sup class="footnote-reference" id="fr-napkin-math-1"><a href="#fn-napkin-math">5</a></sup>, it should take around 200ms. We are far from The Frozen
Room List<sup><abbr title="Trademark">TM</abbr></sup>, but there is more going
on.<sup class="footnote-reference" id="fr-suspens-1"><a href="#fn-suspens">6</a></sup></p>
<p>Do you remember the memory allocator? Its role is to also avoid <em>fragmentation</em>
as much as possible. The number of memory “blocks” isn't infinite: when memory
blocks are freed, and new ones are allocated later, maybe the previous blocks
are no longer available and cannot be reused. The allocator has to find a good
place, while keeping fragmentation under control. Maybe the blocks must be moved
to create enough space to insert the new blocks (it's often preferable to have
contiguous blocks).</p>
<p>That's what I call <strong>memory pressure</strong>. We are asking too much, too fast, and
the memory allocator we use in the Matrix Rust SDK is not designed to handle
this use case.</p>
<p>What are our solutions then?</p>
<div class="conversation" data-character="factotum">
<div class="conversation--character">
<span lang="fr">Le Factotum</span>
<picture role="presentation">
<source srcset="/image/factotum.avif" type="image/avif" />
<source srcset="/image/factotum.webp" type="image/webp" />
<img src="/image/factotum.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>May I suggest an approach? What about finding where we are allocating and
deallocating memory? Then we might be able to reduce either the number of
allocations, or the size of the value being allocated (and deallocated), with
the hope of making the memory allocator happier. Possible solutions:</p>
<ul>
<li>If the allocated value is too large to fit in the stack, we could return a
pointer to it if possible,</li>
<li>Maybe we don't need the full value: we could return just a pointer to a
fragment of it?</li>
</ul>
</div>
</div>
<p>Excellent ideas. Let's track which sorter creates the problem. We start with
the sorter that was recently modified: <code>latest_event</code>. In short, this sorter
compares the <code>LatestEventValue</code> of two rooms: the idea is that rooms with a
<code>LatestEventValue</code> representing a <em>local event</em>, i.e. an event that is not sent
yet, or is sending, must be at the top of the Room List. Alright, <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/blob/3eb693acadb08db8e41de90ef51730d206168e7c/crates/matrix-sdk-ui/src/room_list_service/sorters/latest_event.rs#L64C1-L69C2">let's look at
its core part</a>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> new_sorter</span><span>()</span><span class="z-keyword z-operator"> -></span><span class="z-keyword"> impl</span><span class="z-entity z-name"> Sorter</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> latest_events</span><span class="z-keyword z-operator"> =</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> |</span><span class="z-variable">left</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Room</span><span>,</span><span class="z-variable"> right</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Room</span><span class="z-keyword z-operator">|</span><span> (</span><span class="z-variable">left</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">latest_event</span><span>(),</span><span class="z-variable"> right</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">latest_event</span><span>());</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> move</span><span class="z-keyword z-operator"> |</span><span class="z-variable">left</span><span>,</span><span class="z-variable"> right</span><span class="z-keyword z-operator">| -></span><span class="z-entity z-name"> Ordering</span><span> {</span><span class="z-entity z-name z-function"> cmp</span><span>(</span><span class="z-variable">latest_events</span><span>,</span><span class="z-variable"> left</span><span>,</span><span class="z-variable"> right</span><span>) }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Alright. For each sorting iteration, the <code>Room::latest_event</code> method is called
twice. <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/blob/3eb693acadb08db8e41de90ef51730d206168e7c/crates/matrix-sdk-base/src/room/latest_event.rs#L38">This method is as follows</a>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> latest_event</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-language">self</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> LatestEventValue</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>info</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">read</span><span>()</span><span class="z-keyword z-operator">.</span><span>latest_event</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">clone</span><span>()</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Oh, there it is. We are acquiring a read lock over the <code>info</code> value, then we
are reading the <code>latest_event</code> field, and we are cloning the value. Cloning
is important here as we don't want to hold the read lock for too long. This
is our culprit. The size of the <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/blob/3eb693acadb08db8e41de90ef51730d206168e7c/crates/matrix-sdk-base/src/latest_event.rs#L29"><code>LatestEventValue</code></a> type
is 144 bytes (it doesn't count the size of the event itself, because this size
is dynamic).</p>
<p>Before going further, let's check whether another sorter has a similar problem,
shall we? <i>Look at the other sorters</i>, oh!, turns out <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/blob/01c0775e5974ad8a8690f5c580e79612ddcdfa2d/crates/matrix-sdk-ui/src/room_list_service/sorters/recency.rs#L90">the <code>recency</code>
sorter</a> also uses the <code>latest_event</code> method! Damn, this is
becoming really annoying.</p>
<p>Question: do we need the entire <code>LatestEventValue</code>? Probably not!</p>
<ul>
<li>For the <code>latest_event</code> sorter, we actually only need to know when this
<code>LatestEventValue</code> is <em>local</em>, that's it.</li>
<li>For the <code>recency</code> sorter, we only need to know the timestamp of the
<code>LatestEventValue</code>.</li>
</ul>
<p>So instead of copying the whole value in memory twice per sorter iteration, for
two sorters, let's try to write more specific methods:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> latest_event_is_local</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-language">self</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> bool</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>info</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">read</span><span>()</span><span class="z-keyword z-operator">.</span><span>latest_event</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">is_local</span><span>()</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> latest_event_timestamp</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-language">self</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">MilliSecondsSinceUnixEpoch</span><span>> {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>info</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">read</span><span>()</span><span class="z-keyword z-operator">.</span><span>latest_event</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">timestamp</span><span>()</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Just like that, <strong>the throughput has been improved by 18%</strong> according to the
<code>room_list</code> benchmark. You can see <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/commit/62eb1996d917fb1928bdb9bba40d78a6eefe0bbd">the patch in “action”</a>. Can we
declare victory over memory pressure?</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>I beg your pardon, but I don't believe it's a victory. We have reduced the size
of allocations, but not the number of allocations itself.</p>
<p>Well, actually, <code>latest_event_is_local</code> returns a <code>bool</code>: it
can fit in the stack. And <code>latest_event_timestamp</code> returns an
<code>Option<MilliSecondsSinceUnixEpoch></code>, where <a rel="noopener external" target="_blank" href="https://docs.rs/ruma/0.14.1/ruma/struct.MilliSecondsSinceUnixEpoch.html"><code>MilliSecondsSinceUnixEpoch</code> is a
<code>Uint</code></a>, which <a rel="noopener external" target="_blank" href="https://docs.rs/js_int/0.2.2/js_int/struct.UInt.html">itself is a <code>f64</code></a>: it can
also fit in the stack.</p>
<p>So, yes, we may have reduced the number of allocations greatly, that's agreed,
it explains the 18% throughput improvement. However, issue reporters were
mentioning a lag of 5 minutes or so, do you remember? How do you explain the
remaining 4 minutes 6 seconds then? This is still unacceptable, right?</p>
</div>
</div>
<p>Definitely yes! Everything above 200ms (from our napkin maths) is unacceptable
here. Memory pressure was an important problem, and it's now solved, but it
wasn't the only problem.</p>
<h2 id="lock-contention">Lock Contention<a role="presentation" class="anchor" href="#lock-contention" title="Anchor link to this header">#</a>
</h2>
<p>The assiduous reader may have noticed that we are still dealing with a lock
here.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-variable z-language">self</span><span class="z-keyword z-operator">.</span><span>info</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">read</span><span>()</span><span class="z-keyword z-operator">.</span><span>latest_event</span><span class="z-keyword z-operator">.</span><span>…</span></span>
<span class="giallo-l"><span class="z-comment">// ^^^^^^</span></span>
<span class="giallo-l"><span class="z-comment">// |</span></span>
<span class="giallo-l"><span class="z-comment">// this read lock acquisition</span></span></code></pre>
<p>Do you remember we had 322'042 allocations? It represents the number of times
the <code>latest_event</code> method was called basically, which means…</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>… the lock is acquired 322'042 times!</p>
<p>…</p>
<p>… no?</p>
</div>
</div>
<p>… yes… and please, stop interrupting me, I was trying to build up the suspense for
a climax.</p>
<p>Anyway. Avoiding a lock isn't an easy task. However, this lock around <code>info</code>
is particularly annoying because it's called by almost all sorters! They need
information about a <code>Room</code>; all the information is in this <code>info</code> field, which
is a read-write lock. Hmmm.</p>
<p>Let's change our strategy. We need to take a step back:</p>
<ol>
<li>The sorters need this data.</li>
<li>Running the sorters won't change this data.</li>
<li>When the data does change the sorters will be re-run.</li>
</ol>
<p>Maybe we could fetch, ahead of time, all the necessary data for all sorters in
a single type: it will be refreshed when the data changes, which is right before the
sorters run again.</p>
<div class="conversation" data-character="procureur">
<div class="conversation--character">
<span lang="fr">Le Procureur</span>
<picture role="presentation">
<source srcset="/image/procureur.avif" type="image/avif" />
<source srcset="/image/procureur.webp" type="image/webp" />
<img src="/image/procureur.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>The idea here is to organise the data around a specific layout. The focus on the
data layout aims at being CPU cache friendly as much as possible. This kind of
approach is called <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Data-oriented_design"><em>Data-oriented Design</em></a>.</p>
</div>
</div>
<p>That's correct. If the type is small enough, it can fit more easily in the
CPU caches, like L1 or L2. Do you remember how fast they are? 1ns and 4ns,
much faster than the 100ns for the main memory. Moreover, it removes the lock
contention and the memory pressure entirely!</p>
<details>
<summary>
<p>I highly recommend watching the following talks<sup class="footnote-reference" id="fr-talks-1"><a href="#fn-talks">7</a></sup> if you want to learn more about Data-oriented Design (DoD)</p>
</summary>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/rX0ItVEVjHc"
title="Data-Oriented Design and C++, by Mike Acton, at the CppCon 2014"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Data-Oriented Design and C++, by Mike Acton, at the CppCon 2014</p>
<p>The transformation of data is the only purpose of any program. Common approaches in C++ which are antithetical to this goal will be presented in the context of a performance-critical domain (console game development). Additionally, limitations inherent in any C++ compiler and how that affects the practical use of the language when transforming that data will be demonstrated. <a rel="noopener external" target="_blank" href="https://github.com/CppCon/CppCon2014/tree/master/Presentations/Data-Oriented%20Design%20and%20C%2B%2B">View the slides</a>.</p>
</figcaption>
</figure>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/WDIkqP4JbkE"
title="Cpu Caches and Why You Care, by Scott Meyers, at the code::dive conference 2014"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Cpu Caches and Why You Care, by Scott Meyers, at the code::dive conference 2014</p>
<p>This talk explores CPU caches and their impact on program performance.</p>
</figcaption>
</figure>
</details>
<p>So. Let's be serious: I suggest trying to do some Data-oriented Design here.
We start by putting all our data in a single type:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> struct</span><span class="z-entity z-name"> RoomListItem</span><span> {</span></span>
<span class="giallo-l"><span class="z-comment"> /// Cache of `Room::latest_event_timestamp`.</span></span>
<span class="giallo-l"><span class="z-variable"> cached_latest_event_timestamp</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">MilliSecondsSinceUnixEpoch</span><span>>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> /// Cache of `Room::latest_event_is_local`.</span></span>
<span class="giallo-l"><span class="z-variable"> cached_latest_event_is_local</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> bool</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> /// Cache of `Room::recency_stamp`.</span></span>
<span class="giallo-l"><span class="z-variable"> cached_recency_stamp</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">RoomRecencyStamp</span><span>>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> /// Cache of `Room::cached_display_name`, already as a string.</span></span>
<span class="giallo-l"><span class="z-variable"> cached_display_name</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">String</span><span>>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> /// Cache of `Room::is_space`.</span></span>
<span class="giallo-l"><span class="z-variable"> cached_is_space</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> bool</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Cache of `Room::state`.</span></span>
<span class="giallo-l"><span class="z-variable"> cached_state</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> RoomState</span><span>,</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">impl</span><span class="z-entity z-name"> RoomListItem</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> refresh_cached_data</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable z-language"> self</span><span>,</span><span class="z-variable"> room</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Room</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>cached_latest_event_timestamp </span><span class="z-keyword z-operator">=</span><span class="z-variable"> room</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">new_latest_event_timestamp</span><span>();</span></span>
<span class="giallo-l"><span class="z-comment"> // etc.</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>At this point, the size of <code>RoomListItem</code> is 64 bytes, acceptably small!</p>
<div class="conversation" data-character="factotum">
<div class="conversation--character">
<span lang="fr">Le Factotum</span>
<picture role="presentation">
<source srcset="/image/factotum.avif" type="image/avif" />
<source srcset="/image/factotum.webp" type="image/webp" />
<img src="/image/factotum.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>The L1 and L2 caches nowadays have a size of several kilobytes. You can try to
run <a rel="noopener external" target="_blank" href="https://man.freebsd.org/cgi/man.cgi?query=sysctl"><code>sysctl</code></a> or <a rel="noopener external" target="_blank" href="https://linux.die.net/man/1/getconf"><code>getconf</code></a> in a shell to see how much your hardware supports
(look for an entry like “cache line”, or “cache line size” for example).</p>
<p>On my system for example, the L1 (data) cache size is 65Kb, and the cache line
size is 128 bytes.</p>
<p>Ideally, we —at the very least— want one <code>RoomListItem</code> to fit in a cache line.
Compacting the type to avoid inner padding would be ideal. If there is a <em>cache
miss</em> in L1, the CPU will look at the next cache, so L2, and so on, until
reaching the main memory. So the cost of a cache miss is: look up in L1, plus
cache miss, plus look up in L2, etc.</p>
</div>
</div>
<p><a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/commit/a84c97b292c658109bfb40391b5f10b0708276d4">A bit of plumbing later</a>, this new <code>RoomListItem</code> type is
used everywhere by the Room List, by all its filters and all its sorters. For
example, the <code>latest_event</code> sorter now looks like:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> new_sorter</span><span>()</span><span class="z-keyword z-operator"> -></span><span class="z-keyword"> impl</span><span class="z-entity z-name"> Sorter</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> latest_events</span><span class="z-keyword z-operator"> = |</span><span class="z-variable">left</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">RoomListItem</span><span>,</span><span class="z-variable"> right</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">RoomListItem</span><span class="z-keyword z-operator">|</span><span> {</span></span>
<span class="giallo-l"><span> (</span><span class="z-variable">left</span><span class="z-keyword z-operator">.</span><span>cached_latest_event_is_local,</span><span class="z-variable"> right</span><span class="z-keyword z-operator">.</span><span>cached_latest_event_is_local)</span></span>
<span class="giallo-l"><span> };</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> move</span><span class="z-keyword z-operator"> |</span><span class="z-variable">left</span><span>,</span><span class="z-variable"> right</span><span class="z-keyword z-operator">| -></span><span class="z-entity z-name"> Ordering</span><span> {</span><span class="z-entity z-name z-function"> cmp</span><span>(</span><span class="z-variable">latest_events</span><span>,</span><span class="z-variable"> left</span><span>,</span><span class="z-variable"> right</span><span>) }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The lock acquisitions happen only in <code>refresh_cached_data</code>, when a new update
happens, not during the filtering or sorting anymore. Let's see what the
benchmark has to say now.</p>
<p>Before:</p>
<pre class="giallo z-code"><code data-lang="shellscript"><span class="giallo-l"><span class="z-entity z-name">$</span><span class="z-string"> cargo bench</span><span class="z-constant z-other"> --bench</span><span class="z-string"> room_list</span></span>
<span class="giallo-l"><span class="z-entity z-name">RoomList/Create/1000</span><span class="z-string"> rooms ×</span><span class="z-constant z-numeric"> 1000</span><span class="z-string"> events</span></span>
<span class="giallo-l"><span> time: [</span><span class="z-constant z-numeric">53.027</span><span> ms </span><span class="z-constant z-numeric">53.149</span><span> ms </span><span class="z-constant z-numeric">53.273</span><span> ms]</span></span>
<span class="giallo-l"><span class="z-entity z-name"> thrpt:</span><span> [18.771</span><span class="z-string"> Kelem/s</span><span class="z-constant z-numeric"> 18.815</span><span class="z-string"> Kelem/s</span><span class="z-constant z-numeric"> 18.858</span><span class="z-string"> Kelem/s]</span></span></code></pre>
<p>After:</p>
<pre class="giallo z-code"><code data-lang="shellscript"><span class="giallo-l"><span class="z-entity z-name">$</span><span class="z-string"> cargo bench</span><span class="z-constant z-other"> --bench</span><span class="z-string"> room_list</span></span>
<span class="giallo-l"><span class="z-entity z-name">RoomList/Create/1000</span><span class="z-string"> rooms ×</span><span class="z-constant z-numeric"> 1000</span><span class="z-string"> events</span></span>
<span class="giallo-l"><span> time: [</span><span class="z-constant z-numeric">676.29</span><span> µs </span><span class="z-constant z-numeric">676.84</span><span> µs </span><span class="z-constant z-numeric">677.50</span><span> µs]</span></span>
<span class="giallo-l"><span class="z-entity z-name"> thrpt:</span><span> [1.4760</span><span class="z-string"> Melem/s</span><span class="z-constant z-numeric"> 1.4775</span><span class="z-string"> Melem/s</span><span class="z-constant z-numeric"> 1.4787</span><span class="z-string"> Melem/s]</span></span>
<span class="giallo-l"><span class="z-entity z-name"> change:</span></span>
<span class="giallo-l"><span> time: [-98.725% -98.721% -98.716%] (</span><span class="z-entity z-name">p</span><span class="z-string"> =</span><span class="z-constant z-numeric"> 0.00</span><span class="z-keyword z-operator"> <</span><span class="z-constant z-numeric"> 0.05</span><span>)</span></span>
<span class="giallo-l"><span class="z-entity z-name"> thrpt:</span><span> [+7686.9%</span><span class="z-string"> +7718.5% +7745.6%]</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Performance</span><span class="z-string"> has improved.</span></span></code></pre>
<p>Boom!</p>
<p>We don't see the 5 minutes lag mentioned by the reporters, but remember it's
random. Nonetheless, <strong>the performance impact is huge</strong>:</p>
<ul>
<li>From 18.8Kelem/s to 1.4Melem/s,</li>
<li>From 53ms to 676µs, or —to compare with the same unit— 0.676ms, so <strong>78× faster</strong>!</li>
<li>The throughput has improved by 7718.5%, and the time by 98.7%.</li>
</ul>
<p>Can we claim victory now?</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Apparently yes! The reporters were unable to reproduce the problem anymore. It
seems it's solved! Looking at profilers, we see millions fewer allocations in
the benchmark runs (the benchmark does a lot of allocations for the setup, but
the difference is pretty noticeable).</p>
<p>Data-oriented Design is fascinating. Understanding how computers work, how the
memory and the CPU work, is crucial to optimise algorithms. The changes we've
applied are small compared to the performance improvement they have provided!</p>
<p>You said everything above 200ms is unacceptable. With 676µs, I reckon the target
is reached. It's even below the napkin maths about main memory access, which
suggests we are not hitting the RAM anymore in the filters and sorters (not
in an uncivilised way at least). Also, it's funny that the difference between
an L1/L2 cache access (1-4ns) and a main memory access (100ns) is on average
40 times faster, which looks suspiciously similar to the 78 times factor we see
here. It also suggests we are hitting L1 more frequently than L2, which is a
good sign!</p>
</div>
</div>
<p>The benchmark Iteration Times and Regression graphs are interesting to look at.</p>
<figure>
<p><a href="https://mnt.io/articles/about-memory-pressure-lock-contention-and-data-oriented-design/./1-iteration-times.svg"><img src="https://mnt.io/articles/about-memory-pressure-lock-contention-and-data-oriented-design/./1-iteration-times.svg" alt="Iteration times" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>The initial Iteration Times, before our patches. Notice how the points do not
follow any “trend”. It's a clear sign the program is acting erratically.</p>
</figcaption>
</figure>
<figure>
<p><a href="https://mnt.io/articles/about-memory-pressure-lock-contention-and-data-oriented-design/./2-iteration-times.svg"><img src="https://mnt.io/articles/about-memory-pressure-lock-contention-and-data-oriented-design/./2-iteration-times.svg" alt="Iteration times" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>The final Iteration Times/Regression, after our patches. Notice how the points
are linear.</p>
</figcaption>
</figure>
<p>The second graph is the kind of graph I like. Predictable.</p>
<div class="conversation" data-character="procureur">
<div class="conversation--character">
<span lang="fr">Le Procureur</span>
<picture role="presentation">
<source srcset="/image/procureur.avif" type="image/avif" />
<source srcset="/image/procureur.webp" type="image/webp" />
<img src="/image/procureur.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>In this concrete case, it's difficult to improve the performance further because
<code>RoomListItem</code> is used by sorters, and by filters, and in other places of the
code. The current usage of <code>RoomListItem</code> falls into the definition of <em>Array of
Structures</em> in the Data-oriented Design terminology. After all, we clearly have
a <code>Vec<RoomListItem></code> at the root of everything. It is efficient but <em>Structure
of Arrays</em> might be even more efficient. Instead of having:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">struct</span><span class="z-entity z-name"> RoomListItem</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> a</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> bool</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> b</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> u64</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> c</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> bool</span><span>,</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">let</span><span class="z-variable"> rooms</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">RoomListItem</span><span>>;</span></span></code></pre>
<p>we would have:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">struct</span><span class="z-entity z-name"> RoomListItems</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> a</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">bool</span><span>>,</span></span>
<span class="giallo-l"><span class="z-variable"> b</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">u64</span><span>>,</span></span>
<span class="giallo-l"><span class="z-variable"> c</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">bool</span><span>>,</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">let</span><span class="z-variable"> rooms</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> RoomListItems</span><span>;</span></span></code></pre>
<p>This is not applicable in our situation because sorters are iterating over
different fields. However, if you're sure only one field in a single loop is
used, this <em>Structure of Arrays</em> is cache friendlier as it loads less data into
the CPU caches: less padding, fewer useless bytes. By making better use of the
cache line, not only we are pretty sure the program will run faster, but the
CPU will be better at predicting what data will be loaded in the cache line,
boosting the performance even more!</p>
<p>Just so you know my role here is not restricted to recite documentation or to
summarise Wikipedia entries.</p>
</div>
</div>
<p>Of course you're valuable! Now, the surprise.</p>
<h2 id="">The Dessert<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p>Of course, let's not forget about our dessert! I won't dig too much: the
patch contains all the necessary gory details. In short, it's about how
<code>VectorDiff::Set</code> can create a nasty bug in <code>SortBy</code>. Basically, when a value
in the vector is updated, a <code>VectorDiff::Set</code> is emitted. <code>SortBy</code> is then
responsible for computing a new <code>VectorDiff</code>:</p>
<ul>
<li>it was calculating the old position of the value,</li>
<li>it was calculating the new position,</li>
<li>depending on that, it was emitting the appropriate <code>VectorDiff</code>s.</li>
</ul>
<p>However, the old “value” wasn't removed from the buffer <em>immediately</em> and
not <em>every time</em>. In theory, it should not cause any problem —it was an
optimisation after all— except if… the items manipulated by the stream are
“shallow clones”. Shallow cloning a value won't copy the value entirely: we get
a new value, but its state is synced with the original value. This happens with
types such as:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[derive(</span><span class="z-entity z-name">Clone</span><span>)]</span></span>
<span class="giallo-l"><span class="z-storage">struct</span><span class="z-entity z-name"> S</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> inner</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Arc</span><span><</span><span class="z-entity z-name">T</span><span>></span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Here, cloning a value of type <code>S</code> and changing its <code>inner</code> field will also
update the original value.</p>
<p>Just like that, it was possible to systematically create… <strong>an infinite loop</strong>.
Funky isn't it?</p>
<p>You can view the patch <a rel="noopener external" target="_blank" href="https://github.com/jplatte/eyeball/pull/80">Fix an infinite loop when <code>SortBy<Stream<Item = T>></code> handles a <code>VectorDiff::Set</code> where <code>T</code> is a shallow clone
type</a> to learn more.</p>
<p>I think this is a concrete example of when jumping on an optimisation can lead
to a bug. I'm not saying we should not prematurely optimise our programs: I'm
a partisan of the “we should” camp. I'm saying that bugs can be pretty subtle
sometimes, and this bug would have been avoided if we hadn't taken a shortcut in
this algorithm. It's important to be correct first, then measure, then improve.</p>
<p>I hope you've learned a couple of things, and you've enjoyed your reading.</p>
<p>I would like to thank <a rel="noopener external" target="_blank" href="https://artificialworlds.net/blog/">Andy Balaam</a> and <a rel="noopener external" target="_blank" href="https://github.com/poljar">Damir Jelić</a> for the
reviews and the feedback!</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-vectordiff_on_other_uis">
<p>On <a rel="noopener external" target="_blank" href="https://developer.apple.com/swiftui/">SwiftUI</a>, there is the
<a rel="noopener external" target="_blank" href="https://developer.apple.com/documentation/swift/collectiondifference/change"><code>CollectionDifference.Change</code></a> enum. For example: <code>VectorDiff::PushFront</code>
is equivalent to <code>Change.insert(offset: 0)</code>. On <a rel="noopener external" target="_blank" href="https://developer.android.com/compose">Jetpack Compose</a>, there is
<a rel="noopener external" target="_blank" href="https://kotlinlang.org/api/core/kotlin-stdlib/kotlin.collections/-mutable-list/"><code>MutableList</code></a> object. For example: <code>VectorDiff::Clear</code> is equivalent to
<code>MutableList.clear()</code>! <a href="#fr-vectordiff_on_other_uis-1">↩</a></p>
</li>
<li id="fn-switch">
<p>I would <em>love</em> to talk about how this <code>Stream</code> produces
a <code>Stream</code>, how the outer stream and the inner stream are switched (with
<code>.switch()</code>!), how we've implemented that from scratch, but it's probably
for another article. Meanwhile, you can take a look at <a rel="noopener external" target="_blank" href="https://docs.rs/async-rx/0.1.3/async_rx/struct.Switch.html"><code>async_rx::Switch</code></a>. <a href="#fr-switch-1">↩</a></p>
</li>
<li id="fn-stream_assert">
<p>Do you know <a rel="noopener external" target="_blank" href="https://docs.rs/stream_assert/0.1.1/stream_assert/"><code>stream_assert</code></a>? It's another crate we've
written to easily apply assertions on <code>Stream</code>s. Pretty convenient. <a href="#fr-stream_assert-1">↩</a></p>
</li>
<li id="fn-biscuit">
<p>Yes, <a rel="noopener external" target="_blank" href="https://www.biscuitsec.org/">biscuit</a>. <a href="#fr-biscuit-1">↩</a></p>
</li>
<li id="fn-napkin-math">
<p>I highly recommend to read the <a rel="noopener external" target="_blank" href="https://github.com/sirupsen/napkin-math/">Napkin Math</a> project, with
the great talk at <a rel="noopener external" target="_blank" href="https://www.youtube.com/watch?v=IxkSlnrRFqc">SRECON'19, <cite>Advanced Napkin Math: Estimating System
Performance from First Principles</cite> by Simon Eskildsen</a>. <a href="#fr-napkin-math-1">↩</a></p>
</li>
<li id="fn-suspens">
<p>Do you remember the lock contention? Wait for it. At this step of
the story, I wasn't aware we had a lock contention yet. <a href="#fr-suspens-1">↩</a></p>
</li>
<li id="fn-talks">
<p>If you are curious and enjoy watching talks, I'm maintaining
<a rel="noopener external" target="_blank" href="https://www.youtube.com/playlist?list=PLOkMRkzDhWGX_4YWI4ZYGbwFPqKnDRudf">a playlist of interesting talks I've watched</a>. Also
you can read this old article <a href="https://mnt.io/articles/one-conference-per-day-for-one-year-2017/">Once conference per day, for one year
(2017)</a>. <a href="#fr-talks-1">↩</a></p>
</li>
</ol>
</section>
From 19k to 4.2M events/sec: story of a SQLite query optimisation2025-09-12T00:00:00+00:002025-09-12T00:00:00+00:00
Unknown
https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/<p>Sit down comfortably. Take a cushion if you wish. This is, <i>clear its
throat</i>, the story of a funny performance quest. The <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk">Matrix Rust SDK</a> is a
set of crates aiming at providing all the necessary tooling to develop robust
and safe <a rel="noopener external" target="_blank" href="https://matrix.org/">Matrix</a> clients. Of course, it involves databases to persist some
data. The Matrix Rust SDK supports multiple databases: in-memory, <a rel="noopener external" target="_blank" href="https://sqlite.org/">SQLite</a>, and
<a rel="noopener external" target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API">IndexedDB</a>. This story is about the SQLite database.</p>
<p>The structure we want to persist is a novel type we have designed specifically
for the Matrix Rust SDK: a <a rel="noopener external" target="_blank" href="https://docs.rs/matrix-sdk-common/0.14.0/matrix_sdk_common/linked_chunk/index.html"><code>LinkedChunk</code></a>. It's the underlying structure that
holds all events manipulated by the Matrix Rust SDK. It is somewhat similar to
a <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Linked_List">linked list</a>; the differences are subtle and the goal of this article is
<em>not</em> to present all the details. We have developed many API around this type
to make all operations fast and efficient in the context of the Matrix protocol.
What we need to know is that in a <code>LinkedChunk<_, Item, Gap></code>, each node
contains a <code>ChunkContent<Item, Gap></code> defined as:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">enum</span><span class="z-entity z-name"> ChunkContent</span><span><</span><span class="z-entity z-name">Item</span><span>,</span><span class="z-entity z-name"> Gap</span><span>> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Gap</span><span>(</span><span class="z-entity z-name">Gap</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Items</span><span>(</span><span class="z-entity z-name">Vec</span><span><</span><span class="z-entity z-name">Item</span><span>>),</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Put it differently: each node can contain a <em>gap</em>, or a set of <em>items</em> (be
Matrix events).</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>May I recapitulate?</p>
<p>Each Matrix <em>room</em> contains a <code>LinkedChunk</code>, which is a set of <em>chunks</em>. Each
<em>chunk</em> is either a <em>gap</em> or a set of <em>events</em>. It seems to map fairly easily to
SQL tables, isn't it?</p>
</div>
</div>
<p>You're right: it's pretty straightforward! Let's see the first table:
<code>linked_chunks</code> which contains all the chunks. (Note that the schemas are
simplified for the sake of clarity).</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-keyword">CREATE TABLE</span><span> "</span><span class="z-entity z-name z-function">linked_chunks</span><span>" (</span></span>
<span class="giallo-l"><span class="z-comment"> -- Which linked chunk does this chunk belong to?</span></span>
<span class="giallo-l"><span class="z-string"> "linked_chunk_id"</span><span> BLOB </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Identifier of the chunk, unique per linked chunk.</span></span>
<span class="giallo-l"><span class="z-string"> "id"</span><span class="z-storage"> INTEGER</span><span class="z-keyword"> NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Identifier of the previous chunk.</span></span>
<span class="giallo-l"><span class="z-string"> "previous"</span><span class="z-storage"> INTEGER</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Identifier of the next chunk.</span></span>
<span class="giallo-l"><span class="z-string"> "next"</span><span class="z-storage"> INTEGER</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Our enum for the content of the chunk: `E` for events, `G` for a gap.</span></span>
<span class="giallo-l"><span class="z-string"> "type"</span><span class="z-storage"> TEXT CHECK</span><span>(</span><span class="z-string">"type"</span><span class="z-keyword"> IN</span><span> (</span><span class="z-string">'E'</span><span>, </span><span class="z-string">'G'</span><span>)) </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- … other things …</span></span>
<span class="giallo-l"><span>);</span></span></code></pre>
<p>Alrighty. Next contenders: the <code>event_chunks</code> and the <code>gap_chunks</code> tables, which
store the <code>ChunkContent</code>s of each chunk, respectively for <code>ChunkContent::Items</code>
and <code>ChunkContent::Gap</code>. In <code>event_chunks</code>, each row corresponds to an event. In
<code>gap_chunks</code>, each row corresponds to a gap.</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-keyword">CREATE TABLE</span><span> "</span><span class="z-entity z-name z-function">event_chunks</span><span>" (</span></span>
<span class="giallo-l"><span class="z-comment"> -- Which linked chunk does this event belong to?</span></span>
<span class="giallo-l"><span class="z-string"> "linked_chunk_id"</span><span> BLOB </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Which chunk does this event refer to?</span></span>
<span class="giallo-l"><span class="z-string"> "chunk_id"</span><span class="z-storage"> INTEGER</span><span class="z-keyword"> NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- The event ID.</span></span>
<span class="giallo-l"><span class="z-string"> "event_id"</span><span> BLOB </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Position (index) in the **chunk**.</span></span>
<span class="giallo-l"><span class="z-string"> "position"</span><span class="z-storage"> INTEGER</span><span class="z-keyword"> NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- … other things …</span></span>
<span class="giallo-l"><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">CREATE TABLE</span><span> "</span><span class="z-entity z-name z-function">gap_chunks</span><span>" (</span></span>
<span class="giallo-l"><span class="z-comment"> -- Which linked chunk does this event belong to?</span></span>
<span class="giallo-l"><span class="z-string"> "linked_chunk_id"</span><span> BLOB </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- Which chunk does this gap refer to?</span></span>
<span class="giallo-l"><span class="z-string"> "chunk_id"</span><span class="z-storage"> INTEGER</span><span class="z-keyword"> NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- … other things …</span></span>
<span class="giallo-l"><span>);</span></span></code></pre>
<p>Last contender, <code>events</code>. The assiduous reader may have noted that
<code>event_chunks</code> doesn't contain the content of the events: only its ID and its
position, <i>roll its eyes</i>… let's digress a bit, should we? Why is that? To
handle out-of-band events. In the Matrix protocol, we can receive events via:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://spec.matrix.org/v1.15/client-server-api/#get_matrixclientv3sync">the <code>/sync</code> endpoint</a>, it's the main source of inputs, we get most
of the events via this API,</li>
<li><a rel="noopener external" target="_blank" href="https://spec.matrix.org/v1.15/client-server-api/#get_matrixclientv3roomsroomidmessages">the <code>/messages</code> endpoint</a>, when we need to get events around a
particular events; this is helpful if we need to paginate backwards or
forwards around an event,</li>
<li><a rel="noopener external" target="_blank" href="https://spec.matrix.org/v1.15/client-server-api/#get_matrixclientv3roomsroomidcontexteventid">the <code>/context</code> endpoint</a>, if we need to get more context about
an event.</li>
<li>but there is more, like <a rel="noopener external" target="_blank" href="https://spec.matrix.org/v1.15/client-server-api/#mroompinned_events">pinned events</a>, and so on.</li>
</ul>
<p>When an event is fetched but cannot be positioned regarding other events, it is
considered <em>out-of-band</em>: it belongs to zero linked chunk, but we keep it in the
database. Maybe we can attach it to a linked chunk later, or we want to keep it
for saving future network requests. Anyway. You're a great digression companion.
Let's jump back to our tables.</p>
<p>The <code>events</code> table contains <em>all</em> the events: in-band <em>and</em> out-of-band.</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-comment">-- Events and their content.</span></span>
<span class="giallo-l"><span class="z-keyword">CREATE TABLE</span><span> "</span><span class="z-entity z-name z-function">events</span><span>" (</span></span>
<span class="giallo-l"><span class="z-comment"> -- The ID of the event.</span></span>
<span class="giallo-l"><span class="z-string"> "event_id"</span><span> BLOB </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- The JSON encoded content of the event (it's an encrypted value).</span></span>
<span class="giallo-l"><span class="z-string"> "content"</span><span> BLOB </span><span class="z-keyword">NOT NULL</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> -- … other things …</span></span>
<span class="giallo-l"><span>);</span></span></code></pre>
<p>At some point, we need to fetch metadata about a <code>LinkedChunk</code>. A certain
algorithm needs these metadata to work efficiently. We don't need to load all
events, however we need:</p>
<ul>
<li>to know all the chunks that are part of a linked chunk,</li>
<li>for each chunk, the number of events: 0 in case of a <code>ChunkContent::Gap</code>
(<code>G</code>), or the number of events in case of a <code>ChunkContent::Items</code> (<code>E</code>).</li>
</ul>
<p>A first implementation has landed in the Matrix Rust SDK. All good. When
suddenly…</p>
<h2 id="incredibly-slow-sync"><q cite="https://github.com/element-hq/element-x-ios-rageshakes/issues/4248">Incredibly slow sync</q><a role="presentation" class="anchor" href="#incredibly-slow-sync" title="Anchor link to this header">#</a>
</h2>
<p>A power-user<sup class="footnote-reference" id="fr-power-user-1"><a href="#fn-power-user">1</a></sup> was <a rel="noopener external" target="_blank" href="https://github.com/element-hq/element-x-ios-rageshakes/issues/4248">experiencing slowness</a>. It's always
a delicate situation. How to know the reason of the slowness? Is it the device?
The network? The asynchronous runtime? A lock contention? The file system? …
The database?</p>
<p>We don't have the device within easy reach. Hopefully, Matrix users are always
nice and willing to help! We have added a bunch of logs, then the user has
reproduced the problem, and shared their logs (via a rageshake) with us. Logs
are never trivial to analyse. However, here is a tip we use in the Matrix Rust
SDK: we have a special tracing type that logs the time spent in a portion of the
code; called <a rel="noopener external" target="_blank" href="https://docs.rs/matrix-sdk-common/0.14.0/matrix_sdk_common/tracing_timer/struct.TracingTimer.html"><code>TracingTimer</code></a>.</p>
<p>Basically, when a <code>TracingTimer</code> is created, it keeps its creation time in
memory. And when the <code>TracingTimer</code> is dropped, it emits a log containing the
elapsed time since its creation. It looks like this (it uses
<a rel="noopener external" target="_blank" href="https://docs.rs/tracing/">the <code>tracing</code> library</a>):</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> struct</span><span class="z-entity z-name"> TracingTimer</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> id</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> String</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> callsite</span><span class="z-keyword z-operator">: &</span><span>'</span><span class="z-entity z-name">static DefaultCallsite</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> start</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Instant</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> level</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> tracing</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Level</span><span>,</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">impl</span><span class="z-entity z-name"> Drop</span><span class="z-keyword"> for</span><span class="z-entity z-name"> TracingTimer</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> drop</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable z-language"> self</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> enabled</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> tracing</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">level_enabled!</span><span>(</span><span class="z-variable z-language">self</span><span class="z-keyword z-operator">.</span><span>level)</span><span class="z-keyword z-operator"> &&</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> interest</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>callsite</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">interest</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword z-operator"> !</span><span class="z-variable">interest</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">is_never</span><span>()</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> &&</span><span class="z-entity z-name"> tracing</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">__macro_support</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">__is_enabled</span><span>(</span><span class="z-variable z-language">self</span><span class="z-keyword z-operator">.</span><span>callsite</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">metadata</span><span>(),</span><span class="z-variable"> interest</span><span>)</span></span>
<span class="giallo-l"><span> };</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-keyword z-operator"> !</span><span class="z-variable">enabled</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span>;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> message</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> format!</span><span>(</span><span class="z-string">"_{}_ finished in {:?}"</span><span>,</span><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>id,</span><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>start</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">elapsed</span><span>());</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> metadata</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>callsite</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">metadata</span><span>();</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> fields</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> metadata</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">fields</span><span>();</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> message_field</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> fields</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">field</span><span>(</span><span class="z-string">"message"</span><span>)</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">unwrap</span><span>();</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> values</span><span class="z-keyword z-operator"> =</span><span> [(</span><span class="z-keyword z-operator">&</span><span class="z-variable">message_field</span><span>,</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">message</span><span class="z-keyword"> as</span><span class="z-keyword z-operator"> &</span><span class="z-keyword">dyn</span><span class="z-entity z-name"> tracing</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Value</span><span>))];</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // This function is hidden from docs, but we have to use it</span></span>
<span class="giallo-l"><span class="z-comment"> // because there is no other way of obtaining a `ValueSet`.</span></span>
<span class="giallo-l"><span class="z-comment"> // It's not entirely clear why it is private. See this issue:</span></span>
<span class="giallo-l"><span class="z-comment"> // https://github.com/tokio-rs/tracing/issues/2363</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> values</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> fields</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">value_set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">values</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> tracing</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Event</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">dispatch</span><span>(</span><span class="z-variable">metadata</span><span>,</span><span class="z-keyword z-operator"> &</span><span class="z-variable">values</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>And with that, let's use its companion macro <a rel="noopener external" target="_blank" href="https://docs.rs/matrix-sdk-common/0.14.0/matrix_sdk_common/macro.timer.html"><code>timer!</code></a> (I won't copy-paste it
here, it's pretty straightforward):</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>{</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> _timer</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> timer!</span><span>(</span><span class="z-string">"built something important"</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // … build something important …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // `_timer` is dropped here, and will emit a log.</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>With this technique, we were able to inspect the logs and saw immediately what
was slow… assuming we have added <code>timer!</code>s at the right places! It's not magic,
it doesn't find performance issues for you. You have to probe the correct places
in your code, and refine if necessary.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>I don't know if you heard about <em>sampling profilers</em>, but those are programs
far superior at analysing performance problems, compared to your… rustic
<code>TracingTimer</code> (pun intended!). Such programs can provide flamegraphs, call
trees etc.</p>
<p>I'm personally a regular user of <a rel="noopener external" target="_blank" href="https://github.com/mstange/samply">samply</a>, a command line CPU profiler relying
on the <a rel="noopener external" target="_blank" href="https://github.com/firefox-devtools/profiler">Firefox profiler</a> for its UI. It works on macOS, Linux and Windows.</p>
</div>
</div>
<p>I do also use <code>samply</code> pretty often! But you need an access to the processes
to use such tools. Here, the Matrix Rust SDK is used and embedded inside Matrix
clients. We have no access to it. It lives on devices everywhere around the
world. We may use better log analysers to infer “call trees”, but supporting
asynchronous logs (because the code is asynchronous) makes it very difficult.
And I honestly don't know if such a thing exists.</p>
<p>So. Yes. <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/pull/5407">We found the culprit</a>. With <a rel="noopener external" target="_blank" href="https://github.com/BurntSushi/ripgrep"><code>ripgrep</code></a>, we were able to scan
megabytes of logs and find the culprit pretty quickly. I was looking for lags of
the order of a second. I wasn't disappointed:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> rg </span><span class="z-string">'_method_ finished in.*> load_all_chunks_metadata'</span><span> all.log </span><span class="z-keyword z-operator">|</span><span class="z-entity z-name"> rg</span><span class="z-string"> '\d+(\.\d+)?s'</span><span class="z-constant z-other"> --only-matching</span><span class="z-keyword z-operator"> |</span><span class="z-entity z-name"> sort</span><span class="z-constant z-other"> --numeric-sort --reverse</span></span>
<span class="giallo-l"><span>107.121747125s</span></span>
<span class="giallo-l"><span>79.909931458s</span></span>
<span class="giallo-l"><span>10.348993583s</span></span>
<span class="giallo-l"><span>8.827636417s</span></span>
<span class="giallo-l"><span>8.614481625s</span></span>
<span class="giallo-l"><span>8.009787875s</span></span>
<span class="giallo-l"><span>5.99637875s</span></span>
<span class="giallo-l"><span>4.118492334s</span></span>
<span class="giallo-l"><span>3.910040333s</span></span>
<span class="giallo-l"><span>3.718858334s</span></span>
<span class="giallo-l"><span>3.689340667s</span></span>
<span class="giallo-l"><span>3.661383208s</span></span></code></pre>
<p>107 seconds. Be 1 minute and 47 seconds. Hello sweety.</p>
<h2 id="the-slow-query">The slow query<a role="presentation" class="anchor" href="#the-slow-query" title="Anchor link to this header">#</a>
</h2>
<p><code>load_all_chunks_metadata</code> is a method that runs this SQL query:</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-keyword">SELECT</span></span>
<span class="giallo-l"><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">id</span><span>,</span></span>
<span class="giallo-l"><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">previous</span><span>,</span></span>
<span class="giallo-l"><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">next</span><span>,</span></span>
<span class="giallo-l"><span class="z-support z-function"> COUNT</span><span>(</span><span class="z-constant z-other">ec</span><span>.</span><span class="z-constant z-other">event_id</span><span>) </span><span class="z-keyword">as</span><span> number_of_events</span></span>
<span class="giallo-l"><span class="z-keyword">FROM</span><span> linked_chunks </span><span class="z-keyword">as</span><span> lc</span></span>
<span class="giallo-l"><span class="z-keyword">LEFT JOIN</span><span> event_chunks </span><span class="z-keyword">as</span><span> ec</span></span>
<span class="giallo-l"><span class="z-keyword">ON</span><span class="z-constant z-other"> ec</span><span>.</span><span class="z-constant z-other">chunk_id</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">id</span></span>
<span class="giallo-l"><span class="z-keyword">WHERE</span><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">linked_chunk_id</span><span class="z-keyword z-operator"> =</span><span> ?</span></span>
<span class="giallo-l"><span class="z-keyword">GROUP BY</span><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">id</span></span></code></pre>
<p>For each chunk of the linked chunk, it counts the number of events associated to
this chunk. That's it.</p>
<p>Do you remember that a chunk can be of two kinds: <code>ChunkContent::Items</code> if it
contains a set of events, or <code>ChunkContent::Gap</code> if it contains a gap, so, no
event.</p>
<p>This query does the following:</p>
<ol>
<li>if the chunk is of kind <code>ChunkContent::Items</code>, it does count all events
associated to itself (via <code>ec.chunk_id = lc.id</code>),</li>
<li>otherwise, the chunk is of kind <code>ChunkContent::Gap</code>, so it will try to count
but… no event is associated to it: it's impossible to get
<code>ec.chunk_id = lc.id</code> to be true for a gap. This query will scan <em>all events</em>
for each gap… for no reason whatsoever! This is a linear scan here. If there
are 300 gaps for this linked chunk, and 5000 events, 1.5 millions events will
be scanned for <strong>no reason</strong>!</li>
</ol>
<p>How lovingly inefficient.</p>
<h2 id="12-6x-faster"><math><mn>12.6</mn><mo>×</mo></math> faster<a role="presentation" class="anchor" href="#12-6x-faster" title="Anchor link to this header">#</a>
</h2>
<p><q>Let's use an <a rel="noopener external" target="_blank" href="https://sqlite.org/lang_createindex.html"><code>INDEX</code></a></q> I hear you say (let's pretend
you're saying that, please, for the sake of the narrative!).</p>
<p>A database index provides rapid lookups after all. It has become a reflex
amongst the developer community.</p>
<div class="conversation" data-character="procureur">
<div class="conversation--character">
<span lang="fr">Le Procureur</span>
<picture role="presentation">
<source srcset="/image/procureur.avif" type="image/avif" />
<source srcset="/image/procureur.webp" type="image/webp" />
<img src="/image/procureur.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Indexes are designed to quickly locate data without scanning the full table. An
index contains a copy of the data, organised in a way enabling very efficient
search. Behind the scene, it uses various data structures, involving trade-offs
between lookup performance and index size. Most of the time, an index makes it
possible to transform a linear lookup,
<math>
<mi>O</mi><mo>(</mo>
<mi>n</mi>
<mo>)</mo>
</math>,
to a logarithmic lookup,
<math>
<mi>O</mi><mo>(</mo>
<mo lspace="0" rspace="0">log</mo><mo>(</mo>
<mi>n</mi>
<mo>)</mo>
<mo>)</mo>
</math>.
See <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Database_index">Database index</a> to learn more.</p>
</div>
</div>
<p>That's correct. But we didn't want to use an index here. The reason is twofold:</p>
<ol>
<li><strong>More spaces</strong>. Remember that <em>Le Procureur</em> said an index contains a
<em>copy</em> of the data. Here, the data is <a rel="noopener external" target="_blank" href="https://spec.matrix.org/v1.15/appendices/#event-ids">the event ID</a>. It's not
heavy, but it's not nothing. Moreover, we are not counting the <em>key</em> to
associate the <em>copied data</em> to the row containing the real data in the source
table.</li>
<li><strong>Still extra useless time</strong>. We would still need to traverse the index for
gaps, which is pointless.
<a rel="noopener external" target="_blank" href="https://sqlite.org/arch.html">SQLite implements indexes as B-Trees</a>, which is really
efficient, but still, we already know that a gap has zero event because…
it's… a gap between events!</li>
</ol>
<p>Do you remember that the <code>linked_chunks</code> table has a <code>type</code> column? It contains
<code>E</code> when the chunk is of kind <code>ChunkContent::Items</code> —it represents a set of
events—, and <code>G</code> when of kind <code>ChunkContent::Gap</code> —it represents a gap—. Maybe…
<i> stare into the void</i></p>
<div class="conversation" data-character="factotum">
<div class="conversation--character">
<span lang="fr">Le Factotum</span>
<picture role="presentation">
<source srcset="/image/factotum.avif" type="image/avif" />
<source srcset="/image/factotum.webp" type="image/webp" />
<img src="/image/factotum.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>May I interrupt?</p>
<p>Do you know that SQLite provides <a rel="noopener external" target="_blank" href="https://sqlite.org/lang_expr.html#the_case_expression">a <code>CASE</code> expression</a>? I know it's
unusual. SQL designers prefer to think in terms of sets, sub-sets, joins,
temporal tables, partial indexes… but honestly, for what I'm concerned, in
our case, it's simple enough and it can be powerful. It's a maddeningly
pragmatic <code>match</code> statement.</p>
<p>Moreover, the <code>type</code> column is already typed as an enum with the <code>CHECK("type" IN ('E', 'G'))</code> constraint. Maybe the SQL engine can run some even smarter
optimisations for us.</p>
</div>
</div>
<p>Oh, that would be brilliant! If <code>type</code> is <code>E</code>, we count the number of events,
otherwise we conclude it's <em>de facto</em> zero, isn't it? Let's try. The SQL query
then becomes:</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-keyword">SELECT</span></span>
<span class="giallo-l"><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">id</span><span>,</span></span>
<span class="giallo-l"><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">previous</span><span>,</span></span>
<span class="giallo-l"><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">next</span><span>,</span></span>
<span class="giallo-l"><span class="z-keyword"> CASE</span><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">type</span></span>
<span class="giallo-l"><span class="z-keyword"> WHEN</span><span class="z-string"> 'E'</span><span class="z-keyword"> THEN</span><span> (</span></span>
<span class="giallo-l"><span class="z-keyword"> SELECT</span><span class="z-support z-function"> COUNT</span><span>(</span><span class="z-constant z-other">ec</span><span>.</span><span class="z-constant z-other">event_id</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword"> FROM</span><span> event_chunks </span><span class="z-keyword">as</span><span> ec</span></span>
<span class="giallo-l"><span class="z-keyword"> WHERE</span><span class="z-constant z-other"> ec</span><span>.</span><span class="z-constant z-other">chunk_id</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">id</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span class="z-keyword"> ELSE</span></span>
<span class="giallo-l"><span class="z-constant z-numeric"> 0</span></span>
<span class="giallo-l"><span class="z-keyword"> END</span></span>
<span class="giallo-l"><span class="z-keyword"> as</span><span> number_of_events</span></span>
<span class="giallo-l"><span class="z-keyword">FROM</span><span> linked_chunks </span><span class="z-keyword">as</span><span> lc</span></span>
<span class="giallo-l"><span class="z-keyword">WHERE</span><span class="z-constant z-other"> lc</span><span>.</span><span class="z-constant z-other">linked_chunk_id</span><span class="z-keyword z-operator"> =</span><span> ?</span></span></code></pre>
<p>Since we have spotted the problem, we have written a benchmark to measure the
solutions. The benchmark simulates 10'000 events, with 1 gap every 80 events.
A set of data we consider <em>realistic</em> somehow for a normal user (not for a
power-user though, because a power-user has usually more gaps than events). Here
are the before/after results.</p>
<figure>
<table>
<thead>
<tr>
<th></th>
<th title="0.95 confidence level">Lower bound</th>
<th>Estimate</th>
<th title="0.95 confidence level">Upper bound</th>
</tr>
</thead>
<tbody>
<tr>
<td>Throughput</td>
<td>19.832 Kelem/s</td>
<td>19.917 Kelem/s</td>
<td>19.999 Kelem/s</td>
</tr>
<tr>
<td><math><msup><mi>R</mi><mn>2</mn></msup></math></td>
<td>0.0880234</td>
<td>0.1157540</td>
<td>0.0857823</td>
</tr>
<tr>
<td>Mean</td>
<td>500.03 ms</td>
<td>502.08 ms</td>
<td>504.24 ms</td>
</tr>
<tr>
<td title="Standard Deviation">Std. Dev.</td>
<td>2.2740 ms</td>
<td>3.6256 ms</td>
<td>4.1963 ms</td>
</tr>
<tr>
<td>Median</td>
<td>498.23 ms</td>
<td>500.93 ms</td>
<td>506.25 ms</td>
</tr>
<tr>
<td title="Median Absolute Deviation">MAD</td>
<td>129.84 µs</td>
<td>4.1713 ms</td>
<td>6.1184 ms</td>
</tr>
</tbody>
</table>
<figcaption>
<p>Benchmark's results for the original query with <code>COUNT</code> and <code>LEFT JOIN</code>.</p>
</figcaption>
</figure>
<details>
<summary>
<p>The Probability Distribution Function graph, and the Iteration times graph for
the <code>LEFT JOIN</code> approach</p>
</summary>
<figure>
<p><a href="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./1-pdf.svg"><img src="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./1-pdf.svg" alt="Probability distribution function" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>Benchmark's Probability Distribution Function for the <code>LEFT JOIN</code> approach.</p>
</figcaption>
</figure>
<figure>
<p><a href="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./1-iteration-times.svg"><img src="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./1-iteration-times.svg" alt="Iteration times" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>Benchmark's Iteration Times for the <code>LEFT JOIN</code> approach.</p>
</figcaption>
</figure>
</details>
<figure>
<table>
<thead>
<tr>
<th></th>
<th title="0.95 confidence level">Lower bound</th>
<th>Estimate</th>
<th title="0.95 confidence level">Upper bound</th>
</tr>
</thead>
<tbody>
<tr>
<td>Throughput</td>
<td>251.61 Kelem/s</td>
<td>251.84 Kelem/s</td>
<td>251.98 Kelem/s</td>
</tr>
<tr>
<td><math><msup><mi>R</mi><mn>2</mn></msup></math></td>
<td>0.9999778</td>
<td>0.9999833</td>
<td>0.9999673</td>
</tr>
<tr>
<td>Mean</td>
<td>39.684 ms</td>
<td>39.703 ms</td>
<td>39.726 ms</td>
</tr>
<tr>
<td title="Standard Deviation">Std. Dev.</td>
<td>8.8237 µs</td>
<td>35.948 µs</td>
<td>47.987 µs</td>
</tr>
<tr>
<td>Median</td>
<td>39.683 ms</td>
<td>39.691 ms</td>
<td>39.725 ms</td>
</tr>
<tr>
<td title="Median Absolute Deviation">MAD</td>
<td>1.9369 µs</td>
<td>13.000 µs</td>
<td>50.566 µs</td>
</tr>
</tbody>
</table>
<figcaption>
<p>Benchmark's results for the new query with the <code>CASE</code> expression.</p>
</figcaption>
</figure>
<details>
<summary>
<p>The Probability Distribution Function graph, and the Linear Regression graph
for the <code>CASE</code> approach</p>
</summary>
<figure>
<p><a href="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./2-pdf.svg"><img src="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./2-pdf.svg" alt="Probability distribution function" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>Benchmark's Probability Distribution Function for the <code>CASE</code> approach.</p>
</figcaption>
</figure>
<figure>
<p><a href="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./2-linear-regression.svg"><img src="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./2-linear-regression.svg" alt="Linear regression" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>Benchmark's Linear Regression for the <code>CASE</code> approach.</p>
</figcaption>
</figure>
</details>
<p>The throughput and the time are <math><mn>12.6</mn><mo>×</mo></math> better. No
<code>INDEX</code>. No more <code>LEFT JOIN</code>. Just a simple <code>CASE</code> expression. <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/pull/5411">You can see the
patches containing the benchmark and the fix</a>.</p>
<p>But that's not all…</p>
<h2 id=""><math><mn>211</mn><mo>×</mo></math> faster<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p>It's clearly better, but we couldn't stop ourselves. Having spotted the problem,
and having found this solution, it has made us creative! We have noticed that
we are running one query per chunk of kind <code>ChunkContent::Items</code>. If the linked
chunk contains 100 chunks, it will run 101 queries.</p>
<p>Then suddenly, <i>hit forehead with the hand's palm</i>, an idea pops! What if
we could only use 2 queries for all scenarios!</p>
<ol>
<li>The first query would count all events for each chunk in <code>events_chunk</code> in
one pass, and would store that in a <code>HashMap</code>,</li>
<li>The second query would fetch all chunks also in one pass,</li>
<li>Finally, Rust will fill the number of events for each chunk based on the data
in the <code>HashMap</code>.</li>
</ol>
<p>The first query translates like so in Rust:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// The first query.</span></span>
<span class="giallo-l"><span class="z-storage">let</span><span class="z-variable"> number_of_events_by_chunk_ids</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> transaction</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">prepare</span><span>(</span></span>
<span class="giallo-l"><span class="z-string"> r#"</span></span>
<span class="giallo-l"><span class="z-string"> SELECT</span></span>
<span class="giallo-l"><span class="z-string"> ec.chunk_id,</span></span>
<span class="giallo-l"><span class="z-string"> COUNT(ec.event_id)</span></span>
<span class="giallo-l"><span class="z-string"> FROM event_chunks as ec</span></span>
<span class="giallo-l"><span class="z-string"> WHERE ec.linked_chunk_id = ?</span></span>
<span class="giallo-l"><span class="z-string"> GROUP BY ec.chunk_id</span></span>
<span class="giallo-l"><span class="z-string"> "#</span><span>,</span></span>
<span class="giallo-l"><span> )</span><span class="z-keyword z-operator">?</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">query_map</span><span>((</span><span class="z-keyword z-operator">&</span><span class="z-variable">hashed_linked_chunk_id</span><span>,),</span><span class="z-keyword z-operator"> |</span><span class="z-variable">row</span><span class="z-keyword z-operator">|</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Ok</span><span>((</span></span>
<span class="giallo-l"><span class="z-variable"> row</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-entity z-name"> u64</span><span>>(</span><span class="z-constant z-numeric">0</span><span>)</span><span class="z-keyword z-operator">?</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> row</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-entity z-name"> usize</span><span>>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-keyword z-operator">?</span></span>
<span class="giallo-l"><span> ))</span></span>
<span class="giallo-l"><span> })</span><span class="z-keyword z-operator">?</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">collect</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-entity z-name">Result</span><span><</span><span class="z-entity z-name">HashMap</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-variable"> _</span><span>>,</span><span class="z-variable"> _</span><span>>>()</span><span class="z-keyword z-operator">?</span><span>;</span></span></code></pre>
<p>And the second query translates like so<sup class="footnote-reference" id="fr-simplified-code-1"><a href="#fn-simplified-code">2</a></sup>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-variable">transaction</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">prepare</span><span>(</span></span>
<span class="giallo-l"><span class="z-string"> r#"</span></span>
<span class="giallo-l"><span class="z-string"> SELECT</span></span>
<span class="giallo-l"><span class="z-string"> lc.id,</span></span>
<span class="giallo-l"><span class="z-string"> lc.previous,</span></span>
<span class="giallo-l"><span class="z-string"> lc.next,</span></span>
<span class="giallo-l"><span class="z-string"> lc.type</span></span>
<span class="giallo-l"><span class="z-string"> FROM linked_chunks as lc</span></span>
<span class="giallo-l"><span class="z-string"> WHERE lc.linked_chunk_id = ?</span></span>
<span class="giallo-l"><span class="z-string"> "#</span><span>,</span></span>
<span class="giallo-l"><span> )</span><span class="z-keyword z-operator">?</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">query_map</span><span>((</span><span class="z-keyword z-operator">&</span><span class="z-variable">hashed_linked_chunk_id</span><span>,),</span><span class="z-keyword z-operator"> |</span><span class="z-variable">row</span><span class="z-keyword z-operator">|</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Ok</span><span>((</span></span>
<span class="giallo-l"><span class="z-variable"> row</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-entity z-name"> u64</span><span>>(</span><span class="z-constant z-numeric">0</span><span>)</span><span class="z-keyword z-operator">?</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> row</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">u64</span><span>>>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-keyword z-operator">?</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> row</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">u64</span><span>>>(</span><span class="z-constant z-numeric">2</span><span>)</span><span class="z-keyword z-operator">?</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> row</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-variable">_</span><span>,</span><span class="z-entity z-name"> String</span><span>>(</span><span class="z-constant z-numeric">3</span><span>)</span><span class="z-keyword z-operator">?</span><span>,</span></span>
<span class="giallo-l"><span> ))</span></span>
<span class="giallo-l"><span> })</span><span class="z-keyword z-operator">?</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">map</span><span>(</span><span class="z-keyword z-operator">|</span><span class="z-variable">metadata</span><span class="z-keyword z-operator">| -></span><span class="z-entity z-name"> Result</span><span><</span><span class="z-variable">_</span><span>> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span> (</span><span class="z-variable">identifier</span><span>,</span><span class="z-variable"> previous</span><span>,</span><span class="z-variable"> next</span><span>,</span><span class="z-variable"> chunk_type</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> metadata</span><span class="z-keyword z-operator">?</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Let's use the `HashMap` from the first query here!</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> number_of_events</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> number_of_events_by_chunk_ids</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">id</span><span>)</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">copied</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">unwrap_or</span><span>(</span><span class="z-constant z-numeric">0</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Ok</span><span>(</span><span class="z-entity z-name">ChunkMetadata</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> identifier</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> previous</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> next</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> number_of_events</span><span>,</span></span>
<span class="giallo-l"><span> })</span></span>
<span class="giallo-l"><span> })</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">collect</span><span class="z-keyword z-operator">::</span><span><</span><span class="z-entity z-name">Result</span><span><</span><span class="z-entity z-name">Vec</span><span><</span><span class="z-variable">_</span><span>>,</span><span class="z-variable"> _</span><span>>>()</span></span></code></pre>
<p>Only two queries. All tests are passing. Now let's see what the benchmark has to
say!</p>
<figure>
<table>
<thead>
<tr>
<th></th>
<th title="0.95 confidence level">Lower bound</th>
<th>Estimate</th>
<th title="0.95 confidence level">Upper bound</th>
</tr>
</thead>
<tbody>
<tr>
<td>Throughput</td>
<td>4.1490 Melem/s</td>
<td>4.1860 Melem/s</td>
<td>4.2221 Melem/s</td>
</tr>
<tr>
<td><math><msup><mi>R</mi><mn>2</mn></msup></math></td>
<td>0.9961591</td>
<td>0.9976310</td>
<td>0.9960356</td>
</tr>
<tr>
<td>Mean</td>
<td>2.3670 ms</td>
<td>2.3824 ms</td>
<td>2.3984 ms</td>
</tr>
<tr>
<td title="Standard Deviation">Std. Dev.</td>
<td>16.065 µs</td>
<td>26.872 µs</td>
<td>31.871 µs</td>
</tr>
<tr>
<td>Median</td>
<td>2.3556 ms</td>
<td>2.3801 ms</td>
<td>2.4047 ms</td>
</tr>
<tr>
<td title="Median Absolute Deviation">MAD</td>
<td>3.8003 µs</td>
<td>36.438 µs</td>
<td>46.445 µs</td>
</tr>
</tbody>
</table>
<figcaption>
<p>Benchmark's results for the two queries approach.</p>
</figcaption>
</figure>
<details>
<summary>
<p>The Probability Distribution Function graph, and the Linear Regression graph
for the two queries approach</p>
</summary>
<figure>
<p><a href="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./3-pdf.svg"><img src="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./3-pdf.svg" alt="Probability distribution function" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>Benchmark's Probability Distribution Function for the two queries approach.</p>
</figcaption>
</figure>
<figure>
<p><a href="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./3-linear-regression.svg"><img src="https://mnt.io/articles/from-19k-to-4-2m-events-per-sec-story-of-a-sqlite-query-optimisation/./3-linear-regression.svg" alt="Linear regression" loading="lazy" decoding="async" /></a></p>
<figcaption>
<p>Benchmark's Linear Regression for the two queries approach.</p>
</figcaption>
</figure>
</details>
<p><strong>It is <math><mn>16.7</mn><mo>×</mo></math> faster compared to the previous
solution, so <math><mn>211</mn><mo>×</mo></math> faster than the first query!
We went from 502ms to 2ms. That's mental! From a throughput of 19.9 Kelem/s
to 4.2 Melem/s!</strong> <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk/pull/5425">You can see the patches containing the improvement</a>.</p>
<p>The throughput is measured by <em>element</em>, where an <em>element</em> here represents
a Matrix event. Consequently, 4 Melem/s means 4 millions events per second,
which means that <code>load_all_chunks_metadata</code> can do its computation at a rate of
4 millions events per second.</p>
<p>I think we can stop here. Performance are finally acceptable.</p>
<h2 id="-1">Lessons<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<ul>
<li><a rel="noopener external" target="_blank" href="https://bheisler.github.io/criterion.rs/book/index.html">Write benchmarks (with Criterion)</a>.</li>
<li>Run benchmarks.</li>
<li>Be aware of <a rel="noopener external" target="_blank" href="https://sqlite.org/queryplanner-ng.html">the SQL query planner</a>.</li>
<li>Be careful with joins.</li>
<li>Know your data.</li>
<li>Take a step back and count.</li>
<li>SQLite is fast.</li>
</ul>
<p>Notice how the SQL tables layout didn't change. Notice how the <code>LinkedChunk</code>
implementation didn't change. Only the SQL queries have changed, and it has
dramatically improved the situation.</p>
<p>This is joint effort between <a rel="noopener external" target="_blank" href="https://bouvier.cc/">Benjamin Bouvier</a>, <a rel="noopener external" target="_blank" href="https://github.com/poljar">Damir Jelić</a> and
I.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-power-user">
<p>We consider a <em>power-user</em> a user with more than 2000 rooms. I
hear your laugth! But guess what? We have users with more than 4000 rooms.
And I'm excluding bots here. The Matrix Rust SDK can be used to develop
bots, which can sit in thousands and thousands rooms easily. That said: we
have to be performant. <a href="#fr-power-user-1">↩</a></p>
</li>
<li id="fn-simplified-code">
<p>The code has been simplified a little bit. In reality, basic
Rust types, like <code>u64</code> or <code>Option<u64></code>, are mapped to linked chunk's types. <a href="#fr-simplified-code-1">↩</a></p>
</li>
</ol>
</section>
Sliding Sync at the Matrix Conference2024-10-30T00:00:00+00:002024-10-30T00:00:00+00:00
Unknown
https://mnt.io/articles/sliding-sync-at-the-matrix-conference/<p>Berlin. <time datetime="2024-09-21 10:00">Saturday, September 21, 2024.
10am</time>. I was live on stage and broadcasted on Internet, to talk about
(Simplified) Sliding Sync, the next sync mechanism for Matrix, at the first
<a rel="noopener external" target="_blank" href="https://2024.matrix.org/">Matrix Conference</a>.</p>
<p><a rel="noopener external" target="_blank" href="https://matrix.org/">Matrix</a> is an open network for secure, decentralised communication. It is an
important technology for Internet.</p>
<p>Matrix is a protocol. Everyone can implement it: either by providing its own
server and connect it to the federation, or by providing its own client and
connect it to the federation too. Nobody has a full control over the network,
and nobody controls the clients nor the servers. And yet, end-to-end encryption
is working, synchronisation is working, and everybody can talk to everybody,
communities organize themselves, the network grows and grows.</p>
<p>I am working at <a rel="noopener external" target="_blank" href="https://element.io/">Element</a> since 2 years now. I am paid to work on the [Matrix
Rust SDK], a project owned by the Matrix organisation. Everything we do is
available to the entire Matrix community, not only for Element. Well, this is
the open source world.</p>
<p>Matrix previous synchronisation mechanism is slow and inefficient. To put
Matrix on the hands of everyone for a daily pleasant usage, we have started
to experiment with a new sync mechanism, called Sliding Sync. The MSC —which
stands for Matrix Spec Changes, like RFC for example—, so the <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-spec-proposals/blob/kegan/sync-v3/proposals/3575-sync.md">MSC3575</a> was
our experimental foundation to play with a new sync mechanism. After many sweat
and tears, we ultimately found a working pattern and design that fulfill a
large majority of our usecases. Along the way, the implementation inside the
Sliding Sync Proxy —a proxy that sits on the top of a homeserver<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup> to provide
this new sync mechanism— was starting to feel really buggy and was really slow.
It was time to clean up everything, including the MSC.</p>
<p>Enter <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-spec-proposals/blob/erikj/sss/proposals/4186-simplified-sliding-sync.md">MSC4186</a>, which is basically Simplified Sliding Sync. We have mostly
removed features from <a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-spec-proposals/blob/kegan/sync-v3/proposals/3575-sync.md">MSC3575</a>, so that the implementation on the server-side
is much simpler and lighter. Simplified Sliding Sync is now implemented and
enabled by default on <a rel="noopener external" target="_blank" href="https://github.com/element-hq/synapse/">Synapse</a>, one of the major homeserver implementations.
Other homeservers have implemented MSC3575 and are working on supporting
MSC4186.</p>
<p>Sliding Sync has a huge impact on the overall user experience. Syncing is now
fast and almost transparent. It also works linearly whether the user has 10
or 10'000 rooms.</p>
<p>My talk can be viewed here:</p>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/kI2lSCVEunw"
title="Simplified Sliding Sync, by Ivan Enderlin, at the Matrix Conference 2024, Berlin"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Simplified Sliding Sync, by Ivan Enderlin, at the Matrix Conference 2024, Berlin</p>
<p><a href="./slides.pdf">Download the slides as PDF (21MiB)</a></p>
</figcaption>
</figure>
<h2 id="other-talks">Other talks<a role="presentation" class="anchor" href="#other-talks" title="Anchor link to this header">#</a>
</h2>
<p><a rel="noopener external" target="_blank" href="https://2024.matrix.org/watch/">All the talks are available online</a>, including talks from the public
sector, like NATO, Sweden, French or German administrations… I encourage you to
check the list! Nonetheless, I take the opportunity of this article to highlight
some announcement talks, or technical (Matrix internals) talks, I've enjoyed.</p>
<h3 id="matrix-2-0-and-the-launch-of-element-x">Matrix 2.0 and the launch of Element X!<a role="presentation" class="anchor" href="#matrix-2-0-and-the-launch-of-element-x" title="Anchor link to this header">#</a>
</h3>
<p>Two presentations for the price of one: <cite>Matrix 2.0 Is Here!</cite> by
Matthew Hogdson. 10 years after the original launch of Matrix, and 5 years after
Matrix 1.0, what a best anniversary to announce Matrix 2.0.</p>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/ZiRYdqkzjDU"
title="Matrix 2.0 Is Here!, by Matthew Hogdson, at the Matrix Conference 2024, Berlin"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Matrix 2.0 Is Here!, by Matthew Hogdson, at the Matrix Conference 2024, Berlin</p>
<p><a rel="noopener external" target="_blank" href="https://2024.matrix.org/documents/talk_slides/LAB3%202024-09-20%2010_15%20Matthew%20-%20Matrix%202.0%20is%20Here_%20The%20Matrix%20Conference%20Keynote.pdf">View and download the slides</a></p>
</figcaption>
</figure>
<p>The second video is <cite>Element X Launch!</cite> by Amandine Le Pape, Ștefan
Ceriu, and Amsha Kalra. They present Element X, how it's been designed,
developed, how it uses the Matrix Rust SDK, and where you can see awesome demos
of Element X with Element Call and so on! It was a great moment for everyone
working at Element and users!</p>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/gHyHO3xPfQU"
title="Element X Launch!, by Amandine Le Pape, Ștefan Ceriu, and Amsha Kalra, at the Matrix Conference 2024, Berlin"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Element X Launch!, by Amandine Le Pape, Ștefan Ceriu, and Amsha Kalra, at the Matrix Conference 2024, Berlin</p>
<p><a rel="noopener external" target="_blank" href="https://2024.matrix.org/documents/talk_slides/LAB3%202024-09-20%2017_45%20Amandine%20Le%20Pape,%20Amsha%20Kalra,%20Stefan%20Ceriu%20-%20Element%20X%20Launch%20Complete%20Presentation.pdf">View and download the slides</a></p>
</figcaption>
</figure>
<h3 id="unable-to-decrypt-this-mesage">Unable to decrypt this mesage<a role="presentation" class="anchor" href="#unable-to-decrypt-this-mesage" title="Anchor link to this header">#</a>
</h3>
<p><cite>Unable to decrypt this message</cite> by Kegan Dougal. This talk explains
why one can see an <em>Unable To Decrypt</em> error while trying to view a message in
Matrix. Most problems have been solved today, but the great message about this
presentation is to show how hard it is (was!) to provide reliable end-to-end
encryption over a federated network. One homeserver can be overused and then
slowed down, or a connection between two servers can be broken, or one device
lost its connectivity because it's used in the subway, or whatever. All these
classes of problems are illustrated and explained. I liked it a lot because it
gives a good sense of why end-to-end encryption is hard over a giant
decentralised, federated network, with encryption keys being renewed frequently,
and how problems have been solved.</p>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/FHzh2Y7BABQ"
title="Unable to decrypt this message, by Kegan Dougal, at the Matrix Conference 2024, Berlin"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Unable to decrypt this message, by Kegan Dougal, at the Matrix Conference 2024, Berlin</p>
<p><a rel="noopener external" target="_blank" href="https://2024.matrix.org/documents/talk_slides/LAB4%202024-09-21%2014_30%20Kegan%20Dougal%20-%20Unable%20to%20decrypt%20this%20message.pdf">View and download the slides</a></p>
</figcaption>
</figure>
<h3 id="news-from-the-matrix-rust-sdk">News from the Matrix Rust SDK<a role="presentation" class="anchor" href="#news-from-the-matrix-rust-sdk" title="Anchor link to this header">#</a>
</h3>
<p><cite>Strengthening the Base: Laying the Groundwork for a more robust Rust
SDK</cite> by Benjamin Bouvier, a good friend! This talk explains the recent
updates of the Matrix Rust SDK: how we have designed new API to make the
developer experience easier and more robust.</p>
<figure>
<iframe
class="youtube-player"
src="https://www.youtube-nocookie.com/embed/KOaoZKc1tgo"
title="Strengthening the Base: Laying the Groundwork for a more robust Rust SDK, by Benjamin Bouvier, at the Matrix Conference 2024, Berlin"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
referrerpolicy="strict-origin-when-cross-origin"
allowfullscreen
loading="lazy"></iframe>
<figcaption>
<p>Video: Strengthening the Base: Laying the Groundwork for a more robust Rust SDK, by Benjamin Bouvier, at the Matrix Conference 2024, Berlin</p>
<p><a rel="noopener external" target="_blank" href="https://2024.matrix.org/documents/talk_slides/LAB3%202024-09-20%2011_15%20Benjamin%20Bouvier%20-%20Rust%20SDK%20Foundation.pdf">View and download the slides</a></p>
</figcaption>
</figure>
<h2 id="about-transport">About transport<a role="presentation" class="anchor" href="#about-transport" title="Anchor link to this header">#</a>
</h2>
<p>I currently live in Switzerland. The conference was in Germany. Europe has a
fantastic rail network, and more importantly, a unique <strong>night train</strong> network!</p>
<p>Going there by plane would have generated 1'344 <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Global_warming_potential">kg CO<sub>2</sub>eq</a>,
be 68% of my annual carbon budget (for a sustainable world, we should all be at
2'000 kg maximum). Taking the train however has generated
<strong>6.5 kg CO<sub>2</sub>eq, be 0.33% of my annual carbon budget</strong>. It's 206 times
less than the plane!</p>
<p>If you are curious <a rel="noopener external" target="_blank" href="https://back-on-track.eu/night-train-map/">to try night train, you can check this map</a>
that lists all possible connections, stops, companies operating the trains
etc. Taking the night train is a nice way of travelling, and it saves a lot
of emissions.</p>
<p>I've taken a regular day train to go to Berlin, and a night train to come back
home.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>A <em>homeserver</em> in the Matrix terminology is simply a Matrix server. <a href="#fr-1-1">↩</a></p>
</li>
</ol>
</section>
Building a new site!2024-10-08T00:00:00+00:002024-10-08T00:00:00+00:00
Unknown
https://mnt.io/articles/building-a-new-site/<p>The time has come. I needed to rewrite my site from scratch. It was first
implemented with <a rel="noopener external" target="_blank" href="https://github.com/jekxyl/jekxyl">Jekxyl</a>, a static site generator written with
<a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Xyl">the XYL language</a>, a language I've developed inside <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Central">Hoa</a>. I've migrated
my blog to <a rel="noopener external" target="_blank" href="https://wordpress.com">WordPress.com</a> when
<a href="https://mnt.io/articles/bye-bye-liip-hello-automattic/">I was working there</a>.
The <a rel="noopener external" target="_blank" href="https://github.com/WordPress/gutenberg">Gutenberg editor</a> is really great, but there is no great
support for <code><code></code>. Plus, the theme I was using was pretty heavy. The
homepage was 1.15MiB! A simple article was 1.9MiB. Clearly not really efficient.
I wanted something more customisable, something light, something I can hack, and
more importantly, I wanted to start series.</p>
<h2 id="enter-zola">Enter Zola<a role="presentation" class="anchor" href="#enter-zola" title="Anchor link to this header">#</a>
</h2>
<p><a rel="noopener external" target="_blank" href="https://www.getzola.org/">Zola</a> is a static site generator written in <a rel="noopener external" target="_blank" href="https://www.rust-lang.org/">Rust</a>. It uses <a rel="noopener external" target="_blank" href="https://commonmark.org/">CommonMark</a> for
the markup, which is nice and straightforward to use. The template system is
powerful and simple. Zola can build 34 pages in 392ms at the time of writing, I
consider this is fast.</p>
<p>Nothing particular to say. It's a boring tool, which is great compliment. It
just works! In a couple of hours, I was able to get everything up and running.</p>
<h2 id="site-s-features">Site's features<a role="presentation" class="anchor" href="#site-s-features" title="Anchor link to this header">#</a>
</h2>
<p>The site contains articles and series. A series is composed of several episodes.
That's it. The URL patterns are the followings:</p>
<ul>
<li><code>/articles/<article-id>/</code> to view an article,</li>
<li><code>/series/<series-id>/</code> to view all episodes of a series,</li>
<li><code>/series/<series-id>/<episode-id>/</code> to view a particular episode of a series.</li>
</ul>
<h3 id="homepage">Homepage<a role="presentation" class="anchor" href="#homepage" title="Anchor link to this header">#</a>
</h3>
<p>The homepage provides:</p>
<ul>
<li>the latest series, and</li>
<li>pinned articles.</li>
</ul>
<p>To <em>pin</em> an article, I add the following TOML declarations in the frontmatter of
an article:</p>
<pre class="giallo z-code"><code data-lang="toml"><span class="giallo-l"><span>[</span><span class="z-entity z-name">extra</span><span>]</span></span>
<span class="giallo-l"><span class="z-variable">pinned</span><span class="z-punctuation z-separator"> =</span><span class="z-constant z-language"> true</span></span></code></pre>
<p>This <code>pinned</code> declaration is not recognised by Zola: the <code>[extra]</code> section
contains user-defined values. Then, it's a matter of filtering by this value in
the template:</p>
<pre class="giallo z-code"><code data-lang="html"><span class="giallo-l"><span>{% for page in section.pages | filter(attribute = "extra.pinned", value = true) -%}</span></span></code></pre>
<p>That's a nice feature to promote some articles.</p>
<p>In comparison to WordPress.com, the new homepage is 36.8KiB, that's 31 times
less!</p>
<h3 id="articles">Articles<a role="presentation" class="anchor" href="#articles" title="Anchor link to this header">#</a>
</h3>
<p>An article has some metadata like:</p>
<ul>
<li>the publishing time,</li>
<li>the reading time,</li>
<li>keywords,</li>
<li>edition.</li>
</ul>
<p>If you read this article in October 2024, you might see all that in this very
article. The beauty of this hides in the source code though:</p>
<pre class="giallo z-code"><code data-lang="html"><span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">main</span><span class="z-entity z-other z-attribute-name"> vocab</span><span class="z-punctuation z-separator">=</span><span class="z-string">"https://schema.org"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">article</span><span class="z-entity z-other z-attribute-name"> class</span><span class="z-punctuation z-separator">=</span><span class="z-string">"article"</span><span class="z-entity z-other z-attribute-name"> typeof</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Article"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">header</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">h1</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"name"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>Building a new site!</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">h1</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">div</span><span class="z-entity z-other z-attribute-name"> class</span><span class="z-punctuation z-separator">=</span><span class="z-string">"metadata"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">time</span><span class="z-entity z-other z-attribute-name"> title</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Published date"</span><span class="z-entity z-other z-attribute-name"> datetime</span><span class="z-punctuation z-separator">=</span><span class="z-string">"2024-10-08"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"datePublished"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>October 08, 2024</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">time</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">span</span><span class="z-entity z-other z-attribute-name"> title</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Reading time"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"timeRequired"</span><span class="z-entity z-other z-attribute-name"> content</span><span class="z-punctuation z-separator">=</span><span class="z-string">"PT2M"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>2 minutes read</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">span</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">span</span><span class="z-entity z-other z-attribute-name"> title</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Keywords"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"keywords"</span><span class="z-entity z-other z-attribute-name"> content</span><span class="z-punctuation z-separator">=</span><span class="z-string">"rust, site"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span> Keywords:</span><span class="z-constant z-character">&nbsp;</span><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">a</span><span class="z-entity z-other z-attribute-name"> href</span><span class="z-punctuation z-separator">=</span><span class="z-string">"/keywords/rust"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>rust</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">a</span><span class="z-punctuation z-definition z-tag z-end">></span><span>, </span><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">a</span><span class="z-entity z-other z-attribute-name"> href</span><span class="z-punctuation z-separator">=</span><span class="z-string">"/keywords/site"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>site</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">a</span><span class="z-punctuation z-definition z-tag z-end">></span><span>, </span><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">a</span><span class="z-entity z-other z-attribute-name"> href</span><span class="z-punctuation z-separator">=</span><span class="z-string">"/keywords/rdfa"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>rdfa</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">a</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> </</span><span class="z-entity z-name z-tag">span</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">span</span><span class="z-punctuation z-definition z-tag z-end z-punctuation z-definition z-tag z-begin">><</span><span class="z-entity z-name z-tag">a</span><span class="z-entity z-other z-attribute-name"> href</span><span class="z-punctuation z-separator">=</span><span class="z-string">"https://github.com/Hywan/mnt.io/edit/main/content/articles</span><span class="z-constant z-character">&#x2F;</span><span class="z-string">2024-10-08-building-a-new-site</span><span class="z-constant z-character">&#x2F;</span><span class="z-string">index.md"</span><span class="z-entity z-other z-attribute-name"> title</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Submit a patch for this page"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>Edit</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">a</span><span class="z-punctuation z-definition z-tag z-end">></span><span> this page</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">span</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">meta</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"description"</span><span class="z-entity z-other z-attribute-name"> content</span><span class="z-punctuation z-separator">=</span><span class="z-string">"…"</span><span class="z-punctuation z-definition z-tag z-end"> /></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> </</span><span class="z-entity z-name z-tag">div</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> </</span><span class="z-entity z-name z-tag">header</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> <!-- … --></span></span></code></pre>
<p>First off, the site uses HTML semantics as much as possible with <code>article</code>,
<code>header</code>, <code>time</code>, <code>meta</code> etc. Second, you may also notice the <code>vocab</code>, <code>typeof</code>,
<code>property</code> and <code>content</code> attributes. This is an extension to HTML for <a rel="noopener external" target="_blank" href="https://www.w3.org/TR/rdfa-primer/">Resource
Description Framework in Attributes</a> (RDFa for short). This is common to
add more semantics data to your content. It helps automated tools to analyse the
content of a Web document, and makes sense of it. <a rel="noopener external" target="_blank" href="https://schema.org/">schema.org</a> is a
collaborative effort to create schemas for structured data, and that's what I
use in this site for the moment.</p>
<p>Last neat thing: did you notice you can edit the page? The code lives on Github,
and everyone is free to submit a patch!</p>
<h3 id="series">Series<a role="presentation" class="anchor" href="#series" title="Anchor link to this header">#</a>
</h3>
<p>A series is pretty similar to an article, except that it adds another level of
indirection with episodes.</p>
<p>Similarly to articles with <code>pinned</code>, a series has its own metadata:</p>
<pre class="giallo z-code"><code data-lang="toml"><span class="giallo-l"><span>[</span><span class="z-entity z-name">extra</span><span>]</span></span>
<span class="giallo-l"><span class="z-variable">complete</span><span class="z-punctuation z-separator"> =</span><span class="z-constant z-language"> true</span></span></code></pre>
<p><code>complete</code> indicates whether the series is complete or in progress.</p>
<p>A series also has buttons to navigate to the previous or the next episodes.
Nothing fancy, but it's fun to be able to do all that with Zola.</p>
<p>The hierarchy is intuitive to understand, and it uses RDFa heavily too, for
example a series overview with all its episodes looks like this:</p>
<pre class="giallo z-code"><code data-lang="html"><span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">main</span><span class="z-entity z-other z-attribute-name"> vocab</span><span class="z-punctuation z-separator">=</span><span class="z-string">"https://schema.org"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">section</span><span class="z-entity z-other z-attribute-name"> typeof</span><span class="z-punctuation z-separator">=</span><span class="z-string">"CreativeWorkSeries"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">h1</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"name"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>From Rust to beyond</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">h1</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> <!-- … --></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">h2</span><span class="z-punctuation z-definition z-tag z-end">></span><span>Episodes</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">h2</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">div</span><span class="z-entity z-other z-attribute-name"> role</span><span class="z-punctuation z-separator">=</span><span class="z-string">"list"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">div</span><span class="z-entity z-other z-attribute-name"> role</span><span class="z-punctuation z-separator">=</span><span class="z-string">"listitem"</span><span class="z-entity z-other z-attribute-name"> class</span><span class="z-punctuation z-separator">=</span><span class="z-string">"article-poster"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"hasPart"</span><span class="z-entity z-other z-attribute-name"> typeof</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Article"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">a</span><span class="z-entity z-other z-attribute-name"> href</span><span class="z-punctuation z-separator">=</span><span class="z-string">"/series/from-rust-to-beyond/prelude/"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"url"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>Episode 1 – </span><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">span</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"name"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>Prelude</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">span</span><span class="z-punctuation z-definition z-tag z-end z-punctuation z-definition z-tag z-begin">></</span><span class="z-entity z-name z-tag">a</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">div</span><span class="z-entity z-other z-attribute-name"> class</span><span class="z-punctuation z-separator">=</span><span class="z-string">"metadata"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-comment"> <!-- … --></span><span> </span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> </</span><span class="z-entity z-name z-tag">div</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> </</span><span class="z-entity z-name z-tag">div</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">div</span><span class="z-entity z-other z-attribute-name"> role</span><span class="z-punctuation z-separator">=</span><span class="z-string">"listitem"</span><span class="z-entity z-other z-attribute-name"> class</span><span class="z-punctuation z-separator">=</span><span class="z-string">"article-poster"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"hasPart"</span><span class="z-entity z-other z-attribute-name"> typeof</span><span class="z-punctuation z-separator">=</span><span class="z-string">"Article"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">a</span><span class="z-entity z-other z-attribute-name"> href</span><span class="z-punctuation z-separator">=</span><span class="z-string">"/series/from-rust-to-beyond/the-webassembly-galaxy/"</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"url"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>Episode 2 – </span><span class="z-punctuation z-definition z-tag z-begin"><</span><span class="z-entity z-name z-tag">span</span><span class="z-entity z-other z-attribute-name"> property</span><span class="z-punctuation z-separator">=</span><span class="z-string">"name"</span><span class="z-punctuation z-definition z-tag z-end">></span><span>The WebAssembly galaxy</span><span class="z-punctuation z-definition z-tag z-begin"></</span><span class="z-entity z-name z-tag">span</span><span class="z-punctuation z-definition z-tag z-end z-punctuation z-definition z-tag z-begin">></</span><span class="z-entity z-name z-tag">a</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> <</span><span class="z-entity z-name z-tag">div</span><span class="z-entity z-other z-attribute-name"> class</span><span class="z-punctuation z-separator">=</span><span class="z-string">"metadata"</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"><span class="z-comment"> <!-- … --></span><span> </span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag z-begin"> </</span><span class="z-entity z-name z-tag">div</span><span class="z-punctuation z-definition z-tag z-end">></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> <!-- … --></span></span></code></pre>
<p>First off, we use the <code>role</code> HTML attribute to change the semantics
of some elements: here <code>div</code> become a <code>ul</code> or a <code>li</code>. Second, we use
<code>typeof="CreativeWorkSeries"</code> to describe a series, which is composed
of different parts: <code>property="hasPart"</code>. Each part is an article:
<code>typeof="Article"</code>, which has its own semantics: <code>property="name"</code> etc. The
markup is extremely simple but it contains all required information. HTML is
really powerful, I'm not going to lie!</p>
<h2 id="discuss">Discuss<a role="presentation" class="anchor" href="#discuss" title="Anchor link to this header">#</a>
</h2>
<p>One novelty is the <em>Discuss</em> menu item at the top of the site. It contains a
link to a <a rel="noopener external" target="_blank" href="https://matrix.org/">Matrix</a> public room: <a rel="noopener external" target="_blank" href="https://matrix.to/#/#mnt_io:matrix.org">https://matrix.to/#/#mnt_io:matrix.org</a>, where
anybody can come to talk about an article, a series, ask questions, or simply
chill. You're very welcome there!</p>
<h2 id="good-ol-web">Good ol' Web<a role="presentation" class="anchor" href="#good-ol-web" title="Anchor link to this header">#</a>
</h2>
<p>The site has a short CSS stylesheet written by hand with no framework (oh yeah).
It weights 11KiB (uncompressed), heavy, I know.</p>
<p>The site also has <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/RSS">RSS</a> and <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Atom_(web_standard)">Atom</a> feeds for syndications. It even has a
blogroll under the <em>Recommandations</em> Section in the footer. Well, you get it,
I'm nostalgic of the old Web. It's absolutely incredible what it is possible
to achieve today with HTML and CSS without any frameworks or polyfills. So much
resources are wasted nowadays…</p>
<h2 id="personnages"><q>Personnages</q><a role="presentation" class="anchor" href="#personnages" title="Anchor link to this header">#</a>
</h2>
<p>The biggest novelty is <a href="https://mnt.io/lore/">the lore</a> I've developed for this new version
of the site. Please, welcome 3 characters: <em>Le Compte</em>, <em>Le Factotum</em>, and <em>Le
Procureur</em>.</p>
<p>These characters will help to explain not trivial concepts by interacting with
me. Let me copy the lore here.</p>
<h3 id="le-comte">Le Comte<a role="presentation" class="anchor" href="#le-comte" title="Anchor link to this header">#</a>
</h3>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>My name is <em>Le Comte</em>. I enjoy being the main character of this story. I am
mostly here to learn, and to interrogate our dear author.</p>
<p>My resources are unlimited. I am fortunate to have a fortune with a secret
origin. If I want to understand something, I will work as hard as possible to
try to make light on it. My new caprice is these new modern things people are
calling <em>computers</em>. They seem really powerful and I want to learn everything
about them!</p>
<p>I often ask help to my Factotum for the dirty, and sometimes illegal tasks. I
rarely ask help to Le Procureur, we don't really appreciate his presence.</p>
</div>
</div>
<h3 id="le-factotum">Le Factotum<a role="presentation" class="anchor" href="#le-factotum" title="Anchor link to this header">#</a>
</h3>
<div class="conversation" data-character="factotum">
<div class="conversation--character">
<span lang="fr">Le Factotum</span>
<picture role="presentation">
<source srcset="/image/factotum.avif" type="image/avif" />
<source srcset="/image/factotum.webp" type="image/webp" />
<img src="/image/factotum.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>My name is <em>Le Factotum</em>. It's a latin word, literally saying “do everything”.
I'm here to assist Le Comte in its fancies.</p>
<p>Even if I have an uneventful life now, Le Comte is partly aware of my smuggling
past. He says no word about it, but he knows I kept contact with old friends
across various countries and cultures. These relations are useful to Le Comte to
achieve its quests to learn everything about computers.</p>
<p>Fundamentally, when Le Comte wants to do something manky, he asks me the best
way to do that. And I always have a solution.</p>
</div>
</div>
<h3 id="le-procureur">Le Procureur<a role="presentation" class="anchor" href="#le-procureur" title="Anchor link to this header">#</a>
</h3>
<div class="conversation" data-character="procureur">
<div class="conversation--character">
<span lang="fr">Le Procureur</span>
<picture role="presentation">
<source srcset="/image/procureur.avif" type="image/avif" />
<source srcset="/image/procureur.webp" type="image/webp" />
<img src="/image/procureur.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>My name is <em>Le Procureur</em>. I am the son of the Law and the Order. I know what
is legal, and what is illegal. If an information is missing, a detail, an
exactness, I know where to find the answer.</p>
<p>Some people believe I am irritating, but I consider myself the defenser of
discipline.</p>
</div>
</div>
<h2 id="optimised-for-smallness-speed-semantics-and-fun">Optimised for smallness, speed, semantics and fun!<a role="presentation" class="anchor" href="#optimised-for-smallness-speed-semantics-and-fun" title="Anchor link to this header">#</a>
</h2>
<p>At this step, it should be clear the site has been optimised for smallness,
speed and semantics. Even the fonts aren't custom: I use <a rel="noopener external" target="_blank" href="https://modernfontstacks.com/">Modern Font Stacks</a>
to find a font stack that work on most computers. Two devices may not have the
same look and feel for this site and that's perfectly fine. That's the nature of
the Web.</p>
<p>I encourage you <a rel="noopener external" target="_blank" href="https://github.com/Hywan/mnt.io">to read the source code of this site</a>, to fork it, to
play with it, to get inspired by it. It's important to own your content, and to
not give your work to other platforms.</p>
<p>I really hope you'll enjoy the content I'm preparing. You can start with the
first episode of the new series:
<a href="https://mnt.io/series/reactive-programming-in-rust/observability/">Reactive programming in Rust, Observability</a>.
See you there!</p>
Observability2024-09-19T00:00:00+00:002024-09-19T00:00:00+00:00
Unknown
https://mnt.io/series/reactive-programming-in-rust/observability/<p>Imagine a collection of values <code>T</code>. This collection can be updated by inserting
new values, removing existing ones, or the collection can truncated, cleared…
This collection acts as <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/vec/index.html">the standard <code>Vec</code></a>. However, there is a
subtlety: This collection is <em>observable</em>. It is possible for someone to
<em>subscribe</em> to this collection and to receive its updates.</p>
<p>This observability pattern is the basis of reactive programming. It applies to
any kind of type. Actually, it can be generalised as a single <code>Observable<T></code>
type. For collections though, we will see that an <code>ObservableVector<T></code> type is
more efficient.</p>
<p>I’ve recently played a lot with this pattern as part of my work inside the
<a rel="noopener external" target="_blank" href="https://github.com/matrix-org/matrix-rust-sdk">Matrix Rust SDK</a>, a set of Rust libraries that aim at developing robust
<a rel="noopener external" target="_blank" href="https://matrix.org/">Matrix</a> clients or bridges. It is notoriously used by the next generation
Matrix client developed by <a rel="noopener external" target="_blank" href="https://element.io/">Element</a>, namely <a rel="noopener external" target="_blank" href="https://element.io/labs/element-x">Element X</a>. The Matrix Rust SDK is
cross-platform. Element X has two implementations: on iOS, iPadOS and macOS with
Swift, and on Android with Kotlin. Both languages are using our Rust bindings
to <a rel="noopener external" target="_blank" href="https://www.swift.org/">Swift</a> and <a rel="noopener external" target="_blank" href="https://kotlinlang.org/">Kotlin</a>. This is the story for another series (how we have
automated this, how we support asynchronous flows from Rust to foreign languages
etc.), but for the moment, let’s keep focus on reactive programming.</p>
<p>Taking the Element X use case, the room list –which is the central piece of the
app– is fully dynamic:</p>
<ul>
<li>Rooms are sorted by recency, so rooms move to the top when a new interesting
message is received,</li>
<li>The list can be filtered by room properties (one can filter by group or
people, favourites, unreads, invites…),</li>
<li>The list is also searchable by room names.</li>
</ul>
<p>The rooms exposed by the room list are stored in a unique <em>observable</em> type.
Why is it dynamic? Because the app continuously sync new data that update the
internal state: when a room gets an update from the network, the room list is
automatically updated. The beauty of it: we have nothing to do. Sorters and
filters are run automatically. Why? Spoiler: because everything is a <code>Stream</code>.</p>
<p>Thanks to the Rust async model, every part is lazy. The app never needs to ask
for Rust if a new update is present. It literally just waits for them.</p>
<p>I believe this reactive programming approach is pretty interesting to explore.
And this is precisely the goal of this series. We are going to play with
<code>Stream</code> a lot, with higher-order <code>Stream</code> a lot more, and w…</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Hold on a second! I believe this first step is a bit steep for someone who's not
familiar with asynchronous code in Rust, don't you think?</p>
<p>Before digging in the implementation details you are obviously eager to share,
maybe we can start with examples.</p>
</div>
</div>
<p>Alrighty. Fair. Before digging into the really fun bits, we need some basis.</p>
<h2 id="baby-steps-with-reactive-programming">Baby steps with reactive programming<a role="presentation" class="anchor" href="#baby-steps-with-reactive-programming" title="Anchor link to this header">#</a>
</h2>
<p>Everything we are going to share with you has been implemented in <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball">a library
called <code>eyeball</code></a>. To give you a good idea of what reactive
programming in Rust can look like, let's create a Rust program:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo new --bin playground</span></span>
<span class="giallo-l"><span> Creating binary (application) `playground` package</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cd playground</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo add eyeball</span></span>
<span class="giallo-l"><span> Updating crates.io index</span></span>
<span class="giallo-l"><span> Adding eyeball v0.8.8 to dependencies</span></span>
<span class="giallo-l"><span> Features:</span></span>
<span class="giallo-l"><span> - __bench</span></span>
<span class="giallo-l"><span> - async-lock</span></span>
<span class="giallo-l"><span> - tracing</span></span>
<span class="giallo-l"><span> Updating crates.io index</span></span>
<span class="giallo-l"><span> Locking 3 packages to latest compatible versions</span></span></code></pre><pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Observable</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> main</span><span>() {</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-constant z-numeric">7</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">subscribe</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-entity z-name">Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span>());</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 13</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-entity z-name">Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span>());</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>What do we see here? First off, <code>observable</code> is an observable value. Proof
is: It is possible to subscribe to it, see <code>subscriber</code>. Both <code>observable</code> and
<code>subscriber</code> are seeing the same initial value: 7. When <code>observable</code> receives a
new value, 13, both <code>observable</code> and <code>subscriber</code> are seeing the updated value. Let's take it for a spin:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:8:5] Observable::get(&observable) = 7</span></span>
<span class="giallo-l"><span>[src/main.rs:9:5] subscriber.get() = 7</span></span>
<span class="giallo-l"><span>[src/main.rs:13:5] Observable::get(&observable) = 13</span></span>
<span class="giallo-l"><span>[src/main.rs:14:5] subscriber.get() = 13</span></span></code></pre>
<p>Tadaa. Fantastic, isn't it?</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>I… I am… speechless? Is it <em>really</em> reactive programming? Where is the
reactivity here? It seems like you've only shared a value between an <em>owner</em> and
a <em>watcher</em>. You're calling them <em>observable</em> and <em>subscriber</em>, alright, but how
is this thing reactive? I only see synchronous code for the moment.</p>
</div>
</div>
<p>Hold on. You told me to start slow. You're right though: the <code>Observable</code> owns
the value. The <code>Subscriber</code> is able to read the value from the <code>Observable</code>.
However, <code>Subscriber::next</code> returns a <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/future/trait.Future.html"><code>Future</code></a>! Let's add this:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> main</span><span>() {</span></span>
<span class="giallo-l"><span class="z-comment"> // …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>);</span></span>
<span class="giallo-l"><span>}</span></span></code></pre><pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>error[E0728]: `await` is only allowed inside `async` functions and blocks</span></span>
<span class="giallo-l"><span> --> src/main.rs:16:28</span></span>
<span class="giallo-l"><span> |</span></span>
<span class="giallo-l"><span>3 | fn main() {</span></span>
<span class="giallo-l"><span> | --------- this is not `async`</span></span>
<span class="giallo-l"><span>...</span></span>
<span class="giallo-l"><span>16 | dbg!(subscriber.next().await);</span></span>
<span class="giallo-l"><span> | ^^^^^ only allowed inside `async` functions and blocks</span></span></code></pre>
<p>Indeed. Almighty <code>rustc</code> is correct. The <code>main</code> function is not <code>async</code>. We need
an asynchronous runtime. Let's use <a rel="noopener external" target="_blank" href="https://docs.rs/smol">the <code>smol</code> project</a>, I enjoy it a
lot: it's a small, fast and well-written async runtime:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo add smol</span></span>
<span class="giallo-l"><span> Updating crates.io index</span></span>
<span class="giallo-l"><span> Adding smol v2.0.2 to dependencies</span></span>
<span class="giallo-l"><span> [ … snip … ]</span></span></code></pre>
<p>Now let's modify our <code>main</code> function a little bit:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Observable</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> main</span><span>() {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> smol</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">block_on</span><span>(</span><span class="z-keyword">async</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-constant z-numeric">7</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">subscribe</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-entity z-name">Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span>());</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 13</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-entity z-name">Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">get</span><span>());</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>);</span></span>
<span class="giallo-l"><span> })</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Please <code>rustc</code>, be nice…</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span>[src/main.rs:9:9] Observable::get(&observable) = 7</span></span>
<span class="giallo-l"><span>[src/main.rs:10:9] subscriber.get() = 7</span></span>
<span class="giallo-l"><span>[src/main.rs:14:9] Observable::get(&observable) = 13</span></span>
<span class="giallo-l"><span>[src/main.rs:15:9] subscriber.get() = 13</span></span>
<span class="giallo-l"><span>[src/main.rs:17:9] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> 13,</span></span>
<span class="giallo-l"><span>)</span></span></code></pre>
<p>Hurray!</p>
<p>We can even have a bit more ergonomics by using <a rel="noopener external" target="_blank" href="https://docs.rs/smol-macros">the <code>smol-macros</code>
crate</a> which sets up a default <a rel="noopener external" target="_blank" href="https://docs.rs/smol/2.0.2/smol/struct.Executor.html">async runtime
<code>Executor</code></a> for us. It's useful in our case as we want to play
with something else (reactive programming), and don't want to focus on the async
runtime itself:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo add smol-macros macro_rules_attribute</span></span>
<span class="giallo-l"><span> Updating crates.io index</span></span>
<span class="giallo-l"><span> Adding smol-macros v0.1.1 to dependencies</span></span>
<span class="giallo-l"><span> Adding macro_rules_attribute v0.2.0 to dependencies</span></span>
<span class="giallo-l"><span> Features:</span></span>
<span class="giallo-l"><span> - better-docs</span></span>
<span class="giallo-l"><span> - verbose-expansions</span></span>
<span class="giallo-l"><span> Updating crates.io index</span></span>
<span class="giallo-l"><span> Locking 4 packages to latest compatible versions</span></span>
<span class="giallo-l"><span> Adding macro_rules_attribute v0.2.0</span></span>
<span class="giallo-l"><span> Adding macro_rules_attribute-proc_macro v0.2.0</span></span>
<span class="giallo-l"><span> Adding paste v1.0.15</span></span>
<span class="giallo-l"><span> Adding smol-macros v0.1.1</span></span></code></pre>
<p>We will take the opportunity to improve our program a little bit. Let's spawn a
<code>Future</code> that will continuously read new updates from the <code>subscriber</code>.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> std</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">time</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Duration</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Observable</span><span>;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> macro_rules_attribute</span><span class="z-keyword z-operator">::</span><span>apply;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> smol</span><span class="z-keyword z-operator">::</span><span>{</span><span class="z-entity z-name">Executor</span><span>,</span><span class="z-entity z-name"> Timer</span><span>};</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> smol_macros</span><span class="z-keyword z-operator">::</span><span>main;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[apply(main</span><span class="z-keyword z-operator">!</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">async fn</span><span class="z-entity z-name z-function"> main</span><span>(</span><span class="z-variable">executor</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Executor</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-constant z-numeric">7</span><span>);</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">subscribe</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">observable</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Task that reads new updates from `observable`.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> task</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> executor</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">spawn</span><span>(</span><span class="z-keyword">async move</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> while</span><span class="z-storage"> let</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-variable">new_value</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">new_value</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> });</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Now, let's update `observable`.</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 13</span><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Timer</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">after</span><span>(</span><span class="z-entity z-name">Duration</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_secs</span><span>(</span><span class="z-constant z-numeric">1</span><span>))</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 17</span><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Timer</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">after</span><span>(</span><span class="z-entity z-name">Duration</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_secs</span><span>(</span><span class="z-constant z-numeric">1</span><span>))</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 23</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Wait on the task.</span></span>
<span class="giallo-l"><span class="z-variable"> task</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The little <code>Timer::after</code> calls are here to pretend the values are coming from
random events, for the moment. Let's run it again to see if we get the same
result:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:16:13] new_value = 13</span></span>
<span class="giallo-l"><span>[src/main.rs:16:13] new_value = 17</span></span>
<span class="giallo-l"><span>[src/main.rs:16:13] new_value = 23</span></span>
<span class="giallo-l"><span>^C</span></span></code></pre>
<p>Here we go, perfect! See, ah ha! It's async and nice now.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>I believe I start to appreciate it. However, I foresee you might hide something
behind these <code>Time::after</code>. Am I right?</p>
<p>And this <code>task.await</code> at the end makes the program to never finish. It explains
the need to send <a rel="noopener external" target="_blank" href="https://man.freebsd.org/cgi/man.cgi?query=signal">a <code>SIGINT</code> signal</a> to the program to interrupt it,
right?</p>
</div>
</div>
<p>You're slick. Indeed, I wanted to focus on the <code>observable</code> and the
<code>subscriber</code>. Because there is a subtlety here. If the <code>Timer::after</code> are
removed, only the last update will be displayed on the output by <code>dbg!</code>.
And that's perfectly normal. The async runtime will execute all the
<code>Observable::set(&mut observable, new_value)</code> in a row, and then, once there
is an await point, another task will have room to run. In this case, that's
<code>subscriber.next().await</code>.</p>
<p>The subscriber only receives the <strong>last</strong> update, and that's pretty important
to understand. There is no buffer of all the previous updates here, no memory,
no trace, <code>subscriber</code> returns the last value when it is called. Note that this
is not always the case as we will see with <code>ObservableVector</code> later, but for the
moment, that's the case.</p>
<p>And yes, if we want the <code>task</code> to get a chance to consume more updates, we need
to tell the executor we will wait while the current other tasks are waken up. To
do that, we can use <a rel="noopener external" target="_blank" href="https://docs.rs/smol/2.0.2/smol/future/fn.yield_now.html">the <code>smol::yield_now</code> function</a>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment"> // Now, let's update `observable`.</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 13</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment"> // Eh `executor`: `task` can run now, we will wait!</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> yield_now</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // More updates.</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 17</span><span>);</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Observable</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">set</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable"> observable</span><span>,</span><span class="z-constant z-numeric"> 23</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment"> // Eh `executor`: _bis repetita placent_!</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> yield_now</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> drop</span><span>(</span><span class="z-variable">observable</span><span>)</span></span>
<span class="giallo-l"><span class="z-variable"> task</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Let's see what happens:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:14:13] new_value = 13</span></span>
<span class="giallo-l"><span>[src/main.rs:14:13] new_value = 23</span></span></code></pre>
<p>Eh, see, <code>new_value = 17</code> is <strong>not</strong> displayed, because the <code>observable</code> is
updated but the <code>subscriber</code> is suspended by the executor. But the others are
read, good good.</p>
<p>Note that we are dropping the <code>observable</code>. Once it's dropped, the <code>subscriber</code>
won't be able to read any value from it, so it's going to close itself, and the
<code>task</code> will end. That's why waiting on the task with <code>task.await</code> will terminate
this time. And thus, the program will finish gracefully.</p>
<p>And that's it. That's the basis of reactive programming. Also note that
<code>Subscriber<T></code> implements <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/marker/trait.Send.html"><code>Send</code></a> and <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/marker/trait.Sync.html"><code>Sync</code></a> if <code>T</code> implements <code>Send</code> and
<code>Sync</code>, i.e. if the observed type implements these traits. That's pretty useful
actually: it is possible to send the subscriber in a different thread, and keep
waiting for new updates.</p>
<h2 id="attack-of-the-clones">Attack of the Clones<a role="presentation" class="anchor" href="#attack-of-the-clones" title="Anchor link to this header">#</a>
</h2>
<p>However, at the beginning of this episode, we were talking about a collection.
Let's focus on <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/vec/index.html"><code>Vec</code></a>.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Why do we focus on <code>Vec</code> <em>only</em>? Why not <code>HashMap</code>, <code>HashSet</code>, <code>BTreeSet</code>,
<code>BTreeMap</code>, <code>BinaryHeap</code>, <code>LinkedList</code> or even <code>VecDeque</code>? It seems a bit
non-inclusive if you ask me. Are you aware there isn't only <code>Vec</code> in life?</p>
</div>
</div>
<p>Well, the reason is simple: <code>Vec</code> is supported by <code>eyeball</code>. It's a matter of
time and work to support other collections, it's definitely not impossible but
you will see that it's not trivial neither to support all these collections for
a simple reason: Did you notice that <code>Subscriber</code> produces an owned <code>T</code>? Not a
<code>&T</code>, but a <code>T</code>. That's because
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball/0.8.8/eyeball/struct.Subscriber.html#method.next-1"><code>Subscriber::next</code></a> requires <code>T: Clone</code>. It means
that the observed value will be cloned every time it is broadcasted to a
subscriber.</p>
<p><a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/clone/trait.Clone.html">Cloning a value</a> may be expensive. Here we are manipulating <code>usize</code>,
which is a primitive type, so it's all fine (it boils down to a <a rel="noopener external" target="_blank" href="https://en.cppreference.com/w/c/string/byte/memcpy"><code>memcpy</code></a>).
But imagine an <code>Observable<Vec<BigType>></code> where <code>BigType</code> is 512 bytes: the
memory impact is going to be quickly noticeable. So th…</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>… Excuse my interruption! You know how I love reading books. I like
defining myself as a bibliophile. Anyway. During my perusal of the <code>eyeball</code>
documentation, I have found
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball/0.8.8/eyeball/struct.Subscriber.html#method.next_ref-1"><code>Subscriber::next_ref</code></a>. The documentation
says:</p>
<blockquote>
<p>Wait for an update and get a read lock for the updated value.</p>
</blockquote>
<p>and later:</p>
<blockquote>
<p>You can use this method to get updates of an <code>Observable</code> where the inner type
does not implement <code>Clone</code>.</p>
</blockquote>
</div>
</div>
<p>Can you stop cutting me off please? It's really unpleasant. And do not forget we
are not alone… <i>doing sideways head movement</i></p>
<p>You're right though. There is <code>Subscriber::next_ref</code>. However, if you are such
an <em>assiduous reader</em>, you may have read the end of the documentation, aren't
you?</p>
<blockquote>
<p>However, the <code>Observable</code> will be locked (not updateable) while any read guards
are alive.</p>
</blockquote>
<p>Blocking the <code>Observable</code> might be tolerable in some cases, but it cannot be
generalised to all use cases. A user is more likely to prefer <code>next</code> instead of
<code>next_ref</code> by default.</p>
<p>Back to our <code>Observable<Vec<BigType>></code> then. Imagine the collection contains a
lot of items: cloning the entire <code>Vec<_></code> for every update to every subscriber
is a pretty inefficient way of programming. Remember that, as a programmer, we
have the responsibility to make our programs use as few resources as possible,
so that hardwares can be used longer. The hardware is the most polluting segment
of our digital world.</p>
<p>So. How a data structure like <code>Vec</code> can be cloned cheaply? We could put
it inside an <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/sync/struct.Arc.html"><code>Arc</code></a> right? Cloning an <em>Atomically Reference Counted</em> value
is really cheap: <a rel="noopener external" target="_blank" href="https://github.com/rust-lang/rust/blob/f6bcd094abe174a218f7cf406e75521be4199f88/library/alloc/src/sync.rs#L2118-L2170">it increases the counter by 1 atomically</a>,
the inner value is untouched. Nonetheless, we have a mutation problem now.
If we have <code>Observable<Arc<Vec<_>>></code>, it means that the subscribers will be
<code>Subscriber<Arc<Vec<_>>></code>. In this case, every time the observable wants to
mutate the data, it is going to… be… impossible because an <code>Arc</code> is nothing
less than a shared reference, and shared references in Rust disallow mutation by
default. Using <code>Observable::set</code> will create a new <code>Arc</code>, but we cannot update
the value <em>inside</em> the <code>Arc</code>, except if we use a lock… Well, we are adding more
and more complexity.</p>
<p><q lang="la">Spes salutis</q><sup class="footnote-reference" id="fr-spes_salutis-1"><a href="#fn-spes_salutis">1</a></sup>! Fortunately for us, <em>immutable
data structures</em> exist in Rust.</p>
<blockquote>
<p>An immutable data structure is a data structure which can be copied and modified
efficiently without altering the original.</p>
</blockquote>
<p>It can be modified. However, as soon as it is copied (or cloned), it is still
possible to modify the copy but the original data is not modified. That's
extremely powerful.</p>
<p>Such structures bring many advantages, but one of them is <em>structural sharing</em>:</p>
<blockquote>
<p>If two data structures are mostly copies of each other, most of the memory
they take up will be shared between them. This implies that making copies of
an immutable data structure is cheap: it's really only a matter of copying
a pointer and increasing a reference counter, where in the case of <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/vec/index.html"><code>Vec</code></a> you
have to allocate the same amount of memory all over again and make a copy of
every element it contains. For immutable data structures, extra memory isn't
allocated until you modify either the copy or the original, and then only the
memory needed to record the difference.</p>
</blockquote>
<p>Well, <i>taking a deep breath</i>, it sounds exactly like what we
need to solve our issue, isn't it? The <code>Observable<Immutable<_>></code> and the
<code>Subscriber<Immutable<_>></code>s will share the same value, with the observable
being able to mutate its inner value. The subscribers can modify the received
value too, in an efficient way, without conflicting with the value from the
observable. Both values will continue to live on their side, but cloning the
value is cheap.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Dare I ask how immutable data structures are implemented? It sounds like complex
beasts.</p>
<p>I mean… a naive implementation sounds <em>relatively doable</em> but I am guessing there
is a lot of subtleties, possible conflicts, and many memory guarantees that I am
not anticipating yet, right?</p>
</div>
</div>
<p>Oh… <q lang="la">beati pauperes in spiritu</q><sup class="footnote-reference" id="fr-beati_pauperes_in_spiritu-1"><a href="#fn-beati_pauperes_in_spiritu">2</a></sup>… it
is actually really complex. It may be a topic for another series or articles.
For the moment, if you interested, let me redirect you to one research paper
that proposes an immutable <code>Vec</code>: <cite>RRR Vector: A Practical General
Immutable Sequence</cite><sup class="footnote-reference" id="fr-SRUB2015-1"><a href="#fn-SRUB2015">3</a></sup>. Be cool though, understanding this part is
not necessary at all for what we are talking now. It's a great tool we are going
to use, no matter how it works internally.</p>
<p>Do you know the other good news? We don't have to implement it by ourselves,
because some nice people already did it! Enter <a rel="noopener external" target="_blank" href="https://docs.rs/imbl">the <code>imbl</code> crate</a>. This
crate provides <a rel="noopener external" target="_blank" href="https://docs.rs/imbl/3.0.0/imbl/struct.Vector.html">a <code>Vector</code> type</a>. It can be used like a regular
<code>Vec</code>. (Side note: it's even smarter than a <code>Vec</code> because it implements smart
head and tail chunking<sup class="footnote-reference" id="fr-UCR2014-1"><a href="#fn-UCR2014">4</a></sup>, and allocates in the stack or on the heap
depending on the size of the collection, similarly to <a rel="noopener external" target="_blank" href="https://docs.rs/smallvec">the <code>smallvec</code>
crate</a>. End of digression)</p>
<h2 id="observable-immutable-collection">Observable (immutable) collection<a role="presentation" class="anchor" href="#observable-immutable-collection" title="Anchor link to this header">#</a>
</h2>
<p>The <code>imbl</code> crate then. It provides <a rel="noopener external" target="_blank" href="https://docs.rs/imbl/3.0.0/imbl/struct.Vector.html">a <code>Vector</code> type</a>. <code>eyeball</code>
provides a crate for working with immutable data structures (how surprising
huh?): <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im">this crate is <code>eyeball-im</code></a>.</p>
<p>Instead of providing an <code>Observable<T></code> type, it provides <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/struct.ObservableVector.html">an
<code>ObservableVector<T></code> type</a> which is a <code>Vector</code>,
but an observable one! Let's see… what do we have… <i>scroll the
documentation</i>, hmm, interesting, <i>scroll more…</i>, okay, that's
interesting:</p>
<ul>
<li>First off, there is methods like <code>append</code>, <code>pop_back</code>, <code>pop_front</code>,
<code>push_back</code>, <code>push_front</code>, <code>remove</code>, <code>insert</code>, <code>set</code>, <code>truncate</code> and <code>clear</code>.
It seems this collection is pretty flexible. The vocabulary is clear. They all
take a <code>&mut self</code>, cool.</li>
<li>Then, there is a <code>with_capacity</code> method, this is intriguing, <i>add to
notes</i>,</li>
<li>Finally, we find our not-so-ol' friend <code>subscribe</code>, but this time it returns a
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/struct.VectorSubscriber.html"><code>VectorSubscriber<T></code></a>.</li>
</ul>
<p>Let's explore <code>VectorSubscriber</code> a bit more, would you? <i>Scroll the
document</i>, contrary to
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball/0.8.8/eyeball/struct.Subscriber.html#method.next-1"><code>Subscriber::next</code></a>, there is no <code>next</code> method. How
are we supposed to wait on an update?</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Confer to the assiduous reader! If you read <em>carefully</em> the documentation of the
<code>Subscriber::next</code> method, you will see:</p>
<blockquote>
<p>This method is a convenience so you don't have to import a <code>Stream</code> extension
trait such as <code>futures::StreamExt</code> or <code>tokio_stream::StreamExt</code>.</p>
</blockquote>
</div>
</div>
<p>… fair enough. So <code>Subscriber::next</code> mimics <code>StreamExt::next</code>. Okay. Let's look
at <a rel="noopener external" target="_blank" href="https://docs.rs/futures/0.3.30/futures/stream/trait.Stream.html"><code>Stream</code></a> first, it's from <a rel="noopener external" target="_blank" href="https://docs.rs/futures">the <code>futures</code>
crate</a>. <code>Stream</code> defines itself as:</p>
<blockquote>
<p>A stream of values produced asynchronously.</p>
<p>If <code>Future<Output = T></code> is an asynchronous version of <code>T</code>, then <code>Stream<Item = T></code> is an asynchronous version of <code>Iterator<Item = T></code>. A stream represents
a sequence of value-producing events that occur asynchronously to the caller.</p>
<p>The trait is modeled after <code>Future</code>, but allows <code>poll_next</code> to be called even
after a value has been produced, yielding None once the stream has been fully
exhausted.</p>
</blockquote>
<p>We aren't going to teach everything about <code>Stream</code>: why this design, its
pros and cons… However, <i>wave its hand to ask you to come
closer</i>, did you notice how <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/future/trait.Future.html#tymethod.poll"><code>Future::poll</code></a> returns <code>Poll<Self::Output></code>,
whilst <a rel="noopener external" target="_blank" href="https://docs.rs/futures/0.3.30/futures/stream/trait.Stream.html#tymethod.poll_next"><code>Stream::poll_next</code></a> returns
<code>Poll<Option<Self::Item>></code>? It's really similar to <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/iter/trait.Iterator.html#tymethod.next"><code>Iterator::next</code></a> which
returns <code>Option<Self::Item></code>.</p>
<p>Let's take a look at <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/task/enum.Poll.html"><code>Poll<T></code></a> don't you mind? It's an enum with 2 variants:</p>
<ul>
<li><code>Ready(value)</code> means a <code>value</code> is immediately ready,</li>
<li><code>Pending</code> means no value is ready yet.</li>
</ul>
<p>Then, what <code>Poll<Option<T>></code> represents for a <code>Stream</code>?</p>
<ul>
<li><code>Poll::Ready(Some(value))</code> means this stream has successfully produced a
<code>value</code>, and may produce more values on subsequent <code>poll_next</code> calls,</li>
<li><code>Poll::Ready(None)</code> means the stream has terminated (and <code>poll_next</code> should
not be called anymore),</li>
<li><code>Poll::Pending</code> means no value is ready yet.</li>
</ul>
<p>It makes perfect sense. A <code>Future</code> produces a single value, whilst a <code>Stream</code>
produces multiple values, and <code>Poll::Ready(None)</code> represents the termination of
the stream, similarly to <code>None</code> to represent the termination of an <code>Iterator</code>.
Ahh, I love consistency.</p>
<p>We have the basis. Now let's see <a rel="noopener external" target="_blank" href="https://docs.rs/futures/0.3.30/futures/stream/trait.StreamExt.html"><code>StreamExt</code></a>. It's
a trait extending <code>Stream</code> to add convenient combinator methods. Amongst other
things, we find <a rel="noopener external" target="_blank" href="https://docs.rs/futures/0.3.30/futures/prelude/stream/trait.StreamExt.html#method.next"><code>StreamExt::next</code></a>! Ah ha!
It returns a <code>Next</code> type which implements a <code>Future</code>, exactly what <code>eyeball</code>
does actually. Remember our:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// from `main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">while</span><span class="z-storage"> let</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-variable">new_value</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">new_value</span><span>);</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>It is exactly the same pattern with <code>StreamExt::next</code>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// from the documentation of `StreamExt::Next`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> futures</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">stream</span><span class="z-keyword z-operator">::</span><span>{</span><span class="z-variable z-language">self</span><span>,</span><span class="z-entity z-name"> StreamExt</span><span>};</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> stream</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> stream</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">iter</span><span>(</span><span class="z-constant z-numeric">1</span><span class="z-keyword z-operator">..=</span><span class="z-constant z-numeric">3</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_eq!</span><span>(</span><span class="z-variable">stream</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>,</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-constant z-numeric">1</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_eq!</span><span>(</span><span class="z-variable">stream</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>,</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-constant z-numeric">2</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_eq!</span><span>(</span><span class="z-variable">stream</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>,</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-constant z-numeric">3</span><span>));</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_eq!</span><span>(</span><span class="z-variable">stream</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>,</span><span class="z-entity z-name"> None</span><span>);</span></span></code></pre>
<p>Pieces start to come together, don't they?</p>
<p>End of the detour. Back to <code>eyeball_im::VectorSubscriber<T></code> . It is possible to
transform this type into a <code>Stream</code> with its
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/struct.VectorSubscriber.html#method.into_stream"><code>into_stream</code></a> method. It returns
a <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/struct.VectorSubscriberStream.html"><code>VectorSubscriberStream</code></a>. Naming is
hard, but if I would have to guess, I would say it implements… a… <code>Stream</code>?</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// from `eyeball-im`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">impl</span><span><</span><span class="z-entity z-name">T</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Clone</span><span class="z-keyword z-operator"> +</span><span class="z-entity z-name"> Send</span><span class="z-keyword z-operator"> +</span><span class="z-entity z-name"> Sync</span><span class="z-keyword z-operator"> +</span><span> '</span><span class="z-entity z-name">static</span><span>></span><span class="z-entity z-name"> Stream</span><span class="z-keyword"> for</span><span class="z-entity z-name"> VectorSubscriberStream</span><span><</span><span class="z-entity z-name">T</span><span>> {</span></span>
<span class="giallo-l"><span class="z-storage"> type</span><span class="z-entity z-name"> Item</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> VectorDiff</span><span><</span><span class="z-entity z-name">T</span><span>>;</span></span></code></pre>
<p><a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/struct.VectorSubscriberStream.html#impl-Stream-for-VectorSubscriberStream%3CT%3E">Yes, it does</a>!</p>
<p>Dust blown away, the puzzle starts to appear clearly. Let's back on coding!</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo add eyeball-im futures</span></span>
<span class="giallo-l"><span> Updating crates.io index</span></span>
<span class="giallo-l"><span> Adding eyeball-im v0.5.0 to dependencies</span></span>
<span class="giallo-l"><span> Features:</span></span>
<span class="giallo-l"><span> - serde</span></span>
<span class="giallo-l"><span> - tracing</span></span>
<span class="giallo-l"><span> Adding futures v0.3.30 to dependencies</span></span>
<span class="giallo-l"><span> Features:</span></span>
<span class="giallo-l"><span> + alloc</span></span>
<span class="giallo-l"><span> + async-await</span></span>
<span class="giallo-l"><span> + executor</span></span>
<span class="giallo-l"><span> + std</span></span>
<span class="giallo-l"><span> - bilock</span></span>
<span class="giallo-l"><span> - cfg-target-has-atomic</span></span>
<span class="giallo-l"><span> - compat</span></span>
<span class="giallo-l"><span> - futures-executor</span></span>
<span class="giallo-l"><span> - io-compat</span></span>
<span class="giallo-l"><span> - thread-pool</span></span>
<span class="giallo-l"><span> - unstable</span></span>
<span class="giallo-l"><span> - write-all-vectored</span></span>
<span class="giallo-l"><span> [ … snip … ]</span></span></code></pre><pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball_im</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">ObservableVector</span><span>;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> futures</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">stream</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">StreamExt</span><span>;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> macro_rules_attribute</span><span class="z-keyword z-operator">::</span><span>apply;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> smol</span><span class="z-keyword z-operator">::</span><span>{</span><span class="z-entity z-name">future</span><span class="z-keyword z-operator">::</span><span>yield_now,</span><span class="z-entity z-name"> Executor</span><span>};</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> smol_macros</span><span class="z-keyword z-operator">::</span><span>main;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[apply(main</span><span class="z-keyword z-operator">!</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">async fn</span><span class="z-entity z-name z-function"> main</span><span>(</span><span class="z-variable">executor</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Executor</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> ObservableVector</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>();</span></span>
<span class="giallo-l"><span class="z-comment"> // Subscribe to `observable` and get a `Stream`.</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">subscribe</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">into_stream</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Push one value.</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'a'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Task that reads new updates from `observable`.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> task</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> executor</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">spawn</span><span>(</span><span class="z-keyword">async move</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> while</span><span class="z-storage"> let</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-variable">new_value</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">new_value</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> });</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Now, let's update `observable`.</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'b'</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment"> // Eh `executor`: `task` can run now!</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> yield_now</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // More updates.</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'c'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'d'</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment"> // Eh `executor`, same.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> yield_now</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> drop</span><span>(</span><span class="z-variable">observable</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> task</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Time to show off:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:18:13] new_value = PushBack {</span></span>
<span class="giallo-l"><span> value: 'a',</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"><span>[src/main.rs:18:13] new_value = PushBack {</span></span>
<span class="giallo-l"><span> value: 'b',</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"><span>[src/main.rs:18:13] new_value = PushBack {</span></span>
<span class="giallo-l"><span> value: 'c',</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"><span>[src/main.rs:18:13] new_value = PushBack {</span></span>
<span class="giallo-l"><span> value: 'd',</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Do you see something new?</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Hmm, indeed. With <code>Observable</code>, some values may “miss” because <code>Observable</code>
and <code>Subscriber</code> have no buffer. The subscribers only return the current value
when asked for. However, with <code>ObservableVector</code>, things are different: no
missing values. There are all here. As if there… was a buffer!</p>
<p>And the values returned by the subscriber are not the raw <code>T</code>:
we see <code>PushBack</code>. It comes from, <i>check the documentation</i>,
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/enum.VectorDiff.html"><code>VectorDiff::PushBack</code></a>!</p>
</div>
</div>
<p>Good eyes, well done.</p>
<p>First off, that's correct that <code>PushBack</code> comes from
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/enum.VectorDiff.html"><code>VectorDiff</code></a>. Let's come back to this piece in
a second: it is the cornerstone of the entire series, it deserves a bit of
explanations.</p>
<p>Second, yes, <code>VectorSubscriber</code> returns <strong>all values</strong>! There is actually a
buffer. It's a bit annoying to continue with a <code>task</code> as we did so far, let's
use <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/macro.assert_eq.html"><code>assert_eq!</code></a> instead.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> eyeball_im</span><span class="z-keyword z-operator">::</span><span>{</span><span class="z-entity z-name">ObservableVector</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span>};</span></span>
<span class="giallo-l"><span class="z-comment">// ^^^^^^^^^^ new!</span></span>
<span class="giallo-l"><span class="z-comment">// …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[apply(main</span><span class="z-keyword z-operator">!</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">async fn</span><span class="z-entity z-name z-function"> main</span><span>(</span><span class="z-variable">_executor</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Executor</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> ObservableVector</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>();</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">subscribe</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">into_stream</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Push one value.</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'a'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> assert_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-entity z-name">VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'a'</span><span> }),</span></span>
<span class="giallo-l"><span> );</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Push another value.</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'b'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'c'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'d'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> assert_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-entity z-name">VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'b'</span><span> }),</span></span>
<span class="giallo-l"><span> );</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> assert_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-entity z-name">VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'c'</span><span> }),</span></span>
<span class="giallo-l"><span> );</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> assert_eq!</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> dbg!</span><span>(</span><span class="z-variable">subscriber</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-entity z-name">VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'd'</span><span> }),</span></span>
<span class="giallo-l"><span> );</span></span>
<span class="giallo-l"><span>}</span></span></code></pre><pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:16:9] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'a',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span>
<span class="giallo-l"><span>[src/main.rs:26:9] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'b',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span>
<span class="giallo-l"><span>[src/main.rs:30:9] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'c',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span>
<span class="giallo-l"><span>[src/main.rs:34:9] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'd',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span></code></pre>
<p>Beautiful! However… the code is a bit verbose, isn't it? <i>Desperately waiting
for an affirmative answer</i>, okay, okay, something you may not know about me:
I love macros. There. I said it. Let's quickly craft one:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"><span class="z-comment">// before the `main` function</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">macro_rules! assert_next_eq</span><span> {</span></span>
<span class="giallo-l"><span> (</span><span class="z-keyword z-operator"> $</span><span class="z-variable">stream</span><span class="z-keyword z-operator">:</span><span class="z-variable">ident</span><span>,</span><span class="z-keyword z-operator"> $</span><span class="z-variable">expr</span><span class="z-keyword z-operator">:</span><span class="z-variable">expr</span><span class="z-keyword z-operator"> $</span><span>(,)</span><span class="z-keyword z-operator">?</span><span> )</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> assert_eq!</span><span>(</span><span class="z-entity z-name z-function">dbg!</span><span>(</span><span class="z-keyword z-operator"> $</span><span class="z-variable">stream</span><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">next</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-keyword">await</span><span>),</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-keyword z-operator"> $</span><span class="z-variable">expr</span><span> ));</span></span>
<span class="giallo-l"><span> };</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>This macro does exactly what our <code>assert_eq!</code> was doing, except now it's shorter
to use, and thus more pleasant. Don't believe me? See by yourself:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"><span class="z-comment">// at the end of the `main` function</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Push one value.</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'a'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'a'</span><span> });</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Push another value.</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'b'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'c'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'d'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'b'</span><span> });</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'c'</span><span> });</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'd'</span><span> });</span></span></code></pre>
<p>There we go.</p>
<p>Having a scientific and rigorous approach is important in our domain. We said
<code>ObservableVector</code> seems to contain a buffer, and <code>VectorSubscriber</code> seems to
pop values from this buffer. Let's play with that. I see two things to test:</p>
<ol>
<li>Modify the <code>ObservableVector</code>, and subscribe to it <em>after</em>: Does the
subscriber receive the update before it was created?</li>
<li>How many values the buffer can hold?</li>
</ol>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> ObservableVector</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Push a value before the subscriber exists.</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'a'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">subscribe</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">into_stream</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Push another value.</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'b'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'b'</span><span> });</span></span></code></pre>
<p>If the <code>subscriber</code> receives <code>a</code>, it must fail, otherwise no error:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:25:5] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'b',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span></code></pre>
<p>Look Ma', no error!</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>We have learned that a <code>VectorSubscriber</code> is aware of the new updates that are
made once it exists. A <code>VectorSubscriber</code> is not aware of updates that happened
before its creation.</p>
<p>In the example, <code>VectorDiff::PushBack { value: 'a' }</code> is not received before
<code>subscriber</code> was created. However, <code>VectorDiff::PushBack { value: 'b' }</code> is
received because it happened after <code>subscriber</code> was created. It makes perfect
sense.</p>
<p>It suggests that the buffer lives inside <code>VectorSubscriber</code>, and not inside
<code>ObservableVector</code>. Or maybe the buffer is shared between the observable and the
subscribers, with the buffer having some specific semantics, like a <em>channel</em>.
We would need to look at the implementation to be sure.</p>
</div>
</div>
<p>Agree. This is left as an exercise for the reader, <i>wink to you</i>.</p>
<p>We have an answer to question 1. What about question 2? The size of the buffer.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> ObservableVector</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>();</span></span>
<span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> subscriber</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">subscribe</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">into_stream</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Push ALL THE VALUES!</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'a'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'b'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'c'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'d'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'e'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'f'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'g'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'h'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'i'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'j'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'k'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'l'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'m'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'n'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'o'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'p'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'a'</span><span> });</span></span>
<span class="giallo-l"><span class="z-comment">// no need to assert the others</span></span></code></pre><pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:36:5] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'a',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span></code></pre>
<p>Hmm, the buffer doesn't seem to be full with 16 values. Let's add a couple more:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-comment">// in `src/main.rs`</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// [ … snip … ]</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'n'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'o'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'p'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'q'</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment">// ^ new!</span></span>
<span class="giallo-l"><span class="z-variable">observable</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-string">'r'</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment">// ^ new!</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_next_eq!</span><span>(</span><span class="z-variable">subscriber</span><span>,</span><span class="z-entity z-name"> VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-string"> 'a'</span><span> });</span></span></code></pre><pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:38:5] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> Reset {</span></span>
<span class="giallo-l"><span> values: [</span></span>
<span class="giallo-l"><span> 'a',</span></span>
<span class="giallo-l"><span> 'b',</span></span>
<span class="giallo-l"><span> 'c',</span></span>
<span class="giallo-l"><span> 'd',</span></span>
<span class="giallo-l"><span> 'e',</span></span>
<span class="giallo-l"><span> 'f',</span></span>
<span class="giallo-l"><span> 'g',</span></span>
<span class="giallo-l"><span> 'h',</span></span>
<span class="giallo-l"><span> 'i',</span></span>
<span class="giallo-l"><span> 'j',</span></span>
<span class="giallo-l"><span> 'k',</span></span>
<span class="giallo-l"><span> 'l',</span></span>
<span class="giallo-l"><span> 'm',</span></span>
<span class="giallo-l"><span> 'n',</span></span>
<span class="giallo-l"><span> 'o',</span></span>
<span class="giallo-l"><span> 'p',</span></span>
<span class="giallo-l"><span> 'q',</span></span>
<span class="giallo-l"><span> 'r',</span></span>
<span class="giallo-l"><span> ],</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span>
<span class="giallo-l"><span>thread 'main' panicked at src/main.rs:38:5:</span></span>
<span class="giallo-l"><span>assertion `left == right` failed</span></span>
<span class="giallo-l"><span> left: Some(Reset { values: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r'] })</span></span>
<span class="giallo-l"><span> right: Some(PushBack { value: 'a' })</span></span>
<span class="giallo-l"><span>note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace</span></span></code></pre>
<p>Oh! An error, great! Our <code>assert_next_eq!</code> has failed. <code>subscriber</code>
does not receive a <code>VectorDiff::PopBack</code> but a <code>VectorDiff::Reset</code>.
Let's play with
<a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/struct.ObservableVector.html#method.with_capacity"><code>ObservableVector::with_capacity</code></a>
a moment, maybe it's related to the buffer capacity? Let's change a single line:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">let mut</span><span class="z-variable"> observable</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> ObservableVector</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">with_capacity</span><span>(</span><span class="z-constant z-numeric">32</span><span>);</span></span>
<span class="giallo-l"><span class="z-comment">// ^^^^^^^^^^^^^^^^^ new!</span></span></code></pre><pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cargo run --quiet</span></span>
<span class="giallo-l"><span>[src/main.rs:38:5] subscriber.next().await = Some(</span></span>
<span class="giallo-l"><span> PushBack {</span></span>
<span class="giallo-l"><span> value: 'a',</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span>)</span></span></code></pre><div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>We have learned that <code>ObservableVector::with_capacity</code> controls the size of
the buffer.</p>
<p>The name could suggest that it controls the capacity of the observed <code>Vector</code>,
<em>à la</em> <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.with_capacity"><code>Vec::with_capacity</code></a>, but it must not be confused.</p>
<p>For a reason we ignore so far, when the buffer is full, we receive a
<code>VectorDiff::Reset</code>. We need to learn more about this type.</p>
</div>
</div>
<h2 id="observable-differences">Observable differences<a role="presentation" class="anchor" href="#observable-differences" title="Anchor link to this header">#</a>
</h2>
<p>The previous section was explaining how immutable data structures could save us
by cheaply and efficiently cloning the data between the observable and its
subscribers. However, we see that <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im"><code>eyeball-im</code></a>, despite using <a rel="noopener external" target="_blank" href="https://docs.rs/imbl"><code>imbl</code></a>, does
not share an <a rel="noopener external" target="_blank" href="https://docs.rs/imbl/3.0.0/imbl/struct.Vector.html"><code>imbl::Vector</code></a> but an <a rel="noopener external" target="_blank" href="https://docs.rs/eyeball-im/0.5.0/eyeball_im/enum.VectorDiff.html"><code>eyeball_im::VectorDiff</code></a>. Why such design?
It looks like a drama. A betrayal. An act of treachery!</p>
<p>Well. Firstly, <code>eyeball-im</code> is relying on some immutable properties of <code>Vector</code>.
And secondly, the reason for which <code>VectorDiff</code> exists is simple. If a
subscriber receives <code>Vector</code>s, how is the user able to see what has changed? The
user (!) would be responsible to <em>calculate</em> the differences between 2 <code>Vector</code>s
every time! Not only this is costly, but it is utterly error-prone.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>Are you suggesting that <code>VectorSubscriber</code> (or <code>VectorSubscriberStream</code>)
calculates the differences between the <code>Vector</code>s itself so that the user doesn't
have to?</p>
<p>I still see many problems though. I believe the order of the <code>VectorDiff</code>s
matters a lot for some use cases. For example, let's consider two consecutive
<code>Vector</code>s:</p>
<ol>
<li><code>['a', 'b', 'c']</code> and</li>
<li><code>['a', 'c', 'b']</code>.</li>
</ol>
<p>Has <code>'b'</code> been removed and pushed back, or <code>'c'</code> been popped back and inserted?
How can you decide between the twos?</p>
</div>
</div>
<p>We can't —it would be implementation specifics anyway— and we don't want to.
The user is manipulating the <code>ObservableVector</code> in a special way, and we should
ideally not change that.</p>
<p>These <code>VectorDiff</code> actually comes from <code>ObservableVector</code> itself! Let's look at
the implementation of
<a rel="noopener external" target="_blank" href="https://github.com/jplatte/eyeball/blob/4254403e385715380753bb0def20fb0398e91ebd/eyeball-im/src/vector.rs#L107-L114"><code>ObservableVector::push_back</code></a>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> push_back</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-storage">mut</span><span class="z-variable z-language"> self</span><span>,</span><span class="z-variable"> value</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> T</span><span>) {</span></span>
<span class="giallo-l"><span class="z-comment"> // [ … snip … ]</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span>values</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push_back</span><span>(</span><span class="z-variable">value</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">clone</span><span>());</span></span>
<span class="giallo-l"><span class="z-comment"> // ^^^^^^ this is a `Vector`!</span></span>
<span class="giallo-l"><span class="z-variable z-language"> self</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">broadcast_diff</span><span>(</span><span class="z-entity z-name">VectorDiff</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PushBack</span><span> {</span><span class="z-variable"> value</span><span> });</span></span>
<span class="giallo-l"><span class="z-comment"> // ^^^^^^^^^^ here you are…</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Each method adding or removing values on the <code>ObservableVector</code> emits its own
<code>VectorDiff</code> variant. No calculation, it's purely a mapping:</p>
<figure>
<table><thead><tr><th><code>ObservableVector::…</code></th><th><code>VectorDiff::…</code></th><th>Meaning</th></tr></thead><tbody>
<tr><td><code>append(values)</code></td><td><code>Append { values }</code></td><td>Append many <code>values</code></td></tr>
<tr><td><code>clear()</code></td><td><code>Clear</code></td><td>Clear out all the values</td></tr>
<tr><td><code>insert(index, value)</code></td><td><code>Insert { index, value }</code></td><td>Insert a <code>value</code> at <code>index</code> </td></tr>
<tr><td><code>pop_back()</code></td><td><code>PopBack</code></td><td>Remove the value at the back</td></tr>
<tr><td><code>pop_front()</code></td><td><code>PopFront</code></td><td>Remove the value at the front</td></tr>
<tr><td><code>push_back(value)</code></td><td><code>PushBack { value }</code></td><td>Add <code>value</code> at the back</td></tr>
<tr><td><code>push_front(value)</code></td><td><code>PushFront { value }</code></td><td>Add <code>value</code> at the front</td></tr>
<tr><td><code>remove(index)</code></td><td><code>Remove { index }</code></td><td>Remove value at <code>index</code></td></tr>
<tr><td><code>set(index, value)</code></td><td><code>Set { index, value }</code></td><td>Replace value at <code>index</code> by <code>value</code></td></tr>
<tr><td><code>truncate(length)</code></td><td><code>Truncate { length }</code></td><td>Truncate to <code>length</code> values</td></tr>
</tbody></table>
<figcaption>
<p>Mappings of <code>ObservableVector</code> methods to <code>VectorDiff</code> variants.</p>
</figcaption>
</figure>
<p>See, for each <code>VectorDiff</code> variant, there is an <code>ObservableVector</code> method
triggering it.</p>
<div class="conversation" data-character="comte">
<div class="conversation--character">
<span lang="fr">Le Comte</span>
<picture role="presentation">
<source srcset="/image/comte.avif" type="image/avif" />
<source srcset="/image/comte.webp" type="image/webp" />
<img src="/image/comte.png" loading="lazy" decoding="async" />
</picture>
</div>
<div class="conversation--message">
<p>And what about <code>VectorDiff::Reset</code>?</p>
<p>We were receiving it when the buffer was full apparently. You are not mentioning
it, and if I take a close look at <code>ObservableVector</code>'s documentation, I don't
see any <code>reset</code> method. Is it only an internal thing?</p>
</div>
</div>
<p>You are correct. When the buffer is full, the subscriber will provide a
<code>VectorDiff::Reset { values }</code> where <code>values</code> is the full list of values. The
documentation says:</p>
<blockquote>
<p>The subscriber lagged too far behind, and the next update that should have
been received has already been discarded from the internal buffer.</p>
</blockquote>
<p>If the subscriber didn't catch all the updates, the best thing it can do is to
say: <q>Okay, I am late at the party, I've missed several things, so here is the
current state!</q>. This is not ideal, but the subscriber is responsible to not
lag, and this design avoids having missing values. If a subscriber receives
too much <code>VectorDiff::Reset</code>s, the user may consider increasing the capacity of
the <code>ObservableVector</code>.</p>
<h2 id="filtering-and-sorting-with-higher-order-streams">Filtering and sorting with higher-order <code>Stream</code>s<a role="presentation" class="anchor" href="#filtering-and-sorting-with-higher-order-streams" title="Anchor link to this header">#</a>
</h2>
<p>We are reaching the end of this episode. And you know what? We have set all the
parts to talk about higher-order <code>Stream</code>, <i>chante victory and dance at the
same time</i>!</p>
<p>At the beginning of this episode, we were saying that the Matrix Rust SDK is
able to filter and to sort an <code>ObservableVector</code> representing all the rooms.
How? <code>VectorSubscriberStream</code> <em>is</em> a <code>Stream</code>. More specifically, it is a
<code>Stream<Item = VectorDiff<T>></code>. Now questions:</p>
<ul>
<li>What's the difference between an unfiltered <code>Vector</code> and a filtered <code>Vector</code>?</li>
<li>What's the difference between an unsorted <code>Vector</code> and a sorted <code>Vector</code>?</li>
<li>What's the difference between a filtered <code>Vector</code> and a sorted <code>Vector</code>?</li>
<li>and so on.</li>
</ul>
<p>All of them are strictly <code>Stream<Item = VectorDiff<T>></code>! However, the
<code>VectorDiff</code>s aren't the same. A simple example. Let's say we build a vector by
inserting <code>1</code>, <code>2</code>, <code>3</code> and <code>4</code>. We subscribe to it, and we want to filter out
all the even numbers. Instead of receiving:</p>
<ul>
<li><code>VectorDiff::Insert { index: 0, value: 1 }</code>,</li>
<li><code>VectorDiff::Insert { index: 1, value: 2 }</code>,</li>
<li><code>VectorDiff::Insert { index: 2, value: 3 }</code>,</li>
<li><code>VectorDiff::Insert { index: 3, value: 4 }</code>.</li>
</ul>
<p>… we want to receive:</p>
<ul>
<li><code>VectorDiff::Insert { index: 0, value: 1 }</code>,</li>
<li><code>VectorDiff::Insert { index: 1, value: 3 }</code>: note the <code>index</code>, it is not 2
but 1!</li>
</ul>
<p>We will see how all that works in the next episodes and how powerful this design
is, especially when it comes to cross-platform UI (user interface). We are going
to learn so much about <code>Stream</code> and <code>Future</code>, it's going to be fun!</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-spes_salutis">
<p>Latine expression meaning <em>salvation hope</em>. <a href="#fr-spes_salutis-1">↩</a></p>
</li>
<li id="fn-beati_pauperes_in_spiritu">
<p>Latine expression meaning <em>bless are the poor in spirit</em>. <a href="#fr-beati_pauperes_in_spiritu-1">↩</a></p>
</li>
<li id="fn-SRUB2015">
<p><cite><a href="https://infoscience.epfl.ch/server/api/core/bitstreams/7c8b929f-1f68-4948-8ea8-e364e4899b2a/content">Relaxed-Radix-Balanced
(RRR) Vector: A Practical General Purpose Immutable
Sequence</a></cite> by Sticki N., Rompf T., Ureche V. and Bagwell P. (2015,
August), in <i>Proceedings of the 20th ACM SIGPLAN International Conference
on Functional Programming (pp. 342-354).</i> <a href="#fr-SRUB2015-1">↩</a></p>
</li>
<li id="fn-UCR2014">
<p><cite><a href="http://deepsea.inria.fr/pasl/chunkedseq.pdf">Theory
and Practise of Chunked Sequences</a></cite> by Acar U. A., Charguéraud
A., and Rainey M. (2014), in <i>Algorithms-ESA 2014: 22th Annual European
Symposium, Wroclaw, Poland, September 8-10, 2014. Proceedings 21 (pp.
25-36).</i>, Springer Berlin Heidelberg. <a href="#fr-UCR2014-1">↩</a></p>
</li>
</ol>
</section>
I've loved Wasmer, I still love Wasmer2021-10-04T00:00:00+00:002021-10-04T00:00:00+00:00
Unknown
https://mnt.io/articles/i-leave-wasmer/<p>This article could also have been titled <em>How I failed to change
Wasmer</em>.</p>
<p>Today is my last day at <a rel="noopener external" target="_blank" href="https://wasmer.io/">Wasmer</a>. For those who
don't know this name, it has a twofold meaning: it's a <a rel="noopener external" target="_blank" href="http://github.com/wasmerio/wasmer">very popular
WebAssembly runtime</a>, as well as a
startup. I want to write about what I've been able to accomplish during
my time at Wasmer (a high overview, not a technical view), and what
<em>forces</em> me to leave the company despite being one of its co-founder. I
reckon my testimony can help other people to avoid digging into the hell
I (and my colleagues) had to endure. I'm available for work, you can
contact me at <a href="mailto:[email protected]">[email protected]</a>,
<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a>,
<a rel="noopener external" target="_blank" href="https://www.linkedin.com/in/ivan-enderlin/">ivan-enderlin</a> (LinkedIn).</p>
<h2 id="from-nothing-to-pure-awesomeness">From nothing to pure awesomeness<a role="presentation" class="anchor" href="#from-nothing-to-pure-awesomeness" title="Anchor link to this header">#</a>
</h2>
<p>I've joined the Wasmer company at its early beginning, in March 2019.
The company was 3 months old. My initial role was to write and to
improve the runtime itself, and to create many embeddings, i.e. ways to
integrate the Wasmer runtime inside various technologies, so that
WebAssembly can run everywhere.</p>
<p>I can say with confidence that my work is a success. I've learned a lot,
and I've worked on so many different projects, technologies, hacked so
many things, collaborated with so many people, every action was led by
the <strong>passion</strong>.</p>
<p>At the time of writing, Wasmer has an incredible growth. In 2.5 years
only, the runtime has more than 10'500 stars on Github, and is <strong>one of
the most popular WebAssembly runtime in the world</strong>! It's used by many
various companies, such as <a rel="noopener external" target="_blank" href="https://confio.tech/">Confio</a>, <a rel="noopener external" target="_blank" href="https://fluence.network/">Fluence
Labs</a>, <a rel="noopener external" target="_blank" href="https://hotg.dev/">HOT-G</a>,
<a rel="noopener external" target="_blank" href="https://brave.com/">Brave</a>, <a rel="noopener external" target="_blank" href="https://google.com/">Google</a>,
<a rel="noopener external" target="_blank" href="https://www.apple.com/">Apple</a>, <a rel="noopener external" target="_blank" href="https://spacemesh.io/">SpaceMesh</a>,
<a rel="noopener external" target="_blank" href="https://linkerd.io/">Linkerd</a>,
<a rel="noopener external" target="_blank" href="https://www.singlestore.com/">SingleStore</a>,
<a rel="noopener external" target="_blank" href="https://www.clever-cloud.com/">CleverCloud</a> or
<a rel="noopener external" target="_blank" href="https://konghq.com/">Kong</a> to name a few (for the ones I can name
though, however other companies are also using Wasmer in very critical
environments).</p>
<p>Most of my engineering job happened on the Wasmer runtime itself. At the
time of writing, I'm the #2 contributor on the project. I was working
on <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/f9ff574e10d4ee97f836565bdae99035e04ac879/lib">every parts of the
runtime</a>:
the API, the C API, the compilers, the ABI (mostly WASI), the engines,
the middlewares, and the VM itself which is the most low-level
foundamental layer of the runtime.</p>
<p>The runtime provides so many features. It is an impressively powerful
runtime for WebAssembly, and I'm saying that with a neutral and
respectful mindset. Not everything is perfect obviously but I did my
best to set up a truly user-friendly learning environment, with an
important documentation and <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/f9ff574e10d4ee97f836565bdae99035e04ac879/examples">a collection of
examples</a>
that illustrate many features. I strongly believe it contributed to
Wasmer's popularity to great extent.</p>
<p>I would like to highlight the most notable embedding projects I've
created:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/master/lib/c-api"><code>wasmer-c-api</code></a> is
the C embedding for Wasmer. It's part of the Wasmer runtime itself, and is
fully written in Rust.
<a rel="noopener external" target="_blank" href="https://docs.rs/wasmer-c-api/*/wasmer_c_api/wasm_c_api/index.html">The documentation, the C
examples</a>,
everything is super polished to offer the best experience possible. <a rel="noopener external" target="_blank" href="https://github.com/MarkMcCaskey">Mark
McCaskey</a> and I are the authors of this
project.</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer-python"><code>wasmer-python</code></a> is the Python
embedding for Wasmer. At the time of writing, it's been installed more than 5
millions times (I'm counting the compiler packages too, like
<code>wasmer-compiler-cranelift</code> and so on), and 1300 stars on Github. There is
about 300'000 downloads per months, and it continues to grow! The code is
written in Rust, and it relies on
<a rel="noopener external" target="_blank" href="https://pyo3.rs/">the awesome <code>pyo3</code> project</a> .</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer-go/"><code>wasmer-go</code></a> is the Go embedding for
Wasmer. It's hard to know how much total downloads we have because of how the
Go ecosystem is designed, but we have about 60'000 downloads per months from
Github (I'm excluding the forks of the project), and 1600 stars on Github. The
code is written in Go and uses <a rel="noopener external" target="_blank" href="https://golang.org/cmd/cgo/"><code>cgo</code></a> to bind
against the C API. Almost all blockchain projects that use WebAssembly are
using <code>wasmer-go</code>, which is a popularity I wasn't expecting.</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer-ruby/"><code>wasmer-ruby</code></a> is the Ruby
embedding for Wasmer. It's not as popular as the others, but it's also very
polished and it's finding its place in the Ruby ecosystem. The code is written
in Rust, and it relies on
<a rel="noopener external" target="_blank" href="https://github.com/danielpclark/rutie">the awesome <code>rutie</code> project</a> .</li>
<li>I won't detail all the projects, but there is also
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer-php"><code>wasmer-php</code></a>,
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer-java"><code>wasmer-java</code></a>,
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer-postgres"><code>wasmer-postgres</code></a>… Because of
the Wasmer runtime API and C API we have designed, many developers around the
globe have been able to create a lot more embeddings, such as in
<a rel="noopener external" target="_blank" href="https://github.com/migueldeicaza/WasmerSharp">C#</a>,
<a rel="noopener external" target="_blank" href="https://github.com/chances/wasmer-d">D</a>,
<a rel="noopener external" target="_blank" href="https://github.com/tessi/wasmex">Elixir</a>,
<a rel="noopener external" target="_blank" href="https://github.com/dirkschumacher/wasmr">R</a>,
<a rel="noopener external" target="_blank" href="https://github.com/AlwaysRightInstitute/SwiftyWasmer">Swift</a>,
<a rel="noopener external" target="_blank" href="https://github.com/zigwasm/wasmer-zig">Zig</a>,
<a rel="noopener external" target="_blank" href="https://github.com/dart-lang/wasm">Dart</a>,
<a rel="noopener external" target="_blank" href="https://github.com/helmutkian/cl-wasm-runtime">Lisp</a> and so on.</li>
</ul>
<p>Other fun notable projects are:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/sonde-rs"><code>sonde-rs</code></a>, a library to
compile USDT probes into a Rust library,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/llvm-custom-builds"><code>llvm-custom-builds</code></a>, a
sandbox to produce custom LLVM builds for various platforms,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/loupe"><code>loupe</code></a>, a set of tools to
analyse and to profile Rust code,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/interface-types"><code>wasmer-interface-types</code></a>, a
“living” (understand an unstable playground) library that implements
<a rel="noopener external" target="_blank" href="https://github.com/WebAssembly/interface-types">the WebAssembly Interface Types proposal</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/Hywan/inline-c-rs/"><code>inline-c-rs</code></a>, to write and
to execute C code inside Rust,</li>
<li>in-memory filesystem, that acts exactly like <code>std::fs</code>.</li>
</ul>
<p>As you might think, I've learned so much. The impostor syndrom was very
present because I was constantly trying to do something I didn't know.
It's part of the routine at Wasmer: Trying something for the first time.
But it's also what kept me motivated, and it was the energy for my
passion.</p>
<p>This list above shows released projects, but I've also experimented (and
sometimes at two hairs of a release) with:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Unikernel">Unikernels</a>; this one was really
fun, given a WebAssembly module and a filesystem, we were able to generate a
unikernel that was executing the given program,</li>
<li>Parser; to write the fastest WebAssembly parser possible, it was working, but
never released,</li>
<li>HAL (Hardware Abstraction Layer) ABI for WebAssembly, so that we can run
WebAssembly on small chips super easily (think of IoT),</li>
<li>Networking; an extension to WASI to support networking (TCP and UDP sockets),
with an implementation in <a rel="noopener external" target="_blank" href="https://www.rust-lang.org/">Rust</a>,
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/C_standard_library">libc</a>, and even
<a rel="noopener external" target="_blank" href="https://github.com/ziglang/zig/">Zig</a>! We were able to compile C programs to
WebAssembly like cURL, or TCP servers written with kqueue or epoll etc, and to
execute them on any platforms.</li>
</ul>
<p>All those things were working.</p>
<p>It's absolutely crazy what WebAssembly can do today, and I still truly
and deeply believe in this technology. I'm not the only one:
<a rel="noopener external" target="_blank" href="https://www.ycombinator.com/companies/wasmer">YCombinator</a> and
<a rel="noopener external" target="_blank" href="https://medium.com/speedinvest/the-next-generation-of-cloud-computing-investing-in-wasmer-768c9aac5922">SpeedInvest</a>
are also founders that believe in Wasmer.</p>
<p>So. What a dream, huh?</p>
<h2 id="the-toxic-working-environment">The toxic working environment<a role="presentation" class="anchor" href="#the-toxic-working-environment" title="Anchor link to this header">#</a>
</h2>
<p>WebAssembly is <em>nothing</em> without its community. I won't name people to
avoid missing important persons, but all the contributors are doing
amazing work, to create something new, something special, something
<em>right</em>.</p>
<p>Wasmer is a success. The Wasmer runtime is <em>nothing</em> without the
incredible, marvelous, exceptional team of engineers behind it. In no
particular order: <a rel="noopener external" target="_blank" href="https://github.com/lachlansneff">Lachlan Sneff</a>,
<a rel="noopener external" target="_blank" href="https://github.com/MarkMcCaskey">Mark McCaskey</a>, <a rel="noopener external" target="_blank" href="https://github.com/jubianchi">Julien
Bianchi</a>, <a rel="noopener external" target="_blank" href="https://github.com/nlewycky">Nick
Lewycky</a>, <a rel="noopener external" target="_blank" href="https://github.com/losfair">Heyang
Zhou</a>, <a rel="noopener external" target="_blank" href="https://github.com/xmclark">Mackenzie
Clark</a>, <a rel="noopener external" target="_blank" href="https://github.com/bjfish">Brandon
Fish</a>. All of them, with no exception, have
put a lot of passion in this project. It is what it is today because of
them and also because of the contributors we have been honored to
welcome. The open source side of Wasmer was intense but also an
important source of joy. It is a respectful place to work.</p>
<p>However, the inside reality was very different. All the employees
hereinabove have left the company. Almost all of them due to a burn-out
or conflicts or strong disagreements with the company leadership. I am
leaving due to a severe burn-out. I would like to briefly share my
journey to my first burn-out in few points:</p>
<ul>
<li>I've started as an engineer. I love coding. I love hacking. In Wasmer, I've
found a place to learn a lot and to express my passion. We had a lot of
pressure mostly because our friendly “competitors” had more people dedicated
to work on their runtimes, more money, more power, better marketing and so on.
That's the inevitable burden of any startup. When you're competing against
giants, that's what happens. And that's OK. It's part of the game.</li>
<li>During that time, we were delivering more and more projects, more and more
features, at an incredible pace. New hats: Release manager, project manager,
more product ownership, more customers to be connected with, more contributors
to help, more issues to triage, blog writer etc. The pace was accelerating too
fast, something we did notice on multiple occasions.</li>
<li>The CEO, <a rel="noopener external" target="_blank" href="https://github.com/syrusakbary">Syrus Akbary</a>, had evidently a lot
of pressure on its shoulders. It sadly resulted in the worst possible way:
micro-management, stress, pressure, bad leadership, lack of vision, lack of
trust in the employees, changing the roadmap constantly, lies, secrets etc.</li>
<li>As one of the older in the company, with a family of two kids, I probably got
more “wisdom”. I've decided to create a safe place for employees to express
their frustrations, their needs, to find solutions together. <em>De facto</em>, I
became the “person of trust” in the company. I got new hats, new pressures.</li>
<li>SARS-CoV-2 hit. School at home. Lock-down. More micro-management, more stress.
Wasmer was running out of money. I brought a new investor that saved the
company. New hat.</li>
<li>After too many departures (85% of the engineering team!), I tried to take more
space and to take more responsabilities in the company. That was at the
beginning of 2021. It was my last attempt to save the company from a disaster
before leaving. <strong>I couldn't imagine leaving such brilliant and successful
projects without having tried everything I could</strong>.</li>
<li>Then <strong>I became a <em>late co-founder</em> of Wasmer</strong>. Too many new hats: Doing
hiring interviews, accountabilities, helping to define the roadmap (with
another awesome person, friend, and employee), handling legal aspects to hire
people in multiple countries with non-precarious contracts etc.</li>
<li>Obviously, I was also doing the job of all the engineers that have left. They
were not replaced for unknown reasons. It was absolutely madness. The pace was
unsustainable.</li>
<li>Finally, the crack. The CEO continued to change the roadmap, to take bad
decisions, to not recognize all the efforts we were doing to save/grow the
company. It was my turn to be declared in a <em>severe</em> burn-out by my physician.
The last engineer to fall.</li>
</ul>
<p>Another point: Syrus Akbary also has made many public errors that have
created hostility against the company. Hopefully people were smart and
kind enough to make the difference between the employees and the CEO (I
won't name the people but they will recognize themselves: Thank you). I
tried to fix that situation. Discussing with dozens of person to restore
empathy and forgiveness, to create better collaboration, to cure and
move forward. It was exhausting. I know people have appreciated what I
did, but my mental health was ruined.</p>
<p>Considering all the time I've devoted to the company, the very few
consideration I got in return, the countless work hours (4 days per
week, but frequently closing the computer at 1am due to very late
meetings, I was working like hell), the precarious contract I had (did
you ever see a co-founder with a freelancer contract?), the toxic
working environment, the constant pressure etc., my passion was intact
but my motivation was seriously damaged. Doing overtime was never
recorded and was happening more than frequently, but taking half a day
to take care of a sick child was immediately counted as holidays; the
balance was broken. Criticisms. Micro-management. Disapprovals.
Rewriting the facts and the reality to criticize what you're doing,
flipping things against you, avoiding discussion when things get stormy.
We even had a meeting titled “Why you're not productive enough” whilst
everyone was working as hell, right after the rewrite of the entire
runtime to release Wasmer 1.0, a period we all affectionately called
“The Rewrite of Hell”. The team deserved vacations, congratulations,
attentions, gratitude, … not such a shitty meeting. Well, you get the
idea.</p>
<p>When I've been declared in a severe burn-out, I had to take a break. The
reaction from the CEO was… unexpected: Zero empathy, asking to never
ever being sick again (otherwise I will be fired), dividing my equities,
asking me to work more, saying I've never been involved in the company
etc. That was the final straw to me. That's <em>the</em> wrong way to treat an
employee, a collaborator, a contributor, the co-founder.</p>
<h2 id="what-s-next">What's next?<a role="presentation" class="anchor" href="#what-s-next" title="Anchor link to this header">#</a>
</h2>
<p>I need to recover. As you can imagine, working 2.5 years at this pace
leaves sequelae. Hopefully a couple of months should suffice.</p>
<p>I'm still in love with Wasmer, the <strong>runtime</strong>, the open source projects
we have created. It has a bright future. More companies are contributing
to it, more individual contributors are bringing their stones to the
monument. The project is owned by the public, by the users, by the
contributors, they are doing most of the work today. It's well tested,
well documented, it's easy to contribute to it. It's a fabulous open
source piece of technology.</p>
<p>I strongly <em>hope</em> Wasmer, the <strong>company</strong>, will change. The products
that are going to be released are absolutely fantastic. It's a
technology shift. It will enable the Decentralized Web, how we compile
and distribute programs, how we will even consume programs. Wasmer has a
solid bright future. I really <em>hope</em> things will change, and I wish the
best to and am passing on the torch to the adventurers that will
continue to move the company forward. I'm just too <em>skeptical</em> that
things can improve or even slightly change. We have built something
great. Please take a great care of it.</p>
<p>As I said, I'm available for a new adventure, you can contact me at
<a href="mailto:[email protected]">[email protected]</a>,
<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a>,
<a rel="noopener external" target="_blank" href="https://www.linkedin.com/in/ivan-enderlin/">ivan-enderlin</a> (LinkedIn).</p>
<p>Discussions <a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io/status/1445310721185783811">on
Twitter</a> and <a rel="noopener external" target="_blank" href="https://news.ycombinator.com/item?id=28772863">on
HackerNews</a>.</p>
Bye bye WhatsApp, hello ?2021-01-19T00:00:00+00:002021-01-19T00:00:00+00:00
Unknown
https://mnt.io/articles/bye-bye-whatsapp-hello/<p>Ce billet est en français, essentiellement à destination de mes ami(e)s
et familles, il sert à vulgariser très rapidement les enjeux autour de
<a rel="noopener external" target="_blank" href="https://www.whatsapp.com/">WhatsApp</a>,
<a rel="noopener external" target="_blank" href="https://www.signal.org/fr/">Signal</a>, <a rel="noopener external" target="_blank" href="https://telegram.org/">Telegram</a>
et <a rel="noopener external" target="_blank" href="https://element.io/">Matrix</a> (<em>spoiler</em>, c'est le gagnant). Tout le
monde me pose la même question, alors voici une réponse rapidement en
brouillon qui va m'éviter du copier/coller !</p>
<p>Je ne rentre volontairement <em>pas</em> dans les détails. Il faut que ce
document reste à la portée de tous, sans aucune connaissance en réseau,
chiffrement, sécurité etc. Ceux qui ont ces connaissances savent déjà
que Matrix est <em>le</em> réseau vers lequel aller ;-).</p>
<h2 id="les-bases">Les bases<a role="presentation" class="anchor" href="#les-bases" title="Anchor link to this header">#</a>
</h2>
<p>Quand on parle de messageries, il y a 2 choses primordiales plus 1 bonus :</p>
<ul>
<li>le chiffrement ;</li>
<li>la topologie du réseau (c'est facile, n'ayez pas peur) ;</li>
<li>l'accès libre et gratuit sans restriction au code source (<em>open
source</em>).</li>
</ul>
<p>Nous pouvons aussi parler du modèle économique du réseau rapidement,
voir le tableau comparatif.</p>
<h3 id="le-chiffrement">Le chiffrement<a role="presentation" class="anchor" href="#le-chiffrement" title="Anchor link to this header">#</a>
</h3>
<p>Pour respecter la vie privée et éviter l'espionnage et le vol des
données, il faut que le chiffrement se fasse de bout en bout (<em>end to
end encryption</em>). Ça veut dire que vous seul avez la clé pour chiffrer
et/ou déchiffrer vos messages, et personne d'autre. Par message,
j'entends message texte, audio, image, vidéo, appels audios-vidéos,
tout. Vos données sont à vous, et uniquement vous, et personne ne peut
les utiliser, à part la personne avec qui vous les partagez (qui elle,
normalement à une clé de déchiffrement par exemple). Les clés servent
aussi à identifier la personne avec qui vous parlez, ça permet d'éviter
le vol d'identité.</p>
<h3 id="la-topologie">La topologie<a role="presentation" class="anchor" href="#la-topologie" title="Anchor link to this header">#</a>
</h3>
<p>La plupart des réseaux sont centralisés : ça veut dire qu'on a un gros
silot, un énorme ordinateur/serveur, et que tout le monde est dessus. Ça
pose plein de problèmes :</p>
<ul>
<li>impossible d'avoir le contrôle dessus ;</li>
<li>impossible de faire confiance ;</li>
<li>faille unique.</li>
</ul>
<p>Je prends l'exemple de WhatsApp pour illustrer tout ça parce que ça
parle à tout le monde : Facebook décide unilatéralement de déchiffrer le
réseau, personne n'a le contrôle dessus, c'est une violation grave de la
vie privée de milliards de personnes et on ne peut rien faire (à part
quitter le réseau). Avions-nous confiance dans ce que faisait Facebook
avec nos données WhatsApp avant ? Non, aucunement. Ils disaient que
c'était chiffré, l'était-ce vraiment ? J'accorde plus de confiance dans
ceux qui ont créé et chiffré le réseau avant qu'il ne soit racheté par
Facebook, donc j'ai envie d'y croire, mais… <em>je ne peux pas le vérifier</em>
! Pourquoi ? Parce que personne (en dehors de quelques employés chez
Facebook) n'a accès au code source, aux programmes, qui font tourner
WhatsApp. Et pour le côté <em>faille unique</em>, si Facebook est attaqué,
c'est l'entièreté du réseau qui s'effondre, c'est une faille unique, un
<em>single point of failure</em> comme on dit dans le métier. Pareil si le
réseau est <em>hacké</em>, c'est un accès illimité à tout le réseau.</p>
<blockquote>
<p>Aucune transparence = aucune confiance.</p>
</blockquote>
<p>Mais il existe une alternative majeure bien sûr ! Les réseaux
décentralisés. Au lieu d'avoir un serveur, il y a en des centaines, des
milliers. Il n'y a plus de contrôle possible. Il n'y a plus de <em>single
point of failure</em>. Un <em>hacker</em> ne peut accéder au pire qu'aux données
d'un seul serveur, pas de tous les serveurs (il existe pleins
d'exceptions mais je vulgarise, hein). Nous pouvons créer autant de
serveurs que nous le souhaitons. Souvent, ce sont des réseaux open
source, donc nous pouvons lire le code des programmes, vérifier qu'ils
font bien ce qu'ils proclament faire.</p>
<h2 id="tableau-comparatif">Tableau comparatif<a role="presentation" class="anchor" href="#tableau-comparatif" title="Anchor link to this header">#</a>
</h2>
<p>Comparons les services populaires avec ces critères de bases.</p>
<figure>
<table>
<tbody>
<tr>
<td><strong>Service</strong></td>
<td><strong>Chiffrement</strong></td>
<td><strong>Topologie</strong></td>
<td><strong>Open source</strong></td>
</tr>
<tr>
<td><strong>WhatsApp</strong></td>
<td>bout en bout (pour le moment)</td>
<td>centralisé (US)</td>
<td>non</td>
</tr>
<tr>
<td><strong>Telegram</strong></td>
<td>bout en bout (pour le moment)</td>
<td>centralisé (Dubaï, US)</td>
<td>non</td>
</tr>
<tr>
<td><strong>Signal</strong></td>
<td>bout en bout</td>
<td>centralisé (US)</td>
<td>oui mais…</td>
</tr>
<tr>
<td><strong>Matrix</strong></td>
<td>bout en bout</td>
<td>décentralisé</td>
<td>oui</td>
</tr>
</tbody>
</table>
<figcaption>
<p>Comparons la base !</p>
</figcaption>
</figure>
<p>Signal est open source, mais nous ne pouvons pas vérifier ce qui est
installé sur les serveurs, parce que le serveur est privé. De plus, le
serveur open source n'a pas été <a rel="noopener external" target="_blank" href="https://github.com/signalapp/Signal-Server">mis à jour depuis avril
2020</a>, en année
Informatique, c'est très long. Ça cache quelque chose ? Aucune idée, je
ne peux pas le savoir, car je n'ai pas d'éléments pour prendre une
décision. Est-ce que je veux déposer mes données privées sur un service
dans lequel je n'ai pas confiance ?</p>
<p>En plus, Signal comme WhatsApp sont hébergés/situés aux US, avec les
lois liberticides qu'on leur connaît bien (comme le Cloud Act). Signal
limite la casse grâce au chiffrement de bout en bout, mais peut être
qu'une <em>backdoor</em> est présente et qu'on ne le saura jamais.</p>
<blockquote>
<p>Aucune transparence = aucune confiance</p>
</blockquote>
<p>Les réseaux décentralisés sont supérieurs à tous les niveaux (pas de
contrôle, pas de hack massif etc.). Les réseaux open source sont ceux en
qui nous pouvons avoir confiance. Donc le choix est vite fait, le
gagnant ici est Matrix.</p>
<p>Comparons maintenant comment les services sont financés, parce que c'est
important. Si un service n'est pas rentable, il pourrait avoir de
l'appétit pour les données de ses utilisateurs, et là c'est dangereux
(c'est exactement ce qu'il se passe avec Facebook et WhatsApp).</p>
<figure>
<table>
<tbody>
<tr>
<td><strong>Service</strong></td>
<td><strong>Revenues</strong></td>
</tr>
<tr>
<td><strong>WhatsApp</strong></td>
<td>Facebook veut utiliser les données privées pour vendre de la publicité ciblée.</td>
</tr>
<tr>
<td><strong>Telegram</strong></td>
<td>Les fondateurs sont millionnaires et injectent de l'argent.<br>Dans peu de temps, financement via pubs et comptes premiums.</td>
</tr>
<tr>
<td><strong>Signal</strong></td>
<td>Organisation à but non-lucratif qui opère via des dons.</td>
</tr>
<tr>
<td><strong>Matrix</strong></td>
<td>Matrix développe, offre ou vend des services autour du réseau, mais pas autour des données !</td>
</tr>
</tbody>
</table>
<figcaption>
<p>Comment sont financés les services ?</p>
</figcaption>
</figure>
<p>Les gagnants ici sont Signal et Matrix.</p>
<h2 id="conclusion-matrix-gagnant">Conclusion : Matrix gagnant<a role="presentation" class="anchor" href="#conclusion-matrix-gagnant" title="Anchor link to this header">#</a>
</h2>
<p>Dans le cas des réseaux centralisés, Signal est une meilleure
alternative à WhatsApp et Telegram de part son mode de financement (donc
son appétit pour les données des utilisateurs), mais ils sont tous
sujets aux même problèmes : aucune confiance car pas de transparence,
hébergés aux US etc.</p>
<p>Mais les réseaux décentralisés sont supérieurs car ils résolvent tous
ces problèmes ! Matrix est décentralisé, est financé par des services
autour du réseau mais pas par les données du réseau (qui sont
inaccessibles de toute façon, elles n'existent que sur vos téléphones et
ordinateurs, nul part ailleurs).</p>
<p>J'utilise Matrix. Je vous conseille d'utiliser Matrix. Partir sur
Signal, c'est sortir de la gueule d'un loup pour aller dans celle d'un
autre. Je suis admiratif du travail des développeurs de chez Signal, ils
sont vraiment bons, leur protocole de chiffrement est magnifique, mais
je n'ai pas confiance dans leur service parce que je ne <em>peux</em> pas. Et
personne ne le <em>peut</em>.</p>
<p>J'utilise aussi WhatsApp et Signal pour rester en contact avec mes amis
et ma famille, et leur dire d'utiliser Matrix, mais je n'y publierai
jamais de données personnelles, photos ou quoi que ce soit, je n'ai
aucune confiance. Libre à vous aussi d'utiliser plusieurs réseaux, après
tout nous jonglons déjà avec plusieurs réseaux (mail, SMS, WhatsApp,
Matrix, Twitter, <a rel="noopener external" target="_blank" href="https://mastodon.social/about">Mastodon</a> etc.), ça
n'est pas un problème !</p>
<h2 id="premier-pas-avec-matrix">Premier pas avec Matrix<a role="presentation" class="anchor" href="#premier-pas-avec-matrix" title="Anchor link to this header">#</a>
</h2>
<p>C'est parti, petit tuto Matrix. Le réseau est exceptionnel, mais le
client officiel (<a rel="noopener external" target="_blank" href="https://element.io/">Element</a>) est encore un peu «
brut » à utiliser comparé à Signal ou WhatsApp. Notez que ça évolue très
très vite (je compte 616 contributeurs qui travaillent dessus
bénévolement, encore une grande force de l'open source !).</p>
<p>Ce qui va vous titiller le plus c'est : vous ne pouvez pas toujours
identifier vos contacts par numéro de téléphone (seulement s'ils sont
enregistrés sur un serveur d'identité). Pourquoi ? Parce que votre
compte à un identifiant, comme une adresse email. Le mien est
<code>@mnt_io:matrix.org</code> (le format est <code>@identifiant:serveur</code>). C'est bien
meilleur pour la vie privée. Et pis, ça n'est pas différent de MSN ou de
tout autre réseau de l'époque, c'est vraiment WhatsApp qui a imposé la «
découvertabilité » par le numéro de téléphone. Bien que très pratique,
c'est dangereux pour la vie privée.</p>
<p>Donc, go, on installe le client :</p>
<ul>
<li>sur <a rel="noopener external" target="_blank" href="https://apps.apple.com/us/app/element-messenger/id1083446067">iOS, macOS
etc</a>.,</li>
<li>sur
<a rel="noopener external" target="_blank" href="https://play.google.com/store/apps/details?id=im.vector.app&hl=en_US&gl=US">Android</a>,</li>
<li>sur votre <a rel="noopener external" target="_blank" href="https://element.io/get-started">bureau ou votre
navigateur</a>.</li>
</ul>
<p>Puis on crée un compte, et ajoutez moi. C'est parti !</p>
<p>Matrix/Element est basé sur les groupes. Les chats « directs » (1:1)
sont des groupes aussi. Vous pouvez même rejoindre des <em>rooms</em> (gros
groupes, des communautés) avec des centaines voire des milliers de
personnes dedans. C'est très flexible.</p>
<p>Parce que c'est open source, n'importe qui peut écrire son propre client
(programme qui se connecte au réseau). Il existe des clients
alternatives, comme <a rel="noopener external" target="_blank" href="https://nio.chat/">Nio</a> ou
<a rel="noopener external" target="_blank" href="https://fluffychat.im/en/">FluffyChat</a>, ou même <a rel="noopener external" target="_blank" href="https://matrix.org/docs/projects/client/ditto-chat">Ditto
Chat</a>. Tous ces
clients sont encore en beta, mais ça montre un futur très excitant pour
Matrix avec des clients de plus en plus aboutis !</p>
<h3 id="matrix-element-vector-hein">Matrix, Element, Vector, hein ?<a role="presentation" class="anchor" href="#matrix-element-vector-hein" title="Anchor link to this header">#</a>
</h3>
<ul>
<li>Element c'est le nom de l'entreprise qui travaille/développe le réseau, les
serveurs et le client ;</li>
<li>Matrix c'est le nom du réseau ;</li>
<li>Vector, c'est l'ancien nom d'Element.</li>
</ul>
<p>On parle souvent de façon indifférentiée de Matrix ou Element, c'est un
abus de langage.</p>
<blockquote>
<p>Before: Mark as read.</p>
<p>Now: Mark has read.</p>
</blockquote>
<p>Et par pitié, quittez Facebook…</p>
Announcing the first Java library to run WebAssembly: Wasmer JNI2020-05-13T00:00:00+00:002020-05-13T00:00:00+00:00
Unknown
https://mnt.io/articles/announcing-the-first-java-library-to-run-webassembly-wasmer-jni/<p><em>This is a copy of <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/announcing-the-first-java-library-to-run-webassembly-wasmer-jni-89e319d2ac7c">an article I wrote for
Wasmer</a>.</em></p>
<hr />
<p><a rel="noopener external" target="_blank" href="https://webassembly.org/">WebAssembly</a> is a portable binary format.
That means the same file can run anywhere.</p>
<blockquote>
<p>To uphold this bold statement, each language, platform and system must
be able to run WebAssembly — as fast and safely as possible.</p>
</blockquote>
<p>People who are familiar with Wasmer are used to this kind of
announcement! Wasmer is written in Rust, and comes with an additional
native C API. But you can use it in a lot of other languages. After
having announced libraries to use Wasmer, and thus WebAssembly, in:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/php-ext-wasm"><strong>PHP</strong> with the <code>ext/wasm</code>
extension</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/python-ext-wasm"><strong>Python</strong> with the <code>wasmer</code>
library</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/ruby-ext-wasm"><strong>Ruby</strong> with the <code>wasmer</code>
library</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm"><strong>Go</strong> with the <code>wasmer</code> library</a>
(see
<a rel="noopener external" target="_blank" href="https://medium.com/wasmer/announcing-the-fastest-webassembly-runtime-for-go-wasmer-19832d77c050">Announcing the fastest WebAssembly runtime for Go: <code>wasmer</code></a>),
and even</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/postgres-ext-wasm"><strong>Postgres</strong> with the <code>wasmer</code> library</a>
(see
<a rel="noopener external" target="_blank" href="https://medium.com/wasmer/announcing-the-first-postgres-extension-to-run-webassembly-561af2cfcb1">Announcing the first Postgres extension to run WebAssembly</a>),</li>
<li>and many other contributions in
<a rel="noopener external" target="_blank" href="https://github.com/migueldeicaza/WasmerSharp">.NET/C#</a>,
<a rel="noopener external" target="_blank" href="https://github.com/dirkschumacher/wasmr">R</a> and
<a rel="noopener external" target="_blank" href="https://github.com/tessi/wasmex">Elixir</a>…</li>
</ul>
<p>…we are jazzed to announce that <strong><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm">Wasmer has now landed in
Java</a></strong>!</p>
<p>Let’s discover the Wasmer JNI library together.</p>
<h2 id="installation">Installation<a role="presentation" class="anchor" href="#installation" title="Anchor link to this header">#</a>
</h2>
<p>The Wasmer JNI (<em>Java Native Interface</em>) library is based on the <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer
runtime</a>, which is written in
<a rel="noopener external" target="_blank" href="https://www.rust-lang.org/">Rust</a>, and is compiled to a shared library.
For your convenience, we produce one JAR (<em>Java Archive</em>) per
architecture and platform. By now, the following are supported,
consistently tested, and pre-packaged (available in
<a rel="noopener external" target="_blank" href="https://bintray.com/wasmer/wasmer-jni/wasmer-jni">Bintray</a> and <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm/releases">Github
Releases</a>):</p>
<ul>
<li>
<p><code>amd64-darwin</code> for macOS, x86 64bits,</p>
</li>
<li>
<p><code>amd64-linux</code> for Linux, x86 64 bits,</p>
</li>
<li>
<p><code>amd64-windows</code> for Windows, x86 64 bits.</p>
</li>
</ul>
<p>More architectures and more platforms will be added in the near future.
If you need a specific one, <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm/issues/new?assignees=&labels=%F0%9F%8E%89+enhancement&template=---feature-request.md&title=">feel free to
ask</a>!
However, it is possible to <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm#development">produce your own JAR for your own platform
and
architecture</a>.</p>
<p>The JAR files are named as follows:
<code>wasmer-jni-$(architecture)-$(os)-$(version).jar</code>. Thus, to include
Wasmer JNI as a dependency of your project (assuming you use
<a rel="noopener external" target="_blank" href="http://gradle.org/">Gradle</a>), write for instance:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>dependencies {</span></span>
<span class="giallo-l"><span> implementation "org.wasmer:wasmer-jni-amd64-linux:0.2.0"</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>JAR are hosted on the Bintray/JCenter repository under the
<code>[wasmer-jni](https://bintray.com/wasmer/wasmer-jni/wasmer-jni)</code>
project. They are also attached to our <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm/releases">Github releases as
assets</a>.</p>
<h2 id="calling-a-webassembly-function-from-java">Calling a WebAssembly function from Java<a role="presentation" class="anchor" href="#calling-a-webassembly-function-from-java" title="Anchor link to this header">#</a>
</h2>
<p>As usual, let’s start with a simple Rust program that we will compile to
WebAssembly, and then execute from Java.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>,</span><span class="z-variable"> y</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> i32</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> x</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> y</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>After compilation to WebAssembly, we get a file like <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm/raw/master/examples/simple.wasm">this
one</a>,
named <code>simple.wasm</code>.</p>
<p>The following Java program executes the <code>sum</code> exported function by
passing <code>5</code> and <code>37</code> as arguments:</p>
<pre class="giallo z-code"><code data-lang="java"><span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> org</span><span class="z-punctuation z-separator">.</span><span class="z-storage">wasmer</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Instance</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">io</span><span class="z-punctuation z-separator">.</span><span class="z-storage">IOException</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">file</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Files</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">file</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Paths</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> SimpleExample</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-storage"> public static</span><span class="z-storage z-type z-primitive"> void</span><span class="z-entity z-name z-function"> main</span><span>(</span><span class="z-storage z-type">String</span><span>[]</span><span class="z-variable z-parameter"> args</span><span>)</span><span class="z-storage"> throws</span><span class="z-storage z-type"> IOException</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Read the WebAssembly bytes.</span></span>
<span class="giallo-l"><span class="z-storage z-type z-primitive"> byte</span><span>[]</span><span class="z-variable"> bytes</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> Files</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">readAllBytes</span><span>(</span><span class="z-variable">Paths</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-string">"simple.wasm"</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Instantiate the WebAssembly module.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Instance</span><span class="z-variable"> instance</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-entity z-name z-function"> Instance</span><span>(bytes)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Get the `sum` exported function, call it by passing 5 and 37, and get the result.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Integer</span><span class="z-variable"> result</span><span class="z-keyword z-operator"> =</span><span> (Integer)</span><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-variable">exports</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">getFunction</span><span>(</span><span class="z-string">"sum"</span><span>)</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">apply</span><span>(</span><span class="z-constant z-numeric">5</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 37</span><span>)[</span><span class="z-constant z-numeric">0</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> assert</span><span> result </span><span class="z-keyword z-operator">==</span><span class="z-constant z-numeric"> 42</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">close</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>Great! We have successfully executed a Rust program, compiled to
WebAssembly, in Java. As you can see, it is pretty straightforward. The
API is very similar to the standard JavaScript API, or the other API we
have designed for PHP, Python, Go, Ruby etc.</p>
<p>The assiduous reader might have noticed the <code>[0]</code> in <code>.apply(5, 37)[0]</code>
pattern. A WebAssembly function can return zero to many values, and in
this case, we are reading the first one.</p>
<blockquote>
<p>Note: Java values passed to WebAssembly exported functions are
automatically downcasted to WebAssembly values. Types are inferred at
runtime, and casting is done automatically. Thus, a WebAssembly
function acts as any regular Java function.</p>
</blockquote>
<p>Technically, an exported function is a <em>functional interface</em> as defined
by the Java Language Specification (i.e. it is a
<code>[FunctionalInterface](https://docs.oracle.com/javase/8/docs/api/java/lang/FunctionalInterface.html)</code>).
Thus, it is possible to write the following code where <code>sum</code> is an
actual function (of kind <code>org.wasmer.exports.Function</code>):</p>
<pre class="giallo z-code"><code data-lang="java"><span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> org</span><span class="z-punctuation z-separator">.</span><span class="z-storage">wasmer</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Instance</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> org</span><span class="z-punctuation z-separator">.</span><span class="z-storage">wasmer</span><span class="z-punctuation z-separator">.</span><span class="z-storage">exports</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Function</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">io</span><span class="z-punctuation z-separator">.</span><span class="z-storage">IOException</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">file</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Files</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">file</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Paths</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> SimpleExample</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-storage"> public static</span><span class="z-storage z-type z-primitive"> void</span><span class="z-entity z-name z-function"> main</span><span>(</span><span class="z-storage z-type">String</span><span>[]</span><span class="z-variable z-parameter"> args</span><span>)</span><span class="z-storage"> throws</span><span class="z-storage z-type"> IOException</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Read the WebAssembly bytes.</span></span>
<span class="giallo-l"><span class="z-storage z-type z-primitive"> byte</span><span>[]</span><span class="z-variable"> bytes</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> Files</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">readAllBytes</span><span>(</span><span class="z-variable">Paths</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-string">"simple.wasm"</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Instantiate the WebAssembly module.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Instance</span><span class="z-variable"> instance</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-entity z-name z-function"> Instance</span><span>(bytes)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the `sum` function, as a regular Java function.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Function</span><span class="z-variable"> sum</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-variable">exports</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">getFunction</span><span>(</span><span class="z-string">"sum"</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Call `sum`.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Integer</span><span class="z-variable"> result</span><span class="z-keyword z-operator"> =</span><span> (Integer)</span><span class="z-variable"> sum</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">apply</span><span>(</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 2</span><span>)[</span><span class="z-constant z-numeric">0</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> assert</span><span> result </span><span class="z-keyword z-operator">==</span><span class="z-constant z-numeric"> 3</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">close</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>But a WebAssembly module not only exports functions, it also exports
memory.</p>
<h2 id="reading-the-memory">Reading the memory<a role="presentation" class="anchor" href="#reading-the-memory" title="Anchor link to this header">#</a>
</h2>
<p>A WebAssembly instance has one or more linear memories, a contiguous and
byte-addressable range of memory spanning from offset 0 and extending up
to a varying memory size, represented by the <code>org.wasmer.Memory</code> class.
Let’s see how to use it. Consider the following Rust program:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> return_hello</span><span>()</span><span class="z-keyword z-operator"> -> *</span><span class="z-storage">const</span><span class="z-entity z-name"> u8</span><span> {</span></span>
<span class="giallo-l"><span class="z-string"> b"Hello, World!</span><span class="z-constant z-character">\0</span><span class="z-string">"</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_ptr</span><span>()</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The <code>return_hello</code> function returns a pointer to the statically
allocated string. The string exists in the linear memory of the
WebAssembly module. It is then possible to read it in Java:</p>
<pre class="giallo z-code"><code data-lang="java"><span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> org</span><span class="z-punctuation z-separator">.</span><span class="z-storage">wasmer</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Instance</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> org</span><span class="z-punctuation z-separator">.</span><span class="z-storage">wasmer</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Memory</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">io</span><span class="z-punctuation z-separator">.</span><span class="z-storage">IOException</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">ByteBuffer</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">file</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Files</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-storage"> java</span><span class="z-punctuation z-separator">.</span><span class="z-storage">nio</span><span class="z-punctuation z-separator">.</span><span class="z-storage">file</span><span class="z-punctuation z-separator">.</span><span class="z-storage">Paths</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> MemoryExample</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-storage"> public static</span><span class="z-storage z-type z-primitive"> void</span><span class="z-entity z-name z-function"> main</span><span>(</span><span class="z-storage z-type">String</span><span>[]</span><span class="z-variable z-parameter"> args</span><span>)</span><span class="z-storage"> throws</span><span class="z-storage z-type"> IOException</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Read the WebAssembly bytes.</span></span>
<span class="giallo-l"><span class="z-storage z-type z-primitive"> byte</span><span>[]</span><span class="z-variable"> bytes</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> Files</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">readAllBytes</span><span>(</span><span class="z-variable">Paths</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-string">"memory.wasm"</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Instantiate the WebAssembly module.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Instance</span><span class="z-variable"> instance</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-entity z-name z-function"> Instance</span><span>(bytes)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Get a pointer to the statically allocated string returned by `return_hello`.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Integer</span><span class="z-variable"> pointer</span><span class="z-keyword z-operator"> =</span><span> (Integer)</span><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-variable">exports</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">getFunction</span><span>(</span><span class="z-string">"return_hello"</span><span>)</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">apply</span><span>()[</span><span class="z-constant z-numeric">0</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Get the exported memory named `memory`.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> Memory</span><span class="z-variable"> memory</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-variable">exports</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">getMemory</span><span>(</span><span class="z-string">"memory"</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Get a direct byte buffer view of the WebAssembly memory.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> ByteBuffer</span><span class="z-variable"> memoryBuffer</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> memory</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">buffer</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Prepare the byte array that will hold the data.</span></span>
<span class="giallo-l"><span class="z-storage z-type z-primitive"> byte</span><span>[]</span><span class="z-variable"> data</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-storage z-type z-primitive"> byte</span><span>[</span><span class="z-constant z-numeric">13</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Let's position the cursor, and…</span></span>
<span class="giallo-l"><span class="z-variable"> memoryBuffer</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">position</span><span>(pointer)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // … read!</span></span>
<span class="giallo-l"><span class="z-variable"> memoryBuffer</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">get</span><span>(data)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Let's encode back to a Java string.</span></span>
<span class="giallo-l"><span class="z-storage z-type"> String</span><span class="z-variable"> result</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-entity z-name z-function"> String</span><span>(data)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Hello!</span></span>
<span class="giallo-l"><span class="z-keyword"> assert</span><span class="z-variable"> result</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">equals</span><span>(</span><span class="z-string">"Hello, World!"</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> instance</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">close</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>As we can see, the <code>Memory</code> API provides a <code>buffer</code> method. It returns a
<a rel="noopener external" target="_blank" href="https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html"><em>direct</em> byte
buffer</a>
(of kind <code>java.nio.ByteBuffer</code>) view of the memory. It’s a standard API
for any Java developer. We think it’s best to not reinvent the wheel and
use standard API as much as possible.</p>
<p>The WebAssembly memory is dissociated from the JVM memory, and thus from
the garbage collector.</p>
<blockquote>
<p>You can read <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm/blob/master/examples/GreetExample.java">the Greet
Example</a>
to see a more in-depth usage of the <code>Memory</code> API.</p>
</blockquote>
<h2 id="more-documentation">More documentation<a role="presentation" class="anchor" href="#more-documentation" title="Anchor link to this header">#</a>
</h2>
<p>The project comes with a <code>Makefile</code>. The <code>make javadoc</code> command will
generate a traditional local Javadoc for you, in the
<code>build/docs/javadoc/index.html</code> file.</p>
<p>In addition, the project’s <code>README.md</code> file has an <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm#api-of-the-wasmer-library">API of
the <code>wasmer</code> library
Section</a>.</p>
<p>Finally, the project comes with <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm/tree/master/examples">a set of
examples</a>.
Use the <code>make run-example EXAMPLE=Simple</code> to run the
<code>SimpleExample.java</code> example for instance.</p>
<h2 id="performance">Performance<a role="presentation" class="anchor" href="#performance" title="Anchor link to this header">#</a>
</h2>
<p>WebAssembly aims at being safe, but also fast. Since Wasmer JNI is the
<em>first</em> Java library to execute WebAssembly, we can’t compare to prior
works in the Java ecosystem. However, you might know that Wasmer comes
with 3 backends: Singlepass, Cranelift and LLVM. We’ve even written an
article about it: <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/a-webassembly-compiler-tale-9ef37aa3b537">A WebAssembly Compiler
tale</a>.
The Wasmer JNI library uses the Cranelift backend for the moment, which
offers the best compromise between compilation-time and execution-time.</p>
<h2 id="credits">Credits<a role="presentation" class="anchor" href="#credits" title="Anchor link to this header">#</a>
</h2>
<p>Asami (<a rel="noopener external" target="_blank" href="https://twitter.com/d0iasm">d0iasm</a> on Twitter) has improved
this project during its internship at Wasmer under my guidance. She
finished the internship before the release of the Wasmer JNI project,
but she deserves credits for pushing the project forward! Good work
Asami!</p>
<p>This is an opportunity to remind everyone that we hire anywhere in the
world. Asami was working from Japan while I am working from Switzerland,
and the rest of the team is from US, Spain, China etc. Feel free to
contact me (<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a> or
<a rel="noopener external" target="_blank" href="https://twitter.com/syrusakbary">@syrusakbary</a> on Twitter) if you want
to join us on this big adventure!</p>
<h2 id="conclusion">Conclusion<a role="presentation" class="anchor" href="#conclusion" title="Anchor link to this header">#</a>
</h2>
<p>Wasmer JNI is a library to execute WebAssembly directly in Java. It
embeds the WebAssembly runtime
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer</a>. The first releases provide
the core API with <code>Module</code>, <code>Instance</code>, and <code>Memory</code>. It comes
pre-packaged as a JAR, one per architecture and per platform.</p>
<p>The source code is open and hosted on Github at
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/java-ext-wasm">wasmerio/java-ext-wasm</a>.
We are constantly improving the project, so if you have feedback,
issues, or feature requests please open an issue in the repository, or
reach us on Twitter at <a rel="noopener external" target="_blank" href="https://twitter.com/wasmerio">@wasmerio</a> or
<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a>.</p>
<p>We look forward to see what you build with this!</p>
Announcing the first Postgres extension to run WebAssembly2019-08-29T00:00:00+00:002019-08-29T00:00:00+00:00
Unknown
https://mnt.io/articles/announcing-the-first-postgres-extension-to-run-webassembly/<p><em>This is a copy of <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/announcing-the-first-postgres-extension-to-run-webassembly-561af2cfcb1">an article I wrote for
Wasmer</a>.</em></p>
<hr />
<p>WebAssembly is a portable binary format. That means the same program can
run anywhere.</p>
<blockquote>
<p>To uphold this bold statement, each language, platform and system must
be able to run WebAssembly — as fast and safely as possible.</p>
</blockquote>
<p>Let’s say it again. <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer</a> is a
WebAssembly runtime. We have successfully embedded the runtime in other
languages:</p>
<ul>
<li>In Rust, as it is written in Rust</li>
<li>Using <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/master/lib/runtime-c-api">C and C++
bindings</a></li>
<li>In PHP, using
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/php-ext-wasm"><code>php-ext-wasm</code></a></li>
<li>In Python, using
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/python-ext-wasm"><code>python-ext-wasm</code></a> —
<a rel="noopener external" target="_blank" href="https://pypi.org/project/wasmer/">wasmer package on PyPI</a></li>
<li>In Ruby, using
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/ruby-ext-wasm"><code>ruby-ext-wasm</code></a> — <a rel="noopener external" target="_blank" href="https://rubygems.org/gems/wasmer">wasmer
gem on RubyGems</a></li>
<li>In Go, using <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm"><code>go-ext-wasm</code></a>
— see <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/announcing-the-fastest-webassembly-runtime-for-go-wasmer-19832d77c050">the
announcement</a>.</li>
</ul>
<p>The community has also embedded Wasmer in awesome projects:</p>
<ul>
<li>.NET/C#, using
<a rel="noopener external" target="_blank" href="https://github.com/migueldeicaza/WasmerSharp">WasmerSharp</a></li>
<li>R, using <a rel="noopener external" target="_blank" href="https://github.com/dirkschumacher/wasmr">Wasmr</a>.</li>
</ul>
<p><strong>It is now time to continue the story and to hang around…
<a rel="noopener external" target="_blank" href="https://www.postgresql.org/">Postgres</a>!</strong></p>
<p>We are so happy to announce a newcrazy idea: <strong>WebAssembly on
Postgres</strong>. Yes, you read that correctly. On
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/postgres-ext-wasm">Postgres</a>.</p>
<h2 id="calling-a-webassembly-function-from-postgres">Calling a WebAssembly function from Postgres<a role="presentation" class="anchor" href="#calling-a-webassembly-function-from-postgres" title="Anchor link to this header">#</a>
</h2>
<p>As usual, we have to go through the installation process. There is no
package manager for Postgres, so it’s a manual step. The <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/postgres-ext-wasm#installation">Installation
Section of the
documentation</a>
explains the details; here is a summary:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Build the shared library.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> just build</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Install the extension in the Postgres tree.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> just install</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Activate the extension.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'CREATE EXTENSION wasm;'</span><span class="z-keyword z-operator"> |</span><span> \</span></span>
<span class="giallo-l"><span> psql -h $host -d $database</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Initialize the extension.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">"SELECT wasm_init('$(</span><span class="z-support z-function">pwd</span><span class="z-string">)/target/release/libpg_ext_wasm.dylib');"</span><span class="z-keyword z-operator"> |</span><span> \</span></span>
<span class="giallo-l"><span> psql -h $host -d $database</span></span></code></pre>
<p>Once the extension is installed, activated and initialized, we can start
having fun!</p>
<p>The current API is rather small, however basic features are available.
The goal is to gather a community and to design a pragmatic API
together, discover the expectations, how developers would use this new
technology inside a database engine.</p>
<p>Let’s see how it works. To instantiate a WebAssembly module, we use the
<code>wasm_new_instance</code> function. It takes 2 arguments: The absolute path to
the WebAssembly module, and a prefix for the module exported functions.
Indeed, if a module exports a function named <code>sum</code>, then a Postgres
function named <code>prefix_sum</code> calling the <code>sum</code> function will be created
dynamically.</p>
<p>Let’s see it in action. Let’s start by editing a Rust program that
compiles to WebAssembly:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>,</span><span class="z-variable"> y</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> i32</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> x</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> y</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Once this file compiled to <code>simple.wasm</code>, we can instantiate the module,
and call the exported <code>sum</code> function:</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-comment">-- New instance of the `simple.wasm` WebAssembly module.</span></span>
<span class="giallo-l"><span class="z-keyword">SELECT</span><span> wasm_new_instance(</span><span class="z-string">'/absolute/path/to/simple.wasm'</span><span>, </span><span class="z-string">'ns'</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">-- Call a WebAssembly exported function!</span></span>
<span class="giallo-l"><span class="z-keyword">SELECT</span><span> ns_sum(</span><span class="z-constant z-numeric">1</span><span>, </span><span class="z-constant z-numeric">2</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">-- ns_sum</span></span>
<span class="giallo-l"><span class="z-comment">-- --------</span></span>
<span class="giallo-l"><span class="z-comment">-- 3</span></span>
<span class="giallo-l"><span class="z-comment">-- (1 row)</span></span></code></pre>
<p><em>Et voilà !</em> The <code>ns_sum</code> function calls the Rust <code>sum</code> function through
WebAssembly! How fun is that 😄?</p>
<h2 id="inspect-a-webassembly-instance">Inspect a WebAssembly instance<a role="presentation" class="anchor" href="#inspect-a-webassembly-instance" title="Anchor link to this header">#</a>
</h2>
<p>This section shows how to inspect a WebAssembly instance. At the same
time, it quickly explains how the extension works under the hood.</p>
<p>The extension provides two foreign data wrappers, gathered together in
the <code>wasm</code> foreign schema:</p>
<ul>
<li><code>wasm.instances</code> is a table with the <code>id</code> and <code>wasm_file</code> columns,
respectively for the unique instance ID, and the path of the WebAssembly
module,</li>
<li><code>wasm.exported_functions</code> is a table with the <code>instance_id</code>, <code>name</code>,
<code>inputs</code>, and <code>outputs</code> columns, respectively for the instance ID of the
exported function, its name, its input types (already formatted for Postgres),
and its output types (already formatted for Postgres).</li>
</ul>
<p>Let’s see:</p>
<pre class="giallo z-code"><code data-lang="sql"><span class="giallo-l"><span class="z-comment">-- Select all WebAssembly instances.</span></span>
<span class="giallo-l"><span class="z-keyword">SELECT</span><span class="z-keyword z-operator"> *</span><span class="z-keyword"> FROM</span><span class="z-constant z-other"> wasm</span><span>.</span><span class="z-constant z-other">instances</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">-- id | wasm_file</span></span>
<span class="giallo-l"><span class="z-comment">-- -------------------------------------+-------------------------------</span></span>
<span class="giallo-l"><span class="z-comment">-- 426e17af-c32f-5027-ad73-239e5450dd91 | /absolute/path/to/simple.wasm</span></span>
<span class="giallo-l"><span class="z-comment">-- (1 row)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">-- Select all exported functions for a specific instance.</span></span>
<span class="giallo-l"><span class="z-keyword">SELECT</span></span>
<span class="giallo-l"><span class="z-keyword"> name</span><span>,</span></span>
<span class="giallo-l"><span> inputs,</span></span>
<span class="giallo-l"><span> outputs</span></span>
<span class="giallo-l"><span class="z-keyword">FROM</span></span>
<span class="giallo-l"><span class="z-constant z-other"> wasm</span><span>.</span><span class="z-constant z-other">exported_functions</span></span>
<span class="giallo-l"><span class="z-keyword">WHERE</span></span>
<span class="giallo-l"><span> instance_id </span><span class="z-keyword z-operator">=</span><span class="z-string"> '426e17af-c32f-5027-ad73-239e5450dd91'</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">-- name | inputs | outputs</span></span>
<span class="giallo-l"><span class="z-comment">-- -------+-----------------+---------</span></span>
<span class="giallo-l"><span class="z-comment">-- ns_sum | integer,integer | integer</span></span>
<span class="giallo-l"><span class="z-comment">-- (1 row)</span></span></code></pre>
<p>Based on these information, the <code>wasm</code> Postgres extension is able to
generate the SQL function to call the WebAssembly exported functions.</p>
<p>It sounds simplistic, and… to be honest, it is! The trick is to use
<a rel="noopener external" target="_blank" href="https://www.postgresql.org/docs/current/fdwhandler.html">foreign data
wrappers</a>,
which is an awesome feature of Postgres.</p>
<h2 id="how-fast-is-it-or-is-it-an-interesting-alternative-to-pl-pgsql">How fast is it, or: Is it an interesting alternative to PL/pgSQL?<a role="presentation" class="anchor" href="#how-fast-is-it-or-is-it-an-interesting-alternative-to-pl-pgsql" title="Anchor link to this header">#</a>
</h2>
<p>As we said, the extension API is rather small for now. The idea is to
explore, to experiment, to have fun with WebAssembly inside a database.
It is particularly interesting in two cases:</p>
<ol>
<li>To write extensions or procedures with any languages that compile to
WebAssembly in place of
<a rel="noopener external" target="_blank" href="https://www.postgresql.org/docs/10/plpgsql.html">PL/pgSQL</a>,</li>
<li>To remove a potential performance bottleneck where speed is
involved.</li>
</ol>
<p>Thus we run a basic benchmark. Like most of the benchmarks out there, it
must be taken with a grain of salt.</p>
<blockquote>
<p>The goal is to compare the execution time between WebAssembly and
PL/pgSQL, and see how both approaches scale.</p>
</blockquote>
<p>The Postgres WebAssembly extension uses
<a rel="noopener external" target="_blank" href="https://www.postgresql.org/docs/current/fdwhandler.html">Wasmer</a> as the
runtime, compiled with the Cranelift backend (<a rel="noopener external" target="_blank" href="https://medium.com/wasmer/a-webassembly-compiler-tale-9ef37aa3b537">learn more about the
different
backends</a>).
We run the benchmark with Postgres 10, on a MacBook Pro 15" from 2016,
2.9Ghz Core i7 with 16Gb of memory.</p>
<p>The methodology is the following:</p>
<ul>
<li>Load both the <code>plpgsql_fibonacci</code> and the <code>wasm_fibonacci</code> functions,</li>
<li>Run them with a query like
<code>SELECT *_fibonacci(n) FROM generate_series(1, 1000)</code> where <code>n</code> has the
following values: 50, 500, and 5000, so that we can observe how both
approaches scale,</li>
<li>Write the timings down,</li>
<li>Run this methodology multiple times, and compute the median of the
results.</li>
</ul>
<p>Here come the results. The lower, the better.</p>
<figure>
<p><img src="https://mnt.io/articles/announcing-the-first-postgres-extension-to-run-webassembly/./benchmarks.png" alt="Benchmarks" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Comparing WebAssembly vs. PL/pgSQL when computing the Fibonacci sequence
with n=50, 500 and 5000.</p>
</figcaption>
</figure>
<p>We notice that the Postgres WebAssembly extension is faster to run
numeric computations. The WebAssembly approach scales pretty well
compared to the PL/pgSQL approach, <em>in this situation</em>.</p>
<h3 id="">When to use the WebAssembly extension?<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h3>
<p>So far, the extension only supports integers (on 32- and 64-bits). The
extension doesn’t support strings <em>yet</em>. It also doesn’t support
records, views or other Postgres types. Keep in mind this is the very
first step.</p>
<p>Hence, it is too soon to tell whether WebAssembly can be an alternative
to PL/pgSQL. But regarding the benchmark results above, we are sure they
can live side-by-side, WebAssembly has clearly a place in the ecosystem!
And we want to continue to pursue this exploration.</p>
<h2 id="-1">Conclusion<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p>We are already talking with people that are interested in using
WebAssembly inside databases. If you have any particular use cases,
please reach us at <a rel="noopener external" target="_blank" href="https://wasmer.io/">wasmer.io</a>, or on Twitter at
<a rel="noopener external" target="_blank" href="https://twitter.com/wasmerio">@wasmerio</a> directly or me
<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a>.</p>
<p>Everything is open source, as usual! Happy hacking.</p>
Announcing the fastest WebAssembly runtime for Go: wasmer2019-05-29T00:00:00+00:002019-05-29T00:00:00+00:00
Unknown
https://mnt.io/articles/announcing-the-fastest-webassembly-runtime-for-go-wasmer/<p><em>This is a copy of <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/announcing-the-fastest-webassembly-runtime-for-go-wasmer-19832d77c050">an article I wrote for
Wasmer</a>.</em></p>
<hr />
<p>WebAssembly is a portable binary format. That means the same file can
run anywhere.</p>
<blockquote>
<p>To uphold this bold statement, each language, platform and system must
be able to run WebAssembly — as fast and safely as possible.</p>
</blockquote>
<p><a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer</a> is a WebAssembly runtime
written in <a rel="noopener external" target="_blank" href="https://www.rust-lang.org/">Rust</a>. It goes without saying
that the runtime can be used in any Rust application. We have also
successfully embedded the runtime in other languages:</p>
<ul>
<li>Using <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/master/lib/runtime-c-api">C and C++
bindings</a></li>
<li>In PHP, using
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/php-ext-wasm"><code>php-ext-wasm</code></a></li>
<li>In Python, using
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/python-ext-wasm"><code>python-ext-wasm</code></a> —
<a rel="noopener external" target="_blank" href="https://pypi.org/project/wasmer/">wasmer package on PyPI</a></li>
<li>In Ruby, using
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/ruby-ext-wasm"><code>ruby-ext-wasm</code></a> — <a rel="noopener external" target="_blank" href="https://rubygems.org/gems/wasmer">wasmer
gem on RubyGems</a></li>
<li><strong>It is now time to hang around <a rel="noopener external" target="_blank" href="https://golang.org/">Go</a>
🐹!</strong></li>
</ul>
<p>We are super happy to announce <code>github.com/wasmerio/go-ext-wasm/wasmer</code>,
a <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm">Go library to run WebAssembly binaries,
fast</a>.</p>
<h2 id="calling-a-webassembly-function-from-go">Calling a WebAssembly function from Go<a role="presentation" class="anchor" href="#calling-a-webassembly-function-from-go" title="Anchor link to this header">#</a>
</h2>
<p>First, let’s install <code>wasmer</code> in your go environment (<em>with cgo
support</em>).</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span class="z-storage"> export</span><span class="z-variable"> CGO_ENABLED</span><span class="z-keyword z-operator">=</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-terminator">;</span><span class="z-storage"> export</span><span class="z-variable"> CC</span><span class="z-keyword z-operator">=</span><span class="z-variable">gcc</span><span class="z-punctuation z-terminator">;</span><span class="z-entity z-name"> go</span><span class="z-string"> install github.com/wasmerio/go-ext-wasm/wasmer</span></span></code></pre>
<p>Let’s jump immediately into some
examples.<code>github.com/wasmerio/go-ext-wasm/wasmer</code> is a regular Go
library. The installation is automated with
<code>import "github.com/wasmerio/go-ext-wasm/wasmer"</code>.</p>
<p>Let’s get our hands dirty. We will write a program that compiles to
WebAssembly easily, using Rust for instance:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>,</span><span class="z-variable"> y</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> i32</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> x</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> y</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>After compilation to WebAssembly, we get a file like <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm/blob/master/wasmer/test/testdata/examples/simple.wasm">this
one</a>,
named <code>simple.wasm</code>.<br />
The following Go program executes the <code>sum</code> function by passing <code>5</code> and
<code>37</code> as arguments:</p>
<pre class="giallo z-code"><code data-lang="go"><span class="giallo-l"><span class="z-keyword">package</span><span class="z-entity z-name"> main</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">import</span><span> (</span></span>
<span class="giallo-l"><span class="z-string"> "</span><span class="z-entity z-name z-import">fmt</span><span class="z-string">"</span></span>
<span class="giallo-l"><span class="z-variable"> wasm</span><span class="z-string"> "</span><span class="z-entity z-name z-import">github.com/wasmerio/go-ext-wasm/wasmer</span><span class="z-string">"</span></span>
<span class="giallo-l"><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">func</span><span class="z-entity z-name z-function"> main</span><span>() {</span></span>
<span class="giallo-l"><span class="z-comment"> // Reads the WebAssembly module as bytes.</span></span>
<span class="giallo-l"><span class="z-variable"> bytes</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">ReadBytes</span><span>(</span><span class="z-string">"simple.wasm"</span><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Instantiates the WebAssembly module.</span></span>
<span class="giallo-l"><span class="z-variable"> instance</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">NewInstance</span><span>(</span><span class="z-variable">bytes</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword"> defer</span><span class="z-variable"> instance</span><span>.</span><span class="z-entity z-name z-function">Close</span><span>()</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Gets the `sum` exported function from the WebAssembly instance.</span></span>
<span class="giallo-l"><span class="z-variable"> sum</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> instance</span><span>.</span><span class="z-variable">Exports</span><span>[</span><span class="z-string">"sum"</span><span>]</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Calls that exported function with Go standard values. The WebAssembly</span></span>
<span class="giallo-l"><span class="z-comment"> // types are inferred and values are casted automatically.</span></span>
<span class="giallo-l"><span class="z-variable"> result</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-constant z-numeric">5</span><span>,</span><span class="z-constant z-numeric"> 37</span><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> fmt</span><span>.</span><span class="z-entity z-name z-function">Println</span><span>(</span><span class="z-variable">result</span><span>)</span><span class="z-comment"> // 42!</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Great! We have successfully executed a WebAssembly file inside Go.</p>
<blockquote>
<p><em>Note: Go values passed to the WebAssembly exported function are
automatically cast to WebAssembly values. Types are inferred and
casting is done automatically. Thus, a WebAssembly function acts as
any regular Go function.</em></p>
</blockquote>
<h2 id="webassembly-calling-go-funtions">WebAssembly calling Go funtions<a role="presentation" class="anchor" href="#webassembly-calling-go-funtions" title="Anchor link to this header">#</a>
</h2>
<p>A WebAssembly module <em>exports</em> some functions, so that they can be
called from the outside world. This is the entry point to execute
WebAssembly.</p>
<p>Nonetheless, a WebAssembly module can also have <em>imported</em> functions.
Let’s consider the following Rust program:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">extern</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>,</span><span class="z-variable"> y</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> i32</span><span>;</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> add1</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>,</span><span class="z-variable"> y</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> i32</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> unsafe</span><span> {</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable">x</span><span>,</span><span class="z-variable"> y</span><span>) }</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 1</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The exported function <code>add1</code> calls the <code>sum</code> function. Its
implementation is absent, only its signature is defined. This is an
“extern function”, and for WebAssembly, this is an <em>imported</em> function,
because its implementation must be <em>imported</em>.</p>
<p>Let’s implement the <code>sum</code> function in Go! To do so, <em>need</em> to use
<a rel="noopener external" target="_blank" href="https://blog.golang.org/c-go-cgo">cgo</a>:</p>
<ol>
<li>The <code>sum</code> function signature is defined in C (see the comment above
<code>import "C"</code>),</li>
<li>The <code>sum</code> implementation is defined in Go. Notice the <code>//export</code> which is
the way cgo uses to map Go code to C code,</li>
<li><code>NewImports</code> is an API used to create WebAssembly imports. In this code
<code>"sum"</code> is the WebAssembly imported function name, <code>sum</code> is the Go function
pointer, and <code>C.sum</code> is the cgo function pointer,</li>
<li>Finally, <code>NewInstanceWithImports</code> is the constructor to use to instantiate
the WebAssembly module with imports. That’s it.</li>
</ol>
<p>Let’s see the complete program:</p>
<pre class="giallo z-code"><code data-lang="go"><span class="giallo-l"><span class="z-keyword">package</span><span class="z-entity z-name"> main</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// // 1️⃣ Declare the `sum` function signature (see cgo).</span></span>
<span class="giallo-l"><span class="z-comment">//</span></span>
<span class="giallo-l"><span class="z-comment">// #include <stdlib.h></span></span>
<span class="giallo-l"><span class="z-comment">//</span></span>
<span class="giallo-l"><span class="z-comment">// extern int32_t sum(void *context, int32_t x, int32_t y);</span></span>
<span class="giallo-l"><span class="z-keyword">import</span><span class="z-string"> "</span><span class="z-entity z-name z-import">C</span><span class="z-string">"</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">import</span><span> (</span></span>
<span class="giallo-l"><span class="z-string"> "</span><span class="z-entity z-name z-import">fmt</span><span class="z-string">"</span></span>
<span class="giallo-l"><span class="z-variable"> wasm</span><span class="z-string"> "</span><span class="z-entity z-name z-import">github.com/wasmerio/go-ext-wasm/wasmer</span><span class="z-string">"</span></span>
<span class="giallo-l"><span class="z-string"> "</span><span class="z-entity z-name z-import">unsafe</span><span class="z-string">"</span></span>
<span class="giallo-l"><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// 2️⃣ Write the implementation of the `sum` function, and export it (for cgo).</span></span>
<span class="giallo-l"><span class="z-comment">//export sum</span></span>
<span class="giallo-l"><span class="z-keyword">func</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable z-parameter">context</span><span class="z-entity z-name"> unsafe</span><span>.</span><span class="z-entity z-name">Pointer</span><span>,</span><span class="z-variable z-parameter"> x</span><span class="z-storage z-type"> int32</span><span>,</span><span class="z-variable z-parameter"> y</span><span class="z-storage z-type"> int32</span><span>)</span><span class="z-storage z-type"> int32</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> x</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> y</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">func</span><span class="z-entity z-name z-function"> main</span><span>() {</span></span>
<span class="giallo-l"><span class="z-comment"> // Reads the WebAssembly module as bytes.</span></span>
<span class="giallo-l"><span class="z-variable"> bytes</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">ReadBytes</span><span>(</span><span class="z-string">"import.wasm"</span><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // 3️⃣ Declares the imported functions for WebAssembly.</span></span>
<span class="giallo-l"><span class="z-variable"> imports</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">NewImports</span><span>().</span><span class="z-entity z-name z-function">Append</span><span>(</span><span class="z-string">"sum"</span><span>,</span><span class="z-variable"> sum</span><span>,</span><span class="z-variable"> C</span><span>.</span><span class="z-variable">sum</span><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // 4️⃣ Instantiates the WebAssembly module with imports.</span></span>
<span class="giallo-l"><span class="z-variable"> instance</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">NewInstanceWithImports</span><span>(</span><span class="z-variable">bytes</span><span>,</span><span class="z-variable"> imports</span><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Close the WebAssembly instance later.</span></span>
<span class="giallo-l"><span class="z-keyword"> defer</span><span class="z-variable"> instance</span><span>.</span><span class="z-entity z-name z-function">Close</span><span>()</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Gets the `add1` exported function from the WebAssembly instance.</span></span>
<span class="giallo-l"><span class="z-variable"> add1</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> instance</span><span>.</span><span class="z-variable">Exports</span><span>[</span><span class="z-string">"add1"</span><span>]</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Calls that exported function.</span></span>
<span class="giallo-l"><span class="z-variable"> result</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-entity z-name z-function"> add1</span><span>(</span><span class="z-constant z-numeric">1</span><span>,</span><span class="z-constant z-numeric"> 2</span><span>)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> fmt</span><span>.</span><span class="z-entity z-name z-function">Println</span><span>(</span><span class="z-variable">result</span><span>)</span></span>
<span class="giallo-l"><span class="z-comment"> // add1(1, 2)</span></span>
<span class="giallo-l"><span class="z-comment"> // = sum(1 + 2) + 1</span></span>
<span class="giallo-l"><span class="z-comment"> // = 1 + 2 + 1</span></span>
<span class="giallo-l"><span class="z-comment"> // = 4</span></span>
<span class="giallo-l"><span class="z-comment"> // QED</span></span>
<span class="giallo-l"><span>}</span></span></code></pre><h2 id="reading-the-memory">Reading the memory<a role="presentation" class="anchor" href="#reading-the-memory" title="Anchor link to this header">#</a>
</h2>
<p>A WebAssembly instance has a linear memory. Let’s see how to read it.
Consider the following Rust program:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> return_hello</span><span>()</span><span class="z-keyword z-operator"> -> *</span><span class="z-storage">const</span><span class="z-entity z-name"> u8</span><span> {</span></span>
<span class="giallo-l"><span class="z-string"> b"Hello, World!</span><span class="z-constant z-character">\0</span><span class="z-string">"</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_ptr</span><span>()</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The <code>return_hello</code> function returns a pointer to a string. The string
terminates by a null byte, <em>à la</em> C. Let’s jump on the Go side:</p>
<pre class="giallo z-code"><code data-lang="go"><span class="giallo-l"><span class="z-variable">bytes</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">ReadBytes</span><span>(</span><span class="z-string">"memory.wasm"</span><span>)</span></span>
<span class="giallo-l"><span class="z-variable">instance</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> wasm</span><span>.</span><span class="z-entity z-name z-function">NewInstance</span><span>(</span><span class="z-variable">bytes</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword">defer</span><span class="z-variable"> instance</span><span>.</span><span class="z-entity z-name z-function">Close</span><span>()</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Calls the `return_hello` exported function.</span></span>
<span class="giallo-l"><span class="z-comment">// This function returns a pointer to a string.</span></span>
<span class="giallo-l"><span class="z-variable">result</span><span>,</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> instance</span><span>.</span><span class="z-entity z-name z-function">Exports</span><span>[</span><span class="z-string">"return_hello"</span><span>]()</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Gets the pointer value as an integer.</span></span>
<span class="giallo-l"><span class="z-variable">pointer</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> result</span><span>.</span><span class="z-entity z-name z-function">ToI32</span><span>()</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Reads the memory.</span></span>
<span class="giallo-l"><span class="z-variable">memory</span><span class="z-keyword z-operator"> :=</span><span class="z-variable"> instance</span><span>.</span><span class="z-variable">Memory</span><span>.</span><span class="z-entity z-name z-function">Data</span><span>()</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">fmt</span><span>.</span><span class="z-entity z-name z-function">Println</span><span>(</span><span class="z-storage z-type">string</span><span>(</span><span class="z-variable">memory</span><span>[</span><span class="z-variable">pointer</span><span> :</span><span class="z-variable"> pointer</span><span class="z-keyword z-operator">+</span><span class="z-constant z-numeric">13</span><span>]))</span><span class="z-comment"> // Hello, World!</span></span></code></pre>
<p>The <code>return_hello</code> function returns a pointer as an <code>i32</code> value. We get
its value by calling <code>ToI32</code>. Then, we fetch the memory data with
<code>instance.Memory.Data()</code>.</p>
<p>This function returns a slice over the WebAssembly instance memory. It
can be used as any regular Go slice.</p>
<p>Fortunately for us, we already know the length of the string we want to
read, so <code>memory[pointer : pointer+13]</code> is enough to read the bytes,
that are then cast to a string. <em>Et voilà !</em></p>
<blockquote>
<p>You can read <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm/blob/6934a0fa06558f77884398a2371de182593e6a6c/wasmer/test/example_greet_test.go">the Greet
Example</a>
to see a more advanced usage of the memory API.</p>
</blockquote>
<h2 id="benchmarks">Benchmarks<a role="presentation" class="anchor" href="#benchmarks" title="Anchor link to this header">#</a>
</h2>
<p>So far, <code>github.com/wasmerio/go-ext-wasm/wasmer</code> has a nice API, but
…<em>is it fast</em>?</p>
<p>Contrary to PHP or Ruby, there are already existing runtimes in the Go
world to execute WebAssembly. The main candidates are:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/perlin-network/life">Life</a>, from Perlin Network, a
WebAssembly interpreter</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/go-interpreter/wagon">Wagon</a>, from Go Interpreter,
a WebAssembly interpreter and toolkit.</li>
</ul>
<p>In <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/php-ext-wasm-migrating-from-wasmi-to-wasmer-4d1014f41c88">our blog post about the PHP
extension</a>,
we have used <a rel="noopener external" target="_blank" href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/nbody.html">the n-body
algorithm</a>
to benchmark the performance. Life provides more benchmarks: <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Fibonacci_number">the
Fibonacci algorithm</a>
(the recursive version), <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm">the Pollard’s rho
algorithm</a>, and
the Snappy Compress operation. The latter works successfully with
<code>github.com/wasmerio/go-ext-wasm/wasmer</code> but not with Life or Wagon. We
have removed it from the benchmark suites. <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm/tree/master/benchmarks">Benchmark
sources</a>
are online.</p>
<p>We use Life 20190521143330–57f3819c2df0, and Wagon 0.4.0, i.e. <em>the
latest versions to date</em>.</p>
<p>The benchmark numbers represent the average result for 10 runs each. The
computer that ran these benchmarks is a MacBook Pro 15" from 2016,
2.9Ghz Core i7 with 16Gb of memory.</p>
<p>Results are grouped by benchmark algorithm on the X axis. The Y axis
represents the time used to run the algorithm, expressed in
milliseconds. The lower, the better.</p>
<figure>
<p><img src="https://mnt.io/articles/announcing-the-fastest-webassembly-runtime-for-go-wasmer/./benchmarks.png" alt="Benchmarks" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Speed comparison between Wasmer, Wagon and Life. Benchmark suites are
the n-body, Fibonacci, and Pollard’s rho algorithms. Speed is expressed
in ms. Lower is better.</p>
</figcaption>
</figure>
<p>While both Life and Wagon provide on average the same speed, Wasmer
(<code>github.com/wasmerio/go-ext/wasmer</code>) is on average <strong>72 times faster</strong>
🎉.</p>
<p>It is important to know that Wasmer comes with 3 backends:
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/master/lib/singlepass-backend">Singlepass</a>,
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/master/lib/clif-backend">Cranelift</a>,
and
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer/tree/master/lib/llvm-backend">LLVM</a>.
The default backend that is used by the Go library is Cranelift (<a rel="noopener external" target="_blank" href="https://github.com/CraneStation/cranelift">learn
more about Cranelift</a>). Using
LLVM will provide performance close to native, but we decided to start
with Cranelift as it offers the best tradeoff between compilation-time
and execution-time (<a rel="noopener external" target="_blank" href="https://medium.com/wasmer/a-webassembly-compiler-tale-9ef37aa3b537">learn more about the different
backends</a>,
when to use them, pros and cons etc.).</p>
<h2 id="">Conclusion<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p><code>[github.com/wasmerio/go-ext-wasm/wasmer](https://github.com/wasmerio/go-ext-wasm)</code>
is a new Go library to execute WebAssembly binaries. It embeds the
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer</a> runtime. The first version
supports all the required API for the most common usages.</p>
<p>The current benchmarks (a mix from our benchmark suites and from Life
suites) show that <strong>Wasmer is — on average — 72 times faster than Life
and Wagon</strong>, the two major existing WebAssembly runtimes in the Go
world.</p>
<p>If you want to follow the development, take a look at
<a rel="noopener external" target="_blank" href="https://twitter.com/wasmerio">@wasmerio</a> and
<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a> on Twitter, or
<a rel="noopener external" target="_blank" href="https://webassembly.social/@wasmer">@<a href="mailto:[email protected]">[email protected]</a></a> on
Mastodon.</p>
<p>And of course, everything is open source at
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/go-ext-wasm">wasmerio/go-ext-wasm</a>.</p>
<p>Thank you for your time, we can’t wait to see what you build with us!</p>
🐘+🦀+🕸 php-ext-wasm: Migrating from wasmi to Wasmer2019-04-03T00:00:00+00:002019-04-03T00:00:00+00:00
Unknown
https://mnt.io/articles/elephant-crab-spider-web-php-ext-wasm-migrating-from-wasmi-to-wasmer/<p><em>This is a copy of <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/php-ext-wasm-migrating-from-wasmi-to-wasmer-4d1014f41c88">an article I wrote for
Wasmer</a>.</em></p>
<hr />
<p>First as a joke, now as a real product, I started to develop
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/php-ext-wasm">php-ext-wasm</a>: a
<a rel="noopener external" target="_blank" href="http://php.net/">PHP</a> extension allowing to execute
<a rel="noopener external" target="_blank" href="https://webassembly.org/">WebAssembly</a> binaries.</p>
<p>The PHP virtual machine (VM) is <a rel="noopener external" target="_blank" href="https://github.com/php/php-src/">Zend
Engine</a>. To write an extension, one
needs to develop in C or C++. The extension was simple C bindings to a
Rust library I also wrote. At that time, this Rust library was using
<a rel="noopener external" target="_blank" href="https://github.com/paritytech/wasmi"><code>wasmi</code></a> for the WebAssembly VM. I
knew that <code>wasmi</code> wasn’t the fastest WebAssembly VM in the game, but the
API is solid, well-tested, it compiles quickly, and is easy to hack. All
the requirements to start a project!</p>
<p>After 6 hours of development, I got something working. I was able to run
the following PHP program:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$instance</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> Wasm</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Instance</span><span>(</span><span class="z-string">'simple.wasm'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$result</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $instance</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">sum</span><span>(</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 2</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(</span><span class="z-variable">$result</span><span>)</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> // int(3)</span></span></code></pre>
<p>The API is straightforward: create an instance (here of <code>simple.wasm</code>),
then call functions on it (here <code>sum</code> with 1 and 2 as arguments). PHP
values are transformed into WebAssembly values automatically. For the
record, here is the <code>simple.rs</code> Rust program that is compiled to a
WebAssembly binary:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> sum</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>,</span><span class="z-variable"> y</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> i32</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> i32</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> x</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> y</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>It was great! 6 hours is a relatively small number of hours to go that
far according to me.</p>
<p>However, I quickly noticed that <code>wasmi</code> is… slow. <a rel="noopener external" target="_blank" href="https://webassembly.org/">One of the promise of
WebAssembly</a> is:</p>
<blockquote>
<p>WebAssembly aims to execute at native speed by taking advantage of
<a rel="noopener external" target="_blank" href="https://webassembly.org/docs/portability/#assumptions-for-efficient-execution">common hardware
capabilities</a>
available on a wide range of platforms.</p>
</blockquote>
<p>And clearly, my extension wasn’t fulfilling this promise. Let’s see a
basic comparison with a benchmark.</p>
<p>I chose <a rel="noopener external" target="_blank" href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/nbody.html">the <em>n-body</em>
algorithm</a>
from <a rel="noopener external" target="_blank" href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/">the Computer Language Benchmarks
Game</a> from
Debian, mostly because it’s relatively CPU intensive. Also, the
algorithm has a simple interface: based on an integer, it returns a
floating-point number; this API doesn’t involve any advanced instance
memory API, which is perfect to test a proof-of-concept.</p>
<p>As a baseline, I’ve run the <em>n-body</em> algorithm <a rel="noopener external" target="_blank" href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-rust-7.html">written in
Rust</a>,
let’s call it <code>rust-baseline</code>. The same algorithm has been <a rel="noopener external" target="_blank" href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/nbody-php-3.html">written in
PHP</a>,
let’s call it <code>php</code>. Finally, the algorithm has been compiled from Rust
to WebAssembly, and executed with the <code>php-ext-wasm</code> extension, let’s
call that case <code>php+wasmi</code>. All results are for <code>nbody(5000000)</code>:</p>
<ul>
<li><code>rust-baseline</code>: 287ms,</li>
<li><code>php</code>: 19,761ms,</li>
<li><code>php+wasmi</code>: 67,622ms.</li>
</ul>
<p>OK, so… <code>php-ext-wasm</code> with <code>wasmi</code> is <strong>3.4 times slower</strong> than PHP
itself, it is pointless to use WebAssembly in such conditions!</p>
<p>It confirms my first intuition though: In our case, <code>wasmi</code> is really
great to mock something up, but it’s not fast enough for our
expectations.</p>
<h2 id="faster-faster-faster">Faster, faster, faster…<a role="presentation" class="anchor" href="#faster-faster-faster" title="Anchor link to this header">#</a>
</h2>
<p>I wanted to use <a rel="noopener external" target="_blank" href="https://github.com/CraneStation/cranelift">Cranelift</a>
since the beginning. It’s a code generator, <em>à la</em>
<a rel="noopener external" target="_blank" href="http://llvm.org/">LLVM</a> (excuse the brutal shortcut, the goal isn’t to
explain what Cranelift is in details, but that’s a really awesome
project!). To quote the project itself:</p>
<blockquote>
<p>Cranelift is a low-level retargetable code generator. It translates a
<a rel="noopener external" target="_blank" href="https://cranelift.readthedocs.io/en/latest/ir.html">target-independent intermediate
representation</a>
into executable machine code.</p>
</blockquote>
<p>It basically means that the Cranelift API can be used to generate
executable code.</p>
<p>It’s perfect! I can replace <code>wasmi</code> by Cranelift, and boom, profit. But…
there is other ways to get even faster code execution — at the cost of a
longer code compilation though.</p>
<p>For instance, LLVM can provide a very fast code execution, almost at
native speed. Or we can generate assembly code dynamically. Well, there
is multiple ways to achieve that. What if a project could provide a
WebAssembly virtual machine with multiple backends?</p>
<h2 id="enter-wasmer">Enter Wasmer<a role="presentation" class="anchor" href="#enter-wasmer" title="Anchor link to this header">#</a>
</h2>
<p>And it was at that specific time that I’ve been hired by
<a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer</a>. To be totally honest, I
was looking at Wasmer a few weeks before. It was a surprise and a great
opportunity for me. Well, the universe really wants this rewrite from
<code>wasmi</code> to Wasmer, right 😅?</p>
<p>Wasmer is organized as a set of Rust libraries (called crates). There is
even a <code>wasmer-runtime-c-api</code> crate which is a C and a C++ API on top of
the <code>wasmer-runtime</code> crate and the <code>wasmer-runtime-core</code> crate, i.e. it
allows running the WebAssembly virtual machine as you want, with the
backend of your choice: <em>Cranelift</em>, <em>LLVM</em>, or <em>Dynasm</em> (at the time of
writing). That’s perfect, it removes my Rust library between the PHP
extension and <code>wasmi</code>. Then <code>php-ext-wasm</code> is reduced to a PHP extension
without any Rust code, everything goes to <code>wasmer-runtime-c-api</code>. That’s
sad to remove Rust from this project, but it relies on more Rust code!</p>
<p>Counting the time to make some patches on <code>wasmer-runtime-c-api</code>, I’ve
been able to migrate <code>php-ext-wasm</code> to Wasmer in 5 days.</p>
<p>By default, <code>php-ext-wasm</code> uses Wasmer with the Cranelift backend, it
does a great balance between compilation and execution time. It is
really good. Let’s run the benchmark, with the addition of
<code>php+wasmer(cranelift)</code>:</p>
<ul>
<li><code>rust-baseline</code>: 287ms,</li>
<li><code>php</code>: 19,761ms,</li>
<li><code>php+wasmi</code>: 67,622ms,</li>
<li><code>php+wasmer(cranelift)</code>: 2,365ms 🎉.</li>
</ul>
<p>Finally, the PHP extension provides a faster execution than PHP itself!
<code>php+wasmer(cranelift)</code> is <strong>8.6 times faster</strong> than <code>php</code> to be exact.
And it is <strong>28.6 times faster</strong> than <code>php+wasmi</code>. Can we reach the
native speed (represented by <code>rust-baseline</code> here)? It’s very likely
with LLVM. That’s for another article. I’m super happy with Cranelift
for the moment. (See <a rel="noopener external" target="_blank" href="https://medium.com/wasmer/benchmarking-webassembly-runtimes-18497ce0d76e">our previous blog post to learn how we benchmark
different backends in Wasmer, and other WebAssembly
runtimes</a>).</p>
<h2 id="more-optimizations">More Optimizations<a role="presentation" class="anchor" href="#more-optimizations" title="Anchor link to this header">#</a>
</h2>
<p>Wasmer provides more features, like module caching. Those features are
now included in the PHP extension. When booting the <code>nbody.wasm</code> file
(19kb), it took 4.2ms. By booting, I mean: reading the WebAssembly
binary from a file, parsing it, validating it, compiling it to
executable code and a WebAssembly module structure.</p>
<p>PHP execution model is: starts, runs, dies. Memory is freed for each
request. If one wants to use <code>php-ext-wasm</code>, you don’t really want to
pay that “<em>booting cost</em>” every time.</p>
<p>Hopefully, <code>wasmer-runtime-c-api</code> now provides a module serialization
API, which is integrated into the PHP extension itself. It saves the
“booting cost”, but it adds a “deserialization cost”. That second cost
is smaller, but still, we need to know it exists.</p>
<p>Hopefully again, Zend Engine has an API to get persistent in-memory data
between PHP executions. <code>php-ext-wasm</code> supports that API to get
persistent modules, <em>et voilà</em>.</p>
<p>Now it takes <strong>4.2ms</strong> for the first boot of <code>nbody.wasm</code> and
<strong>0.005ms</strong> for all the next boots. It’s 840 times faster!</p>
<h2 id="conclusion">Conclusion<a role="presentation" class="anchor" href="#conclusion" title="Anchor link to this header">#</a>
</h2>
<p>Wasmer is a young — but mature — framework to build WebAssembly runtimes
on top of. The default backend is Cranelift, and it shows its promises:
It brings a correct balance between compilation time and execution time.</p>
<p><code>wasmi</code> has been a good companion to develop a <em>Proof-Of-Concept</em>. This
library has its place in other usages though, like very short-living
WebAssembly binaries (I’m thinking of Ethereum contracts that compile to
WebAssembly for instance, which is one of the actual use cases). It’s
important to understand that no runtime is better than another, it
depends on the use case.</p>
<p>The next step is to stabilize <code>php-ext-wasm</code> to release a 1.0.0 version.</p>
<p>See you there!</p>
<p>If you want to follow the development, take a look at
<a rel="noopener external" target="_blank" href="https://twitter.com/wasmerio">@wasmerio</a> and
<a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">@mnt_io</a> on Twitter.</p>
Bye bye Automattic, hello Wasmer2019-03-04T00:00:00+00:002019-03-04T00:00:00+00:00
Unknown
https://mnt.io/articles/bye-bye-automattic-hello-wasmer/<p>Today is my first day at <a rel="noopener external" target="_blank" href="https://wasmer.io/">Wasmer</a>.</p>
<p>It's with a lot of regrets that I leave Automattic. To be clear, I'm not
leaving because something negative happened, I'm leaving because I've
received the same job offer, 3 times in 10 days, from Wasmer, Google and
Mozilla. Namely to work with Rust or C++ to build a WebAssembly runtime.
This is an offer I can barely decline. It's an opportunity and a dream
for me. And I was lucky enough to get a choice between 3 excellent
companies!</p>
<p>I can only encourage you to <a rel="noopener external" target="_blank" href="https://automattic.com/work-with-us/">work with
Automattic</a>. It's definitely the
best company I've ever work with; stealing the 1st place to Mozilla.
Automattic is not only about WordPress.com and other services: It's a
way of living. The culture, the spirit, the interactions between people,
the mission, everything is <em>exceptional</em>. It has been a super great
experience.</p>
<p>I could write 100 pages about my team. They have all been <em>remarkable</em>
in many ways. I'm closer to them although they live at 10'000km, rather
than colleagues I met everyday in person in the past. Congrats to
<a rel="noopener external" target="_blank" href="https://ma.tt/about/">Matt</a> for this incredible project.</p>
<p>Now it's time to work on <a rel="noopener external" target="_blank" href="https://github.com/wasmerio/wasmer">Wasmer</a>.
It's a <a rel="noopener external" target="_blank" href="https://webassembly.org/">WebAssembly</a> runtime written in
<a rel="noopener external" target="_blank" href="https://www.rust-lang.org/">Rust</a>: My two current passions. It's
powerful, modular, well-designed, and it comes with great ambitions. I'm
really exciting. I work with an extraordinary team: <a rel="noopener external" target="_blank" href="https://github.com/syrusakbary">Syrus
Akbary</a> (the author of
<a rel="noopener external" target="_blank" href="https://github.com/graphql-python/graphene">Graphene</a>, a GraphQL
framework in Python), <a rel="noopener external" target="_blank" href="https://github.com/lachlansneff">Lachlan Sneff</a>
(the author of <a rel="noopener external" target="_blank" href="https://github.com/nebulet/nebulet">Nebulet</a>, a
microkernel that implements a WebAssembly "usermode" that runs in Ring
0), <a rel="noopener external" target="_blank" href="https://github.com/bjfish">Brandon Fish</a> (a great contributor of
<a rel="noopener external" target="_blank" href="https://github.com/oracle/truffleruby">Truffleruby</a>, a high performance
implementation of Ruby with GraalVM), <a rel="noopener external" target="_blank" href="https://github.com/xmclark">Mackenzie
Clark</a>, and soon more.</p>
<p>My job will consist to work on the runtime of course, and also to
integrate/embed the runtime into different languages, such as PHP —like
I did with <code>[php-ext-wasm](https://github.com/Hywan/php-ext-wasm)</code>, more
to come on this blog—. More secret projects coming. Let's turn them into
realities 🎉!</p>
The PHP galaxy2018-10-29T00:00:00+00:002018-10-29T00:00:00+00:00
Unknown
https://mnt.io/series/from-rust-to-beyond/the-php-galaxy/<p>The galaxy we will explore today is the PHP galaxy. This post will
explain what PHP is, how to compile any Rust program to C and then to a
PHP native extension.</p>
<h2 id="what-is-php-and-why">What is PHP, and why?<a role="presentation" class="anchor" href="#what-is-php-and-why" title="Anchor link to this header">#</a>
</h2>
<p><a rel="noopener external" target="_blank" href="https://secure.php.net/">PHP</a> is a:</p>
<blockquote>
<p>popular general-purpose scripting language that is especially suited
to Web development. Fast, flexible, and pragmatic, PHP powers
everything from your blog to the most popular websites in the world.</p>
</blockquote>
<p>PHP has sadly acquired a bad reputation along the years, but recent
releases (since PHP 7.0 mostly) have introduced neat language features,
and many cleanups, which are excessively ignored by haters. PHP is also
a fast scripting language, and is very flexible. PHP now has declared
types, traits, variadic arguments, closures (with explicit scopes!),
generators, and a <em>huge</em> backward compatibility. The development of PHP
is led by <a rel="noopener external" target="_blank" href="https://wiki.php.net/rfc">RFCs</a>, which is an open and
democratic process.</p>
<p>The Gutenberg project is a new editor for WordPress. The latter is
written in PHP. This is naturally that we want a native extension for
PHP to parse the Gutenberg post format.</p>
<p>PHP is a language with <a rel="noopener external" target="_blank" href="https://github.com/php/php-langspec">a
specification</a>. The most popular
virtual machine is <a rel="noopener external" target="_blank" href="http://php.net/manual/en/internals2.php">Zend
Engine</a>. Other virtual machines
exist, like <a rel="noopener external" target="_blank" href="https://hhvm.com/">HHVM</a> (but the PHP support has been
dropped recently in favor of their own PHP fork, called Hack),
<a rel="noopener external" target="_blank" href="https://www.peachpie.io/">Peachpie</a>, or <a rel="noopener external" target="_blank" href="https://github.com/tagua-vm/tagua-vm">Tagua
VM</a> (under development).</p>
<p>In this post, we will create an extension for Zend Engine. This virtual
machine is written in C. Great, we have visited <a href="https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/">the C galaxy in the
previous
episode</a>!</p>
<h2 id="rust-rocket-c-rocket-php">Rust 🚀 C 🚀 PHP<a role="presentation" class="anchor" href="#rust-rocket-c-rocket-php" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-php-galaxy/./rust-to-php.png" alt="Rust to PHP" loading="lazy" decoding="async" /></p>
</figure>
<p>To port our Rust parser into PHP, we first need to port it to C. It's
been done in the previous episode. Two files result from this port to
C: <code>libgutenberg_post_parser.a</code> and <code>gutenberg_post_parser.h</code>,
respectively a static library, and the header file.</p>
<h3 id="">Bootstrap with a skeleton<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h3>
<p>PHP comes with <a rel="noopener external" target="_blank" href="http://php.net/manual/en/internals2.buildsys.skeleton.php">a script to create an extension
skeleton</a>/template,
called
<a rel="noopener external" target="_blank" href="https://github.com/php/php-src/blob/master/ext/ext_skel.php"><code>ext_skel.php</code></a>.
This script is accessible from the source of the Zend Engine virtual
machine (which we will refer to as <code>php-src</code>). One can invoke the script
like this:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cd php-src/ext/</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> ./ext_skel.php \</span></span>
<span class="giallo-l"><span> --ext gutenberg_post_parser \</span></span>
<span class="giallo-l"><span> --author 'Ivan Enderlin' \</span></span>
<span class="giallo-l"><span> --dir /path/to/extension \</span></span>
<span class="giallo-l"><span> --onlyunix</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cd /path/to/extension</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> ls gutenberg_post_parser</span></span>
<span class="giallo-l"><span>tests/</span></span>
<span class="giallo-l"><span>.gitignore</span></span>
<span class="giallo-l"><span>CREDITS</span></span>
<span class="giallo-l"><span>config.m4</span></span>
<span class="giallo-l"><span>gutenberg_post_parser.c</span></span>
<span class="giallo-l"><span>php_gutenberg_post_parser.h</span></span></code></pre>
<p>The <code>ext_skel.php</code> script recommends to go through the following steps:</p>
<ul>
<li>Rebuild the configuration of the PHP source (run <code>./buildconf</code> at the
root of the <code>php-src</code> directory),</li>
<li>Reconfigure the build system to enable the extension, like
<code>./configure --enable-gutenberg_post_parser</code>,</li>
<li>Build with <code>make</code>,</li>
<li>Done.</li>
</ul>
<p>But our extension is very likely to live outside the <code>php-src</code> tree. So
we will use <code>phpize</code> instead. <code>phpize</code> is an executable that comes with
<code>php</code>, <code>php-cgi</code>, <code>phpdbg</code>, <code>php-config</code> etc. It allows to compile
extensions against an already compiled <code>php</code> binary, which is perfect in
our case! We will use it like this :</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cd /path/to/extension/gutenberg_post_parser</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Get the bin directory </span><span class="z-keyword">for</span><span> PHP utilities.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span class="z-variable"> PHP_PREFIX_BIN</span><span class="z-keyword z-operator">=</span><span>$(</span><span class="z-entity z-name">php-config</span><span class="z-constant z-other"> --prefix</span><span>)</span><span class="z-string">/bin</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Clean (</span><span class="z-entity z-name">except</span><span class="z-string"> if it is the first run</span><span>).</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span class="z-variable"> $PHP_PREFIX_BIN</span><span>/phpize --clean</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # “phpize” the extension.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span class="z-variable"> $PHP_PREFIX_BIN</span><span>/phpize</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Configure the extension </span><span class="z-keyword">for</span><span> a particular PHP version.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> ./configure</span><span class="z-variable"> --with-php-config</span><span class="z-keyword z-operator">=</span><span class="z-variable">$PHP_PREFIX_BIN</span><span class="z-string">/php-config</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Compile.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> make install</span></span></code></pre>
<p>In this post, we will not show all the edits we have done, but we will
rather focus on the extension binding. <a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/tree/master/bindings/php/extension/gutenberg_post_parser">All the sources can be found
here</a>.
Shortly, here is the <code>config.m4</code> file:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>PHP_ARG_ENABLE(gutenberg_post_parser, whether to enable gutenberg_post_parser support,</span></span>
<span class="giallo-l"><span>[ --with-gutenberg_post_parser Include gutenberg_post_parser support], no)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>if test "$PHP_GUTENBERG_POST_PARSER" != "no"; then</span></span>
<span class="giallo-l"><span> PHP_SUBST(GUTENBERG_POST_PARSER_SHARED_LIBADD)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> PHP_ADD_LIBRARY_WITH_PATH(gutenberg_post_parser, ., GUTENBERG_POST_PARSER_SHARED_LIBADD)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> PHP_NEW_EXTENSION(gutenberg_post_parser, gutenberg_post_parser.c, $ext_shared)</span></span>
<span class="giallo-l"><span>fi</span></span></code></pre>
<p>What it does is basically the following:</p>
<ul>
<li>Register the <code>--with-gutenberg_post_parser</code> option in the build
system, and</li>
<li>Declare the static library to compile with, and the source of the
extension itself.</li>
</ul>
<p>We must add the <code>libgutenberg_post_parser.a</code> and
<code>gutenberg_post_parser.h</code> files in the same directory (a symlink is
perfect), to get a structure such as:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> ls gutenberg_post_parser</span></span>
<span class="giallo-l"><span>tests/ # from ext_skel</span></span>
<span class="giallo-l"><span>.gitignore # from ext_skel</span></span>
<span class="giallo-l"><span>CREDITS # from ext_skel</span></span>
<span class="giallo-l"><span>config.m4 # from ext_skel (edited)</span></span>
<span class="giallo-l"><span>gutenberg_post_parser.c # from ext_skel (will be edited)</span></span>
<span class="giallo-l"><span>gutenberg_post_parser.h # from Rust</span></span>
<span class="giallo-l"><span>libgutenberg_post_parser.a # from Rust</span></span>
<span class="giallo-l"><span>php_gutenberg_post_parser.h # from ext_skel</span></span></code></pre>
<p>The core of the extension is the <code>gutenberg_post_parser.c</code> file. This
file is responsible to create the module, and to bind our Rust code to
PHP.</p>
<h3 id="-1">The module, aka the extension<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h3>
<p>As said, we will work in the <code>gutenberg_post_parser.c</code> file. First,
let's include everything we need:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> "php.h"</span></span>
<span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> "ext/standard/info.h"</span></span>
<span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> "php_gutenberg_post_parser.h"</span></span>
<span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> "gutenberg_post_parser.h"</span></span></code></pre>
<p>The last line includes the <code>gutenberg_post_parser.h</code> file generated by
Rust (more precisely, by <code>cbindgen</code>, if you don't remember, <a href="https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/">take a look
at the previous
episode</a>).</p>
<p>Then, we have to decide what API we want to expose into PHP? As a
reminder, the Rust parser produces an AST defined as:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Node</span><span><'</span><span class="z-entity z-name">a</span><span>> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span> (</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>,</span><span class="z-entity z-name"> Input</span><span><'</span><span class="z-entity z-name">a</span><span>>),</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>>,</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">Node</span><span><'</span><span class="z-entity z-name">a</span><span>>></span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Phrase</span><span>(</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>)</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The C variant of the AST is very similar (with more structures, but the
idea is almost identical). So in PHP, the following structure has been
selected:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Gutenberg_Parser_Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-keyword"> string</span><span class="z-variable"> $namespace</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-keyword"> string</span><span class="z-variable"> $name</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-keyword"> string</span><span class="z-variable"> $attributes</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-keyword"> array</span><span class="z-variable"> $children</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Gutenberg_Parser_Phrase</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-keyword"> string</span><span class="z-variable"> $content</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> gutenberg_post_parse</span><span>(</span><span class="z-keyword">string</span><span class="z-variable"> $gutenberg_post</span><span>)</span><span class="z-keyword z-operator">:</span><span class="z-keyword"> array</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The <code>gutenberg_post_parse</code> function will output an array of objects of
kind <code>Gutenberg_Parser_Block</code> or <code>Gutenberg_Parser_Phrase</code>, i.e. our
AST.</p>
<p>So, let's declare those classes!</p>
<h3 id="-2">Declare the classes<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h3>
<p><em>Note: The next 4 code blocks are not the core of the post, it is just
code that needs to be written, you can skip it if you are not about to
write a PHP extension.</em></p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span>zend_class_entry </span><span class="z-keyword z-operator">*</span><span>gutenberg_parser_block_class_entry</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>zend_class_entry </span><span class="z-keyword z-operator">*</span><span>gutenberg_parser_phrase_class_entry</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>zend_object_handlers gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span> _gutenberg_parser_node </span><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span> zend_object zobj</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> gutenberg_parser_node</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>A class entry represents a specific class type. A handler is associated
to a class entry. The logic is somewhat complicated. If you need more
details, I recommend to read the <a rel="noopener external" target="_blank" href="http://www.phpinternalsbook.com/">PHP Internals
Book</a>.</p>
<p>Then, let's create a function to instanciate those objects:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage">static</span><span> zend_object </span><span class="z-keyword z-operator">*</span><span class="z-entity z-name z-function">create_parser_node_object</span><span class="z-punctuation z-section">(</span><span>zend_class_entry </span><span class="z-keyword z-operator">*</span><span class="z-variable z-parameter">class_entry</span><span class="z-punctuation z-section">)</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span> gutenberg_parser_node </span><span class="z-keyword z-operator">*</span><span>gutenberg_parser_node_object</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> gutenberg_parser_node_object </span><span class="z-keyword z-operator">=</span><span class="z-entity z-name z-function"> ecalloc</span><span class="z-punctuation z-section">(</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> sizeof</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">*</span><span>gutenberg_parser_node_object</span><span class="z-punctuation z-section">)</span><span class="z-keyword z-operator"> +</span><span class="z-entity z-name z-function"> zend_object_properties_size</span><span class="z-punctuation z-section">(</span><span>class_entry</span><span class="z-punctuation z-section">))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_object_std_init</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable">gutenberg_parser_node_object</span><span class="z-punctuation z-separator">-></span><span class="z-variable">zobj</span><span class="z-punctuation z-separator">,</span><span> class_entry</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> object_properties_init</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable">gutenberg_parser_node_object</span><span class="z-punctuation z-separator">-></span><span class="z-variable">zobj</span><span class="z-punctuation z-separator">,</span><span> class_entry</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> gutenberg_parser_node_object</span><span class="z-punctuation z-separator">-></span><span class="z-variable">zobj</span><span class="z-punctuation z-separator">.</span><span class="z-variable">handlers</span><span class="z-keyword z-operator"> = &</span><span>gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-keyword z-operator"> &</span><span class="z-variable">gutenberg_parser_node_object</span><span class="z-punctuation z-separator">-></span><span class="z-variable">zobj</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>Then, let's create a function to free those objects. It works in two
steps: Destruct the object by calling its destructor (in the user-land),
then free it for real (in the VM-land):</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage">static</span><span class="z-storage z-type"> void</span><span class="z-entity z-name z-function"> destroy_parser_node_object</span><span class="z-punctuation z-section">(</span><span>zend_object </span><span class="z-keyword z-operator">*</span><span class="z-variable z-parameter">gutenberg_parser_node_object</span><span class="z-punctuation z-section">)</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_objects_destroy_object</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_node_object</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">static</span><span class="z-storage z-type"> void</span><span class="z-entity z-name z-function"> free_parser_node_object</span><span class="z-punctuation z-section">(</span><span>zend_object </span><span class="z-keyword z-operator">*</span><span class="z-variable z-parameter">gutenberg_parser_node_object</span><span class="z-punctuation z-section">)</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_object_std_dtor</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_node_object</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>Then, let's initialize the “module”, i.e. the extension. During the
initialisation, we will create the classes in the user-land, declare
their attributes etc.</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-entity z-name z-function">PHP_MINIT_FUNCTION</span><span class="z-punctuation z-section">(</span><span>gutenberg_post_parser</span><span class="z-punctuation z-section">)</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span> zend_class_entry class_entry</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare Gutenberg_Parser_Block.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> INIT_CLASS_ENTRY</span><span class="z-punctuation z-section">(</span><span>class_entry</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "Gutenberg_Parser_Block"</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-language"> NULL</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> gutenberg_parser_block_class_entry </span><span class="z-keyword z-operator">=</span><span class="z-entity z-name z-function"> zend_register_internal_class</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>class_entry TSRMLS_CC</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the create handler.</span></span>
<span class="giallo-l"><span class="z-variable"> gutenberg_parser_block_class_entry</span><span class="z-punctuation z-separator">-></span><span class="z-variable">create_object</span><span class="z-keyword z-operator"> =</span><span> create_parser_node_object</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // The class is final.</span></span>
<span class="giallo-l"><span class="z-variable"> gutenberg_parser_block_class_entry</span><span class="z-punctuation z-separator">-></span><span class="z-variable">ce_flags</span><span class="z-keyword z-operator"> |=</span><span> ZEND_ACC_FINAL</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the `namespace` public attribute,</span></span>
<span class="giallo-l"><span class="z-comment"> // with an empty string for the default value.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_declare_property_string</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_block_class_entry</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "namespace"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> sizeof</span><span class="z-punctuation z-section">(</span><span class="z-string">"namespace"</span><span class="z-punctuation z-section">)</span><span class="z-keyword z-operator"> -</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-separator">,</span><span class="z-string"> ""</span><span class="z-punctuation z-separator">,</span><span> ZEND_ACC_PUBLIC</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the `name` public attribute,</span></span>
<span class="giallo-l"><span class="z-comment"> // with an empty string for the default value.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_declare_property_string</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_block_class_entry</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "name"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> sizeof</span><span class="z-punctuation z-section">(</span><span class="z-string">"name"</span><span class="z-punctuation z-section">)</span><span class="z-keyword z-operator"> -</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-separator">,</span><span class="z-string"> ""</span><span class="z-punctuation z-separator">,</span><span> ZEND_ACC_PUBLIC</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the `attributes` public attribute,</span></span>
<span class="giallo-l"><span class="z-comment"> // with `NULL` for the default value.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_declare_property_null</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_block_class_entry</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "attributes"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> sizeof</span><span class="z-punctuation z-section">(</span><span class="z-string">"attributes"</span><span class="z-punctuation z-section">)</span><span class="z-keyword z-operator"> -</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-separator">,</span><span> ZEND_ACC_PUBLIC</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the `children` public attribute,</span></span>
<span class="giallo-l"><span class="z-comment"> // with `NULL` for the default value.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zend_declare_property_null</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_block_class_entry</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "children"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> sizeof</span><span class="z-punctuation z-section">(</span><span class="z-string">"children"</span><span class="z-punctuation z-section">)</span><span class="z-keyword z-operator"> -</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-separator">,</span><span> ZEND_ACC_PUBLIC</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare the Gutenberg_Parser_Block.</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> … skip …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Declare Gutenberg parser node object handlers.</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> memcpy</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-separator">,</span><span class="z-entity z-name z-function"> zend_get_std_object_handlers</span><span class="z-punctuation z-section">()</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> sizeof</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-section">))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-separator">.</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> XtOffsetOf</span><span class="z-punctuation z-section">(</span><span>gutenberg_parser_node</span><span class="z-punctuation z-separator">,</span><span> zobj</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-separator">.</span><span class="z-variable">dtor_obj</span><span class="z-keyword z-operator"> =</span><span> destroy_parser_node_object</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> gutenberg_parser_node_class_entry_handlers</span><span class="z-punctuation z-separator">.</span><span class="z-variable">free_obj</span><span class="z-keyword z-operator"> =</span><span> free_parser_node_object</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span> SUCCESS</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>If you are still reading, first: Thank you, and second: Congrats!</p>
<p>Then, there is a <code>PHP_RINIT_FUNCTION</code> and a <code>PHP_MINFO_FUNCTION</code>
functions that are already generated by the <code>ext_skel.php</code> script. Same
for the module entry definition and other module configuration details.</p>
<h3 id="gutenberg-post-parse">The <code>gutenberg_post_parse</code> function<a role="presentation" class="anchor" href="#gutenberg-post-parse" title="Anchor link to this header">#</a>
</h3>
<p>We will now focus on the <code>gutenberg_post_parse</code> PHP function. This
function takes a string as a single argument and returns either <code>false</code>
if the parsing failed, or an array of objects of kind
<code>Gutenberg_Parser_Block</code> or <code>Gutenberg_Parser_Phrase</code> otherwise. Let's
write it! Notice that it is declared with <a rel="noopener external" target="_blank" href="https://github.com/php/php-src/blob/52d91260df54995a680f420884338dfd9d5a0d49/main/php.h#L400">the <code>PHP_FUNCTION</code>
macro</a>.</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-entity z-name z-function">PHP_FUNCTION</span><span class="z-punctuation z-section">(</span><span>gutenberg_post_parse</span><span class="z-punctuation z-section">)</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span class="z-storage z-type"> char</span><span class="z-keyword z-operator"> *</span><span>input</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> size_t</span><span> input_len</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Read the input as a string.</span></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-punctuation z-section"> (</span><span class="z-entity z-name z-function">zend_parse_parameters</span><span class="z-punctuation z-section">(</span><span class="z-entity z-name z-function">ZEND_NUM_ARGS</span><span class="z-punctuation z-section">()</span><span> TSRMLS_CC</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "s"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span>input</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span>input_len</span><span class="z-punctuation z-section">)</span><span class="z-keyword z-operator"> ==</span><span> FAILURE</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span></code></pre>
<p>At this step, the argument has been declared and typed as a string
(<code>"s"</code>). The string value is in <code>input</code> and the string length is in
<code>input_len</code>.</p>
<p>The next step is to parse the <code>input</code>. (The length of the string is not
needed). This is where we are going to call our Rust code! Let's do
that:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment"> // Parse the input.</span></span>
<span class="giallo-l"><span> Result parser_result </span><span class="z-keyword z-operator">=</span><span class="z-entity z-name z-function"> parse</span><span class="z-punctuation z-section">(</span><span>input</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // If parsing failed, then return false.</span></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-punctuation z-section"> (</span><span>parser_result.tag </span><span class="z-keyword z-operator">==</span><span> Err</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span> RETURN_FALSE</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Else map the Rust AST into a PHP array.</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span> Vector_Node nodes </span><span class="z-keyword z-operator">=</span><span> parse_result.ok._0</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The <code>Result</code> type and the <code>parse</code> function come from Rust. If you don't
remember those types, please <a href="https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/">read the previous episode about the C
galaxy</a>.</p>
<p>Zend Engine has a macro called <code>RETURN_FALSE</code> to return… <code>false</code>! Handy
isn't it?</p>
<p>Finally, if everything went well, we get back a collection of node as a
<code>Vector_Node</code> type.</p>
<p>The next step is to map those Rust/C types into PHP types, i.e. an array
of the Gutenberg classes. Let's go:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment"> // Note: return_value is a “magic” variable that holds the value to be returned.</span></span>
<span class="giallo-l"><span class="z-comment"> //</span></span>
<span class="giallo-l"><span class="z-comment"> // Allocate an array.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> array_init_size</span><span class="z-punctuation z-section">(</span><span>return_value</span><span class="z-punctuation z-separator">,</span><span> nodes.length</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Map the Rust AST.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> into_php_objects</span><span class="z-punctuation z-section">(</span><span>return_value</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span class="z-variable z-parameter">nodes</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Done 😁! Oh wait… the <code>into_php_objects</code> function need to be written!</p>
<h3 id="into-php-objects">The <code>into_php_objects</code> function<a role="presentation" class="anchor" href="#into-php-objects" title="Anchor link to this header">#</a>
</h3>
<p>This function is not terribly complex: It's just full of Zend Engine
specific API as expected. We are going to explain how to map a <code>Block</code>
into a <code>Gutenberg_Parser_Block</code> object, and to let the <code>Phrase</code> mapping
to <code>Gutenberg_Parser_Phrase</code> for the assiduous readers. And there we go:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage z-type">void</span><span class="z-entity z-name z-function"> into_php_objects</span><span class="z-punctuation z-section">(</span><span>zval </span><span class="z-keyword z-operator">*</span><span class="z-variable z-parameter">php_array</span><span class="z-punctuation z-separator">,</span><span class="z-storage"> const</span><span> Vector_Node </span><span class="z-keyword z-operator">*</span><span class="z-variable z-parameter">nodes</span><span class="z-punctuation z-section">)</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-storage z-type"> uintptr_t</span><span> number_of_nodes </span><span class="z-keyword z-operator">=</span><span class="z-variable"> nodes</span><span class="z-punctuation z-separator">-></span><span class="z-variable">length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-punctuation z-section"> (</span><span>number_of_nodes </span><span class="z-keyword z-operator">==</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Iterate over all nodes.</span></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span class="z-punctuation z-section"> (</span><span class="z-storage z-type">uintptr_t</span><span> nth </span><span class="z-keyword z-operator">=</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span> nth </span><span class="z-keyword z-operator"><</span><span> number_of_nodes</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span>nth</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span> Node node </span><span class="z-keyword z-operator">=</span><span class="z-variable"> nodes</span><span class="z-punctuation z-separator">-></span><span class="z-variable">buffer</span><span>[nth]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-punctuation z-section"> (</span><span class="z-variable">node</span><span class="z-punctuation z-separator">.</span><span class="z-variable">tag</span><span class="z-keyword z-operator"> ==</span><span> Block</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-comment"> // Map Block into Gutenberg_Parser_Block.</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span><span class="z-keyword"> else if</span><span class="z-punctuation z-section"> (</span><span class="z-variable">node</span><span class="z-punctuation z-separator">.</span><span class="z-variable">tag</span><span class="z-keyword z-operator"> ==</span><span> Phrase</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-comment"> // Map Phrase into Gutenberg_Parser_Phrase.</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>Now let's map a block. The process is the following:</p>
<ol>
<li>Allocate PHP strings for the block namespace, and for the block
name,</li>
<li>Allocate an object,</li>
<li>Set the block namespace and the block name to their respective
object properties,</li>
<li>Allocate a PHP string for the block attributes if any,</li>
<li>Set the block attributes to its respective object property,</li>
<li>If any children, initialise a new array, and call <code>into_php_objects</code>
with the child nodes and the new array,</li>
<li>Set the children to its respective object property,</li>
<li>Finally, add the block object inside the array to be returned.</li>
</ol>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage">const</span><span> Block_Body block </span><span class="z-keyword z-operator">=</span><span> node.block</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>zval php_block</span><span class="z-punctuation z-separator">,</span><span> php_block_namespace</span><span class="z-punctuation z-separator">,</span><span> php_block_name</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// 1. Prepare the PHP strings.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">ZVAL_STRINGL</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block_namespace</span><span class="z-punctuation z-separator">,</span><span> block.namespace.pointer</span><span class="z-punctuation z-separator">,</span><span> block.namespace.length</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">ZVAL_STRINGL</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block_name</span><span class="z-punctuation z-separator">,</span><span> block.name.pointer</span><span class="z-punctuation z-separator">,</span><span> block.name.length</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Do you remember that namespace, name and other similar data are of type
<code>Slice_c_char</code>? It's just a structure with a pointer and a length. The
pointer points to the original input string, so that there is no copy
(and this is the definition of a slice actually). Well, Zend Engine has
<a rel="noopener external" target="_blank" href="https://github.com/php/php-src/blob/52d91260df54995a680f420884338dfd9d5a0d49/Zend/zend_API.h#L563-L565">a <code>ZVAL_STRINGL</code>
macro</a>
that allows to create a string from a pointer and a length, great!
Unfortunately for us, Zend Engine does <a rel="noopener external" target="_blank" href="https://github.com/php/php-src/blob/52d91260df54995a680f420884338dfd9d5a0d49/Zend/zend_string.h#L152-L159">a copy behind the
scene</a>…
There is no way to keep the pointer and the length only, but it keeps
the number of copies small. I think it is to take the full ownership of
the data, which is required for the garbage collector.</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment">// 2. Create the Gutenberg_Parser_Block object.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">object_init_ex</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block</span><span class="z-punctuation z-separator">,</span><span> gutenberg_parser_block_class_entry</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The object has been instanciated with a class represented by the
<code>gutenberg_parser_block_class_entry</code>.</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment">// 3. Set the namespace and the name.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">add_property_zval</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "namespace"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span class="z-variable z-parameter">php_block_namespace</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">add_property_zval</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "name"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span class="z-variable z-parameter">php_block_name</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">zval_ptr_dtor</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block_namespace</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function">zval_ptr_dtor</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span class="z-variable z-parameter">php_block_name</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The <code>zval_ptr_dtor</code> adds 1 to the reference counter. This is required
for the garbage collector.</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment">// 4. Deal with block attributes if some.</span></span>
<span class="giallo-l"><span class="z-keyword">if</span><span class="z-punctuation z-section"> (</span><span>block.attributes.tag </span><span class="z-keyword z-operator">==</span><span> Some</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span> Slice_c_char attributes </span><span class="z-keyword z-operator">=</span><span class="z-variable"> block</span><span class="z-punctuation z-separator">.</span><span class="z-variable">attributes</span><span class="z-punctuation z-separator">.</span><span class="z-variable">some</span><span>.</span><span class="z-variable">_0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> zval php_block_attributes</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> ZVAL_STRINGL</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>php_block_attributes</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> attributes</span><span class="z-punctuation z-separator">.</span><span class="z-variable">pointer</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> attributes</span><span class="z-punctuation z-separator">.</span><span class="z-variable">length</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // 5. Set the attributes.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> add_property_zval</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>php_block</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "attributes"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span>php_block_attributes</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> zval_ptr_dtor</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>php_block_attributes</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>It is similar to what has been done for <code>namespace</code> and <code>name</code>. Now
let's continue with children.</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment">// 6. Handle children.</span></span>
<span class="giallo-l"><span class="z-storage">const</span><span> Vector_Node </span><span class="z-keyword z-operator">*</span><span>children </span><span class="z-keyword z-operator">=</span><span class="z-punctuation z-section"> (</span><span class="z-storage">const</span><span> Vector_Node</span><span class="z-keyword z-operator">*</span><span class="z-punctuation z-section">) (</span><span>block.children</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">if</span><span class="z-punctuation z-section"> (</span><span>children</span><span class="z-keyword z-operator">-></span><span>length </span><span class="z-keyword z-operator">></span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span> zval php_children_array</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> array_init_size</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>php_children_array</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> children</span><span class="z-punctuation z-separator">-></span><span class="z-variable">length</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Recursion.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> into_php_objects</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>php_children_array</span><span class="z-punctuation z-separator">,</span><span> children</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // 7. Set the children.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> add_property_zval</span><span class="z-punctuation z-section">(</span><span class="z-keyword z-operator">&</span><span>php_block</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "children"</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span>php_children_array</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Z_DELREF</span><span class="z-punctuation z-section">(</span><span>php_children_array</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">free</span><span class="z-punctuation z-section">((</span><span class="z-storage z-type">void</span><span class="z-keyword z-operator">*</span><span class="z-punctuation z-section">)</span><span class="z-variable z-parameter"> children</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Finally, add the block instance into the array to be returned:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment"> // 8. Insert the object in the collection.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> add_next_index_zval</span><span class="z-punctuation z-section">(</span><span>php_array</span><span class="z-punctuation z-separator">,</span><span class="z-keyword z-operator"> &</span><span class="z-variable z-parameter">php_block</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p><a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/php/extension/gutenberg_post_parser/gutenberg_post_parser.c">The entire code lands
here</a>.</p>
<h2 id="-3">PHP extension 🚀 PHP userland<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h2>
<p>Now the extension is written, we have to compile it. That's the
repetitive set of commands we have shown above with <code>phpize</code>. Once the
extension is compiled, the generated <code>gutenberg_post_parser.so</code> file
must be located in the extension directory. This directory can be found
with the following command:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php-config --extension-dir</span></span></code></pre>
<p>For instance, in my computer, the extension directory is
<code>/usr/local/Cellar/php/7.2.11/pecl/20170718</code>.</p>
<p>Then, to enable the extension for a given execution, you must write:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php -d</span><span class="z-variable"> extension</span><span class="z-keyword z-operator">=</span><span class="z-string">gutenberg_post_parser</span><span class="z-entity z-name"> -m</span><span class="z-keyword z-operator"> |</span><span> \</span></span>
<span class="giallo-l"><span> grep gutenberg_post_parser</span></span></code></pre>
<p>Or, to enable the extension for all executions, locate the <code>php.ini</code>
file with <code>php --ini</code> and edit it to add:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span>extension=gutenberg_post_parser</span></span></code></pre>
<p>Done!</p>
<p>Now, let's use some reflection to check the extension is correctly
loaded and handled by PHP:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php --re gutenberg_post_parser</span></span>
<span class="giallo-l"><span>Extension [ <persistent> extension #64 gutenberg_post_parser version 0.1.0 ] {</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Functions {</span></span>
<span class="giallo-l"><span> Function [ <internal:gutenberg_post_parser> function gutenberg_post_parse ] {</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Parameters [1] {</span></span>
<span class="giallo-l"><span> Parameter #0 [ <required> $gutenberg_post_as_string ]</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Classes [2] {</span></span>
<span class="giallo-l"><span> Class [ <internal:gutenberg_post_parser> final class Gutenberg_Parser_Block ] {</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Constants [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Static properties [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Static methods [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Properties [4] {</span></span>
<span class="giallo-l"><span> Property [ <default> public $namespace ]</span></span>
<span class="giallo-l"><span> Property [ <default> public $name ]</span></span>
<span class="giallo-l"><span> Property [ <default> public $attributes ]</span></span>
<span class="giallo-l"><span> Property [ <default> public $children ]</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Methods [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> Class [ <internal:gutenberg_post_parser> final class Gutenberg_Parser_Phrase ] {</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Constants [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Static properties [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Static methods [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Properties [1] {</span></span>
<span class="giallo-l"><span> Property [ <default> public $content ]</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> - Methods [0] {</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Everything looks good: There is one function and two classes that are
defined as expected. Now, let's write some PHP code for the first time
in this blog post!</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> gutenberg_post_parse</span><span>(</span></span>
<span class="giallo-l"><span class="z-string"> '<!-- wp:foo /-->bar<!-- wp:baz -->qux<!-- /wp:baz -->'</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * Will output:</span></span>
<span class="giallo-l"><span class="z-comment"> * array(3) {</span></span>
<span class="giallo-l"><span class="z-comment"> * [0]=></span></span>
<span class="giallo-l"><span class="z-comment"> * object(Gutenberg_Parser_Block)#1 (4) {</span></span>
<span class="giallo-l"><span class="z-comment"> * ["namespace"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * string(4) "core"</span></span>
<span class="giallo-l"><span class="z-comment"> * ["name"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * string(3) "foo"</span></span>
<span class="giallo-l"><span class="z-comment"> * ["attributes"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * NULL</span></span>
<span class="giallo-l"><span class="z-comment"> * ["children"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * NULL</span></span>
<span class="giallo-l"><span class="z-comment"> * }</span></span>
<span class="giallo-l"><span class="z-comment"> * [1]=></span></span>
<span class="giallo-l"><span class="z-comment"> * object(Gutenberg_Parser_Phrase)#2 (1) {</span></span>
<span class="giallo-l"><span class="z-comment"> * ["content"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * string(3) "bar"</span></span>
<span class="giallo-l"><span class="z-comment"> * }</span></span>
<span class="giallo-l"><span class="z-comment"> * [2]=></span></span>
<span class="giallo-l"><span class="z-comment"> * object(Gutenberg_Parser_Block)#3 (4) {</span></span>
<span class="giallo-l"><span class="z-comment"> * ["namespace"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * string(4) "core"</span></span>
<span class="giallo-l"><span class="z-comment"> * ["name"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * string(3) "baz"</span></span>
<span class="giallo-l"><span class="z-comment"> * ["attributes"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * NULL</span></span>
<span class="giallo-l"><span class="z-comment"> * ["children"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * array(1) {</span></span>
<span class="giallo-l"><span class="z-comment"> * [0]=></span></span>
<span class="giallo-l"><span class="z-comment"> * object(Gutenberg_Parser_Phrase)#4 (1) {</span></span>
<span class="giallo-l"><span class="z-comment"> * ["content"]=></span></span>
<span class="giallo-l"><span class="z-comment"> * string(3) "qux"</span></span>
<span class="giallo-l"><span class="z-comment"> * }</span></span>
<span class="giallo-l"><span class="z-comment"> * }</span></span>
<span class="giallo-l"><span class="z-comment"> * }</span></span>
<span class="giallo-l"><span class="z-comment"> * }</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span></code></pre>
<p>It works very well!</p>
<h2 id="-4">Conclusion<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h2>
<p>The journey is:</p>
<ul>
<li>A string written in PHP,</li>
<li>Allocated by the Zend Engine from the Gutenberg extension,</li>
<li>Passed to Rust through FFI (static library + header),</li>
<li>Back to Zend Engine in the Gutenberg extension,</li>
<li>To generate PHP objects,</li>
<li>That are read by PHP.</li>
</ul>
<p>Rust fits really everywhere!</p>
<p>We have seen in details how to write a real world parser in Rust, how to
bind it to C and compile it to a static library in addition to C
headers, how to create a PHP extension exposing one function and two
objects, how to integrate the C binding into PHP, and how to use this
extension in PHP.</p>
<p>As a reminder, the C binding is about 150 lines of code. The PHP
extension is about 300 lines of code, but substracting “decorations”
(the boilerplate to declare and manage the extension) that are
automatically generated, the PHP extension reduces to about 200 lines of
code. Once again, I find this is a small surface of code to review
considering the fact that the parser is still written in Rust, and
modifying the parser will not impact the bindings (except if the AST is
updated obviously)!</p>
<p>PHP is a language with a garbage collector. It explains why all strings
are copied, so that they are owned by PHP itself. However, the fact that
Rust does not copy any data saves memory allocations and deallocations,
which is the biggest cost most of the time.</p>
<p>Rust also provides safety. This property can be questionned considering
the number of binding we are going through: Rust to C to PHP: Does it
still hold? From the Rust perspective, yes, but everything that happens
inside C or PHP must be considered unsafe. A special care must be put in
the C binding to handle all situations.</p>
<p>Is it still fast? Well, let's benchmark. I would like to remind that the
first goal of this experiment was to tackle the bad performance of the
original PEG.js parser. On the JavaScript ground, WASM and ASM.js have
shown to be very much faster (see <a href="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/">the WebAssembly
galaxy</a>,
and <a href="https://mnt.io/series/from-rust-to-beyond/the-asm-js-galaxy/">the ASM.js
galaxy</a>).
For PHP, <a rel="noopener external" target="_blank" href="https://github.com/nylen/phpegjs"><code>phpegjs</code> is used</a>: It reads
the grammar written for PEG.js and compiles it to PHP. Let's see how
they compare:</p>
<figure>
<table><thead><tr><th>Document</th><th>PEG PHP parser (ms)</th><th>Rust parser as a PHP extension (ms)</th><th>speedup</th></tr></thead><tbody>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/demo-post.html"><code>demo-post.html</code></a></td><td>30.409</td><td>0.0012</td><td>× 25341</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/shortcode-shortcomings.html"><code>shortcode-shortcomings.html</code></a></td><td>76.39</td><td>0.096</td><td>× 796</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/redesigning-chrome-desktop.html"><code>redesigning-chrome-desktop.html</code></a></td><td>225.824</td><td>0.399</td><td>× 566</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/web-at-maximum-fps.html"><code>web-at-maximum-fps.html</code></a></td><td>173.495</td><td>0.275</td><td>× 631</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/early-adopting-the-future.html"><code>early-adopting-the-future.html</code></a></td><td>280.433</td><td>0.298</td><td>× 941</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/pygmalian-raw-html.html"><code>pygmalian-raw-html.html</code></a></td><td>377.392</td><td>0.052</td><td>× 7258</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/moby-dick-parsed.html"><code>moby-dick-parsed.html</code></a></td><td>5,437.630</td><td>5.037</td><td>× 1080</td></tr>
</tbody></table>
<figcaption>
<p>Benchmarks between PEG PHP parser and Rust parser as a PHP extension.</p>
</figcaption>
</figure>
<p>The PHP extension of the Rust parser is in average 5230 times faster
than the actual PEG PHP implementation. The median of the speedup is
941.</p>
<p>Another huge issue was that the PEG parser was not able to handle many
Gutenberg documents because of a memory limit. Of course, it is possible
to grow the size of the memory, but it is not ideal. With the Rust
parser as a PHP extension, memory stays constant and close to the size
of the parsed document.</p>
<p>I reckon we can optimise the extension further to generate an iterator
instead of an array. This is something I want to explore and analyse the
impact on the performance. The PHP Internals Book has a <a rel="noopener external" target="_blank" href="http://www.phpinternalsbook.com/classes_objects/iterators.html">chapter about
Iterators</a>.</p>
<p>Thanks for reading!</p>
The C galaxy2018-09-11T00:00:00+00:002018-09-11T00:00:00+00:00
Unknown
https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/<p>The galaxy we will explore today is the C galaxy. This post will explain
what C is (shortly), how to compile any Rust program in C in theory, and
how to do that practically with our Rust parser from the Rust side and
the C side. We will also see how to test such a binding.</p>
<h2 id="what-is-c-and-why">What is C, and why?<a role="presentation" class="anchor" href="#what-is-c-and-why" title="Anchor link to this header">#</a>
</h2>
<p><a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/C_(programming_language)">C</a> is probably
the most used and known programming language in the world. Quoting
Wikipedia:</p>
<blockquote>
<p>C […] is a general-purpose, imperative computer programming
language, supporting structured programming, lexical variable scope
and recursion, while a static type system prevents many unintended
operations. By design, C provides constructs that map efficiently to
typical machine instructions, and therefore it has found lasting use
in applications that had formerly been coded in assembly language,
including operating systems, as well as various application software
for computers ranging from supercomputers to embedded systems.</p>
</blockquote>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/./dennis-ritchie.png" alt="Dennis Ritchie" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Dennis Ritchie, the inventor of the C language.</p>
</figcaption>
</figure>
<p>The impact of C is probably without precedent on the progamming language
world. Almost everything is written in C, starting with operating
systems. Today, it is one of the few common denominator between any
programs on any systems on any machines in the world. In other words,
being compatible with C opens a large door to everything. Your program
will be able to talk directly to any program easily.</p>
<p>Because languages like PHP or Python are written in C, in our particular
Gutenberg parser usecase, it means that the parser can be embedded and
used by PHP or Python directly, with almost no overhead. Neat!</p>
<h2 id="">Rust 🚀 C<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/./rust-to-c.png" alt="Rust to C" loading="lazy" decoding="async" /></p>
</figure>
<p>In order to use Rust from C, one may need 2 elements:</p>
<ol>
<li>A static library (<code>.a</code> file),</li>
<li>A header file (<code>.h</code> file).</li>
</ol>
<h3 id="-1">The theory<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h3>
<p>To compile a Rust project into a static library, the <code>crate-type</code>
property must contain the <code>staticlib</code> value. Let's edit the <code>Cargo.toml</code>
file such as:</p>
<pre class="giallo z-code"><code data-lang="toml"><span class="giallo-l"><span>[</span><span class="z-entity z-name">lib</span><span>]</span></span>
<span class="giallo-l"><span class="z-variable">name</span><span class="z-punctuation z-separator"> =</span><span class="z-string"> "gutenberg_post_parser"</span></span>
<span class="giallo-l"><span class="z-variable">crate-type</span><span class="z-punctuation z-separator"> =</span><span> [</span><span class="z-string">"staticlib"</span><span>]</span></span></code></pre>
<p>Once <code>cargo build --release</code> is run, a <code>libgutenberg_post_parser.a</code> file
is created in <code>target/release/</code>. Done. <code>cargo</code> and <code>rustc</code> make this
step really a doddle.</p>
<p>Now the header file. It can be written manually, but it's tedious and it
gets easily outdated. The goal is to <em>automatically</em> generate it. Enter
<a rel="noopener external" target="_blank" href="https://github.com/eqrion/cbindgen/"><code>cbindgen</code></a>:</p>
<blockquote>
<p><code>cbindgen</code> can be used to generate C bindings for Rust code. It is
currently being developed to support creating bindings for
<a rel="noopener external" target="_blank" href="https://github.com/servo/webrender/">WebRender</a>, but has been
designed to support any project.</p>
</blockquote>
<p>To install <code>cbindgen</code>, edit your <code>Cargo.toml</code> file, such as:</p>
<pre class="giallo z-code"><code data-lang="toml"><span class="giallo-l"><span>[</span><span class="z-entity z-name">package</span><span>]</span></span>
<span class="giallo-l"><span class="z-variable">build</span><span class="z-punctuation z-separator"> =</span><span class="z-string"> "build.rs"</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>[</span><span class="z-entity z-name">build-dependencies</span><span>]</span></span>
<span class="giallo-l"><span class="z-variable">cbindgen</span><span class="z-punctuation z-separator"> =</span><span class="z-string"> "^0.6.0"</span></span></code></pre>
<p>Actually, <code>cbindgen</code> comes in 2 flavors: CLI executable, or a library. I
prefer to use the library approach, which makes installation easier.</p>
<p>Note that Cargo has been instructed to use the <code>build.rs</code> file to build
the project. This file is an appropriate place to generate the C headers
file with <code>cbindgen</code>. Let's write it!</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-storage">extern</span><span class="z-keyword"> crate</span><span> cbindgen;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> main</span><span>() {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> crate_dir</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> std</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">env</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">var</span><span>(</span><span class="z-string">"CARGO_MANIFEST_DIR"</span><span>)</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">unwrap</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> cbindgen</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">generate</span><span>(</span><span class="z-variable">crate_dir</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">expect</span><span>(</span><span class="z-string">"Unable to generate C bindings."</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">write_to_file</span><span>(</span><span class="z-string">"dist/gutenberg_post_parser.h"</span><span>);</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>With those information, <code>cbindgen</code> will scan the source code of the
project and will generate C headers automatically in the
<code>dist/gutenberg_post_parser.h</code> header file. Scanning will be detailed in
a moment, but before that, let's quickly see how to control the content
of the header file. With the code snippet above, <code>cbindgen</code> will look
for a <code>cbindgen.toml</code> configuration file in the <code>CARGO_MANIFEST_DIR</code>
directory, i.e. the root of your crate. Mine looks like this:</p>
<pre class="giallo z-code"><code data-lang="toml"><span class="giallo-l"><span class="z-variable">header</span><span class="z-punctuation z-separator"> =</span><span class="z-string"> """</span></span>
<span class="giallo-l"><span class="z-string">/*</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">Gutengerg Post Parser, the C bindings.</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">Warning, this file is autogenerated by `cbindgen`.</span></span>
<span class="giallo-l"><span class="z-string">Do not modify this manually.</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">*/"""</span></span>
<span class="giallo-l"><span class="z-variable">tab_width</span><span class="z-punctuation z-separator"> =</span><span class="z-constant z-numeric"> 4</span></span>
<span class="giallo-l"><span class="z-variable">language</span><span class="z-punctuation z-separator"> =</span><span class="z-string"> "C"</span></span></code></pre>
<p>It describes itself quite easily. <a rel="noopener external" target="_blank" href="https://github.com/eqrion/cbindgen/#configuration">The documentation details the
configuration</a> very
well.</p>
<p><code>cbindgen</code> will scan the code and will stop on <code>struct</code>s or <code>enum</code>s that
have the decorator <code>#[repr(C)]</code>, <code>#[repr(_size_)]</code> or
<code>#[repr(transparent)]</code>, or functions that are marked as <code>extern "C"</code> and
are public. So when one writes:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> struct</span><span class="z-entity z-name"> Slice</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-variable"> c_char</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Option</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-entity z-name">Slice</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> None</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-string"> "C"</span><span class="z-entity z-name z-function"> parse</span><span>(</span><span class="z-variable">pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-variable"> c_char</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-variable"> c_void</span><span> { … }</span></span></code></pre>
<p>Then <code>cbindgen</code> will generate this:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-comment">// … header comment …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-storage z-type"> char</span><span class="z-keyword z-operator"> *</span><span>pointer</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> uintptr_t</span><span> length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Slice</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> enum</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Some</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> None</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Option_Tag</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Slice _0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Some_Body</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Option_Tag tag</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> union</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Some_Body some</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Option</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage z-type">void</span><span class="z-entity z-name z-function"> parse</span><span class="z-punctuation z-section">(</span><span class="z-storage">const</span><span class="z-storage z-type"> char</span><span class="z-keyword z-operator"> *</span><span class="z-variable z-parameter">pointer</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>It works; Great!</p>
<p>Note the <code>#[no_mangle]</code> that decorates the Rust <code>parse</code> function. It
instructs the compiler to not rename the function, so that the function
has the same name from the perspective of C.</p>
<p>OK, that's all for the theory. Let's practise now, we have a parser to
bind to C!</p>
<h3 id="-2">Practise<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h3>
<p>We want to bind a function named <code>parse</code>. The function outputs an AST
representing the language being analysed. <a href="https://mnt.io/series/from-rust-to-beyond/prelude/">For the
recall</a>, the
original AST looks like this:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Node</span><span><'</span><span class="z-entity z-name">a</span><span>> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span> (</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>,</span><span class="z-entity z-name"> Input</span><span><'</span><span class="z-entity z-name">a</span><span>>),</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>>,</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">Node</span><span><'</span><span class="z-entity z-name">a</span><span>>></span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Phase</span><span>(</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>)</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>This AST is defined in the Rust parser. The Rust binding to C will
transform this AST into another set of structs and enums for C. It is
mandatory only for types that are directly exposed to C, not internal
types that Rust uses. Let's start by defining <code>Node</code>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> namespace</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Slice_c_char</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Slice_c_char</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option_c_char</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-variable"> c_void</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Phrase</span><span>(</span><span class="z-entity z-name">Slice_c_char</span><span>)</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Some immediate thoughts:</p>
<ul>
<li>The structure <code>Slice_c_char</code> emulates Rust slices (see below),</li>
<li>The enum <code>Option_c_char</code> emulates <code>Option</code> (see below),</li>
<li>The field <code>children</code> has type <code>*const c_void</code>. It should be
<code>*const Vector_Node</code> (our definition of <code>Vector</code>), but the definition
of <code>Node</code> is based on <code>Vector_Node</code> and vice versa. This <a rel="noopener external" target="_blank" href="https://github.com/eqrion/cbindgen/issues/43">cyclical
definition case is unsupported by <code>cbindgen</code> so
far</a>. So… yes, it is
defined as a <code>void</code> pointer, and will be casted later in C,</li>
<li>The fields <code>namespace</code> and <code>name</code> are originally a tuple in Rust.
Tuples have no equivalent in C with <code>cbindgen</code>, so two fields are used
instead.</li>
</ul>
<p>Let's define <code>Slice_c_char</code>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> struct</span><span class="z-entity z-name"> Slice_c_char</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-variable"> c_char</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>This definition borrows the semantics of <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/primitive.slice.html">Rust'
slices</a>. The major
benefit is that there is no copy when binding a Rust slice to this
structure.</p>
<p>Let's define <code>Option_c_char</code>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Option_c_char</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-entity z-name">Slice_c_char</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> None</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Finally, we need to define <code>Vector_Node</code> and our own <code>Result</code> for C.
They mimic the Rust semantics closely:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> struct</span><span class="z-entity z-name"> Vector_Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> buffer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-constant z-other"> Node</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">C</span><span>)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Result</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Ok</span><span>(</span><span class="z-entity z-name">Vector_Node</span><span>),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Err</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Alright, all types are declared! It's time to write the <code>parse</code>
function:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-string"> "C"</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> parse</span><span>(</span><span class="z-variable">pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-variable"> c_char</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Result</span><span> {</span></span>
<span class="giallo-l"><span> …</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The function takes a pointer from C. It means that the data to analyse
(i.e. the Gutenberg blog post) is allocated and owned by C: The memory
is allocated on the C side, and Rust is only responsible of the parsing.
This is where Rust shines: No copy, no clone, no memory mess, only
pointers to this data will be returned to C as slices and vectors.</p>
<p>The workflow will be the following:</p>
<ul>
<li>First thing to do when we deal with C: Check that the pointer is not
null,</li>
<li>Reconstitute an input from the pointer with
<a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/ffi/struct.CStr.html"><code>CStr</code></a>. This
standard API is useful to abstract C strings from the Rust point of
view. The difference is that a C string terminates by a <code>NULL</code> byte
and has no length, while in Rust a string has a length and does not
terminate with a <code>NULL</code> byte,</li>
<li>Run the parser, then transform the AST into the “C AST”.</li>
</ul>
<p>Let's do that!</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-string"> "C"</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> parse</span><span>(</span><span class="z-variable">pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-variable"> c_char</span><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Result</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-variable"> pointer</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">is_null</span><span>() {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-entity z-name"> Result</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Err</span><span>;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> input</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> unsafe</span><span> {</span><span class="z-entity z-name"> CStr</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_ptr</span><span>(</span><span class="z-variable">pointer</span><span>)</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">to_bytes</span><span>() };</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-storage"> let</span><span class="z-entity z-name"> Ok</span><span>((</span><span class="z-variable">_remaining</span><span>,</span><span class="z-variable"> nodes</span><span>))</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> gutenberg_post_parser</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">root</span><span>(</span><span class="z-variable">input</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> output</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span class="z-keyword z-operator"> =</span></span>
<span class="giallo-l"><span class="z-variable"> nodes</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">into_iter</span><span>()</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">map</span><span>(</span><span class="z-keyword z-operator">|</span><span class="z-variable">node</span><span class="z-keyword z-operator">|</span><span class="z-entity z-name z-function"> into_c</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">node</span><span>))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">collect</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> vector_node</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Vector_Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> buffer</span><span class="z-keyword z-operator">:</span><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_slice</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_ptr</span><span>(),</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">len</span><span>()</span></span>
<span class="giallo-l"><span> };</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> mem</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">forget</span><span>(</span><span class="z-variable">output</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Result</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Ok</span><span>(</span><span class="z-variable">vector_node</span><span>);</span></span>
<span class="giallo-l"><span> }</span><span class="z-keyword"> else</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Result</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Err</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Only pointers are used in <code>Vector_Node</code>: Pointer to the output, and the
length of the output. The conversion is light.</p>
<p>Now let's see the <code>into_c</code> function. Some parts will not be detailed;
Not because they are difficult but because they are repetitive. <a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/c/src/lib.rs">The
entire code lands
here</a>.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> into_c</span><span><'</span><span class="z-entity z-name">a</span><span>>(</span><span class="z-variable">node</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span><'</span><span class="z-entity z-name">a</span><span>>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> match</span><span class="z-keyword z-operator"> *</span><span class="z-variable">node</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span><span class="z-variable"> name</span><span>,</span><span class="z-variable"> attributes</span><span>,</span><span class="z-keyword"> ref</span><span class="z-variable"> children</span><span> }</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> namespace</span><span class="z-keyword z-operator">:</span><span> …,</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span> …,</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span> …,</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span> …</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">Phrase</span><span>(</span><span class="z-variable">input</span><span>)</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">Phrase</span><span>(…)</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>I want to show <code>namespace</code> for the warm-up (<code>name</code>, <code>attributes</code> and
<code>Phrase</code> are very similar), and <code>children</code> because it deals with <code>void</code>.</p>
<p>Let's convert <code>ast::Node::Block.name.0</code> into <code>Node::Block.namespace</code>:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-entity z-name">ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span><span class="z-variable"> name</span><span>, …, … }</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> namespace</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Slice_c_char</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> pointer</span><span class="z-keyword z-operator">:</span><span class="z-variable"> name</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">as_ptr</span><span>()</span><span class="z-keyword"> as</span><span class="z-keyword z-operator"> *</span><span class="z-storage">const</span><span class="z-variable"> c_char</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-variable"> name</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">len</span><span>()</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> …</span></span></code></pre>
<p>Pretty straightforward so far. <code>namespace</code> is a <code>Slice_c_char</code>. The
<code>pointer</code> is the pointer of the <code>name.0</code> slice, and the <code>length</code> is the
length of the same <code>name.0</code>. This is the same process for other Rust
slices.</p>
<p><code>children</code> is different though. It works in three steps:</p>
<ol>
<li>Collect all children as C AST nodes in a Rust vector,</li>
<li>Transform the Rust vector into a valid <code>Vector_Node</code>,</li>
<li>Transform the <code>Vector_Node</code> into a <code>*const c_void</code> pointer.</li>
</ol>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-entity z-name">ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> { …, …,</span><span class="z-keyword"> ref</span><span class="z-variable"> children</span><span> }</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span></span>
<span class="giallo-l"><span> …</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span> {</span></span>
<span class="giallo-l"><span class="z-comment"> // 1. Collect all children as C AST nodes.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> output</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span class="z-keyword z-operator"> =</span></span>
<span class="giallo-l"><span class="z-variable"> children</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">into_iter</span><span>()</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">map</span><span>(</span><span class="z-keyword z-operator">|</span><span class="z-variable">node</span><span class="z-keyword z-operator">|</span><span class="z-entity z-name z-function"> into_c</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">node</span><span>))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> .</span><span class="z-entity z-name z-function">collect</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // 2. Transform the vector into a Vector_Node.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> vector_node</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> if</span><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">is_empty</span><span>() {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Vector_Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> buffer</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> ptr</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">null</span><span>(),</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-constant z-numeric"> 0</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span> }</span><span class="z-keyword"> else</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Vector_Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> buffer</span><span class="z-keyword z-operator">:</span><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_slice</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_ptr</span><span>(),</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">len</span><span>()</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // 3. Transform Vector_Node into a *const c_void pointer.</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> vector_node_pointer</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Box</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">into_raw</span><span>(</span><span class="z-variable">vector_node</span><span>)</span><span class="z-keyword"> as</span><span class="z-keyword z-operator"> *</span><span class="z-storage">const</span><span class="z-variable"> c_void</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> mem</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">forget</span><span>(</span><span class="z-variable">output</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> vector_node_pointer</span></span>
<span class="giallo-l"><span> }</span></span></code></pre>
<p>Step 1 is straightforward.</p>
<p>Step 2 defines what is the behavior when there is no node. In other
words, it defines what an empty <code>Vector_Node</code> is. The <code>buffer</code> must
contain a <code>NULL</code> raw pointer, and the length is obviously 0. Without
this behavior I got various segmentation fault in my code, even if I
checked the <code>length</code> before the <code>buffer</code>. Note that <code>Vector_Node</code> is
allocated on the heap with <code>Box::new</code> so that the pointer can be easily
shared with C.</p>
<p>Step 3 uses <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/boxed/struct.Box.html#method.into_raw">the <code>Box::into_raw</code>
function</a>
to consume the box and to return the wrapped raw pointer of the data it
owns. Rust will not free anything here, it's our responsability (or the
responsability of C to be pedantic). Then the <code>*mut Vector_Node</code>
returned by <code>Box::into_raw</code> can be freely casted into <code>*const c_void</code>.</p>
<p>Finally, we instruct the compiler to not drop <code>output</code> when it goes out
of scope with <code>mem::forget</code> (at this step of the series, you are very
likely to know what it does).</p>
<p>Personally, I spent few hours to understand why my pointers got random
addresses, or were pointing to a <code>NULL</code> data. The resulting code is
simple and kind of clear to read, but it wasn't obvious for me what to
do beforehand.</p>
<p>And that's all for the Rust part! The next section will present the C
code that calls Rust, and how to compile everything all together.</p>
<h2 id="-3">C 🚀 executable<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-c-galaxy/./c-to-executable.png" alt="Rust to C to executable" loading="lazy" decoding="async" /></p>
</figure>
<p>Now the Rust part is ready, the C part must be written to call it.</p>
<h3 id="-4">Minimal Working Example<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h3>
<p>Let's do something very quick to see if it links and compiles:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> <stdlib.h></span></span>
<span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> <stdio.h></span></span>
<span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> <string.h></span></span>
<span class="giallo-l"><span class="z-keyword">#include</span><span class="z-string"> "gutenberg_post_parser.h"</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage z-type">int</span><span class="z-entity z-name z-function"> main</span><span class="z-punctuation z-section">(</span><span class="z-storage z-type">int</span><span class="z-variable z-parameter"> argc</span><span class="z-punctuation z-separator">,</span><span class="z-storage z-type"> char</span><span class="z-keyword z-operator"> **</span><span class="z-variable z-parameter">argv</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span> FILE</span><span class="z-keyword z-operator">*</span><span> file </span><span class="z-keyword z-operator">=</span><span class="z-entity z-name z-function"> fopen</span><span class="z-punctuation z-section">(</span><span class="z-variable">argv</span><span>[</span><span class="z-constant z-numeric">1</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "rb"</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> fseek</span><span class="z-punctuation z-section">(</span><span>file</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-separator">,</span><span> SEEK_END</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> long</span><span> file_size </span><span class="z-keyword z-operator">=</span><span class="z-entity z-name z-function"> ftell</span><span class="z-punctuation z-section">(</span><span>file</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> rewind</span><span class="z-punctuation z-section">(</span><span>file</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage z-type"> char</span><span class="z-keyword z-operator">*</span><span> file_content </span><span class="z-keyword z-operator">=</span><span class="z-punctuation z-section"> (</span><span class="z-storage z-type">char</span><span class="z-keyword z-operator">*</span><span class="z-punctuation z-section">)</span><span class="z-entity z-name z-function"> malloc</span><span class="z-punctuation z-section">(</span><span>file_size </span><span class="z-keyword z-operator">* sizeof</span><span class="z-punctuation z-section">(</span><span class="z-storage z-type">char</span><span class="z-punctuation z-section">))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> fread</span><span class="z-punctuation z-section">(</span><span>file_content</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-separator">,</span><span> file_size</span><span class="z-punctuation z-separator">,</span><span> file</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Let's call Rust!</span></span>
<span class="giallo-l"><span> Result output </span><span class="z-keyword z-operator">=</span><span class="z-entity z-name z-function"> parse</span><span class="z-punctuation z-section">(</span><span>file_content</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-punctuation z-section"> (</span><span class="z-variable">output</span><span class="z-punctuation z-separator">.</span><span class="z-variable">tag</span><span class="z-keyword z-operator"> ==</span><span> Err</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> printf</span><span class="z-punctuation z-section">(</span><span class="z-string">"Error while parsing.</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> const</span><span> Vector_Node nodes </span><span class="z-keyword z-operator">=</span><span class="z-variable"> output</span><span class="z-punctuation z-separator">.</span><span class="z-variable">ok</span><span class="z-punctuation z-separator">.</span><span class="z-variable">_0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-comment"> // Do something with nodes.</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> free</span><span class="z-punctuation z-section">(</span><span>file_content</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> fclose</span><span class="z-punctuation z-section">(</span><span>file</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>To keep the code concise, I left all the error handlers out of the
example. <a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/c/bin/gutenberg_post_parser.c">The entire code lands
here</a>
if you're curious.</p>
<p>What happens in this code? The first thing to notice is
<code>#include "gutenberg_post_parser.h"</code> which is the header file that is
automatically generated by <code>cbindgen</code>.</p>
<p>Then a filename from <code>argv[1]</code> is used to read a blog post to parse. The
<code>parse</code> function is from Rust, just like the <code>Result</code> and <code>Vector_Node</code>
types.</p>
<p>The Rust <code>enum Result { Ok(Vector_Node), Err }</code> is compiled to C as:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> enum</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Ok</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> Err</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Result_Tag</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Vector_Node _0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Ok_Body</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Result_Tag tag</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> union</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Ok_Body ok</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Result</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>No need to say that the Rust version is easier and more compact to read,
but this isn't the point. To check if <code>Result</code> contains an <code>Ok</code> value or
an <code>Err</code>or, one has to check the <code>tag</code> field, like we did with
<code>output.tag == Err</code>. To get the content of the <code>Ok</code>, we did
<code>output.ok._0</code> (<code>_0</code> is a field from <code>Ok_Body</code>).</p>
<p>Let's compile this with <a rel="noopener external" target="_blank" href="http://clang.llvm.org"><code>clang</code></a>! We assume that
this code above is located in the same directory than the
<code>gutenberg_post_parser.h</code> file, i.e. in a <code>dist/</code> directory. Thus:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> cd dist</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> clang \</span></span>
<span class="giallo-l"><span> # Enable all warnings. \</span></span>
<span class="giallo-l"><span> -Wall \</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> # Output executable name. \</span></span>
<span class="giallo-l"><span> -o gutenberg-post-parser \</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> # Input source file. \</span></span>
<span class="giallo-l"><span> gutenberg_post_parser.c \</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> # Directory where to find the static library (*.a). \</span></span>
<span class="giallo-l"><span> -L ../target/release/ \</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> # Link with the gutenberg_post_parser.h file. \</span></span>
<span class="giallo-l"><span> -l gutenberg_post_parser \</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> # Other libraries to link with.</span></span>
<span class="giallo-l"><span> -l System \</span></span>
<span class="giallo-l"><span> -l pthread \</span></span>
<span class="giallo-l"><span> -l c \</span></span>
<span class="giallo-l"><span> -l m</span></span></code></pre>
<p>And that's all! We end up with a <code>gutenberg-post-parser</code> executable that
runs C and Rust.</p>
<h3 id="-5">More details<a role="presentation" class="anchor" href="#-5" title="Anchor link to this header">#</a>
</h3>
<p><a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/c/bin/gutenberg_post_parser.c">In the original source
code</a>,
a recursive function that prints the entire AST on <code>stdout</code> can be
found, namely <code>print</code> (original, isn't it?). Here is some side-by-side
comparisons between Rust syntax and C syntax.</p>
<p>The <code>Vector_Node</code> struct in Rust:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> struct</span><span class="z-entity z-name"> Vector_Node</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> buffer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">const</span><span class="z-constant z-other"> Node</span><span>,</span></span>
<span class="giallo-l"><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The <code>Vector_Node</code> struct in C:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span> Node </span><span class="z-keyword z-operator">*</span><span>buffer</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> uintptr_t</span><span> length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Vector_Node</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>So to respectivelly read the number of nodes (length of the vector) and
the nodes in C, one has to write:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage">const</span><span class="z-storage z-type"> uintptr_t</span><span> number_of_nodes </span><span class="z-keyword z-operator">=</span><span> nodes</span><span class="z-keyword z-operator">-></span><span>length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">for</span><span class="z-punctuation z-section"> (</span><span class="z-storage z-type">uintptr_t</span><span> nth </span><span class="z-keyword z-operator">=</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span> nth </span><span class="z-keyword z-operator"><</span><span> number_of_nodes</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span>nth</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span> Node node </span><span class="z-keyword z-operator">=</span><span class="z-variable"> nodes</span><span class="z-punctuation z-separator">-></span><span class="z-variable">buffer</span><span>[nth]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>This is almost idiomatic C code!</p>
<p>A <code>Node</code> is defined in C as:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> enum</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Block</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> Phrase</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Node_Tag</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Slice_c_char namespace</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> Slice_c_char name</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> Option_c_char attributes</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-storage z-type"> void</span><span class="z-keyword z-operator">*</span><span> children</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Block_Body</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Slice_c_char _0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Phrase_Body</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">typedef</span><span class="z-storage z-type"> struct</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Node_Tag tag</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage z-type"> union</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span> Block_Body block</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> Phrase_Body phrase</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span> Node</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>So once a node is fetched, one can write the following code to detect
its kind:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">if</span><span class="z-punctuation z-section"> (</span><span>node.tag </span><span class="z-keyword z-operator">==</span><span> Block</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-comment"> // …</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span><span class="z-keyword"> else if</span><span class="z-punctuation z-section"> (</span><span>node.tag </span><span class="z-keyword z-operator">==</span><span> Phrase</span><span class="z-punctuation z-section">) {</span></span>
<span class="giallo-l"><span class="z-comment"> // …</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>Let's focus on <code>Block</code> for a second, and let's print the namespace and
the name of the block separated by a slash (<code>/</code>):</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage">const</span><span> Block_Body block </span><span class="z-keyword z-operator">=</span><span> node.block</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">const</span><span> Slice_c_char namespace </span><span class="z-keyword z-operator">=</span><span> block.namespace</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage">const</span><span> Slice_c_char name </span><span class="z-keyword z-operator">=</span><span> block.name</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">printf</span><span class="z-punctuation z-section">(</span></span>
<span class="giallo-l"><span class="z-string"> "</span><span class="z-constant z-other">%.*s</span><span class="z-string">/</span><span class="z-constant z-other z-constant z-character">%.s\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> (</span><span class="z-storage z-type">int</span><span class="z-punctuation z-section">)</span><span> namespace.length</span><span class="z-punctuation z-separator">,</span><span> namespace.pointer</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> (</span><span class="z-storage z-type">int</span><span class="z-punctuation z-section">)</span><span> name.length</span><span class="z-punctuation z-separator">,</span><span> name.pointer</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The special <code>%.*s</code> form in <code>printf</code> allows to print a string based on
its length and its pointer.</p>
<p>I think it is interesting to see the cast from void to <code>Vector_Node</code> for
<code>children</code>. It's a single line:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage">const</span><span> Vector_Node</span><span class="z-keyword z-operator">*</span><span> children </span><span class="z-keyword z-operator">=</span><span class="z-punctuation z-section"> (</span><span class="z-storage">const</span><span> Vector_Node</span><span class="z-keyword z-operator">*</span><span class="z-punctuation z-section">) (</span><span>block.children</span><span class="z-punctuation z-section">)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>I think that's all for the details!</p>
<h3 id="-6">Testing<a role="presentation" class="anchor" href="#-6" title="Anchor link to this header">#</a>
</h3>
<p>I reckon it is also interesting to see how to unit test C bindings
directly with Rust. To emulate a C binding, first, the inputs must be in
“C form”, so strings must be C strings. I prefer to write a macro for
that:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-entity z-name z-function">macro_rules! str_to_c_char</span><span> {</span></span>
<span class="giallo-l"><span> (</span><span class="z-keyword z-operator">$</span><span class="z-variable">input</span><span class="z-keyword z-operator">:</span><span class="z-variable">expr</span><span>)</span><span class="z-keyword z-operator"> =></span><span> (</span></span>
<span class="giallo-l"><span> {</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> ::</span><span class="z-entity z-name">std</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">ffi</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">CString</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">new</span><span>(</span><span class="z-keyword z-operator">$</span><span class="z-variable">input</span><span>)</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">unwrap</span><span>()</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>And second, the opposite: The <code>parse</code> function returns data for C, so
they need to be “converted back” to Rust. Again, I prefer to write a
macro for that:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-entity z-name z-function">macro_rules! slice_c_char_to_str</span><span> {</span></span>
<span class="giallo-l"><span> (</span><span class="z-keyword z-operator">$</span><span class="z-variable">input</span><span class="z-keyword z-operator">:</span><span class="z-variable">ident</span><span>)</span><span class="z-keyword z-operator"> =></span><span> (</span></span>
<span class="giallo-l"><span class="z-keyword"> unsafe</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> ::</span><span class="z-entity z-name">std</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">ffi</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">CStr</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_bytes_with_nul_unchecked</span><span>(</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> ::</span><span class="z-entity z-name">std</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">slice</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_raw_parts</span><span>(</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> $</span><span class="z-variable">input</span><span class="z-keyword z-operator">.</span><span>pointer </span><span class="z-keyword">as</span><span class="z-keyword z-operator"> *</span><span class="z-storage">const</span><span class="z-entity z-name"> u8</span><span>,</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> $</span><span class="z-variable">input</span><span class="z-keyword z-operator">.</span><span>length </span><span class="z-keyword z-operator">+</span><span class="z-constant z-numeric"> 1</span></span>
<span class="giallo-l"><span> )</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">to_str</span><span>()</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">unwrap</span><span>()</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>All right! The final step is to write a unit test. As an example, a
<code>Phrase</code> will be tested; The idea remains the same for <code>Block</code> but the
code is more concise for the former.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[test]</span></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> test_root_with_a_phrase</span><span>() {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> input</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> str_to_c_char!</span><span>(</span><span class="z-string">"foo"</span><span>);</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> output</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> parse</span><span>(</span><span class="z-variable">input</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_ptr</span><span>());</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> match</span><span class="z-variable"> output</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Result</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Ok</span><span>(</span><span class="z-variable">result</span><span>)</span><span class="z-keyword z-operator"> =></span><span class="z-keyword"> match</span><span class="z-variable"> result</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Vector_Node</span><span> {</span><span class="z-variable"> buffer</span><span>,</span><span class="z-variable"> length</span><span> }</span><span class="z-keyword"> if</span><span class="z-variable"> length</span><span class="z-keyword z-operator"> ==</span><span class="z-constant z-numeric"> 1</span><span class="z-keyword z-operator"> =></span></span>
<span class="giallo-l"><span class="z-keyword"> match unsafe</span><span> {</span><span class="z-keyword z-operator"> &*</span><span class="z-variable">buffer</span><span> } {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">Phrase</span><span>(</span><span class="z-variable">phrase</span><span>)</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> assert_eq!</span><span>(</span><span class="z-entity z-name z-function">slice_c_char_to_str!</span><span>(</span><span class="z-variable">phrase</span><span>),</span><span class="z-string"> "foo"</span><span>);</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> _</span><span class="z-keyword z-operator"> =></span><span class="z-entity z-name z-function"> assert!</span><span>(</span><span class="z-constant z-language">false</span><span>)</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> _</span><span class="z-keyword z-operator"> =></span><span class="z-entity z-name z-function"> assert!</span><span>(</span><span class="z-constant z-language">false</span><span>)</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> _</span><span class="z-keyword z-operator"> =></span><span class="z-entity z-name z-function"> assert!</span><span>(</span><span class="z-constant z-language">false</span><span>)</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>What happens here? The <code>input</code> and <code>output</code> have been prepared. The
former is the C string <code>"foo"</code>. The latter is the result of <code>parse</code>.
Then there is a <code>match</code> to validate the form of the AST. Rust is very
expressive, and this test is a good illustration. The <code>Vector_Node</code>
branch is activated if and only if the length of the vector is 1, which
is expressed with the guard <code>if length == 1</code>. Then the content of the
phrase is transformed into a Rust string and compared with a regular
<code>assert_eq!</code> macro.</p>
<p>Note that —in this case— <code>buffer</code> is of type <code>*const Node</code>, so it
represents the first element of the vector. If we want to access the
next elements, we would need to use <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.from_raw_parts">the <code>Vec::from_raw_parts</code>
function</a>
to get a proper Rust API to manipulate this vector.</p>
<h2 id="-7">Conclusion<a role="presentation" class="anchor" href="#-7" title="Anchor link to this header">#</a>
</h2>
<p>We have seen that Rust can be embedded in C very easily. In this
example, Rust has been compiled to a static library, and a header file;
the former is native with Rust tooling, the latter is automatically
generated with <code>cbindgen</code>.</p>
<p>The parser written in Rust manipulates a string allocated and owned by
C. Rust only returns pointers (as slices) to this string back to C. Then
C has no difficulties to read those pointers. The only tricky part is
that Rust allocates some data (like vectors of nodes) on the heap that C
must free. The “free” part has been omitted from the article though: It
does not represent a big challenge, and a C developer is likely to be
used to this kind of situation.</p>
<p>The fact that Rust does not use a garbage collector makes it a perfect
candidate for these usecases. The story behind these bindings is
actually all about memory: Who allocates what, and What is the form of
the data in memory. Rust has a <code>#[repr(C)]</code> decorator to instruct the
compiler to use a C memory layout, which makes C bindings extremely
simple for the developer.</p>
<p>We have also seen that the C bindings can be unit tested within Rust
itself, and run with <code>cargo test</code>.</p>
<p><code>cbindgen</code> is a precious companion in this adventure, by automating the
header file generation, it reduces the update and the maintenance of the
code to a <code>build.rs</code> script.</p>
<p>In terms of performance, C should have similar results than Rust, i.e.
extremely fast. I didn't run a benchmark to verify this statement, it's
purely theoretical. It can be a subject for a next post!</p>
<p>Now that we have successfully embedded Rust in C, a whole new world
opens up to us! The next episode will push Rust in the PHP world as a
native extension (written in C). Let's go!</p>
The ASM.js galaxy2018-08-28T00:00:00+00:002018-08-28T00:00:00+00:00
Unknown
https://mnt.io/series/from-rust-to-beyond/the-asm-js-galaxy/<p>The second galaxy that our Rust parser will explore is the ASM.js
galaxy. This post will explain what ASM.js is, how to compile the parser
into ASM.js, and how to use the ASM.js module with Javascript in a
browser. The goal is to use ASM.js as a fallback to WebAssembly when it
is not available. I highly recommend to read <a href="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/">the previous
episode</a>
about WebAssembly since they have a lot in common.</p>
<h2 id="what-is-asm-js-and-why">What is ASM.js, and why?<a role="presentation" class="anchor" href="#what-is-asm-js-and-why" title="Anchor link to this header">#</a>
</h2>
<p>The main programming language on the Web is Javascript. Applications
that want to exist on the Web had to compile to Javascript, like for
example games. But a problem occurs: The resulting file is heavy (hence
WebAssembly) and Javascript virtual machines have difficulties to
optimise this particular code, resulting in slow or inefficient
executions (considering the example of games). Also —in this context—
Javascript is a compilation target, and as such, some language
constructions are useless (like <code>eval</code>).</p>
<p>So what if a “new” language can be a compilation target and still be
executed by Javascript virtual machines? This is WebAssembly today, but
in 2013, the solution was <a rel="noopener external" target="_blank" href="http://asmjs.org/">ASM.js</a>:</p>
<blockquote>
<p><strong>asm.js</strong>, a strict subset of Javascript that can be used as a
low-level, efficient target language for compilers. This sublanguage
effectively describes a sandboxed virtual machine for memory-unsafe
languages like C or C++. A combination of static and dynamic
validation allows Javascript engines to employ an ahead-of-time (AOT)
optimizing compilation strategy for valid asm.js code.</p>
</blockquote>
<p>So an ASM.js program is a regular Javascript program. It is not a new
language but a subset of it. It can be executed by any Javascript
virtual machines. However, the specific usage of the magic statement
<code>'use asm';</code> instructs the virtual machine to optimise the program with
an ASM.js “engine”.</p>
<p>ASM.js introduces types by using arithmetical operators as an annotation
system. For instance, <code>x | 0</code> annotes <code>x</code> to be an integer, <code>+x</code>
annotates <code>x</code> to be a double, and <code>fround(x)</code> annotates <code>x</code> to be a
float. The following example declares a function
<code>fn increment(x: u32) -> u32</code>:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> increment</span><span>(</span><span class="z-variable z-parameter">x</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable"> x</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> x</span><span class="z-keyword z-operator"> |</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span> (</span><span class="z-variable">x</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 1</span><span>)</span><span class="z-keyword z-operator"> |</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Another important difference is that ASM.js works by module in order to
isolate them from Javascript. A module is a function that takes 3
arguments:</p>
<ol>
<li><code>stdlib</code>, an object with references to standard library APIs,</li>
<li><code>foreign</code>, an object with user-defined functionalities (such as
sending something over a WebSocket),</li>
<li><code>heap</code>, an array buffer representing the memory (because memory is
manually managed).</li>
</ol>
<p>But it's still Javascript. So the good news is that if your virtual
machine has no specific optimisations for ASM.js, it is executed as any
regular Javascript program. And if it does, then you get a pleasant
boost.</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-asm-js-galaxy/./asm-benchmarks.png" alt="Graph" loading="lazy" decoding="async" /></p>
<figcaption>
<p>A graph showing 3 benchmarks running against different Javascript engines:
Firefox, Firefox + asm.js, Google, and native.</p>
</figcaption>
</figure>
<p>Remember that ASM.js has been designed to be a compilation target. So
normally you don't have to care about that because it is the role of the
compiler. The typical compilation and execution pipeline from C or C++
to the Web looks like this:</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-asm-js-galaxy/./asm-pipeline.png" alt="Pipeline" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Classical ASM.js compilation and execution pipeline from C or C++ to the Web.</p>
</figcaption>
</figure>
<p><a rel="noopener external" target="_blank" href="http://kripken.github.io/emscripten-site/">Emscripten</a>, as seen in the
schema above, is a very important project in this whole evolution of the
Web platform. Emscripten is:</p>
<blockquote>
<p>a toolchain for compiling to asm.js and WebAssembly, built using LLVM,
that lets you run C and C++ on the web at near-native speed without
plugins.</p>
</blockquote>
<p>You are very likely to see this name one day or another if you work with
ASM.js or WebAssembly.</p>
<p>I will not explain deeply what ASM.js is with a lot of examples. I
recommend instead to read <a rel="noopener external" target="_blank" href="https://johnresig.com/blog/asmjs-javascript-compile-target/">Asm.js: The Javascript Compile
Target</a> by
John Resig, or <a rel="noopener external" target="_blank" href="http://kripken.github.io/mloc_emscripten_talk/">Big Web app? Compile
it!</a> by Alon Zakai.</p>
<p>Our process will be different though. We will not compile our Rust code
directly to ASM.js, but instead, we will compile it to WebAssembly,
which in turn will be compiled into ASM.js.</p>
<h2 id="">Rust 🚀 ASM.js<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-asm-js-galaxy/./rust-to-asm-js.png" alt="Rust to ASM.js" loading="lazy" decoding="async" /></p>
</figure>
<p>This episode will be very short, and somehow the most easiest one. To
compile Rust to ASM.js, you need to first compile it to WebAssembly
(<a href="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/">see the previous
episode</a>),
and then compile the WebAssembly binary into ASM.js.</p>
<p>Actually, ASM.js is mostly required when the browser does not support
WebAssembly, like Internet Explorer. It is essentially a fallback to run
our program on the Web.</p>
<p>The workflow is the following:</p>
<ol>
<li>Compile your Rust project into WebAssembly,</li>
<li>Compile your WebAssembly binary into an ASM.js module,</li>
<li>Optimise and shrink the ASM.js module.</li>
</ol>
<p><a rel="noopener external" target="_blank" href="https://github.com/WebAssembly/binaryen">The wasm2js tool</a> will be your
best companion to compile the WebAssembly binary into an ASM.js module.
It is part of Binaryen project. Then, assuming we have the WebAssembly
binary of our program, all we have to do is:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> wasm2js --pedantic --output gutenberg_post_parser.asm.js gutenberg_post_parser.wasm</span></span></code></pre>
<p>At this step, the <code>gutenberg_post_parser.asm.js</code> weights 212kb. The file
contains ECMAScript 6 code. And remember that old browsers are
considered, like Internet Explorer, so the code needs to be transformed
a little bit. To optimise and shrink the ASM.js module, we will use <a rel="noopener external" target="_blank" href="https://github.com/mishoo/UglifyJS2/tree/harmony">the
<code>uglify-es</code> tool</a>,
like this:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Transform code, and embed in a</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function">.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> sed -i </span><span class="z-string">'' '1s/^/function GUTENBERG_POST_PARSER_ASM_MODULE() {/; s/export //'</span><span> gutenberg_post_parser.asm.js</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'return { root, alloc, dealloc, memory }; }'</span><span class="z-keyword z-operator"> >></span><span> gutenberg_post_parser.asm.js</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Shrink the code.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> uglifyjs --compress --mangle --output .temp.asm.js gutenberg_post_parser.asm.js</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> mv .temp.asm.js gutenberg_post_parser.asm.js</span></span></code></pre>
<p>Just like we did for the WebAssembly binary, we can compress the
resulting files with <a rel="noopener external" target="_blank" href="http://www.gzip.org/"><code>gzip</code></a> and
<a rel="noopener external" target="_blank" href="https://github.com/google/brotli"><code>brotli</code></a>:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Compress.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> gzip --best --stdout gutenberg_post_parser.asm.js </span><span class="z-keyword z-operator">></span><span> gutenberg_post_parser.asm.js.gz</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> brotli --best --stdout</span><span class="z-variable"> --lgwin</span><span class="z-keyword z-operator">=</span><span class="z-string">24</span><span class="z-entity z-name"> gutenberg_post_parser.asm.js</span><span class="z-keyword z-operator"> ></span><span class="z-string"> gutenberg_post_parser.asm.js.br</span></span></code></pre>
<p>We end up with the following file sizes:</p>
<ul>
<li><code>.asm.js</code>: 54kb,</li>
<li><code>.asm.js.gz</code>: 13kb,</li>
<li><code>.asm.js.br</code>: 11kb.</li>
</ul>
<p>That's again pretty small!</p>
<p>When you think about it, this is a lot of transformations: From Rust to
WebAssembly to Javascript/ASM.js… The amount of tools is rather small
compared to the amount of work. It shows a well-designed pipeline and a
collaboration between many groups of people.</p>
<p>Aside: If you are reading this post, I assume you are developers. And as
such, I'm sure you can spend hours looking at a source code like if it
is a master painting. Did you ever wonder what a Rust program looks like
once compiled to Javascript? See bellow:</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-asm-js-galaxy/./rust-as-asm-js.png" alt="Rust as ASM.js" loading="lazy" decoding="async" /></p>
<figcaption>
<p>A Rust program compiled as WebAssembly compiled as ASM.js.</p>
</figcaption>
</figure>
<p>I like it probably too much.</p>
<h2 id="-1">ASM.js 🚀 Javascript<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p>The resulting <code>gutenberg_post_parser.asm.js</code> file contains a single
function named <code>GUTENBERG_POST_PARSER_ASM_MODULE</code> which returns an
object pointing to 4 private functions:</p>
<ol>
<li><code>root</code>, the axiom of our grammar,</li>
<li><code>alloc</code>, to allocate memory,</li>
<li><code>dealloc</code>, to deallocate memory, and</li>
<li><code>memory</code>, the memory buffer.</li>
</ol>
<p>It sounds familiar if you have read <a href="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/">the previous episode with
WebAssembly</a>.
Don't expect <code>root</code> to return a full AST: It will return a pointer to
the memory, and the data need to be encoded and decoded, and to write
into and to read from the memory the same way. Yes, the same way. <em>The
exact same way</em>. So the code of the boundary layer is strictly the same.
Do you remember the <code>Module</code> object in our WebAssembly Javascript
boundary? This is exactly what the <code>GUTENBERG_POST_PARSER_ASM_MODULE</code>
function returns. You can replace <code>Module</code> by the returned object, <em>et
voilà</em>!</p>
<p><a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/asmjs/bin/gutenberg_post_parser.asm.mjs">The entired code lands
here</a>.
It completely reuses the Javascript boundary layer for WebAssembly. It
just sets the <code>Module</code> differently, and it does not load the WebAssembly
binary. Consequently, the ASM.js boundary layer is made of 34 lines of
code, only 🙃. It compresses to 218 bytes.</p>
<h2 id="-2">Conclusion<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h2>
<p>We have seen that ASM.js can be fallback to WebAssembly in environments
that only support Javascript (like Internet Explorer), with or without
ASM.js optimisations.</p>
<p>The resulting ASM.js file and its boundary layer are quite small. By
design, the ASM.js boundary layer reuses almost the entire WebAssembly
boundary layer. Therefore there is again a tiny surface of code to
review and to maintain, which is helpful.</p>
<p>We have seen in the previous episode that Rust is very fast. We have
been able to observe the same statement for WebAssembly compared to the
actual Javascript parser for the Gutenberg project. However, is it still
true for the ASM.js module? In this case, ASM.js is a fallback, and like
all fallbacks, they are notably slower than the targeted
implementations. Let's run the same benchmark but use the Rust parser as
an ASM.js module:</p>
<figure>
<table><thead><tr><th>Document</th><th>Javascript parser (ms)</th><th>Rust parser as an ASM.js module (ms)</th><th>speedup</th></tr></thead><tbody>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/demo-post.html"><code>demo-post.html</code></a></td><td>15.368</td><td>2.718</td><td>× 6</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/shortcode-shortcomings.html"><code>shortcode-shortcomings.html</code></a></td><td>31.022</td><td>8.004</td><td>× 4</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/redesigning-chrome-desktop.html"><code>redesigning-chrome-desktop.html</code></a></td><td>106.416</td><td>19.223</td><td>× 6</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/web-at-maximum-fps.html"><code>web-at-maximum-fps.html</code></a></td><td>82.92</td><td>27.197</td><td>× 3</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/early-adopting-the-future.html"><code>early-adopting-the-future.html</code></a></td><td>119.880</td><td>38.321</td><td>× 3</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/pygmalian-raw-html.html"><code>pygmalian-raw-html.html</code></a></td><td>349.075</td><td>23.656</td><td>× 15</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/moby-dick-parsed.html"><code>moby-dick-parsed.html</code></a></td><td>2,543.75</td><td>361.423</td><td>× 7</td></tr>
</tbody></table>
<figcaption>
<p>Benchmark between Javascript parser and Rust parser as an ASM.js module.</p>
</figcaption>
</figure>
<p>The ASM.js module of the Rust parser is in average 6 times faster than
the actual Javascript implementation. The median speedup is 6. That's
far from the WebAssembly results, but this is a fallback, and in
average, it is 6 times faster, which is really great!</p>
<p>So not only the whole pipeline is safer because it starts from Rust, but
it ends to be faster than Javascript.</p>
<p>We will see in the next episodes of this series that Rust can reach a
lot of galaxies, and the more it travels, the more it gets interesting.</p>
<p>Thanks for reading!</p>
The WebAssembly galaxy2018-08-22T00:00:00+00:002018-08-22T00:00:00+00:00
Unknown
https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/<p>The first galaxy that our Rust parser will explore is the WebAssembly
(Wasm) galaxy. This post will explain what WebAssembly is, how to
compile the parser into WebAssembly, and how to use the WebAssembly
binary with Javascript in a browser and with NodeJS.</p>
<h2 id="what-is-webassembly-and-why">What is WebAssembly, and why?<a role="presentation" class="anchor" href="#what-is-webassembly-and-why" title="Anchor link to this header">#</a>
</h2>
<p>If you already know WebAssembly, you can skip this section.</p>
<p><a rel="noopener external" target="_blank" href="https://webassembly.org/">WebAssembly</a> defines itself as:</p>
<blockquote>
<p>WebAssembly (abbreviated <em>Wasm</em>) is a binary instruction format for a
stack-based virtual machine. Wasm is designed as a portable target for
compilation of high-level languages like C/C++/Rust, enabling
deployment on the web for client and server applications.</p>
</blockquote>
<p>Should I say more? Probably, yes…</p>
<p>WebAssembly is a <em>new portable binary format</em>. Languages like C, C++, or
Rust already compiles to this target. It is the spirit successor of
<a rel="noopener external" target="_blank" href="http://asmjs.org/">ASM.js</a>. By spirit successor, I mean it is the same
people trying to extend the Web platform and to make the Web fast that
are working on both technologies. They share some design concepts too,
but that's not really important right now.</p>
<p>Before WebAssembly, programs had to compile to Javascript in order to
run on the Web platform. The resulting files were most of the time
large. And because the Web is a network, the files had to be downloaded,
and it took time. WebAssembly is designed to be encoded in a size- and
load-time efficient <a rel="noopener external" target="_blank" href="https://webassembly.org/docs/binary-encoding/">binary
format</a>.</p>
<p>WebAssembly is also faster than Javascript for many reasons. Despites
all the crazy optimisations engineers put in the Javascript virtual
machines, Javascript is a weakly and dynamically typed language, which
requires to be interpreted. WebAssembly aims to execute at native speed
by taking advantage of <a rel="noopener external" target="_blank" href="https://webassembly.org/docs/portability/#assumptions-for-efficient-execution">common hardware
capabilities</a>.
<a rel="noopener external" target="_blank" href="https://hacks.mozilla.org/2018/01/making-webassembly-even-faster-firefoxs-new-streaming-and-tiering-compiler/">WebAssembly also loads faster than
Javascript</a>
because parsing and compiling happen while the binary is streamed from
the network. So once the binary is entirely fetched, it is ready to run:
No need to wait on the parser and the compiler before running the
program.</p>
<p>Today, and our blog series is a perfect example of that, it is possible
to write a Rust program, and to compile it to run on the Web platform.
Why? Because WebAssembly is implemented by <a rel="noopener external" target="_blank" href="https://caniuse.com/#search=wasm">all major
browsers</a>, and because it has been
designed for the Web: To live and run on the Web platform (like a
browser). But its portable aspect and <a rel="noopener external" target="_blank" href="https://webassembly.org/docs/semantics/#linear-memory">its safe and sandboxed memory
design</a> make it a
good candidate to run outside of the Web platform (see <a rel="noopener external" target="_blank" href="https://github.com/geal/serverless-wasm">a serverless
Wasm framework</a>, or <a rel="noopener external" target="_blank" href="https://github.com/losfair/IceCore">an
application container built for
Wasm</a>).</p>
<p>I think it is important to remind that WebAssembly is not here to
replace Javascript. It is just another technology which solves many
problems we can meet today, like load-time, safety, or speed.</p>
<h2 id="rust-rocket-webassembly">Rust 🚀 WebAssembly<a role="presentation" class="anchor" href="#rust-rocket-webassembly" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/./rust-to-wasm.png" alt="Rust to Wasm" loading="lazy" decoding="async" /></p>
</figure>
<p><a rel="noopener external" target="_blank" href="https://github.com/rustwasm/team">The Rust Wasm team</a> is a group of
people leading the effort of pushing Rust into WebAssembly with a set of
tools and integrations. <a rel="noopener external" target="_blank" href="https://rustwasm.github.io/book/">There is a
book</a> explaining how to write a
WebAssembly program with Rust.</p>
<p>With the Gutenberg Rust parser, I didn't use tools like
<a rel="noopener external" target="_blank" href="https://github.com/rustwasm/wasm-bindgen/"><code>wasm-bindgen</code></a> (which is a
pure gem) when I started the project few months ago because I hit some
limitations. Note that some of them have been addressed since then!
Anyway, we will do most of the work by hand, and I think this is an
excellent way to understand how things work in the background. When you
are familiar with WebAssembly interactions, then <code>wasm-bindgen</code> is an
excellent tool to have within easy reach, because it abstracts all the
interactions and let you focus on your code logic instead.</p>
<p>I would like to remind the reader that the Gutenberg Rust parser exposes
one AST, and one <code>root</code> function (the axiom of the grammar),
respectively defined as:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Node</span><span><'</span><span class="z-entity z-name">a</span><span>> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span> (</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>,</span><span class="z-entity z-name"> Input</span><span><'</span><span class="z-entity z-name">a</span><span>>),</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>>,</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">Node</span><span><'</span><span class="z-entity z-name">a</span><span>>></span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Phrase</span><span>(</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>)</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>and</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> root</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> input</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Input</span></span>
<span class="giallo-l"><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Result</span><span><(</span><span class="z-entity z-name">Input</span><span>,</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-variable">ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span>>),</span><span class="z-variable"> nom</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Err</span><span><</span><span class="z-entity z-name">Input</span><span>>>;</span></span></code></pre>
<p>Knowing that, let's go!</p>
<h3 id="">General design<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h3>
<p>Here is our general design or workflow:</p>
<ol>
<li>Javascript (for instance) writes the blog post to parse into the WebAssembly
module memory,</li>
<li>Javascript runs the <code>root</code> function by passing a pointer to the memory, and
the length of the blog post,</li>
<li>Rust reads the blog post from the memory, runs the Gutenberg parser, compiles
the resulting AST into a sequence of bytes, and returns the pointer to this
sequence of bytes to Javascript,</li>
<li>Javascript reads the memory from the received pointer, and decodes the
sequence of bytes as Javascript objects in order to recreate an AST with a
friendly API.</li>
</ol>
<p>Why a sequence of bytes? Because WebAssembly only supports integers and
floats, not strings or vectors, and also because our Rust parser takes a
slice of bytes as input, so this is handy.</p>
<p>We use the term <em>boundary layer</em> to refer to this Javascript piece of
code responsible to read from and write into the WebAssembly module
memory, and responsible of exposing a friendly API.</p>
<p>Now, we will focus on the Rust code. It consists of only 4 functions:</p>
<ul>
<li><code>alloc</code> to allocate memory (exported),</li>
<li><code>dealloc</code> to deallocate memory (exported),</li>
<li><code>root</code> to run the parser (exported),</li>
<li><code>into_bytes</code> to transform the AST into a sequence of bytes.</li>
</ul>
<p><a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/wasm/src/lib.rs">The entire code lands
here</a>.
It is approximately 150 lines of code. We explain it.</p>
<h3 id="-1">Memory allocation<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h3>
<p>Let's start by the memory allocator. I choose to use <a rel="noopener external" target="_blank" href="https://github.com/rustwasm/wee_alloc"><code>wee_alloc</code> for
the memory allocator</a>. It is
specifically designed for WebAssembly by being very small (less than a
kilobyte) and efficient.</p>
<p>The following piece of code describes the memory allocator setup and the
“prelude” for our code (enabling some compiler features, like <code>alloc</code>,
declaring external crates, some aliases, and declaring required function
like <code>panic</code>, <code>oom</code> etc.). This can be considered as a boilerplate:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#![no_std]</span></span>
<span class="giallo-l"><span>#![feature(</span></span>
<span class="giallo-l"><span> alloc,</span></span>
<span class="giallo-l"><span> alloc_error_handler,</span></span>
<span class="giallo-l"><span> core_intrinsics,</span></span>
<span class="giallo-l"><span> lang_items</span></span>
<span class="giallo-l"><span>)]</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">extern</span><span class="z-keyword"> crate</span><span> gutenberg_post_parser;</span></span>
<span class="giallo-l"><span class="z-storage">extern</span><span class="z-keyword"> crate</span><span> wee_alloc;</span></span>
<span class="giallo-l"><span>#[macro_use]</span><span class="z-storage"> extern</span><span class="z-keyword"> crate</span><span> alloc;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> gutenberg_post_parser</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span>;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> alloc</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">vec</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Vec</span><span>;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> core</span><span class="z-keyword z-operator">::</span><span>{mem, slice};</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[global_allocator]</span></span>
<span class="giallo-l"><span class="z-storage">static</span><span class="z-constant z-other"> ALLOC</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> wee_alloc</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">WeeAlloc</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> wee_alloc</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">WeeAlloc</span><span class="z-keyword z-operator">::</span><span class="z-constant z-other">INIT</span><span>;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[panic_handler]</span></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> panic</span><span>(</span><span class="z-variable">_info</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">core</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">panic</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">PanicInfo</span><span>)</span><span class="z-keyword z-operator"> -> !</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> unsafe</span><span> {</span><span class="z-entity z-name"> core</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">intrinsics</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">abort</span><span>(); }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>#[alloc_error_handler]</span></span>
<span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> oom</span><span>(</span><span class="z-variable">_</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> core</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">alloc</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Layout</span><span>)</span><span class="z-keyword z-operator"> -> !</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> unsafe</span><span> {</span><span class="z-entity z-name"> core</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">intrinsics</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">abort</span><span>(); }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// This is the definition of `std::ffi::c_void`, but Wasm runs without std in our case.</span></span>
<span class="giallo-l"><span>#[repr(</span><span class="z-entity z-name">u8</span><span>)]</span></span>
<span class="giallo-l"><span>#[allow(non_camel_case_types)]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-variable"> c_void</span><span> {</span></span>
<span class="giallo-l"><span> #[doc(hidden)]</span></span>
<span class="giallo-l"><span class="z-variable"> __variant1</span><span>,</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> #[doc(hidden)]</span></span>
<span class="giallo-l"><span class="z-variable"> __variant2</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The Rust memory is the WebAssembly memory. Rust will allocate and
deallocate memory on its own, but Javascript for instance needs to
allocate and deallocate WebAssembly memory in order to
communicate/exchange data. So we need to export one function to allocate
memory and one function to deallocate memory.</p>
<p>Once again, this is almost a boilerplate. The <code>alloc</code> function creates
an empty vector of a specific capacity (because it is a linear segment
of memory), and returns a pointer to this empty vector:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-string"> "C"</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> alloc</span><span>(</span><span class="z-variable">capacity</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>)</span><span class="z-keyword z-operator"> -> *</span><span class="z-storage">mut</span><span class="z-variable"> c_void</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> buffer</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Vec</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">with_capacity</span><span>(</span><span class="z-variable">capacity</span><span>);</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> pointer</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> buffer</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_mut_ptr</span><span>();</span></span>
<span class="giallo-l"><span class="z-entity z-name"> mem</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">forget</span><span>(</span><span class="z-variable">buffer</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> pointer</span><span class="z-keyword"> as</span><span class="z-keyword z-operator"> *</span><span class="z-storage">mut</span><span class="z-variable"> c_void</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Note the <code>#[no_mangle]</code> attribute that instructs the Rust compiler to
not mangle the function name, i.e. to not rename it. And <code>extern "C"</code> to
export the function in the WebAssembly module, so it is “public” from
outside the WebAssembly binary.</p>
<p>The code is pretty straightforward and matches what we announced
earlier: A <code>Vec</code> is allocated with a specific capacity, and the pointer
to this vector is returned. The important part is <code>mem::forget(buffer)</code>.
It is required so that Rust will <em>not</em> deallocate the vector once it
goes out of scope. Indeed, Rust enforces <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization">Resource Acquisition Is
Initialization
(RAII)</a>,
so whenever an object goes out of scope, its destructor is called and
its owned resources are freed. This behavior shields against resource
leaks bugs, and this is why we will never have to manually free memory
or worry about memory leaks in Rust (<a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/rust-by-example/scope/raii.html">see some RAII
examples</a>).
In this case, we want to allocate and keep the allocation after the
function execution, hence <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/mem/fn.forget.html">the <code>mem::forget</code>
call</a>.</p>
<p>Let's jump on the <code>dealloc</code> function. The goal is to recreate a vector
based on a pointer and a capacity, and to let Rust drops it:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-string"> "C"</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> dealloc</span><span>(</span><span class="z-variable">pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">mut</span><span class="z-variable"> c_void</span><span>,</span><span class="z-variable"> capacity</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>) {</span></span>
<span class="giallo-l"><span class="z-keyword"> unsafe</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> _</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Vec</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_raw_parts</span><span>(</span><span class="z-variable">pointer</span><span>,</span><span class="z-constant z-numeric"> 0</span><span>,</span><span class="z-variable"> capacity</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p><a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.from_raw_parts">The <code>Vec::from_raw_parts</code>
function</a>
is marked as unsafe, so we need to delimit it in an <code>unsafe</code> block so
that the <code>dealloc</code> function is considered as safe.</p>
<p>The variable <code>_</code> contains our data to deallocate, and it goes out of
scope immediately, so Rust drops it.</p>
<h3 id="-2">From input to a flat AST<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h3>
<p>Now the core of the binding! The <code>root</code> function reads the blog post to
parse based on a pointer and a length, then it parses it. If the result
is OK, it serializes the AST into a sequence of bytes, i.e. it flatten
it, otherwise it returns an empty sequence of bytes.</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/./flatten-ast.png" alt="Flatten AST" loading="lazy" decoding="async" /></p>
<figcaption>
<p>The image illustrates the flow of the data: first off there is a blog post;
second there is the AST of the blog post; finally there is a linear byte-encoded
representation of the AST of the blog post.</p>
</figcaption>
</figure>
<p>The logic flow of the parser: The input on the left is parsed into an
AST, which is serialized into a flat sequence of bytes on the right.</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span>#[no_mangle]</span></span>
<span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> extern</span><span class="z-string"> "C"</span><span class="z-keyword"> fn</span><span class="z-entity z-name z-function"> root</span><span>(</span><span class="z-variable">pointer</span><span class="z-keyword z-operator">: *</span><span class="z-storage">mut</span><span class="z-entity z-name"> u8</span><span>,</span><span class="z-variable"> length</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> usize</span><span>)</span><span class="z-keyword z-operator"> -> *</span><span class="z-storage">mut</span><span class="z-entity z-name"> u8</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> input</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> unsafe</span><span> {</span><span class="z-entity z-name"> slice</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">from_raw_parts</span><span>(</span><span class="z-variable">pointer</span><span>,</span><span class="z-variable"> length</span><span>) };</span></span>
<span class="giallo-l"><span class="z-storage"> let mut</span><span class="z-variable"> output</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> vec!</span><span>[];</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-storage"> let</span><span class="z-entity z-name"> Ok</span><span>((</span><span class="z-variable">_remaining</span><span>,</span><span class="z-variable"> nodes</span><span>))</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> gutenberg_post_parser</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">root</span><span>(</span><span class="z-variable">input</span><span>) {</span></span>
<span class="giallo-l"><span class="z-comment"> // Compile the AST (nodes) into a sequence of bytes.</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> pointer</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">as_mut_ptr</span><span>();</span></span>
<span class="giallo-l"><span class="z-entity z-name"> mem</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">forget</span><span>(</span><span class="z-variable">output</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> pointer</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The variable <code>input</code> contains the blog post. It is fetched from memory
with a pointer and a length. The variable <code>output</code> is the sequence of
bytes the function will return. <code>gutenberg_post_parser::root(input)</code>
runs the parser. If parsing is OK, then the <code>nodes</code> are compiled into a
sequence of bytes (omitted for now). Then the pointer to the sequence of
bytes is grabbed, the Rust compiler is instructed to not drop it, and
finally the pointer is returned. The logic is again pretty
straightforward.</p>
<p>Now, let's focus on the AST to the sequence of bytes (<code>u8</code>) compilation.
All data the AST hold are already bytes, which makes the process easier.
The goal is only to flatten the AST:</p>
<ul>
<li>The first 4 bytes represent the number of nodes at the first level (4
× <code>u8</code> represents <code>u32</code>) ,</li>
<li>Next, if the node is <code>Block</code>:
<ul>
<li>The first byte is the node type: <code>1u8</code> for a block,</li>
<li>The second byte is the size of the block name,</li>
<li>The third to the sixth bytes are the size of the attributes,</li>
<li>The seventh byte is the number of node children the block has,</li>
<li>Next bytes are the block name,</li>
<li>Next bytes are the attributes (<code>&b"null"[..]</code> if none),</li>
<li>Next bytes are node children as a sequence of bytes,</li>
</ul>
</li>
<li>Next, if the node is <code>Phrase</code>:
<ul>
<li>The first byte is the node type: <code>2u8</code> for a phrase,</li>
<li>The second to the fifth bytes are the size of the phrase,</li>
<li>Next bytes are the phrase.</li>
</ul>
</li>
</ul>
<p>Here is the missing part of the <code>root</code> function:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">if</span><span class="z-storage"> let</span><span class="z-entity z-name"> Ok</span><span>((</span><span class="z-variable">_remaining</span><span>,</span><span class="z-variable"> nodes</span><span>))</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> gutenberg_post_parser</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">root</span><span>(</span><span class="z-variable">input</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> nodes_length</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> u32_to_u8s</span><span>(</span><span class="z-variable">nodes</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">len</span><span>()</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u32</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">nodes_length</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">nodes_length</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">1</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">nodes_length</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">2</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">nodes_length</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">3</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span class="z-variable"> node</span><span class="z-keyword"> in</span><span class="z-variable"> nodes</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> into_bytes</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">node</span><span>,</span><span class="z-keyword z-operator"> &</span><span class="z-storage">mut</span><span class="z-variable"> output</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>And here is the <code>into_bytes</code> function:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> into_bytes</span><span><'</span><span class="z-entity z-name">a</span><span>>(</span><span class="z-variable">node</span><span class="z-keyword z-operator">: &</span><span class="z-entity z-name">Node</span><span><'</span><span class="z-entity z-name">a</span><span>>,</span><span class="z-variable"> output</span><span class="z-keyword z-operator">: &</span><span class="z-storage">mut</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">u8</span><span>>) {</span></span>
<span class="giallo-l"><span class="z-keyword"> match</span><span class="z-keyword z-operator"> *</span><span class="z-variable">node</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span><span class="z-variable"> name</span><span>,</span><span class="z-variable"> attributes</span><span>,</span><span class="z-keyword"> ref</span><span class="z-variable"> children</span><span> }</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> node_type</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 1</span><span class="z-entity z-name">u8</span><span>;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> name_length</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> name</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">len</span><span>()</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> name</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-separator">.</span><span class="z-entity z-name z-function">len</span><span>()</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 1</span><span>;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> attributes_length</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> match</span><span class="z-variable"> attributes</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Some</span><span>(</span><span class="z-variable">attributes</span><span>)</span><span class="z-keyword z-operator"> =></span><span class="z-variable"> attributes</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">len</span><span>(),</span></span>
<span class="giallo-l"><span class="z-entity z-name"> None</span><span class="z-keyword z-operator"> =></span><span class="z-constant z-numeric"> 4</span></span>
<span class="giallo-l"><span> };</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> attributes_length_as_u8s</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> u32_to_u8s</span><span>(</span><span class="z-variable">attributes_length</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u32</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> number_of_children</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> children</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">len</span><span>();</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">node_type</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">name_length</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u8</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">attributes_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">attributes_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">1</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">attributes_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">2</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">attributes_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">3</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">number_of_children</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u8</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">extend</span><span>(</span><span class="z-variable">name</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-string">b'/'</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">extend</span><span>(</span><span class="z-variable">name</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">1</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span class="z-storage"> let</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-variable">attributes</span><span>)</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> attributes</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">extend</span><span>(</span><span class="z-variable">attributes</span><span>);</span></span>
<span class="giallo-l"><span> }</span><span class="z-keyword"> else</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">extend</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-string">b"null"</span><span>[</span><span class="z-keyword z-operator">..</span><span>]);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span class="z-variable"> child</span><span class="z-keyword"> in</span><span class="z-variable"> children</span><span> {</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> into_bytes</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-variable">child</span><span>,</span><span class="z-variable"> output</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">Phrase</span><span>(</span><span class="z-variable">phrase</span><span>)</span><span class="z-keyword z-operator"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> node_type</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 2</span><span class="z-entity z-name">u8</span><span>;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> phrase_length</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> phrase</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">len</span><span>();</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">node_type</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> phrase_length_as_u8s</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> u32_to_u8s</span><span>(</span><span class="z-variable">phrase_length</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u32</span><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">phrase_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">0</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">phrase_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">1</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">phrase_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">2</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-variable">phrase_length_as_u8s</span><span class="z-keyword z-operator">.</span><span class="z-constant z-numeric">3</span><span>);</span></span>
<span class="giallo-l"><span class="z-variable"> output</span><span class="z-keyword z-operator">.</span><span class="z-entity z-name z-function">extend</span><span>(</span><span class="z-variable">phrase</span><span>);</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>What I find interesting with this code is it reads just like the bullet
list above the code.</p>
<p>For the most curious, here is the <code>u32_to_u8s</code> function:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">fn</span><span class="z-entity z-name z-function"> u32_to_u8s</span><span>(</span><span class="z-variable">x</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> u32</span><span>)</span><span class="z-keyword z-operator"> -></span><span> (</span><span class="z-entity z-name">u8</span><span>,</span><span class="z-entity z-name"> u8</span><span>,</span><span class="z-entity z-name"> u8</span><span>,</span><span class="z-entity z-name"> u8</span><span>) {</span></span>
<span class="giallo-l"><span> (</span></span>
<span class="giallo-l"><span> ((</span><span class="z-variable">x</span><span class="z-keyword z-operator"> >></span><span class="z-constant z-numeric"> 24</span><span>)</span><span class="z-keyword z-operator"> &</span><span class="z-constant z-numeric"> 0xff</span><span>)</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u8</span><span>,</span></span>
<span class="giallo-l"><span> ((</span><span class="z-variable">x</span><span class="z-keyword z-operator"> >></span><span class="z-constant z-numeric"> 16</span><span>)</span><span class="z-keyword z-operator"> &</span><span class="z-constant z-numeric"> 0xff</span><span>)</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u8</span><span>,</span></span>
<span class="giallo-l"><span> ((</span><span class="z-variable">x</span><span class="z-keyword z-operator"> >></span><span class="z-constant z-numeric"> 8</span><span>)</span><span class="z-keyword z-operator"> &</span><span class="z-constant z-numeric"> 0xff</span><span>)</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u8</span><span>,</span></span>
<span class="giallo-l"><span> (</span><span class="z-variable"> x</span><span class="z-keyword z-operator"> &</span><span class="z-constant z-numeric"> 0xff</span><span>)</span><span class="z-keyword"> as</span><span class="z-entity z-name"> u8</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Here we are. <code>alloc</code>, <code>dealloc</code>, <code>root</code>, and <code>into_bytes</code>. Four
functions, and everything is done.</p>
<h3 id="-3">Producing and optimising the WebAssembly binary<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h3>
<p>To get a WebAssembly binary, the project has to be compiled to the
<code>wasm32-unknown-unknown</code> target. For now (and it will change in a near
future), the nightly toolchain is needed to compile the project, so make
sure you have the latest nightly version of <code>rustc</code> & co. installed with
<code>rustup update nightly</code>. Let's run <code>cargo</code>:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span class="z-variable"> RUSTFLAGS</span><span class="z-keyword z-operator">=</span><span class="z-string">'-g'</span><span class="z-entity z-name"> cargo</span><span class="z-string"> +nightly build</span><span class="z-constant z-other"> --target</span><span class="z-string"> wasm32-unknown-unknown</span><span class="z-constant z-other"> --release</span></span></code></pre>
<p>The WebAssembly binary weights 22kb. Our goal is to reduce the file
size. For that, the following tools will be required:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/alexcrichton/wasm-gc"><code>wasm-gc</code></a> to garbage-collect unused
imports, internal functions, types etc.,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/rustwasm/wasm-snip"><code>wasm-snip</code></a> to mark some functions as
unreachable, this is useful when the binary includes unused code that the
linker were not able to remove,</li>
<li><code>wasm-opt</code> from the <a rel="noopener external" target="_blank" href="https://github.com/WebAssembly/binaryen">Binaryen
project</a>, to optimise the
binary,</li>
<li><a rel="noopener external" target="_blank" href="http://www.gzip.org/"><code>gzip</code></a> and
<a rel="noopener external" target="_blank" href="https://github.com/google/brotli"><code>brotli</code></a> to compress the binary.</li>
</ul>
<p>Basically, what we do is the following:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Garbage-collect unused data.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> wasm-gc gutenberg_post_parser.wasm</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Mark fmt and panicking as unreachable.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> wasm-snip --snip-rust-fmt-code --snip-rust-panicking-code gutenberg_post_parser.wasm -o gutenberg_post_parser_snipped.wasm</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> mv gutenberg_post_parser_snipped.wasm gutenberg_post_parser.wasm</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Garbage-collect unreachable data.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> wasm-gc gutenberg_post_parser.wasm</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Optimise </span><span class="z-keyword">for</span><span> small size.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> wasm-opt -Oz -o gutenberg_post_parser_opt.wasm gutenberg_post_parser.wasm</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> mv gutenberg_post_parser_opt.wasm gutenberg_post_parser.wasm</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> # Compress.</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> gzip --best --stdout gutenberg_post_parser.wasm </span><span class="z-keyword z-operator">></span><span> gutenberg_post_parser.wasm.gz</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> brotli --best --stdout</span><span class="z-variable"> --lgwin</span><span class="z-keyword z-operator">=</span><span class="z-string">24</span><span class="z-entity z-name"> gutenberg_post_parser.wasm</span><span class="z-keyword z-operator"> ></span><span class="z-string"> gutenberg_post_parser.wasm.br</span><span> </span></span></code></pre>
<p>We end up with the following file sizes:</p>
<ul>
<li><code>.wasm</code>: 16kb,</li>
<li><code>.wasm.gz</code>: 7.3kb,</li>
<li><code>.wasm.br</code>: 6.2kb.</li>
</ul>
<p>Neat! <a rel="noopener external" target="_blank" href="https://caniuse.com/#search=brotli">Brotli is implemented by most
browsers</a>, so when the client sends
<code>Accept-Encoding: br</code>, the server can response with the <code>.wasm.br</code> file.</p>
<p>To give you a feeling of what 6.2kb represent, the following image also
weights 6.2kb:</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/./image-example.png" alt="The WordPress's logo" loading="lazy" decoding="async" /></p>
<figcaption>
<p>An image that is as weight as our compressed WebAssembly module.</p>
</figcaption>
</figure>
<p>The WebAssembly binary is ready to run!</p>
<h2 id="-4">WebAssembly 🚀 Javascript<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/./wasm-to-js.png" alt="Wasm to JS" loading="lazy" decoding="async" /></p>
</figure>
<p>In this section, we assume Javascript runs in a browser. Thus, what we
need to do is the following:</p>
<ol>
<li>Load/stream and instanciate the WebAssembly binary,</li>
<li>Write the blog post to parse in the WebAssembly module memory,</li>
<li>Call the <code>root</code> function on the parser,</li>
<li>Read the WebAssembly module memory to load the flat AST (a sequence of bytes)
and decode it to build a “Javascript AST” (with our own objects).</li>
</ol>
<p><a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/wasm/bin/gutenberg_post_parser.mjs">The entire code lands
here</a>.
It is approximately 150 lines of code too. I won't explain the whole
code since some parts of it is the “friendly API” that is exposed to the
user. So I will rather explain the major pieces.</p>
<h3 id="-5">Loading/streaming and instanciating<a role="presentation" class="anchor" href="#-5" title="Anchor link to this header">#</a>
</h3>
<p><a rel="noopener external" target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly">The <code>WebAssembly</code>
API</a>
exposes multiple ways to load a WebAssembly binary. The best you can use
is <a rel="noopener external" target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/instantiateStreaming">the <code>WebAssembly.instanciateStreaming</code>
function</a>:
It streams the binary and compiles it in the same time, nothing is
blocking. This API relies on <a rel="noopener external" target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API">the <code>Fetch</code>
API</a>. You
might have guessed it: It is asynchronous (it returns a promise).
WebAssembly itself is not asynchronous (except if you use thread), but
the instanciation step is. It is possible to avoid that, but this is
tricky, and Google Chrome has a strong limit of 4kb for the binary size
which will make you give up quickly.</p>
<p>To be able to stream the WebAssembly binary, the server must send the
<code>application/wasm</code> MIME type (with the <code>Content-Type</code> header).</p>
<p>Let's instanciate our WebAssembly:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage">const</span><span class="z-variable"> url</span><span class="z-keyword z-operator"> =</span><span class="z-string"> '/gutenberg_post_parser.wasm'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage">const</span><span class="z-variable"> wasm</span><span class="z-keyword z-operator"> =</span></span>
<span class="giallo-l"><span class="z-variable"> WebAssembly</span><span class="z-punctuation z-accessor">.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> instantiateStreaming</span><span>(</span><span class="z-entity z-name z-function">fetch</span><span>(</span><span class="z-variable">url</span><span>)</span><span class="z-punctuation z-separator">,</span><span> {})</span><span class="z-punctuation z-accessor">.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> then</span><span>(</span><span class="z-variable z-parameter">object</span><span class="z-storage z-type z-function"> =></span><span class="z-variable"> object</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">instance</span><span>)</span><span class="z-punctuation z-accessor">.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> then</span><span>(</span><span class="z-variable z-parameter">instance</span><span class="z-storage z-type z-function"> =></span><span> {</span><span class="z-comment"> /* step 2 */</span><span> })</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The WebAssembly binary has been instanciated! Now we can move to the
next step.</p>
<h3 id="-6">Last polish before running the parser<a role="presentation" class="anchor" href="#-6" title="Anchor link to this header">#</a>
</h3>
<p>Remember that the WebAssembly binary exports 3 functions: <code>alloc</code>,
<code>dealloc</code>, and <code>root</code>. They can be found on the <code>exports</code> property,
along with the memory. Let's write that:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-entity z-name z-function"> then</span><span>(</span><span class="z-variable z-parameter">instance</span><span class="z-storage z-type z-function"> =></span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> Module</span><span class="z-keyword z-operator"> =</span><span> {</span></span>
<span class="giallo-l"><span> alloc</span><span class="z-punctuation z-separator">:</span><span class="z-variable"> instance</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">exports</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">alloc</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> dealloc</span><span class="z-punctuation z-separator">:</span><span class="z-variable"> instance</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">exports</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">dealloc</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> root</span><span class="z-punctuation z-separator">:</span><span class="z-variable"> instance</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">exports</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">root</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> memory</span><span class="z-punctuation z-separator">:</span><span class="z-variable"> instance</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">exports</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">memory</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> runParser</span><span>(</span><span class="z-variable">Module</span><span class="z-punctuation z-separator">,</span><span class="z-string"> '<!-- wp:foo /-->xyz'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> })</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Great, everything is ready to write the <code>runParser</code> function!</p>
<h3 id="-7">The parser runner<a role="presentation" class="anchor" href="#-7" title="Anchor link to this header">#</a>
</h3>
<p>As a reminder, this function has to: Write the <code>input</code> (the blog post to
parse) in the WebAssembly module memory (<code>Module.memory</code>), to call the
<code>root</code> function (<code>Module.root</code>), and to read the result from the
WebAssembly module memory. Let's do that:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> runParser</span><span>(</span><span class="z-variable z-parameter">Module</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> raw_input</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> input</span><span class="z-keyword z-operator"> = new</span><span class="z-entity z-name z-function"> TextEncoder</span><span>()</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">encode</span><span>(</span><span class="z-variable">raw_input</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> input_pointer</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> writeBuffer</span><span>(</span><span class="z-variable">Module</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> input</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> output_pointer</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> Module</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">root</span><span>(</span><span class="z-variable">input_pointer</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> input</span><span class="z-punctuation z-accessor">.</span><span>length)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> result</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> readNodes</span><span>(</span><span class="z-variable">Module</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> output_pointer</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> Module</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">dealloc</span><span>(</span><span class="z-variable">input_pointer</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> input</span><span class="z-punctuation z-accessor">.</span><span>length)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> result</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>In details:</p>
<ul>
<li>The <code>raw_input</code> is encoded into a sequence of bytes with <a rel="noopener external" target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder">the
<code>TextEncoder</code>API</a>,
in <code>input</code>,</li>
<li>The input is written into the WebAssembly memory module with
<code>writeBuffer</code> and its pointer is returned,</li>
<li>Then the <code>root</code> function is called with the pointer to the input and
the length of the input as expected, and the pointer to the output is
returned,</li>
<li>Then the output is decoded,</li>
<li>And finally, the input is deallocated. The output of the parser will
be deallocated in the <code>readNodes</code> function because its length is
unknown at this step.</li>
</ul>
<p>Great! So we have 2 functions to write right now: <code>writeBuffer</code> and
<code>readNodes</code>.</p>
<h3 id="-8">Writing the data in memory<a role="presentation" class="anchor" href="#-8" title="Anchor link to this header">#</a>
</h3>
<p>Let's go with the first one, <code>writeBuffer</code>:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> writeBuffer</span><span>(</span><span class="z-variable z-parameter">Module</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> buffer</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> buffer_length</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> buffer</span><span class="z-punctuation z-accessor">.</span><span>length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> pointer</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> Module</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">alloc</span><span>(</span><span class="z-variable">buffer_length</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> memory</span><span class="z-keyword z-operator"> = new</span><span class="z-entity z-name z-function"> Uint8Array</span><span>(</span><span class="z-variable">Module</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">memory</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">buffer</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span> (</span><span class="z-storage">let</span><span class="z-variable"> i</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span class="z-variable"> i</span><span class="z-keyword z-operator"> <</span><span class="z-variable"> buffer_length</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span class="z-variable">i</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable"> memory</span><span>[</span><span class="z-variable">pointer</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> i</span><span>]</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">i</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> pointer</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>In details:</p>
<ul>
<li>The length of the buffer is read in <code>buffer_length</code>,</li>
<li>A space in memory is allocated to write the buffer,</li>
<li>Then a <code>uint8</code> view of the buffer is instanciated, which means that
the buffer will be viewed as a sequence of <code>u8</code>, exactly what Rust
expects,</li>
<li>Finally the buffer is copied into the memory with a loop, that's very
basic, and return the pointer.</li>
</ul>
<p>Note that, unlike C strings, adding a <code>NUL</code> byte at the end is not
mandatory. This is just the raw data (on the Rust side, we read it with
<code>slice::from_raw_parts</code>, slice is a very simple structure).</p>
<h3 id="-9">Reading the output of the parser<a role="presentation" class="anchor" href="#-9" title="Anchor link to this header">#</a>
</h3>
<p>So at this step, the input has been written in memory, and the <code>root</code>
function has been called so it means the parser has run. It has returned
a pointer to the output (the result) and we now have to read it and
decode it.</p>
<p>Remind that the first 4 bytes encodes the number of nodes we have to
read. Let's go!</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> readNodes</span><span>(</span><span class="z-variable z-parameter">Module</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> start_pointer</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> buffer</span><span class="z-keyword z-operator"> = new</span><span class="z-entity z-name z-function"> Uint8Array</span><span>(</span><span class="z-variable">Module</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">memory</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">buffer</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">slice</span><span>(</span><span class="z-variable">start_pointer</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> number_of_nodes</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> u8s_to_u32</span><span>(</span><span class="z-variable">buffer</span><span>[</span><span class="z-constant z-numeric">0</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-constant z-numeric">1</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-constant z-numeric">2</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-constant z-numeric">3</span><span>])</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span> (</span><span class="z-constant z-numeric">0</span><span class="z-keyword z-operator"> >=</span><span class="z-variable"> number_of_nodes</span><span>) {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> nodes</span><span class="z-keyword z-operator"> =</span><span> []</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> offset</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 4</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> end_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span> (</span><span class="z-storage">let</span><span class="z-variable"> i</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span class="z-variable"> i</span><span class="z-keyword z-operator"> <</span><span class="z-variable"> number_of_nodes</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span class="z-variable">i</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> last_offset</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> readNode</span><span>(</span><span class="z-variable">buffer</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> offset</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> nodes</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> end_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> last_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> Module</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">dealloc</span><span>(</span><span class="z-variable">start_pointer</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> start_pointer</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> end_offset</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> nodes</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>In details:</p>
<ul>
<li>A <code>uint8</code> view of the memory is instanciated… more precisely: A slice
of the memory starting at <code>start_pointer</code>,</li>
<li>The number of nodes is read, then all nodes are read,</li>
<li>And finally, the output of the parser is deallocated.</li>
</ul>
<p>For the record, here is the <code>u8s_to_u32</code> function, this is the exact
opposite of <code>u32_to_u8s</code>:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> u8s_to_u32</span><span>(</span><span class="z-variable z-parameter">o</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> p</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> q</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> r</span><span>) {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span> (</span><span class="z-variable">o</span><span class="z-keyword z-operator"> <<</span><span class="z-constant z-numeric"> 24</span><span>)</span><span class="z-keyword z-operator"> |</span><span> (</span><span class="z-variable">p</span><span class="z-keyword z-operator"> <<</span><span class="z-constant z-numeric"> 16</span><span>)</span><span class="z-keyword z-operator"> |</span><span> (</span><span class="z-variable">q</span><span class="z-keyword z-operator"> <<</span><span class="z-constant z-numeric"> 8</span><span>)</span><span class="z-keyword z-operator"> |</span><span class="z-variable"> r</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>And I will also share the <code>readNode</code> function, but I won't explain the
details. This is just the decoding part of the output from the parser.</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> readNode</span><span>(</span><span class="z-variable z-parameter">buffer</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> offset</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> nodes</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> node_type</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Block.</span></span>
<span class="giallo-l"><span class="z-keyword"> if</span><span> (</span><span class="z-constant z-numeric">1</span><span class="z-keyword z-operator"> ===</span><span class="z-variable"> node_type</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> name_length</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 1</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> attributes_length</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> u8s_to_u32</span><span>(</span><span class="z-variable">buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 2</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 3</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 4</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 5</span><span>])</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> number_of_children</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 6</span><span>]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> payload_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 7</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> next_payload_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> payload_offset</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> name_length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> name</span><span class="z-keyword z-operator"> = new</span><span class="z-entity z-name z-function"> TextDecoder</span><span>()</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">decode</span><span>(</span><span class="z-variable">buffer</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">slice</span><span>(</span><span class="z-variable">payload_offset</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> next_payload_offset</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> payload_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> next_payload_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> next_payload_offset</span><span class="z-keyword z-operator"> +=</span><span class="z-variable"> attributes_length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> attributes</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> JSON</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">parse</span><span>(</span><span class="z-keyword z-operator">new</span><span class="z-entity z-name z-function"> TextDecoder</span><span>()</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">decode</span><span>(</span><span class="z-variable">buffer</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">slice</span><span>(</span><span class="z-variable">payload_offset</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> next_payload_offset</span><span>)))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> payload_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> next_payload_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> let</span><span class="z-variable"> end_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> payload_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> children</span><span class="z-keyword z-operator"> =</span><span> []</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span> (</span><span class="z-storage">let</span><span class="z-variable"> i</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span class="z-variable"> i</span><span class="z-keyword z-operator"> <</span><span class="z-variable"> number_of_children</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span class="z-variable">i</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> last_offset</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> readNode</span><span>(</span><span class="z-variable">buffer</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> payload_offset</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> children</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> payload_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> end_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> last_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> nodes</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-keyword z-operator">new</span><span class="z-entity z-name z-function"> Block</span><span>(</span><span class="z-variable">name</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> attributes</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> children</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> end_offset</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span class="z-comment"> // Phrase.</span></span>
<span class="giallo-l"><span class="z-keyword"> else if</span><span> (</span><span class="z-constant z-numeric">2</span><span class="z-keyword z-operator"> ===</span><span class="z-variable"> node_type</span><span>) {</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> phrase_length</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name z-function"> u8s_to_u32</span><span>(</span><span class="z-variable">buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 1</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 2</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 3</span><span>]</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> buffer</span><span>[</span><span class="z-variable">offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 4</span><span>])</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> phrase_offset</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> offset</span><span class="z-keyword z-operator"> +</span><span class="z-constant z-numeric"> 5</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-storage"> const</span><span class="z-variable"> phrase</span><span class="z-keyword z-operator"> = new</span><span class="z-entity z-name z-function"> TextDecoder</span><span>()</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">decode</span><span>(</span><span class="z-variable">buffer</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">slice</span><span>(</span><span class="z-variable">phrase_offset</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> phrase_offset</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> phrase_length</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> nodes</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">push</span><span>(</span><span class="z-keyword z-operator">new</span><span class="z-entity z-name z-function"> Phrase</span><span>(</span><span class="z-variable">phrase</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> phrase_offset</span><span class="z-keyword z-operator"> +</span><span class="z-variable"> phrase_length</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span><span class="z-keyword"> else</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> console</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">error</span><span>(</span><span class="z-string">'unknown node type'</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> node_type</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Note that this code is pretty simple and easy to optimise by the
Javascript virtual machine. It is almost important to note that this is
not the original code. The original version is a little more optimised
here and there, but they are very close.</p>
<p>And that's all! We have successfully read and decoded the output of the
parser! We just need to write the <code>Block</code> and <code>Phrase</code> classes like
this:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> constructor</span><span>(</span><span class="z-variable z-parameter">name</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> attributes</span><span class="z-punctuation z-separator">,</span><span class="z-variable z-parameter"> children</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> this</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">name</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> name</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable z-language"> this</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">attributes</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> attributes</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable z-language"> this</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">children</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> children</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Phrase</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> constructor</span><span>(</span><span class="z-variable z-parameter">phrase</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> this</span><span class="z-punctuation z-accessor">.</span><span class="z-variable">phrase</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> phrase</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The final output will be an array of those objects. Easy!</p>
<h2 id="-10">WebAssembly 🚀 NodeJS<a role="presentation" class="anchor" href="#-10" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/the-webassembly-galaxy/./wasm-to-nodejs.png" alt="Wasm to NodeJS" loading="lazy" decoding="async" /></p>
</figure>
<p>The differences between the Javascript version and the NodeJS version
are few:</p>
<ul>
<li>The <code>Fetch</code> API does not exist in NodeJS, so the WebAssembly binary
has to be instanciated with a buffer directly, like this:
<code>WebAssembly.instantiate(fs.readFileSync(url), {})</code>,</li>
<li>The <code>TextEncoder</code> and <code>TextDecoder</code> objects do not exist as global
objects, they are in <code>util.TextEncoder</code> and <code>util.TextDecoder</code>.</li>
</ul>
<p>In order to share the code between both environments, it is possible to
write the boundary layer (the Javascript code we wrote) in a <code>.mjs</code>
file, aka ECMAScript Module. It allows to write something like
<code>import { Gutenberg_Post_Parser } from './gutenberg_post_parser.mjs'</code>
for example (considering the whole code we wrote before is a class). On
the browser side, the script must be loaded
with<code><script type="module" src="…" /></code>, and on the NodeJS side, <code>node</code>
must run with the <code>--experimental-modules</code> flag. I can recommend you
this talk <a rel="noopener external" target="_blank" href="https://www.youtube.com/watch?v=35ZMoH8T-gc&index=4&list=PLOkMRkzDhWGX_4YWI4ZYGbwFPqKnDRudf&t=0s"><em>Please wait… loading: a tale of two loaders</em> by Myles
Borins</a>
at the JSConf EU 2018 to understand all the story about that.</p>
<p><a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs/blob/master/bindings/wasm/bin/index.mjs">The entire code lands
here</a>.</p>
<h2 id="-11">Conclusion<a role="presentation" class="anchor" href="#-11" title="Anchor link to this header">#</a>
</h2>
<p>We have seen in details how to write a real world parser in Rust, how to
compile it into a WebAssembly binary, and how to use it with Javascript
and with NodeJS.</p>
<p>The parser can be used in a browser with regular Javascript code, or as
a CLI with NodeJS, or on any platforms NodeJS supports.</p>
<p>The Rust part for WebAssembly plus the Javascript part totals 313 lines
of code. This is a tiny surface of code to review and to maintain
compared to writing a Javascript parser from scratch.</p>
<p>Another argument is the safety and performance. Rust is memory safe, we
know that. It is also performant, but is it still true for the
WebAssembly target? The following table shows the benchmark results of
the actual Javascript parser for the Gutenberg project (implemented with
<a rel="noopener external" target="_blank" href="https://pegjs.org/">PEG.js</a>), against this project: The Rust parser as
a WebAssembly binary.</p>
<figure>
<table><thead><tr><th>Document</th><th>Javascript parser (ms)</th><th>Rust parser as a WebAssembly binary (ms)</th><th>speedup</th></tr></thead><tbody>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/demo-post.html"><code>demo-post.html</code></a></td><td>13.167</td><td>0.252</td><td>× 52</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/shortcode-shortcomings.html"><code>shortcode-shortcomings.html</code></a></td><td>26.784</td><td>0.271</td><td>× 98</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/redesigning-chrome-desktop.html"><code>redesigning-chrome-desktop.html</code></a></td><td>75.500</td><td>0.918</td><td>× 82</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/web-at-maximum-fps.html"><code>web-at-maximum-fps.html</code></a></td><td>88.118</td><td>0.901</td><td>× 98</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/early-adopting-the-future.html"><code>early-adopting-the-future.html</code></a></td><td>201.011</td><td>3.329</td><td>× 60</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/pygmalian-raw-html.html"><code>pygmalian-raw-html.html</code></a></td><td>311.416</td><td>2.692</td><td>× 116</td></tr>
<tr><td><a rel="noopener external" target="_blank" href="https://raw.githubusercontent.com/dmsnell/gutenberg-document-library/master/library/moby-dick-parsed.html"><code>moby-dick-parsed.html</code></a></td><td>2,466.533</td><td>25.14</td><td>× 98</td></tr>
</tbody></table>
<figcaption>
<p>Benchmarks between Javascript parser and Rust parser as a WebAssembly binary.</p>
</figcaption>
</figure>
<p>The WebAssembly binary is in average 86 times faster than the actual
Javascript implementation. The median of the speedup is 98. Some edge
cases are very interesting, like <code>moby-dick-parsed.html</code> where it takes
2.5s with the Javascript parser against 25ms with WebAssembly.</p>
<p>So not only it is safer, but it is faster than Javascript in this case.
And it is only 300 lines of code.</p>
<p>Note that WebAssembly does not support SIMD yet: It is still <a rel="noopener external" target="_blank" href="https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md">a
proposal</a>.
Rust is gently supporting it (<a rel="noopener external" target="_blank" href="https://github.com/rust-lang-nursery/stdsimd/pull/549">example with PR
#549</a>). It will
dramatically improve the performances!</p>
<p>We will see in the next episodes of this series that Rust can reach a
lot of galaxies, and the more it travels, the more it gets interesting.</p>
<p>Thanks for reading!</p>
Prelude2018-08-21T00:00:00+00:002018-08-21T00:00:00+00:00
Unknown
https://mnt.io/series/from-rust-to-beyond/prelude/<p><a rel="noopener external" target="_blank" href="https://automattic.com/">At my work</a>, I had an opportunity to start an
experiment: Writing a single parser implementation in Rust for <a rel="noopener external" target="_blank" href="https://github.com/WordPress/gutenberg">the new
Gutenberg post format</a>, bound to
many platforms and environments.</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/prelude/./gutenberg.png" alt="Gutenberg's logo" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Gutenberg's logo.</p>
</figcaption>
</figure>
<p>This series of posts is about those bindings, and explains how to send
Rust beyond earth, into many different galaxies.</p>
<h2 id="">The Gutenberg post format<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p>Let's introduce quickly what Gutenberg is, and why a new post format. If
you want an in-depth presentation, I highly recommend to read <a rel="noopener external" target="_blank" href="https://lamda.blog/2018/04/22/the-language-of-gutenberg/">The
Language of
Gutenberg</a>.
Note that this is <em>not</em> required for the reader to understand the
Gutenberg post format.</p>
<p><a rel="noopener external" target="_blank" href="https://github.com/WordPress/gutenberg">Gutenberg</a> is the next
WordPress editor. It is a little revolution on its own. The features it
unlocks are very powerful.</p>
<blockquote>
<p>The editor will create a new page- and post-building experience that
makes writing rich posts effortless, and has “blocks” to make it easy
what today might take shortcodes, custom HTML, or “mystery meat” embed
discovery. — Matt Mullenweg</p>
</blockquote>
<p>The format of a blog post was HTML. And it continues to be. However,
another semantics layer is added through annotations. Annotations are
written in comments and borrow the XML syntax, e.g.:</p>
<pre class="giallo z-code"><code data-lang="xml"><span class="giallo-l"><span class="z-comment"><!-- wp:ns/block-name {"attributes": "as JSON"} --></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">p</span><span class="z-punctuation z-definition z-tag">></span><span>phrase</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">p</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-comment"><!-- /wp:ns/block-name --></span></span></code></pre>
<p>The Gutenberg format provides 2 constructions: Block, and Phrase. The
example above contains both: There is a block wrapping a phrase. A
phrase is basically anything that is not a block. Let's describe the
example:</p>
<ul>
<li>It starts with an annotation (<code><!-- … --></code>),</li>
<li>The <code>wp:</code> is mandatory to represent a Gutenberg block,</li>
<li>It is followed by a fully qualified block name, which is a pair of an optional
namespace (here sets to <code>ns</code> , defaults to <code>core</code>) and a block name (here
sets to <code>block-name</code>), separated by a slash,</li>
<li>A block has optional attributes encoded as a JSON object (see
<a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc7159">RFC 7159, Section 4, Objects</a>),</li>
<li>Finally, a block has optional children, i.e. a heterogeneous collection of
blocks or phrases. In the example above, there is one child that is the phrase
<code><p>phrase</p></code>. And the following example below shows a block with no child:</li>
</ul>
<pre class="giallo z-code"><code data-lang="xml"><span class="giallo-l"><span class="z-comment"><!-- wp:ns/block-name {"attributes": "as JSON"} /--></span></span></code></pre>
<p>The complete grammar can be found in <a rel="noopener external" target="_blank" href="https://hywan.github.io/gutenberg-parser-rs/gutenberg_post_parser/parser/index.html">the parser's
documentation</a>.</p>
<p>Finally, the parser is used on the <em>editor</em> side, not on the <em>rendering</em>
side. Once rendered, the blog post is a regular HTML file. Some blocks
are dynamics though, but this is another topic.</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/prelude/./block-logic-flow.png" alt="Block logic flow" loading="lazy" decoding="async" /></p>
<figcaption>
<p>The logic flow of the editor (<a rel="noopener external" target="_blank" href="https://make.wordpress.org/core/2017/05/05/editor-how-little-blocks-work/">How Little Blocks
Work</a>).</p>
</figcaption>
</figure>
<p>The grammar is relatively small. The challenges are however to be as
much performant and memory efficient as possible on many platforms. Some
posts can reach megabytes, and we don't want the parser to be the
bottleneck. Even if it is used when creating the post state (cf. the
schema above), we have measured several seconds to load some posts. Time
during which the user is blocked, and waits, or see an error. In other
scenarii, we have hit memory limit of the language's virtual machines.</p>
<p>Hence this experimental project! The current parsers are written in
JavaScript (with <a rel="noopener external" target="_blank" href="https://pegjs.org/">PEG.js</a>) and in PHP (with
<a rel="noopener external" target="_blank" href="https://github.com/nylen/phpegjs"><code>phpegjs</code></a>). This Rust project
proposes a parser written in Rust, that can run in the JavaScript and in
the PHP virtual machines, and on many other platforms. Let's try to be
very performant and memory efficient!</p>
<h2 id="-1">Why Rust?<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p>That's an excellent question! Thanks for asking. I can summarize my
choice with a bullet list:</p>
<ul>
<li>It is fast, and we need speed,</li>
<li>It is memory safe, and also memory efficient,</li>
<li>No garbage collector, which simplifies memory management across
environments,</li>
<li>It can expose a C API (<a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/ffi/index.html">with Foreign Function Interface,
FFI</a>), which eases the
integration into multiple environments,</li>
<li>It compiles to <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/nightly/rustc/platform-support.html">many
targets</a>,</li>
<li>Because I love it.</li>
</ul>
<p>One of the goal of the experimentation is to maintain a single
implementation (maybe the future reference implementation) with multiple
bindings.</p>
<h2 id="-2">The parser<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h2>
<p>The parser is written in Rust. It relies on the fabulous <a rel="noopener external" target="_blank" href="https://github.com/Geal/nom/">nom
library</a>.</p>
<figure>
<p><img src="https://mnt.io/series/from-rust-to-beyond/prelude/./nom.png" alt="nom" loading="lazy" decoding="async" /></p>
<figcaption>
<p><em>nom will happily take a byte out of your files</em> 🙂</p>
</figcaption>
</figure>
<p>The source code is available in <a rel="noopener external" target="_blank" href="https://github.com/Hywan/gutenberg-parser-rs">the <code>src/</code> directory in the
repository</a>. It is very
small and fun to read.</p>
<p>The parser produces an Abstract Syntax Tree (AST) of the grammar, where
nodes of the tree are defined as:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub</span><span class="z-storage"> enum</span><span class="z-entity z-name"> Node</span><span><'</span><span class="z-entity z-name">a</span><span>> {</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span> (</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>,</span><span class="z-entity z-name"> Input</span><span><'</span><span class="z-entity z-name">a</span><span>>),</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Option</span><span><</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>>,</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-entity z-name">Node</span><span><'</span><span class="z-entity z-name">a</span><span>>></span></span>
<span class="giallo-l"><span> },</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> Phrase</span><span>(</span><span class="z-entity z-name">Input</span><span><'</span><span class="z-entity z-name">a</span><span>>)</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>That's all! We find again the block name, the attributes and the
children, and the phrase. Block children are defined as a collection of
node, this is recursive. <code>Input<'a></code> is defined as <code>&'a [u8]</code>, i.e. a
slice of bytes.</p>
<p>The main parser entry is <a rel="noopener external" target="_blank" href="https://hywan.github.io/gutenberg-parser-rs/gutenberg_post_parser/fn.root.html">the <code>root</code>
function</a>.
It represents the axiom of the grammar, and is defined as:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">pub fn</span><span class="z-entity z-name z-function"> root</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> input</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Input</span></span>
<span class="giallo-l"><span>)</span><span class="z-keyword z-operator"> -></span><span class="z-entity z-name"> Result</span><span><(</span><span class="z-entity z-name">Input</span><span>,</span><span class="z-entity z-name"> Vec</span><span><</span><span class="z-variable">ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span>>),</span><span class="z-variable"> nom</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Err</span><span><</span><span class="z-entity z-name">Input</span><span>>>;</span></span></code></pre>
<p>So the parser returns a collection of nodes in the best case. Here is an
simple example:</p>
<pre class="giallo z-code"><code data-lang="rust"><span class="giallo-l"><span class="z-keyword">use</span><span class="z-entity z-name"> gutenberg_post_parser</span><span class="z-keyword z-operator">::</span><span>{root,</span><span class="z-entity z-name"> ast</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Node</span><span>};</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">let</span><span class="z-variable"> input</span><span class="z-keyword z-operator"> = &</span><span class="z-string">b"<!-- wp:foo {</span><span class="z-constant z-character">\"</span><span class="z-string">bar</span><span class="z-constant z-character">\"</span><span class="z-string">: true} /-->"</span><span>[</span><span class="z-keyword z-operator">..</span><span>];</span></span>
<span class="giallo-l"><span class="z-storage">let</span><span class="z-variable"> output</span><span class="z-keyword z-operator"> =</span><span class="z-entity z-name"> Ok</span><span>(</span></span>
<span class="giallo-l"><span> (</span></span>
<span class="giallo-l"><span class="z-comment"> // The remaining data.</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> &</span><span class="z-string">b""</span><span>[</span><span class="z-keyword z-operator">..</span><span>],</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // The Abstract Syntax Tree.</span></span>
<span class="giallo-l"><span class="z-entity z-name z-function"> vec!</span><span>[</span></span>
<span class="giallo-l"><span class="z-entity z-name"> Node</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name">Block</span><span> {</span></span>
<span class="giallo-l"><span class="z-variable"> name</span><span class="z-keyword z-operator">:</span><span> (</span><span class="z-keyword z-operator">&</span><span class="z-string">b"core"</span><span>[</span><span class="z-keyword z-operator">..</span><span>],</span><span class="z-keyword z-operator"> &</span><span class="z-string">b"foo"</span><span>[</span><span class="z-keyword z-operator">..</span><span>]),</span></span>
<span class="giallo-l"><span class="z-variable"> attributes</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name"> Some</span><span>(</span><span class="z-keyword z-operator">&</span><span class="z-string">b"{</span><span class="z-constant z-character">\"</span><span class="z-string">bar</span><span class="z-constant z-character">\"</span><span class="z-string">: true}"</span><span>[</span><span class="z-keyword z-operator">..</span><span>]),</span></span>
<span class="giallo-l"><span class="z-variable"> children</span><span class="z-keyword z-operator">:</span><span class="z-entity z-name z-function"> vec!</span><span>[]</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> ]</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span>);</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-entity z-name z-function">assert_eq!</span><span>(</span><span class="z-entity z-name z-function">root</span><span>(</span><span class="z-variable">input</span><span>),</span><span class="z-variable"> output</span><span>);</span></span></code></pre>
<p>The <code>root</code> function and the AST will be the items we are going to use
and manipulate in the bindings. The internal items of the parser will
stay private.</p>
<h2 id="-3">Bindings<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h2>
<figure role="presentation">
<p><img src="https://mnt.io/series/from-rust-to-beyond/prelude/./rust-to.png" alt="Rust to" loading="lazy" decoding="async" /></p>
</figure>
<p>From now, our goal is to expose the <code>root</code> function and the <code>Node</code> enum
in different platforms or environments. Ready?</p>
<p>3… 2… 1… lift-off!</p>
How Automattic (WordPress.com & co.) partly moved away from PHPUnit to atoum?2018-02-26T00:00:00+00:002018-02-26T00:00:00+00:00
Unknown
https://mnt.io/articles/how-automattic-partly-moved-away-from-phpunit-to-atoum/<p>Hello fellow developers and testers,</p>
<p>Few months ago at <a rel="noopener external" target="_blank" href="https://automattic.com/">Automattic</a>, my team and I
started a new project: <strong>Having better tests for the payment system</strong>.
The payment system is used by all the services at Automattic, i.e.
<a rel="noopener external" target="_blank" href="https://wordpress.com/">WordPress</a>,
<a rel="noopener external" target="_blank" href="https://vaultpress.com/">VaultPress</a>, <a rel="noopener external" target="_blank" href="https://jetpack.com/">Jetpack</a>,
<a rel="noopener external" target="_blank" href="http://akismet.com/">Akismet</a>, <a rel="noopener external" target="_blank" href="http://polldaddy.com/">PollDaddy</a> etc.
It's a big challenge! Cherry on the cake: Our experiment could define
the future of the testing practices for the entire company. No pressure.</p>
<p>This post is a summary about what have been accomplished so far, the
achievements, the failures, and the future, focused around manual tests.
As the title of this post suggests, we are going to talk about
<a rel="noopener external" target="_blank" href="https://phpunit.de/">PHPUnit</a> and <a rel="noopener external" target="_blank" href="http://atoum.org/">atoum</a>, which are
two PHP test frameworks. This is not a PHPUnit vs. atoum fight. These
are observations made for our software, in our context, with our
requirements, and our expectations. I think the discussion can be useful
for many projects outside Automattic. I would like to apologize in
advance if some parts sound too abstract, I hope you understand I can't
reveal any details about the payment system for obvious reasons.</p>
<h2 id="where-we-were-and-where-to-go">Where we were, and where to go<a role="presentation" class="anchor" href="#where-we-were-and-where-to-go" title="Anchor link to this header">#</a>
</h2>
<p>For historical reasons, WordPress, VaultPress, Jetpack & siblings use
<a rel="noopener external" target="_blank" href="https://phpunit.de/">PHPUnit</a> for server-side manual tests. There are
unit, integration, and system manual tests. There are also end-to-end
tests or benchmarks, but we are not interested in them now. When those
products were built, PHPUnit was the main test framework in town. Since
then, the test landscape has considerably changed in PHP. New
competitors, like <a rel="noopener external" target="_blank" href="http://atoum.org/">atoum</a> or
<a rel="noopener external" target="_blank" href="http://behat.org/">Behat</a>, have a good position in the game.</p>
<p>Those tests exist for many years. Some of them grew organically. PHPUnit
does not require any form of structure, which is —despite being
questionable according to me— a reason for its success. It is a
requirement that the code does not need to be well-designed to be
tested, <em>but</em> too much freedom on the test side comes with a cost in the
long term if there is not enough attention.</p>
<p><strong>Our situation is the following</strong>. The code is complex for justified
reasons, and the <em>testability</em> is sometimes lessened. Testing across
many services is indubitably difficult. Some parts of the code are
really old, mixed with others that are new, shiny, and well-done. In
this context, it is really difficult to change something, especially
moving to another test framework. The amount of work it represents is
colossal. Any new test framework does not worth the price for this huge
refactoring. But maybe the new test frameworks can help us to better
test our code?</p>
<p>I'm a <a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/graphs/contributors">long term contributor of
atoum</a> (top 3
contributors). And at the time of writing, I'm a core member. You have
to believe me when I say that, at each step of the discussions or the
processes, I have been neutral, arguing in favor or against atoum. The
idea to switch to atoum partly came from me actually, but my knowledge
about atoum is definitively a plus. I am in a good position to know the
pros and the cons of the tool, and I'm perfectly aware of how it could
solve issues we have.</p>
<p>So after many debates and discussions, we decided to <em>try</em> to move to
atoum. A survey and a meeting were scheduled 2 months later to decide
whether we should continue or not. Spoiler: We will partly continue with
it.</p>
<h2 id="our-needs-and-requirements">Our needs and requirements<a role="presentation" class="anchor" href="#our-needs-and-requirements" title="Anchor link to this header">#</a>
</h2>
<p>Our code is difficult to test. In other words, the testability is low
for some parts of the code. atoum has features to help increase the
testability. I will try to summarize those features in the following
short sections.</p>
<h3 id="atoum-phpunit-extension"><code>atoum/phpunit-extension</code><a role="presentation" class="anchor" href="#atoum-phpunit-extension" title="Anchor link to this header">#</a>
</h3>
<p>As I said, it's not possible to rewrite/migrate all the existing tests.
This is a colossal effort with a non-neglieable cost. Then, enter
<a rel="noopener external" target="_blank" href="https://github.com/atoum/phpunit-extension"><code>atoum/phpunit-extension</code></a>.</p>
<p>As far as I know, atoum is the only PHP framework that is able to run
tests that have been written for another framework. The
<code>atoum/phpunit-extension</code> does exactly that. It runs tests written with
the PHPUnit API with the atoum engines. This is <em>fabulous</em>! PHPUnit is
not required at all. With this extension, we have been able to run our
“legacy” (aka PHPUnit) tests with atoum. The following scenarios can be
fulfilled:</p>
<ul>
<li>Existing test suites written with the PHPUnit API can be run seamlessly by
atoum, no need to rewrite them,</li>
<li>Of course, new test suites are written with the atoum API,</li>
<li>In case of a test suite migration from PHPUnit to atoum, there are two
solutions:
<ol>
<li>Rewrite the test suite entirely from scratch by logically using the atoum
API, or</li>
<li>Only change the parent class from <code>PHPUnit\Framework\TestCase</code> to
<code>atoum\phpunit\test</code>, and suddenly it is possible to use both API at the
same time (and thus migrate one test case after the other for instance).</li>
</ol>
</li>
</ul>
<p>This is a very valuable tool for an adventure like ours.</p>
<p><code>atoum/phpunit-extension</code> is not perfect though. Some PHPUnit APIs are
missing. And while the test verdict is strictly the same, error messages
can be different, some PHPUnit extensions may not work properly etc.
Fortunately, our usage of PHPUnit is pretty raw: No extensions except
home-made ones, few hacks… Everything went well. We also have been able
to contribute easily to the extension.</p>
<h3 id="mock-engines-plural">Mock engines (plural)<a role="presentation" class="anchor" href="#mock-engines-plural" title="Anchor link to this header">#</a>
</h3>
<p>atoum comes with <a rel="noopener external" target="_blank" href="http://docs.atoum.org/en/latest/mocking_systems.html">3 mock
engines</a>:</p>
<ul>
<li>Class-like mock engine for classes and interfaces,</li>
<li>Function mock engine,</li>
<li>Constant mock engine.</li>
</ul>
<p>Being able to mock global functions or global constants is an important
feature for us. It suddenly increases the testability of our code! The
following example is fictional, but it's a good illustration. WordPress
is full of global functions, but it is possible to mock them with atoum
like this:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> test_foo</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">function</span><span class="z-keyword z-operator">-></span><span class="z-variable">get_userdata</span><span class="z-keyword z-operator"> =</span><span> (</span><span class="z-storage">object</span><span>)</span><span class="z-punctuation z-section"> [</span></span>
<span class="giallo-l"><span class="z-string"> 'user_login'</span><span class="z-keyword z-operator"> =></span><span class="z-constant z-other"> …</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-string"> 'user_pass'</span><span class="z-keyword z-operator"> =></span><span class="z-constant z-other"> …</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-constant z-other"> …</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> ]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>In one line of code, it was possible to mock the
<a rel="noopener external" target="_blank" href="https://codex.wordpress.org/Function_Reference/get_userdata"><code>get_userdata</code></a>
function.</p>
<h3 id="runner-engines">Runner engines<a role="presentation" class="anchor" href="#runner-engines" title="Anchor link to this header">#</a>
</h3>
<p>Being able to isolate test execution is a necessity to avoid flakey
tests, and to increase the trust we put in the test verdicts. atoum
comes with <em>de facto</em> 3 runner engines:</p>
<ul>
<li><em>Inline</em>, one test case after another in the same process,</li>
<li><em>Isolate</em>, one test case after another but each time in a new process (full
isolation),</li>
<li><em>Concurrent</em>, like <em>isolate</em> but tests run concurrently (“at the same
time”).</li>
</ul>
<p>I'm not saying PHPUnit doesn't have those features. It is possible to
run tests in a different process each time —with the <em>isolate</em> engine—,
but test execution time blows up, and the isolation is not strict. We
don't use it. The <em>concurrent</em> runner engine in atoum tends to reduce
the execution time to be close to the <em>inline</em> engine, while still
ensuring a strict isolation.</p>
<p>Fun fact: By using atoum and the <code>atoum/phpunit-extension</code>, we are able
to run PHPUnit tests concurrently with a strict isolation!</p>
<h3 id="code-coverage-reports">Code coverage reports<a role="presentation" class="anchor" href="#code-coverage-reports" title="Anchor link to this header">#</a>
</h3>
<p>At the time of writing, PHPUnit is not able to generate code coverage
reports containing the Branch- or Path Coverage Criteria data. atoum
supports them natively with the
<a rel="noopener external" target="_blank" href="https://github.com/atoum/reports-extension"><code>atoum/reports-extension</code></a>
(including nice graphs, see <a rel="noopener external" target="_blank" href="http://atoum.org/reports-extension/">the
demonstration</a>). And we need those
data.</p>
<h2 id="the-difficulties">The difficulties<a role="presentation" class="anchor" href="#the-difficulties" title="Anchor link to this header">#</a>
</h2>
<p>On paper, most of the pain points sound addressable. It was time to
experiment.</p>
<h3 id="integration-to-the-continuous-integration-server">Integration to the Continuous Integration server<a role="presentation" class="anchor" href="#integration-to-the-continuous-integration-server" title="Anchor link to this header">#</a>
</h3>
<p>Our CI does not natively support standard test execution report formats.
Thus we had to create the
<a rel="noopener external" target="_blank" href="https://github.com/Hywan/atoum-teamcity-extension/"><code>atoum/teamcity-extension</code></a>.
<a href="https://mnt.io/articles/atoum-supports-teamcity/">Learn more</a> by
reading a blog post I wrote recently. The TeamCity support is native
inside PHPUnit (see the <a rel="noopener external" target="_blank" href="http://phpunit.readthedocs.io/en/latest/textui.html?highlight=--log-teamcity"><code>--log-teamcity</code>
option</a>).</p>
<h3 id="bootstrap-test-environments">Bootstrap test environments<a role="presentation" class="anchor" href="#bootstrap-test-environments" title="Anchor link to this header">#</a>
</h3>
<p>Our bootstrap files are… challenging. It's expected though. Setting up a
functional test environment for a software like WordPress.com is not a
task one can accomplish in 2 minutes. Fortunately, we have been able to
re-use most of the PHPUnit parts.</p>
<p>Today, our unit tests run in complete isolation and concurrently. Our
integration tests, and system tests run in complete isolation but not
concurrently, due to MySQL limitations. We have solutions, but time
needs to be invested.</p>
<p>Generally, even if it works now, it took time to re-organize the
bootstrap so that some parts can be shared between the test runners
(because we didn't switch the whole company to atoum yet, it was an
experiment).</p>
<h3 id="documentation-and-help">Documentation and help<a role="presentation" class="anchor" href="#documentation-and-help" title="Anchor link to this header">#</a>
</h3>
<p>Here is an interesting paradox. The majority of the team recognized that
atoum's documentation is better than PHPUnit's, even if some parts must
be rewritten or reworked. <em>But</em> developers already know PHPUnit, so they
don't look at the documentation. If they have to, they will instead find
their answers on StackOverflow, or by talking to someone else in the
company, but not by checking the official documentation. atoum does not
have many StackOverflow threads, and few people are atoum users within
the company.</p>
<p>What we have also observed is that when people create a new test, it's a
copy-paste from an existing one. Let's admit this is a common and
natural practice. When a difficulty is met, it's legit to look at
somewhere else in the test repository to check if a similar situation
has been resolved. In our context, that information lacked a little bit.
We tried to write more and more tests, but not fast enough. It should
not be an issue if you have time to try, but in our context, we
unfortunately didn't have this time. The team faced many challenges in
the same period, and the tests we are building are not simple _Hello,
World!_s as you might think, so it increases the effort.</p>
<p>To be honest, this was not the biggest difficulty, but still, it is
important to notice.</p>
<h3 id="concurrent-integration-test-executions">Concurrent integration test executions<a role="presentation" class="anchor" href="#concurrent-integration-test-executions" title="Anchor link to this header">#</a>
</h3>
<p>Due to some MySQL limitations combined with the complexity of our code,
we are not able to run integration (and system) tests concurrently yet.
Therefore it takes time to run them, probably too much in our
development environments. Even if atoum has friendly options to reduce
the debug loop (e.g. see <a rel="noopener external" target="_blank" href="http://docs.atoum.org/en/latest/mode-loop.html">the <code>--loop</code>
option</a>), the execution
is still slow. The problem can be solved but it requires time, and deep
modifications of our code.</p>
<p>Note that with our PHPUnit tests, no isolation is used. This is wrong.
And thus we have a smaller trust in the test verdict than with atoum.
Almost everyone in the team prefers to have slow test execution but
isolation, rather than fast test execution but no confidence in the test
verdict. So that's partly a difficulty. It's a mix of a positive feature
and a needle in the foot, and a needle we can live with. atoum is not
responsible of this latency: The state of our code is.</p>
<h2 id="the-results">The results<a role="presentation" class="anchor" href="#the-results" title="Anchor link to this header">#</a>
</h2>
<p>First, let's start by the positive impacts:</p>
<ul>
<li>In 2 months, we have observed that the testability of our code has been
increased by using atoum,</li>
<li>We have been able to find bugs in our code that were not detected by PHPUnit,
mostly because atoum checks the type of the data,</li>
<li>We have been able to migrate “legacy tests” (aka PHPUnit tests) to atoum by
just moving the files from one directory to another: What a smooth migration!</li>
<li>The <em>trust</em> we put in our test verdict has increased thanks to a strict test
execution isolation.</li>
</ul>
<p>Now, the negative impacts:</p>
<ul>
<li>Even if the testability has been increased, it's not enough. Right now, we are
looking at refactoring our code. Introducing atoum right now was probably too
early. Let's refactor first, then use a better test toolchain later when
things will be cleaner,</li>
<li>Moving the whole company at once is hard. There are thousands of manual tests.
The <code>atoum/phpunit-extension</code> is not magical. We have to come with more solid
results, stuff to blow minds. It is necessary to set the institutional inertia
in motion. For instance, not being able to run integration and system tests
concurrently slows down the builds on the CI; it increases the trust we put in
the test verdict, but this latency is not acceptable at the company scale,</li>
<li>All the issues we faced can be addressed, but it needs time. The experiment
time frame was 2 months. We need 1 or 2 other months to solve the majority of
the remaining issues. Note that I was kind of in-charge of this project, but
not full time.</li>
</ul>
<p>We stop using atoum for <em>manual tests</em>. It's likely to be a pause
though. The experiment has shown we need to refactor and clean our code,
then there will be a good chance for atoum to come back. The experiment
has also shown how to increase the testability of our code: Not
everything can be addressed by using another test framework even if it
largely participates. We can focus on those points specifically, because
we know where they are now. Finally, I reckon it has participated in
moving the test infrastructure inside Automattic by showing that
something else exists, and that we can go further.</p>
<p>I said we stopped using atoum “for manual tests”. Yes. Because we also
have <em>automatically generated tests</em>. The experiment was not only about
switching to atoum. Many other aspects of the experiment are still
running! For instance, <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Kitab">Kitab</a> is
used for our code documentation. Kitab is able to (i) <em>render</em> the
documentation, and (ii) <em>test</em> the examples written inside the
documentation. That way the documentation is ensured to be always
up-to-date and working. Kitab generates tests for- and executes tests
with atoum. It was easy to set up: We just had to use the existing test
bootstraps designed for atoum. We also have another tool to <a rel="noopener external" target="_blank" href="https://github.com/Hywan/atoum-apiblueprint-extension">compile
HTTP API Blueprint specifications into executable
tests</a>. So far,
everyone is happy with those tools, no need to go back, everything is
automat(t)ic. Other tools are likely to be introduced in the future to
automatically generate tests. I want to detail this particular topic in
another blog post.</p>
<h2 id="conclusion">Conclusion<a role="presentation" class="anchor" href="#conclusion" title="Anchor link to this header">#</a>
</h2>
<p>Moving to another test framework is a huge decision with many factors.
The fact atoum has <code>atoum/phpunit-extension</code> is a time saver.
Nonetheless a new test framework does not mean it will fix all the
testability issues of the code. The benefits of the new test framework
must largely overtake the costs. In our current context, it was not the
case. <em>atoum solves issues that are not our priorities</em>. So yes, atoum
can help us to solve important issues, but since these issues are not
priorities, then the move to atoum was too early. During the project, we
gained new automatic test tools, like
<a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Kitab">Kitab</a>. The experiment is not a
failure. Will we retry atoum? It's very likely. When? I hope in a year.</p>
One conference per day, for one year (2017)2018-01-25T00:00:00+00:002018-01-25T00:00:00+00:00
Unknown
https://mnt.io/articles/one-conference-per-day-for-one-year-2017/<p>My self-assigned challenge for 2017 was to watch at least one conference
per day, for one year. That's the first time I try this challenge. Let's
dive in for a recap.</p>
<h2 id="267-conferences">267 conferences<a role="presentation" class="anchor" href="#267-conferences" title="Anchor link to this header">#</a>
</h2>
<p>In some way, I failed the challenge because I've been able to watch only
267 conferences. With an average of 34 minutes per conference, I've
watched 9078 minutes, or 151 hours of <em>freely available</em> conferences
online. Why did I fail to watch 365 of them? Because my first kid was
1.5 years in January 2017, a new little lady came in December 2017, I
<a href="https://mnt.io/articles/bye-bye-liip-hello-automattic/">got a new
job</a>, I
<a href="https://mnt.io/articles/automattic-grand-meetup-2017/">travelled for my
job</a>, I <a rel="noopener external" target="_blank" href="https://www.youtube.com/watch?v=Ymy8qAEe0kQ">gave
talks</a>, I maintain
important open source projects requiring lot of time, I'm building my
own self-sufficient ecological house, the vegetable garden requires many
hours, I watch other videos, and because I'm lazy sometimes. Most of the
time, I was able to watch 2 or 3 conferences in a row.</p>
<h2 id="where-to-find-the-resources">Where to find the resources?<a role="presentation" class="anchor" href="#where-to-find-the-resources" title="Anchor link to this header">#</a>
</h2>
<p>All these conferences are freely available online, on YouTube, or on
Vimeo, for most of them. The channel I mostly watch are the following:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/channel/UCaYhcUwRBNscFNUKTjgPFiA">Rust</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/user/Confreaks">Confreaks</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/channel/UCCBVCTuk6uJrN3iFV_3vurg">Devoxx</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/channel/UCOpGiN9AkczVjlpGDaBwQrQ">elm-conf</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/user/BoostCon">BoostCon</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/user/jsconfeu">JSConf</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/channel/UCv2_41bSAa5Y_8BacJUZfjQ">LLVM</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/user/CppCon">CppCon</a>,</li>
<li><a rel="noopener external" target="_blank" href="https://www.youtube.com/channel/UC_QIfHvN9auy2CoOdSfMWDw">Strange
Loop</a>.</li>
</ul>
<p>It's very Computer Science centric as you might have noticed, and it
targets Rust, C++, Elm, LLVM, or Web technologies (JS, CSS…), but not
only, you can find Haskell or Clojure sometimes.</p>
<h2 id="my-best-of-list">My best-of list<a role="presentation" class="anchor" href="#my-best-of-list" title="Anchor link to this header">#</a>
</h2>
<p>In March 2017, more and more people were questionning me, and asked for
sharing. I then decided to start a <a rel="noopener external" target="_blank" href="https://www.youtube.com/playlist?list=PLOkMRkzDhWGX_4YWI4ZYGbwFPqKnDRudf">playlist of my “best-of”
conferences</a>.
I've added 78 conferences in 2017, and 3 new conferences have been added
since then.</p>
<figure>
<iframe class="youtube-player" width="560" height="315" src="https://www.youtube-nocookie.com/embed/videoseries?si=9Q6Qf-EOE4nyFgrn&list=PLOkMRkzDhWGX_4YWI4ZYGbwFPqKnDRudf" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
<figcaption>
<p>My best-of talks playlist.</p>
</figcaption>
</figure>
<h2 id="thoughts-and-conclusion">Thoughts and conclusion<a role="presentation" class="anchor" href="#thoughts-and-conclusion" title="Anchor link to this header">#</a>
</h2>
<p>The challenge was sometimes easy and relaxing, or it was very hard to
understand everything especially at 2am after a long day (looking at you
CppCon). But it has been a very enjoyable way to learn a lot in a very
short period of time. Many speakers are talented, and listening to them
is a real pleasure. Some others are just… let's say unprepared, and it's
good to stop and jump onto another talk. It's also a good way to get
inspired by technologies you don't necessarily know (for instance, I'm
not a big fan of Clojure, but some projects are really inspiring, like
<a rel="noopener external" target="_blank" href="https://www.youtube.com/watch?v=buPPGxOnBnk&index=81&list=PLOkMRkzDhWGX_4YWI4ZYGbwFPqKnDRudf">Proto
REPL</a>).</p>
<p>Sometimes <a rel="noopener external" target="_blank" href="https://twitter.com/mnt_io">I tweeted</a> about the talk I
watched, and it was quite appreciated too. I reckon because it's a fun
and an easy way to learn, especially with the help of video platforms
like Youtube.</p>
<p>Am I going to continue this challenge in 2018? Yes! But maybe not at
this frequency. It's now part of my routine to watch conferences many
times per week. I like it. I don't want to stop.</p>
<p>As a closing note, I would like to <em>thank</em> every speakers, and more
importantly, every conference organizer. You are doing an amazing job:
From the program, to the event, to the final sharing on Internet with
everyone. Most of you are volunteers. I know the work it represents. You
are producing <em>extremely valuable resources</em>. Thank you!</p>
Random thoughts about `::class` in PHP2018-01-24T00:00:00+00:002018-01-24T00:00:00+00:00
Unknown
https://mnt.io/articles/random-thoughts-about-class-in-php/<blockquote>
<p>The special <strong><code>::class</code></strong> constant allows for fully qualified class
name resolution at compile, this is useful for namespaced classes.</p>
</blockquote>
<p>I'm quoting <a rel="noopener external" target="_blank" href="http://php.net/manual/en/language.oop5.constants.php">the PHP
manual</a>. But
things can be funny sometimes. Let's go through some examples.</p>
<ul>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> A</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">B</span><span class="z-keyword"> as</span><span> C</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$_</span><span class="z-keyword z-operator"> =</span><span class="z-support z-class"> C</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-string">``` <!-- rumdl-disable-line MD031 --></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">resolves to `</span><span class="z-constant z-other">A</span><span>\</span><span class="z-constant z-other">B</span><span class="z-string">`, which is perfect 🙂</span></span>
<span class="giallo-l"></span></code></pre></li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> C</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> f</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable"> $_</span><span class="z-keyword z-operator"> =</span><span class="z-storage"> self</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>resolves to <code>C</code>, which is perfect 😀</p>
</li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> C</span><span> {}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> D</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> C</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> f</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable"> $_</span><span class="z-keyword z-operator"> =</span><span class="z-storage"> parent</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"><span class="z-string">``` <!-- rumdl-disable-line MD031 --></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">resolves to `</span><span class="z-constant z-other">C</span><span class="z-string">`, which is perfect 😄</span></span>
<span class="giallo-l"></span></code></pre></li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> C</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public static</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> f</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable"> $_</span><span class="z-keyword z-operator"> =</span><span class="z-storage"> static</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> D</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> C</span><span> {}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-class">D</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">f</span><span>()</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>resolves to <code>D</code>, which is perfect 😍</p>
</li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">'foo'</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span></span>
<span class="giallo-l"><span class="z-string">``` <!-- rumdl-disable-line MD031 --></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">resolves to `'foo'`, which is… huh? 🤨</span></span>
<span class="giallo-l"></span></code></pre></li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">"foo"</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span></span></code></pre>
<p>resolves to <code>'foo'</code>, which is… expected somehow 😕</p>
</li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$a</span><span class="z-keyword z-operator"> =</span><span class="z-string"> 'oo'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-string">"f{</span><span class="z-variable">$a</span><span class="z-string">}"</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span></span>
<span class="giallo-l"><span class="z-string">``` <!-- rumdl-disable-line MD031 --></span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-string">generates a parse error 🙃</span></span>
<span class="giallo-l"></span></code></pre></li>
<li><pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-class">PHP_VERSION</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span></span></code></pre>
<p>resolves to <code>'PHP_VERSION'</code>, which is… strange: It resolves to the
fully qualified name of the constant, not the <em>class</em> 🤐</p>
</li>
</ul>
<p><code>::class</code> is very useful to get rid of of the <code>get_class</code> or the
<code>get_called_class</code> functions, or even the <code>get_class($this)</code> trick. This
is something truly useful in PHP where entities are referenced as
strings, not as symbols. <code>::class</code> on constants makes sense, but the
name is no longer relevant. And finally, <code>::class</code> on single quote
strings is absolutely useless; on double quotes strings it is a source
of error because the value can be dynamic (and remember, <code>::class</code> is
resolved at compile time, not at run time).</p>
Automattic, Grand Meetup 20172017-11-26T00:00:00+00:002017-11-26T00:00:00+00:00
Unknown
https://mnt.io/articles/automattic-grand-meetup-2017/<p>Awesome company, awesome teams, awesome people. Thanks everyone for this
moment!</p>
<figure>
<p><img src="https://mnt.io/articles/automattic-grand-meetup-2017/./gm.jpg" alt="All the people" loading="lazy" decoding="async" /></p>
<figcaption>
<p>All the people!</p>
</figcaption>
</figure>
atoum supports TeamCity2017-11-06T00:00:00+00:002017-11-06T00:00:00+00:00
Unknown
https://mnt.io/articles/atoum-supports-teamcity/<p><a rel="noopener external" target="_blank" href="http://atoum.org/">atoum</a> is a popular PHP test framework.
<a rel="noopener external" target="_blank" href="https://www.jetbrains.com/teamcity/">TeamCity</a> is a Continuous
Integration and Continuous Delivery software developed by Jetbrains.
Despites <a rel="noopener external" target="_blank" href="http://atoum.org/features.html#reports">atoum supports many industry
standards</a> to report test
execution verdicts, TeamCity uses <a rel="noopener external" target="_blank" href="https://confluence.jetbrains.com/display/TCD8/Build+Script+Interaction+with+TeamCity">its own non-standard
report</a>,
and thus atoum is not compatible with TeamCity… until now.</p>
<p>The <code>atoum/teamcity-extension</code> provides TeamCity support inside atoum.
When executing tests, the reported verdicts are understandable by
TeamCity, and activate all its UI features.</p>
<h2 id="install">Install<a role="presentation" class="anchor" href="#install" title="Anchor link to this header">#</a>
</h2>
<p>If you have Composer, just run:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> composer require atoum/teamcity-extension </span><span class="z-string">'~1.0'</span></span></code></pre>
<p>From this point, you need to enable the extension in your <code>.atoum.php</code>
configuration file. The following example forces to enable the extension
for every test execution:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$extension</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> atoum</span><span class="z-punctuation z-separator">\</span><span>teamcity</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">extension</span><span>(</span><span class="z-variable">$script</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$extension</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">addToRunner</span><span>(</span><span class="z-variable">$runner</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The following example enables the extension <strong>only within</strong> a TeamCity
environment:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$extension</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> atoum</span><span class="z-punctuation z-separator">\</span><span>teamcity</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">extension</span><span>(</span><span class="z-variable">$script</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$extension</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">addToRunnerWithinTeamCityEnvironment</span><span>(</span><span class="z-variable">$runner</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>This latter installation is recommended. That's it 🙂.</p>
<h2 id="glance">Glance<a role="presentation" class="anchor" href="#glance" title="Anchor link to this header">#</a>
</h2>
<p>The default CLI report looks like this:</p>
<figure>
<p><img src="https://mnt.io/articles/atoum-supports-teamcity/./cli.png" alt="Default atoum CLI report" loading="lazy" decoding="async" /></p>
<figcaption>
<p>The default CLI report is the default one from atoum.</p>
</figcaption>
</figure>
<p>The TeamCity report looks like this in your terminal (note the
<code>TEAMCITY_VERSION</code> variable as a way to emulate a TeamCity environment):</p>
<figure>
<p><img src="https://mnt.io/articles/atoum-supports-teamcity/./cli-teamcity.png" alt="TeamCity report inside the terminal" loading="lazy" decoding="async" /></p>
<figcaption>
<p>The TeamCity report is text-based, but it is aimed at being consumed by a
formatter to produce HTML.</p>
</figcaption>
</figure>
<p>Which is less easy to read. However, when it comes into TeamCity UI, we
will have the following result:</p>
<figure>
<p><img src="https://mnt.io/articles/atoum-supports-teamcity/./teamcity.png" alt="TeamCity running atoum" loading="lazy" decoding="async" /></p>
<figcaption>
<p>The final rendering, at an HTML document inside TeamCity itself.</p>
</figcaption>
</figure>
<p>We are using it at <a rel="noopener external" target="_blank" href="https://automattic.com/">Automattic</a>. Hope it is
useful for someone else!</p>
<p>If you find any bugs, or would like any other features, please use
Github at the following repository:
<a rel="noopener external" target="_blank" href="https://github.com/Hywan/atoum-teamcity-extension/">https://github.com/Hywan/atoum-teamcity-extension/</a>.</p>
Export functions in PHP à la Javascript2017-10-30T00:00:00+00:002017-10-30T00:00:00+00:00
Unknown
https://mnt.io/articles/export-functions-in-php-a-la-javascript/<p>Warning: This post is totally useless. It is the result of a fun private
company thread.</p>
<h2 id="export-functions-in-javascript">Export functions in Javascript<a role="presentation" class="anchor" href="#export-functions-in-javascript" title="Anchor link to this header">#</a>
</h2>
<p>In Javascript, a file can export functions like this:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-keyword">export</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> times2</span><span>(</span><span class="z-variable z-parameter">x</span><span>) {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> x</span><span class="z-keyword z-operator"> *</span><span class="z-constant z-numeric"> 2</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>And then we can import this function in another file like this:</p>
<pre class="giallo z-code"><code data-lang="javascript"><span class="giallo-l"><span class="z-keyword">import</span><span> {</span><span class="z-variable">times2</span><span>}</span><span class="z-keyword"> from</span><span class="z-string"> 'foo'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">console</span><span class="z-punctuation z-accessor">.</span><span class="z-entity z-name z-function">log</span><span>(</span><span class="z-entity z-name z-function">times2</span><span>(</span><span class="z-constant z-numeric">21</span><span>))</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> // 42</span></span></code></pre>
<p>Is it possible with PHP?</p>
<h2 id="export-functions-in-php">Export functions in PHP<a role="presentation" class="anchor" href="#export-functions-in-php" title="Anchor link to this header">#</a>
</h2>
<p>Every entity is public in PHP: Constant, function, class, interface, or
trait. They can live in a namespace. So exporting functions in PHP is
absolutely useless, but just for the fun, let's keep going.</p>
<p>A PHP file can return an integer, a real, an array, an anonymous
function, anything. Let's try this:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">return</span><span class="z-storage z-type z-function"> function</span><span> (</span><span class="z-keyword">int</span><span class="z-variable"> $x</span><span>)</span><span class="z-keyword z-operator">:</span><span class="z-keyword"> int</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $x</span><span class="z-keyword z-operator"> *</span><span class="z-constant z-numeric"> 2</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>And then in another file:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$times2</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> require</span><span class="z-string"> 'foo.php'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(</span><span class="z-variable">$times2</span><span>(</span><span class="z-constant z-numeric">21</span><span>))</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> // int(42)</span></span></code></pre>
<p>Great, it works.</p>
<p>What if our file returns more than one function? Let's use an array
(which has most hashmap properties):</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">return</span><span class="z-punctuation z-section"> [</span></span>
<span class="giallo-l"><span class="z-string"> 'times2'</span><span class="z-keyword z-operator"> =></span><span class="z-storage z-type z-function"> function</span><span> (</span><span class="z-keyword">int</span><span class="z-variable"> $x</span><span>)</span><span class="z-keyword z-operator">:</span><span class="z-keyword"> int</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $x</span><span class="z-keyword z-operator"> *</span><span class="z-constant z-numeric"> 2</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-string"> 'answer'</span><span class="z-keyword z-operator"> =></span><span class="z-storage z-type z-function"> function</span><span> ()</span><span class="z-keyword z-operator">:</span><span class="z-keyword"> int</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-constant z-numeric"> 42</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">]</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>To choose what to import, let's use <a rel="noopener external" target="_blank" href="https://github.com/php/php-langspec/blob/master/spec/10-expressions.md#list-intrinsic">the <code>list</code>
intrinsic</a>.
It has several forms: With or without key matching, long (<code>list(…)</code>) and
short syntax (<code>[…]</code>). Because we are modern, we will use the short
syntax with key matching to selectively import functions:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-section">[</span><span class="z-string">'times2'</span><span class="z-keyword z-operator"> =></span><span class="z-variable"> $mul</span><span class="z-punctuation z-section">]</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> require</span><span class="z-string"> 'foo.php'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(</span><span class="z-variable">$mul</span><span>(</span><span class="z-constant z-numeric">21</span><span>))</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> // int(42)</span></span></code></pre>
<p>Notice that <code>times2</code> has been aliased to <code>$mul</code>. What a feature!</p>
<p>Is it useful? Absolutely not. Is it fun? For me it is.</p>
Finite-State Machine as a Type System illustrated with a store product2017-08-09T00:00:00+00:002017-08-09T00:00:00+00:00
Unknown
https://mnt.io/articles/finite-state-machine-as-a-type-system-illustrated-with-a-store-product/<p>Hello fellow coders!</p>
<p>In this article, I would like to talk about how to implement a
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Finite-state_machine">Finite-State
Machine</a> (FSM) with
the PHP type system. The example is a store product (in an e-commerce
solution for instance), something we are likely to meet once in our
lifetime. Our goal is to simply <strong>avoid impossible states and
transitions</strong>.</p>
<p>I am in deep love with <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Type_theory">Type
theory</a>, however I will try
to keep the formulas away from this article to focus on the code.
Moreover, you might be aware that the PHP <em>runtime</em> type system is
somewhat very permissive and “poor” (this is not a formal definition),
hopefully some tricks can help us to express nice constraints.</p>
<h2 id="the-product-fsm">The Product FSM<a role="presentation" class="anchor" href="#the-product-fsm" title="Anchor link to this header">#</a>
</h2>
<p>A product in a store might have the following states:</p>
<ul>
<li>Active: Can be purchased,</li>
<li>Inactive: Has been cancelled or discontinued (a discontinued product can no
longer be purchased),</li>
<li>Purchased and renewable,</li>
<li>Purchased and not renewable,</li>
<li>Purchased and cancellable.</li>
</ul>
<p>The transitions between these states can be viewed as a <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Finite-state_machine">Finite-State
Machine</a> (FSM).</p>
<figure>
<p><img src="https://mnt.io/articles/finite-state-machine-as-a-type-system-illustrated-with-a-store-product/./schema1.png" alt="First schema" loading="lazy" decoding="async" /></p>
<figcaption>
<p>We read this graph as: A product is in the state <code>A</code>. If the <code>purchase</code>
action is called, then it transitions to the state <code>B</code>. If the <code>once-off purchase</code> action is called, then it transitions to the state <code>C</code>. From the
state <code>B</code>, if the <code>renew</code> action is called, it remains in the same state. If
the <code>cancel</code> action is called, it transitions to the <code>D</code> state. Same for the
<code>C</code> to <code>D</code> states.</p>
</figcaption>
</figure>
<p>Our goal is to respect this FSM. Invalid actions must be impossible to
do.</p>
<h2 id="">Finite-State Machine as a Type System<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p>Having a FSM is a good thing to define the states and the transitions
between them: It is formal and clear. However, it is tested at runtime,
not at compile-time, i.e. <code>if</code> statements are required to test if the
state of a product can transition into another state, or else throw an
exception, and this is decided at runtime. Note that PHP does not really
have a compile-time because it is an online compiler (learn more by
reading <a rel="noopener external" target="_blank" href="https://speakerdeck.com/hywan/tagua-vm-a-safe-php-virtual-machine">Tagua VM, a safe PHP virtual
machine</a>,
at slide 29). Our goal is to prevent illegal/invalid states at
parse-/compile-time so that the PHP virtual machine, IDE or static
analysis tools can prove the state of a product without executing PHP
code.</p>
<p>Why is this important? Imagine that we decide to change a product to be
once-off purchasable instead of purchasable, then we can no longer renew
it. We replace an interface on this product, and boom, the IDE tells us
that the code is broken in <em>x</em> places. It <strong>detects impossible scenarios
ahead of code execution</strong>.</p>
<p>No more talking. Here is the code.</p>
<h3 id="-1">The mighty product<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h3>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> Product</span><span> {}</span></span></code></pre>
<p>A product is a class implementing the <code>Product</code> interface. It allows to
type a generic product, with no regards about its state.</p>
<h3 id="-2">Active and inactive<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h3>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product that is active.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> Active</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> Product</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-storage"> self</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product that has been cancelled, or not in stock.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> Inactive</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> Product</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-storage"> self</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The <code>Active</code> and <code>Inactive</code> interfaces are useful to create constraints
such as:</p>
<ul>
<li>A product can be purchased only if it is active, and</li>
<li>A product is inactive if and only if it has been cancelled,</li>
<li>To finally conclude that an inactive product can no longer be purchased, nor
renewed, nor cancelled.</li>
</ul>
<p>Basically, it defines the axiom (initial state) and the final states of
our FSM.</p>
<p>The <code>getProduct(): self</code> trick will make sense later. It helps to
express the following constraint: “A valid product cannot be invalid,
and vice-versa”, i.e. both interfaces cannot be implemented by the same
value.</p>
<h3 id="-3">Purchase, renew, and cancel<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h3>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product that can be purchased.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> Purchasable</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> purchase</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Renewable</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Only an active product can be purchased. The action is <code>purchase</code> and it
generates a product that is renewable. <code>purchase</code> transitions from the
state <code>A</code> to <code>B</code> (regarding the graph above).</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product that can be cancelled.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> Cancellable</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> cancel</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Inactive</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Only an active product can be cancelled. The action is <code>cancel</code> and it
generates an inactive product, so it transitions from the state <code>B</code> to
<code>D</code>.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product that can be renewed.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> Renewable</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> Cancellable</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> renew</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-storage"> self</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>A renewable product is also cancellable. The action is <code>renew</code> and this
is a reflexive transition from the state <code>B</code> to <code>B</code>.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * A product that can be once-off purchased, i.e. it can be purchased but not</span></span>
<span class="giallo-l"><span class="z-comment"> * renewed.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">interface</span><span class="z-entity z-name"> PurchasableOnce</span><span class="z-storage"> extends</span><span class="z-entity z-other z-inherited-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> purchase</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Cancellable</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Finally, a once-off purchasable product has one action: <code>purchase</code> that
produces a <code>Cancellable</code> product, and it transitions from the state <code>A</code>
to <code>C</code>.</p>
<h3 id="-4">Take a breath<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h3>
<figure role="presentation">
<p><img src="https://mnt.io/articles/finite-state-machine-as-a-type-system-illustrated-with-a-store-product/./schema2.png" alt="Second schema" loading="lazy" decoding="async" /></p>
</figure>
<p>So far we have defined interfaces, but the FSM is not implemented yet.
<strong>Interfaces only define constraints</strong> in our type system. An interface
provides a constraint but also <strong>defines type capabilities</strong>: <strong>What
operations can be performed on a value implementing a particular
interface</strong>.</p>
<h3 id="-5">SecretProduct<a role="presentation" class="anchor" href="#-5" title="Anchor link to this header">#</a>
</h3>
<p>Let's consider the <code>SecretProduct</code> as a new super secret product that
will revolutionise our store:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * The `SecretProduct` class is:</span></span>
<span class="giallo-l"><span class="z-comment"> *</span></span>
<span class="giallo-l"><span class="z-comment"> * * A product,</span></span>
<span class="giallo-l"><span class="z-comment"> * * Active,</span></span>
<span class="giallo-l"><span class="z-comment"> * * Purchasable.</span></span>
<span class="giallo-l"><span class="z-comment"> *</span></span>
<span class="giallo-l"><span class="z-comment"> * Note that in this implementation, the `SecretProduct` instance is mutable: Every</span></span>
<span class="giallo-l"><span class="z-comment"> * action happens on the same `SecretProduct` instance. It makes sense because</span></span>
<span class="giallo-l"><span class="z-comment"> * having 2 instances of the same product with different states might be error-prone</span></span>
<span class="giallo-l"><span class="z-comment"> * in most scenarios.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> SecretProduct</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Active</span><span class="z-punctuation z-separator">,</span><span class="z-entity z-other z-inherited-class"> Purchasable</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> /**</span></span>
<span class="giallo-l"><span class="z-comment"> * Purchase the product will return an active product that is renewable,</span></span>
<span class="giallo-l"><span class="z-comment"> * and also cancellable.</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> purchase</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Renewable</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return new</span><span class="z-storage"> class</span><span> (</span><span class="z-variable z-language">$this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getProduct</span><span>())</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Renewable</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-support z-function"> __construct</span><span>(</span><span class="z-support z-class">SecretProduct</span><span class="z-variable"> $product</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-comment"> // Do the purchase.</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> renew</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Renewable</span><span> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Do the renew.</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> cancel</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return new</span><span class="z-storage"> class</span><span> (</span><span class="z-variable z-language">$this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getProduct</span><span>())</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-support z-function"> __construct</span><span>(</span><span class="z-support z-class">SecretProduct</span><span class="z-variable"> $product</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-comment"> // Do the cancel.</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>The <code>SecretProduct</code> is a product that is active and purchasable. PHP
verifies that the <code>Active::getProduct</code> method is implemented, and that
the <code>Purchasable::purchase</code> method is implemented too.</p>
<p>When this latter is called, it returns an object implementing the
<code>Renewable</code> interface (which is also a cancellable active product). The
object in this context is an instance of an anonymous class implementing
the <code>Renewable</code> interface. So the <code>Active::getProduct</code>,
<code>Renewable::renew</code>, and <code>Cancellable::cancel</code> methods must be
implemented.</p>
<p>Having an anonymous class is not required at all, this is just simpler
for the example. A named class may even be better from the testing point
of view.</p>
<p>Note that:</p>
<ul>
<li>The real purchase action is performed in the constructor of the anonymous
class: This is not a hard rule, this is just convenient; it can be done in the
method before returning the new instance,</li>
<li>The real renew action is performed in the <code>renew</code> method before returning
<code>$this</code> ,</li>
<li>And the real cancel action is performed in… we have to dig a little bit more
(the principle is exactly the same though):
<ul>
<li>The <code>Cancellable::cancel</code> method must return an object implementing the
<code>Inactive</code> interface.</li>
<li>It generates an instance of an anonymous class implementing the <code>Inactive</code>
interface, and the real cancel action is done in the constructor.</li>
</ul>
</li>
</ul>
<h3 id="-6">Assert possible and impossible actions<a role="presentation" class="anchor" href="#-6" title="Anchor link to this header">#</a>
</h3>
<p>Let's try some valid and invalid actions. Those followings are
<strong>possible actions</strong>:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>((</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>((</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">renew</span><span>()</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>((</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>((</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">renew</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">renew</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>It is possible to purchase a product, then renew it zero or many times,
and finally to cancel it. It matches the FSM!</p>
<p>Those followings are <strong>impossible actions</strong>:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>(</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">renew</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>(</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>(</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>(</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">renew</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>(</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>(</span><span class="z-keyword">new</span><span class="z-support z-class"> SecretProduct</span><span>())</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>It is impossible:</p>
<ul>
<li>To renew or to cancel a product that has not been purchased,</li>
<li>To purchase or renew a product that has been cancelled,</li>
<li>To purchase a product more than once,</li>
<li>To cancel a product more than once.</li>
</ul>
<p>Those followings are <strong>impossible implementations</strong>:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> SecretProduct</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Active</span><span class="z-punctuation z-separator">,</span><span class="z-entity z-other z-inherited-class"> Purchasable</span><span class="z-punctuation z-separator">,</span><span class="z-entity z-other z-inherited-class"> PurchasableOnce</span><span> {}</span></span></code></pre>
<p>A product cannot be purchasable and once-off purchasable at the same
time, because <code>Purchasable::purchase</code> is not compatible with
<code>PurchasableOnce::purchase</code>.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> SecretProduct</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Inactive</span><span class="z-punctuation z-separator">,</span><span class="z-entity z-other z-inherited-class"> Cancellable</span><span> {}</span></span></code></pre>
<p>An inactive product cannot be purchased nor renewed nor cancelled
because <code>Active::getProduct</code> and <code>Inactive::getProduct</code> are not
compatible.</p>
<p>Wow, that's great garantees isn't it? <strong>PHP will raise fatal errors for
impossible actions or impossible states</strong>. No warnings or notices: Fatal
errors. Most of them are correctly inferred by IDE, so… follow the red
crosses in your IDE.</p>
<h2 id="-7">Restoring a product<a role="presentation" class="anchor" href="#-7" title="Anchor link to this header">#</a>
</h2>
<p>One major thing is missing: The state of a product is stored in the
database. When loading the product, we must be able to get an instance
of a product at its previous state. To avoid repeating code, we will use
traits. Rebuilding the state of a product is “just” (it really is) a
composition of traits.</p>
<p>Note: In these examples, we are using anonymous classes and traits. It
is possible to achieve the same behavior with final named classes. Also
we are using a repository, which is convenient for this article, but not
necessarily the best solution.</p>
<h3 id="-8">Repository<a role="presentation" class="anchor" href="#-8" title="Anchor link to this header">#</a>
</h3>
<p>The following <code>ProductRepository\load</code> function is just here to give you
an idea of how it works.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">namespace</span><span class="z-entity z-name"> ProductRepository</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> load</span><span>(</span><span class="z-keyword">int</span><span class="z-variable"> $id</span><span class="z-punctuation z-separator">,</span><span class="z-keyword"> string</span><span class="z-variable"> $state</span><span>)</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Product</span><span> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Load the product from the database with `$id`.</span></span>
<span class="giallo-l"><span class="z-comment"> //</span></span>
<span class="giallo-l"><span class="z-comment"> // The states can be `Renewable`, `Cancellable`, or `Inactive` (check</span></span>
<span class="giallo-l"><span class="z-comment"> // the FSM to double-check). Products that have not been purchased</span></span>
<span class="giallo-l"><span class="z-comment"> // are not in the database.</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Fake minimal active product.</span></span>
<span class="giallo-l"><span class="z-variable"> $product</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-storage"> class implements</span><span class="z-entity z-other z-inherited-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> switch</span><span> (</span><span class="z-variable">$state</span><span>) {</span></span>
<span class="giallo-l"><span class="z-comment"> // State B.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-support z-class"> Renewable</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> return new</span><span class="z-storage"> class</span><span> (</span><span class="z-variable">$product</span><span>)</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Renewable</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> use</span><span class="z-support z-class"> ActiveProduct</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> use</span><span class="z-support z-class"> RenewableProduct</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> use</span><span class="z-support z-class"> CancellableProduct</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // State C.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-support z-class"> Cancellable</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> return new</span><span class="z-storage"> class</span><span> (</span><span class="z-variable">$product</span><span>)</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Cancellable</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> use</span><span class="z-support z-class"> ActiveProduct</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> use</span><span class="z-support z-class"> CancellableProduct</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // State D.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-support z-class"> Inactive</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> return new</span><span class="z-storage"> class</span><span> (</span><span class="z-variable">$product</span><span>)</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> use</span><span class="z-support z-class"> InactiveProduct</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Invalid state.</span></span>
<span class="giallo-l"><span class="z-keyword"> default</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> throw new</span><span class="z-support z-class"> RuntimeException</span><span>(</span><span class="z-string">'Invalid product state.'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre><h3 id="-9">Traits<a role="presentation" class="anchor" href="#-9" title="Anchor link to this header">#</a>
</h3>
<p>The code must look familiar because this is just a split from the
<code>SecretProduct</code> implementation.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">trait</span><span class="z-entity z-name"> ActiveProduct</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-support z-function"> __construct</span><span>(</span><span class="z-support z-class">Product</span><span class="z-variable"> $product</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Active</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">trait</span><span class="z-entity z-name"> RenewableProduct</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> renew</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Renewable</span><span> {</span></span>
<span class="giallo-l"><span class="z-comment"> // Do the renew.</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">trait</span><span class="z-entity z-name"> CancellableProduct</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> cancel</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return new</span><span class="z-storage"> class</span><span> (</span><span class="z-variable z-language">$this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getProduct</span><span>())</span><span class="z-storage"> implements</span><span class="z-entity z-other z-inherited-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-support z-function"> __construct</span><span>(</span><span class="z-support z-class">Product</span><span class="z-variable"> $product</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-comment"> // Do the cancel.</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">trait</span><span class="z-entity z-name"> InactiveProduct</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-support z-function"> __construct</span><span>(</span><span class="z-support z-class">Product</span><span class="z-variable"> $product</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> getProduct</span><span>()</span><span class="z-keyword z-operator">:</span><span class="z-support z-class"> Inactive</span><span> {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">product</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre><h3 id="-10">Assert possible and impossible actions<a role="presentation" class="anchor" href="#-10" title="Anchor link to this header">#</a>
</h3>
<p>The <strong>possible actions</strong> are:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$product</span><span class="z-keyword z-operator"> =</span><span> ProductRepository</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-name z-function">load</span><span>(</span><span class="z-constant z-numeric">42</span><span class="z-punctuation z-separator">,</span><span class="z-support z-class"> Renewable</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>(</span><span class="z-variable">$product</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>(</span><span class="z-variable">$product</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">renew</span><span>()</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">assert</span><span>(</span><span class="z-variable">$product</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator"> instanceof</span><span class="z-support z-class"> Product</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Product 42 is assumed to be in the state <code>B</code> (<code>Renewable::class</code>), so we
can renew and cancel it.</p>
<p>Those followings are <strong>impossible actions</strong>:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$product</span><span class="z-keyword z-operator"> =</span><span> ProductRepository</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-name z-function">load</span><span>(</span><span class="z-constant z-numeric">42</span><span class="z-punctuation z-separator">,</span><span class="z-support z-class"> Renewable</span><span class="z-keyword z-operator">::</span><span class="z-keyword">class</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$product</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">purchase</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$product</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">cancel</span><span>()</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>It is impossible to purchase the product 42 because it is in state <code>B</code>,
so it has already been purchased. It is impossible to cancel a product
twice.</p>
<p><strong>Same garantees apply here</strong>!</p>
<h2 id="-11">Conclusion<a role="presentation" class="anchor" href="#-11" title="Anchor link to this header">#</a>
</h2>
<p>It is possible to re-implement <code>SecretProduct</code> with the traits we have
defined for the <code>ProductRepository</code>, or to use named classes. I let this
as an easy wrap up exercise for the reader.</p>
<p>The real conclusion is that we have <strong>successfully implemented the
Finite-State Machine of a product with a Type System</strong>. It is impossible
to have an invalid implementation that violates the constraints, such as
an inactive renewable product. PHP detects it immediately at runtime.
Invalid actions are also impossible, such as purchasing a product twice,
or renewing a once-off purchased product. It is also detected by PHP.</p>
<p>All violations take the form of PHP fatal errors.</p>
<p>The product repository is an example of how to restore a product at a
particular state, with the help of the defined interfaces, and new small
and simple traits.</p>
<h2 id="-12">One more thing<a role="presentation" class="anchor" href="#-12" title="Anchor link to this header">#</a>
</h2>
<p>It is possible to integrate product categories in this type system (like
bundles). It is more complex, but possible.</p>
<p>I would highly recommend these following readings:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="http://blogs.perl.org/users/ovid/2010/08/what-to-know-before-debating-type-systems.html">What to know before debating type systems</a>
to have an overview of different systems,</li>
<li><a rel="noopener external" target="_blank" href="https://sdleffler.github.io/RustTypeSystemTuringComplete/">Rust's Type System is Turing-Complete</a>
to see how powerful a type system can be,</li>
<li><a rel="noopener external" target="_blank" href="https://speakerdeck.com/willroth/fear-not-the-machine-of-state">Fear Not the Machine of State!</a>
to see how to integrate an FSM into an object without using a type system.</li>
</ul>
<p>I would like to particularly emphasize a paragraph from the first
article:</p>
<blockquote>
<p>So what is a type? The only true definition is this: a type is a
<strong>label</strong> used by a type system to <strong>prove</strong> some property of the
<strong>program's behavior</strong>. If the type checker can assign types to the
whole program, then it succeeds in its proof; otherwise it fails and
points out why it failed.</p>
</blockquote>
<p>Seeing types as labels is a very smart way of approaching them.</p>
<p>I would like to thanks <a rel="noopener external" target="_blank" href="https://ocramius.github.io/">Marco Pivetta</a> for
the reviews!</p>
Tagua VM, a safe PHP virtual machine2017-06-19T00:00:00+00:002017-06-19T00:00:00+00:00
Unknown
https://mnt.io/articles/tagua-vm-a-safe-php-virtual-machine/<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/Ymy8qAEe0kQ?si=_7IlrTO1VzOriUKW" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen>
</iframe>
<p>PHPTour Nantes 2017 (in French):</p>
<blockquote>
<p>PHP est un langage extrêment populaire. En 2015, PHP était utilisé par
plus de 80% de tous les sites Web. Cependant, 500 vulnérabilités
sévères sont répertoriées. Bien qu'inhérent à tous langages
populaires, cela reste très dangereux. L'objectif du projet Tagua VM
est de fournir une VM PHP qui garantie un haut niveau de sûreté et de
qualité en supprimant des larges classes de vulnérabilités, grâce à
des outils appropriés comme Rust et LLVM. Rust est un langage
remarquable qui apporte des garanties fortes à propos de la sûreté de
la mémoire. C'est aussi un langage très rapide qui rivalise avec C.
LLVM est une infrastructure de compilateur célèbre qui apporte de la
modernité, des algorithmes à la pointe, des performances, une suite
d'outils pour développeur etc. Ce projet va résoudre trois problèmes
en une fois :</p>
<ol>
<li>Fournir un niveau haut niveau de sûreté et de qualité en
supprimant des larges classes de vulnérabilité, et ainsi éviter
des coûts de bugs dramatiques ;</li>
<li>Fournir de la modernité, une nouvelle expérience développeur et
des algorithmes à la pointe de la recherche, donc des performances ;</li>
<li>Fournir un ensemble de bibliothèques qui vont composer la VM et
qui pourront être réutiliser en dehors du projet (comme le
parseur, les analyseurs, les extensions etc.).</li>
</ol>
<p>Durant cette conférence, nous présenterons les objectifs de ce projet,
ainsi que son avancement. Nous expliquerons pourquoi il est crucial et
pourquoi il reçoit le soutient d'une communauté grandissante et de
développeurs notables (avec un rôle important dans le développement de
PHP).</p>
</blockquote>
<p><a rel="noopener external" target="_blank" href="https://speakerdeck.com/hywan/tagua-vm-a-safe-php-virtual-machine">View
slides</a>.</p>
Faster find algorithms in nom2017-05-23T00:00:00+00:002017-05-23T00:00:00+00:00
Unknown
https://mnt.io/articles/faster-find-algorithms-in-nom/<p><a rel="noopener external" target="_blank" href="https://github.com/tagua-vm/">Tagua VM</a> is an experimental PHP virtual
machine written in Rust and LLVM. It is composed as a set of libraries.
One of them that keeps me busy these days is
<a rel="noopener external" target="_blank" href="https://github.com/tagua-vm/parser"><code>tagua-parser</code></a>. It contains the
lexical and syntactic analysers for the PHP language, in addition to the
AST (Abstract Syntax Tree). If you would like to know more about this
project, you can see this conference I gave at the PHPTour last week:
<a rel="noopener external" target="_blank" href="https://speakerdeck.com/hywan/tagua-vm-a-safe-php-virtual-machine">Tagua VM, a safe PHP virtual
machine</a>.</p>
<p>The library <code>tagua-parser</code> is built with parser combinators. Instead of
having a classical grammar, compiled to a parser, we write pure
functions acting as small parsers. We then combine them together. This
post does not explain why this is a sane approach in our context, but
keep in mind this is much easier to test, to maintain, and to optimise.</p>
<p>Because this project is complex enought, we are delegating the parser
combinator implementation to <a rel="noopener external" target="_blank" href="https://github.com/Geal/nom/">nom</a>.</p>
<blockquote>
<p>nom is a parser combinators library written in Rust. Its goal is to
provide tools to build safe parsers without compromising the speed or
memory consumption. To that end, it uses extensively Rust's <em>strong
typing</em>, <em>zero copy</em> parsing, <em>push streaming</em>, <em>pull streaming</em>, and
provides macros and traits to abstract most of the error prone
plumbing.</p>
</blockquote>
<p>Recently, I have been working on optimisations in the <code>FindToken</code> and
<code>FindSubstring</code> traits from nom itself. These traits provide methods to
find a token (i.e. a lexeme), and to find a substring, crazy naming.
However, this is not totally valid: <code>FindToken</code> expects to find a single
item (if implemented for <code>u8</code>, it will look for a <code>u8</code> in a <code>&[u8]</code>),
and <code>FindSubstring</code> really is about finding a substring, so a token of
any length.</p>
<p>It appeared that these methods can be optimised in some cases. Both
default implementations are using Rust iterators: Regular iterator for
<code>FindToken</code>, and <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/slice/struct.Windows.html">window
iterator</a> for
<code>FindSubstring</code>, i.e. an iterator over overlapping subslices of a given
length. We have benchmarked big PHP comments, which are analysed by
parsers actively using these two trait implementations.</p>
<p>Here are the result, before and after our optimisations:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>test …::bench_span ... bench: 73,433 ns/iter (+/- 3,869)</span></span>
<span class="giallo-l"><span>test …::bench_span ... bench: 15,986 ns/iter (+/- 3,068)</span></span></code></pre>
<p>A boost of 78%! Nice!</p>
<p>The <a rel="noopener external" target="_blank" href="https://github.com/Geal/nom/pull/507">pull request has been merged</a>
today, thank you Geoffroy Couprie! The new algorithms heavily rely on
<a rel="noopener external" target="_blank" href="https://github.com/BurntSushi/rust-memchr">the <code>memchr</code> crate</a>. So all
the credits should really go to Andrew Gallant! This crate provides a
safe interface <code>libc</code>'s <code>memchr</code> and <code>memrchr</code>. It also provides
fallback implementations when either function is unavailable.</p>
<p>The new algorithms are only implemented for <code>&[u8]</code> though. Fortunately,
the implementation for <code>&str</code> fallbacks to the former.</p>
<p>This is small contribution, but it brings a very nice boost. Hope it
will benefit to other projects!</p>
<p>I am also blowing the dust off of <a rel="noopener external" target="_blank" href="https://www.amazon.com/Algorithms-Strings-Maxime-Crochemore/dp/0521848997">Algorithms on
Strings</a>,
by M. Crochemore, C. Hancart, and T. Lecroq. I am pretty sure it should
be useful for nom and <code>tagua-parser</code>. If you haven't read this book yet,
I can only encourage you to do so!</p>
Welcome to Chaos2017-04-24T00:00:00+00:002017-04-24T00:00:00+00:00
Unknown
https://mnt.io/articles/welcome-to-chaos/<p>Recently, <a href="https://mnt.io/articles/bye-bye-liip-hello-automattic/">I joined
Automattic</a>.
This is a world-wide distributed company. The first three weeks you
incarn a Happiness Engineer. This is part of the Happiness Rotation
duty. This article explains why I loved it, and why I reckon you should
do it too.</p>
<h2 id="happiness-engineer-really">Happiness Engineer, really?<a role="presentation" class="anchor" href="#happiness-engineer-really" title="Anchor link to this header">#</a>
</h2>
<p>Does it sound mad as a Cheshire cat? Pretentious maybe? Actually, it's
not at all.</p>
<p>As a Happiness Engineer, I had to make the support. This is part of the
Happiness Rotation: Once a year, almost everyone swaps its position to
help our users. I will go back on this later.</p>
<p>My role was to make our users happy. To achieve that, I had to:</p>
<ul>
<li>Meet our users, understand who they are, what they want to achieve,</li>
<li>Listen to and understand their issues,</li>
<li>Find a way to fix the issues.</li>
</ul>
<h3 id="meet-the-users">Meet the users<a role="presentation" class="anchor" href="#meet-the-users" title="Anchor link to this header">#</a>
</h3>
<p>I need motivations in my job. Learning who our users are, and what they
want to achieve, is a great motivation. After these three weeks, I know
what my contributions will serve. It gives a meaning to each
contribution, to each day I wake up.</p>
<p>Especially in a distributed company on Internet, our users are
world-wide, they speak almost all the languages on Earth, they are
present on all continents. Their needs vary a lot, they use our
software in ways I was not able to foresee.</p>
<h3 id="listen-to-understand-and-fix-their-issues">Listen to, understand, and fix their issues<a role="presentation" class="anchor" href="#listen-to-understand-and-fix-their-issues" title="Anchor link to this header">#</a>
</h3>
<p>When you are chatting with a “support guy”, you cannot imagine this is a
real engineer. This is not a random person filling a pre-defined vague
form somewhere where it is cheap to hire her. You will chat with someone
very competent. Someone that has no superior. Someone that has all the
tools to make you happy.</p>
<p>Personally, when I started, it was the first time I was using WordPress.
I was more novice than the user I was talking to. So how to fix it on my
end? I had to:</p>
<ul>
<li>Ask help to the right persons,</li>
<li>Therefore, meet Automatticians (people working with Automattic),</li>
<li>Discover all the interactions between them,</li>
<li>Understand the structure of the company,</li>
<li>How to ask help, how to formulate my questions, how to reformulate the issues
of the users…</li>
<li>Discover all the internal tools,</li>
<li>Therefore, learn how the software work internally and together,</li>
<li>Discover the giant internal and public documentations,</li>
<li>When needed, create bug reports or feature requests to the
appropriated teams,</li>
<li>Learn the culture of the company.</li>
</ul>
<p>This is why it is called <em>Welcome to Chaos</em>. Yes, you have to learn a
lot in three weeks, but it is extremely educative. This is like a speed
training.</p>
<h3 id="happiness">Happiness<a role="presentation" class="anchor" href="#happiness" title="Anchor link to this header">#</a>
</h3>
<p>I can ensure that when a user is grateful after you fixed its issue, the
term Happiness Engineer makes a lot of sense. Automattic provides a lot
of freedom to their Happiness Engineers to make people really happy,
both in term of tooling or financial.</p>
<p>This is the first time I see a company that is that much generous with
its customers.</p>
<h3 id="thanks-buddy">Thanks buddy<a role="presentation" class="anchor" href="#thanks-buddy" title="Anchor link to this header">#</a>
</h3>
<p>Of course, when embracing the chaos, you are not alone. Everyone is here
to help you, and to answer your questions. After all, this is part of
<a rel="noopener external" target="_blank" href="https://automattic.com/creed/">the Automattic's creed</a> (<a rel="noopener external" target="_blank" href="https://ma.tt/2011/09/automattic-creed/">story of the
creed</a>):</p>
<blockquote>
<p>I will never stop learning. I won’t just work on things that are
assigned to me. I know there’s no such thing as a status quo. I will
build our business sustainably through passionate and loyal customers.
<strong>I will never pass up an opportunity to help out a colleague</strong>, and
<strong>I’ll remember the days before I knew everything</strong>. I am more
motivated by impact than money, and I know that Open Source is one of
the most powerful ideas of our generation. I will communicate as much
as possible, because it’s the oxygen of a distributed company. I am in
a marathon, not a sprint, and no matter how far away the goal is, the
only way to get there is by putting one foot in front of another every
day. Given time, there is no problem that’s insurmountable.</p>
</blockquote>
<p>In addition to everyone willing to help, a buddy was assigned to me. A
person that helps and teaches you every time. This is very helpful. Thank
you Hannah!</p>
<h2 id="happiness-rotation">Happiness Rotation<a role="presentation" class="anchor" href="#happiness-rotation" title="Anchor link to this header">#</a>
</h2>
<p>This experience is great. But after some time, you might forget it. So
as a reminder, once a year, you incarn a Happiness Engineer again. This
is part of the happiness rotation. As far as I understand, it implies
almost everyone in the company.</p>
<p>Note: Obviously, there is permanent happiness engineers.</p>
<h2 id="conclusion">Conclusion<a role="presentation" class="anchor" href="#conclusion" title="Anchor link to this header">#</a>
</h2>
<p>I deeply think this approach has many advantages. Some of them are
listed above. It helps to understand the company, and more importantly
the users. The happiness rotation stresses the fact that users are
central to Automattic, probably like any companies, but not with this
care. Remember the creed: I will build our business sustainably through
passionate and loyal customers. To have passionate and loyal users, you
need to know them.</p>
<p>For me, it was a great experience. It was chaotic at first, but it is
worth it.</p>
Bye bye Liip, hello Automattic2017-04-18T00:00:00+00:002017-04-18T00:00:00+00:00
Unknown
https://mnt.io/articles/bye-bye-liip-hello-automattic/<p>Since April 2017, I have left <a rel="noopener external" target="_blank" href="https://www.liip.ch/">Liip</a> to join
<a rel="noopener external" target="_blank" href="https://automattic.com/">Automattic</a>.</p>
<h2 id="bye-bye-liip">Bye bye Liip<a role="presentation" class="anchor" href="#bye-bye-liip" title="Anchor link to this header">#</a>
</h2>
<p>After almost 20 months at Liip, I am leaving. Liip was a great
experience. It was my first industrial non-remote job. It was also my
first job in the country I am currently living in. And I have discovered
a new way of working.</p>
<h3 id="first-industrial-non-remote-job">First industrial non-remote job<a role="presentation" class="anchor" href="#first-industrial-non-remote-job" title="Anchor link to this header">#</a>
</h3>
<p>Before working for Liip, I was working for <a rel="noopener external" target="_blank" href="https://fruux.com/">fruux</a>.
My situation was the following: A french citizen, living as a foreigner
in Switzerland, working for a German company, with employees from
Germany, Holland, and Canada. Everything happened on chat, mail, and
Skype. When my son was born, I had to change my work to simplify my
life. It was not the only reason, but one of them.</p>
<p>And before fruux, I was working for <a rel="noopener external" target="_blank" href="https://www.inria.fr/en/">INRIA</a>, a
research institute in France. It was partially a remote job.</p>
<p>Liip has several offices. I was based in Lausanne.</p>
<p>So, yes, Liip was my first industrial non-remote job. And I liked it.
Working in the train on the morning, walking in Lausanne, seeing real
people, everything in my local language. Because yes, it was my first
job in my native language too.</p>
<p>Everything was simpler. And when you have your first baby, anything else
that is simpler saves your life.</p>
<h3 id="introducing-holacracy">Introducing Holacracy<a role="presentation" class="anchor" href="#introducing-holacracy" title="Anchor link to this header">#</a>
</h3>
<p>Giant discussions were happening to remove any form of hierarchy in
Liip. Then we discovered
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Holacracy">Holacracy</a>, and we started
moving to this system. This is a new governance system. If you are
familiar with distributed network topologies in Computer Science, or
data structures, it really looks like a <a rel="noopener external" target="_blank" href="https://hal.archives-ouvertes.fr/hal-00560821/document">Distributed Spanning
Tree</a>
[<a rel="noopener external" target="_blank" href="http://dblp.org/rec/html/journals/tpds/DahanPN09">DahanPN09</a>]. Note:
I am sure that the authors of Holacracy are not aware of DST, but, eh.</p>
<p>So nothing new from a research point of view, but it is cool to see this
algorithm coming alive in real life. And it worked: Less meetings, more
self-organisation, more shared responsabilities, no more “boss” etc.
This is not a tool for all companies, but I am sure that if you are
reading my blog, then your company should give it a try.</p>
<h3 id="open-source-projects">Open source projects<a role="presentation" class="anchor" href="#open-source-projects" title="Anchor link to this header">#</a>
</h3>
<p>Liip has been very generous with me regarding my open source
engagements. I was involved in <a rel="noopener external" target="_blank" href="https://hoa-project.net/">Hoa</a>,
<a rel="noopener external" target="_blank" href="https://atoum.org/">atoum</a>, and
<a rel="noopener external" target="_blank" href="https://github.com/FriendsOfPHP/pickle">Pickle</a> when joining the
company. Liip gave me a 5% budget, so roughly 1 hour per day to work on
Hoa. Thank you for that!</p>
<p>After that, I have started a new big project, called <a rel="noopener external" target="_blank" href="https://github.com/tagua-vm/tagua-vm">Tagua
VM</a>. They gave me an additional 5%
budget. So I got 2 hours per day to work on Hoa and Tagua VM. Again, thank
you for that!</p>
<p>Finally, I have started an in-house open project called <a rel="noopener external" target="_blank" href="https://github.com/liip/TheA11yMachine">The A11y
Machine</a> (a11ym for short). I
have written a case study for this tool on the Liip's blog:
<a rel="noopener external" target="_blank" href="https://blog.liip.ch/archive/2016/12/06/accessibility-with-a11ym.html">Accessibility: make your website barrier-free with
a11ym!</a></p>
<p>The goal of a11ym is to automate the accessibility testing of any site
by crawling and testing each page. A sweet report is generated, showing
all errors, warnings, and notices, with all information needed by the
developer to fix the issues as fast as possible.</p>
<figure>
<p><img src="https://mnt.io/articles/bye-bye-liip-hello-automattic/./dashboard.jpg" alt="Dashboard" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Dashboard of a11ym, showing the evolution of the accessibility of a site
in time</p>
</figcaption>
</figure>
<figure>
<p><img src="https://mnt.io/articles/bye-bye-liip-hello-automattic/./report.png" alt="Report" loading="lazy" decoding="async" /></p>
<figcaption>
<p>A typical a11ym report listing all errors, warnings, and notices for a
given URL</p>
</figcaption>
</figure>
<p>This project has received really good feedbacks from the accessibility
community. It has been downloaded 7000 times so far, which is not bad
considering the niche it targets.</p>
<p>A new SaaS platform is being build around this software. I enjoyed
working on it, and it was really tangible.</p>
<h3 id="">Main customer, huge project<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h3>
<p>Liip is a Web agency, so you have dozens of customers at the same time.
However, I was in a special team for an important customer. The site is
a luxury watches and jewellery e-commerce platform, located in several
countries, in 10 languages, accessible from 16 domains, shared in 2
datacenters. This is not a casual site.</p>
<p>I learned a lot about all the complexity such a site brings: Checkout
rules (oh damned…), product catalogs in different formats for different
countries with different references, all the business logic inherent to
each country, different payment providers, crazy front end
compatibilities etc.</p>
<p>I have a hundred of crazy anecdotes to tell. This was clearly not a job
for me at first glance: I am a researcher, I have an open source culture
background, I am not tailored for this kind of project. But at the end
of the story, I learned a lot. Really a lot. I have a better overview of
the crazy things any customer can ask, or has to deal with, and the
infrastructure craziness that can be set up. I learned how to make
better things: How to transform a really crappy software into something
understandable by everyone, how to not break a 10+ years old progam with
no test etc. And it requires skills. I learned it the hard way, but I
learned it.</p>
<h3 id="-1">Why leaving?<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h3>
<p>Because even if I learned during my time at Liip, the Web agency model
was definitively not for me. I am very thankful to every Liiper, I had a
great time, I love the Web, but not in an agency.</p>
<p>My son is now 21 months old, and I need fresh air. I can take new
challenges.</p>
<h2 id="-2">Welcome Automattic<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h2>
<p><a rel="noopener external" target="_blank" href="https://automattic.com/">Automattic</a> is the company behind
<a rel="noopener external" target="_blank" href="https://wordpress.com/">WordPress.com</a>,
<a rel="noopener external" target="_blank" href="https://woocommerce.com/">WooCommerce</a>,
<a rel="noopener external" target="_blank" href="https://akismet.com/">Akismet</a>, <a rel="noopener external" target="_blank" href="https://simplenote.com/">Simplenote</a>,
<a rel="noopener external" target="_blank" href="https://cloudup.com/">Cloudup</a>, <a rel="noopener external" target="_blank" href="https://simperium.com">Simperium</a>,
<a rel="noopener external" target="_blank" href="http://en.gravatar.com/">Gravatar</a> and other giant services.</p>
<p>I came to Automattic by coincidence. I was looking for a sponsor for
Tagua VM, and someone pointed me out Automattic. After some researches
about the company, it appears that it could be a really great place
where to work. So I applied.</p>
<p>The hiring process was 4 months long. It was exhausting because it
happened at the same time than a big sprint at Liip (remember the SaaS
platform for The A11y Machine?). But after 4 months, it appears I
succeeded, and I am very glad of that fact!</p>
<p>I am just starting my job at Automattic. I don't have anything strong
and finite to say now, apart that everything is just awesome so far. In
few weeks, I am likely to write about my start at Automattic I did, see
<a href="https://mnt.io/articles/welcome-to-chaos/">Welcome to Chaos</a>. They
have a very interesting way to get you on board.</p>
<p>Time for a new adventure!</p>
DuckDuckGo in a Shell2015-08-05T00:00:00+00:002015-08-05T00:00:00+00:00
Unknown
https://mnt.io/articles/duckduckgo-in-a-shell/<h2 id="the-tip">The tip<a role="presentation" class="anchor" href="#the-tip" title="Anchor link to this header">#</a>
</h2>
<p>When I go outside my terminal, I am kind of lost. I control everything
from my terminal and I hate clicking. That's why I found a small tip
today to open a search on DuckDuckGo directly from the terminal. It
redirects me to my default browser in the background, which is the
expected behavior.</p>
<p>First, I create a function called <code>duckduckgo</code>:</p>
<pre class="giallo z-code"><code data-lang="shellscript"><span class="giallo-l"><span class="z-storage z-type z-function">function</span><span class="z-entity z-name z-function"> duckduckgo</span><span class="z-punctuation z-section"> {</span></span>
<span class="giallo-l"><span class="z-variable"> query</span><span class="z-keyword z-operator">=</span><span class="z-string">`</span><span class="z-entity z-name">php</span><span class="z-constant z-other"> -r</span><span class="z-string"> 'echo urlencode($argv[1]);' "</span><span class="z-variable z-parameter">$1</span><span class="z-string">"`</span></span>
<span class="giallo-l"><span class="z-entity z-name"> open</span><span class="z-string"> 'https://duckduckgo.com/?q='</span><span class="z-variable">$query</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>Note how I (avoid to) deal with quotes in <code>$1</code>.</p>
<p>Then, I just have to create an alias called <code>?</code>:</p>
<pre class="giallo z-code"><code data-lang="shellscript"><span class="giallo-l"><span class="z-support z-function">alias</span><span class="z-string"> '?'='duckduckgo'</span></span></code></pre>
<p>And here we (duckduck) go!</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span class="z-keyword z-operator"> ?</span><span class="z-string"> "foo bar's baz"</span></span></code></pre>
<p>You can <a rel="noopener external" target="_blank" href="https://github.com/Hywan/Dotfiles/commit/fab6d98448240a787eb0e34ab836c5c43d50379c">see the
commit</a>
that adds this to my “shell home framework”.</p>
<p>Oh, and to open the default browser, I use
<a rel="noopener external" target="_blank" href="https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/open.1.html"><code>open (1)</code></a>,
like this:</p>
<pre class="giallo z-code"><code data-lang="shellscript"><span class="giallo-l"><span class="z-storage">alias</span><span class="z-variable"> open</span><span class="z-keyword z-operator">=</span><span class="z-string">'open -g'</span></span></code></pre>
<p>Hope it helps!</p>
sabre/katana2015-07-13T00:00:00+00:002015-07-13T00:00:00+00:00
Unknown
https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./logo-katana.png" alt="sabre/katana's logo" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Project's logo.</p>
</figcaption>
</figure>
<h2 id="">What is it?<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p><code>sabre/katana</code> is a contact, calendar, task list and file server. What
does it mean? Assuming nowadays you have multiple devices (PC, phones,
tablets, TVs…). If you would like to get your address books, calendars,
task lists and files synced between all these devices from everywhere,
you need a server. All your devices are then considered as clients.</p>
<p>But there is an issue with the server. Most of the time, you might
choose <a rel="noopener external" target="_blank" href="https://google.com/">Google</a> or maybe
<a rel="noopener external" target="_blank" href="https://apple.com/">Apple</a>, but one may wonder: Can we trust these
servers? Can we give them our private data, like all our contacts, our
calendars, all our photos…? What if you are a company or an association
and you have sensitive data that are really private or strategic? So,
can you still trust them? Where the data are stored? Who can look at
these data? More and more, there is a huge need for “personal” server.</p>
<p>Moreover, servers like Google or Apple are often closed: You reach your
data with specific clients and they are not available in all platforms.
This is for strategic reasons of course. But with <code>sabre/katana</code>, you
are not limited. See the above schema: Firefox OS can talk to iOS or
Android at the same time.</p>
<p><code>sabre/katana</code> is this kind of server. You can install it on your
machine and manage users in a minute. Each user will have a collection
of address books, calendars, task lists and files. This server can talk
to a <a rel="noopener external" target="_blank" href="https://fruux.com/supported-devices/">loong list of devices</a>,
mainly thanks to a scrupulous respect of industrial standards:</p>
<ul>
<li>macOS:
<ul>
<li>OS X 10.10 (Yosemite),</li>
<li>OS X 10.9 (Mavericks),</li>
<li>OS X 10.8 (Mountain Lion),</li>
<li>OS X 10.7 (Lion),</li>
<li>OS X 10.6 (Snow Leopard),</li>
<li>OS X 10.5 (Leopard),</li>
<li>BusyCal,</li>
<li>BusyContacts,</li>
<li>Fantastical,</li>
<li>Rainlendar,</li>
<li>ReminderFox,</li>
<li>SoHo Organizer,</li>
<li>Spotlife,</li>
<li>Thunderbird ,</li>
</ul>
</li>
<li>Windows:
<ul>
<li>eM Client,</li>
<li>Microsoft Outlook 2013,</li>
<li>Microsoft Outlook 2010,</li>
<li>Microsoft Outlook 2007,</li>
<li>Microsoft Outlook with Bynari WebDAV Collaborator,</li>
<li>Microsoft Outlook with iCal4OL,</li>
<li>Rainlendar,</li>
<li>ReminderFox,</li>
<li>Thunderbird,</li>
</ul>
</li>
<li>Linux:
<ul>
<li>Evolution,</li>
<li>Rainlendar,</li>
<li>ReminderFox,</li>
<li>Thunderbird,</li>
</ul>
</li>
<li>Mobile:
<ul>
<li>Android,</li>
<li>BlackBerry 10,</li>
<li>BlackBerry PlayBook,</li>
<li>Firefox OS,</li>
<li>iOS 8,</li>
<li>iOS 7,</li>
<li>iOS 6,</li>
<li>iOS 5,</li>
<li>iOS 4,</li>
<li>iOS 3,</li>
<li>Nokia N9,</li>
<li>Sailfish.</li>
</ul>
</li>
</ul>
<p>Did you find your device in this list? Probably yes 😉.</p>
<p><code>sabre/katana</code> sits in the middle of all your devices and synced all
your data. Of course, it is <strong>free</strong> and <strong>open source</strong>. <a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-katana/">Go check the
source</a>!</p>
<h2 id="-1">List of features<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p>Here is a non-exhaustive list of features supported by <code>sabre/katana</code>.
Depending whether you are a user or a developer, the features that might
interest you are radically not the same. I decided to show you a list
from the user point of view. If you would like to get a list from the
developer point of view, please see this <a rel="noopener external" target="_blank" href="http://sabre.io/dav/standards-support/">exhaustive list of supported
RFC</a> for more details.</p>
<h3 id="-2">Contacts<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h3>
<p>All usual fields are supported, like phone numbers, email addresses,
URLs, birthday, ringtone, texttone, related names, postal addresses,
notes, HD photos etc. Of course, groups of cards are also supported.</p>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./card-inside-macos-client.png" alt="My card on macOS" loading="lazy" decoding="async" /></p>
<figcaption>
<p>My card inside the native Contact application of macOS.</p>
</figcaption>
</figure>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./card-inside-firefox-os-client.png" alt="My card on Firefox OS" loading="lazy" decoding="async" /></p>
<figcaption>
<p>My card inside the native Contact application of Firefox OS.</p>
</figcaption>
</figure>
<p>My photo is not in HD, I really have to update it!</p>
<p>Cards can be encoded into several formats. The most usual format is VCF.
<code>sabre/katana</code> allows you to download the whole address book of a user
as a single VCF file. You can also create, update and delete address
books.</p>
<h3 id="-3">Calendars<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h3>
<p>A calendar is just a set of events. Each event has several properties,
such as a title, a location, a date start, a date end, some notes, URLs,
alarms etc. <code>sabre/katana</code> also support recurring events (“each last
Monday of the month, at 11am…”), in addition to scheduling (see bellow).</p>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./calendars-inside-macos-client.png" alt="My calendars on macOS" loading="lazy" decoding="async" /></p>
<figcaption>
<p>My calendars inside the native Calendar application of macOS.</p>
</figcaption>
</figure>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./calendars-inside-firefox-os-client.png" alt="My calendars on Firefox OS" loading="lazy" decoding="async" /></p>
<figcaption>
<p>My calendars inside the native Calendar application of Firefox OS.</p>
</figcaption>
</figure>
<p>Few words about calendar scheduling. Let's say you are organizing an
event, like New release (we always enjoy release day!). You would like
to invite several people but you don't know if they could be present or
not. In your event, all you have to do is to add attendees. How are they
going to be notified about this event? Two situations:</p>
<ol>
<li>Either attendees are registered on your <code>sabre/katana</code> server and they will
receive an invite inside their calendar application (we call this iTIP),</li>
<li>Or they are not registered on your server and they will receive an email with
the event as an attached file (we call this iMIP). All they have to do is to
open this event in their calendar application.</li>
</ol>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./invite-by-email.png" alt="Typical mail to invite an attendee to an event" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Invite an attendee by email because she is not registered on your
<code>sabre/katana</code> server.</p>
</figcaption>
</figure>
<p>Notice the gorgeous map embedded inside the email!</p>
<p>Once they received the event, they can accept, decline or “don't know”
(they will try to be present at) the event.</p>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./respond-to-invite.png" alt="Receive an invite to an event" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Receive an invite to an event. Here: Gordon is inviting Hywan. Three
choices for Hywan:</p>
</figcaption>
</figure>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./accepted-event.png" alt="Status of all attendees" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Hywan has accepted the event. Here is what the event looks like. Hywan
can see the response of each attendees.</p>
</figcaption>
</figure>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./notification.png" alt="Notification from attendees" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Gordon is even notified that Hywan has accepted the event.</p>
</figcaption>
</figure>
<p>Of course, attendees will be notified too if the event has been moved,
canceled, refreshed etc.</p>
<p>Calendars can be encoded into several formats. The most usal format is
ICS. <code>sabre/katana</code> allows you to download the whole calendar of a user
as a single ICS file. You can also create, update and delete calendars.</p>
<h3 id="-4">Task lists<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h3>
<p>A task list is exactly like a calendar (from a programmatically point of
view). Instead of containg event objects, it contains todo objects.</p>
<p><code>sabre/katana</code> supports group of tasks, reminder, progression etc.</p>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./tasks.png" alt="My task lists on macOS" loading="lazy" decoding="async" /></p>
<figcaption>
<p>My task lists inside the native Reminder application of macOS.</p>
</figcaption>
</figure>
<p>Just like calendars, task lists can be encoded into several formats,
whose ICS. <code>sabre/katana</code> allows you to download the whole task list of
a user as a single ICS file. You can also create, update and delete task
lists.</p>
<h3 id="-5">Files<a role="presentation" class="anchor" href="#-5" title="Anchor link to this header">#</a>
</h3>
<p>Finally, <code>sabre/katana</code> creates a home collection per user: A personal
directory that can contain files and directories and… synced between all
your devices (as usual 😄).</p>
<p><code>sabre/katana</code> also creates a special directory called <code>public/</code> which
is a public directory. Every files and directories stored inside this
directory are accessible to anyone that has the correct link. No listing
is prompted to protect your public data.</p>
<p>Just like contact, calendar and task list applications, you need a
client application to connect to your home collection on <code>sabre/katana</code>.</p>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./connect-to-dav.png" alt="Connect to a server in macOS" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Connect to a server with the Finder application of macOS.</p>
</figcaption>
</figure>
<p>Then, your public directory on <code>sabre/katana</code> will be a regular
directory as every other.</p>
<figure>
<p><img src="https://mnt.io/articles/sabre-katana-a-contact-calendar-task-list-and-file-server/./files.png" alt="List of my files" loading="lazy" decoding="async" /></p>
<figcaption>
<p>List of my files, right here in the Finder application of macOS.</p>
</figcaption>
</figure>
<p><code>sabre/katana</code> is able to store any kind of files. Yes, any kinds. It's
just files. However, it white-lists the kind of files that can be showed
in the browser. Only images, audios, videos, texts, PDF and some vendor
formats (like Microsoft Office) are considered as safe (for the server).
This way, associations can share musics, videos or images, companies can
share PDF or Microsoft Word documents etc. Maybe in the future
<code>sabre/katana</code> might white-list more formats. If a format is not
white-listed, the file will be forced to download.</p>
<h2 id="sabre-katana">How is <code>sabre/katana</code> built?<a role="presentation" class="anchor" href="#sabre-katana" title="Anchor link to this header">#</a>
</h2>
<p><code>sabre/katana</code> is based on two big and solid projects:</p>
<ol>
<li><a rel="noopener external" target="_blank" href="http://sabre.io/"><code>sabre/dav</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="http://hoa-project.net/">Hoa</a>.</li>
</ol>
<p><code>sabre/dav</code> is one of the most powerful
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/CardDAV">CardDAV</a>,
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/CalDAV">CalDAV</a> and
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/WebDAV">WebDAV</a> framework in the planet.
Trusted by the likes of <a rel="noopener external" target="_blank" href="https://www.atmail.com/">Atmail</a>,
<a rel="noopener external" target="_blank" href="https://www.box.com/blog/in-search-of-an-open-source-webdav-solution/">Box</a>,
<a rel="noopener external" target="_blank" href="https://fruux.com/">fruux</a> and <a rel="noopener external" target="_blank" href="http://owncloud.org/">ownCloud</a>, it
powers millions of users world-wide! It is written in PHP and is open
source.</p>
<p>Hoa is a modular, extensible and structured set of PHP libraries. Fun
fact: Also open source, this project is also trusted by
<a rel="noopener external" target="_blank" href="http://owncloud.org/">ownCloud</a>, in addition to
<a rel="noopener external" target="_blank" href="http://mozilla.org/">Mozilla</a>, <a rel="noopener external" target="_blank" href="http://jolicode.com/">joliCode</a> etc.
Recently, this project has recorded more than 600,000 downloads and the
community is about to reach 1000 people.</p>
<p><code>sabre/katana</code> is then a program based on <code>sabre/dav</code> for the DAV part
and Hoa for everything else, like the logic code inside the
<code>sabre/dav</code>'s plugins. The result is a ready-to-use server with a nice
interface for the administration.</p>
<p>To ensure code quality, we use <a rel="noopener external" target="_blank" href="http://atoum.org/">atoum</a>, a popular and
modern test framework for PHP. So far, <code>sabre/dav</code> has more than
1000 assertions.</p>
<h2 id="-6">Conclusion<a role="presentation" class="anchor" href="#-6" title="Anchor link to this header">#</a>
</h2>
<p><code>sabre/katana</code> is a server for contacts, calendars, task lists and
files. Everything is synced, everytime and everywhere. It perfectly
connects to a lot of devices on the market. Several features we need and
use daily have been presented. This is the easiest and a secure way to
host your own private data.</p>
<p><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-katana">Go download it</a>!</p>
RFCs should provide executable test suites2015-02-27T00:00:00+00:002015-02-27T00:00:00+00:00
Unknown
https://mnt.io/articles/rfcs-should-provide-executable-test-suites/<p>Recently, I implemented xCal and xCard formats inside the <code>sabre/dav</code> libraries.
While testing the different RFCs against my implementation, several errata have
been found. This article, first, quickly list them and, second, ask questions
about how such errors can be present and how they can be easily revealed. If
reading my dry humor about RFC errata is boring, the next sections are more
interesting. The whole idea is: Why RFCs do not provide executable test suites?</p>
<h2 id="what-is-xcal-and-xcard">What is xCal and xCard?<a role="presentation" class="anchor" href="#what-is-xcal-and-xcard" title="Anchor link to this header">#</a>
</h2>
<p>The Web is a read-only media. It is based on the HTTP protocol. However,
there is the <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/WebDAV">WebDAV</a> protocol,
standing for Web Distributed Authoring and Versioning. This is an
extension to HTTP. <em>Et voilà !</em> The Web is a read and write media.
WebDAV is standardized in <a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc2518">RFC2518</a>
and <a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc4918">RFC4918</a>.</p>
<p>Based on WebDAV, we have <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/CalDAV">CalDAV</a>
and <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/CardDAV">CardDAV</a>, respectively for
reading and writing calendars and addressbooks. They are standardized in
<a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc4791">RFC4791</a>,
<a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc6638">RFC6638</a> and
<a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc6352">RFC6352</a>. Good! But these
protocols only explain how to read and write, not how to represent a
real calendar or an addressbook. So let's leave protocols for formats.</p>
<p>The <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/ICalendar">iCalendar</a> format
represents calendar events, like events (<code>VEVENT</code>), tasks (<code>VTODO</code>),
journal entry (<code>VJOURNAL</code>, very rare…), free/busy time (<code>VFREEBUSY</code>)
etc. The <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/VCard">vCard</a> format represents
cards. The formats are very similar and share a common ancestry: This is
a <strong>horrible</strong> line-, colon- and semicolon-, randomly-escaped based
format. For instance:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>BEGIN:VCALENDAR</span></span>
<span class="giallo-l"><span>VERSION:2.0</span></span>
<span class="giallo-l"><span>CALSCALE:GREGORIAN</span></span>
<span class="giallo-l"><span>PRODID:-//Example Inc.//Example Calendar//EN</span></span>
<span class="giallo-l"><span>BEGIN:VEVENT</span></span>
<span class="giallo-l"><span>DTSTAMP:20080205T191224Z</span></span>
<span class="giallo-l"><span>DTSTART;VALUE=DATE:20081006</span></span>
<span class="giallo-l"><span>SUMMARY:Planning meeting</span></span>
<span class="giallo-l"><span>UID:4088E990AD89CB3DBB484909</span></span>
<span class="giallo-l"><span>END:VEVENT</span></span>
<span class="giallo-l"><span>END:VCALENDAR</span></span></code></pre>
<p>Horrible, yes. You were warned. These formats are standardized in
several RFCs, to list some of them:
<a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc5545">RFC5545</a>,
<a rel="noopener external" target="_blank" href="http://tools.ietf.org/html/rfc2426">RFC2426</a> and
<a rel="noopener external" target="_blank" href="http://tools.ietf.org/html/rfc6350">RFC6350</a>.</p>
<p>This format is impossible to read, even for a computer. That's why we
have jCal and jCard, which are respectively another representation of
iCalendar and vCard but in <a rel="noopener external" target="_blank" href="http://json.org/">JSON</a>. JSON is quite
popular in the Web today, especially because it eases the manipulation
and exchange of data in Javascript. This is just a very simple, and
—from my point of view— human readable, serialization format. jCal and
jCard are respectively standardized in
<a rel="noopener external" target="_blank" href="http://tools.ietf.org/html/rfc7265">RFC7265</a> and
<a rel="noopener external" target="_blank" href="http://tools.ietf.org/html/rfc7095">RFC7095</a>. Thus, the equivalent of
the previous iCalendar example in jCal is:</p>
<pre class="giallo z-code"><code data-lang="json"><span class="giallo-l"><span>[</span></span>
<span class="giallo-l"><span class="z-string"> "vcalendar"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"version"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "text"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "2.0"</span><span>]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"calscale"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "text"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "GREGORIAN"</span><span>]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"prodid"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "text"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "-</span><span class="z-constant z-character">\/\/</span><span class="z-string">Example Inc.</span><span class="z-constant z-character">\/\/</span><span class="z-string">Example Calendar</span><span class="z-constant z-character">\/\/</span><span class="z-string">EN"</span><span>]</span></span>
<span class="giallo-l"><span> ]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span></span>
<span class="giallo-l"><span> [</span></span>
<span class="giallo-l"><span class="z-string"> "vevent"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"dtstamp"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "date-time"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "2008-02-05T19:12:24Z"</span><span>]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"dtstart"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "date"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "2008-10-06"</span><span>]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"summary"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "text"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "Planning meeting"</span><span>]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> [</span><span class="z-string">"uid"</span><span class="z-punctuation z-separator">,</span><span> {}</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "text"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "4088E990AD89CB3DBB484909"</span><span>]</span></span>
<span class="giallo-l"><span> ]</span></span>
<span class="giallo-l"><span> ]</span></span>
<span class="giallo-l"><span> ]</span></span>
<span class="giallo-l"><span>]</span></span></code></pre>
<p>Much better. But this is JSON, which is a rather loose format, so we
also have xCal and xCard another representation of iCalendar and vCard
but in <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/XML">XML</a>. They are standardized
in <a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc6321">RFC6321</a> and
<a rel="noopener external" target="_blank" href="https://tools.ietf.org/html/rfc6351">RFC6351</a>. The same example in xCal
looks like this:</p>
<pre class="giallo z-code"><code data-lang="xml"><span class="giallo-l"><span class="z-punctuation z-definition z-tag"><</span><span class="z-entity z-name z-tag">icalendar</span><span class="z-entity z-other z-attribute-name"> xmlns</span><span>=</span><span class="z-string">"urn:ietf:params:xml:ns:icalendar-2.0"</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">vcalendar</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">properties</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">version</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span><span>2.0</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">version</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">calscale</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span><span>GREGORIAN</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">calscale</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">prodid</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span><span>-//Example Inc.//Example Calendar//EN</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">prodid</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">properties</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">components</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">vevent</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">properties</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">dtstamp</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">date-time</span><span class="z-punctuation z-definition z-tag">></span><span>2008-02-05T19:12:24Z</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">date-time</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">dtstamp</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">dtstart</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">date</span><span class="z-punctuation z-definition z-tag">></span><span>2008-10-06</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">date</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">dtstart</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">summary</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span><span>Planning meeting</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">summary</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">uid</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> <</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span><span>4088E990AD89CB3DBB484909</span><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">text</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">uid</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">properties</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">vevent</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">components</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"> </</span><span class="z-entity z-name z-tag">vcalendar</span><span class="z-punctuation z-definition z-tag">></span></span>
<span class="giallo-l"><span class="z-punctuation z-definition z-tag"></</span><span class="z-entity z-name z-tag">icalendar</span><span class="z-punctuation z-definition z-tag">></span></span></code></pre>
<p>More semantics, more meaning, easier to read (from my point of view),
namespaces… It is very easy to <strong>embed</strong> xCal and xCard inside other XML
formats.</p>
<p>Managing all these formats is an extremely laborious task. I suggest you
to take a look at <a rel="noopener external" target="_blank" href="http://sabre.io/vobject/"><code>sabre/vobject</code></a> (see <a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-vobject/">the
Github repository of
<code>sabre/vobject</code></a>). This is a
PHP library to manage all the weird formats. The following example shows
how to read from iCalendar and write to jCal and xCal:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Read iCalendar.</span></span>
<span class="giallo-l"><span class="z-variable">$document</span><span class="z-keyword z-operator"> =</span><span> Sabre</span><span class="z-punctuation z-separator">\</span><span>VObject</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Reader</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">read</span><span>(</span><span class="z-variable">$icalendar</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Write jCal.</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span> Sabre</span><span class="z-punctuation z-separator">\</span><span>VObject</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Writer</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">writeJson</span><span>(</span><span class="z-variable">$document</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// Write xCal.</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span> Sabre</span><span class="z-punctuation z-separator">\</span><span>VObject</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Writer</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">writeXml</span><span>(</span><span class="z-variable">$document</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Magic when you know the complexity of these formats (in both term of
parsing and validation)!</p>
<h2 id="list-of-errata">List of errata<a role="presentation" class="anchor" href="#list-of-errata" title="Anchor link to this header">#</a>
</h2>
<p>Now, let's talk about all the errata I submited recently:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="http://www.rfc-editor.org/errata_search.php?eid=4241">4241, in
RFC6351</a>
(xCard),</li>
<li><a rel="noopener external" target="_blank" href="http://www.rfc-editor.org/errata_search.php?eid=4243">4243, in
RFC6351</a>
(xCard),</li>
<li><a rel="noopener external" target="_blank" href="http://www.rfc-editor.org/errata_search.php?eid=4246">4246, in
RFC6350</a>
(vCard),</li>
<li><a rel="noopener external" target="_blank" href="http://www.rfc-editor.org/errata_search.php?eid=4247">4247, in
RFC6351</a>
(xCard),</li>
<li><a rel="noopener external" target="_blank" href="http://www.rfc-editor.org/errata_search.php?eid=4245">4245, in
RFC6350</a>
(vCard),</li>
<li><a rel="noopener external" target="_blank" href="http://www.rfc-editor.org/errata_search.php?eid=4261">4261, in
RFC6350</a>
(vCard).</li>
</ul>
<p>The 2 last ones are reported, not yet verified.</p>
<p>4241, 4243 and 4246 are just typos in examples. “<em>just</em>” is a bit of an
under-statement when you are reading RFCs for days straight, you have 10
of them opened in your browser and trying to figure out how everything
fits together and if you are doing everything correctly. Finding typos
at that point in your process can be very confusing…</p>
<p>4247 is more subtle. The RFC about xCard comes with an <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/XML_Schema_%28W3C%29">XML
Schema</a>. That's
great! It will help us to test our documents and see if they are valid
or not! No? No.</p>
<p>Most of the time, I try to relax and deal with the incoming problems.
But the date and time format in iCalendar, vCard, jCal, jCard, xCal and
xCard can make my blood boil in a second. In what world, exactly, <code>--10</code>
or <code>---28</code> is a conceivable date and time format? How long did I sleep?
“Well” — was I saying to myself, “do not make a drama, we have the XML
Schema!”. No. Because there is an error in the schema. More precisely,
in a regular expression:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>value-time = element time {</span></span>
<span class="giallo-l"><span> xsd:string { pattern = "(\d\d(\d\d(\d\d)?)?|-\d\d(\d\d?)|--\d\d)"</span></span>
<span class="giallo-l"><span> ~ "(Z|[+\-]\d\d(\d\d)?)?" }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Did you find the error? <code>(\d\d?)</code> is invalid, this is <code>(\d\d)?</code>. Don't
get me wrong: Everyone makes mistakes, but not this kind of error. I
will explain why in the next section.</p>
<p>4245 is not an editorial error but a technical one, under review.</p>
<p>4261 is crazy. It deserves a whole sub-section.</p>
<h3 id="welcome-in-the-crazy-world-of-date-and-time-formats">Welcome in the crazy world of date and time formats<a role="presentation" class="anchor" href="#welcome-in-the-crazy-world-of-date-and-time-formats" title="Anchor link to this header">#</a>
</h3>
<p>There are two major popular date and time format:
<a rel="noopener external" target="_blank" href="http://tools.ietf.org/html/rfc2822">RFC2822</a> and ISO.8601. Examples:</p>
<ul>
<li><code>Fri, 27 Feb 2015 16:06:58 +0100</code> and</li>
<li><code>2015-02-27T16:07:16+01:00</code>.</li>
</ul>
<p>The second one is a good candidate for a computer representation: no
locale, only digits, all information are present…</p>
<p>Maybe you noticed there is no link on ISO.8601. Why? Because ISO
standards are not free and I don't want <a rel="noopener external" target="_blank" href="http://www.iso.org/iso/catalogue_detail?csnumber=40874">to pay
140€</a> to buy a
standard…</p>
<p>The date and time format adopted by iCalendar and vCard (and the rest of
the family) is ISO.8601.2004. I cannot read it. However, since we said
in xCard we have an XML Schema; we can read this (after having applied
erratum 4247):</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span># 4.3.1</span></span>
<span class="giallo-l"><span>value-date = element date {</span></span>
<span class="giallo-l"><span> xsd:string { pattern = "\d{8}|\d{4}-\d\d|--\d\d(\d\d)?|---\d\d" }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span># 4.3.2</span></span>
<span class="giallo-l"><span>value-time = element time {</span></span>
<span class="giallo-l"><span> xsd:string { pattern = "(\d\d(\d\d(\d\d)?)?|-\d\d(\d\d)?|--\d\d)"</span></span>
<span class="giallo-l"><span> ~ "(Z|[+\-]\d\d(\d\d)?)?" }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span># 4.3.3</span></span>
<span class="giallo-l"><span>value-date-time = element date-time {</span></span>
<span class="giallo-l"><span> xsd:string { pattern = "(\d{8}|--\d{4}|---\d\d)T\d\d(\d\d(\d\d)?)?"</span></span>
<span class="giallo-l"><span> ~ "(Z|[+\-]\d\d(\d\d)?)?" }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span># 4.3.4</span></span>
<span class="giallo-l"><span>value-date-and-or-time = value-date | value-date-time | value-time</span></span></code></pre>
<p>Question: <strong><code>--10</code> is October or 10 seconds</strong>?</p>
<p><code>--10</code> can fit into <code>value-date</code> and <code>value-time</code>:</p>
<ul>
<li>From <code>value-date</code>, the 3rd element in the disjunction is <code>--\d\d(\d\d)?</code>, so
it matches <code>--10</code> ,</li>
<li>From <code>value-time</code>, the last element in the first disjunction is <code>--\d\d</code>, so
it matches <code>--10</code>.</li>
</ul>
<p>If we have a date-and-or-time value, <code>value-date</code> comes first, so <code>--10</code>
is always October. Nevertheless, if we have a time value, <code>--10</code> is
10 seconds. Crazy now?</p>
<p>Oh, and XML has its own date and time format, which is well-defined and
standardized. Why should we drag this crazy format along?</p>
<p>Oh, and I assume every format depending on ISO.8601.2004 has this bug.
But I am not sure because ISO standards are not free.</p>
<h2 id="how-can-rfcs-have-such-errors">How can RFCs have such errors?<a role="presentation" class="anchor" href="#how-can-rfcs-have-such-errors" title="Anchor link to this header">#</a>
</h2>
<p>So far, RFCs are textual standards. Great. But they are just text.
Written by humans, and thus they are subject to errors or failures. It
is even error-prone. I do not understand: Why an RFC does not come with
an <strong>executable test suite</strong>? I am pretty sure every reader of an RFC
will try to create a test suite on its own.</p>
<p>I assume xCal and xCard formats are not yet very popular. Consequently,
few people read the RFC and tried to write an implementation. This is my
guess. However, it does not avoid the fact an executable test suite
should (must?) be provided.</p>
<h2 id="how-did-i-find-them">How did I find them?<a role="presentation" class="anchor" href="#how-did-i-find-them" title="Anchor link to this header">#</a>
</h2>
<p>This is how I found these errors. I wrote <a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-vobject/blob/master/tests/VObject/Parser/XmlTest.php">a test suite for xCal and
xCard in
<code>sabre/vobject</code></a>.
I would love to write a test suite agnostic of the implementation, but I
ran out of time. This is basically format transformation: R:x→y where R
can be a reflexive operator or not (depending of the versions of
iCalendar and vCard we consider).</p>
<p>For “simple“ errata, I found the errors by testing it manually. For errata 4247
and 4261 (with the regular expressions), I found the error by applying the
algorithms presented in
<a href="https://mnt.io/articles/generate-strings-based-on-regular-expressions/">Generate strings based on regular expressions</a>
.</p>
<h2 id="conclusion">Conclusion<a role="presentation" class="anchor" href="#conclusion" title="Anchor link to this header">#</a>
</h2>
<p><code>sabre/vobject</code> supports xCal and xCard.</p>
Control the terminal, the right way2015-01-04T00:00:00+00:002015-01-04T00:00:00+00:00
Unknown
https://mnt.io/articles/control-the-terminal-the-right-way/<p>Nowadays, there are plenty of terminal emulators in the wild. Each one has
a specific way to handle controls. How many colours does it support? How to
control the style of a character? How to control more than style, like the
cursor or the window? In this article, we are going to explain and show in
action the right ways to control your terminal with a portable and an easy
to maintain API. We are going to talk about <code>stat</code>, <code>tput</code>, <code>terminfo</code>, <code>hoa/ console</code>… but do not be afraid, it's easy and fun!</p>
<h2 id="introduction">Introduction<a role="presentation" class="anchor" href="#introduction" title="Anchor link to this header">#</a>
</h2>
<p>Terminals. They are the ancient interfaces, still not old fashioned yet.
They are fast, efficient, work remotely with a low bandwidth, secured
and very simple to use.</p>
<p>A terminal is a canvas composed of columns and lines. Only one character
fits at a position. According to the terminal, we have some features
enabled; for instance, a character might be stylized with a colour, a
decoration, a weight etc. Let's consider the former. A colour belongs to
a palette, which contains either 2, 8, 256 or more colours. One may
wonder:</p>
<ul>
<li>How many colours does a terminal support?</li>
<li>How to control the style of a character?</li>
<li>How to control more than style, like the cursor or the window?</li>
</ul>
<p>Well, this article is going to explain how a terminal works and how we
interact with it. We are going to talk about terminal capabilities,
terminal information (stored in database) and
<a rel="noopener external" target="_blank" href="http://github.com/hoaproject/Console"><code>Hoa\Console</code></a>,
a PHP library that provides advanced terminal controls.</p>
<h2 id="the-basis-of-a-terminal">The basis of a terminal<a role="presentation" class="anchor" href="#the-basis-of-a-terminal" title="Anchor link to this header">#</a>
</h2>
<p>A terminal, or a console, is an interface that allows to interact with
the computer. This interface is textual. Like a graphical interface,
there are inputs: The keyboard and the mouse, and ouputs: The screen or
a file (a real file, a socket, a FIFO, something else…).</p>
<p>There is a ton of terminals. The most famous ones are:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="http://invisible-island.net/xterm/xterm.html">xterm</a>,</li>
<li><a rel="noopener external" target="_blank" href="http://iterm2.com/">iTerm2</a>,</li>
<li><a rel="noopener external" target="_blank" href="http://software.schmorp.de/pkg/rxvt-unicode.html">urxvt</a>,</li>
<li><a rel="noopener external" target="_blank" href="http://ttssh2.sourceforge.jp/">TeraTerm</a>.</li>
</ul>
<p>Whatever the terminal you use, inputs are handled by programs (or
processus) and outputs are produced by these latters. We said outputs
can be the screen or a file. Actually, everything is a file, so the
screen is also a file. However, the user is able to use
<a rel="noopener external" target="_blank" href="http://gnu.org/software/bash/manual/bashref.html#Redirections">redirections</a>
to choose where the ouputs must go.</p>
<p>Let's consider the <code>echo</code> program that prints all its options/arguments
on its output. Thus, in the following example, <code>foobar</code> is printed on
the screen:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'foobar'</span></span></code></pre>
<p>And in the following example, <code>foobar</code> is redirected to a file called
<code>log</code>:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'foobar'</span><span class="z-keyword z-operator"> ></span><span> log</span></span></code></pre>
<p>We are also able to redirect the output to another program, like <code>wc</code>
that counts stuff:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'foobar'</span><span class="z-keyword z-operator"> |</span><span class="z-entity z-name"> wc</span><span class="z-constant z-other"> -c</span></span>
<span class="giallo-l"><span>7</span></span></code></pre>
<p>Now we know there are 7 characters in <code>foobar</code>… no! <code>echo</code> automatically
adds a new-line (<code>\n</code>) after each line; so:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo -n </span><span class="z-string">'foobar'</span><span class="z-keyword z-operator"> |</span><span class="z-entity z-name"> wc</span><span class="z-constant z-other"> -c</span></span>
<span class="giallo-l"><span>6</span></span></code></pre>
<p>This is more correct!</p>
<h2 id="detecting-type-of-pipes">Detecting type of pipes<a role="presentation" class="anchor" href="#detecting-type-of-pipes" title="Anchor link to this header">#</a>
</h2>
<p>Inputs and outputs are called <strong>pipes</strong>. Yes, trivial, this is nothing
more than basic pipes!</p>
<p>There are 3 standard pipes:</p>
<ul>
<li><code>STDIN</code>, standing for the standard input pipe,</li>
<li><code>STDOUT</code>, standing for the standard output pipe and</li>
<li><code>STDERR</code>, standing for the standard error pipe (also an output one).</li>
</ul>
<p>If the output is attached to the screen, we say this is a “direct
output”. Why is it important? Because if we stylize a text, this is
<strong>only for the screen</strong>, not for a file. A file should receive regular
text, not all the decorations and styles.</p>
<p>Hopefully, the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Console/blob/master/Source/Console.php"><code>Hoa\Console\Console</code>
class</a>
provides the <code>isDirect</code>, <code>isPipe</code> and <code>isRedirection</code> static methods to
know whether the pipe is respectively direct, a pipe or a redirection
(damn naming…!). Thus, let <code>Type.php</code> be the following program:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> 'is direct: '</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Console</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">isDirect</span><span>(</span><span class="z-support z-constant">STDOUT</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> 'is pipe: '</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Console</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">isPipe</span><span>(</span><span class="z-support z-constant">STDOUT</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> 'is redirection: '</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Console</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">isRedirection</span><span>(</span><span class="z-support z-constant">STDOUT</span><span>))</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Now, let's test our program:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php Type.php</span></span>
<span class="giallo-l"><span>is direct: bool(true)</span></span>
<span class="giallo-l"><span>is pipe: bool(false)</span></span>
<span class="giallo-l"><span>is redirection: bool(false)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php Type.php </span><span class="z-keyword z-operator">|</span><span class="z-entity z-name"> xargs</span><span class="z-constant z-other"> -I@</span><span class="z-string"> echo @</span></span>
<span class="giallo-l"><span>is direct: bool(false)</span></span>
<span class="giallo-l"><span>is pipe: bool(true)</span></span>
<span class="giallo-l"><span>is redirection: bool(false)</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php Type.php </span><span class="z-keyword z-operator">></span><span> /tmp/foo</span><span class="z-punctuation z-terminator">;</span><span class="z-entity z-name"> cat</span><span class="z-string"> !!</span><span>$</span></span>
<span class="giallo-l"><span>is direct: bool(false)</span></span>
<span class="giallo-l"><span>is pipe: bool(false)</span></span>
<span class="giallo-l"><span>is redirection: bool(true)</span></span></code></pre>
<p>The first execution is very classic. <code>STDOUT</code>, the standard output, is
direct. The second execution redirects the output to another program,
then <code>STDOUT</code> is of kind pipe. Finally, the last execution redirects the
output to a file called <code>/tmp/foo</code>, so <code>STDOUT</code> is a redirection.</p>
<p>How does it work? We use <a rel="noopener external" target="_blank" href="http://php.net/fstat"><code>fstat</code></a> to read the
<code>mode</code> of the file. The underlying <code>fstat</code> implementation is defined in
C, so let's take a look at the <a rel="noopener external" target="_blank" href="http://man.cx/fstat%282%29">documentation of
<code>fstat(2)</code></a>. <code>stat</code> is a C structure that
looks like:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-storage z-type">struct</span><span> stat </span><span class="z-punctuation z-section">{</span></span>
<span class="giallo-l"><span class="z-storage z-type"> dev_t</span><span> st_dev</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* device inode resides on */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> ino_t</span><span> st_ino</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* inode's number */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> mode_t</span><span> st_mode</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* inode protection mode */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> nlink_t</span><span> st_nlink</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* number of hard links to the file */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> uid_t</span><span> st_uid</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* user-id of owner */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> gid_t</span><span> st_gid</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* group-id of owner */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> dev_t</span><span> st_rdev</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* device type, for special file inode */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> struct</span><span> timespec st_atimespec</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* time of last access */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> struct</span><span> timespec st_mtimespec</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* time of last data modification */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> struct</span><span> timespec st_ctimespec</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* time of last file status change */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> off_t</span><span> st_size</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* file size, in bytes */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> quad_t</span><span> st_blocks</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* blocks allocated for file */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> u_long</span><span> st_blksize</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* optimal file sys I/O ops blocksize */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> u_long</span><span> st_flags</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* user defined flags for file */</span></span>
<span class="giallo-l"><span class="z-storage z-type"> u_long</span><span> st_gen</span><span class="z-punctuation z-terminator">;</span><span class="z-comment"> /* file generation number */</span></span>
<span class="giallo-l"><span class="z-punctuation z-section">}</span></span></code></pre>
<p>The value of <code>mode</code> returned by the PHP <code>fstat</code> function is equal to
<code>st_mode</code> in this structure. And <code>st_mode</code> has the following bits:</p>
<pre class="giallo z-code"><code data-lang="c"><span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFMT</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">170000</span><span class="z-comment"> /* type of file mask */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFIFO</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">010000</span><span class="z-comment"> /* named pipe (fifo) */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFCHR</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">020000</span><span class="z-comment"> /* character special */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFDIR</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">040000</span><span class="z-comment"> /* directory */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFBLK</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">060000</span><span class="z-comment"> /* block special */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFREG</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">100000</span><span class="z-comment"> /* regular */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFLNK</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">120000</span><span class="z-comment"> /* symbolic link */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFSOCK</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">140000</span><span class="z-comment"> /* socket */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IFWHT</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">160000</span><span class="z-comment"> /* whiteout */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_ISUID</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">004000</span><span class="z-comment"> /* set user id on execution */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_ISGID</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">002000</span><span class="z-comment"> /* set group id on execution */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_ISVTX</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">001000</span><span class="z-comment"> /* save swapped text even after use */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IRWXU</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000700</span><span class="z-comment"> /* RWX mask for owner */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IRUSR</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000400</span><span class="z-comment"> /* read permission, owner */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IWUSR</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000200</span><span class="z-comment"> /* write permission, owner */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IXUSR</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000100</span><span class="z-comment"> /* execute/search permission, owner */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IRWXG</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000070</span><span class="z-comment"> /* RWX mask for group */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IRGRP</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000040</span><span class="z-comment"> /* read permission, group */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IWGRP</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000020</span><span class="z-comment"> /* write permission, group */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IXGRP</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000010</span><span class="z-comment"> /* execute/search permission, group */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IRWXO</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000007</span><span class="z-comment"> /* RWX mask for other */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IROTH</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000004</span><span class="z-comment"> /* read permission, other */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IWOTH</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000002</span><span class="z-comment"> /* write permission, other */</span></span>
<span class="giallo-l"><span class="z-keyword">#define</span><span class="z-entity z-name z-function"> S_IXOTH</span><span class="z-keyword"> 0</span><span class="z-constant z-numeric">000001</span><span class="z-comment"> /* execute/search permission, other */</span></span></code></pre>
<p>Awesome, we have everything we need! We mask <code>mode</code> with <code>S_IFMT</code> to get
the file data. Then we just have to check whether it is a named pipe
<code>S_IFIFO</code>, a character special <code>S_IFCHR</code> etc. Concretly:</p>
<ul>
<li><code>isDirect</code> checks that the mode is equal to <code>S_IFCHR</code>, it means it is
attached to the screen (in our case),</li>
<li><code>isPipe</code> checks that the mode is equal to <code>S_IFIFO</code>: This is a special file
that behaves like a FIFO stack (see the
<a rel="noopener external" target="_blank" href="http://www.freebsd.org/cgi/man.cgi?query=mkfifo&sektion=1">documentation of <code>mkfifo(1)</code></a>), everything which is written is
directly read just after and the reading order is defined by the writing order
(first-in, first-out!),</li>
<li><code>isRedirection</code> checks that the mode is equal to <code>S_IFREG</code> , <code>S_IFDIR</code> ,
<code>S_IFLNK</code> , <code>S_IFSOCK</code> or <code>S_IFBLK</code> , in other words: All kind of files on
which we can apply a redirection. Why? Because the <code>STDOUT</code> (or another
<code>STD_*_</code> pipe) of the current processus is defined as a file pointer to the
redirection destination and it can be only a file, a directory, a link, a
socket or a block file.</li>
</ul>
<p>I encourage you to read the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Console/blob/master/Source/Console.php">implementation of the
<code>Hoa\Console\Console::getMode</code>
method</a>.</p>
<p>So yes, this is useful to enable styles on text but also to define the
default verbosity level. For instance, if a program outputs the result
of a computation with some explanations around, the highest verbosity
level would output everything (the result and the explanations) while
the lowest level would output only the result. Let's try with the
<code>toUpperCase.php</code> program:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$verbose</span><span class="z-keyword z-operator"> =</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Console</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">isDirect</span><span>(</span><span class="z-support z-constant">STDOUT</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$string</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $argv</span><span class="z-punctuation z-section">[</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-section">]</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$result</span><span class="z-keyword z-operator"> =</span><span> (</span><span class="z-keyword">new</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>String</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">String</span><span>(</span><span class="z-variable">$string</span><span>))</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">toUpperCase</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">if</span><span>(</span><span class="z-constant z-language">true</span><span class="z-keyword z-operator"> ===</span><span class="z-variable"> $verbose</span><span>) {</span></span>
<span class="giallo-l"><span class="z-support z-function"> echo</span><span class="z-variable"> $string</span><span class="z-punctuation z-separator">,</span><span class="z-string"> ' becomes '</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $result</span><span class="z-punctuation z-separator">,</span><span class="z-string"> ' in upper case!'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span><span class="z-keyword"> else</span><span> {</span></span>
<span class="giallo-l"><span class="z-support z-function"> echo</span><span class="z-variable"> $result</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Then, let's execute this program:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php toUpperCase.php </span><span class="z-string">'Hello world!'</span></span>
<span class="giallo-l"><span>Hello world! becomes HELLO WORLD! in upper case!</span></span></code></pre>
<p>And now, let's execute this program with a pipe:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> php toUpperCase.php </span><span class="z-string">'Hello world!'</span><span class="z-keyword z-operator"> |</span><span class="z-entity z-name"> xargs</span><span class="z-constant z-other"> -I@</span><span class="z-string"> echo @</span></span>
<span class="giallo-l"><span>HELLO WORLD!</span></span></code></pre>
<p>Useful and very simple, isn't it?</p>
<h2 id="terminal-capabilities">Terminal capabilities<a role="presentation" class="anchor" href="#terminal-capabilities" title="Anchor link to this header">#</a>
</h2>
<p>We can control the terminal with the inputs, like the keyboard, but we
can also control the outputs. How? With the text itself. Actually, an
output does not contain only the text but it includes <strong>control
functions</strong>. It's like HTML: Around a text, you can have an element,
specifying that the text is a link. It's exactly the same for terminals!
To specify that a text must be in red, we must add a control function
around it.</p>
<p>Hopefully, these control functions have been standardized in the
<a rel="noopener external" target="_blank" href="http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf">ECMA-48</a>
document: Control Functions for Coded Character Set. However, not all
terminals implement all this standard, and for historical reasons, some
terminals use slightly different control functions. Moreover, some
information do not belong to this standard (because this is out of its
scope), like: How many colours does the terminal support? or does the
terminal support the meta key?</p>
<p>Consequently, each terminal has a list of <strong>capabilities</strong>. This list is
splitted in <strong>3 categories</strong>:</p>
<ul>
<li>boolean capabilities,</li>
<li>number capabilities,</li>
<li>string capabilities.</li>
</ul>
<p>For instance:</p>
<ul>
<li>the “does the terminal support the meta key” is a boolean capability called
<code>meta_key</code> where its value is <code>true</code> or <code>false</code>,</li>
<li>the “number of colours supported by the terminal” is a… number capability
called <code>max_colors</code> where its value can be <code>2</code>, <code>8</code>, <code>256</code> or more,</li>
<li>the “clear screen control function” is a string capability called
<code>clear_screen</code> where its value might be <code>\e[H\e[2J</code>,</li>
<li>the “move the cursor one column to the right” is also a string capability
called <code>cursor_right</code> where its value might be <code>\e[C</code> .</li>
</ul>
<p>All the capabilities can be found in the <a rel="noopener external" target="_blank" href="http://www.freebsd.org/cgi/man.cgi?query=terminfo&sektion=5">documentation of
<code>terminfo(5)</code></a>
or in the <a rel="noopener external" target="_blank" href="http://pubs.opengroup.org/onlinepubs/7908799/xcurses/terminfo.html">documentation of
xcurses</a>.
I encourage you to follow these links and see how rich the terminal
capabilities are!</p>
<h2 id="terminal-information">Terminal information<a role="presentation" class="anchor" href="#terminal-information" title="Anchor link to this header">#</a>
</h2>
<p>Terminal capabilities are stored as <strong>information</strong> in <strong>databases</strong>.
Where are these databases located? In files with a binary format.
Favorite locations are:</p>
<ul>
<li><code>/usr/share/terminfo</code>,</li>
<li><code>/usr/share/lib/terminfo</code>,</li>
<li><code>/lib/terminfo</code>,</li>
<li><code>/usr/lib/terminfo</code>,</li>
<li><code>/usr/local/share/terminfo</code>,</li>
<li><code>/usr/local/share/lib/terminfo</code>,</li>
<li>etc.</li>
<li>or the <code>TERMINFO</code> or <code>TERMINFO_DIRS</code> environment variables.</li>
</ul>
<p>Inside these directories, we have a tree of the form: <code>_xx_/_name_</code>,
where <code>_xx_</code> is the ASCII value in hexadecimal of the first letter of
the terminal name <code>_name_</code>, or <code>_n_/_name_</code> where <code>_n_</code> is the first
letter of the terminal name. The terminal name is stored in the <code>TERM</code>
environment variable. For instance, on my computer:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-variable">$TERM</span></span>
<span class="giallo-l"><span>xterm-256color</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> file /usr/share/terminfo/78/xterm-256color</span></span>
<span class="giallo-l"><span>/usr/share/terminfo/78/xterm-256color: Compiled terminfo entry</span></span></code></pre>
<p>We can use the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Console/blob/master/Source/Tput.php"><code>Hoa\Console\Tput</code>
class</a>
to retrieve these information. The <code>getTerminfo</code> static method allows to
get the path of the terminal information file. The <code>getTerm</code> static
method allows to get the terminal name. Finally, the whole class allows
to parse a terminal information database (it will use the file returned
by <code>getTerminfo</code> by default). For instance:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$tput</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Tput</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(</span><span class="z-variable">$tput</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">count</span><span>(</span><span class="z-string">'max_colors'</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * Will output:</span></span>
<span class="giallo-l"><span class="z-comment"> * int(256)</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span></code></pre>
<p>On my computer, with <code>xterm-256color</code>, I have 256 colours, as expected.
If we parse the information of <code>xterm</code> and not <code>xterm-256color</code>, we will
have:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-variable">$tput</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Tput</span><span>(Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Tput</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">getTerminfo</span><span>(</span><span class="z-string">'xterm'</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(</span><span class="z-variable">$tput</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">count</span><span>(</span><span class="z-string">'max_colors'</span><span>))</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * Will output:</span></span>
<span class="giallo-l"><span class="z-comment"> * int(8)</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span></code></pre><h2 id="the-power-in-your-hand-control-the-cursor">The power in your hand: Control the cursor<a role="presentation" class="anchor" href="#the-power-in-your-hand-control-the-cursor" title="Anchor link to this header">#</a>
</h2>
<p>Let's summarize. We are able to parse and know all the terminal
capabilities of a specific terminal (including the one of the current
user). If we would like a powerful terminal API, we need to control the
basis, like the cursor.</p>
<p>Remember. We said that the terminal is a canvas of columns and lines.
The cursor is like a pen. We can move it and write something. We are
going to (partly) see how the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Console/blob/master/Source/Cursor.php"><code>Hoa\Console\Cursor</code>
class</a>
works.</p>
<h3 id="i-like-to-move-it">I like to move it!<a role="presentation" class="anchor" href="#i-like-to-move-it" title="Anchor link to this header">#</a>
</h3>
<p>The <code>moveTo</code> static method allows to move the cursor to an absolute
position. For example:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">moveTo</span><span>(</span><span class="z-variable">$x</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $y</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The control function we use is <code>cursor_address</code>. So all we need to do is
to use the <code>Hoa\Console\Tput</code> class and call the <code>get</code> method on it to
get the value of this string capability. This is a parameterized one: On
<code>xterm-256color</code>, its value is <code>e[%i%p1%d;%p2%dH</code>. We replace the
parameters by <code>$x</code> and <code>$y</code> and we output the result. That's all! We are
able to move the cursor on an absolute position on <strong>all terminals</strong>!
This is the right way to do.</p>
<p>We use the same strategy for the <code>move</code> static method that moves the
cursor relatively to its current position. For example:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">move</span><span>(</span><span class="z-string">'right up'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>We split the steps and for each step we read the appropriated string
capability using the <code>Hoa\Console\Tput</code> class. For <code>right</code>, we read the
<code>parm_right_cursor</code> string capability, for <code>up</code>, we read
<code>parm_up_cursor</code> etc. Note that <code>parm_right_cursor</code> is different of
<code>cursor_right</code>: The first one is used to move the cursor a certain
number of times while the second one is used to move the cursor only one
time. With performances in mind, we should use the first one if we have
to move the cursor several times.</p>
<p>The <code>getPosition</code> static method returns the position of the cursor. This
way to interact is a little bit different. We must write a control
function on the output, and then, the terminal replies on the input.
<a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Console/blob/master/Source/Cursor.php">See the implementation by
yourself</a>.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">print_r</span><span>(Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">getPosition</span><span>())</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * Will output:</span></span>
<span class="giallo-l"><span class="z-comment"> * Array</span></span>
<span class="giallo-l"><span class="z-comment"> * (</span></span>
<span class="giallo-l"><span class="z-comment"> * [x] => 7</span></span>
<span class="giallo-l"><span class="z-comment"> * [y] => 42</span></span>
<span class="giallo-l"><span class="z-comment"> * )</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span></code></pre>
<p>In the same way, we have the <code>save</code> and <code>restore</code> static methods that
save the current position of the cursor and restore it. This is very
useful. We use the <code>save_cursor</code> and <code>restore_cursor</code> string
capabilities.</p>
<p>Also, the <code>clear</code> static method splits some parts to clear. For each
part (direction or way), we read from <code>Hoa\Console\Tput</code> the
appropriated string capabilities: <code>clear_screen</code> to clear all the
screen, <code>clr_eol</code> to clear everything on the right of the cursor,
<code>clr_eos</code> to clear everything bellow the cursor etc.</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">clear</span><span>(</span><span class="z-string">'left'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>See what we learnt in action:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> 'Foobar'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-string"> 'Foobar'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-string"> 'Foobar'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-string"> 'Foobar'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-string"> 'Foobar'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">save</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">move</span><span>(</span><span class="z-string">'LEFT'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">move</span><span>(</span><span class="z-string">'↑'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">move</span><span>(</span><span class="z-string">'↑'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">move</span><span>(</span><span class="z-string">'↑'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">clear</span><span>(</span><span class="z-string">'↔'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span class="z-support z-function"> echo</span><span class="z-string"> 'Hahaha!'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">restore</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span><span class="z-string"> 'Bye!'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The result is presented in the following figure.</p>
<figure>
<p><img src="https://mnt.io/articles/control-the-terminal-the-right-way/./cursor_move.gif" alt="Moving a cursor in the terminal" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Saving, moving, clearing and restoring the cursor with <code>Hoa\Console</code>.</p>
</figcaption>
</figure>
<p>The resulting API is portable, clean, simple to read and very easy to
maintain! This is the right way to do.</p>
<p>To get more information, please <a rel="noopener external" target="_blank" title="Documentation of Hoa\Console\Cursor" href="http://hoa-project.net/Literature/Hack/Console.html#Cursor">read the
documentation</a>.</p>
<h3 id="">Colours and decorations<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h3>
<p>Now: Colours. This is mainly the reason why I decided to write this
article. We see the same and the same libraries, again and again, doing
only colours in the terminal, but unfortunately not in the right way 😞.</p>
<p>A terminal has a palette of colours. Each colour is indexed by an
integer, from 0 to potentially +∞ . The size of the palette is described
by the <code>max_colors</code> number capability. Usually, a palette contains 1, 2,
8, 256 or 16 million colours.</p>
<figure>
<p><img src="https://mnt.io/articles/control-the-terminal-the-right-way/./xterm_256color_chart.svg" alt="<code>xterm-256color</code> palette" loading="lazy" decoding="async" /></p>
<figcaption>
<p>The <code>xterm-256color</code> palette (<a rel="noopener external" target="_blank" title="Source of the `xterm-256color` palette" href="https://commons.wikimedia.org/wiki/File:Xterm_256color_chart.svg">source</a>).</p>
</figcaption>
</figure>
<p>So first thing to do is to check whether we have more than 1 colour. If
not, we must not colorize the given text. Next, if we have less than
256 colours, we have to convert the style into a palette containing
8 colours. Same with less than 16 million colours, we have to convert
into 256 colours.</p>
<p>Moreover, we can define the style of the foreground or of the background
with respectively the <code>set_a_foreground</code> and <code>set_a_background</code> string
capabilities. Finally, in addition to colours, we can define other
decorations like bold, underline, blink or even inverse the foreground
and the background colours.</p>
<p>One thing to remember is: With this capability, we only define the style
at a given “pixel” and it will apply on the following text. In this
case, it is not exactly like HTML where we have a beginning and an end.
Here we only have a beginning. Let's try!</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">colorize</span><span>(</span><span class="z-string">'underlined foreground(yellow) background(#932e2e)'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> 'foo'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Cursor</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">colorize</span><span>(</span><span class="z-string">'!underlined background(normal)'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-string"> 'bar'</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>The API is pretty simple: We start to underline the text, we set the
foreground to yellow and we set the background to <code>#932e2e</code> . Then we
output something. We continue with cancelling the underline decoration
in addition to resetting the background. Finally we output something
else. Here is the result:</p>
<figure>
<p><img src="https://mnt.io/articles/control-the-terminal-the-right-way/./colour.png" alt="A styled text in the terminal" loading="lazy" decoding="async" /></p>
<figcaption>
<p>Fun with <code>Hoa\Console\Cursor::colorize</code>.</p>
</figcaption>
</figure>
<p>What do we observe? My terminal does not support more than 256 colours.
Thus, <code>#932e2e</code> is <strong>automatically converted into the closest colour</strong>
in my actual palette! This is the right way to do.</p>
<p>For fun, you can change the colours in the palette with the
<code>Hoa\Console\Cursor::changeColor</code> static method. You can also change the
style of the cursor, like <code>▋</code>, <code>_</code> or <code>|</code>.</p>
<p>To get more information, please <a rel="noopener external" target="_blank" title="Documentation of Hoa\Console\Cursor" href="http://hoa-project.net/Fr/Literature/Hack/Console.html#Content">read the
documentation</a>.</p>
<h2 id="-1">The power in your hand: Readline<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p>A more complete usage of <code>Hoa\Console\Cursor</code> and even
<code>Hoa\Console\Window</code> is the <a rel="noopener external" target="_blank" href="http://central.hoa-project.net/Resource/Library/Console/Readline/Readline.php"><code>Hoa\Console\Readline</code>
class</a>
that is a powerful readline. More than autocompleters, history, key
bindings etc., it has an advanced use of cursors. See this in action:</p>
<figure>
<p><img src="https://mnt.io/articles/control-the-terminal-the-right-way/./readline_autocompleters.gif" alt="Play with autocompleters" loading="lazy" decoding="async" /></p>
<figcaption>
<p>An autocompletion menu, made with <code>Hoa\Console\Cursor</code> and
<code>Hoa\Console\Window</code>.</p>
</figcaption>
</figure>
<p>We use <code>Hoa\Console\Cursor</code> to move the cursor or change the colours and
<code>Hoa\Console\Window</code> to get the dimensions of the window, scroll some
text in it etc. I encourage you to read the implementation.</p>
<p>To get more information, please <a rel="noopener external" target="_blank" title="Documentation of Hoa\Console\Readline" href="http://hoa-project.net/Literature/Hack/Console.html#Readline">read the
documentation</a>.</p>
<h2 id="-2">The power in your hand: Sound 🎵<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h2>
<p>Yes, even sound is defined by terminal capabilities. The famous bip is
given by the <code>bell</code> string capability. You would like to make a bip?
Easy:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$tput</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Tput</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-variable"> $tput</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">get</span><span>(</span><span class="z-string">'bell'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>That's it!</p>
<h2 id="-3">Bonus: Window<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h2>
<p>As a bonus, a quick demo of <code>Hoa\Console\Window</code> because it's fun.</p>
<p>The video shows the execution of the following code:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">setSize</span><span>(</span><span class="z-constant z-numeric">80</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 35</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">var_dump</span><span>(Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">getPosition</span><span>())</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">foreach</span><span>(</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> [</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> [</span><span class="z-constant z-numeric">100</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 100</span><span class="z-punctuation z-section">]</span><span class="z-punctuation z-separator">,</span><span class="z-punctuation z-section"> [</span><span class="z-constant z-numeric">150</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 150</span><span class="z-punctuation z-section">]</span><span class="z-punctuation z-separator">,</span><span class="z-punctuation z-section"> [</span><span class="z-constant z-numeric">200</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 100</span><span class="z-punctuation z-section">]</span><span class="z-punctuation z-separator">,</span><span class="z-punctuation z-section"> [</span><span class="z-constant z-numeric">200</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 80</span><span class="z-punctuation z-section">]</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> [</span><span class="z-constant z-numeric">200</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 60</span><span class="z-punctuation z-section">]</span><span class="z-punctuation z-separator">,</span><span class="z-punctuation z-section"> [</span><span class="z-constant z-numeric">200</span><span class="z-punctuation z-separator">,</span><span class="z-constant z-numeric"> 100</span><span class="z-punctuation z-section">]</span></span>
<span class="giallo-l"><span class="z-punctuation z-section"> ]</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> as</span><span class="z-support z-function"> list</span><span>(</span><span class="z-variable">$x</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $y</span><span>)</span></span>
<span class="giallo-l"><span>) {</span></span>
<span class="giallo-l"><span class="z-support z-function"> sleep</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">moveTo</span><span>(</span><span class="z-variable">$x</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $y</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">2</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">minimize</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">2</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">restore</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">2</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">lower</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">sleep</span><span>(</span><span class="z-constant z-numeric">2</span><span>)</span><span class="z-punctuation z-terminator">;</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span>Console</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Window</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">raise</span><span>()</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>We resize the window, we get its position, we move the window on the
screen, we minimize and restore it, and finally we put it behind all
other windows just before raising it.</p>
<iframe src="https://player.vimeo.com/video/115901611?title=0&byline=0&portrait=0&badge=0&autopause=0&player_id=0&app_id=58479" frameborder="0" allow="autoplay; fullscreen; picture-in-picture; clipboard-write" style="aspect-ratio: 16/9; width: 100%;" title="Hoa\Console\Window in action">
</iframe>
<p>To get more information, please <a rel="noopener external" target="_blank" title="Documentation of Hoa\Console\Window" href="http://hoa-project.net/Literature/Hack/Console.html#Window">read the
documentation</a>.</p>
<h2 id="-4">Conclusion<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h2>
<p>In this article, we saw how to control the terminal by: Firstly,
detecting the type of pipes, and secondly, reading and using the
terminal capabilities. We know where these capabilities are stored and
we saw few of them in action.</p>
<p>This approach ensures your code will be <strong>portable</strong>, easy to maintain
and <strong>easy to use</strong>. The portability is very important because, like
browsers and user devices, we have a lot of terminal emulators released
in the wild. We have to care about them.</p>
<p>I encourage you to take a look at the <a rel="noopener external" target="_blank" href="http://github.com/hoaproject/Console"><code>Hoa\Console</code>
library</a> and to contribute to make
it even more awesome 😄.</p>
atoum has two release managers2014-11-28T00:00:00+00:002014-11-28T00:00:00+00:00
Unknown
https://mnt.io/articles/atoum-has-two-release-managers/<h2 id="what-is-atoum">What is atoum?<a role="presentation" class="anchor" href="#what-is-atoum" title="Anchor link to this header">#</a>
</h2>
<p>Short introduction: atoum is a simple, modern and intuitive unit testing
framework for PHP. Originally created by <a rel="noopener external" target="_blank" href="http://blog.mageekbox.net/">Frédéric
Hardy</a>, a good friend, it has grown thanks
to <a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/graphs/contributors">many
contributors</a>.</p>
<figure>
<p><img src="https://mnt.io/articles/atoum-has-two-release-managers/./atoum-logo.png" alt="atoum's logo" loading="lazy" decoding="async" /></p>
<figcaption>
atoum's logo.
</figcaption>
</figure>
<p>No one can say that atoum is not simple or intuitive. The framework
offers several awesome features and is more like a meta unit testing
framework. Indeed, the “user-land” of atoum, I mean all the assertions
API (“this is an integer and it is equal to…”) is based on a very
flexible mechanism, handled or embedded in runners, reporters etc. Thus,
the framework is very extensible. You can find more informations in the
<code>README.md</code> of the project: <a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum#why-atoum">Why
atoum?</a>.</p>
<p>Several important projects or companies use atoum. For instance,
<a rel="noopener external" target="_blank" href="https://github.com/FriendsOfPHP/pickle/">Pickle</a>, the PHP Extension
installer, created by <a rel="noopener external" target="_blank" href="https://twitter.com/PierreJoye">Pierre Joye</a>,
another friend (the world is very small 😉) use atoum for its unit
tests. Another example with <a rel="noopener external" target="_blank" href="https://github.com/M6Web">M6Web</a>, the geeks
working at <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/M6_%28TV_channel%29">M6</a>, the
most profitable private national French TV channel, also use atoum.
Another example, <a rel="noopener external" target="_blank" href="http://mozilla.org/">Mozilla</a> is using atoum to test
some of their applications.</p>
<h2 id="">Where is the cap'tain?<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p>Since the beginning, Frédéric has been a great leader for the project.
He has inspired many people, users and contributors. In real life, on
stage, on IRC… its personality and charisma were helpful in all aspects.
However, leading such a project is a challenging and nerve-wracking
daily work. I know what I am talking about with
<a rel="noopener external" target="_blank" href="http://hoa-project.net/">Hoa</a>. Hopefully for Frédéric, some
contributors were here to help.</p>
<h2 id="-1">Where to go cap'tain?<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p>However, having contributors do not create a community. A community is a
group of people that share something together. A project needs a
community with strong connections. They do no need to all look at the
same direction, but they have to share something. In the case of atoum,
I would say the project has been <strong>victim of its own success</strong>. We have
seen the number of users increasing very quickly and the project was not
yet ready for such a massive use. The documentation was not ready, a lot
of features were not finalized, there were few contributors and the
absence of a real community did not help. Put all these things together,
blend them together and you obtain a bomb 😄. The project leaders were
under a terrible pressure.</p>
<p>In these conditions, this is not easy to work. Especially when users ask
for new features. The needs to have a roadmap and people taking
decisions were very strong.</p>
<h2 id="-2">When the community acts<a role="presentation" class="anchor" href="#-2" title="Anchor link to this header">#</a>
</h2>
<p>After a couple of months under the sea, we have decided that we need to
create a structure around the project. An organization. Frédéric is not
able to do everything by himself. That's why <strong>2 release managers have
been elected</strong>: Mikaël Randy and I. Thank you to <a rel="noopener external" target="_blank" href="http://jubianchi.fr/">Julien
Bianchi</a>, another friend 😉, for having organized
these elections and being one of the most active contributor of atoum!</p>
<p>Our goal is to define the roadmap of atoum:</p>
<ul>
<li>what will be included in the next version and what will not,</li>
<li>what features need work,</li>
<li>what bugs or issues need to be solved,</li>
<li>etc.</li>
</ul>
<p>Well, a release manager is a pretty common job.</p>
<p>Why 2? To avoid the bus effect and delegate. We all have a family,
friends, jobs and side projects. With 2 release managers, we have
2 times more time to organize this project, and it deserves such an
amount of time.</p>
<p>The goal is also to organize the community if it is possible. New great
features are coming and they will allow more people to contribute and
build their “own atoum”. See below.</p>
<h2 id="-3">Features to port!<a role="presentation" class="anchor" href="#-3" title="Anchor link to this header">#</a>
</h2>
<p>Everything is not defined at 100% but here is an overview of what is
coming.</p>
<p>First of all, you will find the <a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/milestones/1.0.0">latest issues and
bugs</a> we have to close
before the first release.</p>
<p>Second, you will notice the version number… 1.0.0. Yes! atoum will have
tags! After several discussions
(<a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/issues/261">#261</a>,
<a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/issues/300">#300</a>,
<a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/issues/342">#342</a>,
<a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/issues/349">#349</a>…), even if atoum is
rolling-released, it will have tags. And with the <a rel="noopener external" target="_blank" href="http://semver.org/">semver
format</a>. More informations on the blog of Julien
Bianchi: <a rel="noopener external" target="_blank" href="http://jubianchi.fr/atoum-release.htm">atoum embraces semver</a>.</p>
<p>Finally, a big feature is the <a rel="noopener external" target="_blank" href="https://github.com/atoum/atoum/pull/330">Extension
API</a>, that allows to write
extension, such as:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/atoum/visibility-extension"><code>atoum/visibility-extension</code></a>, allows to override methods visibility;
example:</li>
</ul>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Foo</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> bar</span><span>(</span><span class="z-variable">$arg</span><span>) {</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $arg</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// and…</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Foo</span><span class="z-storage"> extends</span><span> atoum</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-other z-inherited-class">test</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> testBaz</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">if</span><span>(</span><span class="z-variable">$sut</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-punctuation z-separator"> \</span><span class="z-support z-class">Foo</span><span>())</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">and</span><span>(</span><span class="z-variable">$arg</span><span class="z-keyword z-operator"> =</span><span class="z-string"> 'bar'</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">then</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">variable</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">invoke</span><span>(</span><span class="z-variable">$sut</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">bar</span><span>(</span><span class="z-variable">$arg</span><span>))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">isEqualTo</span><span>(</span><span class="z-variable">$arg</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Now you will be able to test your protected and private methods!</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/atoum/bdd-extension"><code>atoum/bdd-extension</code></a>, allows to write tests with the behavior-driven
development style and vocabulary; example:</li>
</ul>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Formatter</span><span class="z-storage"> extends</span><span> atoum</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-other z-inherited-class">spec</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> should_format_underscore_separated_method_name</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">given</span><span>(</span><span class="z-variable">$formatter</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-support z-class"> testedClass</span><span>())</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">then</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">invoking</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">format</span><span>(</span><span class="z-constant z-language">__FUNCTION__</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">on</span><span>(</span><span class="z-variable">$formatter</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">shouldReturn</span><span>(</span><span class="z-string">'should format underscore separated method name'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Even the output looks familiar:</p>
<figure>
<p><img src="https://mnt.io/articles/atoum-has-two-release-managers/output.png" alt="Example of a terminal output" loading="lazy" decoding="async" /></p>
<figcaption>
Possible output with the `atoum/bdd-extension`.
</figcaption>
</figure>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/atoum/json-schema-extension"><code>atoum/json-schema-extension</code></a>, allows to validate JSON payloads against a
schema; example:</li>
</ul>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Foo</span><span class="z-storage"> extends</span><span> atoum</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-other z-inherited-class">test</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> testIsJson</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">given</span><span>(</span><span class="z-variable">$string</span><span class="z-keyword z-operator"> =</span><span class="z-string"> '{"foo": "bar"}'</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">then</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">json</span><span>(</span><span class="z-variable">$string</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> testValidatesSchema</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">given</span><span>(</span><span class="z-variable">$string</span><span class="z-keyword z-operator"> =</span><span class="z-string"> '["foo", "bar"]'</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">then</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">json</span><span>(</span><span class="z-variable">$string</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">validates</span><span>(</span><span class="z-string">'{"title": "test", "type": "array"}'</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">json</span><span>(</span><span class="z-variable">$string</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">validates</span><span>(</span><span class="z-string">'/path/to/json.schema'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Contributions-Atoum-PraspelExtension"><code>atoum/praspel-extension</code></a>, allows to use
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Praspel">Praspel</a> inside atoum: automatically
generate and validate advanced test data and unit tests; example:</li>
</ul>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> Foo</span><span class="z-storage"> extends</span><span> atoum</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-other z-inherited-class">test</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> testFoo</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">if</span><span>(</span><span class="z-variable">$regex</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">realdom</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">regex</span><span>(</span><span class="z-string z-regexp">'/[\w\-_]</span><span class="z-keyword z-operator">+</span><span class="z-string z-regexp z-constant z-character">(\.[\w\-\_]</span><span class="z-keyword z-operator">+</span><span class="z-string z-regexp">)</span><span class="z-keyword z-operator">*</span><span class="z-string z-regexp z-constant z-character">@\w\.(net|org)/'</span><span>))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">and</span><span>(</span><span class="z-variable">$email</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">sample</span><span>(</span><span class="z-variable">$regex</span><span>))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">then</span></span>
<span class="giallo-l"><span class="z-constant z-other"> …</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Here, we have generated a string based on its regular expression. Reminder, you
might have seen this on this blog:
<a href="https://mnt.io/articles/generate-strings-based-on-regular-expressions/">Generate strings based on regular expressions</a>
.</p>
<p>Fun fact: the <code>atoum/json-schema-extension</code> is tested with atoum
obviously and… <code>atoum/praspel-extension</code>!</p>
<h2 id="-4">Conclusion<a role="presentation" class="anchor" href="#-4" title="Anchor link to this header">#</a>
</h2>
<p>atoum has a bright future with exciting features! We sincerely hope this
new direction will gather existing and new contributors 😄.</p>
<p>❤️ open-source!</p>
Hello fruux!2014-11-24T00:00:00+00:002014-11-24T00:00:00+00:00
Unknown
https://mnt.io/articles/hello-fruux/<h2 id="leaving-the-research-world">Leaving the research world<a role="presentation" class="anchor" href="#leaving-the-research-world" title="Anchor link to this header">#</a>
</h2>
<p>I have really enjoyed my time at INRIA and Femto-ST, 2 research
institutes in France. But after 8 years at the university and a hard PhD
thesis (but with great results by the way!), I would like to see other
things.</p>
<p>My time as an intern at Mozilla and my work in the open-source world
have been very <em>seductive</em>. Open-source contrasts a lot with the
research world, where privacy and secrecy are first-citizens of every
project. All the work I have made and all the algorithms I have
developed during my PhD thesis have been implemented under an
open-source license, and I ran into some issues because of such decision
(patents are sometimes better, sometimes not… long story).</p>
<p>So, I like research but I also like to hack and share everything. And
right now, I have to get a change of air! So I asked on Twitter:</p>
<blockquote>
<p>I (Ivan Enderlin, a fresh PhD, creator of Hoa) am looking for a job.
Here is my CV:
<a rel="noopener external" target="_blank" href="http://t.co/dAdLm35RUu">http://t.co/dAdLm35RUu</a>.
Please, contact me!
<a rel="noopener external" target="_blank" href="https://twitter.com/hashtag/hoajob?src=hash">#hoajob</a></p>
<p>— Hoa project (@hoaproject) <a rel="noopener external" target="_blank" href="https://twitter.com/hoaproject/status/492382581271572480">July 24th,
2014</a></p>
</blockquote>
<p>And what a surprise! A <strong>lot</strong> of companies answered to my tweet (most
of them in private of course), but the most interesting one at my eyes
was… fruux 😉.</p>
<h2 id="fruux">fruux<a role="presentation" class="anchor" href="#fruux" title="Anchor link to this header">#</a>
</h2>
<p>fruux defines itself as: “A unified contacts/calendaring system that
works across <a rel="noopener external" target="_blank" href="https://fruux.com/supported-devices/">platforms and
devices</a>. We are behind
<a rel="noopener external" target="_blank" href="https://fruux.com/opensource"><code>sabre/dav</code></a>, which is the most popular
open-source implementation of the
<a rel="noopener external" target="_blank" href="http://en.wikipedia.org/wiki/CardDAV">CardDAV</a> and
<a rel="noopener external" target="_blank" href="http://en.wikipedia.org/wiki/CardDAV">CalDAV</a> standards. Besides us,
developers and companies around the globe use our <code>sabre/dav</code> technology
to deliver sync functionality to millions of users”.</p>
<figure>
<p><img src="https://mnt.io/articles/hello-fruux/./fruux-logo.png" alt="Fruux's logo" loading="lazy" decoding="async" /></p>
<figcaption>
fruux's logo.
</figcaption>
</figure>
<p>Several things attract me at fruux:</p>
<ol>
<li>low-layers are open-source,</li>
<li>viable ecosystem based on open-source,</li>
<li>accepts remote working,</li>
<li>close timezone to mine,</li>
<li>touching millions of people,</li>
<li>standards in minds.</li>
</ol>
<p>The first point is the most important for me. I don't want to make a
company richer without any benefits for the rest of the world. I want my
work to be beneficial to the rest of the world, to share my work, I want
my work to be reused, hacked, criticized, updated and shared again. This
is the spirit of the open-source and the hackability paradigms. And
fortunately for me, fruux's low-layers are 100% open-source, namely
<code>sabre/dav</code> & co.</p>
<p>However, being able to eat at the end of the month with open-source is
not that simple. Fortunately for me, fruux has a stable economic model,
based on open-source. Obviously, I have to work on closed projects,
obviously, I have to work for some specific customers, but I can go back
to open-source goodnesses all the time 😉.</p>
<p>In addition, I am currently living in Switzerland and fruux is located
in Germany. Fortunately for me, fruux's team is kind of dispatched all
around Europe and the world. Naturally, they accept me to work remotely.
Whilst it can be inconvenient for some people, I enjoy to have my own
hours, to organize myself as I would like etc. Periodical meetings and
phone-calls help to stay focused. And I like to think that people are
more productive this way. After 4 years at home because of my Master
thesis and PhD thesis, I know how to organize myself and exchange with a
decentralized team. This is a big advantage. Moreover, Germany is in the
same timezone as Switzerland! Compared to companies located at, for
instance, California, this is simpler for my family.</p>
<p>Finally, working on an open-source project that is used by millions of
users is very motivating. You know that your contributions will touch a
lot of people and it gives meaning to my work on a daily basis. Also,
the last thing I love at fruux is this desire to respect standards, RFC,
recommandations etc. They are involved in these processes, consortiums
and groups (for instance
<a rel="noopener external" target="_blank" href="http://calconnect.org/mbrlist.shtml">CalConnect</a>). I love standards and
specifications, and this methodology reminds me the scientific approach
I had with my PhD thesis. I consider that a standard without an
implementation has no meaning, and a well-designed standard is a piece
of a delicious cake, especially when everyone respects this standard 😄.</p>
<p>(… but the cake is a lie!)</p>
<h2 id="sabre"><code>sabre/*</code><a role="presentation" class="anchor" href="#sabre" title="Anchor link to this header">#</a>
</h2>
<p>fruux has mostly hired me because of my experience on
<a rel="noopener external" target="_blank" href="http://hoa-project.net/">Hoa</a>. One of my main public job is to work on
all the <code>sabre/*</code> libraries, which include:</p>
<ul>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-dav"><code>sabre/dav</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-davclient"><code>sabre/davclient</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-event"><code>sabre/event</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-http"><code>sabre/http</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-proxy"><code>sabre/proxy</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-tzserver"><code>sabre/tzserver</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-vobject"><code>sabre/vobject</code></a>,</li>
<li><a rel="noopener external" target="_blank" href="https://github.com/fruux/sabre-xml"><code>sabre/xml</code></a>.</li>
</ul>
<p>You will find the documentations and the news on
<a rel="noopener external" target="_blank" href="http://sabre.io/">sabre.io</a>.</p>
<p>All these libraries serve the first one: <code>sabre/dav</code>, which is an
implementation of the WebDAV technology, including extensions for
CalDAV, and CardDAV, respectively for calendars, tasks and address
books. For the one who does not know what is WebDAV, in few words: The
Web is mostly a read-only media, but WebDAV extends HTTP in order to be
able to write and collaborate on documents. The way WebDAV is defined is
fascinating, and even more, the way it can be extended.</p>
<p>Most of the work is already done by <a rel="noopener external" target="_blank" href="http://evertpot.com/">Evert</a> and
many contributors, but we can go deeper! More extensions, more
standards, better code, better algorithms etc.!</p>
<p>If you are interested in the work I am doing on <code>sabre/*</code>, you can
check this <a rel="noopener external" target="_blank" href="https://github.com/search?q=user%3Afruux+author%3Ahywan&type=Issues">search result on
Github</a>.</p>
<h2 id="">Future of Hoa<a role="presentation" class="anchor" href="#" title="Anchor link to this header">#</a>
</h2>
<p>Certain people have asked me about the future of Hoa: Whether I am going
to stop it or not since I have a job now.</p>
<p>Firstly, a PhD thesis is exhausting, and believe me, it requires more
energy than a regular job, even if you are passionate about your job and
you did not count working hours. With a PhD thesis, you have no weekend,
no holidays, you are always out of time, you always have a ton (sic) of
articles and documents to read… there is no break, no end. In these
conditions, I was able to maintain Hoa and to grow the project though,
thanks to a very helpful and present community!</p>
<p>Secondly, fruux is planning to use Hoa. I don't know how or when, but if
at a certain moment, using Hoa makes sense, they will. What does it
imply for Hoa and me? It means that I will be paid to work on Hoa at a
little percentage. I don't know how much, it will depend of the moments,
but this is a big step forward for the project. Moreover, a project like
fruux using Hoa is a big chance! I hope to see the fruux's logo very
soon on the homepage of the Hoa's website.</p>
<p>Thus, to conclude, I will have more time (on evenings, weekends,
holidays and sometimes during the day) to work on Hoa. Do not be afraid,
the future is bright 😄.</p>
<h2 id="-1">Conclusion<a role="presentation" class="anchor" href="#-1" title="Anchor link to this header">#</a>
</h2>
<p><em>Bref</em>, I am working at fruux!</p>
Generate strings based on regular expressions2014-09-30T00:00:00+00:002014-09-30T00:00:00+00:00
Unknown
https://mnt.io/articles/generate-strings-based-on-regular-expressions/<p>During my PhD thesis, I have partly worked on the problem of the automatic
accurate test data generation. In order to be complete and self-contained, I
have addressed all kinds of data types, including strings. This article aims at
showing how to generate accurate and relevant strings under several constraints.</p>
<h2 id="what-is-a-regular-expression">What is a regular expression?<a role="presentation" class="anchor" href="#what-is-a-regular-expression" title="Anchor link to this header">#</a>
</h2>
<p>We are talking about formal language theory here. In the known world,
there are four kinds of languages. More formally, in 1956, the <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Chomsky_hierarchy">Chomsky
hierarchy</a> has been
formulated, classifying grammars (which define languages) in four
levels:</p>
<ol>
<li>unrestricted grammars, matching langages known as Turing languages, no
restriction,</li>
<li>context-sensitive grammars, matching contextual languages,</li>
<li>context-free grammars, matching algebraic languages, based on stacked
automata,</li>
<li>regular grammars, matching regular languages.</li>
</ol>
<p>Each level includes the next level. The last level is the “weaker”,
which must not sound negative here. <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Regular_expression">Regular
expressions</a> are used
often because of their simplicity and also because they solve most
problems we encounter daily.</p>
<p>A regular expression is a small language with very few operators and,
most of the time, a simple semantics. For instance <code>ab(c|d)</code> means: a
word (a data) starting by <code>ab</code> and followed by <code>c</code> or <code>d</code>. We also have
quantification operators (also known as repetition operators), such as
<code>?</code>, <code>*</code> and <code>+</code>. We also have <code>{_x_,_y_}</code> to define a repetition
between <code>_x_</code> and <code>_y_</code>. Thus, <code>? </code> is equivalent to <code>{0,1}</code>, <code>*</code> to
<code>{0,}</code> and <code>+</code> to <code>{1,}</code>. When <code>_y_</code> is missing, it means +∞, so
unbounded (or more exactly, bounded by the limits of the machine). So,
for instance <code>ab(c|d){2,4}e?</code> means: a word starting by <code>ab</code>, followed
2, 3 or 4 times by <code>c</code> or <code>d</code> (so <code>cc</code>, <code>cd</code>, <code>dc</code>, <code>ccc</code>, <code>ccd</code>, <code>cdc</code>
and so on) and potentially followed by <code>e</code>.</p>
<p>The goal here is not to teach you regular expressions but this is kind
of a tiny reminder. There are plenty of regular languages. You might
know <a rel="noopener external" target="_blank" href="http://www.unix.com/man-page/Linux/7/regex/">POSIX regular
expression</a> or <a rel="noopener external" target="_blank" href="http://pcre.org/">Perl
Compatible Regular Expressions (PCRE)</a>. Forget the
first one, please. The syntax and the semantics are too much limited.
PCRE is the regular language I recommend all the time.</p>
<p>Behind every formal language there is a graph. A regular expression is compiled
into a <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/">Finite State Machine (FSM)</a>. I am not
going to draw and explain them, but it is interesting to know that behind a
regular expression there is a basic automaton. No magic.</p>
<h3 id="why-focussing-regular-expressions">Why focussing regular expressions?<a role="presentation" class="anchor" href="#why-focussing-regular-expressions" title="Anchor link to this header">#</a>
</h3>
<p>This article focuses on regular languages instead of other kind of
languages because we use them very often (even daily). I am going to
address context-free languages in another article, be patient young
padawan. The needs and constraints with other kind of languages are not
the same and more complex algorithms must be involved. So we are going
easy for the first step.</p>
<h2 id="understanding-pcre-lex-and-parse-them">Understanding PCRE: lex and parse them<a role="presentation" class="anchor" href="#understanding-pcre-lex-and-parse-them" title="Anchor link to this header">#</a>
</h2>
<p>The <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Compiler"><code>Hoa\Compiler</code> library</a> provides
both LL(1) LL(k) compiler-compilers. The
<a rel="noopener external" target="_blank" href="http://hoa-project.net/Literature/Hack/Compiler.html">documentation</a>
describes how to use it. We discover that the LL(k)
compiler comes with a grammar description language called PP. What does
it mean? It means for instance that the grammar of the PCRE can be
written with the PP language and that <code>Hoa\Compiler\Llk</code> will transform
this grammar into a compiler. That's why we call them “compiler of
compilers”.</p>
<p>Fortunately, the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Regex"><code>Hoa\Regex</code> library</a>
provides the grammar of the PCRE language in the
<a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Regex/blob/master/Source/Grammar.pp"><code>hoa://Library/Regex/Grammar.pp</code></a>
file. Consequently, we are able to analyze regular expressions written
in the PCRE language! Let's try in a shell at first with the
<code>hoa compiler:pp</code> tool:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'ab(c|d){2,4}e?'</span><span class="z-keyword z-operator"> |</span><span class="z-entity z-name"> hoa</span><span class="z-string"> compiler:pp hoa://Library/Regex/Grammar.pp</span><span class="z-constant z-numeric"> 0</span><span class="z-constant z-other"> --visitor</span><span class="z-string"> dump</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span> #expression</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> ></span><span class="z-comment"> #concatenation</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > ></span><span> token(</span><span class="z-entity z-name">literal,</span><span class="z-string"> a</span><span>)</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > ></span><span> token(</span><span class="z-entity z-name">literal,</span><span class="z-string"> b</span><span>)</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > ></span><span class="z-comment"> #quantification</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > > ></span><span class="z-comment"> #alternation</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > > > ></span><span> token(</span><span class="z-entity z-name">literal,</span><span class="z-string"> c</span><span>)</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > > > ></span><span> token(</span><span class="z-entity z-name">literal,</span><span class="z-string"> d</span><span>)</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > > ></span><span> token(</span><span class="z-entity z-name">n_to_m,</span><span class="z-string"> {2,4}</span><span>)</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > ></span><span class="z-comment"> #quantification</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > > ></span><span> token(</span><span class="z-entity z-name">literal,</span><span class="z-string"> e</span><span>)</span></span>
<span class="giallo-l"><span class="z-punctuation z-separator">></span><span class="z-keyword z-operator"> > > ></span><span> token(</span><span class="z-entity z-name">zero_or_one,</span><span class="z-string"> ?</span><span>)</span></span></code></pre>
<p>We read that the whole expression is composed of a single concatenation
of two tokens: <code>a</code> and <code>b</code>, followed by a quantification, followed by
another quantification. The first quantification is an alternation of (a
choice betwen) two tokens: <code>c</code> and <code>d</code>, between 2 to 4 times. The second
quantification is the <code>e</code> token that can appear zero or one time. Pretty
simple.</p>
<p>The final output of the <code>Hoa\Compiler\Llk\Parser</code> class is an <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">Abstract
Syntax Tree (AST)</a>.
The documentation of <code>Hoa\Compiler</code> explains all that stuff, you should read
it. The LL(k) compiler is cut out into very distinct layers in order to improve
hackability. Again, the documentation teach us we have <a rel="noopener external" target="_blank" href="http://hoa-project.net/Literature/Hack/Compiler.html#Compilation_process">four levels in the
compilation
process</a>:
lexical analyzer, syntactic analyzer, trace and AST. The lexical analyzer (also
known as lexer) transforms the textual data being analyzed into a sequence of
tokens (formally known as lexemes). It checks whether the data is composed of
the good pieces. Then, the syntactic analyzer (also known as parser) checks that
the order of tokens in this sequence is correct (formally we say that it derives
the sequence, see the <a rel="noopener external" target="_blank" href="http://hoa-project.net/Literature/Hack/Compiler.html#Matching_words">Matching words
section</a>
to learn more).</p>
<p>Still in the shell, we can get the result of the lexical analyzer by
using the <code>--token-sequence</code> option; thus:</p>
<pre class="giallo z-code"><code data-lang="shellsession"><span class="giallo-l"><span class="z-punctuation z-separator">$</span><span> echo </span><span class="z-string">'ab(c|d){2,4}e?'</span><span class="z-keyword z-operator"> |</span><span class="z-entity z-name"> hoa</span><span class="z-string"> compiler:pp hoa://Library/Regex/Grammar.pp</span><span class="z-constant z-numeric"> 0</span><span class="z-constant z-other"> --token-sequence</span></span>
<span class="giallo-l"><span> # … token name token value offset</span></span>
<span class="giallo-l"><span>-----------------------------------------</span></span>
<span class="giallo-l"><span> 0 … literal a 0</span></span>
<span class="giallo-l"><span> 1 … literal b 1</span></span>
<span class="giallo-l"><span> 2 … capturing_ ( 2</span></span>
<span class="giallo-l"><span> 3 … literal c 3</span></span>
<span class="giallo-l"><span> 4 … alternation | 4</span></span>
<span class="giallo-l"><span> 5 … literal d 5</span></span>
<span class="giallo-l"><span> 6 … _capturing ) 6</span></span>
<span class="giallo-l"><span> 7 … n_to_m {2,4} 7</span></span>
<span class="giallo-l"><span> 8 … literal e 12</span></span>
<span class="giallo-l"><span> 9 … zero_or_one ? 13</span></span>
<span class="giallo-l"><span> 10 … EOF 15</span></span></code></pre>
<p>This is the sequence of tokens produced by the lexical analyzer. The
tree is not yet built because this is the first step of the compilation
process. However this is always interesting to understand these
different steps and see how it works.</p>
<p>Now we are able to analyze any regular expressions in the PCRE format!
The result of this analysis is a tree. You know what is fun with trees?
<a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Visitor_pattern">Visiting them</a>.</p>
<h2 id="visiting-the-ast">Visiting the AST<a role="presentation" class="anchor" href="#visiting-the-ast" title="Anchor link to this header">#</a>
</h2>
<p>Unsurprisingly, each node of the AST can be visited thanks to the <a rel="noopener external" target="_blank" href="http://github.com/hoaproject/Visitor"><code>Hoa\Visitor</code>
library</a>. Here is an example with the
“dump” visitor:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Compiler</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">File</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// 1. Load grammar.</span></span>
<span class="giallo-l"><span class="z-variable">$compiler</span><span class="z-keyword z-operator"> =</span><span> Compiler</span><span class="z-punctuation z-separator">\</span><span>Llk</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Llk</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">load</span><span>(</span></span>
<span class="giallo-l"><span class="z-keyword"> new</span><span> File</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Read</span><span>(</span><span class="z-string">'hoa://Library/Regex/Grammar.pp'</span><span>)</span></span>
<span class="giallo-l"><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// 2. Parse a data.</span></span>
<span class="giallo-l"><span class="z-variable">$ast</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $compiler</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">parse</span><span>(</span><span class="z-string">'ab(c|d){2,4}e?'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// 3. Dump the AST.</span></span>
<span class="giallo-l"><span class="z-variable">$dump</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> Compiler</span><span class="z-punctuation z-separator">\</span><span>Visitor</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Dump</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-variable"> $dump</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">visit</span><span>(</span><span class="z-variable">$ast</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>This program will print the same AST dump we have previously seen in the
shell.</p>
<p>How to write our own visitor? A visitor is a class with a single <code>visit</code>
method. Let's try a visitor that pretty print a regular expression, i.e.
transform:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>ab(c|d){2,4}e?</span></span></code></pre>
<p>into:</p>
<pre class="giallo z-code"><code data-lang="plain"><span class="giallo-l"><span>a</span></span>
<span class="giallo-l"><span>b</span></span>
<span class="giallo-l"><span>(</span></span>
<span class="giallo-l"><span> c</span></span>
<span class="giallo-l"><span> |</span></span>
<span class="giallo-l"><span> d</span></span>
<span class="giallo-l"><span>){2,4}</span></span>
<span class="giallo-l"><span>e?</span></span></code></pre>
<p>Why a pretty printer? First, it shows how to visit a tree. Second, it
shows the structure of the visitor: we filter by node ID (<code>#expression</code>,
<code>#quantification</code>, <code>token</code> etc.) and we apply respective computations. A
pretty printer is often a good way for being familiarized with the
structure of an AST.</p>
<p>Here is the class. It catches only useful constructions for the given
example:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Visitor</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> PrettyPrinter</span><span class="z-storage"> implements</span><span> Visitor</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-other z-inherited-class">Visit</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> visit</span><span>(</span></span>
<span class="giallo-l"><span> Visitor</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Element</span><span class="z-variable"> $element</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-storage"> &</span><span class="z-variable">$handle</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-variable"> $eldnah</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span></span>
<span class="giallo-l"><span> ) {</span></span>
<span class="giallo-l"><span class="z-storage"> static</span><span class="z-variable"> $_indent</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $nodeId</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getId</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> switch</span><span>(</span><span class="z-variable">$nodeId</span><span>) {</span></span>
<span class="giallo-l"><span class="z-comment"> // Reset indentation and…</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> '#expression'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $_indent</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // … visit all the children.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> '#quantification'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> foreach</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChildren</span><span>()</span><span class="z-keyword z-operator"> as</span><span class="z-variable"> $child</span><span>)</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> .=</span><span class="z-variable"> $child</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // One new line between each children of the concatenation.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> '#concatenation'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> foreach</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChildren</span><span>()</span><span class="z-keyword z-operator"> as</span><span class="z-variable"> $child</span><span>)</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> .=</span><span class="z-variable"> $child</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-keyword z-operator"> .</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Add parenthesis and increase indentation.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> '#alternation'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $oout</span><span class="z-keyword z-operator"> =</span><span class="z-punctuation z-section"> []</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable"> $pIndent</span><span class="z-keyword z-operator"> =</span><span class="z-support z-function"> str_repeat</span><span>(</span><span class="z-string">' '</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $_indent</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> ++</span><span class="z-variable">$_indent</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $cIndent</span><span class="z-keyword z-operator"> =</span><span class="z-support z-function"> str_repeat</span><span>(</span><span class="z-string">' '</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $_indent</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> foreach</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChildren</span><span>()</span><span class="z-keyword z-operator"> as</span><span class="z-variable"> $child</span><span>)</span></span>
<span class="giallo-l"><span class="z-variable"> $oout</span><span class="z-punctuation z-section">[]</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $cIndent</span><span class="z-keyword z-operator"> .</span><span class="z-variable"> $child</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword z-operator"> --</span><span class="z-variable">$_indent</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> .=</span><span class="z-variable"> $pIndent</span><span class="z-keyword z-operator"> .</span><span class="z-string"> '('</span><span class="z-keyword z-operator"> .</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-keyword z-operator"> .</span></span>
<span class="giallo-l"><span class="z-support z-function"> implode</span><span>(</span><span class="z-string">"</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-keyword z-operator"> .</span><span class="z-variable"> $cIndent</span><span class="z-keyword z-operator"> .</span><span class="z-string"> '|'</span><span class="z-keyword z-operator"> .</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $oout</span><span>)</span><span class="z-keyword z-operator"> .</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-keyword z-operator"> .</span></span>
<span class="giallo-l"><span class="z-variable"> $pIndent</span><span class="z-keyword z-operator"> .</span><span class="z-string"> ')'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Print token value verbatim.</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> 'token'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $tokenId</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getValueToken</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $tokenValue</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getValueValue</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> switch</span><span>(</span><span class="z-variable">$tokenId</span><span>) {</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> 'literal'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> 'n_to_m'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> 'zero_or_one'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> .=</span><span class="z-variable"> $tokenValue</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> default</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> throw new</span><span class="z-support z-class"> RuntimeException</span><span>(</span></span>
<span class="giallo-l"><span class="z-string"> 'Token ID '</span><span class="z-keyword z-operator"> .</span><span class="z-variable"> $tokenId</span><span class="z-keyword z-operator"> .</span><span class="z-string"> ' is not well-handled.'</span></span>
<span class="giallo-l"><span> )</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> default</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> throw new</span><span class="z-support z-class"> RuntimeException</span><span>(</span></span>
<span class="giallo-l"><span class="z-string"> 'Node ID '</span><span class="z-keyword z-operator"> .</span><span class="z-variable"> $nodeId</span><span class="z-keyword z-operator"> .</span><span class="z-string"> ' is not well-handled.'</span></span>
<span class="giallo-l"><span> )</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $out</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>And finally, we apply the pretty printer on the AST like previously
seen:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$compiler</span><span class="z-keyword z-operator"> =</span><span> Compiler</span><span class="z-punctuation z-separator">\</span><span>Llk</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Llk</span><span class="z-keyword z-operator">::</span><span class="z-entity z-name z-function">load</span><span>(</span></span>
<span class="giallo-l"><span class="z-keyword"> new</span><span> File</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Read</span><span>(</span><span class="z-string">'hoa://Library/Regex/Grammar.pp'</span><span>)</span></span>
<span class="giallo-l"><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$ast</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $compiler</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">parse</span><span>(</span><span class="z-string">'ab(c|d){2,4}e?'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$prettyprint</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-support z-class"> PrettyPrinter</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-variable"> $prettyprint</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">visit</span><span>(</span><span class="z-variable">$ast</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p><em>Et voilà !</em></p>
<p>Now, put all that stuff together!</p>
<h2 id="isotropic-generation">Isotropic generation<a role="presentation" class="anchor" href="#isotropic-generation" title="Anchor link to this header">#</a>
</h2>
<p>We can use <code>Hoa\Regex</code> and <code>Hoa\Compiler</code> to get the AST of any regular
expressions written in the PCRE format. We can use <code>Hoa\Visitor</code> to
traverse the AST and apply computations according to the type of nodes.
Our goal is to generate strings based on regular expressions. What kind
of generation are we going to use? There are plenty of them: uniform
random, smallest, coverage based…</p>
<p>The simplest is isotropic generation, also known as random generation.
But random says nothing: what is the repartition, or do we have any
uniformity? Isotropic means each choice will be solved randomly and
uniformly. Uniformity has to be defined: does it include the whole set
of nodes or just the immediate children of the node? Isotropic means we
consider only immediate children. For instance, a node <code>#alternation</code>
has <em>c</em> immediate children, the probability <em>C</em> to choose one child is:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<semantics>
<mrow>
<mi>P</mi>
<mo stretchy="false">(</mo><mi>C</mi><mo stretchy="false">)</mo>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<mi>c</mi>
</mfrac>
</mrow>
<annotation encoding="application/x-tex">P(C) = \frac{1}{c}</annotation>
</semantics>
</math>
<p>Yes, simple as that!</p>
<p>We can use the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Math"><code>Hoa\Math</code> library</a> that
provides the <code>Hoa\Math\Sampler\Random</code> class to sample uniform random integers
and floats. Ready?</p>
<h3 id="structure-of-the-visitor">Structure of the visitor<a role="presentation" class="anchor" href="#structure-of-the-visitor" title="Anchor link to this header">#</a>
</h3>
<p>The structure of the visitor is the following:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Visitor</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Math</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">class</span><span class="z-entity z-name"> IsotropicSampler</span><span class="z-storage"> implements</span><span> Visitor</span><span class="z-punctuation z-separator">\</span><span class="z-entity z-other z-inherited-class">Visit</span><span> {</span></span>
<span class="giallo-l"><span class="z-storage"> protected</span><span class="z-variable"> $_sampler</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-support z-function"> __construct</span><span>(Math</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Sampler</span><span class="z-variable"> $sampler</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">_sampler</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $sampler</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage"> public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> visit</span><span>(</span></span>
<span class="giallo-l"><span> Visitor</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Element</span><span class="z-variable"> $element</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-storage"> &</span><span class="z-variable">$handle</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-variable"> $eldnah</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span></span>
<span class="giallo-l"><span> ) {</span></span>
<span class="giallo-l"><span class="z-keyword"> switch</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getId</span><span>()) {</span></span>
<span class="giallo-l"><span class="z-comment"> // …</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>We set a sampler and we start visiting and filtering nodes by their node
ID. The following code will generate a string based on the regular
expression contained in the <code>$expression</code> variable:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$expression</span><span class="z-keyword z-operator"> =</span><span class="z-string"> '…'</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$ast</span><span class="z-keyword z-operator"> =</span><span class="z-variable"> $compiler</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">parse</span><span>(</span><span class="z-variable">$expression</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable">$generator</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-support z-class"> IsotropicSampler</span><span>(</span><span class="z-keyword">new</span><span> Math</span><span class="z-punctuation z-separator">\</span><span>Sampler</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Random</span><span>())</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-variable"> $generator</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">visit</span><span>(</span><span class="z-variable">$ast</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>We are going to change the value of <code>$expression</code> step by step until
having <code>ab(c|d){2,4}e?</code>.</p>
<h3 id="case-of-expression">Case of <code>#expression</code><a role="presentation" class="anchor" href="#case-of-expression" title="Anchor link to this header">#</a>
</h3>
<p>A node of type <code>#expression</code> has only one child. Thus, we simply return
the computation of this node:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">case</span><span class="z-string"> '#expression'</span><span class="z-keyword z-operator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChild</span><span>(</span><span class="z-constant z-numeric">0</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span></code></pre><h3 id="case-of-token">Case of <code>token</code><a role="presentation" class="anchor" href="#case-of-token" title="Anchor link to this header">#</a>
</h3>
<p>We consider only one type of token for now: <code>literal</code>. A literal can
contain an escaped character, can be a single character or can be <code>.</code>
(which means everything). We consider only a single character for this
example (spoil: the whole visitor already exists). Thus:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">case</span><span class="z-string"> 'token'</span><span class="z-keyword z-operator">:</span></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getValueValue</span><span>()</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Here, with <code>$expression = 'a';</code> we get the string <code>a</code>.</p>
<h3 id="case-of-concatenation">Case of <code>#concatenation</code><a role="presentation" class="anchor" href="#case-of-concatenation" title="Anchor link to this header">#</a>
</h3>
<p>A concatenation is just the computation of all children joined in a
single piece of string. Thus:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">case</span><span class="z-string"> '#concatenation'</span><span class="z-keyword z-operator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> foreach</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChildren</span><span>()</span><span class="z-keyword z-operator"> as</span><span class="z-variable"> $child</span><span>)</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> .=</span><span class="z-variable"> $child</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $out</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>At this step, with <code>$expression = 'ab';</code> we get the string <code>ab</code>. Totally
crazy.</p>
<h3 id="case-of-alternation">Case of <code>#alternation</code><a role="presentation" class="anchor" href="#case-of-alternation" title="Anchor link to this header">#</a>
</h3>
<p>An alternation is a choice between several children. All we have to do
is to select a child based on the probability given above. The number of
children for the current node can be known thanks to the
<code>getChildrenNumber</code> method. We are also going to use the sampler of
integers. Thus:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">case</span><span class="z-string"> '#alternation'</span><span class="z-keyword z-operator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $childIndex</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">_sampler</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getInteger</span><span>(</span></span>
<span class="giallo-l"><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChildrenNumber</span><span>()</span><span class="z-keyword z-operator"> -</span><span class="z-constant z-numeric"> 1</span></span>
<span class="giallo-l"><span> )</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChild</span><span>(</span><span class="z-variable">$childIndex</span><span>)</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Now, with <code>$expression = 'ab(c|d)';</code> we get the strings <code>abc</code> or <code>abd</code>
at random. Try several times to see by yourself.</p>
<h3 id="case-of-quantification">Case of <code>#quantification</code><a role="presentation" class="anchor" href="#case-of-quantification" title="Anchor link to this header">#</a>
</h3>
<p>A quantification is an alternation of concatenations. Indeed, <code>e{2,4}</code>
is strictly equivalent to <code>ee|eee|eeee</code>. We have only two
quantifications in our example: <code>?</code> and <code>{_x_,_y_}</code>. We are going to
find the value for <code>_x_</code> and <code>_y_</code> and then choose at random between
these bounds. Let's go:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">case</span><span class="z-string"> '#quantification'</span><span class="z-keyword z-operator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-language"> null</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $x</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $y</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Filter the type of quantification.</span></span>
<span class="giallo-l"><span class="z-keyword"> switch</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChild</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getValueToken</span><span>()) {</span></span>
<span class="giallo-l"><span class="z-comment"> // ?</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> 'zero_or_one'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $y</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 1</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // {x,y}</span></span>
<span class="giallo-l"><span class="z-keyword"> case</span><span class="z-string"> 'n_to_m'</span><span class="z-punctuation z-terminator">:</span></span>
<span class="giallo-l"><span class="z-variable"> $xy</span><span class="z-keyword z-operator"> =</span><span class="z-support z-function"> explode</span><span>(</span></span>
<span class="giallo-l"><span class="z-string"> ','</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-support z-function"> trim</span><span>(</span><span class="z-variable">$element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChild</span><span>(</span><span class="z-constant z-numeric">1</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getValueValue</span><span>()</span><span class="z-punctuation z-separator">,</span><span class="z-string"> '{}'</span><span>)</span></span>
<span class="giallo-l"><span> )</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $x</span><span class="z-keyword z-operator"> =</span><span> (</span><span class="z-storage">int</span><span>)</span><span class="z-support z-function"> trim</span><span>(</span><span class="z-variable">$xy</span><span class="z-punctuation z-section">[</span><span class="z-constant z-numeric">0</span><span class="z-punctuation z-section">]</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-variable"> $y</span><span class="z-keyword z-operator"> =</span><span> (</span><span class="z-storage">int</span><span>)</span><span class="z-support z-function"> trim</span><span>(</span><span class="z-variable">$xy</span><span class="z-punctuation z-section">[</span><span class="z-constant z-numeric">1</span><span class="z-punctuation z-section">]</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Choose the number of repetitions.</span></span>
<span class="giallo-l"><span class="z-variable"> $max</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">_sampler</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getInteger</span><span>(</span><span class="z-variable">$x</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $y</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment"> // Concatenate.</span></span>
<span class="giallo-l"><span class="z-keyword"> for</span><span>(</span><span class="z-variable">$i</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span class="z-variable"> $i</span><span class="z-keyword z-operator"> <</span><span class="z-variable"> $max</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span class="z-variable">$i</span><span>) {</span></span>
<span class="giallo-l"><span class="z-variable"> $out</span><span class="z-keyword z-operator"> .=</span><span class="z-variable"> $element</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">getChild</span><span>(</span><span class="z-constant z-numeric">0</span><span>)</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">accept</span><span>(</span><span class="z-variable z-language">$this</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $handle</span><span class="z-punctuation z-separator">,</span><span class="z-variable"> $eldnah</span><span>)</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword"> return</span><span class="z-variable"> $out</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-keyword"> break</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>Finally, with <code>$expression = 'ab(c|d){2,4}e?';</code> we can have the
following strings: <code>abdcce</code>, <code>abdc</code>, <code>abddcd</code>, <code>abcde</code> etc. Nice isn't
it? Want more?</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">for</span><span>(</span><span class="z-variable">$i</span><span class="z-keyword z-operator"> =</span><span class="z-constant z-numeric"> 0</span><span class="z-punctuation z-terminator">;</span><span class="z-variable"> $i</span><span class="z-keyword z-operator"> <</span><span class="z-constant z-numeric"> 42</span><span class="z-punctuation z-terminator">;</span><span class="z-keyword z-operator"> ++</span><span class="z-variable">$i</span><span>) {</span></span>
<span class="giallo-l"><span class="z-support z-function"> echo</span><span class="z-variable"> $generator</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">visit</span><span>(</span><span class="z-variable">$ast</span><span>)</span><span class="z-punctuation z-separator">,</span><span class="z-string"> "</span><span class="z-constant z-character">\n</span><span class="z-string">"</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span>}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">/**</span></span>
<span class="giallo-l"><span class="z-comment"> * Could output:</span></span>
<span class="giallo-l"><span class="z-comment"> * abdce</span></span>
<span class="giallo-l"><span class="z-comment"> * abdcc</span></span>
<span class="giallo-l"><span class="z-comment"> * abcdde</span></span>
<span class="giallo-l"><span class="z-comment"> * abcdcd</span></span>
<span class="giallo-l"><span class="z-comment"> * abcde</span></span>
<span class="giallo-l"><span class="z-comment"> * abcc</span></span>
<span class="giallo-l"><span class="z-comment"> * abddcde</span></span>
<span class="giallo-l"><span class="z-comment"> * abddcce</span></span>
<span class="giallo-l"><span class="z-comment"> * abcde</span></span>
<span class="giallo-l"><span class="z-comment"> * abcc</span></span>
<span class="giallo-l"><span class="z-comment"> * abdcce</span></span>
<span class="giallo-l"><span class="z-comment"> * abcde</span></span>
<span class="giallo-l"><span class="z-comment"> * abdce</span></span>
<span class="giallo-l"><span class="z-comment"> * abdd</span></span>
<span class="giallo-l"><span class="z-comment"> * abcdce</span></span>
<span class="giallo-l"><span class="z-comment"> * abccd</span></span>
<span class="giallo-l"><span class="z-comment"> * abdcdd</span></span>
<span class="giallo-l"><span class="z-comment"> * abcdcce</span></span>
<span class="giallo-l"><span class="z-comment"> * abcce</span></span>
<span class="giallo-l"><span class="z-comment"> * abddc</span></span>
<span class="giallo-l"><span class="z-comment"> */</span></span></code></pre><h2 id="performance">Performance<a role="presentation" class="anchor" href="#performance" title="Anchor link to this header">#</a>
</h2>
<p>This is difficult to give numbers because it depends of a lot of parameters:
your machine configuration, the PHP VM, if other programs run etc. But I have
generated 1 million strings in less than 25 seconds on my machine (an old
MacBook Pro), which is pretty reasonable.</p>
<h2 id="conclusion-and-surprise">Conclusion and surprise<a role="presentation" class="anchor" href="#conclusion-and-surprise" title="Anchor link to this header">#</a>
</h2>
<p>So, yes, now we know how to generate strings based on regular
expressions! Supporting all the PCRE format is difficult. That's why the
<a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Regex"><code>Hoa\Regex</code> library</a> provides
the <code>Hoa\Regex\Visitor\Isotropic</code> class that is a more advanced visitor.
This latter supports classes, negative classes, ranges, all
quantifications, all kinds of literals (characters, escaped characters,
types of characters —<code>\w</code>, <code>\d</code>, <code>\h</code>…—) etc. Consequently, all you have
to do is:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-keyword">use</span><span> Hoa</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Regex</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-comment">// …</span></span>
<span class="giallo-l"><span class="z-variable">$generator</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span> Regex</span><span class="z-punctuation z-separator">\</span><span>Visitor</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Isotropic</span><span>(</span><span class="z-keyword">new</span><span> Math</span><span class="z-punctuation z-separator">\</span><span>Sampler</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Random</span><span>())</span><span class="z-punctuation z-terminator">;</span></span>
<span class="giallo-l"><span class="z-support z-function">echo</span><span class="z-variable"> $generator</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">visit</span><span>(</span><span class="z-variable">$ast</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>This algorithm is used in
<a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Praspel">Praspel</a>, a
specification language I have designed during my PhD thesis. More
specifically, this algorithm is used inside realistic domains. I am not
going to explain it today but it allows me to introduce the “surprise”.</p>
<h3 id="generate-strings-based-on-regular-expressions-in-atoum">Generate strings based on regular expressions in atoum<a role="presentation" class="anchor" href="#generate-strings-based-on-regular-expressions-in-atoum" title="Anchor link to this header">#</a>
</h3>
<p><a rel="noopener external" target="_blank" href="http://atoum.org/">atoum</a> is an awesome unit test framework. You can
use the <a rel="noopener external" target="_blank" href="https://github.com/hoaproject/Contributions-Atoum-PraspelExtension"><code>Atoum\PraspelExtension</code>
extension</a>
to use Praspel and therefore realistic domains inside atoum. You can use
realistic domains to validate <strong>and</strong> to generate data, they are
designed for that. Obviously, we can use the <code>Regex</code> realistic domain.
This extension provides several features including <code>sample</code>,
<code>sampleMany</code> and <code>predicate</code> to respectively generate one datum,
generate many data and validate a datum based on a realistic domain. To
declare a regular expression, we must write:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$regex</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">realdom</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">regex</span><span>(</span><span class="z-string z-regexp">'/ab(c|d){2,4}e?/'</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>And to generate a datum, all we have to do is:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-variable">$datum</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">sample</span><span>(</span><span class="z-variable">$regex</span><span>)</span><span class="z-punctuation z-terminator">;</span></span></code></pre>
<p>For instance, imagine you are writing a test called <code>test_mail</code> and you
need an email address:</p>
<pre class="giallo z-code"><code data-lang="php"><span class="giallo-l"><span class="z-keyword z-operator"><?</span><span class="z-constant z-other">php</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span class="z-storage">public</span><span class="z-storage z-type z-function"> function</span><span class="z-entity z-name z-function"> test_mail</span><span>() {</span></span>
<span class="giallo-l"><span class="z-variable z-language"> $this</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">given</span><span>(</span></span>
<span class="giallo-l"><span class="z-variable"> $regex</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-variable">realdom</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">regex</span><span>(</span><span class="z-string z-regexp">'/[\w\-_]</span><span class="z-keyword z-operator">+</span><span class="z-string z-regexp z-constant z-character">(\.[\w\-\_]</span><span class="z-keyword z-operator">+</span><span class="z-string z-regexp">)</span><span class="z-keyword z-operator">*</span><span class="z-string z-regexp z-constant z-character">@\w\.(net|org)/'</span><span>)</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-variable"> $address</span><span class="z-keyword z-operator"> =</span><span class="z-variable z-language"> $this</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">sample</span><span>(</span><span class="z-variable">$regex</span><span>)</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-variable"> $mailer</span><span class="z-keyword z-operator"> =</span><span class="z-keyword"> new</span><span class="z-punctuation z-separator"> \</span><span>Mock</span><span class="z-punctuation z-separator">\</span><span class="z-support z-class">Mailer</span><span>(</span><span class="z-constant z-other">…</span><span>)</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span> )</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-entity z-name z-function">when</span><span>(</span><span class="z-variable">$mailer</span><span class="z-keyword z-operator">-></span><span class="z-entity z-name z-function">sendTo</span><span>(</span><span class="z-variable">$address</span><span>))</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">then</span></span>
<span class="giallo-l"><span class="z-keyword z-operator"> -></span><span class="z-variable">…</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>Easy to read, fast to execute and help to focus on the logic of the test
instead of test data (also known as fixtures). Note that most of the
time the regular expressions are already in the code (maybe as
constants). It is therefore easier to write and to maintain the tests.</p>
<p>I hope you enjoyed this first part of the series :-)! This work has been
published in the International Conference on Software Testing,
Verification and Validation: <a rel="noopener external" target="_blank" href="https://hal.science/hal-00931662/file/EDGB12.pdf">Grammar-Based Testing using Realistic
Domains in PHP</a>.</p>
Rüsh Release2014-09-15T00:00:00+00:002014-09-15T00:00:00+00:00
Unknown
https://mnt.io/articles/rush-release/<p>Since 2 years, at <a rel="noopener external" target="_blank" href="http://hoa-project.net/">Hoa</a>, we are looking for the
perfect release process. Today, we have finalized the last thing related
to this new process: we have found a name. It is called <strong>Rüsh
Release</strong>, standing for <em>Rolling Ünd ScHeduled Release</em>.</p>
<p>The following explanations are useful from the user point of view, not
from the developer point of view. It means that we do not explain all
the branches and the workflow between all of them. We will settle for
the user final impact.</p>
<h2 id="rolling-release">Rolling Release<a role="presentation" class="anchor" href="#rolling-release" title="Anchor link to this header">#</a>
</h2>
<p>On one hand, Hoa is not and will never be finished. We will never reach
the “Holy 1.0 Grail”. So, one might reckon that Hoa is rolling-released?
Let's dive into this direction. There are plenty <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Rolling_release">rolling release
types</a> out there, such
as:</p>
<ul>
<li>partially rolling,</li>
<li>fully rolling,</li>
<li>truly rolling,</li>
<li>pseudo-rolling,</li>
<li>optionally rolling,</li>
<li>cyclically rolling,</li>
<li>and synonyms…</li>
</ul>
<p>I am not going to explain all of them. All you need to know is that Hoa
is partly and truly rolling released, or <em>part-</em> and <em>true-</em> rolling
released for short. Why? Firstly, “Part-rolling project has a subset of
software packages that are not rolling”. If we look at Hoa only, it is
fully rolling but Hoa depends on PHP virtual machines to be executed,
which are not rolling released (for the most popular ones at least).
Thus, Hoa is partly rolling released. Secondly, “True-rolling
[project] are developed solely using a rolling release software
development model”, which is the case of Hoa. Consequently and finally,
the <code>master</code> branch is the final public branch, it means that it
<strong>always</strong> contains the latest version, and users constantly fetch
updates from it.</p>
<h2 id="versioning">Versioning<a role="presentation" class="anchor" href="#versioning" title="Anchor link to this header">#</a>
</h2>
<p>Sounds good. On the other hand, the majority of programs that are using
Hoa use tools called dependency managers. The most popular in PHP is
<a rel="noopener external" target="_blank" href="http://getcomposer.org/">Composer</a>. This is a fantastic tool but with a
little spine that hurts us a lot: it does not support rolling release!
Most of the time, dependency managers work with version numbers, mainly
of the form <code>_x_._y_._z_</code>, with a specific semantics for <code>_x_</code>, <code>_y_</code>
and <code>_z_</code>. For instance, some people have agreed about
<a rel="noopener external" target="_blank" href="http://semver.org/">semver</a>, standing for <em>Semantic Versioning</em>.</p>
<p>Also, we are not extremist. We understand the challenges and the needs
behind versioning. So, how to mix both: rolling release and versioning?
Before answering this question, let's progress a little step forward and
learn more about an alternative versioning approach.</p>
<h3 id="scheduled-based-release">Scheduled-based release<a role="presentation" class="anchor" href="#scheduled-based-release" title="Anchor link to this header">#</a>
</h3>
<p>Scheduled-based, also known as date-based, release allows to define
releases at regular periods of time. This approach is widely adopted for
projects that progress quickly, such as Firefox or PHP (see the <a rel="noopener external" target="_blank" href="https://wiki.php.net/rfc/releaseprocess">PHP
RFC: Release Process</a> for
example). For Firefox, every 6 weeks, a new version is released. Note
that we should say <em>a new update</em> to be honest: the term <em>version</em> has
an unclear meaning here.</p>
<p>The scheduled-based release seems a good candidate to be mixed with
rolling release, isn't it?</p>
<h2 id="rush-release">Rüsh Release<a role="presentation" class="anchor" href="#rush-release" title="Anchor link to this header">#</a>
</h2>
<p>Rüsh Release is a mix between part- and true-rolling release and
scheduled-based release. The <code>master</code> branch is part- and true-rolling
release, but with a semi-automatically versioning:</p>
<ul>
<li>each 6 weeks, if at least one new patch has been merged into the <code>master</code>, a
new version is created,</li>
<li>before 6 weeks, if several critical or significant patches have been applied,
a new version is created.</li>
</ul>
<p>What is the version format then? We have proposed <code>_YY_{2,4}._mm_._dd_</code>,
starting from 2000, our “Rüsh Epoch”.</p>
<p>Nevertheless, we are not <strong>infallible</strong> and we can potentially break
backward compatibility. It never happened but we have to face it. This
is a problem because neither the part- and true-rolling release nor the
scheduled-based release holds the information that the backward
compatibility has been broken. Therefore, the <code>master</code> branch must have
a <strong>compatibility number</strong> <code>_x_</code>, starting from 1 with step of 1.
Consequently, the new and last version format is
<code>_x_._Y_{2,4}._mm_._dd_</code>. For today for instance, it is <code>1.14.09.15</code>.</p>
<p>With the Rüsh Release process, we can freely rolling release our
libraries while ensuring the safety and embracing the pros of
versioning.</p>
<p>So, now, you will be able to change your <code>composer.json</code> files from:</p>
<pre class="giallo z-code"><code data-lang="json"><span class="giallo-l"><span>{</span></span>
<span class="giallo-l"><span class="z-support z-type z-property-name"> "require"</span><span class="z-punctuation z-separator">:</span><span> {</span></span>
<span class="giallo-l"><span class="z-support z-type z-property-name"> "hoa/websocket"</span><span class="z-punctuation z-separator">:</span><span class="z-string"> "dev-master"</span></span>
<span class="giallo-l"><span> }</span><span class="z-punctuation z-separator">,</span></span>
<span class="giallo-l"><span class="z-support z-type z-property-name"> "minimum-stability"</span><span class="z-punctuation z-separator">:</span><span class="z-string"> "dev"</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>to (<a rel="noopener external" target="_blank" href="https://getcomposer.org/doc/01-basic-usage.md#next-significant-release-tilde-operator-">learn more about the tilde
operator</a>):</p>
<pre class="giallo z-code"><code data-lang="json"><span class="giallo-l"><span>{</span></span>
<span class="giallo-l"><span class="z-support z-type z-property-name"> "require"</span><span class="z-punctuation z-separator">:</span><span> {</span></span>
<span class="giallo-l"><span class="z-support z-type z-property-name"> "hoa/websocket"</span><span class="z-punctuation z-separator">:</span><span class="z-string"> "~1.0"</span></span>
<span class="giallo-l"><span> }</span></span>
<span class="giallo-l"><span>}</span></span></code></pre>
<p>\o/</p>