Typed ChroniclesVincent Hanquez's adventuresZola2021-01-17T00:00:00+00:00https://vincenthz.github.io/atom.xmlCryptoxide perf (SHA2 / Blake2)2021-01-17T00:00:00+00:002021-01-17T00:00:00+00:00https://vincenthz.github.io/cryptoxide-performance/<p>Related to some engine rewrite and SSE, AVX, AVX2 cpu optimisation I did last
year on <a href="https://github.com/typed-io/cryptoxide/">cryptoxide</a> :</p>
<ul>
<li><a href="https://github.com/typed-io/cryptoxide/pull/8">SHA2 optimisation</a></li>
<li><a href="https://github.com/typed-io/cryptoxide/pull/9">Blake2 optimisation</a></li>
</ul>
<span id="continue-reading"></span><h2 id="history-of-cryptoxide">History of cryptoxide<a class="zola-anchor" href="#history-of-cryptoxide" aria-label="Anchor link for: history-of-cryptoxide">§</a>
</h2>
<p>Cryptoxide is a fork of the initial <a href="https://github.com/DaGenix/rust-crypto">rust-crypto</a>
one-stop cryptography package that went unmaintained.</p>
<p>In 2018, we needed a pure rust version to construct rust-wasm based
web-applications when this use case was in its infancy; rust-crypto was an
interesting starting point, as all the algorithms were written in pure rust,
and it was also easier to construct something than the exploded version which
would have required lots more time to port.</p>
<p>Many other cryptographic packages are now wasm friendly also.</p>
<h2 id="benchmarks-setup">Benchmarks setup<a class="zola-anchor" href="#benchmarks-setup" aria-label="Anchor link for: benchmarks-setup">§</a>
</h2>
<ul>
<li>cpu: 3.6 GHz 8-Core Intel Core i9 (I9-9900K)</li>
<li>rust compiler: stable 1.49</li>
<li>cryptoxide: 0.3.0</li>
<li>rust-crypto: blake2 0.9.1, sha2 0.9.1</li>
<li>ring: 0.16.19</li>
</ul>
<p>The benchmark code itself consist of benchmarking few time the main costly
part of each algorithm over a 10 megabytes array and taking the average of
the run. It's possible that the number reported could be buggy, but it should
be consistently buggy, so here we're more interested by the relative values
than the absolute values.</p>
<p>This benchmark is only looking at the function I was interested about also, thus
only compare Sha256, Sha512, Blake2b and Blake2s.</p>
<p>Finally benchmarks should always be taken with a grain of salt, as different
cpu and environment can lead to different results.</p>
<p>To play with the benchmark on your own machine, have a look at <a href="https://github.com/vincenthz/rcc">rcc</a></p>
<h2 id="raw-numbers">Raw numbers<a class="zola-anchor" href="#raw-numbers" aria-label="Anchor link for: raw-numbers">§</a>
</h2>
<p>Let's start with the raw number in release mode;
This show the average (lower better) with standard deviation (the lower, the better for reliability of benchmark),
and the speed of processing (higher better):</p>
<p>Using the default target_cpu:</p>
<table><thead><tr><th>Algorithm</th><th>Crate</th><th>Avg(ms)</th><th>Std Dev(ms)</th><th>Speed(mb/s)</th></tr></thead><tbody>
<tr><td>blake2b</td><td>cryptoxide</td><td>10.18</td><td>0.174</td><td>981</td></tr>
<tr><td>blake2b</td><td>blake2</td><td>10.28</td><td>0.260</td><td>972</td></tr>
<tr><td>blake2s</td><td>cryptoxide</td><td>15.97</td><td>0.264</td><td>625</td></tr>
<tr><td>blake2s</td><td>blake2</td><td>17.07</td><td>0.150</td><td>585</td></tr>
<tr><td>sha256</td><td>cryptoxide</td><td>30.51</td><td>0.220</td><td>327</td></tr>
<tr><td>sha256</td><td>sha2</td><td>35.66</td><td>0.277</td><td>280</td></tr>
<tr><td>sha256</td><td>ring</td><td>19.17</td><td>0.293</td><td>521</td></tr>
<tr><td>sha512</td><td>cryptoxide</td><td>20.86</td><td>0.319</td><td>479</td></tr>
<tr><td>sha512</td><td>sha2</td><td>21.10</td><td>0.422</td><td>473</td></tr>
<tr><td>sha512</td><td>ring</td><td>13.29</td><td>0.296</td><td>752</td></tr>
</tbody></table>
<p>Using the native target_cpu <code>target_cpu=native</code>:</p>
<table><thead><tr><th>Algorithm</th><th>Crate</th><th>Avg(ms)</th><th>Std Dev(ms)</th><th>Speed(mb/s)</th></tr></thead><tbody>
<tr><td>blake2b</td><td>cryptoxide</td><td>6.72</td><td>0.229</td><td>1486</td></tr>
<tr><td>blake2b</td><td>blake2</td><td>9.95</td><td>0.388</td><td>1004</td></tr>
<tr><td>blake2s</td><td>cryptoxide</td><td>11.27</td><td>0.232</td><td>886</td></tr>
<tr><td>blake2s</td><td>blake2</td><td>17.23</td><td>0.136</td><td>580</td></tr>
<tr><td>sha256</td><td>cryptoxide</td><td>20.71</td><td>0.243</td><td>482</td></tr>
<tr><td>sha256</td><td>sha2</td><td>28.31</td><td>0.365</td><td>353</td></tr>
<tr><td>sha256</td><td>ring</td><td>19.74</td><td>0.283</td><td>506</td></tr>
<tr><td>sha512</td><td>cryptoxide</td><td>17.13</td><td>0.184</td><td>583</td></tr>
<tr><td>sha512</td><td>sha2</td><td>17.50</td><td>0.339</td><td>571</td></tr>
<tr><td>sha512</td><td>ring</td><td>13.17</td><td>0.133</td><td>759</td></tr>
</tbody></table>
<h2 id="in-graphs">In Graphs<a class="zola-anchor" href="#in-graphs" aria-label="Anchor link for: in-graphs">§</a>
</h2>
<p>Putting in graphical form, comparing the default and native runs:</p>
<script type="text/javascript">
window.onload = function () {
renderMyChart(chartContainer1, [{"type":"column","dataPoints":[{"label":"sha2","y":280},{"label":"cryptoxide","y":327},{"label":"ring","y":585},{"label":"sha2 (nat)","y":353},{"label":"cryptoxide (nat)","y":482},{"label":"ring (nat)","y":506}]}]);
renderMyChart(chartContainer2, [{"type":"column","dataPoints":[{"label":"sha2","y":473},{"label":"cryptoxide","y":479},{"label":"ring","y":752},{"label":"sha2 (nat)","y":571},{"label":"cryptoxide (nat)","y":583},{"label":"ring (nat)","y":759}]}]);
renderMyChart(chartContainer3, [{"type":"column","dataPoints":[{"label":"blake2","y":972},{"label":"cryptoxide","y":981},{"label":"blake2 (nat)","y":1004},{"label":"cryptoxide (nat)","y":1486}]}]);
renderMyChart(chartContainer4, [{"type":"column","dataPoints":[{"label":"blake2","y":585},{"label":"cryptoxide","y":625},{"label":"blake2 (nat)","y":580},{"label":"cryptoxide (nat)","y":886}]}]);
function renderMyChart(theDIVid, myData) {
var chart = new CanvasJS.Chart(theDIVid, {
data: myData
});
chart.render();
}
}
</script>
<h2 id="sha256">Sha256<a class="zola-anchor" href="#sha256" aria-label="Anchor link for: sha256">§</a>
</h2>
<div id="chartContainer1" style="width: 90%; height: 300px"></div>
<h2 id="sha512">SHA512<a class="zola-anchor" href="#sha512" aria-label="Anchor link for: sha512">§</a>
</h2>
<div id="chartContainer2" style="width: 90%; height: 300px"></div>
<h2 id="blake2b">BLAKE2B<a class="zola-anchor" href="#blake2b" aria-label="Anchor link for: blake2b">§</a>
</h2>
<div id="chartContainer3" style="width: 90%; height: 300px"></div>
<h2 id="blake2s">BLAKE2S<a class="zola-anchor" href="#blake2s" aria-label="Anchor link for: blake2s">§</a>
</h2>
<div id="chartContainer4" style="width: 90%; height: 300px"></div>
<h2 id="conclusion">Conclusion<a class="zola-anchor" href="#conclusion" aria-label="Anchor link for: conclusion">§</a>
</h2>
<p>Ring is the uncontested winner in term of performance (and probably safety);
Most or all algorithms are implemented in assembly and using the best level
of optimisation all the time; which explains default and native being
virtually identical.</p>
<p>Related to Sha256 algorithm, with native optimisation cryptoxide reach very close
to the very optimised ring implementation and have a noticeable difference with
the pervasive sha2 crate.</p>
<p>Related to Sha512 algorithm, there's no significant difference between cryptoxide and sha2,
which is not particularly surprising considering that I didn't take time to write
an SIMD optimised version of Sha512 in cryptoxide.</p>
<p>Both SHA256 and SHA512 algorithms are only partially optimisable with SIMD.</p>
<p>Related to Blake2b and Blake2s algorithm, while at the default level
performance is mostly equivalent, the true difference happens at the AVX/AVX2
level, where cryptoxide manage a massive boost compared to blake2b. This is enabled
by the really nice design of <a href="https://www.blake2.net/">BLAKE2</a>.</p>
<p>With time permitting, the next step is to add more SIMD optimisation with different
algorithms and as new architecture achieved tier1 and wide support in rust,
hopefully getting other type of SIMD optimisations.</p>
Ouroboros Verifiable Random Function2020-01-26T00:00:00+00:002020-01-26T00:00:00+00:00https://vincenthz.github.io/ouroboros-vrf-explanation/<p>Verifiable Random Function (VRF) are one of the key cryptographic primitive for
<a href="https://eprint.iacr.org/2017/573.pdf">Ouroboros Praos</a>, that allows to participate
in the block creation lottery. Let's dig in the detail of the tech</p>
<span id="continue-reading"></span><h1 id="what-is-a-vrf">What is a VRF<a class="zola-anchor" href="#what-is-a-vrf" aria-label="Anchor link for: what-is-a-vrf">§</a>
</h1>
<p>VRF can be thought as the asymmetric key equivalent to keyed cryptographic hash.
In a nutshell, it uniquely allows the secret key owner to make a output for a given seed value
and anyone with the public key can verify that it was generated honestly from the right seed.</p>
<p>It's easy to see the similarily with a Message Authentication Code (MAC) construction, except
adding that the capability to generate and verify for different part of the key.</p>
<p>So, the VRF is mathematically defined as such:</p>
<ul>
<li>\( \tt{Output} \in \lbrace 0,1 \rbrace^{vrf} , \tt{Seed} \in \lbrace 0,1 \rbrace^* \)</li>
<li>\( \tt{generate} : SecretKey \to \lbrace 0,1 \rbrace^* \to \lbrace 0,1 \rbrace^{vrf} \)</li>
<li>\( \tt{verify} : PublicKey \to \lbrace 0,1 \rbrace^* \to \lbrace 0,1 \rbrace^{vrf} \to \lbrace True,False \rbrace \)</li>
</ul>
<p>You can see the similarity compared to the MAC construction:</p>
<ul>
<li>\( \tt{generate} : MacKey \to \lbrace 0,1 \rbrace^* \to \lbrace 0,1 \rbrace^{mac} \)</li>
<li>\( \tt{verify} : MacKey \to \lbrace 0,1 \rbrace^* \to \lbrace 0,1 \rbrace^{mac} \to \lbrace True,False \rbrace \)</li>
</ul>
<p>There's plenty of online resources for more thorough and formal explanations of VRFs, and the ouroboros specific use:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Verifiable_random_function">Wikipedia</a></li>
<li><a href="https://people.csail.mit.edu/silvio/Selected%20Scientific%20Papers/Pseudo%20Randomness/Verifiable_Random_Functions.pdf">Verifiable Random Functions 1999 - PDF</a></li>
<li><a href="https://eprint.iacr.org/2017/573.pdf">Ouroboros Praos - PDF</a></li>
</ul>
<h1 id="how-is-it-used">How is it used ?<a class="zola-anchor" href="#how-is-it-used" aria-label="Anchor link for: how-is-it-used">§</a>
</h1>
<p>Each slot will be individually evaluated by each owner of stake in secret.
When succesfully evaluated, it gives the authorisation on a per slot basis
to generate a header.</p>
<p>Given the VRF output, we can map it to a range where the minimum of this range represent
0% of stake, and the maximum of the range represent 100% of stake:</p>
<p>$$ number : \lbrace 0,1 \rbrace^{vrf} \to \mathbb{N} $$</p>
<p>And we effectively evaluate this number to be under the owner threshold:</p>
<p>$$ \lbrace 0,1 \rbrace^{vrf} \ as\ \mathbb{N} \lt \tt{StakeThreshold} $$ </p>
<p>This realise the major proposition of proof of stake, given stake in the
system, allow to speak a new block on the network proportional to the
stake you have in the system.</p>
<div class="mermaid ">
graph TB
A["Generate(epoch_random ++ slot_number)"]
A -->|Output| B{"compare stake"}
B -->|< Threshold| D["allowed for this slot"]
B -->|>= Threshold| E["not allowed for this slot"]
</div>
<p>So for example, given a 25% stake holder, with a number domain mapped to 8 bits
(0..255), the threshold is 64, the schedule given to this secret key will be:</p>
<table><thead><tr><th>Slot number</th><th>VRF Output (decimal)</th><th>Authorisation</th></tr></thead><tbody>
<tr><td>1</td><td>0</td><td>✅</td></tr>
<tr><td>2</td><td>124</td><td>❌</td></tr>
<tr><td>3</td><td>180</td><td>❌</td></tr>
<tr><td>4</td><td>65</td><td>❌</td></tr>
<tr><td>5</td><td>80</td><td>❌</td></tr>
<tr><td>6</td><td>63</td><td>✅</td></tr>
<tr><td>...</td><td>...</td><td>...</td></tr>
</tbody></table>
<p>Every authorisation, give the ability for this stake holder to broadcast a
block to the network. It doesn't force the stake holder to create this block,
nor that it means that the stake holder is uniquely able to broadcast for
this slot either.</p>
<h1 id="benchmark">Benchmark<a class="zola-anchor" href="#benchmark" aria-label="Anchor link for: benchmark">§</a>
</h1>
<p>The actual implementation is using 2HashDH-NIZK, as described in the ouroboros paper.
The curve choice for the discrete log was narrowed down to curve25519, as it allow
a secure implementation and is also blazingly fast:</p>
<p>On x86-64:</p>
<table><thead><tr><th>Curve</th><th>Function</th><th>Speed</th></tr></thead><tbody>
<tr><td>Curve25519</td><td>generate</td><td>83 µs</td></tr>
<tr><td>NIST-P256R1</td><td>generate</td><td>211 µs</td></tr>
<tr><td>Curve25519</td><td>verify</td><td>103 µs</td></tr>
<tr><td>NIST-P256R1</td><td>verify</td><td>227 µs</td></tr>
</tbody></table>
<p>On Aarch64:</p>
<table><thead><tr><th>Curve</th><th>Function</th><th>Speed</th></tr></thead><tbody>
<tr><td>Curve25519</td><td>generate</td><td>1086 µs</td></tr>
<tr><td>NIST-P256R1</td><td>generate</td><td>1685 µs</td></tr>
<tr><td>Curve25519</td><td>verify</td><td>1411 µs</td></tr>
<tr><td>NIST-P256R1</td><td>verify</td><td>1529 µs</td></tr>
</tbody></table>
<p>Note: The Curve25519 implementation is using curve25519-dalek and is all in
rust, while the NIST-P256R1 is using haskell, and bindings to OpenSSL; The
aarch64 implementation of curve25519-dalek is using normal operation, and
probably could go faster with some low level optimisation.</p>
<p>It might sounds insignifiant to optimise at the hundreds of µs scale, but
given an epoch of 100000 slots, this means the evaluation budget for the
whole epoch schedule between Curve25519 to P256, move from 8.3s to 21s on x64,
and 108s to 168s on aarch64.</p>
<p>Also the verification is done everytime that there's a network block received,
so it's important to reduce the cost of those operation.</p>
<p>Last but not least, the pure rust implementation allow seamless production
ready compilation to webassembly, which allow to run in the context of a
browser / javascript.</p>
Compiling GHC with Stack for Stack2017-07-20T00:00:00+00:002017-07-20T00:00:00+00:00https://vincenthz.github.io/ghc-stack/<p>While Stack is really good at magically summoning all the compilers you need,
adding your own compiled compiler is not quite documented. For testing specific
version that doesn't have a release, or testing your own compiler modification,
it's useful to add your own compiler in a build tool that by default works
in a multi compiler settings.</p>
<span id="continue-reading"></span>
<p>First make sure you have the right building environment (a C compiler, the make
tool, etc.). Also alex and happy are required, which you can get with:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ stack install alex happy
</span></code></pre>
<p>One of the useful thing that Stack does, is there's no system compiler anymore,
which manifest itself by not having ghc on the $PATH. GHC requires one to
bootstrap itself, so we put the default stack one in the $PATH by starting a
new shell environment:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ stack exec --no-ghc-package-path bash
</span></code></pre>
<h2 id="build-ghc">Build GHC<a class="zola-anchor" href="#build-ghc" aria-label="Anchor link for: build-ghc">§</a>
</h2>
<p>Clone the sources needed:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ git clone --recursive https://github.com/ghc/ghc
</span></code></pre>
<p>This will obviously give you the latest HEAD, so if you want
to rewind to a specific version, do it here.</p>
<p>Now we just build GHC; apply variant of those steps if you want
specific configuration here (see GHC building guide):</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ cd ghc
</span><span>$ ./boot
</span><span>$ ./configure
</span><span>$ make
</span></code></pre>
<h2 id="make-a-binary-dist">Make a binary dist<a class="zola-anchor" href="#make-a-binary-dist" aria-label="Anchor link for: make-a-binary-dist">§</a>
</h2>
<p>It's possible to skip this step by having ./configure called with the right
prefix above and doing <code>make install</code> now, but in the spirit of caching &
re-use, and also to adopt the exact same procedure that stack is doing when
installing a GHC compiler, we will create a binary dist.</p>
<p>To create a binary dist:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ make binary-dist
</span></code></pre>
<p>Now if everything went according to plan, you have a tarball on the root of the
ghc build repository in a format vaguely of <code>ghc-$VERSION-$ARCH-$SYSTEM.tar.xz</code>.
at this stage if you plan to reuse, you can cache it somewhere, make it
available for your company, etc..</p>
<h2 id="install-for-stack">Install for stack<a class="zola-anchor" href="#install-for-stack" aria-label="Anchor link for: install-for-stack">§</a>
</h2>
<p>Unpack the bindist in a temp dir (don't forget to replace the variables):</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ mkdir $TMPDIR
</span><span>$ cp $TARBALL $TMPDIR
</span><span>$ tar xvJf $TARBALL
</span></code></pre>
<p>Then run the bindist to install itself to the right place (again replacing the variables):</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ cd $TARBALLDIR
</span><span>$ ./configure --prefix=$(stack path --programs)/$GHCVER
</span><span>$ make install
</span><span>$ echo -e "installed" > $(stack path --programs)/$GHCVER.installed
</span></code></pre>
<h2 id="configure-your-compiler">Configure your compiler<a class="zola-anchor" href="#configure-your-compiler" aria-label="Anchor link for: configure-your-compiler">§</a>
</h2>
<p>Now create a new <code>my.yaml</code> file to use your new compiler:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>compiler: $GHCVER
</span><span>compiler-check: match-exact
</span><span>resolver: $RESOLVER
</span><span>allow-newer: true
</span></code></pre>
<p>Make sure it works:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ stack --stack-yaml my.yaml ghc -- --version
</span><span>$GHCVER
</span></code></pre>
<p>That's it, it's all ready.</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ stack --stack-yaml my.yaml build
</span><span>...
</span></code></pre>
Efficient CStruct2017-03-20T00:00:00+00:002017-03-20T00:00:00+00:00https://vincenthz.github.io/compilation-cstruct/<p>Dealing with complex C-structure-like data in haskell often
force the developer to have to deal with C files, and create
a system that is usually a tradeoff between efficiency, modularity
and safety.</p>
<p>The <code>Foreign</code> class doesn't quite cut it, external program needs C files,
binary parsers (binary, cereal) are not efficient or modular.</p>
<p>Let's see if we can do better using the advanced haskell type system.</p>
<span id="continue-reading"></span>
<p>First let define a common like C structure that we will re-use to compare
different methods:</p>
<pre data-lang="C" style="background-color:#191919;color:#ffffff;" class="language-C "><code class="language-C" data-lang="C"><span style="color:#80d500;">struct </span><span style="color:#cccccc;">example {
</span><span style="color:#cccccc;"> </span><span style="color:#8aa6c1;">uint64_t</span><span style="color:#cccccc;"> a;
</span><span style="color:#cccccc;"> </span><span style="color:#8aa6c1;">uint32_t</span><span style="color:#cccccc;"> b;
</span><span style="color:#cccccc;"> </span><span style="color:#80d500;">union </span><span style="color:#cccccc;">{
</span><span style="color:#cccccc;"> </span><span style="color:#8aa6c1;">uint64_t</span><span style="color:#cccccc;"> addr64;
</span><span style="color:#cccccc;"> </span><span style="color:#80d500;">struct </span><span style="color:#cccccc;">{
</span><span style="color:#cccccc;"> </span><span style="color:#8aa6c1;">uint32_t</span><span style="color:#cccccc;"> hi;
</span><span style="color:#cccccc;"> </span><span style="color:#8aa6c1;">uint32_t</span><span style="color:#cccccc;"> low;
</span><span style="color:#cccccc;"> } addr32;
</span><span style="color:#cccccc;"> } addr;
</span><span style="color:#cccccc;"> </span><span style="color:#8aa6c1;">uint8_t</span><span style="color:#cccccc;"> data[</span><span style="color:#eddd5a;">16</span><span style="color:#cccccc;">];
</span><span style="color:#cccccc;">};
</span></code></pre>
<h2 id="dealing-with-c-structure">Dealing with C structure<a class="zola-anchor" href="#dealing-with-c-structure" aria-label="Anchor link for: dealing-with-c-structure">§</a>
</h2>
<p>The offset of each field is defined as a displacement (in bytes) from the
beginning of the structure to point at the beginning of the field memory
representation. For example here we have:</p>
<ul>
<li><code>a</code> is at offset 0 (relative to the beginning of the structure)</li>
<li><code>b</code> is at offset 8</li>
<li><code>addr.addr64</code> is at offset 12</li>
<li><code>addr.addr32.hi</code> is at offset 12</li>
<li><code>addr.addr32.low</code> is at offset 16</li>
<li><code>data</code> is at offset 20</li>
</ul>
<p>The size of primitives is simply the number of bits composing the type; so a
<code>uint64_t</code>, composed of 64 bits is 8 bytes. Union is a special construction
where the different option in the union are overlayed on each other and the
biggest element define its size. The size of a struct is defined recursively
as the sum of all its component.</p>
<ul>
<li>field pointed by <code>a</code> is size 8</li>
<li>field pointed by <code>b</code> is of size 4</li>
<li>field pointed by <code>addr</code> is size 8</li>
<li>field pointed by <code>data</code> is size 16</li>
<li>the whole structure is size 36</li>
</ul>
<h2 id="what-s-wrong-with-foreign">What's wrong with Foreign<a class="zola-anchor" href="#what-s-wrong-with-foreign" aria-label="Anchor link for: what-s-wrong-with-foreign">§</a>
</h2>
<p>Here's the usual Foreign definition for something equivalent:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>data </span><span style="color:#66ccff;">Example </span><span>= </span><span style="color:#66ccff;">Example
</span><span style="color:#cccccc;"> { a </span><span>:: </span><span style="color:#cccccc;">{-# </span><span>UNPACK</span><span style="color:#cccccc;"> #-} </span><span>!</span><span style="color:#66ccff;">Word64
</span><span style="color:#cccccc;"> , b </span><span>:: </span><span style="color:#cccccc;">{-# </span><span>UNPACK</span><span style="color:#cccccc;"> #-} </span><span>!</span><span style="color:#66ccff;">Word32
</span><span style="color:#cccccc;"> , u </span><span>:: </span><span style="color:#cccccc;">{-# </span><span>UNPACK</span><span style="color:#cccccc;"> #-} </span><span>!</span><span style="color:#66ccff;">Word64
</span><span style="color:#cccccc;"> , </span><span>data :: !</span><span style="color:#66ccff;">ByteString
</span><span style="color:#cccccc;"> }
</span><span style="color:#cccccc;">
</span><span style="color:#cccccc;">peekBs p ofs len </span><span>= ...
</span><span style="color:#cccccc;">
</span><span>instance </span><span style="color:#80d500;">Foreign Example
</span><span style="color:#cccccc;"> sizeof _ </span><span>= </span><span style="color:#eddd5a;">36
</span><span style="color:#cccccc;"> alignment _ </span><span>= </span><span style="color:#eddd5a;">8
</span><span style="color:#cccccc;"> peek p </span><span>= </span><span style="color:#66ccff;">Example </span><span><$></span><span style="color:#cccccc;"> peek (castPtr p)
</span><span style="color:#cccccc;"> </span><span><</span><span style="color:#cccccc;">*</span><span>></span><span style="color:#cccccc;"> peek (castPtr (p </span><span>`plusPtr` </span><span style="color:#eddd5a;">8</span><span style="color:#cccccc;">))
</span><span style="color:#cccccc;"> </span><span><</span><span style="color:#cccccc;">*</span><span>></span><span style="color:#cccccc;"> peek (castPtr (p </span><span>`plusPtr` </span><span style="color:#eddd5a;">12</span><span style="color:#cccccc;">))
</span><span style="color:#cccccc;"> </span><span><</span><span style="color:#cccccc;">*</span><span>></span><span style="color:#cccccc;"> peekBs p </span><span style="color:#eddd5a;">20 16
</span><span style="color:#cccccc;"> poke p _ </span><span>= ...
</span></code></pre>
<p>Given a (valid) Ptr, we can now get element in this by creating a new <code>Example</code>
type by calling <code>peek</code>. This will materalize a new haskell data structure in
the haskell GC-managed memory which have a copy of all the fields from the Ptr.</p>
<p>In some cases, copying all this values on the haskell heap is wasteful and not
efficient. A simple of this use case, would be to quickly iterate over a block
of memory to check for a few fields values repeatedly in structure.</p>
<p>The <code>Foreign</code> type classes and co is only about moving data between the foreign
boundary, it's not really about efficiency dealing with this foreign boundary.</p>
<p>In short:</p>
<ul>
<li>Materialize values on the haskell side</li>
<li>Not modular: whole type peeking/poking or nothing.</li>
<li>Size and alignment defined on values, not type.</li>
<li>No distinction between constant size types and variable size types.</li>
<li>Often passing <code>undefined :: SomeType</code> to sizeof and alignment.</li>
<li>Usually manually created, not typo-proof.</li>
</ul>
<h2 id="what-about-binary-parsers">What about binary parsers<a class="zola-anchor" href="#what-about-binary-parsers" aria-label="Anchor link for: what-about-binary-parsers">§</a>
</h2>
<p>There's many binary parser on the market: <a href="http://hackage.haskell.org/package/binary">binary</a>
, <a href="http://hackage.haskell.org/package/cereal">cereal</a>, <a href="http://hackage.haskell.org/package/packer">packer</a>,
<a href="http://hackage.haskell.org/package/store">store</a>.</p>
<p>Most of a binary parser job is taking a stream of bytes and efficiently turning
those bytes into haskell value. One added job is dealing with chunking, since
you may not have all the memory for parsing, you need to deal with values that
are cut between memory boundaries and have to deal with resumption.</p>
<p>Here's an example of a binary parser for example:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>getExample :: </span><span style="color:#80d500;">Get Example
</span><span style="color:#cccccc;">getExample </span><span>=
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Example </span><span><$></span><span style="color:#cccccc;"> getWord64Host
</span><span style="color:#cccccc;"> </span><span><</span><span style="color:#cccccc;">*</span><span>></span><span style="color:#cccccc;"> getWord32Host
</span><span style="color:#cccccc;"> </span><span><</span><span style="color:#cccccc;">*</span><span>></span><span style="color:#cccccc;"> getWord64Host
</span><span style="color:#cccccc;"> </span><span><</span><span style="color:#cccccc;">*</span><span>></span><span style="color:#cccccc;"> getByteString </span><span style="color:#eddd5a;">16
</span></code></pre>
<p>However, intuitively this has the exact same problem as <code>Foreign</code>, you can't
selectively and modularly deal with the data, and this create also
full copy of the data on the haskell side. This is clearly warranted when
dealing with memory that you want processed in chunks, since you
can't hold on to the data stream to refer to it later.</p>
<h2 id="defining-a-c-structure-in-haskell">Defining a C structure in haskell<a class="zola-anchor" href="#defining-a-c-structure-in-haskell" aria-label="Anchor link for: defining-a-c-structure-in-haskell">§</a>
</h2>
<p>Dealing with memory directly is error prone and it would be nice to able
to simulate C structures overlay on memory without having to deal with
size, offset and composition manually and to remain as efficient as possible.</p>
<p>First we're gonna need a recent GHC (at least 8.0) and the following extensions:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> DataKinds #-}
</span><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> TypeOperators #-}
</span><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> UndecidableInstances #-}
</span><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> ScopedTypeVariables #-}
</span><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> FlexibleContexts #-}
</span><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> AllowAmbiguousTypes #-}
</span></code></pre>
<p>Then the following imports:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>import </span><span style="color:#8aa6c1;">GHC.TypeLits
</span><span>import </span><span style="color:#8aa6c1;">Data.Type.Bool
</span><span>import </span><span style="color:#8aa6c1;">Data.Proxy
</span><span>import </span><span style="color:#8aa6c1;">Data.Int
</span><span>import </span><span style="color:#8aa6c1;">Data.Word
</span></code></pre>
<p>We define a simple ADT of all the possible elements that you can find, and their compositions:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>data </span><span style="color:#66ccff;">Element </span><span>=
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">FInt8
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FWord8
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FInt16
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FWord16
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FInt32
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FWord32
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FInt64
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FWord64
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FFloat
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FDouble
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FLong
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FArray Nat Element </span><span style="background-color:#171717;color:#616161;">-- size of the element and type of element
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FStruct</span><span style="color:#cccccc;"> [(</span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">, </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">)] </span><span style="background-color:#171717;color:#616161;">-- list of of field * type
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">FUnion</span><span style="color:#cccccc;"> [(</span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">, </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">)] </span><span style="background-color:#171717;color:#616161;">-- list of field * type
</span></code></pre>
<p>now <code>struct example</code> can be represented with:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>type </span><span style="color:#66ccff;">Example </span><span>=</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">FStruct
</span><span style="color:#cccccc;"> '[ '( </span><span style="color:#ffd700;">"a" </span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FWord64</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> , '( </span><span style="color:#ffd700;">"b" </span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FWord32</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> , '( </span><span style="color:#ffd700;">"addr"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FUnion</span><span style="color:#cccccc;"> '[ '( </span><span style="color:#ffd700;">"addr64"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FWord64</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> , '( </span><span style="color:#ffd700;">"addr32"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FStruct</span><span style="color:#cccccc;"> '[ '( </span><span style="color:#ffd700;">"hi"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FWord32</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> , '( </span><span style="color:#ffd700;">"low"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FWord32</span><span style="color:#cccccc;">) ])
</span><span style="color:#cccccc;"> ])
</span><span style="color:#cccccc;"> , '( </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">FArray </span><span style="color:#eddd5a;">16</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">FWord8</span><span style="color:#cccccc;"> )
</span><span style="color:#cccccc;"> ]
</span></code></pre>
<h2 id="calculating-sizes">Calculating sizes<a class="zola-anchor" href="#calculating-sizes" aria-label="Anchor link for: calculating-sizes">§</a>
</h2>
<p>Size is one of the key thing we need to be able to do on element.</p>
<p>Using a type family we can define the Size type which take an <code>Element</code> and returns a <code>Nat</code>
representing the size of the element. </p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> (t </span><span>:: </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">) </span><span>where
</span></code></pre>
<p>This is very easy for our primitives types:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FInt8</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">1
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FWord8</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">1
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FInt16</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">2
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FWord16</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">2
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FInt32</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">4
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FWord32</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">4
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FInt64</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">8
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FWord64</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">8
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FFloat</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">4
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FDouble</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">8
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FLong</span><span style="color:#cccccc;">) </span><span>= </span><span style="color:#eddd5a;">8 </span><span style="background-color:#171717;color:#616161;">-- hardcoded for example sake, but would be dynamic in real code
</span></code></pre>
<p>The array is simply the Size of the element multiplied by the number of elements:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FArray</span><span style="color:#cccccc;"> n el) </span><span>=</span><span style="color:#cccccc;"> n * </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> el
</span></code></pre>
<p>For the constructed elements, we need to define extra recursive type families.
The structure is recursively defined to be the sum of its component Size, and
the union is recursively defined as the biggest element in it,.</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FStruct</span><span style="color:#cccccc;"> ls) </span><span>= </span><span style="color:#66ccff;">StructSize</span><span style="color:#cccccc;"> ls
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">FUnion</span><span style="color:#cccccc;"> ls) </span><span>= </span><span style="color:#66ccff;">UnionSize</span><span style="color:#cccccc;"> ls
</span><span style="color:#cccccc;">
</span><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">StructSize</span><span style="color:#cccccc;"> (ls </span><span>::</span><span style="color:#cccccc;"> [(</span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">, </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">)]) </span><span>where
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">StructSize</span><span style="color:#cccccc;"> '</span><span style="color:#80d500;">[] </span><span>= </span><span style="color:#eddd5a;">0
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">StructSize</span><span style="color:#cccccc;"> ('(_,l) '</span><span>:</span><span style="color:#cccccc;"> ls) </span><span>= </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> l </span><span>+ </span><span style="color:#66ccff;">StructSize</span><span style="color:#cccccc;"> ls
</span><span style="color:#cccccc;">
</span><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">UnionSize</span><span style="color:#cccccc;"> (ls </span><span>::</span><span style="color:#cccccc;"> [(</span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">, </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">)]) </span><span>where
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">UnionSize</span><span style="color:#cccccc;"> '</span><span style="color:#80d500;">[] </span><span>= </span><span style="color:#eddd5a;">0
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">UnionSize</span><span style="color:#cccccc;"> ('(_,l) '</span><span>:</span><span style="color:#cccccc;"> ls) </span><span>= </span><span style="color:#66ccff;">If</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> l </span><span><=? </span><span style="color:#66ccff;">UnionSize</span><span style="color:#cccccc;"> ls) (</span><span style="color:#66ccff;">UnionSize</span><span style="color:#cccccc;"> ls) (</span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> l)
</span></code></pre>
<p>Almost there, we only need a way to materialize the <code>Size</code> type, to have a
value that we can use in our haskell code:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>getSize :: </span><span style="color:#cccccc;">forall el . </span><span style="color:#80d500;">KnownNat</span><span style="color:#cccccc;"> (</span><span style="color:#80d500;">Size </span><span style="color:#cccccc;">el) </span><span>=> </span><span style="color:#80d500;">Integer
</span><span style="color:#cccccc;">getSize </span><span>=</span><span style="color:#cccccc;"> natVal (</span><span style="color:#66ccff;">Proxy </span><span>:: </span><span style="color:#66ccff;">Proxy</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> el))
</span></code></pre>
<p>This looks a bit magic, so let's decompose this to make clear what happens;
first <code>getSize</code> is a constant Integer, it doesn't have <em>any</em> parameters. Next
the <code>el</code> type variable represent the type that we want to know the size of,
and the contraint on <code>el</code> is that applying the Size type function, we
have a <code>KnownNat</code> (Known Natural). In the body of the constant function we use
natVal that takes a Proxy of a KnownNat to materialize the value.</p>
<p>Given this signature, despite being a constant value, <code>getSize</code> need to determine
the element on which it is applied. We can use the Type Application to effectively
force the <code>el</code> element to be what we want to resolve to:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>></span><span style="color:#cccccc;"> putStrLn </span><span>$</span><span style="color:#cccccc;"> show (getSize @</span><span style="color:#66ccff;">Example</span><span style="color:#cccccc;">)
</span><span style="color:#eddd5a;">36
</span></code></pre>
<h2 id="zooming-with-accessors">Zooming with accessors<a class="zola-anchor" href="#zooming-with-accessors" aria-label="Anchor link for: zooming-with-accessors">§</a>
</h2>
<p>One first thing we need to have an accessor types to represent how we represent
part of data structures. For example in C, given the <code>struct example</code>, we want
to be able to do:</p>
<pre data-lang="C" style="background-color:#191919;color:#ffffff;" class="language-C "><code class="language-C" data-lang="C"><span style="color:#cccccc;"> .a
</span><span style="color:#cccccc;"> .addr.addr32.hi
</span><span style="color:#cccccc;"> .data[</span><span style="color:#eddd5a;">3</span><span style="color:#cccccc;">]
</span><span style="color:#cccccc;"> .data
</span></code></pre>
<p>in a case of a structure or a union, we use the field name to dereference the structure,
but in case of an array, we use an integral index. This is really straighforward:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>data </span><span style="color:#66ccff;">Access </span><span>= </span><span style="color:#66ccff;">Field Symbol </span><span>| </span><span style="color:#66ccff;">Index Nat
</span></code></pre>
<p>A List of <code>Access</code> would represent the zooming inside the data structures. The previous
example can be written in haskell with:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> '[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"a"</span><span style="color:#cccccc;"> ]
</span><span style="color:#cccccc;"> '[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"addr"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"addr32"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"hi"</span><span style="color:#cccccc;"> ]
</span><span style="color:#cccccc;"> '[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">3</span><span style="color:#cccccc;"> ]
</span><span style="color:#cccccc;"> '[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;"> ]
</span></code></pre>
<h2 id="calculating-offset">Calculating Offset<a class="zola-anchor" href="#calculating-offset" aria-label="Anchor link for: calculating-offset">§</a>
</h2>
<p>Offset of fields is the next important step to have full capabilities in this system</p>
<p>We define a type family for this that given an <code>Element</code> and <code>[Access]</code> would get back an offset in Nat.
Note that due to the recurvise approach we add the offset <code>ofs</code> to start from.</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> (ofs </span><span>:: </span><span style="color:#66ccff;">Nat</span><span style="color:#cccccc;">) (accessors </span><span>::</span><span style="color:#cccccc;"> [</span><span style="color:#66ccff;">Access</span><span style="color:#cccccc;">]) (t </span><span>:: </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">) </span><span>where
</span></code></pre>
<p>When the list of accessors is empty, we have reach the element, so we can just return the offset we have calculated</p>
<pre data-lang="ruby" style="background-color:#191919;color:#ffffff;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#cccccc;"> Offset ofs </span><span style="color:#ffd700;">'[] t = ofs
</span></code></pre>
<p>When we have a non empty list we call to each respective data structure with:</p>
<ul>
<li>the current offset</li>
<li>the name of field searched or the index searched</li>
<li>either the dictionary of symbol to element (represented by <code>'[(Symbol, Element)]</code>) or the array size and inner Element</li>
</ul>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ofs ('</span><span style="color:#66ccff;">Field</span><span style="color:#cccccc;"> f</span><span>:</span><span style="color:#cccccc;">fs) ('</span><span style="color:#66ccff;">FStruct</span><span style="color:#cccccc;"> dict) </span><span>= </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> ofs f fs dict
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ofs ('</span><span style="color:#66ccff;">Field</span><span style="color:#cccccc;"> f</span><span>:</span><span style="color:#cccccc;">fs) ('</span><span style="color:#66ccff;">FUnion</span><span style="color:#cccccc;"> dict) </span><span>= </span><span style="color:#66ccff;">UnionOffset</span><span style="color:#cccccc;"> ofs f fs dict
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ofs ('</span><span style="color:#66ccff;">Index</span><span style="color:#cccccc;"> i</span><span>:</span><span style="color:#cccccc;">fs) ('</span><span style="color:#66ccff;">FArray</span><span style="color:#cccccc;"> n t) </span><span>= </span><span style="color:#66ccff;">ArrayOffset</span><span style="color:#cccccc;"> ofs i fs n t
</span></code></pre>
<p>Being a type enforced definition, it also mean that with this you can mix up
trying to <code>Index</code> into a Structure, or trying to dereference a <code>Field</code> into an
Array. the type system will (too) emphatically complain.</p>
<p>Both the Structure and Union will recursely match in the dictionary of symbol to find
a matching field. If we reach the empty list, we haven't found the right field
and the developper is notified with a friendly TypeError, at compilation time, that the field is
not present in the structure.</p>
<p>Each time an field is skipped in the structure the size of the element being skipped, is added to the current offset.</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> (ofs </span><span>:: </span><span style="color:#66ccff;">Nat</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (field </span><span>:: </span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (rs </span><span>::</span><span style="color:#cccccc;"> [</span><span style="color:#66ccff;">Access</span><span style="color:#cccccc;">])
</span><span style="color:#cccccc;"> (dict </span><span>::</span><span style="color:#cccccc;"> [(</span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">, </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">)]) </span><span>where
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> ofs field rs '</span><span style="color:#80d500;">[] </span><span>=
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">TypeError</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">Text </span><span style="color:#ffd700;">"offset: field "
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">ShowType</span><span style="color:#cccccc;"> field
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">Text </span><span style="color:#ffd700;">" not found in structure"</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> ofs field rs ('(field, t) '</span><span>:</span><span style="color:#cccccc;"> _) </span><span>= </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ofs rs t
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> ofs field rs ('(_ , v) '</span><span>:</span><span style="color:#cccccc;"> r) </span><span>= </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> (ofs </span><span>+ </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> v) field rs r
</span><span style="color:#cccccc;">
</span><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">UnionOffset</span><span style="color:#cccccc;"> (ofs </span><span>:: </span><span style="color:#66ccff;">Nat</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (field </span><span>:: </span><span style="color:#66ccff;">Symbo</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (rs </span><span>::</span><span style="color:#cccccc;"> [</span><span style="color:#66ccff;">Access</span><span style="color:#cccccc;">])
</span><span style="color:#cccccc;"> (dict </span><span>::</span><span style="color:#cccccc;"> [(</span><span style="color:#66ccff;">Symbol</span><span style="color:#cccccc;">, </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">)]) </span><span>where
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">UnionOffset</span><span style="color:#cccccc;"> ofs field rs '</span><span style="color:#80d500;">[] </span><span>=
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">TypeError</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">Text </span><span style="color:#ffd700;">"offset: field "
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">ShowType</span><span style="color:#cccccc;"> field
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">Text </span><span style="color:#ffd700;">" not found in union"</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">UnionOffset</span><span style="color:#cccccc;"> ofs field rs ('(field, t) '</span><span>:</span><span style="color:#cccccc;"> _) </span><span>= </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ofs rs t
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">UnionOffset</span><span style="color:#cccccc;"> ofs field rs (_ </span><span>:</span><span style="color:#cccccc;"> r) </span><span>= </span><span style="color:#66ccff;">UnionOffset</span><span style="color:#cccccc;"> ofs field rs r
</span></code></pre>
<p>In the case of the array, we can just make sure, at compilation time, that the user is accessing
a field that is within bounds, otherwise we also notify the developer with a friendly TypeError.</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>type</span><span style="color:#cccccc;"> family </span><span style="color:#66ccff;">ArrayOffset</span><span style="color:#cccccc;"> (ofs </span><span>:: </span><span style="color:#66ccff;">Nat</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (idx </span><span>:: </span><span style="color:#66ccff;">Nat</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (rs </span><span>::</span><span style="color:#cccccc;"> [</span><span style="color:#66ccff;">Access</span><span style="color:#cccccc;">])
</span><span style="color:#cccccc;"> (n </span><span>:: </span><span style="color:#66ccff;">Nat</span><span style="color:#cccccc;">)
</span><span style="color:#cccccc;"> (t </span><span>:: </span><span style="color:#66ccff;">Element</span><span style="color:#cccccc;">) </span><span>where
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">ArrayOffset</span><span style="color:#cccccc;"> ofs idx rs n t </span><span>=
</span><span style="color:#cccccc;"> </span><span style="color:#66ccff;">If</span><span style="color:#cccccc;"> (n </span><span><=?</span><span style="color:#cccccc;"> idx)
</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">TypeError</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">Text </span><span style="color:#ffd700;">"out of bounds : index is "
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">ShowType</span><span style="color:#cccccc;"> idx
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">Text </span><span style="color:#ffd700;">" but array of size "
</span><span style="color:#cccccc;"> '</span><span>:<>:</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">ShowType</span><span style="color:#cccccc;"> n))
</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> (ofs </span><span>+</span><span style="color:#cccccc;"> (idx * </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> t)) rs t)
</span></code></pre>
<p>A simple example of how the machinery works:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>> </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ofs '[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">2</span><span style="color:#cccccc;"> ]) </span><span style="color:#66ccff;">Example
</span><span>> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> ofs </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;"> ['</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">2</span><span style="color:#cccccc;"> ]
</span><span style="color:#cccccc;"> '[ '(</span><span style="color:#ffd700;">"a"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">), '(</span><span style="color:#ffd700;">"b"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">) , '(</span><span style="color:#ffd700;">"addr"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">), '( </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">) ]
</span><span>> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> (ofs </span><span>+ </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">Word64</span><span style="color:#cccccc;">) </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;"> ['</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">2</span><span style="color:#cccccc;">]
</span><span style="color:#cccccc;"> '[ '(</span><span style="color:#ffd700;">"b"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">) , '(</span><span style="color:#ffd700;">"addr"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">), '( </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">) ]
</span><span>> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> (ofs </span><span>+ </span><span style="color:#eddd5a;">8 </span><span>+ </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">Word32</span><span style="color:#cccccc;">) </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;"> ['</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">2</span><span style="color:#cccccc;">]
</span><span style="color:#cccccc;"> '[ '(</span><span style="color:#ffd700;">"addr"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">), '( </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, </span><span>..</span><span style="color:#cccccc;">) ]
</span><span>> </span><span style="color:#66ccff;">StructOffset</span><span style="color:#cccccc;"> (ofs </span><span>+ </span><span style="color:#eddd5a;">12 </span><span>+ </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ('</span><span style="color:#66ccff;">Union </span><span>..</span><span style="color:#cccccc;">)) </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;"> ['</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">2</span><span style="color:#cccccc;"> ]
</span><span style="color:#cccccc;"> '[ '( </span><span style="color:#ffd700;">"data"</span><span style="color:#cccccc;">, '</span><span style="color:#66ccff;">Farray </span><span style="color:#eddd5a;">16</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">FWord8</span><span style="color:#cccccc;">) ]
</span><span>> </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> (ofs </span><span>+ </span><span style="color:#eddd5a;">20</span><span style="color:#cccccc;">) ['</span><span style="color:#66ccff;">Index </span><span style="color:#eddd5a;">3</span><span style="color:#cccccc;">] ('</span><span style="color:#66ccff;">FArray </span><span style="color:#eddd5a;">16</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">FWord8</span><span style="color:#cccccc;">)
</span><span>> </span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> (ofs </span><span>+ </span><span style="color:#eddd5a;">20 </span><span>+ </span><span style="color:#eddd5a;">3</span><span style="color:#cccccc;"> * </span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">FWord8</span><span style="color:#cccccc;">) </span><span style="color:#80d500;">[]</span><span style="color:#cccccc;"> '</span><span style="color:#66ccff;">FWord8
</span><span>></span><span style="color:#cccccc;"> ofs </span><span>+ </span><span style="color:#eddd5a;">23
</span></code></pre>
<p>Now we can just calculate Offset of accessors in structure, we just need something to use it.</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>getOffset :: </span><span style="color:#cccccc;">forall el fields . (</span><span style="color:#80d500;">KnownNat</span><span style="color:#cccccc;"> (</span><span style="color:#80d500;">Offset</span><span style="color:#cccccc;"> 0 fields el)) </span><span>=> </span><span style="color:#80d500;">Integer
</span><span style="color:#cccccc;">getOffset </span><span>=</span><span style="color:#cccccc;"> natVal (</span><span style="color:#66ccff;">Proxy </span><span>:: </span><span style="color:#66ccff;">Proxy</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">Offset </span><span style="color:#eddd5a;">0</span><span style="color:#cccccc;"> fields el))
</span></code></pre>
<p>Again same magic as <code>getSize</code>, and we also define a constant by construction.
We also start counting the offset at 0 since we want to calculate absolute
displacement, but we could start at some other points depending on need, and
prevent a runtime addition if we were to know the starting offset at compilation
for example.</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>></span><span style="color:#cccccc;"> putStrLn </span><span>$</span><span style="color:#cccccc;"> show (getOffset @</span><span style="color:#66ccff;">Example</span><span style="color:#cccccc;"> @('</span><span style="color:#80d500;">[]</span><span style="color:#cccccc;">)
</span><span style="color:#eddd5a;">0
</span><span>></span><span style="color:#cccccc;"> putStrLn </span><span>$</span><span style="color:#cccccc;"> show (getOffset @</span><span style="color:#66ccff;">Example</span><span style="color:#cccccc;"> @('[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"a"</span><span style="color:#cccccc;">]))
</span><span style="color:#eddd5a;">0
</span><span>></span><span style="color:#cccccc;"> putStrLn </span><span>$</span><span style="color:#cccccc;"> show (getOffset @</span><span style="color:#66ccff;">Example</span><span style="color:#cccccc;"> @('[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"b"</span><span style="color:#cccccc;">]))
</span><span style="color:#eddd5a;">8
</span><span>></span><span style="color:#cccccc;"> putStrLn </span><span>$</span><span style="color:#cccccc;"> show (getOffset @</span><span style="color:#66ccff;">Example</span><span style="color:#cccccc;"> @('[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"addr, 'Index "</span><span style="color:#cccccc;">addr32</span><span style="color:#ffd700;">", 'Field "</span><span style="color:#cccccc;">lo</span><span style="color:#ffd700;">" "</span><span style="color:#cccccc;">]))
</span><span style="color:#eddd5a;">16
</span><span>></span><span style="color:#cccccc;"> putStrLn </span><span>$</span><span style="color:#cccccc;"> show (getOffset @</span><span style="color:#66ccff;">Example</span><span style="color:#cccccc;"> @('[ '</span><span style="color:#66ccff;">Field </span><span style="color:#ffd700;">"data, 'Index 3 ]))
</span><span style="color:#eddd5a;">23
</span></code></pre>
<h2 id="conclusion">Conclusion<a class="zola-anchor" href="#conclusion" aria-label="Anchor link for: conclusion">§</a>
</h2>
<p>One nice aspect on this is that you can efficiently nest structure, and you can
without a problem re-use the same field names for structure.</p>
<p>You can also define at compilation all sorts of different offsets and sizes
that automatically recalculate given their structures, and combine together.</p>
<p>With this primitive machinery, it's straighforward to define an efficient,
safe, modular accessors (e.g. peek & poke) functions on top of this.</p>
<h2 id="code">Code<a class="zola-anchor" href="#code" aria-label="Anchor link for: code">§</a>
</h2>
<p>You can find the code:</p>
<ul>
<li><a href="https://gist.github.com/vincenthz/9c840ec99172c495a811b9e50c15c788">Code Gist</a></li>
<li><a href="https://gist.github.com/vincenthz/34f0dc42128491317329b42f00fe5294">Experimental Example Usage Gist</a></li>
</ul>
<h2 id="notes">Notes<a class="zola-anchor" href="#notes" aria-label="Anchor link for: notes">§</a>
</h2>
<ol>
<li>Packing & Padding</li>
</ol>
<p>In all this code I consider the C structure packed, and not containing any
padding. While the rules of alignment/padding could be added to the calculation
types, I chose to ignore the issue since the developper can always from a
packed structure definition, add the necessary padding explicitely in the
definition. It would also be possible to define special padding types that
automatically work out their size given how much padding is needed.</p>
<ol start="2">
<li>Endianness</li>
</ol>
<p>I completely ignore endianness for simplicity purpose, but a real library would
likely and simply extend the definitions to add explicit endianness for all
multi-bytes types.</p>
<ol start="3">
<li>Nat and Integer</li>
</ol>
<p>It would be nice to be able to generate offset in machine Int or Word, instead
of unbounded Integer. Sadly the only capability for Nat is to generate Integer
with <code>natVal</code>. The optimisation is probably marginal considering it's just a
constructor away, but it would prevent an unnecessary unwrapping and possibly
even more efficient code.</p>
Foundation2016-09-09T00:00:00+00:002016-09-09T00:00:00+00:00https://vincenthz.github.io/foundation/<p>A new hope. Foundation is a new library that tries to define a new modern Haskell framework.
It is also trying to be more than a library: A common place for
the community to improve things and define new things</p>
<span id="continue-reading"></span>
<p>It started as a thought experiment:</p>
<p><strong>What would a modern Haskell base looks like if I could start from scratch ?</strong></p>
<p>What would I need to complete my Haskell projects without
falling into traditional pitfalls like inefficient String, all-in-one Num,
un-productive packing and unpacking, etc.</p>
<p>One of the constraints, that was set early on, was not depending on any
external packages, instead depending only on what GHC provides (itself, base, and libraries like ghc-prim).
While it may sound surprising, especially considering the usually high quality and precision of
libraries available on hackage, there are many reasons for not depending on anything;
I’ll motivate the reason later in the article, so hang on.</p>
<p>A very interesting article from Stephen Diehl on <a href="http://www.stephendiehl.com/posts/production.html">production</a>, that details
well some of the pitfalls of Haskell, or the workaround for less than
ideal situation/choice, outline pretty well some of the reasons for this effort.</p>
<h1 id="starting-with-the-basic">Starting with the basic<a class="zola-anchor" href="#starting-with-the-basic" aria-label="Anchor link for: starting-with-the-basic">§</a>
</h1>
<p>One of the few basic things that you'll find in any modern haskell project, is
ByteArray, Array and packed strings. Usually in the form of the <code>bytestring</code>,
<code>vector</code> and <code>text</code> packages.</p>
<p>We decided to start here. One of the common problem of those types is
their lack of inter-operability. There's usually a way to convert one into
another, but it's either exposed in an <code>Unsafe</code> or <code>Internal</code> module, or has a
scary name like <code>unsafeFromForeignPtr</code>.</p>
<p>Then, if you're unlucky you will see some issues with unpinned and pinned
(probably in production settings to maximise fun); the common <code>ByteString</code> using
the pinned memory, <code>ShortByteString</code> and <code>Text</code> using unpinned memory, and
<code>Vector</code>, well, it's complicated (there's 4 different kind of Vectors).</p>
<p>Note: pinned and unpinned represent whether the memory is allowed to move by
the GC. Unpinned usually is better as it allows the memory system to reduce
fragmentation, but pinned memory is crucial for dealing with Input/Output with
the real world, large data (and some other uses).</p>
<h2 id="unboxed-array">Unboxed Array<a class="zola-anchor" href="#unboxed-array" aria-label="Anchor link for: unboxed-array">§</a>
</h2>
<p>Our corner stone is the unboxed array. The unboxed array is a native Haskell
ByteArray (represented by the <code>ByteArray#</code> primitive type),
and it is allowed to be unpinned or pinned (at allocation time). To also support
further interesting stuff, we supplement it with another constructor to make it
able to support natively a chunk of memory referenced by a pointer.</p>
<p>In simplified terms it looks like:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>data </span><span style="color:#66ccff;">UArray</span><span style="color:#cccccc;"> ty </span><span>= </span><span style="color:#66ccff;">UArrayBA</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ty) (</span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ty) </span><span style="color:#66ccff;">PinnedStatus ByteArray</span><span style="color:#cccccc;">#
</span><span style="color:#cccccc;"> </span><span>| </span><span style="color:#66ccff;">UArrayAddr</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">Offset</span><span style="color:#cccccc;"> ty) (</span><span style="color:#66ccff;">Size</span><span style="color:#cccccc;"> ty) (</span><span style="color:#66ccff;">Ptr</span><span style="color:#cccccc;"> ty)
</span></code></pre>
<p>With this capability, we have the equivalent of <code>ByteString</code>,
<code>ShortByteString</code>, Unboxed <code>Vector</code> and (Some) Storable <code>Vector</code>, implemented
in one user friendly type. This is a really big win for users, as suddenly all
those types play better together; they are all the same thing working
the same way.</p>
<p>Instead of differentiating <code>ByteString</code> and <code>Vector</code>, now <code>ByteString</code> disappears
completely in favor of <em>just</em> being a <code>UArray Word8</code>. This has been tried
before with the current ecosystem with <a href="http://hackage.haskell.org/package/vector-bytestring">vector-bytestring</a>.</p>
<h2 id="string">String<a class="zola-anchor" href="#string" aria-label="Anchor link for: string">§</a>
</h2>
<p>String is a big pain point. Base represents it as a list of Char <code>[Char]</code>, which
as you can imagine is not efficient for most purpose. <code>Text</code> from the popular <code>text</code>
package implements a packed version of this, using UTF-16 and unpinned native <code>ByteArray#</code>.</p>
<p>While <code>text</code> is almost a standard in haskell, it’s very likely you’ll need
to pack and unpack this representation to interact with base functions, or
switch representation often to interact with some libraries.</p>
<p>Note on Unicode: UTF-8 is an encoding format where unicode codepoints are
encoded in sequence of 1 to 4 bytes (4 different cases). UTF-16 represent unicode
sequences with either 2 or 4 bytes (2 different cases).</p>
<p>Foundation’s String are packed UTF-8 data backed by an unboxed vector of bytes.
This means we can offer a lightweight type based on <code>UArray Word8</code>:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span>newtype </span><span style="color:#66ccff;">String </span><span>= </span><span style="color:#66ccff;">String</span><span style="color:#cccccc;"> (</span><span style="color:#66ccff;">UArray Word8</span><span style="color:#cccccc;">)
</span></code></pre>
<p>So by doing this, we inherit directly all the advantages of our vector types; namely
we have a <code>String</code> type that is unpinned or pinned (depending on needs), and supports
native pointers. It is extremely lightweight to convert between the two: provided
UTF8 binary data, we only validate the data, without re-allocating anything.</p>
<p>There’s no perfect representation of unicode; each representation has it own
advantages and disadvantages, and it really depends on what types of data you’re
actually processing. One of the easy rules of thumb is that the more your representation
has cases, the slower it will be to process the highest unicode sequences.</p>
<p>By extension, it means that choosing a unique representation leads to compromise.
In early benchmarks against text we are consistently outperforming <code>Text</code> when
the data is predominantly ASCII (i.e. 1-byte encoding).
In other type of data, it really depends; sometimes we’re faster still,
sometimes slower, and sometimes par.</p>
<p>Caveat emptor: benchmarks are far from reliable, and only been run on 2 machines
with similar characteristic so far.</p>
<h2 id="other-types">Other Types<a class="zola-anchor" href="#other-types" aria-label="Anchor link for: other-types">§</a>
</h2>
<p>We also support already:</p>
<ul>
<li><strong>Boxed Array</strong>. This is an array to any other Haskell types. Think of it as
array of pointers to another Haskell value</li>
<li><strong>Bitmap</strong>. 1 bit packed unboxed array</li>
</ul>
<p>In the short term, we expect to add:</p>
<ul>
<li><strong>tree like structure</strong>.</li>
<li><strong>hash based structure</strong>.</li>
</ul>
<h2 id="unified-collection-api">Unified Collection API<a class="zola-anchor" href="#unified-collection-api" aria-label="Anchor link for: unified-collection-api">§</a>
</h2>
<p>Many types of collections support the same kind of operations.</p>
<p>For example, commonly you have very similar functions defined with different types:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> </span><span>take :: </span><span style="color:#80d500;">Int </span><span>-> </span><span style="color:#80d500;">UArray </span><span style="color:#cccccc;">a </span><span>-> </span><span style="color:#80d500;">UArray </span><span style="color:#cccccc;">a
</span><span style="color:#cccccc;"> </span><span>take :: </span><span style="color:#80d500;">Int </span><span>-></span><span style="color:#cccccc;"> [a] </span><span>-></span><span style="color:#cccccc;"> [a]
</span><span style="color:#cccccc;"> </span><span>take :: </span><span style="color:#80d500;">Int </span><span>-> </span><span style="color:#80d500;">String </span><span>-> </span><span style="color:#80d500;">String
</span><span style="color:#cccccc;">
</span><span style="color:#cccccc;"> </span><span>head :: </span><span style="color:#80d500;">UArray </span><span style="color:#cccccc;">a </span><span>-> </span><span style="color:#cccccc;">a
</span><span style="color:#cccccc;"> </span><span>head :: </span><span style="color:#80d500;">String </span><span>-> </span><span style="color:#80d500;">Char
</span><span style="color:#cccccc;"> </span><span>head ::</span><span style="color:#cccccc;"> [a] </span><span>-> </span><span style="color:#cccccc;">a
</span></code></pre>
<p>So we tried to avoid monomorphic versions of common Haskell functions and instead
provide type family infused versions of those functions. In foundation we have:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;"> </span><span>take :: </span><span style="color:#80d500;">Int </span><span>-> </span><span style="color:#cccccc;">collection </span><span>-> </span><span style="color:#cccccc;">collection
</span><span style="color:#cccccc;">
</span><span style="color:#cccccc;"> </span><span>head :: </span><span style="color:#cccccc;">collection </span><span>-> </span><span style="color:#80d500;">Element </span><span style="color:#cccccc;">collection
</span><span style="color:#cccccc;">
</span></code></pre>
<p>Note: <code>Element collection</code> is a type family. this allow from a type "collection"
to define another type. for example, the Element of a <code>[a]</code> is <code>a</code>, and the Element of
<code>String</code> is <code>Char</code>.</p>
<p>Note2: <code>head</code> is not exactly defined this way in foundation: This was the simplest example
that show Type families in action and the overloading. foundation's <code>head</code> is not partial and defined:
<code>head :: NonEmpty collection -> Element collection</code></p>
<p>The consequence is that the same <code>head</code> or <code>take</code> (or other generic functions) works
the same way for many different collection types, even when they are monomorphic (e.g. String).</p>
<p>For another good example of this approach being taken, have a look at the
<a href="https://hackage.haskell.org/package/mono-traversable">mono-traversable</a> package</p>
<p>For other operations that are specific to a data structure, and hard to generalize,
we still expose dedicated operations.</p>
<h1 id="the-question-of-dependencies">The question of dependencies<a class="zola-anchor" href="#the-question-of-dependencies" aria-label="Anchor link for: the-question-of-dependencies">§</a>
</h1>
<p>If you’re not convinced by how we provide a better foundation to the standard
Haskell types, then it raises the question: why not depend
on those high quality libraries doing the exact same thing ?</p>
<p><strong>Consistency</strong>. I think it's easier to set a common direction, and have a consistent
approach when working in a central place together, than having N maintainers
working on M packages independently.</p>
<p><strong>Common place</strong>. An example speaks better than words sometime: I have this X thing,
that depends on the A package, and the B package. Should I add it to A, to B, or
create a new C package ?</p>
<p><strong>Atomic development</strong>. We don't have to jump through hoops to improve our types
or functions against other part of foundation. Having more things defined in a
same place, means we can be more aggressive about improving things faster,
while retaining an overall package that make sense.</p>
<p><strong>Versions, and Releases</strong>. Far easier to depends on a small set of library
than depends on hundreds of different versions. Particularly in an industrial
settings, I will be much more confident tracking 1 package, watch 1 issue tracker
and deal with a set of known people, than having to deal with N packages, N issues trackers
(possibly in different places), and N maintainers.</p>
<h1 id="some-final-notes">Some final notes<a class="zola-anchor" href="#some-final-notes" aria-label="Anchor link for: some-final-notes">§</a>
</h1>
<p><strong>A fast iterative release schedule</strong>. Planning to release early, release often.
and with a predictable release schedule.</p>
<p><strong>We're still in the early stage</strong>. While we're at an exciting place,
don't expect a finish product right now.</p>
<p><strong>You don't need to be an expert to help</strong>. anyone can help us shape foundation.</p>
<p><strong>Join us</strong>. If you want to get involved: all Foundation works
take place in the open, on the <a href="https://github.com/haskell-foundation">haskell-foundation organisation</a>
with code, proposals, issues and voting, questions.</p>
Combining Rust and Haskell2015-09-28T00:00:00+00:002015-09-28T00:00:00+00:00https://vincenthz.github.io/rust-with-haskell/<p>Rust is a pretty interesting language, in the area of C++ but more modern /
better. The stated goal of rust are: "a systems programming language focused
on three goals: safety, speed, and concurrency". Combining Rust with Haskell
could create some interesting use cases, and could replace use of C in some
projects while providing a more high level and safer approach where Haskell
cannot be used.</p>
<span id="continue-reading"></span>
<p>One of my reason for doing this, is that writing code targetting low-level
features is simpler in Rust than Haskell. For example, writing inline assembly
or some lowlevel OS routines. Also the performance of Rust is quite close to
C++, and I could see this being useful in certain case where Haskell is not as
optimised.</p>
<p>In this short tutorial, let's call Rust functions from Haskell.</p>
<h2 id="the-rust-library">The Rust library<a class="zola-anchor" href="#the-rust-library" aria-label="Anchor link for: the-rust-library">§</a>
</h2>
<p>First we start with an hypothetical rust library that takes a value, print to
console and return a value.</p>
<p>Our entry point in Rust is a simple rust_hello, in a <code>src/lib.rs</code> file:</p>
<pre data-lang="rust" style="background-color:#191919;color:#ffffff;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#cccccc;">#[no_mangle]
</span><span style="color:#80d500;">pub </span><span>extern </span><span style="color:#80d500;">fn </span><span>rust_hello</span><span style="color:#cccccc;">(</span><span style="font-style:italic;color:#8aa6c1;">v</span><span style="color:#cccccc;">: </span><span style="color:#80d500;">i32</span><span style="color:#cccccc;">) -> </span><span style="color:#80d500;">i32 </span><span style="color:#cccccc;">{
</span><span style="color:#cccccc;"> println!(</span><span style="color:#ffd700;">"Hello Rust World: </span><span style="color:#66ccff;">{}</span><span style="color:#ffd700;">"</span><span style="color:#cccccc;">, v);
</span><span style="color:#cccccc;"> v</span><span>+</span><span style="color:#eddd5a;">1
</span><span style="color:#cccccc;">}
</span></code></pre>
<p>One of the key thing here is the presence of the <code>no_mangle</code> pragma, that allow
exporting the name of the function as-is in the library we're going to generate.</p>
<p>Rust uses Cargo to package library and executable, akin to Cabal for haskell.
We can create the <code>Cargo.toml</code> for our test library:</p>
<pre data-lang="ini" style="background-color:#191919;color:#ffffff;" class="language-ini "><code class="language-ini" data-lang="ini"><span style="color:#80d500;">[package]
</span><span style="font-style:italic;color:#8aa6c1;">name </span><span>= </span><span style="color:#ffd700;">"hello"
</span><span style="font-style:italic;color:#8aa6c1;">version </span><span>= </span><span style="color:#ffd700;">"0.0.1"
</span><span style="font-style:italic;color:#8aa6c1;">authors </span><span>=</span><span style="color:#cccccc;"> [</span><span style="color:#ffd700;">"Vincent Hanquez <[email protected]>"</span><span style="color:#cccccc;">]
</span><span style="color:#cccccc;">
</span><span style="color:#80d500;">[lib]
</span><span style="font-style:italic;color:#8aa6c1;">name </span><span>= </span><span style="color:#ffd700;">"hello"
</span><span style="font-style:italic;color:#8aa6c1;">crate-type </span><span>=</span><span style="color:#cccccc;"> [</span><span style="color:#ffd700;">"staticlib"</span><span style="color:#cccccc;">]
</span></code></pre>
<p>The only special trick is that we ask Cargo to build a static library in the
crate-type section, instead of the default rust lib (.rlib).</p>
<p>Haskell doesn't know other calling / linking convention like C++ (yet) or
Rust, which is why we need to go through those hoops.</p>
<p>We can now build with our Rust library with:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>cargo build
</span></code></pre>
<p>If everything goes according to plan, you should end up with a <code>target</code> directory where you can find the <code>libhello.a</code> library.</p>
<h2 id="the-haskell-part">The haskell part<a class="zola-anchor" href="#the-haskell-part" aria-label="Anchor link for: the-haskell-part">§</a>
</h2>
<p>Now the haskell part is really easy, as this point there's no much difference
than linking with some static C library; first we create a <code>src/Main.hs</code>:</p>
<pre data-lang="haskell" style="background-color:#191919;color:#ffffff;" class="language-haskell "><code class="language-haskell" data-lang="haskell"><span style="color:#cccccc;">{-# </span><span>LANGUAGE</span><span style="color:#cccccc;"> ForeignFunctionInterface #-}
</span><span>module </span><span style="color:#8aa6c1;">Main </span><span>where
</span><span style="color:#cccccc;">
</span><span>import </span><span style="color:#8aa6c1;">Foreign.C.Types
</span><span style="color:#cccccc;">
</span><span style="color:#cccccc;">foreign </span><span>import</span><span style="color:#cccccc;"> ccall unsafe "rust_hello" rust_hello :: </span><span style="color:#8aa6c1;">CInt</span><span style="color:#cccccc;"> -> </span><span style="color:#8aa6c1;">IO CInt
</span><span style="color:#cccccc;">
</span><span style="color:#cccccc;">main </span><span>= </span><span style="color:#80d500;">do
</span><span style="color:#cccccc;"> v </span><span><-</span><span style="color:#cccccc;"> rust_hello </span><span style="color:#eddd5a;">1234
</span><span style="color:#cccccc;"> putStrLn (</span><span style="color:#ffd700;">"Rust returned: " </span><span>++</span><span style="color:#cccccc;"> show v)
</span></code></pre>
<p>Nothing special if you've done some C bindings yourself, otherwise I suggest
having a look at the <a href="https://en.wikibooks.org/wiki/Haskell/FFI">FFI</a> article.</p>
<p>we can try directly linking with ghc:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>ghc -o hello-rust --make src/Main.hs -lhello -Ltarget/debug
</span></code></pre>
<p>and running:</p>
<pre data-lang="sh" style="background-color:#191919;color:#ffffff;" class="language-sh "><code class="language-sh" data-lang="sh"><span style="color:#cccccc;">$ ./hello-rust
</span><span style="color:#cccccc;">Hello Rust World: 1234
</span><span style="color:#cccccc;">Rust returned: 1235
</span></code></pre>
<p>That achieve the goal above. From there we can polish things and add this to a cabal file:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>name: hello-rust
</span><span>version: 0.1.0.0
</span><span>license: PublicDomain
</span><span>license-file: LICENSE
</span><span>author: Vincent Hanquez
</span><span>maintainer: [email protected]
</span><span>category: System
</span><span>build-type: Simple
</span><span>cabal-version: >=1.10
</span><span>extra-source-files: src/lib.rs
</span><span>
</span><span>executable hello-rust
</span><span> main-is: Main.hs
</span><span> other-extensions: ForeignFunctionInterface
</span><span> build-depends: base >=4.8
</span><span> hs-source-dirs: src
</span><span> default-language: Haskell2010
</span><span> extra-lib-dirs: target/release, target/debug
</span><span> extra-libraries: hello
</span></code></pre>
<p>Note: The <code>target/release</code> path is to support building with the <code>-release</code> flag of cargo build.
by listing the <code>target/release</code> and then <code>target/debug</code>, it should allow you to pickup
the release library in preference to the debug library. It can also create some confusion,
and print a warning on my system when one of the directory is missing.</p>
<p>The missing step is either adding some pre-build rules to cabal <code>Setup.hs</code> to run cargo build, or
some more elaborate build system, both which are left as exercice to the interested reader.</p>
<h2 id="where-this-could-go">Where this could go<a class="zola-anchor" href="#where-this-could-go" aria-label="Anchor link for: where-this-could-go">§</a>
</h2>
<p>Going forward, this could lead to having another language from Haskell to
target that is not as lowlevel as C, but offer stellar performance and more
high level constructs (than C) without imposing any other runtime system. This
is interesting where you need to complete some tasks that Haskell is not quite
ready to handle (yet).</p>
<p>For example, a non exhaustive list:</p>
<ul>
<li>Writing cryptographic bindings in <code>rust&asm</code> instead of <code>C&asm</code></li>
<li>Heavily Vector-Optimised routines</li>
<li>Operating system routines (e.g. page table handling) for a hybrid and safer operating system kernel.</li>
<li><a href="https://github.com/servo/servo">Servo</a> embedding</li>
</ul>
<p>Let me know in the comments anything else that might be of interests !</p>
Announcing: cryptonite2015-06-02T00:00:00+00:002015-06-02T00:00:00+00:00https://vincenthz.github.io/announcing-cryptonite/<p>For the last 5 years, I've worked intermittently on cryptographic related packages for Haskell.
Lately, I've consolidated it all in one single package. Announcing <a href="http://hackage.haskell.org/package/cryptonite">cryptonite</a></p>
<span id="continue-reading"></span>
<p>This new package merges the following packages:</p>
<ul>
<li><a href="http://hackage.haskell.org/package/cryptohash">cryptohash</a></li>
<li><a href="http://hackage.haskell.org/package/cryptocipher">cryptocipher</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-random">crypto-random</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-pubkey-types">crypto-pubkey-types</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-pubkey">crypto-pubkey</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-numbers">crypto-numbers</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-cipher-types">crypto-cipher-types</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-cipher-tests">crypto-cipher-tests</a></li>
<li><a href="http://hackage.haskell.org/package/crypto-cipher-benchmarks">crypto-cipher-benchmarks</a></li>
<li><a href="http://hackage.haskell.org/package/cprng-aes">cprng-aes</a></li>
<li><a href="http://hackage.haskell.org/package/cipher-rc4">cipher-rc4</a></li>
<li><a href="http://hackage.haskell.org/package/cipher-des">cipher-des</a></li>
<li><a href="http://hackage.haskell.org/package/cipher-camellia">cipher-camellia</a></li>
<li><a href="http://hackage.haskell.org/package/cipher-blowfish">cipher-blowfish</a></li>
<li><a href="http://hackage.haskell.org/package/cipher-aes">cipher-aes</a></li>
<li><a href="http://hackage.haskell.org/package/afis">afis</a></li>
</ul>
<p>Also this package adds support for the following features:</p>
<ul>
<li><a href="http://cr.yp.to/chacha.html">ChaCha</a></li>
<li><a href="http://cr.yp.to/mac.html">Poly1305</a></li>
<li><a href="http://cr.yp.to/snuffle.html">Salsa</a></li>
<li><a href="http://tools.ietf.org/html/rfc2898">PBKDF2</a></li>
<li><a href="http://www.tarsnap.com/scrypt.html">Scrypt</a></li>
<li><a href="http://cr.yp.to/ecdh.html">Curve25519</a></li>
<li><a href="http://ed25519.cr.yp.to/papers.html">Ed25519</a></li>
<li>A faster and more secure NIST P256 ECC support (through Google P256 implementation)</li>
</ul>
<h2 id="why-this-new-packaging-model">Why this new packaging model ?<a class="zola-anchor" href="#why-this-new-packaging-model" aria-label="Anchor link for: why-this-new-packaging-model">§</a>
</h2>
<p>This is mostly rooted in three reasons:</p>
<ul>
<li>Discoverability</li>
<li>Cryptographic taxonomy</li>
<li>Maintenance</li>
</ul>
<p>Discovering new packages in our current world of hackage is not easy.
Unless you communicate heavily on new packages, there's a good chance that most
people would not know about a new package, leading to re-implementation,
duplicated features, and inconsistencies.</p>
<p>Cryptography taxonomy is hard, and getting harder; cryptographic primitives
are re-used creatively for making hash from cipher primitive, or random
generator from cipher, or authentification code from galois field operations.
This does create problems into where the code lives, how the code is tested,
etc. </p>
<p>Then, finally, if I have to choose a unique reason for doing this, it will be
maintenance. Maintenance of many cabal packages is costly in time: lower
bounds, upper bounds, re-installation, compatibility modules, testing framework, benchmarking
framework.</p>
<p>My limited free time has been siphoned into doing unproductive cross packages
tasks, for example:</p>
<ul>
<li>Upgrading bounds</li>
<li>Sorting out ghc database of installed-packages when reinstalling packages for testing features</li>
<li>Duplicating compatibility modules for supporting many compilers and library versions</li>
<li>Maintaining meta data for many packages (e.g. LICENSE, CHANGELOG, README, .travis, .cabal)</li>
<li>Tagging and releasing new versions</li>
</ul>
<p>Doing everything in one package, simplifies the building issues, gives a better
ability to test features easily, makes a more consistent cryptographic
solution, and minimizes meta data changes.</p>
<h2 id="what-happens-to-other-crypto-packages">What happens to other crypto packages ?<a class="zola-anchor" href="#what-happens-to-other-crypto-packages" aria-label="Anchor link for: what-happens-to-other-crypto-packages">§</a>
</h2>
<p>Cryptonite should be better in almost every aspect: better features, better testing.
So there are no real reasons to maintain any of the old packages anymore, so in
the long run, I expect most of those packages to become deprecated. I encourage
everyone to move to the new package.</p>
<p>I'll try to answer any migration questions as they arise, but most of the migration
should be straightforward in general.</p>
<p>I'm committed to maintain cryptohash for now, as it is very widely used. I'll
try to maintain the rest of the packages for now, but don't expect this to last
for long.</p>
<p>Otherwise, If some people are interested in keeping certain other pieces
independent and maintained, come talk to me directly with motivated arguments.</p>
<h2 id="contributing">Contributing<a class="zola-anchor" href="#contributing" aria-label="Anchor link for: contributing">§</a>
</h2>
<p>I hope this does bring contributions, and this becomes a more
community-maintained package, and specially that cryptonite becomes the
canonical place for anything cryptography related in Haskell.</p>
<p>Main things to look out, for successful contributions:</p>
<ul>
<li>respect the coding style</li>
<li>do not introduce dependencies</li>
</ul>
<p>Also you don't need to know every little thing in cryptography to help
maintain and add feature in cryptonite.</p>
<p>PS: I'm also looking forward to more cryptography related discussions
about timing attacks, what source of random is truly random, etc. :-þ</p>
Downloading safely2015-05-30T00:00:00+00:002015-05-30T00:00:00+00:00https://vincenthz.github.io/downloading-safely/<p>All too often, things are downloaded without safety from hosts and mirrors.
Here's a practical guide to know where you stand and improve the situation.</p>
<span id="continue-reading"></span>
<p>I've seen countless example of script, Docker file, etc, that
will do something akin to:</p>
<pre data-lang="{.shell}" style="background-color:#191919;color:#ffffff;" class="language-{.shell} "><code class="language-{.shell}" data-lang="{.shell}"><span>wget http://my-url/package (or curl)
</span><span>install package
</span></code></pre>
<p>For people that don't understand unix like shell script, that download a
package on a HTTP server using a plain text connection, then try to install it.</p>
<p>It's very likely that if you have some piece of automated machinery or a script,
you want this, to download a very specific data (for example a specific version of a compiler).</p>
<h2 id="what-can-possibly-go-wrong">What can possibly go wrong ?<a class="zola-anchor" href="#what-can-possibly-go-wrong" aria-label="Anchor link for: what-can-possibly-go-wrong">§</a>
</h2>
<p>Pretty much everything:</p>
<ul>
<li>Your query could have been intercepted and modified: you don't know which server you're talking to.</li>
<li>The data could have tempered on the destination (for example, adding a malware in a .tar.gz)</li>
<li>An unauthorized person could have modified the data (e.g. someone manage to do a release but its not the </li>
</ul>
<h2 id="can-t-i-just-use-a-secure-transport-and-be-done">Can't I just use a secure transport and be done ?<a class="zola-anchor" href="#can-t-i-just-use-a-secure-transport-and-be-done" aria-label="Anchor link for: can-t-i-just-use-a-secure-transport-and-be-done">§</a>
</h2>
<p>This is a good idea, but secure transport only secure the transport part.</p>
<p>A secure transport make sure you know that what you are sending and receiving
data comes from and go to a specific server without fear of eavedropping and
meddling.</p>
<p>But it doesn't gives you any guarantees that the data has not been changed on the server,
e.g. the server was compromised and the package you're downloading has been tempered.</p>
<h2 id="signature-to-the-rescue">Signature to the rescue !<a class="zola-anchor" href="#signature-to-the-rescue" aria-label="Anchor link for: signature-to-the-rescue">§</a>
</h2>
<p>Usually using GPG signatures, one can make sure that the data has been released
by the correct person/team. For example, the developers in the team release
a software signing with their personal keys.</p>
<p>Now, we can detect that data has been released by a specific person (or team),
which prevent the data being tempered by third party.</p>
<p>But still not enough: You still rely on something you don't have any control
for your security; the gpg key could have been compromised, or the people
owning the key could have been compromised.</p>
<h2 id="knowing-what-to-expect">Knowing what to expect<a class="zola-anchor" href="#knowing-what-to-expect" aria-label="Anchor link for: knowing-what-to-expect">§</a>
</h2>
<p>A simple technique to represent an arbitrary sized piece of data into a finite
fingerprint is based on digest supported by a <a href="https://en.wikipedia.org/wiki/Cryptographic_hash_function">cryptographic hash function</a>.</p>
<p>More specifically we're looking for those 2 properties in the algorithm generating the digest in this case:</p>
<blockquote>
<p>it is infeasible to modify a message without changing the hash.
it is infeasible to find two different messages with the same hash.</p>
</blockquote>
<p>By embedding a digest of what you expect to download in your script,
you know that no one can modify it without you knowing about it on the
other side.</p>
<p>Which means that the security of the transport doesn't matters anymore,
nor that the security of GPG keys of some people (or group), to determine
that you have downloaded what you expect.</p>
<p>It doesn't make anything secure by itself, but it allow you to vet what you
install, for example by looking at the source, or sandboxing the app, to see what it does,
in a restricted context. Then you only (or your organisation) become in charge
of rubber-stamping this specific version with a digest.</p>
<h2 id="finalizing">Finalizing<a class="zola-anchor" href="#finalizing" aria-label="Anchor link for: finalizing">§</a>
</h2>
<p>To conclude, hashing is one of the most powerful and simple technique
to make sure the data has not changed and it is the exact same thing
you expect.</p>
<p>Here a short table on the risks related to transport and validation method used:</p>
<table><thead><tr><th>Validation Method</th><th>Transport</th><th>MITM</th><th>Tempering</th><th>Author Validated</th></tr></thead><tbody>
<tr><td>None</td><td>Plain</td><td>Unprotected</td><td>Possible</td><td>No</td></tr>
<tr><td>None</td><td>Secure</td><td>Protected</td><td>Possible</td><td>No</td></tr>
<tr><td>GPG</td><td>Plain</td><td>Protected</td><td>Possible</td><td>Yes</td></tr>
<tr><td>GPG</td><td>Secure</td><td>Protected</td><td>Possible</td><td>Yes</td></tr>
<tr><td>Hash</td><td>Plain</td><td>Protected</td><td>Impossible</td><td>No</td></tr>
<tr><td>Hash</td><td>Secure</td><td>Protected</td><td>Impossible</td><td>No</td></tr>
<tr><td>Hash+GPG</td><td>Plain</td><td>Protected</td><td>Impossible</td><td>Yes</td></tr>
<tr><td>Hash+GPG</td><td>Secure</td><td>Protected</td><td>Impossible</td><td>Yes</td></tr>
</tbody></table>
<p>Note: it's assumed that the algorithm used for each specific purpose are perfect regarding to the properties they provide.</p>
Simple time with Hourglass2014-05-05T00:00:00+00:002014-05-05T00:00:00+00:00https://vincenthz.github.io/hourglass-simpler-time/<p>Each time, I've used the <a href="http://hackage.haskell.org/package/time">time</a> API in
Haskell, I'm left with the distinct feeling that the API is not what I want it
to be. After one time too many searching the API to do some basic thing, I've
decided to look at the design space and just try implementing what I want to
use.</p>
<span id="continue-reading"></span>
<p>Before going into this re-design, this is my list of issues with the current API:</p>
<ul>
<li>UTCTime is represented as number of day since a date (sometimes in 19th
century), plus a time difference in seconds from the beginning of the day.
This is probably the worst representation to settle to as main type as it
neither a good computer representation nor a good human representation.</li>
<li>Every time I need to use the time API, i need to look at the documentation.
With the number of time I used the time API, I feel like I shouldn't need to
anymore. Sure it got easier, but it's not as trivial at I want it to be.
The number of functions, and the number of types make it difficult. YMMV.</li>
<li>Too many calendar modules. I just want the standard western calendar module.
It's called the gregorian calendar and time make sure you need to remember
that, as it's part of many function names useful to do things.</li>
<li>C time format string for parsing and printing. Each time I need to format time,
does pretty much mean I need to consult the documentation (again), as there's almost
50 different formatters, that are represented with single letter (that for some of them doesn't have any link to what they represent).</li>
<li>Need to add the <a href="http://hackage.haskell.org/package/old-locale">old-locale</a>
package when doing formatting. Why is this old, if it's still in use and
doesn't have a replacement ?</li>
<li>A local time API that get on the way, different types than global time.
TimeOfDay, ZonedTime, LocalTime. YMMV.</li>
</ul>
<p>Ironically, old-time seems much closer to what I have in mind with some part of
the time API. The name seems to imply that this was the time api before it got
changed to what is currently available.</p>
<h2 id="re-design">Re-design<a class="zola-anchor" href="#re-design" aria-label="Anchor link for: re-design">§</a>
</h2>
<p>So I've got 4 items on this design list:</p>
<ol>
<li>Some better types</li>
<li>Use the system API to go faster</li>
<li>Unified and open system</li>
<li>Better capability for printing and parsing</li>
</ol>
<h2 id="better-types">Better types<a class="zola-anchor" href="#better-types" aria-label="Anchor link for: better-types">§</a>
</h2>
<p>I wanted the main time type to be computer friendly, and linked to how existing API return the time:</p>
<ul>
<li>On Windows system, it's the number of 100 nanoseconds (1 tick) since 1 January 1601.</li>
<li>On Unix system, it's simply the number of seconds since 1st January 1970.</li>
</ul>
<p>It's probably fair to expect other systems to have similar accounting method,
and anyway just those two flavors covers probably 99% of usage. I originally
planned to keep the system referential in the type, but instead it's simpler to
choose one.</p>
<p>Inventing a new one would be fairly pointless, as it would force both system to
do operations. Converting between windows and unix epoch, is really simple and
very cheap (one int64 addition, one int64 multiplication), so Unix has been chosen.</p>
<p>Along with the computer types, proper human types are useful for interacting
with the users. This mean a Date type, a TimeOfDay, and a combined DateTime
describe in pseudo haskell as:</p>
<pre data-lang="{.haskell}" style="background-color:#191919;color:#ffffff;" class="language-{.haskell} "><code class="language-{.haskell}" data-lang="{.haskell}"><span> data Date = Date Year Month Day
</span><span> data TimeOfDay = TimeOfDay Hour Minute Seconds
</span><span> data DateTime = DateTime Date TimeOfDay
</span></code></pre>
<h2 id="use-the-system-luke">Use the System, Luke !<a class="zola-anchor" href="#use-the-system-luke" aria-label="Anchor link for: use-the-system-luke">§</a>
</h2>
<p>Heavy conversion between seconds and date is done by the system. Most
systems got a very efficient way to do that:</p>
<ul>
<li>In Unix that means <a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/gmtime.html">gmtime</a></li>
<li>in Windows <a href="http://www.cs.rpi.edu/courses/fall01/os/FileTimeToSystemTime.html">FileTimeToSystemTime</a></li>
</ul>
<p>One side effect is that we have the same working code as the system. There's
much less need to worry about exactness or bugs in this critical piece.</p>
<p>For futureproofing, a haskell implementation could be used as fall back for
other systems or different compiler target (e.g. haste), if anyone is
interested.</p>
<h2 id="unified-api">Unified API<a class="zola-anchor" href="#unified-api" aria-label="Anchor link for: unified-api">§</a>
</h2>
<p>I don't want to have to remember many different functions to interact with many types.
Also time representation should be all equivalent as to which time value they represent.
So that mean it's easy to convert between them with a unified system.</p>
<p>So 2 type classes have been devised:</p>
<ul>
<li>
<p>one Timeable typeclass to represent type that can be converted to a time
value.</p>
</li>
<li>
<p>one Time typeclass to represent time type that can be created from a time
value.</p>
</li>
</ul>
<p>With this, hourglass support conversion between time types:</p>
<pre data-lang="{.haskell}" style="background-color:#191919;color:#ffffff;" class="language-{.haskell} "><code class="language-{.haskell}" data-lang="{.haskell}"><span>> timeConvert (Elasped 0) :: Date
</span><span>Date { dateYear = 1970, dateMonth = January, dateDay = 1 }
</span><span>> timeConvert (Date 1970 January 1) :: Elapsed
</span><span>Elapsed 0
</span><span>> timeConvert (DateTime (Date 1970 January 1) (TimeOfDay 0 0 0 0)) :: Date
</span><span>Date { dateYear = 1970, dateMonth = January, dateDay = 1 }
</span></code></pre>
<p>Anyone can add new calendar types or other low level types, and still interact
with them with the built-in functions, provided it implement conversion with
the Elapsed. It allow anyone to define new calendar for example, without
complicating anything.</p>
<h2 id="better-formatting-api">Better formatting API<a class="zola-anchor" href="#better-formatting-api" aria-label="Anchor link for: better-formatting-api">§</a>
</h2>
<p>Formatter have a known enumeration types:</p>
<pre data-lang="{.haskell}" style="background-color:#191919;color:#ffffff;" class="language-{.haskell} "><code class="language-{.haskell}" data-lang="{.haskell}"><span>> timePrint [Format_Day,Format_Text '-',Format_Month2] (Date 2011 January 12)
</span><span>"12-01"
</span></code></pre>
<p>But can be overloaded either by string, or some known formats:</p>
<pre data-lang="{.haskell}" style="background-color:#191919;color:#ffffff;" class="language-{.haskell} "><code class="language-{.haskell}" data-lang="{.haskell}"><span>> timePrint "DD-MM-YYYY" (Date 2011 January 12)
</span><span>"12-01-2011"
</span><span>> timePrint ISO8601_Date (Date 2011 January 12)
</span><span>"2011-01-12"
</span></code></pre>
<p>Someone could also re-add C time format string too with this design,
without changing the API.</p>
<h2 id="implementation">Implementation<a class="zola-anchor" href="#implementation" aria-label="Anchor link for: implementation">§</a>
</h2>
<p>The API and values returned has been tested under 32 and 64 bits linux,
freeBSD, and Windows 7. It's got the same limitations that the system has:</p>
<ul>
<li>32 bit linux or BSD: between year 1902 and 2038. this doesn't apply to the x32 flavor of linux, and the latest openbsd 5.5.</li>
<li>64 bit linux or BSD: between year 1 (as BC date before bring all sort of random problems) and few billions of years. this ought to be enough for everyone :-)</li>
<li>windows is limited to date between 1601 and 9999.</li>
</ul>
<p>I find the tradeoff acceptable considering that in counterpart we have descent
performance, and all-in-all a working range that is enough.</p>
<p>For a look on performance, as measured by criterion:</p>
<ul>
<li><a href="http://tab.snarc.org/misc/hourglass-small-criterion.html">A quick report, showing trends quite well</a></li>
<li><a href="http://tab.snarc.org/misc/hourglass-criterion.html">the long and heavy to load report</a></li>
</ul>
<p>The library is small too:</p>
<ul>
<li>time (haskell=1434 (94.5%), C=84 (5.5%)</li>
<li>hourglass (haskell=884 (98%), C=19 (2%)</li>
</ul>
<p>And its documentation is available on <a href="http://hackage.haskell.org/package/hourglass">hackage</a>, and the code on <a href="https://github.com/vincenthz/hs-hourglass">github</a>.</p>
<h2 id="example-of-use">Example of use<a class="zola-anchor" href="#example-of-use" aria-label="Anchor link for: example-of-use">§</a>
</h2>
<pre data-lang="{.haskell}" style="background-color:#191919;color:#ffffff;" class="language-{.haskell} "><code class="language-{.haskell}" data-lang="{.haskell}"><span>> t <- timeCurrent
</span><span>> timeGetDate t
</span><span>Date {dateYear = 2014, dateMonth = May, dateDay = 4}
</span><span>> t
</span><span>1399183466s
</span><span>> timeGetElapsed t
</span><span>1399183466s
</span><span>> timeGetDateTimeOfDay t
</span><span>DateTime { dtDate = Date {dateYear = 2014, dateMonth = May, dateDay = 4}
</span><span> , dtTime = TimeOfDay {todHour = 6, todMin = 4, todSec = 26, todNSec = 0ns}}
</span><span>> timePrint "YYYY-MM-DD" t
</span><span>"2014-05-04"
</span><span>> timePrint "DD Mon YYYY EPOCH TZHM" t
</span><span>"04 May 2014 1399183466 +0000"
</span></code></pre>
<h2 id="q-a">Q&A<a class="zola-anchor" href="#q-a" aria-label="Anchor link for: q-a">§</a>
</h2>
<ul>
<li>
<p>Q: Report issue, wishlist ..</p>
</li>
<li>
<p>A: <a href="https://github.com/vincenthz/hs-hourglass/issues">issue-tracker</a></p>
</li>
<li>
<p>Q: Do I have to use this ?</p>
</li>
<li>
<p>A: No, you can still use <a href="http://hackage.haskell.org/package/time">time</a> if you prefer.</p>
</li>
</ul>
Listing licenses with cabal-db2014-03-29T00:00:00+00:002014-03-29T00:00:00+00:00https://vincenthz.github.io/cabal-db-license/<p>Following discussions with fellow haskellers, regarding the need to be careful
with adding packages that could depends on GPL or proprietary licenses, it
turns out it's not easy to get your dependencies's licenses listed.</p>
<span id="continue-reading"></span>
<p>It would be convenient to be able to ask the hackage database those things,
and this is exactly what cabal-db usually works with.</p>
<p><a href="http://hackage.haskell.org/package/cabal-db">cabal-db</a>, for the ones that
missed the previous
<a href="http://tab.snarc.org/posts/haskell/2013-03-13-cabal-db.html">annoucement</a>, is a
simple program that using the index.tar.gz file, is able to recursively display
or search into packages and dependencies.</p>
<p>The license subcommand is mainly to get a summary of the licenses of a packages
and its dependencies, but it can also display the tree of licenses. Once
a package has been listed, it would not appears again in the tree even if
another package depend on it.</p>
<p>A simple example is better than many words:</p>
<pre data-lang="{.shell}" style="background-color:#191919;color:#ffffff;" class="language-{.shell} "><code class="language-{.shell}" data-lang="{.shell}"><span>$ cabal-db license -s -t BNFC
</span><span>BNFC: GPL
</span><span> process: BSD3
</span><span> unix: BSD3
</span><span> time: BSD3
</span><span> old-locale: BSD3
</span><span> base: BSD3
</span><span> deepseq: BSD3
</span><span> array: BSD3
</span><span> bytestring: BSD3
</span><span> filepath: BSD3
</span><span> directory: BSD3
</span><span> pretty: BSD3
</span><span> mtl: BSD3
</span><span> transformers: BSD3
</span><span> containers: BSD3
</span><span>== license summary ==
</span><span>BSD3: 14
</span><span>GPL: 1
</span></code></pre>
<p>cabal-db is only using the license listed in the license field in cabal files,
so if the field is incorrectly set, cabal-db would have no idea.</p>
unix memory2014-02-25T00:00:00+00:002014-02-25T00:00:00+00:00https://vincenthz.github.io/unix-memory/<p>On unix system, we get access to syscalls that maps files or devices into
memory. The main syscall is mmap, but there's also some others syscalls in the
same family to handle mapped memories like mlock, munlock, mprotect, madvise,
msync.</p>
<span id="continue-reading"></span>
<p>Some limited mmap access is available through the
<a href="http://hackage.haskell.org/package/mmap">mmap</a> or
<a href="http://hackage.haskell.org/package/bytestring-mmap">bytestring-mmap</a> packages,
but both provide a high level access to those API.</p>
<p>To the rescue, I've released
<a href="http://hackage.haskell.org/package/unix-memory">unix-memory</a>. This provide
low level access to all those syscalls. In some place, the API presented is
slightly better than the raw API.</p>
<p>This package is supposed to be ephemeral; The goal is to fold this package to
the venerable <a href="http://hackage.haskell.org/package/unix">unix</a> package when this
becomes less experimental, more stable and is known to work on different
unixoid platforms. If and when this happens, then this package will just
provide compatibility for old versions and eventually be deprecated.</p>
<p>Manipulating memory is unsafe in general, so don't expect any safety from this
package, by design. Also if you don't know what you're doing, don't use those
APIs; It's difficult to get right.</p>
<p>But it also allow interesting patterns when you cooperate with the operating system
to efficiently map files, and devices as virtual memory.</p>
<p>A simple example opening the "/dev/zero" device first memory page, and reading 4096 bytes from it:</p>
<pre data-lang="{.haskell .numberLines}" style="background-color:#191919;color:#ffffff;" class="language-{.haskell .numberLines} "><code class="language-{.haskell .numberLines}" data-lang="{.haskell .numberLines}"><span>import System.Posix.IO
</span><span>import System.Posix.Memory
</span><span>import Control.Monad
</span><span>import Control.Exception (bracket)
</span><span>import Foreign.C.Types
</span><span>import Foreign.Storable
</span><span>
</span><span>bracket (openFd "/dev/zero" ReadWrite Nothing defaultFileFlags) closeFd $ \fd ->
</span><span> bracket (memoryMap Nothing 4096 [MemoryProtectionRead] MemoryMapPrivate (Just fd) 0)
</span><span> (\mem -> memoryUnmap mem 4096)
</span><span> (\mem -> mapM (peekElemOff mem) [0..4095])
</span></code></pre>
announcement: tls-1.2 is out2014-02-14T00:00:00+00:002014-02-14T00:00:00+00:00https://vincenthz.github.io/announce-tls12/<p>One year ago, I've started some big changes on the <a href="http://hackage.haskell.org/package/tls">tls</a> package.
I've finally manage to wrap it up in something that people can use straight out of hackage.</p>
<span id="continue-reading"></span><h2 id="state-improvements">State improvements<a class="zola-anchor" href="#state-improvements" aria-label="Anchor link for: state-improvements">§</a>
</h2>
<p>One major limitation of previous tls versions, was that you wouldn't be able to
read and write data at the same time, since all the state was behind a big-lock
single mvar. Now the state is separated between multiple smaller states with
can be concurrently used:</p>
<ul>
<li>the RX state for receiving data.</li>
<li>the TX state for sending data.</li>
<li>the handshake state for creating new security parameters to replace RX/TX state when the handshake is finished</li>
<li>Misc state for other values.</li>
</ul>
<p>For many protocols like HTTP, this was never an issue as the reading and
writing are disjoints. But some others protocols that do intertwined read and
write (AMQP, IMAP, SMTP, ..) were rightfully having difficulties to use tls.</p>
<p>This provide a more scalable implementation, and optimise the structure changes
to the minimum needed.</p>
<h2 id="certificate-improvements">Certificate improvements<a class="zola-anchor" href="#certificate-improvements" aria-label="Anchor link for: certificate-improvements">§</a>
</h2>
<p>The second pharaonic change was a major rewrite of ASN.1, X509 and the handling
of certificate. The support libraries are now splitted in more logical units, and
provide all the necessary foundations for a much simplified handling of
certificates.</p>
<p>ASN.1 that was previously all in <a href="http://hackage.haskell.org/package/asn1-data">asn1-data</a> is splitted
into <a href="http://hackage.haskell.org/package/asn1-types">asn1-types</a> for the high level ASN.1 Types,
<a href="http://hackage.haskell.org/package/asn1-encoding">asn1-encoding</a> for BER and DER binary encoding support,
and <a href="http://hackage.haskell.org/package/asn1-parse">asn1-parse</a> to
help with parsing ASN.1 representation into high level types. Generally,
the code is nicer and able to support more cases, and also more stress tested.</p>
<p>Certificate <a href="http://hackage.haskell.org/package/certificate">certificate</a> is splitted too and is now deprecated in favor of:</p>
<ul>
<li><a href="http://hackage.haskell.org/package/x509">x509</a>: Contains all the format
parser and writer for certificate, but also now support CRL. The code has
been made more generic and better account certificate formats from the real
world.</li>
<li><a href="http://hackage.haskell.org/package/x509-store">x509-store</a>: Contains some
routines to store and access certificates on disk; this is not very different
than what was in <a href="http://hackage.haskell.org/package/certificate%22">certificate</a>.</li>
<li><a href="http://hackage.haskell.org/package/x509-system">x509-system</a>: Contains all
routines to access system certificates, mainly the trusted CA certificates
supported. The code is not different from <a href="http://hackage.haskell.org/package/certificate%22">certificate</a>
package, except there's now Windows supports for accessing the system
certificate store.</li>
<li><a href="http://hackage.haskell.org/package/x509-validation">x509-validation</a>: One of
the main security aspect of the TLS stack, is certificate validation, which
is a complicated and fiddly process. The main fiddly aspect is the many input
variables that need to be considered, combined with errata and extensions.
The reason to have it as a separate package it to make it easy to debug,
while also isolating this very sensitive piece of code. The feature is
much more configurable and tweakable.</li>
</ul>
<p>On the TLS side, previous version was leaving the whole validation process to a
callback function. Now that we have a solid stack of validation and support for
all main operating systems, tls now automatically provide the validation
function enabled and with the appropriate security parameters by default. Of
course, It's still possible to change validation parameters, add some hooks and
add a validation cache too.</p>
<p>The validation cache is a way to take a fingerprint and get cached yes/no
answer about whether the certificate is accepted. It's a generic lookup
mechanism, so that it could work with any storage mechanism. The same mechanism
can be overloaded to support Trust-on-first-use, and exceptions fingerprint
list.</p>
<p>Exceptions list a great way to use self-signed certificates without
compromising on security; You have to do the validation process out-of-band to
make sure the certificate is really the one, and then store a tuple of the name
of the remote accessed associated with a fingerprint. The fingerprint is a
simple hash of the certificate, whereas the name is really just a simple
(hostname, service) tuple.</p>
<h2 id="key-exchange-methods">Key exchange methods<a class="zola-anchor" href="#key-exchange-methods" aria-label="Anchor link for: key-exchange-methods">§</a>
</h2>
<p>Along with RSA signature, there's now DSA signature support for certificate.</p>
<p>Previous versions only supported the old RSA key exchange methods. After a bit
of refactoring, we now have DHE-RSA and DHE-DSS support. DHE is ephemereal
Diffie Hellman, and provide <a href="http://en.wikipedia.org/wiki/Forward_secrecy">Forward Secrecy</a>.</p>
<p>In the future, with this refactoring in place, ECDHE based key exchange methods
and ECDSA signature will be very easy to add.</p>
<h2 id="api-and-parameters-changes">API and parameters changes<a class="zola-anchor" href="#api-and-parameters-changes" aria-label="Anchor link for: api-and-parameters-changes">§</a>
</h2>
<p>The initialization parameters for a context is now splitted into multiples smaller structures:</p>
<ul>
<li>one for the supported parameters (versions, ciphers methods, compressions methods, ..)</li>
<li>one for shared access structures (x509 validation cache, x509 CA store, session manager, certificate and keys)</li>
<li>the client and server parameters are now 2 distinct structures. this is not anymore a common structure with a role part.</li>
</ul>
<p>All this change allow better separation of what parameters are for the client
and the server, and should also make it easier to setup better default, and allow
tweaking of the configuration to be more self contain. The aim is only to have
to set your "Shared" structure, and for the remaining structures uses default.</p>
<p><a href="http://hackage.haskell.org/package/tls-extra">tls-extra</a> has been merged in tls.</p>
<h2 id="tls-protocol-versions">TLS Protocol Versions<a class="zola-anchor" href="#tls-protocol-versions" aria-label="Anchor link for: tls-protocol-versions">§</a>
</h2>
<p>Previous tls packages were not able to downgrade protocol version. This is now completely fixed, and
by default tls will try to use the maximum supported version (by default, TLS 1.2)
instead of the version specified by the user (by default, TLS 1.0).</p>
<h2 id="client-use">Client use<a class="zola-anchor" href="#client-use" aria-label="Anchor link for: client-use">§</a>
</h2>
<p>For client connection, I recommend to use <a href="http://hackage.haskell.org/package/connection">connection</a>
instead of tls directly.</p>
<h2 id="closing-note">Closing note<a class="zola-anchor" href="#closing-note" aria-label="Anchor link for: closing-note">§</a>
</h2>
<p>And finally this is the extents of the modifications just in tls:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span> 82 files changed, 5528 insertions(+), 4568 deletions(-)
</span></code></pre>
<p>Enjoy,</p>
haskell crypto platform2013-10-25T00:00:00+00:002013-10-25T00:00:00+00:00https://vincenthz.github.io/haskell-crypto-platform/<p>One of my side projects that has been running for couple of years now,
was to get Cryptography up to scratch in haskell. Back when
I started <a href="http://hackage.haskell.org/package/tls">TLS</a>, there were many
various cryptography related projects and libraries. Many were not easy to use,
none were consistent, many had performance problems.</p>
<span id="continue-reading"></span>
<p>Just like the haskell platform, I dreamt of having a go-to set of packages
for cryptography. This is why i've started this project on a side of actual packages.</p>
<p>Currently the platform is made of those features:</p>
<ul>
<li>ASN.1: a terrible serialization format, that is unfortunately pervasive in cryptography.</li>
<li>X.509: public certificate infrastructure based on ASN.1.</li>
<li>symmetric ciphers (block and stream): RC4, DES (inherited from Crypto), 3DES, Blowfish, Camellia, AES</li>
<li>cryptographic hashes (SHA1, SHA2, SHA3, MD2, MD4, MD5, ...) and siphash</li>
<li>assymetric ciphers: RSA, DSA, DH</li>
<li>cryptographic randomness: entropy gathering, Pseudo and random secure bytes generation</li>
<li>securemem: auto scrubbing "bytestring".</li>
</ul>
<h2 id="a-side-note-about-reimplementing-cryptography">A side note, about reimplementing cryptography<a class="zola-anchor" href="#a-side-note-about-reimplementing-cryptography" aria-label="Anchor link for: a-side-note-about-reimplementing-cryptography">§</a>
</h2>
<p>Many people would think that it is a foolish project to reimplement cryptography.
It's undeniable, that cryptography is a hard subject and it's easy to get
some stuff horribly wrong. And many people would rather use references
implementations out there (openssl, nacl, ..).</p>
<p>There's still some values in this, despite not having always the best and most audited implementations:</p>
<ul>
<li>portability: all the code available run on all 3 platforms without having to install external libraries</li>
<li>some cryptography code doesn't require implementation security: for example verifying RSA signature, decrypting your local files for your own purpose, etc.</li>
<li>More diversity: monoculture is dangerous.</li>
<li>It's much more haskell friendly :-)</li>
<li>A common API framework that alternative implementation could/can use.</li>
</ul>
<p>Despite not necessarily being the best cryptography code out there,
I do want to stress that I still consider many of the libraries
available throught this platform to be good in many contexts.</p>
<h2 id="we-want-you">We want you ..<a class="zola-anchor" href="#we-want-you" aria-label="Anchor link for: we-want-you">§</a>
</h2>
<p>Despite my end goals, not everything is as good as I would want,
some ciphers inherited from Crypto are slows, some API not ideal, etc.</p>
<p>Open a discussion somewhere on one of the tracker, send some pull requests,
send me email about supporting your favorite feature, etc..</p>
<p>The organisational part of the platform is not yet defined, while in the process of making stuff
up, i do value suggestions on how to organize. </p>
<p>Also, more documentation would be nice, and while i'm trying to document everything, it would
be better to have end-user trying to document pieces that is not documented well enough.
As an author, it's sometimes difficult to judge where more or better documentation would be
needed.</p>
<h2 id="other-implementations">Other implementations<a class="zola-anchor" href="#other-implementations" aria-label="Anchor link for: other-implementations">§</a>
</h2>
<p>One of the side goals of the platform, that I would like to see developed,
would be to also provide implementation alternatives with the same API infrastructure.</p>
<p>Thanks to Stefan Bühler there's already one package that wander in this direction:</p>
<p><a href="http://hackage.haskell.org/package/nettle">nettle haskell bindings</a></p>
<p>I'm looking forward to similar packages covering other parts of the platform.</p>
<h2 id="license">License<a class="zola-anchor" href="#license" aria-label="Anchor link for: license">§</a>
</h2>
<p>For maximum usability, everything in the crypto-platform is under the BSD3 license.
All additions will be required to be under the same (or very similar) license too.</p>
<h2 id="links">Links<a class="zola-anchor" href="#links" aria-label="Anchor link for: links">§</a>
</h2>
<ul>
<li><a href="https://github.com/vincenthz/hs-crypto-platform">The main repository</a></li>
<li><a href="https://github.com/vincenthz/hs-crypto-platform/blob/master/README.md">platform README</a></li>
<li><a href="http://hackage.haskell.org/packages/tag/crypto-platform">individual packages list</a></li>
</ul>
<p>There's also a google group where to ask question about the packages:</p>
<p><a href="https://groups.google.com/forum/#!forum/haskell-crypto-platform">google group - haskell crypto platform</a></p>
<p>And of course, the issue tracker of either specific packages or the platform one on github.</p>
<p>Enjoy,</p>
ghc core with style2013-04-22T00:00:00+00:002013-04-22T00:00:00+00:00https://vincenthz.github.io/ghc-core-html/<p>After reading one too many time ghc core's output,
i've been itching to have a more interactive output.</p>
<span id="continue-reading"></span>
<p>ghc-core-html is the result of scratching my itch, and i
think it could be useful in general to anyone. It creates
a html output similar to what ghc-core does in a terminal,
but with also the following benefits:</p>
<ul>
<li>Symbols index at the beginning of the file</li>
<li>Clickable symbols.</li>
<li>Some hover popup: extra informations displayed on symbol.</li>
<li>Foldable structures: hide what you don't need.</li>
<li>Core output is (coarsely) parsed, not regex matched: better extensibility.</li>
</ul>
<p>An example is worth thousand words:
<a href="http://tab.snarc.org/misc/ghc-core-html-example1.html">Example 1</a></p>
<p>It's really simple to use, and very similar to the well known ghc-core:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>> ghc-core-html Program.Hs > program.html
</span><span>> $browser program.html
</span></code></pre>
<p>There's lots of other things that can be added,
and style and javascript can easily be improved.
Pull requests gladly accepted at: <a href="http://github.com/vincenthz/ghc-core-html">github repository</a></p>
cabal-db : simple tool for cabal database queries2013-03-13T00:00:00+00:002013-03-13T00:00:00+00:00https://vincenthz.github.io/cabal-db/<p>Following previous experiment with Cabal library and querying the
state of the hackage world <a href="http://tab.snarc.org/posts/haskell/2013-03-03-playing-with-cabal-lib.html">here</a>,
I've extended and wrapped the tool into a cabal package.</p>
<span id="continue-reading"></span><h1 id="a-simple-tool-for-cabal-database-query">A simple tool for cabal database query<a class="zola-anchor" href="#a-simple-tool-for-cabal-database-query" aria-label="Anchor link for: a-simple-tool-for-cabal-database-query">§</a>
</h1>
<ul>
<li><a href="http://hackage.haskell.org/package/cabal-db">hackage</a></li>
<li><a href="http://github.com/vincenthz/cabal-db/">github</a></li>
</ul>
<p>In addition to the graph feature, i've add and added some
others commands that i consolidated from some misc scripts:</p>
<ul>
<li>diff: run the diff command between two different versions of a package.</li>
<li>revdeps: print all reverse dependencies of a package.</li>
<li>info: print all available versions of a package and some misc information.</li>
<li>search-author: search all the database for match in the author field.</li>
<li>search-maintainer: search all the database for match in the maintainer field.</li>
</ul>
<p>The revdeps feature of cabal-db can be also found in a web friendly yesod app with the
very useful <a href="http://packdeps.haskellers.com/">packdeps</a>
or using the source from <a href="https://github.com/snoyberg/packdeps">Michael Snoyman's packdeps repository</a></p>
<p>The diff feature of cabal-db can be found using a gitweb frontend on <a href="http://hdiff.luite.com/">hdiff</a></p>
<h2 id="dependency-graph">Dependency Graph<a class="zola-anchor" href="#dependency-graph" aria-label="Anchor link for: dependency-graph">§</a>
</h2>
<p>After my previous <a href="http://tab.snarc.org/posts/haskell/2013-03-03-playing-with-cabal-lib.html">experiment</a> to
display graph, i've added some nice improvements to make it even more useful.</p>
<ul>
<li>color: red for the packages queried, and green for the package in the platform. Easier to see
how "far" you are from just the platform packages.</li>
<li>hiding packages from the output: useful to hide pervasive packages like base or bytestring,
that doesn't necessarily add information to your graph (e.g. every one depends on base)</li>
</ul>
<p>Running on a single package (cryptohash):</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span> $ cabal-db graph --hide base --hide bytestring cryptohash
</span><span> ...
</span></code></pre>
<p><img src="/pictures/posts/2013-03-03-graph-cryptohash.png" alt="cryptohash deps rendered by dot" /></p>
<p>With this i can produce the graph of all the package i maintain with a single line:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span> $ cabal-db graph --hide base --hide bytestring $(cabal-db search-maintainer Hanquez | xargs)
</span><span> ...
</span></code></pre>
<p><a href="http://tab.snarc.org/pictures/posts/2013-03-03-graph-all.png">all my pkgs rendered by dot</a></p>
<h2 id="diff">Diff<a class="zola-anchor" href="#diff" aria-label="Anchor link for: diff">§</a>
</h2>
<p>Running diff between hit 0.4.2 and hit 0.4.3:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ cabal-db diff hit 0.4.2 0.4.3
</span><span>diff -Naur hit-0.4.2/Data/Git/Storage/FileWriter.hs hit-0.4.3/Data/Git/Storage/FileWriter.hs
</span><span>--- hit-0.4.2/Data/Git/Storage/FileWriter.hs 2013-03-12 11:08:25.453936222 +0000
</span><span>+++ hit-0.4.3/Data/Git/Storage/FileWriter.hs 2013-03-12 11:08:25.963936224 +0000
</span><span>@@ -20,6 +20,14 @@
</span><span>
</span><span> defaultCompression = 6
</span><span>
</span><span>+-- this is a copy of modifyIORef' found in base 4.6 (ghc 7.6),
</span><span>+-- for older version of base.
</span><span>+modifyIORefStrict :: IORef a -> (a -> a) -> IO ()
</span><span>+modifyIORefStrict ref f = do
</span><span>+ x <- readIORef ref
</span><span>+ let x' = f x
</span><span>+ x' `seq` writeIORef ref x'
</span><span>+
</span><span> data FileWriter = FileWriter
</span><span> { writerHandle :: Handle
</span><span> , writerDeflate :: Deflate
</span><span>@@ -42,7 +50,7 @@
</span><span> postDeflate handle = maybe (return ()) (B.hPut handle)
</span><span>
</span><span> fileWriterOutput (FileWriter { writerHandle = handle, writerDigest = digest, writerDeflate = deflate }) bs = do
</span><span>- modifyIORef' digest (\ctx -> SHA1.update ctx bs)
</span><span>+ modifyIORefStrict digest (\ctx -> SHA1.update ctx bs)
</span><span> (>>= postDeflate handle) =<< feedDeflate deflate bs
</span><span>
</span><span> fileWriterClose (FileWriter { writerHandle = handle, writerDeflate = deflate }) =
</span><span>diff -Naur hit-0.4.2/hit.cabal hit-0.4.3/hit.cabal
</span><span>--- hit-0.4.2/hit.cabal 2013-03-12 11:08:25.460602889 +0000
</span><span>+++ hit-0.4.3/hit.cabal 2013-03-12 11:08:25.973936225 +0000
</span><span>@@ -1,5 +1,5 @@
</span><span> Name: hit
</span><span>-Version: 0.4.2
</span><span>+Version: 0.4.3
</span><span> Synopsis: Git operations in haskell
</span><span> Description:
</span></code></pre>
<h2 id="revdeps">Revdeps<a class="zola-anchor" href="#revdeps" aria-label="Anchor link for: revdeps">§</a>
</h2>
<p>Or running revdeps on tls:</p>
<pre style="background-color:#191919;color:#ffffff;"><code><span>$ cabal-db revdeps tls
</span><span>yesod-platform: tls (==1.1.2)
</span><span>warp-tls: tls (>=1.1)
</span><span>tls-extra: tls (>=1.1.0 && <1.2.0)
</span><span>tls-debug: tls (>=1.1 && <1.2 && >=1.1 && <1.2 && >=1.1 && <1.2 && >=1.1 && <1.2)
</span><span>pontarius-xmpp: tls (>=1.0.0)
</span><span>network-conduit-tls: tls (>=0.9)
</span><span>network-api-support: tls (>=0.9)
</span><span>kevin: tls (==1.1.*)
</span><span>imm: tls (-any)
</span><span>http-proxy: tls (>=0.9 && <0.10)
</span><span>http-enumerator: tls (>=0.9 && <0.10)
</span><span>http-conduit-downloader: tls (-any)
</span><span>http-conduit-browser: tls (-any)
</span><span>http-conduit: tls (>=1.1.0)
</span><span>ez-couch: tls (-any)
</span><span>dropbox-sdk: tls (==0.9.*)
</span><span>connection: tls (>=1.0)
</span><span>azure-service-api: tls (>=1.0 && <1.1)
</span></code></pre>