asymmetric’s blagh

Evaluating the security implications of a company-wide Nix remote builder

2021-05-31T00:00:00+00:00

I am trying to test nix’s security properties in this scenario:

A company with a remote Nix builder
N users have ssh write access to the remote builder
A user’s local machine is compromised, with the attacker having root access
The attacker cannot SSH to the remote builder without user interaction (e.g. because of a YubiKey-backed SSH key)
The builder is also used as a binary cache
Question: can the attacker distribute tampered store paths to the rest of the company?

To test this scenario, I’m trying to tamper my own store paths :).

What I did:

remount the store as read-write: mount -o remount,rw /nix/store
nix build nixpkgs#hello
tamper with nixpkgs#hello:

 echo foo > /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12/bin/hello

If I now try to copy this path to a remote store, I get the following:

nix copy --to ssh://foobar /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12 
copying path '/nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12' to 'ssh://foobar'error: hash mismatch importing path '/nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12';
         specified: sha256:1047k69qrr209c6jgls43620s53w2gw7gsgwx56j0b180bsn1qhw
         got:       sha256:0ilw1adqh4xrqzv37896i2l0966w0sdk8q2wm0mmwmqjlplbq28x
error: unexpected end-of-file

Where are the 2 hashes coming from?

The manpage for nix store verify says that the command checks that a store path’s

contents match the NAR hash recorded in the Nix database

and running

nix store verify /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12 
path '/nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12' was modified! expected hash 'sha256:1047k69qrr209c6jgls43620s53w2gw7gsgwx56j0b180bsn1qhw', got 'sha256:0ilw1adqh4xrqzv37896i2l0966w0sdk8q2wm0mmwmqjlplbq28x'

returns the same two hashes, so I went looking in the nix db, and of the 3 tables there:

sqlite> .tables
DerivationOutputs  Refs               ValidPaths  

ValidPaths seemed the most promising (as it’s the only one with a hash column).

But searching for the specified or got hashes returned no results:

sqlite> SELECT * from ValidPaths WHERE hash = 'sha256:1047k69qrr209c6jgls43620s53w2gw7gsgwx56j0b180bsn1qhw' OR hash = 'sha256:0ilw1adqh4xrqzv37896i2l0966w0sdk8q2wm0mmwmqjlplbq28x';
sqlite> 

The key thing here is that in Nix, hashes are stored/displayed in many different formats:

the hash in the DB is base16
the hash in the nix copy output is base32
the hash in the nix verify output is base32
the hash in the nix path-info output is SRI base64
the hash in the nix hash path output is SRI base64

I have no idea why this is the case, and it is very confusing.

So in the interest of clarity, I’ll provide the SRI form (using nix hash to-sri) of the two hashes above:

specified : sha256-HOJg9QIoLCBN6fzpd/gTfBQNhBlE0ycNS0DkjJOZh4A=
got: sha256-HQm86KUSV14rqFxgNJsG3JgEqIgmoTP2x7kTiJsKnEY=

nix path-info returns the expected narHash:

❯ nix path-info --json /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12 | jq -r .[0].narHash
sha256-HOJg9QIoLCBN6fzpd/gTfBQNhBlE0ycNS0DkjJOZh4A=

nix hash path returns the actual narHash:

❯ nix hash path /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12
sha256-HQm86KUSV14rqFxgNJsG3JgEqIgmoTP2x7kTiJsKnEY=

Two things to note:

The store path is actually copied over to the remote host, but it is not added to its nix db’s ValidPaths table, thereby making it invisible.
The narHash check is performed by the remote store

An attacker could, at this point, modify the narHash (and the narSize¹) in the local Nix db:

sqlite> UPDATE ValidPaths
   ...> SET hash = 'sha256:1d09bce8a512575e2ba85c60349b06dc9804a88826a133f6c7b913889b0a9c46',
   ...> narSize = 125720
   ...> WHERE path = '/nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12';

And now:

❯ nix copy --to ssh://foobar /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12 

❯ echo $?
0

Potential mitigations

The issue here is that paths in the store are input-addressed, meaning that once they’re in the store, there’s nothing (apart from signatures, but they wouldn’t provide any further protection here) that prevents them from being tampered with.

Another option (still partly in the works, AFAIU) are proper content-addressed paths. If we do:

❯ nix store make-content-addressed /nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12
rewrote '/nix/store/hkgpl034l6c5zgzhks2dyp7p41z6qyc4-hello-2.12' to '/nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12'

❯ nix path-info --json /nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12 | jq .
[
  {
    "path": "/nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12",
    "narHash": "sha256-HQm86KUSV14rqFxgNJsG3JgEqIgmoTP2x7kTiJsKnEY=",
    "narSize": 125720,
    "references": [
      "/nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12",
      "/nix/store/ah5d5gjqyv319ir95lm8bc44ikkzm10i-glibc-2.34-115"
    ],
    "ca": "fixed:r:sha256:0ilw1adqh4xrqzv37896i2l0966w0sdk8q2wm0mmwmqjlplbq28x",
    "registrationTime": 1653987573
  }
]

we can see that the path now has an additional field, ca, which contains the hash of its actual content.

This cannot be tampered with, unless one is able to break the hashing algorithm and produce a collision.

After tampering with the narHash and the narSize as above, we are now presented with this error:

nix copy --to ssh://foobar /nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12
copying path '/nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12' to 'ssh://foobar'warning: path '/nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12' claims to be content-addressed but isn't
error: cannot add path '/nix/store/9xnc7c495iw6lqk9wxb5r1v8003sp3rp-hello-2.12' to the Nix store because it claims to be content-addressed but isn't
error: unexpected end-of-file

Conclusions

Adding a remote builder opens up the possibility that an attacker might sneak in malicious store paths, which could be very hard to detect. In effect, the remote builder’s store becomes a db with multiple writers, each of which has access full read/write access.

CA derivations provide a mitigation for this usecase, and they rely on the security properties of the hashing algorithm.

It’s important to be aware of this when evaluating a nix build infrastructure.

I haven’t shown this for brevity, but the nix copy command showed an error for the narSize mismatch. ↩

What happens when you run nixos-rebuild

2019-12-21T12:40:34+00:00

nixos-rebuild is the command you run to bring your system’s configuration in line with /etc/nixos/configuration.nix.

What happens behind the scenes is the following:

Your system configuration is evaluated

This typically means evaluating /etc/nixos/configuration.nix in the context of a set of packages. The set of packages is the one pointed to by the NIX_PATH variable – specifically, the part after nixpkgs=:

❯ echo $NIX_PATH
/home/asymmetric/.nix-defexpr/channels nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixos nixos-config=/etc/nixos/configuration.nix /nix/var/nix/profiles/per-user/root/channels

Packages are downloaded/built

including nix itself! That’s why you see two building/downloading sections

A system profile is created

System profiles are directories in the nix store encapsulating a snapshot of the system state:

system-wide binaries
kernel options
files in /etc
manpages
shared libraries
…

A system profile store path could look something like /nix/store/a50aw2gj9dsn1yk3dsp33xw5y9zqfih9-nixos-system-asbesto-19.09.1484.84586a4514d.

These directories are symlinked into well-known, human-readable locations; the location depends on whether the profile is a running or a boot one.

The running system profile is the one currently active on your system, and it’s volatile: it is symlinked in /run/current-system, meaning that when your system reboots, it will be lost.
The boot system profile is the one your machine booted from, or will boot from. It’s symlinked at /nix/var/nix/profiles/system, and is therefore persistent. Your bootloader contains an entry that tells it to load the kernel pointed to by the /nix/var/nix/profiles/system/kernel symlink.

Why do we need this distinction? It is to allow users to test a new system configuration out, before adding an entry into the bootloader (achieved via nixos-rebuild test); or to add an entry to the bootloader, without changing the running system (nixos-rebuild boot).

User profiles

User profiles exist alongside system profiles, and encapsulate a user’s view of the system (binaries, manpages, etc.)

They are symlinked under /nix/var/nix/profiles/per-user/$(whoami)/.

A user’s $PATH variable points to their current user profile’s /bin:

❯ echo $PATH
/run/wrappers/bin /home/asymmetric/.nix-profile/bin /nix/var/nix/profiles/default/bin /run/current-system/sw/bin

As you can see above, the user’s profile is also symlinked at ~/.nix-profile. Additionally, there’s a profile called default - this is the “global” user profile: if you run nix-env -i as root, packages installed will be available to all users of the system.

Recap

Name	Actions	Locations	Persistent	Rollback
boot system profile	`nixos-rebuild boot~/~switch`	`/nix/var/nix/profiles/system`	✔	✔
running system profile	`nixos-rebuild test~/~switch`	`/run/current-system`	❌	❌
user profile	`nix-env -i`	`/nix/var/nix/profiles/per-user/$(whoami)`[1]	✔	❌[2]
default user profile	`nix-env -i`[3]	`/nix/var/nix/profiles/default`	✔	✔

Also symlinked at ~/.nix-profile
Rollback implemented by Home Manager
When run as root

Implementing a Blockchain in Rust

2018-02-11T12:40:34+00:00

This is another one of those blog posts teaching you how to implement a simple, Bitcoin-like blockchain. It has been heavily inspired by Ivan Kuznetsov’s awesome series. What makes it different from previous examples is that this one is written in Rust.

So if these two things happen to interest you, read along.

What’s in a blockchain

A blockchain is a data structure resembling a singly-linked list, i.e. a list where each element has one (and only one) pointer to the preceding one. This pointer is, in the case of a blockchain, the hash of the header of the preceding block in the chain.

We just introduced 3 new terms: hash, block and header.

What’s a hash? A hash is a function that produces a fixed size output for any given input. For example, SHA-256 always produces a 256 bit output, which is usually displayed to humans in hexadecimal. The SHA-256 of the string “I am Satoshi Nakamoto”, displayed in hex, is a756b325faef56ad975c1bf79105bfc427e11102aa159828c8b416f5326a8440.

If a blockchain is a list, a block is an element in the list. It is composed of:

a header
a body

In real blockchains, the body contains the transactions that represent the exchange of value across accounts, but in our case, the body will just be a string.

A note about structure: The code examples in this blog post are not meant to exemplify the best practices on how to structure a Rust project. All the code is in one file; the code snippets are not meant to be run in isolation; and we don’t separate library and binary crates. Check out this chapter the Rust book for info on all that and more.

So with that out of the way, let’s take a first stab at implementing a block:

const HASH_BYTE_SIZE: usize = 32;

pub type Sha256Hash = [u8; HASH_BYTE_SIZE];

#[derive(Debug)]
pub struct Block {
    // Headers.
    timestamp: i64,
    prev_block_hash: Sha256Hash,

    // Body.
    // Instead of transaction, blocks contain data.
    data: Vec<u8>,
}

The block has a header, containing an array of 32 bytes (for the 256-bit SHA-256 hash) and a timestamp, and a body, with a data field of type Vec for variable length strings.

Now that we know what a block is, we can proceed to the next step, i.e. concatenating them in a chain.

In order to do that, we need to have a way to create a new block, and set its prev_block_hash to the hash of the headers of for our blockchainanother block. Let’s implement the new function:

use chrono::prelude::*;

impl Block {
    // Creates a new block.
    pub fn new(data: &str, prev_hash: Sha256Hash) -> Self {
        Self {
            prev_block_hash: prev_hash,
            data: data.to_owned().into(),
            timestamp: Utc::now().timestamp(),
        }
    }
}

So far, so good. We pass a hash and a string to the new function, and it gives us an initialized block. But how do we get the hash in the first place?

Proof of Work

As we saw previously, in our blockchain a hash is prev_block_hash: SHA(header) (in Bitcoin, it’s actually SHA(SHA(header)).

In Bitcoin-like blockchains, miners are the nodes that take new transactions, put them in a block, and then compete with each other for the right to have the block they created added to the blockchain.

What does the competition consist of? The race is to be the first one in the network to find a hash with a numeric value lower than the current target.

The target is a hexadecimal number with an amount x of leading zeroes. x is the difficulty.

For example, if the difficulty is 5, that means we need to produce a hash that, when expressed as hex, has a value lower than 0x0000010000000000000000000000000000000000000000000000000000000000 (notice that the target has 5 leading zeroes).

This is a type of calculation that can only be brute-forced, i.e. there’s no other way to find such a hash than to go through all possible iterations.

But if the header doesn’t change, there is only one iteration to perform: SHA(header), for a header that doesn’t change, will obviously always output the same value, no matter how many times you run it!

For this reason, we need to add another field to our header, the so-called nonce, which is incremented on each iteration.

const HASH_BYTE_SIZE: usize = 32;

pub type Sha256Hash = [u8; HASH_BYTE_SIZE];

#[derive(Debug)]
pub struct Block {
    // Headers.
    timestamp: Utc::now().timestamp(),
    prev_block_hash: Sha256Hash,
    nonce: u64,

    // Body.
    // Instead of transactions, blocks contain data.
    data: Vec<u8>,
}

impl Block {
    // Creates a new block.
    pub fn new(data: &str, prev_hash: Sha256Hash) -> Self {
        Self {
            timestamp: Utc::now().timestamp(),
            prev_block_hash: prev_hash,
            data: data.to_owned().into(),
            nonce: 0,
        }
    }
}

A miner will run the hashing function, check if the hash is below the target, and if it isn’t, increase the nonce (thereby changing the header) and run the hashing function again, until it finds a winning hash (or until someone else finds one before them).

In a real blockchain, the difficulty is adjusted dynamically, in order to maintain the number of blocks produced per minute more-or-less stable: if there’s an increase in hashing power, the difficulty will be raised; if there’s a decrease, it will be lowered. In Bitcoin, the block-mining rate is 10 minutes.

In our case, the difficulty is a hard-coded constant.

We can now proceed to implementing Proof of Work!

extern crate crypto;
use crypto::digest::Digest;
use crypto::sha2::Sha256;
use num_bigint::BigUint;
use num_traits::One;

const DIFFICULTY usize = 5;
const MAX_NONCE: u64 = 1_000_000;

impl Block {
  fn try_hash(&self) -> Option<u64> {
      // The target is a number we compare the hash to. It is a 256bit binary with DIFFICULTY
      // leading zeroes.
      let target = BigUint::one() << (256 - 4 * DIFFICULTY);

      for nonce in 0..MAX_NONCE {
          let hash = calculate_hash(&block, nonce);
          let hash_int = BigUint::from_bytes_be(&hash);

          if hash_int < target {
              return Some(nonce);
          }
      }

      None
  }

  pub fn calculate_hash(block: &Block, nonce: u64) -> Sha256Hash {
      let mut headers = block.headers();
      headers.extend_from_slice(convert_u64_to_u8_array(nonce));

      let mut hasher = Sha256::new();
      hasher.input(&headers);
      let mut hash = Sha256Hash::default();

      hasher.result(&mut hash);

      hash
  }

  pub fn headers(&self) -> Vec<u8> {
      let mut vec = Vec::new();

      vec.extend(&util::convert_u64_to_u8_array(self.timestamp as u64));
      vec.extend_from_slice(&self.prev_block_hash);

      vec
  }

}

// This transforms a u64 into a little endian array of u8
pub fn convert_u64_to_u8_array(val: u64) -> [u8; 8] {
    return [
        val as u8,
        (val >> 8) as u8,
        (val >> 16) as u8,
        (val >> 24) as u8,
        (val >> 32) as u8,
        (val >> 40) as u8,
        (val >> 48) as u8,
        (val >> 56) as u8,
    ]
}

In order to perform the hashing we need the headers and the nonce as an array of bytes, so we create a little helper function (convert_u64_to_u8_array) that does just that (alternatively, you could decide to use the byteorder crate)

The try_hash() function iterates over all the possible nonces sequentially (the limit is MAX_NONCE) and performs the calculation. It either returns the nonce or, if it can’t find any, None.

Notice that how this method uses the num-bigint crate for handling large numbers, and bit-wise operations to create a target: since each hex digit encodes 4 bits, we want to multiply DIFFICULTY by 4.

The calculate_hash() function is where the actual hashing happens. It retrieves the headers, adds the nonce, and hashes the whole thing.

Let’s update our new function to perform hashing when a new block is created:

impl Block {
    pub fn new(data: &str, prev_hash: Sha256Hash) -> Result<Self, MiningError> {
        let mut s = Self {
            prev_block_hash: prev_hash,
            nonce: 0,
            data: data.to_owned().into(),
        };

        s.try_hash()
            .ok_or(MiningError::Iteration)
            .and_then(|nonce| {
                s.nonce = nonce;

                Ok(s)
            })
    }
}

Notice that we introduced an error type, MiningError:

use std::error;
use std::fmt;

#[derive(Debug)]
pub enum MiningError {
    Iteration,
    NoParent,
}

impl fmt::Display for MiningError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            MiningError::Iteration => write!(f, "could not mine block, hit iteration limit"),
            MiningError::NoParent => write!(f, "block has no parent"),
        }
    }
}

impl error::Error for MiningError {
    fn description(&self) -> &str {
        match *self {
            MiningError::Iteration => "could not mine block, hit iteration limit",
            MiningError::NoParent => "block has no parent",
        }
    }

    fn cause(&self) -> Option<&error::Error> {
        None
    }
}

We now have a way to create blocks that are linked to a parent by way of a hash.

A pretty important missing piece is the first block, i.e. the one that has no parent. In blockchain-speak, this is called a genesis block:

impl Block {
    // Creates a genesis block, which is a block with no parent.
    //
    // The `prev_block_hash` field is set to all zeroes.
    pub fn genesis() -> Result<Self, MiningError> {
        Self::new("Genesis block", Sha256Hash::default())
    }
}

The Blockchain

The next step is creating a Blockchain struct:

pub struct Blockchain {
    blocks: Vec<Block>,
}

impl Blockchain {
    // Initializes a new blockchain with a genesis block.
    pub fn new() -> Result<Self, MiningError> {
        let blocks = Block::genesis()?;

        Ok(Self { blocks: vec![blocks] })
    }

    // Adds a newly-mined block to the chain.
    pub fn add_block(&mut self, data: &str) -> Result<(), MiningError> {
        let block: Block;
        {
            match self.blocks.last() {
                Some(prev) => {
                    block = Block::new(data, prev.hash())?;
                }
                // Adding a block to an empty blockchain is an error, a genesis block needs to be
                // created first.
                None => {
                    return Err(MiningError::NoParent)
                }
            }
        }

        self.blocks.push(block);

        Ok(())
    }

    // A method that iterates over the blockchain's blocks and prints out information for each.
    pub fn traverse(&self) {
        for (i, block) in self.blocks.iter().enumerate() {
            println!("block: {}", i);
            println!("hash: {:?}", block.pretty_hash());
            println!("parent: {:?}", block.pretty_parent());
            println!("data: {:?}", block.pretty_data());
            println!()
        }
    }
}

Ok, that’s pretty cool. We can tie all this together in an example. In a properly structured Rust crate, this would be the main.rs file of a binary crate making use of the library crate:

// this would be the library crate
extern crate rusty_chain;

use std::process;

use rusty_chain::blockchain::Blockchain;
use rusty_chain::error::MiningError;

fn main() {
    println!("Welcome to Rusty Chain");

    run().
        unwrap_or_else(|e| {
            println!("Error: {}", e);
            process::exit(1)
        })
}

fn run() -> Result<(), MiningError> {
    let mut chain = Blockchain::new()?;
    println!("Send 1 RC to foo");
    chain.add_block("enjoy, foo!")?;

    println!("Traversing blockchain:\n");
    chain.traverse();

    Ok(())
}

In this silly example, we:

create a blockchain
create a block, with the string enjoy, foo!
traverse and pretty print the contents of the block chain

Try Elixir Online

2016-03-28T00:00:00+00:00

Table of Contents

Update: I’ve added an update to the security section.

Today I released something I have been working on for a little while. It’s basically an IEx instance you can access from your browser.

The problem I was trying to solve is one I’ve had while learning Elixir myself: I would often be reading Dave Thomas’s Programming Elixir book on my iPad, and at somepoint I’d want to try out some Elixir code. Obviously on an iPad I can’t install the runtime, so I was left with no other option than to open my laptop. Hopefully this project will provide an alternative for people on mobile devices.

So, how was the project implemented? It’s actually quite simple. It’s a Phoenix app, without database, using CodeMirror on the frontend side of things for code editing.

But in order to bring the whole idea to completion, I had to solve the following problems:

Evaluate the code sent by the user
Capture and return all the output, including errors
Find a pretty domain
Code the frontend
Choose an online code editor
Implement a syntax highlighter

Let’s look at the first point.

IO in Elixir

With Elixir’s focus on distributed computing, it should come as no surprise that IO itself can be thought of as a distributed process.

So, how does IO work in Elixir?

Basically every IO function either implicitly or explicitly writes to an IO device. In the case of IO.inspect/2, for example, you do it implicitly; its counterpart IO.inspect/3 receives an additional argument: the device.

Some examples of devices are :stdio and :stderr, but a device can be any Elixir process, and can be passed around as a pid (which is a first-class data type in Elixir) or an atom representing a process.

Another example is IO.puts/2. If you look at its definition, you’ll see the first argument it accepts has a default value, group_leader().

In Erlang, a process is always part of a group, and a group always has a group leader. All IO from the group is sent to the group leader. The group leader is basically the default IO device, and it can be configured by calling :erlang.group_leader/2.

In our case, we want to somehow save all output, and then be able to return it to the browser. How do we go about doing that?

Turns out Elixir has a very nice module that exposes a string as an IO device. The module is aptly called StringIO, and it allows us to save and retrieve all data that has been sent to the device.

Combining these two concepts together, the group leader and StringIO, allows us to achieve our goal: redirect all output to a string, store it, and return it:

{:ok, device} = StringIO.open ""
:erlang.group_leader(device, self)

By the way, this stuff is explained quite well on this page of the documentation.

Evaluating code

The next interesting thing happening in this project is it has to evaluate the code that the user sends at runtime.

Again, Elixir offers a simple solution in the form of Code.eval_string/3. It has only one mandatory argument, the code as a string, and returns a tuple of the form {result, bindings}. We are only interested in the result.

result =
  content
  |> Code.eval_string
  |> elem(0)

IO.inspect result

Note that we use IO.inspect here, instead of IO.puts, as that allows us to leverage the existing implementations of the Inspect protocol.

This gets us almost all the way there. But what happens if the user’s code causes an exception? Ideally, we should recover from the exception and output the error message, same as it would happen on IEx.

Let’s wrap our code in a try/rescue block:

    try do
      result =
        content
        |> Code.eval_string
        |> elem(0)

      IO.inspect result
    rescue
      exception -> IO.inspect Exception.message(exception)
    end

And that’s all. With our backend code in place, let’s now move to the frontend.

Elixir syntax in CodeMirror

CodeMirror is an online editor for code. It is distributed via NPM in a modular way, each module in its own file. This meant that I could use bower for the package, and in my bower.json only require the files that I needed, thereby avoiding bloat and saving a lot of space in the final application.js:

{
  "name": "TryElixir",
  "dependencies": {
    "codemirror-elixir": "asymmetric/codemirror-elixir",
    "fetch": "^0.11.0",
    "es6-promise": "^3.2.1",
    "bootstrap": "^3.3.6",
    "hint.css": "^2.2.0"
  },
  "overrides": {
    "codemirror": {
      "main": [
        "lib/codemirror.js",
        "lib/codemirror.css",
        "theme/material.css",
        "addon/mode/simple.js"
      ]
    },
    "bootstrap": {
      "main": [
        "dist/css/bootstrap.css"
      ],
      "dependencies": {}
    }
  }
}

There was no syntax highlighter file for Elixir, so I had to build one. It’s still a work in progress, and PRs are very welcome. Actually, CodeMirror offes 2 ways of defining syntaxes (what it calls “modes”): one is RegEx based, and it’s very simple. The other one is to define a proper lexer (here is the one for Ruby for example) and although it’s the recommended way, it was definitely too much work for my use-case. But if you’re in that kind of thing, there’s an opportunity for you!

Security

(Note: This section was updated on May 11 2016)

Given that this project allows you to execute arbitrary code, just how much risk is there?

Initially, I thought that Heroku’s ephemeral filesystem would be enough: users wouldn’t be able to permanently affect the application, even if they somehow managed to run “malicious” code.

And that’s true, but I was greatly underestimating the amount of damage that can be done when you can run arbitrary code.

For example, you can shut down the VM with :erlang.halt, or restart it with :init.reboot. Additionally, you can run code that will never stop executing, while stealing every CPU cycle available.

To address these problems, I’ve come up with two solutions: a blacklist of forbidden commands, and performing code evaluation inside an async Task .

Async tasks are particularly interesting. You can invoke one with Task.async(fn -> your_function() end), and afterwards call Task.yield(your_task, your_timeout) to send it off to do its work.

The key thing here is the timeout: after it expires, the Task is expected to have a value. If it has one, it’s returned; if it doesn’t, we can handle that however we want. In my case, I kill it brutally and just return a message to the user, informing them that the computation took too long.

Domain, hosting and DNS

I was very happy to have found a catchy TLD for the site. The registrar is Namecheap (affiliate link) and it only cost $0.89 for the first year!

The app is hosted on Heroku, and normally what you would do is point www.yourdomain.com to your-app.herokuapp.com. But I wanted to get rid of the www. subdomain, and it turns out you can’t do that with a lot of DNS providers. The reason is that CNAME records can only be created for subdomains. Heroku does not provide each app with a fixed IP obviously, so you can’t just create an A record and call it a day.

Some providers though do offer so-called ANAME/ALIAS records, bust most of them cost an order of magnitude more than the domain itself. That’s why I was very surprised to find out that CloudFlare offers their DNS services for free! So I proceeded to hand over DNS resolution for my domain to them (which by the way took something like 5 minutes!), configure the root CNAME and that was it! Now the domain is availale without that ugly www.!

TODOs

While I’m pretty happy with the result, there are definitely some ideas that are worth investigting.

It might be interesting to rewrite the app without Phoenix, as a barebones Plug app. The backend is extremely simple, the challenging part would be requiring frontend modules without using Brunch (or having to setup Brunch myself).

Another important thing to do would be to default to HTTPS, maybe using Let’s Encrypt.

And, last but not least, is the tutorial. Most other online interpreters double as intros to the language, with chapters, lessons and excercises. This is not the main focus of the project (as I said, I just wanted an easy way to try out code as I was learning), but if people find it useful, I might implement it.

And I think that’s all.

Have fun!

Installing Erlang on Ubuntu Wily

2015-11-27T00:00:00+00:00

I’m on Ubuntu Wily, and I was trying to install the erlang-dev package to use HTTPoison. The problem though was that APT wanted to downgrade Erlang to an older version (16.0), in order to resolve some conflicts. This was obviously undesirable, so I set out to investigate.

It turns out that, at least for Wily, the entry in /etc/apt/sources.list.d/erlang-solutions.list added by the erlang-solutions package is incorrect.

It reads: deb http://binaries.erlang-solutions.com/debian squeeze contrib, whereas the official instructions say you should have deb http://packages.erlang-solutions.com/ubuntu wily contrib.

As soon as I change that, the problem was solved.

TLDR;

Uninstall the erlang-solutions package
Add the repo manually in /etc/apt/sources.list.d/erlang-solutions.list:

deb http://packages.erlang-solutions.com/ubuntu wily contrib

Install packages

Assignments and Pattern-Matching in Elixir

2015-11-25T00:00:00+00:00

While reading Dave Thomas’ Programming Elixir book, I was confused by the following piece of code:

defmodule Users do
  dave = %{ name: "Dave", state: "TX", likes: "programming" }

  case dave do
    %{state: some_state} = person ->
      IO.puts "#{person.name} lives in #{some_state}"

    _ ->
      IO.puts "No matches"
  end
end

My problem was understanding line 5: %{state: some_state} = person. How could the person variable get its value, if it was on the right side of the assignment?

A digression on Pattern Matching

Now, if you’re not familiar with Elixir, you might not know that in it, the = operator is not exactly an assignment. What it is instead, is an operator for Pattern Matching.

If you recall basic math classes from school, you might remember that x = 1 didn’t mean “assign the value 1 to the variable x”. It meant: both sides of the equation are equal. Saying that x = 1 implied that 1 = x.

Now, in imperative languages, this is not the case. To take an example from Ruby, if you try and run 1 = x, you’ll get an error (albeit a very cryptic one):

irb(main):001:0> 1 = x
SyntaxError: (irb):1: syntax error, unexpected '=', expecting end-of-input
1 = x
   ^

Which is Ruby basically telling you that it’s not expecting an assignment after an integer.

In Elixir, the = operator works like it does in algebra:

iex(1)> x = 1
1
iex(2)> 1 = x
1

In other words, Elixir tries to pattern-match (according to rules which are outside the scope of this article) the left hand of the equation with its right hand. If it succeeds, it returns the value of the equation. If it fails, it throws an error.

This allows us to effectively bind values to variables, like we did above. What we cannot do, however, is introduce variables on the right hand side of an assignment/pattern-matching/equation.

So if I try to do 1 = x for an uninitialized variable x, Elixir will complain:

iex(1)> 1 = x     
** (CompileError) iex:1: undefined function x/0

End of digression

So, how could it be that we could run %{state: some_state} = person?

The point is that I was focusing on the wrong part of the expression, which structurally is of the form:

left -> right # %{state: some_state} = person -> IO.puts "..."

The -> operator mandates that its left side hand be a pattern. So the whole expression %{state: some_state} = person is a pattern.

And when you’re assigning inside a pattern match, the side of the assignment does not matter.

For example, you can do (1 = x) = 1, and it won’t throw an error. Or, as José himself explained, you could run [ h = 1 | t ] = [ 1,2,3 ]. Or [ 1 = h | t ] = [ 1,2,3 ], for that matter.

This is called “Nested pattern matching” in Dave Thomas’ book, but I’m not sure how widely adopted this term is.

So, there it is. I learnt something new. I still think that I’ll use the variable = expression syntax in my code, but it’s good to be aware of this.

Visual Regression Testing with PhantomCSS

2015-04-14T00:00:00+00:00

PhantomCSS is a fantastic project for doing Visual Regression Testing of your web app.

It’s based on PhantomJS, CasperJS and a library written by the same authors as PhantomCSS, called Resemble.js.

Visual Regression Testing

First of all, what is Visual Regression Testing? Well, according to Wikipedia, regression testing is the practice of detecting unwanted changes in a system (regressions), introduced by other, possibly unrelated changes.

Applied to CSS (hence the “visual” part) this means that you take a screenshot of a CSS selector at a moment in time when it looks as expected (your baseline). You then have tests which, on each run, take a screenshot of the same selector against your app, and compare it to the baseline.

The idea

The idea is that if you run these type of tests after introducing changes, you’ll be able to catch regressions automatically. What’s even more awesome, is that you can very easily test across screen sizes (and if you wish, also across rendering engines - cfr. SlimerJS)

But aren’t integration tests enough?

You might wonder: why add another level of testing? Can’t we already test the structure of the page using tools like Capybara or CasperJS itself, asserting on the HTML structure of the page, and possibly even on the CSS classes attached to the elements?

You could, but you’d be wrong. Asserting on the HTML structure is a gross misunderstanging of the scope and goal of integration tests; and even if you assert that the a div has a .red class, there’s no way to be sure that the element is actually red.

The stack

So, with that cleared out of the way, let’s proceed to analyze the stack:

PhantomJS is a headless WebKit browser, exposing a JavaScript API you can use to drive the browser.
CasperJS is a library built on top of PhantomJS (and SlimerJS), providing a much easier to use API, especially useful for writing navigation steps and tests. Think Capybara for JavaScript.
Resemble.js is a library for comparing and diffing images. It allows you to define a threshold, and differences below that threshold are treated as a match. This is useful in many cases, like for when you’re running tests on different OSes that render fonts slightly differently.

What PhantomCSS does is it glues these tools together, and adds its own API for taking screenshots and comparing them to the baselines.

How it works

Example

Here is a fairly typical PhantomCSS test:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
casper.test.begin("Test the example page", function(test) {
  casper.start();

  casper.then(function() {
    casper.viewportSize(1200, 800);
    casper.open('http://test.example.com');
  }).
  then(function() {
    phantomcss.screenshot('.come-on-my#selector', 'filename');
  });

  casper.then(function() {
    casper.viewportSize(600, 400);
    casper.open('http://test.example.com');
  }).
  then(function() {
    phantomcss.screenshot('.come-on-my#selector', 'filename-small');
  });

  casper.then(function() {
    phantomcss.compareAll();
  });

  casper.run(function() {
    test.done();
  });
});

In this example, we’re loading the page twice, at two different resolutions, and taking screenshots of the same selector.

At the end of the test, we’re comparing the saved screenshots with the baselines.

But when have the baselines been saved? Time for some explanation.

The flow

The first time you run a test with PhantomCSS, the library looks into the baselines directory (where?), looking for a baseline that matches the filename passed to the screenshot function. If it doesn’t find one, it will deduce that we’re running the test for the first time, and save the screenshot as a baseline. This also means that it will not run any of the compare steps, as there’s nothing to compare yet obviously.

On every subsequent run, the baselines directory will be populated with files and PhantomCSS will proceed to make the comparisons.

If you want to create the baselines from scratch, you can either delete the files manually, or configure the command line parameter for the rebase function (more on that here).

Structure of a CasperJS test

A full description of the structure of a CasperJS test is out of the scope of this blog post, but a point should be mentioned:

CasperJS tests are written in an asynchronous way, but avoiding the use of callbacks (and in fact, that’s one of the major advantages of using CasperJS instead of raw PhantomJS). This means that tests are declared using then(), but they’re only executed, in the order in which they were declared, with run().

Configuring PhantomCSS

PhantomCSS configuration is done through the init() function. This is how I currenlty configure it:

1
2
3
4
5
6
7
8
phantomcss.init({
  libraryRoot:            'vendor/components/phantomcss',
  screenshotRoot:         'test/visual/screenshots/baselines',
  comparisonResultRoot:   'test/visual/screenshots/results',
  failedComparisonsRoot:  'test/visual/screenshots/failures',
  rebase:                  casper.cli.get('rebase'),
  mismatchTolerance:       0.1,
});

Most options are self-explanatory. The most interesing one is rebase, which is basically saying: when I run the tests using the --rebase flag, overwrite the baselines.

How to run tests

To run a test, you invoke casperjs test filename.js.

It’s also possible to run tests in more than one file, by using globs: casperjs test *.js.

Keeping things DRY

If you’re splitting your tests across multiple files (which you should probably do), you’ll soon find yourself repeating pieces of code over and over - like the above mentioned .init() call for example.

I solved this by centralizing my configuration to its own file, which i decided to call common.js (great name right?)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
var require = patchRequire(require);
var phantomcss = require('../../vendor/components/phantomcss/phantomcss');

phantomcss.init({
  libraryRoot:            'vendor/components/phantomcss',
  screenshotRoot:         'test/visual/screenshots/baselines',
  comparisonResultRoot:   'test/visual/screenshots/results',
  failedComparisonsRoot:  'test/visual/screenshots/failures',
  rebase:                  casper.cli.get('rebase'),
  mismatchTolerance:       1,
});

exports.phantomcss = phantomcss;

var viewports = { 
    'smartphone-portrait':  { width: 320,  height: 480  },  
    'smartphone-landscape': { width: 480,  height: 320  },  
    'tablet-portrait':      { width: 768,  height: 1024 },
    'tablet-landscape':     { width: 1024, height: 768  },  
    'desktop-standard':     { width: 1280, height: 1024 }
};

exports.set_viewport = function(name) {
  var viewport = viewports[name];

  return casper.viewport(viewport.width, viewport.height);
};

casper.options.viewportSize = { 
  width: viewports['desktop-standard'].width,
  height: viewports['desktop-standard'].height
};

As you can see, I’m setting some configuration values, defining some helper functions, and exporting the whole thing using CommonJS’s exports object.

In your tests you can then do something like

1
2
3
var common = require('support/common');
common.set_viewport('desktop-standard');
common.phantomcss.screenshot();

For more info on how to use modules in CasperJS, check this page. There you’ll also find an explanation for the weird two first lines.

Installing Ruby 2.2 on Ubuntu 14.04

2014-12-25T00:00:00+00:00

Today I tried installing Ruby 2.2.0, which just came out, on my Ubuntu 14.04 (Trusty) machine.

The build (through the excellent ruby-build) failed though, and inspecting the logs, the error seemed to be related to libffi:

linking shared-object fiddle.so
/usr/bin/ld: ./libffi-3.2.1/.libs/libffi.a(raw_api.o): relocation R_X86_64_32S against `.rodata' can not be used when making a shared object; recompile with -fPIC
./libffi-3.2.1/.libs/libffi.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status

The solution was to simply install libffi-dev:

sudo aptitude install libffi-dev

And then the installation proceeded without problems.

Cucumber Notes

2014-12-22T00:00:00+00:00

CUCUMBER

Terminology

Feature Step: the textual descriptions which form the body of a scenario
Step Definition: matcher methods, implementation of a step.
- composed of a regexp and a block, which receives all of the matched elements as arguments
Feature: indivisible unit of functionality, e.g. an authentication challenge and response user interface. Each Feature is composed of Scenarios
Scenario: a block of statements inside a feature file that describe some behaviour desired or deprecated in the feature

Concepts

Use descriptive, instead of procedural feature steps
- Feature steps are about what needs to happen, not how
- Will this wording need to change if the implementation does?
  - answer should be no
The verb used in the step definition doesn’t matter:
- A Given feature clause can match a When step definition matcher.
The physical directory structure is flattened by Cucumber
- any file ending in .rb is loaded
- steps defined in one file can be used in any feature
- name clashes result in errors
- the -r parameter instead allows to selectively load files/directories
Immediately implement the new step requirement in the application using the absolute minimum code that will satisfy it.
- The result is that you will never have untested code in your app
  - Because all code is a product of a test
Use modules to group together calls to the Capybara API
Cucumber uses transactions, and transactions are rolled-back after each Scenario

Method

For each feature step
- Write step definition
- Run, and watch it fail
- Write application code that makes it pass
- Without changing the step’s logic, change the “test criteria”, to make sure it’s passing for the right reason
- Reset the “test criteria”

A Good Step Definition

The matcher is short.
The matcher handles both positive and negative (true and false) conditions.
The matcher has at most two value parameters
The parameter variables are clearly named
The body is less than ten lines of code
The body does not call other steps

Examples

Feature

Feature: Some terse yet descriptive text of what is desired
In order that some business value is realized
An actor with some explicit system role
Should obtain some beneficial outcome which furthers the goal
To Increase Revenue | Reduce Costs | Protect Revenue  (pick one)

  Scenario: Some determinable business situation
      Given some condition to meet
         And some other condition to meet
       When some action by the actor
         And some other action
         And yet another action
       Then some testable outcome is achieved
         And something else we can check happens too

  Scenario:  A different situation
      ...

Step

When /statement identifier( not)? expectation "([^\"]+)"/i do |boolean, value|
  actual = expectation( value )
  expected = !boolean
  message = "expectation failed for #{value}"
  assert( actual == expected, message )
end

Modules

module LoginSteps
  def login(name, password)
    visit('/login')
    fill_in('User name', :with => name)
    fill_in('Password', :with => password)
    click_button('Log in')
  end
end

World(LoginSteps)

When /^he logs in$/ do
  login(@user.name, @user.password)
end

Given /^a logged in user$/ do
  @user = User.create!(:name => 'Aslak', :password => 'xyz')
  login(@user.name, @user.password)
end

Classes

# features/support/local_env.rb
class LocalHelpers
  def execute( command )
    stderr_file = Tempfile.new( 'script_stdout_stderr' )
    stderr_file.close
    @last_stdout = `#{command} 2> #{stderr_file.path}`
    @last_exit_status = $?.exitstatus
    @last_stderr = IO.read( stderr_file.path )
  end
end

World do
  LocalHelpers.new
end