Async Programming Is Just @Inject Time

All I really wanted to do was learn a little bit more about different models for error handling, but then I kept seeing “effects” and “effect systems” mentioned, and after reading about Koka and Effekt I think I’ve been converted. I want effects now. So here’s what I wished I could have read a few weeks ago.

To start with, you need to remember that functions don’t exist. They’re made up. They’re a social construct.

Your CPU doesn’t know or care what functions are,1 they’re purely a book-keeping abstraction that makes it easier for you to reason about your code, and for the compiler to give you some useful feedback about your code. It’s the whole idea with structured programming: build some abstractions and have a compiler that can make guarantees about them.

I’ve never really done much assembly so this wasn’t something I’d had to contend with too much, but functions are interesting because they’re a fixed entry point with a dynamic return point. Let me show you what I mean with this C program:

int first_function() {
  // ...
  return 10;
}

int some_function() {
  // ...
  int number = first_function();
  return 4 + number;
}

void main() {
  first_function();
  some_function();
}

When this program is compiled, the compiler knows exactly where the instruction pointer needs to jump to get to first_function and some_function, since it knows exactly where in the executable it put them. Chances are that if you looked at the assembly they would each just be a single instruction to jump a nice fixed offset.

What happens when we get to the return statements? first_function is called from both some_function and main—there isn’t just a single place that we can jump back to. The compiler doesn’t know when it’s generating the code for first_function who’s going to be calling it.

How this works is that alongside any function arguments, there’s an invisible argument2 passed that contains the position of the instruction where it made the jump to the top of the function. The compiler knows what the instruction address is—it’s the one that puts it there—and so for each function call site, that’s just a static piece of information that gets passed in. At the end of each function, the compiler just has to generate some code to read that argument (usually stored in a CPU register somewhere, but it doesn’t have to be), jump back to that location, and continue execution.

You don’t think about this complexity because the abstraction is so solid and yet gives immense flexibility to write complicated programs.

The resolution of which function to call can get more complicated by taking into account the number of arguments and their types, instead of just the name of the function.

That’s the simplest case—static dispatch that is known at compile time—but higher-level languages introduce dynamic dispatch, where a function call could end up jumping to one of many different locations. A great example of this is Java:

class MyClass {
  @Override
  public String toString() {
    return "my class";
  }
}

Object someObject = new MyClass();
someObject.toString();

The toString method that gets called depends on the type of the receiver object. This isn’t determined at compile time, but instead a lookup that happens at runtime. The compiler effectively generates a switch statement that looks at the result of getClass and then calls the right method. It’s smarter than that for performance I’m sure, but conceptually that’s what it’s doing.

This abstraction continues to work really well because if you’ve developed in Java (or any of the many many languages that share this behaviour) you quickly internalise the behaviour of the method resolution algorithm, and it’s almost never surprising which bit of code ends up being executed. The compiler might need a runtime lookup to check, but you can use your big human brain and work it out with deduction while you write the code.

So in Java (and basically every other object-oriented language) we have dynamic function dispatch as well as a dynamic return jump at the end of each function. We can pass an object to a function, and call a method on that object. Since the receiving function doesn’t know the type of the object at compile time, any method calls on it will be completely dynamic:3

String someMethod(Object object) {
  return "This could be anything: " + object.toString();
}

someMethod might be statically dispatched, but the call to toString will have to be dynamically resolved depending on the type of object.

In someMethod, the call to toString will end up jumping to code that is entirely controlled by the object that is passed in as an argument. The CPU (or in this case, JVM) will lookup the location of toString on whatever type of object it is, and jump there.

Just like with the function resolution algorithm, this complexity is manageable both because of the function call abstraction—we know that control will jump into the other function and then return back to our function—as well as type safety—we know the returned type will be a String, so we don’t need to worry about how we got it.

This is something that I find interesting in Rust; since there’s no runtime dynamic dispatch “by default” you have to be very explicit by wrapping your type in Box<dyn MyTrait>, or if you want your dynamism at compile time you can use impl MyTrait.

Now if we’re going to jump to an arbitrary bit of code, why not put that bit of code at the call site? That’s what happens when we create an anonymous subclass:

someMethod(new Object() {
  @Override
  public String toString() {
    return "heh a new string";
  }
});

The actual location in the source file doesn’t really matter—the compiler will end up putting it wherever it feels like—but from a syntax point of view, we’ve now got control flow that jumps into someMethod, then back into our toString method, returns to someMethod, and then finally back to the call site.

This is such a useful pattern that most languages have dedicated syntax for this: closures! I love closures so much that I wrote a review of the various closure syntaxes. Let’s jump out of the JVM for now and appreciate this lovely Swift closure:

[1, 2, 3].map { number in
  number * 3
}

Instead of all that boilerplate to make a new object, we basically just write a block of code that will be used by the function we’re calling. What’s interesting here is that we’re not in complete control when that block of code is running. It might appear like it, but we can’t do anything except give a value back to the function.

This creates the limitation where you can’t create custom control flow that integrates with control flow that’s built into the language. Closures can provide values and have side effects, but they’ve got limited ability to stop the function that called them from running.

Both Ruby and Crystal work around this limitation in interesting ways, but that’s getting a little bit ahead of ourselves.

We’re going to forget about closures for a minute and talk about error handling. I promise it’ll make sense.

Error Handling

The most basic form of error handling is what you get in Go; if something didn’t work, you return a value that says so. By convention the caller checks that value and typically just returns it to say that whatever it was trying to do also didn’t work.

func getConfigPath() (string, error) {
  path, set := os.LookupEnv("CONFIG_PATH")
  if !set {
    // The variable isn't set, report an error
    return "", fmt.Errorf("CONFIG_PATH not set")
  }
  return path, nil
}

This is conceptually very simple, it’s building slightly on the function abstraction by allowing multiple return values, but little else. If a function can fail, you can see from its function signature that it will return an error, like in getConfigPath above.

With this model we have to write out the return nil, err after each function call, but semantically we can think of control flow “jumping” to the point where we do something other than immediately return the error.

func getConfig() (*conf.Config, error) {
  path, err := getConfigPath()
  if err != nil {
    return nil, err
  }
  f, err := os.Open(path)
  if err != nil {
    return nil, err
  }
  config, err := configFromFile(f)
  if err != nil {
    return nil, err
  }
  return config, nil
}

config, err := getConfig()
if err != nil {
  panic(err)
}

In this example, any error in loading the path, reading the file, or parsing the config will all direct control flow back to the top-level code and to the panic call.

Skipping over macros that make it more succinct to return an error, the next iteration of this pattern is checked exceptions in Java. Any function that can fail is annotated with what is effectively a second return value. The thing that’s different is that there’s nothing at the call site needed to return this value, it will be implicitly passed back up through the call stack (each one dynamically resolved, remember) until we hit a catch block, which is just a bit of code that takes that return value and does something with it, not that different to the Go example above.

If we ignore the fact that exceptions in Java are typed, all that’s actually happening here is that every time we enter a try block, the compiler records in memory the location of the instruction corresponding to the start of the catch block. As we keep calling more functions, some of them might have try blocks of their own, and those are added onto a stack—a shorter stack than the actual call stack, since not all functions have a try/catch. When an exception is thrown, instead of looking up the location the function is supposed to return to, we consult the stack to find the topmost catch block, and jump straight there. We’ve just done a return that has skipped over multiple functions all in one go.

Of course the actual behaviour is much more complicated as it has to worry about finally blocks and types and all that, but the core idea is the same.

Have you got all that? This is where things get weird.

When an exception is thrown, what if the compiler grabbed the instruction pointer and stored it somewhere before jumping out to the catch block? Then if you wanted, inside the catch you could choose to jump back—multiple layers of function calls—into the code that failed as though nothing had happened.

Let’s say that we could grab the current instruction pointer location—which the compiler will know for every line of code, since it’s the one generating the instructions—with a special variable called __instruction__.

Something like this (if C had catch … or throw):4

int some_function() {
  print("At the start...")
  throw __location__;
  print("I'm back!");
}

try {
  some_function();
} catch(error_location) {
  print("Caught an exception!");
  goto error_location;
}
print("Finished.");

In some_function we throw and jump out to the nearest try in our call stack, passing the current instruction back. In the code up the call stack we can run some code and then goto back to where the throw happened, resuming the function where we left off.

The output would look like this:

At the start...
Caught an exception!
I'm back!
Finished.

Well, that’s effects. Almost.

Coroutines

There is another feature similar to effects that is called “coroutines”. This is confusing because that’s what people often call lightweight threads, which are often implemented with some version of coroutines, even if you can’t use them in the language for other stuff. Coroutines allow you to stop the execution of a function and then resume it later, usually passing values back and forth in those steps.

I first came across coroutines in the Wren programming language. I got very confused by its concurrency since by default it’s not the “throw stuff at the wall and it’ll run at kinda the same time” model that Go and Crystal have.

var fiber = Fiber.new {
  System.print("Before yield")
  Fiber.yield()
  System.print("Resumed")
}

System.print("Before call")
fiber.call()
System.print("Calling again")
fiber.call()
System.print("All done")

That gives this output:

Before call
Before yield
Calling again
Resumed
All done

Instead of the fibre running in the background (or “background” depending on your scheduler) it runs until it hits Fiber.yield() then it stops and waits for someone to call .call() again.

This is really powerful for writing a lexer and parser that work together without having complicated code, or by storing an entire intermediate result in memory before passing it to the next stage. The lexer can trundle along and once it’s got a full token it can yield() that value. The parser just continually runs .call() whenever it needs a new token to process. They’re passing off control between each other in a more complicated way than just calling a single function and getting back a single result. The code in the lexer and parser can be more freely structured as any function can yield() or call() whenever a value is found or needed.

Remember how I wrote thousands of words about concurrent programming? Well the secret to any language that has async/await is basically that they can do this “jump to catch and then resume again later” trick.

Ignore the fact that catch usually means exceptions which usually means some kind of failure. A piece of code is running and it just started some work that’s going to take a long time in the background, there’s no point waiting and the program can do something more useful while the stuff happens in the background. It “throws” an exception that is caught by a scheduler multiple layers of function calls up the stack. The scheduler saves the return address into a list of pending work to get back to, and then goes to find something that it can make progress on. Eventually it completes the other work and is signalled that our background task is complete. It pops the return address off the list and jumps to it, continuing the function call exactly where it left off as though nothing happened.

If you take nothing else from this post, just know that async/await is just weird exceptions that you can undo.

Now the problem that plagues both async/await and exceptions is that they’re typically not integrated into the rest of the type system. In Java you can’t have a type or function that is generic over whether it will throw an exception.

String readFileOrFail(String path) throws IOException, FileNotFoundException {
  File file = new File(path);
  if (file.exists()) {
    return FileUtils.read(path);
  } else {
    throw new FileNotFoundException("file doesn't exist");
  }
}

List.of("one.txt", "two.txt", "three.txt")
  .stream()
  // doesn't work!
  .map(name -> readFileOrFail(name))
  .collect();

The map method only accepts a lambda that doesn’t throw any checked exceptions, so we can’t directly call our readFileOrFail method. Ideally it would be able to generically say “I throw the same exceptions as the lambda I receive” but you can’t do that in the Java type system.

This isn’t helped by the fact that Java has mostly given up on checked exceptions and instead opted for purely unchecked, runtime exceptions that offer no compile-time guarantees.

Swift is a little better in that it has the rethrows keyword that can mark a closure and function as failing with the same exceptions as the closure.

You get the same story with async functions. Swift has a whole separate library for dealing with async operations on collections, because the methods on the existing collections can’t be generic to support both synchronous and asynchronous versions. There’s no such thing as re-async.

So here we go: all any of these things—closures, exceptions, suspending functions—are just ways of jumping forwards and backwards to different places, and some compiler guarantees to ensure that any jumping can happen in a structured, safe way. And that’s what effects give you, and some more.

Effekt

Effekt is a research language with effect handlers and effect polymorphism (it says so on the website!). I also read the docs on Koka but ended up writing the most code in Effekt.

From the language tour on Effekt effects, an effect is written with an interface:

interface Exception {
  def throw(msg: String): Nothing
}

In this case we’ll throw with a String and then the effect handler will give us Nothing back. In this case that’s a somewhat magical Nothing type that tells the compiler the function will never return, but it could be a real value, which we’ll see in later examples.

Then we have a function that uses the effect:

def div(a: Double, b: Double) =
  if (b == 0.0) { do throw("division by zero") }
  else { a / b }

What’s interesting here is how that throw changes the function signature of div. In this example it’s elided since it will just be inferred by the compiler. We could write it as Double / { Exception }, which says we’re returning a Double and we’ll use the Exception effect. This means we can only call it from somewhere with an Exception effect handler, like this:

try {
  div(4, 0)
} with Exception {
  def throw(msg) = {
    println("oh no the div failed: " ++ msg)
  }
}
println("finished")

The control flow will start at try, then jump to the div function, since the b argument is 0, div will invoke the throw effect. The effect will jump control flow back down into the def throw block, and we’ll print the error. Since we didn’t call resume() the control flow will continue after the try block and run the last println.

Effekt effects get their power with the resume keyword. This swaps them from acting like exceptions and makes them act like async/await. Control flow jumps to the effect handler, which can then do some work and call resume to continue from the point that triggered the effect.

Let’s continue with the Exception example, but make it possible to recover from errors. The effect would be a little more complicated:

interface Exception[T] {
  def throw(msg: String): [T]
}

Now we can throw an exception with a message, and the exception handler can give us back a value to use instead.

val result = try {
  div(4, 0)
} with Exception {
  def throw(msg): Double = {
    println("oh no the div failed: " ++ msg)
    resume(42.0)
  }
}
println("finished: " ++ result)

The div function will be called, and it’ll again throw back to our exception handler. This time we print the error but then resume with a value. In div this is used as the result of the do throw expression.

In Koka this can get even more wild where resume can be called more than once. This forks off the original function so there are two instances, each progressing with different results. This is absolutely wild.

The key here is that you don’t have to call resume immediately. Just like how you can store a closure to compute some result later, you can wrap resume in a closure and wait to call it some other time. The state of the function that triggered the effect will be stored with the closure just like any other data. You can see this in action in the Effekt async example.

Effects versus yield

That’s only just scratching the surface of how you can use effects for control flow. Something I found interesting while reading this is realising that Crystal’s yield keyword is just like a little baby effect system.

Crystal inherits the somewhat complicated block semantics from Ruby. This is exposed with the Proc type and the yield keyword. The simple example from the documentation:

def twice(&)
  yield
  yield
end

twice do
  puts "Hello!"
end

The yield keyword yields (aahh!) control back to the calling function. In this example the block of code “passed” to twice is run two times. This is not too dissimilar to passing a Runnable to a Java method:

void twice(Runnable block) {
  block.run();
  block.run();
}

twice(() -> {
  System.out.println("Hello!")
});

Except the yield in Crystal is more powerful, because the caller can change the control flow in the function that accepts the block. You can break from within a block and cause an early return, or return from within the block and return from the method the block is in—not the method it’s calling.

def find_mod_2(items)
  items.each do |i|
    if i % 2 == 0
      return i
    end
  end
end

That return statement will stop the execution of each and return from find_mod_2. If this was another language, or if each was implemented with a Proc rather than a yield, you would have to return a special value to indicate you wanted to stop, or raise an exception. This is how Crystal gets away with having no for loop in the language.5 Otherwise the block would simply cede control to the method that called it.

What’s confusing is that you use the same syntax to create a Proc which can’t affect the control flow of the function that called it, and has the same limitations as other languages like Java. If you think about the implementation it makes sense, a yield cannot be stored and run later, outside of the execution of the method it is in, whereas a Proc can be stored as an instance variable and executed much later. Like, how would this work?

class Thingie
  getter block : Proc(Nil)? = nil

  def do_thing(&block : Proc(Nil))
    puts "setting thing"
    @block = block
  end
end

def use_thingie(th : Thingie)
  th.do_thing do
    return "this is a value!"
  end
  puts "Am I unreachable?"
end

th = Thingie.new
use_thingie
th.block.call # what should happen here?

How can use_thingie ever finish if the return statement is in the Proc? What should happen when the Proc is called? It can’t return from use_thingie since that function will have already finished by the time it’s called.

The Crystal compiler knows this doesn’t work and the program will fail to compile:

In test.cr:12:5

 12 | return "this is a value!"
      ^
Error: can't return from captured block, use next

This is the exact same distinction as Swift’s @escaping closures, except Swift doesn’t allow control flow in non-escaping closures anyway.

yield in Crystal is a very simple version of effects, since it will only allow jumping up one layer in the call stack, if you want to forward a block you need to re-yield when you call another function. There’s also only one possible receiver, the single block passed to the function will be used for all yield statements.

Dependency injection is just effects

You can only use an effect if somewhere up the call stack there is a place where that effect will be handled. In Java you need a catch around every throw, even if for runtime exceptions you can skirt around this slightly. In languages with async/await you must decorate a call to an async function with await, and the function you’re calling from must be async. Eventually up the call stack you’ll get to a call that adds the async work to a task queue, executor, or blocks waiting for it to complete. These are all examples of effect handlers for async programming. They provide the scheduling effects that the async code needs in order to run.

This can define lexical scopes; no code outside of places where a certain effect handler is installed may use that effect. My mind is broken in just the right way that when I realised this, I thought “that’s just dependency injection”.

The key of (Dagger-style) dependency injection is that you can only access certain dependencies in certain parts of the application, and how those dependencies are constructed is separated from their actual use. I like this so much I implemented it with Crystal macros.

Since effects propagate up, they naturally support nested scopes. When an effect is triggered for a dependency provided by a wider scope, it will skip over the handler for the inner scope and jump straight to the outer handler to get the dependency.

This code is still fairly verbose, you would likely want some code generation or macros to tidy it up and make it less of a pain to write. We start with an injection effect:

interface Inject[A] {
  def get(): A
}

As long as we have the right type annotations, we can do get() to defer to the effect handler to provide us with a value:

def functionWithDeps(): Unit / { Inject[Logger], Inject[Config] } = {
  val logger: Logger = do get()
  val config: Config = do get()
  logger.log("Doing stuff, this config: " ++ show(config))
  doImportantStuff(config)
}

This function can only be called in contexts where we can inject both a Logger and a Config. This is what the function call at the scope root would look like:

def doWithInjection() = {
  val config = buildConfig()
  val logger = getLogger()
  try {
    functionWithDeps()
  } with Inject[Config] {
    def get() = resume(config)
  } with Inject[Logger] {
    def get() = resume(logger)
  }
}

This would obviously be unwieldy with lots of dependencies, but that could either be handled by clever type-system trickery, macros, or code generation. You’d also want to create the objects lazily, which I’ve neglected to do here.

What’s neat about this is that it works on functions rather than objects, so you’re not forced to indirect things through lots of different classes if you don’t want to.

Effect syntax

Since many languages already have an effects-adjacent way of throwing and catching exceptions, the syntax changes required to support arbitrary effects are actually fairly minimal. Effects could slot quite nicely into the Swift syntax, at least the parts that I can think of. Since you want to stay as close to the existing throws and async keywords, I’d propose listing the effects between the argument list and the -> before the return type, like this:

// No effects
func foo() -> String {}
// One effect
func foo() async -> String {}
// Two effects, one with a type parameter
func foo() throws<Error>, async -> String {}

It doesn’t fit with other types in Swift, but I think the names of effects should be lowercase to appear more like “tags” than “types”, although I could be persuaded for consistency to uppercase them. Since an effect might have any number of generic parameters, you’d have to specify those within angle brackets which is a little ugly but not terrible.

Defining the effect would be similar to an enum definition:

effect async {
  case suspend
  case cancel
}

This fits well because you’re yielding to a particular case in the effect, and you can add associated data to each case in just the same way you do for enums:

effect throws<T: Error> {
  case throw(T)
}

Just like Effekt, I think the do keyword is nice to indicate that you’re doing something with an effect. I think this would be required on any function call that has an effect, like this:

func fetchUserInfo(id: Int) async -> User {
  let info = do userInfoLoader.load(id)
  return User(from: info)
}

At a point that we want to handle the effects, you would put a block that will match on the effects used in the do expression. Currently in Swift this is the catch keyword, but since this has to be more general I think when is a better fit. It would read as “do that, and when this happens, do this other thing”.

do {
  do userInfoLoader.load(id)
} when throws(error) {
  Log.error("Unable to load user \(id): \(error)")
  return nil
}

If you need to handle multiple effects, you’d tack on extra when blocks just like you can with catch blocks today.

The case with throws would be—like Effekt—a special case for an effect with a single type. For effect handlers with multiple cases, the body of the when block would be equivalent to a switch statement:

do {
  do asyncScheduler.doSomeWork()
} when async {
  case suspend: {
    self.pendingTasks.append {
      resume
    }
  }
  case cancel: {
    self.onTaskCancelled()
  }
}

What I think is interesting about this exercise is that from a syntactic point of view, there isn’t really that much to change. Functions can already be tagged with a fixed set of effects, and there’s already syntactic structures to handle them.

Do you want effects?

I got into this mess because I was reading about ways of handling errors and also ways of handling async programming.

Much like generics, I don’t think most code would have to worry about defining their own effects or effects handlers. Having exceptions and async/await not be something that’s built into the language and instead be something that’s built with the language would be really cool. The language could be less prescriptive over how async code is written, perhaps allowing certain codepaths to have strict guarantees on how fibres can be cancelled, for example.

This might be the way to get structured concurrency into a language without placing the entire burden on the language itself. It would allow library authors to dictate the contexts in which certain functions could be called, enforcing structure and correctness. In most garbage-collected languages many contracts are only enforced in documentation saying “don’t hold onto a reference to this object”, which I’ve also written about before.

Effects would also be valuable in code that deals with deadlines or other scoped data that is typically stored in dependency injection, thread local variables, or passed through function calls manually. Instead of constantly checking the deadline, you could just augment the existing suspension effect to fail at any suspension point when a deadline has run out. Any code that needs to operate with a deadline simply couldn’t be called from contexts without a deadline.

I’ve focussed mostly on how effects relate back to exceptions and async code, since those are control-flow constructs that I (and probably you) are most familiar with. I haven’t given much thought to what it would be like to write code where all I/O is handled through effects. If you had to annotate every single function and function call that you wanted to do I/O, I imagine that would get really tedious. If the language had good type inference on the required effects, then it might not be so bad.

When I wrote all about concurrency I argued that all code should be async by default—like Go or Crystal—since there’s already so much implicit behaviour going on in your typical program, you might as well get low-cost I/O while you’re at it. I do think there are contexts where it’s useful to know that you aren’t going to suspend for an arbitrarily long amount of time, like a UI handler, and having the ability to write APIs that require no effects would allow these kinds of guarantees.

So there you go, I started out wanting to understand error handling and instead learnt that everything I know about programming could somehow be linked back to one language feature. If you want to read more, I’d recommend starting with the Effekt and Koka language tours (I just skipped straight to the good bits).

Please take my explanations of how function calls, and exceptions work here as illustrative rather than literal. I wanted to give an example of the kind of thing the computer is doing without getting too bogged down in how the computer actually does it. The aim of this is to think more about how control flow works with different languages, rather than how you’d actually implement it.

  1. Ok I haven’t asked mine. 

  2. More or less, depending on architecture I guess? I’m not a CPU instruction set expert, but this is the general concept. 

  3. The original version of this post incorrectly stated that C had no dynamic dispatch, but this is possible with function pointers, which can even be used to implement an object system. 

  4. Ignoring stack frames and suchlike and the fact goto can’t jump across functions. 

  5. Well apart from the macro language, which is kinda separate. 


Ruby Scripting Utilities

I think I’m pretty good at shell scripting: I quote my variables, I know the difference between $@ and $*, I know about checking that variables are set with the ? suffix. I’ve spent a lot of time messing around with shell scripts, but I always feel like the script that I write is almost always dictated by what’s easy or possible to do in the shell.

The main reason I did this was for a self-inflicted concern for portability. If I only wrote shell scripts, I could have everything working on any platform, without having to install additional dependencies or build custom executables.

Last year I decided that I’d had enough, and I gave myself permission to assume that any computer I was actually using would have Ruby installed. This led me to create an easy way to use Ruby expressions as replacements for sed or awk. I then replaced the script that installs all my Zsh, tmux, and Vim plugins with a simple Ruby script.

Ruby has been my go-to scripting language for ages, but now I’ll skip straight past a shell script and go right to Ruby instead. I’ve been using it for one-off scripts as well as small utilities.

The biggest problem is that I constantly end up writing this function:

def run(*args)
  unless system *args
    raise "command failed: #{args.join ' '}"
  end
end

If you’re not fluent in Ruby, system runs a subprocess that inherits the IO of the Ruby process, and returns true or false depending on the exit status of the program. The way I almost always want it to work is to just stop the whole program if something goes wrong, so I always write out this little helper.

The next issue is that I’m really used to Crystal’s Process class that makes it really easy to manage subprocesses. Paired with Crystal’s event loop and IO module, it’s easy to read and write data to the program, or just spawn it and wait for it to finish.

You can do lots of stuff with Crystal’s Process, but in a script all I really want to do is:

  • Run a program and throw an exception if it fails
  • Run a program and capture its output (also throwing an exception if it fails)

My run method does the first. The second can almost be done with the special backtick ` method:

files = `ls`

This is great until you want to pass some arguments in, because it only supports string interpolation, not passing in separate arguments. That’s fine for a one-off script where I know the input, but I’d rather not worry unnecessarily about shell injection problems.

Ruby does have an alternative: Open3.capture2e. It’s not exactly the kind of fluent API I’d like:

require "open3"

output, status = Open3.capture2e('jj', 'commit', '-m', message)
if status.exitstatus != 0
  raise "jj command failed"
end
# => do something with output

So what I’ve done is make use of the RUBYLIB environment variable. It points to an additional place (or places) that Ruby will look when you require a file. Instead of having to bundle common code in a gem, or rely on writing out an absolute path to exactly where my common code is, I can just:1

require "process"

output = Process.capture 'jj', 'commit', '-m', message

I’ve added the two methods for calling external programs that I’ve been wanting, and then of course added a little helper library for interacting with JJ repos. You can see them in this commit in my dotfiles repo. The intention here isn’t to create something to be used by other people, it’s purely so I can write scripts more easily.

  1. It’s probably not the best idea to call this process, and chucking this in the RUBYLIB path could definitely cause a weird problem at some point. I just didn’t want to have a non-obvious name. 


Building Dependency Injection with Crystal Macros

When I first came across dependency injection I was a sceptic. Surely we could just create objects the normal way instead of worrying about modules and bindings? Eventually though I realised how you’re actually just generating the boilerplate code that you’d need to pass the dependencies around manually, and writing that code yourself is a huge waste of time.

I then wondered how hard it would be to implement the whole thing using Crystal macros. The aim is to have all the dependency resolution happen at compile time, so any failures to find a dependency will result in a compilation failure, and you’re not paying the price of a hash lookup or something similar to find each dependency during construction.

Dependency Model

The model I’ve implemented is based on Dagger (and its cousin, Hilt) since that’s what I’ve used the most. How Dagger works is that every injectable type is “installed” in a component, and can access any dependency from that component or its parent component. In a Hilt Android app, this means that a dependency in the ActivityComponent can inject something from the SingletonComponent, but not the other way around.

Each injectable type also has a policy of whether a new object should be created each time, or whether Dagger should hold on to the object and share it between dependent classes. In Dagger these are “scopes” which everyone gets confused about because ActivityScope and ActivityComponent seem like they should be the same thing, but ActivityScope just says to keep the object for the life of the activity, and can only be used on dependencies in the ActivityComponent. The equivalent for SingletonComponent is just Singleton, which is confusing because it is a scope, but it’s not named that way.

I decided to not match the nomenclature and invent my own terminology. Each collection of dependencies is a scope, which may have a parent scope (and the parent scope may also have its own parent). Every injectable type must have either a @[Retain] or @[Recreate] annotation that denotes whether it should be held onto or not.

What I like about this model is that you are effectively adding some lifetimes to objects in a language that doesn’t actually support them.

Implementation

The first trick that makes this whole thing work is defining the scopes as classes. They could be defined by annotations or just instances of a single Scope class, but as you’ll see later this lets us trick the compiler into validating our dependency resolution at compile time for us. Doing that would be much harder if the scopes were defined another way.

In a simple HTTP server, we might have scopes that look like this:

class GlobalScope < Scope
end

class RequestScope < SubScope(GlobalScope)
end

GlobalScope holds everything that is accessible anywhere in the application and lives until the process exits, then RequestScope holds things that are only relevant to a single incoming request and will be discarded once it has been handled.

Don’t worry about SubScope—we’ll get to that later.

Now the main thing we need is to generate the code that builds our object and its dependencies. There are a few ways of defining the macro to do this—you could define a macro on a module and call that from inside the class—but what I ended up doing was creating a generic module with an included hook. The actual code looks like this:

class RequestProcessor
  include Injectable(RequestScope)

  ...
end

Using a generic module means that at compile time we have access to T—the type of the scope—and @type—the class we’re building the injector for. Other approaches could get the same thing, but it fits really nicely with the generic module.

In the module we define a hook:

module Injectable(T)
  macro included
    ...
  end
end

The macro hook is invoked immediately on include, but we can do the classic trick of defining a finished macro inside that macro that calls another macro. That way we can run code after the whole class has been defined.

macro included
  {% verbatim do %}
    macro finished
      build_injector
    end
  {% end %}
end

I’ve written so many macros that I don’t even hesitate at the idea of a macro defining a macro that calls a macro. That’s just how I write code.

That build_injector macro does the actual work to generate the code. This happens in a few stages; I didn’t want to be overly prescriptive on which method would be called for injection, so you have to annotate it with @[Inject], which means we first need to find the right method. This is a bit clumsy in a macro:

{% method = nil %}
{% for m in @type.methods %}
  {% if m.annotation(Inject)
       unless method.nil?
         method.raise "multiple @[Inject] methods: #{m.name} and #{method.name}"
       end
       method = m
     end %}
{% end %}

{% if method.nil?
     @type.raise "no @[Inject] annotated method on #{@type}"
   end %}

After that we have method, which is a Def object. Looking at the arguments to a particular method is easier than trying to process instance variables, and only doing constructor injection instead of field injection (in Dagger parlance) gives a bit more flexibility to the class.

The next trick is to use the unsafe .allocate method to grab some uninitialised memory where we can put our object. We then just call the right initialize method (as defined by method) which will set the instance variables. That’s just a matter of generating a method call from the information in method:

instance = {{ @type }}.allocate

instance.{{ method.name }}(
  {% for arg in method.args %}
    {{ arg.restriction }}.inject(scope),
  {% end %}
)

This can be a bit hard to parse, so let’s look at an example. If we have this class:

class RequestProcessor
  include Injectable(RequestScope)

  @[Inject]
  def initialize(
    @params : URI::Params,
    @context : HTTP::Context,
  )
  end
end

Then the generated inject method would look like this:

def inject(scope)
  instance = RequestProcessor.allocate

  instance.initialize(
    URI::Params.inject(scope),
    HTTP::Context.inject(scope),
  )
  return instance
end

Note that @type is expanded to RequestProcessor in the macro.

Now, this doesn’t actually work because we need to be able to retain objects in the scope, so that if they’re injected in two places, both will get the same instance. What we’ve got currently will just make a new instance of every object every time anything is injected, which isn’t very useful.

In this inject method the initial change is quite simple; instead of calling the inject method on each class to create an argument, read it from the scope—that’s why we have the scope in the first place. We’ll assume that the scope has a .get method, and the change is fairly simple:

instance = {{ @type }}.allocate

instance.{{ method.name }}(
  {% for arg in method.args %}
    scope.get({{ arg.restriction }}),
  {% end %}
)

The problem is that we now have to go and write that .get method. Working out how to do this took some serious head scratching. The problem is that we need to generate some code that will look at the type that’s passed in, find out if it should be recreated or retained (and whether there’s an existing retained instance), then return the retained instance or create a new one.

Perhaps the most naïve way of doing this would be to have a Hash(Class, Object) that stores the objects, but Crystal doesn’t support using Class or Object as type constraints for instance variables, so that’s not an option.

I fiddled around with trying to do horrible unsafe things with pointerof() but that didn’t really get anywhere because even if I can store pointers to objects, I still need to know if I even need to store them in the first place.

I even thought about calling a specially-formatted method that would be handled by method_missing, parsed back into a type, and somehow worked it out from there.

In the end the solution was much simpler; all it took was a realisation of how redefining classes and namespaces work.

If you do this in Crystal, you will add a new method to the String class:

class String
  def sup
    "Sup, #{self}"
  end
end

But if you do this, you will define an entirely new class called Geode::String that is unrelated to the top-level String class:

module Geode
  class String
    def sup
      "Sup, #{self}"
    end
  end
end

That works the same if Geode is a module, class, or struct.

I thought that because my macro was generating code that lived inside the to-be-injected class, I couldn’t patch new methods into other classes, so I couldn’t define a new field on RequestScope from within that macro.

In Crystal every type is resolved relative to the current module or class, so within Geode if you wrote String you’d get your custom class, not the actual string class. If you want to be unambiguous, you can prefix the type with double colons to turn it into an absolute path (just like in C++). So in Geode you can use ::String to refer to the actual string class.

What I didn’t realise is that you can do this same thing when you’re patching a class. So from within one class, you can patch a class in an outer module just by passing an absolute path. Since we have the type of the scope in our generic module as T, we can patch the class like this:

class ::{{ T }}
  {% if @type.annotation(::Retain) %}
    @var_{{ @type.id }} : {{ @type }}? = nil

    def get_{{ @type.id }} : {{ @type }}
      @var_{{ @type.id }} ||= {{ @type }}.inject(self)
    end
  {% else %}
    def get_{{ @type.id }}
      {{ @type }}.inject(self)
    end
  {% end %}
end

This checks whether our type needs to be retained, then either generates an instance variable and getter method, or just a getter method.

The getter method will be named something like get_RequestProcessor. I didn’t include it in the example, but since we can have generic types or types nested in modules, we actually need to strip any special characters out of the type name, like this: @type.stringify.gsub(/[():]/, "_").id. Since Crystal macros don’t have methods, every time we want to access the specially-named getter, we have to duplicate that snippet.

Now we’re able to store the object, and we’ve got a method to access it, but we still don’t have our .get method. This requires another few tricks that will interact with our specially-named method.

Macro methods get instantiated separately for every type that calls them—at least conceptually. We can capture that type in a generic parameter and then call .get_{{ T.id }} to either get the retained instance or a new object:

def get(cls : T.class) : T forall T
  {% begin %}
    {% name = T.stringify.gsub(/[():]/, "_") %}

    {% if @type.has_method? "get_#{name.id}" %}
      self.get_{{ name.id }}
    {% else %}
      {% T.raise "#{T} not registered in #{@type}" %}
    {% end %}
  {% end %}
end

I’m actually combining this with has_method? in order to fail with a more sensible error message. This is how the compiler is tricked into doing our dependency resolution at compile time: it needs to generate all the instantiations of this .get method, and if it can’t call the right getter, then we’ve got an object that can’t be constructed through dependency injection.

Although that’s still only half the story. I promised earlier that I’d get to SubScope(T), and this is where that comes in. Since scopes are organised in a hierarchy, in order to have the compiler do type checking, that hierarchy needs to be represented in the type system. By making SubScope(T) a generic class, the child scope can have a concrete reference to its parent type, and the fallback .get method call is directly on the parent scope type, rather than being a dynamic dispatch on the Scope superclass.

That’s a lot to take in, so here’s the code then we can go through an example.

def get(cls : T.class) : T forall T
  {% begin %}
    {% name = T.stringify.gsub(/[():]/, "_") %}

    {% if @type.has_method? "get_#{name.id}" %}
      self.get_{{ name.id }}
    {% else %}
      @parent.get(cls)
    {% end %}
  {% end %}
end

Let’s say we have RequestLogger which is in RequestScope, and Logger which is in GlobalScope. RequestLogger injects the Logger. How does that dependency get resolved?

The Injectable(T) module macro will generate the inject method, which will build the call to initialize:

def inject(scope)
  instance = RequestLogger.allocate

  instance.initialize(
    scope.get(Logger),
  )
  return instance
end

In this method, scope is a RequestScope. The Crystal compiler sees that we’ve called .get with a parameter of type Class(Logger), and it generates a new specialisation of that method for us, and runs our macro code in that method.

The @type.has_method? check returns false, since Logger is registered in the GlobalScope, and so there’s no get_Logger method patched into the RequestScope. The code generated for .get looks like this:

def get(cls : Logger.class) : Logger
  @parent.get(Logger)
end

@parent is defined in the initialiser for SubScope(S) as being of type S:

class SubScope(S) < Scope
  def initialize(@parent : S)
  end
end

Since RequestScope inherits from SubScope(GlobalScope), the compiler knows that @parent must be a GlobalScope, and so it knows it needs to create a specialisation on that type to satisfy this .get(Logger) call.

In this case, GlobalScope is a regular Scope and so the generated method is slightly different. It does the check for whether there’s a get_Logger method defined—in this case there is, so it will delegate to that. If there wasn’t, then it will raise an exception and compilation will fail.

If we put enough @[AlwaysInline] annotations on these methods, then the get(Logger) call in RequestLogger should be able to skip right to reading the field from whichever scope holds the logger, without doing any method dispatches. Talk about zero-cost abstraction.

Provider and Lazy

Any good dependency injector will know you can’t inject dependencies without deferring their construction in some cases. This lets you do something like:

class MyFeature
  include Injectable(GlobalScope)

  @database : Database

  @[Inject]
  def initialize(
    config : Config,
    old_database : Provider(OldDatabase, GlobalScope),
    new_database : Provider(NewDatabase, GlobalScope)
  )
    if config.use_new_database?
      @database = new_database.get
    else
      @database = old_database.get
    end
  end
end

The implementation for these is fairly straightforward:

struct Provider(T, S)
  def self.inject(scope : S)
    new(scope)
  end

  def initialize(@scope : S)
  end

  def get
    @scope.get(T)
  end
end

The biggest challenge here was that originally I didn’t have them being generic over the scope, which meant the type of @scope was Scope+ (any Scope subclass), and when the get(T) method was instantiated, it would be instantiated on the top-level class, which wouldn’t have the special getter method defined, and the dependency resolution would fail. Making them generic on the scope adds a bit of complexity, but it’s necessary to have the resolution work at compile time.

Both of these classes have to be special-cased into the construction of the objects, which is a little messy. The actual call to the initialize method looks like this:

instance.{{ method.name }}(
  {% for arg in method.args %}
    {% if arg.restriction.nil?
         arg.raise "needs restriction on #{arg}"
       end %}
    {% if arg.restriction.resolve?.nil?
         arg.raise "Unable to resolve #{arg.restriction}"
       end %}

    {% if arg.restriction.resolve? < Provider %}
      {{ arg.restriction }}.inject(scope),
    {% elsif arg.restriction.resolve? < Lazy %}
      {{ arg.restriction }}.inject(scope),
    {% else %}
      scope.get({{ arg.restriction }}),
    {% end %}
  {% end %}
)

This will always create a new Provider or Lazy instance (neither wrapper type should be retained) and then the actual object creation is done in Provider#get and Lazy#get, which call @scope.get(T).

PartialInjectable

The other classic dependency injection pattern is a class that has some fields injected, and some fields passed in explicitly. Usually this is done by generating a factory class, which is exactly what I did. It was quite satisfying to be able to generate code that would invoke the macro that I’d just written to then generate more code, and have it all fit together.

The macros are fairly similar to the rest, it just generates an inner struct named Factory and uses the splat_index to differentiate between injected arguments and regular arguments. The injected arguments are each automatically wrapped in a Provider, so that the dependencies are only resolved when new is called on the factory.

Qualifiers

If you want to inject two objects of the same type in Dagger you can use annotations to differentiate them. I thought about using annotations in Crystal to do the same thing, but decided against it as it would be a bunch more work and complexity. Instead you can implement a single-field struct that wraps the type you want to duplicate—I even made a convenient macro for it.

Scope Parameters

For subscopes that correspond to a particular action (like an HTTP request) you need to be able to put values into the dependency graph. In terms of code, this is as simple as defining the right method on the scope so that the has_method? check in the get macro method will find it.

I made a macro that helps defining these, since writing them manually would be messy.

class RequestScope < SubScope(GlobalScope)
  params [
    context : HTTP::Server::Context
  ]
end

Then when you make the scope, you pass in the parameters: global.new_request_scope(context: context). These will then be available to inject to any class in that scope.

Using it

Here’s a somewhat contrived, overly simple example of where this might be useful. We’ve got two scopes just like the rest of the post, a global config, and a processor that is only used for one request.

Here are the scopes and config:

class AppConfig
  getter do_stuff : Bool
end

class GlobalScope < Scope
  params [
    config : AppConfig
  ]
end

class RequestScope < SubScope(GlobalScope)
  params [
    context : HTTP::Server::Context
  ]
end

Here’s the actual interesting stuff. What’s neat is that we don’t have to plumb Logger or AppConfig into the Handler, and if we decide that Logger should be request-scoped in order to log with embedded request information, we can just move it into that scope and not have to do all the rewiring.

@[Retain]
class RequestProcessor
  @[Inject]
  def initialize(
    @context : HTTP::Server::Context,
    @logger : Logger,
    @config : AppConfig
  )
  end

  def process
    @logger.log { "Processing request!" }
    ...
    if @config.do_stuff
      write_data(@context)
    end
  end
end

@[Retain]
class Handler
  include HTTP::Handler

  @[Inject]
  def initialize(@global : GlobalScope)
  end

  def call(context)
    req_scope = @global.new_request_scope(context: context)
    processor = req_scope.get(RequestProcessor)
    processor.process
  end
end

Now we just need to set up the application by loading the config, creating the root scope (and passing the config in), and starting our server. The server could also be constructed by dependency injection if we really wanted.

config = Config.load_from_file

global_scope = GlobalScope.new(config: config)
server = HTTP::Server.new [global_scope.get(Handler)]
server.bind_tcp "0", 80
server.listen

But Why

Surprisingly I didn’t do this entirely for my own amusement, I did actually have a use case that dependency injection would have made easier. I made some changes to my SSH honeypot that I wrote years ago so that instead of just logging the commands, it would run them in a container. It would hash the username, password, and remote address to create a container name so that repeated connections would be run in the same container.

I then wanted to change this so that I could optionally share containers based on some logic. I would usually do this by having a ContainerDispatcher or something that would be owned by the SSH server, and pass a reference to it to each PodmanConnection when a client was connected. The PodmanConnection would ask the dispatcher for a container, and since it would have visibility into all running containers, it could return an existing one or create a new one.

This is a tiny microcosm of where dependency injection is useful. The connection should exist in its own scope, and request a dispatcher from somewhere. It doesn’t care if that dispatcher is in the same scope or in the parent scope—it just says “give me the thing that will let me get containers”. The SSH server doesn’t really need to know that the dispatcher should be shared among all connections, it doesn’t even really need to know about dispatching between containers at all.

I know dependency injection is a hallmark of over-architected enterprise software, but in cases like these I think it’s a useful tool.

You can see the full implementation in Geode. If I end up using this in my projects, no doubt I’ll make some changes as I find pain points and other shortcomings.


Why Containers?

There seems to be a recurring sentiment that pops up every now and again on Mastodon: “this project looks interesting, but the installation instructions say to use Docker, so I’m not interested anymore”. Now I totally understand the sentiment. If I came across a project and the instructions said to install it I’d need Nix, I would also do a quick 180.

However, I’ve been using containers for development and for the services I run at home for a few years now, and I quite like it. So I thought it might be interesting to explain what I get out of containers, some of the bad bits, and right at the end some feelings about the mismatch between container enthusiasts and container skeptics.

Use podman

I’ll get this out of the way: I don’t actually use Docker, I use Podman. Aside from being more permissively licensed, it’s also much easier to install on basically any Linux system as it’s available in most distros’ package repository. On Ubuntu I can just apt install podman and on Alpine I can apk add podman. The Docker installation instructions are much more involved.

This is the main reason I would recommend Podman, maybe second only to the fact that Podman runs rootless by default, which reduces the chances of a rogue container stomping over something on your machine that it shouldn’t.

Process isolation

By far the biggest benefit of using containers for me is low-effort process isolation. I don’t like the fact that if I run something myself it’ll have access to all the data on my system and be able to read and upload it, or simply just corrupt or delete it.

You can of course get this same benefit by running services as different users, which most well-behaved services installed by system packages should be doing anyway.

Running services as different users does make sharing data between services more difficult, and I’m not very good at managing groups. Being able to just restrict the paths that the container has access to is really simple both to do and conceptually. I don’t have to think about who is a member of which group and what access that grants them, I just say “you can access this folder” and that works really well for me.

The biggest benefit of container isolation is that when the container is gone, everything else goes with it. If the process was writing logs, config, temporary files, whatever, it’s all constrained into the container. Once I’ve done podman rm I know that everything belonging to that container has been purged.

This makes trying new software a lot more like apps on a phone; you know that the risk of installing is very limited, and when you uninstall you know that everything gets removed. The exact opposite of installing software on Windows XP, where you’d have to go through a wizard, it would require full admin access, and it could make any change to your machine it wanted.

This isolation can also flip who is in control of the program. For example, some program might only support listening on a particular port or saving its data to a particular directory. This is of course bad software. Nevertheless I have the power to say “no thanks” and re-map the port and filesystem locations so from my perspective the program is working the way I want.

Opaque storage

Since I’m in full control of what data in the container gets persisted, I have much more control about where it actually lives on my system, and thus how it gets backed up.

All the containers that actually store data—most important being FreshRSS—I map their data into a Podman volume, which I can treat as an opaque blob. Currently I have a script that will take these volumes, export them to TAR files, and upload them to my NAS.

I’m definitely missing out on incremental backup here, as the TAR file requires a full upload each time. This is not perfect.

Processes versus programs

I like the fact that I can think of my programs more like programs instead of having to deal with OS processes. Perhaps I’d get this same benefit if I was good with systemd, but I’m not, so here we are. Instead of haphazardly calling kill with copy-pasted process IDs, I can just podman stop or podman restart with the container name—and it’s a name that I can choose myself.

This grouping of processes also exposes better monitoring—on Ubuntu, at least. I use prometheus-podman-exporter to grab metrics for each container, so I can see where I’m spending my RAM. Annoyingly, due to some cgroups issue on Alpine you just get aggregated stats, not per-container stats. So I can’t actually take advantage of this for containers on my Alpine servers.

Remote management

Something I like about Podman specifically is podman-remote. I’ve written about this before, it’s the secret behind my container dashboard.

Since I can interact with the containers on my local machine in just the same way as I do the ones on my other servers, it is really easy to write scripts that deal with both transparently. Endash merges containers running on my two home servers and a VPS into one interface, even with a mix of Ubuntu and Alpine between them.

Even just for local containers, having an interface to get details about what’s running in a computer-readable JSON format is really convenient. With Endash I can just list the currently running containers and get the ports that they’re listening on to include as links in the dashboard. I can introspect the volumes they have access to, how long they’ve been running, and more all from one API.

The rule of two

This is more geared towards development, but it is helpful for any kind of debugging. Since the process running in the container has no idea about what’s happening in the real world, it’s really quick and easy to run a second instance of some service.

All I have to do is assign it a different port and mount a different directory—or mount no directory to get a fully clean environment—and then I can have both running in parallel.

Paired with the fact that all the data can be isolated to a particular volume, I could duplicate the volume, run a new container pointing to the second copy of my data, and experiment with some alternative configuration or a major version upgrade. All with the peace of mind that if it doesn’t work out, I haven’t actually done anything to my actual service.

Dependency Heaven

Dependency hell isn’t particularly likely if you’re installing the mainstream versions of packages from the package manager that comes with your distro. People have managed to install a lot of packages at once.

I don’t think I actually run enough services for this to be a serious issue, but in theory you could have services that require mutually incompatible versions of shared libraries, command-line tools, or suchlike. Being able to run these without messing with load paths or $PATH is convenient.

The biggest advantage of this dependency isolation is development, where you can hack around with the system safe in the knowledge that you won’t break something important.

Development

I originally came to containerisation as a way to control my development environment, rather than a way to run things on my home server. For development, using podman directly kinda stinks, the commands are exceptionally verbose and easy to get wrong. So much so that I wrote my own program to make this easier.

Over two years later and I’m still mostly working this way, but I did walk it back a little. I use cargo directly on the system and I gave up on using containers for one-off scripts as it was just too much friction. But containers have made it exceptionally easy for me to write a little server, package it up, and keep it running on my home server. I’ve currently got 6 of my own containers running across 3 different servers. Running, deploying, and monitoring them is straightforward.

The other development advantage is being able to hack around in a safe sandbox to try and get something working. I try not to install too much weird stuff onto my development machine, especially old tools that might conflict with a more recent version that I rely on. If I containerise this, I can do almost anything without affecting my actual machine.

As a concrete example, I recently wanted to find out why jekyll-admin would show an error on basically every page load. To do this I needed to build the project, it’s built with React which means installing node.js and Yarn. Some part of this build process requires a Python interpreter, and the versions that jekyll-admin requires are old enough that it would only work with Python 2.1 Now there’s no way that I’d pollute my computer with a Python 2 install, but in a container I can hack around with whatever I want, knowing that it can all just disappear once I’m done.

I did the same thing while trying to work out why I still couldn’t use websockets in Swift. Firstly Swift only supports a few Linux distros,2 so being able to fake Ubuntu from within Alpine is a useful trick, but also the availability of the libcurl version depended on the underlying Ubuntu version, and I could just swap between two versions—or even run both at the same time—trivially.

Running code in containers also forces you to understand what implicit dependencies you’re pulling in, usually development headers for some library, the build-essential package, or maybe just tzdata. Usually these are things that you install and forget about, then when you go to help someone else it’s really hard to remember what you need to do, or even to know what you need in the first place. Having a containerfile doesn’t guarantee reproducibility—it might reference an image that changes—but at least there’s some intention there that can be reverse-engineered.

The bad bits

If you use containers, then you need to trust whichever registry you pull from just like you trust the system package manager. I don’t know much on the relative security tradeoffs, but you’re probably running a newer version if you pull from a registry, which might have unfound flaws. The package manager maintainers might be altering or re-building packages in a way that you find valuable, but they also might be doing something you disagree with.

Trust is definitely something I have in the back of my mind. I don’t have a particularly robust approach here, but if there isn’t a prebuilt image that I feel comfortable with, I’ll just build my own from a standard OS image (probably Alpine) and install the package using apk or apt. That way I’m still getting a container, but using the package from the distro.

The podman CLI is complicated. Somewhat necessarily so, given the amount of stuff it can do, but nonetheless it is really an interface best used by computers, not by people—which I’ll get to more at the end.

Podman networking (or container networking in general) is terrible. I’ve spent ages trying to understand what you can and cannot do, only to run into dead ends and hard-to-debug problems. The key to a happy life is to set ufw to block all incoming connections outside of port 22, 80, and 443, have containers bind to particular known ports, and use Caddy as a proxy.

The opacity you get from using volumes as storage can turn around and bite you when you want to quickly grab something from a volume. The experience is about as good as trying to scp something without knowing the exact path. The less I have to do this the better, and if I’ll need to look at the files with any regularity I’ll use a bind mount instead.

I’m definitely over-invested

You’ve probably realised now that I’m over-invested. I’ve picked containers as the backbone of how I use computers and there’s no going back now. I’ve written a tool for managing them and then rewritten it in Rust and written a custom web interface to view my containers. Without this, I’m lost.

This has definitely given me an appreciation for what you can do with containers, moreso than if I’d pasted in a few podman run commands to get something working. A large part of the reason why I wrote these tools was because I didn’t understand containers, and the way I get an understanding is to get right down into the weeds. I wouldn’t be as comfortable using containers as heavily as I do without having my own system around them.

Putting it all together

Just to make it super clear, I have a custom tool to build, run, deploy, and update containers, just because I didn’t like any of the existing tools. In the shipping container analogy, I’ve built my own boat, harbour, dock, and crane system. So if you find yourself thinking that a particular podman command is hard to remember or complicated, just know that I’ve spent hours writing thousands of lines of code to manage that complexity.

Physical shipping containers aren’t very useful if you don’t have a ship that’s built to transport them. In fact it’s more difficult to cram shipping containers onto a pre-containerisation vessel than it would be to just carry the cargo directly.

That’s the key really, I’ve built a whole system with containers as the foundation, which has made it really easy for me to use containers. But podman doesn’t come with a system, it’s just a box you can put stuff in, and that’s only half the answer.

To get real value out of containers you need a vessel, whether that’s podman-compose or Kubernetes, you need something that will let you take advantage of everything being the same shape.

For someone that just wants to run something on their server that they administer themselves, these tools are a whole new system to the way they’re used to working. When a project says “run this with Docker” what they’re doing is asking you to fit a 40 foot container onto a dinghy.

I think that containers at this scale should be treated as a tool instead of a whole system; developers can package their application and its dependencies in a standard way across all distros, and have it run in the same environment without needing to adapt to the actual machine it’s running on. It’s then up to the distro to provide a way for the user to manage the application without knowing it’s backed by a container. Let the container be a packaging tool.

With that in mind, if you’re still reading, the tool that gets as close to this as possible is podman quadlets, which let you run containers as systemd services. You still have to know they’re backed by a container, but it can be managed the same way as other services on your system.3

  1. The end result was that I tried updating some of the dependencies to work with the current version of Node, realised it would be a serious effort, then gave up. That would have happened with or without containers though. 

  2. Ubuntu, Debian, Fedora, RHEL, and Amazon Linux at the time of writing. 

  3. I’d read this as a nice overview of how and why to use quadlets. 


Upgrading to Jekyll 4.4

This all started as I was getting confused about how syntax highlighting broke on my website, and my confusion as to what GitHub could have possibly done to break it. As it turns out they didn’t do anything, but the idea of moving to an actions-based website seemed less daunting.

Originally GitHub Pages only supported one way of building a website, which was with Jekyll and a fixed set of plugins. You couldn’t write any custom code (apart from in Liquid templates) or depend on additional gems. Later they added support for building the website from GitHub Actions, allowing the use of custom code, dependencies, or even swapping out Jekyll entirely.

I had been keeping a list of all things I could do if I moved to using a custom build instead of the default Pages setup. I only wanted to move if I had reasons to actually justify doing it, rather than just complicating the deployment for no real benefit. This list was getting to a reasonable length, so I started to consider taking the plunge.

The thing is that GitHub Actions feels bad. Setting it up correctly and keeping it working just seemed like a lot of effort, whereas the previous branch-based setup had already been working for me for literally a decade. Sure there are some hiccups, but it doesn’t require any extra YAML files.

So instead I setup a mirror of my website on GitLab Pages. GitLab CI is much easier to setup, the most basic config can just be “use this docker image, and run this command”. You then just put pages: true on an action that writes to public/ and you’re done.

Here’s the whole config:

create-pages:
  image: ruby:3.4
  script:
    - gem install bundler
    - bundle install
    - bundle exec jekyll build -d public
  pages: true
  only:
    - main

It’s a little more complicated if you want to cache the result of bundle install to save time, but what I really love about this is I can see exactly what’s going to happen. If there’s some problem, I can just pull ruby:3.4, run a new container with my website in it, then run the commands in the script section.

Jumping ahead a little, I did end up configuring GitHub Pages, but the config is much longer, it has to configure permissions, you need a special action to actually get your code, as well as two more actions to setup and deploy to Pages.

Interestingly while the basic setup is simpler, GitLab doesn’t auto-compress your files like GitHub does. You have to manually create .gz versions of each file to have them be served with gzip compression. This is a little inconvenient, but they do support serving other compression schemes like Brotli:

find public -type f -regex '.*\.\(htm\|html\|xml\|txt\|text\|js\|css\|svg\)$' -exec gzip -f -k {} \;
find public -type f -regex '.*\.\(htm\|html\|xml\|txt\|text\|js\|css\|svg\)$' -exec brotli -f -k {} \;

Brotli cuts 1kB off the homepage of my website (4.8kB versus 5.9kB) which is a nice improvement, so all in all I’d accept this slight increase in complexity for smaller response sizes. GitHub is limited to whatever they decide to serve, which currently is just gzip applied automatically where they see fit.

So I had a copy of my website on GitLab Pages working with compression and everything on an auto-generated gitlab.io domain. Everything worked, but you can tell just by looking at it that it’s slower to load. I’m very used to my website loading almost instantly because I pointlessly code golf down the size of the HTML and CSS to be as small as possible. I sent a link to a friend who I assume doesn’t check my site as obsessively as I do, and asked “what’s the performance like?” Their immediate response was they thought it was just a bit slower than my actual site.

Right now if I look at the timing in the web inspector, the increase in load time is entirely in the browser waiting for the server to start sending the actual data. Getting the SSL connection is 300ms versus 19ms, then waiting for the response is 500ms versus 8ms. Downloading the actual data from both is 0.1ms. Time to first byte is 817ms versus 32ms.

Doing a bit of a traceroute seems to indicate that GitLab is getting served from somewhere in Missouri, whereas the GitHub response headers include x-served-by: cache-syd10177-SYD so my request probably isn’t going further than 50km.

I also used the Pingdom website speed test which is probably a bit more scientific than doing random requests from my laptop, and it shows a similar story: GitHub spends almost no time (13ms) establishing the SSL connection, whereas GitLab takes 700ms. Moving the request source from Sydney to San Francisco cuts this down to 300ms, so I’m definitely paying a tax for being on the wrong side of the globe.

It’s fun to see that all the responses are smaller because of the Brotli encoding on the GitLab version, but if you’re spending 700ms initialising the connection that doesn’t really matter.

Probably the only thing that would make me use GitLab Pages would be the ability to configure cache expiration. Both hosts set a default Cache-Control: max-age=600 header to cache the response for 10 minutes, but for web fonts, CSS, and my tiny JS file, it would be great to set this to be much longer.

So it seemed like the best option was to stick with GitHub. I know there are plenty of other static site hosts, but I didn’t really set out to do an exhaustive comparison. Adding a different service would likely complicate my build process even more, and the whole reason I started looking at GitLab was that their CI is so much simpler than GitHub’s.

I rolled up my sleeves and flailed around with GitHub Actions YAML until I had a working site. I actually made a new repository so I could just commit and push over and over until it worked, then squash it all down into one perfect commit and push that to my actual repo, saving my git history.

The jump from Jekyll 3.10 to 4.4 (and updating all the other dependencies at the same time as well) did expose some issues.

Either Rouge (syntax highlighter) or Kramdown (markdown processor) have stopped adding a highlighter-rouge wrapper div around code blocks with no highlighting. I was using this in my CSS, and so any non-highlighted code blocks didn’t get styled correctly. This actually ended up being quite convenient, as it forced me to delete and re-write my CSS with respect to <pre> and <code>, resulting in simpler rules.

Kramdown also changed the HTML generated for footnote links, so they no longer have role="doc-noteref". I was using this for styling and all my footnotes jumped back up to being superscripts. This was another easy fix, just change the CSS selector to be sup:has(.footnote).

With the new Sass version, I started getting a lot of deprecation warnings for lighten(), darken(), change-color(), and @import. The colour adjustments could just be rewritten to use color.change() and color.adjust(). Correcting the imports turned out to be trickier, as I’d just split the file somewhat arbitrarily and this didn’t really fit with how Sass wanted imports to work. Instead of working out how rules, variable declarations, and functions should be separated, I just put everything into one file. Maybe I’ll work something smarter out in the future, but for now it works with no warnings.

So after all that I had a website that looked and worked just like it did before. Thankfully I had my list of improvements I could make, and now there was nothing to stop me.

The first thing I did was write a custom Jekyll converter that uses a custom Kramdown converter to always add loading="lazy" to <img> tags. I’d been doing this manually by tacking {:loading="lazy"} onto the end of every markdown image, but now it’ll just happen by default.

Next I removed the liquid template that I used to get more accurate publication times for my latest post. Now it is just a custom Liquid filter, so I just have to write post.date | smart_date | date: site.date_format instead of including and capturing a template.

The next thing is slightly cursed. In both the JSON and RSS feeds any code block has the HTML markup to be syntax highlighted, but feed readers don’t have the CSS and so they’ll always render it without styling. Including it in the feeds is a pure waste of bytes, but there’s no great way in Jekyll to render posts differently depending on where they’re being included. Instead I wrote another filter that will use a regex to parse the HTML and strip any <span> tags from within <code> blocks. Obviously it depends on how much code is in the post, but this cut down the size of the RSS feed by 82kB. And you know how I feel about code golfing the size of my pages.1

A minor fix I made is the character used in the footnote backlinks—the link in the footnote text at the bottom of the post that jumps back up to where the footnote is referenced. The default character is ↩ (#8617, LEFTWARDS ARROW WITH HOOK), which renders differently on iOS versus MacOS. On MacOS it’s similar to other HTML arrows and renders like a letter, like this: ↩︎. On iOS it renders like a colour emoji, a blue shaded box with a white arrow in it, similar to 🆒.

This has always bugged me. I have no idea why there’s this inconsistency, the emoji character looks out of place. I went to see where in Kramdown I needed to make a change to get a different character, and to my surprise I found out there’s already a feature to replace the character—I could have had a different one all along. I chose to swap it to ↑, which I like more as a “jump up to where this was mentioned” link.

There’s actually some interesting discussion on the feature request in Kramdown. You can actually just add another code point afterwards and force it to display in the text mode, but I’m not particularly attached to that character and will stick with the up arrow.

Now we get to bigger changes. Up until now the way to view posts with a particular tag is to go to the tags page and scroll to find the right tag. This is fine, but not particularly nice. Now that I have unlimited power, I can write a custom generator that creates a new page for each tag that’s been defined. Now there’s a dedicated page for each tag which gives me a little more room to group the posts by year instead of just in one big list.

Probably the biggest change is adding a list of related posts to the footer of each post. Previously I just had links for the next and previous post. I don’t even know why I had those links, it must have been in the Jekyll example template or somewhere in the documentation. However since I’m not writing a series it’s not really that important to specifically go to the next or previous entry.

Instead I’ve written some logic in a filter that gives me a certain number of relevant posts to include. Right now this will be posts that share any tags with the current post. I didn’t want to completely throw out the next/previous links, so the related posts list will always include those as well. This gives it a little more variety, and any posts without tags will still get two links at the bottom.

The obvious other thing to do would be to use third-party plugins or my own Ruby code to generate the RSS and JSON feeds. I’ve held off on this because while templating JSON or XML isn’t the best idea, the templates are pretty good at this point and have been working without me fiddling with them for years. Maybe if I want to add something more complicated to the feeds, but for now I think they’re fine as they are.

Another bit of work would be JavaScript-free footnotes. I’ve recently added a few lines of JS to make footnotes open in a popover instead of just jumping down to the bottom of the page, but it would be nice to do this with no JS at all. Now I’ve got complete control over the HTML generation, maybe there’s a better option here.

If you want to make the same move yourself, you can see my GitHub Actions config as well as my GitLab CI config which are both currently working to publish the site live and to the GitLab mirror.

  1. I just had a thought that I could use this same thing to strip out any <span> with a class that I don’t actually apply a style for. Actually I could include that directly in the custom markdown processor. You see, this is the kind of rabbit hole I was concerned about. 


Help I'm Stuck in a Photo Management Factory and I Can't Get Out

Photo of the convent at Izamal, México

I feel obliged to include a photo, since this post is about photography.

In 2019 I got into this whole photography business and used my iPad for photo editing in Photomator. This went pretty well until I started to shoot raw photos instead of JPEGs, and quickly ran into the meagre 250GB internal storage limit of the iPad.

At the time I wrote about how limitations of the iPad make using it as an exclusive photo-editing device impractical. Basically the only way to get photos off the iPad was to upload them to a cloud service—at the time I had a terrible internet connection, so this was basically infeasible. On paper it was possible to export to an external drive, but for any number of photos to actually make a dent in my growing library, this was unsupported.

The most reliable thing to do would be to copy the photos onto a “real” computer using the MacOS “image capture” utility, where you could then dump them onto an external drive or whatever you desired.

Eventually I decided that if I needed a real computer in order to use the iPad, I should just edit on the computer to begin with, and so I moved from the iPad to an M1 MacBook Air with a nice 1T SSD.

In 2022 I wrote:

My photo library currently takes up 500G and I’m not looking forward to having to split it.

How right I was.

At the end of 2023 my photo library burst out of my laptop like an extra-terrestrial creature from the chest of an unsuspecting space-tug crew member. I did exactly what I had planned: I carefully duplicated, backed up, and split my photo library in two. One library with everything before 2023 would live on an external SSD, the other with all my new photos would live on the laptop’s internal storage.1

I could continue to import and edit any new photos into the photo library on the laptop just as I had been doing. If I wanted to look at the old photos, I’d just have to plug in the SSD.

Except it’s not quite that simple. To open an alternate photo library you have to hold the option key while Photos is launching, then select it from a list (or navigate to it in Finder).

To actually edit something in a third-party editor (like aforementioned Photomator) you have to go into Photos’ settings and make the current library the system photo library. This is because the third-party app has no knowledge of anything other than a monolithic system library. You have to redirect that API to your external drive, then re-redirect it back once you’ve finished.

The end result was that all my photos before 2023 were lost media. Dead and inaccessible to the world.

So sometime last year I moved everything onto a bigger 2T SSD2. This required whole extra process where I merged the two libraries back into one using PowerPhotos. Multiple backups, merged, checked. Finally I had everything in one place with a path forward: the 2T SSD left me 700GB of headroom, and if I filled that I could just go and get a 4T or 8T drive and copy everything across.

This even works better than just buying a new laptop with more internal storage, since the MacBook Air doesn’t (currently) come with more than 2T of storage. So even if I’d spent the extra $600 to go from 1T to 2T, going to 4T means getting a bulky MacBook Pro.3

It seemed like a great solution. It’s even a supported way to use the Photos app and mentioned by internet-resident Photos experts.

What this documentation doesn’t mention is that the system expects the library to always be available. The processes that slowly dawdle through your library identifying faces and whatnot will keep the database open, so the drive basically can’t be safely ejected.

Of course you can just rip the cable out or click “force eject” and gamble with data integrity every time. Plenty of people seem to think that ejecting or unmounting is a thing of the past, from the era of floppy disks and CD drives.

Most of the time when I want to unplug the drive I’d have to resort to force ejecting it, since you can’t stop the Photos system from retaining its access to the library. I don’t know if this is what caused it, but fairly often Photos would open the library and have to spend a few minutes “repairing” it before you could do anything.

Other times it would completely fail to open the library with absolutely no recourse, just an error message saying “the library could not be opened”. Through completely dumb luck I worked out that opening PowerPhotos would kick Photos back into gear and it would load the library.

While I haven’t actually got into an irrecoverable state yet, it’s disconcerting seeing your carefully organised photo collection fail to load every so often. Maybe a future OS update will fix whatever causes it to get into this state—but maybe an update will stop PowerPhotos’ ability to kick it back into shape?

Then even after all this trouble, that’s just the photo library. The edits from Photomator (and Pixelmator Pro) are saved in “sidecar” files in ~/Pictures/Linked Files (this has moved around a little). There isn’t a supported way of storing these on the external drive, I have tried to use symbolic links to keep all the files for Pixelmator Pro and Photomator together in one folder, but at least when I last tried one or both of them wouldn’t follow the link and would just fail to save the file.

So here I am, 1.3T of photos sitting on an external drive in a photo library that might be corrupted any time I go to use it.

It won’t come as a surprise that since I filled up my internal drive, the number of times I’ve gone out to take photos has dropped off dramatically. Some of this is me spending my time on other interests, but a large part is the sense of dread I get knowing that every photo I take is just digging me further into a pit of data management hell that I have no way out of.

I don’t know what the solution is, no one seems to enjoy using Lightroom. Affinity Photo has been merged into one mega-app, but it didn’t have photo library management in the first place anyway.

There are interesting other options like Aspect, but it’s only organisation software—no editing. They say it’ll recognise “popular” editors, but the exact details of that could make or break the workflow for me. I really value being able to flick between photos and go from viewing to editing with minimal faff. This was the main reason for me to buy Photomator even though it doesn’t offer different editing capabilities from Pixelmator Pro.

Even if I did find another system, it would likely require a substantial (stressful) migration of my existing library. Would a move just be digging further into the hole, or would it actually get me onto a more sustainable path?

  1. Honestly this was such a process it could be a post of its own. 

  2. A Samsung T7 drive. I’ve got that and a T5 and they seem good. I use SanDisk MicroSD cards but am scared of their SSDs failing

  3. Going from a 2T MacBook Air to the cheapest MacBook Pro with 4T costs an extra $2000. Getting 8T costs $3200 on top of that. 


My Website Broke and You Won't Believe Why

When I published my last post I did my usual quick check on the real website, just to make sure it had published, and find the obligatory mistakes that only appear once it’s public. I quickly noticed that the XML code block didn’t have any syntax highlighting, just a plain unstyled <code> section.

My site uses Rouge for syntax highlighting, which I think is the default for GitHub Pages sites that are built with the (now legacy) non-actions system. I’ve never included an XML code block so maybe Rouge doesn’t support it, but it supports so many languages it would be a significant omission.

It’s weird that I hadn’t noticed this while writing when I was running the site locally, and sure enough I hadn’t noticed it because the local site was working exactly as I expected with beautiful hand-crafted syntax highlighting.

So it works locally, but doesn’t work on the live site.

I run the site locally in a container and install all the dependencies through the github-pages gem, which should track the exact version of Jekyll and of all the available plugins, so my local version should be exactly the same as the live one.

Inspecting the actual HTML of the local site versus the live site, there’s a pretty obvious difference. Here’s the markup for the local site (truncated):

<div class="language-xml highlighter-rouge">
  <div class="highlight">
    <pre class="highlight">
      <code>
        <span class="nt">&lt;svg</span> <span class="na">xmlns=</span>
        <span class="s">"http://www.w3.org/2000/svg"</span>
        <span class="na">width=</span><span class="s">"1364"</span>
        <span class="na">height=</span><span class="s">"486"</span>
        <span class="na">viewBox=</span><span class="s">"0 0 1364 486"</span><span class="nt">&gt;</span>
        <span class="nt">&lt;path</span> <span class="na">fill=</span>
        <span class="s">"none"</span> <span class="na">stroke=</span><span class="s">"#000"</span>
        <span class="na">stroke-linecap=</span><span class="s">"round"</span>
        ...
      </code>
    </pre>
  </div>
</div>

Then here’s the markup I was seeing on the live site:

<div class="language-xml highlighter-rouge">
  <div class="highlight">
    <pre class="highlight language-xml" tabindex="0">
      <code class="language-xml">
        <span class="token tag"><span class="token tag"><span class="token punctuation">&lt;</span>svg</span>
          <span class="token attr-name">xmlns</span><span class="token attr-value">
            <span class="token punctuation attr-equals">=</span>
            <span class="token punctuation">"</span>http://www.w3.org/2000/svg<span class="token punctuation">"</span>
          </span>
          <span class="token attr-name">width</span><span class="token attr-value">
            <span class="token punctuation attr-equals">=</span>
            <span class="token punctuation">"</span>1364<span class="token punctuation">"</span>
          </span>
          <span class="token attr-name">height</span><span class="token attr-value">
            <span class="token punctuation attr-equals">=</span>
            <span class="token punctuation">"</span>486<span class="token punctuation">"</span>
          </span>
          ...
      </code>
    </pre>
  </div>
</div>

The exceptionally short class names (na, s, nt and such) are what I expect to get from Rouge, but instead I was getting much more detailed, longer class names that didn’t match my CSS, so they were left as plain un-highlighted text.

Maybe GitHub is rolling out a new version of the Pages gem that includes a newer Rouge version that creates incompatible markup? That would be pretty rude and also unlikely, there’s nothing in the Rouge changelog that would indicate a breaking change like this.

Or maybe they’ve updated something and Jekyll no longer respects the highlighter: rouge configuration option, so it’s falling back to some other highlighter that creates different markup. That would also be a rude change, and there hasn’t been a release of the Pages gem since August 2024. It seems unlikely that the gem would be out of sync with the system that actually builds your website.

A bit stumped, I asked a friend if they saw the highlighting and they said they did. So it’s just a me problem. I tried in Firefox and had no issue—the highlighting showed up exactly as expected.

You might be thinking “it’s obvious Will, you’ve got some browser extension that’s messing it up!” But no, I’ve only got two extensions: 1Blocker and 1Password.1 There’s no way either of these would alter the syntax highlighting in code blocks.

1Blocker mostly (as far as I’m aware) uses the Content Blocker API that just hides elements in the DOM, rather than mutating them.

1Password should only be adding a little account selection dropdown on pages with a login form. There’s absolutely no reason for their extension to do anything on my website apart from go “nope no login form here”. 2

Well out of complete desperation as I didn’t have any better ideas, I disabled both, and sure enough the highlighting worked.

It turns out 1Password is applying its own syntax highlighting to any block matching this selector:

code[class*="language-"], [class*="language-"] code,
code[class*="lang-"], [class*="lang-"] code

Searching in the extension code for “token” (you can view the source of all scripts injected by extensions in the developer tools) quickly led me to the unobfuscated highlightAllUnder function name, with that selector. A post on the 1Password forum identifies this code as coming from prism.js, a JavaScript code highlighting library.

So there you go, all my wondering about caching and GitHub Pages gem versions was for nothing. I was blinded by the fact it’s absurd that my password manager is injecting a code highlighting script into every page I visit, so I didn’t bother to try disabling extensions sooner.

I contacted 1Password support and they confirmed what I’d already found in the forum; it’s a known issue and they’re working on a fix. Hopefully they will share some information on how this code got into the extension. My assumption is that it’s used in the main app for some feature (like code blocks in notes) and accidentally got included as a dependency of the browser extension.

  1. Seemingly I only use 1Extension. 

  2. The fact that 1Password requires a browser extension instead of using the OS’s specifically-designed API for autofilling passwords is not great. It was understandable before the widespread availability of system-level APIs for autofill, but iOS and MacOS have had these APIs for over 7 years now. They support it on iOS, but MacOS is left with a misbehaving browser extension. 


Add a Signature to Your Website

Nothing says “sophisticated and refined” like putting your signature at the bottom of your blog posts. However if you don’t do it right, you might end up with pixellated garbage, and that’s not refined at all. I’ve just done this (have a look at the bottom of this post) and there are a few tricks to get this working nicely.

First thing is to get a signature. I didn’t want to use the unintelligable scrawl I leave on important documents, but instead put a little more emphasis on my domain name—it’s my name and initials.

I used my iPad and GoodNotes to sign over and over again until I had one I was happy with. You could of course do this with pen and paper, the key is to just use a writing implement that leaves a stroke with uniform width and darkness. GoodNotes is great for this because it has my favourite pen mode: “the best fine-point felt-tip pen you’ve ever used”. We’re going to be converting this to an SVG later, and my SVG-ing skills can’t handle variable width strokes, so the uniform width is key.

There’s probably some way to get an SVG out of GoodNotes, but like so many things on iPadOS you’ll fight the implicit conversion to an raster image at some point in the process anyway. I just selected the good signature and copied it to the universal clipboard so I could paste it into Pixelmator Pro on my laptop.

At this point you could just save the signature as a PNG, slap it on your website, and call it a day. But we can do better.

There’s probably some nice tool to convert an image to clean vector shapes, but I opted to just trace it manually. I used the “freeform pen” tool in Pixelmator, this is a standard tool in any image editor with vector features, it lets you draw a shape freehand and have the result be a bezier curve.

I traced over each stroke in the signature really badly using the trackpad. It would’ve been easier with a mouse but I didn’t want to get off the couch. The quality of the first pass isn’t really important, as you’ve got to go over each stroke and use the vector control nubs to get it as close as possible to the original signature. Here you could have a little artistic license and smooth out some curves or other imperfections.

Now we’ve got an SVG, which we could totally just put on our website and appreciate the infinite scalability, but it’s not quite that simple.

The first trap is that if you export as SVG from Pixelmator Pro it will do the completely reasonable thing of including all your raster layers as <image> tags with base64 encoded data (even if they’re hidden). My exported SVG was 137kB, which is huge.

You could delete them manually but there’s a better way: SVGO. It’s an SVG optimiser, which shrinks the size of SVGs by combining paths and removing unnecessary data. I have no idea how it works, it seems like magic to me. This is the original SVG (with data omitted for readability):

<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated by Pixelmator Pro 3.7.1 -->
<svg width="1364" height="486" viewBox="0 0 1364 486"
  xmlns="http://www.w3.org/2000/svg"
  xmlns:xlink="http://www.w3.org/1999/xlink">
    <path id="Path" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path1" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path2" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path3" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path4" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path5" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path6" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path7" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path8" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path9" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <path id="path10" fill="none" stroke="#000"
      stroke-width="25" stroke-linecap="round"
      stroke-linejoin="round" d="..."/>
    <image id="Layer" x="2" y="-1" width="1364"
      height="490" visibility="hidden"
      xlink:href="data:image/png;base64, ..."/>
</svg>

Then here’s the optimised SVG (again with data omitted):

<svg xmlns="http://www.w3.org/2000/svg" width="1364"
  height="486" viewBox="0 0 1364 486">
  <path fill="none" stroke="#000" stroke-linecap="round"
    stroke-linejoin="round" stroke-width="25" d="..."/>
  <path fill="none" stroke="#000" stroke-linecap="round"
    stroke-linejoin="round" stroke-width="25" d="..."/>
  <path fill="none" stroke="#000" stroke-linecap="round"
    stroke-linejoin="round" stroke-width="25" d="..."/>
  <path fill="none" stroke="#000" stroke-linecap="round"
    stroke-linejoin="round" stroke-width="25" d="..."/>
</svg>

It removed the invisible image (I would have done that manually anyway) but it also decided I only needed four <path> tags, removed the comment, and removed a tonne of point data from the actual paths. The original file was 137kB, removing the <image> takes that to 4.6kB (totally reasonable SVG size), then SVGO cuts that in half to 2.4kB. With no difference that I can see.

We can then put the SVG on our website and marvel at the small file size and resolution-independence. However, there’s a tradeoff: if we embed the SVG directly into the page with an <svg> tag, it can be styled by the site’s CSS rules and thus respond to light and dark mode, but that will increase the size of every HTML page by 2.4kB. If it’s included with an <img> tag it’ll be loaded once and cached, but it can’t access styles.

My site has both light and dark themes, so the signature needs to respond to the CSS, but I also don’t want a 2.4kB increase on every page.1 Thankfully we can get the best of both worlds.

The trick is to use the SVG as an mask-image for an element that has the background-color we want the stroke of our SVG to be. This only works for monochrome images (without much more complicated trickery) which is perfect for this use case. On the page we just need a placeholder element:

<div id="signature"></div>

Then in CSS (or in my case, SASS) we set the mask-image and some other mask- properties to ensure we only get a single, correctly-sized signature.

#signature
  mask-image: url("/images/signature.svg")
  mask-size: contain
  mask-repeat: no-repeat
  mask-position: center
  background: $text-colour
  height: 3em

The <div> placeholder will change colour between light mode and dark mode, only visible through the mask-image, giving the impression that the SVG is able to change colour, without wasting 2.4kB on every page.

Related, I really like this introduction to SVG which really shows off how much more you can do with it. That’s what made me choose SVG for the graph in Light Mode InFFFFFFlation.

  1. Don’t ask why I insist on minimising the size of my site but I’m happy serving four web fonts. 


Programming With tmux for Beginners

Last week I gave a short talk on my adventures doing weird things with tmux. It went well, so I decided to record the slides with voiceover:

It’s very high level, the code snippets are simplified for readability, and I don’t go into much detail on the process of working this all out, but I think it’s a nice introduction to the whole project.

Of course you can read more about:

Then the things I didn’t mention are keeping things inside tmux with run -C and unlocking the full speed of tmux.

You can watch the talk in the video above, or on YouTube.


Why Do Containers on Alpine Forget My Tailnet?

Earlier this year I wrote about how I’d swapped my home “production” server over to use Alpine Linux. Overall it’s gone well, I swapped the VPS that runs NZ Topomap of the Day (read more about that) to Alpine in June, and my script that sets up an Alpine install has made this straightforward.

This weekend I swapped the hardware that I was using for the home “production” server to be a little more recent and reliable, and was reminded of a wrinkle in Alpine that I’d run into before. Thankfully I’d written down some notes that helped me solve the issue again. I can highly recommend keeping notes on random problems you’ve seen or solved, it’s saved me loads of time trying to find the right docs again. Or even better, writing it in a blog post so other people can solve the same problem.

Anyway.

The issue I was seeing was that some containers that need to talk to other devices on my Tailnet (ie Tailscale network) would just lose the ability to resolve their addresses after a while. Frustratingly this wasn’t consistent, it would be working but then a while later it would stop.

I use Tailscale’s MagicDNS feature which allows you to refer to any device by its hostname instead of the fully-qualified name or IP on the Tailnet. The culprit containers were endash (my container dashboard) and Prometheus, both of which connect to other devices on the Tailnet by hostname.

What I learnt is that the MagicDNS feature isn’t actually magic, it’s just setting a search domain for the unique Tailnet domain name (like my-tailnet.ts.net) and running a custom DNS server that resolves these to the Tailnet IP addresses.

This works by setting some config in /etc/resolv.conf that looks like this:

nameserver 100.100.100.100
search my-tailnet.ts.net

It’s not magic, it’s just a config file.

When you create a container, there are a bunch of flags you can pass (like --dns) that override the resolv.conf file. If you don’t provide any of these options, Podman will use the host DNS configuration from the host’s resolv.conf file.

Where the problem comes in is that some process in Alpine will fight Tailscale and write its own resolv.conf, removing the MagicDNS config. Tailscale might rewrite the file, but the config is copied into the container on create and so any containers created while the incorrect config was present will continue to be broken. Updating the file and restarting the container isn’t even enough to fix it—you need to delete and recreate the container, since the config is part of its overlay filesystem.

I wish I had a wonderful solution that made all the pieces play nicely together, but instead you can just tell Alpine to please not overwrite the file. In /etc/udhcpc/udhcpc.conf, ensure that this section is uncommented:

# Do not overwrite /etc/resolv.conf
RESOLV_CONF="no"

Then make sure /etc/resolv.conf is in the state you expect (with search for your tailnet and the Tailscale nameserver), then delete and recreate any containers that need this DNS config.


More Commands in the JJ Toolbox

It’s been almost two years since I started using JJ regularly, and almost 18 months since I wrote some tips on how to use it. That post was really just the result of me reading the docs (which at the time were much sparser than they are now) and working out how to manage remotes properly.

That was a long time ago, and I’ve had more time to settle into a rhythm and realise what works for me and what doesn’t.

Before I get too carried away, I want to get up on my high horse for a second. I found a repo that boasted “over 20 aliases for efficient workflows” and I just want to say: no. You don’t need lots of aliases. Aliases that you don’t know are useless. Having to remember which letter salad corresponds to the exact combination of flags you need is not saving you time.

Seriously, the oh-my-zsh git plugin defines over 200 aliases. Something has gone terribly wrong.

I have seven VCS-related shell aliases, which is the amount of letter salad that I can handle.

alias g="jj"
alias gs=" g status"
alias gd=" g diff"
alias gcm=" g commit -m"
alias gc=" g commit"
alias gl=" g log"
alias gp=" g push"

Anyway this post isn’t supposed to just be complaining about git. We’re here for tips.

Grab other versions of files with restore

The command I’m surprised by my usage of is restore. Since this is such a messy concept in git (are you discarding untracked, unstaged, or staged changes?) I wasn’t in the habit of doing this. The only command I knew was git checkout -- . which would blow away any tracked changes and get you to an empty working copy. It’s not a very precise operation.

I’ve got basically three different usages of restore. The first is when I’ve got a change but it contains some debugging code or something that I don’t want to be included when I send it out for review. I’ll use jj restore -i to show the interactive diff editor and select the bits I want to get rid of.

If I’ve got a commit and I’m working on top of it, sometimes I want to drop a file back to its state on the main branch. I could just rebase the commit I’m working on, but restore makes it easy to get a file to the state it was in on a different revision, usually main. I’ll do this with jj restore -f main path/to/my/file.txt and now my working copy has the updated file.

If you think about it, jj duplicate is just jj restore with all files into an existing empty commit.

The last use is the predictable one, if I’ve made some change and it’s just plain bad, I’ll do jj restore with no extra arguments to simply discard my changes. This is equivalent to jj abandon but feels a little safer.

Of course that safety doesn’t really matter, since I can jj undo anything anyway. This has been surprisingly handy if I get myself into a state with lots of merge conflicts, or accidentally run a command with the wrong flags. It just removes the risk associated with making a mistake, which means I don’t have to be particularly confident that any one command will do exactly what I expect. If it doesn’t, I’ll just undo and check the docs.

Irresponsibly juggle revisions with rebase

I did define an alias onmain that would move the working copy to be based on trunk() instead of wherever it is currently. It’s fine, it works, but to be honest it’s easier to just do rebase -d main.

Initially I think I got a bit confused with the -r flag to rebase, but once I realised -s (or --source) and -d (or --destination) do exactly what you want, I’ve had no trouble.

You can get a little fancy with -A and -B (--insert-after and --insert-before) which lets you splice a change right in-between two others, but this is a bit too much for me to remember. I’ll just run rebase twice.

Move changes between revisions with squash

Something I thought I’d miss in JJ is the lack of an equivalent to hg histedit. This opens a nice TUI that works similarly to an interactive rebase in git. You can choose for each commit whether you want to fold or edit or whatever, and then you say “go” and it does it all.

I’d use this to reorder commits (so one change could get submitted before another) but often all I would do was make a dummy commit, then reorder it to be on top of a commit further down in the history, then fold them together. This is just a really roundabout way of doing squash. So instead of all that nonsense, I’ll just run jj squash -d xyz and the working copy changes will be moved into commit xyz. If I don’t want to move all the files, I’ll use -i to select them interactively. I find the interactive selection easier than passing file paths as arguments most of the time.

In Mercurial I’d use hg absorb for this same job, which is still present as jj absorb. However, neither match up the edits to the right commit every time, so using jj squash is more predictable.

It’s worth using a little bit of your brain space to learn what the “default” arguments are to various JJ commands. For example with squash if you give it no arguments it takes all the changes from the current commit and moves them to the parent. If you provide a revision with -r then it’ll move the changes from that to its parent. If you provide -f it’ll squash from that revision into the current one, if you provide -t then it’ll squash from the current into that. Other commands like rebase and restore have similar behaviour.

Of course it’s not difficult to just always pass -f and -t, but once you get a little fancy you can throw in some revset expressions (like xyz:: to get all descendants) and do some clever nonsense.

Doing fancy revset expressions

Speaking of revset expressions, since I spent a bit of time learning the syntax I’ll find occasions to use a revset to replace a set of tedious commands with a single command.

I wrote a script to make automated changes to a codebase, and it would do jj new before making any changes. For some files it would make no changes and I’d be left with an empty commit. There were two ways that I ended up solving this, I could get rid of all the empty commits with jj abandon 'empty() & mutable()', or I could merge everything back into one commit with jj squash -f 'mutable()' -t @ (remembering that I could totally omit that -t @ and leave it implied).

Obviously most of the time it’s easier to just write the revision ID, use a simple expression like @-, or a branch name like main, but it’s nice having this in your repertoire for scripting or one-off weirdness.

In a way this is similar to Vim commands; you can get away with super basic editing and movement commands, but if you can remember a few tricks like diw or ci{ you’ll be able to get things done more smoothly.

Scripting with the power of -T

Originally—for some reason—I thought I’d leave scripts using git. I have no idea why I thought this, scripting with JJ is so much easier. I find the documentation a little confusing, but almost every command accepts a -T or --template flag that dictates how the output is formatted. It is then easy to write a command that outputs just the fields you need in JSON that is trivial to parse in almost any language. This is what I did when I wrote (and then re-wrote) my project progress printer.

The simpler model also makes scripting easier as you don’t have to worry about the working copy state, or things like where you’re going to git pull from. I just run jj sync (aliased to jj git fetch --all-remotes) and the repo is updated.

An alias that makes a lot of scripts easier is my jj ls alias, which lists the files touched by a particular change:

ls = ['log', '--no-graph', '-T', 'diff.files().map(|f| f.target().path()).join("\n") ++ "\n"']

This makes use of the template to process the list of changes files into a list of paths and then join them into a string. Embedded little languages in tools is really useful.

The aliases I do have

I really came out swinging at the start, but I do actually have some handy aliases that make life easier:

clone = ['git', 'clone']
ig = ['git', 'init', '--git-repo=.']
sync = ['git', 'fetch', '--all-remotes']

I think if you’re typing any git subcommand with any regularity, you should alias that away. The only one I use is jj git remote, but that’s quite rare.

evolve = ['rebase', '--skip-emptied', '-d', 'trunk()']
pullup = ['evolve', '-s', 'immutable()+ ~ immutable()']

Both of these are to update commits to sit on top of a newly-synced main branch. evolve works for the currently checked out branch, but I got frustrated at having to do this multiple times if I had multiple parallel changes. For that I made pullup (named since it pulls the changes from below trunk() to be on top of trunk()). The revset could probably be tidier, I don’t know why I didn’t just use mutable().


I know I poke fun at people that say they only use six git commands, but the more I think about it the more I realise I am slowly enlightening myself to realise that all these different JJ commands actually do the same thing. This time it’s different because these six commands are good.

Anyway this ended up more of a ramble than I expected. You can see my actual JJ config on Codeberg and maybe when you’re reading this I’m using 200 aliases and have reached new heights of productivity.


Hot ECR Reloading in Your Area

Everyone knows that ECR—the templating system built into the Crystal standard library—works at compile time, which makes it as efficient as writing to an IO manually. This is unlike other templating formats (usually in interpreted languages) like ERB (embedded Ruby) that parse and evaluate the template at runtime. This has the advantage of being able to change the template contents without stopping and restarting the program.

After I realised that ECR is just a few classes in the standard library that are actually very easy to modify, I realised that I could get a lot of the runtime reloading advantages of ERB in ECR with some reasonably horrific hacks.

Firstly, I just want you to understand just how cool ECR is. I always assumed it was much more complicated than it actually is, I thought it did a full parse and had to understand the Crystal code within the tags, but it’s actually much cleverer than that.

There isn’t even a parser, there is a lexer and that goes straight into the code generator. No messing about.

What happens is the lexer trundles along until it comes across an opening tag (either <%, <%=, or <%-). All the text before the tag is a single string literal token. It keeps looking at the code inside the tag until it gets to a closing tag (%> or -%>) and then the whole section of code is one single token. It keeps going like this until it gets to the end of the file.

The real magic happens in the code generator. The contents of the code blocks are effectively just dumped unmodified into the output, so if we have this ECR:

ECR solves at least <%= 1 << 10 %> problems

We get this code:

io << "ECR solves at least"
(1 << 10).to_s io
io << " problems"

In this case the code section is just a single expression, so it’s fairly straightforward. Surely though if we have control flow, or a block, we’d have to do something different? No! It just follows the same formula:

<% 10.times do |i| %>
 line number <%= i %>
<% end %>

Since Crystal doesn’t rely on significant whitespace or anything, we can just pop the contents of each of those code blocks into the generated file:

10.times do |i|
io << "line number "
i.to_s io
end

The ECR processor didn’t need to know or care that 10.times do |i| started a new block. If there was a mis-matching end, that would be picked up by the actual Crystal compiler when the generated code is compiled. Syntax errors appear as coming from the ECR file because there are annotations that map the expressions in the generated code to the corresponding line and column number in the ECR file.

Anyway, we can totally do this compile-time-only stuff at runtime. Well, not actually. But mostly.

The ECR file is basically just a series of string literals separated by code snippets. We can’t change the code snippets at runtime, but the strings are fair game. I did wonder if you could do something where you wrap each code section in a Proc or conditional and if they get removed or re-ordered you could only evaluate the ones that remained in the file, but since they can have any inter-dependence (defining variables, etc) this would get very fragile very quickly. Although since like 95% of the time what I want to change is a misspelled HTML class attribute, being able to update the text content of the template is a huge improvement.

I wrote then re-wrote it a few times, and the end result is much simpler than I was expecting at the start. The most important thing is failing fast if the ECR file has changed in a way that we can’t render it anymore. Any change to the actual code will invalidate the template and the code will have to be recompiled to pick up the changes.

The processor that runs at compile time generates very similar code to the actual ECR processor. To check whether the code has changed, it builds a list of all the code snippets as strings. At the start of the generated code I call a helper method that takes this list, rereads the ECR file, and iterates through the tokens. If any code token is different or missing, the file has changed too much and we throw an exception. Otherwise we return a list of new strings that will replace the string literals. The generated code takes this list and inserts strings based on their index in the file (since that won’t change, since we’ll have failed already in that case).

Here’s the (slightly abridged) generated code for the ECR example above:

strings = Geode::HTMLSafeECR::RuntimeLoader.get_strings(
  "test.ecr",
  [nil, " 1 << 10 ", nil]
)
io << strings[0]
(1 << 10).to_html io
io << strings[1]

The array passed to get_strings is generated from the original ECR file contents. Each nil is where there’s a string literal—something that can be replaced—and every non-nil string is a bit of code that must remain in the altered ECR file.1

On release builds, all this code is removed and I swap back over to the boring compile-time-only processor, so all of this nonsense disappears and it works just like a normal ECR.

I’ve added this to the HTML-safe ECR generator in Geode—that I wrote about the other day—and have also simplified that code a whole bunch by splitting out the HTML-generating code into to_html, which removed the need for the Builder wrapper and unsafe_write method entirely. This has simplified the model of composable components, meaning that any object can override to_html and be inserted into an ECR template with <%= %>, and the escaping (or lack thereof) will work as you’d expect. You can see this commit in endash as an example of swapping templates over to use this method.

  1. I should actually check the type of code block (whether it’s output or control or whatnot) but I haven’t been bothered yet.