Llogiq on stuff

Write small Rust scripts

2026-03-05T00:00:00+00:00

Recently I was working on a Rust PR to reduce unreachable_code lint churn after todo!() calls, that basically removes lint messages from unreachable_code after todo!() and instead adds a todo_macro_uses lint which can be turned off while the code is still being worked on. However, once that change was done, I ran into a number of failing tests, because while they had a #![allow(unused)] or some such, this didn’t cover the todo_macro_uses lint.

Brief digression: rustc itself is tested by a tool called compiletest. That tool runs the compiler on code snippets, captures the output and compares it with known-good golden master output it stores alongside the snippets. In this case, there were a good number of tests that had todo!() but didn’t #![allow(todo_macro_uses)]. More tests than I’d care to change manually.

In this year of the lord, many of us would ask some agent to do it for them, but I didn’t like the fact that I would have to review the output (I have seen too many needless formatting changes to be comfortable with investing time and tokens into that). Also I had a code snippet to find all rust files lying around that only used standard library functions and could easily be pasted into a throwaway project.

use std::io;
use std::path::Path;

fn check_files(path: &Path) -> io::Result<()> {
    for e in std::fs::read_dir(path)? {
        let Ok(d) = e else { continue; };
        if d.file_type().is_ok_and(|ft| ft.is_dir()) {
            check_files(&d.path())?;
        } else {
            let path = d.path();
            if path.extension().is_some_and(|ext| ext == "rs") {
                check_file(&path)?;
            }
        }
    }
    Ok(())
}

This can be called on a Path and walks it recursively, calling check_file on all Rust files. I also had done a few read-modify-write functions in Rust (notably in my twirer tool I use for my weekly This Week in Rust contributions). They look like this:

fn check_file(path: &Path) -> io::Result<()> {
    let orig_text = std::fs::read_to_string(path)?;

    let text = todo!(); // put the changed `orig_text` into `text`

    std::fs::write(path, text)
}

There was some slight complication in that a) I wanted to amend any #![allow(..)] annotation I would find instead of adding another, and b) to add one, I would have to find the first position after the initial comments (which are interpreted by compiletest, which would be foiled by putting them below a non-comment line). Also I didn’t want to needlessly add empty lines, so I had to check whether to insert a newline. All in all this came out to less than 50 lines of Rust code, which I’m reproducing here; perhaps someone can use them to copy into their own code to have their own one-off Rust scripts.

use std::fs::{read_dir, read_to_string, write};
use std::io;
use std::path::Path;

fn check_file(path: &Path) -> io::Result<()> {
    let orig_text = read_to_string(path)?;
    if !orig_text.contains("todo!(") || orig_text.contains("todo_macro_uses") {
        return Ok(());
    }
    let text = if let Some(pos) = orig_text.find("#![allow(") {
       // we have an `#[allow(..)]` we can extend
       let Some(insert_pos) = orig_text[pos..].find(")]") else {
           panic!("unclosed #![allow()]");
       };
       let (before, after) = orig_text.split_at(pos + insert_pos);
       format!("{before}, todo_macro_uses{after}")
    } else {
        // find the first line after all // comments
        let mut pos = 0;
        while orig_text[pos..].starts_with("//") {
            let Some(nl) = orig_text[pos..].find("\n") else {
                pos = orig_text.len();
                break;
            };
            pos += nl + 1;
        }
        let (before, after) = orig_text.split_at(pos);
        // insert a newline unless at beginning or we already have one
        let nl = if pos == 0 || before.ends_with('\n') {
            ""
        } else {
            "\n"
        };
        format!("{before}{nl}#![allow(todo_macro_uses)]\n{after}")
    };
    write(path, text)
}

fn check_files(path: &Path) -> io::Result<()> {
    for e in read_dir(path)? {
        let Ok(d) = e else { continue; };
        if d.file_type().is_ok_and(|ft| ft.is_dir()) {
            check_files(&d.path())?;
        } else {
            let path = d.path();
            if path.extension().is_some_and(|ext| ext == "rs") {
                check_file(&path)?;
            }
        }
    }
    Ok(())
}

fn main() -> io::Result<()> {
    check_files(&Path::new("../rust/tests/ui"))
}

The script ran flawlessly, I didn’t need to check the output for errors, and I can reuse parts of it whenever I feel like it.

Conclusion: It’s easy and quick to write small Rust scripts to transform code. And since you know what the code does, you don’t need any time to review the output. And Rust’s standard library, while missing pieces that might simplify some tasks, is certainly servicable for work like this. Even if I had the need for, say, regexes, those would’ve been a mere cargo add regex away. So next time you need to mechanically transform some code, don’t reach for AI, simply rust it.

Rust A Decade Later

2025-05-18T00:00:00+00:00

Hi folks, I’m back from RustWeek, and the ten-year celebration also almost marks ten years of me coding in Rust. Cue the (self-)congratulation and the feel-good things. This is hopefully one of those.

I want to raise one single point that most of the other posts seem to be missing: We’re not done yet. Yes, we can have nice things (and comparing with other languages, we very much do have nice things). But we shouldn’t content ourselves with what we have.

We can have even nicer things.

What’s more, we should have even nicer things. I’ll give a few examples:

Rust’s error reporting is famously great. But at least for now, paths in error and lint messages are usually fully qualified, because the compiler fails to keep the information about what paths are in scope when creating the diagnostics. I’ve teamed up with Esteban Kuber and Vadim Petrochenkov so that we may get path trimming in those messages.
The Rust compiler has more span information than most compilers do. But a) we don’t always get the full info on macro expansion (notably the Rust for Linux folks found an example where macro_rules! m { ($e:$tt) => { $e }; } m!(1) fails to mark the span of the 1 as involved in a macro). Also unlike C, we currently don’t have an annotation to declare that code has been generated from non-Rust code, which would improve help programs that compile to Rust, such as bindgen. We don’t yet have anyone taking up this thing, but here’s hope we’ll get it anyway.
The clippy lint implementer’s workshop led to multiple PRs to make clippy better. I have yet to review some of them, but the results so far are heartening. In the meantime, the clippy performance project has already given us some initial benefits, but there’s a lot of work to be done still.
The cargo team will add their own linting infrastructure and take over the few cargo lints clippy currently has, which will likely improve their performance because they will be able to hook into cargo internals for which we currently need to call out to cargo.
The current story around mutable statics is suboptimal, with the replacement API being nightly-only, while the original idiom is already a lint error. I’m positive we’ll see something better come out of it.

And that’s only a small portion of the herculean amount of work that continues to be poured into Rust.

So here’s to the next 10 years of Rust improvement. It’s already become better than most of us would have dared to dream, and we should expect to continue to raise the bar even further.

Rust Life Improvement

2025-05-15T00:00:00+00:00

This is the companion blog post to my eponymous Rustikon talk. The video recording and slides are also available now.

As is my tradition, I started with a musical number, this time replacing the lyrics to Deep Blue Something’s “Breakfast at Tiffany”, inspired by some recent discussions I got into:

You say that Rust is like a religion
the community is toxic
and you rather stay apart.
You say that C can be coded safely
that it is just a skill issue
still I know you just don’t care.

R: And I say “what about mem’ry unsafety?”
You say “I think I read something about it
and I recall I think that hackers quite like it”
And I say “well that’s one thing you got!”

In C you are tasked with managing mem’ry
no help from the compiler
there’s so much that can go wrong?
So what now? The hackers are all over
your systems, running over
with CVEs galore.

R: And I say “what about…”

You say that Rust is a woke mind virus,
rustaceans are all queer furries
and you rather stay apart.
You say that C can be coded safely
that one just has to get gud
still I know you just don’t care.

Beta Channel

I started out the talk by encouraging people who use Rustup to try the Beta channel. Unlike stable, one can get six weeks of performance improvements and despite thirty-five point-releases since 1.0.0, most of those didn’t fix issues that many people happened upon.

Even when one wants to be sure to only update once the first point release is likely to be out, the median release appeared roughly two weeks after the point-zero one it was based on. Besides, if more people test the beta channel, its quality is also likely to improve. It’s a win-win situation.

Cargo

Cargo has a number of tricks up its sleeve that not everyone knows (so if you already do, feel free to skip ahead). E.g. there are a number of shortcuts:

$ cargo b # build
$ cargo c # check
$ cargo d # doc
$ cargo d --open # opens docs in browser
$ cargo t # test
$ cargo r # run
$ cargo rm $CRATE # remove

besides, if one has rust programs in the examples/ subfolder, one can run them using cargo r --example .

I also noted that cargo can strip release binaries (but doesn’t by default), if you add the following to your project’s Cargo.toml:

[profile.release]
strip=true

Cargo: Configuration

Cargo not only looks for the Cargo.toml manifests, it also has its own project- or user-wide configuration:

project-wide: .cargo/config.toml
user-wide, UNIX-like (Linux, MacOS, etc.): ~/.cargo/config.toml
user-wide, Windows: %USERPROFILE%\.cargo\config.toml

The user configuration can be helpful to…

Add more shortcuts:

[alias]
c = "clippy"
do = "doc --open"
ex = "run --example"
rr = "run --release"
bl = "bless"
s = "semver-checks"

Have Rust compile code for the CPU in your computer (which lets the compiler use all its bells and whistles that normally are off limits in case you give that executable to a friend):

[build]
rustflags = ["-C", "target-cpu=native"]

Have Rust compile your code into a zero-install relocatable static binary

[build]
rustflags = ["-C", "target-feature=+crt-static"]

Use a shared target folder for all your Rust projects (This is very useful if you have multiple Rust projects with some overlap in dependencies, because build artifacts will be shared across projects, so they will only need to be compiled once, conserving both compile time & disk space):

[build]
target = "/home//.cargo/target"

Configure active lints for your project(s):

[lints.rust]
missing_docs = "deny"
unsafe_code = "forbid"

[lints.clippy]
dbg_macro = "warn"

There are sections for both Rust and Clippy. Speaking of which:

Clippy

This section has a few allow by default lints to try:

missing_panics_doc, missing_errors_doc, missing_safety_doc

If you have a function that looks like this:

pub unsafe fn unsafe_panicky_result(foo: Foo) -> Result<Bar, Error> {
    match unsafe { frobnicate(&foo) } {
        Foo::Amajig(bar) => Ok(bar),
        Foo::Fighters(_) => panic!("at the concert");
        Foo::FieFoFum => Err(Error::GiantApproaching),
    }
}`

The lints will require # Errors, # Panics and # Safety sections in the function docs, respectively:

/// # Errors
/// This function returns a `GiantApproaching` error on detecting giant noises
///
/// # Panics
/// The function might panic when called at a Foo Fighters concert
///
/// # Safety
/// Callers must uphold [`frobnicate`]'s invariants'

There’s also an unnecessary_safety_doc lint that warns on # Safety sections in docs of safe functions (which is likely a remnant of an unsafe function being made safe without removing the section from the docs, which might mislead users):

/// # Safety
///
/// This function is actually completely safe`
pub fn actually_safe_fn() { todo!() }

The multiple_crate_versions will look at your dependencies and see if you have multiple versions of a dependency there. For example, if you have the following dependencies:

mycrate 0.1.0
- rand 0.9.0
- quickcheck 1.0.0
  - rand 0.8.0

The rand crate will be compiled twice. Of course, that’s totally ok in many cases (especially if you know that your dependencies will catch up soon-ish), but if have bigger dependencies, your compile time may suffer. Worse, you may end up with incompatible types, as a type from one version of a crate isn’t necessarily compatible with the same type from another version.

So if you have long compile times, or face error messages where a type seems to be not equal to itself, this lint may help you improve things.

The non_std_lazy_statics lint will help you to update your code if you still have a dependency on lazy_static or once_cell for functionality that has been pulled into std. For example:

// old school lazy statics
lazy_static! { static ref FOO: Foo = Foo::new(); }
static BAR: once_cell::sync::Lazy<Foo> = once_cell::sync::Lazy::new(Foo::new);

// now in the standard library
static BAZ: std::sync::LazyLock<Foo> = std::sync::LazyLock::new(Foo::new);

The ref_option and ref_option_ref lints should help you avoid using references on options as function arguments. Since an Option<&T> is the same size as an &Option, it’s a good idea to use the former to avoid the double reference.

fn foo(opt_bar: &Option<Bar>) { todo!() }
fn bar(foo: &Foo) -> &Option<&Bar> { todo!() }

// use instead
fn foo(opt_bar: Option<&Bar>) { todo!() }
fn bar(foo: &Foo) -> Option<&Bar> { todo!() }

The same_name_method lint helps you avoid any ambiguities with would later require a turbo fish to resolve.

struct I;
impl I {
    fn into_iter(self) -> Iter { Iter }
}
impl IntoIterator for I {
    fn into_iter(self) -> Iter { Iter }
    // ...
}

The fn_params_excessive_bools lint will warn if you use multiple bools as arguments in your methods. Those can easily be confused, leading to logic errors.

fn frobnicate(is_foo: bool, is_bar: bool) { ... }

// use types to avoid confusion
enum Fooish {
    Foo
    NotFoo
}

Clippy looks for a clippy.toml configuration file you may want to use in your project:

# for non-library or unstable API projects
avoid-breaking-exported-api = false
# let's allow even less bools
max-fn-params-bools = 2

# allow certain things in tests
# (if you activate the restriction lints)
allow-dbg-in-tests = true
allow-expect-in-tests = true
allow-indexing-slicing-in-tests = true
allow-panic-in-tests = true
allow-unwrap-in-tests = true
allow-print-in-tests = true
allow-useless-vec-in-tests = true

Cargo-Semver-Checks

If you have a library, please use cargo semver-checks before cutting a release.

$ cargo semver-checks
    Building optional v0.5.0 (current)
       Built [   1.586s] (current)
     Parsing optional v0.5.0 (current)
      Parsed [   0.004s] (current)
    Building optional v0.5.0 (baseline)
       Built [   0.306s] (baseline)
     Parsing optional v0.5.0 (baseline)
      Parsed [   0.003s] (baseline)
    Checking optional v0.5.0 -> v0.5.0 (no change)
     Checked [   0.005s] 148 checks: 148 pass, 0 skip
     Summary no semver update required
    Finished [  10.641s] optional

Your users will be glad you did.

Cargo test

First, doctests are fast now (apart from compile_fail ones), so if you avoided them to keep your turnaround time low, you may want to reconsider.

Also if you have a binary crate, you can still use #[test]' by converting your crate to a mixed crate. Put this in yourCargo.toml`:

[lib]
name = "my_lib"
path = "src/lib.rs"

[[bin]]
name = "my_bin"
path = "src/main.rs"

Now you can test all items you have in lib.rs and any and all modules reachable from there.

Insta

Insta is a crate to do snapshot tests. That means it will use the debug representation or a serialization in JSON or YAML to create a “snapshot” once, then complain if the snapshot has changed. This removes the need to come up with known good values, since your tests will create them for you.

#[test]
fn snapshot_test() {
    insta::assert_debug_snapshot!(my_function());
}

insta has a few tricks up its sleeve to deal with uncertainty arising from indeterminism. You can redact the output to e.g. mask randomly chosen IDs:

#[test]
fn redacted_snapshot_test() {
    insta::assert_json_snapshot!(
        my_function(),
        { ".id" => "[id]" }
    );
}

The replacement can also be a function. I have used this with a Mutex> in the past to replace random IDs with sequence numbers to ensure that equal IDs stay equal while ignoring their randomness.

Cargo Mutants

Mutation testing is a cool technique where you change your code to check your tests. It will apply certain changes (for example replacing a + with a - or returning a default value instead of the function result) to your code and see if tests fail. Those changes are called mutations (or sometimes mutants) and if they don’t fail any tests, they are deemed “alive”.

I wrote a bit about that technique in the past and even wrote a tool to do that as a proc macro. Unfortunately, it used specialization and as such was nightly only, so nowadays I recommend cargo-mutants. A typical run might look like this:

$ cargo mutants
Found 309 mutants to test
ok       Unmutated baseline in 3.0s build + 2.1s test
 INFO Auto-set test timeout to 20s
MISSED   src/lib.rs:1448:9: replace for Optioned>::deserialize -> Result,
 D::Error> with Ok(Optioned::from_iter([Default::default()])) in 0.3s build + 2.1s test
MISSED   src/lib.rs:1425:9: replace for Optioned>::hash with () in 0.3s build + 2.1s test
MISSED   src/lib.rs:1202:9: replace for u64>::opt_eq -> bool with false in 0.3s build + 2.1s test
TIMEOUT  src/lib.rs:972:9: replace for Option>::from -> Option with Some(false) in 0.4s
 build + 20.0s test
MISSED   src/lib.rs:1139:9: replace for isize>::get_none -> isize with 0 in 0.4s build + 2.3s test
MISSED   src/lib.rs:1228:14: replace == with != in for i64>::opt_eq in 0.3s build + 2.1s test
MISSED   src/lib.rs:1218:9: replace for i16>::opt_eq -> bool with false in 0.3s build + 2.1s test
MISSED   src/lib.rs:1248:9: replace for f64>::opt_eq -> bool with true in 0.4s build + 2.1s test
MISSED   src/lib.rs:1239:9: replace for f32>::opt_eq -> bool with false in 0.4s build + 2.1s test
...
309 mutants tested in 9m 26s: 69 missed, 122 caught, 112 unviable, 6 timeouts

Unlike code coverage, mutation testing not only finds which code is run by your tests, but which code is actually tested against changes – at least as far as they can be automatically applied.

Also mutation testing can give you the information which tests cover what possible mutations, so you sometimes can remove some tests, making your test suite leaner and faster.

rust-analyzer

I just gave a few settings that may improve your experience:

# need to install the rust-src component with rustup
rust-analyzer.rustc.source = "discover" 
# on auto-import, prefer importing from `prelude`
rust-analyzer.imports.preferPrelude = true
# don't look at references from tests
rust-analyzer.references.excludeTests = true

cargo sweep

If you are like me, you can get a very large target/ folder.

cargo sweep will remove outdated build artifacts:

$ cargo sweep --time 14 # remove build artifacts older than 2 weeks
$ cargo sweep --installed # remove build artifacts from old rustcs

Pro Tip: Add a cronjob (for example every Friday on 10 AM):

0 10 * * fri sh -c "rustup update && cargo sweep --installed"

cargo wizard

This is a subcommand that will give you a TUI to configure your project, giving you a suitable Cargo.toml etc.

cargo pgo

Profile-guided optimization is a great technique to eke out that last bit of performance without needing any code changes. I didn’t go into detail on it because Aliaksandr Zaitsau did a whole talk on it and I wanted to avoid the duplication.

cargo component

This tool will allow you to run your code locally under a WASM runtime.

Run your code in wasm32-wasip1 (or later)
the typical subcommands (test, run, etc.) work as usual
can use a target runner:

[target.wasm32-wasip1]
runner = ["wasmtime", "--dir=."]

The argument is used to allow accessing the current directory (because by default the runtime will disallow all file access). You can of course also use different directories there.

bacon

compiles and runs tests on changes

great to have in a sidebar terminal

‘nuff said.

Language: Pattern matching

Rust patterns are super powerful. You can

destructure tuples and slices
match integer and char ranges
or-combine patterns with the pipe symbol, even within other patterns (note that the bindings need to have the same types). You can even use a pipe at the start of your pattern to get a nice vertical line in your code (see below)
use guard clauses within patterns (pattern if guard(pattern) => arm)

match (foo, bar) {
  (1, [a, b, ..]) => todo!(),
  (2 ..= 4, x) if predicate(x) => frobnicate(x),
  (5..8, _) => todo!(),
  _ => ()
}

if let Some(1 | 23) | None = x { todo!() }

match foo {
  | Foo::Bar
  | Foo::Baz(Baz::Blorp | Baz::Blapp)
  | Foo::Boing(_)
  | Foo::Blammo(..) => todo!(),
  _ => ()
}

matches!(foo, Foo::Bar)

Also patterns may appear in surprising places: Arguments in function signatures are patterns, too – and so are closure arguments:

fn frobnicate(Bar { baz, blorp }: Bar) {
  let closure = |Blorp(flip, flop)| blorp(flip, flop);
}

What’s more, patterns can be used in let and in plain assignments:

let (a, mut b, mut x) = (c, d, z);
let Some((e, f)) = foo else { return; };
(b, x) = (e, f);

As you can see, with plain let and assignment, you need an irrefutable pattern (that must always match by definition), otherwise you can do let-else.

Language: Annotations

use #[expect(..)] instead of #[allow(..)], because it will warn if the code in question is no longer linted (either because the code or clippy changed), so the #[allow(..)] will just linger.

#[expect(clippy::collapsible_if)
fn foo(b: bool, c: u8) [
    if b {
        if c < 25 {
            todo!();
        }
    }
}

Add #[must_use] judiciously on library APIs to help your users avoid mistakes. There’s even a pedantic clippy::must_use_candidates lint that you can auto-apply to help you do it.

You can also annotate types that should always be used when returned from functions.

#[must_use]
fn we_care_for_the_result() -> Foo { todo!() }

#[must_use]
enum MyResult<T> { Ok(T), Err(crate::Error), SomethingElse }

we_care_for_the_result(); // Err: unused_must_use
returns_my_result(); // Err: unused_must_use

Traits sometimes need special handling. Tell your users what to do:

#[diagnostic::on_unimplemented(
    message = "Don't `impl Fooable<{T}>` directly, `#[derive(Bar)]` on `{Self}` instead",
    label = "This is the {Self}"
    note = "additional context"
)]
trait Fooable<T> { .. }

Sometimes, you want internals to stay out of the compiler’s error messages:

#[diagnostic::do_not_recommend]
impl Fooable for FooInner { .. }

library: `Box::leak`

For &'static, once-initialized things that don’t need to be dropped

let config: &'static Configuration = Box::leak(create_config());
main_entry_point(config);

The End?

That’s all I could fit in my talk, so thanks for reading this far.

Easy Mode Rust

2024-03-28T00:00:00+00:00

This post is based on my RustNationUK ‘24 talk with the same title. The talk video is on youtube, the slides are served from here.

Also, here’s the lyrics of the song I introduced the talk with (sung to the tune of Bob Dylan’s “The times, they are a-changin’”):

Come gather Rustaceans wherever you roam
and admit that our numbers have steadily grown.
The community’s awesomeness ain’t set in stone,
so if that to you is worth saving
then you better start teamin’ up instead of toilin’ alone
for the times, they are a-changin’.

Come bloggers and writers who tutorize with your pen
and teach those new folks, the chance won’t come again!
Where there once was one newbie, there soon will be ten
and your knowledge is what they are cravin’.
Know that what you share with them is what you will gain
for the times, they are a-changin’.

Researchers and coders, please heed the call,
Without your efforts Rust would be nothin’ at all
and unsafety would rise where it now meets its fall.
May C++ proponents be ravin’.
What divides them from us is but a rustup install
for the times, they are a-changin’.

Fellow moderators throughout the land,
don’t you dare censor what is not meant to offend
otherwise far too soon helpful people be banned
and what’s left will be angry folks ragin’.
Our first order of business is to help understand
that the times, they are a-changin’.

The line it is drawn, the type it is cast
What debug runs slow, release will run fast
as the present now will later be past
and our values be rapidly fadin’
unless we find new people who can make them last
for the times, they are a-changin’.

Rust has an only somewhat deserved reputation for being hard to learn. But that is mostly an unavoidable consequence of being a systems language that has to supply full control over the myriad of specifics of your code and runtime. But I’d argue that our method of teaching Rust is rather more at fault for this reputation. So as an antidote to this “the right way to do it” thinking, I offer this set of ideas on how to learn as little Rust as possible to become productive in Rust, so you can start and have success right away and learn the harder parts later when you’re comfortable with the basics.

In the talk I started with the “ground rules” for the exercise: I wanted to identify a small subset to learn that will allow people to successfully write Rust programs to solve the problems in front of them without being overwhelmed by all kinds of new concepts. I am happy to forgo on performance, brevity of the code or idiomatic code. In fact, some of the suggestions fly in the face of conventional guidelines on how good Rust code should look like. One of the questions after the talk was how to deal with new contributors or colleagues pushing “substandard” code to a project, and here my suggestion is to just merge it and clean it up after the fact. New users will feel unsure about their abilities, and nitpicking on details will put them off where we want to encourage them in their growth and learning, at least in the beginning.

Of course, the flipside of this is that I don’t suggest that every Rustacean learn only this subset and forever avoid all else. The idea here is to make you productive and successful quickly, and you can then build on that. A suggestion that also came up after my talk was to create a poster with a “research tree” (that is sometimes used in strategy games like e.g. Civilization to give people a path to progress without making it too linear). This is still on my list and I’ll open a repo for that soon, in the hope of finding people who’ll help me.

So without further ado, here are the things we want to avoid learning, and how to do that:

Syntax

Rust is not a small language. When starting out, for flow control it’s best to stick to basic things like if, for and while. If you need to distinguish e.g. enum variants, you can also use match, but keep it simple: Only match one thing, and avoid more complex things like guard clauses:

// Don't nest patterns in match arms
match err_opt_val {
  Some(Err(e)) => panic!("{e}"),
  _ => (),
}

// instead, nest `match` expressions
match err_opt_val {
  Some(err_val) => match err_val {
    Err(e) => panic!("{e}"),
    _ => (),
  },
  _ => (),
}

// Don't use guards
match w {
  Some(x) if x > 3 => { one(x) },
  Some(x) => { other(x) },
  None => (),
}

// instead, nest with `if`
// (that might require you to copy code)
match w {
  Some(x) => {
    if x > 3 {
      one(x)
    } else {
      other(x)
    }
  },
  None => (),
}

Avoid other constructs for now (such as if let or let-else). While they might make the code more readable, you can learn them later and have your IDE refactor your code quickly as you become privy to how they work. Within loops, avoid break and continue, especially with values. Rather introduce a new function that returns the value from within a loop.

As discussed in the introduction, this will take more code and thus exacerbate both brevity and readability, but the individual moving parts are far simpler.

The SBorrow checker

In my talk, I used a classic example that will often come up during search algorithms: Extending a collection of items with filtered and modified versions of the prior items.

for item in items.iter() {
  if predicate(item) {
    items.push(modify(item));
  }
}

The code here pretty much mirrors what you’d do in e.g. Python. It’s simple to read and understand, and there aren’t any needless moving parts. Unfortunately, it is also wrong, and the compiler won’t hesitate to point that out:

  Compiling unfortunate v0.0.1
error[E0502]: cannot borrow `items` as mutable because it is also borrowed as immutable
  --> src/main.rs:16:13
   |
13 |     for item in items.iter() {
   |                 ------------
   |                 |
   |                 immutable borrow occurs here
   |                 immutable borrow later used here
...
16 |             items.push(new_item);
   |             ^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here

For more information about this error, try `rustc --explain E0502`.

Luckily, in almost all of the cases, we can split the immutable from the mutable borrows. In this particular case, we can simply iterate over a clone of our item list:

//               vvvvvvvv
for item in items.clone().iter() {
  if predicate(item) {
    items.push(modify(item));
  }
}

I used to be very wary of cloning in the past, considering it an antipattern, but unless that code is on the hot path, it literally won’t show up on your application’s profile. so going to the effort of avoiding that clone is premature optimization. However, if you have measured and note that the clone is in fact showing up either on your memory or CPU profile, you can switch to indexing instead:

for i in 0..items.len() {
  if predicate(&items[i]) {
    let new_item = modify(&items[i]);
    items.push(new_item);
  }
}

Note that this approach is more brittle than the clone based one: While in this case the loop itself only uses integers and thus doesn’t borrow anything, we might still inadvertently introduce overlapping borrows into the loop body. For example, if we replaced the if with if let, items would be borrowed from for the duration of the then-clause, thus causing the exact error we were trying to avoid in the first place. Also note that we put modify(..) into a local to avoid having it within the push, which might also possibly trip up the borrow checker.

Again, we’re not generally aiming for performance, so I would prefer the clone-based variant as much as possible.

Macros

Macros come up early in Rust. Literally the first Rust program everyone compiles (or even writes) is:

fn main() {
  println!("Hello, World!");
}

You’ll have a hard time writing Rust without calling macros, so I would suggest you treat them like you’d treat functions, with the caveat that they can have a rather variable syntax, but they’ll usually document that. So as long as you roughly know how to call the macro and what it does, feel free to call them as you like.

Writing macros is something we’ll want to avoid. Rust has a number of macro types (declarative macros, declarative macros 2.0, derive macros, annotation macros and procedural bang-macros), but we’re not going to look into writing any of those. All of those macro variants solve a single problem: Code duplication.

Now the obvious simple solution to avoiding macros is: Duplicate your code.

While that sounds very simple, in fact the advice can be split into a hierarchy of solutions that depend on the problem at hand:

up to 5 times, less than 10 lines of code, not expected to change: In this case I’d just copy & paste the code and edit it to fit your requirements. Of course, you still have the risk of introducing errors in one of the instances of the copied code, but with the resulting code being reasonably compact, you’ll have a good chance to catch those quickly.
more than that, still not expected to change within a certain time: Who is better than creating multiple almost-same instances of the same thing than you? Your computer of course! So write some Rust code that builds Rust code by building strings (using format!(..) or println!(..)), call it once and copy the output into your code. Voilà!
expected to be up to date with the rest of the code? In that case, put your code generation into a unit test that reads and splits out the current version of the code, generates the possibly updated version, compares both and if they differ writes the updated version of the code, then panic with a message to tell whoever ran the test they need to commit the changes. It is helpful to add start and end marker comments to the generated code to make splitting it out easier and to document the fact that code is generated.

In code, instead of doing:

macro_rules! make_foo {
  { $($a:ident),* } => { $(let $a = { "foo" };)* };
}
make_foo!(a, b);

Either 1. copy & paste instead:

let a = { "foo" };
let b = { "foo" };

Or 2. write code to generate code:

fn format_code(names: &[&str]) -> String {
  let mut result = String::new();
  for name in names {
    result += format!("\nlet {name} = \"foo\";");
  }
  result
}

Additionally, 3. use a test to keep code updated:

#[test]
fn update_code() {
  let (prefix, actual, suffix) = read_code();
  let expected = format_code(&["a", "b"]);
  if expected == actual { return; }
  write_code(prefix, expected, suffix);
  panic!("updated generated code, please commit");
}

Alexey Kladov explains the latter technique better than I could in his blog post about Self-modifying code.

Generics

Generics lets us re-use code in various situations by being able to swap out types. However, they can also be a great source of complexity, so we’ll of course want to avoid them. As Go before version 1.2 has shown, you can get quite far without them, so unless it’s for the element type of collections, we’ll want to avoid using them.

So instead of writing

struct Foo<A, B> { .. }
fn foo<A, B>(a: A, b: B) { .. }

we’d monomorphize by hand (that is build a copy of the code for each set of concrete types we need), so for each A/B combination, we’d write:

// given `struct X; struct Y`
struct FooXY { .. }
fn foo_x_y(a: X, b: Y) { .. }

Of course, you might end up with a lot of copies, so use the code generation from above to deal with that.

Lifetimes

Lifetime annotations are those arcane tick+letter things that sometimes even stump intermediate Rust programmers, so you won’t be surprised to find them on this list. They look like this:

struct Borrowed<'a>(&'a u32);
fn borrowing<'a, 'b>(a: &'a str, b: &'b str) -> &'a str { .. }

Of course, we don’t want to burden ourselves with those sigil-laden monstrosities for now. To get rid of those, we have to avoid borrowing in function signatures. So instead of taking a reference, take an owned instance. And yes, this will incur more cloning yet. If you need to share an object (e.g. because you want to mutate it), wrap it in an Arc:

struct Arced(Arc<u32>);
fn cloned(a: String, b: String) -> String { .. }
fn arced(a: Arc<String>, b: Arc<String>) -> String { .. }

Arc is a smart pointer that lets you .clone() without cloning the wrapped value. Both Arcs will lead to the exact same value, and if you mutate one, the other will also have changed.

Traits

You can get a lot done without ever implementing a trait in Rust. However, there are some traits (especially in the standard library, but also in trait-heavy crates like serde) that you might need to get some stuff done. In many cases, you can use a #[derive(..)]-annotation, such as

#[derive(Copy, Clone, Default, Eq, PartialEq, Hash)]
struct MyVeryBadExampleIAmSoSorry {
  size: usize,
  makes_sense: bool,
}

In some cases, Rust lore would tell you to use trait based dispatch, but in most of those cases, an enum and a match or even a bag of if-clauses will do the trick. Remember, we’re not attempting to have our code win a beauty contest, just get the job done.

Finally, if you use a framework that requires you to manually implement a trait, write impl WhateverTrait for SomeType and use the insert missing members code action from your IDE if available.

Modules and Imports

This is something we cannot completely avoid. If we don’t use any imports, our code can only use what’s defined in the standard library prelude, and we won’t get very far with that. Also even if we did, we’d end up with a 10+k lines of code file, and no one wants to navigate that. On the other hand, when using modules, we should strive to not go overboard, lest we find ourselves in a maze of twisty little mod.rs files, all different (pardon the text adventure reference).

So we obviously need to import stuff we use, but how do we introduce mods? The key to keeping this simple is the observation that mods conflate code organization with the hierarchy of paths in our crate. So if I have a mod foo containing a bar, people using my code will have to either import or directly specify foo::bar. But there are two recipes we can follow to untangle those. Given an example lib.rs where our code has two functions:

pub fn a() {}
pub fn b() {}

Now in practice, those functions likely won’t be empty, and in most cases we’ll have more than two of them, but you want to read this blog post, not wade through screens of code, so let’s look at the first recipe we will use to move b to a new b.rs file without changing the path where b is visible from the outside. The recipe has three steps:

Declare the b module in lib.rs and pub use b from it:

mod b;
pub use b::b;

pub fn a() {}
pub fn b() {}

Create b.rs, non-publicly importing everything from above, so that fn b() won’t fail to compile because of missing paths from lib.rs:

use super::*;

Move fn b() into b.rs:

use super::*;

// moved from `lib.rs`:
pub fn b() {}

Congratulations, you just split your code without anyone using it being the wiser.

The second recipe is to move the b path into the b module. In this case, we have to make the b module publicly visible and then remove the pub use from our lib.rs:

pub mod b; // added `pub` here
//pub use b::b; <-- no longer needed

pub fn a() {}

Voilà, your users won’t be able to call b() directly anymore unless they import it from b::b. Now modules are still a messy beast, but at least there are easy steps to take to deal with them.

Async

Rust async is arcane, powerful and still has a good number of rough edges. So unless you’re writing a web service that needs to serve more than fifty thousand concurrent users on a single machine, try to avoid it (remember, we’re not after performance here). However, some libraries you may want to use will require async code. In this case, pick an async runtime (most libraries will work with tokio, so that seems a safe choice) and

write your functions as you would write a normal function, prepending async before the fn
add .await after every function call, and then remove it again wherever the compiler complains
avoid potentially locking mechanisms such as Mutex, RwLock, and channels. If you absolutely must use one of them, your async runtime will provide replacements that won’t deadlock on you

You might still run into weird errors. Don’t say I didn’t warn you.

Data Structures

If I had a penny for each internet troll asking me to write a doubly-linked list in safe Rust (hint: I can, but I don’t need to, there’s one in the standard library), I’d be a very well-off man. So you won’t be surprised to read me suggesting you avoid writing your own data structures. In fact I’ll go one step further and allow you to put off learning what data structures are already provided, because you can get by with only two in the majority of cases: Sequence-like and Lookup-like.

Sequences

Whether you call them lists, or arrays, or sequences is of little import. This type of structure is usually there to be filled and later iterated over. For example, if we want to have three items:

start
more
end

we can put them in a Vec:

vec!["start", "more", "end"]

Use Vecs everywhere you want to sequentially iterate items.

Lookups

Those may be called maps, or dictionaries, and allow you to associate a key with a value, to later retrieve the value given the key.

KEY	VALUE
recursion	please look at “recursion”

we will use HashMaps for this case:

HashMap::from([
  ("recursion", "please look at recursion"),
])

While hashmaps can be iterated, the main use case is to get a value given a key.

You will be surprised how many programs you can create by just sticking to those two structures.

Custom Iterators

Rust iterators are also very powerful, and being able to call their combinator functions (like filter, map etc.) you can create new iterators. There are only two points to be mindful of: First, the combinator functions won’t iterate anything, they will just wrap the given iterator into a new iterator type that will modify the iterator’s behavior, and second, the resulting types are usually very hard if not impossible to write out. So you should try avoiding returning such a custom iterator from a function, instead collecting into a Vec or HashMap (see above). If however your iterator is infinite (yes, that can happen), you obviously cannot collect it. In those rare cases, here’s the magic trick to make it work:

fn return_custom_iterator() -> Box<dyn Iterator<Item = MyItemType>> {
  // let's say we filter and map an effectively infinite range of integers
  let iter = (0_usize..).filter(predicate).map(modify);
  Box::new(iter) as Box<dyn Iterator<Item = MyItemType>>
}

So now you know what things you can put off learning while you’re being productive in Rust. Have fun!

Semantic Search with Rust, Bert and Qdrant

2023-11-25T00:00:00+00:00

First for a bit of backstory: While working at Qdrant, I proposed a talk to RustLab Italy 2023, where I would live code a semantic search. Unfortunately, I missed that my window manager (sway) did not support mirroring the screen, so I had to crane my neck to at least partially see what I was doing the whole time, which wasn’t conductive to explaining what I was doing. To everyone who attended the talk and was unhappy about my lackluster explanations, I offer this post, along with my sincere apology.

Before we delve into the code, let me give a short explanation of what semantic search is: Unlike a plain text search, we first transform the input (that may be text or anything else, depending on our transformation) into a vector (usually of floating-point numbers). The transformation has been designed and optimize to give inputs with similar “meaning” nearby values according to some distance metrics. In machine learning circles, we call the transformation “transformer model” and the resulting vector “embedding”. A vector database is a database specialized to search in such a vector space to find nearby points in such a vector space very efficiently.

Now the program I wrote didn’t only search but also set up the database for the search. So the code contained the following steps:

set up the model
set up the Qdrant client
match on the first program argument. If “insert”, go to step 4, if “find”, go to step 8, otherwise exit with an error
(optionally) delete and re-create the collection in Qdrant
read a JSONL file containing the text along with some data
iterate over the lines, parse the JSON into a HashMap, get the text as a value, embed it using the model and collect all of that into a Vec.
insert the points into Qdrant in chunks of 100 and exit
embed the next argument and
search for it in Qdrant, print the result and exit

The reason I used JSONL for importing my points was that I had a file from the Qdrant page search lying around that was created by crawling the page, containing text, page links and CSS selectors, so the page could highlight the match. In hindsight, I could have pre-populated my database and the talk would still have been pretty decent and 33% shorter, but this way it was really from scratch, unless you count the Cargo.toml, which because the network at the conference had been flaky the day before, I had prepopulated with the following dependencies, running cargo build to pre-download all the things I might need during the talk:

[dependencies]
qdrant-client = "1.6.0"
rust-bert = { version = "0.21.0", features = ["download-libtorch"] }
serde_json = "1.0.108"
tokio = { version = "1.34.0", features = ["macros"] }

I didn’t use an argument parsing library because the arg parsing I had to do was so primitive (just two subcommands with one argument each) and chosing a crate would invariably have led to questions as to why. std::env::args is actually OK, folks. Just don’t forget that the zeroeth item is the running program. I also had main return a Result<(), Box>, which let me use the ? operator almost everywhere, but in one closure (which used unwraps because I wasn’t ready to also explain how Rust will collect Iterators of Result into Resule, E>).

For the embedding I chose the rust-bert crate, because it is the easiest way to do workable embeddings. The model I used, MiniLMallL12V2, is a variant of the simple and well-tested BERT model with an output size of 384 values and trained to perform semantic search via cosine similarity (both facts will feature later in the code). For the data storage and retrieval, I obviously chose Qdrant, not only for the reason that it’s my former employer, but also because it performs really well, and I could run it locally without any compromise. One thing that was a bit irritating is that rust-bert always embeds a slice of texts and always returns a Vec> where we only need one Vec, so I did the .into_iter().next().unwrap() dance to extract that.

In practice, chosing such a small-ish model allows one to obtain well-performing embedding code, leading to lean resource usage and acceptable recall in most cases. For specialized applications, the model may be re-trained to perform better without any performance penalty. While the trend has been going towards larger and larger models (the so called Large Language Models, or LLMs for short), there is still a lot of mileage in the small ones, and they’re much less cost-intensive.

In Qdrant, the workflow for setting up the database for search is to create the collection, which must at least be configured with the vector size (here 384) and the distance function (here cosine similarity). I left everything else at the default value, which for such a small demo is just fine. Qdrant, being the speed devil among the vector databases, has a lot of knobs to turn to make it, in the best Rust tradition, run blazingly fast.

In the code I removed the collection first, so I would delete any failed attempts of setting up the collection before, just as a safety measure. Then I embedded all the JSONL objects, converted them into Qdrant’s PointStructs and upserted (a term that means a mixture of inserting and updating) them in batches of 100s, another safety measure to steer widely clear of any possible timeouts that might have derailed the demo. For the search, the minimum parameters are the vector, the limit (one can also use a distance threshold instead) and a with_payload argument that will make Qdrant actually retrieve the payload. I cannot explain why the latter is not the default, but it’s not too painful to ask for it. I can only conjecture that some users will only care for IDs, having downstream data sources from where to retrieve their payloads.

Because Qdrant’s API is async, I also pulled in tokio with the main macro enabled. This way, I had to await a few things, but it otherwise didn’t detract from the clarity of the code, which is nice.

I hope that clears things up. Here is the code 100% as it was at the end of my talk:

use std::collections::HashMap;

use qdrant_client::{
    prelude::*,
    qdrant::{VectorParams, VectorsConfig},
};
use rust_bert::pipelines::sentence_embeddings::SentenceEmbeddingsBuilder;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let model = SentenceEmbeddingsBuilder::remote(
        rust_bert::pipelines::sentence_embeddings::SentenceEmbeddingsModelType::AllMiniLmL12V2,
    )
    .create_model()?;
    let client = QdrantClient::from_url(&std::env::var("QDRANT_URL").unwrap())
        .with_api_key(std::env::var("QDRANT_API_KEY"))
        .build()?;
    let mut args = std::env::args();
    match args.nth(1).as_deref() {
        Some("insert") => {
            let _ = client.delete_collection("points").await;
            client
                .create_collection(&CreateCollection {
                    collection_name: "points".into(),
                    vectors_config: Some(VectorsConfig {
                        config: Some(qdrant_client::qdrant::vectors_config::Config::Params(
                            VectorParams {
                                size: 384,
                                distance: Distance::Cosine as i32,
                                ..Default::default()
                            },
                        )),
                    }),
                    ..Default::default()
                })
                .await?;
            let Some(input_file) = args.next() else {
                eprintln!("usage: semantic insert ");
                return Ok(());
            };
            let contents = std::fs::read_to_string(input_file)?;
            let mut id = 0;
            let points = contents
                .lines()
                .map(|line| {
                    let payload: HashMap<String, Value> = serde_json::from_str(line).unwrap();
                    let text = payload.get("text").unwrap().to_string();
                    let embeddings = model
                        .encode(&[text])
                        .unwrap()
                        .into_iter()
                        .next()
                        .unwrap()
                        .into();
                    id += 1;
                    PointStruct {
                        id: Some(id.into()),
                        payload,
                        vectors: Some(embeddings),
                        ..Default::default()
                    }
                })
                .collect::<Vec<_>>();
            for batch in points.chunks(100) {
                let _ = client
                    .upsert_points("points", batch.to_owned(), None)
                    .await?;
                print!(".");
            }
            println!()
        }
        Some("find") => {
            let Some(text) = args.next() else {
                eprintln!("usage: semantic insert ");
                return Ok(());
            };
            let vector = model.encode(&[text])?.into_iter().next().unwrap();
            let result = client
                .search_points(&SearchPoints {
                    collection_name: "points".into(),
                    vector,
                    limit: 3,
                    with_payload: Some(true.into()),
                    ..Default::default()
                })
                .await?;
            println!("{result:#?}");
        }
        _ => {
            eprintln!("usage: semantic insert ");
            return Ok(());
        }
    }
    Ok(())
}

bytecount now ARMed and Web-Assembled

2023-10-02T00:00:00+00:00

It’s been some time since I wrote on this blog. I did blog at other venues and worked a lot, which left too little time to write here. So this is “just” a release announcement for my bytecount crate. And also a bit of talk around SIMD.

First things first, bytecount as of version 0.6.4 now supports ARM natively on stable using NEON intrinsics. Yay! While bytecount had unstable ARM SIMD support under the generic-simd feature (which used the nightly only packed-simd crate that mimics the upcoming std::simd API), for stable Rust ARM users were restricted to 64-bit integer based pseudo-SIMD, which used some bit twiddling hacks to simulate SIMD with scalar integer operations.

One reason for this is that for the longest time, the only ARM CPU I owned was in my mobile phone, a cheap-ish Android device that nonetheless had the 4+4 core ARM CPU setup that is now so common. Many folks probably don’t know, but there’s Termux, an Android terminal emulator + environment which has included Rust for quite some time. So I ran a small benchmark trying to count through 1 gigabyte of random bytes, which came in at around 6G/s. Not too bad, given that it used bit twiddling instead of actual SIMD.

Termux only supplies a stable Rust toolchain though, but that didn’t stop me from also benchmarking the generic-SIMD version. Using the RUSTC_BOOTSTRAP env variable lets one use unstable features on stable Rust (beware: This is not guaranteed to be safe for any use!), so I did the same benchmark with our generic SIMD implementation. The result came it at 11G/s.

As I recently had some motivation to get into SIMD stuff again, I set out to port bytecount to ARM. The algorithm is mostly the same; the only difference is the data layout: Where Intel’s AVX2 has 256 bit registers, and AVX-512 even 512 bit ones, NEON is still at 128 bits. However, there are enough registers available to check 4 of them for equality while still having enough to keep the count, so that’s what I did.

The implementation work was sped up considerably by me getting a MacBook Pro with an M2 from Flatfile for some optimization work which they generously let me use to test bytecount.

The final implementation is equal up to a small bit faster than the generic-SIMD one, because I can get away with a few less instructions in certain cases.

So bolstered by this success, I went on to tackle WebAssembly support next. The simd128 intrinsics in core::arch::wasm32 are actually quite similar to what you’d find on ARM, but named differently, plus for some reason there doesn’t seem to be a horizontal addition operation. However, that lack is compensated for by a number of widening addition ops that I could use to combine multiple values, so in the end the code looks quite similar to the aarch64 one.

This one got into the current version 0.6.5. So if you use bytecount, upgrade to enjoy a very healthy speed boost on ARM and on the web.

Rust in Rhymes II explainer

2023-02-19T00:00:00+00:00

Update: The talk video is now on YouTube. The slides are here.

As promised, here are some links and explanations to all the rhymes. First of all, the rap lyrics:

My boss said this time I should try a Rust rap
but my rapping skills are positively crap,
Even so, Rust and rapping are a thrilling combination
with which to introduce my talk to RustNation

In London we meet, and the workshops are complete,
so now you people have my talk approaching at full speed,
wakey, wakey, here we are, and I hope you will get
entertainment out of this, or some knowledge instead

Of course I won’t start right away with pages full of code
I’ll start with a few stories, and then switch to teaching mode.
I hope that by the end you’ll know a bit more than before
but otherwise enjoy the show, I’ll try hard not to bore.

I could go on and on about how many talks are lacking
the silliness, the fireworks, just going straight to hacking,
so have some whimsy, have some rhyme, some bars, a little flow
we’re here for having a good time, by golly, HERE WE GO!

In the first bars, we have the very rare self-diss, followed by some flow to announce our whereabouts. Then I go on to establish what I offer in the talk. Again “I’ll try hard not to bore” is a sarcastic self-diss, because of course no one was bored in the slightest. Finally a very tame diss against other talks that lack the entertainment value of this one. After all, I didn’t want to appear too aggressive during a Rust conference. Of course we will have some echoes of this later.

On to the rhymes:

Welcome to this talk, dear Rustacean,
You’re hailing from many a nation,
And you’ve waited some time
for most uplifting rhyme
so let’s start the edification!

A cordial welcome to the audience and setting some expectations for the talk.

Someone from outside London got
his head in my hotel too hot
and the fire alarm
worked this night like a charm
and woke everyone up on the spot.

I would like to have you know that I wrote the rhyme with the fire alarm after experiencing this exact thing: Someone activated the fire alarm in the hotel the RustNation speakers where sleeping in at about 2AM in the morning. So it took a lot of coffee for me to be mostly coherent during the talk.

Since I last talked in rhyme about Rust
I worked in Java and just
three years ago moved
to Rust which behooved
me to go in completely. Or bust.

This is referencing my first “Rust in Rhymes” talk, where after declaring that the talk was about Rust, I showed a Java logo in the second slide as a half-joke. I was actually working in Java during that time.

JFrog, from the name you suppose
they used Java a while, and now those
folks who work to contain
errors in our supply chain
they’re here getting their five lines of prose.

The Rust foundation has teamed up with JFrog to evaluate and improve the crates.io/cargo ecosystem supply chain security. Good for us, I say!

Nowadays I write Rust to help you
getting into Rust code old and new
It will show you the code
and in semantic mode
it will also explain what’s it do.

Please have a look at bloop where we do code search with a semantic twist if you want to know more.

Rust’s steadily been on the rise,
to no Rustacean’s surprise
many people take note
of our bug-antidote
without any speed compromise.

The slide has a google trends chart that shows the Rust community experiencing exponential growth. The rhyme also exhorts the quality of our language which combines correctness with performance without compromising on either.

Rust has now made it on TV
where hackers compiling you see
so join us and write
Rust code with shades by night
if you want as cool as them to be.

I must admit that I found this claim on twitter and didn’t bother to look up the TV show. The slide shows what I presume is a frame of the show that was part of the tweet. Still, it makes for a good rhyme, don’t you think?

Sometimes in Rust you will need
more time ‘til your code is complete
Where in Python you’re “done”
When the code is still wrong
And on perf it’ll never compete.

The slides show two persons, one having solved a puzzle with square pieces incorrectly, while the other is still missing a few pieces for a correctly solved puzzle, alluding to the fact that Rust won’t let you cut any corners with your solution, while dynamically typed languages will be far more lenient, leading to potentially costly bugs later down the road.

There once was a runtime called bun
who thought beating Rust perf was fun.
Rustaceans said “please
bench with dash-dash-release”
Now look at the benchmarks…they’re gone.

The bun javascript runtime written in Zig arrived with a splash, boasting impressive benchmark claims (“faster than Rust!”) that didn’t hold up well to scrutinity (they had compiled the Rust benchmarks in debug mode). Once Rustaceans pointed that out, the benchmarks were quickly removed without further comment.

With Rust and Ada some find
them competing for shares of our mind
Ferrous and AdaCore
thought “we might achieve more
by applying our powers combined!”

Ferrous Systems and AdaCore announced their partnership to promote the use of both Ada and Rust in safety-critical fields like automotive, aerospace, medicine-tech etc. The rhyme celebrates this announcement while arguing for more unity between developers of different language communities.

The Ferrocene scheme will contain
necessary steps to obtain
a Rust certified to
ISO 26262
Let’s hope the effort’s not in vain.

Ferrous Systems announced working on certification of a Rust compiler that will lag behind the most current version and hopefully be allowed to be used in those safety-critical fields where certification is often required by law, bringing safety, performance and productivity benefits to those working with it.

Some functional zealots will tell
that mutation can never go well
but if we never share
what’s mutated, we spare
us from data race debugging hell.

This rhyme notes that in functional programming, you avoid mutation altogether, even if it’s not mutation itself that is the problem, but only the overlap of mutated and shared state. Rust lets us rule out this overlap without the cost of avoiding mutation. This has led some functional purists to decry Rust as inferior, which of course we know is just more gatekeeping.

Some coders will still look in vain
for reasons with C to remain
but “freedom” to crash
is an unholy mash
of stupidity, hubris and pain.

Coming from the other side, the slide cites a redditor who claims to “use C for freedom” and asks others to follow that example. However, the perceived “freedom” here is mostly the freedom to get undefined behavior, as in practice the constraints Rust places upon us do not keep us from writing delight- and useful programs at all. So I’m poking some fun at this really bad take.

Bjarne wants C++ to sprout
some memory safety without
needing more than some lint
Well, good luck, and a hint:
That’s what Rust’s type system is all about.

This alludes to a discussion led by Bjarne Stroustrup, the original creator of C++, who claims they can get the benefit of memory safety with “some additional static checks”, despite the fact that both the Firefox and the Chrome teams tried to make C++ safer and both have turned to Rust now instead. Even with the past showing that this endeavor hasn’t seen much success in the past, I sincerely wish the C++ committee the utmost success in this, as improving the memory safety of a great many programs will be a fantastic boon to our digital security.

The Rust foundation is here
not to mandate, prescribe, boss or steer,
they support from within
Rustaceans for the win,
so let’s everyone give them some cheer!

I wrote this rhyme after talking to Rebecca Rumbul at the foundation stand in the conference lobby, mostly paraphrasing what she said to me. I applaud the work the foundation has done so far, and the audience audibly concurred.

By this time you probably just
have heard elsewhere of This Week in Rust.
I collect crate and quote
and PRs people wrote
for our main source of Rust news you trust.

I couldn’t stop myself from doing some advertisement of one of the community projects I’m involved in, also outlining my engagement with the project.

There was a certain Rust PR
that some folks think went too far
make internal code omit
certain numbers, to wit
in hex they’d fill the swearing jar.

The hex numbers are mostly things like 0xB00B1E5 which is juvenile and doesn’t fit in the professional setting of the standard library documentation. As there is no useful reason to have those numbers in the code whatsoever, the PR in question replaced them with plain consecutive digit ranges like 0x1234 and introduced a tidy check that disallowed re-adding those numbers in the future.

In Rust you will gladly embrace
the compiler’s help if you face
a refactor so hairy
your brain’s feeling airy
It’ll keep track of time, stuff and space.

I’ll get back to the reasons later, but this rhyme airs my feelings towards refactoring in Rust, where the compiler will often be very helpful, often not only showing what needs to be changed, but also how.

We all have learned that a bad
craftsman blamed the tools that they had
but Rust will well teach
us it’s not out of reach
to build far better tooling instead.

Rust’s tooling is top notch in many places. Kudos to the respoective teams. The rhyme still reminds us that the work here is never really done though, and exhorts us to think about how we can improve our tooling even further, then do the work to get there.

For Androids it’s been a must
To be shiny and free of Rust
But to safety’s advance
Google loosened their stance
hoping blue tooths no longer combust.

Some word play here; both for Rust (the language, here mixed up with iron oxide) and Android (the operating system, here taken as a humanoid machine). The last line references the first part of Android that got reimplemented in Rust, which is the bluetooth stack, and the security problems of the previous implementation.

Now in Android version thirteen
there’s more Rust than ever has been.
But now that they wrote
millions Rust lines of code
memory bugs remain to be seen.

I had read a blog post by Jeffrey Vander Stoep, who found that unlike their C++ code, which contained roughly one vulnerability per 1000 lines of code (which by the way makes Android’s code top of the line in the industry), there hasn’t been found any memory safety-related vulnerability in their Rust code so far in more than a million lines of code!

Meanwhile Rust in Linux progressed
to the point where it’s put to the test,
The patch builds the foundation
for the carcination
of drivers. This stuff is the best!

The Rust for Linux patches have been merged into mainline and there are some experimental drivers using it already. Those who have been benchmarked show similar performance to their C counterparts despite not having been thoroughly optimized, which is very promising.

Asahi Lina just took
the GPU of a Macbook
to write drivers in Rust
for Linux, one must
think this is an elite-worthy look.

15 thousand lines of Rust code later
Despite claims from many a hater
Lina’s work here shows
How kernel dev goes
Without any in-GPU crater.

Those two rhymes refer to the M1 GPU drivers project, which has working drivers written in Rust. The author so far has experienced very little problems during the implementation, presumably because of the strong type system and memory safety guarantees the language affords. Contrast this with the development experience of other GPU drivers, which still suffer from frequent crashes and hard to debug memory problems.

When your Rust goes embedded, you’ll find
no need to leave abstractions behind,
with a HAL (not the boor
who won’t open the door)
you get a piece of low-level mind.

I must admit that I’m not that knowledgeable around embedded Rust. That said, I’ve seen some hardware abstraction layers, which offer really nice abstractions on top of the often complex and arcane hardware APIs. The abbreviation of hardware abstraction layer is HAL, which I associated with HAL 9000, the antagonistic AI from Arthur C. Clarke’s “Space Oddyssey” series, who is most certainly the opposite of helpful. The slide for this rhyme shows the famous red camera eye from the 1968 film.

When asked curl in Rust to rewrite
Daniel Stenberg said that he might
But he won’t rearrange
everything in one change
So it sure won’t be done overnight.

Daniel Stenberg, the author of the popular HTTP client tool curl has now introduced Rust into the code base. He’s quite pragmatic with it though, and has so far carefully and considerately introduced Rust.

Rust would be far too arcane
Without Esteban keeping us sane
for the message on error
sparks joy and not terror
though the steep learning curve does remain.

Esteban Küber (I hope that this is the right spelling) is leading the diagnostics working group, which has made rustc produce the most delightful error and warning messages ever known by programmerkind. The rhyme imagines a dystopian alternate reality where Rust error messages would be as incoherent and unhelpful as their counterparts in some other languages’ compiler implementation (although I should note that a rising tide lifts all boats; many of those implementations have improved following Rustc’s example). The final line reminds us that while Rust has gotten far easier to learn in the last 7 years since 1.0.0, we can still improve on that front.

When rustc surprisingly showed
it converts a large int to a float
Mara Bos went complete-
ly berserk for more speed
the compiler’s builtins to demote.

Mara Bos who is leading the libs team and is the author of the phantastic “Rust Atomics and Locks” book, had blogged about a both entertaining and educational story which led her to reimplement the int-to-float conversion routines for Rust’s compiler builtins.

Nyx toolkit brings Rust into space
as other tools they replace
at speed they simulate
how craft accelerate
at this scale you can’t let data race.

Given that the industry sending people into space is usually quite conservative in what tech they use, it was surprising to me to see a Rust application in astrodynamics. Yet here we are.

nushell builds on Rust to be able
to work with data by the table
and letting you pipe
your data with type.
It’s elegant, mighty and stable.

I use nushell as my main shell on windows (haven’t made the switch on Linux yet though), and I find it much more usable than PowerShell.

If your code depends at any rate
on the so-called “rustdecimal” crate
and you also did choose
Gitlab CI to use
Check your systems before it’s too late.

There was a security advisory from the rustsec group. Someone had typosquatted the rust-decimal (note the dash?) crate by copying it and including some malicious code in the build script that installed a backdoor when finding that it ran on GitLab CI.

Rust enums enjoy much attention,
well deserved, so allow me to mention
with match they enable
us to code more stable
and refactor to match our intention.

Here the main joke was the slide which showed an enum Talk { Boring(TechnicalStuff), Interesting(Rhymes) }, echoing the diss from my Rap. The rhyme itself alludes to the fact that we have exhaustiveness checking for matching enums, and so changing them will give us compiler errors at the exact places we need to change to continue our refactor.

Now if you just need a bool
from a match, this macro is cool
unless you must take out
of your type, it’s about
the best match code-size reducing tool.

Having written a good number of if let ... { true } else { false } during my work on clippy, I’m totally happy to have the matches! macro at my disposal, so here it got its well-deserved rhyme. The code on the slide just matches the enum of the previous one.

With traits your code faces ascension
by internal Rust type extension
where you tell everything
how to do anything
so allow me the pay-offs to mention.

This rhyme refers to the power of extension traits which allow us to implement behavior for types defined elsewhere. I wrote this rhyme on short notice after Luciano Mammino’s talk where he spoke about his delight at being able to implement his own iterator methods.

When in Rust code macros you use
I worry a hard way you choose
for debugging it might
become heck of a fight
which means more time later you lose.

I’ve written quite some macros during my work in Rust, and I’m keenly aware that they can make debugging much harder. So here’s a warning to the unwary to think twice if they need a macro for something. Often, generics can do the trick, or the duplication that a macro is built to prevent might be the lesser evil to choose.

In Rust when a thing you mutate
And desire to annotate
that it no longer should
change you easily could
let it be itself until its fate.

This rhyme shows how one can make use of let bindings and shadowing to restrict mutability of a value to a certain area of the code. Keeping values immutable often makes it easier to follow the code, because you don’t have to worry that they change from under your nose. The code on the slide is as follows:

let mut x = get_x();
mutate_x(&mut x);
let x = x;
// x is now immutable

Last time I showed how to break Rust
but this easter egg is now dust,
here’s how out of a scope
you can jump without rope
which is useful and far more robust.

The first two lines reference the RustFest Zurich talk again, which had a rhyme about a Rust easter egg where if you wrote loop { break rust; }, the compiler would output a funny message. Now in the meantime, Rust has allowed break to be used outside loops as a limited form of goto. The slide shows a code example of this:

'scope: { ..
    if broke { break 'scope; }
    ..
}
let result = 'scoop { ..
    if let Some(value) = scoop() {
        break 'scoop value;
    }
    default_value
};

Rust loops come in many a style
but few know it too has a do-while
though its looks may bemuse
you can put it to use
and rest assured it’ll compile.

Again, a funny yet sometimes useful trick where one can emulate a do-while loop using a block in the while condition:

let mut x = 1;
// behold: do-while!
while {
    x += 1;
    x < 10
} {}

Sometimes iterating a Vec
by index incurs a bounds check
however a slice
often won’t, which is nice.
So go get your runtime perf back.

Sergey “Shnatsel” Davidoff wrote a very good blog post on how to avoid bounds checks, which sometimes cost a bit of performance when they appear on the hot code path. The rhyme and code on the slide show the easiest technique, which is using a slice.

Just sometimes a Rust Result looks
like an owl sitting on stacks of books.
With two units as eyes,
on embedded it flies
over meadows and forests and brooks.

This poem is dedicated to the “Owl” result type (Result<(), ()>), which is often used in embedded systems to denote success or failure, where it is far nicer than using a bool while still having the same size.

Rust has special syntax to let
you a function to drop values get
It is named for its looks
toilet closure, the books
tell you to use mem::drop instead.

Another syntax-based poem, this time about the toilet closure (|_|()).

In Febuary a year ago
Rust 1.59 stole the show
with assignments that could
destructure where you would
do multiple ones, now you know.

Rust let bindings use patterns to allow for destructuring, which also afforded them the option to assign multiple bindings via tuples (e.g. let (a, b) = (1, 2)). However, assignments were not awarded this luxury. Rust 1.59.0 finally brought them up to feature parity, so we can write (a, b) = (1, 2); in valid Rust code (given pre-existing mutable bindings of a and b).

One cool byproduct of this option
that has not seen too much adoption
is that you can ignore foo
without let as you do,
it’s a small RSI antitoxin.

A good number of types and functiosn in Rust have the #[must_use] attribute, which the unused_must_use lint will pick up and warn if the result of that functions or return values of that types are left unused. However, as an escape hatch, the lint does not warn if the result is assigned or bound via let _ = ... So you’ll see such wildcard lets from time to time in Rust code. Since assignments now also target patterns, we can simply assign to a wildcard, thus saving the effort of writing let. It will likely be a while until this simplification is widely used.

Rust used to use let but with any
irrefutable bindings, so many
uses where one would guard
the code was somewhat hard
now you get them a dozen a penny.

This rhyme celebrates the advent of let-else, which allows us to use a refutable let as a guard, follwoed by an else block.

Atomics allow you to choose
what ordering on access to use.
While on X-eighty-six
You may easily mix
them, on ARM the results may confuse.

The slide showed a chain link fence. After reciting the rhyme, I asked the attendants: “Know what that is? – A memory fence.”

In Rust you may know that the Range
type does not impl Copy, how strange!
But I’ve heard that some fine
folks hatch plans to combine
their powers this to rearrange.

Many people have noted that Range doesn’t implement Copy even when T does. The reason for this is that historically, Range was defined to be iterable (instead of merely implementing IntoIterator) and copying the iterable would mean that the original would no longer be changed. In the meantime the iterator protocol (how Rust implements a for loop internally) has been overhauled so that this restriction is no longer necessary, but it will probably take another edition to allow us to copy ranges.

Sometimes dyn traits in Rust do require
thinking outside the box to acquire
a solution to track
different types on the stack
to quench our full-stack desire.

I’ve blogged about this before: You can get a &dyn reference to different type without Boxing the values, but you need distinct stack slots for them because a value can only ever have one type.

To many a newbie’s surprise
we often use a str slice
for a lot of stuff
borrowing is enough
and not needing more memory is nice.

This rhyme merely states the fact that when working with strings, borrowing a slice of them often is sufficient.

To defer a task isn’t hard,
in Rust you can just use a guard
which is a type that
does on drop what you had
it in mind to complete from the start.

The slide for this rhyme shows a royal English guard. In Rust, those guard types are really useful, not only for locks.

When a large naïve linked list you drop
your program will crash and then stop
for you’ll exhaust the stack
from there it’s no way back
so please do manual while let-pop.

Alexey Kladow pointed out that the autogenerated Drop implementation would recurse into a singly-linked list (that would be defined by struct Node { value: T, next: Option>> }) in a way that could not be done by tail recursion, so a reference to each node would land on the stack. For large lists, this could overflow the stack. The solution here is to manually follow the list with while let:

impl<T> Drop for Node<T> {
  fn drop(&mut self) {
    while let Some(next) = self.next.take() {
      *self = *next;
    }
  }
}

With Rust, every coder feels safe,
so some venture ever so brave
into OS flaws to
poke some holes into
the cover to then rant and rave.

The slide shows a tweet by Patrick Walton who lamented that it is hard to explain why Rust doesn’t forbid reading from or writing to /proc/mem on Linux (which can be misused to poke holes in our aliasing guarantees). It is generally felt that this is a weakness of the operating system and out of the purview of the language; besides checking this would carry a cost on every file access, which is certainly not proportional to the risk.

When in Rust crates features you use
and don’t ask for docs.rs to choose
to document all
it’s dropping the ball,
as with features it’s somewhat obtuse.

If you want docs.rs to document all your crate’s features, put the following into your Cargo.toml:

[package.metadata.docs.rs]
all-features = true

A cool thing in nightly rust be
called “portable SIMD”,
which means you can say
in a cross-platform way
how instructions should parallel be.

The slides show the following code and output:

#![feature(portable_simd)]
use std::simd::f32x4;
fn main() {
    let x = f32x4::from_array([0., 1., 2., 3.]);
    dbg!(x * 2.)
}

$ rustc +nightly simd.rs -O && simd              
[simd.rs:7] y = [
    0.0,
    2.0,
    4.0,
    6.0,
]

Thus showcasing a simple use for portable SIMD which will work on all CPU architectures (unlike std::arch-based SIMD which is stable on e.g. x86, but bound to that platform).

If your code has the requirement
that a type be not Sync or not Send
it’s time to introduce
a phantom to choose
which type’s contract you will amend.

The slide shows the following code, which defines type aliases for PhantomDatas that, when added to a type, preclude that type being Send or Sync, respectively. This way, Rust allows us to guarantee that values of certain types won’t ever pass or be shared across a thread boundary, which can sometimes be useful for correctness and optimizations.

use std::marker::PhantomData;

type NoSend = PhantomData<*mut u8>;
type NoSync = PhantomData<Cell<u8>>;

struct MyType(usize, NoSend, NoSync);

thread::scope, the first time around
Rust one-zero, alas, was unsound
But if you’re keeping track
you know that it’s back,
this time with correct lifetime bound.

One of Rust’s most impressive features, borrowing across threads, had a problem when it first appeared; it relied on the drop of the JoinGuard actually being called for soundness. So the feature was left unstable in the runup to 1.0.0. Later the crossbeam crate improved on that with its own implementation with a slightly different interface that introduced a lifetime bound into the method’s scope parameter. The standard library even improved on that in Rust version 1.63 by using two different lifetimes, one for the environment and one for the scope.

I told you about Cow last time
but failed to mention in my rhyme
that you can assign-add
change in-place what you had
here as_mut is its partner in crime.

“last time” here is of course referring to the first “Rust in Rhymes” talk I gave at RustFest Zürich. This time, I also returned to Cow (no silly pictures this time, sorry) to showcase its abilities. The slide reads:

// Cow says Moo
let mut cow = Cow::Borrowed("Clone on write");
cow += " makes ownership optional";

cow.as_mut().make_ascii_uppercase();

This code constructs the abbreviated sentence “Clone on Write makes ownership optional” as in “Cow” says “moo”.

Formatting macros now combine
format strings with arguments inline!
Note that this will fail to
work with all things you do
not in local bindings define.

This new feature is really cool; you can write format!("<{foo}>") to put the contents of foo in angly brackets. Unfortunately, rust-analyzer isn’t yet ready to follow references through inline formatting arguments, so it’ll likely be a while until this feature is widely used.

When in Rust async functions you call
they don’t do any work at all
instead they will wait
for you to await
them, this might cause your program to stall

When async Rust was young, many coders fell into the trap of forgetting the .await, leading their async calls to go nowhere. In Rust, calling an async function simply creates the future, but it doesn’t poll it; that’s what awaiting is for. This is in contrast to other languages where simply calling an async function will start executing its code until the first yield or return.

When in Rust async code you borrow
across an await, you reap sorrow
for those two things don’t mix
but with Arc you might fix
your code’s problems until tomorrow.

Another pitfall for new async Rust programmers is that while synchronous Rust is very keen on borrowing, this strategy will fall flat in async Rust. The reason for this is that you no longer control when your code is run, therefore it is impossible to constrain the lifetime of values within your async code. That includes borrowed values, and having something borrowed endlessly is usually a bad idea in Rust.

When a future no longer you need,
and you won’t wait for it to complete,
you drop it, it’s gone
there’s no need to hold on,
and your other code works at full speed.

Cancellation is implemented in idiomatic Rust by dropping the future. Many people still don’t know this and ask how to cancel a future.

Rust hashmaps have some subtle tricks
now throw rustc-hash into the mix
if you do not expect
to be perf-wise attacked
use the hasher it has called Fx.

The Fx hash function is used throughout the Rust compiler. It works very well on integers and short strings, of which there are a great many within the normal lifetime of a Rust compiling session.

When in Rust you’re coding for speed
and write to a file, you will need
to buffer the write
or your system might
wait for every byte to complete.

Continuing our performance theme, this rhyme warns of a performance pitfall that many have experienced: Rust writes (and reads) are by default unbuffered. So you need a BufReader and BufWriter for buffering your reads and writes, lest every IO operation incur far more OS overhead than actually needed.

Likewise it may come as a shock
all your println! calls silently block
on standard out, be aware
if for speed you do care
in that case best use standard out’s lock.

Unlike C, where printf may usually just assume to be able to write on stdout unimpeded, Rust is far more careful, locking the standard output on every write call. You can avoid this by manually calling .lock on io::stdout() and using that locked output.

Those decrying Rust’s unsafe ‘scape hatch
because it “kills all safety”, natch?
Most experts will tell
that it works pretty well
if all invariants you do catch.

Very often in online discussion, people profess or pretend to mistake the unsafe keyword for meaning that the whole idea of safety go out the window. Of course, we know better, and it has even been mathematically proven that safe code can rely on unsafe code as long as the latter upholds its safety invariants.

Some people tell you unsafe will
turn of borrowck and then shill
for C, which is why
they tell this bald lie
because rustc is checking things still.

Another popular myth busted: unsafe allows to work with pointers, which avoid the borrow checker, but it doesn’t turn off the latter.

When in unsafe Rust code you write
and get some things not quite right
it’s no longer the season
for rhyme nor for reason
though your program may run out of spite.

The flipside of all of this is of course that the unsafe code must uphold its invariants come what may. This is often not exactly trivial, and the worst thing about such unsound code is that it may work (“out of spite”) for a good while before finally leading to surprising result come the next compiler version, target architecture or whatever; once UB appears, all bets are off.

The slide for this one is a meme picture showing a crab with the caption “If you put a crab up to your ear, you can hear what it’s like to be attacked by a crab”. The audience loved that one.

Though Rust saves you from a large class
of errors when RAM you amass,
it won’t fix any test
on it’s own, so you best
make sure that each one does pass

This rhyme serves as a humbling reminder that while Rust’s type system and borrow checker will save us from a good set of error classes, it cannot ensure program correctness, so we still need to test. And preferrably also run the tests and make sure they pass.

When rustc with garbage you feed
Don’t fret if the errors you read
Are confusing as hell
The compiler goes: “Well,
I’m telling you straight what 𝐼 need.”

The slide for this shows a @ReductRs tweet reading:

I’m sorry for being so confusing when you were feeding me garbage — Compiler Diagnostics Module Speaks Out.

When secrets you handle in Rust,
deleting them safely you must,
so you don’t get hacked
and your defenses cracked.
The zeroize crate I would trust.

The zeroize crate offers a set of traits you can derive on your types that will make it easy to have data overwritten with 0 bytes when it gets dropped, thus ensuring that e.g. passphrases aren’t lingering in memory for malicious hackers to find them.

Dacquiri: you’d normally think
I was rhyming while having a drink.
But in Rust it describes
A framework to use types
so folks can’t step over the brink

The dacquiri framework allows annotating methods to require certain authorization elements in a way that ensures the implementation cannot be called without authorization.

Andrew Gallant of ripgrep fame
is upping Rust’s byte string game
to make prodding with Rust
real-world data robust
with impressive performance to claim.

This of course refers to the bstr crate.

If a Rust library you change
to avoid things downstream going strange
you can deftly depend
on what future amend
you’re going to pre-re-arrange.

I was cackling about how clever I was when rhyming this (and while choosing the “back to the future” slide), but apparently people in the audience mostly failed to get the reference to David Tolnay’s great semver trick where a crate depends on a future self to implement certain things, thus avoiding incompatibilities between both versions.

If in Rust code you need to parse
some args either dense or sparse
where one went for structopt
this dependency stopped
as now clap pulls it out of its…derives.

The clap crate implements a really great argument parser with many useful features. But the idea to derive the argument parser from a type was born in the structopt crate before being added to clap proper. The last line not rhyming reaped a lot of laughter and applause.

A bevy of games, not of geese
in Rust is a sight that will please
so join in the fun
get some Rust games to run
you’ll find one can do that with ease.

The Bevy engine is a production-ready game engine written in Rust. There is a lot of good documentation, so it’s a great way to get into Rust game programming.

With tauri you stand on the shoulder
of web tech giants, tread bolder
to build quick applications
in all variations
and run them before getting older.

Tauri is a framework that relies on web technology to provide a UI for your application that can still be written in Rust. Relying on the system-native web view, it doesn’t need to bundle an almost-complete Chrome, which makes binaries much smaller compared to e.g. Electron.

When some Rustaceans exercise
their muscles, it’s not very nice,
their squats at any rate
each go on their own crate
so that others need to improvise.

This rhyme laments the sad state of many a crate on crates.io, being a mere shell that has been put there to reserve the name. In consequence, others need to exercise their imagination to come up with a different name. I note that this topic has been discussed to death many times, and there is still no satisfactory solution in sight. So I won’t prescribe one here, I feel that every one that has been suggested so far had flaws large enough to render the idea moot.

We’re surprised to hear cargo test
Is speedwise no longer the best
so I’ll gladly install
nextest, after all
my cores don’t need too much rest.

The cargo nextest subcommand is a new, fast test runner for rust that will often outpace good old cargo test, mostly by maximizing parallelism.

insta will easily check
if the objects your test code gets back
are the same as before
when a good state they bore
so your tests are quickly on track.

Armin Ronacher’s insta crate offers support for snapshot tests, where when you run the test it records the output which you can then save with a subcommand to be compared by future test runs. I cordially recommend it.

From Embark in Stockholm says hi
the subcommand cargo deny
which will thoroughly see
your dependency tree
through for…wow, lots of problems, oh my.

The cargo deny subcommand checks the whole dependency tree for security as well as licensing problems. Suffice to say, I feel much better when it prints nothing upon running it on my code.

Clippy users who at first sense
they’re constantly making amends
To make their code “good”
(as they probably should)
will later become lifelong friends.

cargo clippy now has dash-dash-fix
which changes your code’s subpar schticks
to improve it’s style
and perf, and meanwhile
it has got up it’s sleeve some more tricks.

As a clippy maintainer, of course I had to write a couple of rhymes on it. The first one puts in rhyme a story someone (sorry, I cannot remember who it was) told me that they hated clippy in the beginning, feeling that “it always had something to criticize”, but later were proud and happy that their code was “clippy clean” and now use clippy on all of their code.

The second informs people that many lints now have suggestions that can be automatically applied using the cargo clippy --fix subcommand.

The clippy workshop went great
most people stayed with us late
with good work, much respect,
clippy lints to correct
and also new lints to create

This one I wrote after holding the RustNation-adjacent clippy workshop. My hat’s off to all participants, you are an awesome bunch!

cargo semverver will check
if your current crate version is back-
wards compatible and
warn if change is on hand
that gives users a pain in the neck.

The cargo subcommand has in the meantime been renamed to cargo semver-checks, but the idea is still the same: Giving you advance warning before you introduce changes to your Rust library crate in a minor version that would break your users’ builds. It will likely become part of cargo proper in due time.

Your performance desire to quench
you can put your code on the bench
but to make it complete
a criterion you need
lest measurement errors avenge.

Some people rely on the bencher crate, but I will usually recommend criterion, which has more benchmark configuration options, as well as more solid statistics and optionally a nice graph output.

A helix of Rust can be made
to work on the code you create.
Between entering code
you switch to command mode
with complete & assists, it is great!

The helix editor is a modal text editor like VI that has IDE-like features using the Language Server Protocol (LSP). Unlike VI, it goes selection-first, which takes some time to adapt your muscle memory, but the sleek interface and great performance is well worth it.

With the crypto winter afoot,
Rust jobs are now mostly good
in search, cloud, UI
embedded, AI,
if you haven’t applied yet, you should.

I usually get a few job offers per week. During the late ’10s, most were blockchain job spam. I now get far less of those and more jobs in other fields, which given my stance on blockchain I personally find very pleasant. So if you haven’t sought out a Rust job because you thought it’d all be crypto blockchain web3 stuff, perhaps have another look.

Should I buy an aquarium, my wish
is to fill it with Rust turbofish
One fro, one reverse
(to complete this verse)
& a TV with satellite dish.

I should have put that one to the other syntax rhymes, but somehow failed to do so. Anyway the famed and feared turbofish is a somewhat weird feature of the Rust language that however ensures that Rust code be linearly parseable. The problem it solves is that a < could be the start of generics or a less-than sign. So the fish disambiguates those cases.

Festive Ferris the Crab crossed the street
a bunch of gophers to meet,
and pythons and others
all FFI brothers
to get something to drink and to eat.

This rhyme was a very roundabout way of saying that we programmers are all in the same boat, whether we write Rust, Go, Python or any other language. No programmer is better than another because of what language they choose and we should all strive to improve collectively instead of stupidly fighting among one another.

If you tell people that they must
from now on write all code in Rust
you will earn some ‘aw, shucks’,
for your lobbying sucks.
Good luck on regaining their trust.

In my last Java job before switching my career to Rust, I often felt limited by what I couldn’t express in Java. Yet I knew my colleagues, unlike myself, did not sign up for learning a new language, and forcing them to do so would surely make them hate me. So when you want to introduce Rust at a job, please think of your colleagues who haven’t learned it yet and may not feel the same way about it as you do.

If your thirst for rhymes isn’t quenched
and your love of Rust is entrenched
you’ll find more of my verse
on Twitter – or worse
if that service is tabled – or benched.

I for now have a twitter account where I will sometimes post more Rust rhymes. So if you want to avoid missing any of them, follow me there. The last line alludes to the fact that twitter is steering towards an unknown future since being bought and shedding a great many people. Perhaps I might move to some mastodon instance in the future. Don’t worry, I’ll let all my twitter followers know.

An ML model showed it could rhyme
about Rust in a fraction of time
it does take me to write
a poem, I won’t fight
it ‘cause having more rhymes is sublime.

Well, tickle me pink! I admire,
this work, for now I can retire
and sit on a beach
a stiff drink in my reach.
Why should this AI draw my ire?

The slide for this shows a GPT prompt “Show a rust error message in form of a limerick.” which the model answered with

There once was a code that was fraught
with a borrow checker that thought
“You’re taking a slice, But not paying the price!” And the error message it brought.

I was sent this on twitter with the question if I was worried if the AI would replace me. Why would I be worried? It’s my rhymes and the model’s, not either-or. Besides should I choose to retire, you can still ask the model for more rhymes if you want them.

Farewell now my friends, time to go
I hope you’re enjoying the show
and thank you so much
for you being such
a great audience! Cheerio!

The final rhyme was me being thankful for the great audience (which I had correctly predicted would appear). I can assure you that I wouldn’t have given the rhyme to a lesser audience than this was. A good update to the disclaimer I ended the first “Rust in Rhymes” talk with, don’t you think?

And that’s it. The whole talk explained. I hope you’ve had a good time reading it, and feel free to ask questions on /r/rust if anything is still unclear.

Catch 22! Rust in Review

2022-12-11T00:00:00+00:00

It’s getting late in the year, so it’s time to write another blog post looking what good and bad came out of the year for Rust and what’s likely to happen next year.

I guess the second part is also a roadmap post for the next year, so there’s that.

Language

Language-wise, 2022 was a great year for syntactic sugar, with let-else rapidly reaching stable (I must admit I wasn’t even aware of it until I read the release blog), and let chains being worked on in nightly. I was wary of those until I tried them in the clippy codebase, since then I’m very happy we’ll get them sooner or later. We also got GATs in 1.65! Those who don’t know what that is, don’t worry, it’s all good, and for the others, that was well worth the wait. Finally!

There also has been some really great work on the async front, with a MVP for async trait landing in nightly in mid-November. No more hacks with attribute macros and boxing all futures, yay!

Some new features were just filling gaps, like destructuring assignments which brought them to parity with let bindings. While they undoubtedly made the implementation more complex, the language complexity was reduced by one special case.

Compiler

The compiler still got faster, despite serving more features and improving the error messages even more. Cranelift and rustc-codegen-gcc are progressing nicely, and gcc-rust was recently merged into GCC proper. mrustc is still alive and well, so we’ll soon have three compilers and three backends (LLVM, gcc, cranelift) for the official compiler.

Tools & Ecosystem

Rust analyzer got a well-deserved promotion to a proper rust-lang project. With helix for the terminal and lapce using its own GUI, we now have two native Rust editors written in Rust that use the rust-analyzer backend. Cargo gained an inbuild implementation of the cargo add subcommand, which was formerly only available through the cargo-edit crate (update: With this year’s last release, the cargo rm subcommand joined cargo proper, too). The gitoxide git source control system implementation finally gained enough functionality to clone itself.

The tokio runtime saw a steady stream of improvements, with the former making headway in safe & fast web service implementations and notably tooling to dump #an “async backtrace” in case something goes wrong, which finally closes an important gap between sync and async debugging. Async-std development seems to have stalled, with the last commit dating from July.

GUI-wise, we’ve seen multipe frameworks come to life, from the Xi-editor inspired druid and its likely successor xilem over the startup-sponsored slint to iced and dioxus. Other toolkits saw progress, too, the AreWeGUIyet page lists many of them.

Growth

Crates.io has more than 99k crates now. This time last year it was about 70k, so the growth hasn’t stalled at all. The registry has counted 24.4 billion downloads, which is more than double last year’s count! The subreddit has cracked 200k subscribers in October and is now bigger than r/golang which it had trailed all the years before. Rust has kept its 19th spot in the redmonk index, and won top spot in the yearly “most loved” StackOverflow survey, in addition it’s “most wanted” for the second year in a row.

The blockchain winter is now in full swing, with many of the fraudulent players going broke. Luckily, my fear that Rust would suffer by association didn’t become reality. The job growth in other fields like machine learning, cloud and elsewhere easily outweighed the losses in blockchain-related Rust jobs. Phew!

This part of the growth curve doesn’t come out of thin air: Many companies large and small spun up Rust teams. We make inroads in the cloud, on mobile, in embedded (notably automotive and aerospace, where the groundwork is currently being laid), in machine learning (where python still reigns supreme, but Rust could win sometimes by superior performance and type-based coding ergonomics). The current Android version 13 has more than 1.5 million Rust lines of code. There’s Rust in the Linux kernel!

Those folks who spelled Rust’s doom because there wasn’t enough buy-in from companies now spell its doom because there’s so much buy-in. I guess some things never change.

Governance

The foundation has really come into its own this year. Kudos to all involved! With the community grants, the foundation directly allowed more people to work on what they love while boosting the Rust ecosystem. Also I hear there has been some lobbying in the US that resulted in Rust being the official recommendation for low-level projects.

The project restructuring also saw some good progress; I am personally very happy with the outcome so far.

Now for a look ~~in the crystal ball~~ on the roadmap:

Language

I see chains of let. Async traits, too. I see them bloom for me and you. And I think to myself, what a wonderful Rust… or something. try blocks will have to prove their worth in the nightly implemenation. I haven’t tried them yet, so I cannot predict if they land next year. We will see more whittling down the missing pieces of features we got this year for sure. Especially on the async side of things I expect more movement regarding pinning and avenues for optimizing the task machinery for various use cases.

I have high hope for negative trait bounds which sound like they will enable a bucket of interesting use cases, especially with regards to avoiding the need for specialization. Though there is some compatibility hazard, I’m sure the lang team will take sufficient precautions to make it a net win.

Compiler

I guess it’s safe to say the compiler will get even faster in ‘23, though we may already see diminishing returns.

As for the backends, we’ll see gcc-rust mature somewhat, and rustc-codegen-gcc as well as cranelift become useable for more cases. Perhaps even some kind of stabilization.

I sure hope we’ll see more evolution on the macro machinery. WebAssembly could be a great base on which to rebuild the macro system; we could even cache pre-compiled macros or have crates.io serve proc-macro crates as WASM blobs, thus reducing compile time for their users.

Tools & Ecosystem

The Rust experience is already pretty good, and the stream of improvements is unlikely to ebb next year. I’d note that with the newly set up fmt team, I expect notable advances in rustfmt functionality.

Most Rustaceans wish for some consolidation; do we really need three async runtimes, eleven web frameworks, fourteen GUI libraries and a dozen or so hash map crates? On the other hand, I think we haven’t seen the end of the experimentation phase; in fact it may just have begun. So I fully expect more delightful new libraries and surprising crates come up every now and then. One size need not fit all, as they say.

Growth

Rust has reached the tipping point. From here on the only way is up. The only thing that could pull the trajectory back down would be for an even better new language to emerge, and I don’t expect one to appear in the next three years. That said, if it does appear, I’m so here for it.

With this explosive growth come a set of challenges, which nicely brings us to the next point:

Governance

The foundation has shown it can do more than taking care of the trademark or sponsoring email services for This Week in Rust. Now is the time to expand the involvement. The current community grants are a good start, but they’re pretty narrow. The foundation could team up with universities to sponsor PhD theses around Rust. It could give out a yearly award for the most innovative use of Rust in the industry. It could lobby more governments to make memory safe languages the official recommendation.

I dare not hope to see memory safety mandatory, but perhaps at one point it could become a sign of negligence to write code in a memory unsafe language unless there is a good reason to do so. The first country to enact such laws is going to rule the software sector for years to come.

The project will continue spinning out working groups. Currently those can organize however they see fit, but we’ll start to see some standards emerge. While this will lead to some small frustration on some members’ part, it will ultimately help transparency and inter-team ventures.

Rust 2021 – Looking Back and Forth

2021-12-30T00:00:00+00:00

2021 was a great year for Rust. The language got more powerful. The standard library gained a good number of new functions and a lot of improvements on the existing ones. The error messages were tweaked and tweaked again. The compiler got faster still, despite at last enabling noalias information by default.

Also incremental compilation the default again after being backed out due to soundness problems. We got a new experimental gcc-based backend, and another gcc-based implementation (both are work in progress for now). Clippy gained a number of lints and lost a lot of false positives in turn. Rust is getting into the Linux kernel, which also brings some improvements to both language and libraries to facilitate this feat.

The community also grew. Crates.io is now at more than 70000 crates in stock, and more than 11 billion downloads! The rust subreddit grew from about 122k to more than 162k users, narrowly trailing r/golang. Rust entered the top 20 of the Redmonk index for the first time, and won the Stack Overflow survey’s “Most loved programming language” crown for the sixth year in a row. I’ve had more people ask me for mentoring than ever before.

We have a foundation which is gaining members. The foundation has actually started to do useful stuff, like professionally organizing the operations of crates.io which has so far been done by volunteers. Recently, the DevX initiative has started sponsoring work on Rust. This is great news for Rust and Open Source alike!

That’s not to say all has been roses. The mod team quit due to a problem with the rules not allowing us to enforce the CoC. Fortunately, there is active work underway to fix this problem, and the new mod team also seems to be doing a good job as far as I can tell.

We still have a security flaw in one of the most popular time/date libraries, which most seem to simply have forgotten about. Ok, it’s fixed in time 0.3, but not in chrono so far. At least we have a CVE to remind us of that.

Looking forward, we’re going to see more of the same. The compiler will be getting more powerful while speeding up even more. Features that people have been missing are in the process of being implemented and will arrive in the new year. The error message code will at one point become sentient and decide to send a robot back in time to save mankind from skynet or something. The community will continue to grow, and more Rust jobs will be available, ready to be taken by more Rust programmers.

My guess is that we will see more active involvement from the foundation in the new year, which is great news to the community. The work started after the mod team change will likely conclude within the second quarter of 2022, improving the governance structure and paving the way for even more success in the future.

With a bit of luck we will find a way to harden our systems against supply-chain attacks before they become a real problem. We likely need more integration, documentation and raising awareness.

I still see some risk that there might be a tipping point where the backlash against the out of control grift around blockchains will also harm Rust by association, because many projects in that space use the language because of performance, safety and productivity. Knocking on wood here.

Around us, the pandemic is still raging. Meetup activity has gone online or is reduced. More people work remotely than ever before. My hope and wish is for all of you to stay safe & healthy.

Rust 2021 – Ethical Development

2020-09-21T00:00:00+00:00

This is my Rust 2021 post. I believe that Rust has shown that “bending the curve” is both possible and fruitful in programming language design. We also have a strong set of values as a community, enshrined both in the Code of Conduct and the example of various high-profile rustaceans.

For 2021 I really couldn’t care less what features go into the next edition. Don’t get me wrong, I want the next edition to be awesome, but I believe this is table stakes by now. I want us to take on the harder problems.

I want us to grow as a community, and I don’t mean that in a population count sense. This is what I meant when I snarkily wrote “Less doing the wrong thing, safely”. The keyword here is ethical development. Just like the Rust programming language has a set of vaues, a set of tradeoffs between them, and a set of bent curves where we found a place in the design space that lets us have our cake and eat it where this was a formerly accepted tradeoff, I want the Rust community to have a shared set of values, a shared set of tradeoffs between them, and as much bent curves as possible.

Now what I mean by ‘values’ obviously differs when I talk about programming language design or a programming language community. For language design, the values can contain execution speed, productivity, safety, learnability and so on. For software ethics values may be inclusion and diversity, welfare for all people, privacy, freedom, lawfulness, friendlyness and hospitality, integrity, humor, and a lot more. Some of them are by now well-known and have seen implementations, for example privacy. But how might we hope to encode e.g. hospitality?

There are a few hard questions to ask, and I won’t cover everything here, but I’ll give you a few examples of things the Rust community will have to face at some point:

I have heard reports of companies treating their Rust developers badly. As if using Rust should make up for abuse by an employer. This is obviously unacceptable, and we should call out instances wherever we find them. Such employers should not be considered part of the community (Names withheld to protect the innocent)
Rust is still hosted on github, a company that has a contract with ICE, a institution that has proven to act out genocide against immigrants in the US. For now, we act as if this doesn’t affect us and have the mod team quell all discussion. I’m having a hard time bringing this up after having personally removed discussion about it on /r/rust. But while we don’t want the drama to continue, and expressly wish this point not to debated in the replies, we as a community need to tackle this problem
I’m still not writing that “why we shouldn’t write more blockchains in Rust” article just yet, but as I’ve written before, Blockchain tech hasn’t showed much benefit to society just yet while binding a lot of resources and burning through a lot of electricity (obviously proof of stake is better than proof of work in that regard, but being better than the worst thing doesn’t make something good), thus hastening the climate crisis

So here’s my suggestions: I want an Ethics WG, tasked with both researching and teaching Rustic Software Ethics. With the above examples, I hope to have given a useful set of initial work items for that group.

Llogiq on stuff

Write small Rust scripts

Rust A Decade Later

Rust Life Improvement

Beta Channel

Cargo

Cargo: Configuration

Clippy

Cargo-Semver-Checks

Cargo test

Insta

Cargo Mutants

rust-analyzer

cargo sweep

cargo wizard

cargo pgo

cargo component

bacon

Language: Pattern matching

Language: Annotations

library: Box::leak

The End?

Easy Mode Rust

Syntax

The SBorrow checker

Macros

Generics

Lifetimes

Traits

Modules and Imports

Async

Data Structures

Sequences

Lookups

Custom Iterators

Semantic Search with Rust, Bert and Qdrant

bytecount now ARMed and Web-Assembled

Rust in Rhymes II explainer

Catch 22! Rust in Review

Language

Compiler

Tools & Ecosystem

Growth

Governance

Language

Compiler

Tools & Ecosystem

Growth

Governance

Rust 2021 – Looking Back and Forth

Rust 2021 – Ethical Development

library: `Box::leak`