llimllib's notes

I am building a cloud

2026-04-23T11:14:19.193Z

I do not like the cloud today.

I want to. Computers are great, whether it is a BSD installed directly on a PC or a Linux VM. I can enjoy Windows, BeOS, Novell NetWare, I even installed OS/2 Warp back in the day and had a great time with it. Linux is particularly powerful today and a source of endless potential. And for all the pages of products, the cloud is just Linux VMs. Better, they are API driven Linux VMs. I should be in heaven.

But every cloud product I try is wrong. Some are better than others, but I am constantly constrained by the choices cloud vendors make in ways that make it hard to get computers to do the things I want them to do.

David Crawshaw

Chapter 2 - defining nonfunctional requirements

2026-04-20T15:17:14.922Z

The authors define a functional requirement as "the functionality that the application must offer", and the nonfunctional requirements as everything else you need to do.

Right off the bat, I struggle with this definition! They give performance as an example of a nonfunctional requirement, but obviously that's a matter of degree. Let's see if they clarify, or just allow the definition to be very fuzzy at the edges.

Twitter example

They start off with a motivating example, of a common interview question: let's design a service like twitter. Given three tables:

users	id	screen_name	profile_image
	12	jack	123.png

posts	id	sender_id fk(users)	timestamp	text
	20	12	123456	just setting up my twttr

follows	follower_id fk(users)	followee_id fk(users)
	9923882	12

They give the first pass SQL query to create a timeline page:

SELECT posts.*, users.* 
  FROM posts
  JOIN follows ON posts.sender_id = follows.followee_id
  JOIN users   ON posts.sender_id = users.id
 ORDER BY posts.timestamp DESC
 LIMIT 1000

This query will be expensive, so we will in practice need to materialize the timeline. For each user, we store their home timeline; when a user posts, we look up all their followers and insert that post into their timeline.

fan-out means the factor by which the number of requests increases in such a scenario.

describing performance

Two main types of performance metric:

response time - the elapsed time from a request to a response
throughput - the requests per second the system is processing

Generally, response time decreases as throughput increases.

They have a brief sidebar about using jitter and exponential backoff plus circuit breakers to avoid thundering herd issues, but don't dive into it.

response time is usually what users care about the most, whereas the throughput determines the required computing resources

response time is what the client sees; includes all delays anywhere in a system
service time is the duration for which the service is actively processing the client's request
queueing delays they don't define, oddly, just put it in italics and assume its definition is understood
latency is a catchall term for time during which a request is not being actively processed (i.e. during which it is latent)
- This is interesting to me because they make latency a relativistic measurement; it matters if, from your perspective, you know that work is being done on it
- We can call response time "request latency" because it's time when we (the client) don't know that the request is being actively processed (or we can say that it is not being actively processed by our system)
  - the author gives "network latency" as a particular example - the time that a request and response spend traveling through the network, but from the network stack's perspective, there are several different times within that span when the request is latent, and others when it's workign on it

There's a brief discussion of median/mean/percentiles

Amazon describes response time requirements for internal services in terms of the 99.9th percentile... Optimizing the 99.99th percentile (the slowest 1 in 10,000 requests) was deemed too expensive and found to not yield enough benefit

The authors point to several sources for the "performance == money" idea, and suggest that they're all insufficient and somewhat contradictory, and that we don't really know how much performance is functional, in the sense of making more money from users.

They point to the tail at scale to describe tail latency amplification, whereby as a single service request makes more and more requests to other endpoints, it becomes more likely that one of them suffers a tail latency spike.

There's a brief discussion of how SLOs and SLAs may use percentiles to define their expectations, and an important sidebar that averaging percentiles is meaningless. I've seen people make that mistake many times in practice, and it's possible I have (I certainly have (sorry))

reliability and fault tolerance

reliability is, roughly, continuing to work correctly, even when things go wrong

They distinguish betwen faults and failures:

A fault occurs when a particular part of a system stops working correctly
A failure occurs when the system as a whole stops providing the required service to the user

They are the same thing at different levels

I'm glad they said that, it was my immediate objection to that definition!

We call a system fault-tolerant if it continues providing the required service to users in spite of faults occurring, and a part is a single point of failure if failure causes failure to the whole system.

Counterintuitively, in fault-tolerant systems, it can make sense to increase the rate of faults by triggering them deliberately -- for example, by randomly killing individual processes without warning. This is called fault injection... by deliberately inducing faults, you ensure that the fault-tolerance machinery is continually exercised and tested

Hardware and software faults

Hardware redundancy increases the uptime of a single machine... using a distributed system has advantages, such as being able to tolerate a complete outage of one datacenter. For this reason, ccloud systems tend to focus less on the reliability of individual machines and instead aim to make services highly available by tolerating faulty nodes at the software level. Cloud providers use availability zones to identify which resources are physically co-located

hardware failures are often less correlated than software faults, because it is common for many nodes to run the same software and thus have the same bugs

Human beings

One study of large internet services found that configuration changes by operators were the leading cause of outages, whereas hardware faults played a role in only 10-25% of cases [72]

ed: that study is from 2003, I wonder if there's anything more recent

What we call "human error" is... a symptom of a problem with the sociotechnical system in which people are trying their best to do their jobs

They cite "the field guide to understanding human error", which I had for a while but failed to read

Scalability

Scalability is the term we use to describe a system's ability to cope with increased load

ggsql

2026-04-20T13:46:10.953Z

https://ggsql.org/

ggsql brings the elegance of the Grammar of Graphics to SQL. Write familiar queries, add visualization clauses, and see your data transform into beautiful, composable charts — no context switching, no separate tools, just SQL with superpowers.

ggsql introduces the VISUALIZE keyword into its sql dialect, allowing you to do grammar of graphics charts from SQL queries. The home page has an example:

-- Regular query
SELECT * FROM ggsql:penguins
WHERE island = 'Biscoe'

-- Followed by visualization declaration
VISUALISE bill_len AS x, bill_dep AS y, body_mass AS fill
DRAW point
PLACE rule 
  SETTING slope => 0.4, y => -1
SCALE BINNED fill
LABEL
  title => 'Relationship between bill dimensions in 3 species of penguins',
  x => 'Bill length (mm)',
  y => 'Bill depth (mm)'

There's an introductory blog post here: https://opensource.posit.co/blog/2026-04-20_ggsql_alpha_release/ with some more motivations and introductory text

How the Heck Does Shazam Work?

2026-04-20T13:42:31.266Z

https://perthirtysix.com/how-the-heck-does-shazam-work

How audio fingerprinting and a connect-the-dots trick lets Shazam identify a song in seconds.

Another in Shri Khapalda's series of interactive explainers. Previously:

QR Codes
GPS

Gwenifer Raymond Tiny Desk

2026-04-18T02:59:33.588Z

I talked about her previously in Last Night I Heard the Dog Star Bark - Gwenifer Raymond, and her tiny desk really showcases her energy

Chapter 1

2026-04-17T13:23:40.062Z

We call an application data intensive if data management is one of the primary challenges in developing the application. While in compute-intensive systems the challenge is parallelizing a very large computation, in data-intesnive applications we usually worry more about things like storing and processing large data volumes

The difference in use between backend engineers (who modify data & generally look at one user at a time) and business analysts/data scientists has led to a split between operational systems and analytical systems that are often kept separate.

point query: a query that looks up a small number of records based on a key
OLTP: Online Transaction Processing - a system that inserts, updates or deletes records generally based on a key
OLAP: Online Analytical Processing - a system that generally scans a huge number of records to calculate aggregate statistics rather than returning individual records to the user

They differentiate ClickHouse etc (Pinot, Druid? never heard of them) as product analytics or real-time analytics which are designed for analytical workloads, but serve user-facing products

Data from OLTP systems is often spread across the enterprise, and BAs do not want to have to query across potentially dozens of systems. A data warehouse contains data extracted from many OLTP systems in a company. (They go on to say that it can come from many other sources as well, which is more accurate imo)

A data warehouse often uses relational tables, which is well suited to what BAs want, but less well suited to data scientists' desires, training ML models and using NLP or computer vision.

(They say that feature engineering is particularly difficult to express using SQL? I'd like to understand what they mean by that more here. [citation needed])

a data lake is a centralized data repository that holds a copy of any data that might be useful for analysis. The difference from a data warehouse is that a data lake simply contains files, without imposing any particular file format. (So, s3 is a data lake? Seems like a kinda useless definition imo)

the term cloud native is used to describe an architecture that is designed to take advantage of cloud services

This is an extremely fuzzy definition to me. They say that Postgres and ClickHouse are self-hosted systems that are not cloud native, but that Aurora and Snowflake are cloud-native, and I think you'd be hard-pressed to really apply the definition above to make that distinction clear.

The things that distinguish them to me are:

usually specialized to the operating environment of a particular cloud service provider
usually not available as open source
usually not available to run on a general-purpose computer, or available but in severely degraded form

To me, if we want to split the mysqls and postgreses of the world from the aurorae and BigQueries, it's about how specialized they are to cloud service providers' computers as opposed to general-purpose computation

The key idea of cloud native services is not only to use the computing resources managed by your operating system, but also to build upon the lower-level cloud services to create higher-level services

yeah that's closer for me

zignal

2026-04-17T13:19:50.528Z

https://github.com/arrufat/zignal
https://arrufat.github.io/zignal/

Zignal is a zero-dependency image processing library

Core Math: Matrices (SMatrix, Matrix, SVD), PCA, ND Geometry (SIMD Points, affine/projective transforms, convex hull), Statistics, Optimization.

Computer Vision: Feature detection and matching (FAST, ORB), Edge detection (Shen-Castan), Hough Transform, Feature Distribution Matching (style transfer).

Image Processing: Spatial transforms (resize, crop, rotate), morphology, convolution filters (blur, sharpen), thresholding, advanced Color Spaces (Lab, Oklab, Oklch, Xyb, Lms, etc.), Perlin noise generation.

I/O & Graphics: Pure-Zig PNG/JPEG codecs, Canvas API (antialiasing, Bézier curves), Bitmap/PCF Fonts, Colormaps, Terminal graphics (Kitty/Sixel).

Platform Support: Native Zig, first-class Python bindings, and WASM compilation for the web.

neat-looking zig image processing library with python bindings. Has a CLI that is currently pretty bare-bones.

Pretty impressive to have a library that does its own image decoding, font rendering, etc etc!

Clubs, really? That's what you call those?

2026-04-16T15:00:54.775Z

https://approximateknowledge.net/inkhaven/2026/04/14/playing-cards.html

A fun history of how we ended up with the suits common in playing card decks.

I tried to find some word to describe which playing card decks they're common in, but I failed because I have no idea how widespread the playing cards we have in America are

Use Apple’s App Store at your own risk

2026-04-16T14:54:51.317Z

The App Store, in other words, is rotten. And whatever Apple’s app-vetting procedure is, it’s not working. Perhaps that reflects the magnitude of the job. At last count there were approximately two million iOS apps on the store, which across its 18-year history equates very roughly to 9,000 per month. Factor in the acceleration over time, not to mention all the other apps that were vetted once but have since been removed because the developers stopping updating them, and that’s a lot of vetting, even for a company with major resources.

But is that an excuse? Not really. If running an app store is too much trouble, close it down. If comprehensive vetting is impractical, stop pretending the App Store is completely safe. (And definitely stop scaremongering about sideloading.) If you can’t make the App Store a truly reliable resource for good, safe, legitimate software, then give iPhone users the freedom to install from other places. Or just stop pretending the App Store monopoly is about anything other than revenue.

David Price

amen

An AI tool I find useful

2025-07-27T14:31:07.68Z

update: This tool has been superseded by pr-review, a similar but more powerful tool

One of the tasks that I do most often is to review code. I've written a review command that asks an AI to review a code sample, and I've gotten a lot of value out of it.

I ignore most of the suggestions that the tool outputs, but it has already saved me often enough from painful errors that I wanted to share it in the hope that others might find it useful.

How to install it

install llm and configure it to use the provider and model you'd like to use
- on a mac, brew install llm
optionally install bat
- on a mac, brew install bat
save the review script anywhere on your path and make it executable

How it works

The main job of the script is to generate context from a git diff and pass it to llm for code review.

If you run review with no arguments, it will:

run git diff -U10
- the -U argument changes the amount of context given in a git diff
- any additional arguments you pass will be forwarded directly to git diff
estimate the tokens in the output and check if that fits within the context window
- shrink the context window if necessary and re-run git diff
- if you want to manually expand the context beyond 10 lines
  - you can append to the system prompt with the --context flag
  - you can expand the diff context to 100 by passing -U100 or any number you prefer
pipe the context to llm
- You can configure llm to use whichever model or AI company you prefer
highlight the output with bat if available
- bat is great and I highly recommend using it if you don't already. brew install bat on a mac will install it

The result looks like this in my terminal:

How I use it

My main use of the command is to review a PR I'm preparing before I file it. The biggest value I've gotten out of it is that it frequently catches embarrassing errors before I file a PR - misspellings, DELETEMEs I forgot to remove, and occasionally logic errors.

It also often suggests features that make sense to add before finishing the PR, or as next steps.

It is very important to use it intelligently! The LLM is just an LLM, and it also may be missing context. The screenshot above has two examples of mistaken suggestions that I read and ignored; you have to apply your own understanding and taste to its output.

Keep in mind that it is tasked via its system prompt with finding problems and making suggestions; no matter how good your code is it will try to find and suggest something.

I also use it for reviewing other people's PRs, with review origin/main origin/some-feature-branch. In these cases, I really am just using it for clues as to some things that I may need to investigate with my actual human brain. Please do not just dump llm suggestions into a PR! That's both rude and likely to be unhelpful.

How it differs

That last point brings me to why I prefer this tool to github's own copilot review tool.

I can use my review tool in private, and evaluate its suggestions in private
I can run it repeatedly as I change the code
The separation of my terminal from the code review tool provides a space for me to apply critical thinking

Areas for improvement

A lot of things are hard-coded into the script, because I'm its only user
- If you find use in it, please let me know!
the system prompt seems to work fine, but the range of possible system prompts is so large that I'm sure it could be better

Postscript

Thanks to a suggestion on lobste.rs from davidcrespo, I added the ability to provide context via stdin. Thanks for the suggestion!

Retrofitting JIT compilers into C interpreters

2026-04-15T12:41:08.38Z

https://tratt.net/laurie/blog/2026/retrofitting_jit_compilers_into_c_interpreters.html

Very cool work by Laurence Tratt, Edd Barrett, Lukas Diekmann, and Pavel Durov to create a system, yk, which accepts an interpreter for a language (lua or python for example) and emits a meta-tracing JIT compiler for it.

the blog post explains what a _meta-tracing JIT_ is much better than I could here

The upshot of this is that you can, with a reasonable amount of work, automate the construction of JIT compiler from an interpreter that doesn't natively support it.

Put another way, you can build a Pypy-like JIT interpreter from the regular C python interpreter, automatically.

The article does a great job of walking you through what all this means and how it does it. Very cool work!

pagefind

2023-10-20T13:54:09Z

https://pagefind.app/
https://github.com/CloudCannon/pagefind/

Pagefind is a fully static search library that aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure.

...The goal of Pagefind is that websites with tens of thousands of pages should be searchable by someone in their browser, while consuming a reasonable amount of bandwidth. Pagefind’s search index is split into chunks, so that searching in the browser only ever needs to load a small subset of the search index. Pagefind can run a full-text search on a 10,000 page site with a total network payload under 300KB, including the Pagefind library itself. For most sites, this will be closer to 100KB.

The search engine appears to be a rust application built to a wasm bundle. It looks like it builds an index of the site, creates a bunch of page files, and then when you search it checks the index for what the relevant pages are, downloads only those ones, and serves results.

via Carl M. Johnson on mastodon

No one can force me to have a secure website!!!

2026-04-14T00:23:56.928Z

In which the always-worth-watching author tom7 creates the least-secure HTTPS server they can manage

Previously: BoVeX

magika - file type detection

2026-04-13T16:36:57.706Z

https://github.com/google/magika
https://opensource.googleblog.com/2024/02/magika-ai-powered-fast-and-efficient-file-type-identification.html

Magika is a novel AI-powered file type detection tool that relies on the recent advance of deep learning to provide accurate detection. Under the hood, Magika employs a custom, highly optimized model that only weighs about a few MBs, and enables precise file identification within milliseconds, even when running on a single CPU. Magika has been trained and evaluated on a dataset of ~100M samples across 200+ content types (covering both binary and textual file formats), and it achieves an average ~99% accuracy on our test set.

An interesting-looking successor to libmagic

impeccable (design skill)

2026-04-12T21:31:48.605Z

https://impeccable.style/

Anthropic created frontend-design, a skill that guides Claude toward better UI design. Impeccable builds on that foundation with deeper expertise and more control.

Every LLM learned from the same generic templates. Without guidance, you get the same predictable mistakes: Inter font, purple gradients, cards nested in cards, gray text on colored backgrounds.

Impeccable fights that bias with:

An expanded skill with 7 domain-specific reference files (view source)

18 steering commands to audit, review, polish, distill, animate, and more

Curated anti-patterns that explicitly tell the AI what NOT to do