The Supreme Court issued a decision recently in United States v. Skrmetti, a case challenging the constitutionality of a restriction on puberty-altering drugs. Here's a layman / engineer's guide to the constitutional concepts in the case and the reasoning in the decision.
The law enacting the restriction was challenged under the Equal Protection Clause of the 14th Amendment, which states:
[states may not] deny to any person within its jurisdiction the equal protection of the laws
The 14th Amendment was ratified following the Civil War, and ensured that people could not (legally) be denied rights based on race.
While racial discrimination was a clear motivator for passing the amendment, race is not mentioned in it. Consequently, Equal Protection arguments are used to challenge laws that treat people differently based on characteristics other than race.
Guaranteeing equal treatment sounds uncontroversial, but our laws categorize and treat people differently all the time:
In a series of cases since the amendment's ratification, the Court has had to grapple with how exactly laws are are able to treat people differently. In addition to ruling on the individual cases, much of the Court's role is to provide clear guidance to potential litigants and the hundreds of US Federal Judges that could be involved in these disputes. This guidance gets fleshed out piecemeal as cases are brought before the Court and new edge cases are dealt with.
The result of this process is a complex system where the Court has recognized certain suspect and quasi-suspect classes of characteristics that the Court treats differently. When those classes are implicated, the law being challenged is held to a higher standard of review (also known as the level of scrutiny). The Court has not closed the door to adding new characteristics to the suspect or quasi-suspect classes, but it has not recognized any new ones in over 40 years.
As a practical matter, the Court's choice of which level of scrutiny to apply largely dictates the outcome of the case.
The lowest level is called Rational Basis, which means that the legislature just has to have some logical reason that the law will help achieve some legitimate goal (even if only partially, and even if it's unfair, arbitrary, or inefficient). It's exceptionally unusual for a law to not reach this bar. Laws that do NOT implicate any suspect or quasi-suspect classes get this level of scrutiny.
The highest level is called Strict Scrutiny. This requires that the law is necessary to achieve a "compelling state interest" and is the least restrictive way possible to achieve it. The common saying is "strict in theory, fatal in fact" because almost no law passes this bar. The law is presumed unconstitutional from the outset, and the burden of proof is on the government to prove it can't be done any better way. Laws implicating the suspect classes get this level of scrutiny. The suspect classes recognized by the Court that apply to federal laws are race and national origin.
A middle level, called Intermediate Scrutiny was added later. It's a much fuzzier and subjective standard, under which the government must show that the "classification serves important governmental objectives" and that "the discriminatory means employed are substantially related to the achievement of those objectives". Laws implicating quasi-suspect classes get this level of treatment. Only sex and illegitimacy (children born out of wedlock) have been treated as a quasi-suspect classes by the Court.
You won't find any mention of scrutiny levels in the Constitution. They're a purely judicial creation, along with the concept of suspect and quasi-suspect classes. The concept of scrutiny standards was first mentioned in a (now famous) footnote of a Supreme Court decision about an interstate milk shipping requirement in United States v. Carolene Products Company.
The Skrmetti case dealt with a Tennessee law banning the prescription of puberty blockers to minors experiencing gender dysphoria. As with most cases, it comes to the Supreme Court on appeal after a District Court and appellate court (Sixth Circuit, in this case) have issued judgements and opinions.
The District Court found that transgender status should be considered a new quasi-suspect class alongside sex. Since laws implicating quasi-suspect classes receive Intermediate Scrutiny, the District Court applied that standard. It determined that the law was unlikely to pass that heightened bar of review, and granted a preliminary injunction to prevent the law from going into effect.
On appeal, the Sixth Circuit reversed the District Court. It found that since neither the Supreme Court nor the Sixth Circuit has recognized transgender status as a quasi-suspect class, there was no basis for a heightened standard of review. The Sixth Circuit looked at 2 considerations for adding suspect classes that the Supreme Court had previously mentioned: whether the characteristic is immutable and whether the group in question is "politically powerless". The court ruled that transgenderism was not immutable, pointing to cases of detransitioners that the law's challengers did not contest. The court also held that transgender people were not politically powerless, pointing to the fact that the then-President, Department of Justice, and major medical associations supported the law's challengers.
The Supreme Court agreed to hear an appeal of the Sixth Circuit's decision, which is called granting certiorari. The Court reviews legal reasoning in the courts below it de novo, meaning they proceed through their own analysis of the law from scratch.
Starting fresh, the Court first considered the obvious discrimination based on age—this restriction applies only to children under the age of 18. Since age is not a suspect or quasi-suspect class, this differentiation does not trigger heightened review.
Next, the Court considered whether the law discriminates based on sex. The law's challengers argued that there is de-facto sex discrimination because the law makes it legal to provide a boy testosterone to live as a boy, but illegal to provide a girl testosterone to live as a boy. They tried to draw parallels to the Courts ruling in Loving v. Virginia, which banned laws restricting inter-racial marriage. The argument there was similar—it's an Equal Protection violation to allow a white person to marry a white person but prohibit a black person from marrying a white person.
In this case, the Court found that the law did not discriminate based on sex, but on the medical diagnosis. Since the law bans prescribing testosterone to anyone (boy or girl) to treat gender dysphoria, and allows prescribing it to anyone for any other diagnosis, the Court found that the diagnosis is the actual differentiator. The Court had previously found that regulating sex-specific medical conditions does not necessarily entail sex-based discrimination in a 1974 case called Geduldig v. Aiello regarding pregnancy coverage. Since medical diagnosis is not a suspect or quasi-suspect class, this also does not trigger heightened review.
There was some language in the legislative record about the legislature's desire to "[encourage] minors to appreciate their sex". Laws seeking to enforce sex stereotypes or that conceal an intent to discriminate based on sex can be treated as discriminatory and subjected to Intermediate Scrutiny. The Court looked at this and decided there was not convincing evidence of this, pointing to additional language showing that the legislature had concerns about the safety and potential harms of the procedures. Since those are permissible intents, this also did not trigger heightened review.
Applying the low bar of Rational Basis, the Court affirmed the decision of the Sixth Circuit and allowed the law to go into effect. Because the Court did not "reach" consideration of what differentiation is allowed based on transgender status, that question remains unresolved.
]]>I've found these tools helpful building "pre-LLM" software in small and large startups, and for LLM-enabled applications like I'm working on now. My tool choices are largely informed by how I structure software. I don't spend much time justifying my tools here. That's both for brevity, and because I think tools either resonate with a person or they don't . It's also not exhaustive—these are what I keep in the tool chest, not the attic.1
My primary development environment. Working roughly from the bottom up:
I use a MacBook Air 15". I used to always go with a beefed up MacBook Pro, but switched and haven't missed it. If I do anything compute intensive, I usually just use another computer.
I use Karabiner-Elements to map caps_lock -> left-ctrl, left-ctrl -> esc, and right-cmd + hjkl -> arrow keys. This makes ctrl+{x} more ergonomic, esc more accessible, and gives me an alternative to the arrow keys. The first two can be done with system preferences alone.
I use Rectangle as a window manager. It's lightweight and fast. I used AeroSpace before, but had some issues with it and switched. I used divvy before that, and it was ok as well.
I use Drafts to save notes, and as a general scratchpad. The quick capture is great, and I have it globally bound to ctrl+d. I have it on my phone as well, and use it as a simple workout tracker. It's got a lot more functionality that I barely scratch the surface of.
I use OrbStack for Docker containers and local long-running services. I've sung it's praises before.
I use Wezterm as my terminal. I like it because I can write custom functions in Lua to e.g. run a command when I press a key or set up a specific layout per project. I used kitty before, which is also great but didn't have as much flexibility.
I use Ice to manage my menu bar. I used Bartender until they changed control without notifying users.
I use CleanShot for screenshots and screen recording. It's simple and easy to use.
I use Alfred as a more powerful Spotlight replacement and launcher.
By this, I mean computers that I own or rent to run services on.
I use Tailscale for a private network between my devices and services. It simplifies connecting to locally running services, and exit nodes are handy when traveling. I run a tailscale router in docker on other subnets that I want to connect to privately.
I use fly.io for hosting internal services and prototypes. It's simple to spin up an app that sleeps when not in use, and has private networking for running internal services. It has somewhat frequent small scale service interruptions, but I don't have to spend time dealing with IAM roles or enabling GCP APIs.
I use AWS when I can't use fly.
I use GCP when I can't use AWS.
I run PicoShare to share links to files, like product demo videos that I email to larger groups.2
I run ArchiveBox by the talented Nick Sweeting to save webpages and prevent link rot.
I use Cloudflare for DNS and SSL.
I handle most provisioning across providers with Terraform Cloud. This includes VPS, DNS, buckets, managed services, and anything else I configure +/- once.
mkrepo alias configured to create it from a copier template. It includes:
/bin to store binaries. I add it to PATH via direnv./etc for configuration. I store non-sensitive config in shared.env, test specific config in test.env, and sensitive config in secret.env. secret.env is gitignored. shared.env and secret.env are sourced via direnv./pkgs for code if it's a monorepo, which I prefer./var for runtime data. It's gitignored, and can be cleaned to reset my dev environment.Make, with some improvements (e.g. no .PHONY).Mostly installed through Homebrew or uv.
I use zsh as my shell with oh my zsh. I switched from bash so long ago that I don't remember why. I tried fish but didn't see a reason to pay the switching cost.
I use chezmoi to manage my dotfiles. I don't love the workflow I use with it, especially once I started using templates. It's a bit manual to keep updated, but I like having my config in git. I use brew bundle dump --global --force to include my brew installs, which also picks up my VSCode extensions.
I use llm3 to prompt via the command line and pipe to and from my other tools.4 I use OpenRouter to use different models depending on the task.5
cat. It gives me syntax color and paging.dig.ip address show and ip route get 8.8.8.8.I like "boring" and reliable here. These are the most commonly used.
redis for simple pub/sub, key/value, and caching. I started trying out valkey after the license change.I mostly write backends in Python, unless there's a good reason not to.
I use uv for virtual environments, dependency management, tool management, building packages, and running Python scripts. It's a step function improvement over what we had before.
I use ruff for linting and formatting. Like everything from Astral, it's so fast that I introduce errors to verify it's working.
I use Pylance for type checking in strict mode. Microsoft restricts it from being used outside of VS Code, and it's a big reason I still use VS Code. Astral is working on a type checker, and with their track record, I'm very optimistic about it.6
I use Pydantic and attrs for defining types. I use pydantic for serializing and parsing across boundaries (like the API), and attrs for internal types. Having multiple models feels pedantic at times, but it's largely born of my philosophy around types. I used dataclass before, but attrs is effectively a superset.
I use Pydantic Settings to strongly type environment variables.
I use fastapi as a web framework. I generate an OpenAPI spec from it that feeds the frontend.
I use prisma as an ORM. Unusual choice for Python, but I like the DSL and the types that the unofficial Python port generates are better than the other ORMs I've tried.
I use PydanticAI to call AI models. It's simple, intuitive, and let me rip out many lines of boilerplate around tool calling. I'd highly suggest starting with this library instead of a more expansive "framework". I intially assumed that interacting with LLMs was complicated, and would benefit from a framework the same way HTTP services do. I now think of it as more of a complex interaction, where the framework is adding a layer of indirection and distracting you from your usual tools. PydanticAI is simple enough to read and even modify—I use a custom fork where I've hacked in support for passing images.
I use stamina to retry things that fail intermittently.7 It's tenacity with the right defaults set.
I use my fork of result when I want to bubble up nested error states.
I use httpx as an async HTTP client.
I use gcloud-aio for better typing when using GCP.
I use fern to generate Python SDKs for APIs that don't have an SDK or have a clunky / untyped one. I usually track down an OpenAPI spec and generate an SDK locally.
I use structlog as a logging library. Nicely formatted, stuctured, and eliminates the need for string interpolation in log messages.
I use Logfire for observability. It's a cleaner and less expensive Datadog from the strong Pydantic team.8
I use Sentry for error reporting.
I use VCR.py to record and replay HTTP interactions—mainly for testing with pytest-recording but also for evaluating tool calling models.
I use inline-snapshot to automate assertions in tests. It works really well in conjuction with VCR.py when testing LLM output, like I added here.
Mainly web apps, mobile apps, and static landing pages.
output: export and hosting the static files with nginx if I can.These are my opinions and beliefs on effectively structuring software. They're born of my experience and style of building. I believe they are "right" in a sense, but not the only right way. They are mainly in the context of building B2B/B2C application code, and not e.g. firmware for pacemakers.
Type checking is a great superpower of building software. A good type system allows us to translate our understanding of a domain into a set of precise rules, and the type checker ensures our that understanding remains consistent as we evolve the system. It's a powerful tool to ensure that we are in control of the complexity in our projects. I use strong typing wherever possible, and highly favor libraries that do as well.
It's ok to have duplicative-looking types for different layers of a project—e.g. ORM, API schema, and business logic. Types need to be easy to change and reason about so we can adapt them to our evolving requirements. Types near the edge (like API schemas and database tables) are inherently less flexible. It's often better to explicitly map types between these layers rather than create a "franken-type" and allow the least flexible layer to calcify the rest.11
match statements liberally.12500 Internal Server Error response is loud and clear. A 200 OK with an empty list is just confusing. Exceptions we anticipate can be incorporated in the type system with a result type13, allowing callers to opt in to more specific error handling (e.g. a more descriptive error message).I don't like "convention over configuration" and generally avoid libraries/platforms that use a lot of "magic" like metaprogramming. The magic is a liability when learning or coming back to a technology, and the pain of explicitly writing configuration is going to 0 with LLMs. Good libraries expose their capabilities in their function signatures and don't require regularly referencing a book of incantations.
I like understanding how the tools I use work. I almost never use a starter template or automated install script. If it's complicated enough to warrant automating configuration, it's worth taking a few minutes to understand what tradeoffs it's making.
TODOs), and any algorithmic choices that would make a good leetcode question.This guide mostly applies to B2B/B2C "web"-based applications—my tools for e.g. PKI and announcing BGP are kept in the attic. ↩︎
PicoShare's creator Michael Lynch has a very honest and transparent blog about building products / businesses that I enjoy. ↩︎
llm's creator Simon Willison has a high-signal/high-volume blog that I read every day. It's a great resource, particularly for working with LLMs. His style of working in the open is inspiring, and sparks a lot of ideas for me. Dan Corin is another great example of building in the open like this. ↩︎
There are a bunch of great plugins for llm, and you can write your own. I use a wrapper I wrote to generate terminal commands and my sister's commitm to generate commit messages. It pairs well with files-to-prompt for including file context. ↩︎
OpenRouter's got a stacked team. I'm very bullish on them & happy to call them friends. ↩︎
Obviously, I'm bullish on Astral. Charlie Marsh is building a team of wizards, including sharkdp and BurntSushi, who authored tools on this list. ↩︎
Hynek Schlawack created attrs, stamina, and structlog on this list. I resonate well with his approach to structuring software. His blog and the documentation on his projects are excellent resources, especially on Python and Docker. ↩︎
Pydantic is another great team pushing the Python ecosystem forward. Samuel Colvin and the team have great technical taste that shows across their projects. ↩︎
The This Week in React newsletter is a good source for keeping up to date with React Native news. ↩︎
I think UIs are going to change a lot as AI improves. Geoffrey Litt does insightful work on the topic of "malleable software" that I suggest reading. ↩︎
I see a lot of natural hesitation to do this because of a misapplication of the DRY Principle. There's a difference between being explicit and being duplicative. The cost of being explicit is going to 0 with LLMs, and we can harness the benefits for free. ↩︎
In Python, pattern matching is newer and a bit unintuitive. Raymond Hettinger has a talk explaining the quirks and some complicated workarounds. I use it more simply for matching types. I always check for exhaustiveness, and exclude a default case (or use assert_never) so the type checker tells me when I am missing a case. ↩︎
I use my fork of result for this in Python, along with stamina to retry things that fail intermittently. I prefer decorators over a bunch of if (err) style logic. ↩︎
You try to install Python dependencies in Github Actions using a private Github repository as a source. You receive an error like:
fatal: could not read Username for 'https://github.com': No such device
remote: Repository not found.
fatal: repository 'https://github.com/example/repo/' not found
The GITHUB_TOKEN secret that is available in Github Actions environments does NOT have access to other private repositories.
Here's a sample pyproject.toml that can break, where example/otherproj is an example private repository:
[project]
name = "myproj"
version = "0.1.0"
dependencies = [
"otherproj",
]
[tool.uv.sources]
otherproj = { git = "https://github.com/example/otherproj" } # my private repo
This will work fine locally, but fail when you uv sync it in Github Actions.
You need to create a Github PAT with permissions to access the private repository and pipe it through to the uv sync command.
Here's an example of one way to do it:
repo permissions at https://github.com/settings/tokens.
UV_GH_TOKEN as the name below.
docker/bake-action step. I left the permissions section in the example below so that other steps can use the limited token..github/workflows/docker.yaml:
name: Build Docker Images
on:
push:
jobs:
build-and-push:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- name: Build and push
uses: docker/bake-action@v6
env:
UV_GH_TOKEN: ${{ secrets.UV_GH_TOKEN }}
with:
files: docker-bake.json
set: |
*.platform=linux/amd64
*.secrets+=id=github_token,env=UV_GH_TOKEN
.netrc file during uv sync, deleting it after to avoid leaking it in the image. uv will respect this during installation.RUN --mount=type=secret,id=github_token \
echo "machine github.com login x-access-token password $(cat /run/secrets/github_token)" > ~/.netrc \
uv sync \
&& rm ~/.netrc
Let me know if you find a better way to do this!
]]>